Skip to main content
Demo

Talk Data to Me: Conversational AI for FAIR and Accessible Biomedical Data Discovery

16 September 2025| 10:15| 80/1-001 - Globe of Science and Innovation - 1st Floor

This demo will discuss how conversational interfaces can bridge technical, linguistic, and policy gaps in data discovery—particularly for regulated data that must remain access-controlled. It also explores the integration of safeguards such as access permissions, response transparency, and bias auditing.

The biomedical research ecosystem is rich in data but poor in discoverability. Researchers often struggle to identify and evaluate datasets across fragmented platforms, inconsistent metadata schemas, and access-restricted environments. Traditional search interfaces fail to accommodate the exploratory and interdisciplinary nature of modern research.

We propose a new paradigm: using conversational AI to transform metadata search into natural dialogue. Powered by large language models, the chatbot prototype on Synapse.org interprets user intent, translates natural language into structured queries, and surfaces metadata summaries—enabling ethical and efficient discovery even in regulated domains.

Users can ask nuanced questions like “Which Alzheimer's datasets involve Type II Diabetes in patients over 60?” and receive synthesized metadata responses, access notes, and provenance trails. The chatbot supports non-technical users and facilitates equitable access by acting as a semantic translator between scientific domains. Moreover, logs of user queries inform improvements in metadata quality and usability.

This presentation will discuss how conversational interfaces can bridge technical, linguistic, and policy gaps in data discovery—particularly for regulated data that must remain access-controlled. It also explores the integration of safeguards such as access permissions, response transparency, and bias auditing.

We argue that conversational AI is not just an interface improvement, but a step toward inclusive and intuitive open science infrastructure.

Organisations involved

Presenters

Susheel Varma

Chief Data Officer
Susheel Varma is the Chief Data Officer, where he leads strategy across all scientific research and biomedical data management teams. Prior to joining Sage Bionetworks, Susheel was the Head of Artificial Intelligence and Data Science at the Information Commissioner’s Office, where he was at the forefront of developing and implementing cutting-edge technology and AI regulations, skillfully balancing technological advancements with critical considerations in data privacy and ethical AI use. Previously, he was the Chief Technology Officer at Health Data Research UK, where he led the digital, data, and technology strategy, playing a pivotal role in transforming health data research across the UK. He spearheaded initiatives like the Health Data Research Innovation Gateway, Trusted Research Environments (TREs) and the National Cohort Discovery Service, significantly advancing the interoperability and secure data sharing across the health and research sectors. In his previous roles at EMBL-EBI and ELIXIR, he led an international portfolio of data science projects in multiple jurisdictions and research infrastructures - EMBL, ELIXIR, EOSC and GA4GH. He holds an Executive MBA specializing in the Management of Research Infrastructures, and a Doctor of Philosophy in Computational Systems Biology from the University of Sheffield. He is also an elected Fellow of the British Computer Society for his career contributions developing innovative solutions at the intersection of academia, healthcare and technology.