scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery
Abstract
We present scPilot, the first systematic framework to practice omics-native reasoning: a large language model (LLM) converses in natural language while directly inspecting single-cell RNA-seq data and on-demand bioinformatics tools. scPilot converts core analyses—cell-type annotation, developmental-trajectory reconstruction, and transcription-factor targeting—into step-by-step reasoning problems that the model must solve, justify, and, when needed, revise with new evidence.
To measure progress, we release scBench, a suite of 9 expertly curated datasets and graders that faithfully evaluate the omics-native reasoning capability of scPilot across different LLMs. Experiments with o1 show that iterative omics-native reasoning lifts average accuracy by 11% for annotation, and Gemini 2.5 Pro cuts trajectory graph-edit distance by 30% vs. one-shot prompting, while revealing systematic failure modes in gene-regulatory prediction.
Grounding LLMs in raw omics yields transparent, auditable analyses and opens a path toward fully automated, interpretable, and scientifically robust single-cell workflows.
