Medical AI Needs Its Physics
From ChatBots to Real AI
This recent paper from DeepMind is beautiful.
What I love about it is that it takes a realistic approach to solving problems with LLMS — don’t force LLMs to do what they can’t do.
This is so important in the field of medicine. Let me briefly outline where my thoughts are with AI in Medicine.
ChatBot Alone
Just last week we saw a cool paper, also from Google.
There are some cool bits to this, but suffice it to say — it tries to push everything about medicine through the LLM funnel.
It funnels everything through LLMs. But LLMs can’t reason. They don’t have the architecture, making it unlikely.
We also see time and time again that they can’t generalize forward into novel situations with common sense. It sure looks like RLHF heuristically removes nonsensical outputs, but removing nonsense is very different then building sense.
Suffice it to say — cool, but the wrong tool for medical reasoning.
Principle-Driven
LLMs are decent translators — sometimes, if the sun hits them at the right angle.
What we need behind the translation is a reasoning engine, one that relies on different principles, or dynamics, than those present in language/grammar.
The Physics-Based AI community has been doing incredible work here for a while:
And now we’re seeing this approach make its way to scale. The architecture of AlphaGeometry is something to watch very closely.
What I love about this: it uses LLMs are translators, but offloads the real reasoning to architectures that are more congruent — the symbol engine here.
I want to say more, but I’ll spend some more time with the paper before a more full-form. Hopefully, you understand why I’m excited at this stage.
Caveats
To do good science, we have to think critically about the work. I obviously like the work above, but it comes with its own set of problems.
- We know the rules of Geometry. We don’t know the rules of Medicine. But, some of us are working on that…
- DeepMind’s History of Hype. Skin cancer, Coding, Matrix Multiplication, Protein Folding, etc. are all things that DeepMind has solved forever… except the universe hasn’t gotten the memo.
- Lack of Transparency. We’ve seen so many earlier projects fall flat, and without critical, transparent, maybe-slow assessment by the broader research community — we can’t efficiently build from those projects.
- Synthetic Data. I love the synthetic data approach. It helped me in one of my main PhD papers (https://pubmed.ncbi.nlm.nih.gov/36288215/). But it comes with its own caveats — you’re baking in the principles that you’re trying to infer. In geometry, this is fine. But in medicine, we can’t rely solely on synthetic data to get something that converges to reality… yet.
Summary
I’m really liking the work coming out of Google’s AI — they’re staying focused on principles, not statistical correlations in token space. Regardless of how impressive they are.
Sadly, lots of LLM efforts are going down the trinkets path, but it’s good to see efforts-at-scale that are using LLMs more commensurate with their capabilities.
I hope to see more ethical progress, one that plugs more into academic communities and takes critical assessment more seriously.