June 2023. ChatGPT was six months old. We’d launched our first AI feature, Magic Canvas, three months before. I was meeting with our friends at Joann Fabric (may they rest in polyester) to explain what GenAI was about. They weren’t alone. Everyone was trying to figure out what it meant.
Not much has changed since then.
To try and make sense of it all, I took a moment of solitude with my keyboard. I decided to write down the principles that I believed were most likely to be durable, at least for the next while. Now you only have me at my word that I copied this fairly from my google docs history – it would be a lot more compelling if I had dated receipts. (No sir, I really did mean to buy that lottery ticket!)
That said, if you take me at my word, it’s surprising how, in this fast moving world… some things have turned out just the way we expected.
- Core AI models will improve faster than we can innovate. Don’t invest too much time in making the state of the art perform better. The state of the art will perform better on its own.
This continues to be true. Startups keep springing up whose core benefit is “prompt/finetune AI to make it better at ___”. They keep getting eaten by the models just getting better all by themselves. Then new ones keep sprouting. - Models will leapfrog each other. Don’t build or buy lock-in to any specific technology, provider, or model. Even among the best models, some are dramatically more capable than others at any given time. (E.g. currently GPT-4 is far better than Anthropic’s Claude, though much slower.)
The leaders keep swapping places, but nobody is the durable champion. Gemini 2.5 Pro was the coding champ, then o3, then Opus, and last week I switched to GPT-5-Codex. This is not a place to get attached to any one model or model provider. - Hosted LLMs will be more capable than self-hosted. Because of limited access to high-end GPUs and near-term closed source advantages, we should expect to pay providers for the most capable (biggest, smartest, most features) LLMs – not self-host.
This may seem obvious, but two years ago the self-hosted image generating models were as good or even better than what the big labs had to offer. It’s clear that we won’t be seeing that again any time soon. - Small fine-tuned LLMs will outperform big ones at specific tasks. While big companies will have the best big models, we can host small, specialized models that are some combination of faster, cheaper, more power efficient, and better at a specific task.
I’m not sure ‘better at a specific task’ proved true. Finetuning can make smaller models almost as good as big ones in specific tasks - LLM token space will continue to increase, and large LLMs will become easily fine-tunable. It will become easy to incorporate lots of information in LLMs. Investments in embeddings are temporary workarounds, not strategic investments.
Eh, I did OK on this. Leading LLMs are hitting 1-2M tokens while Llama boasts 10M. But useful ranges are small. RAG/embeddings will likely be with us forever, because no matter how big token windows are, your data is bigger. - Multimodal LLMs are the future. We should think about input and output of images, video, audio, and speech with all of our projects.”
Well, yes. All I can say is this did not seem obvious at the time. - Truth is the rarest commodity. Garbage in, garbage out, no matter how good the technology gets. Human-verified information that’s known to be correct is the most durable strategic investment. Look for scalable ways to validate information.
Most people still don’t get this. Validated, ground-truth data is the greatest asset in the AI world. LLMs know the patterns; you want to anchor them in the facts. - AI Knowledge goes out of date weekly. The entire AI space is moving extraordinarily fast. What you know today may not be true tomorrow. What you thought was impossible or impractical today may very well be an open source project tomorrow. It is incumbent on all of us to track the progress more aggressively than we typically do for other areas of software.
Two years later and not a moment’s rest in sight. - Take a hard look at AI features in SaaS apps. There is a lot of vaporware, snake oil, and mediocrity. Set the bar high, and be sure to do a thorough evaluation on our own data before committing.
The vaporware used to just be saying ‘AI!’ and maybe having some machine learning feature that was half a decade old. Now businesses are integrating LLMs-badly. Buyer beware. - When choosing 3rd party services, avoid long-term commitments. That’s both financial commitments, and big integration projects – because competitors are leapfrogging each other. It’s not the time to place a big bet.
This has never been more true. Innovation is blindingly fast, and today’s leader may be gone tomorrow. It’s a great time to be experimenting and innovating, but a bad time to be making long-term technology purposes.
It’s surprising how, in this time of lightning change, some of these core principles are holding steady. Will they last another two years? I’m betting on yes.
(You might want to subscribe or follow me on Twitter so you don’t miss new articles)
Leave a Reply