9 Comments
author

Hi Jiayu - certainly, and I would love to take a look at the translated post as well!

Expand full comment

Regarding the "cache hit rate of 30%" part, if I understand correctly, "cache hit" means no actual search happens, right? If so, this means 30% of the searches actually cost nothing & the related revenue is 100% profit. Accordingly, take ChatGPT-Equivalent search for example, with a overall 75% profit margin, the cost of each actual search should be "0.055 * (1- 75%) / (1 - 30%) = 0.020", not "0.055 * (1- 75%) * (1 - 30%) = 0.010"

Expand full comment

Great! Well Done.

Expand full comment

Great article, with loads of in-depth knowledge into details and specifics of LLMs; I am particularly intrigued by the rapid drop of training cost due to GPU process node efficiency as well as adjustment for model precision, not necessarily the architecture of the models themselves; also the comparison of training cost between A100 and GCP TPU is very enlightening-- one would expect TPU be way more efficient with much higher utilization, but the difference in utilization is impressive compared to Meta’s 20% with A100 according to LeCun. Curious to know how AWS’ trainium would compare? or it’s probably not tuned for LLMs? Also AMD’s GPU is underrepresented among the AI giants, understandably. As for graphcore cerebras samba nova groq even tenstorrent, not sure how much $$$ opportunity is left for them given the ever higher fab cost and versatility of new AI models, and the AI training power is increasingly concentrated into fewer hands.

Expand full comment

This might be the best public-facing post on LLM economics that I’ve seen so far. Any chance you have a Twitter account or some other channel we can follow for more regular updates??

Expand full comment

Hi, Sunyan. Your blog is very enlightening! I’m wondering if we can translate this blog in Chinese and post it on our WeChat official platform. We will keep the original link and state where it is translated from. I believe this move will benefit more people! Thank you!

Expand full comment