T̪alapu

LLMs are not compilers or interpreters

We humans use analogies and metaphors to understand concepts. But it is important to know that they are only approximations, not the actual concept in itself.

Saying LLM is like a new higher level of abstraction that compiles natural language into code is just an analogy. It is not the equivalent of a C compiler creating machine code.

Natural language is ambiguous. This has been known for millennia, and we worked around that problem by creating new languages that are unambiguous or at least have as little ambiguity as possible. Mathematics is one such example. Some mathematical ideas cannot accurately be expressed in natural language. We have to use mathematical notation.

Code is like math. It's an unambiguous expression of a computation. When going from C to machine code, we are going from one unambiguous language to another.

When LLM creates code from natural language, there is inherent ambiguity in the process. This means we cannot "program in English". Now one might say we can create validation conditions to make sure that LLM takes a spec and implements it correctly. But the validation conditions themselves are written in natural language and will have ambiguity. We have just kicked the ambiguity can a bit further down the road.

This is not an argument against the utility of LLMs. They are quite useful. But to remove ambiguity and produce useful software, we need a way to remove the ambiguity inherent in an LLM's output. A human working in tandem with an LLM can do this quite easily. I'm not so sure about other attempts like building a harness around the LLM. Such a harness will need a spec, and as an old joke goes, a spec which is unambiguous is called code.