Re: [FRIAM] Hallucinations

Marcus Daniels Sat, 13 Sep 2025 19:32:28 -0700

Let’s have Claude formulate the contradiction in Lean 4 and delegate the 
reasoning to a tool that is good at that.  
(Just like I wouldn’t do long division by hand.)

From: Russ Abbott <[email protected]> 
Sent: Friday, September 12, 2025 3:51 PM
To: Marcus Daniels <[email protected]>
Cc: The Friday Morning Applied Complexity Coffee Group <[email protected]>
Subject: Re: [FRIAM] Hallucinations

Marcus, You're right, and I was wrong. I was much too insistent that LLMs don't 
understand the text they manipulate.

A couple of weeks ago, I asked ChatGPT to embed (encode) a sentence and then 
decode it back to natural language. It said it didn't have access to the tools 
to do exactly that, but it would show me what the result would look like.

The input sentence was: “I ate an apple because I was hungry. The apple was 
rotten. I got sick. My friend ate a banana. The banana was not rotten. My 
friend didn’t get sick.”

ChatGPT simulated embedding/encoding the sentence as a vector. It then produced 
what it claimed was a reasonable natural language approximation of that vector. 
The result was: "A person and their friend ate fruit. One of the fruits was 
rotten, which caused sickness, while the other was fresh and did not cause 
illness."

If ChatGPT can be believed, this is quite impressive. It implies that the 
embedding/encoding of natural language text includes something like the 
essential semantics of the original text. I had forgotten all about this when I 
wrote my post about hallucinations. I apologize.

What I would like to do now -- and perhaps someone can help figure out if any 
tools are available to do this -- is to explore more carefully the sorts of 
information embeddings/encodings contain. For example, what would one get if 
one encoded and then decoded Chomsky's famous sentence: "Colorless green ideas 
sleep furiously." What would one get if one encoded -> decoded a contradiction: 
"All men are mortal. Socrates is a man; Socrates is immortal." What about: "The 
integer 3 is larger than the integer 9." Or "The American Revolutionary War 
occurred during the 19th century. George Washington led the American troops in 
that war. George Washington's tenure as the inaugural president of the United 
States began on April 30, 1789." Etc.

-- Russ Abbott <https://russabbott.substack.com/>   (Click for my Substack)

Professor Emeritus, Computer Science
California State University, Los Angeles

On Thu, Sep 11, 2025 at 6:59 PM Marcus Daniels <[email protected] 
<mailto:[email protected]> > wrote:

It often works with the frontier models to take a computational science or 
theory paper and to have them implement the idea expressed in some computer 
language.   One can also often invert that program back into natural language 
(and/or with LaTeX equations).  Further, one can translate between very 
different formal languages (imperative vs. functional), which would be hard 
work for most people.

These summaries and transformations work so well that tools like Github Copilot 
will periodically perform a conversation summary, and simply drop the whole 
conversation and start over with crystallized context (due to context window 
limitations).   When it picks up after that, one will often see a few syntax or 
API misunderstandings before it regroups to where it was.

What this pivoting ease implies to me is that LLMs have a deep semantic 
representation of the conversation (and knowledge and these skills).   It 
certainly is not just a matter of mating token sequences with some deft 
smoothing.

Another example that has come up for me recently is using LLMs to predict 
simulation or solver outputs.  When faced with learning large arrays of 
numbers, what it does is more like capturing a picture then a sequence of 
digits.  It doesn’t know, without some help, about why number boundaries, 
signs, and decimal points are important.   Only through hard-won experience 
does it learn that the most and least significant digits should be treated 
differently.   Syntax is a hint one can offer through weak scaffolding 
penalties (outside of the training material).  It learns the semantics first.   
Strong syntax penalties can get in the way of learning semantics by creating 
problematic energy barriers.

While LLMs are huge, the Chinchilla optimality criterion (20 tokens per 
parameter) forces regularization.  There’s some flood fill, but I don’t think 
it can hold up for idiosyncratic lexical patterns.

From: Friam <[email protected] <mailto:[email protected]> > on 
behalf of Santafe <[email protected] <mailto:[email protected]> >
Date: Thursday, September 11, 2025 at 5:12 PM
To: [email protected] <mailto:[email protected]>  
<[email protected] <mailto:[email protected]> >, The Friday Morning 
Applied Complexity Coffee Group <[email protected] <mailto:[email protected]> >
Subject: Re: [FRIAM] Hallucinations

In your post, Russ, you say: 

“They are trained to produce fluent language, not to produce valid statements.“ 

Is that actually, operationally, what they are trained to do?  I speak from a 
position of ignorance here, but my impression is that they are trained to 
effectively stitch together fragments of varying lengths, according to rules 
for what stitchings are “compatible”.

My thinking here is metaphorical, to homologous recombination in DNA.  Some 
regions that don’t start out contiguous can be concatenated by DNA repair 
machinery, because under the physics to which it responds, they have plausible 
enough overlap that it considers them “compatible” or eligible to be identified 
at the join region, their “mis-matches” edited out.  Other pairs are so 
dissimilar that, under its operating physics, the repair machinery will 
effectively never join them.

My metaphor isn’t great, in the sense that if what LLMs (for human speech) are 
doing is “next-word prediction”, that says that the notion of “joining” is 
reduced formally to appending next-words onto strings.  Though, to the extent 
that certain substrings of next-words are extremely frequently attested across 
the corpus of all the training expressions, one would expect to see extended 
sequences essentially reproduced as fragments with large probability.

If my previous two characterizations aren’t fundamentally wrong, it would 
follow that fluent speech-generation becomes possible because the 
compatible-joining relations are suffficiently strong in human languages that 
the attention structures or other feed-forward aspects of the architecture have 
no trouble capturing them in parameters, even though human linguists trying to 
write them as re-write rules from which a computer could generate native-like 
speech failed for decades to get anywhere close to that.  My interpretation 
here would be consistent with what I believed was the main watershed change in 
the LLMs: that the parametric models would, ultimately, have terribly few 
parameters, whereas the LLMs can flood-fill a corpus with parameters, and then 
try to drip out the parts that don’t “stick to” some pattern in the data, and 
are regarded as the excess entropy from the sampling algorithm that the 
training is supposed to recognize and remove.  It is easy to imagine that 
fluent speech has far more regularities than rule-book linguists captured 
parametrically, but still few enough that LLMs can have no trouble attaching to 
almost-all of them, with parameters to spare.  Hence fluent speech could be 
epiphenomenal on what they are (operationally, mechanistically) being trained 
to do, but a natural summary statistic for the effectiveness of that training, 
and of course the one that drives market engagement.

But if the above is the case, then the question of when they get “the syntax” 
right and “the semantics” wrong, would seem to turn on how much context from 
the training set is needed to identify semantically as well as syntactically 
appropriate “allowed joins” of fragments.  When short fragments contain enough 
of their own context to constrain most of the semantics, the stitching training 
algorithm has no reason to perform any worse at revealing the semantic signal 
in the training set than the syntactic one.  But if probability needs to be 
withheld for a long time in the prediction model, driving it to prioritize a 
much smaller number of longer or more remote assembled inputs from the training 
data, it could still do fine on syntax but fail to “find” and “render” the 
semantic signal in the training data, even if that signal is present in 
principal.  

I would not feel a need to use terms like “understanding” anywhere in the 
above, to make predictions of what kinds of successes or failures an LLM might 
deliver from the user’s perspective.  It seems to me like something that all 
lives in the domain of hardness-of-search combinatorics in data-spaces with a 
lot of difficult structure.

Eric

On Sep 10, 2025, at 7:02, Russ Abbott <[email protected] 
<mailto:[email protected]> > wrote:

OpenAI just published a paper on hallucinations 
<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fcdn.openai.com%2fpdf%2fd04913be-3f6f-4d2b-b283-ff432ef4aaa5%2fwhy-language-models-hallucinate.pdf&c=E,1,IvBfvLzhn3L6LCNk3_ktKoEbc9NI2Oqq8vlFpNcIXCHElptIB-Fx-UxQYyTnCFW_ToeD5Kd4RjHkY-6fLxSBqZueOcvRqyHwpsHPK9ugMNcsOw,,&typo=1>
  as well as a post summarizing the paper 
<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fopenai.com%2findex%2fwhy-language-models-hallucinate%2f&c=E,1,tEcctM28Lbt5XBi3gNiUX-RiFelMYHNq6K3VJBilGv1_Z8uAt34ta8FaU-FcW5i8V3-2tsjNPu_at8Es78G2_drdmykgOltvjRvvaw1hUgnXUsv3&typo=1>
 . The two of them seem wrong-headed in such a simple and obvious way that I'm 
surprised the issue they discuss is still alive. 

The paper and post point out that LLMs are trained to generate fluent 
language--which they do extraordinarily well. The paper and post also point out 
that LLMs are not trained to distinguish valid from invalid statements. Given 
those facts about LLMs, it's not clear why one should expect LLMs to be able to 
distinguish true statements from false statements--and hence why one should 
expect to be able to prevent LLMs from hallucinating. 

In other words, LLMs are built to generate text; they are not built to 
understand the texts they generate and certainly not to be able to determine 
whether the texts they generate make factually correct or incorrect statements.

Please see my post 
<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2frussabbott.substack.com%2fp%2fwhy-language-models-hallucinate-according&c=E,1,zLy4H6KEpD5hDchYiBUjiH2J5dG2O9bmqa-jm1z6mGgRSqZgDKaVd2D2Xh_2Wuzi7FtZu2kjIOTNjQuk4iwsnfNUG68UPCxmZvD_IHTVUEPTcW6HDgpmcozzRQ,,&typo=1>
  elaborating on this.

Why is this not obvious, and why is OpenAI still talking about it?

-- Russ Abbott 
<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2frussabbott.substack.com%2f&c=E,1,cKMmq0etz4RiUaE4G2F04re6Su0EnNyqR9j5Dx8RcccQVNOB2r5CMNBzxRL9EYmN3lG_11nhB4wP-5jPf7NR86Mb9VxP9Jn2YUdKPZQT&typo=1>
   (Click for my Substack)

Professor Emeritus, Computer Science
California State University, Los Angeles

.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... 
--- -- . / .- .-. . / ..- ... . ..-. ..- .-..
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fbit.ly%2fvirtualfriam 
<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fbit.ly%2fvirtualfriam&c=E,1,TGWUFxByQV3GAAU3oSRoMNfDJD6ptWzY73PWkEy6wjvRSnx8Mc4UYZvwNnCZtQTtnx4s1YQWhA5OFZgcHYsPOfh2UOY3Y08aOLzFbRROXd4isiXdoT93L5Ncgw,,&typo=1>

&c=E,1,TGWUFxByQV3GAAU3oSRoMNfDJD6ptWzY73PWkEy6wjvRSnx8Mc4UYZvwNnCZtQTtnx4s1YQWhA5OFZgcHYsPOfh2UOY3Y08aOLzFbRROXd4isiXdoT93L5Ncgw,,&typo=1
to (un)subscribe 
https://linkprotect.cudasvc.com/url?a=http%3a%2f%2fredfish.com%2fmailman%2flistinfo%2ffriam_redfish.com

<https://linkprotect.cudasvc.com/url?a=http%3a%2f%2fredfish.com%2fmailman%2flistinfo%2ffriam_redfish.com&c=E,1,R8rvP64Y8Ojn7C4RmXsVaTwfI61-h--86QYAcdZfJB5b2Vma9UVdbCXCsDqLzWtC_TM9Ckm5LlRcoIn4_6mGC8c_WptkWvx_WtZA0PdtE8ViiUc,&typo=1>

&c=E,1,R8rvP64Y8Ojn7C4RmXsVaTwfI61-h--86QYAcdZfJB5b2Vma9UVdbCXCsDqLzWtC_TM9Ckm5LlRcoIn4_6mGC8c_WptkWvx_WtZA0PdtE8ViiUc,&typo=1
FRIAM-COMIC 
https://linkprotect.cudasvc.com/url?a=http%3a%2f%2ffriam-comic.blogspot.com%2f 
<https://linkprotect.cudasvc.com/url?a=http%3a%2f%2ffriam-comic.blogspot.com%2f&c=E,1,JolQcZ2iD8sPfKhQE-npSBJtUmqqa8EaE0J19wBCnesx4rjYKUpByO5mwjwVUiEn91veQr1Bk3B0gvLNuTtgIkN8-2VZRSkQS61pFh_zro8Oe_g7&typo=1>

&c=E,1,JolQcZ2iD8sPfKhQE-npSBJtUmqqa8EaE0J19wBCnesx4rjYKUpByO5mwjwVUiEn91veQr1Bk3B0gvLNuTtgIkN8-2VZRSkQS61pFh_zro8Oe_g7&typo=1
archives:  5/2017 thru present 
https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fredfish.com%2fpipermail%2ffriam_redfish.com%2f

<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fredfish.com%2fpipermail%2ffriam_redfish.com%2f&c=E,1,3J72bQm1T2SIdCaPyxSx4gitJ3Bt_OjLNAoKxcLa4u2f5Yw2m3gHImwAjCKE9RabMTMbzMedGiltpwWQ5w10fnNmDFvVkW9oQcfwHVezCQ,,&typo=1>

&c=E,1,3J72bQm1T2SIdCaPyxSx4gitJ3Bt_OjLNAoKxcLa4u2f5Yw2m3gHImwAjCKE9RabMTMbzMedGiltpwWQ5w10fnNmDFvVkW9oQcfwHVezCQ,,&typo=1
 1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

smime.p7s
Description: S/MIME cryptographic signature

.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... 
--- -- . / .- .-. . / ..- ... . ..-. ..- .-..
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Re: [FRIAM] Hallucinations

Reply via email to