Re: [FRIAM] Hallucinations

Russ Abbott Sun, 14 Sep 2025 00:18:11 -0700

Thanks, I'm looking for something like this. From a quick scan of the
papers, though, I don't see a way for me to experiment with inputs of my
own. Do you know of any such system?




On Sat, Sep 13, 2025 at 9:23 PM Marcus Daniels <[email protected]> wrote:

> There is a significant literature on these topics…
>
> https://transformer-circuits.pub/2024/scaling-monosemanticity/
> https://arxiv.org/abs/1702.01135
> https://arxiv.org/html/2502.02470v2#abstract
>
>
>
> *From: *Russ Abbott <[email protected]>
> *Date: *Saturday, September 13, 2025 at 5:46 PM
> *To: *Marcus Daniels <[email protected]>
> *Cc: *The Friday Morning Applied Complexity Coffee Group <
> [email protected]>
> *Subject: *Re: [FRIAM] Hallucinations
>
> I wasn't expecting the LLM process to parallel the human process. I want
> to know what LLMs produce as an embedding and what that embedding might
> allow them to do. This is all about what an LLM can do on its own (without
> external tools). This isn't a new question. I'm surprised there hasn't been
> more work along these lines.
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Sep 12, 2025 at 7:31 PM Marcus Daniels <[email protected]>
> wrote:
>
> Understanding how language and logic become encoded in deep neural nets is
> interesting topic.  However, we don’t have that expectation of one
> another.   “Your argument is plausible, but have your synaptic connectivity
> (a bunch of floats) been imaged and studied for correctness?”
>
> If one wants to be sure that reasoning is sound, relying on human or LLM
> intuition is the wrong way to go about it.   Instead, with MCP one can
> leverage LLM language skills to formulate (and refine & run) code that can
> show that the proofs that they generate (or hallucinate) are sound.
>
>
>
> *From: *Russ Abbott <[email protected]>
> *Date: *Friday, September 12, 2025 at 7:16 PM
> *To: *Marcus Daniels <[email protected]>
> *Cc: *The Friday Morning Applied Complexity Coffee Group <
> [email protected]>
> *Subject: *Re: [FRIAM] Hallucinations
>
> I'm not sure what your point is. You would expect Lean to be able to do
> that. Also, my example didn't say that the combination of statements is
> invalid. You added that explicitly and asked Lean to confirm it.
>
>
>
> My interest is what the encoding of the natural-language input looks like
> and--since it just looks like a sequence of floats--what it conveys in the
> context of that particular learned embedding framework.
>
>
>
> -- Russ
>
>
>
> On Fri, Sep 12, 2025, 5:04 PM Marcus Daniels <[email protected]> wrote:
>
> Let’s have Claude formulate the contradiction in Lean 4 and delegate the
> reasoning to a tool that is good at that.
> (Just like I wouldn’t do long division by hand.)
>
>
>
> *From:* Russ Abbott <[email protected]>
> *Sent:* Friday, September 12, 2025 3:51 PM
> *To:* Marcus Daniels <[email protected]>
> *Cc:* The Friday Morning Applied Complexity Coffee Group <
> [email protected]>
> *Subject:* Re: [FRIAM] Hallucinations
>
>
>
> Marcus, You're right, and I was wrong. I was much too insistent that LLMs
> don't understand the text they manipulate.
>
>
>
> A couple of weeks ago, I asked ChatGPT to embed (encode) a sentence and
> then decode it back to natural language. It said it didn't have access to
> the tools to do exactly that, but it would show me what the result would
> look like.
>
>
>
> The input sentence was: “I ate an apple because I was hungry. The apple
> was rotten. I got sick. My friend ate a banana. The banana was not rotten.
> My friend didn’t get sick.”
>
>
>
> ChatGPT simulated embedding/encoding the sentence as a vector. It then
> produced what it claimed was a reasonable natural language approximation of
> that vector. The result was: "A person and their friend ate fruit. One of
> the fruits was rotten, which caused sickness, while the other was fresh and
> did not cause illness."
>
>
>
> If ChatGPT can be believed, this is quite impressive. It implies that the
> embedding/encoding of natural language text includes something like the
> essential semantics of the original text. I had forgotten all about this
> when I wrote my post about hallucinations. I apologize.
>
>
>
> What I would like to do now -- and perhaps someone can help figure out if
> any tools are available to do this -- is to explore more carefully the
> sorts of information embeddings/encodings contain. For example, what would
> one get if one encoded and then decoded Chomsky's famous sentence:
> "Colorless green ideas sleep furiously." What would one get if one encoded
> -> decoded a contradiction: "All men are mortal. Socrates is a man;
> Socrates is immortal." What about: "The integer 3 is larger than the
> integer 9." Or "The American Revolutionary War occurred during the 19th
> century. George Washington led the American troops in that war. George
> Washington's tenure as the inaugural president of the United States began
> on April 30, 1789." Etc.
>
>
>
> -- Russ Abbott <https://russabbott.substack.com/>  (Click for my Substack)
>
> Professor Emeritus, Computer Science
> California State University, Los Angeles
>
>
>
>
>
>
>
>
>
> On Thu, Sep 11, 2025 at 6:59 PM Marcus Daniels <[email protected]>
> wrote:
>
> It often works with the frontier models to take a computational science
> or theory paper and to have them implement the idea expressed in some
> computer language.   One can also often invert that program back into
> natural language (and/or with LaTeX equations).  Further, one can translate
> between very different formal languages (imperative vs. functional), which
> would be hard work for most people.
>
> These summaries and transformations work so well that tools like Github
> Copilot will periodically perform a conversation summary, and simply drop
> the whole conversation and start over with crystallized context (due to
> context window limitations).   When it picks up after that, one will often
> see a few syntax or API misunderstandings before it regroups to where it
> was.
>
>
>
> What this pivoting ease implies to me is that LLMs have a deep semantic
> representation of the conversation (and knowledge and these skills).   It
> certainly is not just a matter of mating token sequences with some deft
> smoothing.
>
>
>
> Another example that has come up for me recently is using LLMs to predict
> simulation or solver outputs.  When faced with learning large arrays of
> numbers, what it does is more like capturing a picture then a sequence of
> digits.  It doesn’t know, without some help, about why number boundaries,
> signs, and decimal points are important.   Only through hard-won experience
> does it learn that the most and least significant digits should be treated
> differently.   Syntax is a hint one can offer through weak scaffolding
> penalties (outside of the training material).  It learns the semantics
> first.   Strong syntax penalties can get in the way of learning semantics
> by creating problematic energy barriers.
>
>
>
> While LLMs are huge, the Chinchilla optimality criterion (20 tokens per
> parameter) forces regularization.  There’s some flood fill, but I don’t
> think it can hold up for idiosyncratic lexical patterns.
>
>
>
> *From: *Friam <[email protected]> on behalf of Santafe <
> [email protected]>
> *Date: *Thursday, September 11, 2025 at 5:12 PM
> *To: *[email protected] <[email protected]>, The Friday Morning
> Applied Complexity Coffee Group <[email protected]>
> *Subject: *Re: [FRIAM] Hallucinations
>
> In your post, Russ, you say:
>
>
>
> “They are trained to produce fluent language, not to produce valid
> statements.“
>
>
>
> Is that actually, operationally, what they are trained to do?  I speak
> from a position of ignorance here, but my impression is that they are
> trained to effectively stitch together fragments of varying lengths,
> according to rules for what stitchings are “compatible”.
>
>
>
> My thinking here is metaphorical, to homologous recombination in DNA.
> Some regions that don’t start out contiguous can be concatenated by DNA
> repair machinery, because under the physics to which it responds, they have
> plausible enough overlap that it considers them “compatible” or eligible to
> be identified at the join region, their “mis-matches” edited out.  Other
> pairs are so dissimilar that, under its operating physics, the repair
> machinery will effectively never join them.
>
>
>
> My metaphor isn’t great, in the sense that if what LLMs (for human speech)
> are doing is “next-word prediction”, that says that the notion of “joining”
> is reduced formally to appending next-words onto strings.  Though, to the
> extent that certain substrings of next-words are extremely frequently
> attested across the corpus of all the training expressions, one would
> expect to see extended sequences essentially reproduced as fragments with
> large probability.
>
>
>
> If my previous two characterizations aren’t fundamentally wrong, it would
> follow that fluent speech-generation becomes possible because the
> compatible-joining relations are suffficiently strong in human languages
> that the attention structures or other feed-forward aspects of the
> architecture have no trouble capturing them in parameters, even though
> human linguists trying to write them as re-write rules from which a
> computer could generate native-like speech failed for decades to get
> anywhere close to that.  My interpretation here would be consistent with
> what I believed was the main watershed change in the LLMs: that the
> parametric models would, ultimately, have terribly few parameters, whereas
> the LLMs can flood-fill a corpus with parameters, and then try to drip out
> the parts that don’t “stick to” some pattern in the data, and are regarded
> as the excess entropy from the sampling algorithm that the training is
> supposed to recognize and remove.  It is easy to imagine that fluent speech
> has far more regularities than rule-book linguists captured parametrically,
> but still few enough that LLMs can have no trouble attaching to almost-all
> of them, with parameters to spare.  Hence fluent speech could be
> epiphenomenal on what they are (operationally, mechanistically) being
> trained to do, but a natural summary statistic for the effectiveness of
> that training, and of course the one that drives market engagement.
>
>
>
> But if the above is the case, then the question of when they get “the
> syntax” right and “the semantics” wrong, would seem to turn on how much
> context from the training set is needed to identify semantically as well as
> syntactically appropriate “allowed joins” of fragments.  When short
> fragments contain enough of their own context to constrain most of the
> semantics, the stitching training algorithm has no reason to perform any
> worse at revealing the semantic signal in the training set than the
> syntactic one.  But if probability needs to be withheld for a long time in
> the prediction model, driving it to prioritize a much smaller number of
> longer or more remote assembled inputs from the training data, it could
> still do fine on syntax but fail to “find” and “render” the semantic signal
> in the training data, even if that signal is present in principal.
>
>
>
> I would not feel a need to use terms like “understanding” anywhere in the
> above, to make predictions of what kinds of successes or failures an LLM
> might deliver from the user’s perspective.  It seems to me like something
> that all lives in the domain of hardness-of-search combinatorics in
> data-spaces with a lot of difficult structure.
>
>
>
> Eric
>
>
>
>
>
> On Sep 10, 2025, at 7:02, Russ Abbott <[email protected]> wrote:
>
>
>
>
>
> OpenAI just published a paper on hallucinations
> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fcdn.openai.com%2fpdf%2fd04913be-3f6f-4d2b-b283-ff432ef4aaa5%2fwhy-language-models-hallucinate.pdf&c=E,1,IvBfvLzhn3L6LCNk3_ktKoEbc9NI2Oqq8vlFpNcIXCHElptIB-Fx-UxQYyTnCFW_ToeD5Kd4RjHkY-6fLxSBqZueOcvRqyHwpsHPK9ugMNcsOw,,&typo=1>
>  as
> well as a post summarizing the paper
> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fopenai.com%2findex%2fwhy-language-models-hallucinate%2f&c=E,1,tEcctM28Lbt5XBi3gNiUX-RiFelMYHNq6K3VJBilGv1_Z8uAt34ta8FaU-FcW5i8V3-2tsjNPu_at8Es78G2_drdmykgOltvjRvvaw1hUgnXUsv3&typo=1>.
> The two of them seem wrong-headed in such a simple and obvious way that I'm
> surprised the issue they discuss is still alive.
>
>
>
> The paper and post point out that LLMs are trained to generate fluent
> language--which they do extraordinarily well. The paper and post also point
> out that LLMs are not trained to distinguish valid from invalid statements.
> Given those facts about LLMs, it's not clear why one should expect LLMs to
> be able to distinguish true statements from false statements--and hence why
> one should expect to be able to prevent LLMs from hallucinating.
>
>
>
> In other words, LLMs are built to generate text; they are not built to
> understand the texts they generate and certainly not to be able to
> determine whether the texts they generate make factually correct or
> incorrect statements.
>
>
>
> Please see my post
> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2frussabbott.substack.com%2fp%2fwhy-language-models-hallucinate-according&c=E,1,zLy4H6KEpD5hDchYiBUjiH2J5dG2O9bmqa-jm1z6mGgRSqZgDKaVd2D2Xh_2Wuzi7FtZu2kjIOTNjQuk4iwsnfNUG68UPCxmZvD_IHTVUEPTcW6HDgpmcozzRQ,,&typo=1>
> elaborating on this.
>
>
>
> Why is this not obvious, and why is OpenAI still talking about it?
>
>
>
> -- Russ Abbott
> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2frussabbott.substack.com%2f&c=E,1,cKMmq0etz4RiUaE4G2F04re6Su0EnNyqR9j5Dx8RcccQVNOB2r5CMNBzxRL9EYmN3lG_11nhB4wP-5jPf7NR86Mb9VxP9Jn2YUdKPZQT&typo=1>
> (Click for my Substack)
>
> Professor Emeritus, Computer Science
> California State University, Los Angeles
>
>
>
>
>
> .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. /
> ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-..
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fbit.ly%2fvirtualfriam&c=E,1,TGWUFxByQV3GAAU3oSRoMNfDJD6ptWzY73PWkEy6wjvRSnx8Mc4UYZvwNnCZtQTtnx4s1YQWhA5OFZgcHYsPOfh2UOY3Y08aOLzFbRROXd4isiXdoT93L5Ncgw,,&typo=1
> to (un)subscribe
> https://linkprotect.cudasvc.com/url?a=http%3a%2f%2fredfish.com%2fmailman%2flistinfo%2ffriam_redfish.com&c=E,1,R8rvP64Y8Ojn7C4RmXsVaTwfI61-h--86QYAcdZfJB5b2Vma9UVdbCXCsDqLzWtC_TM9Ckm5LlRcoIn4_6mGC8c_WptkWvx_WtZA0PdtE8ViiUc,&typo=1
> FRIAM-COMIC
> https://linkprotect.cudasvc.com/url?a=http%3a%2f%2ffriam-comic.blogspot.com%2f&c=E,1,JolQcZ2iD8sPfKhQE-npSBJtUmqqa8EaE0J19wBCnesx4rjYKUpByO5mwjwVUiEn91veQr1Bk3B0gvLNuTtgIkN8-2VZRSkQS61pFh_zro8Oe_g7&typo=1
> archives:  5/2017 thru present
> https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fredfish.com%2fpipermail%2ffriam_redfish.com%2f&c=E,1,3J72bQm1T2SIdCaPyxSx4gitJ3Bt_OjLNAoKxcLa4u2f5Yw2m3gHImwAjCKE9RabMTMbzMedGiltpwWQ5w10fnNmDFvVkW9oQcfwHVezCQ,,&typo=1
>  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>
>
>
> .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. /
> ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-..
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
> https://bit.ly/virtualfriam
> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIC http://friam-comic.blogspot.com/
> archives:  5/2017 thru present
> https://redfish.com/pipermail/friam_redfish.com/
>   1/2003 thru 6/2021  http://friam.383.s1.nabble.com/
>

.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... 
--- -- . / .- .-. . / ..- ... . ..-. ..- .-..
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Re: [FRIAM] Hallucinations

Reply via email to