I have a question: Why is it when a chatbot makes stuff up it's called Hallucinating. On the other hand if I said there's lots of people out there: I am either douglas addams kinds of wrong, or (purely to avoid rule lawyering): Give context, that or i'm just making tings up (the fine and noble art the white lie.. Ok so why is it then: Chatbot's get callued Hallucinating, but a person is either making things up, or just confused someone. So why do we say: Chatbot Hallucinated,vs started pulling ********** out of it's digital and hightly trained electronic backside and sitting part? (aka BSing and telling fat lies). For instance many years ago: chatgpt 2+2 is 4 (internet trolls went: oh really?) and soon the 2+2=5 Acording to chatgpt (ie making stuff up!) Why is that called hallucinating vs BSing and making sh**** up?
On Thu, Sep 11, 2025 at 11:30 AM Marcus Daniels <[email protected]> wrote: > It seems to me that we are full circle back to a Turing Test. If the LLM > encodes and demonstrates skill (they certainly do), and these skills can be > progress a solution of some real-world problem, then it is just empty > chauvinism to say they don’t understand a topic. > > > > *From: *Friam <[email protected]> on behalf of Steve Smith < > [email protected]> > *Date: *Thursday, September 11, 2025 at 10:12 AM > *To: *[email protected] <[email protected]> > *Subject: *Re: [FRIAM] Hallucinations > > I find LLM engagement to be somewhere between that with a highly > plausible gossip and a well researched survey paper in a subject I am > interested in? > > Where a given conversation lands in this interval almost exclusively > seems to rely on my care in crafting my prompts. > > I don't expect 'truth' out of either gossip or a survey paper... just > 'perspective'? > > On 9/11/25 10:55 am, glen wrote: > > OK. You're right in principle. But we might want to think of this in > > the context of all algorithms. For example, let's say you run a FFT on > > a signal and it outputs some frequencies. Does the signal *actually* > > contain or express those frequencies? Or is it just an inference that > > we find reliable? > > > > The same is true of the LLM inferences. Whether one ascribes truth or > > falsity to those inferences is only relevant to metaphysicians and > > philosophers. What matters is how reliable the inferences are when we > > do some task. Yelling at the kids on your lawn doesn't achieve > > anything. It's better to go out there and talk to them. 8^D > > > > > > On 9/10/25 8:38 PM, Russ Abbott wrote: > >> Glen, I wish people would stop talking about whether LLM-generated > >> sentences are true or false. The mechanisms LLMs employ to generate a > >> sentence have nothing to do with whether the sentence turns out to be > >> true or false. A sentence may have a higher probability of being true > >> if the training data consisted entirely of true sentences. (Even > >> that's not guaranteed; similar true sentences might have their > >> components interchanged when used during generation.) But the point > >> is: the transformer process has no connection to the validity of its > >> output. If an LLM reliably generates true sentences, no credit is due > >> to the transformer. If the training data consists entirely of > >> true/false sentences, the generated output is more likely to be > >> true/false. Output validity plays no role in how an LLM generates its > >> output. > >> > >> Marcus, if an LLM is trained entirely on false statements, its > >> "confidence" in its output will presumably be the same as it would be > >> if it were trained entirely on true statements. Truthfulness is not a > >> consideration in the generation process. Speaking of a need to reduce > >> ambiguity suggests that the LLM understands the input and realizes it > >> might have multiple meanings. But of course, LLMs don't understand > >> anything, they don't realize anything, and they can't take meaning > >> into consideration when generating output. > >> > >> > >> > >> > >> > >> On Tue, Sep 9, 2025 at 5:20 PM glen <[email protected] > >> <mailto:[email protected] <[email protected]>>> wrote: > >> > >> It's unfortunate jargon [⛧]. So it's nothing like whether an LLM > >> is red (unless you adopt a jargonal definition of "red"). And your > >> example is a great one for understanding how language fluency *is* at > >> least somewhat correlated with fidelity. The statistical probability > >> of the phrase "LLMs hallucinate" is >> 0, whereas the prob for the > >> phrase "LLMs are red" is vanishingly small. It would be the same for > >> black swans and Lewis Carroll writings *if* they weren't canonical > >> teaching devices. It can't be that sophisticated if children think > >> it's funny. > >> > >> But imagine all the woo out there where words like "entropy" or > >> "entanglement" are used falsely. IDK for sure, but my guess is the > >> false sentences outnumber the true ones by a lot. So the LLM has a > >> high probability of forming false sentences. > >> > >> Of course, in that sense, if a physicist finds themselves talking > >> to an expert in the "Law of Attraction" (e.g. the movie "The Secret") > >> and makes scientifically true statements about entanglement, the guru > >> may well judge them as false. So there's "true in context" (validity) > >> and "ontologically true" (soundness). A sentence can be true in > >> context but false in the world and vice versa, depending on who's in > >> control of the reinforcement. > >> > >> > >> [⛧] We could discuss the strength of the analogy between human > >> hallucination and LLM "hallucination", especially in the context of > >> prediction coding. But we don't need to. Just consider it jargon and > >> move on. > >> > >> On 9/9/25 4:37 PM, Russ Abbott wrote: > >> > Marcus, Glen, > >> > > >> > Your responses are much too sophisticated for me. Now that I'm > >> retired (and, in truth, probably before as well), I tend to think in > >> much simpler terms. > >> > > >> > My basic point was to express my surprise at realizing that it > >> makes as much sense to ask whether an LLM hallucinates as it does to > >> ask whether an LLM is red. It's a category mismatch--at least I now > >> think so. > >> > _ > >> > _ > >> > __-- Russ <https://russabbott.substack.com/ > >> <https://russabbott.substack.com/>> > >> > > >> > > >> > > >> > > >> > On Tue, Sep 9, 2025 at 3:45 PM glen <[email protected] > >> <mailto:[email protected] <[email protected]>> <mailto: > [email protected] > >> <mailto:[email protected] <[email protected]>>>> wrote: > >> > > >> > The question of whether fluency is (well) correlated to > >> accuracy seems to assume something like mentalizing, the idea that > >> there's a correspondence between minds mediated by a correspondence > >> between the structure of the world and the structure of our > >> minds/language. We've talked about the "interface theory of > >> perception", where Hoffman (I think?) argues we're more likely to > >> learn *false* things than we are true things. And we've argued about > >> realism, pragmatism, prediction coding, and everything else under the > >> sun on this list. > >> > > >> > So it doesn't surprise me if most people assume there will > >> be more true statements in the corpus than false statements, at least > >> in domains where there exists a common sense, where the laity *can* > >> perceive the truth. In things like quantum mechanics or whatever, > >> then all bets are off becuase there are probably more false sentences > >> than true ones. > >> > > >> > If there are more true than false sentences in the corpus, > >> then reinforcement methods like Marcus' only bear a small burden (in > >> lay domains). The implicit fidelity does the lion's share. But in > >> those domains where counter-intuitive facts dominate, the > >> reinforcement does the most work. > >> > > >> > > >> > On 9/9/25 3:12 PM, Marcus Daniels wrote: > >> > > Three ways some to mind.. I would guess that OpenAI, > >> Google, Anthropic, and xAI are far more sophisticated.. > >> > > > >> > > 1. Add a softmax penalty to the loss that tracks > >> non-factual statements or grammatical constraints. Cross entropy may > >> not understand that some parts of content are more important than > >> others. > >> > > 2. Change how the beam search works during inference > >> to skip sequences that fail certain predicates – like a lookahead > >> that says “Oh, I can’t say that..” > >> > > 3. Grade the output, either using human or non-LLM > >> supervision, and re-train. > >> > > > >> > > *From:*Friam <[email protected] > >> <mailto:[email protected] <[email protected]>> <mailto: > [email protected] > >> <mailto:[email protected] <[email protected]>>>> *On > Behalf Of *Russ Abbott > >> > > *Sent:* Tuesday, September 9, 2025 3:03 PM > >> > > *To:* The Friday Morning Applied Complexity Coffee > >> Group <[email protected] <mailto:[email protected] <[email protected]>> > > >> <mailto:[email protected] <mailto:[email protected] <[email protected]> > >>> > >> > > *Subject:* [FRIAM] Hallucinations > >> > > > >> > > OpenAI just published a paper on hallucinations > >> < > https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf > >> < > https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf> > > >> < > https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf > >> < > https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf>>> > as > > >> well as a post summarizing the paper > >> <https://openai.com/index/why-language-models-hallucinate/ > >> <https://openai.com/index/why-language-models-hallucinate/> > >> <https://openai.com/index/why-language-models-hallucinate/ > >> <https://openai.com/index/why-language-models-hallucinate/>>>. The > >> two of them seem wrong-headed in such a simple and obvious way that > >> I'm surprised the issue they discuss is still alive. > >> > > > >> > > The paper and post point out that LLMs are trained to > >> generate fluent language--which they do extraordinarily well. The > >> paper and post also point out that LLMs are not trained to > >> distinguish valid from invalid statements. Given those facts about > >> LLMs, it's not clear why one should expect LLMs to be able to > >> distinguish true statements from false statements--and hence why one > >> should expect to be able to prevent LLMs from hallucinating. > >> > > > >> > > In other words, LLMs are built to generate text; they > >> are not built to understand the texts they generate and certainly not > >> to be able to determine whether the texts they generate make > >> factually correct or incorrect statements. > >> > > > >> > > Please see my post > >> < > https://russabbott.substack.com/p/why-language-models-hallucinate-according > >> < > https://russabbott.substack.com/p/why-language-models-hallucinate-according> > > >> < > https://russabbott.substack.com/p/why-language-models-hallucinate-according > >> < > https://russabbott.substack.com/p/why-language-models-hallucinate-according>>> > > >> elaborating on this. > >> > > > >> > > Why is this not obvious, and why is OpenAI still > >> talking about it? > >> > > > >> -- > > > > > > .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / > ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-.. > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ > .- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / > ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-.. > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ >
.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ... --- -- . / .- .-. . / ..- ... . ..-. ..- .-.. FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
