VaultGemma: A Differentially Private Gemma Model
https://services.google.com/fh/files/blogs/vaultgemma_tech_report.pdf
While this is great for privacy, I think the fuzziness it provides applies to
all facts, right? So if someone's out there defaming π, some of that defamation
may well bleed over into what we do or do not know about e. I guess it depends
on the number of doclets defaming π and how the defamation is constructed. So
when π and e had their viral courtroom battle of the abuse in their
relationship, if the majority of doclets assert that e is an attention-seeking
gold-digger, that same criticism might apply to π or worse, all the irrational
numbers? Regular Joes like √2 will suffer the most from the celebrity fuzz.
So does DP-SGD fuzzify all facts? What's the impact on fidelity/accuracy? Maybe
it'll make the output more trustworthy because it'll subtly alter statements to
make them more general, less specific, less concrete? But then if there's a
domain where there's one valiant soul shouting truth in a wilderness of
bullsh¡t, it'll be less likely to make it into the model?
On 9/9/25 5:19 PM, glen wrote:
It's unfortunate jargon [⛧]. So it's nothing like whether an LLM is red (unless you adopt a jargonal definition of
"red"). And your example is a great one for understanding how language fluency *is* at least somewhat
correlated with fidelity. The statistical probability of the phrase "LLMs hallucinate" is >> 0,
whereas the prob for the phrase "LLMs are red" is vanishingly small. It would be the same for black swans
and Lewis Carroll writings *if* they weren't canonical teaching devices. It can't be that sophisticated if children
think it's funny.
But imagine all the woo out there where words like "entropy" or "entanglement"
are used falsely. IDK for sure, but my guess is the false sentences outnumber the true ones by a
lot. So the LLM has a high probability of forming false sentences.
Of course, in that sense, if a physicist finds themselves talking to an expert in the "Law of Attraction"
(e.g. the movie "The Secret") and makes scientifically true statements about entanglement, the guru may well
judge them as false. So there's "true in context" (validity) and "ontologically true" (soundness).
A sentence can be true in context but false in the world and vice versa, depending on who's in control of the
reinforcement.
[⛧] We could discuss the strength of the analogy between human hallucination and LLM
"hallucination", especially in the context of prediction coding. But we don't
need to. Just consider it jargon and move on.
On 9/9/25 4:37 PM, Russ Abbott wrote:
Marcus, Glen,
Your responses are much too sophisticated for me. Now that I'm retired (and, in
truth, probably before as well), I tend to think in much simpler terms.
My basic point was to express my surprise at realizing that it makes as much
sense to ask whether an LLM hallucinates as it does to ask whether an LLM is
red. It's a category mismatch--at least I now think so.
_
_
__-- Russ <https://russabbott.substack.com/>
On Tue, Sep 9, 2025 at 3:45 PM glen <[email protected]
<mailto:[email protected]>> wrote:
The question of whether fluency is (well) correlated to accuracy seems to assume
something like mentalizing, the idea that there's a correspondence between minds mediated
by a correspondence between the structure of the world and the structure of our
minds/language. We've talked about the "interface theory of perception", where
Hoffman (I think?) argues we're more likely to learn *false* things than we are true
things. And we've argued about realism, pragmatism, prediction coding, and everything
else under the sun on this list.
So it doesn't surprise me if most people assume there will be more true
statements in the corpus than false statements, at least in domains where there
exists a common sense, where the laity *can* perceive the truth. In things like
quantum mechanics or whatever, then all bets are off becuase there are probably
more false sentences than true ones.
If there are more true than false sentences in the corpus, then
reinforcement methods like Marcus' only bear a small burden (in lay domains).
The implicit fidelity does the lion's share. But in those domains where
counter-intuitive facts dominate, the reinforcement does the most work.
On 9/9/25 3:12 PM, Marcus Daniels wrote:
> Three ways some to mind.. I would guess that OpenAI, Google, Anthropic,
and xAI are far more sophisticated..
>
> 1. Add a softmax penalty to the loss that tracks non-factual statements
or grammatical constraints. Cross entropy may not understand that some parts of
content are more important than others.
> 2. Change how the beam search works during inference to skip sequences
that fail certain predicates – like a lookahead that says “Oh, I can’t say that..”
> 3. Grade the output, either using human or non-LLM supervision, and
re-train.
>
> *From:*Friam <[email protected]
<mailto:[email protected]>> *On Behalf Of *Russ Abbott
> *Sent:* Tuesday, September 9, 2025 3:03 PM
> *To:* The Friday Morning Applied Complexity Coffee Group <[email protected]
<mailto:[email protected]>>
> *Subject:* [FRIAM] Hallucinations
>
> OpenAI just published a paper on hallucinations
<https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
<https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf>>
as well as a post summarizing the paper <https://openai.com/index/why-language-models-hallucinate/
<https://openai.com/index/why-language-models-hallucinate/>>. The two of them seem wrong-headed
in such a simple and obvious way that I'm surprised the issue they discuss is still alive.
>
> The paper and post point out that LLMs are trained to generate fluent
language--which they do extraordinarily well. The paper and post also point out
that LLMs are not trained to distinguish valid from invalid statements. Given
those facts about LLMs, it's not clear why one should expect LLMs to be able to
distinguish true statements from false statements--and hence why one should expect
to be able to prevent LLMs from hallucinating.
>
> In other words, LLMs are built to generate text; they are not built to
understand the texts they generate and certainly not to be able to determine
whether the texts they generate make factually correct or incorrect statements.
>
> Please see my post
<https://russabbott.substack.com/p/why-language-models-hallucinate-according
<https://russabbott.substack.com/p/why-language-models-hallucinate-according>>
elaborating on this.
>
> Why is this not obvious, and why is OpenAI still talking about it?
>
--
¡sıɹƎ ןıɐH ⊥ ɐןןǝdoɹ ǝ uǝןƃ
Ἐν τῷ ἄλλοις αἴλουροι τὰς ἐχθροὺς ὀξύνονται, ἐγὼ τοὺς φίλους μου ὀξύνομαι ἵνα
σῶμαι αὐτούς.
.- .-.. .-.. / ..-. --- --- - . .-. ... / .- .-. . / .-- .-. --- -. --. / ...
--- -- . / .- .-. . / ..- ... . ..-. ..- .-..
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/