Strange findings debugging bayes results

hg user Thu, 16 Feb 2023 01:19:09 -0800

I was investigating a bunch of bitcoin spam: different titles,
different senders (all from gmail), different text, different pdf
attachment.


Unfortunately in those days my bayes db was polluted and they all got
a BAYES_50, 0.8.

I tested the messages now with a recreated bayes db and got some
BAYES_999. So I dug to understand if I already saw the spam...

But the debug result was unpleasant:
dbg: bayes: tokenized header: 92 tokens
dbg: bayes: token 'HX-Received:Jan' => 0.998028449502134
dbg: bayes: token 'HX-Google-DKIM-Signature:20210112' => 0.997244532803181
dbg: bayes: token 'H*r:sk:<START_OF_RECIPIENT_EMAIL_ADDRESS>' =>
0.997244532803181
dbg: bayes: token 'H*r:a05' => 0.995425742574258
dbg: bayes: token 'HAuthentication-Results:sk:<MY_SA_HOSTNAME>.' =>
0.986543689320388
dbg: bayes: token 'HX-Google-DKIM-Signature:reply-to' => 0.916110175863517
dbg: bayes: token 'H*r:2002' => 0.877842810325844
dbg: bayes: token 'HAuthentication-Results:2048-bit' => 0.858520043212023
dbg: bayes: token 'HAuthentication-Results:pass' => 0.855193895034317
dbg: bayes: score = 0.999997915091326


Every score is based on headers, very generic headers. and some
related to my setup.

Not a single token from the message body....

Strange findings debugging bayes results

Reply via email to