> My problem is this: I'm using squirrelmail,

As your only email access?

> and to keep an eye on false negatives (I define those as real mails
> that get shuttled to spam, just to keep things clear) I have a 'spam'
> folder. As anyone that uses sqmail knows, it gets very slow when any
> folder contains more than a few hundred messages.

<g>  Try several thousand, as a number of customers have reported to

Actually, it's only spewed out error messages in a very few cases.

> But, since my
> filter is trained very well, I'd like to send autolearned spams to
> /mail/Trash (ultimately to /dev/null) so I don't have to deal with
> those.

Mmm.  Dangerous - I've seen FPs get autolearned as spam once or twice. 

What I do on my accounts is set up a "big-spam" folder, and rely on the
X-Spam-Level header to move mail there.  Anything scoring 15 or higher
gets 15 or more stars in X-Spam-Level, and I have this:

* ^X-Spam-Level:.\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*

before the check that files spam in my "main" spam folder.

With the well-tuned 2.64+SURBL systems I have, ~80% or the spam usually
ends up in the "big-spam" folder.

> I figured just setting bayes_auto_learn_threshold_spam 6 would
> work great. It really does not do much of anything. I've decreased
> it to 3, and to 1, but it really doesnt make a difference. I found
> these relevant lines in a debug:

> debug: auto-learn? ham=0.1, spam=1, body-points=0, head-points=-2.82,
> learned-points=1.886
> debug: auto-learn? no: scored as spam but too few body points (0 < 3)

These two entries are the critical ones;  note the body-points and
head-points.  To be autolearned as spam, a message must hit tests worth
a total of 3 points or more on header tests, and a total of 3 points or
more on body tests.

I notice you're still using the default autolearn-as-ham setting;  this
is dangerous as very low-scoring spam can get autolearned incorrectly.
I've dropped it to -0.01 on my systems to prevent this.

> What, exactly, is going on here? The head points I can explain (this
> is a spam I saved that had already come to me) but the body points -
> I don't understand. It also wasn't clear to me until this debug that
> the autolearn had its own scoring system.

Not entirely;  to decide whether to autolearn a message one of the
"no-Bayes" score sets is used to calculate the scores, depending on
whether you've got network tests disabled or not.

Get your mouse off of there!  You don't know where that email has been!

Reply via email to