Look at:
http://useast.spamassassin.org/doc/Mail_SpamAssassin_Conf.html#learning%20op
tions
bayes_ignore_header header_name
If you receive mail filtered by upstream mail systems, like a spam-filtering
ISP or mailing list, and that service adds new headers (as most of them do),
these headers may provide inappropriate cues to the Bayesian classifier,
allowing it to take a ``short cut''. To avoid this, list the headers using
this setting. Example:
bayes_ignore_header X-Upstream-Spamfilter
bayes_ignore_header X-Upstream-SomethingElse
An example:
http://www.stearns.org/doc/spamassassin-setup.current.html#autoreporting
--Larry
> -----Original Message-----
> From: [EMAIL PROTECTED]
[mailto:spamassassin-talk-
> [EMAIL PROTECTED] On Behalf Of Ross Vandegrift
> Sent: Monday, January 19, 2004 2:53 PM
> To: [EMAIL PROTECTED]
> Subject: [SAtalk] Bayes mis-learning problem
>
> Hey everyone,
>
> We're currently coping with a false-positive crisis that's
> sweeping our email with 2.60, mostly due to scores of the Bayes filter.
> We run SA site-wide on an incoming MX host, so individual users do not
> have access to train the Bayes database. Moreover, our primary client
> program is Pegasus Mail for DOS, which provides no real way to get raw
> messages out unmodified (it hoses CR/LF, forces line wraps, and cat's
> MIME parts together).
>
> So I'm going through some of our Bayes tokens trying to decide
> if I should dump the current database and start over. I've noticed
> really bad things like this:
>
> 0.892 381 112 1069183901 HTo:[EMAIL PROTECTED]
> 0.905 75 19 1069183901 HTo:[EMAIL PROTECTED]
> 0.997 17 0 1069183901 HTo:[EMAIL PROTECTED]
>
> This looks really horrible! Just by virtue of my boss's email having a
> "To: [EMAIL PROTECTED]", it'll almost certainly be tagged as spam. The
> database is trained with nham=13685 and nspam=5652. Autolearning is
> enabled and has default threshholds.
>
> This is alarming at first. But when I think about it, and I realize
> that most of us get more spam than ham - Bayes is right. Unfortuantely,
> that's really, really the wrong thing to do. Is there a way to excempt
> some headers from processing?
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk