dalchri wrote:
> Hello,
>
> I completed configuring all my network tests and the bayes database has
> passed 200 ham messages and is being used.  The bayes database has been
> accumulating knowledge so far through autolearn.
>
> I was concerned about how one sided the autolearning has been since over 90%
> of our email is spam.  To avoid FP, I put our customer database of email
> addresses into a manual whitelist.
>
> Although these addresses are making it through fine, only a few are being
> reported as autolearn=ham in the X-Spam-Status header, most are being
> reported as autolearn=no.
>
> Is there any way to force these messages through the autolearn process?
>   
No, in fact, the autolearner currently intentionally ignores manual
whitelists when deciding if it should autolearn.

This is largely done to prevent whitelisting mistakes from creating a
"bayes hangover", where the autolearning causes a lot of mistakenly
whitelisted spam to get learned as nonspam.

This risk is quite realistic if you're whitelist_from, particularly if
you do whole domains, and inevitable if you use "whitelist_from
[EMAIL PROTECTED]". This is because whitelist_from offers no protections at
all against forgery. Fundamentally, whitelist_from is a tool of last
resort, and only exists for a few rare situations where no other option
exists. (were it not for those situations, there are strong arguments
that would likely result in whitelist_from being removed from SA)

Ok, I suppose I lied a bit, you could modify the tflags for the
USER_IN_WHITELIST rule so it no longer has userconf or noautolearn. That
should cause the autolearner to start considering the score of the
whitelist, which will almost certainly result in most of the messages
being learned as nonspam. (however, if they score really high in the
BAYES_* rules, it will still refuse to autolearn something that strongly
contradicts the existing training)

 However, proceed with due caution, and only if you're using
whitelist_from_rcvd or whitelist_from_spf. Don't do this with
whitelist_from.




Reply via email to