dalchri wrote: > Hello, > > I completed configuring all my network tests and the bayes database has > passed 200 ham messages and is being used. The bayes database has been > accumulating knowledge so far through autolearn. > > I was concerned about how one sided the autolearning has been since over 90% > of our email is spam. To avoid FP, I put our customer database of email > addresses into a manual whitelist. > > Although these addresses are making it through fine, only a few are being > reported as autolearn=ham in the X-Spam-Status header, most are being > reported as autolearn=no. > > Is there any way to force these messages through the autolearn process? > No, in fact, the autolearner currently intentionally ignores manual whitelists when deciding if it should autolearn.
This is largely done to prevent whitelisting mistakes from creating a "bayes hangover", where the autolearning causes a lot of mistakenly whitelisted spam to get learned as nonspam. This risk is quite realistic if you're whitelist_from, particularly if you do whole domains, and inevitable if you use "whitelist_from [EMAIL PROTECTED]". This is because whitelist_from offers no protections at all against forgery. Fundamentally, whitelist_from is a tool of last resort, and only exists for a few rare situations where no other option exists. (were it not for those situations, there are strong arguments that would likely result in whitelist_from being removed from SA) Ok, I suppose I lied a bit, you could modify the tflags for the USER_IN_WHITELIST rule so it no longer has userconf or noautolearn. That should cause the autolearner to start considering the score of the whitelist, which will almost certainly result in most of the messages being learned as nonspam. (however, if they score really high in the BAYES_* rules, it will still refuse to autolearn something that strongly contradicts the existing training) However, proceed with due caution, and only if you're using whitelist_from_rcvd or whitelist_from_spf. Don't do this with whitelist_from.