On Thu, 19 Apr 2007, Craig Carriere wrote:
Does this really mean that auto-learn is "out of balance"? My first
guess is that this site probably relies only on SA to combat spam and
does little at the MTA level to reject UBE mail. They may even run a
catch-all account which would markedly increase his spam count if he is
not rejecting for non-existent users. At my small mail server even with
MTA restrictions, conservative ones, in place our spam hits out number
ham by probably 4-5 to 1. It is just the nature of the beast. I do
agree that he needs to manually train his bayes bases and probably keep
feeding ham into the bayes engine. after it starts to fire.
As an aside do you use any MTA restrictions and/or greylisting?
I'm using Postfix+ClamAV+SA on our two border filter servers. Roughly 95%
of all inbound is messages are weeded out before getting to our internal
server our customers use.
I have a couple internal blacklists used and greylisting. And, I had set
the following values within the local.cf:
bayes_auto_learn_threshold_nonspam 0.01
bayes_auto_learn_threshold_spam 18.0
Their normal defaults are 0.1 and 12.0 respectively. I had set a higher
value for auto learn as you don't have hardly any control over what
messages get learned in either direction. Some others on this list have
the auto learn values set even higher.
As far as the numbers mentioned by the OP, 25,000 spam to 180 ham? That is
a lot more than your ~5 to 1. I would not have suspected auto learn to be
that far off.