Kai Schaetzl wrote:
Arthur Kerpician wrote on Thu, 09 Apr 2009 09:41:22 +0300:

The docs mention that after 5000 spam and ham learned, spamassassin doesn't improve spam detection much.

do they? What is meant is that once you reach some threshold the detection rate doesn't improve as good as before. You can't get any better as "nearly everything". But it will drop if no new tokens get added.

What is the best
practice to optimize the bayes detection? Should I stop auto-learning after reaching the 5000 mark and than re-train from time to time from scratch?

No, keep the automatic training (unless there are too many FPs in the autotrained messages). Do a regular manual expire, so old tokens are purged out.
I don't get many FPs or FNs after upgrading to 3.2.5 and retraining bayes. But, if I keep auto-learning enabled, I should monitor the trained spam and ham levels and manual train ham when the spam exceeds it (as it will always exceed ham level). So from time to time I should feed ham manually to sa-learn, until it reaches the spam level again. Is this correct? If it is, I think it's rather time-consuming to always check the trained ham/spam and level them.

I was thinking to increase bayes_auto_learn_threshold_spam to a higher number, so less spam is auto-learned. Is this ok?

Reply via email to