Re: bayes learn best practice

2009-04-14 Thread John Hardin
On Tue, 14 Apr 2009, Arthur Kerpician wrote: Again, if I choose to learn *all* spam and *all* ham, I'll end up with big differences between their levels in bayes, which will affect spam detection. Not really, when you consider the volume of spam you receive far exceed the volumne of ham you

Re: bayes learn best practice

2009-04-14 Thread Arthur Kerpician
Kai Schaetzl wrote: Arthur Kerpician wrote on Thu, 09 Apr 2009 20:25:42 +0300: . So from time to time I should feed ham manually to sa-learn, until it reaches the spam level again. Is this correct? If it is, I think it's rather time-consuming to always check the trained ham/spam and level

Re: bayes learn best practice

2009-04-13 Thread Kai Schaetzl
Arthur Kerpician wrote on Thu, 09 Apr 2009 20:25:42 +0300: > . So from time to time I should > feed ham manually to sa-learn, until it reaches the spam level again. Is > this correct? If it is, I think it's rather time-consuming to always > check the trained ham/spam and level them. There is n

Re: bayes learn best practice

2009-04-09 Thread Michael Scheidell
Arthur Kerpician wrote: I was thinking to increase bayes_auto_learn_threshold_spam to a higher number, so less spam is auto-learned. Is this ok? I try to keep it a 10 to 1 ratio (dropping the _ham threshold and increasing the _spam threshold), basically, trying to mimic the global stats o

Re: bayes learn best practice

2009-04-09 Thread Arthur Kerpician
Kai Schaetzl wrote: Arthur Kerpician wrote on Thu, 09 Apr 2009 09:41:22 +0300: The docs mention that after 5000 spam and ham learned, spamassassin doesn't improve spam detection much. do they? What is meant is that once you reach some threshold the detection rate doesn't improve as g

Re: bayes learn best practice

2009-04-09 Thread Kai Schaetzl
Arthur Kerpician wrote on Thu, 09 Apr 2009 09:41:22 +0300: > The docs mention that after 5000 spam and ham learned, > spamassassin doesn't improve spam detection much. do they? What is meant is that once you reach some threshold the detection rate doesn't improve as good as before. You can't ge

Re: bayes learn best practice

2009-04-09 Thread John Hardin
On Thu, 9 Apr 2009, Arthur Kerpician wrote: I tried to manually keep both spam and ham at the same level in the bayes db but it seems that spamassassin is learning spam twice as fast as ham. Not surprising, as raw email traffic has a very skewed spam:ham ratio. Surely you've heard the stats

bayes learn best practice

2009-04-08 Thread Arthur Kerpician
Hi, I recently upgraded to 3.2.5 and re-trained bayes db from scratch. The auto-learn is on so now I have about 6000 mails trained as spam and 3000 as ham. I tried to manually keep both spam and ham at the same level in the bayes db but it seems that spamassassin is learning spam twice as fast