Re: Bayes scoring

2010-08-02 Thread RW
On Mon, 2 Aug 2010 05:51:25 -0700 (PDT) andrij wrote: > How many tokens are used by the SA's bayes classifier to > calculate the probability that the mail is spam/ham? It varies. It uses all the tokens above a minimum token strength, up to a maximum of 150.

Re: DB tokens expiration

2010-08-02 Thread RW
On Mon, 2 Aug 2010 05:29:32 -0700 (PDT) andrij wrote: > > Hi all, > > after I trained the bayes classifier with several thousands of > e-mails I run "sa-learn --dump magic" and I got the following: > > Why was the number of ntokens not reduced to 15? The expiry algorithm isn't very good.

Re: sa-compile has no effect (under Windows.......)

2010-08-02 Thread Daniel McDonald
On 8/2/10 7:53 AM, "Daniel Lemke" wrote: > > > Yet Another Ninja wrote: >> >> compiled rules only affects body & rawbody rules. >> Network tests won't be affected and are probably the reason for the lack >> of a massive difference. >> > > Good advice, I disabled all the other plugins and ran

Re: sa-compile has no effect

2010-08-02 Thread Karsten Bräckelmann
On Mon, 2010-08-02 at 05:53 -0700, Daniel Lemke wrote: > Yet Another Ninja wrote: > > compiled rules only affects body & rawbody rules. > > Network tests won't be affected and are probably the reason for the lack > > of a massive difference. > > Good advice, I disabled all the other plugins and r

Re: Bayes scoring

2010-08-02 Thread andrij
Daniel Lemke wrote: > > > andrij wrote: >> >> I run the bayes classifier on more than 4500 e-mails. All (except of cca >> 100 e-mails) contained test=BAYES_*. Does anybody have any idea why these >> 100 e-mails were not scored by the bayes classifier? >> > > Do you have any shortcircuit enab

Re: Bayes scoring

2010-08-02 Thread Daniel Lemke
andrij wrote: > > I run the bayes classifier on more than 4500 e-mails. All (except of cca > 100 e-mails) contained test=BAYES_*. Does anybody have any idea why these > 100 e-mails were not scored by the bayes classifier? > Do you have any shortcircuit enabled? Could you post a raw example of

Re: sa-compile has no effect (under Windows.......)

2010-08-02 Thread Daniel Lemke
Yet Another Ninja wrote: > > compiled rules only affects body & rawbody rules. > Network tests won't be affected and are probably the reason for the lack > of a massive difference. > Good advice, I disabled all the other plugins and ran spamassassin in local test mode, processing a huge text

Bayes scoring

2010-08-02 Thread andrij
Hi all, I run the bayes classifier on more than 4500 e-mails. All (except of cca 100 e-mails) contained test=BAYES_*. Does anybody have any idea why these 100 e-mails were not scored by the bayes classifier? At http://www.paulgraham.com/spam.html, it is written that "When new mail arrives, it is

DB tokens expiration

2010-08-02 Thread andrij
Hi all, after I trained the bayes classifier with several thousands of e-mails I run "sa-learn --dump magic" and I got the following: 0.000 0 3 0 non-token data: bayes db version 0.000 0 5367 0 non-token data: nspam 0.000 0 3792