On Sun, 2009-08-02 at 02:00 +0100, RW wrote: > On Sun, 02 Aug 2009 01:42:21 +0200 Karsten Bräckelmann wrote:
> > > when I learn bayes by hand (sa-learn --spam --file mail) that this > > > mail is spam? I have explicit set in local.cf bayes_min_spam_num 1. > > > This means that for bayes is sufficient one mail for > > > learning(according to me). But it dosesnt work. > > Do NOT do that. > > > > Unless you *really* understand the implications. Which you don't. > > It's a default for a reason. > > > > It's a counter-measure against bad learning, to force at least some > > MINIMAL manual training, before auto-learning kicks in. You just side- > > stepped that. > > AFAIK it doesn't affect autoleaning at all, bayes_min_spam_num & > bayes_min_ham_num control when scoring starts. Well, it *does* nonetheless. *shrug* As per the docs, that threshold controls when Bayes activates. Nothing more, nothing less. Want to see for yourself? $ echo | spamassassin --cf='score EMPTY_MESSAGE 6' --cf='score MISSING_DATE 6' X-Spam-Status: Yes, score=17.3 required=8.0 tests=EMPTY_MESSAGE,MISSING_DATE, MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED, NO_RELAYS,TVD_SPACE_RATIO autolearn=spam version=3.2.5 $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 2 0 non-token data: nspam 0.000 0 1 0 non-token data: nham 0.000 0 20 0 non-token data: ntokens -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}