On Sun, 2013-11-10 at 02:39 -0200, Sergio Durigan Junior wrote: > On Sunday, November 10 2013, Karsten Bräckelmann wrote:
> > > So, I now have yet another question. I let auto_learn active for SA, > > > and now for every false-negative SA will learn that it is not spam, > > > > No. False negative (not classified spam, although it is) is NOT what > > triggers auto-learn ham. > > All right, I misunderstood things then. I assumed that because of > sa-learn --dump magic output: > 0.000 0 37 0 non-token data: nham > And this number increases every time I receive a message (whether it is > a false-negative or a true-negative). Since I have too little spam to > train, it is hard to keep up with the number of ham received. nham is the "Number of HAM" learned, in messages. Same for nspam. Keep training until both are at least 200 -- accuracy should improve dramatically after that. Keep an eye on the X-Spam-Status header, autolearn bit. If that happens frequently for FNs, there's a problem somewhere. We'd need the X-Spam headers and preferably the full, raw message put up a pastebin for debugging. After some initial training. There's one thing worrying in your comment: "whether false-negative or true-negative". You DO have spam also, right? I mean, classified spam is not just silently discarded without you ever seeing it? That would be really bad at this stage. Take it, verify it, learn it. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}