Re: SA-learn (spamassassin)

Karsten Bräckelmann Sun, 02 Aug 2009 15:25:37 -0700

On Sun, 2009-08-02 at 14:43 -0700, an anonymous Nabble user wrote:
> To by Karsten Bräckelmann-2: I want to apologize for my approach - I use
> Ubuntu and other forums because I am hopeless because my homework was
> install configure and run antispam(spamassassin, ClamAV, Clamsmtp,razor,
> postfix). Now I am under pressure because tomorrow I have to deliver my
> solution to my chief... I must explain to him how it works and so on.


Good luck with that.


Utterly fucked-up quoting, err, dumping of previous posts intermixed
with comments, fixicated.

> > the number of spam exceeding the bayes_min_spam_num value does not activate
> > Bayes *learn*ing. It means that Bayes will classify mail -- based on what it
> > learned before.

> > it keeps track of *tokens*, and the number they have been seen in ham
> > or spam.

> Your  explanation is confusing for me, because you
> claim value of min_spam_num  means that Bayes will classify mail -- based on
> what it learned before My min_spam_num value is 1. I get the first mail.
> Subject: viagra; body: viagra. I use sa - learn -spam for this mail. I get
> new mail: Subject: viagra; body: viagra. What will do Bayes according to
> you? Keep in mind your words 

Bayes will check the tokens against its database. Based on the number of
occurrences of each token in ham and spam, Bayes will return whether the
mail appears spammy or hammy (based on what it learned before), and its
confidence of that assessment.

This classification (ham or spam) and confidence will be scored by SA.

Keep in mind there are a LOT more tokens in a message than merely the
words in the Subject and Body. This DOES have a severe impact on your
results, if your "test spam" is a self-generated message with the word
Viagra as Subject and Body. Nope, this is not a proper test environment.


> > The bayes_min_(ham|spam)_num values ONLY control, how many messages
> > Bayes needs to have learned, before it should start classifying mail.

> => my Bayes can classifying mail(because min_spam_num value is 1 => the
> condition is accomplish). What now? Will be my new mail mark like spam?
> Or will get any higher score...?

It will be classified (by Bayes) based on the tokens in the message and
the previously learned statistics. Bayes does NOT only mark spam. It
also can report a message to look like ham.

Anyway, I asked you before to provide sa-learn --dump magic output. You
didn't. Given the intro, I seriously wonder if the user you are training
Bayes and scanning mail is the same anyway.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

Reply via email to