On 7/26/2010 5:58 AM, andrij wrote:
> Hi all,
>
> I am new to spamassassin and bayes classifier. I have several questions and
> I will greatly appreciate your help with that.
>
> 1) Training of the bayes classifier with _multipart_ e-mails (e.g., an
> e-mail contains other e-mails within its body). If I set
> "bayes_ignore_header Some-header", will bayes classifier ignore (while
> learning) the header "Some-header" in the nested messages as well?

As far as SA is concerned, this is a single message with a single set of
headers.  Bayes will ignore the specified header in the main message,
but not in the body (where the rest of the e-mails are stored).  If you
want them treated as separate messages, you will need to run something
to split them into separate files and then learn them.

> 2) Evaluating whether an email is spam or not. Again, if I set
> "bayes_ignore_header Some-header", will the bayes classifier ignore the
> header while evaluating an e-mail?

Yes.  That's what it's for.

> 3) Evaluating whether an email is spam or not. Does the bayes classifier
> analyze headers if I have, for example, the following rule: "body BAYES_05
> eval:check_bayes('0.00', '0.05')". According to the
> http://wiki.apache.org/spamassassin/WritingRules : "Body rules also include
> the Subject as the first line of the body content". So, any headers that
> precede subject header are not considered by the bayes classifier?

I don't have an answer for you here, but just another question.  Why do
you want to mess with the bayes rules?  They work very well as-is as
long as you make sure the database is being fed properly (learning spam
as spam and ham as ham with a decent mix of both being learned on a
regular basis).

-- 
Bowie

Reply via email to