On 7/26/2010 5:58 AM, andrij wrote: > Hi all, > > I am new to spamassassin and bayes classifier. I have several questions and > I will greatly appreciate your help with that. > > 1) Training of the bayes classifier with _multipart_ e-mails (e.g., an > e-mail contains other e-mails within its body). If I set > "bayes_ignore_header Some-header", will bayes classifier ignore (while > learning) the header "Some-header" in the nested messages as well?
As far as SA is concerned, this is a single message with a single set of headers. Bayes will ignore the specified header in the main message, but not in the body (where the rest of the e-mails are stored). If you want them treated as separate messages, you will need to run something to split them into separate files and then learn them. > 2) Evaluating whether an email is spam or not. Again, if I set > "bayes_ignore_header Some-header", will the bayes classifier ignore the > header while evaluating an e-mail? Yes. That's what it's for. > 3) Evaluating whether an email is spam or not. Does the bayes classifier > analyze headers if I have, for example, the following rule: "body BAYES_05 > eval:check_bayes('0.00', '0.05')". According to the > http://wiki.apache.org/spamassassin/WritingRules : "Body rules also include > the Subject as the first line of the body content". So, any headers that > precede subject header are not considered by the bayes classifier? I don't have an answer for you here, but just another question. Why do you want to mess with the bayes rules? They work very well as-is as long as you make sure the database is being fed properly (learning spam as spam and ham as ham with a decent mix of both being learned on a regular basis). -- Bowie