Hi all,

For clarification I'll sketch the flow of our mail to sa-learn.

Internet -> 2 redundant Exim mailservers with SA -> 2 redundant Notes 
server
 -> User -> Spam DB on Notes server -> via fetchmail back to the Exim 
server
   -> sa-learn

That will of course add at least a new received: line (for fetchmail) and
various other headerlines by Notes. Additionally the message might
already have a subject tag added by exim relays when the message
already scored high enough on its initial receiving. Note that this tag
is _not_ added by SA, but by the MTA itself.

A sample message:

Received: from nsext02.abit.de [10.150.1.42]
        by localhost with POP3 (fetchmail-6.2.5)
        for [EMAIL PROTECTED] (single-drop); Wed, 21 Dec 2005 09:35:43 +0100 
(CET)
Received: from mx01.abit.de ([10.150.1.52])
          by nsext02.abit.de (Lotus Domino Release 6.5.4FP1)
          with ESMTP id 2005110406561735-1513 ;
          Fri, 4 Nov 2005 06:56:17 +0100
Received: from [61.97.149.100] (helo=easynet.co.uk)
        by mx01.abit.de with smtp (Exim 4.52)
        id 1EXuYn-0002Jd-5B
        for [EMAIL PROTECTED]; Fri, 04 Nov 2005 06:56:22 +0100
Received: from 233.208.214.174 by smtp.covlink.co.uk;
        Fri, 04 Nov 2005 05:54:33 +0000
Message-ID: <[EMAIL PROTECTED]>
From: "Annabelle Osborn" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Date: Fri, 04 Nov 2005 01:54:04 -0400
MIME-Version: 1.0
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
X-Spam-Score: 5.1 (+++++)
X-Spam-Report: Software zur Erkennung von "Spam" auf dem Rechner
        mx02.abit.de
        hat die eingegangene E-mail als mögliche "Spam"-Nachricht 
identifiziert.
        Die ursprüngliche Nachricht wurde an diesen Bericht angehängt, so 
dass
        Sie sie anschauen können (falls es doch eine legitime E-Mail ist) 
oder
        ähnliche unerwünschte Nachrichten in Zukunft markieren können.
        Bei Fragen zu diesem Vorgang wenden Sie sich bitte an
        the administrator of that system
        Vorschau: Online pharmacy - Visit our online store and save. Save
        up to 80% compared to normal rates. All popular drugs are 
available!
        [...]
        Inhaltsanalyse im Detail:   (5.1 Punkte, 3.0 benötigt)
        Pkte Regelname              Beschreibung
        ---- ---------------------- 
--------------------------------------------------
        0.1 RAZOR2_CF_RANGE_51_100 BODY: Razor2 Spam-Bewertung liegt 
zwischen 51 und 100
        [cf: 100]
        3.5 BAYES_99               BODY: Spamwahrscheinlichkeit nach 
Bayes-Test: 99-100%
        [score: 1.0000]
        1.5 RAZOR2_CHECK           Gelistet im "Razor2"-System 
(http://razor.sf.net/)
X-Spam-Flag: YES
X-Spam-Category: REDIRECT
Subject: XX_SPAM_REDIRECT_XX (score 5.1) Top 10 popular pharmacy drugs
X-MIMETrack: Itemize by SMTP Server on NSEXT02/ABIT(Release 6.5.4FP1|June 
19, 2005) at
 04.11.2005 06:56:17,
        Serialize by POP3 Server on NSEXT02/ABIT(Release 6.5.4FP1|June 19, 
2005) at
 21.12.2005 09:35:43
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

[content of the message would come here]

Now the big question: Will those subject tags or the notes headerlines
confuse sa-learn and lead to wrong bayes tokens?
Especially subject-tags that I added via our MTA and the fact that
the message has more received lines than it had in first place when 
arriving
at our MTA.

I already added the following lines to the local.cf for SA:

bayes_auto_learn 0
bayes_ignore_header X-Spam-Score
bayes_ignore_header X-Spam-Report
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Category
bayes_ignore_header X-MIMETrack
bayes_auto_expire 0

If my assumptions are correct, that leaves the additional received line 
and
the subject tag. Should I take measures to filter those out via a script 
before
injecting them into sa-learn or can I safely ignore them?

I hope my message was not too confusing, but it's still early ;)

regards
        sash

--------------------------------------------------
Sascha Runschke
Netzwerk Administration
IT-Services

ABIT AG
Robert-Bosch-Str. 1
40668 Meerbusch

Tel.:+49 (0) 2150.9153.226
Mobil:+49 (0) 173.5419665
mailto:[EMAIL PROTECTED]

http://www.abit.net
http://www.abit-epos.net
---------------------------------
Sicherheitshinweis zur E-Mail Kommunikation /
  Security note regarding email communication:
http://www.abit.net/sicherheitshinweis.html

Reply via email to