RE: Manually training SpamAssassin by forwarding mail

Joe Polk 4 Feb 2005 19:38:35 -0000

First, I had understood that Bayes can learn previously tagged emails without
stripping Spamassassin tags. Has this changed?


Second, all of my users use a webmail client, though they can use OE if they
wish. It is probably best for them to use IMAP so that server-side scanning
can better be setup. I currently have 2 scripts that run nightly. The first
takes everthing in the user's /home/user/mail/Spam folder and learns it as
spam then empties it. The second does the same for Ham, but moved that mail to
a Cleaned folder. All the user has to do is move untagged spam into Spam and
false-positives into Ham.

--
<<JAV>>


---------- Original Message -----------
From: "Sander Holthaus - Orange XL" <[EMAIL PROTECTED]>
To: "'SpamAssassin Users'" <users@spamassassin.apache.org>
Cc: "'Stuart Johnston'" <[EMAIL PROTECTED]>, "'Peter Marshall'"
<[EMAIL PROTECTED]>
Sent: Fri, 4 Feb 2005 19:47:40 +0100
Subject: RE: Manually training SpamAssassin by forwarding mail

> > -----Original Message-----
> > From: Stuart Johnston [mailto:[EMAIL PROTECTED] 
> > Sent: Friday, February 04, 2005 7:35 PM
> > To: Peter Marshall; SpamAssassin Users
> > Subject: Re: Manually training SpamAssassin by forwarding mail
> > 
> > Peter Marshall wrote:
> > > Stuart Johnston wrote:
> > > 
> > >> Peter Marshall wrote:
> > >>
> > >>> Kevin Sullivan wrote:
> > >>>
> > >>>> --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote:
> > >>>>
> > >>>>> I've been interested in offering customers to train 
> > manually train 
> > >>>>> the SpamAssassin Bayes filter for ham and spam (to reduce false 
> > >>>>> positives and negatives). However, I can only find 
> > documentation 
> > >>>>> to this for local mailboxes and IMAP. Most users 
> > however, retrieve 
> > >>>>> their mail through POP and use Outlook (Express) as 
> > mail client. 
> > >>>>> Is there a way to train SpamAssassin with such a setup (e.g. 
> > >>>>> forwarding mail with Outlook
> > >>>>> (Express) using SMTP)?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> If you want to do a lot of programming, you could save 
> > all incoming 
> > >>>> messages for a few days in a database somewhere.  When a user 
> > >>>> forwards a message to a special "ham" or "spam" mailbox, 
> > you pull 
> > >>>> the message-id from the message and use it to recover 
> > the original 
> > >>>> message from your database.
> > >>>>
> > >>>>     -Kevin
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> My question is the same as Henrik, I have a bunch of 
> > email that is 
> > >>> spam (either tagged by spam assassin or not tagged at all.  I 
> > >>> forwared it as an attachment to a "spam" mail box.  What 
> > do I have 
> > >>> to do now before I can get bayes to learn the message ... 
> > I read you 
> > >>> have to remove the headers .... Could anyone give me a 
> > little more 
> > >>> detail ?
> > >>
> > >>
> > >>
> > >> I use a modified version of the DMZS-sa-learn.pl from: 
> > >> http://www.dmzs.com/tools/files/spam.phtml
> > >>
> > >> When someone forwards a spam to me, I move the message to 
> > a special 
> > >> imap folder that gets processed by the script.  My additions look 
> > >> something like:
> > >>
> > >> use Email::MIME;
> > >> ...
> > >> my $msg = Email::MIME->new($raw_message_body);
> > >>
> > >> my @parts = $msg->parts;
> > >>
> > >> foreach (@parts) {
> > >>   if ($_->content_type =~ m|message/rfc822|) {
> > >>     sa_learn($_->body_raw);
> > >>   }
> > >> }
> > >>
> > >>
> > >> I've tested this with messages forwarded as attachment 
> > from Outlook 
> > >> and Thunderbird.  I'm not sure how effective it is though. 
> >  I'm sure 
> > >> that it still looses something in the translation.  All imap is 
> > >> really the way to go if you can.
> > >>
> > >>
> > >> Stuart Johnston
> > >>
> > >>
> > > But I have no imap .. only pop .. they would forwared (as 
> > attachment) 
> > > to a mailbox, and then I have to run sa-learn ... I assume as root ?
> > > 
> > > Will the stuff you posted work for this setup as well ??
> > > 
> > > Would there be big problems just running it after the forwared as 
> > > attachment. ??
> > 
> > The code I posted only shows how you can extract the attached 
> > spam from the email.  You'll need to write your own code to 
> > integrate it into your particular setup.
> > 
> > BTW, in Outlook, you can easily attach multiple spams to one 
> > message and this code should handle it.
> 
> CTRL-a, right click, "Forward Items" will indeed do the trick.
> 
> > > 
> > > Can users also forwared as attachemtn mail that was sent that was 
> > > already marked as spam ... or is there any advantage to this ?
> > 
> > If you use Bayes auto learn, I suspect that this wouldn't do much. 
> > Otherwise, it might help.
> 
> I would check the headers of the forwarded messages to see if their
> spam-score is above your auto-learning threshold. If it is,
>  relearning is is perhaps quite useless. You might wonder why they 
> received the message anyway
> (I would think that something that is good enough to autolearn is 
> good enough to refuse or discard).
> 
> Kind Regards,
> Sander Holthaus
------- End of Original Message -------

RE: Manually training SpamAssassin by forwarding mail

Reply via email to