First, I had understood that Bayes can learn previously tagged emails without stripping Spamassassin tags. Has this changed?
Second, all of my users use a webmail client, though they can use OE if they wish. It is probably best for them to use IMAP so that server-side scanning can better be setup. I currently have 2 scripts that run nightly. The first takes everthing in the user's /home/user/mail/Spam folder and learns it as spam then empties it. The second does the same for Ham, but moved that mail to a Cleaned folder. All the user has to do is move untagged spam into Spam and false-positives into Ham. -- <<JAV>> ---------- Original Message ----------- From: "Sander Holthaus - Orange XL" <[EMAIL PROTECTED]> To: "'SpamAssassin Users'" <users@spamassassin.apache.org> Cc: "'Stuart Johnston'" <[EMAIL PROTECTED]>, "'Peter Marshall'" <[EMAIL PROTECTED]> Sent: Fri, 4 Feb 2005 19:47:40 +0100 Subject: RE: Manually training SpamAssassin by forwarding mail > > -----Original Message----- > > From: Stuart Johnston [mailto:[EMAIL PROTECTED] > > Sent: Friday, February 04, 2005 7:35 PM > > To: Peter Marshall; SpamAssassin Users > > Subject: Re: Manually training SpamAssassin by forwarding mail > > > > Peter Marshall wrote: > > > Stuart Johnston wrote: > > > > > >> Peter Marshall wrote: > > >> > > >>> Kevin Sullivan wrote: > > >>> > > >>>> --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote: > > >>>> > > >>>>> I've been interested in offering customers to train > > manually train > > >>>>> the SpamAssassin Bayes filter for ham and spam (to reduce false > > >>>>> positives and negatives). However, I can only find > > documentation > > >>>>> to this for local mailboxes and IMAP. Most users > > however, retrieve > > >>>>> their mail through POP and use Outlook (Express) as > > mail client. > > >>>>> Is there a way to train SpamAssassin with such a setup (e.g. > > >>>>> forwarding mail with Outlook > > >>>>> (Express) using SMTP)? > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> If you want to do a lot of programming, you could save > > all incoming > > >>>> messages for a few days in a database somewhere. When a user > > >>>> forwards a message to a special "ham" or "spam" mailbox, > > you pull > > >>>> the message-id from the message and use it to recover > > the original > > >>>> message from your database. > > >>>> > > >>>> -Kevin > > >>> > > >>> > > >>> > > >>> > > >>> My question is the same as Henrik, I have a bunch of > > email that is > > >>> spam (either tagged by spam assassin or not tagged at all. I > > >>> forwared it as an attachment to a "spam" mail box. What > > do I have > > >>> to do now before I can get bayes to learn the message ... > > I read you > > >>> have to remove the headers .... Could anyone give me a > > little more > > >>> detail ? > > >> > > >> > > >> > > >> I use a modified version of the DMZS-sa-learn.pl from: > > >> http://www.dmzs.com/tools/files/spam.phtml > > >> > > >> When someone forwards a spam to me, I move the message to > > a special > > >> imap folder that gets processed by the script. My additions look > > >> something like: > > >> > > >> use Email::MIME; > > >> ... > > >> my $msg = Email::MIME->new($raw_message_body); > > >> > > >> my @parts = $msg->parts; > > >> > > >> foreach (@parts) { > > >> if ($_->content_type =~ m|message/rfc822|) { > > >> sa_learn($_->body_raw); > > >> } > > >> } > > >> > > >> > > >> I've tested this with messages forwarded as attachment > > from Outlook > > >> and Thunderbird. I'm not sure how effective it is though. > > I'm sure > > >> that it still looses something in the translation. All imap is > > >> really the way to go if you can. > > >> > > >> > > >> Stuart Johnston > > >> > > >> > > > But I have no imap .. only pop .. they would forwared (as > > attachment) > > > to a mailbox, and then I have to run sa-learn ... I assume as root ? > > > > > > Will the stuff you posted work for this setup as well ?? > > > > > > Would there be big problems just running it after the forwared as > > > attachment. ?? > > > > The code I posted only shows how you can extract the attached > > spam from the email. You'll need to write your own code to > > integrate it into your particular setup. > > > > BTW, in Outlook, you can easily attach multiple spams to one > > message and this code should handle it. > > CTRL-a, right click, "Forward Items" will indeed do the trick. > > > > > > > Can users also forwared as attachemtn mail that was sent that was > > > already marked as spam ... or is there any advantage to this ? > > > > If you use Bayes auto learn, I suspect that this wouldn't do much. > > Otherwise, it might help. > > I would check the headers of the forwarded messages to see if their > spam-score is above your auto-learning threshold. If it is, > relearning is is perhaps quite useless. You might wonder why they > received the message anyway > (I would think that something that is good enough to autolearn is > good enough to refuse or discard). > > Kind Regards, > Sander Holthaus ------- End of Original Message -------