[N.B.: Your prior correspondent is not able to post to this list, so we only saw your side of that exchange.]

On 23 Jul 2018, at 19:38 (-0400), Nick Bright wrote:

When requesting submissions from users for use with sa-learn, if they are going to forward the message somewhere; is it best for that to be forwarded as an attachment, or forwarded inline? Will sa-learn automatically understand "the spam is attached" if it's an attachment?

Learning from a mailbox of my own spam (with full headers - the actual mails) is quite different from users *forwarding* spam for training.

So I ask: what is the best practice for learning submissions when using site-wide bayes?

The goal is to get a copy of the message that is identical to what SA saw when it arrived. For IMAP users, this is easiest to get with a 'missed spam' mailbox into which users can move messages for learning. If you must rely on forwarded submissions, make sure users are forwarding messages as attachments, and have the target deliver into a mailbox that is processed to extract the 'message/rfc822' MIME object(s) in those submissions and learn those, not the submission mail itself.

Learning ham is harder, because generally speaking it is not a good idea to deliver mail that SA believes is spam *at all* unless you can't reject it in SMTP. As a result, users don't have 'false positive' samples to submit (although their irate would-be correspondents could...) In an IMAP environment, you can identify borderline ham that is useful to learn by looking at tagging and archiving. If the user assigns a keyword to a message and/or moves it to a mailbox (other than ones with names like Junk and Spam and Trash) you can usually be sure it is ham. If your users are trainable (it DOES happen...) you might even get them to use specific keywords and/or archival mailboxes and use those to feed ham training. In a POP3 environment, this is a much harder problem to solve.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole

Reply via email to