Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam

David Jones Tue, 13 Feb 2018 10:34:35 -0800

On 02/13/2018 11:45 AM, Horváth Szabolcs wrote:

Reindl Harald [mailto:h.rei...@thelounge.net] wrote:

I think I have no control over what is learnt automatically.

surely, don't do autolearning at all


This is a mail gateway for multiple companies. I'm not supposed to read e-mails 
on that, or picking mails that can be used for learning ham.
And I can't ask users to use a "ham" mailbox, because they are not IT experts, 
sometimes they have problems with a simple mail forwarding.

If you aren't allowed to check specific emails with a suspicious subjector that are reported as spam by your users, there's no way you can doyour job of accurately filtering email.

Without autolearning and without the help of the end-users, I can't build a 
proper ham bayes database, can I?

SA's autolearning doesn't use the results from BAYES_* rules since thatcould make incorrect training even worse so you are going to have tobuild local rules or get help from RBLs and other SA plugins to get tothe autolearning thresholds.

With non-English email flow, it's more challenging. If no RBLs hit,then you really must train your Bayes properly which requires some wayto accurately determine the ham and spam. You must keep a copy of theham and spam corpi and be allowed to review suspicious email.

Can you setup a split copy of the email that can redact the recipient oranonymize it enough to allow for review? If not, your filtering is notgoing to be accurate.

Best regards
   Szabolcs


--
David Jones

Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam

Reply via email to