Re: How to automatically train each users Bayes?

RW Fri, 27 Mar 2015 14:40:59 -0700

On Fri, 27 Mar 2015 20:03:18 +0100
Michael wrote:

> On 27.03.2015 19:09, RW wrote:
> > On Fri, 27 Mar 2015 15:16:13 +0000

> > "cur" doesn't imply that the mail has been read; for that you
> > need to check the seen flag in the filename, an S somewhere after
> > the colon.
> 
> Yes, that's true. But if I'm right, new mails stay in "new" until the
> appropriate folder in the IMAP client has been opened, right? I just
> assume, if the use has some false negatives in the folder, he will
> either immediately delete it or just move it into the Spam folder.

People can have mail clients running unattended in the background,
often on multiple devices, so you can't assume it's been seen by a
human.

> > You could also supplement spam training by autolearning only spam,
> > e.g. I have:
> > 
> > bayes_auto_learn 1
> > bayes_auto_learn_on_error 1
> > bayes_auto_learn_threshold_nonspam -2000.0
> 
> But that learns spam only if its score is above 12.0. And learns no
> nonspam.

That's why I suggested using it to "*supplement* spam training". When it
works, autotraining does have the advantage of happening in real-time.

> And then maybe the default config which auto learns spam and
> ham is already the best...

the default doesn't learn ham well, I'd only do that as a last resort.

> My setup is already configured retrain when the user moves mail from
> Inbox to Spam or from Spam to another folder.

This is a really poor way of training Bayes because it trains on SA
misclassifications rather than Bayes misclassifications. It's a poor
way of training spam and very much worse at training ham.  

On Fri, 27 Mar 2015 20:14:03 +0100
Matus UHLAR - fantomas wrote:

> >On 27.03.2015 19:54, Matus UHLAR - fantomas wrote:
> >> the easiest way is to train on false positives and false negatives.
> >> dovecot imapd has plugin to train when mail is moved from/to spam.
> 
> On 27.03.15 20:10, Michael wrote:
> >My concerns are the following:
> >Sometimes new kind of spam is appearing. This new kind often gets low
> >scores so that they are just 0.1 to 0.5 points above the limit. And
> >the auto learner gets no hit.
> >If the same spam then comes from another sending server, the score is
> >just a little bit below the border so that I'm getting a
> >false-negative. If the previous spam would have already been
> >learned, the second mail would have been scored as spam.
> 
> I don't get this. 

By the sound of it the OP is already using the dovecot plugin or
equivalent.

The first spam wasn't autolearned and was correctly identified as
spam. In this case the plugin doesn't provide a way of training it,
even if it has BAYES_00, because it's already in the spam folder.

People keep recommending the plugin, but IMO it's a poor choice for
SpamAssassin.

Re: How to automatically train each users Bayes?

Reply via email to