On Thu, 18 Mar 2021 14:01:28 +0100
Matus UHLAR - fantomas wrote:

> >On Wed, 17 Mar 2021 10:42:14 -0400 Kris Deugau wrote:  
> 
> >> My own experience has been that accumulating blobs of ham/spam and
> >> just repeatedly running sa-learn over those works just fine.  It
> >> also reduces the incidence of tokens from somewhat rarer mail
> >> automatically expiring out of Bayes, leading to FPs and FNs.  
> 
> On 17.03.21 22:01, RW wrote:
> >It wont do that by default. You would need to have something removing
> >the signature hashes from the database.  
> 
> oh, yes, it does:
> 
>        bayes_auto_expire             (default: 1)

I meant that sa-learn will ignore mail that's already been trained. So,
by default, rerunning it over a corpus that already been trained wont
prevent any tokens expiring. 

Redis does support ageing-out signatures, but I don't see why you would
want to retrain on old mail at the expense of losing tokens from
new mail. You'll also end up with a database where very old emails will
have been trained many times and recent, more relevant, FPs & FNs have
only have been trained once. 

Reply via email to