On Thu, 18 Mar 2021 14:01:28 +0100 Matus UHLAR - fantomas wrote: > >On Wed, 17 Mar 2021 10:42:14 -0400 Kris Deugau wrote: > > >> My own experience has been that accumulating blobs of ham/spam and > >> just repeatedly running sa-learn over those works just fine. It > >> also reduces the incidence of tokens from somewhat rarer mail > >> automatically expiring out of Bayes, leading to FPs and FNs. > > On 17.03.21 22:01, RW wrote: > >It wont do that by default. You would need to have something removing > >the signature hashes from the database. > > oh, yes, it does: > > bayes_auto_expire (default: 1)
I meant that sa-learn will ignore mail that's already been trained. So, by default, rerunning it over a corpus that already been trained wont prevent any tokens expiring. Redis does support ageing-out signatures, but I don't see why you would want to retrain on old mail at the expense of losing tokens from new mail. You'll also end up with a database where very old emails will have been trained many times and recent, more relevant, FPs & FNs have only have been trained once.