Am 28.01.2015 um 16:52 schrieb Axb:
On 01/28/2015 04:38 PM, Reindl Harald wrote:

is AFAIK relevant in context of sa-learn to not re-train the same
messages again and again - and it has it's own bugs becaue for a few
messages it contains random parts of the message itself, fire sa-learn
on the whole corpus would add these messages each time to "bayes_toks"

see two example snippets below
hence it is that large here

-rw------- 1 sa-milt sa-milt 5,4K 2015-01-28 16:34 bayes_journal
-rw------- 1 sa-milt sa-milt 1,3M 2015-01-28 16:12 bayes_seen
-rw------- 1 sa-milt sa-milt  40M 2015-01-28 16:33 bayes_toks
-rw------- 1 sa-milt sa-milt   98 2014-08-21 17:47 user_prefs
_________________________________________________

something here does NOT make sense

1.3 MB of seen against 40MB tokens.

someone please correct me if I'm wrong:

afaik, this probably means you've deleted bayes_seen so bayes has lost
it's record of what it has processed so it will relearn stuff you
already fed it.

no, i explained what happens in the part you stripped from the quote - it contains randomly complete message parts independent how often i delete *any file* in the userhome and rebuild from scratch

if i delete "bayes_seen" than it happens by a complete reset with sa-learn.sh using sa-learn to *rebuild from scratch* based on the forever stored raw-mails in the folders "ham" and "spam"

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to