Dave Koontz wrote: > Theo and all. I know this topic comes up on occasion, but I am not sure > I've ever seen an explanation as to why the bayes_seen file is not auto > pruned along with the bayes db file. Since tokens expire in the main DB > file, what is the purpose of having a seen file to unlearn tokens which > may have long ago been purged? IMO, it would seem logical to also > purge the seen file at some sort of cycle so it can't grow so > excessively large. >
In order to expire from bayes_seen you have to know that there are no longer any tokens from a given msg in the bayes_token database. This is a hard problem, mapping tokens to msgs, so it wasn't done. Likewise no one ever did anything about expiring the bayes_seen entries. Sounds like a good project, there might even be a bugzilla enhancement opened already. Patches are welcome. Michael > Theo Van Dinter wrote: >> On Wed, Sep 19, 2007 at 03:23:50PM -0600, Mr. Gus wrote: >> >>>> The file bayes_seen has grown in size to 256GB! (274992939008) >>>> How do I cap the size limit of this file? I want to have it not grow larger >>>> then say 800mb at the most! >>>> >>> You need to expire old bayes tokens. The limit is set not as a size, but as >>> >> Expiring bayes tokens does nothing to the bayes_seen file. There is no >> expiry >> for bayes_seen. >> >> If the seen file is bigger than you'd like, I'd just rm the file. >> >> >