Dave Koontz wrote:
> Theo and all.  I know this topic comes up on occasion, but I am not sure
> I've ever seen an explanation as to why the bayes_seen file is not auto
> pruned along with the bayes db file.  Since tokens expire in the main DB
> file, what is the purpose of having a seen file to unlearn tokens which
> may have long ago been purged?   IMO, it would seem logical to also
> purge the seen file at some sort of cycle so it can't grow so
> excessively large.
> 

In order to expire from bayes_seen you have to know that there are no
longer any tokens from a given msg in the bayes_token database.  This is
a hard problem, mapping tokens to msgs, so it wasn't done.  Likewise no
one ever did anything about expiring the bayes_seen entries.

Sounds like a good project, there might even be a bugzilla enhancement
opened already.

Patches are welcome.

Michael



> Theo Van Dinter wrote:
>> On Wed, Sep 19, 2007 at 03:23:50PM -0600, Mr. Gus wrote:
>>   
>>>> The file bayes_seen has grown in size to 256GB!  (274992939008)
>>>> How do I cap the size limit of this file? I want to have it not grow larger
>>>> then say 800mb at the most!
>>>>       
>>> You need to expire old bayes tokens. The limit is set not as a size, but as
>>>     
>> Expiring bayes tokens does nothing to the bayes_seen file.  There is no 
>> expiry
>> for bayes_seen.
>>
>> If the seen file is bigger than you'd like, I'd just rm the file.
>>
>>   
> 

Reply via email to