https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6942

--- Comment #15 from AXB <axb.li...@gmail.com> ---
(In reply to Mark Martinec from comment #14)
> > > > 12240246  non-token data: nspam
> > > >  5076877  non-token data: nham
> > > > used_memory_human:3.10G
> > > Suggesting: bayes_auto_learn_on_error 1
> > 
> > why?
> 
> From the stats it seems to me like a large number of tokens,
> and 3 GB of resident storage is on a high side and probably
> growing still at the same rate (until expiration kicks in).

This is two weeks' worth of tokens, running on a dedicated redis box with 32 gb
of ram. 
Since last Redis server upgrade memory usage has decreased quite a bit.
I was peaking 4.7 GB of Redis usage. 

DB size is pretty stable so it seems expiration has been reliable.


99% of spam is fed via traps - NOT production flow.
ham is production ham, in autolearn mode and there's no false learning.
(autolearn_force does wonders with ham and spam)

> The bayes_auto_learn_on_error can reduce the growth rate
> substantially, without sacrificing much on the quality of
> results. Some studies even indicated that a learn_on_error
> strategy increased the classification quality (but I won't
> speculate on that here).

I'm not worried about size - and speed is so fast that it can only be beat by
turning off bayes completely.

As to classification quality, I see no errors, in neither way.
Don't quite see how this setting could make it even better, but open to
education

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to