Adam Katz <antis...@khopis.com> writes:

> Micah Anderson wrote:
>>> Also, to see how experienced your Bayes knowledge is - use "$ sa-leanrn
>>> --dump magic"
>> 
>> This shows me that I have no idea what these magic things are :) Does
>> this tell you anything useful? 
>> 
>> 0.000          0          3          0  non-token data: bayes db version
>> 0.000          0    6798614          0  non-token data: nspam
>> 0.000          0   19136753          0  non-token data: nham
>> 0.000          0 1063157695          0  non-token data: ntokens
>> 0.000          0 1241301616          0  non-token data: oldest atime
>> 0.000          0 1241416889          0  non-token data: newest atime
>> 0.000          0          0          0  non-token data: last journal sync 
>> atime
>> 0.000          0 1241344830          0  non-token data: last expiry atime
>> 0.000          0      43200          0  non-token data: last expire atime 
>> delta
>> 0.000          0     496607          0  non-token data: last expire 
>> reduction count
>
> Eh?  Last journal sync atime is Jan 1 1970?
> Try running:   sa-learn --sync

Doesn't seem to change the 'last journal sync atime' from 0.

> If that helps, put it in your nightly SpamAssassin cron job
> (and/or revisit your custom teaching scripts).

In fact, I've been running that from cron every night. 

I'm using a mysql DB and I've got the following set in my local.cf:

# We want to expire via cronjob, rather than having one of our spamd
# children do it. 
bayes_auto_expire                  0

# no affect
bayes_learn_to_journal             0

> A quick primer (since this doesn't really exist anywhere...):  The
> three zeroed columns are always zero.
>
> bayes db version is self-explanatory.
> nspam is the number of spam messages on record.  bayes needs >200.

Should be fine: 6798649

> nham is the number of ham messages on record.  bayes needs >200.

Also should be fine: 19160960

> ntokens is the number of 'words' noted in the system.

lots of tokens: 1065483803

> oldest atime is the oldest access time of the oldest token (I think).

I've got 1241474416 which would be Mon May  4 15:00:16 PDT 2009
which is just yesterday... that doesn't seem right that this would be
the oldest access time, especially for 1065483803 tokens!

> the rest of the times should be self-explanatory.
> last expire reduction count is the number of tokens removed from the
> last expiration run (I think).

Ok, that seems to be counting, so something is being expired:

0.000          0     840628          0  non-token data: last expire reduction 
count

This is all very interesting info, I appreciate the
explanation. However, my original question still stands.

micah

Reply via email to