Theo Van Dinter wrote:

On Fri, Sep 19, 2003 at 10:42:16AM -0400, Pete O'Hara wrote:


bayes_expiry_max_db_size 50000 # this is to force an auto expiry



That's probably going to confuse things a lot.


As the docs say:

      bayes_expiry_max_db_size      (default: 150000)
          What should be the maximum size of the Bayes tokens
          database?  When expiry occurs, the Bayes system will
          keep either 75% of the maximum value, or 100,000
          tokens, whichever has a larger value.  150,000 tokens
          is roughly equivalent to a 8Mb database file.

The code enforces at least 100k tokens in the DB at any time.  Other than
that, you'd have to run with -D to get debug output...

Yes, I figured that if for some reason the 50k was too low that I should endup with 100k, but I here I have 165k and this is what is confusing me.
0.000 0 165010 0 non-token data: ntokens


Oops. I thought that I attached -D output but I messed up. It is attached.

Pete






The problem is that I am not seeing bayes auto expiring tokens.
I am running SA-2.60-rc5, DB_File-1.806

what would cause an old lock file and a bayes_toks.new that is
static (not being written to and just hanging around)? - I have seen
users with memory problems that cause this but they seem to have
mail problems and database access issues that I don't have - the logs
show that BAYES_XX tests are being utilized

[EMAIL PROTECTED] .spamassassin]$ ls -ltr
total 6348
-rw-------   1 vscan    vscan          37 Sep 17 19:29 
bayes.lock.mail.testdomain.com.19706
drwxrwxr-x   2 vscan    vscan        4096 Sep 18 09:38 backup
-rw-r-----   1 vscan    vscan        1098 Sep 18 14:39 user_prefs
-rw-------   1 vscan    vscan     1081344 Sep 18 15:32 bayes_toks.new
-rw-------   1 vscan    vscan     4943872 Sep 18 15:36 bayes_toks
-rw-------   1 vscan    vscan       13498 Sep 18 15:36 bayes_journal
-rw-------   1 vscan    vscan     1323008 Sep 18 15:36 bayes_seen
[EMAIL PROTECTED] .spamassassin]$ sa-learn --dump magic > /var/tmp/check.ad

-- I believe auto_expiry but how do I know for sure (bayes_auto_expire 1 in
-- /etc/mail/spamassassin/local.cf - which is being read - see below) 
-- but it's not expiring AFAIK. I have bayes_expiry_max_db_size 50000. I know
-- that with such a small size the result should be 100,000 tokens
[EMAIL PROTECTED] .spamassassin]$ cat /var/tmp/check.ad
0.000          0          2          0  non-token data: bayes db version
0.000          0       7718          0  non-token data: nspam
0.000          0       6448          0  non-token data: nham
0.000          0     165010          0  non-token data: ntokens
0.000          0 1062343120          0  non-token data: oldest atime
0.000          0 1063910443          0  non-token data: newest atime
0.000          0 1063910445          0  non-token data: last journal sync atime
0.000          0 1063725493          0  non-token data: last expiry atime
0.000          0    1382400          0  non-token data: last expire atime delta
0.000          0      66791          0  non-token data: last expire reduction count
[EMAIL PROTECTED] .spamassassin]$ 

-- unfortunately the sa-learn blew away the bayes_toks.new so I can't 
-- use that for any debug
[EMAIL PROTECTED] .spamassassin]$ sa-learn --spam --mbox /var/tmp/spam/spam112 -D
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/sbin', keeping.
debug: Final PATH set to: /usr/bin:/usr/sbin:/bin:/sbin
debug: using "/usr/share/spamassassin" for default rules dir
debug: using "/etc/mail/spamassassin" for site rules dir
debug: using "/home/vscan/.spamassassin/user_prefs" for user prefs file

-- these are due to old or erroneous entries and I assume they aren't a problem 
debug: Failed to parse line in SpamAssassin configuration, skipping: ok_languages
debug: Failed to parse line in SpamAssassin configuration, skipping: ok_locales
debug: Failed to parse line in SpamAssassin configuration, skipping: defang_mime 0
debug: bayes: 25648 tie-ing to DB file R/O /home/vscan/.spamassassin/bayes_toks
debug: bayes: 25648 tie-ing to DB file R/O /home/vscan/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 2 chosen.
debug: Initialising learner
debug: Initialising learner
debug: Syncing Bayes journal and expiring old tokens...
debug: lock: 25648 created 
/home/vscan/.spamassassin/bayes.lock.mail.testdomain.com.25648
debug: lock: 25648 trying to get lock on /home/vscan/.spamassassin/bayes with 0 retries
debug: lock: 25648 link to /home/vscan/.spamassassin/bayes.lock: link ok
debug: bayes: 25648 tie-ing to DB file R/W /home/vscan/.spamassassin/bayes_toks
debug: bayes: 25648 tie-ing to DB file R/W /home/vscan/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: synced Bayes databases from journal in 1 seconds: 554 unique entries (740 total 
entries)

-- have "bayes_expiry_max_db_size 50000" to try to force an auto expire 
debug: bayes: expiry check keep size, 75% of max: 37500
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 165454, final goal reduction size: 65454
debug: bayes: First pass?  Current: 1063914795, Last: 1063725493, atime: 1382400, 
count: 66791, newdelta: 1410637, ratio: 1.02042655911021
debug: bayes: 25648 untie-ing
debug: bayes: 25648 untie-ing db_toks
debug: bayes: 25648 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 25648 unlink /home/vscan/.spamassassin/bayes.lock

-- why were 155616 tokens kept? should have been 100,000 I thought
debug: expired old Bayes database entries in 84 seconds: 155616 entries kept, 9838 
deleted
debug: Syncing complete.
debug: Learning Spam
debug: uri tests: Done uriRE
debug: lock: 25648 created 
/home/vscan/.spamassassin/bayes.lock.mail.testdomain.com.25648
debug: lock: 25648 trying to get lock on /home/vscan/.spamassassin/bayes with 0 retries
debug: lock: 25648 link to /home/vscan/.spamassassin/bayes.lock: link ok
debug: bayes: 25648 tie-ing to DB file R/W /home/vscan/.spamassassin/bayes_toks
debug: bayes: 25648 tie-ing to DB file R/W /home/vscan/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: [EMAIL PROTECTED]: already learnt correctly, not learning twice
Learned from 0 message(s) (1 message(s) examined).
debug: bayes: 25648 untie-ing
debug: bayes: 25648 untie-ing db_toks
debug: bayes: 25648 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 25648 unlink /home/vscan/.spamassassin/bayes.lock
[EMAIL PROTECTED] .spamassassin]$ sa-learn --dump magic > /var/tmp/check.ae

[EMAIL PROTECTED] .spamassassin]$ cat /var/tmp/check.ae
0.000          0          2          0  non-token data: bayes db version
0.000          0       7743          0  non-token data: nspam
0.000          0       6448          0  non-token data: nham
0.000          0     155695          0  non-token data: ntokens
0.000          0 1062504719          0  non-token data: oldest atime
0.000          0 1063915425          0  non-token data: newest atime
0.000          0 1063914958          0  non-token data: last journal sync atime
0.000          0 1063914879          0  non-token data: last expiry atime
0.000          0    1410637          0  non-token data: last expire atime delta
0.000          0       9838          0  non-token data: last expire reduction count
[EMAIL PROTECTED] .spamassassin]$ 

Reply via email to