Title: Re: Bayes database
Thanks for posting this script, it didn't work for me however.  I did the following:
 
Sync the journal (sa-learn --rebuild)
check (sa-learn --dump magic | head -1) for the db value, should be 2 - it was
Run db-to-text2.pl -o bayes_toks > bayes_toks.txt
It changed the atime of hundreds of tokens:
 
Resetting atime of key in the future:
 <key>H*c:alternative</key><ts>411973</ts><th>56913</th><atime>1735776000</atime>
......
Resetting atime of key in the future:
 <key>listing</key><ts>1278</ts><th>2979</th><atime>1735776000</atime>
.................
Resetting atime of key in the future:
 <key>94120-7334</key><ts>4</ts><th>1</th><atime>1735776000</atime>
..................................................................
Resetting atime of key in the future:
 <key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
but after I ran db-to-text2.pl -i bayes_toks < bayes_toks.txt and ran the auto-expire it didn't work.
 
Expiry output after atime adjustment:
 
debug: bayes: found bayes db version 2
debug: bayes: expiry check keep size, 75% of max: 60000
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 4179196, final goal reduction size: 4079196
debug: bayes: First pass?  Current: 1092831331, Last: 1092826832, atime: 736899888, count: 0, newdelta: 0, ratio: 0
debug: bayes: something fishy, calculating atime (first pass)
debug: bayes: couldn't find a good delta atime, need more token difference, skipping expire.
debug: Syncing complete.
debug: bayes: 32574 untie-ing
debug: bayes: 32574 untie-ing db_toks
debug: bayes: 32574 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 32574 unlink /usr/local/share/spamassassin/run/bayes.lock
 
So ran it again:
 
Resetting atime of key in the future:
 <key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
Resetting atime of key in the future:
 <key>UD:htm</key><ts>42786</ts><th>26947</th><atime>1735776000</atime>
....................................................
bayes_toks: 4179192 keys copied
bayes_toks: 146 future-keys reset
 
The fact that it was finding atimes in the future on the second pass means that the import of bayes_toks.txt must not be working properly because I could find not future atimes when doing a grep on bayes_toks.txt.  Expiry still didn't work.
 
Doing an sa-learn --dump magic showed that the newest atime is in 2025 and the oldest in 2000 which are both incorrect:
 
0.000          0          2          0  non-token data: bayes db version
0.000          0     632798          0  non-token data: nspam
0.000          0     619738          0  non-token data: nham
0.000          0    4179196          0  non-token data: ntokens
0.000          0  952965268          0  non-token data: oldest atime
0.000          0 1735776000          0  non-token data: newest atime
0.000          0 1092825211          0  non-token data: last journal sync atime
0.000          0 1092836039          0  non-token data: last expiry atime
0.000          0  736899888          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count
Any suggestions?  No errors are given after the import back to the database so I'm not sure what is going wrong here.
 
Cheers,
Zoe



From: Martin Schr�der [mailto:[EMAIL PROTECTED]
Sent: Tue 17/08/2004 17:59
To: [EMAIL PROTECTED]
Subject: Re: Bayes database

On 2004-08-17 18:28:48 +0200, Andy Spiegl wrote:
> You don't have to delete your bayes database!
>
> In April I had the same problem and I ended up extending and fixing the tool
http://spamassassin.taint.org/devel/db-to-text.pl.txt
> and posting it to the mailing list.  I asked that someone puts it on the

THANKS!

expired old Bayes database entries in 661 seconds
145343 entries kept, 455855 deleted

:-))

Best regards
        Martin
--
               Martin Schr�der, [EMAIL PROTECTED]
     ArtCom GmbH, Lise-Meitner-Str 5, 28359 Bremen, Germany
          Voice +49 421 20419-44 / Fax +49 421 20419-10
                    http://www.artcom-gmbh.de



---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is free from all known viruses.

For further information contact [EMAIL PROTECTED]



Reply via email to