Thanks for posting this
script, it didn't work for me however. I did the following:
Sync the journal (sa-learn
--rebuild)
check (sa-learn --dump magic | head -1) for the db
value, should be 2 - it was
Run db-to-text2.pl -o bayes_toks > bayes_toks.txt
Run db-to-text2.pl -o bayes_toks > bayes_toks.txt
It changed the atime of hundreds of tokens:
Resetting atime of key in the
future:
<key>H*c:alternative</key><ts>411973</ts><th>56913</th><atime>1735776000</atime>
......
Resetting atime of key in the future:
<key>listing</key><ts>1278</ts><th>2979</th><atime>1735776000</atime>
.................
Resetting atime of key in the future:
<key>94120-7334</key><ts>4</ts><th>1</th><atime>1735776000</atime>
..................................................................
Resetting atime of key in the future:
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
<key>H*c:alternative</key><ts>411973</ts><th>56913</th><atime>1735776000</atime>
......
Resetting atime of key in the future:
<key>listing</key><ts>1278</ts><th>2979</th><atime>1735776000</atime>
.................
Resetting atime of key in the future:
<key>94120-7334</key><ts>4</ts><th>1</th><atime>1735776000</atime>
..................................................................
Resetting atime of key in the future:
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
but after I ran
db-to-text2.pl -i bayes_toks < bayes_toks.txt and ran the auto-expire it
didn't work.
Expiry output after atime
adjustment:
debug: bayes: found bayes db version 2
debug: bayes: expiry check keep size, 75% of max: 60000
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 4179196, final goal reduction size: 4079196
debug: bayes: First pass? Current: 1092831331, Last: 1092826832, atime: 736899888, count: 0, newdelta: 0, ratio: 0
debug: bayes: something fishy, calculating atime (first pass)
debug: bayes: couldn't find a good delta atime, need more token difference, skipping expire.
debug: Syncing complete.
debug: bayes: 32574 untie-ing
debug: bayes: 32574 untie-ing db_toks
debug: bayes: 32574 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 32574 unlink /usr/local/share/spamassassin/run/bayes.lock
debug: bayes: expiry check keep size, 75% of max: 60000
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 4179196, final goal reduction size: 4079196
debug: bayes: First pass? Current: 1092831331, Last: 1092826832, atime: 736899888, count: 0, newdelta: 0, ratio: 0
debug: bayes: something fishy, calculating atime (first pass)
debug: bayes: couldn't find a good delta atime, need more token difference, skipping expire.
debug: Syncing complete.
debug: bayes: 32574 untie-ing
debug: bayes: 32574 untie-ing db_toks
debug: bayes: 32574 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 32574 unlink /usr/local/share/spamassassin/run/bayes.lock
So ran it
again:
Resetting atime of key in the
future:
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
Resetting atime of key in the future:
<key>UD:htm</key><ts>42786</ts><th>26947</th><atime>1735776000</atime>
....................................................
bayes_toks: 4179192 keys copied
bayes_toks: 146 future-keys reset
<key>editions</key><ts>253</ts><th>495</th><atime>1735776000</atime>
...............................
Resetting atime of key in the future:
<key>UD:htm</key><ts>42786</ts><th>26947</th><atime>1735776000</atime>
....................................................
bayes_toks: 4179192 keys copied
bayes_toks: 146 future-keys reset
The fact that it was
finding atimes in the future on the second pass means that the import of
bayes_toks.txt must not be working properly because I could find not future
atimes when doing a grep on bayes_toks.txt. Expiry still didn't
work.
Doing an sa-learn --dump
magic showed that the newest atime is in 2025 and the oldest in 2000 which are
both incorrect:
0.000
0
2 0 non-token data:
bayes db version
0.000 0 632798 0 non-token data: nspam
0.000 0 619738 0 non-token data: nham
0.000 0 4179196 0 non-token data: ntokens
0.000 0 952965268 0 non-token data: oldest atime
0.000 0 1735776000 0 non-token data: newest atime
0.000 0 1092825211 0 non-token data: last journal sync atime
0.000 0 1092836039 0 non-token data: last expiry atime
0.000 0 736899888 0 non-token data: last expire atime delta
0.000 0 0 0 non-token data: last expire reduction count
0.000 0 632798 0 non-token data: nspam
0.000 0 619738 0 non-token data: nham
0.000 0 4179196 0 non-token data: ntokens
0.000 0 952965268 0 non-token data: oldest atime
0.000 0 1735776000 0 non-token data: newest atime
0.000 0 1092825211 0 non-token data: last journal sync atime
0.000 0 1092836039 0 non-token data: last expiry atime
0.000 0 736899888 0 non-token data: last expire atime delta
0.000 0 0 0 non-token data: last expire reduction count
Any suggestions? No
errors are given after the import back to the database so I'm not sure what is
going wrong here.
Cheers,
Zoe
From: Martin Schr�der
[mailto:[EMAIL PROTECTED]
Sent: Tue 17/08/2004 17:59
To: [EMAIL PROTECTED]
Subject: Re: Bayes database
Sent: Tue 17/08/2004 17:59
To: [EMAIL PROTECTED]
Subject: Re: Bayes database
On 2004-08-17 18:28:48 +0200, Andy Spiegl wrote:
> You
don't have to delete your bayes database!
>
> In April I had the
same problem and I ended up extending and fixing the tool
> http://spamassassin.taint.org/devel/db-to-text.pl.txt
>
and posting it to the mailing list. I asked that someone puts it on
the
THANKS!
expired old Bayes database entries in 661
seconds
145343 entries kept, 455855 deleted
:-))
Best
regards
Martin
--
Martin Schr�der, [EMAIL PROTECTED]
ArtCom GmbH,
Lise-Meitner-Str 5, 28359 Bremen,
Germany
Voice +49 421
20419-44 / Fax +49 421
20419-10
http://www.artcom-gmbh.de
---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is free from all known viruses.
For further information contact [EMAIL PROTECTED]
