Theo Van Dinter wrote on Thu, 12 Feb 2004 10:54:07 -0500:

> $ db_dump -p -f out .spamassassin/bayes_toks
> $ sa-learn --dump data | perl -nle 'print if ( (split)[3] > time )' > out2

I overlooked something here at first. At a quick glance it looked like this 
was a sequence, so that line 2 depends on line 1 but it isn't. I think just 
doing an
sa-learn --dump data > dump.file
is what I need. I then get everything neatly arranged in columns and just 
need to strip away all the lines with the negative value.
0.958          1          0     -17982  bgiek
Interestingly, all of them seem to be spam tokens and all have -17982.

And then rebuild the database from that. Michael sent me a script which is 
supposed to do that and the interesting thing is that it *seems* to create a 
valid db of exactly the same size as before but it's binarily different. 
(For testing purposes I dumped from a *valid* non-corrupted db and then 
recreated it with his tool. So there aren't any mistakes I could introduce 
by editing.) sa-learn identifies it as a v0 database and does not show any 
tokens or other data in it with "--dump magic". When I run --force-expire 
over it it starts converting the db to v2 and after that still lists no 
tokens and all four atime values show the current time. No errors whatsoever 
shown. Michael says his tool creates a v2 database, but sa-learn identifies 
it as v0 and converts without an error to v2. Weird.
I'm gonna post his code here once he acknowledges.

Kai

-- 

Kai Sch�tzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org



Reply via email to