Theo Van Dinter wrote on Thu, 12 Feb 2004 13:11:55 -0500:

> > What does this mean? That most tokens are within the same time range
> or that 
> > most tokens are way too old ??? How can I figure this out?
> 
> Well, the data listing there tells you.

Tells you, not me ;-) I can read that stuff to a certain extent, but only  
understand portions of it.

The tokens in your DB are all
> over 256 days old.

This is simply "impossible" because auto-learned items are added daily and 
I also learn it a spam mailbox sometimes. However, it's possible that a 
great portion of the db is quite old considering the fact that it didn't 
expire for a while and we learned several thousand spam and ham mails at 
the beginning.

> 
> > 0.000          0     -17982          0  non-token data: newest atime
> 
> That's not possible.

I have "-17982" on three machines, always the same value. This db started 
out as Bayes DB version 1 (or 0?) with SA 2.43 possibly, then was carried 
over to two other machines and they also got upgraded to 2.5x and 2.6x 
versions consecutively.
There's also no "oldest atime". Wouldn't that suggest that possibly all 
dates are in the future?

When I do a sa-learn --dump data, what do I need to look for? Everything 
over 1076607731 (= last expiry atime, so near current date)?

0.958          1          0 1051805273  low_interest
0.206         17         12 1075495400  HX-MIMETrack:Release

f.i., the above, are these valid records/dates? If so, then I'm wondering 
why it can't display an oldest atime (if I understand correctly what atime 
means). What's the exact meaning of "atime"? Is this the time when the 
token was added to the db? I think the times above are in the past, so it 
should be able to show an oldest atime, shouldn't it?

I'm sure there is a command which converts that Unix Timestamp (assuming 
it is one) to something human-readable, but I don't know it.

I read the dumping instructions etc. in
Message-ID: <[EMAIL PROTECTED]>
Didn't understand everything, though.
I now have a readable dump of the incorrect records (at least I hope I 
have).
Most of these records seem to be way in the future:
0.518        219         37 1128239545  review
0.978          2          0 1104581966  8:ѣ
0.958          1          0 1128052147  lkalowhbrd
0.994          8          0 1093712392  WEST
0.942         90          1 1128239545  REQUIRED

Couldn't I simply remove these from bayes_toks or "out"? I'm not keen on 
fixing them. It's only about 50 KB.
So remove the token and any lines until the next token? Is that the 
correct thing to do? (Next thing then: learn how to convert this back to 
bayes_toks.)

straight.php
 \e0\c4\97\cf>
 \db\d5\d4\cb\c9
 \f0\91\05\dc>
 
f.i. remove that completely?
What is CVVV / CV?

Thanks,

Kai

-- 

Kai Sch�tzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org



Reply via email to