Theo Van Dinter wrote on Thu, 12 Feb 2004 13:11:55 -0500: > > What does this mean? That most tokens are within the same time range > or that > > most tokens are way too old ??? How can I figure this out? > > Well, the data listing there tells you.
Tells you, not me ;-) I can read that stuff to a certain extent, but only understand portions of it. The tokens in your DB are all > over 256 days old. This is simply "impossible" because auto-learned items are added daily and I also learn it a spam mailbox sometimes. However, it's possible that a great portion of the db is quite old considering the fact that it didn't expire for a while and we learned several thousand spam and ham mails at the beginning. > > > 0.000 0 -17982 0 non-token data: newest atime > > That's not possible. I have "-17982" on three machines, always the same value. This db started out as Bayes DB version 1 (or 0?) with SA 2.43 possibly, then was carried over to two other machines and they also got upgraded to 2.5x and 2.6x versions consecutively. There's also no "oldest atime". Wouldn't that suggest that possibly all dates are in the future? When I do a sa-learn --dump data, what do I need to look for? Everything over 1076607731 (= last expiry atime, so near current date)? 0.958 1 0 1051805273 low_interest 0.206 17 12 1075495400 HX-MIMETrack:Release f.i., the above, are these valid records/dates? If so, then I'm wondering why it can't display an oldest atime (if I understand correctly what atime means). What's the exact meaning of "atime"? Is this the time when the token was added to the db? I think the times above are in the past, so it should be able to show an oldest atime, shouldn't it? I'm sure there is a command which converts that Unix Timestamp (assuming it is one) to something human-readable, but I don't know it. I read the dumping instructions etc. in Message-ID: <[EMAIL PROTECTED]> Didn't understand everything, though. I now have a readable dump of the incorrect records (at least I hope I have). Most of these records seem to be way in the future: 0.518 219 37 1128239545 review 0.978 2 0 1104581966 8:ѣ 0.958 1 0 1128052147 lkalowhbrd 0.994 8 0 1093712392 WEST 0.942 90 1 1128239545 REQUIRED Couldn't I simply remove these from bayes_toks or "out"? I'm not keen on fixing them. It's only about 50 KB. So remove the token and any lines until the next token? Is that the correct thing to do? (Next thing then: learn how to convert this back to bayes_toks.) straight.php \e0\c4\97\cf> \db\d5\d4\cb\c9 \f0\91\05\dc> f.i. remove that completely? What is CVVV / CV? Thanks, Kai -- Kai Sch�tzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de & http://msie.winware.org
