thanks Theo. I would love to send my bayes_toks thru db_dump and fix the "broken" records. However i am not familiar with the format. is there an existing script, or a site that will allow me to properly remove entries with bad atime values?
thanks adam On Tue, 2004-02-10 at 11:44, Theo Van Dinter wrote: > On Tue, Feb 10, 2004 at 09:31:34AM -0500, Adam Denenberg wrote: > > debug: bayes: expiry check keep size, 75% of max: 750000 > > Ok, so your max size is 1_000_000 tokens. > > > debug: bayes: token count: 2588992, final goal reduction size: 1838992 > > Your DB says you have ~2.6m tokens, so to get to the goal of 750k tokens, > you need to remove ~1.8m tokens. > > > debug: bayes: First pass? Current: 1076421753, Last: 1076394691, atime: > > 1382400, count: 1019, newdelta: 765, ratio: 1804.70264965653 > > Not looking at the other things, the ratio is way off, so expiry isn't going > to work. > > > debug: bayes: atime token reduction > > debug: bayes: ======== =============== > > debug: bayes: 43200 2595384 > > debug: bayes: 86400 2595384 > > debug: bayes: 172800 2595384 > > debug: bayes: 345600 2595384 > > debug: bayes: 691200 2595384 > > debug: bayes: 1382400 2595384 > > debug: bayes: 2764800 2595384 > > debug: bayes: 5529600 2595384 > > debug: bayes: 11059200 2595384 > > debug: bayes: 22118400 2595384 > > The interesting thing here is that you only have 2588992 tokens in the DB > (magic token), but the atime/reduction chart shows 2595384 being removed > (actual loop through DB tokens)... What's up with that? > > What the above chart says is that no matter what atime you use, you'll > be expirying too many tokens. Now, the atime deltas here are populated > sets via newest_atime - token_atime. Since your newest atime is far > far in the future as Matt already pointed out (1134906269 == Sun Dec > 18 06:44:29 2005 EST), all of your tokens are "older" than 256 days > (last line in the chart). > > So ... I would do 2 things. 1) fix the db. unless you're _very sure_ > about the internal db format, "rm bayes_*". if you are used to the > format, do a db_dump, edit the output and modify the "future" token > atimes to be something more reasonable, modify the newest atime magic > token, do a db_load. 2) if you save your messages, find the one that > caused the problem and attach it to the ticket specified below... > > FYI: For 3.0.0, I just put in some code that stops this kind of thing from > happening (if the calculated message atime is determined to be more than > 1 day in the future, it just uses the current time() value instead). > If a 2.64 release happens, the fix will probably go in there too: > http://bugzilla.spamassassin.org/show_bug.cgi?id=3025
