Theo Van Dinter wrote on Tue, 10 Feb 2004 11:44:24 -0500: > FYI: For 3.0.0, I just put in some code that stops this kind of thing from > happening (if the calculated message atime is determined to be more than > 1 day in the future, it just uses the current time() value instead). > If a 2.64 release happens, the fix will probably go in there too: > http://bugzilla.spamassassin.org/show_bug.cgi?id=3025 >
I think I'm hitting the same problem: debug: bayes: found bayes db version 2 debug: bayes: expiry check keep size, 75% of max: 112500 debug: bayes: token count: 638040, final goal reduction size: 525540 debug: bayes: First pass? Current: 1076602270, Last: 1076601983, atime: 0, count: 0, newdelta: 0, ratio: 0 debug: bayes: Can't use estimation method for expiry, something fishy, calculating optimal atime delta (first pass) If I understand correctly the database should have only 112500 (must be the 2.63 default), so it's been failing for quite some time if it's now at over 600.000. The token reduction count stays at debug: bayes: 43200 637929 debug: bayes: 22118400 637929 so, it would expire almost everything. What does this mean? That most tokens are within the same time range or that most tokens are way too old ??? How can I figure this out? This is a db which started around summer/autumn last year with some learning and is continually growing since then, with around 17.000 spam and 3.000 ham at the moment. I'm not sure what the next means, does it help to better understand the above? 0.000 0 -17982 0 non-token data: newest atime 0.000 0 1076601982 0 non-token data: last journal sync atime 0.000 0 1076602431 0 non-token data: last expiry atime I "fixed" this now by setting bayes_expiry_max_db_size 1000000 Is there a way I can sanitize the db? I don't really want to throw it away. The interesting thing is that I have this problem on two machines but it was detectable only on one of them. We use a milter (MailCorral) which hands the mail over to spamd. The timeout for that is 60 seconds. I didn't note any increase in spam or other problems on that machine. Since MailCorral isn't actively developed anymore I'm looking for alternatives and set up MailScanner + SA on another machine, copied the old Bayes and other SA stuff over and keep sending a small portion of the spamtrap spam we get directly to that machine. Almost immediately I had a lot of SA time-outs and searching the list I finally found the articles about the "fishy" atime delta. MailScanner uses a smaller time-out by default, I think 20 seconds or so, that's still unchanged yet. So, one could imagine that the problem wasn't detected because the longer time-out allowed for finishing the hanging expiry. However, this doesn't seem to be the case. Most of the time the spamd result comes after a few seconds. I'm not seeing much if any spamd time-outs in the logs of the first machine. Is there something different between spamd and sa, so that the problem would exist but only visually emerge with SA but not with spamd? Like that spamd isn't trying the auto-expire with every message but just once a day while it happens with each invocation of spamassassin? Kai -- Kai Sch�tzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de & http://msie.winware.org
