bayes expiry not happening when it should
~$ grep '^bayes_expiry_max_db_size' ~/.spamassassin/user_prefs | awk '{print $2}' 200 ~$ sa-learn --force-expire bayes: synced databases from journal in 0 seconds: 2784 unique entries (2805 total entries) ~$ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 24501 0 non-token data: nspam 0.000 0 23548 0 non-token data: nham 0.000 02009202 0 non-token data: ntokens 0.000 0 100071 0 non-token data: oldest atime 0.000 0 1438755640 0 non-token data: newest atime 0.000 0 1438755988 0 non-token data: last journal sync atime 0.000 0 1438756034 0 non-token data: last expiry atime 0.000 0 11059200 0 non-token data: last expire atime delta 0.000 0 20174 0 non-token data: last expire reduction count ??wth??? I thought I _finally_ understood this stuff :-( -- Please *no* private copies of mailing list or newsgroup messages. Rule 420: All persons more than eight miles high to leave the court.
Re: bayes expiry not happening when it should
On Tue, 4 Aug 2015 23:36:51 -0700 Ian Zimmerman wrote: ~$ grep '^bayes_expiry_max_db_size' ~/.spamassassin/user_prefs | awk '{print $2}' 200 ~$ sa-learn --force-expire 0.000 02009202 0 non-token data: ntokens ??wth??? I thought I _finally_ understood this stuff :-( The number of tokens is within 0.5% of the configured value. It's designed to produce a value between 75% and roughly 150%.
Re: bayes expiry not happening when it should
On 2015-08-05 12:58 +0100, RW wrote: The number of tokens is within 0.5% of the configured value. It's designed to produce a value between 75% and roughly 150%. I can't quite parse that answer, so let's be more specific. Doc says: bayes_expiry_max_db_size (default: 15) What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value. From this (and the more elaborate description in the EXPIRATION section, which I've also read) I thought it worked roughly like this: if (ntokens bayes_expiry_max_db_size) do_nothing() else goal_ntokens = max(10, 0.75 * bayes_expiry_max_db_size) while (ntokens goal_ntokens) kill_oldest_tokens() If I misunderstood, how/where? Sorry for my density :-( -- Please *no* private copies of mailing list or newsgroup messages. Rule 420: All persons more than eight miles high to leave the court.
Re: bayes expiry not happening when it should
On Wed, 5 Aug 2015 07:47:20 -0700 Ian Zimmerman wrote: On 2015-08-05 12:58 +0100, RW wrote: The number of tokens is within 0.5% of the configured value. It's designed to produce a value between 75% and roughly 150%. I can't quite parse that answer, so let's be more specific. Doc says: bayes_expiry_max_db_size (default: 15) What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep either 75% of the maximum value, or 100,000 tokens, whichever has a larger value. From this (and the more elaborate description in the EXPIRATION section, which I've also read) I thought it worked roughly like this: if (ntokens bayes_expiry_max_db_size) do_nothing() That bit is only for auto-expiry goal_ntokens = max(10, 0.75 * bayes_expiry_max_db_size) while (ntokens goal_ntokens) kill_oldest_tokens() What it actually does is estimate a cut-off time and then delete all tokens older than that. How it gets the cut-off time is described the next two sections: EXPIRE LOGIC and ESTIMATION PASS LOGIC.
Re: bayes expiry not happening when it should
On 2015-08-05 19:34 +0100, RW wrote: What it actually does is estimate a cut-off time and then delete all tokens older than that. How it gets the cut-off time is described the next two sections: EXPIRE LOGIC and ESTIMATION PASS LOGIC. OMG. For one thing, are the clauses in the definition of weird conjunctive or disjunctive? A more insolent question, why this complexity? Why can't I force an expire when I feel like it? :-P Or can I? -- Please *no* private copies of mailing list or newsgroup messages. Rule 420: All persons more than eight miles high to leave the court.