Large-scale global Bayes tuning?

Kris Deugau Wed, 09 Apr 2008 09:13:17 -0700

Anyone have any suggestions on tuning a large global Bayes db forstability and sanity? I've got my fingers in the pie of a moderatelylarge mail cluster, but I haven't yet found a Bayes configuration that'ssane and stable for any extended period. Wiping it completely aboutonce a week seems to provide "acceptable" filtering performance (we havea number of addon rulesets), but I still see spam in my inbox withBAYES_00 - a sure sign of a mistuned Bayes database.

Past experience with (much) smaller systems has shown stable behaviourwith bayes_expiry_max_db_size set to 1500000 (~40M BDB Bayes), dailyexpiry runs delete ~25-35K tokens; mail volume ~3K/day. However, thelarger system (MySQL, currently set with max_db_size at 3000000, on-diskfiles running ~100M) only seems to be expiring that same 25-35K tokenseven though autolearn is picking up ~1.5M+ from ~300K messages on adaily basis. Reading through the docs on token expiry I would guess itshould be far more aggressive than it is. (Among other things, I reallydon't want to bump up max_db_size by two orders of magnitude; up to ~5Mshould be fine, and I could see as high as 7.5M if really necssary.)

I'm not even really sure what questions to ask to get more detail;sa-learn -D doesn't really spit out *enough* detail about the expiryprocess to know for sure if something is going wrong there.


-kgd

Large-scale global Bayes tuning?

Reply via email to