David B Funk wrote: > Something's really wrong here, those "dump magic" numbers don't > match up with the size of your bayes files. For example, you have a > non-empty 'bayes_journal' file but the last journal sync atime is > zero (implying never synced).
I wasn't clear other than showing that my cron has --force-expire every hour that I have 'bayes_auto_expire 0' set and then run 'sa-learn --force-expire' hourly by cron. Will that have an affect on this? I fear I hid that too deeply. Sorry. But otherwise, yes, something is wrong. But what? I am confident that if I clear and start again it will be okay. But I am hoping to learn how to avoid the problem. Most of the time this ticks along like clockwork on its own. I rarely look at it and don't do anything except feed messages to sa-learn. > The size of your bayes_seen file is consistent with several million > messages learned, not a few tens-of-thousands. It processes mailing list messages and sees a relatively high volume of mail ever day. I should figure out how much sometime. It is very active in terms of daily input. > Are you -sure- those bayes files correspond to the bayes database > your "dump magic" is reporting? (which one is your SA using for its > operations?) As sure as I can be without having coded it myself. A frontend machine processes the email. There is only one user on the frontend machine. The frontend machine runs spamc to submit the email to a second dedicated backend spamd machine. The spamd machine has only the same named user and records the user field in the syslog appropriately. The file timestamps are current. It must be using those files. If not why would they be updating? > If you watch that "bayes_journal" file over an hour or two does it > gradually increase in size then suddenly drop? (that's normal > operation). If so then the 'last journal sync atime' should > correspond to when it dropped in size (the sync operation). When the > journal cycles the nspam/nham should go up. I will need to track this for a while and get back with an answer. Can I manually walk it through a test sequence of --force-expire and/or --sync operations and gather useful data directly? While exploring I did a 'sa-learn --backup > /tmp/sa-learn.backup.out' and 'wc -l /tmp/sa-learn.backup.out' returned 700351 lines. Not sure that is useful information but it might give an idea of something. There isn't anything personal or private in the bayes_* files. But they are somewhat large. I would be happy to make them available directly to anyone who wished to peek at them to get a better idea of what is happening. > If you learn some spam/ham by hand do the nspam/nham counters go up? Yes. I just did a test to verify. One ham, checked, incremented nham counter, one spam, checked, incremented nspam counter. Thanks! Bob
