RE: Bayes expiring during message test
I am running SA 3.02 on a Windows 2003 server. As previously posted to this list I have had a problem where SA seems unable to remove a bayes lock file or something like that. First of all, I was wondering if anyone knows what the error message that is being displayed and what might be causing it? First that's NOT an error message. You are running SA in debug mode, and you are seeing a debug message. All it means is just what it says, part of the SA code is refreshing it's hold on the database lock. It's not failing anything, it's normal. The expiry process on a non-SQL based bayes DB refreshes often to avoid having another SA process assume the lock is stale and delete it. (see subset_running_expire_tok in BayeStore/DBM.pm) If the message bothers you, don't run SA with -D. It's not that it bothers me. It takes an age to get to that stage, and I thought that the expiry had already taken place. It spends a long time on the line: debug: bayes: expiry max exponent: 9 I thought that all this time meant that it would have already processed the expiry, and so the only thing that was preventing it from completing was this message about bayes.lock thing. I now understand that the expiry hasn't taken place yet and that this repeated message isn't an indication of anything going wrong, but just it making sure that it isn't overridden. Secondly, in my local.cf file I have: bayes_expiry_max_db_size 50 Why is it expiring the database when it is only 11mb big? That sounds about right.. the comments about 150k tokens being 8mb are outdated and belong to SA 2.6x. In 2.6x tokens were text strings, and thus rather large. SA 3.0 tokens are SHA1 hashes (16 bytes), plus a few extra bytes for atime, nspam, nham. I'm not sure the exact size of the tokens, but 11mb does sound feasible. My own ballpark guess at the format runs 13mb for 500k tokens. Unfortunately, I don't run SA 3.x at this time, so I can't verify that. Ok. I thought that a 500k database would be much larger. Since I specified the larger database (I can't remember how big the default one is), the bayes db file doesn't seem to be any bigger, so I was expecting it to grow more before expiry. It is not recommended to have a db of more then 500k? Presumably the larger it is the slower it is. Is that the only reason to keep the db size down? Why is it expiring the database during a message scan? Because SA does that by default. In some SA environments SA only runs when messages are being scanned. It's got to expire at some point, so it does it once in a while during a message scan. This is on by default, otherwise users that just call spamassassin instead of using spamd would have their bayes files grow without bound. How can this be? ... Actually I guess if you use autolearning, then you don't need to run sa-learn separately. Before I noticed this happening, I thought that it would only autoexpire during the learning process. I do a batch learn over night. If it starts expiring during scanning of a message, it messes up the timeout that my mailserver has. Is there a command line option to prevent it from expiring during a scan of a message? No, but there's a config option you can add to local.cf: bayes_auto_expire 0 This will prevent it auto expiring during a sa-learn batch as well I presume, so I will have to schedule an expiry specifically. I presume if u use the --no-sync option when learning messages, it just creates a journal file which can then by synchronised later with the main bayes db, is this correct? Yes, or you can run sa-learn --sync to cause a sync check to occur. Or you can use sa-learn --force-expire which will force a sync and expire to run, regardless of perceived need. Thanks Matt for setting me straight. Ben
Bayes expiring during message test
I am running SA 3.02 on a Windows 2003 server. As previously posted to this list I have had a problem where SA seems unable to remove a bayes lock file or something like that. I include complete logs below to show what it is like - I apologise for the size of it. First of all, I was wondering if anyone knows what the error message that is being displayed and what might be causing it? Secondly, in my local.cf file I have: bayes_expiry_max_db_size 50 Why is it expiring the database when it is only 11mb big? Why is it expiring the database during a message scan? Is there a command line option to prevent it from expiring during a scan of a message? I presume if u use the --no-sync option when learning messages, it just creates a journal file which can then by synchronised later with the main bayes db, is this correct? Thanks for your help, Ben SA debug log: debug: SpamAssassin version 3.0.2 debug: Score set 0 chosen. debug: running in taint mode? no debug: defining getpwuid() wrapper using 'unknown' as username debug: using F:\Perl\etc\mail\spamassassin\init.pre for site rules init.pre debug: config: read file F:\Perl\etc\mail\spamassassin\init.pre debug: using F:\Perl/share/spamassassin for default rules dir debug: config: read file F:\Perl/share/spamassassin/10_misc.cf debug: config: read file F:\Perl/share/spamassassin/20_anti_ratware.cf debug: config: read file F:\Perl/share/spamassassin/20_body_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_compensate.cf debug: config: read file F:\Perl/share/spamassassin/20_dnsbl_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_drugs.cf debug: config: read file F:\Perl/share/spamassassin/20_fake_helo_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_head_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_html_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_meta_tests.cf debug: config: read file F:\Perl/share/spamassassin/20_phrases.cf debug: config: read file F:\Perl/share/spamassassin/20_porn.cf debug: config: read file F:\Perl/share/spamassassin/20_ratware.cf debug: config: read file F:\Perl/share/spamassassin/20_uri_tests.cf debug: config: read file F:\Perl/share/spamassassin/23_bayes.cf debug: config: read file F:\Perl/share/spamassassin/25_body_tests_es.cf debug: config: read file F:\Perl/share/spamassassin/25_hashcash.cf debug: config: read file F:\Perl/share/spamassassin/25_spf.cf debug: config: read file F:\Perl/share/spamassassin/25_uribl.cf debug: config: read file F:\Perl/share/spamassassin/30_text_de.cf debug: config: read file F:\Perl/share/spamassassin/30_text_fr.cf debug: config: read file F:\Perl/share/spamassassin/30_text_nl.cf debug: config: read file F:\Perl/share/spamassassin/30_text_pl.cf debug: config: read file F:\Perl/share/spamassassin/50_scores.cf debug: config: read file F:\Perl/share/spamassassin/60_whitelist.cf debug: config: read file F:\Perl/share/spamassassin/70_antidrug.cf debug: config: read file F:\Perl/share/spamassassin/70_ben_smallcap.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_adult.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_fraud_post25x.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_oem.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_random.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_spoof.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_unsub.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_uri0.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_uri1.cf debug: config: read file F:\Perl/share/spamassassin/70_sare_uri3.cf debug: config: read file F:\Perl/share/spamassassin/local.cf debug: using F:\Perl/etc/mail/spamassassin for site rules dir debug: config: read file F:\Perl/etc/mail/spamassassin/local.cf debug: using F:\Documents and Settings\LocalService/.spamassassin for user state dir debug: using F:\Documents and Settings\LocalService/.spamassassin/user_prefs for user prefs file debug: config: read file F:\Documents and Settings\LocalService/.spamassassin/user_prefs debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22730b0) debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::Hashcash=HASH(0x25cf708) debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x25e1ba0) debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22730b0) implements 'parse_config' debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x25cf708) implements 'parse_config' debug: bayes: 2700 tie-ing to DB file R/O F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes_toks debug: bayes: 2700 tie-ing to DB file R/O F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes_seen debug: bayes: found bayes db version 3 debug: Score set 3 chosen. debug: