RE: Bayes expiring during message test

2005-05-24 Thread Ben Wylie
 I am running SA 3.02 on a Windows 2003 server.
 As previously posted to this list I have had a problem where SA seems 
 unable to remove a bayes lock file or something like that.
 
 First of all, I was wondering if anyone knows what the error message that

 is being displayed and what might be causing it?

 First that's NOT an error message.  You are running SA in debug mode, and
 you are seeing a debug message. All it means is just what it says, part of
 the SA code is refreshing it's hold on the database lock. It's not failing
 anything, it's normal.
 
 The expiry process on a non-SQL based bayes DB refreshes often to avoid 
 having another SA process assume the lock is stale and delete it. (see 
 subset_running_expire_tok in BayeStore/DBM.pm)
 
 If the message bothers you, don't run SA with -D.

It's not that it bothers me. It takes an age to get to that stage, and I
thought that the expiry had already taken place.
It spends a long time on the line:
debug: bayes: expiry max exponent: 9
I thought that all this time meant that it would have already processed the
expiry, and so the only thing that was preventing it from completing was
this message about bayes.lock thing. I now understand that the expiry hasn't
taken place yet and that this repeated message isn't an indication of
anything going wrong, but just it making sure that it isn't overridden.

 Secondly, in my local.cf file I have:
 bayes_expiry_max_db_size 50
 Why is it expiring the database when it is only 11mb big?

 That sounds about right.. the comments about 150k tokens being 8mb are 
 outdated and belong to SA 2.6x. In 2.6x tokens were text strings, and thus
 rather large.

 SA 3.0 tokens are SHA1 hashes (16 bytes), plus a few extra bytes for 
 atime, nspam, nham. I'm not sure the exact size of the tokens, but 11mb 
 does sound feasible. My own ballpark guess at the format runs 13mb for 
 500k tokens.

 Unfortunately, I don't run SA 3.x at this time, so I can't verify that.

Ok. I thought that a 500k database would be much larger. Since I specified
the larger database (I can't remember how big the default one is), the bayes
db file doesn't seem to be any bigger, so I was expecting it to grow more
before expiry. It is not recommended to have a db of more then 500k?
Presumably the larger it is the slower it is. Is that the only reason to
keep the db size down?

 Why is it expiring the database during a message scan?

 Because SA does that by default. In some SA environments SA only runs when
 messages are being scanned. It's got to expire at some point, so it does 
 it once in a while during a message scan. This is on by default, otherwise
 users that just call spamassassin instead of using spamd would have 
 their bayes files grow without bound.

How can this be? ... Actually I guess if you use autolearning, then you
don't need to run sa-learn separately. Before I noticed this happening, I
thought that it would only autoexpire during the learning process. I do a
batch learn over night. If it starts expiring during scanning of a message,
it messes up the timeout that my mailserver has.

 Is there a command line option to prevent it from expiring during a scan 
 of a message?

 No, but there's a config option you can add to local.cf:
 bayes_auto_expire 0

This will prevent it auto expiring during a sa-learn batch as well I
presume, so I will have to schedule an expiry specifically.

 I presume if u use the --no-sync option when learning messages, it just
 creates a journal file which can then by synchronised later with the main
 bayes db, is this correct?

 Yes, or you can run sa-learn --sync to cause a sync check to occur.

 Or you can use sa-learn --force-expire which will force a sync and expire
 to run, regardless of perceived need.

Thanks Matt for setting me straight.

Ben




Bayes expiring during message test

2005-05-23 Thread Ben Wylie
I am running SA 3.02 on a Windows 2003 server.
As previously posted to this list I have had a problem where SA seems unable
to remove a bayes lock file or something like that.

I include complete logs below to show what it is like - I apologise for the
size of it.

First of all, I was wondering if anyone knows what the error message that is
being displayed and what might be causing it?
Secondly, in my local.cf file I have:
bayes_expiry_max_db_size 50
Why is it expiring the database when it is only 11mb big?
Why is it expiring the database during a message scan?
Is there a command line option to prevent it from expiring during a scan of
a message?
I presume if u use the --no-sync option when learning messages, it just
creates a journal file which can then by synchronised later with the main
bayes db, is this correct?

Thanks for your help,

Ben

SA debug log:

debug: SpamAssassin version 3.0.2
debug: Score set 0 chosen.
debug: running in taint mode? no
debug: defining getpwuid() wrapper using 'unknown' as username
debug: using F:\Perl\etc\mail\spamassassin\init.pre for site rules
init.pre
debug: config: read file F:\Perl\etc\mail\spamassassin\init.pre
debug: using F:\Perl/share/spamassassin for default rules dir
debug: config: read file F:\Perl/share/spamassassin/10_misc.cf
debug: config: read file F:\Perl/share/spamassassin/20_anti_ratware.cf
debug: config: read file F:\Perl/share/spamassassin/20_body_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_compensate.cf
debug: config: read file F:\Perl/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_drugs.cf
debug: config: read file F:\Perl/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_head_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_html_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_meta_tests.cf
debug: config: read file F:\Perl/share/spamassassin/20_phrases.cf
debug: config: read file F:\Perl/share/spamassassin/20_porn.cf
debug: config: read file F:\Perl/share/spamassassin/20_ratware.cf
debug: config: read file F:\Perl/share/spamassassin/20_uri_tests.cf
debug: config: read file F:\Perl/share/spamassassin/23_bayes.cf
debug: config: read file F:\Perl/share/spamassassin/25_body_tests_es.cf
debug: config: read file F:\Perl/share/spamassassin/25_hashcash.cf
debug: config: read file F:\Perl/share/spamassassin/25_spf.cf
debug: config: read file F:\Perl/share/spamassassin/25_uribl.cf
debug: config: read file F:\Perl/share/spamassassin/30_text_de.cf
debug: config: read file F:\Perl/share/spamassassin/30_text_fr.cf
debug: config: read file F:\Perl/share/spamassassin/30_text_nl.cf
debug: config: read file F:\Perl/share/spamassassin/30_text_pl.cf
debug: config: read file F:\Perl/share/spamassassin/50_scores.cf
debug: config: read file F:\Perl/share/spamassassin/60_whitelist.cf
debug: config: read file F:\Perl/share/spamassassin/70_antidrug.cf
debug: config: read file F:\Perl/share/spamassassin/70_ben_smallcap.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_adult.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_fraud_post25x.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_oem.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_random.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_spoof.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_unsub.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_uri0.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_uri1.cf
debug: config: read file F:\Perl/share/spamassassin/70_sare_uri3.cf
debug: config: read file F:\Perl/share/spamassassin/local.cf
debug: using F:\Perl/etc/mail/spamassassin for site rules dir
debug: config: read file F:\Perl/etc/mail/spamassassin/local.cf
debug: using F:\Documents and Settings\LocalService/.spamassassin for user
state dir
debug: using F:\Documents and
Settings\LocalService/.spamassassin/user_prefs for user prefs file
debug: config: read file F:\Documents and
Settings\LocalService/.spamassassin/user_prefs
debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
debug: plugin: registered
Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22730b0)
debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
debug: plugin: registered
Mail::SpamAssassin::Plugin::Hashcash=HASH(0x25cf708)
debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x25e1ba0)
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22730b0)
implements 'parse_config'
debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x25cf708)
implements 'parse_config'
debug: bayes: 2700 tie-ing to DB file R/O
F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes_toks
debug: bayes: 2700 tie-ing to DB file R/O
F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes_seen
debug: bayes: found bayes db version 3
debug: Score set 3 chosen.
debug: