[Mimedefang] sa-learn load, blocking?

2005-05-10 Thread Adam Porter
Apologies if slightly off-topic, I'm guessing this is mainly a 
spamassassin thing though I'm not positive.

I'm using Redhat Enterprise Linux 3 on a 2.4Ghz Xeon with 2GB of RAM, 
Mimedefang 2.41, McAfee (uvscan) and spamassassin 2.63.  We use global 
bayes learning like this:

use_bayes 1
bayes_auto_learn 1
bayes_file_mode 0770
bayes_path /var/spool/MIMEDefang/bayes
bayes_auto_learn_threshold_spam 8.0
bayes_learn_to_journal 1
bayes_auto_expire 0
Once a night, this gets run out of cron to synchronize the database:
/usr/bin/sa-learn -p /etc/mail/spamassassin/sa-mimedefang.cf --force-expire
The problem comes when trying to feed the database ham and spam this way:
/usr/bin/sa-learn --showdots --spam --dbpath /var/spool/MIMEDefang/ \
/home/$person/Maildir/.spam/cur  --no-rebuild
(or --ham with ham folder instead of --spam with spam folder.)
We find that the sa-learn seems to lock something (the database?) and 
when feeding sa-learn hundreds of messages at a time, the load average 
skyrockets and the machine starts to reject inbound messages due to 
sendmail's threshold (already raised to 30)!  (RAM usage does not seem 
to be a problem in this case.)

I'm thinking of upgrading (both spamassassin and Mimedefang) but would 
feel better about the downtime involved if I knew this problem would go 
away.

Suggestions welcome!
Thanks,
-Adam
___
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] Tracking down the delay (Razor timeout!)

2004-01-28 Thread Adam Porter
[EMAIL PROTECTED] wrote:
I personally have chosen not to use all of the various RBLs that
SpamAssassin uses by default and have instead just enabled
RCVD_IN_SBL[1], HABEAS_VIOLATOR[2], RCVD_IN_DYNABLOCK[3] and created my
own RBL rules to access SpamHaus' XBL list[4].
What I wound up doing was just looking through the SpamAssassin file
20_dnsbl_tests.cf and seeing what the various tests were.  I then looked
around at the URLs mentioned in the comments to get a feel for how each
of the lists was run.  In particular I was interested in the philosophy
of how things got added and removed from the lists and how well they
were technically set-up.  

I then decided on the lists mentioned above as they met my criteria.
You might wish to do something similar.  By only using a select group of
RBLs I figure I'm saving time (not having to query so many lists),
keeping inbound delays to a minimum and reducing the chance of false
positives.
M-

Thanks much for your time & wisdom.  This is a disciplined way to use 
the RBLs, as opposed to blindly using all that show up in the 
SpamAssassin rules.  I'll definitely do some exploring & trying out the 
various services to see which we want to use.  In the meantime, 
shortening those delay limits will definitely help us protect our 
performance if any of them has problems.

Cheers,
-Adam
___
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
[EMAIL PROTECTED]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


[Mimedefang] Tracking down the delay (Razor timeout!)

2004-01-28 Thread Adam Porter
> I don't have a complete answer for you, but it occurs to me that you 
might
> want to tinker with the SpamAssassin configuration options rbl_timeout
> and razor_timeout.  According to Mail::SpamAssassin::Conf(3pm) the
> defaults for these are 15 seconds for the rbl stuff and 10 seconds for
> the razor2 stuff.  Assuming the default settings for both, that sounds
> like it could account for at least 25 seconds.
>
> Perhaps setting these down to 5 seconds each and then restarting
> everything might be a useful experiment.  If the delays drop after that,
> you'll know you're in the right ballpark.

Thanks, this seems like good advice!  I already had rbl_timeout at 5 so 
I shortened razor_timeout to 5 also and sure enough, my delays are ~10 
seconds now.

I may have narrowed the problem down to razor.  Looks like I'm getting 
intermittent delays (which could explain why I have sometimes run 
spamassassin all the way through without delay.)  Often I run this once 
and it completes quickly, but the 2nd and all subsequent attempts razor 
hangs on timeout with wonder.cloudmark.com:

# razor-check -d -debuglevel=9 /tmp/spam2.txt

[...]

Jan 28 14:09:00.272986 check[29149]: [ 8] preparing 2 queries
Jan 28 14:09:00.273370 check[29149]: [ 8] sending 1 batches
Jan 28 14:09:00.273584 check[29149]: [ 4] wonder.cloudmark.com << 108
Jan 28 14:09:00.273727 check[29149]: [ 6] 
-a=c&e=4&ep4=7542-10&s=LDMBvNPlqNgucLUsV_snpo9cxOcA
a=c&e=4&ep4=7542-10&s=10fgyuwqXkOs1rGoOYlmLj-I1EYA
.

(Pauses here for long time)

Jan 28 14:09:15.266090 check[29149]: [ 1] razor-check error: check 2: 
Timed out (15 sec) while reading from wonder.cloudmark.com
check 2: Timed out (15 sec) while reading from wonder.cloudmark.com
#

So I guess my questions now are:  How do I find & set up a good, 
reliable set of RBLs?  Do I need to invest a lot of time or can I 
automate it?  Is this an anomaly with cloudmark's db/service or does 
this kind of thing happen a lot?  (PS: I re-initialized my razor client 
but it hasn't helped.)

My apologies for dumping my razor problems on the MD list... If it 
weren't for great software like MD, I might never have bothered with 
spamassassin, razor, etc...

Cheers,
-Adam
___
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
[EMAIL PROTECTED]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


[Mimedefang] Tracking down the delay

2004-01-28 Thread Adam Porter
I'm getting long delays (30+ seconds) for every message with 
MD/SpamAssassin/Razor.  I know this comes up a lot but I'd like to learn 
how to debug things a little better.

Environment: Redhat Linux 9 on Intel (gobs of CPU/RAM) with sendmail 
8.12.8+Milter (to be upgraded once this is solved), MD 2.38, 
SpamAssassin 2.63, Razor Agents 2.36, NAI/McAfee uvscan.

1. The delays seem to go away if I disable MD in the sendmail conf.  So 
I guess sendmail is innocent.

2. The delays seem to go away when I disable RBL checks.  So this points 
to razor/RBL stuff.

3. When I run spamassassin at the command line on a spam message, 
there's no delay.  I can't explain this.

4. When I run uvscan at the command line on a message, there's no delay. 
 So I'm guessing NAI uvscan is probably innocent too.

5. When I upgraded razor-agents last week, the delays went down to 3-4 
seconds for a while but now they've gotten long (30+ seconds) again.

I've got spamassassin and MD logging but can't find any RBL debug info 
except for the delay times.  Can anyone point me towards debugging this 
properly?  I've tried playing with debug options for spamassassin and 
razor but haven't managed to produce any RBL-related messages.

Thanks!

-Adam



___
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
[EMAIL PROTECTED]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang