Re: 3.0.3 uses all CPUs after tie

2005-06-03 Thread Matthew Daubenspeck
On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:
 can you repro this reliably?  if so, output from -D and/or an strace
 - -f -p $spamdpid would be helpful.

From top:

28702 nobody25   0  781m 714m 1796 R 99.9 35.5   4:11.72 spamd

That's the runaway process.

# strace -f -p 28702
Process 28702 attached - interrupt to quit

That's all it does. I never see anything else. It then continues to chew
up both processors untill I killall and restart spamd. If I kill just
that PID, another spamd PID takes over and uses 100% cpu.

About the only thing I can do is run a cron script that kills all of
spamd and restarts it. However, that is a VERY ugly fix :)

Thanks.


Re: 3.0.3 uses all CPUs after tie

2005-06-03 Thread Michael Parker
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Matthew Daubenspeck wrote:

On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:

can you repro this reliably? if so, output from -D and/or an strace
- -f -p $spamdpid would be helpful.


From top:

28702 nobody 25 0 781m 714m 1796 R 99.9 35.5 4:11.72 spamd

That's the runaway process.

# strace -f -p 28702
Process 28702 attached - interrupt to quit

That's all it does. I never see anything else. It then continues to chew
up both processors untill I killall and restart spamd. If I kill just
that PID, another spamd PID takes over and uses 100% cpu.

About the only thing I can do is run a cron script that kills all of
spamd and restarts it. However, that is a VERY ugly fix :)

Thanks.

Exim? If so are you limiting the size of msgs sent spamd?

Michael
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFCoHUIG4km+uS4gOIRAhsyAJ0f2solLG3igMOml5OIAQ1f63zv3ACgl/xu
xOT4LMtSATDvqF+hl/ja178=
=0C5o
-END PGP SIGNATURE-



Re: 3.0.3 uses all CPUs after tie

2005-06-02 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


can you repro this reliably?  if so, output from -D and/or an strace
- -f -p $spamdpid would be helpful.

where does tie come in? (from the subj line).

- --j.

Matthew Daubenspeck writes:
 I am using Spamassassin 3.0.3 on a Gentoo AMD64 system with exim and
 exiscan. This has worked VERY well for months without a single issue.
 All of the sudden spamd eventually uses all of both CPU's and nearly
 locks the machine. I have tried downgrading to 3.0.2 with the same
 result. I have been using several of the RulesDuJour's and first started
 to suspect that.
 
 I removed all of the files from /etc/mail/spamassassin except for the
 following local.cf:
 
 required_hits   5
 skip_rbl_checks 0
 use_bayes   0
 score HELO_DYNAMIC_IPADDR   2
 score ALL_TRUSTED   0
 use_auto_whitelist  0
 
 When spamd is running normally its processes look as such:
 
 # ps aux | grep spamd
 root 29434  0.0  1.6  66712 33828 ?Ss   21:13   0:00
 /usr/sbin/spamd -d -r /var/run/spamd.pid -m 5 -c -H
 root 29442  0.1  1.8  69712 37152 ?S21:13   0:00 spamd
 child
 root 29443  0.0  1.7  68852 36300 ?S21:13   0:00 spamd
 child
 root 29444  0.0  1.7  68444 35904 ?S21:13   0:00 spamd
 child
 root 29445  0.0  1.7  68124 35584 ?S21:13   0:00 spamd
 child
 root 29446  0.0  1.7  68160 35600 ?S21:13   0:00 spamd
 child
 
 When both CPU's are pegged at 100%, they look like this:
 
 # ps aux | grep spamd
 root 10097  0.2  5.6 152336 117208 ?   Ss   10:32   0:06
 /usr/sbin/spamd -d -r /var/run/spamd.pid -m 5 -c -H
 root 10378  0.9  6.8 176116 141012 ?   S10:32   0:19 spamd
 child
 root 10379  1.0  6.6 170452 136024 ?   S10:32   0:22 spamd
 child
 root 10380  0.9  6.8 174528 140080 ?   S10:32   0:19 spamd
 child
 nobody   10381 27.1 38.0 818616 783476 ?   R10:32   9:20 spamd
 child
 root 10382  0.7  6.4 167376 133004 ?   S10:32   0:16 spamd
 child
 
 I'm sure pasting that to a message screwed everything up, so you can
 also see them at http://daubnet.dyndns.org:3000/foo/spamassassin
 
 For some reason, one of the processes switches from being owned by root
 to owned by nobody. Its state also changes from S to R. The only way I
 can clear this is by killing all spamd processes and restarting the
 service. I was initially using bayes, but thought that might have
 something to do with it so I disabled it. This made no change. 
 
 I've tried everything I can think of but nothing makes any difference. I
 have searched the archives and can't seem to find a solution. I know the
 list has heard this a million times, but I have changed nothing as far
 as settings in months :)
 
 Any suggestions?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCn1KnMJF5cimLx9ARAvkNAJ9RzXGvFxCHkrSKcpBAVuaizjpASACgr/i6
wpy5hgHz/nI9P1s0hgHvYaM=
=lgor
-END PGP SIGNATURE-



Re: 3.0.3 uses all CPUs after tie

2005-06-02 Thread Matthew Daubenspeck
On Thu, Jun 02, 2005 at 11:40:39AM -0700, Justin Mason wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 can you repro this reliably?  if so, output from -D and/or an strace
 - -f -p $spamdpid would be helpful.

It randomly happens after an hour or so of use. Next time it happens I
will try both and send it to the list.

 where does tie come in? (from the subj line).

Whoops. That should have been time :)


Re: 3.0.3 uses all CPUs after tie

2005-06-02 Thread Thomas Jacob
 It randomly happens after an hour or so of use. Next time it happens I
 will try both and send it to the list.

To follow up on the Debian thread with the same problem:

Since seems to happen for several people, during the last days, could it
be that this is not in fact exim/exiscan related, but some sort of
bug/attack on spamassassin/perl thru spam containing certain triggers,
causing buffer overflows?

I've tried analyzed our scanning logs a bit today, from the times when
the memory usage exploded, and there were was nothing unusual about the
size or number of scanned mail.


signature.asc
Description: Digital signature