Re: Do you experience problems with 3.1.8?

2007-03-12 Thread maillist

Michał Jęczalik wrote:

Hello,

after upgrading from 3.1.7 I have numerous problems with my spamd. It 
hangs up during high load and become permamently unresponsive. 
According to advices I have found on devel list, I'm using 
--round-robin now and it hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear 
and quickly foul user's quota. It's interesting that on another host 
with similar load conditions everything works ok. Anyway - am I the 
only one experiencing these problems? There's no rumour on the devel 
list, there's no rumour here - what's wrong? :) In this situation 
3.1.8 is quite unusable for me and I'm thinking about downgrade. The 
only reason I have not done it already is that I'm not sure if this is 
a simple task - my users won't stand another spamassassin blackout, 
after numerous spam floods due to those hang-ups in past couple of 
days. ;-)

How did you upgrade?
What OS?
What MDA?
When you say hangs what do you mean?


Re: Do you experience problems with 3.1.8?

2007-03-12 Thread Michał Jęczalik

On Mon, 12 Mar 2007, maillist wrote:


Michał Jęczalik wrote:

Hello,

after upgrading from 3.1.7 I have numerous problems with my spamd. It hangs 
up during high load and become permamently unresponsive. According to 
advices I have found on devel list, I'm using --round-robin now and it 
hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't disappear and 
quickly foul user's quota. It's interesting that on another host with 
similar load conditions everything works ok. Anyway - am I the only one 
experiencing these problems? There's no rumour on the devel list, there's 
no rumour here - what's wrong? :) In this situation 3.1.8 is quite unusable 
for me and I'm thinking about downgrade. The only reason I have not done it 
already is that I'm not sure if this is a simple task - my users won't 
stand another spamassassin blackout, after numerous spam floods due to 
those hang-ups in past couple of days. ;-)

How did you upgrade?


perl Makefile.PL etc ;-)


What OS?


Linux 2.4


What MDA?


It is completly unrelated to MDA. I invoke spamd with inetd and spamc with 
procmail, but the problem is in spamd itself. Probably one could repeat it 
with feeding messages manually to spamc. As far as I read the devel 
list, guys out there are aware of this problem, but they seem to be 
satisfied with the temporary (?) solution of --round-robin so far. But it 
doesn't fix the problem, it just seems to decrease intensivity.


Oh, I've just noticed it died again. Well, killall spamd... ;-)


When you say hangs what do you mean?


This is what I mean:

 5707 ?Ss 0:02 /usr/bin/perl -T -w /usr/bin/spamd --max-children=14 
--round-robin
 5805 ?R 58:05 spamd child
 5826 ?S  3:10 spamd child
 5851 ?R 31:03 spamd child
 5862 ?R 26:19 spamd child
 5873 ?R 26:11 spamd child
 5882 ?R 26:09 spamd child
15341 ?R 18:15 spamd child
17651 ?R 16:09 spamd child
22972 ?R 16:16 spamd child
 9744 ?R 10:47 spamd child
14581 ?S  1:37 spamd child
18379 ?R 10:18 spamd child
21493 ?R  7:21 spamd child
24789 ?R  6:43 spamd child

And a nice bunch of spamc - some probably hung up waiting for output from 
spamd, and some continously trying to connect and feed incoming mails (and 
giving up after some retries, passing the message spam-uncredited).


A last sane response of every spamd's child is processing message 
--
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04



Re: Do you experience problems with 3.1.8?

2007-03-12 Thread maillist

Michał Jęczalik wrote:

On Mon, 12 Mar 2007, maillist wrote:


Michał Jęczalik wrote:

Hello,

after upgrading from 3.1.7 I have numerous problems with my spamd. 
It hangs up during high load and become permamently unresponsive. 
According to advices I have found on devel list, I'm using 
--round-robin now and it hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
disappear and quickly foul user's quota. It's interesting that on 
another host with similar load conditions everything works ok. 
Anyway - am I the only one experiencing these problems? There's no 
rumour on the devel list, there's no rumour here - what's wrong? :) 
In this situation 3.1.8 is quite unusable for me and I'm thinking 
about downgrade. The only reason I have not done it already is that 
I'm not sure if this is a simple task - my users won't stand another 
spamassassin blackout, after numerous spam floods due to those 
hang-ups in past couple of days. ;-)

How did you upgrade?


perl Makefile.PL etc ;-)


What OS?


Linux 2.4


What MDA?


It is completly unrelated to MDA. I invoke spamd with inetd and spamc 
with procmail, but the problem is in spamd itself. Probably one could 
repeat it with feeding messages manually to spamc. As far as I read 
the devel list, guys out there are aware of this problem, but they 
seem to be satisfied with the temporary (?) solution of --round-robin 
so far. But it doesn't fix the problem, it just seems to decrease 
intensivity.


Oh, I've just noticed it died again. Well, killall spamd... ;-)


When you say hangs what do you mean?


This is what I mean:

 5707 ?Ss 0:02 /usr/bin/perl -T -w /usr/bin/spamd 
--max-children=14 --round-robin

 5805 ?R 58:05 spamd child
 5826 ?S  3:10 spamd child
 5851 ?R 31:03 spamd child
 5862 ?R 26:19 spamd child
 5873 ?R 26:11 spamd child
 5882 ?R 26:09 spamd child
15341 ?R 18:15 spamd child
17651 ?R 16:09 spamd child
22972 ?R 16:16 spamd child
 9744 ?R 10:47 spamd child
14581 ?S  1:37 spamd child
18379 ?R 10:18 spamd child
21493 ?R  7:21 spamd child
24789 ?R  6:43 spamd child

And a nice bunch of spamc - some probably hung up waiting for output 
from spamd, and some continously trying to connect and feed incoming 
mails (and giving up after some retries, passing the message 
spam-uncredited).


A last sane response of every spamd's child is processing message 

make uninstall
perl Makefile.PL etc ;-)

Sorry man, I'm stumped.  It just seems like it must be an issue with the 
upgrade.


-=Aubrey=-


Re: Do you experience problems with 3.1.8?

2007-03-12 Thread Daryl C. W. O'Shea

Michal Jeczalik wrote:

On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:

after upgrading from 3.1.7 I have numerous problems with my spamd. It 
hangs up during high load and become permamently unresponsive. 
According to advices I have found on devel list, I'm using 
--round-robin now and it hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
disappear and quickly foul user's quota. It's interesting that on 
another host with similar load conditions everything works ok. Anyway 
- am I the only one experiencing these problems? There's no rumour on 
the devel list, there's no rumour here - what's wrong? :) In this 
situation 3.1.8 is quite unusable for me and I'm thinking about 
downgrade. The only reason I have not done it already is that I'm not 
sure if this is a simple task - my users won't stand another 
spamassassin blackout, after numerous spam floods due to those 
hang-ups in past couple of days. ;-)


This has nothing to do with 3.1.8 specifically.  The same thing would 
happen with 3.1.7.  Reverting to an earlier SA version will do nothing 
for you.


spamd isn't hanging up, it's doing bayes expiries, as you can tell 
from having the bayes_toks.expire* lock files left after you kill off 
the child process(es) doing the expiry.  Since you're killing off the 
expiries before they complete, this will (of course) keep happening.


If your system is too loaded to deal with bayes auto expiries, disable 
bayes_auto_expire and then schedule them to be done via a cron job 
using sa-learn --force-expire -u username.


OK, I'll try disabling autoexpire, but the fact is that I had no 
problems with 3.1.7.


No changes were made to how expiries are done between 3.1.7 (and a lot 
further back) and 3.1.8.  It's most likely just that as time has gone on 
more of your users' databases have become ready for expiry, whereas 
before expiries were less frequent and thus manageable.


Daryl



Re: Do you experience problems with 3.1.8?

2007-03-12 Thread Daryl C. W. O'Shea

Michal Jeczalik wrote:

On Mon, 12 Mar 2007, Daryl C. W. O'Shea wrote:

after upgrading from 3.1.7 I have numerous problems with my spamd. It 
hangs up during high load and become permamently unresponsive. 
According to advices I have found on devel list, I'm using 
--round-robin now and it hangs less often. But now I have a lot of 
~/.spamassassin/bayes_toks.expire[pid] lockfiles, that don't 
disappear and quickly foul user's quota. It's interesting that on 
another host with similar load conditions everything works ok. Anyway 
- am I the only one experiencing these problems? There's no rumour on 
the devel list, there's no rumour here - what's wrong? :) In this 
situation 3.1.8 is quite unusable for me and I'm thinking about 
downgrade. The only reason I have not done it already is that I'm not 
sure if this is a simple task - my users won't stand another 
spamassassin blackout, after numerous spam floods due to those 
hang-ups in past couple of days. ;-)


This has nothing to do with 3.1.8 specifically.  The same thing would 
happen with 3.1.7.  Reverting to an earlier SA version will do nothing 
for you.


spamd isn't hanging up, it's doing bayes expiries, as you can tell 
from having the bayes_toks.expire* lock files left after you kill off 
the child process(es) doing the expiry.  Since you're killing off the 
expiries before they complete, this will (of course) keep happening.


If your system is too loaded to deal with bayes auto expiries, disable 
bayes_auto_expire and then schedule them to be done via a cron job 
using sa-learn --force-expire -u username.


BTW - if it hangs up, it hangs up *completely* until I restart it. If it 
goes down at midnight, then spamd is unresposive until 8am when I get up 
and do something. There are no log messages during this period. It's 
*dead* in the full meaning of this word. :) So I'm not so sure as you 
that it's only a matter of auto expire - would a single autoexpire task 
lock up a frontend process for so long?!


If it's as busy as you said it was, hangs up during high load, and 
all/most of the children are trying to do expiries it could take months 
to complete -- especially if you don't have the physical memory to do it 
(read a whole lot of RAM if multiple expiries are happening).


Disable auto expiry, do serialized expiries via cron, and see if the 
problem stops.  Actually, you don't even need to do the expries to stop 
the problem, just disable auto expiries.  If spamd stops hanging then 
it's the auto expiries causing the problem.


Experience tells me that if the spamd children are actually using CPU 
time and they're not spewing errors all over your syslog, then it's an 
expiry issue.



Daryl