Re: Re[2]: [qmailtoaster] SA issue - again

2006-12-23 Thread Peter Peltonen

On 12/23/06, Peter Peltonen [EMAIL PROTECTED] wrote:

When looking at information that sar has collected CPU has been idle
at least 80 % all the time. Top showed mysql and spamd eating from
0-20 % of CPU at times, never much more.

After restarting spamd I have not been able to reproduce this.

I'll continue watching the system.


Unfortunately approx 24h after the restart the problem occurred again.

2006-12-23 17:30:43.646384500 [2074] warn: spamd: timeout: (300 second
timeout while trying  to PROCESS) at /usr/bin/spamd line 1686, GEN47
line 4.

As today is Saturday not much heavy load have been hitting the server.
The processors are mostly idle.

What I also noticed was a lot of httpd processes (about 50, when we
normally have about 10): After issuing `service httpd stop` they took
about 10-20 seconds to stop.

There is no other errors in any logs.

I have now disabled autloearn, we'll see if that helps. If not, I'll
try rebooting the server.

Regards,
Peter

-
QmailToaster hosted by: VR Hosted http://www.vr.org
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re[2]: [qmailtoaster] SA issue - again

2006-12-22 Thread Alexey Loukianov
Greetings, Peter.

On 22 ??? 2006 ?., 18:59:10 you wrote:
 We have quite the default spamassassin setup (no extra or custom
 rules) and are using spam filtering only for a few domains. And no
 cluster setup, this is a single server.
I.e. mysql, spamassassin, apache and qMailToaster - all are set up at
the same server?

 After spamd restart everything is ok so it seems that spamd is freezing mysql?
If the answer to the question above is 'yes', then that's might be the
cause. If you've got system-accounting enabled - try to use RedHat
tools to check what was the system state at the moment the problems
were experienced. If not, wait for trouble to show up again and check
out ps, top, vmstat and iostat output. I suppose, that you've got
spamd eating up all CPU and thus blocking mysql.

-- 
Best regards,
 Alexey Loukianov  mailto:[EMAIL PROTECTED]
 System Engineer,
 IT Department,
 Lavtech Corp


-
 QmailToaster hosted by: VR Hosted http://www.vr.org
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re[2]: [qmailtoaster] SA issue - again

2006-12-22 Thread Alexey Loukianov
Greetings, Peter.

On 22 ??? 2006 ?., 18:59:10 you wrote:
 If someone has an idea how to debug mysql to find out what is really
 causing the hangups, please let me know.
Forgot to say - try to disable bayest engine of SA and check if it
helps. In my case the issue is gone just as I disable bayest.

To disable SA bayest engine, add the following to local.cf:
use_bayes 0
use_bayes_rules 0
bayes_auto_learn 0
bayes_auto_expire 0

-- 
Best regards,
 Alexey Loukianov  mailto:[EMAIL PROTECTED]
 System Engineer,
 IT Department,
 Lavtech Corp


-
 QmailToaster hosted by: VR Hosted http://www.vr.org
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re[2]: [qmailtoaster] SA issue - again

2006-12-20 Thread Alexey Loukianov
Greetings, Eric.

On 19 ??? 2006 ?., 20:05:37 you wrote:

 Alexey Loukianov wrote:
 Hello all,
 
 I'm forced to run a bunch of SA-specialized servers to be able to
 handle processing of all the incoming mail to the corporate servers.
 All the SA hosts utilize the same HA mysql for bayest storage DB, and
 a simple king of load-balancing for SA is achieved by built in spamc
 functionality.
 
 From time to time some of SA servers tend to 'stuck'. This shows up in
 logs like this:
 
 # qmlog -s @4000458765251acf0b74.s spamd | grep -E 'error|warn'
 2006-12-19 00:53:00.508318500 [10802] warn: spamd: timeout: (300 second 
 timeout while trying to PROCESS) at /usr/bin/spamd line 1686, GEN6356 line 
 254.
 2006-12-19 00:53:14.979634500 [11224] warn: spamd: timeout: (300 second 
 timeout while trying to PROCESS) at /usr/bin/spamd line 1686, GEN6395 line 
 337.
 2006-12-19 00:53:20.724340500 [11225] warn: spamd: timeout: (300 second 
 timeout while trying to PROCESS) at /usr/bin/spamd line 1686, GEN6400 line 
 319.
 2006-12-19 00:53:25.781288500 [11226] warn: spamd: timeout: (300 second 
 timeout while trying to PROCESS) at /usr/bin/spamd line 1686, GEN6403 line 
 300.
 2006-12-19 00:53:44.309364500 [10261] warn: spamd: timeout: (300 second 
 timeout while trying to PROCESS) at /usr/bin/spamd line 1686, GEN6428 line 
 319.
 2006-12-19 00:58:13.590168500 [10802] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6359 line 334.
 2006-12-19 00:58:13.590507500 [10802] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6359 line 334.
 2006-12-19 00:58:16.081281500 [11224] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6401 line 664.
 2006-12-19 00:58:16.081622500 [11224] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6401 line 664.
 2006-12-19 00:58:24.804143500 [11225] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6401 line 253.
 2006-12-19 00:58:24.804156500 [11225] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6401 line 253.
 2006-12-19 00:58:33.883826500 [11226] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6406 line 321.
 2006-12-19 00:58:33.883837500 [11226] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6406 line 321.
 2006-12-19 00:58:43.214455500 [10261] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6430 line 1371.
 2006-12-19 00:58:43.214841500 [10261] error: child processing timeout at 
 /usr/bin/spamd line 1085, GEN6430 line 1371.
 
 From the moment the first warning shows up in the logs and until the
 spamd would be restarted by hand processing stucks, and all the spamd
 processes die after a 300 sec timeout, with a corresponding message in
 logs.
 
 I know that this issue is related to bayest rules, as if I turn'em off
 in local.cf - no hangs happen. Some time ago E.S. mentioned in list
 that there's an issue in current SA that might cause such timeouts,
 and that it's connected with SA bayest autoexpire function. No
 problems, turned autoexpire off in local.cf, restarted spamd, headed
 on to crontab and set up hourly job to force tokens expiry.
 
 Nevertheless, after about 11 hours the stuck happened again.
 
 Anybody else experiencing familiar issue?
 

 I haven't seen this problem since turning autoexpire off.

 Sounds like a mysql problem (either getting to or w/in). Any indications on
 the mysql side?
Logs show that this is not a mysql problem. We've got here HA cluster
for mysql with Hitachi TagmaStore AMS200 as a storage for them, so
there should be no problems with mysql at all in any case. Problems
are SA related, looks like a some kind of bug in it, as a simple restart
of spamd fixes the things for a while.

 P.S. The new qmlog has grep built in. Perhaps I should've made it egrep?
Still using older QMTP RPM, and I think that there's no need for qmlog
to gain excess functionality. It's against The Unix Way (TM), and it
will be always easier for experienced sysadm to pipe to a grep/egrep,
instead of trying to remember correct opts for every utility he or she
uses.

-- 
Best regards,
 Alexey Loukianov  mailto:[EMAIL PROTECTED]
 System Engineer,
 IT Department,
 Lavtech Corp


-
 QmailToaster hosted by: VR Hosted http://www.vr.org
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]