Re: per user bayes db: auto_expiry problem, spamd child timeout, very long scantimes

2007-03-01 Thread mailinglists
 On Tue, 27 Feb 2007 [EMAIL PROTECTED] wrote:

 Some emails have a scantime of more than 900 seconds.

 I do not see a relation to a huge load on the SpamAssassin Servers
 (I have 2 of them). The timeout problems happen when there is
 small load (10 out of 20 spamds marked Busy) as well as when there
 are 45 spamds forked with 35 marked Busy.

 That really smells like swap thrashing. How much memory is in your SA
 servers, and what does procinfo / top report for swap used vs. swap
 available when things are going pear-shaped?

RAM doesn't seem to be the issue here. Both spamd boxes are equipped with
4GB RAM. Although all is used up 2.4GB, 1GB respectively are used for disk
cache. Swap space is untouched and swap pages per second in|out are near
zero. CPU load on both boxes peaks at 10%.

Hence both spamd boxes request and write their bayes stuff to one and the
same mysql box I suspect there to be the problem.
This morning I ran a manual sa-learn --force-expire --sync job on the
users the timeout problem occured during the night. While running the job
I had several timeout errors on the bayes DB as well as on certain spamd
children.
In the coming days I will try to reproduce the problem by stressing the
SQL based bayes db using a parallelized sa-learn --force-expire job.

Philipp






 --
  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
   Users mistake widespread adoption of Microsoft Office as the
   development of a standard document format.
 ---
  14 days until Albert Einstein's 128th Birthday






per user bayes db: auto_expiry problem, spamd child timeout, very long scantimes

2007-02-27 Thread mailinglists
Hi all

I run a site for more than 2000 mailboxes with Postfix, SA 3.1.8 and
procmail. Every user has his own bayes db. Allow_user_rules is
deactivated.

I have a number of problems:

A number of emails passes spamd unfiltered due to spamd child timeout.
Looking at the scantime it often is far more than the 220s that are
defined as a timeout value. Some emails have a scantime of more than 900
seconds. Although I use SARE rules I do not blame them because I had this
problem already with SA 3.0|1.x.
It is possible that this problem is linked to the second problem. I have a
timeout on auto_expiry.
To address both issues I followed the hints and tipps that already were
discussed here not long ago. Yesterday I disabled auto_expiry and now run
sa-learn --force-expire --sync manually for those users that are concerned
by the expiry problem. I impossibly can run a force-expire job on a daily
cron basis for all users. This would simply use up the 24h a day has.
Also I have noticed that some users do have 1 to 2 million tokens in the
bayes db. A number between 150k to 200k is normal.
bayes_expiry_max_db_size default would be 150'000 and I havn't changed
this value.

What are the possible reasons why auto_expiry wouldn't expire such a huge
number of tokens?

I do not see a relation to a huge load on the SpamAssassin Servers (I have
2 of them). The timeout problems happen when there is small load (10 out
of 20 spamds marked Busy) as well as when there are 45 spamds forked with
35 marked Busy.

I wonder if I have to migrate from bayes db per user to a site-wide bayes
db. What would change?

In particular, these are the error messages:
spamd[27428]: child processing timeout at spamd line 1086, GEN209 line 503.
spamd[3692]: bayes: expire_old_tokens: child processing timeout at spamd
line 1086, GEN245 line 56.

Thank you very much in advance for any hints. I'd be really grateful.

Philipp




Re: per user bayes db: auto_expiry problem, spamd child timeout, very long scantimes

2007-02-27 Thread John D. Hardin
On Tue, 27 Feb 2007 [EMAIL PROTECTED] wrote:

 Some emails have a scantime of more than 900 seconds.
 
 I do not see a relation to a huge load on the SpamAssassin Servers
 (I have 2 of them). The timeout problems happen when there is
 small load (10 out of 20 spamds marked Busy) as well as when there
 are 45 spamds forked with 35 marked Busy.

That really smells like swap thrashing. How much memory is in your SA 
servers, and what does procinfo / top report for swap used vs. swap 
available when things are going pear-shaped?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Users mistake widespread adoption of Microsoft Office as the
  development of a standard document format.
---
 14 days until Albert Einstein's 128th Birthday