Re: spamd keeps running at 99% CPU until i kill the process

2007-09-11 Thread Stuart Gall


On 11 Sep 2007, at 05:15, Thomas Schulz wrote:


On 30 Aug 2007, at 16:55, Micke Andersson wrote:


Richard Hobbs wrote:

Hello,

To add information to this problem, it appears that spamd does
eventually give up after 5 minutes - which then bounces the
message back
to the sender stating:

  421 SMTP incoming data timeout - message abandoned

Obviously, this cannot keep happening, but i don't know how to
stop it...

Any advice greatly appreciated.

Thanks again,
Richard.



Hi, I have had exactly the same problem as you, with about the very
same setup as you!
The problem where actually a TCP problem, and not a SpamAssassin
problem!
From a bunch of TCPDUMPs from my side and a good partner side, we
did finaly track down the problem to
TCP_WINDOW_SCALING, which is set to 1 (True) on Debian, which seems
to give a problem
to Exim. My solution for the problem was to set this parameter to 0
(false)
Easiest done by
echo 0  /proc/sys/net/ipv4/tcp_window_scaling

And further on, this seems to be a problem mostly against old
versions of Sendmail, when they
where sending to us, and had some kind of attachment in the email
as well!
(I don't really know where the bug is, if it is in Exim or TCP, I
found a solution and is pleased with that)

Hope this helps you out, a bit of topic though!



This may have fixed the problem by simply severely reducing the data
transmission speed.
With window scaling off your maximum window size is a mere 64K,
hardly enough these days.

I have had this problem since upgrading to the latest SA.
But I had also at the same time taken the opportunity to add a load
of spam and ham to bayes.

It makes much more sense that spamd is trying to do a bayes expire
and taking too long.
sa-learn --force-expire took a good 4 minutes

Setting auto expire off as suggested in this thread has fixed the
problem for me.

THEREFORE
The actual problem must be that  it is taking longer to do the bayes
expire than spamd's timeout child setting each new spamd child tries
to do a bayes expire but never has enough time to complete it.
Other wise one would expect just a single spamd to go 99% not every
spamd.

Right?

Perhaps the bayes auto expire code should be moved to the parent
process to prevent this problem.


If I remember correctly, in 3.2.x the logic was changed to have the  
child
process the mail first and then do the expire.  I don't know if  
that would

have fixed your problem or not.


This problem is with 3.2.3
I noticed that there were always two spamd at 99% so perhaps two  
processes try to do the expire simultaneously and block ?


Setting bayes auto expire to 0 in local.cf has fixed the problem.




Tom Schulz
Applied Dynamics Intl.
[EMAIL PROTECTED]





Re: spamd keeps running at 99% CPU until i kill the process

2007-09-11 Thread Micke Andersson

Stuart Gall wrote:


On 30 Aug 2007, at 16:55, Micke Andersson wrote:


Richard Hobbs wrote:

  421 SMTP incoming data timeout - message abandoned


  

Hi, I have had exactly the same problem as you, with about the very 
same setup as you!

The problem where actually a TCP problem, and not a SpamAssassin problem!
From a bunch of TCPDUMPs from my side and a good partner side, we did 
finaly track down the problem to
TCP_WINDOW_SCALING, which is set to 1 (True) on Debian, which seems 
to give a problem
to Exim. My solution for the problem was to set this parameter to 0 
(false)

Easiest done by
echo 0  /proc/sys/net/ipv4/tcp_window_scaling

And further on, this seems to be a problem mostly against old 
versions of Sendmail, when they
where sending to us, and had some kind of attachment in the email as 
well!
(I don't really know where the bug is, if it is in Exim or TCP, I 
found a solution and is pleased with that)


Hope this helps you out, a bit of topic though!



This may have fixed the problem by simply severely reducing the data 
transmission speed.
With window scaling off your maximum window size is a mere 64K, hardly 
enough these days.


I have had this problem since upgrading to the latest SA. 
But I had also at the same time taken the opportunity to add a load of 
spam and ham to bayes.


It makes much more sense that spamd is trying to do a bayes expire and 
taking too long.

sa-learn --force-expire took a good 4 minutes

Setting auto expire off as suggested in this thread has fixed the 
problem for me.


*THEREFORE*
The actual problem must be that  it is taking longer to do the bayes 
expire than spamd's timeout child setting each new spamd child tries 
to do a bayes expire but never has enough time to complete it.

Other wise one would expect just a single spamd to go 99% not every spamd.

Right?

Perhaps the bayes auto expire code should be moved to the parent 
process to prevent this problem.










My reflection was purely to the SMTP error 421 SMTP incoming data 
timeout - message abandoned

Which on one of my SMTP server where solved with my above solution!
And that particular error, I have only had it on Exim4 server, none of 
my Sendmail nor Postfix installations

suffer from that!

/Micke


Re: spamd keeps running at 99% CPU until i kill the process

2007-09-10 Thread Stuart Gall


On 30 Aug 2007, at 16:55, Micke Andersson wrote:


Richard Hobbs wrote:

Hello,

To add information to this problem, it appears that spamd does
eventually give up after 5 minutes - which then bounces the  
message back

to the sender stating:

  421 SMTP incoming data timeout - message abandoned

Obviously, this cannot keep happening, but i don't know how to  
stop it...


Any advice greatly appreciated.

Thanks again,
Richard.


Hi, I have had exactly the same problem as you, with about the very  
same setup as you!
The problem where actually a TCP problem, and not a SpamAssassin  
problem!
From a bunch of TCPDUMPs from my side and a good partner side, we  
did finaly track down the problem to
TCP_WINDOW_SCALING, which is set to 1 (True) on Debian, which seems  
to give a problem
to Exim. My solution for the problem was to set this parameter to 0  
(false)

Easiest done by
echo 0  /proc/sys/net/ipv4/tcp_window_scaling

And further on, this seems to be a problem mostly against old  
versions of Sendmail, when they
where sending to us, and had some kind of attachment in the email  
as well!
(I don't really know where the bug is, if it is in Exim or TCP, I  
found a solution and is pleased with that)


Hope this helps you out, a bit of topic though!



This may have fixed the problem by simply severely reducing the data  
transmission speed.
With window scaling off your maximum window size is a mere 64K,  
hardly enough these days.


I have had this problem since upgrading to the latest SA.
But I had also at the same time taken the opportunity to add a load  
of spam and ham to bayes.


It makes much more sense that spamd is trying to do a bayes expire  
and taking too long.

sa-learn --force-expire took a good 4 minutes

Setting auto expire off as suggested in this thread has fixed the  
problem for me.


THEREFORE
The actual problem must be that  it is taking longer to do the bayes  
expire than spamd's timeout child setting each new spamd child tries  
to do a bayes expire but never has enough time to complete it.
Other wise one would expect just a single spamd to go 99% not every  
spamd.


Right?

Perhaps the bayes auto expire code should be moved to the parent  
process to prevent this problem.










Re: spamd keeps running at 99% CPU until i kill the process

2007-09-03 Thread Richard Hobbs
Hello,

Why is that? Surely updates are released for current, stable versions of
spamassassin? Most of the world can't run SVN version! lol

Any ideas why this might be?

Thanks again,
Richard.


[EMAIL PROTECTED] wrote:
 However, if there are no updates
 
 Which I bet you will be surprised to find will always be the case
 these days, unless you use yet unreleased SVN versions of spamassassin.
 
 _
 This e-mail has been scanned for viruses by Verizon Business Internet Managed 
 Scanning Services - powered by MessageLabs. For further information visit 
 http://www.verizonbusiness.com/uk
 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-09-03 Thread Justin Mason

Rules for the SVN version are promoted and demoted automatically, and
updates are generated without human intervention.  However, it's
potentially unstable, and could cause false positives.  We haven't yet had
the time to work out an agreed way to implement a safe version of
automatic update generation for the stable tree, so for now, the stable
tree is reliant on committers hand-copying the rules -- and this hasn't
been happening much recently.

--j.

Richard Hobbs writes:
 Hello,
 
 Why is that? Surely updates are released for current, stable versions of
 spamassassin? Most of the world can't run SVN version! lol
 
 Any ideas why this might be?
 
 Thanks again,
 Richard.
 
 
 [EMAIL PROTECTED] wrote:
  However, if there are no updates
  
  Which I bet you will be surprised to find will always be the case
  these days, unless you use yet unreleased SVN versions of spamassassin.


Re: spamd keeps running at 99% CPU until i kill the process

2007-09-03 Thread Daryl C. W. O'Shea

Richard Hobbs wrote:

Hello,

Michael Parker wrote:

Richard Hobbs wrote:

Hello,

Could the size of bayes_seen and bayes_toks be causing this timeout?


No, those aren't really that big, but it does look like you have an
expiration problem.


I've heard that this file should be around 10MB on a standard system,
so surely 80MB is huge?


No.


Also, our mail server deals with around
1000-1500 emails per day, and is a single-CPU Pentium(R) 4 CPU 2.26GHz
with 512MB RAM - should this be able to cope?


Certainly.


Well, i've just run sa-learn --force-expire manually, and despite
saying it had removed a whole load of entries from the database, the
bayes_toks and bayes_seen files are still the same size (80MB and 10MB,
respectively).


They're sparse files, so they're not going to shrink in reported size.


Daryl


Re: spamd keeps running at 99% CPU until i kill the process

2007-09-03 Thread jidanni
JM the stable tree is reliant on committers hand-copying the rules --
JM and this hasn't been happening much recently.

Idea: hand copy them anyway once a month or less in the interim.


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-31 Thread jidanni
 However, if there are no updates

Which I bet you will be surprised to find will always be the case
these days, unless you use yet unreleased SVN versions of spamassassin.


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-30 Thread Richard Hobbs
Hello,

What does spamassassin --lint do? I can't find the lint option in the
man page...

Thanks again,
Richard.


John D. Hardin wrote:
 On Wed, 29 Aug 2007, Richard Hobbs wrote:
 
 Would you guys also recommend putting the command /usr/bin/sa-update
 in the crontab as well?

 If so, should i be using any particular parameters? And how often should
 it be run to be worthwhile?
 
 I run it weekly...
 
 /etc/cron.weekly/spamassassin-update:
   /usr/bin/sa-update  spamassassin --lint  service spamassassin restart
 
 --
  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
   The difference is that Unix has had thirty years of technical
   types demanding basic functionality of it. And the Macintosh has
   had fifteen years of interface fascist users shaping its progress.
   Windows has the hairpin turns of the Microsoft marketing machine
   and that's all.-- Red Drag Diva
 ---
  21 days until Talk Like a Pirate day
 
 
 _
 This e-mail has been scanned for viruses by Verizon Business Internet Managed 
 Scanning Services - powered by MessageLabs. For further information visit 
 http://www.verizonbusiness.com/uk
 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-30 Thread Richard Hobbs
Hello,

Don't worry - i know what lint does now, but when i run it, i get this:

mail:/etc/cron.weekly# spamassassin --lint
[25465] warn: config: failed to parse line, skipping: rewrite_subject 1
[25465] warn: lint: 1 issues detected, please rerun with debug enabled
for more information
mail:/etc/cron.weekly#

But, if i remove the rewrite_subject 1 line, it stops rewriting the
subject line (i think)!

Any suggestions?

Thanks again,
Richard.


John D. Hardin wrote:
 On Wed, 29 Aug 2007, Richard Hobbs wrote:
 
 Would you guys also recommend putting the command /usr/bin/sa-update
 in the crontab as well?

 If so, should i be using any particular parameters? And how often should
 it be run to be worthwhile?
 
 I run it weekly...
 
 /etc/cron.weekly/spamassassin-update:
   /usr/bin/sa-update  spamassassin --lint  service spamassassin restart
 
 --
  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
   The difference is that Unix has had thirty years of technical
   types demanding basic functionality of it. And the Macintosh has
   had fifteen years of interface fascist users shaping its progress.
   Windows has the hairpin turns of the Microsoft marketing machine
   and that's all.-- Red Drag Diva
 ---
  21 days until Talk Like a Pirate day
 
 
 _
 This e-mail has been scanned for viruses by Verizon Business Internet Managed 
 Scanning Services - powered by MessageLabs. For further information visit 
 http://www.verizonbusiness.com/uk
 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-30 Thread Richard Hobbs
Hello,

Sorry for so many emails, but it's sorted - removing rewrite_subject 1
from the config does not stop the subject line from being rewritten, so
i'm done!

However, with regards to the cronjob, i'd like some intelligent
reporting if possible, so i've constructed this:

==
echo Updating rules...  /usr/bin/sa-update  echo Done, now
checking config syntax...  spamassassin --lint  echo Done, now
restarting spamd...  /etc/init.d/spamassassin restart  echo Done
==

However, if there are no updates (and sa-update exits with an exit code
of 1) i would like to print No updates, exiting and then exit.

So i modified it to become this:

==
echo Updating rules...  /usr/bin/sa-update  echo Done, now
checking config syntax... || echo No updates, exiting; exit 
spamassassin --lint  echo Done, now restarting spamd... 
/etc/init.d/spamassassin restart  echo Done
==

However, whether there are updates or not, the script still exits - do
you know which syntax i need to use here (i'm using bash, btw).

Thanks again,
Richard.


Richard Hobbs wrote:
 Hello,
 
 Don't worry - i know what lint does now, but when i run it, i get this:
 
 mail:/etc/cron.weekly# spamassassin --lint
 [25465] warn: config: failed to parse line, skipping: rewrite_subject 1
 [25465] warn: lint: 1 issues detected, please rerun with debug enabled
 for more information
 mail:/etc/cron.weekly#
 
 But, if i remove the rewrite_subject 1 line, it stops rewriting the
 subject line (i think)!
 
 Any suggestions?
 
 Thanks again,
 Richard.
 
 
 John D. Hardin wrote:
 On Wed, 29 Aug 2007, Richard Hobbs wrote:

 Would you guys also recommend putting the command /usr/bin/sa-update
 in the crontab as well?

 If so, should i be using any particular parameters? And how often should
 it be run to be worthwhile?
 I run it weekly...

 /etc/cron.weekly/spamassassin-update:
   /usr/bin/sa-update  spamassassin --lint  service spamassassin restart

 --
  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
   The difference is that Unix has had thirty years of technical
   types demanding basic functionality of it. And the Macintosh has
   had fifteen years of interface fascist users shaping its progress.
   Windows has the hairpin turns of the Microsoft marketing machine
   and that's all.-- Red Drag Diva
 ---
  21 days until Talk Like a Pirate day


 _
 This e-mail has been scanned for viruses by Verizon Business Internet 
 Managed Scanning Services - powered by MessageLabs. For further information 
 visit http://www.verizonbusiness.com/uk


 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-30 Thread John D. Hardin
On Thu, 30 Aug 2007, Richard Hobbs wrote:

 ==
 echo Updating rules...  /usr/bin/sa-update  echo Done, now
 checking config syntax...  spamassassin --lint  echo Done, now
 restarting spamd...  /etc/init.d/spamassassin restart  echo Done
 ==
 
 However, if there are no updates (and sa-update exits with an exit code
 of 1) i would like to print No updates, exiting and then exit.
 
 So i modified it to become this:
 
 ==
 echo Updating rules...  /usr/bin/sa-update  echo Done, now
 checking config syntax... || echo No updates, exiting; exit 
 spamassassin --lint  echo Done, now restarting spamd... 
 /etc/init.d/spamassassin restart  echo Done
 ==
 
 However, whether there are updates or not, the script still exits - do
 you know which syntax i need to use here (i'm using bash, btw).

I'd recommend rewriting that to use if ... then ... else ... fi 
syntax. Pasting stuff together with  and || is only manageable when 
the logic is simple and straightforward.

Perhaps (off the top of my head):

echo Updating rules...

if /usr/bin/sa-update
then
echo Done, now checking config syntax...
if spamassassin --lint
then
echo Syntax OK, now restarting spamd...
/etc/init.d/spamassassin restart
echo Done
else
echo Syntax error in update, please manually review!
fi
else
echo No updates, exiting.
fi
 
--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  There is no doubt in my mind that millions of lives could have been
  saved if the people were not brainwashed about gun ownership and
  had been well armed. ... Gun haters always want to forget the Warsaw
  Ghetto uprising, which is a perfect example of how a ragtag,
  half-starved group of Jews took 10 handguns and made asses out of
  the Nazis.-- Theodore Haas, Dachau Survivor
---
 20 days until Talk Like a Pirate day



Re: spamd keeps running at 99% CPU until i kill the process

2007-08-29 Thread Richard Hobbs
Hello,

Anthony Peacock wrote:
 Hi,
 
 Richard Hobbs wrote:
 Hello,

 John D. Hardin wrote:
 On Tue, 28 Aug 2007, Richard Hobbs wrote:

 Could the size of bayes_seen and bayes_toks be causing this
 timeout?
 Yes.

 If so, what can i do about this?
 Disable automatic Bayes expiry and do a manual expiration run, and
 allow it to complete.

 So what you (and Michael Parker) are saying is that it's not the
 checking of spam against the tokens that is causing the timeout, it's
 the automatic expiration of old tokens that conveniently gets tagged
 onto the same operation, right?

 If so, let me get this straight - an email comes in and goes off to
 spamd. spamd then checks the message against the tokens to determine
 whether it's spam or not, then runs an expiry of old tokens (or perhaps
 it happens the other way around), and only then returns the mail to
 exim. The expiration of old tokens takes a lot longer than the spam
 checking and as a result it's timing out.

 So, if i disable automatic expiration, spamd will only attempt one
 operation at a time, (checking of spam against the tokens) and should
 therefore not timeout, correct?

 But, if i do disable automatic expiration, i will have to remember to do
 it manually, or via cron.

 Is this all correct?
 
 Yes.  Set up expiration in a cron job, once per day is usually fine.

OK, will do - i assume sa-learn --force-expire is the command to run
via cron, right?

Also, how do i disable automatic expiration?

Thanks again,
Richard.


 Would a suitable alternative be to delete bayes_seen and bayes_toks,
 then restart spamd? I know i would be deleting everything that had been
 learned over the last period of time, but starting afresh may not be a
 bad thing, seeing as the rules in the database are probably 6 months to
 a year old now (we've not been using spamd for a year or so, because i
 broke it and had no time to fix it!).

 Please let me know your thoughts, and also let me know whether deleting
 both of those files is a good way to go.
 
 No.
 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-29 Thread Anthony Peacock

Hi,

Richard Hobbs wrote:

Hello,

SNIP



Is this all correct?

Yes.  Set up expiration in a cron job, once per day is usually fine.


OK, will do - i assume sa-learn --force-expire is the command to run
via cron, right?


Yup! This is my crontab line:

00 22 * * * /usr/local/bin/sa-learn --force-expire



Also, how do i disable automatic expiration?


Set:

bayes_auto_expire  0

In your local.cf file.



Thanks again,
Richard.



Would a suitable alternative be to delete bayes_seen and bayes_toks,
then restart spamd? I know i would be deleting everything that had been
learned over the last period of time, but starting afresh may not be a
bad thing, seeing as the rules in the database are probably 6 months to
a year old now (we've not been using spamd for a year or so, because i
broke it and had no time to fix it!).

Please let me know your thoughts, and also let me know whether deleting
both of those files is a good way to go.

No.







--
Anthony Peacock
CHIME, Royal Free  University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
A CAT scan should take less time than a PET scan.  For a CAT scan,
 they're only looking for one thing, whereas a PET scan could result in
 a lot of things.- Carl Princi, 2002/07/19


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-29 Thread Richard Hobbs
Hello,

Thank you - this is now done! :-)

I shall see how it goes...

Richard.


Anthony Peacock wrote:
 Hi,
 
 Richard Hobbs wrote:
 Hello,
 SNIP
 

 Is this all correct?
 Yes.  Set up expiration in a cron job, once per day is usually fine.

 OK, will do - i assume sa-learn --force-expire is the command to run
 via cron, right?
 
 Yup! This is my crontab line:
 
 00 22 * * * /usr/local/bin/sa-learn --force-expire
 

 Also, how do i disable automatic expiration?
 
 Set:
 
 bayes_auto_expire  0
 
 In your local.cf file.
 

 Thanks again,
 Richard.


 Would a suitable alternative be to delete bayes_seen and
 bayes_toks,
 then restart spamd? I know i would be deleting everything that had been
 learned over the last period of time, but starting afresh may not be a
 bad thing, seeing as the rules in the database are probably 6 months to
 a year old now (we've not been using spamd for a year or so, because i
 broke it and had no time to fix it!).

 Please let me know your thoughts, and also let me know whether deleting
 both of those files is a good way to go.
 No.



 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-29 Thread Richard Hobbs
Hello,

Would you guys also recommend putting the command /usr/bin/sa-update
in the crontab as well?

If so, should i be using any particular parameters? And how often should
it be run to be worthwhile?

Thanks again,
Richard.


Richard Hobbs wrote:
 Hello,
 
 Thank you - this is now done! :-)
 
 I shall see how it goes...
 
 Richard.
 
 
 Anthony Peacock wrote:
 Hi,

 Richard Hobbs wrote:
 Hello,
 SNIP

 Is this all correct?
 Yes.  Set up expiration in a cron job, once per day is usually fine.
 OK, will do - i assume sa-learn --force-expire is the command to run
 via cron, right?
 Yup! This is my crontab line:

 00 22 * * * /usr/local/bin/sa-learn --force-expire

 Also, how do i disable automatic expiration?
 Set:

 bayes_auto_expire  0

 In your local.cf file.

 Thanks again,
 Richard.


 Would a suitable alternative be to delete bayes_seen and
 bayes_toks,
 then restart spamd? I know i would be deleting everything that had been
 learned over the last period of time, but starting afresh may not be a
 bad thing, seeing as the rules in the database are probably 6 months to
 a year old now (we've not been using spamd for a year or so, because i
 broke it and had no time to fix it!).

 Please let me know your thoughts, and also let me know whether deleting
 both of those files is a good way to go.
 No.



 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-29 Thread John D. Hardin
On Wed, 29 Aug 2007, Richard Hobbs wrote:

 Would you guys also recommend putting the command /usr/bin/sa-update
 in the crontab as well?
 
 If so, should i be using any particular parameters? And how often should
 it be run to be worthwhile?

I run it weekly...

/etc/cron.weekly/spamassassin-update:
  /usr/bin/sa-update  spamassassin --lint  service spamassassin restart

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 21 days until Talk Like a Pirate day



spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

We are running SpamAssassin version 3.1.7-deb running on Perl version
5.8.4 on a Debian Sarge box with exim4.

I am using the following router:

==
sa_router:
   no_verify
   check_local_user
   # When to scan a message :
   # - it isn't already flagged as spam from Spamassassin
   # - it isn't already scanned
   # - it isn't local
   # - it isn't from one internal domain user to another
   condition = ${if and { \
 {!eq {$received_protocol}{spam-scanned}} \
 {!eq {$received_protocol}{local}} \
 {!eq {$sender_address_domain}{$domain}} \
 } \
 {1}{0}}
   driver= accept
   transport = sa_spamcheck
   local_parts = /etc/spamassassinUsers
==

and the following transport:

==
sa_spamcheck:
   driver = pipe
   command = /usr/sbin/exim4 -oMr spam-scanned -bS
   use_bsmtp = true
   transport_filter = /usr/bin/spamc
   home_directory = /tmp
   current_directory = /tmp
   user = spamcheck
   group = spamcheck
   log_output = true
   return_fail_output = true
   return_path_add = false
   message_prefix =
   message_suffix =
==

This always used to work perfectly until i upgraded spamassassin using
apt-get to the version above (which was done in order to give us access
to the sa-update command).

Now, on some emails spamd eats up 99% CPU and doesn't return until i
kill it, at which point the email is delivered as non-spam, whether it
is spam or not.

Does anyone know why this is happening? It's destroying the use of our
mail server, lol!

Also, FYI, here is a copy of /etc/spamassassin/local.cf:

==
# This is the right place to customize your installation of SpamAssassin.
#
# See 'perldoc Mail::SpamAssassin::Conf' for details of what can be
# tweaked.
#
# Only a small subset of options are listed below
#
###

rewrite_subject 1

#   Add *SPAM* to the Subject header of spam e-mails
#
rewrite_header Subject SPAM (_HITS_/_REQD_):


#   Save spam messages as a message/rfc822 MIME attachment instead of
#   modifying the original message (0: off, 2: use text/plain instead)
#
report_safe 1


#   Set which networks or hosts are considered 'trusted' by your mail
#   server (i.e. not spammers)
#
# trusted_networks 212.17.35.
trusted_networks192.168.3.
trusted_networks193.128.142.


#   Set file-locking method (flock is not safe over NFS, but is faster)
#
# lock_method flock


#   Set the threshold at which a message is considered spam (default: 5.0)
#
required_score 2.5


#   Use Bayesian classifier (default: 1)
#
# use_bayes 1


#   Bayesian classifier auto-learning (default: 1)
#
# bayes_auto_learn 1


#   Set headers which may provide inappropriate cues to the Bayesian
#   classifier
#
# bayes_ignore_header X-Bogosity
# bayes_ignore_header X-Spam-Flag
# bayes_ignore_header X-Spam-Status

# WHITELIST
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]
==

Thanks in advance for any assistance! It's greatly appreciated!

Richard.

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Mark Martinec
Richard,

 To add information to this problem, it appears that spamd does
 eventually give up after 5 minutes

Capture a message causing touble from a MTA queue,
and feed it to a command line spamassassin with -t -D options.

  Mark


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

Mark Martinec wrote:
 Richard,
 
 To add information to this problem, it appears that spamd does
 eventually give up after 5 minutes
 
 Capture a message causing touble from a MTA queue,
 and feed it to a command line spamassassin with -t -D options.

I would love to do this, but it's not the same messages all the time -
in a batch of identical messages, some get through and others cause the
hanging. It seems to be completely random.

Do you have any other ideas?

Thanks again,
Richard.

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

To add information to this problem, it appears that spamd does
eventually give up after 5 minutes - which then bounces the message back
to the sender stating:

  421 SMTP incoming data timeout - message abandoned

Obviously, this cannot keep happening, but i don't know how to stop it...

Any advice greatly appreciated.

Thanks again,
Richard.


Richard Hobbs wrote:
 Hello,
 
 We are running SpamAssassin version 3.1.7-deb running on Perl version
 5.8.4 on a Debian Sarge box with exim4.
 
 I am using the following router:
 
 ==
 sa_router:
no_verify
check_local_user
# When to scan a message :
# - it isn't already flagged as spam from Spamassassin
# - it isn't already scanned
# - it isn't local
# - it isn't from one internal domain user to another
condition = ${if and { \
  {!eq {$received_protocol}{spam-scanned}} \
  {!eq {$received_protocol}{local}} \
  {!eq {$sender_address_domain}{$domain}} \
  } \
  {1}{0}}
driver= accept
transport = sa_spamcheck
local_parts = /etc/spamassassinUsers
 ==
 
 and the following transport:
 
 ==
 sa_spamcheck:
driver = pipe
command = /usr/sbin/exim4 -oMr spam-scanned -bS
use_bsmtp = true
transport_filter = /usr/bin/spamc
home_directory = /tmp
current_directory = /tmp
user = spamcheck
group = spamcheck
log_output = true
return_fail_output = true
return_path_add = false
message_prefix =
message_suffix =
 ==
 
 This always used to work perfectly until i upgraded spamassassin using
 apt-get to the version above (which was done in order to give us access
 to the sa-update command).
 
 Now, on some emails spamd eats up 99% CPU and doesn't return until i
 kill it, at which point the email is delivered as non-spam, whether it
 is spam or not.
 
 Does anyone know why this is happening? It's destroying the use of our
 mail server, lol!
 
 Also, FYI, here is a copy of /etc/spamassassin/local.cf:
 
 ==
 # This is the right place to customize your installation of SpamAssassin.
 #
 # See 'perldoc Mail::SpamAssassin::Conf' for details of what can be
 # tweaked.
 #
 # Only a small subset of options are listed below
 #
 ###
 
 rewrite_subject 1
 
 #   Add *SPAM* to the Subject header of spam e-mails
 #
 rewrite_header Subject SPAM (_HITS_/_REQD_):
 
 
 #   Save spam messages as a message/rfc822 MIME attachment instead of
 #   modifying the original message (0: off, 2: use text/plain instead)
 #
 report_safe 1
 
 
 #   Set which networks or hosts are considered 'trusted' by your mail
 #   server (i.e. not spammers)
 #
 # trusted_networks 212.17.35.
 trusted_networks192.168.3.
 trusted_networks193.128.142.
 
 
 #   Set file-locking method (flock is not safe over NFS, but is faster)
 #
 # lock_method flock
 
 
 #   Set the threshold at which a message is considered spam (default: 5.0)
 #
 required_score 2.5
 
 
 #   Use Bayesian classifier (default: 1)
 #
 # use_bayes 1
 
 
 #   Bayesian classifier auto-learning (default: 1)
 #
 # bayes_auto_learn 1
 
 
 #   Set headers which may provide inappropriate cues to the Bayesian
 #   classifier
 #
 # bayes_ignore_header X-Bogosity
 # bayes_ignore_header X-Spam-Flag
 # bayes_ignore_header X-Spam-Status
 
 # WHITELIST
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 whitelist_from [EMAIL PROTECTED]
 ==
 
 Thanks in advance for any assistance! It's greatly appreciated!
 
 Richard.
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

Could the size of bayes_seen and bayes_toks be causing this timeout?

==
mail:/home/spamcheck/.spamassassin# ls -l
total 101060
-rw---  1 spamcheck spamcheck  2637824 2007-08-28 12:42 auto-whitelist
-rw---  1 spamcheck spamcheck6 2007-02-27 16:35
auto-whitelist.mutex
-rw---  1 spamcheck spamcheck   56 2007-08-28 12:52 bayes.lock
-rw---  1 spamcheck spamcheck6 2007-02-27 16:33 bayes.mutex
-rw---  1 spamcheck spamcheck 10530816 2007-08-28 11:11 bayes_seen
-rw---  1 spamcheck spamcheck 83738624 2007-08-28 12:52 bayes_toks
-rw---  1 spamcheck spamcheck  1298432 2007-02-27 18:14
bayes_toks.expire1698
-rw---  1 spamcheck spamcheck  1327104 2007-02-28 02:17
bayes_toks.expire17474
-rw---  1 spamcheck spamcheck  2473984 2007-02-28 02:31
bayes_toks.expire17865
-rw---  1 spamcheck spamcheck  1302528 2007-02-28 02:48
bayes_toks.expire17866
-rw---  1 spamcheck spamcheck  1335296 2007-02-28 03:17
bayes_toks.expire18715
-rw---  1 spamcheck spamcheck  2572288 2007-02-28 03:37
bayes_toks.expire19618
-rw---  1 spamcheck spamcheck   655360 2007-02-28 09:19
bayes_toks.expire28928
-rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:14
bayes_toks.expire3058
-rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:34
bayes_toks.expire3059
-rw---  1 spamcheck spamcheck  1298432 2007-02-27 17:33
bayes_toks.expire31684
-rw---  1 spamcheck spamcheck  1302528 2007-02-27 18:42
bayes_toks.expire31685
-rw---  1 spamcheck spamcheck  5017600 2007-08-28 11:08
bayes_toks.expire4625
-rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:35
bayes_toks.expire7150
-rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:52
bayes_toks.expire7925
-rw-r--r--  1 spamcheck spamcheck 1175 2005-07-20 14:23 user_prefs
mail:/home/spamcheck/.spamassassin#
==

If so, what can i do about this?

Thanks again,
Richard.


Richard Hobbs wrote:
 Hello,
 
 Mark Martinec wrote:
 Richard,

 To add information to this problem, it appears that spamd does
 eventually give up after 5 minutes
 Capture a message causing touble from a MTA queue,
 and feed it to a command line spamassassin with -t -D options.
 
 I would love to do this, but it's not the same messages all the time -
 in a batch of identical messages, some get through and others cause the
 hanging. It seems to be completely random.
 
 Do you have any other ideas?
 
 Thanks again,
 Richard.
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Michael Parker
Richard Hobbs wrote:
 Hello,
 
 Could the size of bayes_seen and bayes_toks be causing this timeout?
 

No, those aren't really that big, but it does look like you have an
expiration problem.

To solve your immediate problem you could just turn off bayes, that will
get mail flowing again and then you can address your real problem.

I'd guess that the spike in CPU you are seeing is due to bayes
expiration running and then getting killed off.  Expiration can be very
IO intensive and it locks the database while running which can cause a
backup depending on you setup.  To make matters worse it looks like
you've got something timing out and killing off the expiration so its
not allowed to complete.

Long term, I'd suggest turning off auto expiration and then running
sa-learn --force-expire via a cronjob.  I believe there is good
information on setting this up on the wiki.

You can remove any *.expire* files that are older than 5 minutes, they
are left over from previously timed out expiration attempts.

Like I said, check out the wiki, you should find more information there
on the problem and possible solutions.

Michael
 ==
 mail:/home/spamcheck/.spamassassin# ls -l
 total 101060
 -rw---  1 spamcheck spamcheck  2637824 2007-08-28 12:42 auto-whitelist
 -rw---  1 spamcheck spamcheck6 2007-02-27 16:35
 auto-whitelist.mutex
 -rw---  1 spamcheck spamcheck   56 2007-08-28 12:52 bayes.lock
 -rw---  1 spamcheck spamcheck6 2007-02-27 16:33 bayes.mutex
 -rw---  1 spamcheck spamcheck 10530816 2007-08-28 11:11 bayes_seen
 -rw---  1 spamcheck spamcheck 83738624 2007-08-28 12:52 bayes_toks
 -rw---  1 spamcheck spamcheck  1298432 2007-02-27 18:14
 bayes_toks.expire1698
 -rw---  1 spamcheck spamcheck  1327104 2007-02-28 02:17
 bayes_toks.expire17474
 -rw---  1 spamcheck spamcheck  2473984 2007-02-28 02:31
 bayes_toks.expire17865
 -rw---  1 spamcheck spamcheck  1302528 2007-02-28 02:48
 bayes_toks.expire17866
 -rw---  1 spamcheck spamcheck  1335296 2007-02-28 03:17
 bayes_toks.expire18715
 -rw---  1 spamcheck spamcheck  2572288 2007-02-28 03:37
 bayes_toks.expire19618
 -rw---  1 spamcheck spamcheck   655360 2007-02-28 09:19
 bayes_toks.expire28928
 -rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:14
 bayes_toks.expire3058
 -rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:34
 bayes_toks.expire3059
 -rw---  1 spamcheck spamcheck  1298432 2007-02-27 17:33
 bayes_toks.expire31684
 -rw---  1 spamcheck spamcheck  1302528 2007-02-27 18:42
 bayes_toks.expire31685
 -rw---  1 spamcheck spamcheck  5017600 2007-08-28 11:08
 bayes_toks.expire4625
 -rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:35
 bayes_toks.expire7150
 -rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:52
 bayes_toks.expire7925
 -rw-r--r--  1 spamcheck spamcheck 1175 2005-07-20 14:23 user_prefs
 mail:/home/spamcheck/.spamassassin#
 ==
 
 If so, what can i do about this?
 
 Thanks again,
 Richard.
 
 
 Richard Hobbs wrote:
 Hello,

 Mark Martinec wrote:
 Richard,

 To add information to this problem, it appears that spamd does
 eventually give up after 5 minutes
 Capture a message causing touble from a MTA queue,
 and feed it to a command line spamassassin with -t -D options.
 I would love to do this, but it's not the same messages all the time -
 in a batch of identical messages, some get through and others cause the
 hanging. It seems to be completely random.

 Do you have any other ideas?

 Thanks again,
 Richard.

 



Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

Michael Parker wrote:
 Richard Hobbs wrote:
 Hello,

 Could the size of bayes_seen and bayes_toks be causing this timeout?

 
 No, those aren't really that big, but it does look like you have an
 expiration problem.

I've heard that this file should be around 10MB on a standard system,
so surely 80MB is huge? Also, our mail server deals with around
1000-1500 emails per day, and is a single-CPU Pentium(R) 4 CPU 2.26GHz
with 512MB RAM - should this be able to cope?

 To solve your immediate problem you could just turn off bayes, that will
 get mail flowing again and then you can address your real problem.

How effective would spam checking be with bayes disabled, though? I know
it's better than a broken solution, but just out of interest...

 I'd guess that the spike in CPU you are seeing is due to bayes
 expiration running and then getting killed off.  Expiration can be very
 IO intensive and it locks the database while running which can cause a
 backup depending on you setup.  To make matters worse it looks like
 you've got something timing out and killing off the expiration so its
 not allowed to complete.

That sounds about right - i have no idea what spamd is doing during
that time other than running a lot of pread system calls, but that
would make sense. I think exim gives 5 minutes to the spam checker, and
then kills it off and rejects the mail with the error in my previous mail.

 Long term, I'd suggest turning off auto expiration and then running
 sa-learn --force-expire via a cronjob.  I believe there is good
 information on setting this up on the wiki.

Well, i've just run sa-learn --force-expire manually, and despite
saying it had removed a whole load of entries from the database, the
bayes_toks and bayes_seen files are still the same size (80MB and 10MB,
respectively).

 You can remove any *.expire* files that are older than 5 minutes, they
 are left over from previously timed out expiration attempts.

Done.

 Like I said, check out the wiki, you should find more information there
 on the problem and possible solutions.

Hmm... will do, but you think it's all working perfectly, just slower
than we'd like, right? Please correct me if that statement is incorrect.

Thanks again,
Richard.


 Michael
 ==
 mail:/home/spamcheck/.spamassassin# ls -l
 total 101060
 -rw---  1 spamcheck spamcheck  2637824 2007-08-28 12:42 auto-whitelist
 -rw---  1 spamcheck spamcheck6 2007-02-27 16:35
 auto-whitelist.mutex
 -rw---  1 spamcheck spamcheck   56 2007-08-28 12:52 bayes.lock
 -rw---  1 spamcheck spamcheck6 2007-02-27 16:33 bayes.mutex
 -rw---  1 spamcheck spamcheck 10530816 2007-08-28 11:11 bayes_seen
 -rw---  1 spamcheck spamcheck 83738624 2007-08-28 12:52 bayes_toks
 -rw---  1 spamcheck spamcheck  1298432 2007-02-27 18:14
 bayes_toks.expire1698
 -rw---  1 spamcheck spamcheck  1327104 2007-02-28 02:17
 bayes_toks.expire17474
 -rw---  1 spamcheck spamcheck  2473984 2007-02-28 02:31
 bayes_toks.expire17865
 -rw---  1 spamcheck spamcheck  1302528 2007-02-28 02:48
 bayes_toks.expire17866
 -rw---  1 spamcheck spamcheck  1335296 2007-02-28 03:17
 bayes_toks.expire18715
 -rw---  1 spamcheck spamcheck  2572288 2007-02-28 03:37
 bayes_toks.expire19618
 -rw---  1 spamcheck spamcheck   655360 2007-02-28 09:19
 bayes_toks.expire28928
 -rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:14
 bayes_toks.expire3058
 -rw---  1 spamcheck spamcheck  1302528 2007-08-28 10:34
 bayes_toks.expire3059
 -rw---  1 spamcheck spamcheck  1298432 2007-02-27 17:33
 bayes_toks.expire31684
 -rw---  1 spamcheck spamcheck  1302528 2007-02-27 18:42
 bayes_toks.expire31685
 -rw---  1 spamcheck spamcheck  5017600 2007-08-28 11:08
 bayes_toks.expire4625
 -rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:35
 bayes_toks.expire7150
 -rw---  1 spamcheck spamcheck  5013504 2007-08-28 12:52
 bayes_toks.expire7925
 -rw-r--r--  1 spamcheck spamcheck 1175 2005-07-20 14:23 user_prefs
 mail:/home/spamcheck/.spamassassin#
 ==

 If so, what can i do about this?

 Thanks again,
 Richard.


 Richard Hobbs wrote:
 Hello,

 Mark Martinec wrote:
 Richard,

 To add information to this problem, it appears that spamd does
 eventually give up after 5 minutes
 Capture a message causing touble from a MTA queue,
 and feed it to a command line spamassassin with -t -D options.
 I would love to do this, but it's not the same messages all the time -
 in a batch of identical messages, some get through and others cause the
 hanging. It seems to be completely random.

 Do you have any other ideas?

 Thanks again,
 Richard.

 
 
 _
 This e-mail has been scanned for viruses by Verizon Business Internet Managed 
 Scanning Services - powered by MessageLabs. For further 

Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread John D. Hardin
On Tue, 28 Aug 2007, Richard Hobbs wrote:

 Could the size of bayes_seen and bayes_toks be causing this timeout?

Yes.

 If so, what can i do about this?

Disable automatic Bayes expiry and do a manual expiration run, and 
allow it to complete.

Once that's done you *can* turn automatic expiry back on to maintain 
it, but you might just get back into the same situation. You may want 
to leave automatic expiry off, and use cron to schedule a manual 
expiry at a low-traffic time of day.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Taking my gun away because I *might* shoot someone is like cutting
  my tongue out because I *might* yell Fire! in a crowded theater.
  -- Peter Venetoklis
---
 Today: Exercise Your Rights day



Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Zbigniew Szalbot

Hello,

One thing - you were given an excellent advice today on exim list by
Graeme. Why don't you follow it?

quote
Anyway, this is probably your problem:
http://spamassassin.apache.org/advisories/cve-2007-0451.txt Upgrade to
3.1.8 if you can.
/quote

On Tue, 28 Aug 2007 15:09:14 +, Richard Hobbs
[EMAIL PROTECTED] wrote:
 No, those aren't really that big, but it does look like you have an
 expiration problem.
 
 I've heard that this file should be around 10MB on a standard system,
 so surely 80MB is huge? Also, our mail server deals with around
 1000-1500 emails per day, and is a single-CPU Pentium(R) 4 CPU 2.26GHz
 with 512MB RAM - should this be able to cope?

More than enough. I handle about 26K emails per day on a Pentium III,
512RAM pc and the system is almost idle most of the time.


-- 
Zbigniew Szalbot
www.slowo.pl
www.lcwords.com



Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Anthony Peacock

Hi,

Richard Hobbs wrote:

Hello,

John D. Hardin wrote:

On Tue, 28 Aug 2007, Richard Hobbs wrote:


Could the size of bayes_seen and bayes_toks be causing this timeout?

Yes.


If so, what can i do about this?
Disable automatic Bayes expiry and do a manual expiration run, and 
allow it to complete.


So what you (and Michael Parker) are saying is that it's not the
checking of spam against the tokens that is causing the timeout, it's
the automatic expiration of old tokens that conveniently gets tagged
onto the same operation, right?

If so, let me get this straight - an email comes in and goes off to
spamd. spamd then checks the message against the tokens to determine
whether it's spam or not, then runs an expiry of old tokens (or perhaps
it happens the other way around), and only then returns the mail to
exim. The expiration of old tokens takes a lot longer than the spam
checking and as a result it's timing out.

So, if i disable automatic expiration, spamd will only attempt one
operation at a time, (checking of spam against the tokens) and should
therefore not timeout, correct?

But, if i do disable automatic expiration, i will have to remember to do
it manually, or via cron.

Is this all correct?


Yes.  Set up expiration in a cron job, once per day is usually fine.



Would a suitable alternative be to delete bayes_seen and bayes_toks,
then restart spamd? I know i would be deleting everything that had been
learned over the last period of time, but starting afresh may not be a
bad thing, seeing as the rules in the database are probably 6 months to
a year old now (we've not been using spamd for a year or so, because i
broke it and had no time to fix it!).

Please let me know your thoughts, and also let me know whether deleting
both of those files is a good way to go.


No.


--
Anthony Peacock
CHIME, Royal Free  University College Medical School
WWW:http://www.chime.ucl.ac.uk/~rmhiajp/
A CAT scan should take less time than a PET scan.  For a CAT scan,
 they're only looking for one thing, whereas a PET scan could result in
 a lot of things.- Carl Princi, 2002/07/19


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread Richard Hobbs
Hello,

John D. Hardin wrote:
 On Tue, 28 Aug 2007, Richard Hobbs wrote:
 
 Could the size of bayes_seen and bayes_toks be causing this timeout?
 
 Yes.
 
 If so, what can i do about this?
 
 Disable automatic Bayes expiry and do a manual expiration run, and 
 allow it to complete.

So what you (and Michael Parker) are saying is that it's not the
checking of spam against the tokens that is causing the timeout, it's
the automatic expiration of old tokens that conveniently gets tagged
onto the same operation, right?

If so, let me get this straight - an email comes in and goes off to
spamd. spamd then checks the message against the tokens to determine
whether it's spam or not, then runs an expiry of old tokens (or perhaps
it happens the other way around), and only then returns the mail to
exim. The expiration of old tokens takes a lot longer than the spam
checking and as a result it's timing out.

So, if i disable automatic expiration, spamd will only attempt one
operation at a time, (checking of spam against the tokens) and should
therefore not timeout, correct?

But, if i do disable automatic expiration, i will have to remember to do
it manually, or via cron.

Is this all correct?

Would a suitable alternative be to delete bayes_seen and bayes_toks,
then restart spamd? I know i would be deleting everything that had been
learned over the last period of time, but starting afresh may not be a
bad thing, seeing as the rules in the database are probably 6 months to
a year old now (we've not been using spamd for a year or so, because i
broke it and had no time to fix it!).

Please let me know your thoughts, and also let me know whether deleting
both of those files is a good way to go.

Thanks again,
Richard.


 Once that's done you *can* turn automatic expiry back on to maintain 
 it, but you might just get back into the same situation. You may want 
 to leave automatic expiry off, and use cron to schedule a manual 
 expiry at a low-traffic time of day.
 
 --
  John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
 ---
   Taking my gun away because I *might* shoot someone is like cutting
   my tongue out because I *might* yell Fire! in a crowded theater.
   -- Peter Venetoklis
 ---
  Today: Exercise Your Rights day
 
 
 _
 This e-mail has been scanned for viruses by Verizon Business Internet Managed 
 Scanning Services - powered by MessageLabs. For further information visit 
 http://www.verizonbusiness.com/uk
 
 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Email: [EMAIL PROTECTED]
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 376964Mobile: +44 7811 803377


Re: spamd keeps running at 99% CPU until i kill the process

2007-08-28 Thread John D. Hardin
On Tue, 28 Aug 2007, Richard Hobbs wrote:

 Is this all correct?

Spot on.
 
 Would a suitable alternative be to delete bayes_seen and
 bayes_toks, then restart spamd? I know i would be deleting
 everything that had been learned over the last period of time, but
 starting afresh may not be a bad thing, seeing as the rules in the
 database are probably 6 months to a year old now (we've not been
 using spamd for a year or so, because i broke it and had no time
 to fix it!).

How accurately is Bayes classifying messages? Why zap it entirely if 
it's doing a good job?
 
 Please let me know your thoughts, and also let me know whether
 deleting both of those files is a good way to go.

You can safely delete the files with expire in their filename, those
are temporary work files. If you delete the entire Bayes database, 
then Bayes won't do anything until it's learned more new spams and 
hams.

I would suggest not zapping all of bayes.

Personally, I don't use autolearn, but then I have a very small user 
base and can easily manually train.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  It is sadly humorous that those who are the most shrilly vocal
  about bemoaning the increasing violations of civil liberties by
  the federal government and comparing the president to Hitler are
  also those who are working hardest to ensure the citizens of our
  nation are disarmed and unable to effectively resist that same
  government. Who do these people think will protect them from the
  Jackbooted Thugs they are so worried about?
---
 Today: Exercise Your Rights day