Limit SA to scan messages 100k and below

2010-03-31 Thread Keith De Souza
Hi Guys,

My current sysadmin has now left the company and I'm new to SA and Exim.
Needless to say I have been assigned the task to
look after the server . I'm hoping I've come to the right place for my
questions to be answered.

The system I have is running on:

Gentoo Base System release 1.12.10
SpamAssassin version 3.2.5
  running on Perl version 5.8.8
Exim version 4.69

Here is my spamd.conf file:

=
SPAMD_OPTS="-m 25 -H -u mail -D"

# spamd stores its pid in this file. If you use the -u option to
# run spamd under another user, you might need to adjust it.

PIDFILE="/var/run/spamd.pid"

# SPAMD_NICELEVEL lets you set the 'nice'ness of the running
# spamd process

SPAMD_NICELEVEL=1
=

I've read somewhere that the default setting for SA to scan a message is
500k.

Can I reduce this, so that SA scans messages 100k and below?


Many Thanks in advance


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Mikael Syska
Hi

On Wed, Mar 31, 2010 at 2:24 PM, Keith De Souza
 wrote:
> Hi Guys,
>

[snip]

>
> I've read somewhere that the default setting for SA to scan a message is
> 500k.
>
> Can I reduce this, so that SA scans messages 100k and below?

Have you tried google first ?
http://www.google.dk/#hl=da&safe=off&q=spamd+scan+messages+size&meta=&aq=f&aqi=&aql=&oq=&gs_rfai=&fp=15904d39482f0df0

Maybe this one: http://spamassassin.apache.org/full/3.2.x/doc/spamc.html

I'm no expert at spamc ... but this seems to be the right settings to go for ...

But are there are reason for dropping it?

> Many Thanks in advance
>
>
>

mvh
Mikael Syska


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Karsten Bräckelmann
On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote:
> My current sysadmin has now left the company and I'm new to SA and
> Exim. [...]

> I've read somewhere that the default setting for SA to scan a message
> is 500k.

That's actually the default for spamc. Messages exceeding the threshold
just won't be passed to spamd. SA (and spamd) will check everything it
gets passed.

> Can I reduce this, so that SA scans messages 100k and below?

You need to change whatever glue you are using to pass messages to SA,
and skip the scanning for messages larger than your desired threshold.

That said, IMHO 100k is rather low. Why do you want that particular
threshold?

  guenther


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Mikael Syska
Hi,

Remember to respond to the mailing list ... so other users can follow
this also ...

On Wed, Mar 31, 2010 at 2:54 PM, Keith De Souza
 wrote:
> Hi,
>
>>> But are there are reason for dropping it?
>
> I'm having a few errors in my Exim logs from legitamate senders not coming
> through:
>
> ===
> 2010-03-31 01:22:25 1Nwlbc-0001QS-Ua
> H=host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86]
> F= temporarily rejected after DATA
> ===
>
> And after checking my SA logs:
>
> ===
> Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 -
> GENESIS_PHONENUMBER07
> scantime=300.0,size=24337,user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid=,autolearn=unavailable
> ==

Your required score is very slow ... but thats not the problem.

> I'm trying to understand why is it taking 300.0 seconds to scan a message
> only 24Kb in size??

This is not the way to go ... there could be other problems ... like
SA rules, RBL's timing out ...

Are you running "sa-update" ?

> I'm begeining to think that because SA is taking so long to scan the
> message, it is timing out
> and hence Exim returning a "temporarily reject after DATA".
>
> My thoughs so far is to perhaps reducing the file size that SA takes to scan
> and see if the scan time reduces.

Are there lots of mails in the queue ?

> I may be wrong in my troublshooting methods but I'm not sure why this is
> happeninig at present.
>
> Many Thanks
>
>
>
>
> On 31 March 2010 13:30, Mikael Syska  wrote:
>>
>> Hi
>>
>> On Wed, Mar 31, 2010 at 2:24 PM, Keith De Souza
>>  wrote:
>> > Hi Guys,
>> >
>>
>> [snip]
>>
>> >
>> > I've read somewhere that the default setting for SA to scan a message is
>> > 500k.
>> >
>> > Can I reduce this, so that SA scans messages 100k and below?
>>
>> Have you tried google first ?
>>
>> http://www.google.dk/#hl=da&safe=off&q=spamd+scan+messages+size&meta=&aq=f&aqi=&aql=&oq=&gs_rfai=&fp=15904d39482f0df0
>>
>> Maybe this one: http://spamassassin.apache.org/full/3.2.x/doc/spamc.html
>>
>> I'm no expert at spamc ... but this seems to be the right settings to go
>> for ...
>>
>> But are there are reason for dropping it?
>>
>> > Many Thanks in advance
>> >
>> >
>> >
>>
>> mvh
>> Mikael Syska
>
>

mvh


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Keith De Souza
Hi

*>> You need to change whatever glue you are using to pass messages to SA,
>>and skip the scanning for messages larger than your desired threshold.

*Sorry as I'm new to SA can you elaborated what you mean by glue?
*
>>That said, IMHO 100k is rather low. Why do you want that particular
>>threshold?*

Judging from your response, I may be wrong in what I need to do:

Basically I'm having a few errors in my Exim logs from legitamate senders
not coming through:

===
2010-03-31 01:22:25 1Nwlbc-0001QS-Ua H=
host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86] F=<
l...@dukeandearl.com> temporarily rejected after DATA
===

And after checking my SA logs:

===
Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 -
GENESIS_PHONENUMBER07 *scantime=300.0,size=24337*,
user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid=<
c7d27527.8a78%l...@dukeandearl.com 
>,autolearn=unavailable
==

I'm trying to understand why is it taking 300.0 seconds to scan a message
only 24Kb in size??
I'm begeining to think that because SA is taking so long to scan the
message, it is timing out
and hence Exim returning a "temporarily reject after DATA".

My thoughs so far is to perhaps reducing the file size that SA takes to scan
and see if the scan time reduces.
I may be wrong in my troublshooting methods but I'm not sure why this is
happeninig at present.

Many Thanks






2010/3/31 Karsten Bräckelmann 

> On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote:
> > My current sysadmin has now left the company and I'm new to SA and
> > Exim. [...]
>
> > I've read somewhere that the default setting for SA to scan a message
> > is 500k.
>
> That's actually the default for spamc. Messages exceeding the threshold
> just won't be passed to spamd. SA (and spamd) will check everything it
> gets passed.
>
> > Can I reduce this, so that SA scans messages 100k and below?
>
> You need to change whatever glue you are using to pass messages to SA,
> and skip the scanning for messages larger than your desired threshold.
>
> That said, IMHO 100k is rather low. Why do you want that particular
> threshold?
>
>  guenther
>
>
> --
> char *t="\10pse\0r\0dtu...@ghno
> \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
> }}}
>
>


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Keith De Souza
Hi

Oops only realized after I had sent you the message - but will do.

*>> Are you running "sa-update" ?*

I might not be, how can I check?

*>> Are there lots of mails in the queue?

*No mails in the queue. I should also say that, mail is coming in fine
and we are receving it but certain legitamate mail (like the one sent)are
not
and SA take 300.0 second to scan.

I'm also receiving these in my logs:

*spam acl condition: error reading from spamd socket: Connection timed out

*Many Thanks


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Jeff Mincy
   From: Keith De Souza 
   Date: Wed, 31 Mar 2010 14:10:50 +0100
   
   Hi
   
   *>> You need to change whatever glue you are using to pass messages to SA,
   >>and skip the scanning for messages larger than your desired threshold.
   
   *Sorry as I'm new to SA can you elaborated what you mean by glue?
   *
   >>That said, IMHO 100k is rather low. Why do you want that particular
   >>threshold?*
   
   Judging from your response, I may be wrong in what I need to do:
   
   Basically I'm having a few errors in my Exim logs from legitamate senders
   not coming through:

300 seconds looks like an timeout.   Something is giving up after
waiting 300 seconds.

Note the autolearn=unavailable.   I'd guess that you are getting
locked out from the Bayes database.   You probably had a Bayes expire
running at the same time.   There should be messages about this in a
log file.

If this is the case you can turn off bayes_auto_expire and run expire
from cron.  You could also try learning to the journal and doing
sa-learn --sync periodically from cron.

-jeff

   
   ===
   2010-03-31 01:22:25 1Nwlbc-0001QS-Ua H=
   host81-136-197-86.in-addr.btopenworld.com (mail.duke.tv) [81.136.197.86] F=<
   l...@dukeandearl.com> temporarily rejected after DATA
   ===
   
   And after checking my SA logs:
   
   ===
   Mar 31 01:25:51 mailserver spamd[5379]: spamd: result: . -4 -
   GENESIS_PHONENUMBER07 *scantime=300.0,size=24337*,
   
user=nobody,uid=8,required_score=3.2,rhost=localhost,raddr=127.0.0.1,rport=42308,mid=<
   c7d27527.8a78%l...@dukeandearl.com 
   >,autolearn=unavailable
   ==
   
   I'm trying to understand why is it taking 300.0 seconds to scan a message
   only 24Kb in size??
   I'm begeining to think that because SA is taking so long to scan the
   message, it is timing out
   and hence Exim returning a "temporarily reject after DATA".
   
   My thoughs so far is to perhaps reducing the file size that SA takes to scan
   and see if the scan time reduces.
   I may be wrong in my troublshooting methods but I'm not sure why this is
   happeninig at present.
   
   Many Thanks
   
   
   
   
   
   
   2010/3/31 Karsten Bräckelmann 
   
   > On Wed, 2010-03-31 at 13:24 +0100, Keith De Souza wrote:
   > > My current sysadmin has now left the company and I'm new to SA and
   > > Exim. [...]
   >
   > > I've read somewhere that the default setting for SA to scan a message
   > > is 500k.
   >
   > That's actually the default for spamc. Messages exceeding the threshold
   > just won't be passed to spamd. SA (and spamd) will check everything it
   > gets passed.
   >
   > > Can I reduce this, so that SA scans messages 100k and below?
   >
   > You need to change whatever glue you are using to pass messages to SA,
   > and skip the scanning for messages larger than your desired threshold.
   >
   > That said, IMHO 100k is rather low. Why do you want that particular
   > threshold?
   >
   >  guenther
   >
   >
   > --
   > char *t="\10pse\0r\0dtu...@ghno
   > \x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
   > main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i c<<=1:
   > (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
   > }}}
   >
   >


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Bowie Bailey
Keith De Souza wrote:
>
> I'm trying to understand why is it taking 300.0 seconds to scan a
> message only 24Kb in size??
> I'm begeining to think that because SA is taking so long to scan the
> message, it is timing out
> and hence Exim returning a "temporarily reject after DATA".
>
> My thoughs so far is to perhaps reducing the file size that SA takes
> to scan and see if the scan time reduces.
> I may be wrong in my troublshooting methods but I'm not sure why this
> is happeninig at present.

My first suggestion to anyone who is having problems with SA running
slowly is to check memory usage.

You posted previously that your conf file contained this:

SPAMD_OPTS="-m 25 -H -u mail -D"

"-m 25" means that you are running 25 spamd processes.  On my system
(with a few extra rulesets), the spamd processes take up about 60-70M
each.  How much memory do you have?  You need to make sure that the
machine doesn't go into swap.  If it does, SA will slow down
dramatically.  Try running the "free" command to see how much memory you
have available.  If you are close to the edge, you may want to lower the
number of processes.

"-H" is a command to change the home directory and generally requires an
argument, so I'm not sure what it's doing here.

"-u mail" means spamd is running as the user "mail".  So when you are
testing, manually learning the Bayes db, etc, make sure you are logged
in as "mail" so that you are using the same settings and databases as spamd.

"-D" puts spamd into debug mode.  Aside from filling up your logs with
excess debug information, this will probably slightly increase the
memory use and slow down the scanning process.  If you don't need it for
some reason, get rid of it.

-- 
Bowie


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Charles Gregory

On Wed, 31 Mar 2010, Keith De Souza wrote:

Sorry as I'm new to SA can you elaborated what you mean by glue?


Geek terminology for the program, script or other mechanism that 
'connects' your MTA and your SA. Ie. The calling MTA or its script must do 
the size check, then decide *whether* to call SA



I'm trying to understand why is it taking 300.0 seconds to scan a message
only 24Kb in size??


1) Server is overloaded. Your load only has to go 10-20% over your 
system's 'maximum capacity' to cause processing times to jump from 20 
seconds up to five minutes or more


2) Something that SA relies upon, like your DNS server, is taking way 
too long to do its job. Check that your DNS has a reasonable timeout 
value. Otherwise it could be waiting for a non-existent domain

This would be the case if the problem occurs for certain addresses,
or more often on spam (which comes from 'unknown' systems) than on 
legitimate mail


3) There may be a 'locking' issue with any databases (Bayes?) that SA 
uses. Again, this may only become a problem under heave load, with too 
many concurrent SA processes



My thoughs so far is to perhaps reducing the file size that SA takes to scan
and see if the scan time reduces.


It is a better idea to try and reduce the number of emails that SA will 
process at the same time.


- C


Re: Limit SA to scan messages 100k and below

2010-03-31 Thread Martin Gregorie
On Wed, 2010-03-31 at 15:06 +0200, Mikael Syska wrote:
> > I'm trying to understand why is it taking 300.0 seconds to scan a message
> > only 24Kb in size??
> 
Use the sysstat tool-set to find out what's going on in your system and
fix that.

I agree with those who say that -m 25 is too large a value. If that's
the problem then you don't need to use the sysstat programs to see it -
just run 'top' and you'll see the swap space used value changing and
that kswapd is busy. Try simply deleting the -m option, which uses the
default of 5 children, and see how SA performance changes. 

To provide some guide numbers I looked at my two SA setups, which both
use the default number of children:

- My SA rule development rig runs on a 1.5GHz CoreDuo laptop with 1GB
  RAM. It can scan my two biggest spam test messages (412KB and 360KB) 
  in 21 seconds: however scan time depends on the message content: the
  412KB message only takes 1 second to scan by itself while the 360KB one 
  takes the other 20 seconds. This set-up uses SA 3.3.0

- My main server is a lot smaller: an 866MHz P3 with 512 MB RAM. It runs
  SA 3.2.5. Here are the numbers from its set of maillogs:

  Messages scanned: 2758
  Message size: min 2072  avg 7223  max 417840   bytes
  Scan times:   min 0.7   avg 2.247 max 21.1 seconds

  I'm using the default SA child process populations. This machine is
  also running getmail, Postfix, Dovecot, named, ntpd, Samba, Apache and
  PostgresQL and is used for Java development as well.


Martin





Re: Limit SA to scan messages 100k and below

2010-04-01 Thread Keith De Souza
Hi Guys,

Firstly, many thanks for all your replies.

I've now made some changes to my spamd conf file (/etc/conf.d/spamd) based
on the replies given.
This is what it looks like now:

==
SPAMD_OPTS="-m 6 -H -u mail -D --timeout-child=60"

# spamd stores its pid in this file. If you use the -u option to
# run spamd under another user, you might need to adjust it.

PIDFILE="/var/run/spamd.pid"

# SPAMD_NICELEVEL lets you set the 'nice'ness of the running
# spamd process

#SPAMD_NICELEVEL=1
==

I've also hashed out the "SPAMD_NICELEVEL=1", not sure why it was there in
the first place.
Any ideas what this entry does?

I've then Added the  "--timeout-child=60". Will this mean that the child
processors will timeout after 60seconds an let the message through for Exim
to process?

By the way the errors in the logs have gone away after the changes made.
Also the processing load on the server has dropped
dramatically.

Many Thanks


Re: Limit SA to scan messages 100k and below

2010-04-01 Thread John Hardin

On Thu, 1 Apr 2010, Keith De Souza wrote:


I've now made some changes to my spamd conf file (/etc/conf.d/spamd) based
on the replies given.
This is what it looks like now:

==
SPAMD_OPTS="-m 6 -H -u mail -D --timeout-child=60"


You don't need -D (debugging output) unless you're actively 
troubleshooting a problem. Remove that.



# SPAMD_NICELEVEL lets you set the 'nice'ness of the running
# spamd process

#SPAMD_NICELEVEL=1
==

I've also hashed out the "SPAMD_NICELEVEL=1", not sure why it was there in
the first place.
Any ideas what this entry does?


It allows you to adjust the relative priority of spam processing. If SA is 
not invoked during SMTP (i.e. not during the interactive part of mail 
exchange, where the computer on the other end has to wait for it to finish 
processing before it can go on to the next message it wants to send), then 
you can reduce the priority of SA to give higher priority to interactive 
operations (e.g. to the SMTP exchange, to webmail that's running on the 
same host, etc.) - if the spam scan is taking place in the background, 
what does it matter if it takes 25 seconds or 30? You may want to improve 
the response of activities a user is actually waiting on.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Gun Control is marketed to the public using the appealing delusion
  that violent criminals will obey the law.
---
 Today: April Fools' day


Re: Limit SA to scan messages 100k and below

2010-04-03 Thread Keith De Souza
*>>It allows you to adjust the relative priority of spam processing. If SA
is not invoked during SMTP (i.e. not during the interactive >>part of mail
exchange, where the computer on the other end has to wait for it to finish
processing before it can go on to the next >>message it wants to send), then
you can reduce the priority of SA to give higher priority to interactive
operations (e.g. to the >>SMTP exchange, to webmail that's running on the
same host, etc.) - if the spam scan is taking place in the background, what
>>does it matter if it takes 25 seconds or 30? You may want to improve the
response of activities a user is actually waiting on.
*
Thanks for the explaination [?]
<<330.gif>>

Re: Limit SA to scan messages 100k and below

2010-04-04 Thread Keith De Souza
Hi John*

>>I take it the [?] means you didn't understand what I was trying to
explain?
>>I'd be happy to try again, if you wish to understand what "niceness"
means.*

Sorry didn't realise, it was meant to be a smiley symbol ;-) but was
replaced by a question mark instead.

In essence, from what you're saying, the niceness level in spamassassin will
dictate the priority
of processing. I'm assuming the higher the niceness level its set, the least
priority it has?

Should I be setting this or can it be left out? My server is only running
running Exim, Apache
(for only one website) and SA

Also I've set the timeout to 60secs:

=
SPAMD_OPTS="-m 6 -H -u mail --timeout-child=60"
=

Does this mean that SA will scan the message for up to 60seconds before it
is let through?
In essence what will happen to the message after the timeout set? will the
message just be let through?

Many thanks :-)


Re: Limit SA to scan messages 100k and below

2010-04-04 Thread John Hardin

On Sun, 4 Apr 2010, Keith De Souza wrote:


Hi John*

I take it the [?] means you didn't understand what I was trying to 
explain? I'd be happy to try again, if you wish to understand what 
"niceness" means.*


Sorry didn't realise, it was meant to be a smiley symbol ;-) but was 
replaced by a question mark instead.


Ah. Character set issues, I guess.

In essence, from what you're saying, the niceness level in spamassassin 
will dictate the priority of processing. I'm assuming the higher the 
niceness level its set, the least priority it has?


Correct.

Should I be setting this or can it be left out? My server is only 
running running Exim, Apache (for only one website) and SA


You'd only worry about it if users started complaining about delays in 
interactive tasks on that box.


If you get a lot of mail and the response for the website is suffering 
then you could look into adjusting the priority of spamd. If you are 
calling spamd during SMTP then reducing the priority of spamd would make 
the remote MTA wait longer to get the "mail accepted for delivery" 
message. Priority tuning can be a tradeoff, but generally you want to tune 
to minimize the delay that actual people are seeing in interactive tasks.



Also I've set the timeout to 60secs:

=
SPAMD_OPTS="-m 6 -H -u mail --timeout-child=60"
=

Does this mean that SA will scan the message for up to 60seconds before it
is let through?


I think so. I have never explored that option myself, as I don't care how 
long it takes to scan a message.



In essence what will happen to the message after the timeout set? will the
message just be let through?


That depends on how your glue is configured to respond to spamd reporting 
an error. In general, discarding the messaeg is a bad idea, but the MTA 
could either deliver the message unscanned, or return it to the queue to 
try again later.


I have to ask, is your mail really so time-critical that you're not 
willing to wait two minutes for spamd do to its job?


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  How can you reason with someone who thinks we're on a glidepath to
  a police state and yet their solution is to grant the government a
  monopoly on force? They are insane.
---
 9 days until Thomas Jefferson's 267th Birthday


Re: Limit SA to scan messages 100k and below

2010-04-05 Thread Keith De Souza
Hi John*

>> I have to ask, is your mail really so time-critical that you're not
willing to wait two minutes for spamd do to its job?
*
No reason really, initially it was set to the default (300secs) which I
thought was what was causing the errors in the logs.
I've set it to 60secs just as a test to eliminate all possibilities. So far
all seems be fine on the server. I am going to wind this
up over the next few days to see how it pans out.

Many thanks