RE: [OT] Uptime was [scan times up!]

2004-10-05 Thread Nate Schindler
they mean microsoft equipment... :)

-Original Message-
From: Andy Jezierski [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 05, 2004 3:13 PM
To: users@spamassassin.apache.org
Subject: [OT] Uptime was [scan times up!]






Ken Goods <[EMAIL PROTECTED]> wrote on 10/05/2004 04:50:30 PM:

> Spamassassin, and ClamAV. It is currently processing 5 to 8 thousand
emails
> a day and has been up for 68 days. Here's a current snapshot of top:
>

A sad day is coming on Thursday, I have to re-boot a router at one of our
remote locations to install a new card.
I always loved showing this to people who insisted that you should re-boot
equipment periodically.

anrtr1>sh ver
Cisco Internetwork Operating System Software
IOS (tm) 3600 Software (C3640-JS-M), Version 12.1(5)T,  RELEASE SOFTWARE
(fc1)
Copyright (c) 1986-2000 by cisco Systems, Inc.
Compiled Sat 11-Nov-00 07:24 by ccai
Image text-base: 0x60008950, data-base: 0x61476000

ROM: System Bootstrap, Version 11.1(20)AA2, EARLY DEPLOYMENT RELEASE
SOFTWARE (fc1)

anrtr1 uptime is 3 years, 23 weeks, 1 day, 15 hours, 58 minutes
System returned to ROM by power-on
System restarted at 23:10:31 PDT Thu Apr 26 2001
System image file is "flash:c3640-js-mz.121-5.T.bin"


Heavy Sigh...

Andy



huge increase in false negatives...

2004-10-05 Thread Steve Sobel
So I installed SA3.0 something like 10 days ago, and up until two days
ago, it was spectacular!  I got so little spam it was great...

For the last couple/few days though, Spam volume that gets through has
taken a sharp increase again.   Is this happening for anyone else?

I'm just curious if anyone else has had a sharp increase in spam the past
few days too.  come to think of it - I think the increase started once I
started running sa-learn on the Learn Spam folders in the accounts on my
machine.. however backwards that seems!

Any insight on this would be appreciated...

Steve


whitelist not working after 3.0 upgrade

2004-10-05 Thread Gary Quiring
I upgraded to SA 3.0 and my whitelist_from in
/etc/mail/spamassassin/local.cf file is no longer working.  I am using
RH 8.  I tried spamassassin --lint and reported no errors in the file.
All my whitelisted users are coming in with positive scores.

I have entried like:

whitelist_from [EMAIL PROTECTED]
whitelist_from [EMAIL PROTECTED]


[OT] Uptime was [scan times up!]

2004-10-05 Thread Andy Jezierski




Ken Goods <[EMAIL PROTECTED]> wrote on 10/05/2004 04:50:30 PM:

> Spamassassin, and ClamAV. It is currently processing 5 to 8 thousand
emails
> a day and has been up for 68 days. Here's a current snapshot of top:
>

A sad day is coming on Thursday, I have to re-boot a router at one of our
remote locations to install a new card.
I always loved showing this to people who insisted that you should re-boot
equipment periodically.

anrtr1>sh ver
Cisco Internetwork Operating System Software
IOS (tm) 3600 Software (C3640-JS-M), Version 12.1(5)T,  RELEASE SOFTWARE
(fc1)
Copyright (c) 1986-2000 by cisco Systems, Inc.
Compiled Sat 11-Nov-00 07:24 by ccai
Image text-base: 0x60008950, data-base: 0x61476000

ROM: System Bootstrap, Version 11.1(20)AA2, EARLY DEPLOYMENT RELEASE
SOFTWARE (fc1)

anrtr1 uptime is 3 years, 23 weeks, 1 day, 15 hours, 58 minutes
System returned to ROM by power-on
System restarted at 23:10:31 PDT Thu Apr 26 2001
System image file is "flash:c3640-js-mz.121-5.T.bin"


Heavy Sigh...

Andy



RE: scan times up!

2004-10-05 Thread Ken Goods
Andy Jezierski scribbled on Tuesday, October 05, 2004 2:31 PM:

> There's never anything wrong with going on an archeological dig in the
> computer room.  I'm sometimes asked "What's that relic in the
> corner?  Does
> it actually have a hard drive?" and people are amazed when I tell them
> "That's what's keeping your mailbox spam free." It works,
> scan times are
> still fairly reasonable, so hey.
> 
> 180Mhz of PentiumPro power, and spamd uses every single one of them.
> 
> Andy

Andy, Chris, and all,

The bone-yard is exactly where I got the pieces to put together our
(production) email gateway which is filtering our corporate emails. Check
this out... An old 200Mhz Pentium Pro Dell with a 4GB IDE drive and 192MB
ram. Running RedHat9.0, Sendmail, and the latest versions of MailScanner,
Spamassassin, and ClamAV. It is currently processing 5 to 8 thousand emails
a day and has been up for 68 days. Here's a current snapshot of top:

[EMAIL PROTECTED] root]# top
 14:41:09  up 68 days, 21:53,  1 user,  load average: 0.60, 0.30, 0.19
53 processes: 51 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:   0.3% user   0.3% system   0.0% nice   0.0% iowait  99.2% idle
Mem:   190676k av,  162724k used,   27952k free,   0k shrd,   13948k
buff
112160k actv,   16012k in_d, 484k in_c
Swap:  192772k av,  111580k used,   81192k free   21020k
cached  

I get a couple "Spamassassin Timed Out" messages a day but that could be a
network lookup (DNS, SUBRL, etc... who knows), other than that it runs like
a top and keeps us ~97% spam-free and 100% virus free. All stock with no
modifications except for a few whitelist entries.

Machine for email gateway. $0.00
Software for email gateway $0.00
Manhours to maintain.. $0.00
Unbelievably talented open source 
programming wizards to make it all possible... PRICELESS!

Thanks guys and keep up the excellent work!


Ken Goods
Network Administrator
AIA Insurance, Inc.


RE: scan times up!

2004-10-05 Thread Morris Jones
On Tue, 5 Oct 2004, scohen wrote:

> Thanks for the reply. So what's with the SA eats all my memory thread and
> the memory footprint thread? Those two threads and your Scan times up
> thread makes it seem that there are problems.

I think there's a problem with spamd and memory use, and we've tripped
over it a bit, but we now know of a fairly easy workaround.

What I'm finding is that my scan times are significantly faster than
2.64. I wasn't expecting that; it might be an effect of the new spamd
architecture.  Memory usage is well within reason for a well-equipped
machine.  My server has 512M of RAM, and spamd and its children use
about a third of the memory (most of the time).

It's even possible that this memory issue existed in 2.64, but wasn't
visible because spamd children were terminated after each scan.  You can
replicate that behavior in 3.0.0 with the --max-conn-per-child option,
and that's worked well for me so far.

That said, the overall performance and especially accuracy of 3.0.0
has been much higher on my server.

Mojo
-- 
Morris Jones <*>
Monrovia, CA
[EMAIL PROTECTED]
http://www.whiteoaks.com



RE: scan times up!

2004-10-05 Thread Andy Jezierski




Chris Santerre <[EMAIL PROTECTED]> wrote on 10/05/2004
03:48:57 PM:

> Good grief NO! You read this wrong. I'm running it on a system that
> archeologists are interested in! I think the Boston Computer museum left
me
> a message wanting to take the system away! Hell I'm running on a system
that
> couldn't run a PC game from 2 years ago!!  3.0 caused my old iron to hit
> swap a lot at busy times. It's not SA's fault, but my budget of $10.99
that
> causes it :)
>
> --Chris (Seriously, my 4 yr old has a computer twice as powerful!)

There's never anything wrong with going on an archeological dig in the
computer room.  I'm sometimes asked "What's that relic in the corner?  Does
it actually have a hard drive?" and people are amazed when I tell them
"That's what's keeping your mailbox spam free." It works, scan times are
still fairly reasonable, so hey.

180Mhz of PentiumPro power, and spamd uses every single one of them.

Andy



RE: scan times up!

2004-10-05 Thread Gary Smith
Chris, 
 
Your priorities are wrong...  Give the wife and kids the old hardare. :)
 
It seems that AWL could also be to blame.  Looking at some of the threads on 
performance and memory issues everyone seems to have AWL configured.  When we 
ran 3.0.0 rc4 in development it seemed to work fine even with a load.  These 
used bayes and SURBL but AWL.  I didn't see any performance or real memory 
problems.
 
Just my $0.02.
 
Gary



From: Chris Santerre [mailto:[EMAIL PROTECTED]
Sent: Tue 10/5/2004 1:48 PM
To: 'scohen'
Cc: Spamassassin-Talk (E-mail)
Subject: RE: scan times up!





>-Original Message-
>From: scohen [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, October 05, 2004 4:35 PM
>To: Chris Santerre
>Cc: Spamassassin-Talk (E-mail)
>Subject: RE: scan times up!
>
>
>
>
>On Tue, 5 Oct 2004, Chris Santerre wrote:
>
>> If anyone remembers this thread, I have more feedback.
>>
>> After disabling Bayes, AWL, and reducing the system to 4
>children, I now am
>> running average scan times of 3.5 seconds. Much better!
>>
>> You devs are some seriously sexy coders!
>>
>> --Chris (Bayes?..poppycock!)
>
>I haven't been reading this list for a couple of weeks. Are you
>seriously saying that in order to get good performance out
>SA3.0 you have
>to disable bayes and only run 4 children? With the complaints of poor
>performance and increased memory usage is there any reason to
>put this on
>a production system?
>

Good grief NO! You read this wrong. I'm running it on a system that
archeologists are interested in! I think the Boston Computer museum left me
a message wanting to take the system away! Hell I'm running on a system that
couldn't run a PC game from 2 years ago!!  3.0 caused my old iron to hit
swap a lot at busy times. It's not SA's fault, but my budget of $10.99 that
causes it :)

--Chris (Seriously, my 4 yr old has a computer twice as powerful!)




RE: scan times up!

2004-10-05 Thread scohen


On Tue, 5 Oct 2004, Chris Santerre wrote:

>
>
> >-Original Message-
> >From: scohen [mailto:[EMAIL PROTECTED]
> >Sent: Tuesday, October 05, 2004 4:35 PM
> >To: Chris Santerre
> >Cc: Spamassassin-Talk (E-mail)
> >Subject: RE: scan times up!

> >I haven't been reading this list for a couple of weeks. Are you
> >seriously saying that in order to get good performance out
> >SA3.0 you have
> >to disable bayes and only run 4 children? With the complaints of poor
> >performance and increased memory usage is there any reason to
> >put this on
> >a production system?
> >
>
> Good grief NO! You read this wrong. I'm running it on a system that
> archeologists are interested in! I think the Boston Computer museum left me
> a message wanting to take the system away! Hell I'm running on a system that
> couldn't run a PC game from 2 years ago!!  3.0 caused my old iron to hit
> swap a lot at busy times. It's not SA's fault, but my budget of $10.99 that
> causes it :)
>
> --Chris (Seriously, my 4 yr old has a computer twice as powerful!)
>
Thanks for the reply. So what's with the SA eats all my memory thread and
the memory footprint thread? Those two threads and your Scan times up
thread makes it seem that there are problems.

Steve Cohen



Re: bogus "sa-learn --dump magic" report

2004-10-05 Thread Bill Landry
- Original Message - 
From: "Theo Van Dinter" <[EMAIL PROTECTED]>

> On Wed, Sep 29, 2004 at 03:51:28PM -0700, Bill Landry wrote:
> > Hmmm, where else could this configuration issue be, Theo, since none
> of my
> > CF files contain a "-" in the test definitions?  Grep results:
>
> Run spamassassin with -D, it'll tell you what files its reading.  Could
> be
> /usr/share/spamassassin/*.cf, user_prefs, etc.
>
> > And like I said, "spamassassin --lint" comes back with nothing -
> should it
> > not detect this apparent configuration issue, as well?  I can send you
> the
> > "spamassassin --lint -D" output, if you would like.
>
> It should (not knowing what is causing the issue I can't answer for
> certain,) but
> there's nothing in the code that I know of which would be converting
> underscore to dash, so it has to be a config file somewhere.

Update on this issue:
=
[EMAIL PROTECTED] billl]# ls -l /home/billl/Mail-SpamAssassin-3.0.0/sa-learn
-rwxr-xr-x1 root root36543 Sep 26 03:12
/home/billl/Mail-SpamAssassin-3.0.0/sa-learn

[EMAIL PROTECTED] billl]# /home/billl/Mail-SpamAssassin-3.0.0/sa-learn --dump 
magic
error: rule 'RCVD_IN_DNSRBL-DUN' has invalid characters (not Alphanumeric +
Underscore)
error: rule 'RCVD_IN_DNSRBL-SPAM' has invalid characters (not Alphanumeric +
Underscore)
error: rule 'RCVD_IN_DSBL-MULTI' has invalid characters (not Alphanumeric +
Underscore)
error: rule 'RCVD_IN_SECURITY-SAGE' has invalid characters (not Alphanumeric
+ Underscore)
error: rule 'DNS_FROM_RFCI-ABUSE' has invalid characters (not Alphanumeric +
Underscore)
error: rule 'DNS_FROM_MAILPOLICE-BULK' has invalid characters (not
Alphanumeric + Underscore)
error: rule 'DNS_FROM_MAILPOLICE-PORN' has invalid characters (not
Alphanumeric + Underscore)
error: rule 'DNS_FROM_RFCI-POSTMASTER' has invalid characters (not
Alphanumeric + Underscore)
error: rule 'DNS_FROM_RFCI-PIGS' has invalid characters (not Alphanumeric +
Underscore)
0.000  0  3  0  non-token data: bayes db version
0.000  0 153178  0  non-token data: nspam
0.000  0  75561  0  non-token data: nham
0.000  0 256644  0  non-token data: ntokens
0.000  0 1096903685  0  non-token data: oldest atime
0.000  0 1097004810  0  non-token data: newest atime
0.000  0 1097005094  0  non-token data: last journal sync
atime
0.000  0 1096990378  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire atime
delta
0.000  0  83493  0  non-token data: last expire
reduction count
=
[EMAIL PROTECTED] billl]# ls -l /usr/bin/sa-learn
-r-xr-xr-x1 root root36633 Sep 26 03:12 /usr/bin/sa-learn

[EMAIL PROTECTED] billl]# /usr/bin/sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 153178  0  non-token data: nspam
0.000  0  75561  0  non-token data: nham
0.000  0 256644  0  non-token data: ntokens
0.000  0 1096903685  0  non-token data: oldest atime
0.000  0 1097004810  0  non-token data: newest atime
0.000  0 1097005094  0  non-token data: last journal sync
atime
0.000  0 1096990378  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire atime
delta
0.000  0  83493  0  non-token data: last expire
reduction count
=

Guess I figured that the sa-learn in the distribution directory would be the
same as the one installed during "make install".  Guess I figured wrong...

Bill



RE: scan times up!

2004-10-05 Thread Chris Santerre


>-Original Message-
>From: scohen [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, October 05, 2004 4:35 PM
>To: Chris Santerre
>Cc: Spamassassin-Talk (E-mail)
>Subject: RE: scan times up!
>
>
>
>
>On Tue, 5 Oct 2004, Chris Santerre wrote:
>
>> If anyone remembers this thread, I have more feedback.
>>
>> After disabling Bayes, AWL, and reducing the system to 4 
>children, I now am
>> running average scan times of 3.5 seconds. Much better!
>>
>> You devs are some seriously sexy coders!
>>
>> --Chris (Bayes?..poppycock!)
>
>I haven't been reading this list for a couple of weeks. Are you
>seriously saying that in order to get good performance out 
>SA3.0 you have
>to disable bayes and only run 4 children? With the complaints of poor
>performance and increased memory usage is there any reason to 
>put this on
>a production system?
>

Good grief NO! You read this wrong. I'm running it on a system that
archeologists are interested in! I think the Boston Computer museum left me
a message wanting to take the system away! Hell I'm running on a system that
couldn't run a PC game from 2 years ago!!  3.0 caused my old iron to hit
swap a lot at busy times. It's not SA's fault, but my budget of $10.99 that
causes it :)

--Chris (Seriously, my 4 yr old has a computer twice as powerful!)


RE: scan times up!

2004-10-05 Thread scohen


On Tue, 5 Oct 2004, Chris Santerre wrote:

> If anyone remembers this thread, I have more feedback.
>
> After disabling Bayes, AWL, and reducing the system to 4 children, I now am
> running average scan times of 3.5 seconds. Much better!
>
> You devs are some seriously sexy coders!
>
> --Chris (Bayes?..poppycock!)

I haven't been reading this list for a couple of weeks. Are you
seriously saying that in order to get good performance out SA3.0 you have
to disable bayes and only run 4 children? With the complaints of poor
performance and increased memory usage is there any reason to put this on
a production system?

Steve Cohen



RE: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread Doug Block

I had this problem till I set the max per child option to = 1

This caused spamd to kill the process used to scan every msg once it's
done.
Not the best answer I know but it keeps it in check





RE: locating/translating geography of IP addresses

2004-10-05 Thread Chris Stone
GeoIP - http://sourceforge.net/projects/geoip/
 

-Original Message-
From: Diffenderfer, Randy [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 05, 2004 7:09 AM
To: 'users@spamassassin.apache.org'
Subject: locating/translating geography of IP addresses

Folks, 

Just recently I recall someone mentioning some code or a tool to relate IP
addresses to originating geography.  I haven't located the reference by
searching the archives.

So, can someone recall the reference or point me at it? 

Thanks! 
rnd 




RE: HUMOR: Legit ham subject

2004-10-05 Thread Chris Santerre


>-Original Message-
>From: Daniel Quinlan [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, October 05, 2004 2:50 PM
>To: Chris Santerre
>Cc: Spamassassin-Talk (E-mail)
>Subject: Re: HUMOR: Legit ham subject
>
>
>Chris Santerre <[EMAIL PROTECTED]> writes:
>
>> This is from an FP I got this morning. Legit ham subject!
>> 
>> "See Children's Letters To God- as low as $25*!":-) 
>> 
>> Its the name of a Broadway show. Good grief these legit 
>newsletters need to
>> meet us halfway!
>
>Which rules caused the FP?

Content analysis details:   (8.4 points, 5.0 required)

 pts rule name  description
 --
--
 0.8 SARE_SUB_AS_LOW_AS Subject contains apparent spammer phrasing
 0.4 MY_POWERMTAPowerMTA is evil
 1.7 MSGID_FROM_MTA_ID  Message-Id for external message added locally
 0.6 MY_PHRS_LOWBODY: low scoring phrases found
 0.4 EXCUSE_19  BODY: Claims you opted-in or registered
 0.7 MY_LEARN   BODY: Learn something new?
 1.0 MY_PHRS_MEDBODY: medium scoring phrases found
 0.3 URI_OFFERS URI: Message has link to company offers
 0.0 HTML_60_70 BODY: Message is 60% to 70% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.6 SARE_HTML_FONT_INVIS2  RAW: contains HTML color which is likely
spamsign
 0.2 MY_CLICK   RAW: Asks to click
 0.6 MY_WIDTH   RAW: Width 1 or 0
 0.4 MY_OBFU_LONG   RAW: One long word!
 0.6 MY_HEIGHT  RAW: Height 1 or 0
 0.2 UPPERCASE_25_50message body is 25-50% uppercase
 0.0 MSGID_FROM_MTA_HEADER  Message-Id was added by a relay

If I added right, 5.5 points came from my custom rules! Not the normal 3.0
install. I haven't changed my custom (Non SARE) rules since the upgrade from
2.4x to 3.0. I'm thinking of removing them for now, as with SARE+SURBL+SA
3.0 gets spam high enough already.

So no worries mate!

--Chris



Re: sa 3.0.0 - same performance after training

2004-10-05 Thread Matt Kettler
At 03:28 AM 10/5/2004, Insems Citam wrote:
Still no idea what could be wrong, any ideas guys?
No, because I still don't have a clear vision of what you did :)
What criteria did you use when training? did you just train the spam and 
ham mailboxes that SA generated? or did you hand-sort prior to training?

If you just trained SA based on it's previous classifications without any 
hand sorting, why would you expect any sort of change at all?

Are the BAYES_* rules showing up in the Spam-Status headers of the messages 
in the second run? What's the mix like?

Try gathering some bayes-specific stats using grep: I'd suggest looking at 
bayes 00, 50, and 99 to start with.

  grep BAYES_00 spam.mbox |wc -l
  grep BAYES_50 spam.mbox |wc -l
  grep BAYES_99 spam.mbox |wc -l
  grep BAYES_00 ham.mbox |wc -l
  grep BAYES_50 ham.mbox |wc -l
  grep BAYES_99 ham.mbox |wc -l
Compare those to the total counts of each mailbox.


RE: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread Dallas L. Engelken
> Is there a Perl equivalent to the Unix 'setrlimit' or 'ulimit'
> function? (IE something to set the max data size that a process 
> is allowed to use).

I use djb's softlimit and supervise my spamd process with daemontools.
I softlimit spamd at 100MB just to prevent childs from running away with
all the memory.  It works well.

d
-- 
Dallas Engelken
NMGI


Re: HUMOR: Legit ham subject

2004-10-05 Thread Daniel Quinlan
Chris Santerre <[EMAIL PROTECTED]> writes:

> This is from an FP I got this morning. Legit ham subject!
> 
> "See Children's Letters To God- as low as $25*!":-) 
> 
> Its the name of a Broadway show. Good grief these legit newsletters need to
> meet us halfway!

Which rules caused the FP?

Daniel

-- 
Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin
http://www.pathname.com/~quinlan/  http://www.apachecon.com/  sessions & more)


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Kai Schaetzl
Morris Jones wrote on Tue, 5 Oct 2004 10:22:42 -0700 (PDT):

> I watched a spamd child grow to 250MB yesterday on a single message.
>

This can happen sometimes (rarely) with 2.6x as well! I have already seen 
900 MB spamds.


Kai

-- 

Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org





Re: spamc sanity check failures on Suse 9.1

2004-10-05 Thread John Beranek
John Beranek wrote:
I've not got a spamassassin RPM installed at. I just installed
  ^all
Note to self: Proofread a sentence after rewording it. :)
John.
--
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake


Re: spamc sanity check failures on Suse 9.1

2004-10-05 Thread John Beranek
Michael W Cocke wrote:
On Tue, 05 Oct 2004 12:00:26 +0100, you wrote:

I've just upgraded to SpamAssassin 3.0 and Suse 9.1, and have now 
started noticing errors in my mail logs. I'm not sure if it's the OS 
upgrade or the SpamAssassin upgrade.

You've probably already checked this, but for the record - the
/etc/sysconfig/spamd that ships with suse is VERY wrong for SA 3.0.
Did you fix the command line parms?
I've not got a spamassassin RPM installed at. I just installed 
Mail::SpamAssassin from CPAN, and added a /etc/init.d/spamd I found 
somewhere on the net...

John.
--
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake


RE: scan times up!

2004-10-05 Thread Chris Santerre
If anyone remembers this thread, I have more feedback.

After disabling Bayes, AWL, and reducing the system to 4 children, I now am
running average scan times of 3.5 seconds. Much better!

You devs are some seriously sexy coders! 

--Chris (Bayes?..poppycock!)



>> -Original Message-
>> From: Chris Santerre [mailto:[EMAIL PROTECTED]
>> Sent: Friday, October 01, 2004 7:26 AM
>> To: 'Nick Leverton'; Spamassassin-Talk (E-mail)
>> Subject: RE: scan times up!
>> 
>> 
>> 
>> >-Original Message-
>> >From: Nick Leverton [mailto:[EMAIL PROTECTED]
>> >Sent: Friday, October 01, 2004 6:49 AM
>> >To: Spamassassin-Talk (E-mail)
>> >Subject: Re: scan times up!
>> >
>> >
>> >On Thu, Sep 30, 2004 at 05:10:27PM -0400, Chris Santerre wrote:
>> >> Well...
>> >>
>> >> ver   avg scan time
>> >> 2.4x  2.7 seconds
>> >> 3.0   30.4 seconds
>> >>
>> >> OH MY! Network test :)
>> >>
>> >> Any longer and I might just be doing greylisting by accident. ;)
>> >
>> >Have you got a local (on-site, preferably on-machine) DNS cache ?
>> >This makes a lot of difference to the DNS-based network tests (which
>> >is to say, most of them).  One mail probably won't see much
>difference,
>> >but when the next one comes in, many of its lookups are
>> >already cached :)
>> >
>> 
>> This is also on my list of TTD. I'm running on some very old iron as
>well.
>> The 5 children might be bothering the sysem a little. I may reduce
>that.
>> 
>> I'll post some feedback if my users ever let me get back to it :)
>> 
>> --Chris
>


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Morris Jones
On Tue, 5 Oct 2004, Michael Parker wrote:

> On Tue, Oct 05, 2004 at 10:22:42AM -0700, Morris Jones wrote:
> > 
> > I watched a spamd child grow to 250MB yesterday on a single message.  I
> > have a suspicion that the memory usage growth is happening on a whitelist
> > or bayes database maintenance event of some sort.
> > 
> 
> For folks that are seeing huge jumps in memory, instead of gradual
> growth, how are you calling SA?
> 
> Thanks,
> Michael

I'm running spamd -d -c --max-conn-per-child=10

In /etc/procmailrc I have:

:0fw
* < 256000
| /usr/bin/spamc

Mojo
-- 
Morris Jones <*>
Monrovia, CA
[EMAIL PROTECTED]
http://www.whiteoaks.com



Re: Memory footprint of spamd 3.0

2004-10-05 Thread Michael Parker
On Tue, Oct 05, 2004 at 10:22:42AM -0700, Morris Jones wrote:
> 
> I watched a spamd child grow to 250MB yesterday on a single message.  I
> have a suspicion that the memory usage growth is happening on a whitelist
> or bayes database maintenance event of some sort.
> 

For folks that are seeing huge jumps in memory, instead of gradual
growth, how are you calling SA?

Thanks,
Michael


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Morris Jones
On Tue, 5 Oct 2004, Theo Van Dinter wrote:

> Well, it's not running a single child, it's that each child should
> only run 1 message before dying.  It's right in the spamd docs, but
> "--max-conn-per-child=1" is what you're looking for.

I set mine to --max-conn-per-child=10 as a compromise, and it's doing very
well for me.  At least if a 250MB spamd pops up, it won't live very long.

I watched a spamd child grow to 250MB yesterday on a single message.  I
have a suspicion that the memory usage growth is happening on a whitelist
or bayes database maintenance event of some sort.

One solution might be to have a child terminate after one of these events
to reclaim the memory.

Of course, here I am speculating about it without actually diving into
the code to propose a solution.

Regards,
Mojo
-- 
Morris Jones <*>
Monrovia, CA
[EMAIL PROTECTED]
http://www.whiteoaks.com



deleting spam

2004-10-05 Thread KyleReynolds
I am running spamassassin 2.64 on freeBSD 5.2.1.  It is working fine for
tagging spam, but now that we are satisfied with it's performance, we want
to start deleting obvious spam instead of just tagging it.  I have followed
(I believe...) the instructions for deleting spam with a certain score by
doing the following:

here is my user_prefs:

__
# SpamAssassin user preferences file.  See 'perldoc
Mail::SpamAssassin::Conf'
# for details of what can be tweaked.
###

# How many hits before a mail is considered spam.
# required_hits 5

# Whitelist and blacklist addresses are now file-glob-style patterns, so
# "[EMAIL PROTECTED]", "[EMAIL PROTECTED]", or "*.domain.net" will all work.
# whitelist_from    [EMAIL PROTECTED]

# Add your own customised scores for some tests below.  The default scores
are
# read from the installed spamassassin rules files, but you can override
them
# here.  To see the list of tests and their default scores, go to
# http://spamassassin.org/tests.html .
#
# score SYMBOLIC_TEST_NAME n.nn

# Speakers of Asian languages, like Chinese, Japanese and Korean, will
almost
# definitely want to uncomment the following lines.  They will switch off
some
# rules that detect 8-bit characters, which commonly trigger on mails using
CJK
# character sets, or that assume a western-style charset is in use.
#
# score HTML_COMMENT_8BITS  0
# score UPPERCASE_25_50 0
# score UPPERCASE_50_75 0
# score UPPERCASE_75_100    0

add_header all Level _STARS(*)_



and my procmailrc:

# SpamAssassin sample procmailrc
#
# Pipe the mail through spamassassin (replace 'spamassassin' with 'spamc'
# if you use the spamc/spamd combination)
#
# The condition line ensures that only messages smaller than 250 kB
# (250 * 1024 = 256000 bytes) are processed by SpamAssassin. Most spam
# isn't bigger than a few k and working with big messages can bring
# SpamAssassin to its knees.
#
# The lock file ensures that only 1 spamassassin invocation happens
# at 1 time, to keep the load down.
#

:0fw: spamc.lock
* < 256000
| spamc

# Mails with a score of 15 or higher are almost certainly spam (with 0.05%
# false positives according to rules/STATISTICS.txt). Let's put them in a
# different mbox. (This one is optional.)
#:0:
#* ^X-Spam-Level: \*\*\*\*
#/dev/null
:0 H
*^X-Spam-Level: \*\*\*\*\*\*
/dev/null
#almost-certainly-spam

# All mail tagged as spam (eg. with a score higher than the set threshold)
# is moved to "probably-spam".
:0:
* ^X-Spam-Status: Yes
/dev/null
#probably-spam

# Work around procmail bug: any output on stderr will cause the "F" in
"From"
# to be dropped.  This will re-add it.
:0
* ^^rom[ ]
{
  LOG="*** Dropped F off From_ header! Fixing up. "

  :0 fhw
  | sed -e '1s/^/F/'
}

__

Could anyone spot any mistakes or offer any reason why this isn't working?

I appreciate any advice,

Thanks,



KR





HUMOR: Legit ham subject

2004-10-05 Thread Chris Santerre
This is from an FP I got this morning. Legit ham subject!

"See Children's Letters To God- as low as $25*!":-) 

Its the name of a Broadway show. Good grief these legit newsletters need to
meet us halfway!

--Chris


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Theo Van Dinter
On Mon, Oct 04, 2004 at 10:42:54PM -0700, Loren Wilton wrote:
> There is an option to only run a single child, which is claimed to be
> equivalent to the 2.6x implementation.  I don't recall the option
> (something=1), but Theo posted it within the last day here.  And I'm almost
> positive it is in the docs somewhere.

Well, it's not running a single child, it's that each child should
only run 1 message before dying.  It's right in the spamd docs, but
"--max-conn-per-child=1" is what you're looking for.

It's not a "fork per message" model, it'll still prefork, but the child will
only survive 1 message then die, and there's no config copying going on, etc.

"fork per message" was simple, but very inefficient.  The "prefork" model
is more efficient (and has the potential to become even more efficient),
but it's also a lot more complex.

> 2.The copying of the config back and forth with preforking has a few
> minor but serious problems.

It's really not efficient either, but the main problems with it right
now are all caused by Storable module bugs, iirc.

> 3.There is a problem with spamd children getting hung out to dry on a
> read that never completes.

Seems to only affect a handful of people though, fyi.

> 4.I suspect that 3.0 is inherently less memory efficient than 2.6x; but
> probably not by a huge amount.

Yes and no.  3.0 does a hell of a lot more than 2.6 did, so that means more
memory usage.

-- 
Randomly Generated Tagline:
#else /* !STDSTDIO */ /* The big, slow, and stupid way */
  -- Larry Wall in str.c from the perl source code


pgpAI5tDqA3Jy.pgp
Description: PGP signature


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Theo Van Dinter
On Tue, Oct 05, 2004 at 09:06:04AM -0600, Sherwood Botsford wrote:
> Perhaps code to say, "What's my present memory usage?"
> when it starts up, and a periodic "Hmm my memory usage has
> grown by 20%. time to die."

Unfortunately, there's no real way to do that.

> Or perhaps a config entry/commandline parameter that says
> filter a hundred messages then croak.

Already exists.  RTFM. :)   The default is 200 per child, btw.

-- 
Randomly Generated Tagline:
"My opinions may have changed, but not the fact that I am right."
  - Ashleigh Brilliant


pgpvTiD2pQhlH.pgp
Description: PGP signature


Re: spamc sanity check failures on Suse 9.1

2004-10-05 Thread Michael W Cocke
On Tue, 05 Oct 2004 12:00:26 +0100, you wrote:

>I've just upgraded to SpamAssassin 3.0 and Suse 9.1, and have now 
>started noticing errors in my mail logs. I'm not sure if it's the OS 
>upgrade or the SpamAssassin upgrade.

You've probably already checked this, but for the record - the
/etc/sysconfig/spamd that ships with suse is VERY wrong for SA 3.0.
Did you fix the command line parms?

Mike-

--
If you can keep your head while those around you are losing theirs...
You may have a great career as a network administrator ahead!
--
Please note - Due to the intense volume of spam, we have installed 
site-wide spam filters at catherders.com.  If email from you bounces,
try non-HTML, non-encoded, non-attachments,



Re: Memory footprint of spamd 3.0

2004-10-05 Thread Sherwood Botsford
On Monday 04 October 2004 23:24, Jerry Glomph Black wrote:
> Good enough argument, but the new version has killed off
> two machines that have successfully run SA/spamd with no
> problems since Jan 2002.
>
> The machines leak into oblivion, and Something Bad
> happens (major daemons killed off, etc.).
>
> There is definitely some kind of memory resource problem
> (if not an outright leak) going on.Which reverting to
> the old fork-on-demand -might- help.

Perhaps code to say, "What's my present memory usage?"
when it starts up, and a periodic "Hmm my memory usage has
grown by 20%. time to die."

Or perhaps a config entry/commandline parameter that says
filter a hundred messages then croak.

In the mean time, how about an hourly cronjob that kills and 
restarts spamd

E.g:  /usr/local/bin/spamd -r /var/run/spamd.pid -d
Crontab entry:
0  *  *  *  *root"kill -HUP `cat /var/run/spamd.pid`




-- 
Sherwood Botsford
St. John's School of Alberta



Re: locating/translating geography of IP addresses

2004-10-05 Thread Martin Hepworth
Hi
GeoIP springs to mind..
http://freshmeat.net/projects/geoip/
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Diffenderfer, Randy wrote:
Folks,
Just recently I recall someone mentioning some code or a tool to relate 
IP addresses to originating geography.  I haven't located the reference 
by searching the archives.

So, can someone recall the reference or point me at it?
Thanks!
rnd
**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.
**


Re: locating/translating geography of IP addresses

2004-10-05 Thread Jerry Glomph Black
I tried a bunch of methods, then plunked down fifty bucks for a 1-year 
subscription to an IP -> country database from http://www.ip2location.com

They also have more finely detailed databases for more $$.
They are trivial to use with MySQL or whatever database program you want.
On Tue, 5 Oct 2004, Diffenderfer, Randy wrote:
Folks,
Just recently I recall someone mentioning some code or a tool to relate IP
addresses to originating geography.  I haven't located the reference by
searching the archives.
So, can someone recall the reference or point me at it?


locating/translating geography of IP addresses

2004-10-05 Thread Diffenderfer, Randy
Title: locating/translating geography of IP addresses





Folks,


Just recently I recall someone mentioning some code or a tool to relate IP addresses to originating geography.  I haven't located the reference by searching the archives.

So, can someone recall the reference or point me at it?


Thanks!
rnd





Re: SPF not working

2004-10-05 Thread Brett Romero
- Original Message - 
From: "Dermot McNally" <[EMAIL PROTECTED]>
To: "Daniel Quinlan" <[EMAIL PROTECTED]>
Cc: "Raymond Dijkxhoorn" <[EMAIL PROTECTED]>; "Real Magnet - Brett 
Romero" <[EMAIL PROTECTED]>; 
Sent: Tuesday, October 05, 2004 5:03 AM
Subject: Re: SPF not working


Daniel Quinlan wrote:
> 2. Put this single line into init.pre which should be in
>/etc/spamassassin or /etc/mail/spamassassin:
>
> --- start of cut text --
> loadplugin Mail::SpamAssassin::Plugin::SPF
> --- end 
Interesting - I'm having a similar problem as far as I can see. None of my 
incoming mail is picking up any scores that seem related to SPF, even when 
I try to provoke it with faked mails.

I _do_ have this line in init.pre. However, our SA runs out of MimeDefang, 
and I have a feeling it somehow bypasses init.pre. I certainly seem to get 
different scoring when I pass the same messages through the smapassassin 
command-line tool.

Is there any _other_ way of loading the plugin?
Dermot
According to the debug output, my SPF plugin is loading.  I'm concerned 
about the line parsing errors I see for the scores I've entered in local.cf. 
That might be why SPF is not running.  Not sure though.

Brett 



Re: SPF not working

2004-10-05 Thread Dermot McNally
Daniel Quinlan wrote:
> 2. Put this single line into init.pre which should be in
>/etc/spamassassin or /etc/mail/spamassassin:
>
> --- start of cut text --
> loadplugin Mail::SpamAssassin::Plugin::SPF
> --- end 
Interesting - I'm having a similar problem as far as I can see. None of 
my incoming mail is picking up any scores that seem related to SPF, even 
when I try to provoke it with faked mails.

I _do_ have this line in init.pre. However, our SA runs out of 
MimeDefang, and I have a feeling it somehow bypasses init.pre. I 
certainly seem to get different scoring when I pass the same messages 
through the smapassassin command-line tool.

Is there any _other_ way of loading the plugin?
Dermot


spamc sanity check failures on Suse 9.1

2004-10-05 Thread John Beranek
I've just upgraded to SpamAssassin 3.0 and Suse 9.1, and have now 
started noticing errors in my mail logs. I'm not sure if it's the OS 
upgrade or the SpamAssassin upgrade.

Version info:
$ uname -a
Linux linda 2.6.4-52-default #1 Wed Apr 7 02:08:30 UTC 2004 i686 athlon 
i386 GNU/Linux

$ spamassassin --version
SpamAssassin version 3.0.0
  running on Perl version 5.8.3
I run spamd with no options, ditto spamc.
The log messages are attached, because of the long lines, I hope the 
mailing list doesn't strip attachments.

John.
P.S. A web search turned up problems with previous spamassassin versions 
with OSes that use UTF-8 locales by default, and changed spamd and spamc 
to run with LANG=C. This seemed to make it fail more often, not less...

--
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake
Oct  5 11:47:03 linda spamd[14355]: connection from localhost [127.0.0.1] at 
port 43967
Oct  5 11:47:03 linda spamd[14355]: info: setuid to jberanek succeeded
Oct  5 11:47:03 linda spamd[14355]: processing message <[EMAIL PROTECTED]> for 
jberanek:500.
Oct  5 11:47:04 linda spamd[14498]: clean message (-5.9/5.0) for jberanek:500 
in 0.8 seconds, 1019 bytes.
Oct  5 11:47:04 linda spamd[14498]: result: . -5 - ALL_TRUSTED,BAYES_00 
scantime=0.8,size=1019,mid=<[EMAIL PROTECTED]>,bayes=0,autolearn=ham
Oct  5 11:47:13 linda spamd[14355]: clean message (-5.9/5.0) for jberanek:500 
in 10.1 seconds, 1019 bytes.
Oct  5 11:47:13 linda spamd[14355]: result: . -5 - ALL_TRUSTED,BAYES_00 
scantime=10.1,size=1019,mid=<[EMAIL PROTECTED]>,bayes=0,autolearn=unavailable
Oct  5 11:47:13 linda spamc[14497]: failed sanity check, 1205 bytes claimed, 
2487 bytes seen


Re: Cyrillic chars in rule regex ?

2004-10-05 Thread Eugene Morozov
Shane Metler wrote:
Hi there,
 
Has anyone constructed Spam Assassin rules that can match Cyrillic 
characters?
 
I know this is more of a RegEx question, but I have been very 
unsuccessful at finding out how to match Cyrillic characters in Spam 
Assassin rules.
 

Can anyone offer a little advice or point me to the appropriate method? 
These Russian spams are the only group I've been unable to stop.
This is impossible in an unpatched SpamAssassin. First, there're at 
least three encodings for Russian language: cp1251, koi8-r and utf-8. 
Second, SA treats messages as raw bytes, so Unicode properties like 
\p{Cyrillic} will not work.

I have a patch for SA that converts messages to Unicode and allows using 
Unicode regexps in rules:
http://bugzilla.spamassassin.org/show_bug.cgi?id=3244

Warning: this patch is untested and even I don't use it. It's just a 
proof of concept.

For Russian spam you can also use ok_languages, ok_locales options.


RE: Building SA3 on RH9

2004-10-05 Thread Alan Munday
> -Original Message-
> From: Theo Van Dinter [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 05, 2004 12:24 AM
> To: users@spamassassin.apache.org
> Subject: Re: Building SA3 on RH9
> 
> > I have however not found perl(Parse::Syslog) or
> > perl(Statistics::Distributions) needed for spamassassin-tools.
> 
> Unless you're planning to do development (and even then it's
> not needed), I wouldn't bother with spamassassin-tools.

Theo

No, I'm not up to doing development... yet.

Thanks for the information.

Alan



sa 3.0.0 - same performance after training

2004-10-05 Thread Insems Citam



Still no idea what could be wrong, any ideas 
guys?
 
- Original Message - 
From: Insems Citam 
To: users@spamassassin.apache.org 

Sent: Saturday, October 02, 2004 11:24 AM
Subject: sa 3.0.0 - same performance after training

Hi!
 
I recently installed SpamAssassin 3.0.0. 
What I'm trying to do, is having it scan a mixed 
mailbox, split the mail into two seperate mailboxes (spam.mbox and ham.mbox) and 
then analyze how successful it was.
I do this with:
formail -s procmail -m sa.check < 
mixed.mbox
 
where sa.check contains:
:0fw| spamassassin
 
:0:* 
^X-Spam-Status:(.*\<)?Yesspam.mbox
 
:0:ham.mbox
 
So far, everything works OK.
Then I run sa-learn on my hand-scanned mailboxes 
Spam and Ham (see stats below)(it says it learned from all the messagess I fed 
to it), and rerun SpamAssassin same way I did the first time.
 
Here's where my problem lies:
Actual stats:
Mixed: 1812
Spam: 1150
Ham: 662
 
Stats after 2nd run:
Spam.mbox: 956
Ham.mbox: 855
... the thing is, the stats I get after the 
second run are the same as they were after the first!!! But they very much 
shouldn't be, as Mixed mailbox is actually Spam and Ham mailboxes put together 
(and SA learned from them before it did the 2nd run!).
Just like it didn't learn at all (but it said it 
did).
 
Before I tried this on SA 3.0.0 I did it on SA 2.64 
- worked fine.
The results after the 1st run were not as good as 
after SA 3.0.0.'s 1st run, but after i sa-learn-ed it, it got all the mail 
correct in the 2nd run.
 
What bothers me is what am I doing wrong? Is there 
some catch with SA 3.0.0? Something I've overlooked?
 
Please help!
 
Thanks,
Insems


Filter doesn't seem to work with sa-3.0 with a wellknown medic.

2004-10-05 Thread Zsolt Koppany
Hi,

emails come through with very well known medicaments in the mail body such
as (I replaced 'V' with 'X'): Xiagra.

How can I fix it?

Results:

X-Spam-Level: ***
X-Spam-Status: No, score=3.2 required=4.0 tests=BAYES_60,DRUGS_DIET,
DRUGS_ERECTILE,MSGID_DOLLARS autolearn=no version=3.0.0


Zsolt




Re: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread Simon Byrnand
> On Tue, 5 Oct 2004, Jon Trulson wrote:
>
>> On Mon, 4 Oct 2004, Luis Hernan Otegui wrote:
>>
>> > Well, a weekend update:
>> > Nothing has changed here. I removed EVERYTHING (except for local.cf)
>> > from /etc/mail/spamassassin, and still it chews as much memory as it
>> > could get. I limited the number of childs to five (removed the -m
>> > switch in the startup script), and nothing changed. The only
>> > "improvement" was that instead of 20 processes claiming all the
>> > memory, there were only five trying to freeze my box... But the oldest
>> > one still is a big memory grabber: It reached up to 133 MB, and NEVER
>> > got any lower, it just keeps grabbing and grabbing memory... Seems
>> > pretty much strange to me...
>>
>>  Same thing I saw, except in my case, it was 320MB.  Once a child
>> had it, it never let it go until terminated (or hit the default 200
>> connection limit).
>
> Is there a Perl equivalent to the Unix 'setrlimit' or 'ulimit'
> function? (IE something to set the max data size that a process is
> allowed to use).
>
> Just set it to limit the child processes to something reasonable,
> (say 50~100MB) and have them die if it is exceeded.

Is there any reason why you couldn't just use the unix ulimit command in
the script that launches the spamd daemon ? Are the spamd children threads
or processes ? If they're just forked processes, shouldn't they inherit
the ulimit values of the parent ?

If one of them went over it's ulimit it would be killed. Whether the
message being processed would then pass through unscanned or not, and
whether the parent spamd would notice and respawn a replacement is another
matter though... :)

Regards,
Simon





Re: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread David B Funk
On Tue, 5 Oct 2004, Jon Trulson wrote:
On Mon, 4 Oct 2004, Luis Hernan Otegui wrote:
> Well, a weekend update:
> Nothing has changed here. I removed EVERYTHING (except for local.cf)
> from /etc/mail/spamassassin, and still it chews as much memory as it
> could get. I limited the number of childs to five (removed the -m
> switch in the startup script), and nothing changed. The only
> "improvement" was that instead of 20 processes claiming all the
> memory, there were only five trying to freeze my box... But the oldest
> one still is a big memory grabber: It reached up to 133 MB, and NEVER
> got any lower, it just keeps grabbing and grabbing memory... Seems
> pretty much strange to me...
Same thing I saw, except in my case, it was 320MB.  Once a child
had it, it never let it go until terminated (or hit the default 200
connection limit).
Is there a Perl equivalent to the Unix 'setrlimit' or 'ulimit'
function? (IE something to set the max data size that a process is
allowed to use).
Just set it to limit the child processes to something reasonable,
(say 50~100MB) and have them die if it is exceeded.
--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread Jon Trulson
On Mon, 4 Oct 2004, Luis Hernán Otegui wrote:
Well, a weekend update:
Nothing has changed here. I removed EVERYTHING (except for local.cf)
from /etc/mail/spamassassin, and still it chews as much memory as it
could get. I limited the number of childs to five (removed the -m
switch in the startup script), and nothing changed. The only
"improvement" was that instead of 20 processes claiming all the
memory, there were only five trying to freeze my box... But the oldest
one still is a big memory grabber: It reached up to 133 MB, and NEVER
got any lower, it just keeps grabbing and grabbing memory... Seems
pretty much strange to me...
	Same thing I saw, except in my case, it was 320MB.  Once a child 
had it, it never let it go until terminated (or hit the default 200 
connection limit).

[...]
--
Jon Trulsonmailto:[EMAIL PROTECTED]
ID: 1A9A2B09, FP: C23F328A721264E7 B6188192EC733962
PGP keys at http://radscan.com/~jon/PGPKeys.txt
#include 
"I am Nomad." -Nomad


RE: SA 3.0 is eating up all my memory!!!

2004-10-05 Thread Jon Trulson
On Fri, 1 Oct 2004, Morris Jones wrote:
I found 3.0 pushing my machine into swapping as well this afternoon -- a
first for me.  I stopped and restarted my smtp server and spamd, and it's
back to normal for now.
I'm beginning to think I might be better off running spamassassin in
unique processes instead of as a daemon.  The load time was never terribly
bad, and they certainly can't leak.
	See my response in a previous thread on this problem.  For kicks, 
try --max-conn-per-child=1 to spamd see and see if your machine will last 
longer :)  Mine did...

--
Jon Trulsonmailto:[EMAIL PROTECTED]
ID: 1A9A2B09, FP: C23F328A721264E7 B6188192EC733962
PGP keys at http://radscan.com/~jon/PGPKeys.txt
#include 
"I am Nomad." -Nomad


Re: 3.0 scanning delays

2004-10-05 Thread Jon Trulson
On Fri, 1 Oct 2004, Luis Hernán Otegui wrote:
Same thing here, except that it also eats as much memory as it can...
Scan times keep growing bigger and bigger in time...
	I saw this problem too on our scanning machine (dual Xeon HT 1GB 
RAM), upgraded to SA 3.0 over the weekend.  After awhile (4-8 hours) it 
would get slower and slower (to the point the milter on the mail gateway 
would timeout waiting for spamd to finish a message), and then unscanned 
email would be delivered.

	I tracked it down (partially) to 3 or more of the spamd threads 
jumping up to around 320MB allocated RAM and staying there.  Easy to suck 
up a gig that way.  As more of the spamd children 'blew up' the slower the 
system became due to the increased swapping.

	By default each spamd child will handle 200 connections before 
terminating and allowing the 'master' to start a new child.  After several 
hours, these blownup spamd's would bring the machine to it's knees.

	What I did was add '--max-conn-per-child=1' to the spamd start 
line.  This causes each child to die after handling one connection.  I 
still see the occasional 'blow up' for a spamd child, but at least now it 
gets released as soon as that particular child as finished scanning it's 
message.

	Since then, I haven't had any more problems - running 2 days now 
without requiring a manual restart to regain control of the machine.

	Of course this is really just a workaround.  spamd really should 
release it's allocated mem after handling a message.  I have no idea what 
causes a spamd to explode like that - a 'special' message that exploits 
some bug in spamd?  You guys might try that option to spamd and see if it 
helps.

On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
<[EMAIL PROTECTED]> wrote:
So, I take it that no one is seeing these weird spamd delays but me?  Rats.
Shane Hickey <[EMAIL PROTECTED]> [2004-09-29 14:11]:
Howdy all.  I'm running version 3.0.0 on Gentoo Linux (using the
3.0.0-r1 ebuild).  The machine is a dual P3/450 and it is also running
sendmail 8.12.11 and it handles mail for 20 or so domains with less
than 20 users total.  So, the mail volume is pretty low.
I'm running spamd in the following manner:
/usr/sbin/spamd -d -r /var/run/spamd/spamd.pid -u mail -x -m 10 -L
I'm running spamc out of my /etc/procmailrc (with no options).
What I've noticed is that after spamd has been running for a little
while, it starts to take longer and longer to check each message.
Here is a snippet of my times from 2.64:
clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1129 bytes.
clean message (-104.9/5.0) for user2:8 in 0.9 seconds, 1231 bytes.
clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1231 bytes.
clean message (-4.9/5.0) for user1:8 in 1.1 seconds, 1046 bytes.
When I first start spamd, I see times that are very close to this.
But, within 10-20 minutes, they start to climb.  Here is how they look
right now (I started spamd 40 minutes ago).
clean message (-102.8/5.0) for user1:8 in 5.8 seconds, 1282 bytes.
clean message (-5.0/5.0) for user2:8 in 41.8 seconds, 2867 bytes.
clean message (-100.0/5.0) for user3:8 in 37.8 seconds, 2250 bytes.
If I let spamd run for several hours, I'll see times near 200 seconds
per message and it seems to keep increasing.
I have always had "skip_rbl_checks 1" in my local.cf.  But, I've been
trying to isolate what's caused this new slowness, so I've also tried
to first disable razor2, dcc and pyzor and that didn't seem to make
much difference.  Then I set use_bayes to 0 and that seems to help a
little bit, but I still see long delays.  The delayed times that I
show above are for this configuration:
# Enable the Bayes system
use_bayes   0
# Enable or disable network checks
skip_rbl_checks 1
use_razor2  1
use_dcc 1
use_pyzor   1
I also tried "lock_method flock" and I didn't see much success their
either.  Anyway, I was hoping someone else had seen this behavior and
or maybe someone could shed some light on what might be the cause of
this?
Thanks,
Shane
--
Shane Hickey <[EMAIL PROTECTED]>: Network/System Consultant
GPG KeyID: 777CBF3F
Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
Listening to: The Courtship of Birdy Numnum - The
Parapalegic-Homoerotic Episode
--
Shane Hickey <[EMAIL PROTECTED]>: Network/System Consultant
GPG KeyID: 777CBF3F
Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
Listening to: The Styrenes - Cold Meat


--
Jon Trulsonmailto:[EMAIL PROTECTED]
ID: 1A9A2B09, FP: C23F328A721264E7 B6188192EC733962
PGP keys at http://radscan.com/~jon/PGPKeys.txt
#include 
"I am Nomad." -Nomad


Re: Memory footprint of spamd 3.0

2004-10-05 Thread Loren Wilton
> Any chance in going back to something that actually worked?   I tried
running a
> 2.64 version of spamd, but got a mountain of bayes-related errors.

There is an option to only run a single child, which is claimed to be
equivalent to the 2.6x implementation.  I don't recall the option
(something=1), but Theo posted it within the last day here.  And I'm almost
positive it is in the docs somewhere.

Just watching what people have been reporting, I've come to several
tentative conclusions on 3.0 as it currently stands:

1.I'm about 70% convinced there is an undiscvered memory leak or other
resource leak that has the equivalent result.
2.The copying of the config back and forth with preforking has a few
minor but serious problems.
3.There is a problem with spamd children getting hung out to dry on a
read that never completes.
4.I suspect that 3.0 is inherently less memory efficient than 2.6x; but
probably not by a huge amount.

I think that when the first three problems are addressed and solved that 3.0
will become a whole lot more generally usable.

Loren



Re: Spamassassin qmail-scanner hack

2004-10-05 Thread Loren Wilton
Spamassassin qmail-scanner hack  But as I was told by so many smart people
on this mailing list.. Spamassassin rewrites the headers. So how the heck is
the bad header still in the message after it is run through spamc/spamd ??
  Well it's because the header is not a header. It's part of the message
body.  So Spamassassin does

FWIW, you must have gotten some broken spams, or else that is a very new
characteristic.  Looking back thru the last month's spam, every spam that
has that fake header in it (or one of two near variations) actually had the
line in the headers, where it will (unfortunately) get stripped before it
can be analyzed.

Fortunately something over 90% of the spams using that line also have other
very easy to identify characteristics that will not get stripped.  :-)

Loren



Re: Memory footprint of spamd 3.0

2004-10-05 Thread Jerry Glomph Black
Good enough argument, but the new version has killed off two machines
that have successfully run SA/spamd with no problems since Jan 2002.
The machines leak into oblivion, and Something Bad happens (major daemons killed 
off, etc.).

There is definitely some kind of memory resource problem (if not an outright 
leak) going on.Which reverting to the old fork-on-demand -might- help.

I always marvelled at how SA was different from most perl programs, since they
generally hog a machine's resources if running long enough.   Perhaps this is 
finally happening with this persistent-process model.

Any chance in going back to something that actually worked?   I tried running a 
2.64 version of spamd, but got a mountain of bayes-related errors.


On Mon, 4 Oct 2004, Justin Mason wrote:
Jerry Glomph Black writes:
spamd 3.0 does preforking of the child processes.
Nothing wrong with that, but WHY do the children have such enormous RSS
numbers already when started (>20 Meg per process)?  To me, this makes
no sense.
3.0 has rendered two decent machines of mine useless by snarfing up all
the RAM.
Can this be put back to the old fork-on-demand model?   I'm convinced
that the current forking scheme is broken.
Is the old 2.6x spamd code compatible enough to run with all the 3.0
perl apparatus?
most of this memory is shared.  spamd preloads as much as possible
upfront, to maximise memory sharing -- including all the rules, compiled
into perl code etc.  recent investigation (can't recall bug #) is showing
that this is working, too...


Spamassassin qmail-scanner hack

2004-10-05 Thread Eggleton, Michael
Title: Spamassassin qmail-scanner hack






Hello All,

  (This is not directly a spamassassin issue, but may be very useful to anyone using a spamassassin/qmail setup) 

  I have been having an issue with spam sent to my clients and not quarantined even though the score was way over the limit.  This was happening because of the following fake spammer line:

X-Spam-Status: No, hits=-5.9 required=5.0 tests=AWL,NO_REAL_NAME autolearn=no 

    version=2.60-spam20030926a

  I figured this was a header and that qmail-scanner was reading this header and not the real header:

X-Spam-Status: Yes, hits=9.9 required=5.0 tests=DNS_FROM_RFCI_DSN,HTML_70_80,

    HTML_FONTCOLOR_UNKNOWN,HTML_IMAGE_ONLY_06,HTML_MESSAGE,

    MIME_BOUND_NEXTPART,MIME_MISSING_BOUNDARY,RCVD_IN_DSBL,

    RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,RCVD_IN_XBL,RCVD_NUMERIC_HELO,

    UPPERCASE_25_50 autolearn=no version=2.64

  But as I was told by so many smart people on this mailing list…. Spamassassin rewrites the headers. So how the heck is the bad header still in the message after it is run through spamc/spamd ??

  Well it’s because the header is not a header… It’s part of the message body.  So Spamassassin does not see it as a header and does not remove or replace it.  Because it is in the message after the message is spit back out of spamc, qmail-scanner sees this line last, over writes the real score with this spammers score and lets the message through the system and does not quarantine it.  

So I hacked qmail-scanner to stop this from happening:

Original lines:

  while () {

   print SOUT;

  }

New lines:

  while () {

  if (/^X-Spam-Status: (Yes|No), (hits|score)=(-?[\d\.]*) required=([\d\.]*)/) {

   # HACK HACK HACK   

   } 

   else {   

   print SOUT;

   }

  }

Hope this is helpful for anyone who was having this issue.

Thx

Mike




Re: Memory footprint of spamd 3.0

2004-10-05 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Jerry Glomph Black writes:
> spamd 3.0 does preforking of the child processes.
> 
> Nothing wrong with that, but WHY do the children have such enormous RSS
> numbers already when started (>20 Meg per process)?  To me, this makes
> no sense.
> 
> 3.0 has rendered two decent machines of mine useless by snarfing up all
> the RAM.
> 
> Can this be put back to the old fork-on-demand model?   I'm convinced
> that the current forking scheme is broken.
> 
> Is the old 2.6x spamd code compatible enough to run with all the 3.0
> perl apparatus?

most of this memory is shared.  spamd preloads as much as possible
upfront, to maximise memory sharing -- including all the rules, compiled
into perl code etc.  recent investigation (can't recall bug #) is showing
that this is working, too...

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBYi4oQTcbUG5Y7woRAvODAKDiWxNLCaKaBCf+61/Yfd2z7hbGTACcDQSU
aNq6h+LAU/vLNcqu9bRuHlo=
=h/kR
-END PGP SIGNATURE-



Memory footprint of spamd 3.0

2004-10-05 Thread Jerry Glomph Black
spamd 3.0 does preforking of the child processes.
Nothing wrong with that, but WHY do the children have such enormous RSS numbers 
already when started (>20 Meg per process)?  To me, this makes no sense.

3.0 has rendered two decent machines of mine useless by snarfing up all the RAM.
Can this be put back to the old fork-on-demand model?   I'm convinced that the 
current forking scheme is broken.

Is the old 2.6x spamd code compatible enough to run with all the 3.0 perl 
apparatus?




NO_DNS_FOR_FROM

2004-10-05 Thread Alan
Prior to upgrading to SA3, my maillog used to show quite a few hits for
'NO_DNS_FOR_FROM'. When SA3 came out, I rebuilt my server (RH8, MailScanner
4.34.4) and installed SA3. All seems to be working well, I see hits for SPF,
Razor2, and surbl tests etc., but I no longer see ANY hits for NO_DNS_FOR_FROM.
None at all.

I saw the threads discussing old versions of Net::DNS, but I have confirmed that
I have the latest version installed:
  Spamassassin -D reports:
  debug: is Net::DNS::Resolver available? yes
  debug: Net::DNS version: 0.48

  CPAN reports:
  Net::DNS is up to date.

Even when I feed SA an email with a non-existant from domain, this rule is not
hit. Anyone have any ideas on how I should proceed?

Thanks!
-Alan



Re: Cyrillic chars in rule regex ?

2004-10-05 Thread Loren Wilton
I think you may have to figure out what the actual 8 bit value of those
characters are and then go to hex escape sequences in the re.  Ugly, but it
can be made to work.  The trick is extracting the actual 8-bit values you
will need to match.  A hex dump of the mail comes to mind.

Loren



Re: SA 3.0 and SURBL obfuscation

2004-10-05 Thread Loren Wilton
Is that still broken in 3.0?  I thought sure they would have fixed that
blank line parsing problem!

Loren


> I've noticed some technique to avoid running SURBL check.
> There's an appropriate part of spam message:
> 
> http://aircraft.com href=
>
> "http://ca-t.com/free/?org";>MORE INFO
> HERE
> 
> 
> 
> 
> 
> http://leveled.com href=
>
> "http://ca-t.com/rm.html";>no thanx
> 



Re: Memory usage spikes ...

2004-10-05 Thread Morris Jones
When the memory usage spikes up on a spamd child, it does so on a single
message; from 36K or so up to 250M.  That one message shows up in my
logs taking a _long_ time:

Oct  4 18:46:06 devilrock spamd[1800]: identified spam (12.8/5.0) for :500 
in 165.3 seconds, 1490 bytes. 

Most of the spams are identified in 2-4 seconds.

I decided to watch my logs for one of these to come by, and save out the
message.

But that was really useless.  The message itself doesn't trigger the
memory spike.  In this case the triggering message was a simple little
random spam, with a few random words thrown in.  Repeat runs through
spamc give a two second result and no memory spike.

So something else is causing the memory spike.  Something to do with
bayes db maintenance perhaps?

Best regards,
Mojo

On Sat, 2 Oct 2004, Morris Jones wrote:

> Yesterday I commented that I was seeing spamd children eating a lot of
> memory, pushing the machine into swap.  I've been keeping an eye on
> the spamd children this morning.
> 
> Overnight, all five children were using around 4 meg.  This morning
> sometime, one spamd child shot up to 250M:
> 
> Mem:   513948K av,  504660K used,9288K free,   0K shrd,   15532K buff
> Swap: 1052216K av,  263780K used,  788436K free   68408K 
> cached
> 
>   PID  PPID USER  SIZE STAT %CPU %MEM COMMAND
>  1537 15624 root  250M S 0.0 44.5 spamd child 
> 
> 25394 15624 root 40056 S 0.0  6.1 spamd child 
> 
>  1432 15624 root 38932 S 0.0  6.0 spamd child 
> 
>  1241 15624 root 38768 S 0.0  6.0 spamd child 
> 
>  1754 15624 root 39308 S 0.0  6.0 spamd child 
> 
> 
> Yesterday afternoon when I killed and restarted spamd, they were all using
> about that much.
> 
> Mojo
> 

-- 
Morris Jones <*>
Monrovia, CA
[EMAIL PROTECTED]
http://www.whiteoaks.com



Bayes problem: Interrupted system call

2004-10-05 Thread Oscar Retana
I am using SA3.0, and I have a heavy load of incoming email. When the 
load is low, SA works fine. But I had to turn off bayes because SA was 
getting stuck trying to the get the lock for the bayes databases, when 
the load was high:

bayes.db_* R/W: lock failed: Interrupted system call
bayes.db_* R/W: lock failed: Interrupted system call
bayes.db_* R/W: lock failed: Interrupted system call
...
A thing I noticed was that the bayes database size was about 40MB, too 
big, isn't? I think my configuration sets limits (I don't know if these 
limits refers to tokens, bytes, kbytes, mbytes poorly documented).

But then, I reseted the databases, and the problem continued, even with 
little databases.

Has anyone seen something like this?
A last quiestion: what should be for SA a heavy load?
Thanks a lot!
- Oscar.

P.S. For your reference, my bayes settings:
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam 0
bayes_auto_learn_threshold_nonspam 12
bayes_journal_max_size 5
bayes_expiry_max_db_size 75000
bayes_auto_expire 1
bayes_path /var/spool/spamassassin/bayes/bayes.db
bayes_file_mode 0660



Re: Building SA3 on RH9

2004-10-05 Thread Ed Greshko
On Tue, 2004-10-05 at 07:18, Alan Munday wrote:

> I have however not found perl(Parse::Syslog) or
> perl(Statistics::Distributions) needed for spamassassin-tools.
> 
> I just wanted to check the best method of getting these modules as I know RH
> can be a bit fussy and I want to avoid getting modules in the wrong part of
> the perl tree. 

I've found the use of cpan2rpm and then installing the modules using the
resulting rpms to work the best.

http://perl.arix.com/cpan2rpm/

-- 
"I think the problem, to be quite honest with you, is that you've never
actually known what the question is."

--The computer "Deep Thought" in "Hitchhiker's Guide to The Galaxy"