RE: Ammount of the RAM used by spamd childs

2006-09-30 Thread Balzi Andrea
Thanks All!

Now I've about 80MB for child

Andrea


Re: Earthlink emails

2006-09-30 Thread Ramprasad
On Fri, 2006-09-29 at 11:20 -0400, Michel Vaillancourt wrote:
 Ramprasad wrote:
  On Fri, 2006-09-29 at 08:12 -0400, Michel Vaillancourt wrote:
  Ramprasad wrote:
  Why not SPF ??
 Over two thirds of the email I receive that is UCE/Spam has an 
  SPF_PASS associated with it from SA.  All SPF seems to do is make the 
  stupid spammers look more stupid.  The clever ones aren't affected.
 
  I have a script that automatically blocks SPF-pass domains sending spam
  consistently. you could make good use of the SPF_PASS too. 
  
 
   Care to share?  This would be very handy.
 
This is a perl script a part of larger module. And not exactly worth
sharing. But the idea is very simple 

* cronscript on each machine parses the logs for SPF_PASS mails with SA
score above 15 and puts the messages log lines in a file in http area 

* The rbldns server wgets all files from different servers and finds the
top sender domains who send spam

* Delete all whitelisted domains from the list and those domains who are
also sending a lot of ham to correct ids ( I get this from a mysql db
query to my reports db ) 

* Put the remaining into the rbldns blacklist and restart the rbldns
server for postfix to use these 





  What is the point accepting the mail and the entire data and then
  scanning for DK when It should have ideally been rejected after 
  mail from:
 
 That would be the exact point of DK at the Postfix/ MTA level.
  
  How. All the while I thought dkfilter helps me block after dataend ? Do
  I have to RTFM again ? 
  
   My mistake..  this one runs as a content filter.  The same author is 
 working on a DKIM Proxy that would be your first point-of-contact and handle 
 the mail from intercept.  I got confused.
 
  
  So I let SA do the testing .. which catches the spams but eats resources
  of my servers. When you receive 3-5 million mails a day you tend to
  bother more about resources
 
 I would humbly submit to you that if you move that much traffic, you 
  should be able to justify one more MX machine in the pool and implementing 
  DK.
 
  We have 8 dual xeons already. for this much traffic. And servers are
  always loaded with all kinds tests enabled in SA  
  
   I'm curious... what is the RAM/ MHz spec of your machines?  5M mail/day 
 is 7 mail per second per machine...  at a median 8 seconds mail handle time, 
 that is 57 mail in the pipes at any one time...  50Mb for SA or anti-virus 
 per message works to about 3Gb of RAM in use.  I can see your concern.  
 However, again, I'd say that even two more machines in the pool would bring 
 that down to ~2GB of RAM in use per machine, and that should give you the 
 cycles and memory to run SPF queries as well as DK filters.
 
4GB Ram , 3GHz x 2 xeon with HT 
But I think you too would know mail never comes uniformly at 7/s.
There are peak times when my mailservers touch 43k/hour while in the
nights they may be sleeping with the rest of us. And at peak times the
mail delay starts killing us. ( Thats exactly when I start sending 450
to bad domains ) 





   I do understand the notion your boss might not be willing to put 
 another $5K down to deal with the problem.  However, as anyone  can attest 
 to, good customer service costs money to provide.
 



Re: Ammount of the RAM used by spamd childs

2006-09-30 Thread Matt Kettler
Balzi Andrea wrote:
 Thanks All!

 Now I've about 80MB for child

 Andrea

   
You're distinctly NOT welcome.

I don't help folks who outright blacklist whole ISP's with millions of
legitimate users in order to prevent a portion of spam. Particularly
when that ISP is one I'm using.

Perhaps now that your spamd's are reasonable, you can ditch some of
these absurdly ignorant approaches to spam control:

 A message (from [EMAIL PROTECTED]) was received at 30 Sep 2006  3:14:30 
 +.

 The following addresses had delivery problems:

 [EMAIL PROTECTED]
   Permanent Failure: 
 550-mail_drop_because_comcast.net_is_in_our_blacklist_/_mail_scartata_perche'
   Delivery last attempted at Sat, 30 Sep 2006 03:14:46 -
   
   



Re: Ammount of the RAM used by spamd childs

2006-09-30 Thread jdow

From: Bowie Bailey [EMAIL PROTECTED]


Balzi Andrea wrote:

 -Original Message-
[...]
  every child it occupies approximately 450MB of RAM.
  
  My server is a GNU/Linux Debian 3.1r2 with spamassassin v3.1.5 and

  Perl v5.8.4 Aren't it too many every 450MB for single child?
 
 That is a bit excessive.  My first guess is that you have WAY

 too many add-on rule sets (or you are using old ones that should
 not be used). 
 
 Which rule sets are you currently using?
 


I'm usign the default rules of spamassassin 3.1.5 with the follow
rules downloaded from rulesemporium:

ANTIDRUG


Antidrug is not needed with current versions of SA.


BLACKLIST_URI


You should use the ws.surbl.org version of this blacklist instead.

See here for more info:
http://wiki.apache.org/spamassassin/SURBL


BLACKLIST


This is a 16M rulefile and probably a major contributor to your memory
load.


SARE_SPAMCOP_TOP200


The current versions of SA already use this list as a network test.
If you have network tests enabled, you don't need this.

Other than that, all I can say is that you have quite a few rules.
You may want to try removing some of them and restarting spamd.  Just
do some trial and error and see which ones make the most difference.


You named the big ones. I use more rule sets than he quoted and only
use about 66 megs.

28211 root  16   0 75368  66m 2400 S  0.0  6.6   0:24.43 spamd

{^_^}



Re: Non-blocklisted embedded URLs are getting hits on URIBL_AB_SURBL and URIBL_PH_SURBL in SpamAssassin 3.1.5

2006-09-30 Thread Justin Mason

David Ulevitch writes:
 From: Chris [EMAIL PROTECTED]
 To: users@spamassassin.apache.org
 Date: Friday, September 29, 2006, 3:59:03 PM
 Subject: Non-blocklisted embedded URLs are getting hits on  
 URIBL_AB_SURBL and URIBL_PH_SURBL in SpamAssassin 3.1.5

 ===8==Original message text===
 On Thursday 28 September 2006 1:17 am, Donald Craig wrote:
 And Theo Van Dinter pointed out:
 You're not by chance using the opendns.{com,org} folks for DNS,  
 are you?

 Of course.  I'm an idiot.  I switched to OpenDNS a couple of weeks  
 back.
 Time to return from whence I came.  Thank you,

Donald,

We handle DNSBLs but not URIBLs, at the moment.  Passing along to  
Noah to see what he can do.  Sorry you had this happen to your  
SpamAssassin scoring. (Time to check mine... :-) )

You can resolve this behavior by turning off typo correction in your  
preferences page and it'll work again with us returning NXDOMAIN  
(RCODE=3) instead of doing the typo correction service.  Hopefully we  
can get more granular with that in the future.

If you are on a dynamic IP, well, just sit tight for a couple more  
weeks or email me to start beta testing some code this week to handle  
dynamic IPs (and that offer is for anyone).

David -- 

Thanks for commenting, and good to hear it doesn't affect traditional
DNSBL lookups.   It sounds like we should probably add a temporary
SpamAssassin FAQ entry for this?

--j.

Thanks,
David Ulevitch (from OpenDNS)


 Don Craig
 
 I'm getting matches whenever I have an embedded URL
 on URIBL_AB_SURBL and URIBL_PH_SURBL -
 unless the URL is actually in URIBL_SBL, in which case the
 logic for all the flavors of URIBL_XX_SURBL seems
 to work correctly.  I have verified the
 absence of the incorrectly matching URLs from SURBL
 with lookups in http://www.rulesemporium.com/cgi-bin/uribl.cgi

 This is SpamAssassin 3.1.5, all was fine in 3.1.2.

 For now I have set both those tests to 0.00.

 Don Craig
 Yes, OpenDNS definitely caused problems for me also:

 Sep  1 21:51:25 localhost spamd[10939]: uridnsbl: bogus rr for
 domain=otwaloow.com, rule=URIBL_XS_SURBL, id=8880
 rr=otwaloow.com.xs.surbl.org. 1 IN A 208.67.219.40
 at /usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/Plugin/ 
 URIDNSBL.pm line
 626.

 Theo pointed out the errors of my ways:

 The error is saying that it's looking for a 127/8 result, but it gets
 208.67.219.40 (which resolves to a *.opendns.com name btw).  So I  
 would
 say that yes, the problems are related to changing your nameservers.


 -- 
 Chris

 ===8===End of original message text===





Re: Non-blocklisted embedded URLs are getting hits on URIBL_AB_SURBL and URIBL_PH_SURBL in SpamAssassin 3.1.5

2006-09-30 Thread David Ulevitch

On Sep 30, 2006, at 3:30 AM, Justin Mason wrote:


David Ulevitch writes:


Donald,

We handle DNSBLs but not URIBLs, at the moment.  Passing along to
Noah to see what he can do.  Sorry you had this happen to your
SpamAssassin scoring. (Time to check mine... :-) )

You can resolve this behavior by turning off typo correction in your
preferences page and it'll work again with us returning NXDOMAIN
(RCODE=3) instead of doing the typo correction service.  Hopefully we
can get more granular with that in the future.

If you are on a dynamic IP, well, just sit tight for a couple more
weeks or email me to start beta testing some code this week to handle
dynamic IPs (and that offer is for anyone).


David --

Thanks for commenting, and good to hear it doesn't affect traditional
DNSBL lookups.   It sounds like we should probably add a temporary
SpamAssassin FAQ entry for this?



Justin,

That sounds like a good idea.  Want me to write one up for you in the  
style of the SA FAQ or is there enough in my post above to toss one  
in until we are better able to address URIBLs?


-david



Re: SA gone mad, times out and stucks

2006-09-30 Thread Jürgen Herz
Jürgen Herz wrote:
 Bowie Bailey wrote:
 If your --force-expire only took 19 seconds, I would guess that you
 are not talking to the same database.  Make sure you are logged in as
 the same user that is having the problem when you run the
 --force-expire.
 
 Uh, that's a very good point. You can be right, --force-expire as that
 actual user took 641 secs.
 Have reenabled bayes_auto_expire now and will see.

Manual --force-expire seems to have helped. I only get one timout per
day since then - from what I see if multiple mails come in at the same time.

What I still get and not understand is
warn: bayes: cannot open bayes databases /var/spool/exim4/.spamassa
ssin/bayes_* R/W: lock failed: File exists

Thanks for all your help so far.

Regards,
Jürgen


Re: SA gone mad, times out and stucks

2006-09-30 Thread Andreas Pettersson

Jürgen Herz wrote:


What I still get and not understand is
warn: bayes: cannot open bayes databases /var/spool/exim4/.spamassa
ssin/bayes_* R/W: lock failed: File exists
 



Make sure the file permissions hasn't changed when you ran the manual 
expire.


Regards,
Andreas



Re: TQMcube Geo Zone config files

2006-09-30 Thread Andreas Pettersson

Andreas Pettersson wrote:

In case anybody is interrested, I've compiled a config file for the 
geo zone at TQM http://tqmcube.com/worldzone.php
It might not be of great use, but it is interresting to gather some 
statistics of where the mails come from.


Files found here
http://anp.ath.cx/tqmcube/



I have updated tqmcube_world.cf with the -lastexternal setting on the 
set name, so that only the connecting IP address is checked instead of 
the whole chain of relays.


Regards,
Andreas



SpamAssassin MX Gateway Server

2006-09-30 Thread Russ B.
I have a unique but interesting problem:

I have a farm of servers that use Sendmail/ProcMail/SpamAssassin.

Due to their very heavy loads and my custom rules, I have built a
dual-proc-dual-core FBSD AMD64 bit OS server to do nothing but my major
spam knockdowns and processing to send back to the
Sendmail/Procmail/SpamAssassin server farm.

On my gateway MX server, I'm using Postfix/AmavisD and Spamassassin, and
it works great. It's flat out rejecting spam scored over 150 spam score,
it tags spam as spam if it's over 15 size, and it just puts in the spam
headers over 15 size as well. If it scores UNDER 15, it neither get's
scored nor given headers.

Then, on the Sendmail farm, I use this recipe, which works great:

:0:
* ! ^X-Spam-Status: YES
{
:0fw
*  256000
|/usr/local/bin/spamc -f
}

:0:
* ^X-Spam-Status: Yes
$HOME/mail/Caught-Spam

Basically, anything that arrives over 15 in score, will have that
SPAM-STATUS header embedded, so it does NOT run SpamAssassin on this
server, and just puts it in the Caught-Spam. If it has LOWER than a score
of 15 from the MX, then the MX server didn't put a header on it, so it's
processed here and filed here.

Why do that? Because my users on the sendmail server farm have a whole
variety of score choices they are using, so I want their specfic score to
be utilized - but by making the score on the MX 15, I'm saving the 
sendmail server from a WHOLE LOT of processing, and nobody's going to have
a default score over 15... so that's a safe number?

Make sense? This works great. The MX get's the mail, knocks down the
really bad spam, tags the medium spam and let's the end servers re-score
the questionable stuff to the user preferences.

Ok - my question/problem is this:

Is there a way I can run spamc (or spamassassin) so that it doesn't
actually RESCORE/REPROCESS the mail (the large amount of work), but
instead just looks at the users required score (required_score  6.0) and
only re-tags the X-Spam-Status flag to YES or No??

See, in my current setup (as explained above):

MX server scores it as spam score 205 -- sendmail farm nukes it

MX server scores it as spam score 16 - MX tags it as spam -- sendmail
farm just files it in the user's Caught-Spam folder.

MX server scores it as score 7, which is below questionable as 15, so it
doesn't score it --- sendmail then runs spamass on it, rescores it and
then files it to user's settings.



Re: SpamAssassin MX Gateway Server

2006-09-30 Thread Russ B.
Fix to above post's last lines:

MX server scores it as spam score 200 -- MX server just nukes it

MX server scores it as spam score 16 - MX tags it as spam -- sendmail
farm just files it in the user's \Caught-Spam folder.

MX server scores it as score 7, which is below questionable which is set
to 15, so it doesn't score it (nor gives it any spam headers) ---
sendmail then runs spamass on it, rescores it and then files it to user's
settings.




Re: Setting up DKIM and DomainKeys mail signing and verification

2006-09-30 Thread SM

At 12:32 28-09-2006, Henrik Ostergaard wrote:

This sounds promissing! But I have distributed, moving users and therefore
uses pop-before-smtp for authentication, which means that my IP list is in a
hash table, which is not in CIDR format. :-(


dk-filter and dkim-filter support pop-before-smtp.

Regards,
-sm 



Re: sa-learn and Caught spams

2006-09-30 Thread Bill Horne
On Wed, 2006-09-27 at 21:00 -0400, Matt Kettler wrote:
 Bill Horne wrote:
 
  I have a follow on question, so I'll add it to this thread:
 
  Assuming that it's a good idea to feed Caught spams through sa-learn
  in order to reinforce the tokens that might not have been autolearned,
  how do I tell SA to ignore the  SPAM  notice in the subject? I
  have ignore-header commands in local.cf for the X-Spam-Status: Yes and
  other spam headers, but how do I skip only a portion of the subject?
 
 Provided it's a markup your SpamAssassin generated, SA will
 automatically ignore it when learning.

Matt,

Thanks for your reply: I apologize for not writing more clearly.

I'm running Exim4 with Exiscan, so the  SPAM  prefix is being
added to the Subject header by Exim4, not SA. 

The question is: can sa-learn be taught to ignore specific parts of the
subject line, or do I need to filter its input with a separate process?

TIA.

Bill Horne




Re: SpamAssassin MX Gateway Server

2006-09-30 Thread Jerry Bell
I'm not the authority on such things, but I don't believe it's possible
without some customization.

I really wanted to ask you, though, how you handle mail rejection on the
inner layer of mail servers?  If mail gets through your front end SA box
and needs to be rejected because it's to an invalid address or some other
reason, how do you handle that?

Jerry


 I have a unique but interesting problem:

 I have a farm of servers that use Sendmail/ProcMail/SpamAssassin.

 Due to their very heavy loads and my custom rules, I have built a
 dual-proc-dual-core FBSD AMD64 bit OS server to do nothing but my major
 spam knockdowns and processing to send back to the
 Sendmail/Procmail/SpamAssassin server farm.

 On my gateway MX server, I'm using Postfix/AmavisD and Spamassassin, and
 it works great. It's flat out rejecting spam scored over 150 spam score,
 it tags spam as spam if it's over 15 size, and it just puts in the spam
 headers over 15 size as well. If it scores UNDER 15, it neither get's
 scored nor given headers.

 Then, on the Sendmail farm, I use this recipe, which works great:

 :0:
 * ! ^X-Spam-Status: YES
 {
 :0fw
 *  256000
 |/usr/local/bin/spamc -f
 }

 :0:
 * ^X-Spam-Status: Yes
 $HOME/mail/Caught-Spam

 Basically, anything that arrives over 15 in score, will have that
 SPAM-STATUS header embedded, so it does NOT run SpamAssassin on this
 server, and just puts it in the Caught-Spam. If it has LOWER than a score
 of 15 from the MX, then the MX server didn't put a header on it, so it's
 processed here and filed here.

 Why do that? Because my users on the sendmail server farm have a whole
 variety of score choices they are using, so I want their specfic score to
 be utilized - but by making the score on the MX 15, I'm saving the
 sendmail server from a WHOLE LOT of processing, and nobody's going to have
 a default score over 15... so that's a safe number?

 Make sense? This works great. The MX get's the mail, knocks down the
 really bad spam, tags the medium spam and let's the end servers re-score
 the questionable stuff to the user preferences.

 Ok - my question/problem is this:

 Is there a way I can run spamc (or spamassassin) so that it doesn't
 actually RESCORE/REPROCESS the mail (the large amount of work), but
 instead just looks at the users required score (required_score  6.0) and
 only re-tags the X-Spam-Status flag to YES or No??

 See, in my current setup (as explained above):

 MX server scores it as spam score 205 -- sendmail farm nukes it

 MX server scores it as spam score 16 - MX tags it as spam -- sendmail
 farm just files it in the user's Caught-Spam folder.

 MX server scores it as score 7, which is below questionable as 15, so it
 doesn't score it --- sendmail then runs spamass on it, rescores it and
 then files it to user's settings.






Re: SpamAssassin MX Gateway Server

2006-09-30 Thread Daniel Staal
--As of September 30, 2006 12:32:41 PM -0500, Russ B. is alleged to have 
said:



Basically, anything that arrives over 15 in score, will have that
SPAM-STATUS header embedded, so it does NOT run SpamAssassin on this
server, and just puts it in the Caught-Spam. If it has LOWER than a score
of 15 from the MX, then the MX server didn't put a header on it, so it's
processed here and filed here.

Why do that? Because my users on the sendmail server farm have a whole
variety of score choices they are using, so I want their specfic score to
be utilized - but by making the score on the MX 15, I'm saving the
sendmail server from a WHOLE LOT of processing, and nobody's going to have
a default score over 15... so that's a safe number?


--As for the rest, it is mine.

Just as a thought: Since you are running procmail on them anyway, it should 
be possible to have a script in there that reads the desired score and uses 
the score count Spamassassin embeds in the 'X-Spam-Level:' header to filter.


It wouldn't reformat the mail (at least not without a lot of work), but you 
could at least file it differently...


Daniel T. Staal

---
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---


Re: SpamAssassin MX Gateway Server

2006-09-30 Thread jdow

From: Daniel Staal [EMAIL PROTECTED]

--As of September 30, 2006 12:32:41 PM -0500, Russ B. is alleged to have said:


Basically, anything that arrives over 15 in score, will have that
SPAM-STATUS header embedded, so it does NOT run SpamAssassin on this
server, and just puts it in the Caught-Spam. If it has LOWER than a score
of 15 from the MX, then the MX server didn't put a header on it, so it's
processed here and filed here.

Why do that? Because my users on the sendmail server farm have a whole
variety of score choices they are using, so I want their specfic score to
be utilized - but by making the score on the MX 15, I'm saving the
sendmail server from a WHOLE LOT of processing, and nobody's going to have
a default score over 15... so that's a safe number?


--As for the rest, it is mine.

Just as a thought: Since you are running procmail on them anyway, it should  be 
possible to have a script in there that reads the desired score and uses  the score 
count Spamassassin embeds in the 'X-Spam-Level:' header to filter.


It wouldn't reformat the mail (at least not without a lot of work), but you  could at 
least file it differently...


If you can have per user rules and system wide Bayes it becomes real
easy to have the per user rules be one line, their spam threshold. Of
course, with per user Bayes you can have far better anti-spam because
you are not dealing with one person's ham is another person's spam.
But it gets to be a maintenance nightmare as the number of users goes
up and the user computer sophistication goes down.

{^_^}