Re: General assistance
Chris Santerre wrote: I would like to make a quick comment to everyone who has helped in this thread: Great job. Seriously. Some good answers here. Can we we all take a minute to make sure these answers are posted somewhere on the SA wiki's for future reference? Its been a while since we had a push for additions. http://wiki.apache.org/spamassassin/ and http://www.exit0.us/ Your chance to preserve your helpful info in the anals of history. (That almost sounds painful!) Thanks! Chris Santerre Chris and all, I apologize for being so slow in getting to this, things came up. I found a page in the Wiki I had not seen, and could not find a link for, titled FasterPerformance. It gives an explanation of the DNS cache solution. I saw no sense in rewriting an already excellent text. I also added a page titled ChooseYourRules with my thoughts. Both pages are now linked under "Performance Tips" at http://wiki.apache.org/spamassassin/UsingSpamAssassin DAve -- This message was checked by forty monkeys and found to not contain any SPAM whatsoever. Your monkeys may vary
Re: General assistance
Chris Santerre wrote: -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 14, 2006 3:14 PM To: users@spamassassin.apache.org Subject: Re: General assistance Chris Santerre wrote: I would like to make a quick comment to everyone who has helped in this thread: Great job. Seriously. Some good answers here. Can we we all take a minute to make sure these answers are posted somewhere on the SA wiki's for future reference? Its been a while since we had a push for additions. http://wiki.apache.org/spamassassin/ and http://www.exit0.us/ Cool, never saw that before. Your chance to preserve your helpful info in the anals of history. (That almost sounds painful!) Thanks! Tell me what parts should be added, and where to put them, Tips and Tricks? Performance Hints? Managing High Load? and I will add what I can. DAve Thats the beauty of a wiki, put it anywhere you like. We can always change it. ;) --Chris Don't get me started on Wikis, I still have nightmares about faq-o-matics. No one is worse, or more negligent, or more lazy about documentation that a sysadmin. I know cause I am one, and I have two documentation projects I haven't even started yet (whoops). Anyone who thought that sysadmins would self document through a Wiki had a screw loose or a drinking problem. But I will stop crying now and endevor to become part of the solution! ;^) DAve
RE: General assistance
Title: RE: General assistance > -Original Message- > From: DAve [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, February 14, 2006 3:14 PM > To: users@spamassassin.apache.org > Subject: Re: General assistance > > > Chris Santerre wrote: > > I would like to make a quick comment to everyone who has > helped in this > > thread: > > > > Great job. Seriously. Some good answers here. Can we we all > take a minute to > > make sure these answers are posted somewhere on the SA > wiki's for future > > reference? Its been a while since we had a push for additions. > > > > http://wiki.apache.org/spamassassin/ > > and > > http://www.exit0.us/ > > Cool, never saw that before. > > > > > Your chance to preserve your helpful info in the anals of > history. (That > > almost sounds painful!) > > > > Thanks! > > > > Tell me what parts should be added, and where to put them, > > Tips and Tricks? > Performance Hints? > Managing High Load? > > and I will add what I can. > > DAve Thats the beauty of a wiki, put it anywhere you like. We can always change it. ;) --Chris
Re: General assistance
Chris Santerre wrote: I would like to make a quick comment to everyone who has helped in this thread: Great job. Seriously. Some good answers here. Can we we all take a minute to make sure these answers are posted somewhere on the SA wiki's for future reference? Its been a while since we had a push for additions. http://wiki.apache.org/spamassassin/ and http://www.exit0.us/ Cool, never saw that before. Your chance to preserve your helpful info in the anals of history. (That almost sounds painful!) Thanks! Tell me what parts should be added, and where to put them, Tips and Tricks? Performance Hints? Managing High Load? and I will add what I can. DAve
RE: General assistance
Title: RE: General assistance I would like to make a quick comment to everyone who has helped in this thread: Great job. Seriously. Some good answers here. Can we we all take a minute to make sure these answers are posted somewhere on the SA wiki's for future reference? Its been a while since we had a push for additions. http://wiki.apache.org/spamassassin/ and http://www.exit0.us/ Your chance to preserve your helpful info in the anals of history. (That almost sounds painful!) Thanks! Chris Santerre SysAdmin and SARE/URIBL ninja http://www.uribl.com http://www.rulesemporium.com > -Original Message- > From: Ed Russell [mailto:[EMAIL PROTECTED]] > Sent: Friday, February 10, 2006 4:42 PM > To: users@spamassassin.apache.org > Subject: RE: General assistance > > > I was doing some reading and I am beginning to look into > Rules Du Jour. I > see there are quite a large number of rulesets to choose from > when utilizing > this. Does anyone have any advice on what ones would be safe? > > Ed > > > --- > > Talk is cheap since supply always exceeds demand. > > --- > > > -Original Message- > From: DAve [mailto:[EMAIL PROTECTED]] > Sent: Friday, February 10, 2006 4:30 PM > To: users@spamassassin.apache.org > Subject: Re: General assistance > > Bowie Bailey wrote: > > DAve wrote: > > > >>Ed Russell wrote: > >> > >>>2. Once this is in place should I re-activate pzyor, dcc or razor? > >>>Is one better than the other? Are there advantages to either? > >> > >>I use neither, though I think I am in the minority. I > routinely check > >> my spam and I have found that bayes, rayzor, dcc, and most of the > >>SARE rules catch little if any spam "for me". So I don't > run them and > >>save the CPU for additional spamd processes. > > > > > > That's odd. Bayes, Razor2, DCC work quite well for me. > Check out my > > stats from today: > > > > TOP SPAM RULES FIRED > > > > RANK RULE NAME COUNT %OFRULES > %OFMAIL %OFSPAM > > %OFHAM > > > > 1 RAZOR2_CF_RANGE_51_100 1280 5.02 > 48.05 83.33 > > 0.98 > > 2 RAZOR2_CHECK 1259 4.94 > 47.26 81.97 > > 1.15 > > 3 RAZOR2_CF_RANGE_E8_51_100 1164 4.56 > 43.69 75.78 > > 0.27 > > > > > > > Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. > > > > They tagged plenty of spam for me, no doubt about that. But > they caught > only a few spam that SA wouldn't have caught without them. It is rare > that bayes points on top of existing points ever made the > score squeek > over the threshold. > > Not using them however, dropped my CPU, network, and memory > requirements > so much I could run twice as many spamd processes. Processing > time went > from an average of 10 seconds (with all SARE rules, bayes, > DCC, Razor) > to 2 seconds (limited SARE, no bayes, no razor, no dcc). > > All the SARE rules loaded makes spamd run about 45-75mb each, > selective > SARE rules and I can see spamd drop to 23-35mb. More spamd, > faster spamd. > > Of course tommorrow, everything could change ;^) > > DAve > > >
Re: General assistance
On Feb 14, 2006, at 10:47 AM, DAve wrote: Daniel Cañas Montero wrote: On Feb 11, 2006, at 3:14 PM, Ed Russell wrote: I have to say a heartfelt THANK YOU to everyone who contributed to this thread. My filter is working 500% more efficient that it ever was. I have done the following: 1.Installed djbdns and I am using dnscache as I was told. I have increased the cache size to 100 Megabytes and completely disabled logging after determining it was working properly. How do you disable logging completely? I use multilog and filter out all the lines so it logs nothing. Is there a way to tell dnscache not to actually spit anything out? Only by removing code from dsncache I believe. Most people just limit what, if anything, is picked up by multilog. I've never tried it but it would be interesting to see what #svc -d /service/dnscache/log would do. That would remove any need to modify your log/run script. I know some people just redirect dnscache output to /dev/null. I sometimes need to see what the stats are for dnscache when checking SA (URIBL SURBL), so I've never done it. DAve OK. That is what I do currently...have '-*' in my multilog but I thought there might be a way to avoid having a 'dummy' multilog process running. I have l aso tried not starting the multilog (ie #svc -d /service/ dnscache/log), and it seems to work... but I wasn't sure if it would do anything funny over the long run, so I started it up again. Maybe this is a dumb question... But is it ok to have a process monitored by supervise not to have a corresponding multilog running to capture the output?
Re: General assistance
Daniel Cañas Montero wrote: On Feb 11, 2006, at 3:14 PM, Ed Russell wrote: I have to say a heartfelt THANK YOU to everyone who contributed to this thread. My filter is working 500% more efficient that it ever was. I have done the following: 1.Installed djbdns and I am using dnscache as I was told. I have increased the cache size to 100 Megabytes and completely disabled logging after determining it was working properly. How do you disable logging completely? I use multilog and filter out all the lines so it logs nothing. Is there a way to tell dnscache not to actually spit anything out? Only by removing code from dsncache I believe. Most people just limit what, if anything, is picked up by multilog. I've never tried it but it would be interesting to see what #svc -d /service/dnscache/log would do. That would remove any need to modify your log/run script. I know some people just redirect dnscache output to /dev/null. I sometimes need to see what the stats are for dnscache when checking SA (URIBL SURBL), so I've never done it. DAve 2.I have implemented rbl at the MTA level, I use relays.ordb.org and sbl-xbl.spamhaus.org. 3.I have implemented Rules Du Jour. I selected a subset of the SARE rules and misc others. 4.I have turned back on pyzor, razor and dcc. Scanning times are well within tolerance with a minimal impact on delivery time. See below (email addresses removed for privacy):
RE: General assistance
[EMAIL PROTECTED] log]# cat /etc/dnscache/log/run #!/bin/sh #exec setuidgid gdnslog multilog t ./main exec setuidgid gdnslog multilog -* You can see that as opposed to multilog t ./main I use multilog -* That will do it. Enjoy. Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Daniel Cañas Montero [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 14, 2006 11:14 AM To: users@spamassassin.apache.org Subject: Re: General assistance On Feb 11, 2006, at 3:14 PM, Ed Russell wrote: > I have to say a heartfelt THANK YOU to everyone who contributed to > this > thread. My filter is working 500% more efficient that it ever > was. I have > done the following: > > 1.Installed djbdns and I am using dnscache as I was told. I have > increased the cache size to 100 Megabytes and completely disabled > logging > after determining it was working properly. How do you disable logging completely? I use multilog and filter out all the lines so it logs nothing. Is there a way to tell dnscache not to actually spit anything out? > > 2.I have implemented rbl at the MTA level, I use relays.ordb.org and > sbl-xbl.spamhaus.org. > > 3.I have implemented Rules Du Jour. I selected a subset of the SARE > rules and misc others. > > 4.I have turned back on pyzor, razor and dcc. > > Scanning times are well within tolerance with a minimal impact on > delivery > time. See below (email addresses removed for privacy): >
Re: General assistance
On Feb 11, 2006, at 3:14 PM, Ed Russell wrote: I have to say a heartfelt THANK YOU to everyone who contributed to this thread. My filter is working 500% more efficient that it ever was. I have done the following: 1. Installed djbdns and I am using dnscache as I was told. I have increased the cache size to 100 Megabytes and completely disabled logging after determining it was working properly. How do you disable logging completely? I use multilog and filter out all the lines so it logs nothing. Is there a way to tell dnscache not to actually spit anything out? 2. I have implemented rbl at the MTA level, I use relays.ordb.org and sbl-xbl.spamhaus.org. 3. I have implemented Rules Du Jour. I selected a subset of the SARE rules and misc others. 4. I have turned back on pyzor, razor and dcc. Scanning times are well within tolerance with a minimal impact on delivery time. See below (email addresses removed for privacy):
Re: General assistance
Bowie Bailey wrote: DAve wrote: Bowie Bailey wrote: DAve wrote: Ed Russell wrote: 2. Once this is in place should I re-activate pzyor, dcc or razor? Is one better than the other? Are there advantages to either? I use neither, though I think I am in the minority. I routinely check my spam and I have found that bayes, rayzor, dcc, and most of the SARE rules catch little if any spam "for me". So I don't run them and save the CPU for additional spamd processes. That's odd. Bayes, Razor2, DCC work quite well for me. Check out my stats from today: Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. They tagged plenty of spam for me, no doubt about that. But they caught only a few spam that SA wouldn't have caught without them. It is rare that bayes points on top of existing points ever made the score squeek over the threshold. Not using them however, dropped my CPU, network, and memory requirements so much I could run twice as many spamd processes. Processing time went from an average of 10 seconds (with all SARE rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no razor, no dcc). All the SARE rules loaded makes spamd run about 45-75mb each, selective SARE rules and I can see spamd drop to 23-35mb. More spamd, faster spamd. I guess this is a definite case of 'YMMV'. With Bayes, Razor2, DCC, and 15 SARE rulesets, my average scantime is 2.5 seconds, but each process is 46M and I usually only have 2 or 3 running of a max of 8 (although with 1G of ram, I've got plenty of headroom if I need to add more). I have 26 configured spamds this week, I have enough ram to run 40, though I hit the wall with the CPUs at that point. I generally have 15 to 25 running all the time, 24x7. Even late at night I can checkin with the server and find only 2 or three spamd processes sleeping. Each is consuming 30 to 35mb of ram, currently. The bottom line is that if you have a low to medium mail volume and a decent server, you can probably turn it all on and not worry too much about it. If you have a high volume of mail, or a slower server, you may need to be a bit more picky with your rulesets and features. My advice is this: Try it with Bayes, Razor2, DCC, and Pyzor. Install any of the SARE rulesets you think might be useful. Then monitor your server and see what happens. You can use the 'top' command to see how much memory is in use and how much each spamd process is using. You should try to configure things such that the server never uses swap. If SA goes into swap, your performance will drop through the floor. To see your average scantime (assuming that SA is logging to syslog), you can use this command string (or drop it in a script file): grep -e 'clean message' -e 'identified spam' /var/log/maillog | perl -ne 'if (/in (\d+\.\d+) seconds/) { $time += $1; $cnt++;} } $avg = $time/$cnt; print "$avg\n"; {' Note that this command string should be all on one line. Your mailreader will probably split it... If everything is working well, you're good. If you are using too much memory, remove some of the extra rules or reduce the number of spamd children. If your scantimes are too slow and the machine is not swapping, then you should experiment with disabling Bayes, DCC, or Razor or removing rules. Also, as others have pointed out, a local caching nameserver on your SA machine can go a long way towards reducing lag from the network tests. That is a good synopsis of what I went through in determining cost(in resources) vs benefit with each SA option/plugin/ruleset. I'm sure most admins have done the same, and it is excellent advice for Ed. I couldn't have said better myself. DAve
RE: General assistance
Thanks for the advice, it's well suited. FYI, my average scan time is: 7.73104733769435 I have enabled pyzor, razor and dcc. All looks fine for now. Of course this is a work in progress and I will have to keep a close eye on it. Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Bowie Bailey [mailto:[EMAIL PROTECTED] Sent: Monday, February 13, 2006 1:20 PM To: users@spamassassin.apache.org Subject: RE: General assistance DAve wrote: > Bowie Bailey wrote: > > DAve wrote: > > > > > Ed Russell wrote: > > > > > > > 2. Once this is in place should I re-activate pzyor, dcc or > > > > razor? Is one better than the other? Are there advantages to > > > > either? > > > > > > I use neither, though I think I am in the minority. I routinely > > > check my spam and I have found that bayes, rayzor, dcc, and most > > > of the SARE rules catch little if any spam "for me". So I don't > > > run them and save the CPU for additional spamd processes. > > > > That's odd. Bayes, Razor2, DCC work quite well for me. Check out > > my stats from today: > > > > Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. > > They tagged plenty of spam for me, no doubt about that. But they > caught only a few spam that SA wouldn't have caught without them. It > is rare that bayes points on top of existing points ever made the > score squeek over the threshold. > > Not using them however, dropped my CPU, network, and memory > requirements so much I could run twice as many spamd processes. > Processing time went from an average of 10 seconds (with all SARE > rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no > razor, no dcc). > > All the SARE rules loaded makes spamd run about 45-75mb each, > selective SARE rules and I can see spamd drop to 23-35mb. More spamd, > faster spamd. I guess this is a definite case of 'YMMV'. With Bayes, Razor2, DCC, and 15 SARE rulesets, my average scantime is 2.5 seconds, but each process is 46M and I usually only have 2 or 3 running of a max of 8 (although with 1G of ram, I've got plenty of headroom if I need to add more). The bottom line is that if you have a low to medium mail volume and a decent server, you can probably turn it all on and not worry too much about it. If you have a high volume of mail, or a slower server, you may need to be a bit more picky with your rulesets and features. My advice is this: Try it with Bayes, Razor2, DCC, and Pyzor. Install any of the SARE rulesets you think might be useful. Then monitor your server and see what happens. You can use the 'top' command to see how much memory is in use and how much each spamd process is using. You should try to configure things such that the server never uses swap. If SA goes into swap, your performance will drop through the floor. To see your average scantime (assuming that SA is logging to syslog), you can use this command string (or drop it in a script file): grep -e 'clean message' -e 'identified spam' /var/log/maillog | perl -ne 'if (/in (\d+\.\d+) seconds/) { $time += $1; $cnt++;} } $avg = $time/$cnt; print "$avg\n"; {' Note that this command string should be all on one line. Your mailreader will probably split it... If everything is working well, you're good. If you are using too much memory, remove some of the extra rules or reduce the number of spamd children. If your scantimes are too slow and the machine is not swapping, then you should experiment with disabling Bayes, DCC, or Razor or removing rules. Also, as others have pointed out, a local caching nameserver on your SA machine can go a long way towards reducing lag from the network tests. -- Bowie
RE: General assistance
DAve wrote: > Bowie Bailey wrote: > > DAve wrote: > > > > > Ed Russell wrote: > > > > > > > 2. Once this is in place should I re-activate pzyor, dcc or > > > > razor? Is one better than the other? Are there advantages to > > > > either? > > > > > > I use neither, though I think I am in the minority. I routinely > > > check my spam and I have found that bayes, rayzor, dcc, and most > > > of the SARE rules catch little if any spam "for me". So I don't > > > run them and save the CPU for additional spamd processes. > > > > That's odd. Bayes, Razor2, DCC work quite well for me. Check out > > my stats from today: > > > > Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. > > They tagged plenty of spam for me, no doubt about that. But they > caught only a few spam that SA wouldn't have caught without them. It > is rare that bayes points on top of existing points ever made the > score squeek over the threshold. > > Not using them however, dropped my CPU, network, and memory > requirements so much I could run twice as many spamd processes. > Processing time went from an average of 10 seconds (with all SARE > rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no > razor, no dcc). > > All the SARE rules loaded makes spamd run about 45-75mb each, > selective SARE rules and I can see spamd drop to 23-35mb. More spamd, > faster spamd. I guess this is a definite case of 'YMMV'. With Bayes, Razor2, DCC, and 15 SARE rulesets, my average scantime is 2.5 seconds, but each process is 46M and I usually only have 2 or 3 running of a max of 8 (although with 1G of ram, I've got plenty of headroom if I need to add more). The bottom line is that if you have a low to medium mail volume and a decent server, you can probably turn it all on and not worry too much about it. If you have a high volume of mail, or a slower server, you may need to be a bit more picky with your rulesets and features. My advice is this: Try it with Bayes, Razor2, DCC, and Pyzor. Install any of the SARE rulesets you think might be useful. Then monitor your server and see what happens. You can use the 'top' command to see how much memory is in use and how much each spamd process is using. You should try to configure things such that the server never uses swap. If SA goes into swap, your performance will drop through the floor. To see your average scantime (assuming that SA is logging to syslog), you can use this command string (or drop it in a script file): grep -e 'clean message' -e 'identified spam' /var/log/maillog | perl -ne 'if (/in (\d+\.\d+) seconds/) { $time += $1; $cnt++;} } $avg = $time/$cnt; print "$avg\n"; {' Note that this command string should be all on one line. Your mailreader will probably split it... If everything is working well, you're good. If you are using too much memory, remove some of the extra rules or reduce the number of spamd children. If your scantimes are too slow and the machine is not swapping, then you should experiment with disabling Bayes, DCC, or Razor or removing rules. Also, as others have pointed out, a local caching nameserver on your SA machine can go a long way towards reducing lag from the network tests. -- Bowie
Re: General assistance
Ed Russell wrote: I have to say a heartfelt THANK YOU to everyone who contributed to this thread. My filter is working 500% more efficient that it ever was. I have done the following: 1. Installed djbdns and I am using dnscache as I was told. I have increased the cache size to 100 Megabytes and completely disabled logging after determining it was working properly. 2. I have implemented rbl at the MTA level, I use relays.ordb.org and sbl-xbl.spamhaus.org. 3. I have implemented Rules Du Jour. I selected a subset of the SARE rules and misc others. 4. I have turned back on pyzor, razor and dcc. Scanning times are well within tolerance with a minimal impact on delivery time. See below (email addresses removed for privacy): Feb 11 16:10:18 as spamd[4137]: spamd: identified spam (31.3/4.0) for [EMAIL PROTECTED] :99 in 4.5 seconds, 1178 bytes. Feb 11 16:10:18 as spamd[363]: spamd: clean message (1.2/4.0) for [EMAIL PROTECTED] :99 in 3.1 seconds, 8939 bytes. Feb 11 16:10:19 as spamd[4218]: spamd: clean message (0.0/4.0) for [EMAIL PROTECTED] :99 in 5.4 seconds, 2245 bytes. I have some final questions though, a. Can I get any statistics from rblsmtpd (I know this isn't a group devoted to it, but I figured I would ask)? I would like to know how many got dropped and from where. I don't use it anymore as my qmail toasters are not allowed traffic from the outside, only from my MailScanner servers. I run Sendmail and do my rbl checks there. But I would think this would get you a quick count, #cd /var/log/qmail/smtpd/ #cat current [EMAIL PROTECTED] | grep rblsmtpd | wc -l #cat current [EMAIL PROTECTED] | grep relays.ordb.org | wc -l #cat current [EMAIL PROTECTED] | grep sbl-xbl.spamhaus.org | wc -l Script from there forward and you can gleen just about as much as you care to sift through. awk, sed, Ruby, or Perl are your friends there. You can check access times on the logs to make sure you are checking today's or yesterday's logs. b. Does anyone have any utilities to get statistics from SA? Such as what rules triggered spam etc etc. I have seen some posts with some interesting looking reports. Currently I only use a hacked together script I wrote to give me the raw amount of spam caught per day which greps "identified spam" on maillog and then gives me a wc -l. I see that has been answered already. Once again, thanks so much to everyone. This group is simply amazing. I second that! DAve Ed -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 1:19 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: User validation is going to be tough or all but impossible. This box forwards off the mail to an NT box running SL Mail. There is no easy way to get a userlist out of this product. In addition the users change daily and some even use multi-drops. You don't need to get a user list, you just need to ask the destination server if the user exists before accepting the message. This is what milter-ahead does on my MailScanner servers. I process and forward to servers running qmail(my toasters) and Exchange, GroupMail, Groupwise, Sendmail(my clients servers). All respond correctly to milter-ahead. I do not know of a way to duplicate milter-ahead in qmail without requiring something like vpopmail or LDAP. Did you look at using dnscache? That might buy you enough breathing room to shop around for a solution to user verification. DAve Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:39 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: [EMAIL PROTECTED] smtpd]# spamassassin --version SpamAssassin version 3.1.0 running on Perl version 5.8.7 Spamd running with: OPTIONS="-L -x -d -u nobody -m 45" No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. DAve 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K buff Swap: 2097136K av, 0K used, 20
Re: General assistance
> b. Does anyone have any utilities to get statistics from SA? Such as Can't help you on your first question, but likely someone else can. On the second question, there are two different stats scripts. Confusingly enough they are BOTH named sa_stats.pl. One is distributed with SA itself. I forget the directory where it ends up, but digging for sa_stats.pl should turn it up. The other one was written by Dallas, and is available on the rulesemporium website. I believe both of these just dig through the log to get their answers. Loren
RE: General assistance
I have to say a heartfelt THANK YOU to everyone who contributed to this thread. My filter is working 500% more efficient that it ever was. I have done the following: 1. Installed djbdns and I am using dnscache as I was told. I have increased the cache size to 100 Megabytes and completely disabled logging after determining it was working properly. 2. I have implemented rbl at the MTA level, I use relays.ordb.org and sbl-xbl.spamhaus.org. 3. I have implemented Rules Du Jour. I selected a subset of the SARE rules and misc others. 4. I have turned back on pyzor, razor and dcc. Scanning times are well within tolerance with a minimal impact on delivery time. See below (email addresses removed for privacy): Feb 11 16:10:18 as spamd[4137]: spamd: identified spam (31.3/4.0) for [EMAIL PROTECTED] :99 in 4.5 seconds, 1178 bytes. Feb 11 16:10:18 as spamd[363]: spamd: clean message (1.2/4.0) for [EMAIL PROTECTED] :99 in 3.1 seconds, 8939 bytes. Feb 11 16:10:19 as spamd[4218]: spamd: clean message (0.0/4.0) for [EMAIL PROTECTED] :99 in 5.4 seconds, 2245 bytes. I have some final questions though, a. Can I get any statistics from rblsmtpd (I know this isn't a group devoted to it, but I figured I would ask)? I would like to know how many got dropped and from where. b. Does anyone have any utilities to get statistics from SA? Such as what rules triggered spam etc etc. I have seen some posts with some interesting looking reports. Currently I only use a hacked together script I wrote to give me the raw amount of spam caught per day which greps "identified spam" on maillog and then gives me a wc -l. Once again, thanks so much to everyone. This group is simply amazing. Ed -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 1:19 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: > User validation is going to be tough or all but impossible. This box > forwards off the mail to an NT box running SL Mail. There is no easy way to > get a userlist out of this product. In addition the users change daily and > some even use multi-drops. You don't need to get a user list, you just need to ask the destination server if the user exists before accepting the message. This is what milter-ahead does on my MailScanner servers. I process and forward to servers running qmail(my toasters) and Exchange, GroupMail, Groupwise, Sendmail(my clients servers). All respond correctly to milter-ahead. I do not know of a way to duplicate milter-ahead in qmail without requiring something like vpopmail or LDAP. Did you look at using dnscache? That might buy you enough breathing room to shop around for a solution to user verification. DAve > > Ed > > > --- > > Talk is cheap since supply always exceeds demand. > > --- > > > -Original Message- > From: DAve [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 12:39 PM > To: users@spamassassin.apache.org > Subject: Re: General assistance > > Ed Russell wrote: > >>[EMAIL PROTECTED] smtpd]# spamassassin --version >>SpamAssassin version 3.1.0 >> running on Perl version 5.8.7 >> >> >>Spamd running with: >>OPTIONS="-L -x -d -u nobody -m 45" >> >>No user verification or RBL at the MTA level. > > > Absolutely do user verification. I can throw out from 20% to 80% of my > traffic depending on the current level of dictionary and Joe-Job > attacks. Since you are processing ahead of your clients Exchange boxes > I'm not sure how you can do that with qmail. I do it on my gateways > running MailScanner via milter-ahead, and on my toasters via checkuser > in vpopmail. > > There might be a way to get qmail to check with an Exchange box to > validate a user without running vpopmail, but I won't know it. > > DAve > > >> >>12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 >>313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped >>CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle >>Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K >>buff >>Swap: 2097136K av, 0K used, 2097136K free 225380K >>cached >> >>As you can see I have loads of head room as far as memory goes. I was >>looking into integrating RBL into Qmail, but with the very high volume I > > am > >>quite concerned that this will introduce a slowdown. If I increase the >>inbound concurrent rate I eventually run into qmail-scanner problems with >>reformime. Is there anything else I need consider? >> >
RE: General assistance
You are completely correct, qmail-scanner does use spamc to talk to the already running spamd. I just had trouble explaining what the setup was I may indeed look into having procmail be the agent for Spamassassin. As for automatic deletion, well that's a decision we made and for the most part it works. We just ensure that we are not too aggressive on the rules. Ed -Original Message- From: jdow [mailto:[EMAIL PROTECTED] Sent: Saturday, February 11, 2006 12:28 AM To: users@spamassassin.apache.org Subject: Re: General assistance No, Ed, qmail-scanner should not initiate spamd. It should use spamc to call the already running spamd. I hope that is what you mean. That is what stood the hairs on end. It made me wonder if you really knew what was going on. {o.o} And seriously, if you are using procmail it's perhaps better to fire off spamc from procmail. That way you can skip SA scanning for some specific addresses, if you want. Or you can skip SA scanning if the message size is too big. If you're running procmail anyway it might as well be the agent for running SpamAssassin. That way you are SURE the markups are there for when you delegate the spam to /dev/null. And as a general rule I believe dumping mail to /dev/null is asking for "I sent you the ebay notifications you needed! I can't help it if your spam filter deleted them! Why'd you give me a bad review, you [EMAIL PROTECTED] 3-)(#$*&&&!" {o.o} - Original Message - From: "Ed Russell" <[EMAIL PROTECTED]> To: Sent: 2006 February, 10, Friday 20:11 Subject: RE: General assistance >I think you are confused as to how I have set this up. Qmail-scanner is my > replacement qmail queue. Qmail simply receives mail from the outside world, > then passes it to qmail-scanner for processing. Qmail-scanner initiates > spamd which scans the mail and off it goes. From there procmail will look > at the mail and determine if the spam status is marked in the header, if yes > it kills the mail, if not it passes it along. Keep in mind no users > whatsoever live on this box. It is as I mentioned a pass through filter. > > > > -Original Message- > From: jdow [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 10:55 PM > To: users@spamassassin.apache.org > Subject: Re: General assistance > > From: "Ed Russell" <[EMAIL PROTECTED]> > >> If everyone would indulge me I would like to put forth the setup I am >> utilizing and get some feedback. I have a box that I have been using for >> some time which acts as a pass-through filter for many domains (currently >> about 100) for spam, this is a fairly high traffic server processing about >> 150,000 to 200,000 messages per day. I use the following method. >> >> Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. >> >> Qmail runs which accepts the email from the world (with a >> concurrencyincoming of 100) and passes it through qmail-scanner (which > calls >> spamd) and spamassassin which checks the email and writes spam status to > the >> header. Each message gets then passed through a procmail filter which > will >> delete it if it is spam. The procmail filter is: > > I note the other answers and thought I'd comment because the above > description of your mail topology raised the hairs on the back of my > neck. (And that takes doing considering their length. {^_-}) > > First I not you say Qmail (it's own punishment) feets qmail-scanner. > The qmail-scanner calls spamd? Naw, can't DO that. AND it calls > spamassassin? That's even stranger. But then it goes to procmail for > the delivery. > > My topology is somewhat different but useful. If you are using qmail-scanner > only to make the spamassassin run and the procmail run then jettison it > and go to procmail directly. That MAY reduce the machine load a little. > Also make sure spamd is running, exactly once, from your /etc/init.d > files or the equivalent on BSDs. You'd then use spamc to get to the > SpamAssassin run. You show some data below. (I am not sure what the > EXITCODE is supposed to do for you. I never set it here. But that may > be because I use procmail alone. It exits and mail is "delivered" either > to a diversion directory, /dev/null, or the user's mailbox.) > > Anyway, you can call spamc from inside procmail this way: > > :0 > * < 50 > * !^List-Id: .*(spamassassin\.apache.\org) > | /usr/bin/spamc -t 150 -u $USER > >> :0 >> * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* >> { >>EXITCODE=99 >>:0 >>/dev/null >> } >> >> :0 >> * ^X-Spam-Status: Yes >>
Re: General assistance
On Freitag, 10. Februar 2006 22:42 Ed Russell wrote: > I was doing some reading and I am beginning to look into Rules Du > Jour. I see there are quite a large number of rulesets to choose > from when utilizing this. Does anyone have any advice on what ones > would be safe? I use those: SARE_ADULT SARE_OBFU0 SARE_OBFU1 SARE_URI0 SARE_REDIRECT_POST300 SARE_HTML0 SARE_HEADER0 SARE_SPECIFIC SARE_BML SARE_FRAUD SARE_SPOOF SARE_GENLSUBJ0 SARE_UNSUB SARE_WHITELIST_RCVD SARE_WHITELIST_SPF ZMI_GERMAN The last one being specific for german language SPAM. Additionaly, I use the blacklist by William Stearns for postfix, running a cron job: rsync -qL rsync.sa-blacklist.stearns.org::wstearns/sa-blacklist/sa-blacklist.current.reject /etc/postfix/sender_blacklist ; postmap /etc/postfix/sender_blacklist That's much better than the blacklist by SARE, as it's less memory consuming and faster - a drop by MTA is generally faster than handing it over to SA. mfg zmi -- // Michael Monnerie, Ing.BSc --- it-management Michael Monnerie // http://zmi.at Tel: 0660/4156531 Linux 2.6.11 // PGP Key: "lynx -source http://zmi.at/zmi2.asc | gpg --import" // Fingerprint: EB93 ED8A 1DCD BB6C F952 F7F4 3911 B933 7054 5879 // Keyserver: www.keyserver.net Key-ID: 0x70545879 pgp087jPloNCP.pgp Description: PGP signature
Re: General assistance
On Freitag, 10. Februar 2006 19:32 Ed Russell wrote: > 1. Does anyone have an opinion as to what RBL to contact? I > know there are quite a few. sbl-xbl.spamhaus.org, multi.surbl.org, safe.dnsbl.sorbs.net, dnsbl.njabl.org, bl.spamcop.net, relays.ordb.org I use those at MTA level. That dropped 62.000 messages, and only 378 spams were detected by SA during that time. I guess that saved a lot of CPU. Since you seem to have a problem with DNS queries ("if I disable RBL checks and razor, pyzor and dcc the delay goes away"), I would suggest: - make RBL checks at the MTA already - get permission from RBL maintainers to make a zone transfer to your box, and run a local named or whatever. By that, you only have local DNS queries, that should help a lot. > 2. Once this is in place should I re-activate pzyor, dcc or > razor? Is one better than the other? Are there advantages to > either? Each of them are different, altogether they help a lot. I use all of them, but I'm not in a situation where I have problems with delay. First try RBL at MTA, and possibly you have enough CPU cycles left then to reactivate that checks. mfg zmi -- // Michael Monnerie, Ing.BSc --- it-management Michael Monnerie // http://zmi.at Tel: 0660/4156531 Linux 2.6.11 // PGP Key: "lynx -source http://zmi.at/zmi2.asc | gpg --import" // Fingerprint: EB93 ED8A 1DCD BB6C F952 F7F4 3911 B933 7054 5879 // Keyserver: www.keyserver.net Key-ID: 0x70545879 pgp3dyy87IfpR.pgp Description: PGP signature
Re: General assistance
No, Ed, qmail-scanner should not initiate spamd. It should use spamc to call the already running spamd. I hope that is what you mean. That is what stood the hairs on end. It made me wonder if you really knew what was going on. {o.o} And seriously, if you are using procmail it's perhaps better to fire off spamc from procmail. That way you can skip SA scanning for some specific addresses, if you want. Or you can skip SA scanning if the message size is too big. If you're running procmail anyway it might as well be the agent for running SpamAssassin. That way you are SURE the markups are there for when you delegate the spam to /dev/null. And as a general rule I believe dumping mail to /dev/null is asking for "I sent you the ebay notifications you needed! I can't help it if your spam filter deleted them! Why'd you give me a bad review, you [EMAIL PROTECTED] 3-)(#$*&&&!" {o.o} - Original Message - From: "Ed Russell" <[EMAIL PROTECTED]> To: Sent: 2006 February, 10, Friday 20:11 Subject: RE: General assistance I think you are confused as to how I have set this up. Qmail-scanner is my replacement qmail queue. Qmail simply receives mail from the outside world, then passes it to qmail-scanner for processing. Qmail-scanner initiates spamd which scans the mail and off it goes. From there procmail will look at the mail and determine if the spam status is marked in the header, if yes it kills the mail, if not it passes it along. Keep in mind no users whatsoever live on this box. It is as I mentioned a pass through filter. -Original Message- From: jdow [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 10:55 PM To: users@spamassassin.apache.org Subject: Re: General assistance From: "Ed Russell" <[EMAIL PROTECTED]> If everyone would indulge me I would like to put forth the setup I am utilizing and get some feedback. I have a box that I have been using for some time which acts as a pass-through filter for many domains (currently about 100) for spam, this is a fairly high traffic server processing about 150,000 to 200,000 messages per day. I use the following method. Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. Qmail runs which accepts the email from the world (with a concurrencyincoming of 100) and passes it through qmail-scanner (which calls spamd) and spamassassin which checks the email and writes spam status to the header. Each message gets then passed through a procmail filter which will delete it if it is spam. The procmail filter is: I note the other answers and thought I'd comment because the above description of your mail topology raised the hairs on the back of my neck. (And that takes doing considering their length. {^_-}) First I not you say Qmail (it's own punishment) feets qmail-scanner. The qmail-scanner calls spamd? Naw, can't DO that. AND it calls spamassassin? That's even stranger. But then it goes to procmail for the delivery. My topology is somewhat different but useful. If you are using qmail-scanner only to make the spamassassin run and the procmail run then jettison it and go to procmail directly. That MAY reduce the machine load a little. Also make sure spamd is running, exactly once, from your /etc/init.d files or the equivalent on BSDs. You'd then use spamc to get to the SpamAssassin run. You show some data below. (I am not sure what the EXITCODE is supposed to do for you. I never set it here. But that may be because I use procmail alone. It exits and mail is "delivered" either to a diversion directory, /dev/null, or the user's mailbox.) Anyway, you can call spamc from inside procmail this way: :0 * < 50 * !^List-Id: .*(spamassassin\.apache.\org) | /usr/bin/spamc -t 150 -u $USER :0 * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* { EXITCODE=99 :0 /dev/null } :0 * ^X-Spam-Status: Yes { EXITCODE=99 :0 /dev/null } :0 * ^^rom[ ] { LOG="*** Dropped F off From_ header! Fixing up. " :0 fhw | sed -e '1s/^/F/' } :0 /dev/null Mail that is clean gets passed off to a second qmail install which then delivers the mail to our servers using smtproutes. Ouch. And what is that final redirect of EVERYTHING to /dev/null? I just let procmail deliver it. {o.o}
RE: General assistance
I think you are confused as to how I have set this up. Qmail-scanner is my replacement qmail queue. Qmail simply receives mail from the outside world, then passes it to qmail-scanner for processing. Qmail-scanner initiates spamd which scans the mail and off it goes. From there procmail will look at the mail and determine if the spam status is marked in the header, if yes it kills the mail, if not it passes it along. Keep in mind no users whatsoever live on this box. It is as I mentioned a pass through filter. -Original Message- From: jdow [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 10:55 PM To: users@spamassassin.apache.org Subject: Re: General assistance From: "Ed Russell" <[EMAIL PROTECTED]> > If everyone would indulge me I would like to put forth the setup I am > utilizing and get some feedback. I have a box that I have been using for > some time which acts as a pass-through filter for many domains (currently > about 100) for spam, this is a fairly high traffic server processing about > 150,000 to 200,000 messages per day. I use the following method. > > Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. > > Qmail runs which accepts the email from the world (with a > concurrencyincoming of 100) and passes it through qmail-scanner (which calls > spamd) and spamassassin which checks the email and writes spam status to the > header. Each message gets then passed through a procmail filter which will > delete it if it is spam. The procmail filter is: I note the other answers and thought I'd comment because the above description of your mail topology raised the hairs on the back of my neck. (And that takes doing considering their length. {^_-}) First I not you say Qmail (it's own punishment) feets qmail-scanner. The qmail-scanner calls spamd? Naw, can't DO that. AND it calls spamassassin? That's even stranger. But then it goes to procmail for the delivery. My topology is somewhat different but useful. If you are using qmail-scanner only to make the spamassassin run and the procmail run then jettison it and go to procmail directly. That MAY reduce the machine load a little. Also make sure spamd is running, exactly once, from your /etc/init.d files or the equivalent on BSDs. You'd then use spamc to get to the SpamAssassin run. You show some data below. (I am not sure what the EXITCODE is supposed to do for you. I never set it here. But that may be because I use procmail alone. It exits and mail is "delivered" either to a diversion directory, /dev/null, or the user's mailbox.) Anyway, you can call spamc from inside procmail this way: :0 * < 50 * !^List-Id: .*(spamassassin\.apache.\org) | /usr/bin/spamc -t 150 -u $USER > :0 > * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* > { >EXITCODE=99 >:0 >/dev/null > } > > :0 > * ^X-Spam-Status: Yes > { >EXITCODE=99 >:0 >/dev/null > } > > :0 > * ^^rom[ ] > { > LOG="*** Dropped F off From_ header! Fixing up. " > > :0 fhw > | sed -e '1s/^/F/' > } > > :0 > /dev/null > > Mail that is clean gets passed off to a second qmail install which then > delivers the mail to our servers using smtproutes. Ouch. And what is that final redirect of EVERYTHING to /dev/null? I just let procmail deliver it. {o.o}
Re: General assistance
From: "Ed Russell" <[EMAIL PROTECTED]> If everyone would indulge me I would like to put forth the setup I am utilizing and get some feedback. I have a box that I have been using for some time which acts as a pass-through filter for many domains (currently about 100) for spam, this is a fairly high traffic server processing about 150,000 to 200,000 messages per day. I use the following method. Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. Qmail runs which accepts the email from the world (with a concurrencyincoming of 100) and passes it through qmail-scanner (which calls spamd) and spamassassin which checks the email and writes spam status to the header. Each message gets then passed through a procmail filter which will delete it if it is spam. The procmail filter is: I note the other answers and thought I'd comment because the above description of your mail topology raised the hairs on the back of my neck. (And that takes doing considering their length. {^_-}) First I not you say Qmail (it's own punishment) feets qmail-scanner. The qmail-scanner calls spamd? Naw, can't DO that. AND it calls spamassassin? That's even stranger. But then it goes to procmail for the delivery. My topology is somewhat different but useful. If you are using qmail-scanner only to make the spamassassin run and the procmail run then jettison it and go to procmail directly. That MAY reduce the machine load a little. Also make sure spamd is running, exactly once, from your /etc/init.d files or the equivalent on BSDs. You'd then use spamc to get to the SpamAssassin run. You show some data below. (I am not sure what the EXITCODE is supposed to do for you. I never set it here. But that may be because I use procmail alone. It exits and mail is "delivered" either to a diversion directory, /dev/null, or the user's mailbox.) Anyway, you can call spamc from inside procmail this way: :0 * < 50 * !^List-Id: .*(spamassassin\.apache.\org) | /usr/bin/spamc -t 150 -u $USER :0 * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* { EXITCODE=99 :0 /dev/null } :0 * ^X-Spam-Status: Yes { EXITCODE=99 :0 /dev/null } :0 * ^^rom[ ] { LOG="*** Dropped F off From_ header! Fixing up. " :0 fhw | sed -e '1s/^/F/' } :0 /dev/null Mail that is clean gets passed off to a second qmail install which then delivers the mail to our servers using smtproutes. Ouch. And what is that final redirect of EVERYTHING to /dev/null? I just let procmail deliver it. {o.o}
Re: General assistance
Ed Russell wrote: I was doing some reading and I am beginning to look into Rules Du Jour. I see there are quite a large number of rulesets to choose from when utilizing this. Does anyone have any advice on what ones would be safe? My experience with SARE has been they try very hard to classify their rules based on their ability to hit spam correctly, and ham incorrectly. After the first year using SARE I now just trust in their judgment ;^). I generally get the zero rules when there is a choice (rules that hit only spam in testing, named with a zero) and try them first. I choose the rules based on the spam I am seeing slip through. I generally never adjust their assigned points either. These have always been good performers for me. 70_sare_html0.cf 70_sare_adult.cf 70_sare_oem.cf 70_sare_obfu.cf This one is proving useful over the past few weeks, 70_sare_stocks.cf I almost always grab any new rules announced and give them a try for a few days as well. DAve Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 4:30 PM To: users@spamassassin.apache.org Subject: Re: General assistance Bowie Bailey wrote: DAve wrote: Ed Russell wrote: 2. Once this is in place should I re-activate pzyor, dcc or razor? Is one better than the other? Are there advantages to either? I use neither, though I think I am in the minority. I routinely check my spam and I have found that bayes, rayzor, dcc, and most of the SARE rules catch little if any spam "for me". So I don't run them and save the CPU for additional spamd processes. That's odd. Bayes, Razor2, DCC work quite well for me. Check out my stats from today: TOP SPAM RULES FIRED RANKRULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1RAZOR2_CF_RANGE_51_100 1280 5.02 48.05 83.33 0.98 2RAZOR2_CHECK 1259 4.94 47.26 81.97 1.15 3RAZOR2_CF_RANGE_E8_51_1001164 4.56 43.69 75.78 0.27 Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. They tagged plenty of spam for me, no doubt about that. But they caught only a few spam that SA wouldn't have caught without them. It is rare that bayes points on top of existing points ever made the score squeek over the threshold. Not using them however, dropped my CPU, network, and memory requirements so much I could run twice as many spamd processes. Processing time went from an average of 10 seconds (with all SARE rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no razor, no dcc). All the SARE rules loaded makes spamd run about 45-75mb each, selective SARE rules and I can see spamd drop to 23-35mb. More spamd, faster spamd. Of course tommorrow, everything could change ;^) DAve
Re: General assistance
Joey wrote: Dave, What paramters are you using for logging with the caching name server? I currently use this: logging { category lame-servers { null; }; }; Thanks, Joey I was speaking of dnscache, the program, not dnscache as in "a cacheing DNS server". See http://cr.yp.to/djbdns.html. It can log in a very verbose way, generating gigabytes of log files a day. What you have above for Bind. DAve -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:28 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: If everyone would indulge me I would like to put forth the setup I am utilizing and get some feedback. I have a box that I have been using for some time which acts as a pass-through filter for many domains (currently about 100) for spam, this is a fairly high traffic server processing about 150,000 to 200,000 messages per day. I use the following method. Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. Qmail runs which accepts the email from the world (with a concurrencyincoming of 100) and passes it through qmail-scanner (which calls spamd) and spamassassin which checks the email and writes spam status to the header. Each message gets then passed through a procmail filter which will delete it if it is spam. The procmail filter is: :0 * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* { EXITCODE=99 :0 /dev/null } :0 * ^X-Spam-Status: Yes { EXITCODE=99 :0 /dev/null } :0 * ^^rom[ ] { LOG="*** Dropped F off From_ header! Fixing up. " :0 fhw | sed -e '1s/^/F/' } :0 /dev/null Mail that is clean gets passed off to a second qmail install which then delivers the mail to our servers using smtproutes. This has been working fine for a few years now, but recently we have experienced major delays in the processing of email. Due to the very high volume pretty much all the time the system is handling 100 concurrent incoming pieces of email. Of course with everything else going on it is not uncommon for this system to have up to 400 processes running. Sometimes mail can take hours to get through to its destination. What I have discovered is that if I disable RBL checks and razor, pyzor and dcc the delay goes away. However, the effectiveness of the filter reduces. Am I completely off base in the way I have this all setup? I have went with a higher speed HD to increase the threshold on file I/O. Can I tune the performance of razor etc while maintaining delivery time? Is there anything else I should be considering? If I have not explained things well or more information is needed I will certainly provide anything. Thanks Since you are running qmail, consider doing your rbl checks in qmail-smtpd. No sense scanning a message if you can drop it at the door first. Also, are your running dnscache? I run dnscache on all my servers, web, webmail, toasters, etc. It can speed things up considerably as it will cache your RBL lookups, SURBL lookups, etc. It's a nice thing to do for the URIBL and SURBL folks too. If you do run dnscache, consider turning logging off once you are configured and satisfied it works as intended. dnscache can keep a disk pretty busy with it's potential to log a lot of data. DAve
Re: General assistance
I was doing some reading and I am beginning to look into Rules Du Jour. I see there are quite a large number of rulesets to choose from when utilizing this. Does anyone have any advice on what ones would be safe? I use these: SARE_ADULT SARE_BAYES_POISON_NXM SARE_FRAUD SARE_HEADER0 SARE_HEADER1 SARE_HTML0 SARE_OBFU0 SARE_OEM SARE_RANDOM SARE_REDIRECT_POST300 SARE_SPAMCOP_TOP200 SARE_SPECIFIC SARE_SPOOF SARE_STOCKS SARE_WHITELIST_RCVD SARE_WHITELIST_SPF This is on a server with 165 domains and several hundred users. In this environment, I'd rather create false negatives than false positives, hence my choice of rulesets. On my box at home, where it's just me and my wife receiving mail, I add these as well: BOGUSVIRUS SARE_BML SARE_EVILNUMBERS0 SARE_GENLSUBJ0 SARE_URI0 TRIPWIRE On the work box, I use the SBL/XBL lists from Spamhaus and bogusmx.rfc-ignorant.org at the MTA level. At home, I add dynablock.njabl.org, dsn.rfc-ignorant.org, blackholes.mail-abuse.org, relays.mail-abuse.org, dialups.mail-abuse.org, and ws.surbl.org, pretty much in that order. The lower ranked ones rarely trigger (and they're probably redundant, but I don't really care).
RE: General assistance
Dave, What paramters are you using for logging with the caching name server? I currently use this: logging { category lame-servers { null; }; }; Thanks, Joey -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:28 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: > If everyone would indulge me I would like to put forth the setup I am > utilizing and get some feedback. I have a box that I have been using for > some time which acts as a pass-through filter for many domains > (currently about 100) for spam, this is a fairly high traffic server > processing about 150,000 to 200,000 messages per day. I use the following method. > > Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. > > Qmail runs which accepts the email from the world (with a > concurrencyincoming of 100) and passes it through qmail-scanner (which > calls > spamd) and spamassassin which checks the email and writes spam status > to the header. Each message gets then passed through a procmail > filter which will delete it if it is spam. The procmail filter is: > > :0 > * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* { > EXITCODE=99 > :0 > /dev/null > } > > :0 > * ^X-Spam-Status: Yes > { > EXITCODE=99 > :0 > /dev/null > } > > :0 > * ^^rom[ ] > { > LOG="*** Dropped F off From_ header! Fixing up. " > > :0 fhw > | sed -e '1s/^/F/' > } > > :0 > /dev/null > > Mail that is clean gets passed off to a second qmail install which > then delivers the mail to our servers using smtproutes. > > This has been working fine for a few years now, but recently we have > experienced major delays in the processing of email. Due to the very > high volume pretty much all the time the system is handling 100 > concurrent incoming pieces of email. Of course with everything else > going on it is not uncommon for this system to have up to 400 > processes running. Sometimes mail can take hours to get through to > its destination. What I have discovered is that if I disable RBL > checks and razor, pyzor and dcc the delay goes away. However, the effectiveness of the filter reduces. > > Am I completely off base in the way I have this all setup? I have > went with a higher speed HD to increase the threshold on file I/O. > Can I tune the performance of razor etc while maintaining delivery > time? Is there anything else I should be considering? If I have not > explained things well or more information is needed I will certainly provide anything. > > Thanks Since you are running qmail, consider doing your rbl checks in qmail-smtpd. No sense scanning a message if you can drop it at the door first. Also, are your running dnscache? I run dnscache on all my servers, web, webmail, toasters, etc. It can speed things up considerably as it will cache your RBL lookups, SURBL lookups, etc. It's a nice thing to do for the URIBL and SURBL folks too. If you do run dnscache, consider turning logging off once you are configured and satisfied it works as intended. dnscache can keep a disk pretty busy with it's potential to log a lot of data. DAve
RE: General assistance
I was doing some reading and I am beginning to look into Rules Du Jour. I see there are quite a large number of rulesets to choose from when utilizing this. Does anyone have any advice on what ones would be safe? Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 4:30 PM To: users@spamassassin.apache.org Subject: Re: General assistance Bowie Bailey wrote: > DAve wrote: > >>Ed Russell wrote: >> >>>2. Once this is in place should I re-activate pzyor, dcc or razor? >>>Is one better than the other? Are there advantages to either? >> >>I use neither, though I think I am in the minority. I routinely check >> my spam and I have found that bayes, rayzor, dcc, and most of the >>SARE rules catch little if any spam "for me". So I don't run them and >>save the CPU for additional spamd processes. > > > That's odd. Bayes, Razor2, DCC work quite well for me. Check out my > stats from today: > > TOP SPAM RULES FIRED > > RANKRULE NAME COUNT %OFRULES %OFMAIL %OFSPAM > %OFHAM > >1RAZOR2_CF_RANGE_51_100 1280 5.02 48.05 83.33 > 0.98 >2RAZOR2_CHECK 1259 4.94 47.26 81.97 > 1.15 >3RAZOR2_CF_RANGE_E8_51_1001164 4.56 43.69 75.78 > 0.27 > > > Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. > They tagged plenty of spam for me, no doubt about that. But they caught only a few spam that SA wouldn't have caught without them. It is rare that bayes points on top of existing points ever made the score squeek over the threshold. Not using them however, dropped my CPU, network, and memory requirements so much I could run twice as many spamd processes. Processing time went from an average of 10 seconds (with all SARE rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no razor, no dcc). All the SARE rules loaded makes spamd run about 45-75mb each, selective SARE rules and I can see spamd drop to 23-35mb. More spamd, faster spamd. Of course tommorrow, everything could change ;^) DAve
Re: General assistance
Bowie Bailey wrote: DAve wrote: Ed Russell wrote: 2. Once this is in place should I re-activate pzyor, dcc or razor? Is one better than the other? Are there advantages to either? I use neither, though I think I am in the minority. I routinely check my spam and I have found that bayes, rayzor, dcc, and most of the SARE rules catch little if any spam "for me". So I don't run them and save the CPU for additional spamd processes. That's odd. Bayes, Razor2, DCC work quite well for me. Check out my stats from today: TOP SPAM RULES FIRED RANKRULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1RAZOR2_CF_RANGE_51_100 1280 5.02 48.05 83.33 0.98 2RAZOR2_CHECK 1259 4.94 47.26 81.97 1.15 3RAZOR2_CF_RANGE_E8_51_1001164 4.56 43.69 75.78 0.27 Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. They tagged plenty of spam for me, no doubt about that. But they caught only a few spam that SA wouldn't have caught without them. It is rare that bayes points on top of existing points ever made the score squeek over the threshold. Not using them however, dropped my CPU, network, and memory requirements so much I could run twice as many spamd processes. Processing time went from an average of 10 seconds (with all SARE rules, bayes, DCC, Razor) to 2 seconds (limited SARE, no bayes, no razor, no dcc). All the SARE rules loaded makes spamd run about 45-75mb each, selective SARE rules and I can see spamd drop to 23-35mb. More spamd, faster spamd. Of course tommorrow, everything could change ;^) DAve
RE: General assistance
DAve wrote: > Ed Russell wrote: > > > > 2. Once this is in place should I re-activate pzyor, dcc or razor? > > Is one better than the other? Are there advantages to either? > > I use neither, though I think I am in the minority. I routinely check > my spam and I have found that bayes, rayzor, dcc, and most of the > SARE rules catch little if any spam "for me". So I don't run them and > save the CPU for additional spamd processes. That's odd. Bayes, Razor2, DCC work quite well for me. Check out my stats from today: TOP SPAM RULES FIRED RANKRULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1RAZOR2_CF_RANGE_51_100 1280 5.02 48.05 83.33 0.98 2RAZOR2_CHECK 1259 4.94 47.26 81.97 1.15 3RAZOR2_CF_RANGE_E8_51_1001164 4.56 43.69 75.78 0.27 4URIBL_BLACK 1147 4.50 43.06 74.67 0.44 5HTML_MESSAGE 1071 4.20 40.20 69.73 44.50 6DCC_CHECK1046 4.10 39.26 68.10 6.56 7BAYES_99 985 3.86 36.97 64.13 0.44 8DIGEST_MULTIPLE 937 3.67 35.17 61.00 0.35 9URIBL_JP_SURBL927 3.63 34.80 60.35 0.09 10URIBL_SBL 903 3.54 33.90 58.79 0.35 11URIBL_WS_SURBL797 3.12 29.92 51.89 0.27 12RCVD_IN_XBL 719 2.82 26.99 46.81 0.00 13RCVD_IN_BL_SPAMCOP_NET669 2.62 25.11 43.55 0.98 14URIBL_OB_SURBL653 2.56 24.51 42.51 0.09 15URIBL_SC_SURBL552 2.16 20.72 35.94 0.00 16RAZOR2_CF_RANGE_E4_51_100 550 2.16 20.65 35.81 0.71 17RCVD_IN_SORBS_DUL 448 1.76 16.82 29.17 0.27 18MIME_HTML_ONLY438 1.72 16.44 28.52 7.18 19RCVD_IN_NJABL_DUL 348 1.36 13.06 22.66 0.27 20RCVD_IN_SBL 330 1.29 12.39 21.48 0.09 Razor2 caught 83% of the spam, DCC caught 68%, and Bayes got 64%. > Bottom line, my clients would rather have 95% of the spam stopped and > a 20 second delivery time than 100% of spam caught and a two minute > delivery time. As always ;^) YMMV. Setup a honeypot account and check > it's contents daily. That will tell you if the choices you make are > correct or not. > > DAve > > PS. While bayes/rayzor/dcc don't provide a benefit for me, I find > URIBL and SURBL are responsible for catching at the very least 70% of > my spam and at times 90%+. I also move SARE rules and custom rules in > and out weekly, depends on the type of traffic I see. Right now > SARE_OEM and SARE_STOCK are helping out. Next week it might be > SARE_ADULT. Agreed on URIBL and SURBL. Both of those have good showings in my stats as well. I don't swap out the SARE rules. I use most of them and just let them run. My server doesn't see quite enough traffic for it to create a problem. They don't catch as much as the net rules, but they do help out from time to time. -- Bowie
Re: General assistance
Ed Russell wrote: My homework is: 1. Install and configure dnscache. 2. Look into RBL at the MTA. 3. Begin to investigate user authentication at the MTA. Some questions, 1. Does anyone have an opinion as to what RBL to contact? I know there are quite a few. I have tried several with different levels of success. We have clients who get a lot of mail from self-administered servers on DSL, Pacific Rim, Eastern Europe, etc. So I have to be careful what RBL I choose. I have been using http://ordb.org and http://sbl-xbl.spamhaus.org for the past year and have no complaints (I should say my clients have no complaints). I believe RBLs are like spam rules and whats works for me may not work for you. Your mail will determine if a particular RBL is a good fit or not. 2. Once this is in place should I re-activate pzyor, dcc or razor? Is one better than the other? Are there advantages to either? I use neither, though I think I am in the minority. I routinely check my spam and I have found that bayes, rayzor, dcc, and most of the SARE rules catch little if any spam "for me". So I don't run them and save the CPU for additional spamd processes. Bottom line, my clients would rather have 95% of the spam stopped and a 20 second delivery time than 100% of spam caught and a two minute delivery time. As always ;^) YMMV. Setup a honeypot account and check it's contents daily. That will tell you if the choices you make are correct or not. DAve PS. While bayes/rayzor/dcc don't provide a benefit for me, I find URIBL and SURBL are responsible for catching at the very least 70% of my spam and at times 90%+. I also move SARE rules and custom rules in and out weekly, depends on the type of traffic I see. Right now SARE_OEM and SARE_STOCK are helping out. Next week it might be SARE_ADULT. -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 1:19 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: User validation is going to be tough or all but impossible. This box forwards off the mail to an NT box running SL Mail. There is no easy way to get a userlist out of this product. In addition the users change daily and some even use multi-drops. You don't need to get a user list, you just need to ask the destination server if the user exists before accepting the message. This is what milter-ahead does on my MailScanner servers. I process and forward to servers running qmail(my toasters) and Exchange, GroupMail, Groupwise, Sendmail(my clients servers). All respond correctly to milter-ahead. I do not know of a way to duplicate milter-ahead in qmail without requiring something like vpopmail or LDAP. Did you look at using dnscache? That might buy you enough breathing room to shop around for a solution to user verification. DAve Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:39 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: [EMAIL PROTECTED] smtpd]# spamassassin --version SpamAssassin version 3.1.0 running on Perl version 5.8.7 Spamd running with: OPTIONS="-L -x -d -u nobody -m 45" No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. DAve 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K buff Swap: 2097136K av, 0K used, 2097136K free 225380K cached As you can see I have loads of head room as far as memory goes. I was looking into integrating RBL into Qmail, but with the very high volume I am quite concerned that this will introduce a slowdown. If I increase the inbound concurrent rate I eventually run into qmail-scanner problems with reformime. Is there anything else I need consider? Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Kristopher Austin [mailto:[EMAIL PROTECTED] Se
RE: General assistance
Ed Russell wrote: > 1.Does anyone have an opinion as to what RBL to contact? I know > there are quite a few. openrbl.org has a reasonably comprehensive list. -- Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer
RE: General assistance
> -Original Message- > From: Ed Russell [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 12:32 PM > To: users@spamassassin.apache.org > Subject: RE: General assistance > > My homework is: > > 1.Install and configure dnscache. > 2.Look into RBL at the MTA. > 3.Begin to investigate user authentication at the MTA. > > Some questions, > > 1.Does anyone have an opinion as to what RBL to contact? I know there > are quite a few. > We use sbl-xbl.spamhaus.org and I know a lot of others on this list do the same. However, I do know that there are FPs mentioned on this list concerning this RBL. I have never encountered one. It is a popular enough list that if someone is on it they usually work quickly to get off of it. If there were any list to choose that most people probably use SBL+XBL is definitely it. Go to http://www.spamhaus.org for more info. Kris
RE: General assistance
My homework is: 1. Install and configure dnscache. 2. Look into RBL at the MTA. 3. Begin to investigate user authentication at the MTA. Some questions, 1. Does anyone have an opinion as to what RBL to contact? I know there are quite a few. 2. Once this is in place should I re-activate pzyor, dcc or razor? Is one better than the other? Are there advantages to either? -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 1:19 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: > User validation is going to be tough or all but impossible. This box > forwards off the mail to an NT box running SL Mail. There is no easy way to > get a userlist out of this product. In addition the users change daily and > some even use multi-drops. You don't need to get a user list, you just need to ask the destination server if the user exists before accepting the message. This is what milter-ahead does on my MailScanner servers. I process and forward to servers running qmail(my toasters) and Exchange, GroupMail, Groupwise, Sendmail(my clients servers). All respond correctly to milter-ahead. I do not know of a way to duplicate milter-ahead in qmail without requiring something like vpopmail or LDAP. Did you look at using dnscache? That might buy you enough breathing room to shop around for a solution to user verification. DAve > > Ed > > > --- > > Talk is cheap since supply always exceeds demand. > > --- > > > -Original Message- > From: DAve [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 12:39 PM > To: users@spamassassin.apache.org > Subject: Re: General assistance > > Ed Russell wrote: > >>[EMAIL PROTECTED] smtpd]# spamassassin --version >>SpamAssassin version 3.1.0 >> running on Perl version 5.8.7 >> >> >>Spamd running with: >>OPTIONS="-L -x -d -u nobody -m 45" >> >>No user verification or RBL at the MTA level. > > > Absolutely do user verification. I can throw out from 20% to 80% of my > traffic depending on the current level of dictionary and Joe-Job > attacks. Since you are processing ahead of your clients Exchange boxes > I'm not sure how you can do that with qmail. I do it on my gateways > running MailScanner via milter-ahead, and on my toasters via checkuser > in vpopmail. > > There might be a way to get qmail to check with an Exchange box to > validate a user without running vpopmail, but I won't know it. > > DAve > > >> >>12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 >>313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped >>CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle >>Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K >>buff >>Swap: 2097136K av, 0K used, 2097136K free 225380K >>cached >> >>As you can see I have loads of head room as far as memory goes. I was >>looking into integrating RBL into Qmail, but with the very high volume I > > am > >>quite concerned that this will introduce a slowdown. If I increase the >>inbound concurrent rate I eventually run into qmail-scanner problems with >>reformime. Is there anything else I need consider? >> >>Ed >> >>--- >> >> Talk is cheap since supply always exceeds demand. >> >>--- >> >> >>-Original Message- >>From: Kristopher Austin [mailto:[EMAIL PROTECTED] >>Sent: Friday, February 10, 2006 12:06 PM >>To: [EMAIL PROTECTED]; users@spamassassin.apache.org >>Subject: RE: General assistance >> >> >> >>>-Original Message- >>>From: Ed Russell [mailto:[EMAIL PROTECTED] >>>Sent: Friday, February 10, 2006 10:51 AM >>>To: users@spamassassin.apache.org >>>Subject: General assistance >>> >>>Am I completely off base in the way I have this all setup? I have >> >>went >> >> >>>with >>>a higher speed HD to increase the threshold on file I/O. Can I tune >> >>the >> >> >>>performance of razor etc while maintaining delivery time? Is there >>>anything >>>else I should be considering? If I have not explained things well or >> >>more >> >> >>>information is needed I will certainly provide anything. >>> >> >> >>A few quest
Re: General assistance
Ed Russell wrote: User validation is going to be tough or all but impossible. This box forwards off the mail to an NT box running SL Mail. There is no easy way to get a userlist out of this product. In addition the users change daily and some even use multi-drops. You don't need to get a user list, you just need to ask the destination server if the user exists before accepting the message. This is what milter-ahead does on my MailScanner servers. I process and forward to servers running qmail(my toasters) and Exchange, GroupMail, Groupwise, Sendmail(my clients servers). All respond correctly to milter-ahead. I do not know of a way to duplicate milter-ahead in qmail without requiring something like vpopmail or LDAP. Did you look at using dnscache? That might buy you enough breathing room to shop around for a solution to user verification. DAve Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:39 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: [EMAIL PROTECTED] smtpd]# spamassassin --version SpamAssassin version 3.1.0 running on Perl version 5.8.7 Spamd running with: OPTIONS="-L -x -d -u nobody -m 45" No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. DAve 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K buff Swap: 2097136K av, 0K used, 2097136K free 225380K cached As you can see I have loads of head room as far as memory goes. I was looking into integrating RBL into Qmail, but with the very high volume I am quite concerned that this will introduce a slowdown. If I increase the inbound concurrent rate I eventually run into qmail-scanner problems with reformime. Is there anything else I need consider? Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Kristopher Austin [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:06 PM To: [EMAIL PROTECTED]; users@spamassassin.apache.org Subject: RE: General assistance -Original Message- From: Ed Russell [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 10:51 AM To: users@spamassassin.apache.org Subject: General assistance Am I completely off base in the way I have this all setup? I have went with a higher speed HD to increase the threshold on file I/O. Can I tune the performance of razor etc while maintaining delivery time? Is there anything else I should be considering? If I have not explained things well or more information is needed I will certainly provide anything. A few questions I have: What SA version are you running? spamassassin --version What do you have --max-children set to? How much memory do you have free when the box is fully loaded? I'm trying to see if you have any headroom left to have more spamd children running. It sounds like your problem is with waiting on DNS returns. This should mean that you have plenty of processing power remaining just not enough children to handle the requests. Other things to consider: Do you use RBLs at the MTA level? Do you have user verification at the MTA level? Look for messages your MTA can drop before sending to SA. Kris
Re: General assistance
Ed Russell wrote: I think there is some confusion, this box does not act as a gateway to Exchange. I do not use this product in this scenario, but in others. I used Exchange as an example, and did state as such. My response made it apear you were looking for an Exchange solution. My appologies. DAve Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Bowie Bailey [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:51 PM To: users@spamassassin.apache.org Subject: RE: General assistance DAve wrote: Ed Russell wrote: No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. IIRC, Exchange can act as an LDAP server, so you may be able to do user verification via LDAP lookups.
RE: General assistance
I think there is some confusion, this box does not act as a gateway to Exchange. I do not use this product in this scenario, but in others. Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Bowie Bailey [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:51 PM To: users@spamassassin.apache.org Subject: RE: General assistance DAve wrote: > Ed Russell wrote: > > > > No user verification or RBL at the MTA level. > > Absolutely do user verification. I can throw out from 20% to 80% of my > traffic depending on the current level of dictionary and Joe-Job > attacks. Since you are processing ahead of your clients Exchange boxes > I'm not sure how you can do that with qmail. I do it on my gateways > running MailScanner via milter-ahead, and on my toasters via checkuser > in vpopmail. > > There might be a way to get qmail to check with an Exchange box to > validate a user without running vpopmail, but I won't know it. IIRC, Exchange can act as an LDAP server, so you may be able to do user verification via LDAP lookups. -- Bowie
RE: General assistance
User validation is going to be tough or all but impossible. This box forwards off the mail to an NT box running SL Mail. There is no easy way to get a userlist out of this product. In addition the users change daily and some even use multi-drops. Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: DAve [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:39 PM To: users@spamassassin.apache.org Subject: Re: General assistance Ed Russell wrote: > [EMAIL PROTECTED] smtpd]# spamassassin --version > SpamAssassin version 3.1.0 > running on Perl version 5.8.7 > > > Spamd running with: > OPTIONS="-L -x -d -u nobody -m 45" > > No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. DAve > > > 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 > 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped > CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle > Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K > buff > Swap: 2097136K av, 0K used, 2097136K free 225380K > cached > > As you can see I have loads of head room as far as memory goes. I was > looking into integrating RBL into Qmail, but with the very high volume I am > quite concerned that this will introduce a slowdown. If I increase the > inbound concurrent rate I eventually run into qmail-scanner problems with > reformime. Is there anything else I need consider? > > Ed > > --- > > Talk is cheap since supply always exceeds demand. > > --- > > > -Original Message- > From: Kristopher Austin [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 12:06 PM > To: [EMAIL PROTECTED]; users@spamassassin.apache.org > Subject: RE: General assistance > > >>-Original Message- >>From: Ed Russell [mailto:[EMAIL PROTECTED] >>Sent: Friday, February 10, 2006 10:51 AM >>To: users@spamassassin.apache.org >>Subject: General assistance >> >>Am I completely off base in the way I have this all setup? I have > > went > >>with >>a higher speed HD to increase the threshold on file I/O. Can I tune > > the > >>performance of razor etc while maintaining delivery time? Is there >>anything >>else I should be considering? If I have not explained things well or > > more > >>information is needed I will certainly provide anything. >> > > > A few questions I have: > What SA version are you running? spamassassin --version > What do you have --max-children set to? > How much memory do you have free when the box is fully loaded? > > I'm trying to see if you have any headroom left to have more spamd > children running. It sounds like your problem is with waiting on DNS > returns. This should mean that you have plenty of processing power > remaining just not enough children to handle the requests. > > Other things to consider: > Do you use RBLs at the MTA level? > Do you have user verification at the MTA level? > > Look for messages your MTA can drop before sending to SA. > > Kris > > >
RE: General assistance
DAve wrote: > Ed Russell wrote: > > > > No user verification or RBL at the MTA level. > > Absolutely do user verification. I can throw out from 20% to 80% of my > traffic depending on the current level of dictionary and Joe-Job > attacks. Since you are processing ahead of your clients Exchange boxes > I'm not sure how you can do that with qmail. I do it on my gateways > running MailScanner via milter-ahead, and on my toasters via checkuser > in vpopmail. > > There might be a way to get qmail to check with an Exchange box to > validate a user without running vpopmail, but I won't know it. IIRC, Exchange can act as an LDAP server, so you may be able to do user verification via LDAP lookups. -- Bowie
RE: General assistance
Ed Russell wrote: > > No user verification or RBL at the MTA level. You really should consider finding a way to do user verification at the MTA level. You can greatly reduce your server's load if you don't accept mail for nonexistent users. To give you an example, so far today my server has rejected 7000 messages to unknown users and only delivered 2000 messages. That's 7000 messages that SA and ClamAV didn't have to scan. Also, if I had accepted those messages, I would have had to drop another 7000 non-delivery messages into my delivery queue (most of which would have sat in the queue for a week before double-bouncing back to postmaster). -- Bowie
Re: General assistance
Ed Russell wrote: [EMAIL PROTECTED] smtpd]# spamassassin --version SpamAssassin version 3.1.0 running on Perl version 5.8.7 Spamd running with: OPTIONS="-L -x -d -u nobody -m 45" No user verification or RBL at the MTA level. Absolutely do user verification. I can throw out from 20% to 80% of my traffic depending on the current level of dictionary and Joe-Job attacks. Since you are processing ahead of your clients Exchange boxes I'm not sure how you can do that with qmail. I do it on my gateways running MailScanner via milter-ahead, and on my toasters via checkuser in vpopmail. There might be a way to get qmail to check with an Exchange box to validate a user without running vpopmail, but I won't know it. DAve 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K buff Swap: 2097136K av, 0K used, 2097136K free 225380K cached As you can see I have loads of head room as far as memory goes. I was looking into integrating RBL into Qmail, but with the very high volume I am quite concerned that this will introduce a slowdown. If I increase the inbound concurrent rate I eventually run into qmail-scanner problems with reformime. Is there anything else I need consider? Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Kristopher Austin [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:06 PM To: [EMAIL PROTECTED]; users@spamassassin.apache.org Subject: RE: General assistance -Original Message- From: Ed Russell [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 10:51 AM To: users@spamassassin.apache.org Subject: General assistance Am I completely off base in the way I have this all setup? I have went with a higher speed HD to increase the threshold on file I/O. Can I tune the performance of razor etc while maintaining delivery time? Is there anything else I should be considering? If I have not explained things well or more information is needed I will certainly provide anything. A few questions I have: What SA version are you running? spamassassin --version What do you have --max-children set to? How much memory do you have free when the box is fully loaded? I'm trying to see if you have any headroom left to have more spamd children running. It sounds like your problem is with waiting on DNS returns. This should mean that you have plenty of processing power remaining just not enough children to handle the requests. Other things to consider: Do you use RBLs at the MTA level? Do you have user verification at the MTA level? Look for messages your MTA can drop before sending to SA. Kris
Re: General assistance
Ed Russell wrote: If everyone would indulge me I would like to put forth the setup I am utilizing and get some feedback. I have a box that I have been using for some time which acts as a pass-through filter for many domains (currently about 100) for spam, this is a fairly high traffic server processing about 150,000 to 200,000 messages per day. I use the following method. Based upon a redhat 6.2 box running kernel 2.2.26, PIV with 2 Gigs of RAM. Qmail runs which accepts the email from the world (with a concurrencyincoming of 100) and passes it through qmail-scanner (which calls spamd) and spamassassin which checks the email and writes spam status to the header. Each message gets then passed through a procmail filter which will delete it if it is spam. The procmail filter is: :0 * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* { EXITCODE=99 :0 /dev/null } :0 * ^X-Spam-Status: Yes { EXITCODE=99 :0 /dev/null } :0 * ^^rom[ ] { LOG="*** Dropped F off From_ header! Fixing up. " :0 fhw | sed -e '1s/^/F/' } :0 /dev/null Mail that is clean gets passed off to a second qmail install which then delivers the mail to our servers using smtproutes. This has been working fine for a few years now, but recently we have experienced major delays in the processing of email. Due to the very high volume pretty much all the time the system is handling 100 concurrent incoming pieces of email. Of course with everything else going on it is not uncommon for this system to have up to 400 processes running. Sometimes mail can take hours to get through to its destination. What I have discovered is that if I disable RBL checks and razor, pyzor and dcc the delay goes away. However, the effectiveness of the filter reduces. Am I completely off base in the way I have this all setup? I have went with a higher speed HD to increase the threshold on file I/O. Can I tune the performance of razor etc while maintaining delivery time? Is there anything else I should be considering? If I have not explained things well or more information is needed I will certainly provide anything. Thanks Since you are running qmail, consider doing your rbl checks in qmail-smtpd. No sense scanning a message if you can drop it at the door first. Also, are your running dnscache? I run dnscache on all my servers, web, webmail, toasters, etc. It can speed things up considerably as it will cache your RBL lookups, SURBL lookups, etc. It's a nice thing to do for the URIBL and SURBL folks too. If you do run dnscache, consider turning logging off once you are configured and satisfied it works as intended. dnscache can keep a disk pretty busy with it's potential to log a lot of data. DAve
RE: General assistance
[EMAIL PROTECTED] smtpd]# spamassassin --version SpamAssassin version 3.1.0 running on Perl version 5.8.7 Spamd running with: OPTIONS="-L -x -d -u nobody -m 45" No user verification or RBL at the MTA level. 12:20pm up 4:05, 1 user, load average: 9.49, 9.23, 9.23 313 processes: 300 sleeping, 12 running, 1 zombie, 0 stopped CPU states: 18.9% user, 16.6% system, 0.0% nice, 64.4% idle Mem: 2009856K av, 711560K used, 1298296K free, 353776K shrd, 129268K buff Swap: 2097136K av, 0K used, 2097136K free 225380K cached As you can see I have loads of head room as far as memory goes. I was looking into integrating RBL into Qmail, but with the very high volume I am quite concerned that this will introduce a slowdown. If I increase the inbound concurrent rate I eventually run into qmail-scanner problems with reformime. Is there anything else I need consider? Ed --- Talk is cheap since supply always exceeds demand. --- -Original Message- From: Kristopher Austin [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 12:06 PM To: [EMAIL PROTECTED]; users@spamassassin.apache.org Subject: RE: General assistance > -Original Message- > From: Ed Russell [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 10:51 AM > To: users@spamassassin.apache.org > Subject: General assistance > > Am I completely off base in the way I have this all setup? I have went > with > a higher speed HD to increase the threshold on file I/O. Can I tune the > performance of razor etc while maintaining delivery time? Is there > anything > else I should be considering? If I have not explained things well or more > information is needed I will certainly provide anything. > A few questions I have: What SA version are you running? spamassassin --version What do you have --max-children set to? How much memory do you have free when the box is fully loaded? I'm trying to see if you have any headroom left to have more spamd children running. It sounds like your problem is with waiting on DNS returns. This should mean that you have plenty of processing power remaining just not enough children to handle the requests. Other things to consider: Do you use RBLs at the MTA level? Do you have user verification at the MTA level? Look for messages your MTA can drop before sending to SA. Kris
RE: General assistance
> -Original Message- > From: Ed Russell [mailto:[EMAIL PROTECTED] > Sent: Friday, February 10, 2006 10:51 AM > To: users@spamassassin.apache.org > Subject: General assistance > > Am I completely off base in the way I have this all setup? I have went > with > a higher speed HD to increase the threshold on file I/O. Can I tune the > performance of razor etc while maintaining delivery time? Is there > anything > else I should be considering? If I have not explained things well or more > information is needed I will certainly provide anything. > A few questions I have: What SA version are you running? spamassassin --version What do you have --max-children set to? How much memory do you have free when the box is fully loaded? I'm trying to see if you have any headroom left to have more spamd children running. It sounds like your problem is with waiting on DNS returns. This should mean that you have plenty of processing power remaining just not enough children to handle the requests. Other things to consider: Do you use RBLs at the MTA level? Do you have user verification at the MTA level? Look for messages your MTA can drop before sending to SA. Kris