Re: Simple question TRUE or FALSE
David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE False for A - however, lots of ram and cpu is essential for any reasonably high throughput (this can be applied to many platforms). We process around 5 million incoming per day through the spamd layer, and each message takes around 0.5 seconds for spamd to process - this includes DNS (URI) checks, per-mailbox database lookups, local rulesets and DCC checks. The key is to have your resources local - if you're relying on external lookups to the Internet then everything becomes very variable... Paul
RE: Simple question TRUE or FALSE (More data to answer this question)
> My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi > drives can only scan a message in 4-5 seconds. At least that was my scan > time with a completely default setup, running spamd/spamass-milter, SA > 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while > (since I updated SA, the milter, and sendmail), but I have a good feeling > most of my processing time was spent waiting for DNS responses. > > Any input into my situation would be appreciated. I'd love to be able to > get down to 2-3 seconds, basically cutting my processing time in half! I only checked the timings of the last 10 or so mails to show that it was much faster than the mentioned 20-30 seconds, but especially for you ;-) I now calculated the mean SA checktime of the last 7 days, on the 1Ghz/512MB server. And it is: 3.854 seconds. This server has Suse Linux, postfix 2.2.3, Amavisd-new 2.3.1, SA 3.03, Clamav, Razor, DCC. Network tests are enabled, no local DNS-server, only the standard SA CF files except for a small local.cf. Menno van Bennekom
Re: Simple question TRUE or FALSE (More data to answer this question)
From: "Jon Dossey" <[EMAIL PROTECTED]> > From: Menno van Bennekom [mailto:[EMAIL PROTECTED] > To: David Velásquez Restrepo > Subject: Re: Simple question TRUE or FALSE (More data to answer this > question) > > > Q) With spamassassin (and all the above info) you need about 20 to 30 > > seconds per email message and LOTS of RAM and CPU: > > a) TRUE > > b) FALSE > My answer is b), False. > I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that > server usually takes 2 or 3 seconds per message. > Like already posted, some of your rulesets are unnecessary because they > are included in SA (standard rulesets or SURBL). > Did you check 'cat messages | spamassassin -D' to see what part takes most > time? DNS time-outs can take a lot of time for example (also checkable > with tcpdump port 53). > Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail > but I use postfix (and amavisd-new) and I think it's quite memory and CPU > efficient. > Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Any input into my situation would be appreciated. I'd love to be able to get down to 2-3 seconds, basically cutting my processing time in half! [JDOW>>] Jon, I am using these rules from the sources that follow the names. (I built my own GetRules script.) 99_FVGT_Tripwire.cf,http://www.rulesemporium.com/rules/ 99_OBFU_drugs.cf,http://www.rulesemporium.com/rules/Testing/ 99_sare_fraud_post25x.cf,http://www.rulesemporium.com/rules/ 99_FVGT_DomainDigits.cf,http://www.rulesemporium.com/rules/Testing/ 99_FVGT_meta.cf,http://www.rulesemporium.com/rules/ 88_FVGT_body.cf,http://www.rulesemporium.com/rules/ 88_FVGT_rawbody.cf,http://www.rulesemporium.com/rules/ 88_FVGT_subject.cf,http://www.rulesemporium.com/rules/ 88_FVGT_headers.cf,http://www.rulesemporium.com/rules/ 72_sare_bml_post25x.cf,http://www.rulesemporium.com/rules/ 72_sare_redirect_post3.0.0.cf,http://www.rulesemporium.com/rules/ 70_sare_highrisk.cf,http://www.rulesemporium.com/rules/ 70_sare_adult.cf,http://www.rulesemporium.com/rules/ 70_sare_bayes_poison_nxm.cf,http://www.rulesemporium.com/rules/ 70_sare_oem.cf,http://www.rulesemporium.com/rules/ 70_sare_random.cf,http://www.rulesemporium.com/rules/ 70_sare_spoof.cf,http://www.rulesemporium.com/rules/ 70_sare_header.cf,http://www.rulesemporium.com/rules/ 70_sare_header_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_html.cf,http://www.rulesemporium.com/rules/ 70_sare_html_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj0.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj1.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj2.cf,http://www.rulesemporium.com/rules/ 70_sare_specific.cf,http://www.rulesemporium.com/rules/ 70_sare_unsub.cf,http://www.rulesemporium.com/rules/ 70_sare_uri0.cf,http://www.rulesemporium.com/rules/ 70_sare_uri1.cf,http://www.rulesemporium.com/rules/ 70_sare_uri_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_obfu0.cf,http://www.rulesemporium.com/rules/ 70_sare_obfu1.cf,http://www.rulesemporium.com/rules/ chickenpox.cf,http://www.rulesemporium.com/rules/ ratware.cf,http://www.rulesemporium.com/rules/ useless.cf,http://www.rulesemporium.com/rules/ weeds_2.cf,http://www.rulesemporium.com/rules/ Spamc/Spamd takes 2 seconds to scan a small spam message and spit it out. $ spamc
RE: Simple question TRUE or FALSE (More data to answer this question)
>> I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that >> server usually takes 2 or 3 seconds per message. >> Like already posted, some of your rulesets are unnecessary because they >> are included in SA (standard rulesets or SURBL). >> Did you check 'cat messages | spamassassin -D' to see what part takes most >> time? DNS time-outs can take a lot of time for example (also checkable >> with tcpdump port 53). >> Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail >> but I use postfix (and amavisd-new) and I think it's quite memory and CPU >> efficient. >> > >Please don't take this as me doubting you - but how in the world are you >able to scan a message in 2-3 seconds? I assume you're running some of >the network tests, like other people that have posted 2-3 second message >processing times, is that correct? > >My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi >drives can only scan a message in 4-5 seconds. At least that was my scan >time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, >RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated >SA, the milter, and sendmail), but I have a good feeling most of my processing >time was spent waiting for DNS responses. > >Any input into my situation would be appreciated. I'd love to be able to get >down to 2-3 seconds, basically cutting my processing time in half! > >.jon I'll describe my setup, and that may give you some insight. It's almost certainly what you think: network tests. My setup uses Compaq ML570s, 4 700MHz Xeon CPUs each, 2G of ram, RAID 0+1 disk arrays. They do virus scanning, spam scanning, and various other mail related tasks which all (of course) take resources. These machines rarely go above 700M consumed, and only really run more than 50% busy (over a several minute window) on Monday morning, or when a spammer has decided that it would be a wonderful idea to hit every single address of ours that they have in rapid succession. The sa-stats routine return the following data: Based on yesterday's logs, the average scan time was 1.44s, average ham scan time 1.11s, average spam scan time 1.62s. The total number of messages scanned was 225,850. It would much higher, but we don't scan outbound email, and also block mail using a sendmail milter derived from rbl-milter, which blocks when 2 (or more) of the RBLs that we use agree. To speed up the network tests, we take advantage of any RBL provider that offers rsync access to their lists (njabl, dsbl, surbl, others), and then (almost) only use those ones. Our scan times went up after I added a few others (sbl-xbl, and bl.spamcop), but those ones are really fast anyway. Each machine runs a local caching DNS server, and the locally hosted RBLs are served by an rbldnsd server. Conveniently, rbldns makes it easy to run a private URIBL, which is occasionally nice. Our site-wide bayes database lives in SQL, because it's more convenient to share among multiple machines that way, and has the added benefit of being faster. I don't run Razor or DCC or Pyzor. A pile of custom rules, and SARE rulesets finish the setup. I've probably forgotten something, but those are the important things. Anyway, I hope that helps someone :) The setup works nicely, with nary a hitch, thanks to everyone who makes it possible! - Austin.
Re: Simple question TRUE or FALSE (More data to answer this question)
Jon Dossey wrote: > > Please don't take this as me doubting you - but how in the world are you able > to scan a message in 2-3 seconds? I assume you're running some of the > network tests, like other people that have posted 2-3 second message > processing times, is that correct? > > My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi > drives can only scan a message in 4-5 seconds. At least that was my scan > time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, > RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I > updated SA, the milter, and sendmail), but I have a good feeling most of my > processing time was spent waiting for DNS responses. > Using SA 2.64 with bayes, razor, dcc (w/dccifd), Mail::SpamCopURI, and 31 .cf files in my /etc/mail spamassassin I'm able to do it in this timeframe. # time spamc X-Spam-Status: Yes, hits=999.5 required=5.0 tests=BAYES_00,DCC_CHECK, DNS_FROM_RFCI_DSN,GTUBE,RAZOR2_CF_RANGE_11_50,RAZOR2_CHECK autolearn=no version=2.64 real0m2.583s user0m0.000s sys 0m0.000s System is a single CPU p4 celeron 2ghz with 512mb of ram, and a caching resolver DNS on localhost.
Re: Simple question TRUE or FALSE (More data to answer this question)
Jon Dossey wrote: > Please don't take this as me doubting you - but how in the world are > you able to scan a message in 2-3 seconds? I assume you're running > some of the network tests, like other people that have posted 2-3 > second message processing times, is that correct? > > My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored > scsi drives can only scan a message in 4-5 seconds. My $450 machine is a 2ghz AMD Sempron with 1gb RAM and old/slow IDE drives, and it processes messages in 4 seconds: May 19 16:53:52 goose spampd[13065]: 2005/05/19-16:53:51 CONNECT TCP Peer: "127.0.0.1:36813" Local: "127.0.0.1:27" [...] May 19 16:53:56 goose spampd[13065]: Closed connections I do a ~dozen RBL lookups, URIDNSBL, Razor2 lookups, ClamAV scans, my own LDAP tests, and some more stuff -- Eric A. Hallhttp://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
Re: Simple question TRUE or FALSE (More data to answer this question)
On Thu, 19 May 2005 16:33:36 -0500 "Jon Dossey" <[EMAIL PROTECTED]> wrote: > > From: Menno van Bennekom [mailto:[EMAIL PROTECTED] > > To: David Velásquez Restrepo > > Subject: Re: Simple question TRUE or FALSE (More data to answer > > this question) > > > > > Q) With spamassassin (and all the above info) you need about > > > 20 to 30 seconds per email message and LOTS of RAM and CPU: > > > a) TRUE > > > b) FALSE > > My answer is b), False. > > I have a mailserver here that has a 1Ghz CPU and 512MB RAM and > > SA on that server usually takes 2 or 3 seconds per message. > > Like already posted, some of your rulesets are unnecessary > > because they are included in SA (standard rulesets or SURBL). > > Did you check 'cat messages | spamassassin -D' to see what part > > takes most time? DNS time-outs can take a lot of time for > > example (also checkable with tcpdump port 53). > > Also your SMTP-server (xmail?) takes a lot of cpu. I've never > > used Xmail but I use postfix (and amavisd-new) and I think it's > > quite memory and CPU efficient. > > > > Please don't take this as me doubting you - but how in the world > are you able to scan a message in 2-3 seconds? I assume you're > running some of the network tests, like other people that have > posted 2-3 second message processing times, is that correct? > > My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB > mirrored scsi drives can only scan a message in 4-5 seconds. At > least that was my scan time with a completely default setup, > running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail > 8.13.1. I haven't checked in a while (since I updated SA, the > milter, and sendmail), but I have a good feeling most of my > processing time was spent waiting for DNS responses. > > Any input into my situation would be appreciated. I'd love to be > able to get down to 2-3 seconds, basically cutting my processing > time in half! > > .jon > For me, spamstats returns ... Average message analysis time : 5.64 seconds Average spam analysis time : 5.54 seconds Average clean message analysis time : 5.76 seconds What is my mail server running? A single PII/400 - 768MB - SCSI disks - Debian Woody w/SA from www.backports.org -- Raquel Often people attempt to live their lives backwards: they try to have more things, or more money, in order to do more of what they want so they will be happier. The way it actually works is the reverse. You must first be who you really are, then, do what you need to do, in order to have what you want. --Margaret Young pgpFkv33dOKKq.pgp Description: PGP signature
Re: Simple question TRUE or FALSE (More data to answer this question)
Justin Mason wrote: > > jdow writes: > >>>You are using larger chunks of VIRT than I am. I use about 60M where >>>you are using 98M. I run with "--max-conn-per-child=15". You win a >>>little if you either add RAM or cut down to "-m2" or "-m3". You do >>>have a fair amount of cache in use. Once that happens you flounder >>>around in cache swapping when running spamassassin. > > > the fundamental problem is that he's not using spamd. Well, that's about 3/4 of his problem Justin. The other 1/4 is he's also got a massively oversized configuration with duplicate rulesets, and large unsupported rulesets like bigevil. Switching to spamd would help him in speed, and will limit the memory usage by limiting the number of children, but the per-child memory usage will still be high until he gets rid of bigevil. > > rule of thumb: if you see performance issues, and you're not using > spamd, STOP RIGHT THERE and start using spamd ;) Unless you're using some other persistent daemon for integration that uses the Mail::SpamAssassin API, such as MailScanner or similar. (note: just using Mail::SpamAssassin in a perl script that gets called for each message as the OP is doesn't count. It's got to be a persistent daemon that doesn't get re-invoked for every message in order to be comparable to spamd.)
RE: Simple question TRUE or FALSE (More data to answer this question)
> Please don't take this as me doubting you - but how in the world are you > able to scan a message in 2-3 seconds? I assume you're running some of Personally, I rarely have any processing times over 1 second. Most of mine are between 0.3 and 0.9 seconds per message. I do not run any network tests, however. Stock SpamAssassin rules, the only modifications I've made have been some scoring adjustments. This is on an AMD64 3000+ with 1GB of DDR400 RAM, running OpenBSD 3.6-STABLE, spamd/spamc, and qmail. Benny -- "You come from a long line of scary women." -- Ranger, "Three To Get Deadly"
Re: Simple question TRUE or FALSE (More data to answer this question)
> Software: > -- > A perl script wich takes some file and test it using Mail::SpamAssassin to Which version of SA? > Using: Net test, Bayes, Razor2, DCC, Phyzor, SPF Test (and everything else > suggested by spamassassin) > Rules: > rules_du_jour: > http://www.rulesemporium.com/rules/bigevil.cf That's your first problem. We;ve been telling people for months to GET RID OF THIS THING. Probably causing 80% of your problems. > http://mywebpages.comcast.net/mkettler/sa/antidrug.cf If you are on 3.x you shouldn't be running this, it is built in. If you aren't running 3.x, why not? > http://www.rulesemporium.com/rules/99_sare_fraud_post25x.cf > http://www.rulesemporium.com/rules/99_sare_fraud_pre25x.cf > http://www.rulesemporium.com/rules/72_sare_bml_post25x.cf > http://www.rulesemporium.com/rules/71_sare_bml_pre25x.cf I think you aren't reading rule descriptions on our site. Those are two files, ONE is supposed to be used if you are on 2.4x or before, and the OTHER if you are on 2.5x or later. It is physically impossible for a version of SA to be BOTH a version before and after 2.50. > http://www.rulesemporium.com/rules/71_sare_redirect_pre3.0.0.cf > http://www.rulesemporium.com/rules/72_sare_redirect_post3.0.0.cf Same basic problem. Here you are claiming that your version of SA is both before and after 3.0.0. Up above you claimed it was both before and after 2.50. Throw out the junk you shouldn't have in those rule sets and things might work better. Loren
RE: Simple question TRUE or FALSE (More data to answer this question)
> From: Menno van Bennekom [mailto:[EMAIL PROTECTED] > To: David Velásquez Restrepo > Subject: Re: Simple question TRUE or FALSE (More data to answer this > question) > > > Q) With spamassassin (and all the above info) you need about 20 to 30 > > seconds per email message and LOTS of RAM and CPU: > > a) TRUE > > b) FALSE > My answer is b), False. > I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that > server usually takes 2 or 3 seconds per message. > Like already posted, some of your rulesets are unnecessary because they > are included in SA (standard rulesets or SURBL). > Did you check 'cat messages | spamassassin -D' to see what part takes most > time? DNS time-outs can take a lot of time for example (also checkable > with tcpdump port 53). > Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail > but I use postfix (and amavisd-new) and I think it's quite memory and CPU > efficient. > Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Any input into my situation would be appreciated. I'd love to be able to get down to 2-3 seconds, basically cutting my processing time in half! .jon
Re: Simple question TRUE or FALSE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marcel Veldhuizen writes: > At 06:00 19-5-2005, Justin Mason wrote: > > > > Memory usage can be quite huge if you have many custom rulesets, > > because SA > > > 3.0.x forks into several processes which all insist on making their own > > > copy of the ruleset in memory :( When I still used the RDJ bigevil list > > > (amongst others), it would use 96 MB of memory for each SA process. > > > >actually, most of this *is* shared, it's just that linux can no > >longer report this accurately. > > What makes you think that? Total used memory on my system is consistent > with SpamAssassin processing not sharing any significant amount of memory. > Also it reports the memory sharing just fine on applications such as Apache? No, it doesn't ;) It is consistent with that scenario -- but that scenario is *NOT* what's happening. that's exactly the problem. Red Hat 2.4 kernels, and all kernels >= 2.6.0, report only shared library usage in the "SHR" column. Therefore memory that is copy-on-write shared between multiple process' code and data segments is not counted. http://wiki.apache.org/spamassassin/TopSharedMemoryBug has the details. - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCjMgtMJF5cimLx9ARAuD1AKCRB+C2BB77l5m7RdlcdU2m2Lz8OwCeJVJ1 taMpiSnJX5ymgS1FrANfZm0= =W6RI -END PGP SIGNATURE-
RE: Simple question TRUE or FALSE (More data to answer this question)
> > Q) With spamassassin (and all the above info) you need > about 20 to 30 > > seconds per email message and LOTS of RAM and CPU: > > a) TRUE > > b) FALSE > My answer is b), False. > I have a mailserver here that has a 1Ghz CPU and 512MB RAM > and SA on that server usually takes 2 or 3 seconds per message. > Like already posted, some of your rulesets are unnecessary > because they are included in SA (standard rulesets or SURBL). > Did you check 'cat messages | spamassassin -D' to see what > part takes most time? DNS time-outs can take a lot of time > for example (also checkable with tcpdump port 53). > Also your SMTP-server (xmail?) takes a lot of cpu. I've never > used Xmail but I use postfix (and amavisd-new) and I think > it's quite memory and CPU efficient. For a 2.4GHz Celeron with 1GB RAM, SA + Postfix hooked to a mysql DB I second that! However on slow boxes I've seen SA doing 15seconds tests but CPU never climbed to 20-30%. The tests took so long because of a mixture of rulesets for SA < v3.x and SA >= v3.x. After cleaning up the mess perfomance gain was here immediately. If there are problems with dns, you are advised to use a DNS cache right on the SA box or at least one that's physically in the same network / subnet. Philipp
Re: Simple question TRUE or FALSE
> David Velásquez Restrepo wrote: > > > Hi, > > > > I'm user of spamassassin to reviw a lot (a lot!) of > > incoming mails with spamassassin lot time ago. Today i > > have a machine just running spamassassin, due the high > > CPU and MEM requirements. Just to be clear (may be i > have something bad) The question is: > > > Q) With spamassassin you need about 20 to 30 seconds per > > email message and LOTS of RAM and CPU: > >a) TRUE > >b) FALSE I have seen this type of behaviour from spamassassin in my installation. It has always been a result of something wrong with my perl install OR missing modules that I have configured sa to use. Check the output of a "spamassassin --lint". If your messages take 20 to 30 seconds to be scanned then you'll see the reason in that output. = Kevin W. Gagel Network Administrator Information Technology Services (250) 561-5848 local 448 --- The College of New Caledonia, Visit us at http://www.cnc.bc.ca Virus scanning is done on all incoming and outgoing email. Anti-spam information for CNC can be found at http://avas.cnc.bc.ca ---
Re: Simple question TRUE or FALSE
David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE False. On my home system, which admittedly doesn't see a lot of mail volume, it takes between four and six seconds to scan a message. It sometimes takes longer if some other process is using a lot of memory, because that machine is kind of short on RAM. It's a 500 MHz DEC AlphaPC. I'm not doing DNS caching on that one, so a lot of that time may be waiting for DNS blacklists to respond. A quick check of the mail server at work, which is faster and uses a caching DNS server, shows most messages are being scanned in under 2 seconds. If you're seeing 20 to 30 second scan times, your server is probably overloaded. Maybe you don't have enough RAM and you're swapping to disk.
Re: Simple question TRUE or FALSE (More data to answer this question)
David Velásquez Restrepo wrote: > Software: > -- > A perl script wich takes some file and test it using Mail::SpamAssassin > to get it´s spam score level If your script isn't persistent, I'd ditch it and use spamc/spamd as Justin Mason suggested. You'll save a lot of processor time from two things using this approach: 1) spamd parses the rulesets when it loads, instead of on a per-message basis. 2) You'll avoid invoking a perl process on a per-message basis, which is a huge waste of CPU time. The perl processes will be preforked by spamd, and only spamc (a compiled utility) gets invoked per-message. 3) spamc has a built-in message size limit, so you'll avoid scanning messages with large attachments that are unlikely to be spam anyway. >http://www.rulesemporium.com/rules/bigevil.cf Matt Y already pointed this out, but just to underline it, bigevil will waste TRULY massive amounts of resources on your system. Even the author of bigevil (Chris S.) strongly recommends that nobody use it, and if you go to the website now, it's been deleted to prevent anyone from using it anymore. You should easily cut 30MB or more off the size of your processes if you remove bigevil. In general it looks like you downloaded every optional ruleset in the world and added it to your configuration before you started off. I would strongly discourage doing that kind of approach to any kind of server application, and it's especially true for spamassassin. Start off running SA without *ANY* add on rulesets, then start adding them a few at a time. This way if you add a bloated ruleset like bigevil, the cause of the problem is immediately obvious. Be very wary of any ruleset which has a .cf file that's greater than 64k in size. Matt Y's comments on duplicated rulesets (such as antidrug.cf, and having both the pre and post 2.5x versions of several rulesets) is also valid. > Q) With spamassassin (and all the above info) you need about 20 to 30 seconds > per email message and LOTS of RAM and CPU: >a) TRUE >b) FALSE a) TRUE, due to misconfiguration. With some tuning based on the tips above, this will readily change to b) FALSE.
Re: Simple question TRUE or FALSE (More data to answer this question)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 jdow writes: > You are using larger chunks of VIRT than I am. I use about 60M where > you are using 98M. I run with "--max-conn-per-child=15". You win a > little if you either add RAM or cut down to "-m2" or "-m3". You do > have a fair amount of cache in use. Once that happens you flounder > around in cache swapping when running spamassassin. the fundamental problem is that he's not using spamd. rule of thumb: if you see performance issues, and you're not using spamd, STOP RIGHT THERE and start using spamd ;) - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCjOkeMJF5cimLx9ARAoarAJ9TY6BF9vF8UFt3Dj2qLDQmDg+pdQCgkSrR 8rFpV4XKLKzk+jtjaam5fFg= =8RxI -END PGP SIGNATURE-
Re: Simple question TRUE or FALSE (More data to answer this question)
> Q) With spamassassin (and all the above info) you need about 20 to 30 > seconds per email message and LOTS of RAM and CPU: > a) TRUE > b) FALSE My answer is b), False. I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. Menno van Bennekom
Re: Simple question TRUE or FALSE (More data to answer this question)
From: "David Velásquez Restrepo" <[EMAIL PROTECTED]> > Software: > -- > A perl script wich takes some file and test it using Mail::SpamAssassin to > get it´s spam score level > OS: gentoo 2005.0 > MTA: postfix > > SpamAssassin: > -- > Using: Net test, Bayes, Razor2, DCC, Phyzor, SPF Test (and everything else > suggested by spamassassin) > Rules: > rules_du_jour: > http://www.rulesemporium.com/rules/99_FVGT_Tripwire.cf > http://www.rulesemporium.com/rules/bigevil.cf > http://mywebpages.comcast.net/mkettler/sa/antidrug.cf > http://www.rulesemporium.com/rules/evilnumbers.cf > http://www.stearns.org/sa-blacklist/sa-blacklist.current > http://www.stearns.org/sa-blacklist/sa-blacklist.current.uri.cf > http://www.stearns.org/sa-blacklist/random.current.cf > http://www.timj.co.uk/linux/bogus-virus-warnings.cf > http://www.rulesemporium.com/rules/70_sare_adult.cf > http://www.rulesemporium.com/rules/99_sare_fraud_post25x.cf > http://www.rulesemporium.com/rules/99_sare_fraud_pre25x.cf > http://www.rulesemporium.com/rules/72_sare_bml_post25x.cf > http://www.rulesemporium.com/rules/71_sare_bml_pre25x.cf > http://www.rulesemporium.com/rules/70_sare_ratware.cf > http://www.rulesemporium.com/rules/70_sare_spoof.cf > http://www.rulesemporium.com/rules/70_sare_bayes_poison_nxm.cf > http://www.rulesemporium.com/rules/70_sare_oem.cf > http://www.rulesemporium.com/rules/70_sare_random.cf > http://www.rulesemporium.com/rules/70_sare_header.cf > http://www.rulesemporium.com/rules/70_sare_html.cf > http://www.rulesemporium.com/rules/70_sare_specific.cf > http://www.rulesemporium.com/rules/71_sare_redirect_pre3.0.0.cf > http://www.rulesemporium.com/rules/72_sare_redirect_post3.0.0.cf > http://www.rulesemporium.com/rules/70_sare_uri0.cf > http://www.rulesemporium.com/rules/70_sare_uri1.cf > http://www.rulesemporium.com/rules/70_sare_uri2.cf > http://www.rulesemporium.com/rules/70_sare_uri3.cf > http://www.rulesemporium.com/rules/70_sare_uri_eng.cf > http://www.rulesemporium.com/rules/70_sare_uri_arc.cf > > Runtime: > -- > 4 processes in parallel mode > > Harwdare: > -- > Intel Pentium III - 1ghz - 512RAM (pci133) > > top: > --- > top - 23:03:27 up 10:39, 2 users, load average: 5.47, 5.35, 5.19 > Tasks: 62 total, 2 running, 60 sleeping, 0 stopped, 0 zombie > Cpu(s): 93.7% us, 5.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.6% hi, 0.0% si > Mem:514036k total, 490044k used,23992k free, 6892k buffers > Swap: 987988k total,49672k used, 938316k free,38012k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 27220 xmail 19 0 98680 71m 3064 R 99.9 14.2 2:38.51 > /progs/xmail/bin/mx_parser/mx_parser.pl - 1 > 27603 xmail 15 0 100m 95m 3064 S 36.8 19.0 2:06.76 > /progs/xmail/bin/mx_parser/mx_parser.pl - 5 > 28171 xmail 16 0 93604 87m 3064 D 28.9 17.4 1:11.20 > /progs/xmail/bin/mx_parser/mx_parser.pl - 4 > 27516 xmail 17 0 94644 88m 3064 D 13.1 17.6 2:03.70 > /progs/xmail/bin/mx_parser/mx_parser.pl - 2 > 27308 xmail 18 0 97960 73m 3064 D 10.5 14.5 2:35.46 > /progs/xmail/bin/mx_parser/mx_parser.pl - 3 > > So, here it goes again the "simple", but not short, question: > Q) With spamassassin (and all the above info) you need about 20 to 30 > seconds per email message and LOTS of RAM and CPU: > a) TRUE > b) FALSE > Given the way you phrase that belligerent assertion I am tempted to simply answer "true" and leave you floundering. It is obvious that for the way you have it configured you're going to take 20-30 seconds so the obvious answer is "true", for you. Now, if you asked, "Am I doing something wrong?" and approached it from that direction you might discover you can run tests in about 5 to 7 second each for your machine. I'll be presumptuous and figure this is what you really mean. For the run times you cite you may have a BL configuration problem, such as trying to use a dead BL somewhere. One other thing that can cause this is a DNS problem. You are using larger chunks of VIRT than I am. I use about 60M where you are using 98M. I run with "--max-conn-per-child=15". You win a little if you either add RAM or cut down to "-m2" or "-m3". You do have a fair amount of cache in use. Once that happens you flounder around in cache swapping when running spamassassin. {^_^}
Re: Simple question TRUE or FALSE
David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE FALSE. My SA runs on a Pentium IV 3GHz system with 512MB. The average processing time per email for the last 100,000 or so emails is 2.8 seconds.
Re: Simple question TRUE or FALSE
David depends on what you call lots or RAM, CPU etc. my old scanner took about 5 seconds to scan email with SA (URI_RBL's, bayes two normal RBL's, lots of extra SARE rules etc), Sophos, ClamAV, the extra checks MailScanner does and dump the email into a mysql DB for reports. Given emails would normally be batched up into a few messages (2-5 average) it difficult to get a single email timing. That was a 500mhz celeron with 512MB ram and an IDE disk. I would top out at about 17,000 messages per day of an avergae size of 26kb. New scanner (P4 2,8ghz, 1.5 GB ram, Sata Disk) takes around 2 seconds per average batch and tops out at around 70,000 messages per day (without much O/S tuning). -- Martin Hepworth Snr Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
Re: Simple question TRUE or FALSE
At 06:00 19-5-2005, Justin Mason wrote: > Memory usage can be quite huge if you have many custom rulesets, because SA > 3.0.x forks into several processes which all insist on making their own > copy of the ruleset in memory :( When I still used the RDJ bigevil list > (amongst others), it would use 96 MB of memory for each SA process. actually, most of this *is* shared, it's just that linux can no longer report this accurately. What makes you think that? Total used memory on my system is consistent with SpamAssassin processing not sharing any significant amount of memory. Also it reports the memory sharing just fine on applications such as Apache?
Re: Simple question TRUE or FALSE (More data to answer this question)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 use spamd. - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCjBz0MJF5cimLx9ARArJbAKCzDKjKCODdwTWx+OBBCp6lY7B9rgCdEo7C +IGtZtyPQpOgYxB22dSrQIg= =KUEV -END PGP SIGNATURE-
Re: Simple question TRUE or FALSE (More data to answer this question)
Hi David, A few quick tips to help performance... David Velásquez Restrepo said: SNIP > http://www.rulesemporium.com/rules/bigevil.cf Do not, I repeat do not use this file, it grew way to big. This type of test is better handled by SURBL. > http://mywebpages.comcast.net/mkettler/sa/antidrug.cf If you are running => SA 3.0.0 antidrug is builtin to SA > http://www.stearns.org/sa-blacklist/sa-blacklist.current > http://www.stearns.org/sa-blacklist/sa-blacklist.current.uri.cf Might want to drop these as well in favor of SURBL tests, at least the uri version. > http://www.rulesemporium.com/rules/99_sare_fraud_post25x.cf > http://www.rulesemporium.com/rules/99_sare_fraud_pre25x.cf Depending on your SA version run only one of the above rulesets. > http://www.rulesemporium.com/rules/72_sare_bml_post25x.cf > http://www.rulesemporium.com/rules/71_sare_bml_pre25x.cf Depending on your SA version run only one of the above rulesets. > http://www.rulesemporium.com/rules/71_sare_redirect_pre3.0.0.cf > http://www.rulesemporium.com/rules/72_sare_redirect_post3.0.0.cf Depending on your SA version run only one of the above rulesets. SNIP Are you running a caching DNS server? A caching nameserver will help quite a bit with the net tests. > So, here it goes again the "simple", but not short, question: > Q) With spamassassin (and all the above info) you need about 20 to 30 > seconds per email message and LOTS of RAM and CPU: > a) TRUE > b) FALSE Correct the above items and see how it runs after the changes. Cheers, matt
Re: Simple question TRUE or FALSE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Marcel Veldhuizen writes: > At 03:25 19-5-2005, David Velásquez Restrepo wrote: > > >Q) With spamassassin you need about 20 to 30 seconds per email message and > >LOTS of RAM and CPU: > >a) TRUE > >b) FALSE > > False. It depends on your settings and custom rulesets, but scanning a > single message takes about 4-5 seconds on Athlon 800 home box. Of course, > suppose it would be scanning 10 messages in parallel, it would take > 'longer' per message. > > Memory usage can be quite huge if you have many custom rulesets, because SA > 3.0.x forks into several processes which all insist on making their own > copy of the ruleset in memory :( When I still used the RDJ bigevil list > (amongst others), it would use 96 MB of memory for each SA process. actually, most of this *is* shared, it's just that linux can no longer report this accurately. FALSE, anyway -- as Marcel notes, 20 seconds is waaay too long. - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCjA9NMJF5cimLx9ARAjjCAKCRIahmvOAnWIOYn6xOlVdN/v7k0wCglTEL QK8ffMTnbeQP61JfxBTr7Ys= =oR4Q -END PGP SIGNATURE-
Re: Simple question TRUE or FALSE (More data to answer this question)
Software: -- A perl script wich takes some file and test it using Mail::SpamAssassin to get it´s spam score level OS: gentoo 2005.0 MTA: postfix SpamAssassin: -- Using: Net test, Bayes, Razor2, DCC, Phyzor, SPF Test (and everything else suggested by spamassassin) Rules: rules_du_jour: http://www.rulesemporium.com/rules/99_FVGT_Tripwire.cf http://www.rulesemporium.com/rules/bigevil.cf http://mywebpages.comcast.net/mkettler/sa/antidrug.cf http://www.rulesemporium.com/rules/evilnumbers.cf http://www.stearns.org/sa-blacklist/sa-blacklist.current http://www.stearns.org/sa-blacklist/sa-blacklist.current.uri.cf http://www.stearns.org/sa-blacklist/random.current.cf http://www.timj.co.uk/linux/bogus-virus-warnings.cf http://www.rulesemporium.com/rules/70_sare_adult.cf http://www.rulesemporium.com/rules/99_sare_fraud_post25x.cf http://www.rulesemporium.com/rules/99_sare_fraud_pre25x.cf http://www.rulesemporium.com/rules/72_sare_bml_post25x.cf http://www.rulesemporium.com/rules/71_sare_bml_pre25x.cf http://www.rulesemporium.com/rules/70_sare_ratware.cf http://www.rulesemporium.com/rules/70_sare_spoof.cf http://www.rulesemporium.com/rules/70_sare_bayes_poison_nxm.cf http://www.rulesemporium.com/rules/70_sare_oem.cf http://www.rulesemporium.com/rules/70_sare_random.cf http://www.rulesemporium.com/rules/70_sare_header.cf http://www.rulesemporium.com/rules/70_sare_html.cf http://www.rulesemporium.com/rules/70_sare_specific.cf http://www.rulesemporium.com/rules/71_sare_redirect_pre3.0.0.cf http://www.rulesemporium.com/rules/72_sare_redirect_post3.0.0.cf http://www.rulesemporium.com/rules/70_sare_uri0.cf http://www.rulesemporium.com/rules/70_sare_uri1.cf http://www.rulesemporium.com/rules/70_sare_uri2.cf http://www.rulesemporium.com/rules/70_sare_uri3.cf http://www.rulesemporium.com/rules/70_sare_uri_eng.cf http://www.rulesemporium.com/rules/70_sare_uri_arc.cf Runtime: -- 4 processes in parallel mode Harwdare: -- Intel Pentium III - 1ghz - 512RAM (pci133) top: --- top - 23:03:27 up 10:39, 2 users, load average: 5.47, 5.35, 5.19 Tasks: 62 total, 2 running, 60 sleeping, 0 stopped, 0 zombie Cpu(s): 93.7% us, 5.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.6% hi, 0.0% si Mem:514036k total, 490044k used,23992k free, 6892k buffers Swap: 987988k total,49672k used, 938316k free,38012k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 27220 xmail 19 0 98680 71m 3064 R 99.9 14.2 2:38.51 /progs/xmail/bin/mx_parser/mx_parser.pl - 1 27603 xmail 15 0 100m 95m 3064 S 36.8 19.0 2:06.76 /progs/xmail/bin/mx_parser/mx_parser.pl - 5 28171 xmail 16 0 93604 87m 3064 D 28.9 17.4 1:11.20 /progs/xmail/bin/mx_parser/mx_parser.pl - 4 27516 xmail 17 0 94644 88m 3064 D 13.1 17.6 2:03.70 /progs/xmail/bin/mx_parser/mx_parser.pl - 2 27308 xmail 18 0 97960 73m 3064 D 10.5 14.5 2:35.46 /progs/xmail/bin/mx_parser/mx_parser.pl - 3 So, here it goes again the "simple", but not short, question: Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE
Re: Simple question TRUE or FALSE
> The question is: No, the questionS ARE: > Q) With spamassassin you need about 20 to 30 seconds per email message > b) FALSE > and LOTS of RAM > a) TRUE > and LOTS of CPU: > b) FALSE
Re: Simple question TRUE or FALSE
From: "Marcel Veldhuizen" <[EMAIL PROTECTED]> At 03:25 19-5-2005, David Velásquez Restrepo wrote: >Q) With spamassassin you need about 20 to 30 seconds per email message and >LOTS of RAM and CPU: >a) TRUE >b) FALSE False. It depends on your settings and custom rulesets, but scanning a single message takes about 4-5 seconds on Athlon 800 home box. Of course, suppose it would be scanning 10 messages in parallel, it would take 'longer' per message. [JDOW>>] Trust me on this one - it takes an incredibly longer time for a run on a 66MHz pentium with 256megs of memory. I've seen it take long enough to timeout sendmail. {^_-}
Re: Simple question TRUE or FALSE
From: "David Velásquez Restrepo" <[EMAIL PROTECTED]> > Hi, > > I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with > spamassassin lot time ago. Today i have a machine just running spamassassin, > due the high CPU and MEM requirements. Just to be clear (may be i have > something bad) The question is: > > Q) With spamassassin you need about 20 to 30 seconds per email message and > LOTS of RAM and CPU: > a) TRUE > b) FALSE c) IT DEPENDS How much memory do you have? How fast is the machine? How many spamd processes are running? How many rule sets are running? Are you using spamc and spamd or simply spamassassin itself? Is DNS setup properly? Are you using BLs? yatta and more yatta. 3.02 with a HUGE bundle of SARE rules on a 1 GIB machine running at 2GHz I get these times processing one of my sample spams through the spamc route: 0.00user 0.00system 0:02.91elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+196minor)pagefaults 0swaps Without the "IT DEPENDS" clause this question is the same as asking a poor sod if he has stopped beating his wife yet. It presumes a state not in evidence. {^_^}
Re: Simple question TRUE or FALSE
On Wed, May 18, 2005 at 08:25:53PM -0500, David Velásquez Restrepo wrote: > Q) With spamassassin you need about 20 to 30 seconds per email message and > LOTS of RAM and CPU: >a) TRUE >b) FALSE Can't answer this question with the information provided. As a general answer, though, b, due to the "and". Usually, 20-30 seconds means you're having network timeout issues, or you have an overloaded/underpowered machine. "LOTS" could mean anything, but generally as much memory/cpu as possible is a good idea. -- Randomly Generated Tagline: "Hey, you know what'd cheer you up? You should get yourself a puppy." -Amy "A puppy? Nibbler loved to eat puppies" -Leela pgpx6EsNe6pSz.pgp Description: PGP signature
Re: Simple question TRUE or FALSE
At 03:25 19-5-2005, David Velásquez Restrepo wrote: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE False. It depends on your settings and custom rulesets, but scanning a single message takes about 4-5 seconds on Athlon 800 home box. Of course, suppose it would be scanning 10 messages in parallel, it would take 'longer' per message. Memory usage can be quite huge if you have many custom rulesets, because SA 3.0.x forks into several processes which all insist on making their own copy of the ruleset in memory :( When I still used the RDJ bigevil list (amongst others), it would use 96 MB of memory for each SA process. Now that I've trashed bigevil and using URIDNSBL instead, each process uses about 32 MB of memory for me.
Simple question TRUE or FALSE
Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE