Re: Question regarding meta rule handling
On Wed, Aug 03, 2005 at 08:18:16AM +0200, Sven Riedel wrote: > header __X Content-Type =~ /^(message|multipart)/i > rawbody __Y /\S/ > meta Z ( !X && !Y ) > > and yet the rule triggers for me. Doing a Of course. __X != X ... :) -- Randomly Generated Tagline: "The question is to what extent parasites like Microsoft should be parasites off the public system, or should be granted any rights at all." - Noam Chomsky pgp1hmiHdeI2o.pgp Description: PGP signature
Question regarding meta rule handling
Hi, a while back someone kindly posted a rule here that matches on empty mails: header __X Content-Type =~ /^(message|multipart)/i rawbody __Y /\S/ meta Z ( !X && !Y ) Now I find that Z matches on all mails - investigation shows that Y matches on all non-whitespaces as it should, and X doesn't match. So I would assume that ( !0 && !1 ) = ( 1 && 0 ) = 0 and yet the rule triggers for me. Doing a spamassassin -t -D < testmessage doesn't show anything of use to me why this rule triggers. Any ideas? Oh, and then I tried to disable the rule by assigning as score of 0 (the wiki page on writing rules states that rules with a score of 0 aren't processed). And yet this rule keep on turning up in my X-Spam-Status header. I'm a bit puzzled at this point. Regs, Sven -- BAGHUS GmbH EDV und Internetdienstleistungen Staffelseestr. 2 81477 München Tel.: 0 89 / 8 71 81 - 4 84 Fax.: 0 89 / 8 71 81 - 4 88 www.baghus.net, [EMAIL PROTECTED] HRB: 144283, USt-IdNr: DE224865405 Junkmail Catcher, do not use: [EMAIL PROTECTED] --
RE: Load balancing spamd
> > How do you (make and) balance the calls to the AV servers? How do you > (make > and) balance the calls to the spamd machines? I am very interested in > these > details! We just call them in order case on the connection line. On two of the 4 SMTP gateways we use node 1 as the primary and node 2 as the secondary, on the other two, just the opposite. I know this is the poor mans way of doing this but we are lazy and haven't made our way to using something like LVS. > > > We are edging up to 95K a day now on only two machines. You can imagine > we > are anxious to start using the other boxes we have rarin' to go! Ironically, when we first started this we had everything running on 4 machines and it started choking. So, we went with the two backend ends. It chocked. Then we kicked the -m from 30 to 6. 6 is a small number but it seems to be working fine. We have found for our environment that 6 to 8 works well. > > > We > > recently upgrade all of the hardware to Dell Dimension 4700's with 1.5gb > > ram each. Budget was $5200. > > > > Machines are idle. > > Sweet. ;) > And it was overall cheap > Why? Because your DNS costs to query your RBL list in Postfix is very > heavy/slowing you down? Are you going to mirror just one chosen RBL out > there or a combination of several?? > > Do you run DCC in your SA environment? If so, you are over their > recommended > limit for hosting a DCC server (we are nearing it - 100K a day I think). > Do > you run a DCC server for yourself? Any issues to be aware of? > It's on the TODO list. Item 629 I believe... :) There are other pressing items to fix/work on. This is working great but will be readdressed during the next maintenance upgrade (which is about every 90 days). Gary
Re: Bayes: not enough usable tokens found
Hum. I'm a little confused by that SA score stuff on the bottom of the message. If it refers to a message that should be spam you have two serious problems. If it referred to a message from this list you may have a serious problem and a less serious problem. pts rule name description -- -- -3.3 ALL_TRUSTEDDid not pass through any untrusted hosts 0.0 HTML_30_40 BODY: Message is 30% to 40% HTML 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_TITLE_EMPTY BODY: HTML title contains no text -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% In general ALL_TRUSTED shouldn't be firing for messages coming from an external source. This makes me wonder if you have trusted_hosts and trusted_networks set correctly. In general SA (and especially Bayes) shouldn't be seeing this list, since it has a lot of real spam floating through it, and other spammy tokens. It is far better to use postfix or whatever your router is to bypass this list around SA. If that header referred to a spam, BAYES_00 says that Bayes thought it was guaranteed ham. That would be a sign that you have a corrupted bayes database. Loren
Re: Bayes: not enough usable tokens found
Mike Cavanagh wrote: Hum. I can see some messages are being caught via the Bayes test, but I would think Bayes would find many more as I have close to 5000 SPAM in the Bayes system. I get at most 15 messages a day flagged as SPAM while I receive approx. 100 messages a day as non-SPAM but should be flagged as SPAM. I have started to include the Spamassassin footer on all messages to get a handle on what passes in the "non-Spam" messages. Any thoughts on how to improve this would be helpful. pts rule name description -- -- -3.3 ALL_TRUSTEDDid not pass through any untrusted hosts http://wiki.apache.org/spamassassin/TrustPath
OT: RBL for dynamic "no reverse DNS" lookups
OT: RBL for dynamic "no reverse DNS" lookups I'm trying to find an RBL which will return a standard RBL return code (like "127.0.0.2") if/when the IP passed to the RBL doesn't have a reverse DNS entry. (1) I know that SA doesn't have a need for this as another function is already available in SA for this. But I need this for a **different** utility, not SA (which is why I said, "OT"). (2) This other utility doesn't have the option to check for "no reverse DNS", but CAN do whatever general RBL lookups I tell it to do. Also, I don't have access to this utility's source code. However, if I can find this kind of RBL I mentioned, then I can use this utility's RBL lookups against that kind of RBL to accomplish checking a message's sending server for "no reverse DNS". But, again, doing lookups on (reversedIP).in-addr.arpa is NOT an option in this utility because it **only** works with the traditional RBL responses, which are always numeric, unlike reverse DNS lookups. (3) I know that some aggressive RBLs factor in "no reverse DNS"... but, instead, I'm looking for an RBL which would do a DYNAMIC lookup to see if there is "no reverse DNS", even if that RBL hasn't checked that IP before or hasn't previously added that IP to it's "no reverse DNS" nameserver database. (4) And, of course, I understand that it is NOT a good idea to block **solely** due to a sending server's IP not having a reverse DNS lookup. Rather, I'm using this for auditing, testing, and other things. Thanks, Rob McEwen PowerView Systems
Re: Bayes: not enough usable tokens found
Hum. I can see some messages are being caught via the Bayes test, but I would think Bayes would find many more as I have close to 5000 SPAM in the Bayes system. I get at most 15 messages a day flagged as SPAM while I receive approx. 100 messages a day as non-SPAM but should be flagged as SPAM. I have started to include the Spamassassin footer on all messages to get a handle on what passes in the "non-Spam" messages. Any thoughts on how to improve this would be helpful. Thanks, Mike Loren Wilton wrote: What does this message mean?? debug: cannot use bayes on this message; not enough usable tokens found debug: bayes: not scoring message, returning undef Unless you are seeing this a whole lot, I don't think you are doing anything wrong. I think this just means that the particular mail didn't much match anything Bayes had seen before, so it didn't feel competent to assign a score to it. I would have expected that to be a bayes_50 case, but it looks like it just decided to bypass the message. Loren Spam detection software, running on the system "fred.5cs.com", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see [EMAIL PROTECTED] for details. Content preview: Hum. I can see some messages are being caught via the Bayes test, but I would think Bayes would find many more as I have close to 5000 SPAM in the Bayes system. I get at most 15 messages a day flagged as SPAM while I receive approx. 100 messages a day as non-SPAM but should be flagged as SPAM. [...] Content analysis details: (-5.9 points, 10.0 required) pts rule name description -- -- -3.3 ALL_TRUSTEDDid not pass through any untrusted hosts 0.0 HTML_30_40 BODY: Message is 30% to 40% HTML 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_TITLE_EMPTY BODY: HTML title contains no text -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.]
Re: Bayes: not enough usable tokens found
> What does this message mean?? > debug: cannot use bayes on this message; not enough usable tokens found > debug: bayes: not scoring message, returning undef Unless you are seeing this a whole lot, I don't think you are doing anything wrong. I think this just means that the particular mail didn't much match anything Bayes had seen before, so it didn't feel competent to assign a score to it. I would have expected that to be a bayes_50 case, but it looks like it just decided to bypass the message. Loren
RE: Load balancing spamd
--- "Gary W. Smith" <[EMAIL PROTECTED]> wrote: > We have 4 front end servers running postfix. These servers call and AV > process on two additional AV servers behind the wall. Then these > servers "these" being the AV server calls spamd or it goes back to the MTA first? How do you (make and) balance the calls to the AV servers? How do you (make and) balance the calls to the spamd machines? I am very interested in these details! > call spamd on two additional servers behind the wall. Those two > servers have a simple MySQL cluster (running Linux-HA and DRBD). > > In all we have 8 boxes that handle all of our email for our clients. We > are generating about 170k emails per day coming into the network. We are edging up to 95K a day now on only two machines. You can imagine we are anxious to start using the other boxes we have rarin' to go! > We > recently upgrade all of the hardware to Dell Dimension 4700's with 1.5gb > ram each. Budget was $5200. > > Machines are idle. Sweet. ;) > Something new we have been looking at as well. We are looking at > setting up simple relays that will run RBL on the front end and then > just hand them off to our 4 backend servers. But since it works right > now we're not going to fix it. Why? Because your DNS costs to query your RBL list in Postfix is very heavy/slowing you down? Are you going to mirror just one chosen RBL out there or a combination of several?? Do you run DCC in your SA environment? If so, you are over their recommended limit for hosting a DCC server (we are nearing it - 100K a day I think). Do you run a DCC server for yourself? Any issues to be aware of? Thanks a TON!! > > > -Original Message- > > From: email builder [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, August 02, 2005 5:19 PM > > To: Jason Frisvold > > Cc: Gary W. Smith; users@spamassassin.apache.org > > Subject: Re: Load balancing spamd > > > > > > > > --- Jason Frisvold <[EMAIL PROTECTED]> wrote: > > > > > On 8/1/05, email builder <[EMAIL PROTECTED]> wrote: > > > > Even if I had forgotten the -A, I think I would have been seeing > > > connection > > > > refused notices, but right now, it just seems to time out. I'm > pretty > > > sure > > > > this is a LVS question more than a spamc/d question, since I've no > > > problems > > > > with the latter -- I am only asking here to see if anyone else > does SA > > > > weighted load balancing. > > > > > > I kinda went the other way around.. I have multiple mail machines, > > > each with their own instance of spamd. I use a Cisco 7206 VXR to do > > > the load balancing. Works like a charm. > > > > Wow, a bit out of our price range here. :) > > > > We have also considered just continuing to build out MTA boxes each > with > > an > > Amavis/Clamd and SA on them to share our increasing load (just use LVS > to > > balance the incoming SMTP traffic and there is little reason to worry > > about > > balancing SA or Amavis/Clam), but our first choice is to split the > > "layers" > > -- have a couple separate machines that just do MTA-ish things, and a > > separate set of boxes that serve as a "SA (and Clam-av) farm". The > thing > > that's better about doing it that way is the redundancy that you don't > get > > if > > you aren't sharing spamd instances across all your MTA machines. > > > > Technically, this should be feasible with just plain DNS load > balancing, > > but > > in our current medium/low budget scenario, we don't have the rackspace > to > > have numerous boxes that are dedicated ONLY to SA/clam, thus our > desire is > > to > > figure out a way to *WEIGHT* our spamd balancing. > > > > I'm surprised there's not a lot of folks out there who have done this > > before? > > > > Thanks again! > > > > > > > > > > > > Start your day with Yahoo! - make it your home page > > http://www.yahoo.com/r/hs > > > __ Yahoo! Mail for Mobile Take Yahoo! Mail with you! Check email on your mobile phone. http://mobile.yahoo.com/learn/mail
Bayes: not enough usable tokens found
What does this message mean?? debug: cannot use bayes on this message; not enough usable tokens found debug: bayes: not scoring message, returning undef I am using MimeDefang Ver. 2.52 and SpamAssassin Ver. 3.0.4 Below is: current status of bayes database (sa-learn --dump=magic) sa-mimedefang.cf spamassassin --lint --debug What am I doing wrong? I am sure this is something simple, I just can't seem to see it. Thanks, Mike. * SA-LEARN Status: /usr/local/bin/sa-learn --username=mimedefang --siteconfigpath=/etc/mail/spamassassin --dump=magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 4275 0 non-token data: nspam 0.000 0765 0 non-token data: nham 0.000 0 148928 0 non-token data: ntokens 0.000 0 1120235107 0 non-token data: oldest atime 0.000 0 1123040192 0 non-token data: newest atime 0.000 0 1123030366 0 non-token data: last journal sync atime 0.000 0 1123000571 0 non-token data: last expiry atime 0.000 02764800 0 non-token data: last expire atime delta 0.000 0 2580 0 non-token data: last expire reduction count * Sa-mimedefang.cf: required_hits 10 ok_locales en, zh skip_rbl_checks 0 # Go ahead and check anyways use_bayes 1 bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam 0.1 bayes_auto_learn_threshold_spam 12.0 bayes_learn_during_report 1 bayes_path /etc/mail/spamassassin/bayes bayes_file_mode 0700 bayes_min_ham_num 200 bayes_min_spam_num 200 bayes_use_hapaxes 1 bayes_use_chi2_combining 1 bayes_auto_expire 1 bayes_learn_to_journal 0 bayes_journal_max_size 102400 use_dcc 1 use_pyzor 1 use_razor2 1 * Spamassassin Lint: spamassassin -D --lint --siteconfigpath=/etc/mail/spamassassin debug: SpamAssassin version 3.0.4 debug: Score set 0 chosen. debug: running in taint mode? yes debug: Running in taint mode, removing unsafe env vars, and resetting PATH debug: PATH included '/usr/sbin', keeping. debug: PATH included '/usr/bin', keeping. debug: PATH included '/usr/ccs/bin', keeping. debug: PATH included '/usr/local/bin', keeping. debug: PATH included '/opt/sfw/bin', keeping. debug: Final PATH set to: /usr/sbin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/opt/sfw/bin debug: diag: module not installed: DBI ('require' failed) debug: diag: module installed: DB_File, version 1.811 debug: diag: module installed: Digest::SHA1, version 2.07 debug: diag: module installed: IO::Socket::UNIX, version 1.21 debug: diag: module installed: MIME::Base64, version 3.03 debug: diag: module installed: Net::DNS, version 0.46 debug: diag: module not installed: Net::LDAP ('require' failed) debug: diag: module installed: Razor2::Client::Agent, version 2.40 debug: diag: module installed: Storable, version 2.09 debug: diag: module installed: URI, version 1.30 debug: ignore: using a test message to lint rules debug: using "/opt/sfw/share/spamassassin" for default rules dir debug: config: read file /opt/sfw/share/spamassassin/10_misc.cf debug: config: read file /opt/sfw/share/spamassassin/20_anti_ratware.cf debug: config: read file /opt/sfw/share/spamassassin/20_body_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_compensate.cf debug: config: read file /opt/sfw/share/spamassassin/20_dnsbl_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_drugs.cf debug: config: read file /opt/sfw/share/spamassassin/20_fake_helo_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_head_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_html_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_meta_tests.cf debug: config: read file /opt/sfw/share/spamassassin/20_phrases.cf debug: config: read file /opt/sfw/share/spamassassin/20_porn.cf debug: config: read file /opt/sfw/share/spamassassin/20_ratware.cf debug: config: read file /opt/sfw/share/spamassassin/20_uri_tests.cf debug: config: read file /opt/sfw/share/spamassassin/23_bayes.cf debug: config: read file /opt/sfw/share/spamassassin/25_body_tests_es.cf debug: config: read file /opt/sfw/share/spamassassin/25_hashcash.cf debug: config: read file /opt/sfw/share/spamassassin/25_spf.cf debug: config: read file /opt/sfw/share/spamassassin/25_uribl.cf debug: config: read file /opt/sfw/share/spamassassin/30_text_de.cf debug: config: read file /opt/sfw/share/spamassassin/30_text_fr.cf debug: config: read file /opt/sfw/share/spamassassin/30_text_nl.cf debug: config: read file /opt/sfw/share/spamassassin/30_text_pl.cf debug: config: read file /opt/sfw/share/spamassassin/50_scores.cf debug: config: read file /opt/sfw/share/spamassassin/60_whitelist.cf debug: using "/etc/mail/spamassassin
Re: Load balancing spamd
--- Charles Sprickman <[EMAIL PROTECTED]> wrote: > On Tue, 2 Aug 2005, email builder wrote: > > > Technically, this should be feasible with just plain DNS load balancing, > but > > in our current medium/low budget scenario, we don't have the rackspace to > > have numerous boxes that are dedicated ONLY to SA/clam, thus our desire > is to > > figure out a way to *WEIGHT* our spamd balancing. > > I've been very happy with DNS load balancing. The frontend mxer runs > tinydns on a local zone "blah.local.domain.com", and an instance of > dnscache with the round-robin patch is pointed to in resolv.conf. While I > thought that the load balancing would be a little "rough", looking at the > stats I sent 17011 messages through #1, 17025 through #2, and 17016 > through #3 yesterday. I can also weight this by having multiple records, > ie: > > spamd1 gets three identical entries in tinydns > spamd2 gets three identical entries in tinydns > spamd3 gets three identical entries in tinydns > spamd4 gets one entry O, some good bits! We have always been plenty satisfied with Bind, but maybe this is the straw that broke the camel's back unless anyone knows if Bind will behave the same way if we have multiple entries for one host?? > that will leave spamd4 seeing about 1/3 the load of the other boxes. It's > not "clustering", but when using the "-d" flag: > > -d host >Connect to spamd server on given host. If host resolves to multi- >ple addresses, then spamc will fail-over to the other addresses, if >the first one cannot be connected to. > > it should hit another box if one goes down. Or some easy scripting could > remove the appropriate entries from tinydns if one machine stops > responding. > > Speaking of low budget, we have three SA boxes, each of which has a 2GHz > AMD processor, 1GB RAM. The first two cost about $550, the last one about > $425. They are pretty crappy boxes with no RAID, etc., but it's cheaper > for me to keep one more box than needed in the equation than to build out > a few "uber spamd" boxes. They are in mini-atx cases, so they barely take > up more room than an equivalent number of 1U boxes. I spawn 30 spamd > children on each. I have been very happy with the performance so far. > > > I'm surprised there's not a lot of folks out there who have done this > > before? > > Maybe they're all cheap like me. :) Awesome! Thanks for the advice!!! __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
RE: Load balancing spamd
We have 4 front end servers running postfix. These servers call and AV process on two additional AV servers behind the wall. Then these servers call spamd on two additional servers behind the wall. Those two servers have a simple MySQL cluster (running Linux-HA and DRBD). In all we have 8 boxes that handle all of our email for our clients. We are generating about 170k emails per day coming into the network. We recently upgrade all of the hardware to Dell Dimension 4700's with 1.5gb ram each. Budget was $5200. Machines are idle. Something new we have been looking at as well. We are looking at setting up simple relays that will run RBL on the front end and then just hand them off to our 4 backend servers. But since it works right now we're not going to fix it. > -Original Message- > From: email builder [mailto:[EMAIL PROTECTED] > Sent: Tuesday, August 02, 2005 5:19 PM > To: Jason Frisvold > Cc: Gary W. Smith; users@spamassassin.apache.org > Subject: Re: Load balancing spamd > > > > --- Jason Frisvold <[EMAIL PROTECTED]> wrote: > > > On 8/1/05, email builder <[EMAIL PROTECTED]> wrote: > > > Even if I had forgotten the -A, I think I would have been seeing > > connection > > > refused notices, but right now, it just seems to time out. I'm pretty > > sure > > > this is a LVS question more than a spamc/d question, since I've no > > problems > > > with the latter -- I am only asking here to see if anyone else does SA > > > weighted load balancing. > > > > I kinda went the other way around.. I have multiple mail machines, > > each with their own instance of spamd. I use a Cisco 7206 VXR to do > > the load balancing. Works like a charm. > > Wow, a bit out of our price range here. :) > > We have also considered just continuing to build out MTA boxes each with > an > Amavis/Clamd and SA on them to share our increasing load (just use LVS to > balance the incoming SMTP traffic and there is little reason to worry > about > balancing SA or Amavis/Clam), but our first choice is to split the > "layers" > -- have a couple separate machines that just do MTA-ish things, and a > separate set of boxes that serve as a "SA (and Clam-av) farm". The thing > that's better about doing it that way is the redundancy that you don't get > if > you aren't sharing spamd instances across all your MTA machines. > > Technically, this should be feasible with just plain DNS load balancing, > but > in our current medium/low budget scenario, we don't have the rackspace to > have numerous boxes that are dedicated ONLY to SA/clam, thus our desire is > to > figure out a way to *WEIGHT* our spamd balancing. > > I'm surprised there's not a lot of folks out there who have done this > before? > > Thanks again! > > > > > > Start your day with Yahoo! - make it your home page > http://www.yahoo.com/r/hs >
Re: Runaway processes
Pretty much answered in my following mail. In general each child might us 30-60mb under NORMAL circumstances, so the amount of memory on your machine will determine the upper limit for number of children. so 8 would be max on a 512meg system (what I have). I still have free ram after firing off 15 but I'll take it back down and see what happens. In most cases you shouldn't really need less than about 20 connections sounds like a place to start. thanks. PS: Could you post plain text rather than html if convenient? sure Frank M. Cook Association Computer Services, Inc. http://www.acsplus.com
Re: Load balancing spamd
On Tue, 2 Aug 2005, email builder wrote: Technically, this should be feasible with just plain DNS load balancing, but in our current medium/low budget scenario, we don't have the rackspace to have numerous boxes that are dedicated ONLY to SA/clam, thus our desire is to figure out a way to *WEIGHT* our spamd balancing. I've been very happy with DNS load balancing. The frontend mxer runs tinydns on a local zone "blah.local.domain.com", and an instance of dnscache with the round-robin patch is pointed to in resolv.conf. While I thought that the load balancing would be a little "rough", looking at the stats I sent 17011 messages through #1, 17025 through #2, and 17016 through #3 yesterday. I can also weight this by having multiple records, ie: spamd1 gets three identical entries in tinydns spamd2 gets three identical entries in tinydns spamd3 gets three identical entries in tinydns spamd4 gets one entry that will leave spamd4 seeing about 1/3 the load of the other boxes. It's not "clustering", but when using the "-d" flag: -d host Connect to spamd server on given host. If host resolves to multi- ple addresses, then spamc will fail-over to the other addresses, if the first one cannot be connected to. it should hit another box if one goes down. Or some easy scripting could remove the appropriate entries from tinydns if one machine stops responding. Speaking of low budget, we have three SA boxes, each of which has a 2GHz AMD processor, 1GB RAM. The first two cost about $550, the last one about $425. They are pretty crappy boxes with no RAID, etc., but it's cheaper for me to keep one more box than needed in the equation than to build out a few "uber spamd" boxes. They are in mini-atx cases, so they barely take up more room than an equivalent number of 1U boxes. I spawn 30 spamd children on each. I have been very happy with the performance so far. I'm surprised there's not a lot of folks out there who have done this before? Maybe they're all cheap like me. :) Charles Thanks again! Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
Re: Load balancing spamd
--- Jason Frisvold <[EMAIL PROTECTED]> wrote: > On 8/1/05, email builder <[EMAIL PROTECTED]> wrote: > > Even if I had forgotten the -A, I think I would have been seeing > connection > > refused notices, but right now, it just seems to time out. I'm pretty > sure > > this is a LVS question more than a spamc/d question, since I've no > problems > > with the latter -- I am only asking here to see if anyone else does SA > > weighted load balancing. > > I kinda went the other way around.. I have multiple mail machines, > each with their own instance of spamd. I use a Cisco 7206 VXR to do > the load balancing. Works like a charm. Wow, a bit out of our price range here. :) We have also considered just continuing to build out MTA boxes each with an Amavis/Clamd and SA on them to share our increasing load (just use LVS to balance the incoming SMTP traffic and there is little reason to worry about balancing SA or Amavis/Clam), but our first choice is to split the "layers" -- have a couple separate machines that just do MTA-ish things, and a separate set of boxes that serve as a "SA (and Clam-av) farm". The thing that's better about doing it that way is the redundancy that you don't get if you aren't sharing spamd instances across all your MTA machines. Technically, this should be feasible with just plain DNS load balancing, but in our current medium/low budget scenario, we don't have the rackspace to have numerous boxes that are dedicated ONLY to SA/clam, thus our desire is to figure out a way to *WEIGHT* our spamd balancing. I'm surprised there's not a lot of folks out there who have done this before? Thanks again! Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
Re: Increase Performance howto
From: "Dhanny Kosasih" <[EMAIL PROTECTED]> > I tested my qmail wtih more than 14000 spam (i used qmail-inject in my > script). If i use QSheff + ClamAV + SpamAssassin, my server process > 14000 emails in 1 hour, and if i only use qmail my server process 14000 > emails in 1/3 hours. How can i increase my server performance ? I don't > understand what 'max-connection' and '-m' for, can u tell me what is that ? If you do not already have a large amount of memory then adding memory is one of the sovereign cures for slow SpamAssassin. As soon as it goes to swap you're dead. More processor also helps. Fewer rule sets leads to poorer filtering and faster operation. You are already processing much faster than my 1GHz Athlon which has a gigabyte of ram. With all the rules I run it takes on the order of a second and a half to scan for single messages. With multiple messages at once there is some net advantage to the multiprocessing that happens. It may be time to split the server into two machines. {^_^}
Re: Forwarding mail address
From: "Alexandre Cruz" <[EMAIL PROTECTED]> > Hi all, > > I do understand that this can sound as a very newbie question, however i > have a doubt that i can't find an answer. We are using Spamassassin with > procmail/sendmail. It is working fine, however, spam mail is being > forwarded for a mail account, which is no longer valid. I've been > looking where this address is in the configuration, in order to forward > those mails to another account, but no luck. Any suggestion? Is fetchmail involved? If so then you might have to change contents of either /etc/fetchmailrc or that person's account's .fetchmailrc file. For the /etc/fetchmailrc case all you need to do is redirect that person's email by changing the local address stanza. If a fetchmail is running for each account then you would need to disable that account's fetchmail startup, where ever that happens. Then add lines to some other account's .fetchmailrc to poll for and receive the mail instead. If you are not using fetchmail you need to punt a little. The sendmail (or substitute) files might need an alias on that account. Others can suggest tactics for this case. {^_^}
Re: Runaway processes
> is it better to run five children with 20 connections each, or 20 children with five connections each? Pretty much answered in my following mail. In general each child might us 30-60mb under NORMAL circumstances, so the amount of memory on your machine will determine the upper limit for number of children. In most cases you shouldn't really need less than about 20 connections (mails processed before dying) per child. If you do it may be a sign of other configuration problems in the system, such as not limiting the size of large mails going through SA. Loren PS: Could you post plain text rather than html if convenient? OE makes quoting from HTML a bloody pain. :-(
Re: Increase Performance howto
> I tested my qmail wtih more than 14000 spam (i used qmail-inject in my > script). If i use QSheff + ClamAV + SpamAssassin, my server process > 14000 emails in 1 hour, and if i only use qmail my server process 14000 > emails in 1/3 hours. How can i increase my server performance ? I don't > understand what 'max-connection' and '-m' for, can u tell me what is that ? I just did a long reply on ths subject, look at the trhead 'runaway processes'. Loren
Re: Runaway processes
> so you are running 30 per child and 6 children? 180 total. how many messages a day are you handling. I upped my children from 5 to 15 thinking that would help but it hasn't. I was thinking of taken connections down to 5 or 6 on 15 children. maybe I have it backwards? I don't have anything else running on this computer at all so I was thinking I wanted to use up all the memory with children. is that off? 30 connections on 6 children is a reasonable number for many smaller sites, the type that average probably less than 10K mails/day, at a guess. It should work reasonably well on the typical system with at least 500MB of memory and a 500MHz or faster processor. With a slower processor, or certainly with less memory, you might want to take the number of children down, and possibly the number of connections. Simple description on how this stuff works: spamd fires off some number of children determined by -m, with the default of 5. Each child takes some amount of memory. This is typically 30-60MB *per child* depending on the number of rules files you have. It will start a bit smaller than that, and will typically grow over the first dozen or so mails. If you have a lot of rules so your spamd children are taking 60MB each, 5 * 60 = 300MB. You better have a 512MB system or larger or you will be in heap big trouble. Even at 30MB, 30 * 5 = 150MB. This would probably work in a 256M system, but maybe not. You might want -m 3 or so in this case. Each child will process --max-conn-per-child messages before it dies and a new child is created in its place. If all mail was pretty much the same, and if the children did nothing but process mail, this really shouldn't matter. But the real fact is that all mail isn't the same. Some are very large. They should be limited to 250K or so, but some programs like qmail don't necessarily limit the mail size in the standard configuration. It is NOT a direct relation from mail size to spamd child size! A 250KB mail might easily crank a child up to 250MB! Once the child gets big, it just stays that way. If you feed large mails to SA, you cen get some really fat children. 5 children at 250MB each aren't going to fit well in a 512MB system. If you only let each child process a few messages before dying, if it happens to process one large message and gets big, it will only stay big for a few messages before going away. Chances are relatively small that all the children will manage to get fat at the same time, so you will probably survive just fine. With a large value of max con per child (like the default) it is pretty easy to get all the children fat at once. Spamd children also do other things than just process mail. Like doing database expiration runs. These tend to get the children very fat, especially have you have a database that has somehow gotten out of control. Again, this causes Bad Things(tm) if it happens to a lot of the children at once. Loren
Re: Personal Bayes Score
Matthew Yette wrote: Dankos, Put this into your /etc/mail/spamassassin/local.cf: user_scores_sql_custom_querySELECT preference, value FROM _TABLE_ WHERE username = _USERNAME_ OR username = '@GLOBAL' OR username = _DOMAIN_ ORDER BY username ASC That will make per-user preferences priority, and then roll back to the GLOBAL if the user doesn't have a preference specified. If i running spamd with -u [user] option and use your configuration, GLOBAL configuration never used, is that correct ? If no, what is the correct parameter i must use ? Regards, dankos. ___ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
Re: Runaway processes
is it better to run five children with 20 connections each, or 20 children with five connections each? Frank M. CookAssociation Computer Services, Inc.http://www.acsplus.com
Increase Performance howto
I tested my qmail wtih more than 14000 spam (i used qmail-inject in my script). If i use QSheff + ClamAV + SpamAssassin, my server process 14000 emails in 1 hour, and if i only use qmail my server process 14000 emails in 1/3 hours. How can i increase my server performance ? I don't understand what 'max-connection' and '-m' for, can u tell me what is that ? Regards, dankos. ___ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
Re: Runaway processes
Most strange. Could you give us the listing frop top or the like? The normal case, as you are probably aware, is that the children get fat (use a lot of memory) and your system goes into thraashing. This sounds like you have some other problem. Are you using awl (it is on by default in 3.x) or bayes? Possibly they are all trying to do expire runs on a huge database that has somehow managed to grow out of control. The other parameter you might want to set, if you haven't, and if the kids are getting fat, is --max-con-per-child. It defaults to something quite large, and setting it down to 20 or so has helped many people. Loren
RE: Runaway processes
Herb Martin wrote: >> When people ask why I haven't upgraded from 2.64 yet... I'm waiting >> until a week goes by without a new thread about runaway / way-slow / >> resource-eating SA 3.0.X processes! :-) >> > > I suspect your wait is over 3.10 (due any day now) + 1 week > should make you happy. > > Improved thread handling and for me it works even in pre-Release. That's great news. I'll give it a try after the initial shakedown. I must add, however, that SA 2.64 with "Spamcop URI" (SURBL), Bayes, DCC and a dash of SARE has been doing a great job here, 98% - 99% of spam caught with minimal FP's. Together with MailScanner and a virus scanner it's handling 15,000 emails per day on an old 800MHz PIII box, with the load average usually in the 0.30 range. And it's rock-solid; I've never needed to kill an SA process in over a year of uptime. Pierre
Re: userpref with mysql does not work
Martin Tanzer wrote: > My setup: > Debian 3.1 (sarge) with the provided spamassassin package (3.0.3-2) > Postfix, spamassassin bound to postfix (no amavisd-new) > There are no users on the machine, all mails are forwarded to another > mailserver trough the transport file. > > Any ideas? It seems pretty clear what is happening. In your "test" above you did the right thing, calling spamc with -u and of course it worked correctly. Now, when you are calling it via postfiix you are no longer sending the correct address to spamc, either by not using the -u command line param at all or just simply sending spam as the username. Fix how you are calling spamc and all will be well. Michael signature.asc Description: OpenPGP digital signature
Re: Forwarding mail address
I do understand that this can sound as a very newbie question, however i have a doubt that i can't find an answer. We are using Spamassassin with procmail/sendmail. It is working fine, however, spam mail is being forwarded for a mail account, which is no longer valid. I've been looking where this address is in the configuration, in order to forward those mails to another account, but no luck. Any suggestion? Track the mail through every step it would take through your system at each step where it could change system usernames and/or be forwarded to another address: 1. virtusertable 2. aliases file(s) 3. .forward file in user's home directory 4. System-wide procmailrc 5. User-specific .procmailrc Mike Jackson Tech Administrator, Datahost www.datahost.com
Re: Forwarding mail address
At 09:00 AM 8/2/2005, you wrote: Hi all, I do understand that this can sound as a very newbie question, however i have a doubt that i can't find an answer. We are using Spamassassin with procmail/sendmail. It is working fine, however, spam mail is being forwarded for a mail account, which is no longer valid. I've been looking where this address is in the configuration, in order to forward those mails to another account, but no luck. Any suggestion? You won't find it. Spamassassin doesn't forward mail. It scans mail. This is something that would need to be done on your mailer, or with a procmail recipe, depending on your mail setup.
Re: Forwarding mail address
Alexandre Cruz wrote: > Hi all, > > > > I do understand that this can sound as a very newbie question, however i > have a doubt that i can’t find an answer. We are using Spamassassin with > procmail/sendmail. It is working fine, however, spam mail is being > forwarded for a mail account, which is no longer valid. I’ve been > looking where this address is in the configuration, in order to forward > those mails to another account, but no luck. Any suggestion? > > > > Best regards, Look at your procmailrc. SpamAssassin itself can't forward mail, so it's not going to be in any of the SA config files.
Forwarding mail address
Hi all, I do understand that this can sound as a very newbie question, however i have a doubt that i can’t find an answer. We are using Spamassassin with procmail/sendmail. It is working fine, however, spam mail is being forwarded for a mail account, which is no longer valid. I’ve been looking where this address is in the configuration, in order to forward those mails to another account, but no luck. Any suggestion? Best regards, Alexandre Cruzx
Re: runaway processes
My setup is as follows: FreeBSD 4.10, SpamAssassin 3.0.4, Perl 5.8 Using Bayes and a pile 'o SARE rules. It scanned 34484 messages last night and the only time we see lags is when the bayes database is expiring. The startup script is as follows: /usr/local/bin/spamd --max-children=6 --max-conn-per-child=20 -d -x -u daemon -s local0" HTH, Tom
RE: Runaway processes
> > When people ask why I haven't upgraded from 2.64 yet... I'm waiting > > until a week goes by without a new thread about runaway / > way-slow / > > resource-eating SA 3.0.X processes! :-) > > I suspect your wait is over 3.10 (due any day now) + 1 week should make you happy. Improved thread handling and for me it works even in pre-Release. -- Herb Martin
Re: Runaway processes
Sorry, no, that didn't come out right. There's only six children running at any time. Each will process 30 messages, then restart. The machine processed about 3200 messages yesterday, so each child restarted about once every 2.5-3 hours. Mike Jackson Tech Administrator, Datahost www.datahost.com - Original Message - From: "Frank M. Cook" <[EMAIL PROTECTED]> To: "Mike Jackson" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, August 02, 2005 08:21 Subject: Re: Runaway processes so you are running 30 per child and 6 children? 180 total. how many messages a day are you handling. I upped my children from 5 to 15 thinking that would help but it hasn't. I was thinking of taken connections down to 5 or 6 on 15 children. maybe I have it backwards? I don't have anything else running on this computer at all so I was thinking I wanted to use up all the memory with children. is that off? Frank M. Cook Association Computer Services, Inc. http://www.acsplus.com
Re: Runaway processes
so you are running 30 per child and 6 children? 180 total. how many messages a day are you handling. I upped my children from 5 to 15 thinking that would help but it hasn't. I was thinking of taken connections down to 5 or 6 on 15 children. maybe I have it backwards? I don't have anything else running on this computer at all so I was thinking I wanted to use up all the memory with children. is that off? Frank M. CookAssociation Computer Services, Inc.http://www.acsplus.com
Re: Runaway processes
Pierre Thomson wrote: I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6 & Exim V4.52. I'm noticing that SA seems to have a big problem with child processes just "running away", never terminating and eating CPU. My mailservers can't cope, and I'm looking at having to switch off SA. (Not something I really want to do..) No matter what I set "-m" to spamd, they all just go into this endless death spiral.. When people ask why I haven't upgraded from 2.64 yet... I'm waiting until a week goes by without a new thread about runaway / way-slow / resource-eating SA 3.0.X processes! :-) Pierre It's good to know I'm not the only one with this issue.
RE: Runaway processes
> I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6 & Exim V4.52. > > I'm noticing that SA seems to have a big problem with child > processes just "running away", never terminating and eating CPU. > > My mailservers can't cope, and I'm looking at having to switch > off SA. (Not something I really want to do..) > > No matter what I set "-m" to spamd, they all just go into this > endless death spiral.. When people ask why I haven't upgraded from 2.64 yet... I'm waiting until a week goes by without a new thread about runaway / way-slow / resource-eating SA 3.0.X processes! :-) Pierre
Re: Runaway processes
I've been fighting a problem which may turn out to be similar. my spamassassin just starts falling behind and runaway threads could be the cause. I'm going to try adjusting --max connections per child (check docs for exact syntax). the default is 200. maybe someone else will jump in with a recommended number but I'm thinking the default may be way too high. a lower number will cause each child to shut down sooner. when the max number is reached the thread is stopped and a new one is created. Frank M. Cook Association Computer Services, Inc. http://www.acsplus.com
Re: Load balancing spamd
On 8/1/05, email builder <[EMAIL PROTECTED]> wrote: > Even if I had forgotten the -A, I think I would have been seeing connection > refused notices, but right now, it just seems to time out. I'm pretty sure > this is a LVS question more than a spamc/d question, since I've no problems > with the latter -- I am only asking here to see if anyone else does SA > weighted load balancing. I kinda went the other way around.. I have multiple mail machines, each with their own instance of spamd. I use a Cisco 7206 VXR to do the load balancing. Works like a charm. > Thanks! -- Jason 'XenoPhage' Frisvold [EMAIL PROTECTED]
Runaway processes
I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6 & Exim V4.52. I'm noticing that SA seems to have a big problem with child processes just "running away", never terminating and eating CPU. My mailservers can't cope, and I'm looking at having to switch off SA. (Not something I really want to do..) No matter what I set "-m" to spamd, they all just go into this endless death spiral.. GTG Gordon Ross, Network Manager/Rheolwr Rhydwaith Countryside Council for Wales/Cyngor Cefn Gwlad Cymru
Re: Qmail + spamassassin + squirellmail
Dhanny Kosasih wrote: Hi, Any body know, how to install qmail + spamassassin + squirellmail (can tell spam to spamassassin) ? And how to make spamassassin can autolearn for spam ? Regards, dankos. Here are two "toaster" documents I used: http://sylvestre.ledru.info/howto/howto_qmail_vpopmail.php#vpopmail http://www.differentpla.net/node/view/165 Good luck! Peace... Tom
Re: unwanted breakthrough
> SARE_ADLTSUB2 Subject =~ /\b(?:blow|climax > |enlarg(e|ment)|fuck|inter+acial|lick|porn|penis|pervert|pussy|tits|tight|va gina|virgins?)\b/i > > Fix the rule, don't ditch the \b's for such a broad rule.. > > Besides, the whole rule is subject to all kinds of obfuscation tricks. P.e.n.i.s > still won't match, nor any other character-insertion obfuscation. > > I'd suggest creating obfu rules to detect obfuscations, and don't try to expand > the scope of this already over-broad rule. (which will match a few FP cases > as-is such as "your photo enlargement is ready") Um, I was going to point out that this rule is in the _adult set, not the _obfu set. Loren