Re: Q. about spam directed towards highest MX Record?
Rob McEwen wrote: (CCing Marc Perkel because I seem to recall him knowing about this) Not that I'd ever outright block based on this one factor alone, but... Does anyone have any stats about what percentage of spam is directed towards the highest MX Record? (that is, where there is more than one MX record?) Also, has anyone ever seen ANY legit mail go to the highest MX record when no mail server failure occurred? I get lots of mail from a number of different Domino servers delivered to my lowest preference MXes. I've always suspected it was something IBM had done to Domino to "improve queue performance" but I've never looked into it. Daryl
Re: really slow spamd scan
> > 14 seconds may be just the delay for the various network tests to > > respond. > You mean the test form SA? I have googled for this kind of situations > and I found I am the slowest. If I stop the spamd, the delivery will > be much faster. I mean it depends how your SA is configured. Some of the test are based on data on remove databases (all the RBL test, Razor/Pysor, Dcc...) depending on your network connectivity, it can take some time before you get a response for these various tests. But once again, this is not an issue because it does not create any load on your server: while SA is waiting for a network test to finish, the server can process other emails. I have an average SA processing time comprised between 10 and 15 seconds. Olivier
Re: sa-learn and "Caught" spams
> For instance, given the explanations above, I'll > start a system to automatically learn from my 'checkspam' folder, but not > my 'highspam' folder. Remember that your 'highspam' may be separated from 'checkspam' largely based on network tests; I often see identical messages with a 6-8 point variance depending on the connecting IP and the envelope sender. So if I don't feed the Bayes the first message because it scored a safe 12 points, that might mean the second sneaks through because it hits BAYES_20 instead of BAYES_50, or whatever. My own theory is "Learn 'em all and let Bayes sort 'em out." -- Dave Pooser Cat-Herder-in-Chief, Pooserville.com "NOTHING says love like a monkey. It's a fuzzy screeching bundle of tenderness!" -- QueenOfWands.net
Re: really slow spamd scan
On 9/28/06, Olivier Nicole <[EMAIL PROTECTED]> wrote: > I am quite new to SA (a week of SA life), and the SA is working, the > thing is, SA is incredibly slow on my server (2.8GHZ CPU + 2GB Memory > + Qmail + Qmail-scanner). Here's a typical scan log: > > result: . 0 - SPF_PASS scantime=14.7,size=1689 ... Hi, Problem is not that it is slow. That SA takes 14 seconds to deliver a message is not an issue, email is not a real time process anyway and transiting email from one gateway to another can take minutes or hours. The scantime=14.7 does not mean the scan time of spamassassin? Problem would be is SA would make high CPU load on your server. 14 seconds may be just the delay for the various network tests to respond. You mean the test form SA? I have googled for this kind of situations and I found I am the slowest. If I stop the spamd, the delivery will be much faster. Bests Olivier Thanks very much for the suggestion! Deephay
Re: Q. about spam directed towards highest MX Record?
> Also, has anyone ever seen ANY legit mail go to the highest MX record when > no mail server failure occurred? I've seen a tiny amount-- little enough that I earlier set my primary to dump any messages received from my tertiary MX into a quarantine folder for my review, but since I got ImageInfo.pm working properly I haven't noticed any spam make it through mail3 unscathed. -- Dave Pooser Cat-Herder-in-Chief Pooserville.com "Dogs are what puppies turn into if you don't eat 'em before they go all stringy." --Sgt. Schlock
Re: really slow spamd scan
> I am quite new to SA (a week of SA life), and the SA is working, the > thing is, SA is incredibly slow on my server (2.8GHZ CPU + 2GB Memory > + Qmail + Qmail-scanner). Here's a typical scan log: > > result: . 0 - SPF_PASS scantime=14.7,size=1689 ... Hi, Problem is not that it is slow. That SA takes 14 seconds to deliver a message is not an issue, email is not a real time process anyway and transiting email from one gateway to another can take minutes or hours. Problem would be is SA would make high CPU load on your server. 14 seconds may be just the delay for the various network tests to respond. Bests Olivier
really slow spamd scan
Greetings all, I am quite new to SA (a week of SA life), and the SA is working, the thing is, SA is incredibly slow on my server (2.8GHZ CPU + 2GB Memory + Qmail + Qmail-scanner). Here's a typical scan log: result: . 0 - SPF_PASS scantime=14.7,size=1689 ... . And I have checked the SA wiki and found there is a note saying if you are using UTF-8 locale, the performance can be low. What I am wondering is: Will it be that slow? Any suggestion is appreciated. Deephay
Re: FORGED_YAHOO_RCVD?
What's your trusted_networks look like? Based on the headers below you'll need to set it manually. By default SA assumes that all the "private range" hosts are part of your network, and the first non-private. However, in this case, the first non-private is yahoo's server. That's bad. Jim Davis wrote: > This autoresponse from Yahoo abuse crept over the spam line, mostly > because of a hit on FORGED_YAHOO_RCVD... but it's not clear from the > headers why that would be. This is a from a Fedora Core 5 system > running SpamAssassin 3.1.3 under amavisd-new 2.4.2: > >> Return-Path: <[EMAIL PROTECTED]> >> Received: from xenopodid.cs.arizona.edu (xenopodid.cs.arizona.edu >> [192.12.69.105]) >> by email.cs.arizona.edu (8.13.3/8.13.3) with ESMTP id >> k8RFY9pl088354 >> for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 >> 08:34:09 >> -0700 (MST) >> (envelope-from [EMAIL PROTECTED]) >> Received: from localhost (xenopodid.cs.arizona.edu [127.0.0.1]) >> by xenopodid.cs.arizona.edu (Postfix) with ESMTP id 6FCBA6DCCEA >> for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 >> 08:34:09 >> -0700 (MST) >> X-Virus-Scanned: amavisd-new at cs.arizona.edu >> X-Spam-Flag: YES >> X-Spam-Score: 5.019 >> X-Spam-Level: * >> X-Spam-Status: Yes, score=5.019 tagged_above=- required=5 >> tests=[BAYES_40=-0.185, DNS_FROM_RFC_ABUSE=0.2, >> DNS_FROM_RFC_POST=1.708, DNS_FROM_RFC_WHOIS=1.447, >> FORGED_YAHOO_RCVD=1.849] >> Received: from xenopodid.cs.arizona.edu ([127.0.0.1]) >> by localhost (xenopodid.cs.arizona.edu [127.0.0.1]) >> (amavisd-new, >> port 10024) >> with ESMTP id QAQkAwgDuuF1 for >> <[EMAIL PROTECTED]>; >> Wed, 27 Sep 2006 08:34:03 -0700 (MST) >> Received: from cheltenham.cs.arizona.edu (cheltenham.cs.arizona.edu >> [192.12.69.60]) >> by xenopodid.cs.arizona.edu (Postfix) with ESMTP id 90AAC6DCCD1 >> for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 >> 08:34:03 >> -0700 (MST) >> Received: from mail-relay1.yahoo.com (mail-relay1.yahoo.com >> [216.145.48.34]) >> by cheltenham.cs.arizona.edu (8.13.4/8.13.4) with ESMTP id >> k8RFY0Gj014456 >> for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:03 >> -0700 (MST) >> (envelope-from [EMAIL PROTECTED]) >> Received: from speedster.cc.kana.corp.yahoo.com >> (speedster.cc.kana.corp.yahoo.com [207.126.228.28]) >> by mail-relay1.yahoo.com (8.13.6/8.13.6/mr1) with SMTP id >> k8RFMTSi086721 >> for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:22:29 >> -0700 (PDT) >> Message-Id: <[EMAIL PROTECTED]> >> Precedence: bulk >> Auto-Submitted: auto-replied >> Date: Wed, 27 Sep 2006 08:22:28 -0700 >> To: [EMAIL PROTECTED] >> Subject: A message from Yahoo! Customer Care (KMM37445094V70533L0KM) >> From: Yahoo! Mail <[EMAIL PROTECTED]> >> Reply-To: Yahoo! Mail <[EMAIL PROTECTED]> >> MIME-Version: 1.0 >> Content-Type: text/plain; charset = "us-ascii" >> Content-Transfer-Encoding: 7bit >> X-Mailer: KANA Response 7.0.1.142 >> X-UID: 371043 >> >> Thank you for contacting Yahoo! Customer Care to answer your question. A >> support representative will get back to you within 48 hours regarding >> your issue. Until then, feel free to visit our online help center at >> http://help.yahoo.com/ >> for answers if you have not already done so. >
Re: Received header unparseable
A second attempt tests much better. Added at line 747: # Received: from ([10.0.0.6]) by myfirewalll; Thu, # 13 Mar 2003 06:26:21 -0500 (EST) if (/^from \(\[(${IP_ADDRESS})\]\) by myfirewall/) { $mta_looked_up_dns = 1; $helo = $1; $ip = $1; $by = 'myfirewall'; goto enough; } benthere-nine wrote: > > In a desperate newbie attempt to fix this problem myself, I added the > following lines to Received.pm at line 895: > ># Received: from ([10.0.0.6]) by myfirewalll; Thu, ># 13 Mar 2003 06:26:21 -0500 (EST) > if (/^from \(\[(${IP_ADDRESS})\]\) by myfirewall/) { >$ip = $1; $by = 'my.firewall.ip.addr'; goto enough; > } > > Ummm, it didn't work, but it didn't break anything. How can I make this > work? Add a "$helo =" ? > > Thanks. > > > > benthere-nine wrote: >> >> My firewall puts a received header on every e-mail it >> forwards to SA 3.1.5: >> >> Received: from f66108.upc-f.chello.nl ([80.56.66.108]) >> by myfirewall; Tue, 26 Sep 2006 12:35:52 -0500 >> (Central Daylight Time) >> >> But when my firewall can't find a DNS entry for the >> e-mail's last relay IP address, it just puts in a >> blank space: >> >> Received: from ([201.19.179.63]) by myfirewall; Tue, >> 26 Sep 2006 12:35:53 -0500 (Central Daylight Time) >> >> 20_head_tests.cf hits on this as an UNPARSEABLE_RELAY. >> SA isn't able to look up that IP address on all the >> network tests. >> >> I'm e-mailing Tech Support for the company that >> publishes the firewall software, but is there anything >> that can be done on the SA side? >> >> Thank you very much. >> > > -- View this message in context: http://www.nabble.com/Received-header-unparseable-tf2340368.html#a6539503 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn and "Caught" spams
--As of September 27, 2006 5:43:28 PM -0700, Kelson is alleged to have said: Daniel T. Staal wrote: True. So... Optimal is obviously to train, once and correctly, on all messages. Sending a message through that has been trained will consume *some* resources, but less then one that still needs to be learned. So the exact balance is a complicated question. ;) I just train on everything. If it's already learned from a message, it takes a few resources for it to recognize that, but almost certainly less time than it would have taken me to separate them out. --As for the rest, it is mine. Depends on the setup. For instance, given the explanations above, I'll start a system to automatically learn from my 'checkspam' folder, but not my 'highspam' folder. I have procmail automatically sort my spam by score, so I can pay extra attention to low-scoring spam. (Which is more likely to be ham which was misplaced than the high-scoring spam.) So, since I *already* have them separated out, I can avoid the double-check. ;) Anyway, I just knew that there was an automatic system, and at the very least there is *some* load to re-learning, even if a full analysis is skipped. It would be interesting to see how much it actually is, compared to an easy filter. If I find time, I may try to figure out a good test. Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
Re: an stupide config question
Philippe Couas wrote: > Hi, > > I have migrate from Spamassassin 2.63 to 3.15.1, that' seems running, > somes mail are flaged and rpm -a seee new version. > But previously rules and local.cf was in /etc/mail/spamassasin, and > theses files are not modified by my rpm -Uvh. The "Stock" rules should not have been in /etc/mail/spamassassin. Only add-on rules and your local.cf belong under any part of /etc/. This is true for ALL versions of SA (or at least going back to 2.2x). The "stock" rules (50_scores.cf and friends) should be in /usr/share/spamassassin, or possibly /usr/local/share/spamassassin. > > I want know if config files are always in same directory ?? The should be. However, the best way to be sure is to check. With the /usr/share/ ones, some packages use /usr/local/share.. be sure not to end up with duplicates. > > Regards > > Philippe COUAS > Responsable Développement > INFODEV S.A. >
Re: sa-learn and "Caught" spams
Bill Horne wrote: > > I have a "follow on" question, so I'll add it to this thread: > > Assuming that it's a good idea to feed "Caught" spams through sa-learn > in order to reinforce the tokens that might not have been autolearned, > how do I tell SA to ignore the " SPAM " notice in the subject? I > have ignore-header commands in local.cf for the "X-Spam-Status: Yes" and > other spam headers, but how do I skip only a portion of the subject? Provided it's a markup your SpamAssassin generated, SA will automatically ignore it when learning.
Re: sa-learn and "Caught" spams
Daniel T. Staal wrote: True. So... Optimal is obviously to train, once and correctly, on all messages. Sending a message through that has been trained will consume *some* resources, but less then one that still needs to be learned. So the exact balance is a complicated question. ;) I just train on everything. If it's already learned from a message, it takes a few resources for it to recognize that, but almost certainly less time than it would have taken me to separate them out. -- Kelson Vibber SpeedGate Communications
Re: sa-learn and "Caught" spams
Daniel T. Staal wrote: > > While I in general agree with this, I was under the impression that > spamassassin will auto-learn from messages it marks. (At least, past a > certain threshold.) Actually, that's not entirely true. There's more than just a threshold. Actually, the score you see isn't even the score compared against the threshold. Score computation generalities: 1) The score is computed as if bayes was disabled. This includes changing the score set. 2) Any rule with the "noautolearn" tflag ie: white/blacklist commands, is discarded. >From there, the criteria to learn as spam using this "learning score" are: 1) score above threshold (default 12.0) 2) at least 3.0 points from header rules 3) at least 3.0 points from body rules 3) Existing bayes learning must not result in the message matching a BAYES_* rule with a score less than -1.0 4) The bayes R/W lock must be available on the first try. ie: no other autolearn, manual learn or expiry processes are running. And note that because of 2 and 3, the score needs to be over 6.0, regardless of what you have the threshold set to. If any of the above aren't met, autolearning will not happen. In general the autolearner tries very hard to be ABSOLUTELY POSITIVE a message is spam before autolearning it. So relying on autolearning to learn all or even most of your spam isn't a very good idea. It's not going to learn all your spam. It just won't. > In which case, feeding the spam messages to it again > would bias the database towards spam, as the messages are being learned > twice. > Actually, As Jim pointed out, it will skip message-id's that are already in the bayes DB. Also, this skip isn't particularly slow, so you're not wasting a ton of CPU by re-feeding messages that were already auto learned. > So the question would have to be: Does Spamassassin automatically update > the Bayes database from (some/any) messages it flags as spam or ham? > Some, yes. Most, no. Score less than 6, never.
Re: Non-blocklisted embedded URLs are getting hits on URIBL_AB_SURBL and URIBL_PH_SURBL in SpamAssassin 3.1.5
On Wed, Sep 27, 2006 at 02:26:41PM -0700, Donald Craig wrote: > I'm getting matches whenever I have an embedded URL > on URIBL_AB_SURBL and URIBL_PH_SURBL - You're not by chance using the opendns.{com,org} folks for DNS, are you? -- Randomly Selected Tagline: "You can tell that I got this out from the newspaper because it looks like I cut it out with a spatula." - Jim Duncan pgpeRfl5P6q7N.pgp Description: PGP signature
Non-blocklisted embedded URLs are getting hits on URIBL_AB_SURBL and URIBL_PH_SURBL in SpamAssassin 3.1.5
I'm getting matches whenever I have an embedded URL on URIBL_AB_SURBL and URIBL_PH_SURBL - unless the URL is actually in URIBL_SBL, in which case the logic for all the flavors of URIBL_XX_SURBL seems to work correctly. I have verified the absence of the incorrectly matching URLs from SURBL with lookups in http://www.rulesemporium.com/cgi-bin/uribl.cgi This is SpamAssassin 3.1.5, all was fine in 3.1.2. For now I have set both those tests to 0.00. Don Craig
RE: duplicate emails
Loren Wilton wrote: >occa_phishing.cf >occa_replica.cf >I have no knowledge of these. >From the rules you show these aren't particularly worthwhile (nor all that >well written rules). There are a number of SARE rules that cover this area >much more thoroughly, and I believe these days even a number of standard >rules in this area. I'd dump these files. >I forget how you said you have SA integrated. If you are using spamc/spamd >as the interface then you can just kill spamd and restart it. Depending on >your system distribution the script to do that sometimes has various names >and locations. >Not every setup uses spamd though. I think Mailscanner integrates SA >directly, and in this case you have to bounce mailscanner. >So what are the pieces of your mail system again? And which OS distro? >Someone will likely know what to knock over the head to restart SA on that >configuration. This machine is running RedHat AS 3 with qmail and spamassassin 3.0.4. We are running spamd. So you are saying that I should find where spamd is running and restart it? "Knocking over the head" is what either I need or this crazy machine needs. For all who read this today I am not sure I will be able to see posts or not. I will definitely not be able to see them until tomorrow morning. Whenever I am able to see them again I will say that your posts are valuable to me and I hope to be able to utilize your expertise so please let me know what you think. One last note about my outage today, as a last check I looked at our internal email server that is running Microsoft Exchange 2000. When I examined that machine the C: drive which has a 6 GB C: drive partition only had 116 MB free. I have been attempting to free up space on it to see if that may be causing my sporadic email delivery problems. I am still attempting to figure out what is causing my problems so I appreciate all advice. Steve Ingraham
Re: sa-learn and "Caught" spams
On Wed, 2006-09-27 at 06:37 +, Mike Woods wrote: > Hi guys, bit of a query regarding sa-learn and messages that have > already been tagged as spam. > > We have spamassassin scanning mail via amavisd and sending any caught > spams to a spam folder in the users accounts (using plus addressing), > we've also been getting users to drop any missed spams into this spam > folder so we can train spamassassin on them, at present I have a script > that moves *only* the missed spams to a master folder for sa-learn, my > question is simple, would there be any benefit in including the mails > identified as spam in this process, I know sa-learn looks for common > patterns in spams to identify them as spam but im unsure if adding known > spams in would be beneficial in this ? I have a "follow on" question, so I'll add it to this thread: Assuming that it's a good idea to feed "Caught" spams through sa-learn in order to reinforce the tokens that might not have been autolearned, how do I tell SA to ignore the " SPAM " notice in the subject? I have ignore-header commands in local.cf for the "X-Spam-Status: Yes" and other spam headers, but how do I skip only a portion of the subject? TIA. Bill
RE: Newbie Rule Question
On Wed, 27 Sep 2006, Shue, Daniel G. wrote: > # Catch anything from 8:00 PM to 6:00 AM and score it > header RCVD_AT_NIGHT Date =~ /..., .. ... [0,2][0-5]:..:..*/ > score RCVD_AT_NIGHT 0.001 > describeRCVD_AT_NIGHT Email was received between 8:00PM and > 6:00AM If you want to score based on when the message was *received* I suggest you match against the Received: header *your* mail relay adds. The Date: header is subject to forgery, will be in an unpredictable time zone, and is supposed to indicate when the message was *sent*. -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- People seem to have this obsession with objects and tools as being dangerous in and of themselves, as though a weapon will act of its own accord to cause harm. A weapon is just a force multiplier. It's *humans* that are (or are not) dangerous. ---
RE: Stats of rules ?
Chris wrote: > On Tuesday 26 September 2006 2:50 pm, Bowie Bailey wrote: > > Noc Phibee wrote: > > > Hi > > > > > > on my spamassassin server, i use a lot of rules .. > > > personnal and downloaded. > > > > > > Anyone know if they have a tools for know in 24h or 48h > > > if a rules are used or not ? > > > > If you just want to know if the rule is getting hits, you can do a > > simple grep against your maillog file. > > > > For more in-depth stats, try this script: > > > > http://www.rulesemporium.com/programs/sa-stats.txt > > > > Rename it to sa-stats.pl before you run it. > > Your script is still running great over here, if he's looking for > something different than what sa-stats.pl provies and if your script > is for public consumption, you may want to suggest it to him. I've > also got it running daily in a cronjob is he wants something like > that. My script can be used as well, although it is more for add-on rules in particular. It does not give stats on any of the built-in rules. I'm attaching an updated version. I have fixed the rulename detection so that it will pick up on the fuzzyocr rules now (it will list their score as 0 since they don't have a score line associated). -- Bowie sa-addon-stats.pl Description: Binary data
RE: Newbie Rule Question
Ok guys, I figured it out... w/ Loren's help of course! :) Here's what I came up with: # Catch anything from 8:00 PM to 6:00 AM and score it header RCVD_AT_NIGHT Date =~ /..., .. ... [0,2][0-5]:..:..*/ score RCVD_AT_NIGHT 0.001 describeRCVD_AT_NIGHT Email was received between 8:00PM and 6:00AM # Catch anything from 3:00PM to 5:00PM for TESTING #header RCVD_DURING_DAY Date =~ /..., .. ... 1[5,6,7]:..:..*/ #score RCVD_DURING_DAY -0.001 #describe RCVD_DURING_DAY TESTING - Email was received between 3:00PM and 5:00PM What I can't figure out though is why you have to do "Date =~" why not "Date: ". Is that a SA understood variable thing? I don't know what to do with the time zone? Any ideas on that? I'm going to test it out tonight and see how it goes, let me know if your interested in the out come. I think it's a good idea myself, our people here will never get ham after 8 PM. And as you can see... my scores are really low right now, I may move them up some but not too much. -Original Message- From: Brent Kennedy [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 27, 2006 3:39 PM To: users@spamassassin.apache.org Subject: RE: Newbie Rule Question Importance: Low Nice, I like that! Most of our spam also comes in during the wee hours of the morning.. I think adding a half point or even a point would help even more. Though, I have trained and continue to train both of my servers and they are pretty effective. We get 3500 mails a day of which 70% are classified as spam and that doesn't count the email addresses I have receive blocked on(about another 1500 emails for them). -Brent -Original Message- From: Loren Wilton [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 27, 2006 3:04 PM To: users@spamassassin.apache.org Subject: Re: Newbie Rule Question I need to check, "Date: Wed, 27 Sep 2006 14:17:17 -0400" and I've looked Quite ignoring the arguments people will make against this (including me) you could do something like the following. Of course remember the date header is when the mail was made in whatever timezone it was made, not in YOUR timezone. For that you would want to check the timestamp in the received header that your system adds, not in the date header. #Catch 1700 to 0700 headerA_BAD_TIMEDate =~ /\d\s(?:1[789]|2\d|0[01234567]):/ score A_BAD_TIME0.2 Loren
RE: Newbie Rule Question
Nice, I like that! Most of our spam also comes in during the wee hours of the morning.. I think adding a half point or even a point would help even more. Though, I have trained and continue to train both of my servers and they are pretty effective. We get 3500 mails a day of which 70% are classified as spam and that doesn't count the email addresses I have receive blocked on(about another 1500 emails for them). -Brent -Original Message- From: Loren Wilton [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 27, 2006 3:04 PM To: users@spamassassin.apache.org Subject: Re: Newbie Rule Question I need to check, "Date: Wed, 27 Sep 2006 14:17:17 -0400" and I've looked Quite ignoring the arguments people will make against this (including me) you could do something like the following. Of course remember the date header is when the mail was made in whatever timezone it was made, not in YOUR timezone. For that you would want to check the timestamp in the received header that your system adds, not in the date header. #Catch 1700 to 0700 headerA_BAD_TIMEDate =~ /\d\s(?:1[789]|2\d|0[01234567]):/ score A_BAD_TIME0.2 Loren
spamassassin 3.1.4
installed this today, removed bogofilter... also installed spamc, notice one of the suggested installs was libnet-ident-perl, is anyone using this, with spamassassin ? or is this a sparate module by itself. Regards - Richard
Re: Q. about spam directed towards highest MX Record?
Rob McEwen wrote: (CCing Marc Perkel because I seem to recall him knowing about this) Not that I'd ever outright block based on this one factor alone, but... Does anyone have any stats about what percentage of spam is directed towards the highest MX Record? (that is, where there is more than one MX record?) Our lowest priority MX is just a store and forward box left over from when backup MXs were useful. We only keep it around because a few (getting fewer) clients say the PC magazine pundits say you need one. So they pay. We do all the normal user validation, greylisting, RBLs, same as our other servers but the spammers insist on using it. Here are the stats for yesterday; total messages total viruses total spam --- 120,242 1,681 106,102 Also, has anyone ever seen ANY legit mail go to the highest MX record when no mail server failure occurred? Just about any MS Exchange server. I have never had a valid message from qmail/Sendmail/Postfix/Exim go to that server. Always Exchange, and generally from a small business with a "shrink wrap admin" running the mail services. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: Newbie Rule Question
I need to check, "Date: Wed, 27 Sep 2006 14:17:17 -0400" and I've looked Quite ignoring the arguments people will make against this (including me) you could do something like the following. Of course remember the date header is when the mail was made in whatever timezone it was made, not in YOUR timezone. For that you would want to check the timestamp in the received header that your system adds, not in the date header. #Catch 1700 to 0700 headerA_BAD_TIMEDate =~ /\d\s(?:1[789]|2\d|0[01234567]):/ score A_BAD_TIME0.2 Loren
Re: Newbie Rule Question
> Hi folks, > I'm a newbie to SA and have looked at a few tutorials on writing > custom rules, but they all seem to be too simple for what I want to do. > That, or I'm not smart enough to figure it out on my own. What I'm > needing is some guidance on how to write a custom rule that looks at the > creation time in the header, and if its between 8 PM to 7 AM score > it appropriately. I know that this might make some of you cringe at the > thought of it, but I would say that 95% - 99% of all of ham is sent > between 8 AM to 6 PM. The only this that really comes through later > than that may be some valid newsletters that really don't make a > difference. And besides, I'm not talking about scoring it at 999.000, > just add maybe .8 or even 1.000. I think it would be a handy little > rule if I had the brains to figure it out. Here is the header tag that > I need to check, "Date: Wed, 27 Sep 2006 14:17:17 -0400" and I've looked > at the date rules built into SA but I can't see how I could possibly > create a rule with regex to do what I want. Well... maybe. let me > think on it, if anyone has any ideas, please let me know! How about creating two configuration files, each with a different threshold. Then use cron to switch which file is used at 8am and 6pm. Cheers, Peter Smith
Newbie Rule Question
Hi folks, I'm a newbie to SA and have looked at a few tutorials on writing custom rules, but they all seem to be too simple for what I want to do. That, or I'm not smart enough to figure it out on my own. What I'm needing is some guidance on how to write a custom rule that looks at the creation time in the header, and if its between 8 PM to 7 AM score it appropriately. I know that this might make some of you cringe at the thought of it, but I would say that 95% - 99% of all of ham is sent between 8 AM to 6 PM. The only this that really comes through later than that may be some valid newsletters that really don't make a difference. And besides, I'm not talking about scoring it at 999.000, just add maybe .8 or even 1.000. I think it would be a handy little rule if I had the brains to figure it out. Here is the header tag that I need to check, "Date: Wed, 27 Sep 2006 14:17:17 -0400" and I've looked at the date rules built into SA but I can't see how I could possibly create a rule with regex to do what I want. Well... maybe. let me think on it, if anyone has any ideas, please let me know! Thanks a bunch! This email and any files transmitted with it are confidential and intended for use only by the individual or entity named above. If you are not the intended recipient or the employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any disclosure, dissemination, distribution, copying of this communication, or unauthorized use is strictly prohibited. Please notify us immediately by reply email and then delete this message from your system. Please note that any views or opinions presented in this email are solely those of the author and do not necessarily represent those of Randolph County Government. This email and any file attachments have been scanned for potential viruses; however, the recipient should check this email for the presence of viruses and/or malicious code. Randolph County accepts no liability for any damage transmitted via this email.
Q. about spam directed towards highest MX Record?
(CCing Marc Perkel because I seem to recall him knowing about this) Not that I'd ever outright block based on this one factor alone, but... Does anyone have any stats about what percentage of spam is directed towards the highest MX Record? (that is, where there is more than one MX record?) Also, has anyone ever seen ANY legit mail go to the highest MX record when no mail server failure occurred? Thanks! Rob McEwen PowerView Systems [EMAIL PROTECTED] (478) 475-9032
Re: [qmailtoaster] duplicate emails
Hi, have a look at rulesemporium.com There are descriptions of the rules, and definitely you should use only one out pof each set of similar named ones Wolfgang Hamann Be careful there. It depends on what you mean by "similarly named". It is perfectly valid to have 70_sare_html0.cf 70_sare_html1.cf 70_sare_html2.cf 70_sare_html3.cf 70_sare_html4.cf on your system. Each higher numbered file adds more 'dangerous' tests to the previous one. It would NOT be valid to have only 70_sare_html1.cf 70_sare_html3.cf 70_sare_html4.cf These depend on html0, so you would need that. And it would not make much sense to have 3 and 4 without 2. Also, the following would be wrong: 70_sare_html.cf 70_sare_html0.cf 70_sare_html1.cf 70_sare_html2.cf The "html" file includes html0, html1, and I belive most all the others except maybe html4. So if you had the above configuration you would have a whole lot of duplicate rules. The other thing to look out for is versioned files: 70_sare_whitelist_pre30.cf 72_sare_bml_post23x.cf 99_sare_fraud_post25x.cf This is legal, IF you are running 2.63. However, assuming we had 70_sare_whitelist_pre30.cf 70_sare_whitelist_post30.cf It would NOT be valid to have BOTH of those in your configuration. Loren
Re: sa-learn and "Caught" spams
Which means, for the orginal question, that re-learning the already caught spams will have very little effect other than wasting some processor cycles. Doing what he is doing right now is probably best. This is assuming that they were auto-learned. Not all system are configured for auto-learning. (Mine isn't.) So in that case, if you don't manually learn, they don't get learned. Loren
Re: duplicate emails
occa_phishing.cf occa_replica.cf I have no knowledge of these. From the rules you show these aren't particularly worthwhile (nor all that well written rules). There are a number of SARE rules that cover this area much more thoroughly, and I believe these days even a number of standard rules in this area. I'd dump these files. I forget how you said you have SA integrated. If you are using spamc/spamd as the interface then you can just kill spamd and restart it. Depending on your system distribution the script to do that sometimes has various names and locations. Not every setup uses spamd though. I think Mailscanner integrates SA directly, and in this case you have to bounce mailscanner. So what are the pieces of your mail system again? And which OS distro? Someone will likely know what to knock over the head to restart SA on that configuration. Loren
RE: sa-learn and "Caught" spams
Mike Woods wrote: > The internet is a great place for raising more questions than it > answers :D > > Given all the opinions I think I will move the caught spam's into the > learning cycle however i'm also going to make sure that each spam is > only ever fed through the system once, this wont be a problem since I > already make use of their checksums to avoid duplicating files and I > had intended to use it to remove old spam anyway. Why not simply turn off autolearning? Then you can feed everything to sa-learn and not worry about it. -- Bowie
RE: sa-learn and "Caught" spams
> From: Mike Woods [mailto:[EMAIL PROTECTED] > > The internet is a great place for raising more questions than it answers > :D > > Given all the opinions I think I will move the caught spam's into the > learning cycle however i'm also going to make sure that each spam is > only ever fed through the system once, this wont be a problem since I > already make use of their checksums to avoid duplicating files and I had > intended to use it to remove old spam anyway. If you look at the X-Spam-Status header, it will tell you if the message was already autolearned: X-Spam-Status: Yes, score=19.5 required=5.0 tests=... (list of tests)... autolearn=spam version=3.1.5
RE: Migrate dependencies problem
On Wed, 27 Sep 2006 12:50:37 -0400, Bowie Bailey <[EMAIL PROTECTED]> wrote: >Benny Pedersen wrote: >> On Wed, September 27, 2006 16:26, Sietse van Zanen wrote: >> > It's best to use cpan for this. It's very easy to use and will >> > automagically resolve any dependencies. >> >> just one problem with cpan is it will not solve rpm depndice >> >> > Other way is find the modules on http://rpmfind.net/ >> > Specify your search as perl-net-dns etc. >> >> package maintainer needs to make it better > >I attempted to install SA via rpm one time. After fighting with Perl >module dependencies for a couple of hours, I gave up and installed it >with a single CPAN command. I've found yum to be about the easiest. It seems CPAN has been throwing all sorts of errors lately on CentOS. I've ended up installing the Perl modules through yum as well. Thus far it's been a painless task (cue all hell breaking loose on my next install) :-D Nigel
RE: Bayes poisoning (was Re: your mail)
Peter Smith wrote: > > > The messages are simply a random stream of words, with punctuation > > > scattered in them. No HTML, no URLs being advertised, no excessive > > > capitalisation, just meaningless text. > > I'm cautious about feeding these messages to sa-learn as spam, in > case it has a negative impact on genuine messages. The punctuation is > pretty good - full stops every dozen words or so, the odd comma. In > fact, it's probably better punctuation than most of my users use:) At > the moment I'm just black-listing host or netblocks which this junk > is coming from. As long as you learn the messages as spam, they will have no negative impact. The only way these messages could cause problems is if they get autolearned as ham instead of spam. -- Bowie
RE: Migrate dependencies problem
Benny Pedersen wrote: > On Wed, September 27, 2006 16:26, Sietse van Zanen wrote: > > It's best to use cpan for this. It's very easy to use and will > > automagically resolve any dependencies. > > just one problem with cpan is it will not solve rpm depndice > > > Other way is find the modules on http://rpmfind.net/ > > Specify your search as perl-net-dns etc. > > package maintainer needs to make it better I attempted to install SA via rpm one time. After fighting with Perl module dependencies for a couple of hours, I gave up and installed it with a single CPAN command. -- Bowie
FORGED_YAHOO_RCVD?
This autoresponse from Yahoo abuse crept over the spam line, mostly because of a hit on FORGED_YAHOO_RCVD... but it's not clear from the headers why that would be. This is a from a Fedora Core 5 system running SpamAssassin 3.1.3 under amavisd-new 2.4.2: Return-Path: <[EMAIL PROTECTED]> Received: from xenopodid.cs.arizona.edu (xenopodid.cs.arizona.edu [192.12.69.105]) by email.cs.arizona.edu (8.13.3/8.13.3) with ESMTP id k8RFY9pl088354 for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:09 -0700 (MST) (envelope-from [EMAIL PROTECTED]) Received: from localhost (xenopodid.cs.arizona.edu [127.0.0.1]) by xenopodid.cs.arizona.edu (Postfix) with ESMTP id 6FCBA6DCCEA for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:09 -0700 (MST) X-Virus-Scanned: amavisd-new at cs.arizona.edu X-Spam-Flag: YES X-Spam-Score: 5.019 X-Spam-Level: * X-Spam-Status: Yes, score=5.019 tagged_above=- required=5 tests=[BAYES_40=-0.185, DNS_FROM_RFC_ABUSE=0.2, DNS_FROM_RFC_POST=1.708, DNS_FROM_RFC_WHOIS=1.447, FORGED_YAHOO_RCVD=1.849] Received: from xenopodid.cs.arizona.edu ([127.0.0.1]) by localhost (xenopodid.cs.arizona.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QAQkAwgDuuF1 for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:03 -0700 (MST) Received: from cheltenham.cs.arizona.edu (cheltenham.cs.arizona.edu [192.12.69.60]) by xenopodid.cs.arizona.edu (Postfix) with ESMTP id 90AAC6DCCD1 for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:03 -0700 (MST) Received: from mail-relay1.yahoo.com (mail-relay1.yahoo.com [216.145.48.34]) by cheltenham.cs.arizona.edu (8.13.4/8.13.4) with ESMTP id k8RFY0Gj014456 for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:34:03 -0700 (MST) (envelope-from [EMAIL PROTECTED]) Received: from speedster.cc.kana.corp.yahoo.com (speedster.cc.kana.corp.yahoo.com [207.126.228.28]) by mail-relay1.yahoo.com (8.13.6/8.13.6/mr1) with SMTP id k8RFMTSi086721 for <[EMAIL PROTECTED]>; Wed, 27 Sep 2006 08:22:29 -0700 (PDT) Message-Id: <[EMAIL PROTECTED]> Precedence: bulk Auto-Submitted: auto-replied Date: Wed, 27 Sep 2006 08:22:28 -0700 To: [EMAIL PROTECTED] Subject: A message from Yahoo! Customer Care (KMM37445094V70533L0KM) From: Yahoo! Mail <[EMAIL PROTECTED]> Reply-To: Yahoo! Mail <[EMAIL PROTECTED]> MIME-Version: 1.0 Content-Type: text/plain; charset = "us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: KANA Response 7.0.1.142 X-UID: 371043 Thank you for contacting Yahoo! Customer Care to answer your question. A support representative will get back to you within 48 hours regarding your issue. Until then, feel free to visit our online help center at http://help.yahoo.com/ for answers if you have not already done so.
RE: your mail
John D. Hardin wrote: > On Wed, 27 Sep 2006, Peter Smith wrote: > > > The messages are simply a random stream of words, with punctuation > > scattered in them. No HTML, no URLs being advertised, no excessive > > capitalisation, just meaningless text. > > Technically, then, it's not spam. Spam requires a commercial message > of some sort. :) That depends on whose definition you use. I would say that any unsolicited and unwanted email qualifies as spam. > > As such, SA is finding very little to complain about, and is even > > lowering the scoring because the bayes filtering deems it to be > > good. > > I'm torn about whether or not to train on such messages. I do hand > training so I keep pretty tight control over what gets trained. I use a very simple criteria for Bayes training. If it's something I want in the inbox, I train it as ham. If it's something I don't want in the inbox, I train it as spam. Messages with random garbage in them are definitely in the second set. :) -- Bowie
RE: [qmailtoaster] duplicate emails
Hi, have a look at rulesemporium.com There are descriptions of the rules, and definitely you should use only one out pof each set of similar named ones Wolfgang Hamann >> 70_sare_evilnum1.cf >> 70_sare_evilnum2.cf >> 70_sare_header0.cf >> 70_sare_header.cf >> 70_sare_header_eng.cf >> 70_sare_html0.cf >> 70_sare_html1.cf >> 70_sare_html2.cf >> 70_sare_html3.cf >> 70_sare_html4.cf >> 70_sare_html_eng.cf >> 70_sare_oem.cf >> 70_sare_random.cf >> 70_sare_ratware.cf >> 70_sare_specfic.cf >> 70_sare_uri0.cf >> 70_sare_uri.cf >> 70_sare_whitlelist.cf >> 70_sare_whitelist_pre30.cf >> 72_sare_bml_post23x.cf >> 99_sare_fraud_post25x.cf >> antidrug.cf >> blacklist.cf >> blacklist-uri.cf >> bogus-virus-warnings.cf >> >>
RE: Migrate dependencies problem
On Wed, September 27, 2006 16:26, Sietse van Zanen wrote: > It's best to use cpan for this. It's very easy to use and will automagically > resolve any > dependencies. just one problem with cpan is it will not solve rpm depndice > Other way is find the modules on http://rpmfind.net/ > Specify your search as perl-net-dns etc. package maintainer needs to make it better -- "This message was sent using 100% recycled spam mails."
Re: sa-learn and "Caught" spams
The internet is a great place for raising more questions than it answers :D Given all the opinions I think I will move the caught spam's into the learning cycle however i'm also going to make sure that each spam is only ever fed through the system once, this wont be a problem since I already make use of their checksums to avoid duplicating files and I had intended to use it to remove old spam anyway. Ta guys, much food for thought :D -- Mike Woods Systems Administrator
an stupide config question
Title: Message Hi, I have migrate from Spamassassin 2.63 to 3.15.1, that' seems running, somes mail are flaged and rpm -a seee new version. But previously rules and local.cf was in /etc/mail/spamassasin, and theses files are not modified by my rpm -Uvh. I want know if config files are always in same directory ?? Regards Philippe COUAS Responsable Développement INFODEV S.A.
Re: sa-learn and "Caught" spams
On Wed, September 27, 2006 11:38 am, Nels Lindquist said: > Daniel T. Staal wrote: > >> On Wed, September 27, 2006 11:10 am, Jim Maul said: >> >>> I believe that SA will not learn a message it has seen before so >>> multiple sa-learn's will not have any affect. >> >> Actually, that was my impression too. >> >> Which means, for the orginal question, that re-learning the already >> caught spams will have very little effect other than wasting some >> processor cycles. Doing what he is doing right now is probably best. > > Except that there's a significant difference between "already caught" and > "already learned" spam. The threshold for learning is much higher (and > has specific requirements WRT point contributions of various types) so > it's definitely possible to have, for example, a message that was > correctly flagged as spam entirely due to network tests that was not > auto-learned. Training such messages then reinforces Bayes on the > content side, so future messages that look similar but perhaps have a new > URL that hasn't hit the blacklists yet can still be flagged. True. So... Optimal is obviously to train, once and correctly, on all messages. Sending a message through that has been trained will consume *some* resources, but less then one that still needs to be learned. So the exact balance is a complicated question. ;) Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
Re: sa-learn and "Caught" spams
Daniel T. Staal wrote: > On Wed, September 27, 2006 11:10 am, Jim Maul said: > >> I believe that SA will not learn a message it has seen before so >> multiple sa-learn's will not have any affect. > > Actually, that was my impression too. > > Which means, for the orginal question, that re-learning the already caught > spams will have very little effect other than wasting some processor > cycles. Doing what he is doing right now is probably best. Except that there's a significant difference between "already caught" and "already learned" spam. The threshold for learning is much higher (and has specific requirements WRT point contributions of various types) so it's definitely possible to have, for example, a message that was correctly flagged as spam entirely due to network tests that was not auto-learned. Training such messages then reinforces Bayes on the content side, so future messages that look similar but perhaps have a new URL that hasn't hit the blacklists yet can still be flagged. Nels Lindquist
Re: Received header unparseable
In a desperate newbie attempt to fix this problem myself, I added the following lines to Received.pm at line 895: # Received: from ([10.0.0.6]) by myfirewalll; Thu, # 13 Mar 2003 06:26:21 -0500 (EST) if (/^from \(\[(${IP_ADDRESS})\]\) by myfirewall/) { $ip = $1; $by = 'my.firewall.ip.addr'; goto enough; } Ummm, it didn't work, but it didn't break anything. How can I make this work? Add a "$helo =" ? Thanks. benthere-nine wrote: > > My firewall puts a received header on every e-mail it > forwards to SA 3.1.5: > > Received: from f66108.upc-f.chello.nl ([80.56.66.108]) > by myfirewall; Tue, 26 Sep 2006 12:35:52 -0500 > (Central Daylight Time) > > But when my firewall can't find a DNS entry for the > e-mail's last relay IP address, it just puts in a > blank space: > > Received: from ([201.19.179.63]) by myfirewall; Tue, > 26 Sep 2006 12:35:53 -0500 (Central Daylight Time) > > 20_head_tests.cf hits on this as an UNPARSEABLE_RELAY. > SA isn't able to look up that IP address on all the > network tests. > > I'm e-mailing Tech Support for the company that > publishes the firewall software, but is there anything > that can be done on the SA side? > > Thank you very much. > -- View this message in context: http://www.nabble.com/Received-header-unparseable-tf2340368.html#a6529150 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn and "Caught" spams
On Wed, September 27, 2006 11:10 am, Jim Maul said: > Daniel T. Staal wrote: >> On Wed, September 27, 2006 10:43 am, Matt Kettler said: >>> Mike Woods wrote: Hi guys, bit of a query regarding sa-learn and messages that have already been tagged as spam. We have spamassassin scanning mail via amavisd and sending any caught spams to a spam folder in the users accounts (using plus addressing), we've also been getting users to drop any missed spams into this spam folder so we can train spamassassin on them, at present I have a script that moves *only* the missed spams to a master folder for sa-learn, my question is simple, would there be any benefit in including the mails identified as spam in this process, I know sa-learn looks for common patterns in spams to identify them as spam but im unsure if adding known spams in would be beneficial in this ? >>> YES. There is DEFINITELY a benefit to learning messages tagged as spam. >>> Even if they got BAYES_99. >>> >>> Why? because spam mutates over time, and even if a spam got bayes_99, >>> it >>> may still have new variants of "hot" words in it that will help it keep >>> hitting the same kind of spam as it changes. If you wait till this kind >>> of message mutates enough to no longer be bayes_99, you've put yourself >>> behind the curve, and now you have to catch up to the new variant. >> >> While I in general agree with this, I was under the impression that >> spamassassin will auto-learn from messages it marks. (At least, past a >> certain threshold.) In which case, feeding the spam messages to it >> again >> would bias the database towards spam, as the messages are being learned >> twice. > > I believe that SA will not learn a message it has seen before so > multiple sa-learn's will not have any affect. Actually, that was my impression too. Which means, for the orginal question, that re-learning the already caught spams will have very little effect other than wasting some processor cycles. Doing what he is doing right now is probably best. Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
RE: duplicate emails
>sa-blacklist.cf >sa-blacklist.current.uri.cf >Get rid of these! They are evil and probably the root of your problem! > (They are also long depreciated and very out of date, so wouldn't be doing >much even if they didn't kill your system.) I have removed those from /etc/mail/spamassassin. >occa_phishing.cf >occa_replica.cf >I have no knowledge of these. As far as I can tell these are rules files created by the previous system manager. I am not aware if they are functional or not. The occa_phishing.cf file was setup to stop phishing emails.Here is the content of that file: body OCCA_PHISH_COMFED_RULE /Commercial Federal/ score OCCA_PHISH_COMFED_RULE 0.2 describe OCCA_PHISH_COMFED_RULE This rule tries to eliminate phishing using comfed The occa_replica.cf file was set up to stop spam emails for replica rolex watches. Here is the content of that file: body OCCA_ROLEX_RULE /Rolex/ score OCCA_ROLEX_RULE 0.1 describe OCCA_ROLEX_RULE This rule tries to eliminate Rolex replica spam body OCCA_REPLICA_RULE /Replica/ score OCCA_REPLICA_RULE 0.1 describe OCCA_REPLICA_RULE replica watches meta OCCA_REPL_ROL_RULE (OCCA_ROLEX_RULE + OCCA_REPLICA_RULE > .1) score OCCA_REPL_ROL_RULE 2 >random.cf >random.current.cf >I'm not sure what the 'current' one is, but I strongly suspect one of these >is not necessary. They look identical. I removed the random.current.cf file. >antidrug.cf >You shoudn't be using this unless you are on 2.6x or earlier. Since 3.0 >antidrug has been part of the stock rules. We are running 3.0.4. I removed antidrug.cf >blacklist.cf >blacklist-uri.cf >I'm not sure what these are, but they may be early versions of sa->blacklist, >and probably a bad thing to have. Removed. >70_sare_whitelist_pre30.cf >72_sare_bml_post23x.cf >99_sare_fraud_post25x.cf >And which version of SA are you running? This tells me you are on 2.6x, >since you have to be after 2.5x and before 3.0. Is that really true? If >so, upgrade to a current version of SA and drop the whitelist_pre30 and >replace it with whitelist, and possibly pick other versions of the other >two >files. The information on the Spamassassin version shows that we are running version 3.0.4. I suspect there were several rules sets that were from previous versions of SA. I am not sure how long the previous administrator was using SA. I received no information concerning SA when I came on board so I am trying to understand how and why all of our systems are set up the way they are. This information about what should or should not be used is very helpful. I also have another question concerning the spamassassin control. I understand that I should be able to restart spamassassin by using: /etc/init.d/spamassassin restart However, there are no files in /etc/init.d/ for spamassassin so I get an error message stating: Bash: etc/init.d/spamassassin: no such file or directory. The only way I have been able to restart spamassassin is to restart the server. If spamassassin is not in /etc/init.d where would it be and how can I find it? Thank you, Steve Ingraham
Re: sa-learn and "Caught" spams
Daniel T. Staal wrote: On Wed, September 27, 2006 10:43 am, Matt Kettler said: Mike Woods wrote: Hi guys, bit of a query regarding sa-learn and messages that have already been tagged as spam. We have spamassassin scanning mail via amavisd and sending any caught spams to a spam folder in the users accounts (using plus addressing), we've also been getting users to drop any missed spams into this spam folder so we can train spamassassin on them, at present I have a script that moves *only* the missed spams to a master folder for sa-learn, my question is simple, would there be any benefit in including the mails identified as spam in this process, I know sa-learn looks for common patterns in spams to identify them as spam but im unsure if adding known spams in would be beneficial in this ? YES. There is DEFINITELY a benefit to learning messages tagged as spam. Even if they got BAYES_99. Why? because spam mutates over time, and even if a spam got bayes_99, it may still have new variants of "hot" words in it that will help it keep hitting the same kind of spam as it changes. If you wait till this kind of message mutates enough to no longer be bayes_99, you've put yourself behind the curve, and now you have to catch up to the new variant. While I in general agree with this, I was under the impression that spamassassin will auto-learn from messages it marks. (At least, past a certain threshold.) In which case, feeding the spam messages to it again would bias the database towards spam, as the messages are being learned twice. I believe that SA will not learn a message it has seen before so multiple sa-learn's will not have any affect. So the question would have to be: Does Spamassassin automatically update the Bayes database from (some/any) messages it flags as spam or ham? I would think only if you try to reverse/forget the original learning. Daniel T. Staal -Jim
Re: sa-learn and "Caught" spams
On Wed, September 27, 2006 10:43 am, Matt Kettler said: > Mike Woods wrote: >> Hi guys, bit of a query regarding sa-learn and messages that have >> already been tagged as spam. >> >> We have spamassassin scanning mail via amavisd and sending any caught >> spams to a spam folder in the users accounts (using plus addressing), >> we've also been getting users to drop any missed spams into this spam >> folder so we can train spamassassin on them, at present I have a >> script that moves *only* the missed spams to a master folder for >> sa-learn, my question is simple, would there be any benefit in >> including the mails identified as spam in this process, I know >> sa-learn looks for common patterns in spams to identify them as spam >> but im unsure if adding known spams in would be beneficial in this ? > > YES. There is DEFINITELY a benefit to learning messages tagged as spam. > Even if they got BAYES_99. > > Why? because spam mutates over time, and even if a spam got bayes_99, it > may still have new variants of "hot" words in it that will help it keep > hitting the same kind of spam as it changes. If you wait till this kind > of message mutates enough to no longer be bayes_99, you've put yourself > behind the curve, and now you have to catch up to the new variant. While I in general agree with this, I was under the impression that spamassassin will auto-learn from messages it marks. (At least, past a certain threshold.) In which case, feeding the spam messages to it again would bias the database towards spam, as the messages are being learned twice. So the question would have to be: Does Spamassassin automatically update the Bayes database from (some/any) messages it flags as spam or ham? Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
Bayes poisoning (was Re:)
> Are you runing net tests? It sounds like someone has a broken zombie net > that is supposed to be sending out gif spams, but they forgot the images. > Net tests would probably catch these easily. Well I'm using the following: score DCC_CHECK 1.0 score PYZOR_CHECK 1.0 score RAZOR_CHECK 1.0 score RCVD_IN_BL_SPAMCOP_NET 3.0 score X_CHINESE_RELAY 1.5 score X_KOREAN_RELAY 1.5 score X_SPAMHAUS 1.5 But none of these are being triggered by the offending messages. Are there other net checks I should be using? Thanks, Peter Smith
Bayes poisoning (was Re: your mail)
>> The messages are simply a random stream of words, with punctuation >> scattered in them. No HTML, no URLs being advertised, no excessive >> capitalisation, just meaningless text. > > Technically, then, it's not spam. Spam requires a commercial message > of some sort. :) Yeah, I think I said 'junk' rather than spam. I wonder if such mail has a name? > I would agree that it's an attempt to poison your bayes database, > assuming that you have autolearn turned on, either by skewing the > scores towards ham or by bloating the database. Do you think the perpetrators are poisoning the bayes db with a view to sending spam at a later date? We aren't a big organisation - few hundred mail boxes - so it seems rather long lengths for a spammer to go to. Another suggestion was that the spammer had intended to attach an image, which hadn't got through. Given the technical competence of many spammers, it seems more likely they screwed up and forgot to attach the image. But I'm just guessing here. >> Any thoughts on what I can do about these messages? Even with >> bayes turned off, they would still fail to score more than say 2 >> or 3. Each message contains a different paragraph of random text, >> so it's not possible to pick out keywords; and the messages are >> coming from dialup machines, so blocking IP isn't going to be very >> effective. > > Look for punctuation? A good deal of the random bayes poison at one > time was totally without punctuation. I'm cautious about feeding these messages to sa-learn as spam, in case it has a negative impact on genuine messages. The punctuation is pretty good - full stops every dozen words or so, the odd comma. In fact, it's probably better punctuation than most of my users use:) At the moment I'm just black-listing host or netblocks which this junk is coming from. Apologies for not setting a subject in my original mail by the way Peter Smith
Re: sa-learn and "Caught" spams
Mike Woods wrote: > Hi guys, bit of a query regarding sa-learn and messages that have > already been tagged as spam. > > We have spamassassin scanning mail via amavisd and sending any caught > spams to a spam folder in the users accounts (using plus addressing), > we've also been getting users to drop any missed spams into this spam > folder so we can train spamassassin on them, at present I have a > script that moves *only* the missed spams to a master folder for > sa-learn, my question is simple, would there be any benefit in > including the mails identified as spam in this process, I know > sa-learn looks for common patterns in spams to identify them as spam > but im unsure if adding known spams in would be beneficial in this ? YES. There is DEFINITELY a benefit to learning messages tagged as spam. Even if they got BAYES_99. Why? because spam mutates over time, and even if a spam got bayes_99, it may still have new variants of "hot" words in it that will help it keep hitting the same kind of spam as it changes. If you wait till this kind of message mutates enough to no longer be bayes_99, you've put yourself behind the curve, and now you have to catch up to the new variant. In general: DO NOT intentionally try to bias the training of your bayes database for any reason. That's just self-inflicted bayes poison. If it's spam, train it as spam. Do not hold back because of "ham-like" content. Do not hold back because it was already tagged. If it's spam, train it as such. The same goes for nonspam training. Don't hold back training any emails that you don't want to be tagged, even if they contain "spam words". SpamAssassin's bayes system will handle the gray cases just fine. It does particularly well at this because of the chi-squared combining, as compared to the results of simple averaging.
Re: your mail
On Wed, 27 Sep 2006, Peter Smith wrote: > The messages are simply a random stream of words, with punctuation > scattered in them. No HTML, no URLs being advertised, no excessive > capitalisation, just meaningless text. Technically, then, it's not spam. Spam requires a commercial message of some sort. :) > As such, SA is finding very little to complain about, and is even > lowering the scoring because the bayes filtering deems it to be > good. I'm torn about whether or not to train on such messages. I do hand training so I keep pretty tight control over what gets trained. I would agree that it's an attempt to poison your bayes database, assuming that you have autolearn turned on, either by skewing the scores towards ham or by bloating the database. > Any thoughts on what I can do about these messages? Even with > bayes turned off, they would still fail to score more than say 2 > or 3. Each message contains a different paragraph of random text, > so it's not possible to pick out keywords; and the messages are > coming from dialup machines, so blocking IP isn't going to be very > effective. Look for punctuation? A good deal of the random bayes poison at one time was totally without punctuation. -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- ...every time I sit down in front of a Windows machine I feel as if the computer is just a place for the manufacturers to put their advertising.-- fwadling on Y! SCOX --
RE: Migrate dependencies problem
Title: Message It's best to use cpan for this. It's very easy to use and will automagically resolve any dependencies. Other way is find the modules on http://rpmfind.net/ Specify your search as perl-net-dns etc. -Sietse From: Philippe CouasSent: Wed 27-Sep-06 16:15To: users@spamassassin.apache.orgSubject: Migrate dependencies problem Hi, I want Migrate from SpamAssasin 2.63 to 3.15.1 on my MailServer on Redhat9 1 i use perl 5.8.0 2 i have stoped spamd 3 run "sa-relearn --rebuild" 4 rpm -Uvh spamassassin-3.1.5-1.rh9.rf.i386.rpm warning: spamassassin-3.1.5-1.rh9.rf.i386.rpm: V3 DSA signature: NOKEY, key ID 6b8d79e6 error: Failed dependencies: perl(Digest::SHA1) is needed by spamassassin-3.1.5-1.rh9.rf perl(Net::DNS) is needed by spamassassin-3.1.5-1.rh9.rf perl(Time::HiRes) is needed by spamassassin-3.1.5-1.rh9.rf Where could i found theses perls optional packages, and how install them ? Regards Philippe Philippe COUAS Responsable Développement INFODEV S.A.
Re: [qmailtoaster] duplicate emails
sa-blacklist.cf sa-blacklist.current.uri.cf Get rid of these! They are evil and probably the root of your problem! (They are also long depreciated and very out of date, so wouldn't be doing much even if they didn't kill your system.) occa_phishing.cf occa_replica.cf I have no knowledge of these. random.cf random.current.cf I'm not sure what the 'current' one is, but I strongly suspect one of these is not necessary. antidrug.cf You shoudn't be using this unless you are on 2.6x or earlier. Since 3.0 antidrug has been part of the stock rules. blacklist.cf blacklist-uri.cf I'm not sure what these are, but they may be early versions of sa-blacklist, and probably a bad thing to have. 70_sare_whitelist_pre30.cf 72_sare_bml_post23x.cf 99_sare_fraud_post25x.cf And which version of SA are you running? This tells me you are on 2.6x, since you have to be after 2.5x and before 3.0. Is that really true? If so, upgrade to a current version of SA and drop the whitelist_pre30 and replace it with whitelist, and possibly pick other versions of the other two files. Loren - Original Message - From: "Steve Ingraham" <[EMAIL PROTECTED]> To: ; ; Sent: Wednesday, September 27, 2006 6:47 AM Subject: RE: [qmailtoaster] duplicate emails Jdow wrote: Steve, it might help if you listed which rule sets. There are some which are obscenely large and others that are obsolete. Maybe we can prune the list for you a little. As some have mentioned I may have too many rules. I would like to know what is a must have and what I should not use. Here is a list of what is currently in the /etc/mail/spamassassin/ folder: CURRENTLY IN /ETC/MAIL/SPAMASSASSIN 70_sare_adult.cf 70_sare_bayes_poison_nxm.cf 70_sare_evilnum0.cf occa_phishing.cf occa_replica.cf sa-blacklist.cf sa-blacklist.current.uri.cf tripwire.cf chickenpox.cf init.pre random.cf random.current.cf weeds2.cf local.cf The rules below were moved yesterday and placed in a different folder. Once I moved these and restarted spamassassin by rebooting the server it was no longer bogging down and duplicating emails. REMOVED YESTERDAY FROM /ETC/MAIL/SPAMASSASSIN 70_sare_evilnum1.cf 70_sare_evilnum2.cf 70_sare_header0.cf 70_sare_header.cf 70_sare_header_eng.cf 70_sare_html0.cf 70_sare_html1.cf 70_sare_html2.cf 70_sare_html3.cf 70_sare_html4.cf 70_sare_html_eng.cf 70_sare_oem.cf 70_sare_random.cf 70_sare_ratware.cf 70_sare_specfic.cf 70_sare_uri0.cf 70_sare_uri.cf 70_sare_whitlelist.cf 70_sare_whitelist_pre30.cf 72_sare_bml_post23x.cf 99_sare_fraud_post25x.cf antidrug.cf blacklist.cf blacklist-uri.cf bogus-virus-warnings.cf Here is the content of my config file for rules_du_jour: TRUSTED_RULESETS="TRIPWIRE ANTIDRUG SARE_EVILNUMBERS0 BLACKLIST BLACKLIST_URI RANDOMVAL BOGUSVIRUS SARE_ADULT SARE_FRAUD SARE_BAYES_POISON_NXM SARE_OEM SARE_RANDOM SARE_HEADER SARE_HEADER0 SARE_HEADER_ENG SARE_HTML0 SARE_HTML1 SARE_HTML2 SARE_HTML3 SARE_HTML4 SARE_HTML_ENG SARE_RATWARE SARE_SPECIFIC SARE_URI SARE_BML_POST25X SARE_WHITELIST SARE_WHITELIST_PRE30" SA_DIR="/etc/mail/spamassassin" MAIL_ADDRESS="[EMAIL PROTECTED]" SA_RESTART="killall -HUP spamd" I have quite a few users who get a lot of spam, especially pharmaceuticals and stocks delivered to their mailboxes. They are why I began trying to work on the spamassassin filtering. An interesting note I have observed but do not understand why it is happening. When I updated the rules on Monday, many users started seeing an increase number of spam in their mailboxes. One user who was getting a great deal of duplicate emails was also seeing a huge increase in the total numbers of spam emails. Where she would receive 100 spam emails per day before the rules_du_jour update, afterwards she was seeing 800 or 900 spam emails per day. Much of it was porn spam that she was not seeing before the update to the rules files. I would appreciate any advice and/or education offered on the spam filtering. Thanks, Steve Ingraham {^_^} - Original Message - From: "Steve Ingraham" <[EMAIL PROTECTED]> Steve Ingraham wrote: I need help with a problem. Our users are seeing some multiple duplicate emails coming from the same sender. This is not occurring with every email so there does not seem to be any pattern to which incoming emails will be duplicated and which ones won't. They are also reporting that duplicate emails are sent when they send to an outside email. Has anyone experienced this problem before? What could be causing this to occur and what can I do to stop this? I am running qmailtoaster and spamassassin as an external email gateway. There has been nothing changed with qmail but I did update some rules in SA using rules_du_jour yesterday. Would these rules updates cause this problem? If so, what would have changed? Jake Vickers wrote: If your system is low on resources (ie: RAM), then the spamd process can take too long, making Toaster think the ma
Migrate dependencies problem
Title: Message Hi, I want Migrate from SpamAssasin 2.63 to 3.15.1 on my MailServer on Redhat9 1 i use perl 5.8.0 2 i have stoped spamd 3 run "sa-relearn --rebuild" 4 rpm -Uvh spamassassin-3.1.5-1.rh9.rf.i386.rpm warning: spamassassin-3.1.5-1.rh9.rf.i386.rpm: V3 DSA signature: NOKEY, key ID 6b8d79e6 error: Failed dependencies: perl(Digest::SHA1) is needed by spamassassin-3.1.5-1.rh9.rf perl(Net::DNS) is needed by spamassassin-3.1.5-1.rh9.rf perl(Time::HiRes) is needed by spamassassin-3.1.5-1.rh9.rf Where could i found theses perls optional packages, and how install them ? Regards Philippe Philippe COUAS Responsable Développement INFODEV S.A.
Re: sa-learn and "Caught" spams
Mike Woods schrieb: Hi guys, bit of a query regarding sa-learn and messages that have already been tagged as spam. We have spamassassin scanning mail via amavisd and sending any caught spams to a spam folder in the users accounts (using plus addressing), we've also been getting users to drop any missed spams into this spam folder so we can train spamassassin on them, at present I have a script that moves *only* the missed spams to a master folder for sa-learn, my question is simple, would there be any benefit in including the mails identified as spam in this process, I know sa-learn looks for common patterns in spams to identify them as spam but im unsure if adding known spams in would be beneficial in this ? If i understand you right: i relearn the messages that get *only* e.g.: BAYES_00 BAYES_50 BAYES_80 etc., it seems to help. At a closer look there is only a percentage of these "retrained" messages added to bayes-database ... Ta for any help! --- Mike Woods Systems Administrator hth MH
RE: [qmailtoaster] duplicate emails
Jdow wrote: >Steve, it might help if you listed which rule sets. There are some >which are obscenely large and others that are obsolete. Maybe we >can prune the list for you a little. As some have mentioned I may have too many rules. I would like to know what is a must have and what I should not use. Here is a list of what is currently in the /etc/mail/spamassassin/ folder: CURRENTLY IN /ETC/MAIL/SPAMASSASSIN 70_sare_adult.cf 70_sare_bayes_poison_nxm.cf 70_sare_evilnum0.cf occa_phishing.cf occa_replica.cf sa-blacklist.cf sa-blacklist.current.uri.cf tripwire.cf chickenpox.cf init.pre random.cf random.current.cf weeds2.cf local.cf The rules below were moved yesterday and placed in a different folder. Once I moved these and restarted spamassassin by rebooting the server it was no longer bogging down and duplicating emails. REMOVED YESTERDAY FROM /ETC/MAIL/SPAMASSASSIN 70_sare_evilnum1.cf 70_sare_evilnum2.cf 70_sare_header0.cf 70_sare_header.cf 70_sare_header_eng.cf 70_sare_html0.cf 70_sare_html1.cf 70_sare_html2.cf 70_sare_html3.cf 70_sare_html4.cf 70_sare_html_eng.cf 70_sare_oem.cf 70_sare_random.cf 70_sare_ratware.cf 70_sare_specfic.cf 70_sare_uri0.cf 70_sare_uri.cf 70_sare_whitlelist.cf 70_sare_whitelist_pre30.cf 72_sare_bml_post23x.cf 99_sare_fraud_post25x.cf antidrug.cf blacklist.cf blacklist-uri.cf bogus-virus-warnings.cf Here is the content of my config file for rules_du_jour: TRUSTED_RULESETS="TRIPWIRE ANTIDRUG SARE_EVILNUMBERS0 BLACKLIST BLACKLIST_URI RANDOMVAL BOGUSVIRUS SARE_ADULT SARE_FRAUD SARE_BAYES_POISON_NXM SARE_OEM SARE_RANDOM SARE_HEADER SARE_HEADER0 SARE_HEADER_ENG SARE_HTML0 SARE_HTML1 SARE_HTML2 SARE_HTML3 SARE_HTML4 SARE_HTML_ENG SARE_RATWARE SARE_SPECIFIC SARE_URI SARE_BML_POST25X SARE_WHITELIST SARE_WHITELIST_PRE30" SA_DIR="/etc/mail/spamassassin" MAIL_ADDRESS="[EMAIL PROTECTED]" SA_RESTART="killall -HUP spamd" I have quite a few users who get a lot of spam, especially pharmaceuticals and stocks delivered to their mailboxes. They are why I began trying to work on the spamassassin filtering. An interesting note I have observed but do not understand why it is happening. When I updated the rules on Monday, many users started seeing an increase number of spam in their mailboxes. One user who was getting a great deal of duplicate emails was also seeing a huge increase in the total numbers of spam emails. Where she would receive 100 spam emails per day before the rules_du_jour update, afterwards she was seeing 800 or 900 spam emails per day. Much of it was porn spam that she was not seeing before the update to the rules files. I would appreciate any advice and/or education offered on the spam filtering. Thanks, Steve Ingraham {^_^} - Original Message - From: "Steve Ingraham" <[EMAIL PROTECTED]> Steve Ingraham wrote: I need help with a problem. Our users are seeing some multiple duplicate emails coming from the same sender. This is not occurring with every email so there does not seem to be any pattern to which incoming emails will be duplicated and which ones won't. They are also reporting that duplicate emails are sent when they send to an outside email. Has anyone experienced this problem before? What could be causing this to occur and what can I do to stop this? I am running qmailtoaster and spamassassin as an external email gateway. There has been nothing changed with qmail but I did update some rules in SA using rules_du_jour yesterday. Would these rules updates cause this problem? If so, what would have changed? Jake Vickers wrote: If your system is low on resources (ie: RAM), then the spamd process can take too long, making Toaster think the mail got lost somewhere, so it resends it. Might want to check and see how much RAM you're using. I want to thank everyone who posted a reply on my inquiry. I believe Jake Vickers was right about the problem. The RAM on the email server was bogged down since yesterday when I updated the various .cf files using rules_du_jour. I had included just a handful of rules from RDJ but it appears that RDJ utilizes much too much of my server resources to use it to update my spamassassin rules. It was slowing down the server so much that simple functions were not responding. This appears to have affected the delivery of emails. In fact I noticed that my original message to these mail lists took several hours to post and were duplicated also. I resolved the problem by moving the various rules .cf files out of the /etc/mail/spamassassin folder and restarting spamassassin. If anyone has a simple way of updating rules for spamassassin I would welcome your input. I still need to update the rules as I have been getting a great number of emails coming through to users. Specifically, we are getting a lot of the pharmaceutical spam and the stock spam. Again, thanks to everyone for the posts. Steve Ingraham
sa-learn and "Caught" spams
Hi guys, bit of a query regarding sa-learn and messages that have already been tagged as spam. We have spamassassin scanning mail via amavisd and sending any caught spams to a spam folder in the users accounts (using plus addressing), we've also been getting users to drop any missed spams into this spam folder so we can train spamassassin on them, at present I have a script that moves *only* the missed spams to a master folder for sa-learn, my question is simple, would there be any benefit in including the mails identified as spam in this process, I know sa-learn looks for common patterns in spams to identify them as spam but im unsure if adding known spams in would be beneficial in this ? Ta for any help! --- Mike Woods Systems Administrator
Re: performance question
* [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Hi, > > As we have seen the amount of incoming mail increase by 25% in the last > few months, our customer is willing to invest in an extra mail relay. > I was thinking about a system with Sun's T1 chipset, (like the sunfire > T1000), I'm thinking the threaded nature of this chipset would work well > with the type of computing going on on a typical mailrelay (lots of > processes all doing relatively short bursts of cpu) ? Any ideas ? Good CPU, Large RAM for ramdisk (temp files during message testing) and fast disks. [EMAIL PROTECTED] > > Tom. > > > > > > Martin Hepworth <[EMAIL PROTECTED]> > 27/09/2006 11:33 > > To: [EMAIL PROTECTED] > cc: users@spamassassin.apache.org > Subject:Re: performance question > > > [EMAIL PROTECTED] wrote: > > Hi, > > > > I would like your opinion if our mailrelay is properly tuned: > > > > I have a mailrelay (sendmail / mimedefang / spamassassin with fuzzyocr, > > razor and dcc) running on a Sun V20Z with 6 GB Ram and 2 AMD 1.8Ghz > cpu's > > on Solaris 10. > > it currently handles 95000 mails per day (most of it spam ofcourse). > Load > > is currently constantly around 6 - 7, average scan time of a mail is > about > > 7 seconds. %io according to top is almost 0. > > > > mimedefang spool is running on a ramdisk, and all the software is at the > > > most recent version > > > > Does this seem like a normal load ? I had hoped that our upgrade from > > solaris 9 to solaris 10 would have drastically improved the number of > > emails the system could process (or lower the load for the same amout of > > > emails), but I don't notice a major improvement. > > > > I'd appreciate your input on this matter. > > > > regards, > > tom. > > > > > Don't confuse the load average figure with 'overloaded' systems. > > the load figure just means X processes are waiting for some resource > (CPU or disk or network or). > > Depending on your setup you may find it useful to use milter-ahead (or a > free equivalent) to drop email to unknown users before you send it off > to spamassassin etc for further processing. I drop over 2/3rds of my > traffic that way. > > Given it's only 7 seconds to scan the email I'd says your system is more > than handling the traffic being processed. > > -- > Martin Hepworth > Senior Systems Administrator > Solid State Logic > Tel: +44 (0)1865 842300 > > ** > > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > > This footnote confirms that this email message has been swept > for the presence of computer viruses and is believed to be clean. > > ** > > > -- state of mind Agentur für Kommunikation und Design Patrick KoetterTel: 089 45227227 Echinger Strasse 3 Fax: 089 45227226 85386 Eching Web: http://www.state-of-mind.de
Re: performance question
* [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Hi, > > As we have seen the amount of incoming mail increase by 25% in the last > few months, our customer is willing to invest in an extra mail relay. > I was thinking about a system with Sun's T1 chipset, (like the sunfire > T1000), I'm thinking the threaded nature of this chipset would work well > with the type of computing going on on a typical mailrelay (lots of > processes all doing relatively short bursts of cpu) ? Any ideas ? Mailrelays are I/O bound, not CPU bound. At least mine is. -- Ralf Hildebrandt (i.A. des IT-Zentrums) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
Re: performance question
Hi, As we have seen the amount of incoming mail increase by 25% in the last few months, our customer is willing to invest in an extra mail relay. I was thinking about a system with Sun's T1 chipset, (like the sunfire T1000), I'm thinking the threaded nature of this chipset would work well with the type of computing going on on a typical mailrelay (lots of processes all doing relatively short bursts of cpu) ? Any ideas ? Tom. Martin Hepworth <[EMAIL PROTECTED]> 27/09/2006 11:33 To: [EMAIL PROTECTED] cc: users@spamassassin.apache.org Subject:Re: performance question [EMAIL PROTECTED] wrote: > Hi, > > I would like your opinion if our mailrelay is properly tuned: > > I have a mailrelay (sendmail / mimedefang / spamassassin with fuzzyocr, > razor and dcc) running on a Sun V20Z with 6 GB Ram and 2 AMD 1.8Ghz cpu's > on Solaris 10. > it currently handles 95000 mails per day (most of it spam ofcourse). Load > is currently constantly around 6 - 7, average scan time of a mail is about > 7 seconds. %io according to top is almost 0. > > mimedefang spool is running on a ramdisk, and all the software is at the > most recent version > > Does this seem like a normal load ? I had hoped that our upgrade from > solaris 9 to solaris 10 would have drastically improved the number of > emails the system could process (or lower the load for the same amout of > emails), but I don't notice a major improvement. > > I'd appreciate your input on this matter. > > regards, > tom. > > Don't confuse the load average figure with 'overloaded' systems. the load figure just means X processes are waiting for some resource (CPU or disk or network or). Depending on your setup you may find it useful to use milter-ahead (or a free equivalent) to drop email to unknown users before you send it off to spamassassin etc for further processing. I drop over 2/3rds of my traffic that way. Given it's only 7 seconds to scan the email I'd says your system is more than handling the traffic being processed. -- Martin Hepworth Senior Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
RE:
> -Original Message- > From: Peter Smith [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 26, 2006 8:08 PM > To: users@spamassassin.apache.org > Subject: > > > Hi, > > Over the last week, my machine (Fedora, SA 3.1.3, qmail, > qmail-scanner-queue.pl) has been recieving a fair amount of > junk mail which is not being tagged as spam; in fact the > total scores are negative. > > The messages are simply a random stream of words, with > punctuation scattered in them. No HTML, no URLs being > advertised, no excessive capitalisation, just meaningless > text. The message headers are pretty clean too, apart from > the From field being false. A right wing conspiracy to poison the global Bayesian database.
Re: performance question
* [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Hi, > > I would like your opinion if our mailrelay is properly tuned: > > I have a mailrelay (sendmail / mimedefang / spamassassin with fuzzyocr, > razor and dcc) running on a Sun V20Z with 6 GB Ram and 2 AMD 1.8Ghz cpu's > on Solaris 10. > it currently handles 95000 mails per day (most of it spam ofcourse). Load > is currently constantly around 6 - 7, average scan time of a mail is about > 7 seconds. %io according to top is almost 0. We have a VERY similar setup and similar amount of mail, but we use Postfix and amavisd-new (which uses spamassassin with fuzzyocr and can do it's own defanging). amavisd-new uses a ramdisk as well. I guess this reduces the number of processes/forks and performs a bit better. * Linux 2.6.18 * 2 Xeon 2.8 GHZ Processors and 3 GB RAM. * Peak load at noon is about (11:30 now): 4.64 The average scan time is below your 7s... -- Ralf Hildebrandt (i.A. des IT-Zentrums) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]
Re: performance question
[EMAIL PROTECTED] wrote: Hi, I would like your opinion if our mailrelay is properly tuned: I have a mailrelay (sendmail / mimedefang / spamassassin with fuzzyocr, razor and dcc) running on a Sun V20Z with 6 GB Ram and 2 AMD 1.8Ghz cpu's on Solaris 10. it currently handles 95000 mails per day (most of it spam ofcourse). Load is currently constantly around 6 - 7, average scan time of a mail is about 7 seconds. %io according to top is almost 0. mimedefang spool is running on a ramdisk, and all the software is at the most recent version Does this seem like a normal load ? I had hoped that our upgrade from solaris 9 to solaris 10 would have drastically improved the number of emails the system could process (or lower the load for the same amout of emails), but I don't notice a major improvement. I'd appreciate your input on this matter. regards, tom. Don't confuse the load average figure with 'overloaded' systems. the load figure just means X processes are waiting for some resource (CPU or disk or network or). Depending on your setup you may find it useful to use milter-ahead (or a free equivalent) to drop email to unknown users before you send it off to spamassassin etc for further processing. I drop over 2/3rds of my traffic that way. Given it's only 7 seconds to scan the email I'd says your system is more than handling the traffic being processed. -- Martin Hepworth Senior Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
performance question
Hi, I would like your opinion if our mailrelay is properly tuned: I have a mailrelay (sendmail / mimedefang / spamassassin with fuzzyocr, razor and dcc) running on a Sun V20Z with 6 GB Ram and 2 AMD 1.8Ghz cpu's on Solaris 10. it currently handles 95000 mails per day (most of it spam ofcourse). Load is currently constantly around 6 - 7, average scan time of a mail is about 7 seconds. %io according to top is almost 0. mimedefang spool is running on a ramdisk, and all the software is at the most recent version Does this seem like a normal load ? I had hoped that our upgrade from solaris 9 to solaris 10 would have drastically improved the number of emails the system could process (or lower the load for the same amout of emails), but I don't notice a major improvement. I'd appreciate your input on this matter. regards, tom.
Re: Infuriating gif spam...
Bill Landry wrote: >> Version 2.3j works much better... I'd previously been using version >> 2.3b for which I had an ebuild for gentoo. >> >> One thing I have noticed, however, is a number of errors/warnings which >> spamd sticks into /var/log/messages when it is started: >> >> -- >> Sep 26 17:20:48 server spamd[25563]: Subroutine new redefined at >> /etc/mail/spamassassin/FuzzyOcr.pm line 122. >> -- >> >> Have I somehow loaded this module twice? I didn't get these messages >> until I upgraded to version 2.3j from 2.3b > > No problem here, these are just informational messages that only > recently showed up for me with the more recent versions of the > FuzzyOcr plugin, as well. However, with the two latest versions, it > only gets written to the log once during start-up, not with each image > file that gets scanned like I was seeing a few versions back. Jorge Valdes replied to me (though I can't find his email on the list) - he said to look at v310.pre - I had an unnecessary line: > loadplugin FuzzyOcr /etc/mail/spamassassin/FuzzyOcr.pm After having commented that out 2.3j works just as well as it did before and I don't get the warnings any more.