Re: against this spam mail...
What I did against this , is first, have a virtusertable that lists all your users, and at the end has something like @mydomain.edu.trerror: sorry no one by that name (syntax may be off I am writing this from the top of my head) so it rejects it outright before the mail has to go thru spamassassin, etc. second thing I did: hack the sendmail source so that when BadRcptThrottle is reached, it closes the connection instead. Life has been peaceful since :) -t David B Funk wrote: On Wed, 18 May 2005, Jeff Chan wrote: On Wednesday, May 18, 2005, 12:05:13 AM, Monty Ree wrote: Hello, all. When I see maillog, I can see lots of logs like below.. Some spammer send spam mails from [EMAIL PROTECTED] to [EMAIL PROTECTED], I guess. So mail server load is high to accept this spam and reply withUser unknown. Is there any good way or solution against thess series spam? Thanks in advance. May 18 15:11:04 mail02 sendmail[22487]: j4I6B4i22487: [EMAIL PROTECTED]... User unknown May 18 15:11:04 mail02 sendmail[22490]: j4I6B4i22490: [EMAIL PROTECTED]... User unknown May 18 15:11:04 mail02 sendmail[22493]: j4I6B4i22493: [EMAIL PROTECTED]... User unknown This is called a dictionary attack. If you search for that and sendmail, you may find some answers. It's not specifically a SpamAssassin question. For sendmail, enable the BadRcptThrottle threshold. This feature will cause sendmail to rate limit transactions once a specified number of bad recipients have been seen. sendmail will still have to tell the spammers No No No but at a slower rate so they don't drive up your server load average. (the default is 20, I've got mine set to 3 ;) Combine this with ConnectionRateThrottle MaxDaemonChildren to limit the total simultaneous sessions to prevent your SpamAssassin from being driven into meltdown by these kinds of attacks. You can also add in dnsbl lists such as xbl.spamhaus.org to block connections by infected PCs at the SMTP level. Lots of this kind of trash is coming from 'bot nets' and can be blocked by good dnsbl lists.
German Spam local.conf
hello, i am totally unskilled in config spamassisn, but i found this url: http://www.exit0.us/index.php?pagename=GermanSoberSpamBounceRules i but the text into my /etc/mail/spamassassin/local.conf and restarted spamassassin. but spamc doesnt use the rules :-(. if have tryed:spamc 1116480451.P29200Q0M105826.isengard.skynet\:2\,Sc Return-path: [EMAIL PROTECTED] Received: from mmp.a1.net [194.48.124.160] by localhost with POP3 (fetchmail-6.2.5) for [EMAIL PROTECTED] (single-drop); Wed, 18 May 2005 09:50:05 +0200 (CEST) Received: from uxmtat02-real.net.mobilkom.at (uxmtat02-real.net.mobilkom.at [194.48.125.55]) by msgstore4.net.mobilkom.at (A1.net ORGANIZER - Alle Emails und Termine fest im Griff) with ESMTP id [EMAIL PROTECTED] for [EMAIL PROTECTED]; Wed, 18 May 2005 09:48:30 +0200 (MEST) Received: from okioweoj.com (N712P016.adsl.highway.telekom.at [62.47.32.240]) by smtpin2.a1.net (A1.net ORGANIZER - Alle Emails und Termine fest im Griff) with SMTP id [EMAIL PROTECTED] for [EMAIL PROTECTED] (ORCPT [EMAIL PROTECTED]); Wed, 18 May 2005 09:48:30 +0200 (MEST) Date: Wed, 18 May 2005 07:44:32 + (GMT) From: [EMAIL PROTECTED] Subject: Auslaenderpolitik To: [EMAIL PROTECTED] Message-id: [EMAIL PROTECTED] MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit Importance: Normal X-Priority: 3 (Normal) Original-recipient: rfc822;[EMAIL PROTECTED] X-Virus-Status: No X-Virus-Checker-Version: clamassassin 1.2.2 with clamscan / ClamAV 0.85/884/Wed May 18 00:14:26 2005 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on isengard.skynet X-Spam-Level: X-Spam-Status: No, score=4.5 required=5.0 tests=AWL,BAYES_80, DNS_FROM_RFC_POST,NO_REAL_NAME,PRIORITY_NO_NAME,RCVD_IN_SORBS_WEB autolearn=no version=3.0.3 Lese selbst: http://www.mjoelnirsseite.de/2100.htm here is my local.conf: # These values can be overridden by editing ~/.spamassassin/user_prefs.cf # (see spamassassin(1) for details) # These should be safe assumptions and allow for simple visual sifting # without risking lost emails. required_hits 5 report_safe 0 rewrite_header Subject [SPAM] #ruleset to catch german sober virus spam bounces #adapted from Raymond Dijkxhoor's original ruleset by Kevin Peuhkurinen #http://www.exit0.us/index.php?pagename=GermanSoberSpamBounceRules body __PROLO_GSPAMB01 /Gegen das Vergessen/i body __PROLO_GSPAMB02 /Verbrechen der deutschen Frau/i body __PROLO_GSPAMB03 /Dresden Bombing Is To Be Regretted Enormously/i body __PROLO_GSPAMB04 /Graeberschaendung auf bundesdeutsche Anordnung/i body __PROLO_GSPAMB05 /Deutsche Buerger trauen sich nicht \.\.\./i body __PROLO_GSPAMB06 /S\.O\.S. Kiez\! Polizei schlaegt Alarm/i body __PROLO_GSPAMB07 /Schily ueber Deutschland/i body __PROLO_GSPAMB08 /Massenhafter Steuerbetrug durch auslaendische Arbeitnehmer/i body __PROLO_GSPAMB09 /The Whore Lived Like a German/i body __PROLO_GSPAMB10 /Transparenz ist das Mindeste/i body __PROLO_GSPAMB11 /Volk wird nur zum zahlen gebraucht!/i body __PROLO_GSPAMB12 /Trotz Stellenabbau/i body __PROLO_GSPAMB13 /Augen auf/i body __PROLO_GSPAMB14 /Armenian Genocide Plagues Ankara 90 Years On/i body __PROLO_GSPAMB15 /Du wirst ausspioniert \.\.\.\.\!/i body __PROLO_GSPAMB16 /Dresden 1945/i body __PROLO_GSPAMB17 /Blutige Selbstjustiz/i body __PROLO_GSPAMB18 /Turkish Tabloid Enrages Germany with Nazi Comparisons/i body __PROLO_GSPAMB19 /Multi-Kulturell \= Multi-Kriminell/i body __PROLO_GSPAMB20 /60 Jahre Befreiung: Wer feiert mit?/i body __PROLO_GSPAMB21 /Vorbildliche Aktion/i body __PROLO_GSPAMB22 /Auf Streife durch den Berliner Wedding/i body __PROLO_GSPAMB23 /Tuerkei in die EU/i body __PROLO_GSPAMB24 /Paranoider Deutschenmoerder kommt in Psychiatrie/i body __PROLO_GSPAMB25 /Hier sind wir Lehrer die einzigen Auslaender/i body __PROLO_GSPAMB26 /4\,8 Mill\. Osteuropaeer durch Fischer-Volmer Erlass/i body __PROLO_GSPAMB27 /Du wirst zum Sklaven gemacht\!\!\!/i body __PROLO_GSPAMB28 /Deutsche werden kuenftig beim Arzt abgezockt/i body __PROLO_GSPAMB29 /Auslaenderpolitik/i body __PROLO_GSPAMB30 /Auslaender bevorzugt/i body __PROLO_GSPAMB31 /Can you believe this still happens today/i header __KP_DAEMON From =~ /(?:postmaster|daemon|administrator|automated|subsystem)/i header __KP_UNDELIVSubject =~ /(?:undeliverable|returned|failed|failure)/i metaKP_GSPAMB01 __PROLO_GSPAMB01 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB02 __PROLO_GSPAMB02 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB03 __PROLO_GSPAMB03 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB04 __PROLO_GSPAMB04 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB05 __PROLO_GSPAMB05 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB06 __PROLO_GSPAMB06 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB07 __PROLO_GSPAMB07 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB08 __PROLO_GSPAMB08 (__KP_DAEMON || __KP_UNDELIV) metaKP_GSPAMB09 __PROLO_GSPAMB09 (__KP_DAEMON ||
Re: Gee i wish....
jdow wrote: Gee, I wish there was a way I could tell spamassassin I want a 0.2 score on a given test for each time it is hit within a message. I have some spams I could drive up over 20 points with such a rule that might hit in half the hams I receive all of one or two times. Ah, so you want a counting rule. Could be useful. Bob
sa-learn and big messages
Hello! If I commit a big mail (32 MB) to sa-learn it need long time. I must wait 50 sec. and the sa-learn process need 332 MB RAM. What can I do for faster proceed? Ingo
Re: German Spam local.conf
In an older episode (Thursday 19 May 2005 08:20), Patrick Steiner wrote: hello, i am totally unskilled in config spamassisn, but i found this url: http://www.exit0.us/index.php?pagename=GermanSoberSpamBounceRules That file will only detect Non Delivery Notifications that have been sent in reply to the sober spam. You are running it on one of the spams, not on a non delivery notification. i but the text into my /etc/mail/spamassassin/local.conf As pointed out here before, that is to be local.cf, or any .cf file, not local.conf tho. From: [EMAIL PROTECTED] does not match header __KP_DAEMON From =~ /(?:postmaster|daemon|administrator| automated|subsystem)/i Subject: Auslaenderpolitik does not match header __KP_UNDELIVSubject =~ /(?:undeliverable|returned|failed| failure)/i rules to detect the original sober spam can be found at http://mailscanner.prolocation.net/german.cf http://weir.dattitu.de/archives/9-Filtering-Sober-P.html regards, wolfgang
Re: spamc and spamd in different servers
The problem is corrected. By default, spamd listen only in 127.0.0.1 The corrects options for my purpose are: /usr/sbin/spamd -m 10 -i -A 172.19.3.1 -A 172.19.3.2 -A 127.0.0.1 -d Where: -A is used for permit connections from 3.1 and 3.2 (plus localhost) and -i make spamd listen on all their interfaces Ciao. El mié, 18-05-2005 a las 10:36 -0500, Andy Jezierski escribió: Paco Yepes [EMAIL PROTECTED] wrote on 05/18/2005 08:16:09 AM: I want to connect spamc in IP 172.19.3.1 to spamd in IP 172.19.2.1 spamd is running in 2.1 with the following options: # ps -ef | grep spamd root 11192 1 0 14:20 ?00:00:00 /usr/sbin/spamd -m 10 -A 172.19.3.1 -A 172.19.3.2 -A 127.0.0.1 -d --pidfile=/var/run/spamd.pid root 11193 11192 0 14:20 ?00:00:00 spamd child root 11194 11192 0 14:20 ?00:00:00 spamd child and it work fine with connections in 127.0.0.1 [snip] Your -A option is incorrect the correct format is -A ipaddr,ipaddr,ipaddr Andy
Problemes = not the same score into spamassassin 3.0.3
Hi i have a small problems and dont see where is my errors .. On my first spamassassin, a email sort: May 19 10:15:49 gw spamd[16048]: connection from srv1.ophelys.org [127.0.0.1] at port 45692 May 19 10:15:49 gw spamd[16048]: info: setuid to root succeeded May 19 10:15:49 gw spamd[16048]: Still running as root: user not specified with -u, not found, or set to root. Fall back to nobody. May 19 10:15:49 gw spamd[16048]: checking message [EMAIL PROTECTED] for root:65534. May 19 10:15:49 gw spamd[16048]: identified spam (998.1/4.9) for root:65534 in 0.2 seconds, 799 bytes. May 19 10:15:49 gw spamd[16048]: result: Y 998 - ALL_TRUSTED,AWL,FM_MULTI_ODD2,GTUBE scantime=0.2,size=799,mid=[EMAIL PROTECTED],autolearn=failed and on the second server, a new but with the same local.cf: May 19 10:25:19 gw spamd[12950]: connection from srv2.ophelys.org [127.0.0.1] at port 39551 May 19 10:25:19 gw spamd[12950]: info: setuid to root succeeded May 19 10:25:19 gw spamd[12950]: Still running as root: user not specified with -u, not found, or set to root. Fall back to nobody. May 19 10:25:19 gw spamd[12950]: processing message [EMAIL PROTECTED] for root:65534. May 19 10:25:20 gw spamd[12950]: clean message (-2.7/4.9) for root:65534 in 0.2 seconds, 799 bytes. May 19 10:25:20 gw spamd[12950]: result: . -2 - ALL_TRUSTED,AWL,DNS_FROM_AHBL_RHSBL scantime=0.2,size=799,mid=[EMAIL PROTECTED],autolearn=failed why a -2.7 smime.p7s Description: S/MIME Cryptographic Signature
Re: Simple question TRUE or FALSE
At 06:00 19-5-2005, Justin Mason wrote: Memory usage can be quite huge if you have many custom rulesets, because SA 3.0.x forks into several processes which all insist on making their own copy of the ruleset in memory :( When I still used the RDJ bigevil list (amongst others), it would use 96 MB of memory for each SA process. actually, most of this *is* shared, it's just that linux can no longer report this accurately. What makes you think that? Total used memory on my system is consistent with SpamAssassin processing not sharing any significant amount of memory. Also it reports the memory sharing just fine on applications such as Apache?
bayes_auto_learn_threshold_spam not working?
Hello! I am a little confused with the bayes. I have set bayes_auto_learn_threshold_spam = 6. Why is the following mail not autolearn as spam? No user.prefs or else is set. May 19 10:25:16 mail spamd[6531]: result: Y 14 - BIZ_TLD,DATE_IN_PAST_06_12,FROM_ENDS_IN_NUMS,LOCAL_OBFU_DUNG_BIZ,LOCAL_OBFU_DUNG_GAMES,LOCAL_OBFU_DUNG_SOFTWARE,TDE_FM_BU_EXCUSE,TDE_RO_BV_GRATIS,TDE_WS_BV_GARANTIEN,TDE_WS_BV_KEINRISIKO,TDE_WS_BV_PREIS1,TDE_WS_BV_RABATT,TDE_WS_BV_SPARPREIS,TDE_WS_BV_WORD_ALLES,TDE_WS_BV_WORD_GARANTIE,TDE_WS_BV_WORD_JETZT scantime=2.1,size=6817,mid=[EMAIL PROTECTED],autolearn=no Ingo
Re: Exim Mail Server
Jeffrey N. Miller wrote: I use Spamassassin with Sendmail and I am thinking about going to Exim. Does it make a difference to Spamassassin with mimedefang? I'm about to go the same way. I believe mimedefang only works in sendmail, but exim will do all the mimedefang things with a small bit of persuasion. Cheers Bill -- What's the difference between Linux and Windoze? Linux - Thousands of programmers are working *WITH*you. Windoze - Thousands of programmers are working *AGAINST* you.
Re: Spam Percentages
Hamie wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Martin Hepworth wrote: Fred wrote: Ben Hanson wrote: Shortly after the first of the year, I noticed the percentage of spam messages for our organization dropped consistently by 10-15%. Ben I see between 83-85% spam. We use SARE rules + my own home-brew rules + the new BLACK uribl lists + unreleased SARE rules. In the past 24 hours the numbers are: spam-reject 55,967 mail-in 11,089 total-mail 67,056 Viruses not included in this count, it would skew things due to the recent increase in new viruses lately. http://www.rulesemporium.com might have some helpful rules for you to add to your setup. On another topic, I see just as many user-unknowns as I reject spam. That's cause we are an ISP and customers like to switch stuff around often ;) Frederic Tarasevicius Internet Information Services, Inc. http://www.i-is.com/ 810-794-4400 Fred 70% of my inbound traffic is for unknown users, 20% spam/malware and 10% real mail. How do you count 'unknown users'? Accurately I mean... I can examine the reject log in exim to get counts. Assuming you don't accept email in the first place if the user is unknown (Or you might I guess, but it seems like un-necessary processing to me) most spammers that I can see in our logs just keep re-trying again again again... yes, but given 70% of my inbound traffic is a pretty constant figure I'm not seeing this. also rejecting 70% of my traffic on MTA connection the small amount of proocessing to lookup valid email address is way way less than having to SA scann all these emails. For example on our mail server I reject far more than I accept. Yet the rejects are in most cases repeated. As spammers appear to be a thick bunch don't take a 5xx very well. Currenty I have 'discussions' with various people round here over the fact that we 'only' catch about 5-10% of our total accepted email in SA as spam, yet MessageLabs et al always like to quote the (To me) alarmist figures of 80% email is spam etc. But then we reject email from un-verified addresses and don't accept email for unknown users at the border MTA, not at SA. (And so don't have an accurate count of them). H lucky you, even taking out the uknown users I'm running 75% spam on my inbound. -- Martin Hepworth Snr Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
Re: Simple question TRUE or FALSE
David depends on what you call lots or RAM, CPU etc. my old scanner took about 5 seconds to scan email with SA (URI_RBL's, bayes two normal RBL's, lots of extra SARE rules etc), Sophos, ClamAV, the extra checks MailScanner does and dump the email into a mysql DB for reports. Given emails would normally be batched up into a few messages (2-5 average) it difficult to get a single email timing. That was a 500mhz celeron with 512MB ram and an IDE disk. I would top out at about 17,000 messages per day of an avergae size of 26kb. New scanner (P4 2,8ghz, 1.5 GB ram, Sata Disk) takes around 2 seconds per average batch and tops out at around 70,000 messages per day (without much O/S tuning). -- Martin Hepworth Snr Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
Re: Exim with Spamassassin and mimedefang
On Wed, 18 May 2005, Jeffrey N. Miller wrote: I want to setup a SMTP relay filtering SPAM and viruses. The relay will relay the mail to my Exchange server. Is there well documented HOWTOs on setting this up using Exim, Spamassassin, Mimedefang and a good virus scanning software? I see HOWTOs using sendmail but I want to switch to Exim or am I just making things hard? Recent versions of Exim come with support for SpamAssassin and anti-virus software built in. See the section in the documentation about Content Scanning. Tony. -- f.a.n.finch [EMAIL PROTECTED] http://dotat.at/ BISCAY: WEST 5 OR 6 BECOMING VARIABLE 3 OR 4. SHOWERS AT FIRST. MODERATE OR GOOD.
Re: Problemes = not the same score into spamassassin 3.0.3
At 10:32 19-5-2005, Phibee Network operation Center wrote: The GTUBE rule is defined in the (standard) 20_body_tests.cf rule file. So I'd say for some reason it must not be processing the rules in that file. Try spamassassin -D --lint and see if 20_body_tests.cf is being loaded? Then there is the DNS_FROM_AHBL_RHSBL difference, but I don't see how that could make such a big diffe i have a small problems and dont see where is my errors .. On my first spamassassin, a email sort: May 19 10:15:49 gw spamd[16048]: result: Y 998 - ALL_TRUSTED,AWL,FM_MULTI_ODD2,GTUBE scantime=0.2,size=799,mid=[EMAIL PROTECTED],autolearn=failed and on the second server, a new but with the same local.cf: May 19 10:25:20 gw spamd[12950]: result: . -2 - ALL_TRUSTED,AWL,DNS_FROM_AHBL_RHSBL scantime=0.2,size=799,mid=[EMAIL PROTECTED],autolearn=failed
Rather too refreshing: bayes.lock
I've just had an issue with SpamAssassin and Bayes. I am running SpamAssassin 3.02 on Windows. Basically it stopped working, and no SA checks were completeing. I tried to see what was happening with Spamassassin -D --lint And it started ok, and decided it needed to do an expiry, which took quite a long time. When it had completed it had the following line: debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock repeated over and over again, perhaps for 20 minutes. Fortunately the last time I tried it to get a copy of the log to post, it did eventually complete with the following end to the log: debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: bayes: 3392 untie-ing debug: bayes: 3392 untie-ing db_toks debug: bayes: 3392 untie-ing db_seen debug: bayes: files locked, now unlocking lock debug: unlock: 3392 unlink F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock debug: expired old Bayes database entries in 1533 seconds: 485101 entries kept, 18889 deleted debug: Syncing complete. debug: registering glue method for check_uridnsbl (Mail::SpamAssassin::Plugin::U RIDNSBL=HASH(0x22727b4)) debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22727b4) implements ' check_tick' debug: running raw-body-text per-line regexp tests; score so far=-0.287 debug: running full-text regexp tests; score so far=-0.287 debug: DCCifd is not available: no r/w dccifd socket found. debug: Running tests for priority: 500 debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x22727b4) implements ' check_post_dnsbl' debug: running meta tests; score so far=-0.287 debug: running header regexp tests; score so far=0.939 debug: running body-text per-line regexp tests; score so far=0.939 debug: running uri tests; score so far=0.939 debug: running raw-body-text per-line regexp tests; score so far=0.939 debug: running full-text regexp tests; score so far=0.939 debug: Running tests for priority: 1000 debug: running meta tests; score so far=0.939 debug: running header regexp tests; score so far=0.939 debug: running body-text per-line regexp tests; score so far=0.939 debug: running uri tests; score so far=0.939 debug: running raw-body-text per-line regexp tests; score so far=0.939 debug: running full-text regexp tests; score so far=0.939 debug: is spam? score=0.939 required=2.4 debug: tests=BAYES_05,MISSING_HEADERS,MISSING_SUBJECT,NO_REAL_NAME debug: subtests=__HAS_MSGID,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__SANE_MSGID,__UNU SABLE_MSGID The problem which is no more a problem is why was it doing: debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock over and over again? I tried removing the bayes.lock file but it kept going and replaced the file. Is there anything I can do to stop this happening again? Thanks Ben
Re: Exim with Spamassassin and mimedefang
Jeffrey N. Miller wrote: I want to setup a SMTP relay filtering SPAM and viruses. The relay will relay the mail to my Exchange server. Is there well documented HOWTOs on setting this up using Exim, Spamassassin, Mimedefang and a good virus scanning software? I see HOWTOs using sendmail but I want to switch to Exim or am I just making things hard? It doesn't use mimedefang, but here: http://slett.net/spam-filtering-for-mx/index.html is an excellent resource for setting up SA with Exim. Highly recommended. Kevin
Re: Simple question TRUE or FALSE
David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE FALSE. My SA runs on a Pentium IV 3GHz system with 512MB. The average processing time per email for the last 100,000 or so emails is 2.8 seconds.
Re: German Spam local.conf
Alan Premselaar wrote: first, make sure your file is named local.cf and not local.conf. also, you don't appear to have any scores assigned to these rules. I'm not a rules writing expert, but these rules may not actually do anything for you without scores assigned. ALL rules receive a default score of 1 - with the exception that rules with names that begin with __ do not add to the total score for the message. -kgd -- Get your mouse off of there! You don't know where that email has been!
Re: Simple question TRUE or FALSE (More data to answer this question)
From: David Velásquez Restrepo [EMAIL PROTECTED] Software: -- A perl script wich takes some file and test it using Mail::SpamAssassin to get it´s spam score level OS: gentoo 2005.0 MTA: postfix SpamAssassin: -- Using: Net test, Bayes, Razor2, DCC, Phyzor, SPF Test (and everything else suggested by spamassassin) Rules: rules_du_jour: http://www.rulesemporium.com/rules/99_FVGT_Tripwire.cf http://www.rulesemporium.com/rules/bigevil.cf http://mywebpages.comcast.net/mkettler/sa/antidrug.cf http://www.rulesemporium.com/rules/evilnumbers.cf http://www.stearns.org/sa-blacklist/sa-blacklist.current http://www.stearns.org/sa-blacklist/sa-blacklist.current.uri.cf http://www.stearns.org/sa-blacklist/random.current.cf http://www.timj.co.uk/linux/bogus-virus-warnings.cf http://www.rulesemporium.com/rules/70_sare_adult.cf http://www.rulesemporium.com/rules/99_sare_fraud_post25x.cf http://www.rulesemporium.com/rules/99_sare_fraud_pre25x.cf http://www.rulesemporium.com/rules/72_sare_bml_post25x.cf http://www.rulesemporium.com/rules/71_sare_bml_pre25x.cf http://www.rulesemporium.com/rules/70_sare_ratware.cf http://www.rulesemporium.com/rules/70_sare_spoof.cf http://www.rulesemporium.com/rules/70_sare_bayes_poison_nxm.cf http://www.rulesemporium.com/rules/70_sare_oem.cf http://www.rulesemporium.com/rules/70_sare_random.cf http://www.rulesemporium.com/rules/70_sare_header.cf http://www.rulesemporium.com/rules/70_sare_html.cf http://www.rulesemporium.com/rules/70_sare_specific.cf http://www.rulesemporium.com/rules/71_sare_redirect_pre3.0.0.cf http://www.rulesemporium.com/rules/72_sare_redirect_post3.0.0.cf http://www.rulesemporium.com/rules/70_sare_uri0.cf http://www.rulesemporium.com/rules/70_sare_uri1.cf http://www.rulesemporium.com/rules/70_sare_uri2.cf http://www.rulesemporium.com/rules/70_sare_uri3.cf http://www.rulesemporium.com/rules/70_sare_uri_eng.cf http://www.rulesemporium.com/rules/70_sare_uri_arc.cf Runtime: -- 4 processes in parallel mode Harwdare: -- Intel Pentium III - 1ghz - 512RAM (pci133) top: --- top - 23:03:27 up 10:39, 2 users, load average: 5.47, 5.35, 5.19 Tasks: 62 total, 2 running, 60 sleeping, 0 stopped, 0 zombie Cpu(s): 93.7% us, 5.7% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.6% hi, 0.0% si Mem:514036k total, 490044k used,23992k free, 6892k buffers Swap: 987988k total,49672k used, 938316k free,38012k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 27220 xmail 19 0 98680 71m 3064 R 99.9 14.2 2:38.51 /progs/xmail/bin/mx_parser/mx_parser.pl - 1 27603 xmail 15 0 100m 95m 3064 S 36.8 19.0 2:06.76 /progs/xmail/bin/mx_parser/mx_parser.pl - 5 28171 xmail 16 0 93604 87m 3064 D 28.9 17.4 1:11.20 /progs/xmail/bin/mx_parser/mx_parser.pl - 4 27516 xmail 17 0 94644 88m 3064 D 13.1 17.6 2:03.70 /progs/xmail/bin/mx_parser/mx_parser.pl - 2 27308 xmail 18 0 97960 73m 3064 D 10.5 14.5 2:35.46 /progs/xmail/bin/mx_parser/mx_parser.pl - 3 So, here it goes again the simple, but not short, question: Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE Given the way you phrase that belligerent assertion I am tempted to simply answer true and leave you floundering. It is obvious that for the way you have it configured you're going to take 20-30 seconds so the obvious answer is true, for you. Now, if you asked, Am I doing something wrong? and approached it from that direction you might discover you can run tests in about 5 to 7 second each for your machine. I'll be presumptuous and figure this is what you really mean. For the run times you cite you may have a BL configuration problem, such as trying to use a dead BL somewhere. One other thing that can cause this is a DNS problem. You are using larger chunks of VIRT than I am. I use about 60M where you are using 98M. I run with --max-conn-per-child=15. You win a little if you either add RAM or cut down to -m2 or -m3. You do have a fair amount of cache in use. Once that happens you flounder around in cache swapping when running spamassassin. {^_^}
Re: sa-learn and big messages
From: Ingo Reinhart [EMAIL PROTECTED] Hello! If I commit a big mail (32 MB) to sa-learn it need long time. I must wait 50 sec. and the sa-learn process need 332 MB RAM. What can I do for faster proceed? In procmail the incantation is something like: :0 fw: spamassassin.lock * 25 | /usr/bin/spamc -t 150 For other tools it's probably at least vaguely similar. You bypass spamassassin for large messages. And that's a job for something outside spamassassin itself. {^_^}
Re: Gee i wish....
From: Bob Proulx [EMAIL PROTECTED] jdow wrote: Gee, I wish there was a way I could tell spamassassin I want a 0.2 score on a given test for each time it is hit within a message. I have some spams I could drive up over 20 points with such a rule that might hit in half the hams I receive all of one or two times. Ah, so you want a counting rule. Could be useful. Very. There is one phish type spam I get that has huge numbers of http markups, around each letter in the text. A counter on that message would push its score into the stratosphere. And another technique that seems to be floating around, from wareglobe.com, is multiple alternating Subject and From headers. A counting rule plus a meta rule would nail them to a tree neatly. It really looks like it is time to figure out how to count rule hits. This is probably not required for most rules. But for some it should be an available option. {^_^}
Re: Exim Mail Server
On Wed, 18 May 2005 16:23:48 -0500 Jeffrey N. Miller [EMAIL PROTECTED] wrote: I use Spamassassin with Sendmail and I am thinking about going to Exim. Does it make a difference to Spamassassin with mimedefang? I'm not fully sure of all the capabilities and implications of mimedefang, but a few comments that might be helpful: a) my HOWTO at http://www.timj.co.uk/linux/Exim-SpamAndVirusScanning.pdf has some (hopefully) helpful notes about using SpamAssassin with Exim. You can call SA directly from Exim including (optionally) making decisions about rejection at SMTP time. b) Exim has many powerful facilities for rejecting mail including MIME parsing functions, some of which may be an optional substitute for some of mimedefang's features, some of which may not. However... c) in general you can run ANY software (including mimedefang or SpamAssassin) aganist a message and make decisions based on the answer. You can use use arbitrary software as a transport filter for modifying the message. (some of the mimedefang featurelist looks like modifying the message to me, e.g. add boilerplate text) d) the exim-users list over at exim-users@exim.org are pretty helpful at answering any Exim-specific questions :) Tim
RE: Argument isn't numeric in addition (+) at /usr/local/lib/perl5/site_perl/5.8.2/Mail/SpamAssassin/Conf.pm line 743.
First of all - I double-posted this Q to the list by mistake and I apologize. Secondly, I installed this version of SpamAssassin from the FreeBSD port of 3.03. I built the port again and checked the Conf.pm file inside the port folder against the installed Conf.pm file and there are no differences. This is the section in question: *** See Breport_safe_copy_headers if you want to copy headers from the original mail into tagged messages. =cut push (@cmds, { setting = 'report_safe', default = 1, code = sub { my ($self, $key, $value, $line) = @_; $self-{report_safe} = $value+0; if (! $self-{report_safe}) { $self-{headers_spam}-{Report} = _REPORT_; } } }); =back *** Does it seem correct? Regards, John Schneider, Information Systems Manager, GVA DAUM Worldwide Real Estate Solutions 123 S Figueroa Street Suite 400 Los Angeles, CA 90012 213-270-2262 Direct 213-947-1431 Fax WEBSITE: www.gvadaum.com EMAIL: [EMAIL PROTECTED] -Original Message- From: Theo Van Dinter [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 17, 2005 9:55 PM To: users@spamassassin.apache.org Subject: Re: Argument isn't numeric in addition (+) at /usr/local/lib/perl5/site_perl/5.8.2/Mail/SpamAssassin/Conf.p m line 743. On Tue, May 17, 2005 at 05:50:32PM -0700, John Schneider wrote: Argument isn't numeric in addition (+) at /usr/local/lib/perl5/site_perl/5.8.2/Mail/SpamAssassin/Conf.pm line 743. I've searched .. What could it be? Bad config line. Looking at the code, a bad report_safe line. -- Randomly Generated Tagline: Earth men are real men!
Re: Simple question TRUE or FALSE (More data to answer this question)
Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE My answer is b), False. I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. Menno van Bennekom
Re: bayes_auto_learn_threshold_spam not working?
On Thu, May 19, 2005 at 11:43:39AM +0200, Ingo Reinhart wrote: Why is the following mail not autolearn as spam? No user.prefs or else is set. As usual, run with -D. It will tell you. -- Randomly Generated Tagline: Marriage is a three ring circus: engagement ring, wedding ring, and suffering. - Unknown pgptbCUyKa2FZ.pgp Description: PGP signature
RE: German Spam local.conf
I would like to be removed from this distrubtion list, anyone have an idea how to do that? -Original Message- From: Kris Deugau [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 6:16 AM To: users@spamassassin.apache.org Subject: Re: German Spam local.conf Alan Premselaar wrote: first, make sure your file is named local.cf and not local.conf. also, you don't appear to have any scores assigned to these rules. I'm not a rules writing expert, but these rules may not actually do anything for you without scores assigned. ALL rules receive a default score of 1 - with the exception that rules with names that begin with __ do not add to the total score for the message. -kgd -- Get your mouse off of there! You don't know where that email has been! ___ The information contained in this message and any attachment may be proprietary, confidential, and privileged or subject to the work product doctrine and thus protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify me immediately by replying to this message and deleting it and all copies and backups thereof. Thank you.
Re: Simple question TRUE or FALSE (More data to answer this question)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 jdow writes: You are using larger chunks of VIRT than I am. I use about 60M where you are using 98M. I run with --max-conn-per-child=15. You win a little if you either add RAM or cut down to -m2 or -m3. You do have a fair amount of cache in use. Once that happens you flounder around in cache swapping when running spamassassin. the fundamental problem is that he's not using spamd. rule of thumb: if you see performance issues, and you're not using spamd, STOP RIGHT THERE and start using spamd ;) - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCjOkeMJF5cimLx9ARAoarAJ9TY6BF9vF8UFt3Dj2qLDQmDg+pdQCgkSrR 8rFpV4XKLKzk+jtjaam5fFg= =8RxI -END PGP SIGNATURE-
Re: sa-learn and big messages
Ingo Reinhart wrote: Hello! If I commit a big mail (32 MB) to sa-learn it need long time. I must wait 50 sec. and the sa-learn process need 332 MB RAM. What can I do for faster proceed? Don't commit such a large message to sa-learn? Seriously, sa-learn isn't designed to handle such a huge input message. If you're using SA under 3.0.1 you can improve the memory usage somewhat with an upgrade, but even that's not going to make things fast with such a large input email. See bug: http://bugzilla.spamassassin.org/show_bug.cgi?id=3876
Re: Problemes = not the same score into spamassassin 3.0.3
Phibee Network operation Center wrote: Hi i have a small problems and dont see where is my errors .. Have you tried spamassassin --lint on the second box? At casual glance it looks like the second box is missing a lot of the standard ruleset, as GTUBE failed to fire.
Re: Simple question TRUE or FALSE (More data to answer this question)
David Velásquez Restrepo wrote: Software: -- A perl script wich takes some file and test it using Mail::SpamAssassin to get it´s spam score level If your script isn't persistent, I'd ditch it and use spamc/spamd as Justin Mason suggested. You'll save a lot of processor time from two things using this approach: 1) spamd parses the rulesets when it loads, instead of on a per-message basis. 2) You'll avoid invoking a perl process on a per-message basis, which is a huge waste of CPU time. The perl processes will be preforked by spamd, and only spamc (a compiled utility) gets invoked per-message. 3) spamc has a built-in message size limit, so you'll avoid scanning messages with large attachments that are unlikely to be spam anyway. http://www.rulesemporium.com/rules/bigevil.cf Matt Y already pointed this out, but just to underline it, bigevil will waste TRULY massive amounts of resources on your system. Even the author of bigevil (Chris S.) strongly recommends that nobody use it, and if you go to the website now, it's been deleted to prevent anyone from using it anymore. You should easily cut 30MB or more off the size of your processes if you remove bigevil. In general it looks like you downloaded every optional ruleset in the world and added it to your configuration before you started off. I would strongly discourage doing that kind of approach to any kind of server application, and it's especially true for spamassassin. Start off running SA without *ANY* add on rulesets, then start adding them a few at a time. This way if you add a bloated ruleset like bigevil, the cause of the problem is immediately obvious. Be very wary of any ruleset which has a .cf file that's greater than 64k in size. Matt Y's comments on duplicated rulesets (such as antidrug.cf, and having both the pre and post 2.5x versions of several rulesets) is also valid. Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE a) TRUE, due to misconfiguration. With some tuning based on the tips above, this will readily change to b) FALSE.
Re: Simple question TRUE or FALSE
David Velásquez Restrepo wrote: Hi, I'm user of spamassassin to reviw a lot (a lot!) of incoming mails with spamassassin lot time ago. Today i have a machine just running spamassassin, due the high CPU and MEM requirements. Just to be clear (may be i have something bad) The question is: Q) With spamassassin you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE False. On my home system, which admittedly doesn't see a lot of mail volume, it takes between four and six seconds to scan a message. It sometimes takes longer if some other process is using a lot of memory, because that machine is kind of short on RAM. It's a 500 MHz DEC AlphaPC. I'm not doing DNS caching on that one, so a lot of that time may be waiting for DNS blacklists to respond. A quick check of the mail server at work, which is faster and uses a caching DNS server, shows most messages are being scanned in under 2 seconds. If you're seeing 20 to 30 second scan times, your server is probably overloaded. Maybe you don't have enough RAM and you're swapping to disk.
Re: Argument isn't numeric in addition (+) at /usr/local/lib/perl5/site_perl/5.8.2/Mail/SpamAssassin/Conf.pm line 743.
John Schneider wrote: First of all - I double-posted this Q to the list by mistake and I apologize. Secondly, I installed this version of SpamAssassin from the FreeBSD port of 3.03. I built the port again and checked the Conf.pm file inside the port folder against the installed Conf.pm file and there are no differences. Yes, the Conf.pm looks correct. But that's not what you should be looking at. It's not even the right file. As per Theo's suggestion, look for a report_safe statement in your local.cf or user_prefs (or any other config file). You probably have a line like: report_safe When it needs to be: report_safe 1 Or something similar. Report_safe *MUST* be followed by a parameter, and if it's not, you'll generate the error you're reporting.
Two uris hit uribl, should count double ?
Hi, I got a spam today, and it had two spamvertised websites in them. Both uris hit uribl.com's black list: * 3.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist * [URIs: ridofallthebad dot com noveltyrenewed dot com] Shouldn't the score be 6, because it caught two uris ? Niek
Re: Rather too refreshing: bayes.lock
On 5/19/05, Ben Wylie [EMAIL PROTECTED] wrote: debug: refresh: 3392 refresh F:/DOCUME~1/ADMINI~1/SPAMAS~1/bayes.lock [...] debug: expired old Bayes database entries in 1533 seconds: 485101 entries No real help here, but another data point: I've been seeing occasional problems on my Linux workstation with 3.0 leaving bayes.lock files behind, throughout the 3.0 series. I suspect it does have something to do with syncing the journals and/or token expiry, because although it always happens after spamd has run, it does not happen in any observably predictable way.
RE: Simple question TRUE or FALSE (More data to answer this question)
Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE My answer is b), False. I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. For a 2.4GHz Celeron with 1GB RAM, SA + Postfix hooked to a mysql DB I second that! However on slow boxes I've seen SA doing 15seconds tests but CPU never climbed to 20-30%. The tests took so long because of a mixture of rulesets for SA v3.x and SA = v3.x. After cleaning up the mess perfomance gain was here immediately. If there are problems with dns, you are advised to use a DNS cache right on the SA box or at least one that's physically in the same network / subnet. Philipp
Re: German Spam local.conf
[EMAIL PROTECTED] wrote: I would like to be removed from this distrubtion list, anyone have an idea how to do that? Read the message headers for *ANY* post on the list: list-unsubscribe: mailto:[EMAIL PROTECTED] Note: if your mail reader won't show you these headers, it's broken. This header is the RFC required method for mailing lists to advertise how to unsubscribe. Most lists which are of a technical nature conform to this, so keep an eye out for it in the future.
Re: Two uris hit uribl, should count double ?
Niek wrote: Hi, I got a spam today, and it had two spamvertised websites in them. Both uris hit uribl.com's black list: * 3.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist * [URIs: ridofallthebad dot com noveltyrenewed dot com] Shouldn't the score be 6, because it caught two uris ? No, spamassassin rules don't add like that. They're either hit, or not hit.
RE: Simple question TRUE or FALSE (More data to answer this question)
From: Menno van Bennekom [mailto:[EMAIL PROTECTED] To: David Velásquez Restrepo Subject: Re: Simple question TRUE or FALSE (More data to answer this question) Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE My answer is b), False. I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Any input into my situation would be appreciated. I'd love to be able to get down to 2-3 seconds, basically cutting my processing time in half! .jon
Re: Test Posting
At 11:06 AM Thursday, 5/19/2005, Jake Colman wrote -= Does this work? My last two posts did not seem to make it to the list.. No... }B-) Ed Kasky ~ Randomly Generated Quote (31 of 477): Be gentle to all and stern with yourself. - St. Teresa of Avila
Bayes_journal and locking
on our mailcluster we have spamassassin running with bayes turned on, but auto learning turned off, we then hand train the bayesdb and push it out to all the machines. However weven with bayes_auto_learn turned off, spamd is still creating the bayes_journal files, which isn't a problem until spamd decides to flush the journal file to no where it still locks the bayesdb file causing concurrent spamd processes to be unable to use the db resulting in inconsistent scores every few seconds. The quick and dirty solution was to create a +i bayes_journal file, spamd never writes to the journal, so it never tries to flush it...except it now throws an error everytime it tries to write to bayes_journal. Just wondering if anyone out there has run into this problem before? Is there a more elegant solution? Here is the pertinent section of local.cf: use_bayes 1 bayes_path /etc/mail/bayes/bayes bayes_auto_learn0 bayes_auto_expire 0 bayes_use_hapaxes 1 bayes_use_chi2_combining1 bayes_learn_to_journal 0 Thx in advance for any help. -Rocky -- __ what's with today, today? Email: [EMAIL PROTECTED] PGP:http://rocky.mindphone.org/rocky_mindphone.org.gpg signature.asc Description: Digital signature
RE: Simple question TRUE or FALSE (More data to answer this question)
Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of Personally, I rarely have any processing times over 1 second. Most of mine are between 0.3 and 0.9 seconds per message. I do not run any network tests, however. Stock SpamAssassin rules, the only modifications I've made have been some scoring adjustments. This is on an AMD64 3000+ with 1GB of DDR400 RAM, running OpenBSD 3.6-STABLE, spamd/spamc, and qmail. Benny -- You come from a long line of scary women. -- Ranger, Three To Get Deadly
Re: Simple question TRUE or FALSE (More data to answer this question)
Justin Mason wrote: jdow writes: You are using larger chunks of VIRT than I am. I use about 60M where you are using 98M. I run with --max-conn-per-child=15. You win a little if you either add RAM or cut down to -m2 or -m3. You do have a fair amount of cache in use. Once that happens you flounder around in cache swapping when running spamassassin. the fundamental problem is that he's not using spamd. Well, that's about 3/4 of his problem Justin. The other 1/4 is he's also got a massively oversized configuration with duplicate rulesets, and large unsupported rulesets like bigevil. Switching to spamd would help him in speed, and will limit the memory usage by limiting the number of children, but the per-child memory usage will still be high until he gets rid of bigevil. rule of thumb: if you see performance issues, and you're not using spamd, STOP RIGHT THERE and start using spamd ;) Unless you're using some other persistent daemon for integration that uses the Mail::SpamAssassin API, such as MailScanner or similar. (note: just using Mail::SpamAssassin in a perl script that gets called for each message as the OP is doesn't count. It's got to be a persistent daemon that doesn't get re-invoked for every message in order to be comparable to spamd.)
Re: SA Sometimes Being Bypassed?
Jake Colman wrote: If my sendmail server is down, a backup MX in a different domain catches all my email. When my sendmail server comes back up, the backup MX dumps all the mail it's been holding for me. It seems that all the email sent to me in this manner bypasses my SA filtering. Why should this be? I beleive that what I am saying is accurate because if I examine the email headers for emails sent by the backup MX, they do not have my X-Spam headers. How do you call spamassassin for your normal mail? Without knowing how normal mail gets to SA, it's hard to guess why mail from the secondary isn't getting to SA.
Re: sa-learn and big messages
Ingo Reinhart wrote: Hello! If I commit a big mail (32 MB) to sa-learn it need long time. I must wait 50 sec. and the sa-learn process need 332 MB RAM. What can I do for faster proceed? Ingo um..since messages over 250k (default) wont be scanned by SA, why bother sa-learning anything over this limit? Sa isnt going to scan it anyway. -Jim
Re: bayes_auto_learn_threshold_spam not working?
Ingo Reinhart wrote: Hello! I am a little confused with the bayes. I have set bayes_auto_learn_threshold_spam = 6. Why is the following mail not autolearn as spam? No user.prefs or else is set. May 19 10:25:16 mail spamd[6531]: result: Y 14 - BIZ_TLD,DATE_IN_PAST_06_12,FROM_ENDS_IN_NUMS,LOCAL_OBFU_DUNG_BIZ,LOCAL_OBFU_DUNG_GAMES,LOCAL_OBFU_DUNG_SOFTWARE,TDE_FM_BU_EXCUSE,TDE_RO_BV_GRATIS,TDE_WS_BV_GARANTIEN,TDE_WS_BV_KEINRISIKO,TDE_WS_BV_PREIS1,TDE_WS_BV_RABATT,TDE_WS_BV_SPARPREIS,TDE_WS_BV_WORD_ALLES,TDE_WS_BV_WORD_GARANTIE,TDE_WS_BV_WORD_JETZT scantime=2.1,size=6817,mid=[EMAIL PROTECTED],autolearn=no because there are a lot more things involved with autolearning than simply the score of the message. a certain number of points of body points as well as header points are needed. There are actually a bunch of other criteria as well, but i think this is why it did not autolearn. check the sa docs..all of the tests are in there somewhere. -Jim
Re: Perl IMAP client
On May 19, 2005, at 15:34, Bret Miller wrote: I'd like to knock together a utility for invoking SA against messages in an IMAP store, and it seems logical to build it as a Perl program using an IMAP package and Mail::SpamAssassin. Can anyone recommend a good Perl IMAP package? I have used Mail::IMAPClient. It works decently enough.
Re: sa-learn and big messages
Jim Maul wrote: Ingo Reinhart wrote: Hello! If I commit a big mail (32 MB) to sa-learn it need long time. I must wait 50 sec. and the sa-learn process need 332 MB RAM. What can I do for faster proceed? Ingo um..since messages over 250k (default) wont be scanned by SA, why bother sa-learning anything over this limit? Sa isnt going to scan it anyway. -Jim Based on the way bayes works, that doesn't make much sense Jim. Bayes doesn't learn messages, it learns tokens from within messages. Really, you don't care if SA is going to scan messages of the same size or not. You care if it will scan messages with some of the same content. It's quite possible the 32mb is a large version of a message that's normally short. For example logwatch output. The only reason training the 32mb message would be pointless would be if it only contained content that would be in similarly large messages. Minor Note of Clarification: that 250k default limit applies to those who use spamd, which admittedly Ingo does use. But it is not inherent in spamassassin in general (i.e. those using the API or spamassassin command-line don't have this feature unless implemented elsewhere)
Re: Simple question TRUE or FALSE (More data to answer this question)
Jon Dossey wrote: Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Using SA 2.64 with bayes, razor, dcc (w/dccifd), Mail::SpamCopURI, and 31 .cf files in my /etc/mail spamassassin I'm able to do it in this timeframe. # time spamc sample-spam.txt snip X-Spam-Status: Yes, hits=999.5 required=5.0 tests=BAYES_00,DCC_CHECK, DNS_FROM_RFCI_DSN,GTUBE,RAZOR2_CF_RANGE_11_50,RAZOR2_CHECK autolearn=no version=2.64 snip real0m2.583s user0m0.000s sys 0m0.000s System is a single CPU p4 celeron 2ghz with 512mb of ram, and a caching resolver DNS on localhost.
RE: Simple question TRUE or FALSE (More data to answer this question)
I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Any input into my situation would be appreciated. I'd love to be able to get down to 2-3 seconds, basically cutting my processing time in half! .jon I'll describe my setup, and that may give you some insight. It's almost certainly what you think: network tests. My setup uses Compaq ML570s, 4 700MHz Xeon CPUs each, 2G of ram, RAID 0+1 disk arrays. They do virus scanning, spam scanning, and various other mail related tasks which all (of course) take resources. These machines rarely go above 700M consumed, and only really run more than 50% busy (over a several minute window) on Monday morning, or when a spammer has decided that it would be a wonderful idea to hit every single address of ours that they have in rapid succession. The sa-stats routine return the following data: Based on yesterday's logs, the average scan time was 1.44s, average ham scan time 1.11s, average spam scan time 1.62s. The total number of messages scanned was 225,850. It would much higher, but we don't scan outbound email, and also block mail using a sendmail milter derived from rbl-milter, which blocks when 2 (or more) of the RBLs that we use agree. To speed up the network tests, we take advantage of any RBL provider that offers rsync access to their lists (njabl, dsbl, surbl, others), and then (almost) only use those ones. Our scan times went up after I added a few others (sbl-xbl, and bl.spamcop), but those ones are really fast anyway. Each machine runs a local caching DNS server, and the locally hosted RBLs are served by an rbldnsd server. Conveniently, rbldns makes it easy to run a private URIBL, which is occasionally nice. Our site-wide bayes database lives in SQL, because it's more convenient to share among multiple machines that way, and has the added benefit of being faster. I don't run Razor or DCC or Pyzor. A pile of custom rules, and SARE rulesets finish the setup. I've probably forgotten something, but those are the important things. Anyway, I hope that helps someone :) The setup works nicely, with nary a hitch, thanks to everyone who makes it possible! - Austin.
Re: OT: Perl IMAP client
Hi, I am using the netxap package in a similar context. There was an ages old version on cpan which needed a few patches to work (at least with cyrus) but the version that came with suse linux was up to date Wolfgang Hamann I'd like to knock together a utility for invoking SA against messages in an IMAP store, and it seems logical to build it as a Perl program using an IMAP package and Mail::SpamAssassin. Can anyone recommend a good Perl IMAP package? Server will be Dovecot on Fedora. My utility will take all messages in a folder of uncaught spam that aren't wrapped in a SA report, run them through the equivalent of sa-learn, wrap them in a SA report, and clear their seen/read state. Here's all the hits I get on CPAN for stuff about IMAP: http://search.cpan.org/search?m=allq=imaps=1n=100
rulesdujour and old copies of rule files
Hi, I've noticed there is a buildup of old rules in my /etc/mail/spamassassin/RulesDuJour directory like this 109543 May 10 19:07 bogus-virus-warnings.cf 92609 Aug 10 2004 bogus-virus-warnings.cf.20040819-0402 93896 Aug 19 2004 bogus-virus-warnings.cf.20040823-0423 94241 Aug 23 2004 bogus-virus-warnings.cf.20040909-0403 94292 Sep 9 2004 bogus-virus-warnings.cf.20041101-0453 100387 Oct 30 2004 bogus-virus-warnings.cf.20041103-0434 100389 Nov 2 2004 bogus-virus-warnings.cf.20041109-0406 100721 Nov 8 2004 bogus-virus-warnings.cf.20041217-0418 103643 Dec 16 08:23 bogus-virus-warnings.cf.20041218-0453 103635 Dec 17 10:44 bogus-virus-warnings.cf.20050103-0436 104973 Jan 2 05:22 bogus-virus-warnings.cf.20050114-0501 105986 Jan 13 18:43 bogus-virus-warnings.cf.20050520-0903 Since it seems to be just a history of the script changes can I delete all these except for the first file? Also, does spam assassin ONLY look in the /etc/mail/spamassassin folder and no deeper or does it recurse into all subdirectories in there as well? -- Regards, Peter Kiem Zordah IT - IT Consultancy and Internet Services Ph: (0414) 724-766 Fax: (07) 3344-5827 Web: www.zordah.net Email: [EMAIL PROTECTED]
Re: bayes_auto_learn_threshold_spam not working?
On Thursday 19 May 2005 10:43, Ingo Reinhart wrote: Hello! I am a little confused with the bayes. I have set bayes_auto_learn_threshold_spam = 6. Why is the following mail not autolearn as spam? No user.prefs or else is set. Sorry to snip your data Ingo, but I am in the process of working out a problem of my own with spamc/spamd, and in doing so, might be able to help you. I found this link helpful to me (although it didn't quite solve my particular problem) : http://wiki.apache.org/spamassassin/AutolearningNotWorking Lots of people seem to be confused by the autolearn=no statement in the default X-Spam-Status header. There are usually questions regarding whether or not no means SpamAssassin is not autolearning at all. What it actually means is that the specific message which includes the autolearn=no part was not autolearned, not that autolearning is disabled or somehow broken. (snip) If a message has already been learned by SpamAssassin, then that message will not be learned again. Therefore, if you run a message through SpamAssassin to see why it was classified as spam or ham, and it has already been learned, you will always get the result autolearn=no. (To see this more clearly, use the -D flag, and you will see debug output explaining that the message has already been learned.) I'm just going to 'piggyback' my query on top of yours if that's ok. My mail server runs SpamAssassin 3.0.3 on Fedora Core 3 (2.6.11-1.14_FC3). SA was installed via Perl. I had previously installed SA using up2date but due to the problem I will mention in a minute, I decided to try the Perl install to see if anything changed. It didn't. When I had the dynamic duo of spamc/spamd running, I noticed I was getting similar outputs to Ingo, as far as 'autolearn' was concerned, but with 'autolearn=failed' coming up more often. I never got 'autolearn=ham or spam'. Switching to using the spamassassin binary on it's own, instead of spamc/spamd produced better results, with 'autolearn=ham' coming up for every ham mail I got. I am very pleased with this, but can't help wondering why the spamc/spamd combo produces 'autolearn=failed' when the spamassassin binary doesn't. To clarify what I mean, here is the section of my procmailrc on my mail server, that is relevant to SA : :0fw: spamassassin.lock | /usr/bin/spamassassin (this always seems to make SA do an 'autolearn=ham') Before, I had this in my procmailrc : :0fw: | /usr/bin/spamc (This produces 'autolearn=failed', with the occasional 'autolearn=no') From the aforementioned link, I found this explanation of the failure : failed: means that autolearning was attempted, but couldn't complete. This happens if SpamAssassin can't gain a lock on the Bayes database files, etc. Basically, what do I do now to enable the spamc/spamd combo to get proper locking ? By the way, I did try adding the 'spamassassin.lock' entry to the second procmailrc excerpt above, but nothing changed. If it helps, this is my local.cf, in /etc/mail/spamassassin : rewrite_header Subject [SPAM] required_hits 4.8 # report_safe 1 # trusted_networks 212.17.35. lock_method flock # These addresses should never be marked as [SPAM]. whitelist_from [EMAIL PROTECTED] whitelist_from [EMAIL PROTECTED] whitelist_from [EMAIL PROTECTED] whitelist_from [EMAIL PROTECTED] The 'lock_method flock' was initially commented out. I enabled it to experiment, as I don't use NFS. Sorry for the long post. Pete.
Re: Feeding bayes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 jimsheffer wrote: [snip] | | Next, I tried the following, but it didn't like it. Heres what I did, and | the response I received: | | Root# sa-learn --spam -C /etc/mail/spamassassin --showdots --dir | /var/CommuniGate/Accounts/spam.macnt/INBOX.mbox | | Use of uninitialized value in quotemeta at | ///Library/Perl/5.8.1/Mail/SpamAssassin.pm line 928. | . | Learned from 1 message(s) (1 message(s) examined). | | Any idea what I need to do? (there are over 200 emails in each account, but | it lookes like it read only 1) - --dir is probably not what you want to use assuming that INBOX.mbox is not a directory, it's also deprecated (at least in SA3.0.3). Further assuming that the file is in mbox format, you'll need to use the --mbox flag to sa-learn to tell it what to expect. Good Luck! Craig. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iD8DBQFCjNaUMDDagS2VwJ4RAhFEAJ41ZWkTVzOLgEoOuYSoK54PFJ7h6gCg/ISS GM0UhCxxVDOyZcm8hW+ej9w= =UzsR -END PGP SIGNATURE-
Re: SA Being Bypassed?
On Thu, May 19, 2005 at 10:57:33AM -0400, Jake Colman wrote: Am I missing some obvious SA configuration that would address this? Is what I am seeing impossible to be true - which means that something else is going on and I am misstating this? It's an issue for however you have your filtering setup. SA filters anything that gets passed to it. There's no bypass method in SA itself. -- Randomly Generated Tagline: I dunno, I dream in Perl sometimes... -- Larry Wall in [EMAIL PROTECTED] pgpmfx28JmCZe.pgp Description: PGP signature
Re: --lint tells me I need 0.34 dns
On Thu, May 19, 2005 at 10:38:58PM -0400, Eric Wood wrote: I'm not a perl guru so my natural fear is that I'm going to be trapped doing any endless amount of perl updates just for this one problem. Hopefully only this rpm is needed. Here I go The INSTALL doc in the tarball has a list of the required and optional modules and what versions are necessary. If you installed the RPM as provided by a distribution, they should have included the proper versions of the other modules as well. If they didn't, you should let them know to update their packages. -- Randomly Generated Tagline: Love isn't hopeless. Look, maybe I'm no expert on the subject, but there was one time I got it right. -- Homer Simpson Another Simpson's Clip Show pgpRofg1mhnct.pgp Description: PGP signature
Re: --lint tells me I need 0.34 dns
- Original Message - From: Ed Kasky [EMAIL PROTECTED] Have you ever tried using the cpan shell to install perl modules? Try the following as root: perl -MCPAN -e shell o conf prerequisites_policy ask install Net::DNS I have found it much easier for perl modules. HTH Cool. That did get rid of the dependency message from spamassassin --list. I see that it installed this module in: /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi/Net and has appeared to not affect the current 5.8.1 based module package: # rpm -V perl-Net-DNS # If this is all there is to it, then you're the man Ed! Thanks, -eric wood
Re: --lint tells me I need 0.34 dns
- Original Message - From: Theo Van Dinter If you installed the RPM as provided by a distribution, they should have included the proper versions of the other modules as well. If they didn't, you should let them know to update their packages. Your correct. It's the only spamassassin-3x rpm I could find on the net at: http://dag.wieers.com/home-made/apt/ This guy's rpm didn't look out for the rpm dependency for a newer perl-Net-DNS module. But.. looking at spamassassin current .spec file: [EMAIL PROTECTED] Mail-SpamAssassin-3.0.3]# grep Req spamassassin.spec Requires: perl-Mail-SpamAssassin = %{version}-%{release} Requires: perl(Pod::Usage) BuildRequires: perl = 5.6.1 perl(Digest::SHA1) Requires: perl-Mail-SpamAssassin = %{version}-%{release} Requires: perl = 5.6.1 perl(HTML::Parser) perl(Digest::SHA1) BuildRequires: perl = 5.6.1 perl(HTML::Parser) perl(Digest::SHA1) doesn't really check for specific perl modules. Maybe the spamassassin package maintainers might need to be informed. I seem to be working fine now. -eric wood
Re: --lint tells me I need 0.34 dns
On Thu, May 19, 2005 at 11:35:01PM -0400, Eric Wood wrote: Your correct. It's the only spamassassin-3x rpm I could find on the net at: http://dag.wieers.com/home-made/apt/ I'd just build it yourself. Docs are on the wiki/download page (iirc). doesn't really check for specific perl modules. Maybe the spamassassin package maintainers might need to be informed. Yeah, this comes up periodically. Since Net::DNS isn't required for SA operation, it's not listed as required in the spec file. There doesn't seem to be a way to say if perl(Net::DNS) is installed, require version 0.34 or higher. -- Randomly Generated Tagline: There are perfectly good answers to those questions, but they'll have to wait for another night. -- Homer Simpson Homers Barbershop Quartet pgpqHNR2VC083.pgp Description: PGP signature
What is a caching name server?
Hello list, in several posts I have noticed people refer to a caching nameserver. What exactly is that? Would BIND 9.3.1 qualify? Any advice would be greatly appreciated. Regards, Devin
Re: Simple question TRUE or FALSE (More data to answer this question)
From: Jon Dossey [EMAIL PROTECTED] From: Menno van Bennekom [mailto:[EMAIL PROTECTED] To: David Velásquez Restrepo Subject: Re: Simple question TRUE or FALSE (More data to answer this question) Q) With spamassassin (and all the above info) you need about 20 to 30 seconds per email message and LOTS of RAM and CPU: a) TRUE b) FALSE My answer is b), False. I have a mailserver here that has a 1Ghz CPU and 512MB RAM and SA on that server usually takes 2 or 3 seconds per message. Like already posted, some of your rulesets are unnecessary because they are included in SA (standard rulesets or SURBL). Did you check 'cat messages | spamassassin -D' to see what part takes most time? DNS time-outs can take a lot of time for example (also checkable with tcpdump port 53). Also your SMTP-server (xmail?) takes a lot of cpu. I've never used Xmail but I use postfix (and amavisd-new) and I think it's quite memory and CPU efficient. Please don't take this as me doubting you - but how in the world are you able to scan a message in 2-3 seconds? I assume you're running some of the network tests, like other people that have posted 2-3 second message processing times, is that correct? My Dl360 with dual 1.266ghz CPU's, 2GB of RAM, and dual 18GB mirrored scsi drives can only scan a message in 4-5 seconds. At least that was my scan time with a completely default setup, running spamd/spamass-milter, SA 3.0.1, RedHat FC2, and sendmail 8.13.1. I haven't checked in a while (since I updated SA, the milter, and sendmail), but I have a good feeling most of my processing time was spent waiting for DNS responses. Any input into my situation would be appreciated. I'd love to be able to get down to 2-3 seconds, basically cutting my processing time in half! [JDOW] Jon, I am using these rules from the sources that follow the names. (I built my own GetRules script.) 99_FVGT_Tripwire.cf,http://www.rulesemporium.com/rules/ 99_OBFU_drugs.cf,http://www.rulesemporium.com/rules/Testing/ 99_sare_fraud_post25x.cf,http://www.rulesemporium.com/rules/ 99_FVGT_DomainDigits.cf,http://www.rulesemporium.com/rules/Testing/ 99_FVGT_meta.cf,http://www.rulesemporium.com/rules/ 88_FVGT_body.cf,http://www.rulesemporium.com/rules/ 88_FVGT_rawbody.cf,http://www.rulesemporium.com/rules/ 88_FVGT_subject.cf,http://www.rulesemporium.com/rules/ 88_FVGT_headers.cf,http://www.rulesemporium.com/rules/ 72_sare_bml_post25x.cf,http://www.rulesemporium.com/rules/ 72_sare_redirect_post3.0.0.cf,http://www.rulesemporium.com/rules/ 70_sare_highrisk.cf,http://www.rulesemporium.com/rules/ 70_sare_adult.cf,http://www.rulesemporium.com/rules/ 70_sare_bayes_poison_nxm.cf,http://www.rulesemporium.com/rules/ 70_sare_oem.cf,http://www.rulesemporium.com/rules/ 70_sare_random.cf,http://www.rulesemporium.com/rules/ 70_sare_spoof.cf,http://www.rulesemporium.com/rules/ 70_sare_header.cf,http://www.rulesemporium.com/rules/ 70_sare_header_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_html.cf,http://www.rulesemporium.com/rules/ 70_sare_html_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj0.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj1.cf,http://www.rulesemporium.com/rules/ 70_sare_genlsubj2.cf,http://www.rulesemporium.com/rules/ 70_sare_specific.cf,http://www.rulesemporium.com/rules/ 70_sare_unsub.cf,http://www.rulesemporium.com/rules/ 70_sare_uri0.cf,http://www.rulesemporium.com/rules/ 70_sare_uri1.cf,http://www.rulesemporium.com/rules/ 70_sare_uri_eng.cf,http://www.rulesemporium.com/rules/ 70_sare_obfu0.cf,http://www.rulesemporium.com/rules/ 70_sare_obfu1.cf,http://www.rulesemporium.com/rules/ chickenpox.cf,http://www.rulesemporium.com/rules/ ratware.cf,http://www.rulesemporium.com/rules/ useless.cf,http://www.rulesemporium.com/rules/ weeds_2.cf,http://www.rulesemporium.com/rules/ Spamc/Spamd takes 2 seconds to scan a small spam message and spit it out. $ spamc scott 0.00user 0.00system 0:01.97elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+190minor)pagefaults 0swaps I am using the default BL tests for 3.02. (Am I insane running all those tests? Probably. Does it work? Excellently. Now, again, am I crazy running all those tests? Naw - if it works do not fix it.) {^_-} - Proof that much of the time old age and guile really can defeat youth and enthusiasm.
Re: --lint tells me I need 0.34 dns
There doesn't seem to be a way to say if perl(Net::DNS) is installed, require version 0.34 or higher. Which begins to make me wonder - why NOT require net::dns for SA, at least as far as the install goes? Is it that huge a package? If the user isn't going to use it, how much have they lost by being required to install it somewhere? Loren