Re: SA TIMED OUT message debian sarge
On Friday November 3 2006 05:23, Matt Kettler wrote: I believe the option is $sa_timeout Not sure what the default is, probably 30. Which should be enough to prevent that problem, unless you have a LOT of sa instances contending for the AWL database. Try adding a $sa_timeout = 60 to your Amavisd.conf and lock_method flock to your spamassassin/local.cf (if you don't use NFS for DB storage.) A note for the archive: $sa_timeout is relevant primarily for versions older than 2.4.0. SA allowed time is now controlled primarily through $child_timeout, which defaults to 8 minutes, 2/3 of that is 5+ minutes. The 2.4.0 release notes say: - added ability to kill externally running decoder process or a command-line virus scanner process if running for too long; ...; allowed time is calculated as 2/3 of the remaining time (initially at $child_timeout), but at least 10 seconds; - use the same timeout calculation as above for calls to SA, taking $sa_timeout instead if that value is bigger than the calculated time, thus making $sa_timeout pretty much redundant; Mark
Re: R: BIG increase in spam today
François Rousseau wrote: Greylisting is not always good... The greylisting insert delay in delevery and sometimes the email have to be delever fast. I don't trust enough DNSBLs to completely block an email only based on them. What about combining BlackListing and GreyListing? I'd like to use GreyLists (with long delay) for BlackListed emails only. Has anybody already implemented it? Is there already something able to implement it? Thanks. -- ___ __ |- [EMAIL PROTECTED] |ederico Giannici http://www.neomedia.it ___
R: R: BIG increase in spam today
François Rousseau wrote: Greylisting is not always good... The greylisting insert delay in delevery and sometimes the email have to be delever fast. I don't trust enough DNSBLs to completely block an email only based on them. What about combining BlackListing and GreyListing? I'd like to use GreyLists (with long delay) for BlackListed emails only. This is a very interesting idea. Ah, these italian brains! :) Has anybody already implemented it? I use postfix, and something like that is suggested in the postfix's SMTP Access Policy Delegation manual (http://www.postfix.org/SMTPD_POLICY_README.html). See Greylisting mail from frequently forged domains in there. That, however, uses a static list of frequently forged domains and check_sender_access to enforce greylistin on listed domains. What you suggest is obviously more powerfull. Due to the dynamic nature of this test, I guess that at least in the postfix case it should need to be somehow embedded into the greylisting server: it seems postfix doesn't allow to specify more than one policy server in the check_policy_service directive. So, a postgrey or postgreysql server's code would shurely need to be tuned for this. Is there already something able to implement it? FWIK, no. --- Giampaolo Tomassoni - IT Consultant Piazza VIII Aprile 1948, 4 I-53044 Chiusi (SI) - Italy Ph: +39-0578-21100 MAI inviare una e-mail a: NEVER send an e-mail to: [EMAIL PROTECTED] Thanks. -- ___ __ |- [EMAIL PROTECTED] |ederico Giannici http://www.neomedia.it ___
Does a rule already exist for this?
I assume a rule already exists for this but just in the remote chance it's not... If the text with a URL in a hyperlink does not match the href, then the message should get more spam points. For example, HREF=http://StringA;http://StringB/A if(StringA != StringB) { Add more spam points. } Joe
Re: Enabling/testing SPF?
On Fri, 2006-11-03 at 10:21 +, Henry Kwan wrote: Am finally getting around to making SPF records for our domains so naturally I was fiddling with SA to see SPF-checking was enabled. Running 3.17 with Mail-SPF-Query-1.999.1 installed. During make test, it seemed to pass all 36 tests in t/spf...ok. But when I do a debug test via spamassassin -D sample-nonspam.txt, it doesn't seem to return debug: registering glue method for check_for_spf_helo_pass (Mail::SpamAssassin::Plugin::SPF=HASH(0x8d21990)). I then sent a test email from another machine, forging an email with a domain known to have a good SPF record and I didn't see any references to SPF in the tests section. So what might be the issue here? TIA for any insights. spamassassin -D file 21 | grep -i spf check the output which MTA do you use ? Your MTA must insert an X-Envelope-From: header ( or similar ) Thanks Ram
RE: how to show exact score for the tests in the headers
Hi, I'm running SLES9. I've added add_header all Report _REPORT_ to local.cf file, but I'm still getting those headers without individual scores :( Like these: X-Spam-Status: Yes, hits=11.0 tag1=-999.0 tag2=5.0 kill=5.0 tests=BAYES_50, FROM_ILLEGAL_CHARS, HTML_60_70, HTML_MESSAGE, MIME_HTML_MOSTLY, RAZOR2_CF_RANGE_51_100, RAZOR2_CHECK, RCVD_IN_BL_SPAMCOP_NET, RCVD_IN_NJABL_DUL, SUBJ_ILLEGAL_CHARS X-Spam-Level: *** These are the latest patched versions of SA and Amavis on SLES9: amavisd-new-20030616p9-3.6 spamassassin-2.64-3.7 Is there still a way for me to get these scores for every test? Best Regards, Leon -Original Message- From: Gary V [mailto:[EMAIL PROTECTED] Sent: Friday, November 03, 2006 12:57 AM To: users@spamassassin.apache.org Subject: Re: how to show exact score for the tests in the headers I'm running a system with Cyrus+Postfix+Amavisd-new+SA+ClamAV. I've seen on this list that there is a possibility to show in the SA headers the exact score for all tests scored for particular message, like this: No, hits=-0.8 required=5.0 tests=BAYES_00=-2.599, DK_POLICY_SIGNSOME=0.001,DNS_FROM_RFC_ABUSE=0.2, FORGED_MUA_MOZILLA=1.593,SPF_PASS=-0.001 autolearn=no version=3.1.7 My current SA headers look like this: X-Spam-Status: Yes, hits=15.8 tag1=-999.0 tag2=5.0 kill=5.0 tests=BAYES_99, HTML_FONTCOLOR_RED, HTML_FONTCOLOR_UNSAFE, HTML_MESSAGE, MSGID_FROM_MTA_SHORT, RCVD_IN_NJABL_DUL, RCVD_IN_SORBS_DUL, RCVD_IN_SORBS_WEB, RCVD_IN_XBL X-Spam-Level: *** How should I change the configs (local.cf, amavis.conf, etc.?) so it looks like in the upper example? To get the list of rules hit and their individual scores, add the following line to local.cf: add_header all Report _REPORT_ Run 'perldoc Mail::SpamAssassin::Conf' for details. -- Chris That will not help here as amavisd-new does not allow spamassassin to write headers. The problem here is an outdated amavisd-new. What distro are you running? Gary V _ Try Search Survival Kits: Fix up your home and better handle your cash with Live Search! http://imagine-windowslive.com/search/kits/default.aspx?kit=improvelocale=en-USsource=hmtagline
Forged_Hotmail_Rcvd
I am wondering why this mail failed in 2.5 FORGED_HOTMAIL_RCVD Forged hotmail.com 'Received:' header found test? Can anyone help me out in understanding why? Received: from bay0-omc3-s8.bay0.hotmail.com [65.54.246.208] by qualispace.com with ESMTP (SMTPD-8.22) id ADCC0278; Fri, 03 Nov 2006 06:53:48 -0500 Received: from BAY124-W33 ([207.46.11.196]) by bay0-omc3-s8.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 3 Nov 2006 03:53:44 -0800 X-Originating-IP: [59.92.166.54] X-Originating-Email: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Type: multipart/alternative; boundary=_5e4c0420-8c86-4f99-8d53-6f3e557244f0_ From: PHILIP.P. ALEX [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Date: Fri, 3 Nov 2006 11:53:44 + MIME-Version: 1.0 Return-Path: [EMAIL PROTECTED] X-Envelope-From:[EMAIL PROTECTED] X-OriginalArrivalTime: 03 Nov 2006 11:53:44.0303 (UTC) FILETIME=[B696CFF0:01C6FF3E] X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on SERVER204 X-Spam-Level: X-Spam-Status: Yes, score=4.7 required=4.5 tests=BAYES_50,FORGED_HOTMAIL_RCVD, HTML_MESSAGE,MISSING_SUBJECT,SPF_HELO_PASS autolearn=no version=3.0.1 X-Spam-Report: * -0.1 SPF_HELO_PASS SPF: HELO matches SPF record * 2.5 FORGED_HOTMAIL_RCVD Forged hotmail.com 'Received:' header found * 0.0 HTML_MESSAGE BODY: HTML included in message * 1.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% * [score: 0.4910] * 1.2 MISSING_SUBJECT Missing Subject: header X-IMail-Queuename:2dcc01be0d97; Demo: 2006-11-30 X-RCPT-TO: [EMAIL PROTECTED] Status: U X-IMail-Rule: H~X-Spam-Status: Yes:Junk Data- X-SPAM-STATUS: YES, SCORE=4.7 X-UIDL: 458491654 X-IMail-ThreadID: 2dcc01be0d97 Warm Regards, Suhas System Administrator QualiSpace - A QuantumPages Enterprise === Tel India: +91 (22) 6792 - 1480 Tel US: +1 (614) 827 - 1224 Fax India: +91 (22) 2530 - 3166 URL: http://www.qualispace.com === For Any Technical Query Please Use: http://helpdesk.qualispace.com QualiSpace Community Discussion forum: http://forum.qualispace.com
RE: Forged_Hotmail_Rcvd
Title: Message -Original Message-From: Suhas (QualiSpace) [mailto:[EMAIL PROTECTED] Sent: Friday, November 03, 2006 8:00 AMTo: users@spamassassin.apache.orgSubject: Forged_Hotmail_Rcvd I am wondering why this mail failed in 2.5 FORGED_HOTMAIL_RCVD Forged hotmail.com 'Received:' header found test? Can anyone help me out in understanding why? hard coded ip addresses in test, hotmail added new servers. if you use spf (you do) , just disable that test by adding a score FORGED_HOTMAIL_RDVD 0 to local.cf Received: from bay0-omc3-s8.bay0.hotmail.com [65.54.246.208] by qualispace.com with ESMTP (SMTPD-8.22) id ADCC0278; Fri, 03 Nov 2006 06:53:48 -0500 Received: from BAY124-W33 ([207.46.11.196]) by bay0-omc3-s8.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 3 Nov 2006 03:53:44 -0800 X-Originating-IP: [59.92.166.54] X-Originating-Email: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Type: multipart/alternative; boundary="_5e4c0420-8c86-4f99-8d53-6f3e557244f0_" From: "PHILIP.P. ALEX" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Date: Fri, 3 Nov 2006 11:53:44 + MIME-Version: 1.0 Return-Path: [EMAIL PROTECTED] X-Envelope-From:[EMAIL PROTECTED] X-OriginalArrivalTime: 03 Nov 2006 11:53:44.0303 (UTC) FILETIME=[B696CFF0:01C6FF3E] X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on SERVER204 X-Spam-Level: X-Spam-Status: Yes, score=4.7 required=4.5 tests=BAYES_50,FORGED_HOTMAIL_RCVD, HTML_MESSAGE,MISSING_SUBJECT,SPF_HELO_PASS autolearn=no version=3.0.1 X-Spam-Report: * -0.1 SPF_HELO_PASS SPF: HELO matches SPF record * 2.5 FORGED_HOTMAIL_RCVD Forged hotmail.com 'Received:' header found * 0.0 HTML_MESSAGE BODY: HTML included in message * 1.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% * [score: 0.4910] * 1.2 MISSING_SUBJECT Missing Subject: header X-IMail-Queuename:2dcc01be0d97; Demo: 2006-11-30 X-RCPT-TO: [EMAIL PROTECTED] Status: U X-IMail-Rule: H~X-Spam-Status: Yes:Junk Data- X-SPAM-STATUS: YES, SCORE=4.7 X-UIDL: 458491654 X-IMail-ThreadID: 2dcc01be0d97 Warm Regards, Suhas System Administrator QualiSpace - A QuantumPages Enterprise === Tel India: +91 (22) 6792 - 1480 Tel US: +1 (614) 827 - 1224 Fax India: +91 (22) 2530 - 3166 URL: http://www.qualispace.com === For Any Technical Query Please Use: http://helpdesk.qualispace.com QualiSpace Community Discussion forum: http://forum.qualispace.com
RE: Does a rule already exist for this?
Joe Flowers wrote: If the text with a URL in a hyperlink does not match the href, then the message should get more spam points. This idea has been discussed before, and rejected. Too many false positives. http://wiki.apache.org/spamassassin/AntiPhishFakeUrlRule
Re: Spam
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#item_clear_report_template Thanks about the link. i will take a look at this the next days. But can i something more do, agains the spam problem? Marcus _ Sie suchen E-Mails, Dokumente oder Fotos? Die neue MSN Suche Toolbar mit Windows-Desktopsuche liefert in sekundenschnelle Ergebnisse. Jetzt neu! http://desktop.msn.de/ Jetzt gratis downloaden!
BIZ_TLD and INFO_TLD
Aren't them a bit outdated? I have a couple of FPs due to them scoring 2.whatever on an opt-in mailing list (at least, it seems so). I know I can lower their scores. I was just wondering why their default score is so high: maybe when .biz and .info TLDs started operating, they were mostly used by spammers... What about now? Actually I guess there are legitimate (and unlucky) senders with .biz and .info domains. --- Giampaolo Tomassoni - IT Consultant Piazza VIII Aprile 1948, 4 I-53044 Chiusi (SI) - Italy Ph: +39-0578-21100 MAI inviare una e-mail a: NEVER send an e-mail to: [EMAIL PROTECTED]
Amazon / RFCI false positives
Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Tony. -- f.a.n.finch [EMAIL PROTECTED] http://dotat.at/ IRISH SEA: VARIABLE 3 OR LESS, BECOMING WESTERLY 4 OR 5 LATER. SMOOTH BECOMING SLIGHT. FAIR. GOOD.
Re: Amazon / RFCI false positives
* Tony Finch [EMAIL PROTECTED]: Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Amazon.co.uk is not listed: http://www.rfc-ignorant.org/tools/lookup.php?domain=Amazon.co.uk -- Ralf Hildebrandt (i.A. des IT-Zentrums) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 IT-Zentrum Standort CBFsend no mail to [EMAIL PROTECTED]
Block wrote: spams
I am getting a lot of these Bob wrote: spams Anyone know a way to write the rule so if the subject has wrote: in the subject, tag it? Here is what I have? header WROTE_SUB Subject =~ /\bwrote\:\b/i describe WROTE_SUB Wrote in Subject score WROTE_SUB 3.0 -- Mike Yrabedra B^)
Re: Block wrote: spams
Ive been getting the same and just wrote a rule for it today. Ive got what you have listed below. Havent tested it though.On 11/3/06, MIKE YRABEDRA [EMAIL PROTECTED] wrote:I am getting a lot of these Bob wrote: spams Anyone know a way to write the rule so if the subject has wrote: in thesubject, tag it?Here is what I have?header WROTE_SUBSubject =~ /\bwrote\:\b/idescribe WROTE_SUBWrote in Subject score WROTE_SUB 3.0--Mike Yrabedra B^)-- -Juan
handy new rule-dev tip: --cf
Here's a nifty feature I added recently to SVN trunk that's quite useful if you're a rule developer. Basically, it allows you to set a line or two of configuration, on the command line: spamassassin --cf=config --cf='config line' Add additional lines of configuration directly from the com- mand-line, parsed after the configuration files are read. Multi- ple --cf arguments can be used, and each will be considered a sepa- rate line of configuration. Here's the key benefit for rule developers; you can test rules against a message without even editing a single file. For example: $ spamassassin -t --cf=body NEWRULE /text/ -Lt spam | grep NEW 1.0 NEWRULEBODY: NEWRULE $ it's very nifty. --j.
Re: Amazon / RFCI false positives
On Fri, 3 Nov 2006, Ralf Hildebrandt wrote: * Tony Finch [EMAIL PROTECTED]: Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Amazon.co.uk is not listed: http://www.rfc-ignorant.org/tools/lookup.php?domain=Amazon.co.uk My mistake: I cited the wrong domain. Try bounces.amazon.com which they use in the return path of their messages (I guess for all their international domains) http://www.rfc-ignorant.org/tools/lookup.php?domain=bounces.amazon.com Tony. -- f.a.n.finch [EMAIL PROTECTED] http://dotat.at/ SOUTH SHANNON: SOUTHEASTERLY 3 OR 4. MODERATE. FAIR. GOOD.
RE: sa-learn training question(s)
Matt Kettler wrote: Jason Wellman wrote: ... I have all incoming mail that is tagged as Spam delivered to a CaughtSpam IMAP box for each user. ... Should I also have sa-learn from the CaughtSpam folder? I have read some places that say yes, and some that say no. YES. Those that say no clearly do not know what they're talking about. Ummm...Do you really want to sa-learn from an unverified spam folder? Lets face it.. if there was no point in learning tagged spam, why does the autolearner only kick in on high-scoring spam? The autolearner kicks in only on high-scoring spam to avoid learning from false positives. Learning from the CaughtSpam folder is like dropping the autolearn threshold down to 5.0 and removing the header/body score requirements. That said, it will only learn the caught spam that wasn't already autolearned, but this is actually quite valuable as it will generally contain more of the borderline spam which is important for bayes to know about. You do want to learn from as much spam (and ham) as possible, but you want a human to sort it first. I would say that you should only learn from the IsSpam folder and encourage your users to copy the spam over from the CaughtSpam folder to the IsSpam folder after they've verified that there are no false positives. Second question. It is easy to tell a user (and some of mine are non-tech folks) to put Spam in the IsSpam folder, but there isn't a way to really tell them that they need to put HAM in a certain folder, they just don't understand it. So my second question is how are people feeding sa-learn good HAM? That depends a lot on the user. Some are good, some not so good. Most will generally do this only when they're getting FPs, but that's still handy. Agreed. Just tell them to do it. If they do, great! If not, you still get the false positives. In the end, they are the ones responsible for making the Bayes DB effective. If they don't want to help, there's not much you can do about it. I was toying with the idea of feeding in peoples Sent folders along with all messages from their INBOX and Trash that were marked as read (I can pull these out using mboxgrep). This would also give me a larger sample of HAM them Spam which I understand is a good thing. Can anyone poke holes in my logic on this, or point out a better source for me to scrape HAM to feed sa-learn? Well, doing inbox and trash, you'll autolearn any false-negatives that your user happened to read and did not move to the IsSpam.. If you don't trust them to force-feed good ham, this might not be a good idea. Sent would appear to be fine.. unless your users are really dumb and frequently reply to spam. Ditto. Sent might be ok, but Trash is probably a bad idea. -- Bowie
Re: R: BIG increase in spam today
Federico Giannici wrote: François Rousseau wrote: Greylisting is not always good... The greylisting insert delay in delevery and sometimes the email have to be delever fast. I don't trust enough DNSBLs to completely block an email only based on them. What about combining BlackListing and GreyListing? I'd like to use GreyLists (with long delay) for BlackListed emails only. Has anybody already implemented it? Is there already something able to implement it? from milter-greylist readme: -- snip -- 9 Using DNSRBL == milter-greylist can use a DNSRBL to decide wether a host should be greylisted or whitelisted. For instance, let us say that you cant to greylist any host appearing in the SORBS dynamic pool list (this include DSL and cable pools). You would do this: # if IP a.b.c.d is positive, then nslookup of d.c.b.a.dnsbl.sorbs.net # returns 127.0.0.10 dnsrbl SORBS DUN dnsbl.sorbs.net 127.0.0.10 acl greylist dnsrbl SORBS DUN You can combine it with variable greylisting delays so that dynamic hosts get a greylisting delay of 12 hours while other hosts only get 15 minutes: dnsrbl SORBS DUN dnsbl.sorbs.net 127.0.0.10 acl greylist dnsrbl SORBS DUN delay 12h acl greylist default delay 15m This feature was introduced in milter-greylist 2.1.7 and may not be fully stable. You need the --enable-dnsrbl flag to configure to use it. You must link milter-greylist with a thread-safe resolver, else the milter will be unstable (see the explanation in the SPF section). -- snip -- Ken A Pacific.Net Thanks.
Re: Relay Checker plugin v0.2
John Rudd wrote: I've put up a new version of Relay checker, in ... I expect I might, at some point, switch from using a dynamic score in the plugin, to a normal score. But that's the only change I expect to make, aside from bug fixes (if there are any), and/or a switch to using Net::DNS. I wonder if there is any way for a plugin to hook into SA's DNS routines. That might be better than calling Net::DNS directly.
RE: BIG increase in spam today
Am Donnerstag, 2. November 2006 16:04 schrieb Amos: (...) Actually, it's getting to the extent that some at work are raising questions as to whether our SA setup will be able to maintain adequate protection from this growing onslaught. Amos Only AFTER adequate initial RBL filtering. Spamhaus does a great job here. It's not doing as great as it used to here. The amount of spam that SA is processing is about 4X what it was in January. If this keep up, we'll have to look at other possible options, maybe more RBLs? Bret
Re: BIZ_TLD and INFO_TLD
Still seem to be mostly spammers here. There is a slight increase in ham, but I don't think it would really change the scores all that much. I have both of these domains scored at 5 with no problems. Loren
Re: Block wrote: spams
I haven't seen any of these. But if the spams universally have "single word wrote: stuff" as the subject then I'd consider a more stringent rule: /^\w+\s+wrote:/i or /^(?:\w+\s+){1,2}wrote:/i or /^(?:re:\s*|fw:\s*){0,20}(?:\w+\s+){1,2}wrote:/i Loren - Original Message - From: Juan Mas To: MIKE YRABEDRA Cc: spamassassin-users Sent: Friday, November 03, 2006 7:15 AM Subject: Re: Block "wrote:" spams Ive been getting the same and just wrote a rule for it today. Ive got what you have listed below. Havent tested it though. On 11/3/06, MIKE YRABEDRA [EMAIL PROTECTED] wrote: I am getting a lot of these "Bob wrote: " spams Anyone know a way to write the rule so if the subject has "wrote:" in thesubject, tag it?Here is what I have?header WROTE_SUBSubject =~ /\bwrote\:\b/idescribe WROTE_SUBWrote in Subject score WROTE_SUB 3.0--Mike Yrabedra B^)-- -Juan
Re: BIZ_TLD and INFO_TLD
at 2006. november 3. 18.20 Loren Wilton wrote: Still seem to be mostly spammers here. There is a slight increase in ham, but I don't think it would really change the scores all that much. I have both of these domains scored at 5 with no problems. Why don't you use simplex algorithm (or similar) to compute optimal scores? -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Bayesian scores
Hello, Why BAYES_99 have only the score 3.5 while 5.0 is required to identify a mail as spam? I think this rule should have a score about 5.1 (or anything greater than 5.0). -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Re: Bayesian scores
Péntek Imre wrote: Hello, Why BAYES_99 have only the score 3.5 while 5.0 is required to identify a mail as spam? I think this rule should have a score about 5.1 (or anything greater than 5.0). because if its wrong in its classification, then that 1 rule alone will cause a FP. The whole idea is that no 1 rule cause a message to be tagged either way. (except for maybe whitlist/blacklist) Anyway, if you want, change the score of the rule. I've upped the scores on almost all bayes rules here because history has shown it to be incredibly accurate here. -Jim
Re: Block wrote: spams
there's a rule that matches them in 3.1.x sa-update, fwiw. --j. Loren Wilton writes: I haven't seen any of these. But if the spams universally have single word wrote: stuff as the subject then I'd consider a more stringent rule: /^\w+\s+wrote:/i or /^(?:\w+\s+){1,2}wrote:/i or /^(?:re:\s*|fw:\s*){0,20}(?:\w+\s+){1,2}wrote:/i Loren - Original Message - From: Juan Mas To: MIKE YRABEDRA Cc: spamassassin-users Sent: Friday, November 03, 2006 7:15 AM Subject: Re: Block wrote: spams Ive been getting the same and just wrote a rule for it today. Ive got what you have listed below. Havent tested it though. On 11/3/06, MIKE YRABEDRA [EMAIL PROTECTED] wrote: I am getting a lot of these Bob wrote: spams Anyone know a way to write the rule so if the subject has wrote: in the subject, tag it? Here is what I have? header WROTE_SUB Subject =~ /\bwrote\:\b/i describe WROTE_SUB Wrote in Subject score WROTE_SUB 3.0 -- Mike Yrabedra B^) -- -Juan
blocking mail gateways
I have started to recieve a flood of spam that is getting through spam assassin on my server. I have my score set to 4 which I don't think is too high but this spam is coming through sometimes with scores of .5 or 1. I want to be able to block the email gateways these things are being sent from. I only have limited configuration of the spam assassin server through a web interface. (i'm on a shared hosted server.) It has a score blank and I was wondering if there is something I can put in there to tell it to block emails coming though these addresses. Thanks. -- View this message in context: http://www.nabble.com/blocking-mail-gateways-tf2569709.html#a7163359 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Bayesian scores
Jim Maul wrote: I've upped the scores on almost all bayes rules here because history has shown it to be incredibly accurate here. Yes. BTW so far I've got no FP but still get false negatives with score 3.5, BAYES_99, using this database: [5816] dbg: bayes: corpus size: nspam = 2757, nham = 1403 Built from scratch by myself, still growing. As I have so big database there's very little possibility of mistaken bayesian score, but as I've built this database from scratch, I can also state that the same stands for little bayesian databases too. So I will use score 5.1 for BAYES_99, and still suggest to use this in the SA distribution too. Thanks for helping me anyways. -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Re: Bayesian scores
Péntek Imre wrote: Jim Maul wrote: I've upped the scores on almost all bayes rules here because history has shown it to be incredibly accurate here. Yes. BTW so far I've got no FP but still get false negatives with score 3.5, BAYES_99, using this database: [5816] dbg: bayes: corpus size: nspam = 2757, nham = 1403 Built from scratch by myself, still growing. As I have so big database there's very little possibility of mistaken bayesian score, but as I've built this database from scratch, I can also state that the same stands for little bayesian databases too. So I will use score 5.1 for BAYES_99, and still suggest to use this in the SA distribution too. Thanks for helping me anyways. If you are getting false negatives with 3.5 then you need to find a way to get more rules to hit. My average spam score here is 16.1 which is way over my 5.0 threshold. The trick is to increase the distance between your average spam and ham scores as much as possible and then you can run with a higher spam threshold. If you have spam not getting tagged, you should increase rules that trigger, not lower your threshold. Are you using network tests, razor, surbl, add on rules from sare, etc? -Jim
Re: Enabling/testing SPF?
Ramprasad ram at netcore.co.in writes: spamassassin -D file 21 | grep -i spf check the output which MTA do you use ? Your MTA must insert an X-Envelope-From: header ( or similar ) Thanks Ram Hi. I'm using sendmail so I see that I have to modify sendmail.cf by adding H?l?X-Envelope-From: $f. By the way, how can I add that bit via sendmail.mc instead of modifying sendmail.cf directly? Anyway, this is what I get with the sample non-sample: [EMAIL PROTECTED] Mail-SpamAssassin-3.1.7]# spamassassin -D sample-nonspam.txt 21 | grep -i spf [25342] dbg: config: read file /usr/share/spamassassin/25_spf.cf [25342] dbg: config: read file /usr/share/spamassassin/60_whitelist_spf.cf Even with a piece of mail that I had saved from a domain that has a confirmed SPF record (and a X-Envelope-From: header), I get the same output as above. Thanks. --Henry P.S. Sorry if this is a dupe. Wasn't sure if this got sent as Pine complained about my mailbox when I tried to send it earlier.
Re: Bayesian scores
Jim Maul wrote: Are you using network tests, razor, surbl, add on rules from sare, etc? I can just guess, as I don't know how to get to be sure. I can find several spams marked with: RCVD_IN_BL_SPAMCOP_NET UNPARSEABLE_RELAY URIBL_AB_SURB Are these mean I also use network tests? As I see I don't use razor, I will read the wikipage about it. -- Üdvözlettel: Ifj. Péntek Imre E-Mail: [EMAIL PROTECTED]
R: BIZ_TLD and INFO_TLD
at 2006. november 3. 18.20 Loren Wilton wrote: Still seem to be mostly spammers here. There is a slight increase in ham, but I don't think it would really change the scores all that much. I have both of these domains scored at 5 with no problems. Why don't you use simplex algorithm (or similar) to compute optimal scores? I don't have a reliable ham corpus: my customers mostly use pop3... --- Giampaolo Tomassoni - IT Consultant Piazza VIII Aprile 1948, 4 I-53044 Chiusi (SI) - Italy Ph: +39-0578-21100 MAI inviare una e-mail a: NEVER send an e-mail to: [EMAIL PROTECTED] -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Re: Bayesian scores
Péntek Imre wrote: Jim Maul wrote: Are you using network tests, razor, surbl, add on rules from sare, etc? I can just guess, as I don't know how to get to be sure. I can find several spams marked with: RCVD_IN_BL_SPAMCOP_NET UNPARSEABLE_RELAY URIBL_AB_SURB Are these mean I also use network tests? I am not sure. It would seem so to me. Make sure you do not have -L being passed when starting spamd. As I see I don't use razor, I will read the wikipage about it. Definitely! Razor is by far one of the top performing rules on many SA setups. It works great. -Jim
Re: R: BIG increase in spam today
Federico Giannici wrote: François Rousseau wrote: Greylisting is not always good... The greylisting insert delay in delevery and sometimes the email have to be delever fast. I don't trust enough DNSBLs to completely block an email only based on them. What about combining BlackListing and GreyListing? I'd like to use GreyLists (with long delay) for BlackListed emails only. Has anybody already implemented it? Is there already something able to implement it? This was asked on the Postfix list recently: http://groups.google.com/group/list.postfix.users/browse_thread/thread/5146269c41c5ca9d The best answer was: http://www.orangegroove.net/code/marbl/
Re: Bayesian scores
Jim Maul wrote: I am not sure. It would seem so to me. Make sure you do not have -L being passed when starting spamd. I've started reading that wikipage, so now I can test for sure: $ spamassassin -t -D spam output 21 $ grep network output [6639] dbg: pyzor: network tests on, attempting Pyzor [6639] dbg: reporter: network tests on, attempting SpamCop Thanks for the suggestions. -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Re: sa-learn training question(s)
Thanks for the feedback. One last question that I am currently tossing around. Sitewide vs individual learning... I have a small domain, less then 50 users. Should I be looking at setting up a sitewide bayes database instead of individual ones? Again I find conflicting information when I dig into it on the web. I find myself thinking that one persons Spam may be another's legitimate advertising... - JasonOn 11/3/06, Bowie Bailey [EMAIL PROTECTED] wrote: Matt Kettler wrote: Jason Wellman wrote: ... I have all incoming mail that is tagged as Spam delivered to a CaughtSpam IMAP box for each user. ... Should I also have sa-learn from the CaughtSpam folder?I have read some places that say yes, and some that say no. YES. Those that say no clearly do not know what they're talking about. Ummm...Do you really want to sa-learn from an unverified spam folder? Lets face it.. if there was no point in learning tagged spam, why does the autolearner only kick in on high-scoring spam? The autolearner kicks in only on high-scoring spam to avoid learningfrom false positives.Learning from the CaughtSpam folder is likedropping the autolearn threshold down to 5.0 and removing the header/body score requirements. That said, it will only learn the caught spam that wasn't already autolearned, but this is actually quite valuable as it will generally contain more of the borderline spam which is important for bayes to know about.You do want to learn from as much spam (and ham) as possible, but youwant a human to sort it first.I would say that you should only learnfrom the IsSpam folder and encourage your users to copy the spam over from the CaughtSpam folder to the IsSpam folder after they'veverified that there are no false positives. Second question.It is easy to tell a user (and some of mine are non-tech folks) to put Spam in the IsSpam folder, but there isn't a way to really tell them that they need to put HAM in a certain folder, they just don't understand it.So my second question is how are people feeding sa-learn good HAM? That depends a lot on the user. Some are good, some not so good. Most will generally do this only when they're getting FPs, but that's still handy.Agreed.Just tell them to do it.If they do, great!If not, youstill get the false positives.In the end, they are the onesresponsible for making the Bayes DB effective.If they don't want to help, there's not much you can do about it. I was toying with the idea of feeding in peoples Sent folders along with all messages from their INBOX and Trash that were marked as read (I can pull these out using mboxgrep).This would also give me a larger sample of HAM them Spam which I understand is a good thing. Can anyone poke holes in my logic on this, or point out a better source for me to scrape HAM to feed sa-learn? Well, doing inbox and trash, you'll autolearn any false-negatives that your user happened to read and did not move to the IsSpam.. If you don't trust them to force-feed good ham, this might not be a good idea. Sent would appear to be fine.. unless your users are really dumb and frequently reply to spam.Ditto.Sent might be ok, but Trash is probably a bad idea. --Bowie
Re: Enabling/testing SPF?
Ramprasad ram at netcore.co.in writes: spamassassin -D file 21 | grep -i spf check the output which MTA do you use ? Your MTA must insert an X-Envelope-From: header ( or similar ) Thanks Ram Hi, After some more banging my head against the wall, I discovered that SPF checking was disabled because I wasn't loading the plugin in my init.pre. Apparently my init.pre is so old that it never included a section on SPF. So everytime I upgraded, the new version SA would never replace my old init.pre so the SPF plugin was never getting loaded. After I insert the load plugin section into init.pre and restarted spamd, SPF checking is now working. Doh! Thanks for your help.
How to disable IADB
Hi, One of my users gets lots of similar UCE, and learning doesn't help a bit. Investigating the report headers, it seems the mails trigger 'IADB' rules, which seems to be a RBL whitelist. ( 70_iadb.cf 20_dnsbl_tests.cf) Is there a way to disable this 'feature', without editting those files? Regards, -- Henk van Lingen, Systems Network Administrator (o- -+ Dept. of Computer Science, Utrecht University./\| phone: +31-30-2535278v_/_ http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/
Re: How to disable IADB
On Fri, Nov 03, 2006 at 09:02:46PM +0100, Henk van Lingen wrote: Is there a way to disable this 'feature', without editting those files? Set the rule scores to 0. -- Randomly Selected Tagline: She's gonna say my name! --Ralph Wiggum Lisa Gets an A (Episode AABF03) pgprnpi6Vg9j8.pgp Description: PGP signature
Re: SA TIMED OUT message debian sarge
On 11/3/06, Mark Martinec [EMAIL PROTECTED] wrote: On Friday November 3 2006 05:23, Matt Kettler wrote: I believe the option is $sa_timeout Not sure what the default is, probably 30. Which should be enough to prevent that problem, unless you have a LOT of sa instances contending for the AWL database. Try adding a $sa_timeout = 60 to your Amavisd.conf and lock_method flock to your spamassassin/local.cf (if you don't use NFS for DB storage.) Thanks for all the replies on this topic.. With a combination of the answers, i *seem* to have it sorted as well as a couple of good hints to increase speed etc. Thanks again.
Re: Relay Checker plugin v0.2
Stuart Johnston wrote: John Rudd wrote: I've put up a new version of Relay checker, in ... I expect I might, at some point, switch from using a dynamic score in the plugin, to a normal score. But that's the only change I expect to make, aside from bug fixes (if there are any), and/or a switch to using Net::DNS. I wonder if there is any way for a plugin to hook into SA's DNS routines. That might be better than calling Net::DNS directly. If anyone knows of a way, I'd look into it. I need to do both fwd and reverse lookups though.
Re: How to disable IADB
On Fri, Nov 03, 2006 at 03:06:10PM -0500, Theo Van Dinter wrote: On Fri, Nov 03, 2006 at 09:02:46PM +0100, Henk van Lingen wrote: Is there a way to disable this 'feature', without editting those files? Set the rule scores to 0. Oke, of course. There are however 28 such rules at the moment. grep IADB /var/lib/spamassassin/3.001007/*/* | grep score | wc 28 872879 They all get tested every time. I'd hoped for a 'skip_rbl_checks alike' check, or something. Thanks anyways, -- Henk van Lingen, Systems Network Administrator (o- -+ Dept. of Computer Science, Utrecht University./\| phone: +31-30-2535278v_/_ http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/
RE: sa-learn training question(s)
Jason Wellman wrote: Thanks for the feedback. One last question that I am currently tossing around. Sitewide vs individual learning... I have a small domain, less then 50 users. Should I be looking at setting up a sitewide bayes database instead of individual ones? Again I find conflicting information when I dig into it on the web. I find myself thinking that one persons Spam may be another's legitimate advertising... In general, individual databases are better than site-wide *IF* they are well trained. A well trained site-wide database is better than a bunch of individual databases that only get autolearning. Also, keep in mind that each database will require learning from 200 ham and 200 spam before it becomes operational. So make sure your users aren't expecting an overnight improvement. -- Bowie
Re: Spam
you will get a format that's more suitable to put in the headers. What do you mean, whaat this two options do, i found nothing on the spamassassin site. At the moment i use bayes and the emails are marked like this in the header: But some emails come through the spamassasin filter like this from the header: Return-path: [EMAIL PROTECTED] Envelope-to: [EMAIL PROTECTED] Received: from xdsl-10369.wroclaw.dialog.net.pl ([84.40.242.129]) by 89-149-XXX-125.internetserviceteam.com with esmtp (Exim 4.50) id 1GfnGR-0002Fg-Nw for [EMAIL PROTECTED]; Fri, 03 Nov 2006 01:50:36 +0100 Message-ID: [EMAIL PROTECTED] From: cases [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: me configure instead editing Date: Fri, 3 Nov 2006 01:51:19 +0100 MIME-Version: 1.0 Content-Type: multipart/related; type=multipart/alternative; boundary==_NextPart_000_0006_01C6FEEA.8EE1C180 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2869 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962 It contains a picture in the body email and this text: Function active am way! Destroy is without am closing in underlying handles odd even? Mutual exclusion see it exec? Its is Looking am direction? It exec why of earth a we say ill. Hash sha Melissa Schrumpf gtgtgti in gtgtgtthe. Without closing underlying handles odd even in actively redirected anyways. Active way Sciences Division in Technology. Output actually appears in. Gtgt yes similar issue program somehow or got confused a state. Gravereaux may tue is Tues Begin pgp Signed Hash sha. Appear am first remove. Way Sciences Division! More Wish console or unresponse when script of started from. Posted in will email visible? Parents a call them freaky true Thats. So what can i do here? thx marcus _ Sie suchen E-Mails, Dokumente oder Fotos? Die neue MSN Suche Toolbar mit Windows-Desktopsuche liefert in sekundenschnelle Ergebnisse. Jetzt neu! http://desktop.msn.de/ Jetzt gratis downloaden!
Ham Learning
Hello, when i learn with sa-learn some emails as ham i get this error message: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/Mail/SpamAssassin/HTML.pm line 182. Can somebody explain me what this mean? bye marcus _ Die neue MSN Suche Toolbar mit Windows-Desktopsuche. Suchen Sie gleichzeitig im Web, Ihren E-Mails und auf Ihrem PC! Jetzt neu! http://desktop.msn.de/ Jetzt gratis downloaden!
Re: How to disable IADB
On Fri, Nov 03, 2006 at 09:38:27PM +0100, Henk van Lingen wrote: Oke, of course. There are however 28 such rules at the moment. Technically the only one that matters is __RCVD_IN_IADB: score __RCVD_IN_IADB 0 The rest look at the results generated by that rule, so if that rule doesn't run ... I'd hoped for a 'skip_rbl_checks alike' check, or something. patches to make rule groups welcome. :) -- Randomly Selected Tagline: It was real. At least, if it wasn't real, it did support them, and as that is what sofas are supposed to do, this, by any test that mattered, was a real sofa. pgpnPdGM5chCe.pgp Description: PGP signature
Re: Amazon / RFCI false positives
Seems pretty accurate to me since I have accounts that have been returning 550: User Unknown smtp rejects for 2+ years that still receive mail from Amazon on a weekly/monthly basis. Same thing for several airline mileage programs, big name stock brokerages, etc. On Friday 03 November 2006 08:23, Tony Finch wrote: On Fri, 3 Nov 2006, Ralf Hildebrandt wrote: * Tony Finch [EMAIL PROTECTED]: Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Amazon.co.uk is not listed: http://www.rfc-ignorant.org/tools/lookup.php?domain=Amazon.co.uk My mistake: I cited the wrong domain. Try bounces.amazon.com which they use in the return path of their messages (I guess for all their international domains) http://www.rfc-ignorant.org/tools/lookup.php?domain=bounces.amazon.com Tony.
Re: How to disable IADB
Henk van Lingen wrote: On Fri, Nov 03, 2006 at 03:06:10PM -0500, Theo Van Dinter wrote: On Fri, Nov 03, 2006 at 09:02:46PM +0100, Henk van Lingen wrote: Is there a way to disable this 'feature', without editting those files? Set the rule scores to 0. Oke, of course. There are however 28 such rules at the moment. grep IADB /var/lib/spamassassin/3.001007/*/* | grep score | wc 28 872879 They all get tested every time. I'd hoped for a 'skip_rbl_checks alike' check, or something. Thanks anyways, How about: perl -n -e 'if(/(score RCVD_IN_IADB\w*)/){ print $1 0\n }' \ /var/lib/spamassassin/3.001003/updates_spamassassin_org/70_iadb.cf \ /etc/mail/spamassassin/disable_iadb.cf
SA TIMED OUT message debian sarge (new error)
Hi There, Looks like ive solved one issue, and another crops up!... I think that i may need to move to a mysql storage engine here? approx 17,000 messages a day incoming on this server. Any pointers here? - Thanks!! Nov 4 11:39:40 mx1 amavis[32148]: (32148-07) SA TIMED OUT, backtrace: at /usr/share/perl5/Mail/SpamAssassin/DBBasedAddrList.pm line 171\n\teval {...} called at /usr/share/perl5/Ma il/SpamAssassin/DBBasedAddrList.pm line 171\n\tMail::SpamAssassin::DBBasedAddrList::remove_entry('Mail::SpamAssassin::DBBasedAddrList=HASH(0xa881df0)', 'HASH(0xa6bc474)') called at /usr/share/perl5/Mail/SpamAssassin/AutoWhitelist.pm line 134\n\tMail::SpamAssassin::AutoWhitelist::check_address('Mail::SpamAssassin::AutoWhitelist=HASH(0xa87eba8)', '[EMAIL PROTECTED] adv.com', 82.227.79.148) called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 355\n\teval {...} called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 351\n\tMa il::SpamAssassin::Plugin::AWL::check_from_in_auto_whitelist('Mail::SpamAssassin::Plugin::AWL=HASH(0xa09da08)', 'Mail::SpamAssassin::PerMsgStatus=HASH(0xa67060c)') called at (eval 2 80) line 7\n\tMail::SpamAssassin::PerMsgStatus::check_f...
RE: Amazon / RFCI false positives
-Original Message- From: Tony Finch [mailto:[EMAIL PROTECTED] On Behalf Of Tony Finch Sent: Friday, November 03, 2006 9:59 AM To: users@spamassassin.apache.org Subject: Amazon / RFCI false positives Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Not a false positive if their servers are broken. Looks like their servers are broken. They can either fix their servers, or you can disable the tests.
RE: Amazon / RFCI false positives
-Original Message- From: Michael Scheidell Sent: Friday, November 03, 2006 6:32 PM To: Tony Finch; users@spamassassin.apache.org Subject: RE: Amazon / RFCI false positives -Original Message- From: Tony Finch [mailto:[EMAIL PROTECTED] On Behalf Of Tony Finch Sent: Friday, November 03, 2006 9:59 AM To: users@spamassassin.apache.org Subject: Amazon / RFCI false positives Amazon.co.uk was listed by RFC-Ignorant at the start of this week, and it is now scoring more than 5: DNS_FROM_RFC_DSN 2.87, DNS_FROM_RFC_POST 1.44, FROM_EXCESS_BASE64 1.05. Not a false positive if their servers are broken. Looks like their servers are broken. They can either fix their servers, or you can disable the tests. Yep, still broken: host -t mx bounces.amazon.com bounces.amazon.com mail is handled by 10 bounces-0101.amazon.com. bounces.amazon.com mail is handled by 10 bounces-2102.amazon.com. bounces.amazon.com mail is handled by 10 bounces-2101.amazon.com. bounces.amazon.com mail is handled by 10 bounces-0102.amazon.com. smb-250# telnet bounces-0101.amazon.com 25 Trying 207.171.178.149... telnet: connect to address 207.171.178.149: Connection refused telnet: Unable to connect to remote host smb-250# telnet bounces-2102.amazon.com 25 Trying 207.171.160.55... telnet: connect to address 207.171.160.55: Connection refused telnet: Unable to connect to remote host smb-250# telnet bounces-2101.amazon.com 25 Trying 207.171.160.54... telnet: connect to address 207.171.160.54: Connection refused telnet: Unable to connect to remote host smb-250# telnet bounces-0102.amazon.com 25 Trying 207.171.178.150... telnet: connect to address 207.171.178.150: Connection refused telnet: Unable to connect to remote host
Re: Relay Checker plugin v0.2
John Rudd wrote: Stuart Johnston wrote: John Rudd wrote: I've put up a new version of Relay checker, in ... I expect I might, at some point, switch from using a dynamic score in the plugin, to a normal score. But that's the only change I expect to make, aside from bug fixes (if there are any), and/or a switch to using Net::DNS. I wonder if there is any way for a plugin to hook into SA's DNS routines. That might be better than calling Net::DNS directly. If anyone knows of a way, I'd look into it. I need to do both fwd and reverse lookups though. The simple version might look like: # Get resolver my $dns = $pms-{parser_dns_pms}; # Reverse $hostname = $dns-lookup_ptr ($ip); # Forward my @addrs = $dns-lookup_a ($hostname); I'm not sure if the above code is really in any way better than the way you have it now. There are also functions for doing dns in the background but I don't know if that would be practical or helpful for your plugin. You also might consider using the rdns that SA has already calculated to save one query: $hostname = $relay-{ip};
RE: SA TIMED OUT message debian sarge (new error)
Hi There, Looks like ive solved one issue, and another crops up!... I think that i may need to move to a mysql storage engine here? approx 17,000 messages a day incoming on this server. Any pointers here? - Thanks!! Nov 4 11:39:40 mx1 amavis[32148]: (32148-07) SA TIMED OUT, backtrace: at /usr/share/perl5/Mail/SpamAssassin/DBBasedAddrList.pm line 171\n\teval {...} called at /usr/share/perl5/Ma il/SpamAssassin/DBBasedAddrList.pm line 171\n\tMail::SpamAssassin::DBBasedAddrList::remove_entry('Mail::SpamAssassin::DBBasedAddrList=HASH(0xa881df0)', 'HASH(0xa6bc474)') called at /usr/share/perl5/Mail/SpamAssassin/AutoWhitelist.pm line 134\n\tMail::SpamAssassin::AutoWhitelist::check_address('Mail::SpamAssassin::AutoWhitelist=HASH(0xa87eba8)', '[EMAIL PROTECTED] adv.com', 82.227.79.148) called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 355\n\teval {...} called at /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 351\n\tMa il::SpamAssassin::Plugin::AWL::check_from_in_auto_whitelist('Mail::SpamAssassin::Plugin::AWL=HASH(0xa09da08)', 'Mail::SpamAssassin::PerMsgStatus=HASH(0xa67060c)') called at (eval 2 80) line 7\n\tMail::SpamAssassin::PerMsgStatus::check_f... This could be simply what spamassassin was doing at the point you ran out of time. One possible reason for timeouts is sa-learn is running an expiry, and possibly learning a message at the same time. The Debian package of amavisd-new has a cron entry that runs --force-expire once a day (/etc/cron.daily/amavisd-new). You can disable opportunistic expiry by setting: bayes_auto_expire 0 in local.cf, but MAKE SURE the script works or Bayes will grow forever. Simply run it. If it takes a minute to run, it's very likely working. The script may be outdated also. The important part should read something like: su - amavis -c '/usr/bin/sa-learn --sync --force-expire /dev/null' Moving to MySQL helps considerably: http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html Gary V _ Add a Yahoo! contact to Windows Live Messenger for a chance to win a free trip! http://www.imagine-windowslive.com/minisites/yahoo/default.aspx?locale=en-ushmtagline
Re: SA TIMED OUT message debian sarge (new error)
Simon, Looks like ive solved one issue, and another crops up!... I think that i may need to move to a mysql storage engine here? approx 17,000 messages a day incoming on this server. Any pointers here? - Thanks!! Nov 4 11:39:40 mx1 amavis[32148]: (32148-07) SA TIMED OUT, backtrace: at /usr/share/perl5/Mail/SpamAssassin/DBBasedAddrList.pm line 171 ... /usr/share/perl5/Mail/SpamAssassin/AutoWhitelist.pm line 134 ... /usr/share/perl5/Mail/SpamAssassin/Plugin/AWL.pm line 355 Move AWL to SQL, if you haven't already. It is not too bad to start from scratch with an empty AWL database, it is probably not worth salvaging your existing AWL. Mark
RE: Amazon / RFCI false positives
On Fri, 3 Nov 2006, Michael Scheidell wrote: Not a false positive if their servers are broken. True from the RFCI point of view, but NOT true from the SpamAssassin point of view. These messages are wanted by their recipients so should not be scored as spam by SpamAssassin. Tony. -- f.a.n.finch [EMAIL PROTECTED] http://dotat.at/ LUNDY FASTNET: EAST 3 OR 4, BECOMING VARIABLE 3 OR LESS LATER. SLIGHT OCCASIONALLY MODERATE. FAIR. GOOD.
Re: sa-learn training question(s)
Bowie Bailey wrote: Matt Kettler wrote: Jason Wellman wrote: ... I have all incoming mail that is tagged as Spam delivered to a CaughtSpam IMAP box for each user. ... Should I also have sa-learn from the CaughtSpam folder? I have read some places that say yes, and some that say no. YES. Those that say no clearly do not know what they're talking about. Ummm...Do you really want to sa-learn from an unverified spam folder? Good point.. I took that to be a question about should I avoid sa-learning mail that was already tagged.. A massive misread on my part.. need more coffee. Lets face it.. if there was no point in learning tagged spam, why does the autolearner only kick in on high-scoring spam? The autolearner kicks in only on high-scoring spam to avoid learning from false positives. Learning from the CaughtSpam folder is like dropping the autolearn threshold down to 5.0 and removing the header/body score requirements. Again, I was off base, but if you re-read it in-context of how I originaly read it, it makes sense. That said, it will only learn the caught spam that wasn't already autolearned, but this is actually quite valuable as it will generally contain more of the borderline spam which is important for bayes to know about. You do want to learn from as much spam (and ham) as possible, but you want a human to sort it first Aye.
Re: Ham Learning
Markus Braun wrote: Hello, when i learn with sa-learn some emails as ham i get this error message: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/Mail/SpamAssassin/HTML.pm line 182. Can somebody explain me what this mean? It's normal.. but AFAIK, that message should be suppressed in reasonably recent versions of SA.. AFAIK only early SA 3.0 or 2.6 should generate that message.
Re: Amazon / RFCI false positives
From: Tony Finch [EMAIL PROTECTED] On Fri, 3 Nov 2006, Michael Scheidell wrote: Not a false positive if their servers are broken. True from the RFCI point of view, but NOT true from the SpamAssassin point of view. These messages are wanted by their recipients so should not be scored as spam by SpamAssassin. Kinda tough, ain't it? You could setup a whitelist_from_rcvd for Amazon, though. {^_^}
Re: Bayesian scores
Modify the score if you think that is appropriate. (I do. I score it at 5.1. The .1 is so I can be obnoxious in arguments about this, like the argument which may start with your message.) If you Bayes is VERY well trained with VERY few hams that come in BAYES_99, like 1 in 1000 or less, then raising the score may be called for. Raise the score in modest steps until you see BAYES_99 on ham messages. Then back off a little. {^_-} - Original Message - From: Péntek Imre [EMAIL PROTECTED] Hello, Why BAYES_99 have only the score 3.5 while 5.0 is required to identify a mail as spam? I think this rule should have a score about 5.1 (or anything greater than 5.0). -- With regards: Imre Péntek E-Mail: [EMAIL PROTECTED]
Re: Bayesian scores
From: Jim Maul [EMAIL PROTECTED] Péntek Imre wrote: Jim Maul wrote: I've upped the scores on almost all bayes rules here because history has shown it to be incredibly accurate here. Yes. BTW so far I've got no FP but still get false negatives with score 3.5, BAYES_99, using this database: [5816] dbg: bayes: corpus size: nspam = 2757, nham = 1403 Built from scratch by myself, still growing. As I have so big database there's very little possibility of mistaken bayesian score, but as I've built this database from scratch, I can also state that the same stands for little bayesian databases too. So I will use score 5.1 for BAYES_99, and still suggest to use this in the SA distribution too. Thanks for helping me anyways. If you are getting false negatives with 3.5 then you need to find a way to get more rules to hit. My average spam score here is 16.1 which is way over my 5.0 threshold. The trick is to increase the distance between your average spam and ham scores as much as possible and then you can run with a higher spam threshold. If you have spam not getting tagged, you should increase rules that trigger, not lower your threshold. Are you using network tests, razor, surbl, add on rules from sare, etc? Jim, if a rule has a history of hitting wrong once in 1000 or 1 times the score should be moved up from what the perceptron shows modulo your mail flow. At 1000 messages a day finding one or even two hams in the spam folder because of a rule scored too high is not severely annoying. You can discover it. You can fix it. This goes for a low volume email system with per user rules and Bayes. For a largish ISP different rules of thumb must apply. Still, a really REALLY good rule can score pretty high before it reveals itself as a problem with false negatives and you have to lower the score a bit. BAYES_99 on a well trained system is one such rule. Tweak scores gently until your tolerance for false positives is exceeded. Then back off a bit, maybe even two notches. {^_^}
Re: Block wrote: spams
And I would restart spamd after installing the rule. {^_-} - Original Message - From: Loren Wilton [EMAIL PROTECTED] I haven't seen any of these. But if the spams universally have single word wrote: stuff as the subject then I'd consider a more stringent rule: /^\w+\s+wrote:/i or /^(?:\w+\s+){1,2}wrote:/i or /^(?:re:\s*|fw:\s*){0,20}(?:\w+\s+){1,2}wrote:/i Loren - Original Message - From: Juan Mas Ive been getting the same and just wrote a rule for it today. Ive got what you have listed below. Havent tested it though. On 11/3/06, MIKE YRABEDRA [EMAIL PROTECTED] wrote: I am getting a lot of these Bob wrote: spams Anyone know a way to write the rule so if the subject has wrote: in the subject, tag it? Here is what I have? header WROTE_SUB Subject =~ /\bwrote\:\b/i describe WROTE_SUB Wrote in Subject score WROTE_SUB 3.0 -- Mike Yrabedra B^) -- -Juan
Re: BIZ_TLD and INFO_TLD
From: Péntek Imre [EMAIL PROTECTED] Still seem to be mostly spammers here. There is a slight increase in ham, but I don't think it would really change the scores all that much. I have both of these domains scored at 5 with no problems. Why don't you use simplex algorithm (or similar) to compute optimal scores? Local experience and laziness. When it becomes a problem we lower it a little. I don't score them as high as he does, though. That's one of the joys of per user scores, rules, and bayes. {^_^}
Re: BIZ_TLD and INFO_TLD
From: Giampaolo Tomassoni [EMAIL PROTECTED] at 2006. november 3. 18.20 Loren Wilton wrote: Still seem to be mostly spammers here. There is a slight increase in ham, but I don't think it would really change the scores all that much. I have both of these domains scored at 5 with no problems. Why don't you use simplex algorithm (or similar) to compute optimal scores? I don't have a reliable ham corpus: my customers mostly use pop3... And getting one for an ISP must be a real sonovabitch, too. {^_^}
Re: Amazon / RFCI false positives
* Tony Finch [EMAIL PROTECTED]: My mistake: I cited the wrong domain. Try bounces.amazon.com which they use in the return path of their messages (I guess for all their international domains) http://www.rfc-ignorant.org/tools/lookup.php?domain=bounces.amazon.com Yes, correct. My tests show that the MX hosts for bounces.amazon.com do indeed refuse all connections to them. WTF did they break? -- Ralf Hildebrandt (i.A. des IT-Zentrums) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 IT-Zentrum Standort CBFsend no mail to [EMAIL PROTECTED]