Re: Perfect bayes filter ratio spam/ham
At 07:15 PM 8/27/2005, Torsten Bronger wrote: Which bayes filter ratio is better: 1:1 or the natural incoming ratio? 1:1 actualy. I was a strong proponent of natural, but Dan Q corrected me. After a lot of thinking about the statistics, it made sense.
Problems with SpamAssassin 3.1 RC1and MIMEDefang
This is a problem that some mimedefang people are experiencing with SA 3.1 rc1. (mimedefang slave processes are becoming un-killable due to a mis-feature in SA 3.1 which messes with SIGCHLD) Begin forwarded message: From: "David F. Skoll" <[EMAIL PROTECTED]> Date: August 27, 2005 6:01:28 PM PDT To: mimedefang@lists.roaringpenguin.com Subject: Re: [Mimedefang] Problems with SpamAssassin 3.1 RC1and MIMEDefang Martin Blapp wrote: Please download SA3.1 Pre 1 and try yourself. I downloaded it, and didn't have to try anything; the problem was obvious after I read the SA 3.1 code. It's a bug in SA 3.1. Look at the file "Dns.pm", in the routine enter_helper_run_mode. We see this code: # enforce SIGCHLD as DEFAULT; IGNORE causes spurious kernel warnings # on Red Hat NPTL kernels (bug 1536), and some users of the # Mail::SpamAssassin modules set SIGCHLD to be a fatal signal # for some reason! (bug 3507) $self->{old_sigchld_handler} = $SIG{CHLD}; $SIG{CHLD} = 'DEFAULT'; There's a leave_helper_run_mode that resets the SIGCHLD handler to its old value. HOWEVER: If the slave dies sometime between enter_helper_run_mode and leave_helper_run_mode, the multiplexor never gets a SIGCHLD signal. I don't know why the SA developers are even monkeying with the SIGCHLD handler in the Perl module; you'd have to ask them. It seems like a bad idea to me. I think I have a workaround; I'll release a beta soon. In the meantime, I believe that turning off the embedded interpreter will make it work properly. Regards, David.
Re: HELO_DYNAMIC_IPADDR - score too high?
In an older episode (Saturday, 27. August 2005 19:24), Robert Menschel wrote: > If you can send me the full email, with headers, so I can compose a > whitelist_from_rcvd rule for it, and if you are personally certain > they do not send spam from that From address, I'll add an entry for > them into 70_sare_whitelist.cf For the records: mail sent to Robert.
Perfect bayes filter ratio spam/ham
Hallöchen! Which bayes filter ratio is better: 1:1 or the natural incoming ratio? Tschö, Torsten. -- Torsten Bronger, aquisgrana, europa vetusICQ 264-296-646
Re: sa-learn over already-scanned spam
> If I pass this directory to sa-learn, will sa-learn detect the SA > part and skip over it, or do I have to fear that the SA messages > skew my bayes data? Normally SA will do the right thing and recognize the markup. If this is from a really old version of SA and you are learning on a recent version, I suppose there could potentialy be problems. However, I would tend to not learn spam more than about 6 months old anyway. Loren
sa-learn over already-scanned spam
Hallöchen! I have a directory with my old spam. Most of it has been recognised by SA as such, so its body got replaced by an SA message with the original body moved to an attachment. If I pass this directory to sa-learn, will sa-learn detect the SA part and skip over it, or do I have to fear that the SA messages skew my bayes data? Tschö, Torsten. -- Torsten Bronger, aquisgrana, europa vetusICQ 264-296-646
Re: SURBL Redirection Problem
Craig McLean wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 3.1.0-rc1 nailed it to the wall. Craig. <...> domain | 4.5 URIBL_SC_SURBL Contains an URL listed in the SC SURBL blocklist | [URIs: moonboard.info] Did you detect that with a redirector_pattern? I don't see that detected with a stock 3.1.0-rc1 here (no hint of it when SA is run with -Duri). Daryl
DNS cache size for moderatly busy sites?
Hello, We just migrated to Tinydns from BIND and are looking at our cache size (OK, so I am really talking about dnscache, not tinydns itself). Looking at our cache logs from the last 12 hours (2am Friday night to 2pm Saturday afternoon), I see our "cache motion" is already 75MB of data. Wow. That's in a relatively low activity time for us. We get an average of somewhere under 100,000 mails a day. I am curious what other people's cache sizes are set to. If the numbers we are seeing hold up (especially during peak), and if we wanted to cache 3 days worth of DNS queries, it seems like we'd need something like a 500MB+ cache size. Is it me, or does that seem rather large? I wonder how efficient dnscache would be at that size anyway... Thanks for any tips! __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: SURBL Redirection Problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 3.1.0-rc1 nailed it to the wall. Craig. Ilan Aisic wrote: | | pts rule name description | -- - -- | 0.9 RCVD_BY_IP Received by mail server with no name | -6.0 USER_IN_WHITELIST_TO User is listed in 'whitelist_to' | -0.0 DK_VERIFIEDDomain Keys: signature passes verification | 0.0 DK_SIGNED Domain Keys: message has an unverified signature | 3.2 FUZZY_PHARMACY BODY: Attempt to obfuscate words in spam | 1.3 INFO_TLD URI: Contains an URL in the INFO top-level domain | 1.0 LOCAL_INFO_TLD URI: Contains an URL in the INFO top-level domain | 4.5 URIBL_SC_SURBL Contains an URL listed in the SC SURBL blocklist | [URIs: moonboard.info] | 2.1 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist | [URIs: moonboard.info] | 3.0 URIBL_OB_SURBL Contains an URL listed in the OB SURBL blocklist | [URIs: moonboard.info] | 3.8 URIBL_AB_SURBL Contains an URL listed in the AB SURBL blocklist | [URIs: moonboard.info] | 2.0 URIBL_XS_SURBL Has URI in XS - Testing | [URIs: moonboard.info] | 4.1 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist | [URIs: moonboard.info] | 3.0 URIBL_SC2_SURBLHas URI in SC2 at http://www.surbl.org/lists.html | [URIs: moonboard.info] | 1.7 SARE_OBFU_VISIT2 found apparent obfuscation of word used in spam | -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDEOEtMDDagS2VwJ4RAvTNAJ4j7+6v+Dj/j+JrmE7iwVC5dTLHWwCgtikJ 6x0dpPWA8KhAvFRbH/5yE3k= =hs1n -END PGP SIGNATURE-
Re: SURBL Redirection Problem
Perhaps changing the uri check would be a short-term fix. There is a redirector pattern detector in SA which would be the right thing to fix. Loren
Re: HELO_DYNAMIC_IPADDR - score too high?
Hello wolfgang, Saturday, August 27, 2005, 3:50:20 AM, you wrote: w> we received a Duden newsletter (duden is *the* spelling w> rules/grammar/dictionary publisher in germany) with the header: Wolfgang, If you can send me the full email, with headers, so I can compose a whitelist_from_rcvd rule for it, and if you are personally certain they do not send spam from that From address, I'll add an entry for them into 70_sare_whitelist.cf Bob Menschel
Re: How does SA detect non-english language?
Hello John, Friday, August 26, 2005, 6:25:14 AM, you wrote: JH> Hello, JH> We have had a complaint from a user that some of his Japanese mail JH> (being received by us) is always marked by SA as spam. As a University JH> it is natural for us to receive foreign mail messages. Understood. JH> X-Spam-Status: Yes, score=13.7 required=8.0 tests=BAYES_99,HTML_20_30, JH> HTML_MESSAGE,MANGLED_LOOK,SARE_HTML_P_MANY3,SARE_RAND_2, JH> SARE_RECV_IP_218216,SARE_SUB_ENC_ISO2022JP,SARE_SUB_PCT_LETTER, JH> SUBJ_ALL_CAPS autolearn=unavailable version=3.0.4 JH> Unfortunately at the time I had left included in our site-wide JH> configuration some of the specific 'ENG' SARE rules, so that explains JH> the SARE_SUB_ENC_ISO2022JP matching and bumping the score up a bit. The JH> SARE_RECV_IP_218216 is also a bit worrying (the message may have passed JH> through a known spam relay). If you're using the latest SARE version, SARE_RECV_IP_218216 should be scoring only 0.964, because we have detected ham coming through that range of servers (though spam:ham > 100:1). If you can send me some confirmed ham (full emails, headers and all), I can add those to my corpus and that will help drive the score down. MANGLED_LOOK is the larger concern, with a score of 2.3. Like the ENG rules, the MANGLED rules file should not be used if you expect any significant non-English ham. I would remove that file from your collection. The 70_sare_obfu*.cf file set is slowly replacing MANGLED, and seems to be successful in avoiding most language problems. SARE_RAND_2 also scores 2.5 -- That tests for a specific string suggesting that a broken ratware configuration inserted something like %RND into the email. I suppose it's possible, but it seems unlikely that the Japanese email would match that pattern. If you can send me the exact email which does so, maybe I can track that down. SARE_HTML_P_MANY3 scores only 0.217, so that's not much of a concern. SARE_SUB_PCT_LETTER with a score of 1.152 is also a significant contributor, matching a percent sign, followed by a single letter, then word break. There is no percent sign in the raw subject you posted, so I assume it's in the code after translation. Seems strange. Again, a copy of that exact email would help me analyze this. The biggest concern, as Matt pointed out, is your BAYES_99. If this is indeed ham, then you need to train these ham, because your Bayes system believes firmly that these are spam. Bob Menschel
Re: HELO_DYNAMIC_IPADDR - score too high?
>Hi, > >we received a Duden newsletter (duden is *the* spelling >rules/grammar/dictionary publisher in germany) with the header: > >Received: from ds80-237-180-34.dedicated.hosteurope.de >(ds80-237-180-34.dedicated.hosteurope.de [80.237.180.34]) >by netra27.desy.de (DesyMail_In_27) with ESMTP id 3B5D6FB90A >for ; Fri, 26 Aug 2005 17:00:38 +0200 (MEST) > >It got, among others, the scores >4.4 HELO_DYNAMIC_IPADDR >2.2 DCC_CHECK >2.4 MIME_HTML_ONLY_MULTI > >This makes me wonder if HELO_DYNAMIC_IPADDR should get a lower score in SA in >general - I have now lowered it's score in our setup to reduce the FP risk. > >Cheers, > >wolfgang > Wolfgang, Assuming you really do want the newletter, you should also be adding it to the DCC whitelist. That way it won't trigger the DCC_CHECK *and* you won't be reporting it to the DCC servers (a separate choice, but one I use for any "signed-up-for" bulk mail); See the DCC man pages for examples and syntax. Paul Shupak [EMAIL PROTECTED]
SURBL Redirection Problem
This is a sniplet from spam content I got: http://chietaphi.com/catalog/redirect.php?action=url&goto=www.vxneev.moonboard.info/?100aa983aGd9080f4c0bfF3c1362f8e1";>Just VISlT EPharmaccy-By It did not trigger any of the URI rules even though moonboard.info is listed in all the places. They have exploited a redirector script on chietaphi.com which looks legit. I think it should not be hard to improve the SA plugin for URI (check_dnsbl) to also check something as obvious as this redirection. Perhaps it can be done with a second call after parseing the string followiong the domain name and realizing it contains a URI. -- Ilan Aisic Registered Linux User 8124 http://counter.li.org
HELO_DYNAMIC_IPADDR - score too high?
Hi, we received a Duden newsletter (duden is *the* spelling rules/grammar/dictionary publisher in germany) with the header: Received: from ds80-237-180-34.dedicated.hosteurope.de (ds80-237-180-34.dedicated.hosteurope.de [80.237.180.34]) by netra27.desy.de (DesyMail_In_27) with ESMTP id 3B5D6FB90A for ; Fri, 26 Aug 2005 17:00:38 +0200 (MEST) It got, among others, the scores 4.4 HELO_DYNAMIC_IPADDR 2.2 DCC_CHECK 2.4 MIME_HTML_ONLY_MULTI This makes me wonder if HELO_DYNAMIC_IPADDR should get a lower score in SA in general - I have now lowered it's score in our setup to reduce the FP risk. Cheers, wolfgang
Re: Feature Request: dynamic trusted_networks
"jdow" schrieb: >> However, >> if a message came from a client who gave SMTP-AUTH, it ought to be >> "trusted" (and not subjected to the blacklist checks). > > Would you care to expound on your theory here. What makes you think > a valid SPF is a sign of a good guy? SMTP authentification has nothing - really nothing - to do with SPF. -thh