SEM rules pushed into production again
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Sorry for posting to both dev and users but it looks like the SEM lists have been pushed into production rules again and are showing up in 72_active.cf under 3.3.1. This happened around 23:00 (GMT-4) last night. Someone fix that? - --Blaine -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) iEYEARECAAYFAk4I5gEACgkQLp9/dJH6k+PfkQCcDaaeaVyyO/Tl2nKB4mrNg5Wy I1sAmwWW10mO1Xn7VWgJO6XWAAk6OD0d =X2Yk -END PGP SIGNATURE-
Re: Spam Eating Monkey causing 100% false positives for large institutions
On 3/23/2011 9:56 AM, dar...@chaosreigns.com wrote: > In the recent sa-updates, the Spam Eating Monkey rules were > inappropriately enabled. If you hit them too much, they start returning > 100% false positives. Their listed limits are "more than 100,000 queries > per day or more than 5 queries per second for more than a few minutes". As soon as the bug was reported on the dev list I disabled the 127.0.0.255 response code to avoid any additional issues. I will be turning this functionality back on as soon as the SA rules are updated which I assume will be soon. The response code of 127.0.0.255 only happens when someone has performed at least 100 million queries per day for 48 hours straight. During the first 48 hours the queries are simply ignored. Attempts were also made to contact several of the large (300M+) query sources but so far only one has responded with anything more than an autoresponder. Turns out that even large companies don't watch their systems close enough to notice long delays and query failures against a blacklist. If this had been a planned action then policies would have been changed to reflect the nature of most SA users in regards to default blacklists. Unfortunately, the substantial traffic was just dropped on SEM and the automatic policies did what they are designed to do: They protected the system. The result was another very stressed SEM admin calling me at 3AM. Personally, I don't think it is unreasonable to start returning this response code for someone that is performing well over 100M queries/day against a free list with a limit of 100K/day. This policy would most likely change if SEM rules were ever part of the default SA rules. --Blaine
Re: Understanding blacklisted mail from trusted domains
Alex wrote: > Why is s.twimg.com blacklisted on SEM_URI and SEM_URIRED? There was a rather painful flood of crap hitting our servers using images hosted at twimg.com. Looks like they were posting the images as profile pics and then linking directly to it on twitter's dime. This domain has dropped off the list after a little nudge. --Blaine
Re: HTML in Messages
MY EYES!!11 Maybe it's time to repost the "best practices" for this list?!? --Blaine Disclaimer: This top post is the intellectual property of blah blah blah and urmom. Henrik K wrote: > > > Marc wrote: > >> Get a modern email client. Are you using a KSR33 teletype on a 110 >> baud modem? > > Idiot, HTML is supposed to be flashy and not a tool for lame blockquotes.
Re: How was your holiday weekend spam traffic?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Santerre wrote: > I'm just curious this morning. I see a dip in spam trapped, but a pretty > big rise in blocking. I expected a lot worse over the long holiday > weekend. Did someone get arrested or something? Since last Wednesday I show about a 25% reduction in my spamtraps but a 30% increase in delivery attempts to my actual mail servers. First glance looks like the botnets slowed down but the snowshoe picked up. This might warrant further investigation... - --Blaine -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) iEYEARECAAYFAksUSa8ACgkQLp9/dJH6k+PJegCfV8QgNYATDFaAsNczvkoJqg4m zh4An2DKNy8w7rXbkiNhs52+d6Jgrmoa =wmmv -END PGP SIGNATURE-
Re: Spam Eating Monkey?
Warren Togami wrote: > http://spameatingmonkey.com/usage.html > > Are these URI rules really valid syntax? They don't look right, and > spamassassin lint rejects them. I'm using all of those rules except for the backscatter one with no problems. They also lint fine for me. Are you watching for line wrap? --Blaine
Re: Harvested Fresh .cn URIBL
Terry Carmen wrote: > Instead of blacklisting new domains (which is apparently difficult to > do), why not blacklist all .cn domains (or simply all domains) newer > than xxx days? > > If they're older than xxx days and not yet on another blacklist for > sending actual spam, return a neutral response. How do you determine age? Whois queries really can't be used because of the reasons I mentioned in my previous post. I guess one option is to record the date that the domain was first seen anywhere and then work from that but what about the domains that are rarely used? One of the big problems I see with trying to look at .cn domains in the wild is the lack of data. How many people here deal with a large volume of mail that would have legitimate .cn domains? I seem to remember recently seeing one of the big blacklists not having enough non-english mail to work with. --Blaine
Re: Harvested Fresh .cn URIBL
Warren Togami wrote: > Opinions of this proposal? I would love to have a listing of recently registered .cn domains but until the TLD operator starts working with us that just isn't going to happen. Trying to perform a whois lookup on every domain is painfully slow. Once you get a high enough volume of .cn domains detected it will become impossible and that is assuming you are never rate limited. On top of that, most of the time when I do a whois lookup on a .cn domain I find the destination whois server to be unresponsive, stuck in a "maintenance mode" or doesn't include any data except the domain name and the listed nameservers. Spam from .cn domains can be mitigated with the right rules and querying multiple lists. I know my users never see .cn domains in their inbox and if I didn't run a blacklist I wouldn't either. --Blaine
Re: Spam Eating Monkey?
Warren Togami wrote: > I'll add your existing rules to the Sandbox for testing. Thank you! > But have you considered putting all the DNSBL's and URIBL's into > aggregated zones so you can cut down on redundant queries? Actually, the uri red list is an aggregate zone of my uri black, red and yellow lists. The main reason I haven't merged the black list with any of the other IP zones is because I haven't had enough user response on the other lists yet. Basically, the relevant zones are the SEM-URIRED and SEM-BLACK and each of them needs to be it's own query because of the two completely different datasets. --Blaine
Re: Spam Eating Monkey?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Warren Togami wrote: > http://spameatingmonkey.com > > Anyone have any experience using these DNSBL and URIBL's? > > Is anyone from this site on this list? > > I wonder if we should add these rules to the sandbox for masschecks as > well. Since someone is bound to ask I figure I'll state right now that I have no objections to the SEM lists being included in the masschecks. In fact, I'm quite curious. I would also recommend adding AnonWhois.org to the list. - --Blaine Fleming SEM Admin http://spameatingmonkey.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) iEYEARECAAYFAkrJTLYACgkQLp9/dJH6k+Mc4ACeII1l3SSA2y2hz30A7ulqzp1Q yWIAnjxIj63wAbqYDdzrU0DW/Rsj1eSz =X6Nx -END PGP SIGNATURE-
Re: Understanding the hostKarma Lists
Marc Perkel wrote: > I like it. > > RCVD_IN_HOSTKARMA_BL > RCVD_IN_HOSTKARMA_WL > RCVD_IN_HOSTKARMA_YL > RCVD_IN_HOSTKARMA_BR > > Let's go with it. Marc, have you updated your wiki to reflect the new rules? I think that will pretty well settle any debate or question people have. --Blaine
Re: Hostkarma Blacklist Climbing the Charts
Marc Perkel wrote: > My NoBL list is similar to yellow except that you can skip black list > lookup but maybe might be whitelisted somewhere. I keep seeing IPs that are on both the NoBL *and* the blacklist. An example of this 89.206.179.213. That IP currently returns 127.0.0.2 (blacklisted) and 127.0.0.5 (NoBL listed). Can you make sense of this entry? --Blaine
Re: .cn domain age query?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Let's try this again with sending to the list. Sorry Mike! Mike Cardwell wrote: > That wouldn't help in this particular case: > > "All domains registered in the last 5 days under the .BIZ, .COM, .INFO, > .NAME, .NET and .US TLDs" > > Doesn't work for .cn's, or any other country level tld's (apart from .us) Unfortunately, ccTLDs aren't very cooperative in matters such as this. There are a few exceptions but most of them will ignore requests for zone file access or outright tell you they can't for "security reasons". The operators of the .cn TLD are unwilling to work with me at all. If anyone has any contacts at various ccTLDs that are willing to grant people access to zone files then please let the list know. I'm sure there are several others that would like to get access. - --Blaine -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) iEYEARECAAYFAkqunckACgkQLp9/dJH6k+MKQwCgh+9L8+5edKSwRKUAcelT1BDR hQUAn2beU0Vy4oFULDaZjh8IQluQ7exT =ZO2c -END PGP SIGNATURE-
Re: workaround for DNS "search service"
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Arvid Ephraim Picciani wrote: >> By any chance, didn't your ISP start "providing search service" for any >> web name that does not exist? > > btw, whats the workaround for this? opendns didnt work for me as they have > similar "features". > do you simply query the bl's dns service directly? > It seems to me the best solution is to run your own local caching DNS server. If there are a reasonable amount of duplicate queries then this could help performance substantially. Of course, I've never been a fan of using a DNS server that I don't control ever since providers realized they can make money by telling you what they think you were looking for. - --Blaine -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) iEYEARECAAYFAklarjQACgkQLp9/dJH6k+OgVACfea0p1vtuIpq/qVQLr1kpzBQ7 zZAAnAgcBQLq1+iiarju5o/2HohbL2xO =/O0b -END PGP SIGNATURE-
Re: Trying out a new concept
John Hardin wrote: This is why I started processing all the TLDs I was able to obtain access to. There is lag but the most it could be is about 24 hours and that assumes they register a new domain immediately after the TLD dumps the zone. Does your data allow mapping domain name to registrar? If so, you might want to try implementing a URIBL for the Evil Registrars as has been discussed from time to time on the list... I've thought about doing that but it seems redundant since URIBL already does. At least they seem to have it published on their site so I'm pretty sure it's included in their zones too. --Blaine
Re: Trying out a new concept
SM wrote: Even if your traffic patterns are different, the hit rates shouldn't be that low. There would be a difference if your MTA uses a DNSBL to reject or if you apply other pre-content filtering techniques. It's not a matter of different traffic patterns as much as a matter of when I do the tests. Incoming mail that is accepted is subjected to many tests before it is even checked against the new domains list. If I put it closer to the front of the tests it would probably hit higher but I've never had much need to do so. --Blaine
Re: Trying out a new concept
John Hardin wrote: Why is it so flippin' difficult to get a feed of newly-registered domain names? Because the TLDs hate giving people access to the data and certainly won't provide a feed without a bunch of cash involved. Even worse, all the ccTLDs pretty much refuse to even talk to you about access to the zones. This is why I started processing all the TLDs I was able to obtain access to. There is lag but the most it could be is about 24 hours and that assumes they register a new domain immediately after the TLD dumps the zone. Honestly, on my system I have less than 0.01% hits against a list of domains registered in the last five days so I've always considered the list a failure. However, several others are reporting excellent hit rates on it. I think it is because the test is so far after everything else though. --Blaine
Re: More spam after disabling local BIND ?
Jules Yasuna wrote: Configuration (maybe more than you care to see, sorry) -- 1) platform: kubuntu 8.04 2) SA version:3.2.4 3) options: add_header spam BB score=_SCORE_ report_safe 0 lock_method flock 4) using qmail -> procmail -> spamc -> spamd ps ea | grep spam shows ... /usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir --username spamd -s /usr/local/bb/spamassassin/spamd.log -d --pidfile=/usr/local/bb/spamassassin/spamd.pid this snippit is from /etc/procmailrc :0fw: spamassassin.lock * < 256000 | spamc -F /usr/local/bb/spamassassin/bb.spamc.conf cat bb.spamc.conf shows -u spamd -s 100 --headers SA has been working great! Very few spam messages get through. Then, we made ONE change to the machine. We turned off BIND, and just resolve to the ISP name servers. After that, lots and lots of spam gets through ? Not everything, just a lot more than when BIND was running locally So, instead of having BIND running locally, and forwarding to the name servers provided by our ISP, we turned off running BIND, and placed the ISP name servers addresses in /etc/resolv.conf. Just for clarity, here is what we did to /etc/resolv.conf # nameserver 192.168.1.17 comment out our localhost, since bind is no longer running domain angels.bookus-boulet.com nameserver 66.189.0.29 nameserver 66.189.0.30 So, really that is all we did. After that, lot's of spam gets through. Just to check, we turned our nameserver back on (and adjusted /etc/resolv.conf accordingly), and once again SAworks great ! So, please tell me what I am doing wrong here Thanks in advance ... jules I'm wondering if your DNS servers are running slow or timing out. Have you tried running SA in debug mode and looking for DNS related delays or issues? --Blaine
Re: rbldnsd blacklist question
Marc Perkel wrote: Looking from opinions from people running rbl blacklists. I have a list that contains a lot of name based information. I'm about to add a lot more information to the list and what will happen is that when you look up a name you might get several results. For example, a hostname might be blacklisted, be in a URIBL list, be in a day old bread list, and a NOT QUIT list. So it might return 4 results like 127.0.0.2, 127.0.0.6, 127.0.0.7, 127.0.0.8. Is this what would be considered "best practice". My thinking is that having one list that returns everything is very efficient. Thoughts? +1 for bitmasking the data. --Blaine
Re: New Day old Bread list trick
Marc Perkel wrote: Blaine Fleming wrote: Marc Perkel wrote: Blaine Fleming wrote: Marc Perkel wrote: I just discovered the "Day old Bread" list of host names under 5 days old. I don't know where they get it but the list is very useful. I remember playing with this list a few years ago but now they seem to lag a few days behind. For example, as of right now, 'superbleached dot com' is not showing up in the list despite being registered on 09-14-2008. Because of this lag I created my own version of the list where I directly process the .ORG, .NET, .COM, .INFO, .US, .COOP and .BIZ TLD zones every day shortly after they are posted. I list all domains in those TLDs that are within the standard domain tasting period (currently 5 days). Would it be of value if I made this list available in RBLDNS format? I have limited resources but would be willing to make an effort if there is any interest. --Blaine How do you get the list? I have the resources. I'd like to make it available. If you can tell me how to get the list myself I'll do it. Tell me what works best for you. I think the list would be very valuable as a factor to test for spam. I get the list by generating the data myself directly from the TLD zone files. It takes me about three hours to download and process the ~8.5GB of uncompressed data (~246 million lines) on a single server. I've made it available to anyone interested so contact me offlist with the IP or netblock if you would like access. --Blaine Where do you download those zone files from? I've got some serious bandwidth and computing power. You have to execute a zone file access agreement with each of the TLDs you want to obtain the data from. There are several that are unresponsive and even more that just outright deny your access. --Blaine
Re: New Day old Bread list trick
Marc Perkel wrote: Blaine Fleming wrote: Marc Perkel wrote: I just discovered the "Day old Bread" list of host names under 5 days old. I don't know where they get it but the list is very useful. I remember playing with this list a few years ago but now they seem to lag a few days behind. For example, as of right now, 'superbleached dot com' is not showing up in the list despite being registered on 09-14-2008. Because of this lag I created my own version of the list where I directly process the .ORG, .NET, .COM, .INFO, .US, .COOP and .BIZ TLD zones every day shortly after they are posted. I list all domains in those TLDs that are within the standard domain tasting period (currently 5 days). Would it be of value if I made this list available in RBLDNS format? I have limited resources but would be willing to make an effort if there is any interest. --Blaine How do you get the list? I have the resources. I'd like to make it available. If you can tell me how to get the list myself I'll do it. Tell me what works best for you. I think the list would be very valuable as a factor to test for spam. I get the list by generating the data myself directly from the TLD zone files. It takes me about three hours to download and process the ~8.5GB of uncompressed data (~246 million lines) on a single server. I've made it available to anyone interested so contact me offlist with the IP or netblock if you would like access. --Blaine
Re: New Day old Bread list trick
Marc Perkel wrote: I just discovered the "Day old Bread" list of host names under 5 days old. I don't know where they get it but the list is very useful. I remember playing with this list a few years ago but now they seem to lag a few days behind. For example, as of right now, 'superbleached dot com' is not showing up in the list despite being registered on 09-14-2008. Because of this lag I created my own version of the list where I directly process the .ORG, .NET, .COM, .INFO, .US, .COOP and .BIZ TLD zones every day shortly after they are posted. I list all domains in those TLDs that are within the standard domain tasting period (currently 5 days). Would it be of value if I made this list available in RBLDNS format? I have limited resources but would be willing to make an effort if there is any interest. --Blaine
Re: DNS ISP Host List Available
John Hardin wrote: So how is a proponent of the "Hunt down and kill spammers very messily" FUSSP classified? In the US, they would be classified as a felon. --Blaine
Re: DNS ISP Host List Available
mouss wrote: are you using an old imode phone :) The message was about 125Ko. That's less than a small photo (I say this because that's what a "smartphone" is for, no?). Samsung SCH-i760 on Verizon that takes forever to download mail so when something longer than about 4k comes in it takes a while. Doesn't really freeze the phone but it doesn't exactly respond well either. It works...mostly. hope this mail is ok :) Not a problem! --Blaine
Re: DNS ISP Host List Available
Marc Perkel wrote: Here's my list in dnsrbl format. I only do rsync so far to paid subscribers or people who I'm trading with. Dude. Seriously. The data is appreciated but next time please post it on a website or something. Your mail pissed off my smart phone! It might not be the best device out there but it normally works for me. I'm more disgruntled about the frozen device than the email itself so feel free to wallop me with a frozen trout or whatever. --Blaine
Day Old Bread list performance
I haven't heard anything about the DOB list from Support Intelligence in several months and that was only to hear about timeouts. Is it still a viable list? Does anyone use it? I know it does still respond but I haven't used it in over a year. Back then it seemed to work well. I have access to several TLD zone files and have started generating my own list for internal use and might even make the data available via rsync if the Support Intelligence version is dead/closed/etc. Thoughts? --Blaine
Re: eudora and "password"
installed current win version on an xp box, using last paid version of eudora, inserted "127.0.0.1" in place of POP3 mail server as advised in manual, but eudora wants a password every time one changes a server name.and I cannot find one that works. Not really an SA issue but it is probably a config issue. Most proxies need you to set the pop3 username to the full login credentials like '[EMAIL PROTECTED]' so they know where to send your request. As someone that was happy with Eudora for the better part of 10 years I can say that Eudora doesn't have a password that is causing you problems. --Blaine
Re: How many use CRM114?
Marc Perkel wrote: CRM114? What's that? Can't quite figure out what it does. Is it a pony? :) CRM114 is another way to intelligently separate the spam from the ham. It is listed on the SA custom plugins page at http://wiki.apache.org/spamassassin/CustomPlugins . So far it works quite well and learns very fast. When I first saw it I wrote it off as some random plugin that didn't make much sense but it has proven to be a worthwhile addition/replacement for bayes. --Blaine
Re: Testing MD5-Sum of the Subject against a dnsbl
Benny Pedersen wrote: That is a good starting point for writing a plugin to do something similar but the OP wants to hash the subject not the body. subject is part of the body Correct me if I'm wrong, but I believe that ixhash splits the part after the blank line (body) and hashes it using the appropriate method(s). Last I checked, the "Subject:" line is in the header, before the blank in the part it discards. I started doing this a while ago in addition to using ixhash. how ? Wrote a plugin that gets the exact subject line from SA, hashes it then queries a remote server. Still debating of subject hashing is worth it in the end as this rule overlaps with several others such as CRM114, ixhash and some other custom rules. After engineering a Client-Server plugin to provide realtime hash stats it helps to push spam over the threshold without having too many false positives. with the hashhack-server.pl ? The only part I found valuable for my environment was the regular expressions showing how it generates the hashes. Otherwise, I tossed it all out and rolled my own. I learned quick that exporting the data to a rbldns zone was too slow so I took the approach of using a UDP messaging system to directly query a central server that contains all the data collected by my spamtraps. Doing this boosted the hash hit rate substatially over DNS data that was dumped every 5 minutes. I'm still revising the architecture but plan on releasing all the code shortly. In other words, I hate perl and the code looks like crap so let me clean it up before I totally embarrass myself! :-) Hope that all makes sense as I'm not really good at explaining things. I'm more of a lock-him-in-the-back-and-let-him-code type of person. --Blaine
Re: Testing MD5-Sum of the Subject against a dnsbl
Oops, still trying to get used to Thunderbird and didn't post this to the list Benny Pedersen wrote: Is there a way to realise this in SA. http://ixhash.sourceforge.net/ That is a good starting point for writing a plugin to do something similar but the OP wants to hash the subject not the body. I started doing this a while ago in addition to using ixhash. After engineering a Client-Server plugin to provide realtime hash stats it helps to push spam over the threshold without having too many false positives. The false positives I received were typically monthly notices like the Verizon Wireless statement notice that hit thousands of boxes at the same time. I strongly recommend being careful with the scoring of the subject hash and be sure to account for missing subjects. I find a missing subject is more common in ham than spam. --Blaine
Re: Time to blacklist google.
If gmail has a problem, then without a doubt, blacklist them until they fix it. Seems pretty simple to me. I know that the ISP's I run mail systems for would lose their customers if they stop getting mail from Google. The customer attitude is that the provider should take measures to block spam but don't you dare block a legitimate message for any reason. Of course, every situation is different. Personally, I'd rather put better filters in place at my end than expect Google to control it. Same goes for Yahoo and Microsoft. In theory I think it would be a good idea but just the number of mail systems required to get the point across is too high to actually happen soon. Looking at my mail history there is a lot of legitimate mail from Google and very little spam (so far). I do miss the days when spam filtering was a luxury and nobody really needed it. Now I'm running thousands of dollars of hardware to handle mail that is about 98% spam with a 99.995% successful filtering rate. --Blaine
Re: Is http://www.rulesemporium.com?
I was not able to access http://www.rulesemporium.com? is this working are moved some where? Works fine from here. Site is reachable and resolves to 72.52.4.74 which pings fine as well. --Blaine
How many use CRM114?
Slightly off-topic, but I'm curious, how many of you are using CRM114? How well does it work for you? Was it difficult to train? I've been looking at it and haven't found much except the official plugin guide and a single page saying that it works better than other learning methods. Any info would be appreciated. Thanks, Blaine