On 09/19, Mark wrote: > [email protected] wrote: > > The other 38 were notifications from livejournal.com, nothing spam > > related, from 2011-08-02 to 2011-08-11. It looks like you just had > > livejournal.com listed as a spammer for those 10 days. Those emails > > are not hitting this rule now. > > livejournal.com has been whitelisted for years, so it's certainly not expected > behaviour.
Any SA dev folks have opinions on this? I'm up for assuming there was somehow a problem on my end and removing these from my corpora if that's what you devs think I should do. Mark, I encourage you to include [email protected] in your replies. > Perhaps you were using a DNS server that returned bad results. Some > governments (e.g. China) intercept DNS requests and return their own IP. Some > ISP's think they can do that too for NXDOMAIN results. It seems unlikely. I'm using a local bind server with two forwarders to my hosting provider, linode.com, which is very open-source oriented, and seems unlikely to pull something like that. Although I'm happy to ask them via a support request if there was a related incident during this time period. The relevant rule is: urirhssub URIBL_WS_SURBL multi.surbl.org. A 4 Does that mean it could've matched anything ending in .4, or only 127.0.0.4? Man page is Mail::SpamAssassin::Plugin::URIDNSBL > That should be preventable to a large extent by checking if the return code is > within the 127/8 IP range. Devs, if urirhssub with a value of "4" does not constrain to 127/8, we should change the rules to match only, for example, 127.0.0.4. > We don't control external DNS servers of course, so if one of them decides to > return a 127/8 code due to whatever cause (e.g. cache poisoning), it will > cause a false detection signal. Indeed. > Another possibility is DNS client error. That is known to occur with > multithreaded and asynchronous dns clients. Typical is a race condition while > accessing memory, causing a mix up of query returns. Seems unlikely, mostly because of the time frame. > Did the livejournal.com hits have specific subdomains? I just looked for notifications from livejournal that didn't hit this rule in the same time frame - there were none. Everything I got from livejournal.com from August 2nd to August 11th hit URIBL_WS_SURBL. And all included these urls: http://news.livejournal.com/ http://www.livejournal.com/manage/subscriptions/ Other URLs were generally of a subdomain <user>.livejournal.com. > Also, I would expect that there would not be any query to SURBL for a domain > that is on SA's internal frequently queried whitelist. livejournal.com should > be on that list. Can you see if there were any changes/updates to SA that > could have caused this? The rules currently include: 25_uribl.cf:uridnsbl_skip_domain juno.com kernel.org livejournal.com lycos.com Certainly looks to me like that shouldn't allow livejournal.com to be looked up against SURBL. Closest backup of those config files I have is 2011-08-23, and that file has an md5 checksum identical to my current 25_uribl.cf. Same as the backup from 2011-07-01: # md5sum panic-2011-07-01/var/lib/spamassassin/3.004000/updates_spamassassin_org/25_uribl.cf 64a27859c0a7cdafbd856dce3461c2f3 panic-2011-07-01/var/lib/spamassassin/3.004000/updates_spamassassin_org/25_uribl.cf $ md5sum /var/lib/spamassassin/3.004000/updates_spamassassin_org/25_uribl.cf 64a27859c0a7cdafbd856dce3461c2f3 /var/lib/spamassassin/3.004000/updates_spamassassin_org/25_uribl.cf So it shouldn't be possible for spamassassin.com to hit URIBL_WS_SURBL. I've removed the examples from my corpora. I'd still like to know how it happened. Here's the simplest example I can find: http://www.chaosreigns.com/sa/ws_surbl.txt Only URLs that could hit URIBL_WS_SURBL are www.livejournal.com and news.livejournal.com, right? Yep. spamassassin -D 2>&1 | grep multi.surbl | grep starting | less Sep 19 17:22:39.564 [9037] dbg: async: starting: URI-DNSBL, DNSBL:multi.surbl.org.:news.livejournal.com (timeout 15.0s, min 3.0s) Sep 19 17:22:39.569 [9037] dbg: async: starting: URI-DNSBL, DNSBL:multi.surbl.org.:www.livejournal.com (timeout 15.0s, min 3.0s) That's current trunk output, so there's a bug causing uridnsbl_skip_domain to not work? Opened bug: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6662 Even without uridnsbl_skip_domain I still can't explain why this rule hit, and that still bothers me. > Thanks, feedback on fp's is always very welcome with SURBL. -- "Life is either a daring adventure or it is nothing at all." - Helen Keller http://www.ChaosReigns.com
