Warren Togami wrote:
> Overlap analysis shows the majority of XBL and PBL are also listed by
> Barracuda.  Furthermore Barracuda's list seems to have a similar hit
> % as XBL + PBL combined.  Is Barracuda known to aggregate Spamhaus
> data with their own?  If so we might be adding redundant scores in a 
> dangerous and undesirable manner.
> 
> Adam Katz sa-update channels contains DNSBL rule overlap adjustments
> in an attempt to compensate for what he calls "incestuous"
> blacklists.  I am beginning to think this is a good idea to explore
> for spamassassin upstream if in fact one blacklist is aggregating
> data from another blacklist.

I should say more about my overlap rules (which is the PC version of
what I called in earlier versions and in comments as "incestuous").

I've noticed that a lot of these blocklists have a lot of overlap on the
same ham.  Some of them syndicate common upstream sources, but more
importantly, they share the same propagation methods.

Spam traps are limited in what they can pick up while still staying
pure; using list subscription + unsubscription, catch-all accounts on
guessable or subtly "advertised" domains, cleaned-up stale email
accounts, feeding addresses to spam bots, and perhaps a few other bags
of tricks.

This fishing for spam will lure the same spammers across the board, thus
the overlap.  This overlap is a problem because some spammers are smart
enough to cycle through relays and hope for one known (rightly or not)
for sending ham, or at least *not* known for sending spam.  Overlap from
DNSBLs can completely kill ham, and I think a multifaceted system like
SpamAssassin should not apply 5+ points (out of 5) to a message solely
from DNSBLs** when there are so many other tools available.  Real spam
will bump into something else.


That brings me to a big pet peeve of mine on DNSBLs:  they 'clean'
themselves of this problem by using DNSWLs ... and spammers know this.
The 'whitelisting' supplied by a DNSWL is in my opinion not appropriate
for a DNSBL to use.  Instead, a DNSBL-dedicated reference is needed,
perhaps even one that is not publicly available.

As to how such a thing would be populated ... that's a great question.
If it's anything that could be publicly accessible, I'd prefer DNSBLs to
either use NOTHING and let their users cross-check or else use a
different return code to indicate the hit anyway so that I can act on it
anyway.  *Especially* while DNSWLs lack an abuse-reporting mechanism.

I have seen SO much DNSWL'd spam that I've had to migrate to using
confirmation; like whitelist_from vs whitelist_auth on a DNSWL level.
In my khop-bl sa-update channel, any DNSWL'd message that doesn't pass
DKIM or SPF gains a point while any that does loses 2.25 (unless it's
already been lowered by overlapping DNSWL scores).  ... actually, I'm
surprised I gave it such a swing given spammers' increasing use of SPF
and DKIM.


** Another pet peeve:  Mail should not be able to be marked as spam from
a single category of detection mechanisms, aside from blacklists and
perhaps a fully trained and moderated learning algorithm.  I'd like to
set a hard cap of mechanism categories to something like 3.5, perhaps
4.0 for something dynamically generated by incoming data (e.g. Bayes,
AWL), but SA makes facilitating this kind of capping *really* hard.

DNSBL/URIBL/DNSWLs are the only place that this sticks out enough for me
to have remedied.  My IXHASH rule is specifically designed to avoid this
exact problem.  It uses the plugin's defaults of 0.1 per server hit and
make their union the rule that gets the larger amount of points.  If I
had masscheck results, some servers scores might go up, but the bulk
would still be applied by the meta rule.  SA 3.4 (or 3.3 if it's not too
late...) should (IMHO) include that sort of mechanism for DNSBLs.  Not
quite a cap, but close enough.


The overlap rules in question are a part of my khop-bl channel, which is
published at http://khopesh.com/Anti-spam#sa-update_channels not too far
above my iXhash meta rule, which now includes the workaround update
discussed here not too long ago.

Reply via email to