SA has long gone to great lengths to extract URIs from things which are
not strictly URIs, on the basis that mail clients do the same and SA
needs to inspect such things for DNSBL lookups. I'm fine with this.
However, once in a while I come across a case where something is clearly
being extracted and canonicalized a little too enthusiastically, which
usually comes to my attention in the context of an FP due in large part
to a hit on our local DNSBL. (Which listing is in turn likely due to
the same extraction and canonicalization on a batch of missed spam, and
the minimal "is this an abused legit domain or a spammer domain" check I
do before adding an entry to the DNSBL.)
The latest case is mail from the Cornell Lab of Ornithology, which has
some message element that SA extracts "none" from, and converts it to
"none.com" to try to look up "none.com" in DNSBLs. At a guess, it's an
image tag with a "background" attribute of "none".
"uridnsbl_skip_domain none" doesn't seem to suppress this lookup, either
in 3.4.6 or a recent test install from SVN trunk.
I've worked around this specific case, and past ones, in one way or
another, but I'd like to more precisely target the bad URI extraction.
In particular, I'd like to suppress this at the "random crap that looks
like a URI" stage rather than later on. I specifically do NOT want to
suppress lookups of the canonicalized URI, since that may be justifiably
listed on the local DNSBL.
Am I missing some configuration option that can do this, or am I left
with doing one of:
- just suppressing lookups of the canonicalized URI
- removing the canonicalized URI from the DNSBL, even if the listing
might be justified where the *NON*-canonical version absolutely isn't
- applying the welcomelist_* sledgehammer
-kgd