On 2022-04-15 at 14:20:28 UTC-0400 (Fri, 15 Apr 2022 19:20:28 +0100)
Laura Atkins via mailop <la...@wordtothewise.com>
is rumored to have said:

> .eu.org <http://eu.org/> is, essentially, a tld. And .tlds have their own 
> reputation, too. Just this week a few of us were talking about ‘weird’ tlds. 
> One of the participants works at a filtering company, checked their stats and 
> said “this particular tld is 9x% spam.” So 9+ times in 10 when they see a 
> domain registered in that tld, it’s spam.

This is consistent with the SpamAssassin RuleQA stats. 
https://ruleqa.spamassassin.org/?daterev=&rule=%2FTLD shows the stats for rules 
with 'TLD' in their names. (N.B.: the SA sample size is nowhere near what the 
big providers have, and it probably skews differently) The default rules 
channel includes a list of 'suspicious' TLDs that have been used predominantly 
in spam, at least at some point in the past. Reliably 99%+ spam & with some 
minor FP protection that goes over 99.9%. Some of those have been pulled out as 
test rules (T_*) in response to complaints by "legit" (stipulated, unchecked) 
senders. Every TLD tested individually that way has been found to be 
persistently associated at least 95% with spam, except for .space which has a 
slightly better record hovering around 90%.

It is entirely reasonable to see a TLD (or any zone used as a registry) as a 
meaningful attribute in spam filtering. There is a correlation in many cases. I 
cannot nail down what causes that correlation, but that doesn't affect its 
utility. OTOH, even with the spam/ham reduction in recent years, the majority 
of mail is still spam so a 90% correlation isn't as significant as it sounds. A 
few years ago, a 90% spam source would have been *better* than the aggregate 

Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
mailop mailing list

Reply via email to