https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6926
Adam Katz <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #1 from Adam Katz <[email protected]> --- > Do they provide their TLD list in machine readable format? How about: $ wget -qqO- https://www.iana.org/domains/root/db \ |perl -ne 'if (m"/root/db/([^.]+)\.html") { print "$1\n" }' \ > tld.txt After that, you can run: $ sed '/^ ac ad/,/^ zm zw/!d; s/^ //; s/ */\n/g' \ lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm \ |grep -vwFf- tld.txt Which currently reveals we're missing: bl bq bv cw eh gb mf post sj ss sx um (plus all the punycode IDNs, unless we track them elsewhere) (I also ran the opposite. We don't have any TLDs that aren't on IANA's list.) We'll have to add these via util_rb_tld in sa-update in addition to RegistrarBoundaries.pm so users don't have to wait for SA 3.4.0 to get this. While on the ~tld topic, I see we don't yet include https://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 (for 2tld and 3tld). I haven't vetted that to see if it's worthwhile, but in doing some research a while back, it looked ideal. -- You are receiving this mail because: You are the assignee for the bug.
