As a follow-up to, but off-topic from the bug report ...

 ------- Additional Comments From [EMAIL PROTECTED]
> 2004-01-25 02:18 -------
> I don't like the idea of having to run mass-checks manually and
> extracting domain names to check from that -- mostly because most
> people won't do it.
>
> How about this:
>
> - Extract registerable domain part using reportedly existing heuristics
>   (hostpart.spammer.co.uk -> spammer.co.uk)
>

Over the weekend, I've collected 3600 host names associated with 16,300
URL's
extracted from about 80,000 spam messages going back to August of this year.
They're sorted in reverse dot order, for example:

trimtram.net
trinketreach.net
www.try4free.net
www.ultrastats.net
umbrellacover.net
www.usagov.net
www.usaskylink.net
ns.usenetsolution.net
www.vacationpromo.net
mysite.verizon.net
viva-x.net
www.vivato.net
bradford.hfwnflvzxb.wealthnation.net
lane.nerbq.wealthnation.net
www.whitephantom.net
www.whitetrashsluts.net
www.whoringfor-college.net
www.wideep.net

As you can see, for example, the wealthnation.net entries are together, but
the host name prefixes are different.

Question: is there a Perl package that can be used to boil these down
to their domain name part, suitable for a whois look up? Where I'm going
with this is to try and build a data base of same regirstrar/techinal point
of contact and so on. One approach I thought of was to try a whois on the
fully qualified host names above, and if it doesn't succed, then remove
the first component and try again, and so on, but that's not very elegant.

Regarding whois, I tried a few of the domains in the list and noticed
that whois turned up empty. Is there a database somewhere that relates
domain names to their registrar, or to a server that will reply with their
whois info?




Reply via email to