Well.  I see that there has been moderately vigorous discussion going
on since I posted my new information regarding port 43 exit statistics, which
is just what I had hoped for. :-)  I don't have responses for all of the
points raised in the followups so far, but I can comment on some of them.
     On Fri, 12 Jun 2009 07:54:55 -0400 Tim Wilde <t...@krellis.org> wrote:
>On 6/12/2009 3:29 AM, Scott Bennett wrote:
>> In other words, by restricting just port 43 exits to only the legitimate 
>> whois
>> IP addresses, I eliminated at least 70% of *all* exits through my tor node,
>> which suggests to me that the vast, overwhelming majority of exits from the
>> tor network are illegitimate and place a terribly taxing load upon the tor
>> network as a whole.
>
>Scott,
>
>Thanks for your continued analysis, this is interesting information.
>However, the list of WHOIS servers you mentioned (and I snipped for
>brevity) is by no means a complete set of "the legitimate WHOIS IP
>addresses".  In fact, it's much much too small to draw any significant
>conclusions, for at least two major reasons:
>
>1) Any .com or .net WHOIS queries that hit whois.verisign-grs.com (aka
>whois.internic.net in your list) with a legitimate domain name will
>result in a referral to an individual registrar's WHOIS server, which
>will often be followed by the client, and would not be allowed by your
>exit policy.  There are potentially tens of thousands of these registrar
>WHOIS servers out there.

     I'm not at all sure that that is happening in this case.  My node's
exit policy leaves port 4321 (rwhois) wide open, yet the exit count for
the same time period covered in the statistics I posted last night is
only 22.
>
>2) Your list significantly excludes all ccTLD WHOIS servers.  While the

     Drat.  You're quite right.  I forgot all about those.  However, a
quick check shows that an awful lot of those are at the same IP addresses
for which I currently allow port 43 exits.  In other words, the whois
servers I've listed in my exit policy are also covering many of those
ccTLDs.

>numbers of domains registered in ccTLDs are not significant compared to
>.com/.net, their use is quite popular in a number of places,
>particularly in some where Tor is also quite popular, ie Germany.
>
>I'd be interested in seeing a comparison done with a more significantly
>complete list.  I understand you feel very strongly about sampling the

     I agree.  I'll try to add the ones I can find that are at IP addresses
distinct from the ones already allowed.

>contents of the traffic, and that's perfectly understandable and
>appropriate, but it is probably the only way to actually make a firm
>determination of how much of this exit traffic really is WHOIS, without
>crafting a VERY large Exit policy.  It may be possible, with
>appropriately engineered tools, to sample the traffic in a suitably
>anonymous way but still draw some conclusions, perhaps by simply
>attempting to determine if the TCP session involves mostly text or
>binary data.  That may still be a bit too intrusive, so I suppose we
>might just never know.

     Well, I see the situation a bit differently.  First off, I just find
it very hard to understand how there could be five, ten, or more times as
many legitimate whois connections as https connections.  My own usage of
whois lookups is generally fewer than ten per week, mainly in tracking
down information about sources of junk mail, whereas I do untold numbers
of https web page fetches per week.
>
>Given these shortcomings in the list, I definitely wouldn't suggest that
>such a list be considered a "default", as you'll be blocking a
>potentially significant amount of legitimate WHOIS traffic.

     An alternative approach would be to treat a default for port 43 just
like the default treats port 25, I suppose.
>
>If you do attempt to dig up a more complete list of WHOIS servers, I'd
>certainly be interested to see what you come up with, but of course
>understand you're doing this all on your own time and dime, and would
>never suggest that you're by any means obligated to do so. :)
>
     As noted above, I'll get to the additions when I find an hour or so
free to do it.  I'll provide another update to the list once I've
accumulated more data with the expanded list.  However, I suspect at this
point anyway that the expanded list is unlikely to result in drastically
different exit counts relative to the counts for other ports.  As you say,
though, the truth will be in the data, not in my suspicions.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:       bennett at cs.niu.edu                              *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************

Reply via email to