#33018: Dir auths using an unsustainable 400+ mbit/s, need to diagnose and fix
----------------------------+------------------------
 Reporter:  arma            |          Owner:  (none)
     Type:  defect          |         Status:  new
 Priority:  Medium          |      Milestone:
Component:  Core Tor/Tor    |        Version:
 Severity:  Normal          |     Resolution:
 Keywords:  network-health  |  Actual Points:
Parent ID:                  |         Points:
 Reviewer:                  |        Sponsor:
----------------------------+------------------------

Comment (by arma):

 Initial impressions: these are requests to the DirPort (not the ORPort),
 and they're coming from many different IP addresses, most of which are not
 current relay IP addresses.

 I had 1000+ connections to my DirPort in TCP state ESTABLISHED, and kill
 -USR1 said
 {{{
 For 1139 Directory connections: 43795985 used/48406528 allocated
 }}}
 i.e. at that moment I had already committed to answering 43megabytes of
 dir info that I hadn't managed to push onto the network yet.

 Most requests seem to be for "/tor/status-vote/current/consensus" which is
 the vanilla-flavored consensus, not the microdesc-flavored consensus that
 is actually in use by clients.

 It would be useful for Tor to collect statistics about how many requests,
 and how many bytes, were for what sort of dir object, and came from relay
 vs non-relay IP addresses.

 Another idea for an improvement is that we might change Tors to only fetch
 from the dir auths once they have decided to publish their relay
 descriptor, i.e. if you are a relay but you are not reachable, you should
 stay on the "client" fetch schedule. That way it is easier to say that if
 you are fetching from moria1 but you are not a relay, it is surprising and
 weird. (Still a bit tricky though, because relays might connect to
 moria1's dirport from a different IP address than they write in their
 descriptor.)

 Also, handle_get_current_consensus() checks
 {{{
   if (global_write_bucket_low(TO_CONN(conn), size_guess, 2)) {
     log_debug(LD_DIRSERV,
               "Client asked for network status lists, but we've been "
               "writing too many bytes lately. Sending 503 Dir busy.");
 }}}

 but global_write_bucket_low() says
 {{{
   if (authdir_mode(get_options()) && priority>1)
     return 0; /* there's always room to answer v2 if we're an auth dir */
 }}}

 I have commented these lines out on moria1, and now I am sending dozens of
 503 responses per second. This is sort of sad for legit relays that want
 to get their answers, but I think it should make bandwidth available to
 other directory operations.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33018#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Reply via email to