#33018: Dir auths using an unsustainable 400+ mbit/s, need to diagnose and fix ----------------------------+------------------------ Reporter: arma | Owner: (none) Type: defect | Status: new Priority: Medium | Milestone: Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: network-health | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: ----------------------------+------------------------
Comment (by arma): Initial impressions: these are requests to the DirPort (not the ORPort), and they're coming from many different IP addresses, most of which are not current relay IP addresses. I had 1000+ connections to my DirPort in TCP state ESTABLISHED, and kill -USR1 said {{{ For 1139 Directory connections: 43795985 used/48406528 allocated }}} i.e. at that moment I had already committed to answering 43megabytes of dir info that I hadn't managed to push onto the network yet. Most requests seem to be for "/tor/status-vote/current/consensus" which is the vanilla-flavored consensus, not the microdesc-flavored consensus that is actually in use by clients. It would be useful for Tor to collect statistics about how many requests, and how many bytes, were for what sort of dir object, and came from relay vs non-relay IP addresses. Another idea for an improvement is that we might change Tors to only fetch from the dir auths once they have decided to publish their relay descriptor, i.e. if you are a relay but you are not reachable, you should stay on the "client" fetch schedule. That way it is easier to say that if you are fetching from moria1 but you are not a relay, it is surprising and weird. (Still a bit tricky though, because relays might connect to moria1's dirport from a different IP address than they write in their descriptor.) Also, handle_get_current_consensus() checks {{{ if (global_write_bucket_low(TO_CONN(conn), size_guess, 2)) { log_debug(LD_DIRSERV, "Client asked for network status lists, but we've been " "writing too many bytes lately. Sending 503 Dir busy."); }}} but global_write_bucket_low() says {{{ if (authdir_mode(get_options()) && priority>1) return 0; /* there's always room to answer v2 if we're an auth dir */ }}} I have commented these lines out on moria1, and now I am sending dozens of 503 responses per second. This is sort of sad for legit relays that want to get their answers, but I think it should make bandwidth available to other directory operations. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33018#comment:1> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs