#18809: Handle linked connections better during bootstrap -------------------------------------------------+------------------------- Reporter: teor | Owner: arma Type: defect | Status: Priority: Medium | needs_revision Component: Core Tor/Tor | Milestone: Tor: Severity: Normal | 0.2.8.x-final Keywords: must-fix-before-028-rc, | Version: Tor: TorCoreTeam201605, review-group-1 | 0.2.8.1-alpha Parent ID: | Resolution: Reviewer: andrea | Actual Points: medium | Points: medium | Sponsor: | SponsorS-can -------------------------------------------------+-------------------------
Comment (by arma): Replying to [comment:23 teor]: > Code review: Thanks for the review! > 91c5801 avoid following through on a consensus fetch if we have one already arriving > Suffers from a race condition where two connections could attach at the same time. But this is unlikely and relatively benign: we'll just fetch the consensus twice. Can you elaborate on this one? This part of Tor is single-threaded, so they can't attach at the same time. One of them attaches before the other, and the first one to win gets immediately moved to state AP_CONN_STATE_CONNECT_WAIT as soon as it's chosen, so the second one will know that there's already a winner. > a7665df close other consensus fetches when we get a consensus > There were two race conditions avoided by the previous code: > * the old function closed the connection immediately to avoid wasting data. But this meant that if a connection was receiving data or undergoing other operations, there could be errors. I think calling connection_mark_for_close is better, even if it's less efficient. The trouble is that connection_mark_for_close() just closes the dir conn -- not the underlying tls connection, and so it doesn't undo any begin cell already sent onto the network. So we can close the consensus request, but if it already sent its begin cell, we will still get a consensus coming back at us -- we'll just drop all the data cells because we won't be expecting it anymore. This behavior is similar to the problem in bug #16844. But to be clear, this new function connection_dir_close_consensus_fetches() is aiming to find and kill directory requests that haven't attached to a circuit yet. There should not be any directory requests that are attached to a circuit (i.e. fetching a consensus) at this point -- they would have been noticed inside 91c5801 and stopped there. > * connections can be initiated after connection_dir_list_by_purpose_and_resource() is called. These connections are never cleaned up. I think this is fixed by checking when we attach a stream in 91c5801. You mean the connection_dir_list_by_purpose_and_resource() call inside connection_dir_close_consensus_fetches()? That function is only called after we've called networkstatus_set_current_consensus(), so we should have a consensus in place. But! What about the case where we got a consensus, but we don't have the certs yet for verifying it? Then networkstatus_consensus_is_bootstrapping() will return 1, yes? Crappo. Does this mean we want to modify networkstatus_consensus_is_bootstrapping() so it says no if there is a consensus but it's in waiting-for-certs limbo? After all, we don't want to start launching more consensus fetches in that situation. > bcae392 avoid another redundant check > good, but I wonder if we'll want networkstatus_consensus_is_downloading_usable_flavor in future. probably not worth keeping it around Right, we can bring it back if we want it one day. > d5a9628 simplify more -- we only call these funcs when bootstrapping > makes sense, but you'd better document that those functions should only ever be called when bootstrapping The functions are named update_consensus_bootstrap_multiple_downloads and update_consensus_bootstrap_attempt_downloads(), and also their function comments start with "When we're bootstrapping, ". Should we do more? > 1f72653 fix a bug where relays would use the aggressive client bootstrapping retry number > I think the correct fix for this is to check `server_mode(options)` in `consensus_max_download_tries()`, and `return options->TestingConsensusMaxDownloadTries` if it's true. > > Otherwise, this bug is just waiting to happen again. Hm? I was actually thinking of just removing consensus_max_download_tries() entirely, since it is only called from two places, and one of the places wants the first clause of the function, and the other place wants the last clause of the function. I guess this is my instinct to simplify rather than complexify. > 507883e rip out the unit tests for the functions i removed > Please leave the parts of the unit tests that test existing functions, or remove those functions if they are unused: > * connection_dir_list_by_purpose_and_resource > * connection_dir_count_by_purpose_and_resource They are still used. Would you find it easy to add back whichever parts of the unit tests you want to keep? Or throw out this last commit and do whatever you think is smart? :) Thanks! -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18809#comment:24> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs