#23603: hs: Cleanup race between circuit close and free with the HS circuitmap --------------------------+------------------------------------ Reporter: dgoulet | Owner: (none) Type: defect | Status: new Priority: High | Milestone: Tor: 0.3.2.x-final Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: tor-hs | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: --------------------------+------------------------------------
Comment (by dgoulet): Ok, those latest logs were enlightening and 99.9% certainty now :) The offending function is `hs_service_intro_circ_has_closed()` called in `circuit_about_to_free()`. Same scenario has above but just simpler for the sake of the example: 1. mark for close intro circuits 2. launch new ones to same intro points. 3. `about_to_free()` the circuits in step (1). --> This calls `has_closed()` which removes the intro point object if it has been retried too many times. 4. New circuit opens, boom no intro point object and we get the warning of `Unknown auth key...`. From asn's log, which suspend/wakeup many times his tor, we happen to try intro points many times (successfully) and in their 18h-24h lifetime, it was done more than 3 times which triggered this condition in the has closed function: {{{ if (ip->circuit_retries > MAX_INTRO_POINT_CIRCUIT_RETRIES) { remember_failing_intro_point(ip, desc, approx_time()); service_intro_point_remove(service, ip); service_intro_point_free(ip); }}} This can only happen if the service launch the new circuits *before* the free() of the previous circuits happens *and* the circuit retry counter has reached > 3. Couple of possible fixes here. First, we should NOT clean anything in that function. We should ONLY do that in `cleanup_intro_points()` which would have cleanup everything properly (circuit included). For this, we need to move the "remember failing intro point" logic in there in case we've tried too many times. Second, this `circuit_retries` counter should probably not be incremented when we just go out of hibernation and `tor` is assuming at that point that all circuits are not working. We either do a callback from hibernation into the HS subsystem to cleanup those counters or another possible option would be to reset to 0 the counter on a successful intro point establishment. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23603#comment:6> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs