#32265: MS: Format an exit list from a previous exit list and exitmap output ----------------------------------+-------------------------------- Reporter: irl | Owner: irl Type: task | Status: needs_revision Priority: Medium | Milestone: Component: Metrics/Exit Scanner | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: #29654 | Points: Reviewer: karsten | Sponsor: ----------------------------------+--------------------------------
Comment (by irl): Replying to [comment:5 karsten]: > Glad to see that the rewrite is progressing so quickly! > > Couple remarks/questions: > - Why 48 hours and not 24 hours? Doesn't the current exit scanner keep scan results for 24 hours? I might be wrong, though. Let's use whatever the current scanner does. https://2019.www.torproject.org/tordnsel/exitlist-spec.txt It discards relays that were not seen in the last 48 hours in a consensus. > - Rather than downloading exit lists from CollecTor, wouldn't it be sufficient to just read the latest exit list previously written by this scanner? And if there's none, just assume that no previous scans have happened. In theory, this should be all we need to learn. Probably, but this was a handy way to get test data and I wanted to try out the new Stem functionality. It would be nice to have a method to bootstrap a new scanner but this could just mean manually downloading the latest exit list and putting it in the right place. > - It seems that `LastStatus` is only taken from exit lists downloaded from CollecTor but never set by new measurements. We should make a plan what to do with this field. Take it out? Populate it with consensus valid- after times? Right, this is the tricky bit. Do you know if anything consumes the LastStatus or Published timestamps? Ideally we could just drop these but for now I'm synthesizing them from the timestamp of the last measurement which could be close enough for the consumers. > - Does exitmap with the plugin use previous scans as input to decide which relays to scan? I believe that it uses some logic to avoid scanning relays too frequently. This has two effects: it doesn't generate more load on the network and on single relays than necessary, and it ensures that new relays are scanned sooner. As a result, the new scanner could be run once or twice per hour, rather than every 2 or 3 hours (at 45 minutes runtime). No. It scans the entire network every time. It does this asynchronously, and doesn't try to prioritize anything. Just whichever circuits are built first will be tested first. I was even thinking it could run continuously. If exit relays cannot cope with two HTTP requests an hour, perhaps they shouldn't be exit relays. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32265#comment:6> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs