#29624: New version of exit list format -------------------------------------+-------------------------------- Reporter: irl | Owner: karsten Type: task | Status: needs_revision Priority: Medium | Milestone: Component: Metrics/Exit Scanner | Version: Severity: Normal | Resolution: Keywords: metrics-roadmap-2019-q2 | Actual Points: Parent ID: #29650 | Points: Reviewer: irl | Sponsor: -------------------------------------+-------------------------------- Changes (by karsten):
* status: needs_review => needs_revision Comment: Here are my notes from talking this over at today's meeting: Replying to [comment:8 karsten]: > Replying to [comment:7 notirl]: > > We need to work on the use of words like "may". Unless Tor already has something for this, let's refer to RFC2119. > > Makes sense. However, it's been a while that I wrote specs with those keywords, and I think I didn't get it right in all cases back then. Do you mind going through the spec at the end and correcting keywords accordingly? > > > I don't believe we need to prefix keywords with "Scanner". Was there a specific reason for this? > > My idea was to avoid future conflicts with keywords used in exit list entries, and in the header it matters the least to make keywords a bit longer. I don't feel strongly, though. Mild preference for keeping the prefix. > > > dir-spec uses kebab-case for keywords, not CamelCase. > > > > For fields that are already defined in dir-spec, like "contact" we should refer to those semantics instead of making up our own. > > Hmm, should we really mix CamelCase and kebab-case in a single document? I think I'd prefer to stay in CamelCase notation. We made plans to use kebab-case keywords only in version 2. This means that it won't be backward-compatible with version 1 which only uses CamelCase keywords. The API can still provide the same methods for accessing parts of an exit list, regardless of the version. Let's try this. Related to this change, we're going to say "contact" rather than "ScannerContact" or "scanner-contact", and we're linking to version 3 of dir-spec to say that we're using the format specified there. > > As above, for date/time formats. > > Hmm? I copied over the format from dir-spec. The formats should be equivalent. Or what do you mean? Likewise, we're linking to dir-spec version 3. > > We should be specific on our use of country codes. There are extensions added by the databases we are using, and we also use our own extensions. Maybe we should talk to OONI and see what they are using too so we can be unified. > > I'm not sure what to gain from defining (or linking to) a set of allowed country codes. I consider this field mostly informational. But I don't really mind. In any case we could move forward with completing this spec and writing parsers, and we could later adapt the spec to define a subset of valid two-letter country codes. For now we'll allow `[A-Z][A-Z]` as valid 2-alpha country code as specified in ISO 3166-1 alpha-2. We're writing these as uppercase and parsing them case-insensitively. > > How does the "Downloaded" keyword work with signed documents? How do you see it being used? > > Signed documents are certainly a challenge. The issue is that this keyword is already being used: CollecTor adds it. A better choice (back then) would have been to use an annotation for this. But I think the `Created` keyword will supersede this keyword anyway. Still, it's there, which is why I included it in the spec. Maybe there's a better plan? We might use `@downloaded-at` in CollecTor, but we're not going to specify a new line like this in version 2 of the exit list specification. > > On point 1, this sounds OK. I am starting to think of exit lists in the new scanner context as a derived format from the raw measurement results in a similar way that our current torperf files are derived from onionperf analysis results which are derived from tor/tgen logs. > > > > As an aside, the format we are deriving from will most likely be [[https://pathspider.readthedocs.io/en/latest/using.html#data- formats|PATHspider ndjson]]. This is not important for the spec. > > Makes sense. > > > On point 2, this also sounds OK. Should we specify that an exit list should be used with a specific consensus in applications like ExoneraTor? I think no, we should always use the latest exit list and latest consensus to give the most up-to-date information available. > > Agreed, we should leave this up to the application. > > Changing back to needs_review for the open questions. Thanks! I'm going to make changes as outlined above, and then irl is going to adapt the MAY/MUST/etc. parts. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29624#comment:9> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs