Walking Onions: week 7 update

 On our current grant from the zcash foundation, I'm working on a
full specification for the Walking Onions design.  I'm going to try to
send out these updates once a week.

My previous updates are linked below:

 Week 1:
   formats, preliminaries, git repositories, binary diffs,
   metaformat decisions, and Merkle Tree trickery.

   https://lists.torproject.org/pipermail/tor-dev/2020-March/014178.html

 Week 2:
   Specifying details of SNIP and ENDIVE formats, in CDDL.

   https://lists.torproject.org/pipermail/tor-dev/2020-March/014181.html

 Week 3:
   Expanding ENDIVEs into SNIPs.

   https://lists.torproject.org/pipermail/tor-dev/2020-March/014194.html

Week 4:
   Voting (part 1) and extending circuits.

   https://lists.torproject.org/pipermail/tor-dev/2020-March/014207.html

Week 5:
   Voting (part 2)

   https://lists.torproject.org/pipermail/tor-dev/2020-April/014218.html

Week 6:
   Editing, voting rules, and client behavior (part 1)

   https://lists.torproject.org/pipermail/tor-dev/2020-April/014222.html


== Clustering exit ports

This week I tried to figure out how exits work with walking onions.
A naive approach might have 65535 different routing indices in the
ENDIVE (one for each TCP port), but that would give horrible
performance: it would make the expanded ENDIVEs absolutely huge.
Instead, we can cluster ports into "classes" based on which are
always supported or not supported together.  (For example, on the
day I scanned the directory, every relay supporting any port in the
range 49002-50000 supported all of them.)  There are about 200 such
classes on the current Tor network.

But 200 exit indices is still probably too much: it seems like we'll
need to cluster exit ports together in a more lossy way.  For
example, not every exit that supports port 80 supports port 443 or
vice versa.  But around 98% of them do -- so it we wouldn't be
losing too much by using one exit index for "443 and 80", and not
listing the 2% of exits that only support one of those ports.

I wrote a short program [EXIT-ANALYSIS] to experiment with
clustering strategies here, and settled on a greedy algorithm to get
down to a smaller number of port classes by iteratively combining
whichever pair of classes will cause us to lose the fewest number of
(relay:port) possibilities.  We can get down to 16 classes and lose
only 0.07% of all the relay:port possibilities; we can get down to 8
classes and lose only 0.2%.

That's not the whole story, though.  Some of what we're losing would
be things we'd miss.  When the algorithm gives us 16 classes, it
loses 75% of port 10000, 70% of port 23, and so on.  Clearly not all
ports are equivalent, and we should tune this approach to take that
into account.  We should also consider the bandwidth impact of these
changes, and retool the algorithm so that it minimizes the
bandwidth-per-port combination we have.

We'd also likely want to hand-edit these port lists
so that they make more sense semantically on their own.  That way,
we could take Teor's suggestion from last month and try to give exit
relays the options to opt in to or out of human-intelligible
partitions.

Further, we could save some complexity by requiring dual-stack exits
to support the same ports on IPv4 and IPv6.  That way we wouldn't
need to worry about divergent sets of port classes.

== So what do we do with these clusters?

If a client wants to connect to port 25, and uses a port-25-only
index, that leaks the client's intention to the middle node.  That's
not so good.  Instead, in the tech report [WHITEPAPER], one or both
of Chelsea Komlo and Ian Goldberg came up with a mechanism called
Delegated Verifiable Selection (DVS).  With DVS, when the client
sends a BEGIN cell, the relay can respond that it does not support
the requested port, along with a SNIP from a relay that does.  To
prevent the relay from choosing an arbitrary SNIP, the BEGIN cell
would have to include an auxiliary index value to be used when DVS
is in place, and the END cell would need room for a SNIP.

(Alternatively, clients could just make a 3-hop circuit and ask it
for SNIPs to be used as exits on some other circuit.)

== Voting on policy sets

These policy clusters (and a few other things!) are too complex for
the voting operations I had before.  Fortunately, it's safe to treat
them as opaque.

I've added a new voting operation to allow authorities to vote on a
bstr that is then decoded and treated as a cbor object (if it is valid
and well-formed).  It should be useful in other places too.

== Migrating port classes

Migrating from one set of port classes to another is nontrivial.
Clients who have the old root document will expect the indices to
correspond to one partition of ports, whereas clients with the new
root document will expect another partition.  To solve this, I've
designed the system so that two sets of exit indices can exist
concurrently while clients are moving to the newer partition of
ports.

Alternatively, we could give SNIPs a mechanism to contain a field
saying "Your root document must be at least this new."  I don't know
which approach would be simpler; both seem like too much complexity for a
feature that would only be used occasionally.

== Next steps

My next plan here is to write up onion services; I believe I've got
the design there mostly figured out, and just need to write it down.

After that, I'm planning to turn back and fill in all the missing
subsections that I've written so far, and write a short
introduction.  There are a lot of missing sections, so that won't be
super fast.  I'm also going to need to show the whole thing to a
beta reader or two to answer the question, "Would this make sense to
somebody who doesn't already know what it says?"


== Fun facts

The script to generate exit port classes [EXIT-ANALYSIS] was written in
Rust. If I'm counting right, it's only my second Rust program that
actually does something useful!  If you are one of those Rust
enthusiasts who would like to critique it, I wouldn't mind at all --
just send me comments offline or make notes on the repository.

(Only do this if you like critiquing newbie Rust that was never
actually meant for production use.  Don't do this if you just want
to point out all the things that make it unsuitable for production
use.)

== Not-so-fun facts

This week I've been fairly overwhelmed with the lead up to and down
from the recent layoffs at Tor, in response to a downturn in funding
during the COVID-19 pandemic [BLOGPOST]. Tor will continue, and this
work will continue with it--but it is still a hard blow to have to
lose so many excellent coworkers. (Yes, I'm still at Tor, for the
record.)

Do you know somebody who is interested in hiring one or more amazingly
talented programmers, engineers, communications specialists, or project
managers -- all with experience at productive remote work and all with
dedication to improving people's lives?  Please pass their information
along to tor-al...@lists.torproject.org , and it will (after moderation)
get forwarded to those affected.


[BLOGPOST] https://blog.torproject.org/covid19-impact-tor

[EXIT-ANALYSIS]
https://github.com/nmathewson/walking-onions-wip/tree/master/exit-analysis

[EXIT-PARTITION]
https://lists.torproject.org/pipermail/tor-dev/2020-March/014182.html

[WHITEPAPER] https://crysp.uwaterloo.ca/software/walkingonions/
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to