Hello everyone, I'm a long time user of Tor, first time poster here.

Over the last few months, I have been working on a light weight C++ client only 
implementation of the Tor protocol, intended to be used as an embedded library 
in other applications. It is now at the stage where it can complete 
bootstrapping, build circuits, as well as connect to services / host hidden 
services (v3). I have been building this primarily off the spec documents 
(which have generally been extremely helpful), and well as the assistance of 
stepping through the official Tor implementation when needed for 
troubleshooting and to confirm specifics. Before I release this to the wider 
world, I'd like to confirm a few points that may not be explicitly stated in 
the specs.

1. In general, what are the things to look out for when implementing the Tor 
protocol beyond "making it work" and validating all data (signatures, 
timestamps, etc)? One thing I'm concerned about is the risk of fingerprinting 
where the spec does not completely specify behaviour, e.g. the order in which 
link specifiers are passed in an extend cell, exact criteria for when circuits 
are explicitly destroyed etc. (I'm very excited to see the proposals around 
CBOR on this point which would help greatly with knowing that a canonical data 
representation was used).

2. When it comes to bootstrapping, the official implementation appears to 
favour accessing directories via plaintext HTTP rather than connecting on the 
OR port and using create fast / begin dir. What is the motivation for using the 
plaintext option (and for that matter, having a plaintext http service open at 
all)?. While the OR will learn just as much about the client regardless, it 
seems like the default plaintext access to directory information unnecessarily 
gives away details of how clients engage with the Tor network to third parties.

3. When using bridges and in particular pluggable transports, how is the client 
intended to safely bootstrap in the cold start case where it does not know up 
front which bridge/relay it will be connected to (e.g. when using Snowflake)? 
The RSA identity can be accepted in blind faith based on the Tor handshake, and 
it's then possible to get the full details with create fast / begin dir, but 
how does a client know that it has been connected to a bridge that is "blessed" 
by the Tor network rather than a MITM actor?

4. Finally, if anyone reading has been involved with or close to the 
development of other unofficial Tor implementations, what are the lessons 
learned on this front? I'm aware of among others Orchid (updated last in 2016), 
node-Tor (does not implement ECC) and torpy (does not implement hidden services 
v3). What makes these fail / stall?

Many thanks,
P
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to