Re: [tor-dev] Status report - Stream-RTT

2013-10-08 Thread ra
On Saturday 10 August 2013 02:37:48 Damian Johnson wrote:
  Yup. It's unfortunate that tor decided to include an 'Exit' flag with
  such an unintuitive meaning. You're not the first person to be
  confused by it.
  
  Is this meaning at least documented somewhere and I have just read over
  it?
 
  Here's the relevant part of the spec...
 
 https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt#l1738

Patch submitted in ticket 9932[0].

Best,
Robert

[0] https://trac.torproject.org/projects/tor/ticket/9932


signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] Status report - Stream-RTT

2013-09-14 Thread ra

tl;dr
When building a circuit, measuring the RTT a single time could provide better 
latency and anonymity while not affecting throughput. Multiple measurements 
could be used for running real-time applications like VoIP or optimizing  
throughput.


Despite the fact that the Tor network is currently in an unusual state so to 
say, I have been spending the last weeks looking into stream-RTT 
data of circuits. I gathered the data shortly before and at the beginning of 
the huge botnet usage. This is what I have found out:
As assumed stream-RTT measurements of a single circuit are not at a fixed 
value but distributed since they are subject to multiple influences. After 
comparing stream-RTT distributions of multiple circuits, I found lots of 
different shapes and I realized that no single distribution fits them all.
The Time-To-First-Byte (TTFB) for fetching a small website over HTTP is used 
to approximate the latency of a certain circuit. I used different methods to  
check the correlation between the RTT of a circuit and its TTFB - all 
indicating a very high correlation. Hence, stream-RTTs of a circuit make a 
good estimator for its TTFB and therefor its latency. 
In terms of latency, using a single stream-RTT measurement (First-RTT) 
performs better than the currently used method CBT. So far I haven't done any 
testing/calculations on the other metrics: bandwidth and anonymity. I would 
assume the former to be unaffected by First-RTT. Latter could probably be 
slightly increased, if the percentage of discarded circuits would be reduced 
from 20% with CBT to 10% or 15% with First-RTT - while still achieving a minor 
improvement in latency.
Nevertheless I would not recommend using First-RTT as method for providing low 
latency circuits to applications, because it only gives a small hint about the 
quality of a circuit and cannot make sure that some latency properties 
hold for a certain circuit. Nevertheless First-RTT works pretty well comparing 
to the minimum effort it takes.

Additionally I played around a lot with methods to provide a better estimator 
for latency properties of a certain circuit. But they all need far more than a 
single measurement and are therefor out of scope for the common case. Besides 
they cannot protect against suddenly changing circuit conditions. But they 
could be used to fulfill a application specific maximum RTT for real-time 
applications like VoIP. With the use of similar techniques it should be 
possible to detect circuits that include a node that's within its bandwidth 
limit. This could be used for providing high bandwidth circuits for 
applications like BitTorrent.

Best,
Robert


signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-08-12 Thread ra
On Saturday 10 August 2013 23:52:44 Damian Johnson wrote:
 If I understand this correctly you're thinking that multiple calls to
 extend_circuit() cause parallel EXTENDCIRCUIT requests, and the first
 response would be used for both callers. Is that right?

Yes.

 If so then I would be very interested if you actually see that
 behaviour. Stem provides thread safe controller communication. See the
 msg() method of the BaseController - though the Controller's methods
 are called in parallel the actual socket requests are done in serial
 to prevent that exact issue that you describe.

That looks fine to me. I obviously drew the wrong conclusion from the issues I 
have encountered. My fault, sorry.

Best,
Robert


signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-08-10 Thread ra
On Saturday 10 August 2013 02:37:48 Damian Johnson wrote:
 Hi Robert. Here's the relevant part of the spec...
 
 https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt#l1738

Thanks. I will try to make that part more clear and open a ticket. 

  If requests are sent to Tor to create more then a single circuit at once,
  the mapping between circuit events and create-request is unknown because
  the circuit ID is not known until the LAUNCHED-event has been received.
  This is clearly an issue on Tor's side but one could argue that Stem
  should stop me from using it that way.
 
 Not sure that I follow. The extend_circuit() returns the circuit id
 (it's provided by the EXTENDCIRCUIT call). Are you saying that tor's
 EXTENDCIRCUIT response is wrong when done in parallel?

As far as I understand it it's not necessarily wrong but it might be the case 
that a response that does not belong to the call is received first: Assume a 
single program making two extend_circuit() calls within a short time. If the 
first EXTENDED response is delayed for some reason, both calls receive the 
EXTENDED response belonging to the second call - both calls use the same 
circuit ID. Another case, again a single program making two extend_circuit() 
calls within a short time: if the second call has been made before the first 
EXTENDED response is received, the second call will use the EXTENDED response 
from the the first call when it arrives - both calls use the same circuit ID. 
Therefore the await_build parameter should be True by default IMHO. Anyway it 
should be made clear that the await_build parameter doesn't work when 
extend_circuit() is used by two separate programs/threads that run 
concurrently. The user has to do the locking of (at least) the LAUNCHED event 
herself then. 

Besides I could not find any filtering of Tor-internal circuit events. If a Tor-
internal circuit EXTENDED event occurs during an extend_circuit() call, the 
wrong circuit ID will be used.

I hope, this is not too confusing.

Best,
Robert


signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-08-10 Thread Damian Johnson
 As far as I understand it it's not necessarily wrong but it might be the case
 that a response that does not belong to the call is received first: Assume a
 single program making two extend_circuit() calls within a short time. If the
 first EXTENDED response is delayed for some reason, both calls receive the
 EXTENDED response belonging to the second call - both calls use the same
 circuit ID.

If I understand this correctly you're thinking that multiple calls to
extend_circuit() cause parallel EXTENDCIRCUIT requests, and the first
response would be used for both callers. Is that right?

If so then I would be very interested if you actually see that
behaviour. Stem provides thread safe controller communication. See the
msg() method of the BaseController - though the Controller's methods
are called in parallel the actual socket requests are done in serial
to prevent that exact issue that you describe.

Apologies if I'm misunderstanding what you're describing. -Damian
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-08-09 Thread Damian Johnson
 Yup. It's unfortunate that tor decided to include an 'Exit' flag with
 such an unintuitive meaning. You're not the first person to be
 confused by it.

 Is this meaning at least documented somewhere and I have just read over it?

Hi Robert. Here's the relevant part of the spec...

https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt#l1738

 What kind of issue does that encounter? Is it a problem with stem's
 thread safety or an issue on tor's side?

 If requests are sent to Tor to create more then a single circuit at once, the
 mapping between circuit events and create-request is unknown because the
 circuit ID is not known until the LAUNCHED-event has been received.
 This is clearly an issue on Tor's side but one could argue that Stem should
 stop me from using it that way.

Not sure that I follow. The extend_circuit() returns the circuit id
(it's provided by the EXTENDCIRCUIT call). Are you saying that tor's
EXTENDCIRCUIT response is wrong when done in parallel?

 Not quite. The connect_port() function never returns an exception.
 Rather, if it fails to establish a control connection then it prints
 the issue to stdout and returns None. Also, the connection it provides
 is already authenticated.

 If Tor has ControlPort enabled without having HashedControlPassword set,
 authenticate() has to be called to authenticate the connection.
 Though this is not recommended I don't know which other default setting would
 be more appropriate.

I think there's some misunderstanding. Yes, when you establish a new
controller connection you need to call authenticate(), even if Tor
doesn't require any credentials.

connect_port() is a convenience function that does everything
(including authentication) for you. If tor requires a password then it
gives the user a password prompt. If it runs into an error then it
prints an explanation of the failure and returns None. Sounds like I
need some more documentation here...

Cheers! -Damian
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-08-04 Thread Damian Johnson
Hi Robert, sorry about the delay. I couldn't sink the time a reply to
this thread deserved until now.

 -) When I wanted to check if a certain node is an exit node it took me some
 time to figure out that looking for an exit flag is not sufficient because 
 some
 nodes are in fact exit nodes but don't have an exit flag.

Yup. It's unfortunate that tor decided to include an 'Exit' flag with
such an unintuitive meaning. You're not the first person to be
confused by it.

 One has to look at
 the nodes exit policy which is unaccessible by default because of
 microdescriptors. Maybe returning some meaningful message when one uses
 get_server_descriptor() and microdescriptors are enabled would help..?

Good idea! Done...

https://gitweb.torproject.org/stem.git/commitdiff/e78f1b7

 -) It is not safe to use extend_circuit in parallel for creating new circuits.
 I think this is not mentioned anywhere.

What kind of issue does that encounter? Is it a problem with stem's
thread safety or an issue on tor's side?

 or would like a code review then let me
 know.

 That would be awesome!

Few things I'm spotting offhand...

 self._lock.acquire()

Manual lock handling is risky. If anything within this block raises an
exception (and there's several points throughout your script where you
use Controller methods that can potentially raise errors) then the
lock won't be released.

The safer way of doing this is to use the 'with' keyword...

with self._lock:
  # do stuff

This is the same as...

try:
  self._lock.acquire()
  # do stuff
finally:
  self._lock.release()

 def read(self):
   ...
   return None

Not necessary. Methods return None by default.

 # pylint: disable-msg=R0902

You might want to look into pyflakes and pep8. I've found them to be
better static analysis tools.

 try:
   controller = connect_port()
 except SocketError:
   sys.stderr.write(ERROR: Couldn't connect to Tor.\n)
   sys.exit(1)
 controller.authenticate()

Not quite. The connect_port() function never returns an exception.
Rather, if it fails to establish a control connection then it prints
the issue to stdout and returns None. Also, the connection it provides
is already authenticated.

This should instead be...

controller = connect_port()

if not controller:
  sys.exit(1)  # failed to get a control connenction


Cheers! -Damian
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Status report - Stream-RTT

2013-07-28 Thread ra
On Saturday 27 July 2013 04:41:45 Damian Johnson wrote:
 Hi ra, glad to see that you're using stem! 

Sure, stem works really great!

 If you have any questions,
 suggestions, feature requests, 

These parts were a bit tricky to figure out for me:
-) When I wanted to check if a certain node is an exit node it took me some 
time to figure out that looking for an exit flag is not sufficient because some 
nodes are in fact exit nodes but don't have an exit flag. One has to look at 
the nodes exit policy which is unaccessible by default because of  
microdescriptors. Maybe returning some meaningful message when one uses 
get_server_descriptor() and microdescriptors are enabled would help..?
-) It is not safe to use extend_circuit in parallel for creating new circuits. 
I think this is not mentioned anywhere.
-) Router status V2/V3 also took me some time but this has already been fixed.

 or would like a code review then let me
 know. 

That would be awesome!

 As of just four weeks ago the Controller started providing v3
 responses

I missed that obviously. Fixed in [0].

  circ.build_flags.count('IS_INTERNAL') == 0

 This would more commonly be done as...
 'IS_INTERNAL' not in circ.build_flags

Fixed in [0].

  try:
controller.reset_conf(__DisablePredictedCircuits)
controller.reset_conf(__LeaveStreamsUnattached)
controller.close()
  
  except NameError:
pass
 
 What raises a NameError?

This was a leftover where it has been possible that controller doesn't exist 
at that time. Fixed in [0].

  # close circuit, but ignore if it does not exist anymore
  
  try:
self._controller.get_circuit(self._cid)
self._controller.close_circuit(self._cid)
  
  except (ValueError, InvalidArguments):
pass
 
 What is the purpose of the get_circuit() call? If it's not superfluous

It doesn't do any harm but is definitely superfluous. Fixed in [0].

  try:
controller = Controller.from_port()
  
  except SocketError:
sys.stderr.write(ERROR: Couldn't connect to Tor.\n)
sys.exit(1)
  
  controller.authenticate()
 
 This is certainly a fine way of doing it, but you might want to also
 look at connection.connect_port()...
 
 https://stem.torproject.org/api/connection.html#stem.connection.connect_por
 t
 
 It is intended to be a quick and easy method of getting a Controller
 for command-line applications. For instance, it will present a
 password prompt if tor is configured to use password authentication.
 Just realized I should have included it in a tutorial somewhere...

I didn't know that. Since the script now depends on stem version  1.0.1 
anyway, I integrated it.

Thank you for your feedback so far!

 Your code looks great! If you wouldn't mind I'd love to reference it
 on stem's examples page...

Sure, go ahead.

 Shall I reference 'https://bitbucket.org/ra_/tor-rtt/' or do you
 anticipate your project having a more permanent home? (this might be a
 question for Mike as much as you)

I would not mind but I don't have any plans for that. Mike only asked me to 
make the code accessible online.

Best,
Robert

[0] https://bitbucket.org/ra_/tor-
rtt/commits/666e0b173871ba3f699c8bc07bfb156f653adf7a


signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] Status report - Stream-RTT

2013-07-26 Thread ra
Hi all!

During the last weeks I have been very busy working on my GSoC project which 
is about reducing the RTT of preemptively built circuits. 

There is now a single script called rttprober[0] that depends on a 
patched[1] Tor client running a certain configuration[2]. The goal is to  
measure RTTs of Tor circuits. It takes a few parameters as input: an 
authenticated Stem Tor controller for communication with the Tor client, the 
number of circuits to probe, the number of probes to be taken for each circuit 
and the number of circuits that should be probed concurrently. It outputs a 
tar file containing lzo-compressed serialized data with detailed node 
information, all circuit- and stream-events involved and the circuit build 
time for further analysis.
Since the RTT-measurements are run in parallel with very short locks it is 
important not to overload Tor nodes. Therefore a single node is not probed 
more than once at a time.

A first analysis of some measurements taken supports the original assumption 
that a Frechét distribution fits both the circuit build times[3] and round trip 
times[4].

I will continue gathering and analyzing measurement data and will hopefully be 
able to draw some conclusions from that.

Best,
Robert


[0] https://bitbucket.org/ra_/tor-
rtt/src/1127f6936086664981fc55b4dbc82b1570714140/rttprober.py?at=master
[1] https://bitbucket.org/ra_/tor-
rtt/src/1127f6936086664981fc55b4dbc82b1570714140/patches?at=master
[2] https://bitbucket.org/ra_/tor-
rtt/src/1127f6936086664981fc55b4dbc82b1570714140/torrc?at=master
[3] http://postimg.org/image/je8k5yydt/
[4] http://postimg.org/image/ktk90vxm7/



signature.asc
Description: This is a digitally signed message part.
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev