Re: Load Balancing

2007-09-23 Thread Fabian Keil
Juliusz Chroboczek [EMAIL PROTECTED] wrote:

  I believe this results in a perceptible performance improvement for general 
  browsing.
 
 I think so too, but some people disagree.  Since I don't want to get
 into this discussion again, I refer you to the following friendly flamewar.

Additionally there's http://tor.eff.org/volunteer.html.en#Coding:

|We need a measurement study of Polipo vs Privoxy. Is Polipo in
|fact significantly faster, once you factor in the slow-down from Tor?
|Are the results the same on both Linux and Windows? Related, does Polipo
|handle more web sites correctly than Privoxy, or vice versa? Are there
|stability issues on any common platforms, e.g. Windows?

Looks like the first person who comes up with a reproducible
benchmark could make three projects happy at the same time.

 (Note that while the tone was not always as polite as it should have
 been, Fabian and I live in good friendship and mutual respect.)

I second that.

Fabian


signature.asc
Description: PGP signature


Re: Load Balancing

2007-09-22 Thread Fabian Keil
Michael_google gmail_Gersten [EMAIL PROTECTED] wrote:

 On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote:
  On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote:
   Short answer: Tor tries to group many streams on a single circuit.  If
   we didn't, that would be way too much PK.
 
  Means, if the browser opens up 6 instances to grab stuff from a site,
  Tor consolidates all requests into a single circuit?
 
  Makes sense from a performance point of view.

 If you have a web page with 30 sub-fetches (images, style sheets,
 script files, etc), then they will all fetch over a single circuit.

That's likely, but not guaranteed. Circuits don't last forever.

Fabian


signature.asc
Description: PGP signature


Re: Load Balancing

2007-09-22 Thread Robert Hogan
On Friday 21 September 2007 19:34:09 Alexander W. Janssen wrote:
 On 9/21/07, Arrakis [EMAIL PROTECTED] wrote:
  Hey guys, quick question.
 
  If I have Tor process running, and request a url that has 10 images to
  load from the same domain, do all the requests go through the same
  circuit, or does the tor process split up the requests across all the
  circuits?

 Interesting question. From what I understood a new circuit is created
 for every TCP-connection. If your browser grabs, for instance, 6
 images at the same time (6 loading instances == connections), Tor
 should open 6 different circuits.

 However, considering your question... It doesn't sound too efficient
 to me... The slides say If the user wants to access a different site,
 Alice's Tor client selects a different path.
 I'm curious how strict I should read that...

 Site vs. TCP-connections?

 Alex.

The original question is answered later in the thread, but there is an 
interesting distinction between polipo and privoxy in the way this situation 
is handled:

- privoxy will use new streams on the same circuit for each of the images
- polipo will generally pipeline everything over the same stream

I believe this results in a perceptible performance improvement for general 
browsing. Someone please correct me if I've got this wrong, because I'm just 
working from observation.

-- 

Browse Anonymously Anywhere - http://anonymityanywhere.com
TorK- KDE Anonymity Manager - http://tork.sf.net
KlamAV  - KDE Anti-Virus- http://www.klamav.net



Re: Load Balancing

2007-09-22 Thread Scott Bennett
 On Fri, 21 Sep 2007 15:06:39 -0700 Michael_google gmail_Gersten
[EMAIL PROTECTED] wrote:
On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote:
 On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote:
  Short answer: Tor tries to group many streams on a single circuit.  If
  we didn't, that would be way too much PK.

 Means, if the browser opens up 6 instances to grab stuff from a site,
 Tor consolidates all requests into a single circuit?

 Makes sense from a performance point of view.

  yrs,
  Nick Mathewson

 Alex.

If you have a web page with 30 sub-fetches (images, style sheets,
script files, etc), then they will all fetch over a single circuit.

 Unless the circuit becomes old or is closed by a server going down,
of course.  Also, there is at least the possibility, however unlikely,
that one or more of the subordinate fetches will take a different circuit
because of some peculiarity in the relationship between the page and the
item on the page, e.g., the page is on a non-tor-related web server and the
item on the page is on a web server near a tor server allowing local exits
to that web server.

It does NOT make sense from a performance point of view. Since
everything will be encrypted, regardless of which circuit it takes,
there is no performance impact.

 On modern CPUs, the encryption-decryption workload has to be a minor
factor in the performance of a circuit.  That is why NumCPUs  1 makes so
little difference.  Performance for most circuits will be limited by the
network performance characteristics of their various servers and the
Internet paths currently connecting them.

The question of network efficiency is an interesting one. A single
circuit will be slower than many circuits. However, each new circuit
will start off slower (TCP takes time to get up to speed). Many
established circuits will be faster than one established circuit. [1]

The question of anonymity is more interesting. When I asked on the
development list, I was told that using a single circuit rather than
many circuits helped users to remain anonymous. I didn't understand
the explanation, so I can't repeat it; I trust that the people who
have studied that more than I have know what they are talking about.

[1] This is more true statistically than absolutely. If you have many
circuits, some will be fast, and some slow. Rotating your usage,
concentrating on those circuits with the smallest queue, will send
more TCP channels over faster Tor circuits. However, with many
circuits, you pretty much guarantee that one will be slow. With a
single circuit, you have the all eggs in one basket case, and you
may have a very slow connection.

 The above [1] seems correct as far as it goes, but it needs a bit
of tweaking because separate circuits originating in a particular client
are have a chance, and in the case of entry guard usage, a very high
chance, of sharing some tor server interconnections.  Tor servers normally
have no more than one open socket at a time to any other given tor server's
ORPort.  That socket may carry many circuit segments from different circuits
with no way to know, from the point of view of the tor server on either end
of the socket, whether any of those circuit segments originate from the same
client.  Each of those circuit segments may have many tunneled TCP streams
traversing it.
 Now back to those entry guards.  A client using entry guards picks a
few and connects sockets to their ORPorts.  As it builds circuits, all have
one of those entry guards as their first hop.  Therefore *all* circuits go
through one of those few (NumEntryGuards) sockets, and thus *all* tunneled
TCP streams also go through those few entry guards.  The default value of
NumEntryGuards is 3, so in that case, all circuits and all TCP streams
tunneled through them that originate from that user's browser will be split
across no more than three distinct first hops and are thus in competition
with each other for bandwidth to whichever entry guard(s) is(are) in use and
for bandwidth out the other side(s) of that(those) entry guard(s).  This
situation might also occur when UseEntryGuards = 0 because it is also the
same situation occurring whenever more than one tunneled TCP stream takes
a path sharing, at any hop, the same tor server as another tunneled TCP
stream.


  Scott Bennett, Comm. ASMELG, CFIAG
**
* Internet:   bennett at cs.niu.edu  *
**
* A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army.   *
*-- Gov. John Hancock, New York Journal, 28 January 1790 *
**


Re: Load Balancing

2007-09-22 Thread Juliusz Chroboczek
 - privoxy will use new streams on the same circuit for each of the images
 - polipo will generally pipeline everything over the same stream

Not quite.  Polipo will try to use up to n simultaneous connections to
a given server, where n is

  - 2 for a server that can do pipelining;
  - 4 for a server that can do persistent requests but not pipelining;
  - 8 for a server that cannot do persistent requests.

These magic constants are configurable.

Ideally, Polipo should choose the number of simultaneous connections
depending on an estimate of average queue length, but I haven't
thought about it seriously yet.

 I believe this results in a perceptible performance improvement for general 
 browsing.

I think so too, but some people disagree.  Since I don't want to get
into this discussion again, I refer you to the following friendly flamewar.
(Note that while the tone was not always as polite as it should have
been, Fabian and I live in good friendship and mutual respect.)

Me:
  http://archives.seul.org/or/talk/Apr-2007/msg00056.html

Fabian Keil:
  http://archives.seul.org/or/talk/Apr-2007/msg00063.html

Me:
  http://archives.seul.org/or/talk/Apr-2007/msg00066.html

Me clarifying:
  http://archives.seul.org/or/talk/Apr-2007/msg00069.html

You may also find this paper interesting:

  http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html

Juliusz


Re: Load Balancing

2007-09-22 Thread Roger Dingledine
On Fri, Sep 21, 2007 at 03:06:39PM -0700, Michael_google gmail_Gersten wrote:
 If you have a web page with 30 sub-fetches (images, style sheets,
 script files, etc), then they will all fetch over a single circuit.
 
 It does NOT make sense from a performance point of view. Since
 everything will be encrypted, regardless of which circuit it takes,
 there is no performance impact.

Actually, here's one reason why using circuits one by one rather than
all at once is good for performance and ultimately good for security.

Imagine you make five circuits preemptively. Scenario one is that you
use them one by one, discarding them and switching to a new one when the
current one becomes dirty (defined as first used more than 10 minutes
ago). Scenario two is that you use all five of them for fetching your
big web site, discarding them when they become dirty.

Now compare the two scenarios in terms of total number of circuits the
user needs to make over the course of a day.

Once upon a time, the value of 10 minutes was actually more like 1
minute. You see, the shorter it is, the fewer actions from the user are
linkable with each other based on being in the same circuit. But Tor
server operaters complained they were using 100% cpu because they were
constantly handling new circuit creation requests. So we moved it back to
10 minutes -- bad for user security, but necessary to keep things working.

If the user started churning through circuits at five times the current
rate, we may end up forced to move the 10 minute value back even farther
to compensate, resulting in even more user connections becoming linkable.

Now, this isn't the whole story. Maybe there are really
good security improvements that can be had by not fetching
the whole site over a single connection -- see e.g. item #1 on
https://tor.eff.org/volunteer#Research or (a more tenuous connection)
http://freehaven.net/anonbib/#pet05-serjantov
But on the third hand, see
http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#SplitStreams

Until somebody works through both the security issues and the performance
issues in a convincing way, we will likely just stick with the current
behavior.

--Roger



Re: Load Balancing

2007-09-22 Thread Anthony DiPierro
On 9/22/07, Roger Dingledine [EMAIL PROTECTED] wrote:
 Once upon a time, the value of 10 minutes was actually more like 1
 minute. You see, the shorter it is, the fewer actions from the user are
 linkable with each other based on being in the same circuit. But Tor
 server operaters complained they were using 100% cpu because they were
 constantly handling new circuit creation requests. So we moved it back to
 10 minutes -- bad for user security, but necessary to keep things working.

 If the user started churning through circuits at five times the current
 rate, we may end up forced to move the 10 minute value back even farther
 to compensate, resulting in even more user connections becoming linkable.

Why does new circuit creation use up so much CPU?  Are you spawning a
new thread for each circuit, or something?  If so, maybe there's a way
to be more efficient instead?

I know, I know, I shouldn't complain if I'm not going to offer code.
So don't consider this a complaint, just a question.


Re: Load Balancing

2007-09-22 Thread Roger Dingledine
On Sat, Sep 22, 2007 at 03:51:07PM -0400, Anthony DiPierro wrote:
  If the user started churning through circuits at five times the current
  rate, we may end up forced to move the 10 minute value back even farther
  to compensate, resulting in even more user connections becoming linkable.
 
 Why does new circuit creation use up so much CPU?  Are you spawning a
 new thread for each circuit, or something?  If so, maybe there's a way
 to be more efficient instead?

It's because of the crypto. Each server on the path needs to do two
public key operations for each new circuit -- one to ensure perfect
forward secrecy, and the other for authentication.

See e.g.
https://tor.eff.org/svn/trunk/doc/design-paper/tor-design.html#subsec:circuits
or Section 5.1 of
https://tor.eff.org/svn/trunk/doc/spec/tor-spec.txt

If you still want to read after those, also see
http://freehaven.net/anonbib/#overlier-pet2007
and
http://freehaven.net/anonbib/full/date.html#kate-pet2007

--Roger



Re: Load Balancing

2007-09-21 Thread Arrakis
Alex,

That is exactly the distinction I am looking for.

Does Tor care about the destination of the TCP request, when deciding to
make a new circuit, and thus will use one because it is already dirtied
by that domain?

Steve

Alexander W. Janssen wrote:
 However, considering your question... It doesn't sound too efficient
 to me... The slides say If the user wants to access a different site,
 Alice's Tor client selects a different path.
 I'm curious how strict I should read that...
 
 Site vs. TCP-connections?
 
 Alex.



Re: Load Balancing

2007-09-21 Thread Alexander W. Janssen
On 9/21/07, Arrakis [EMAIL PROTECTED] wrote:
 Does Tor care about the destination of the TCP request, when deciding to
 make a new circuit, and thus will use one because it is already dirtied
 by that domain?

s/domain/IP-address ?

However, that's all up to the implementation of the internal SOCKS-proxy, too!
Just think: Your browser might send out 6 different requests in
different connections, but you don't know what the SOCKS-interface of
Tor makes of it... It might try to be clever and queue it up to a
single circuit.

Not that this is bad, but interesting to know.

If we get an answer, we should put that up to the Tor Tech FAQ. It's a
pretty interesting question.

 Steve

Alex.

-- 
I am tired of all this sort of thing called science here... We have spent
millions in that sort of thing for the last few years, and it is time it
should be stopped.
 -- Simon Cameron, U.S. Senator, on the Smithsonian Institution, 1901.


.


Re: Load Balancing

2007-09-21 Thread Michael_google gmail_Gersten
On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote:
 On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote:
  Short answer: Tor tries to group many streams on a single circuit.  If
  we didn't, that would be way too much PK.

 Means, if the browser opens up 6 instances to grab stuff from a site,
 Tor consolidates all requests into a single circuit?

 Makes sense from a performance point of view.

  yrs,
  Nick Mathewson

 Alex.

If you have a web page with 30 sub-fetches (images, style sheets,
script files, etc), then they will all fetch over a single circuit.

It does NOT make sense from a performance point of view. Since
everything will be encrypted, regardless of which circuit it takes,
there is no performance impact.

The question of network efficiency is an interesting one. A single
circuit will be slower than many circuits. However, each new circuit
will start off slower (TCP takes time to get up to speed). Many
established circuits will be faster than one established circuit. [1]

The question of anonymity is more interesting. When I asked on the
development list, I was told that using a single circuit rather than
many circuits helped users to remain anonymous. I didn't understand
the explanation, so I can't repeat it; I trust that the people who
have studied that more than I have know what they are talking about.

[1] This is more true statistically than absolutely. If you have many
circuits, some will be fast, and some slow. Rotating your usage,
concentrating on those circuits with the smallest queue, will send
more TCP channels over faster Tor circuits. However, with many
circuits, you pretty much guarantee that one will be slow. With a
single circuit, you have the all eggs in one basket case, and you
may have a very slow connection.