Re: Load Balancing
Juliusz Chroboczek [EMAIL PROTECTED] wrote: I believe this results in a perceptible performance improvement for general browsing. I think so too, but some people disagree. Since I don't want to get into this discussion again, I refer you to the following friendly flamewar. Additionally there's http://tor.eff.org/volunteer.html.en#Coding: |We need a measurement study of Polipo vs Privoxy. Is Polipo in |fact significantly faster, once you factor in the slow-down from Tor? |Are the results the same on both Linux and Windows? Related, does Polipo |handle more web sites correctly than Privoxy, or vice versa? Are there |stability issues on any common platforms, e.g. Windows? Looks like the first person who comes up with a reproducible benchmark could make three projects happy at the same time. (Note that while the tone was not always as polite as it should have been, Fabian and I live in good friendship and mutual respect.) I second that. Fabian signature.asc Description: PGP signature
Re: Load Balancing
Michael_google gmail_Gersten [EMAIL PROTECTED] wrote: On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote: On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote: Short answer: Tor tries to group many streams on a single circuit. If we didn't, that would be way too much PK. Means, if the browser opens up 6 instances to grab stuff from a site, Tor consolidates all requests into a single circuit? Makes sense from a performance point of view. If you have a web page with 30 sub-fetches (images, style sheets, script files, etc), then they will all fetch over a single circuit. That's likely, but not guaranteed. Circuits don't last forever. Fabian signature.asc Description: PGP signature
Re: Load Balancing
On Friday 21 September 2007 19:34:09 Alexander W. Janssen wrote: On 9/21/07, Arrakis [EMAIL PROTECTED] wrote: Hey guys, quick question. If I have Tor process running, and request a url that has 10 images to load from the same domain, do all the requests go through the same circuit, or does the tor process split up the requests across all the circuits? Interesting question. From what I understood a new circuit is created for every TCP-connection. If your browser grabs, for instance, 6 images at the same time (6 loading instances == connections), Tor should open 6 different circuits. However, considering your question... It doesn't sound too efficient to me... The slides say If the user wants to access a different site, Alice's Tor client selects a different path. I'm curious how strict I should read that... Site vs. TCP-connections? Alex. The original question is answered later in the thread, but there is an interesting distinction between polipo and privoxy in the way this situation is handled: - privoxy will use new streams on the same circuit for each of the images - polipo will generally pipeline everything over the same stream I believe this results in a perceptible performance improvement for general browsing. Someone please correct me if I've got this wrong, because I'm just working from observation. -- Browse Anonymously Anywhere - http://anonymityanywhere.com TorK- KDE Anonymity Manager - http://tork.sf.net KlamAV - KDE Anti-Virus- http://www.klamav.net
Re: Load Balancing
On Fri, 21 Sep 2007 15:06:39 -0700 Michael_google gmail_Gersten [EMAIL PROTECTED] wrote: On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote: On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote: Short answer: Tor tries to group many streams on a single circuit. If we didn't, that would be way too much PK. Means, if the browser opens up 6 instances to grab stuff from a site, Tor consolidates all requests into a single circuit? Makes sense from a performance point of view. yrs, Nick Mathewson Alex. If you have a web page with 30 sub-fetches (images, style sheets, script files, etc), then they will all fetch over a single circuit. Unless the circuit becomes old or is closed by a server going down, of course. Also, there is at least the possibility, however unlikely, that one or more of the subordinate fetches will take a different circuit because of some peculiarity in the relationship between the page and the item on the page, e.g., the page is on a non-tor-related web server and the item on the page is on a web server near a tor server allowing local exits to that web server. It does NOT make sense from a performance point of view. Since everything will be encrypted, regardless of which circuit it takes, there is no performance impact. On modern CPUs, the encryption-decryption workload has to be a minor factor in the performance of a circuit. That is why NumCPUs 1 makes so little difference. Performance for most circuits will be limited by the network performance characteristics of their various servers and the Internet paths currently connecting them. The question of network efficiency is an interesting one. A single circuit will be slower than many circuits. However, each new circuit will start off slower (TCP takes time to get up to speed). Many established circuits will be faster than one established circuit. [1] The question of anonymity is more interesting. When I asked on the development list, I was told that using a single circuit rather than many circuits helped users to remain anonymous. I didn't understand the explanation, so I can't repeat it; I trust that the people who have studied that more than I have know what they are talking about. [1] This is more true statistically than absolutely. If you have many circuits, some will be fast, and some slow. Rotating your usage, concentrating on those circuits with the smallest queue, will send more TCP channels over faster Tor circuits. However, with many circuits, you pretty much guarantee that one will be slow. With a single circuit, you have the all eggs in one basket case, and you may have a very slow connection. The above [1] seems correct as far as it goes, but it needs a bit of tweaking because separate circuits originating in a particular client are have a chance, and in the case of entry guard usage, a very high chance, of sharing some tor server interconnections. Tor servers normally have no more than one open socket at a time to any other given tor server's ORPort. That socket may carry many circuit segments from different circuits with no way to know, from the point of view of the tor server on either end of the socket, whether any of those circuit segments originate from the same client. Each of those circuit segments may have many tunneled TCP streams traversing it. Now back to those entry guards. A client using entry guards picks a few and connects sockets to their ORPorts. As it builds circuits, all have one of those entry guards as their first hop. Therefore *all* circuits go through one of those few (NumEntryGuards) sockets, and thus *all* tunneled TCP streams also go through those few entry guards. The default value of NumEntryGuards is 3, so in that case, all circuits and all TCP streams tunneled through them that originate from that user's browser will be split across no more than three distinct first hops and are thus in competition with each other for bandwidth to whichever entry guard(s) is(are) in use and for bandwidth out the other side(s) of that(those) entry guard(s). This situation might also occur when UseEntryGuards = 0 because it is also the same situation occurring whenever more than one tunneled TCP stream takes a path sharing, at any hop, the same tor server as another tunneled TCP stream. Scott Bennett, Comm. ASMELG, CFIAG ** * Internet: bennett at cs.niu.edu * ** * A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army. * *-- Gov. John Hancock, New York Journal, 28 January 1790 * **
Re: Load Balancing
- privoxy will use new streams on the same circuit for each of the images - polipo will generally pipeline everything over the same stream Not quite. Polipo will try to use up to n simultaneous connections to a given server, where n is - 2 for a server that can do pipelining; - 4 for a server that can do persistent requests but not pipelining; - 8 for a server that cannot do persistent requests. These magic constants are configurable. Ideally, Polipo should choose the number of simultaneous connections depending on an estimate of average queue length, but I haven't thought about it seriously yet. I believe this results in a perceptible performance improvement for general browsing. I think so too, but some people disagree. Since I don't want to get into this discussion again, I refer you to the following friendly flamewar. (Note that while the tone was not always as polite as it should have been, Fabian and I live in good friendship and mutual respect.) Me: http://archives.seul.org/or/talk/Apr-2007/msg00056.html Fabian Keil: http://archives.seul.org/or/talk/Apr-2007/msg00063.html Me: http://archives.seul.org/or/talk/Apr-2007/msg00066.html Me clarifying: http://archives.seul.org/or/talk/Apr-2007/msg00069.html You may also find this paper interesting: http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html Juliusz
Re: Load Balancing
On Fri, Sep 21, 2007 at 03:06:39PM -0700, Michael_google gmail_Gersten wrote: If you have a web page with 30 sub-fetches (images, style sheets, script files, etc), then they will all fetch over a single circuit. It does NOT make sense from a performance point of view. Since everything will be encrypted, regardless of which circuit it takes, there is no performance impact. Actually, here's one reason why using circuits one by one rather than all at once is good for performance and ultimately good for security. Imagine you make five circuits preemptively. Scenario one is that you use them one by one, discarding them and switching to a new one when the current one becomes dirty (defined as first used more than 10 minutes ago). Scenario two is that you use all five of them for fetching your big web site, discarding them when they become dirty. Now compare the two scenarios in terms of total number of circuits the user needs to make over the course of a day. Once upon a time, the value of 10 minutes was actually more like 1 minute. You see, the shorter it is, the fewer actions from the user are linkable with each other based on being in the same circuit. But Tor server operaters complained they were using 100% cpu because they were constantly handling new circuit creation requests. So we moved it back to 10 minutes -- bad for user security, but necessary to keep things working. If the user started churning through circuits at five times the current rate, we may end up forced to move the 10 minute value back even farther to compensate, resulting in even more user connections becoming linkable. Now, this isn't the whole story. Maybe there are really good security improvements that can be had by not fetching the whole site over a single connection -- see e.g. item #1 on https://tor.eff.org/volunteer#Research or (a more tenuous connection) http://freehaven.net/anonbib/#pet05-serjantov But on the third hand, see http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#SplitStreams Until somebody works through both the security issues and the performance issues in a convincing way, we will likely just stick with the current behavior. --Roger
Re: Load Balancing
On 9/22/07, Roger Dingledine [EMAIL PROTECTED] wrote: Once upon a time, the value of 10 minutes was actually more like 1 minute. You see, the shorter it is, the fewer actions from the user are linkable with each other based on being in the same circuit. But Tor server operaters complained they were using 100% cpu because they were constantly handling new circuit creation requests. So we moved it back to 10 minutes -- bad for user security, but necessary to keep things working. If the user started churning through circuits at five times the current rate, we may end up forced to move the 10 minute value back even farther to compensate, resulting in even more user connections becoming linkable. Why does new circuit creation use up so much CPU? Are you spawning a new thread for each circuit, or something? If so, maybe there's a way to be more efficient instead? I know, I know, I shouldn't complain if I'm not going to offer code. So don't consider this a complaint, just a question.
Re: Load Balancing
On Sat, Sep 22, 2007 at 03:51:07PM -0400, Anthony DiPierro wrote: If the user started churning through circuits at five times the current rate, we may end up forced to move the 10 minute value back even farther to compensate, resulting in even more user connections becoming linkable. Why does new circuit creation use up so much CPU? Are you spawning a new thread for each circuit, or something? If so, maybe there's a way to be more efficient instead? It's because of the crypto. Each server on the path needs to do two public key operations for each new circuit -- one to ensure perfect forward secrecy, and the other for authentication. See e.g. https://tor.eff.org/svn/trunk/doc/design-paper/tor-design.html#subsec:circuits or Section 5.1 of https://tor.eff.org/svn/trunk/doc/spec/tor-spec.txt If you still want to read after those, also see http://freehaven.net/anonbib/#overlier-pet2007 and http://freehaven.net/anonbib/full/date.html#kate-pet2007 --Roger
Re: Load Balancing
Alex, That is exactly the distinction I am looking for. Does Tor care about the destination of the TCP request, when deciding to make a new circuit, and thus will use one because it is already dirtied by that domain? Steve Alexander W. Janssen wrote: However, considering your question... It doesn't sound too efficient to me... The slides say If the user wants to access a different site, Alice's Tor client selects a different path. I'm curious how strict I should read that... Site vs. TCP-connections? Alex.
Re: Load Balancing
On 9/21/07, Arrakis [EMAIL PROTECTED] wrote: Does Tor care about the destination of the TCP request, when deciding to make a new circuit, and thus will use one because it is already dirtied by that domain? s/domain/IP-address ? However, that's all up to the implementation of the internal SOCKS-proxy, too! Just think: Your browser might send out 6 different requests in different connections, but you don't know what the SOCKS-interface of Tor makes of it... It might try to be clever and queue it up to a single circuit. Not that this is bad, but interesting to know. If we get an answer, we should put that up to the Tor Tech FAQ. It's a pretty interesting question. Steve Alex. -- I am tired of all this sort of thing called science here... We have spent millions in that sort of thing for the last few years, and it is time it should be stopped. -- Simon Cameron, U.S. Senator, on the Smithsonian Institution, 1901. .
Re: Load Balancing
On 9/21/07, Alexander W. Janssen [EMAIL PROTECTED] wrote: On 9/21/07, Nick Mathewson [EMAIL PROTECTED] wrote: Short answer: Tor tries to group many streams on a single circuit. If we didn't, that would be way too much PK. Means, if the browser opens up 6 instances to grab stuff from a site, Tor consolidates all requests into a single circuit? Makes sense from a performance point of view. yrs, Nick Mathewson Alex. If you have a web page with 30 sub-fetches (images, style sheets, script files, etc), then they will all fetch over a single circuit. It does NOT make sense from a performance point of view. Since everything will be encrypted, regardless of which circuit it takes, there is no performance impact. The question of network efficiency is an interesting one. A single circuit will be slower than many circuits. However, each new circuit will start off slower (TCP takes time to get up to speed). Many established circuits will be faster than one established circuit. [1] The question of anonymity is more interesting. When I asked on the development list, I was told that using a single circuit rather than many circuits helped users to remain anonymous. I didn't understand the explanation, so I can't repeat it; I trust that the people who have studied that more than I have know what they are talking about. [1] This is more true statistically than absolutely. If you have many circuits, some will be fast, and some slow. Rotating your usage, concentrating on those circuits with the smallest queue, will send more TCP channels over faster Tor circuits. However, with many circuits, you pretty much guarantee that one will be slow. With a single circuit, you have the all eggs in one basket case, and you may have a very slow connection.