Re: [tor-dev] Scaling tor for a global population
Sebastian Hahn: On 27 Sep 2014, at 02:18, Mike Perry mikepe...@torproject.org wrote: If we were willing to tolerate 10% directory overhead this would allow for 5 times as many users. In other words, 100M daily connecting users. We would still need to find some way to fund the growth of the network to support this 40X increase, but there are no actual *technical* reasons why it cannot be done. One thing there isn't an automatic answer to is whether our current users and our would-be users are differ in their usage pattern. Currently, my intuition would be that most of our users are responsible for a relative small amount of traffic, whereas we have some users who pull a lot of data. I wonder what happens with Netflix, youtube and other services. We might at least want to try and figure this into the equation by estimating our average daily bytes sent/received per users, and comparing that to the bytes sent/received by our target group. This is a great point, Sebastian. Some of the browser vendors do this sort of analytics, and other parties have also made some inferences independently. In particular, these two studies are old, and may not be accurate anymore, but they say that between 5-15% of each browser's userbase used Private Browsing Mode, depending on the browser (pg 9): https://www.usenix.org/legacy/event/sec10/tech/full_papers/Aggarwal.pdf And moreover, that the typical usage duration for Private Browsing Mode was 4-22 minutes, with 10 minutes being most common: https://blog.mozilla.org/metrics/2010/08/23/understanding-private-browsing/ Refreshed studies of this type of data would be very helpful for determining what size rollout we could handle, and if we'd want to further limit it by making the Tor mode opt-in, not prominent, or otherwise less likely to be selected by most users, at least for the initial versions. We could also capture more detailed usage frequency analytics and bytecount statistics in an alpha or other trial version. -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
30.09.2014, 01:12 isis: isis [mash-up] [3]: Please, don't give all the shit relays to me as bridges. I think it's less important scalability-wise (right now) to have a strict cutoff rate for bridges, but eventually, when/if we ever have Bridge Bandwidth Authorities, BridgeDB should cut off anything below some completely arbitrary rate, like 100 KB/s. I've gotten a bridge (from http://bridges.torproject.org) which was 28 B/s. Yes, *bytes*. That thing was probably slowing down the rest of the Tor network just by *existing* via its molasses-speeds blocking the Exit from continuing the response after SENDME number of cells, which is probably eventually going to cause TCP timeouts on the Exit's side and a whole bunch of other messes. I think the Tor Project requires a high number of bridges to make collection of all addresses harder for some adversaries. I'm aware that adversaries can outrun the number of brides. This might or might not be valid until you shut down all vanilla bridges. Obviously bridges that don't provide too less bandwidth should not take part in the network. What I usually recommend is to users is based on their bandwidth and how frequently their IP changes. If their connection is fast and their IP never changes (eg, a desktop or server), then run a non-exit relay [2]. For a laptop that moves to-from work, then a relay or bridge. Actually, anything with a constantly changing IP is a terrible idea for a bridge. Think of it this way: BridgeDB hands you a bridge line on Monday. On Tuesday, the bridge's IP rotates to something else. Now your Tor doesn't work. Wah-wah, sucks for you! The bridge being down is indeed a problem. I got the same recommendation of running a bridge on my home-network, rather than a relay (or exit) from a person working for/on Tor. I passed that recommendation around, IIRC even on this list, and no one complained. It is problematic to have this information around and you, isis, speaking up s late. The changing IP part is beneficial for censorship circumvention. It would be broken if clients could learn the new IP address automatically. Distribution of bridges just sucks for circumvention of censorship. (Not your work, isis.) That could be improved by handing out puzzles. There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent. I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay[.] I don't think you are the Tor Project either, isis. You are working for the Tor Project. Well depending on the view all employees are the Tor Project. I'm not suggesting you would run those nodes for the NSA or that you work for them in any way. Thank you all for your work in upscaling Tor. Regards, Sebastian G. bastik ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Thanks isis. We worked on PIR-Tor a while ago, so my recollection could be a bit rusty, but here are some thoughts. 1) From a performance perspective, the best results are obtained by using the information-theoretic variant of PIR-Tor (ITPIR) (as opposed to the computational variant). As you correctly point out, we considered the three guard relays to be the PIR directory servers, with the assumption that not all three collude. (Our argument for the non-colluding assumption was that if all three guards are malicious, then the security offered by Tor is terrible already.) Note than ITPIR necessarily requires more than one server. Given Tor's move to a single guard design, the first challenge is to figure out alternative PIR servers for the client. (I havn't thought much about this challenge, but it seems that the single guard could be one server, and two other relays (maybe with guard flags) could play the role of PIR directory servers. I'll have to think more about this, but it is possible that this design continues to provide similar security properties to Tor) 2) Even outside the context of PIR, there are several advantages to having some structure in the database, with fine grained signatures on individual relay/micro descriptors (relays could also be grouped into blocks, with signatures at the block level). For example, if the Tor network were to use 2 hop circuits, then clients would have close to 0 (or at least significantly less than 1%) overhead for maintaining network state. Why? Because the first hop (guard) is fixed, fetching the directory to select the guard is only a *one-time* operation; note that much of the directory overhead in Mike Perry's analysis comes from *periodic fetches*. Secondly, clients do not need to use PIR to fetch the exit relay in this setting, since guards know the identity of exits in a two hop design anyway (bandwidth overhead of fetching a single exit is simply the size of the descriptor/micro-descriptor -- resulting in *several orders of magnitude* bandwidth savings). (This optimization was referred to in the paper in the context of fetching middle relays) (Of course there are anonymity implications of moving to two hop designs. Perhaps the most significant concern is easy guard identification, but perhaps an additional 100 million clients + move to a single guard reduces these concerns. The bandwidth savings could be very significant, specially in the context of mobile clients, which otherwise might use hundreds of MB per month just to maintain network state. Anyway, I digress.) Please find some comments about your previous mail inline (below). On Tue, Sep 30, 2014 at 1:04 AM, isis i...@torproject.org wrote: 1. The authors assume that a client's three [sic] Guard Nodes will be the directory mirrors for that client, accepting PIR queries for descriptors. Right, please see (1) above. 3. There is zero handling of colluding directory mirrors. This seems incorrect? We used the Percy open source library to perform PIR, which does handle collusion among mirrors (to the best of my recollection). The parameters can be set such that one honest server is sufficient for security. 4. The *increase* to descriptor size is not well factored in to the analysis. 4.b. All of the Directory Authorities would need to sign each and every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus). I am not sure which signature scheme you are considering. In the paper, we talk about threshold BLS signatures, which have only 22 byte overhead. (This might increase for better security guarantees, but 4KB overhead seems very conservative.) While one of the benefits of PIR would be that clients would no longer need to request the descriptors for all relays listed in the consensus, this benefit actually doesn't help as much as it would seem, because switching to a PIR scheme would require a client to make three of these roughly O(146449)-bit requests, every ten minutes or so, when the client wants a new circuit. Right, clients would roughly need 18 middle/exit relays (using PIR queries) over a three hour interval to build a Tor circuit every 10 minutes. My recollection is that even in 2011, PIR seemed more efficient (we estimated a 12KB overhead per PIR query) than a full network fetch every three hours (12*18 = 216KB with PIR vs several MBs without PIR). Though these numbers might change depending on implementation details (including the use of micro-descriptors), the benefits would only improve with growing network size. Thanks, Prateek -- Prateek Mittal Assistant Professor Princeton University http://www.princeton.edu/~pmittal/ ___ tor-dev mailing
Re: [tor-dev] Scaling tor for a global population
On Tue, Sep 30, 2014 at 6:29 AM, Prateek Mittal pmit...@princeton.edu wrote: 4. The *increase* to descriptor size is not well factored in to the analysis. 4.b. All of the Directory Authorities would need to sign each and every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus). I am not sure which signature scheme you are considering. In the paper, we talk about threshold BLS signatures, which have only 22 byte overhead. (This might increase for better security guarantees, but 4KB overhead seems very conservative.) You can make this even lower because every PIR query returns a block which is roughly square root the size of the entire database; you would only need to have a signature on each block. - Nikita -- Nikita Borisov - http://hatswitch.org/~nikita/ Associate Professor, Electrical and Computer Engineering Tel: +1 (217) 244-5385, Office: 460 CSL ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On Tue, Sep 30, 2014 at 12:04 AM, isis i...@torproject.org wrote: [1]: I assume that we need replication in Tor's use case. There are papers, such as the following: Kushilevitz, Eyal, and Rafail Ostrovsky. Replication is not needed: Single database, computationally-private information retrieval. 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 2013. http://www.cs.fiu.edu/~carbunar/teaching/cis5374/slides/pir-single.pdf for which the research doesn't apply because it was aimed at computationally-hiding PIR schemes, and obviously Tor aims for information theoretic security. Other than the PIR-Tor paper, I haven't found any literature which analyses PIR for anything close to Tor's use case. (Although I'd be stoked to hear about any such papers.) The current Tor design relies heavily on computational assumptions in all of its cryptography, so I don't think that the Tor network setting is a reason to use information-theoretic rather than computational PIR. We advocated ITPIR because it is much more efficient and because collusion among your three guards was already a significant problem. With the move to a single guard, that argument no longer makes as much sense. We do have an analysis in the paper of the computational PIR scenario, but it does impose significantly more (computational) overhead on the directory mirrors. - Nikita -- Nikita Borisov - http://hatswitch.org/~nikita/ Associate Professor, Electrical and Computer Engineering Tel: +1 (217) 244-5385, Office: 460 CSL ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On 09/30/2014 06:28 AM, AFO-Admin wrote: E.g. you have a Server with 2x E5-2683 v3 v3 and a 10 Gbit/s pipe you would need atleast 14 IP's to use most of the CPU. Multicore support is hard and needs developers. Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change. -- Moritz Bartl https://www.torservers.net/ ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Moritz Bartl transcribed 0.5K bytes: Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change. We *still* have that limit? I thought we killed it a long time ago. Can we kill it now? It's not going to do anything to prevent Sybils, it'll only prevent good relay operators on larger machines from giving more bandwidth. -- ♥Ⓐ isis agora lovecruft _ GPG: 4096R/A3ADB67A2CDB8B35 Current Keys: https://blog.patternsinthevoid.net/isis.txt signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
isis: Moritz Bartl transcribed 0.5K bytes: Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change. We *still* have that limit? I thought we killed it a long time ago. Can we kill it now? It's not going to do anything to prevent Sybils, it'll only prevent good relay operators on larger machines from giving more bandwidth. If not kill it, raising it to 4-8 might make more sense and be a good interim step. We could lower it again if we have real multicore support (which I still think we should aim for because it also improves the mix properties of fast relays to an external observer, but I realize is no small task). -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Ryan Carboni transcribed 1.1K bytes: There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent. I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay: https://globe.torproject.org/#/relay/26F6A7570E0F655DFDD054E79ACBB127112C2D7B https://globe.torproject.org/#/relay/3BC76060924A1B8DE69DF57C27E6C1EBB95ABE57 -- ♥Ⓐ isis agora lovecruft _ GPG: 4096R/A3ADB67A2CDB8B35 Current Keys: https://blog.patternsinthevoid.net/isis.txt Post Scriptum: Mike, for the numbers you're gathering on the tor-relays mailist list thread, the above exit relay costs me approximately $20/month. signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On Mon, Sep 29, 2014 at 4:36 PM, isis i...@torproject.org wrote: Ryan Carboni transcribed 1.1K bytes: There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent. I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay: You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
It would be trivial for a multimillion dollar organisation to sybil attack Tor even as it stands right now. On 30/09/2014 01:12, Ryan Carboni wrote: On Mon, Sep 29, 2014 at 4:36 PM, isis i...@torproject.org wrote: Ryan Carboni transcribed 1.1K bytes: There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent. I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay: You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On Mon, Sep 29, 2014 at 7:12 PM, Ryan Carboni rya...@gmail.com wrote: You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions. If you look at the numbers isis posted, all relays below the median contribute less than 3% of the overall bandwidth. I think cutting all of them off might be overkill, but even if you did, you would not make a Sybil attack appreciably harder. - Nikita -- Nikita Borisov - http://hatswitch.org/~nikita/ Associate Professor, Electrical and Computer Engineering Tel: +1 (217) 244-5385, Office: 460 CSL ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Andrew Lewman transcribed 1.8K bytes: The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor) Using Private Information Retrieval (PIR) to retrieve a 50 KB/s relay would likely make the current network [0] completely unusable. ___ First, the PIR-Tor paper makes a few naïve assumptions which grossly effect the analysis of the overhead for directory fetches: 1. The authors assume that a client's three [sic] Guard Nodes will be the directory mirrors for that client, accepting PIR queries for descriptors. 2. There is zero handling of cases where multiple directory mirrors have differing data. 3. There is zero handling of colluding directory mirrors. 4. The *increase* to descriptor size is not well factored in to the analysis. In the PIR-Tor paper, §4.2: | | Even if all three guard relays are compromised, they cannot actively | manipulate elements in the PIR database since they are signed by the | directory authorities[...] | Single descriptors are *not* signed, the set of all descriptors is. Therefore, if a client wanted to actually check that all (or a majority) or the Directory Authorities had signed a descriptor, that client would need to: 4.a. Download all of them and check. Which negates the whole point of this PIR thing. 4.b. All of the Directory Authorities would need to sign each and every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus). Result: Each microdescriptor is 4X larger, and the Directory Authorities need to do several orders of magnitudes more signatures. ___ Second, because the PIR-Tor paper doesn't seem to be what we would actually want to implement, the following is an analysis. Using some safe assumptions about what security guarantees we would likely want to require from any PIR scheme we used... That is, we would like want a 1-roundtrip (RT), b-private, b-Byzantine, k-out-of-m databases, [1] PIR scheme. (Meaning that there are m total directory mirrors, only k of those actually answer a set of your queries, and you want the privacy of your queries and the authenticity of the answers to your queries to be protected even if b number of directory mirrors are malicious and colluding with one another.) Then, with those assumptions in mind, from Theorem 7.7 in §7.1 of this review on PIR schemes [2] (which is, admittedly, a bit dated) used in one of Dan Boneh's classes, we have a size complexity [3] of O((k/3b) n^(1/[(k-1)/3]) m log_2 m) where bk/3 (Meaning that the number of malicious, colluding directory mirrors which answered is less than one third of the total number of mirrors which answered.) import math m = 3000 k = 30 b = 4 n = 160 # the number of bits in a relay fingerprint int(math.ceilfloat(k)/(3*float(b))) ... * (n**(1.0/((float(k)-1.0)/float(3) ... * float(m)) * math.log(float(m), 2.0))) 146449 the above space complexity comes out to O(146449)-bits per lookup. (Where, by lookup, I mean the set of all queries pertaining to fetching a single relay descriptor, since, in PIR, a query is usually for a single bit.) This would mean that adding any sane PIR scheme for directory requests would result in something like a 900X increase in the bytes used for each request. While one of the benefits of PIR would be that clients would no longer need to request the descriptors for all relays listed in the consensus, this benefit actually doesn't help as much as it would seem, because switching to a PIR scheme would require a client to make three of these roughly O(146449)-bit requests, every ten minutes or so, when the client wants a new circuit. Don't get me wrong; PIR is awesome. But the current Tor network is likely *not* the correct real-world application of any of the current PIR schemes. At least, it isn't until we have some HUGE number of nodes in the network, which by #1 and #2 in my original reply to this thread, [4] we shouldn't. [0]: Note, I said the current network. An imaginary future Tor network which has 10,000,000 relays would be different. And then note the very last paragraph where I state that we probably shouldn't ever have 10,000,000 relays. [1]: I assume that we need replication in Tor's use case. There are papers, such as the following: Kushilevitz, Eyal, and Rafail Ostrovsky.
Re: [tor-dev] Scaling tor for a global population
On 28 Sep 2014, at 02:12, Tom Ritter t...@ritter.vg wrote: why not also change the consensus and related document formats to be something more efficient than ASCII text? Taking the latest consensus and doing some rough estimates, I found the following: Original consensus, xz-ed: 407K Change flags to uint16: ~399K +Removing names: 363K +Compressing IPv6 to 16Bytes + 4 Bytes - 360K +Compressing IPv4 to 4 Bytes + 4Bytes + 4bytes - 315K +Compressing the Datetime to 4 bytes - 291K +Compressing the Version string to 4bytes - 288K +Replacing reject 1-65K to a single byte - 287K +Replacing Bandwidth=# with a 4 byte - 273K These numbers are optimistic - you won't see quite this much gain, but if I'm understanding you correctly that the consensus is painful, it seems like you could save at least 50K-70K out of 400K with relatively straightforward changes. This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value. Cheers Sebastian ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On 27 Sep 2014, at 10:18 , tor-dev-requ...@lists.torproject.org wrote: Date: Fri, 26 Sep 2014 17:18:07 -0700 From: Mike Perry mikepe...@torproject.org To: tor-dev@lists.torproject.org Subject: Re: [tor-dev] Scaling tor for a global population 2. Cut off relays below the median capacity, and turn them into bridges. Relays in the top 10% of the network are 164 times faster than relays in the 50-60% range, 1400 times faster than relays in the 70-80% range, and 35000 times faster than relays in the 90-100% range. In fact, many relays are so slow that they provide less bytes to the network than it costs to tell all of our users about them. There should be a sweet spot where we can set this cutoff such that the overhead from directory activity balances the loss of capacity from these relays, as a function of userbase size. Result: ~2X reduction in consensus and directory size. Do these extra relays function as spare (albeit slower) capacity in the network? If so, we'd have to be careful to set the cut-off low enough that any spikes in usage could be accommodated. Also, doesn't reducing the number of routers in the consensus risk harming network diversity? I understand that if these routers are a net burden on the network, then they're not actually contributing much to diversity overall. But what about users with obscure configurations like New Zealand routers only, or custom list of 20 trusted routers? (Is a restricted set of routers a common enough use case?) Do we risk pulling the data they need from the consensus? Then again, users with these sorts of configurations are probably technically savvy enough to fix any issues that occur with any changeover. T teor pgp 0xABFED1AC hkp://pgp.mit.edu/ http://0bin.net/paste/Mu92kPyphK0bqmbA#Zvt3gzMrSCAwDN6GKsUk7Q8G-eG+Y+BLpe7wtmU66Mx signature.asc Description: Message signed with OpenPGP using GPGMail ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On 28 September 2014 07:00, Sebastian Hahn sebast...@torproject.org wrote: This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value. All of those numbers were after compressing the consensus document using xz, which is the best compression method I know. -tom ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On 28 Sep 2014, at 16:33, Tom Ritter t...@ritter.vg wrote: On 28 September 2014 07:00, Sebastian Hahn sebast...@torproject.org wrote: This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value. All of those numbers were after compressing the consensus document using xz, which is the best compression method I know. sorry, I completely missed that you showed compressed numbers. Interesting results! ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
To avoid squashing the Tor network with all of these new clients, the company would almost certainly have to run some big relays to help compensate for the additional load. Another proposal would be some sort of incentive for running relays. -V ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Il 9/27/14, 2:33 AM, Mike Perry ha scritto: We could also handle controlled rollouts to fractions of their userbase to test the waters, and slowly add high capacity nodes to the network to support these new users, to ensure we have the people ready to accept payment for running the servers, and maintain diversity. I read your very detailed estimations and improvement paths, i love it! However i see that the main suggestion to increase the network capacity can be simplified as follow: - improve big nodes ability to push even more traffic - add more big nodes Other improvements are to reduce the consensus size and directory load, but not specifically on network capacity. While this is the obvious way to add more capacity i feel that's going to have impacts such as: 1) reduce the diversity (thus the anonymity, because few players will handle most of the network's traffic) 2) make it irrelevant for anyone to run their own small/volounteer relay That sounds like the easier way to scale up in a defined amount of time and with a defined budget, but imho also with consequences and pre-defined limits. I feel that the only way to scale-up without limits and consequences is to have end-users became active elements of the network, where we have success story such as Skype. End-users have important network resources available that can be estimated and used (with care). Not all end-users are equal, i'm now on a 2M Hyperlan line (damn digital divide!), but someone else in Stockholm or San Francisco it's on a 1000M/100M fiber connection @home (not in a datacenter) and while in Milan i've a 100M/10M fiber! That bandwith resources are amazing, usually quite cheap (home broadband lines), widely available in the end-users hands. IMHO those are the bandwidth resources, widely available, cheap, very diverse/sparse that could help the Tor network to scale-up. How to use it properly within/for the Tor network? That's a different topic. But those big bandwidth resources are there, available under our feet, in our home, and we're not using it! -naif ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Besides the fact that this could be a great opportunity for tor in many ways I see two problems we should consider: - this vastly growth would be artificial. What happens to all the users and servers if they stop supporting the product or close-down? - IMHO it is a problem if the network expands because the SOME users pay for it. Don't get me wrong. I know that many people are willing to spend their private money to run tor-relays ... but I know of no user who have to pay for it. (Expect torplug, but that is a different story) Possible that I'm getting something completely wrong, but just for the record! -- M signature.asc Description: PGP signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
On 26 September 2014 22:28, Mike Perry mikepe...@torproject.org wrote: That's basically what I'm arguing: We can increase the capacity of the network by reducing directory waste but adding more high capacity relays to replace this waste, causing the overall directory to be the same size, but with more capacity. I'm sure that diffs will make a huge difference, but if you're focusing on the directory documents why not also change the consensus and related document formats to be something more efficient than ASCII text? Taking the latest consensus and doing some rough estimates, I found the following: Original consensus, xz-ed: 407K Change flags to uint16: ~399K +Removing names: 363K +Compressing IPv6 to 16Bytes + 4 Bytes - 360K +Compressing IPv4 to 4 Bytes + 4Bytes + 4bytes - 315K +Compressing the Datetime to 4 bytes - 291K +Compressing the Version string to 4bytes - 288K +Replacing reject 1-65K to a single byte - 287K +Replacing Bandwidth=# with a 4 byte - 273K These numbers are optimistic - you won't see quite this much gain, but if I'm understanding you correctly that the consensus is painful, it seems like you could save at least 50K-70K out of 400K with relatively straightforward changes. -tom ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
But, because this is fraction rises with both D and U, these research papers rightly point out that you can't keep adding relays *and* users and expect Tor to scale. Broadcast a fraction of all available directories? Use md5 as a random number generator, hash the ECC/RSA keys using md5. A user connecting to the network will generate an 8-bit random value, and based on that, will download one of 1/256 directories. Right now, any relay with more than ~100Mbit of capacity really needs to run an additional tor relay instance on that link to make use of it. If they have AES-NI, this might go up to 300Mbit. Any plans to use ChaCha8 instead of AES? It would an order of magnitude faster. It is also unlikely for ChaCha8 to become sufficiently insecure to effect web-size non-video traffic. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
M. Ziebell: Besides the fact that this could be a great opportunity for tor in many ways I see two problems we should consider: - this vastly growth would be artificial. What happens to all the users and servers if they stop supporting the product or close-down? In this case, presumably the users would disappear, as this vendor's browser product would no longer include Tor (and I would hope push out an update that removed the feature). - IMHO it is a problem if the network expands because the SOME users pay for it. Don't get me wrong. I know that many people are willing to spend their private money to run tor-relays ... but I know of no user who have to pay for it. (Expect torplug, but that is a different story) Well, I think the model we are considering is that a vendor is monitizing their users some other way (perhaps because they purchased the phone that comes with Tor on it, or through advertising revenue during non-private mode). This vendor would presumably then have an interest in contributing either relays, money, or both to make sure the Tor network is fast enough to be useful to their users. I am worried that this particular vendor is thinking about just using our Tor Browser patches without talking to us on the engineering side, but I guess this is still in the initial stages of discussion. -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Mike Perry: 5. Invest in the Tor network. Based purely on extrapolating from the Noisebridge relays, we could add ~300 relays, and double the network capacity for $3M/yr, or about $1 per user per year (based on the user counts from: https://metrics.torproject.org/users.html). Note that this value should be treated as a minimum estimate. We actually want to ensure diversity as we grow the network, which may make this number higher. I am working on better estimates using replies from: https://lists.torproject.org/pipermail/tor-relays/2014-September/005335.html Automated donation/funding distribution mechanisms such as https://www.oniontip.com/ are especially interesting ways to do this (and can even automatically enforce our diversity goals) but more traditional partnerships are also possible. Result: 100% capacity increase for each O($3M/yr), or ~$1 per new user per year. Naif's point about there being 100Mbit residential uplinks out there suggests that there may be a hybrid approach here. If this vendor could detect super-high-speed client uplinks, they could ask only these users if they wanted to be non-exit relays. But this is complicated, as it also requires understanding if the user's ISP will get upset at the traffic consumption or the fact that a listening TCP service is running. For example, I know Comcast calls their residential service unlimited, but yells at you if you transfer more than 250GB in a month, or if they discover any listening TCP ports on your IP address. Even if we could figure these problems out by looking up ISP policy based on client IP address, I think we still need to fund exit relays. I don't think we can just enlist random home users connections to be exits without giving them a wall of text explaining how to deal with issues that may arise. So this may be something to consider to reduce network expenditure, but it won't completely eliminate it. -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
[tor-dev] Scaling tor for a global population
I had a conversation with a vendor yesterday. They are interested in including Tor as their private browsing mode and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off. They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users. As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with tor mode enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor. As this question keeps popping up by the business world looking at privacy as the next must have feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related. The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor) Is there a better list available for someone new to Tor to read up on the scalability challenges? -- Andrew pgp 0x6B4D6475 https://www.torproject.org/ +1-781-948-1982 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
Il 9/26/14, 4:58 PM, Andrew Lewman ha scritto: They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users. . WOW! . Is there a better list available for someone new to Tor to read up on the scalability challenges? As a basic concept, i don't think that Tor could scale up to huge numbers without making the end-user to became active part of the network routing. -- Fabio Pietrosanti (naif) HERMES - Center for Transparency and Digital Human Rights http://logioshermes.org - http://globaleaks.org - http://tor2web.org ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Scaling tor for a global population
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I think one of the important thoughts here, at least as an exit operator is that having a large group like that can significantly influence how Tor is seen and I am sure having that kind of backing could open many avenues for us. If we are to scale up, we can reduce CPU load (or optimise) per node or we can have more ISPs who welcome Tor. I think whilst there are lots of great people doing fantastic work expanding the ISP's who accept Tor, we need to perhaps revisit those ISP's who have shown to be hostile towards us. Do you believe this person or group could assist in perhaps pursuading ISP's to open up to Tor exit operators like myself? If they are willing to offer their name as a backing, I'd be more than happy to dedicate myself for many hours per week to get in touch with ISP's and try to change their policies. If we see much success, I can easily co-ordinate a revamp of the good/bad ISP list which has become a bit messy over the last few months. Given the sheer volume of traffic my exits have pushed (Petabytes a month), the amount of abuse complaints I've had and even police raids I am quite comfortable giving ISP's the honest picture. Some won't open up even with major backing but I am sure we can convince some to change their policies when they see and hear from the actual operators who aren't in prison (since many of them seem to equate running Tor exits to being a criminal or a guaranteed way to get in trouble with the police). - -T On 26/09/2014 15:58, Andrew Lewman wrote: I had a conversation with a vendor yesterday. They are interested in including Tor as their private browsing mode and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off. They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users. As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with tor mode enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor. As this question keeps popping up by the business world looking at privacy as the next must have feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related. The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor) Is there a better list available for someone new to Tor to read up on the scalability challenges? -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJUJc0dAAoJEE2uQiaesOsLq5wP/3Gt8dzgFscFE66cOv+Fatkw UYO7rS+9daEAlaXRPI8n7W7EEHhRnTyrqPr6NEb8qSqkgs+Mb90G0d9hqrLNt9T6 vXlU092HN11Fhy70AnEKpodv+PGOR1leXWAfAzZI9lmSMeQKRNZHQHz+hAJj1UTB Jn2i6LFJyrsKj53cof2TW1IIP8snVQhXpvnwUEeXhyQ8ehgK7khR1lio7FZUUwXe Gs+jP8UivE250BlOLawGVi11SQ47+d6qgVe+IfSsMYySN5RSIUoHIn7LTcr/zzOS vXMb4RkNDOK3Q2O9NE8PIdDS8QaOTKdlT1YYh+bgzhupxVCRlXcc7xtyKeWZoSYP DxsPngHMxCwx8sLZvnsFqkG+xGY8ypuMMV1Bm5E4DzFwEBJvlq4muoqXWX4t173g XqRQyVFd2uZEZDs843HIHa3q1kOs7+ws7qJIpDHY22m3VH+hBFyFAEJLq0H8Cq99 99O51PshpEgP53sOMxWPsI8QHqt+wlNfhbuUNveLZAJW2PLYGYbNloBxco6GaNI5 H8mLBQd9zXQkO1+3n0Z07b3vC+EE1lBfhwRgOklbp4Ek4uDDhG6WDp7x4+R5EI6E 8F4hbWF5Vtgx1owZbJbds9vNraYXu+tGS3a5PET9BQFdHuwWb5S1hI/OaQwDoqrQ JVl+Cq1An0bBtdhPihfm =GpP6 -END PGP SIGNATURE- ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev