Re: [tor-dev] Hidden Service Scaling
El 04/05/14 07:42, Christopher Baines escribió: On 04/05/14 11:43, waldo wrote: El 02/05/14 02:34, Christopher Baines escribió: On 02/05/14 00:45, waldo wrote: I am worried about an attack coming from evil IP based on forced disconnection of the HS from the IP. I don't know if this is possible but I am worried that if you pick a new circuit randomly could be highly problematic. Lets say I am NSA and I own 10% of the routers and disconnecting your HS from an IP I control, if you select a new circuit randomly, even if the probabilities are low, eventually is a matters of time until I force you to use an specific circuit from those convenient to me in order to have a possible circuit(out of many) that transfers your original IP as metadata through cooperative routers I own and then do away with the anonymity of the hidden service. Yeah, that does appear plausible. Are there guard nodes used for hidden service circuits (I forget)? No idea, according to this docs https://www.torproject.org/docs/hidden-services.html.en there aren't guards in the circuits to the IP in step one(not mentioned). They are definitely used on step five to protect against a timing attack with a corrupt entry node. Even if they are used, I still see some problems. I mean it looks convenient to try to reconnect to the same IP but in real life you are going to find nodes that fail a lot so if you picked an IP that has bad connectivity reconnecting to it is not gonna contribute at all with the HS scalability or availability of your HS, on the contrary. I don't think a minority of bad IP's will do much to hurt a hidden service. Hi Christopher. You are correct a minority can't do much harm, but they don't contribute. What's the point on keeping them? I don't meant to be rude, but also minority is relative. Can you please tell us what is the total number of IPs? I ask you because you were working there so you likely know better. If is 3 then one bad node is 33% of failed connections, If they are 50 one is only 2%. Clients will try connecting through all IP's until giving up, and this will only happen when they initially connect. What I've had noticed is that initial connection is what causes most troubles. Once you establish a rendezvous with the HS things go smooth. Is my personal experience don't know what others experience. I've noticed also that usually highly used services like the hidden wiki tend to behave a whole lot better. No idea why. Could be related to the HSDir query and this http://donncha.is/2013/05/trawling-tor-hidden-services/. Don't know if that was fixed. Maybe a good idea would be to try to reconnect and if it is failing too much select another IP. It currently does do this, but on probably a shorter time period than you are suggesting. It keeps a count of connection failures while trying to reconnect, but this is reset once a new connection is established. Yes I meant measuring a larger time period and over different circuits as the cause of disconnection could have being the circuit failing and not the IP. What happens if the HS just goes offline for a while? It keeps trying to connect, finds that it can't connect to the IPs and picks another set? You are differentiating that case? How do they coordinate which one publishes the descriptor? Wich one puts the first descriptor? This gets complicated, as you need to ensure that each instance of the service is using the same introduction points, I saw your answer to another person and seems to me is related to this you are saying If the service key's (randomly generated keys per introduction point) are used, then this would complicate/cause problems with the multiple instances connecting to one introduction point. Only one key would be listed in the descriptor, which would only allow one instance to get the traffic. What if the instances interchange keys and use the same? Master/slave for example and one of them take the master role if the master goes offline. Lets say master instance creates the IPs and sends a message to the rest to connect there. How about changing the descriptor to host several keys per IP in case previous is not possible/too difficult? Why this needs to be ensured? Does it breaks something? I understand it could be convenient to have at least two to avoid correlation attacks from the IP but why all? What would happen if one instance goes offline? The whole thing breaks? The desirable behavior I think in that case is that other instances take over and split load as if nothing happened. I also think is highly desirable they are indistinguishable to provide less information (enumeration, etc). If they use the same key, you could send the rendezvous message to all instances from the IP as all would have the same private key and can decrypt it (if the IP is not shared by different HS, don't know if this is possible currently). So the message doesn't gets lost in a failed circuit. If it is
Re: [tor-dev] Hidden Service Scaling
On 09/05/14 20:05, waldo wrote: El 04/05/14 07:42, Christopher Baines escribió: On 04/05/14 11:43, waldo wrote: El 02/05/14 02:34, Christopher Baines escribió: On 02/05/14 00:45, waldo wrote: I am worried about an attack coming from evil IP based on forced disconnection of the HS from the IP. I don't know if this is possible but I am worried that if you pick a new circuit randomly could be highly problematic. Lets say I am NSA and I own 10% of the routers and disconnecting your HS from an IP I control, if you select a new circuit randomly, even if the probabilities are low, eventually is a matters of time until I force you to use an specific circuit from those convenient to me in order to have a possible circuit(out of many) that transfers your original IP as metadata through cooperative routers I own and then do away with the anonymity of the hidden service. Yeah, that does appear plausible. Are there guard nodes used for hidden service circuits (I forget)? No idea, according to this docs https://www.torproject.org/docs/hidden-services.html.en there aren't guards in the circuits to the IP in step one(not mentioned). They are definitely used on step five to protect against a timing attack with a corrupt entry node. Even if they are used, I still see some problems. I mean it looks convenient to try to reconnect to the same IP but in real life you are going to find nodes that fail a lot so if you picked an IP that has bad connectivity reconnecting to it is not gonna contribute at all with the HS scalability or availability of your HS, on the contrary. I don't think a minority of bad IP's will do much to hurt a hidden service. Hi Christopher. You are correct a minority can't do much harm, but they don't contribute. What's the point on keeping them? I don't meant to be rude, but also minority is relative. Can you please tell us what is the total number of IPs? I ask you because you were working there so you likely know better. If is 3 then one bad node is 33% of failed connections, If they are 50 one is only 2%. I agree that it would be good if the service could detect and avoid bad IP's, but I don't yet see a good method for doing so (that fits within the rest of the design). Regarding the number of IP's, unfortunately I also don't know. This is possible to look up though, as you could modify a node running in the real network to log the number of nodes it considers choosing for an IP (I just haven't got time to do that atm). There might also be an easier way to do it with some existing Tor network stats tool. Maybe a good idea would be to try to reconnect and if it is failing too much select another IP. It currently does do this, but on probably a shorter time period than you are suggesting. It keeps a count of connection failures while trying to reconnect, but this is reset once a new connection is established. Yes I meant measuring a larger time period and over different circuits as the cause of disconnection could have being the circuit failing and not the IP. What happens if the HS just goes offline for a while? It keeps trying to connect, finds that it can't connect to the IPs and picks another set? You are differentiating that case? I am unsure what you mean here, can you clarify that you do mean the HS, and what It refers to? How do they coordinate which one publishes the descriptor? Wich one puts the first descriptor? So, starting from a state where you have no instances of a hidden service running. You start instance 1, it comes up and checks for a descriptor. This fails as this service is new and has not been published before. It picks some introduction points (does not matter how), and publishes a descriptor. You then start instance 2, like 1, it comes up and checks for a descriptor. This succeeds, instance 2 then connects to each of the introduction points in the descriptor. So in terms of coordination, there is none. You have to start the instances one after the other (you just have to start one before the rest in the general case). Thinking through this now has also brought up another point for interesting behaviour which I don't think I have tested, what happens if the descriptor contains 1 or more unreachable IP's at the time the second instance retrieves it... (just thought I would note this here). This gets complicated, as you need to ensure that each instance of the service is using the same introduction points, I saw your answer to another person and seems to me is related to this you are saying If the service key's (randomly generated keys per introduction point) are used, then this would complicate/cause problems with the multiple instances connecting to one introduction point. Only one key would be listed in the descriptor, which would only allow one instance to get the traffic. What if the instances interchange keys and use the same? Master/slave for example and one of them take the master role if
Re: [tor-dev] Hidden Service Scaling
El 09/05/14 16:03, Christopher Baines escribió: Maybe a good idea would be to try to reconnect and if it is failing too much select another IP. It currently does do this, but on probably a shorter time period than you are suggesting. It keeps a count of connection failures while trying to reconnect, but this is reset once a new connection is established. Yes I meant measuring a larger time period and over different circuits as the cause of disconnection could have being the circuit failing and not the IP. What happens if the HS just goes offline for a while? It keeps trying to connect, finds that it can't connect to the IPs and picks another set? You are differentiating that case? I am unsure what you mean here, can you clarify that you do mean the HS, and what It refers to? Sorry I meant the HS instance. If the instance goes offline, let's say the network interface goes down and it keeps running but can't can't create any circuits to the IP. Looks according to your other explanations that it reads the HSDir and tries to reconnect to older IPs. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 07/05/14 18:30, Michael Rogers wrote: On 07/05/14 17:32, Christopher Baines wrote: What about the attack suggested by waldo, where a malicious IP repeatedly breaks the circuit until it's rebuilt through a malicious middle node? Are entry guards enough to protect the service's anonymity in that case? I think it is a valid concern. Assuming the attacker has identified their node as an IP, and has the corresponding public key. They can then get the service to create new circuits to their node, buy just causing the existing ones to fail. Using guard nodes for those circuits would seem to be helpful, as this would greatly reduce the chance that the attackers nodes are used in the first hop. If guard nodes where used (assuming that they are currently not), you would have to be careful to act correctly when the guard node fails, in terms of using a different guard, or selecting a new guard to use instead (in an attempt to still connect to the introduction point). Perhaps it would make sense to pick one or more IPs per guard, and change those IPs when the guard is changed? Then waldo's attack by a malicious IP would only ever discover one guard. If you change the IP's when the guard is changed, this could break the consistency between different instances of the same service (assuming that the different instances are using different guards). signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/05/14 22:07, Christopher Baines wrote: On 06/05/14 15:29, Michael Rogers wrote: Does this mean that at present, the service builds a new IP circuit (to a new IP?) every time it receives a connection? If so, is it the IP or the service that closes the old circuit? Not quite. When the service (instance, or instances) select an introduction point, a circuit to that introduction point is built. This is a long term circuit, through which the RELAY_COMMAND_INTRODUCE2 cells can be sent. This circuit enables the IP to contact the service when a client asks it to do so. Currently, any IP's will close any existing circuits which are for a common purpose and service. Thanks for the explanation! Cheers, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBCAAGBQJTair9AAoJEBEET9GfxSfMN80IALQ1dHkYbf/IzoypYqn0pldi oNC0YoMCmvKFUOpyYClADLns74komcyodfgoNbwbEB1NLlOpeuUn9UubE4HKKAY9 74pTrl9f8uUg1pJ8NaNaoQfiKnEQEO/mdW19cKfleS4ZjG0wbEy15e+GdxokjzXv tDK3OAzCZPzgaAoHNUzY4ORgKGU7Jy/+AAg06e2GcLzyqGT8tDWQGMtiJUs6Uxci gB5m1CymjTX6yhGg/UC48y0wg7ty17uIa2SiBBNIQHTOs3DaJLFhGD3oMrIld3YS 3f2kdKkFnbQytTyWKcDPFPDU5N9IcGqVZiV3ozMELxvhBY7aI1Y+joYm3w4SqBk= =l7Py -END PGP SIGNATURE- ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 07/05/14 13:51, Michael Rogers wrote: On 06/05/14 22:17, Christopher Baines wrote: If so, then yes. When I implemented the deterministic selection of introduction points, I had to implement a reconnection mechanism to ensure that the introduction point would only be changed if it had failed, and not in the case of intermittent network issues (the degree to which I have actually done this might vary). Is it necessary to know why the circuit broke, or is it sufficient to try rebuilding the circuit, and pick a new IP if the old one isn't reachable? I imagine that the service will still have to try connecting via an alternate route, as even if it was told that the introduction point is no longer available, it should still check anyway (to avoid being tricked). What about the attack suggested by waldo, where a malicious IP repeatedly breaks the circuit until it's rebuilt through a malicious middle node? Are entry guards enough to protect the service's anonymity in that case? I think it is a valid concern. Assuming the attacker has identified their node as an IP, and has the corresponding public key. They can then get the service to create new circuits to their node, buy just causing the existing ones to fail. Using guard nodes for those circuits would seem to be helpful, as this would greatly reduce the chance that the attackers nodes are used in the first hop. If guard nodes where used (assuming that they are currently not), you would have to be careful to act correctly when the guard node fails, in terms of using a different guard, or selecting a new guard to use instead (in an attempt to still connect to the introduction point). signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 07/05/14 17:32, Christopher Baines wrote: What about the attack suggested by waldo, where a malicious IP repeatedly breaks the circuit until it's rebuilt through a malicious middle node? Are entry guards enough to protect the service's anonymity in that case? I think it is a valid concern. Assuming the attacker has identified their node as an IP, and has the corresponding public key. They can then get the service to create new circuits to their node, buy just causing the existing ones to fail. Using guard nodes for those circuits would seem to be helpful, as this would greatly reduce the chance that the attackers nodes are used in the first hop. If guard nodes where used (assuming that they are currently not), you would have to be careful to act correctly when the guard node fails, in terms of using a different guard, or selecting a new guard to use instead (in an attempt to still connect to the introduction point). Perhaps it would make sense to pick one or more IPs per guard, and change those IPs when the guard is changed? Then waldo's attack by a malicious IP would only ever discover one guard. Cheers, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBCAAGBQJTam21AAoJEBEET9GfxSfMiLkIAJuEjcF4yYH8L6nJOeSw33r+ aa7ANQPoBE0+dxXssNmFSw6Jw77qfip8LTQrvp58csdoxlh7ckp5wDMD0EqDag8X 98MuD6LRMD2q8MyJWHHYzBIn1SipW0PdTjpckdWlzI/u7ltpLy1ZHtLlpbKOGTKP pTmG0enWCGP7bpkQeEiJYmCHPbQWxTYJ1lvGdG9EX6DMqWR51FiTJpl5u/eI0JiS 5iLzCuPyP+DCyOBlaxFozujSRnElAKgsIQKz9+NY+bmHFC7tCnh1zE7DikbJlDUd XmZuzvK2VPuCabtDUegBteeenoyD3gtKKk59OyQUu9YbBz8JfJLY0zEmvTG9Mn4= =gDUS -END PGP SIGNATURE- ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Christopher, I'm interested in your work because the hidden service protocol doesn't seem to perform very well for hidden services running on mobile devices, which frequently lose network connectivity. I wonder if the situation can be improved by choosing introduction points deterministically. On 30/04/14 22:06, Christopher Baines wrote: - multiple connections for one service to an introduction point is allowed (previously, existing were closed) Does this mean that at present, the service builds a new IP circuit (to a new IP?) every time it receives a connection? If so, is it the IP or the service that closes the old circuit? Thanks, Michael -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBCAAGBQJTaPGuAAoJEBEET9GfxSfMBr0H/2U2X9nspW17PdcsenxmfLWa qdDRtqiVYkV96GMyEF7kO+felqHyZXQ/l3PHdkRZIalCwWdJrP2gTtV0NJ1tSIJM dyLwdtIBybRfi1rWJBXrphBA8GuG5qOIdxMcgO90CixvWNz21BxeQ1JGwKI/etP9 bio/saloYTgX6FX0S8TPzTBs42RW2mZ6PIt98Tdeq3AMOU0EZIw/bNb+vxPo/qpJ yX9ui/gIB5QA2IPrPLZzT7H1dqPnLSaYUg4E17SiBgsOHpa/7wDtDARgDINT+C3+ kK8T75yIq8ElEVRU+3TU9Rz+y+JCvHKJgKMvLqEOsYMS6w+fFtJCBLT5ZLnS+uI= =77ga -END PGP SIGNATURE- ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Sat, May 3, 2014 at 5:58 AM, Christopher Baines cbain...@gmail.com wrote: On 03/05/14 11:21, George Kadianakis wrote: On 08/10/13 06:52, Christopher Baines wrote: In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) Ah, this means that now IPs know which HSes they are serving (even if they don't have the HS descriptor). Why was this change necessary? If the service key's (randomly generated keys per introduction point) are used, then this would complicate/cause problems with the multiple instances connecting to one introduction point. Only one key would be listed in the descriptor, which would only allow one instance to get the traffic. Using the same key is good. Using the services key, is not great. One possible improvement might be to generate a key for an introduction point based off the identity of the introduction point, plus some other stuff to make it secure. Would it make sense to solve this problem using a similar approach to the key blinding described in proposal 224? For example, if the public key is g^x and the introduction point has identity (e.g. fingerprint) y, then the IP blinding factor would be t_{IP} = Hash(y | g^x) and the IP-specific public key would be P_{IP} = g^{x*t_{IP}} This way the IP doesn't learn what HS it's serving if it doesn't know the descriptor, but any HS server that knows the secret key (x) can compute the IP secret key x*t. -- Nicholas Hopper Associate Professor, Computer Science Engineering, University of Minnesota Visiting Research Director, The Tor Project ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Tue, May 06, 2014 at 03:29:03PM +0100, Michael Rogers wrote: I'm interested in your work because the hidden service protocol doesn't seem to perform very well for hidden services running on mobile devices, which frequently lose network connectivity. I wonder if the situation can be improved by choosing introduction points deterministically. I think https://trac.torproject.org/projects/tor/ticket/8239 would resolve a lot of this problem. Somebody should write the patch. :) --Roger ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 06/05/14 20:13, Nicholas Hopper wrote: On Sat, May 3, 2014 at 5:58 AM, Christopher Baines cbain...@gmail.com wrote: On 03/05/14 11:21, George Kadianakis wrote: On 08/10/13 06:52, Christopher Baines wrote: In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) Ah, this means that now IPs know which HSes they are serving (even if they don't have the HS descriptor). Why was this change necessary? If the service key's (randomly generated keys per introduction point) are used, then this would complicate/cause problems with the multiple instances connecting to one introduction point. Only one key would be listed in the descriptor, which would only allow one instance to get the traffic. Using the same key is good. Using the services key, is not great. One possible improvement might be to generate a key for an introduction point based off the identity of the introduction point, plus some other stuff to make it secure. Would it make sense to solve this problem using a similar approach to the key blinding described in proposal 224? For example, if the public key is g^x and the introduction point has identity (e.g. fingerprint) y, then the IP blinding factor would be t_{IP} = Hash(y | g^x) and the IP-specific public key would be P_{IP} = g^{x*t_{IP}} This way the IP doesn't learn what HS it's serving if it doesn't know the descriptor, but any HS server that knows the secret key (x) can compute the IP secret key x*t. Yes, from the non-mathematical explanation, that seems to fit the requirements fine. signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 06/05/14 15:29, Michael Rogers wrote: I'm interested in your work because the hidden service protocol doesn't seem to perform very well for hidden services running on mobile devices, which frequently lose network connectivity. I wonder if the situation can be improved by choosing introduction points deterministically. Unfortunately, I don't really see how anything I have done could have helped with this. Assuming that the mobile device has maintained connectivity during the connection phase, and you now have the 6 hop circuit through the RP, the behaviour from then on is unchanged, and this is where I assume the problems with loosing connectivity occur? On 30/04/14 22:06, Christopher Baines wrote: - multiple connections for one service to an introduction point is allowed (previously, existing were closed) Does this mean that at present, the service builds a new IP circuit (to a new IP?) every time it receives a connection? If so, is it the IP or the service that closes the old circuit? Not quite. When the service (instance, or instances) select an introduction point, a circuit to that introduction point is built. This is a long term circuit, through which the RELAY_COMMAND_INTRODUCE2 cells can be sent. This circuit enables the IP to contact the service when a client asks it to do so. Currently, any IP's will close any existing circuits which are for a common purpose and service. The modification I attempt to describe above, is the disabling of this functionality. So a hidden service instance (or multiple instances of the same hidden service), can connect to the same introduction point through multiple circuits. There is also some additional modifications needed to make the RELAY_COMMAND_INTRODUCE2 handling work with multiple circuits. signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 06/05/14 21:19, Roger Dingledine wrote: On Tue, May 06, 2014 at 03:29:03PM +0100, Michael Rogers wrote: I'm interested in your work because the hidden service protocol doesn't seem to perform very well for hidden services running on mobile devices, which frequently lose network connectivity. I wonder if the situation can be improved by choosing introduction points deterministically. I think https://trac.torproject.org/projects/tor/ticket/8239 would resolve a lot of this problem. Somebody should write the patch. :) I have implemented this (or something similar). I will try to extract it as a patch, which can be applied independently of anything else which I have changed. This might take a few weeks, as I have exams looming. signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 06/05/14 22:07, Christopher Baines wrote: On 06/05/14 15:29, Michael Rogers wrote: I'm interested in your work because the hidden service protocol doesn't seem to perform very well for hidden services running on mobile devices, which frequently lose network connectivity. I wonder if the situation can be improved by choosing introduction points deterministically. Unfortunately, I don't really see how anything I have done could have helped with this. Assuming that the mobile device has maintained connectivity during the connection phase, and you now have the 6 hop circuit through the RP, the behaviour from then on is unchanged, and this is where I assume the problems with loosing connectivity occur? Right, attempt two, I think I may have misinterpreted what you said. The above response relates to client behaviour for hidden services. Am I correct in saying that you actually mean hosting the hidden service from a mobile device? If so, then yes. When I implemented the deterministic selection of introduction points, I had to implement a reconnection mechanism to ensure that the introduction point would only be changed if it had failed, and not in the case of intermittent network issues (the degree to which I have actually done this might vary). signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
Christopher Baines cbain...@gmail.com writes: On 08/10/13 06:52, Christopher Baines wrote: I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1]. So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. Previous threads on this subject: https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html https://lists.torproject.org/pipermail/tor-dev/2013-October/005674.html I have now implemented a prototype, for one possible design of how to allow distribution in hidden services. While developing this, I also made some modifications to chutney to allow for the tests I wanted to write. Great! Here are a few small comments from quickly reading your post. In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) Ah, this means that now IPs know which HSes they are serving (even if they don't have the HS descriptor). Why was this change necessary? - multiple connections for one service to an introduction point is allowed (previously, existing were closed) - tor will check for a descriptor when it needs to establish all of its introduction points, and connect to the ones in the descriptor (if it is available) - Use a approach similar to the selection of the HSDir's for the selection of new introduction points (instead of a random selection) As you note below, this suffers from the same issue that HSDirs suffer from. Why was this necessary? Is it to avoid race conditions? Based on the previous point, I thought that the second node of an HS would be able to get the list of IPs by reading the descriptor of the first node. - Attempt to reconnect to an introduction point, if the connection is lost With chutney, I added support for interacting with the nodes through Stem, I also moved the control over starting the nodes to the test, as this allows for more complex behaviour. Currently the one major issue is that using an approach similar to the HSDir selection means that introduction points suffer from the same issue as HSDir's currently [1]. I believe any satisfactory solution to the HSDir issue would resolve this problem also. One other thing of note, tor currently allows building circuits to introduction points, through existing introduction points, and selecting introduction points on circuits used to connect to other introduction points. These two issues mean that a failure in one introduction point, can currently cause tor to change two introduction points. (I am not saying this needs changing, but you could adjust the circuit creation, to prevent some extra work later if a failure occurs). Any comments regarding the above would be welcome. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 02/05/14 00:45, waldo wrote: El 30/04/14 17:06, Christopher Baines escribió: On 08/10/13 06:52, Christopher Baines wrote: I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1]. So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. Previous threads on this subject: https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html https://lists.torproject.org/pipermail/tor-dev/2013-October/005674.html I have now implemented a prototype, for one possible design of how to allow distribution in hidden services. While developing this, I also made some modifications to chutney to allow for the tests I wanted to write. In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) - multiple connections for one service to an introduction point is allowed (previously, existing were closed) - tor will check for a descriptor when it needs to establish all of its introduction points, and connect to the ones in the descriptor (if it is available) - Use a approach similar to the selection of the HSDir's for the selection of new introduction points (instead of a random selection)an taack involvin - Attempt to reconnect to an introduction point, if the connection is lost I appreciate your work since Hidden services are really bad. Hard to reach ATM sometimes. But ... how you do this in details? Sorry but walking over your sources could be challenging if I don't know the original codebase you used and is gonna take more time than if I just ask you. I also can't test as I don't have enough resources/know how/time. In terms of the code, just when the circuit to an introduction point has failed, try to establish another one. I am unsure if I have taken the best approach in terms of code, but it does seem to work. I am worried about an attack coming from evil IP based on forced disconnection of the HS from the IP. I don't know if this is possible but I am worried that if you pick a new circuit randomly could be highly problematic. Lets say I am NSA and I own 10% of the routers and disconnecting your HS from an IP I control, if you select a new circuit randomly, even if the probabilities are low, eventually is a matters of time until I force you to use an specific circuit from those convenient to me in order to have a possible circuit(out of many) that transfers your original IP as metadata through cooperative routers I own and then do away with the anonymity of the hidden service. Yeah, that does appear plausible. Are there guard nodes used for hidden service circuits (I forget)? The big question I have is what is the probability with current Tor network size of this happening? If things are like I describe, is a matter of seconds or thousand of years? I am unsure. I implemented this, as it was quite probable when testing with a small network using chutney. When testing the behaviour of the network when an introduction point fails, you need to have reconnection, otherwise instances which connect to other introduction points through that failed introduction point, will also see those working introduction points as failing. Leading to the instances using different introduction points (what I was trying to avoid). signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
El 30/04/14 17:06, Christopher Baines escribió: On 08/10/13 06:52, Christopher Baines wrote: I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1]. So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. Previous threads on this subject: https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html https://lists.torproject.org/pipermail/tor-dev/2013-October/005674.html I have now implemented a prototype, for one possible design of how to allow distribution in hidden services. While developing this, I also made some modifications to chutney to allow for the tests I wanted to write. In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) - multiple connections for one service to an introduction point is allowed (previously, existing were closed) - tor will check for a descriptor when it needs to establish all of its introduction points, and connect to the ones in the descriptor (if it is available) - Use a approach similar to the selection of the HSDir's for the selection of new introduction points (instead of a random selection)an taack involvin - Attempt to reconnect to an introduction point, if the connection is lost I appreciate your work since Hidden services are really bad. Hard to reach ATM sometimes. But ... how you do this in details? Sorry but walking over your sources could be challenging if I don't know the original codebase you used and is gonna take more time than if I just ask you. I also can't test as I don't have enough resources/know how/time. I am worried about an attack coming from evil IP based on forced disconnection of the HS from the IP. I don't know if this is possible but I am worried that if you pick a new circuit randomly could be highly problematic. Lets say I am NSA and I own 10% of the routers and disconnecting your HS from an IP I control, if you select a new circuit randomly, even if the probabilities are low, eventually is a matters of time until I force you to use an specific circuit from those convenient to me in order to have a possible circuit(out of many) that transfers your original IP as metadata through cooperative routers I own and then do away with the anonymity of the hidden service. The big question I have is what is the probability with current Tor network size of this happening? If things are like I describe, is a matter of seconds or thousand of years? With chutney, I added support for interacting with the nodes through Stem, I also moved the control over starting the nodes to the test, as this allows for more complex behaviour. Currently the one major issue is that using an approach similar to the HSDir selection means that introduction points suffer from the same issue as HSDir's currently [1]. I believe any satisfactory solution to the HSDir issue would resolve this problem also. One other thing of note, tor currently allows building circuits to introduction points, through existing introduction points, and selecting introduction points on circuits used to connect to other introduction points. These two issues mean that a failure in one introduction point, can currently cause tor to change two introduction points. (I am not saying this needs changing, but you could adjust the circuit creation, to prevent some extra work later if a failure occurs). Any comments regarding the above would be welcome. I have put the code for this up, but it should not be used for anything other than private testing (and will not work properly outside of chutney at the moment anyway). The modifications to tor can be found in the disths branch of: git://git.cbaines.net/tor.git The modifications and additional tests for chutney can be found in the disths branch of: git://git.cbaines.net/chutney.git To run the tests against the new code, you would do something along the lines of: git clone -b disths git://git.cbaines.net/tor.git git clone -b disths git://git.cbaines.net/chutney.git cd tor ./autogen.sh ./configure make clean all cd ../chutney git submodule update --init export PATH=../tor/src/or:../tor/src/tools/:$PATH ls networks/hs-* | xargs -n 1 ./chutney configure ls networks/hs-* | xargs -n 1 ./chutney --quiet start The last command should yield some output similar to: networks/hs-dual-intro-fail-3 PASS networks/hs-intro-fail-2 PASS networks/hs-intro-fail-3 PASS networks/hs-intro-select-2 PASS networks/hs-start-3 PASS
Re: [tor-dev] Hidden Service Scaling
On 08/10/13 06:52, Christopher Baines wrote: I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1]. So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. Previous threads on this subject: https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html https://lists.torproject.org/pipermail/tor-dev/2013-October/005674.html I have now implemented a prototype, for one possible design of how to allow distribution in hidden services. While developing this, I also made some modifications to chutney to allow for the tests I wanted to write. In short, I modified tor such that: - The services public key is used in the connection to introduction points (a return to the state as of the v0 descriptor) - multiple connections for one service to an introduction point is allowed (previously, existing were closed) - tor will check for a descriptor when it needs to establish all of its introduction points, and connect to the ones in the descriptor (if it is available) - Use a approach similar to the selection of the HSDir's for the selection of new introduction points (instead of a random selection) - Attempt to reconnect to an introduction point, if the connection is lost With chutney, I added support for interacting with the nodes through Stem, I also moved the control over starting the nodes to the test, as this allows for more complex behaviour. Currently the one major issue is that using an approach similar to the HSDir selection means that introduction points suffer from the same issue as HSDir's currently [1]. I believe any satisfactory solution to the HSDir issue would resolve this problem also. One other thing of note, tor currently allows building circuits to introduction points, through existing introduction points, and selecting introduction points on circuits used to connect to other introduction points. These two issues mean that a failure in one introduction point, can currently cause tor to change two introduction points. (I am not saying this needs changing, but you could adjust the circuit creation, to prevent some extra work later if a failure occurs). Any comments regarding the above would be welcome. I have put the code for this up, but it should not be used for anything other than private testing (and will not work properly outside of chutney at the moment anyway). The modifications to tor can be found in the disths branch of: git://git.cbaines.net/tor.git The modifications and additional tests for chutney can be found in the disths branch of: git://git.cbaines.net/chutney.git To run the tests against the new code, you would do something along the lines of: git clone -b disths git://git.cbaines.net/tor.git git clone -b disths git://git.cbaines.net/chutney.git cd tor ./autogen.sh ./configure make clean all cd ../chutney git submodule update --init export PATH=../tor/src/or:../tor/src/tools/:$PATH ls networks/hs-* | xargs -n 1 ./chutney configure ls networks/hs-* | xargs -n 1 ./chutney --quiet start The last command should yield some output similar to: networks/hs-dual-intro-fail-3 PASS networks/hs-intro-fail-2 PASS networks/hs-intro-fail-3 PASS networks/hs-intro-select-2 PASS networks/hs-start-3 PASS networks/hs-stop-3 PASS networks/hs-tripple-intro-fail-3 PASS 1: https://trac.torproject.org/projects/tor/ticket/8244 signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Sun, Oct 13, 2013 at 10:22:29PM +0100, Christopher Baines wrote: On 09/10/13 18:05, Matthew Finkel wrote: These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. Does it? Given the above, that is, each instance of the hidden serivce connects once to each introduction point. Then the number of instances of a hidden service, is equal to the number of connections with that each introduction point sees with that key. Ah, I missed something earlier, this makes more sense now. Thanks for reiterating that point. So, this having been said, do you have some thoughts on how to counter this? At this point, introduction points are selected at random, it will be unforunate if they can build a profile of a hidden service's usage over time. There is also uncertainty around the replacement of failing introduction points. New ones have to be chosen, but as the service instances do not directly communicate, there could be some interesting behaviour unless this is done carefully. Is there a reason they shouldn't communicate with each other? I have avoided it so far, as it increases the complexity both the implementation and setup. However, this is probably a minor issue, as the major issue is how service providers would want to use this? Complex hidden services (compared to hidden services with static content) will probably require either communication between instances, or communication from all instances to another server (set of servers)? It will surely increase the complexity, however allowing the hidden service peers to coordinate their introduction points (and/or other information) could be a useful feature. This could be especially true if we want to address the all introduction points know the number of hidden service instances that constitute a hidden service address problem. As a general rule, we want to minimize the number of nodes that are given a priviledged position within the network. As an example, if we go back to my earlier comment and assume all instances of a hidden service use the same introduction points, then a client will use any one of the introduction points with equal probability. Given this, an introduction point 1) knows the size (number of instances) of the hidden service, 2) can influence which hidden service instances are used by clients, 3) can communicate with a HS without knowing who it is, and 4) can potentially determine the geographical location of the hidden service's users (based on when it is used). These last few points are not unique to your design and the last point is not unique to introduction points, but these leakages are important and we should try to account for them (and plug them, if possible). (This is not an exhaustive list) I am aware that there are several undefined parts of the above description, e.g. how does a introduction point choose what circuit to use? but at the moment I am more interested in the wider picture. It would be good to get some feedback on this. 1: https://blog.torproject.org/blog/hidden-services-need-some-love 2: http://tor.stackexchange.com/questions/13/can-a-hidden-service-be-hosted-by-multiple-instances-of-tor/24#24 This is a good start! Some important criteria you might also think about include how much you trust each component/node and which nodes do you want to be responsible for deciding where connections are routed. Also seriously think about how something like a botnet that uses hidden services might impact the reliability of your design (crazy idea, I know). I assume the characteristics of this are: 1 or more hidden service instances, connected to by very large numbers of clients, sending and reviving small amounts of information? Perhaps, but just think about the load an intro point can handle and sustain. If Introduction Points are where load balacing takes place, then does this affect the difficulty of attacking a hidden service? (for some undefined definition of 'attack'.) At the moment, I am really considering the redundancy and scalibility of the serivce. Both of these could be helped by allowing for multi-instance hidden serivces (in a planned and thought through manor). Hopefully allowing for this will increase the difficulty to attack hidden serivce, not directly, but by allowing the operators to use this functionality. Understood, and I
Re: [tor-dev] Hidden Service Scaling
Christopher Baines cbain...@gmail.com writes: On 10/10/13 23:28, Paul Syverson wrote: On Wed, Oct 09, 2013 at 03:02:47PM +0100, Christopher Baines wrote: On 09/10/13 11:41, Paul Syverson wrote: These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. You said something similar in response to Nick, specifically you said I believe that to mask the state and possibly number of instances, you would have to at least have some of the introduction points connecting to multiple instances. I didn't understand why you said this in either place. Someone would have to know they had a complete list of introduction points to know the number of instances, but that would depend on how HS descriptors are created, stored, and distributed. From whom is this being hidden? You didn't state the adversary. Is it HS directory servers, intro point operators, potential clients of a hidden service? I don't see why any of these necessarily learns the state or number of instances simply because each intro point is chosen by a single instance (ignoring coincidental collisions if these choices are not coordinated). To clarify, I was interpreting the goal as only the service operator should know the number of instances. In particular, the adversary here is the introduction point. If hidden service instances only ever create one circuit to each introduction point, each introduction point knows the number of instances of every service it is an introduction point for, as this is the same as the number of circuits for that service. I'm missing something. Suppose there is a hidden service with ten instances, each of which runs its own introduction point. How do any of these ten introduction points know the number of instances because they each see a single circuit from the hidden service? Ah, I have not been explicit enough when describing the behaviour I want to implement. In my original email, I set out that each instance of the service connects to each introduction point (this has also developed/changed a bit since that email). Unfortunately, I did not state the resultant behaviour I was looking to achieve (the above), just the changes to the protocol that would result in this behaviour. That's from the PoV of Introduction Points. On the other hand, a client that knows the onion address of an HS (or an HSDir before https://lists.torproject.org/pipermail/tor-dev/2013-October/005534.html gets implemented) can still get the list of all IPs (at least with the current directory design). If some of the HS peers that correspond to those IPs are down, then a client can notice this by sending INTRODUCE1 cells to all the IPs and seeing which ones fail. As a more conditional attack (from the IP PoV), let's think of a super-HS with two HS peers where each of them has one Introduction Point. If one of the HS peers goes down, then the other IP might be able to figure this out using the number of introduction it conducts (if we assume that each IP used to do half of the introductions of the HS, then the number of introductions will increase when one HS peer goes down.) ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 10/10/13 23:28, Paul Syverson wrote: On Wed, Oct 09, 2013 at 03:02:47PM +0100, Christopher Baines wrote: On 09/10/13 11:41, Paul Syverson wrote: These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. You said something similar in response to Nick, specifically you said I believe that to mask the state and possibly number of instances, you would have to at least have some of the introduction points connecting to multiple instances. I didn't understand why you said this in either place. Someone would have to know they had a complete list of introduction points to know the number of instances, but that would depend on how HS descriptors are created, stored, and distributed. From whom is this being hidden? You didn't state the adversary. Is it HS directory servers, intro point operators, potential clients of a hidden service? I don't see why any of these necessarily learns the state or number of instances simply because each intro point is chosen by a single instance (ignoring coincidental collisions if these choices are not coordinated). To clarify, I was interpreting the goal as only the service operator should know the number of instances. In particular, the adversary here is the introduction point. If hidden service instances only ever create one circuit to each introduction point, each introduction point knows the number of instances of every service it is an introduction point for, as this is the same as the number of circuits for that service. I'm missing something. Suppose there is a hidden service with ten instances, each of which runs its own introduction point. How do any of these ten introduction points know the number of instances because they each see a single circuit from the hidden service? Ah, I have not been explicit enough when describing the behaviour I want to implement. In my original email, I set out that each instance of the service connects to each introduction point (this has also developed/changed a bit since that email). Unfortunately, I did not state the resultant behaviour I was looking to achieve (the above), just the changes to the protocol that would result in this behaviour. Also, in your response to Nick you said that not having instances share intro points in some way would place an upper bound on the number of instances. True, but if the number of available intro points likely number of instances, this is a nonissue. I don't really follow your reasoning. If there are a thousand possible introduction points for a given HS, if each instance runs say two intro points, then that bounds the number of instances at 500 (ignoring that the intro points for different instances overlap q.v. below). I think this resolves itself with the above clarification. And come to think of it, not true: if the instances are sometimes choosing the same intro points then this does not bound the number of instances possible (ignoring the number of HSes or instances for which a single intro point can serve as intro point at one time). Ok, but I was assuming the current behaviour of Tor, which I believe prevents instances using some of the same introduction points. Why? If two different instances of the same HS operated completely independently (just for the sake of argument, I'm assuming there are good reasons this wouldn't happen in reality) then they wouldn't even know they were colliding on intro points. And neither would the intro points. Given my above clarification, the instances perform some coordination via the hidden service directories. When a new instance starts, it finds existing introduction points in exactly the same way a client (who wants to connect to the hidden service) does. Also, above you said If each instance just makes one circuit. Did you mean if there is a single intro point per instance? No, as you could have one instance that makes say 3 circuits to just one introduction point. This can help, as it can hide the number of instances from the introduction point. Off the top of my head, I'm guessing this would be a bad idea since the multiple circuits with the same source and destination will create more observation opportunities for either compromised Tor nodes or underlying ASes routers, etc. I don't have a specific attack in mind, but this seems a greater threat to locating a hidden service than would be
Re: [tor-dev] Hidden Service Scaling
If the goal is to prevent introduction points from guessing the number of instances because of multiple instances using the same introduction points, shouldn't this scheme work? 1. On deployment, all instances of a hidden service have a copy of a secret bitstring (maybe the private key for the hidden service, maybe an additional secret) and the number of instances N. Every instance also has a unique instance ID k in the range [0, N-1]. 2. When selecting an introduction point, an instance only considers candidates for which hash(introduction-point-address || shared-secret) = k mod N. With this system no two instances will ever connect to the same introduction point, and it doesn't require any synchronisation between the instances other than the initial instance ID assignation. But it relies on there being enough potential introduction points for which the equality holds. This will also mean that an introduction point knows that it is always being used by the same instance of a hidden service. If you want to avoid this you could add the current day or hour or random time period to the hashed value, but then you might get a collision when a new time period begins. Apologies if this has already been discussed. --ll signature.asc Description: This is a digitally signed message part ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 09/10/13 01:16, Matthew Finkel wrote: So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. These are excellent goals. It would be even better if you made a stronger statement about hidden service failure. Something closer to increase hidden service availablity, but I won't bikeshed on the wording. I agree, that is clearer. I think what I am planning distils down to two main changes. Firstly, when a OP initialises a hidden service, currently if you start a hidden service using an existing keypair and address, the new OP's introduction points replace the existing introduction points [2]. This does provide some redundancy (if slow), but no load balancing. So an interesting thing to note about this hack is that it does provide *some* load balancing. Not much, but some. The reason for this is because Tor clients cache hidden service descriptors so that they don't need to refetch every time they want to connect to it. My current plan is to change this such that if the OP has an existing public/private keypair and address, it would attempt to lookup the existing introduction points (probably over a Tor circuit). If found, it then establishes introduction circuits to those Tor servers. Then comes the second problem, following the above, the introduction point would then disconnect from any other connected OP using the same public key (unsure why as a reason is not given in the rend-spec). This would need to change such that an introduction point can talk to more than one instance of the hidden service. It's important to think about the current design based on the assumption that a hidden service is a single node. Any modifications to this assumption will change the behavior of the various components. The only interactions I currently believe can be affected are the Hidden Service instance - Introduction point(s) and Hidden Service instance - directory server. I need to go and read more about the latter, as I don't have all the information yet. These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. There is also uncertainty around the replacement of failing introduction points. New ones have to be chosen, but as the service instances do not directly communicate, there could be some interesting behaviour unless this is done carefully. I am also unsure how the lack of direct communication between the hidden service instances could affect the usability of this. I think what would be good to do is take some large, open source, distributed web applications and look at how/how not to set them up using various possible implementations of distributed hidden services. I am aware that there are several undefined parts of the above description, e.g. how does a introduction point choose what circuit to use? but at the moment I am more interested in the wider picture. It would be good to get some feedback on this. 1: https://blog.torproject.org/blog/hidden-services-need-some-love 2: http://tor.stackexchange.com/questions/13/can-a-hidden-service-be-hosted-by-multiple-instances-of-tor/24#24 This is a good start! Some important criteria you might also think about include how much you trust each component/node and which nodes do you want to be responsible for deciding where connections are routed. Also seriously think about how something like a botnet that uses hidden services might impact the reliability of your design (crazy idea, I know). I assume the characteristics of this are: 1 or more hidden service instances, connected to by very large numbers of clients, sending and reviving small amounts of information? signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Wed, Oct 09, 2013 at 09:58:07AM +0100, Christopher Baines wrote: On 09/10/13 01:16, Matthew Finkel wrote: Then comes the second problem, following the above, the introduction point would then disconnect from any other connected OP using the same public key (unsure why as a reason is not given in the rend-spec). This would need to change such that an introduction point can talk to more than one instance of the hidden service. It's important to think about the current design based on the assumption that a hidden service is a single node. Any modifications to this assumption will change the behavior of the various components. The only interactions I currently believe can be affected are the Hidden Service instance - Introduction point(s) and Hidden Service instance - directory server. I need to go and read more about the latter, as I don't have all the information yet. Indeed. Lots of issues there. These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. You said something similar in response to Nick, specifically you said I believe that to mask the state and possibly number of instances, you would have to at least have some of the introduction points connecting to multiple instances. I didn't understand why you said this in either place. Someone would have to know they had a complete list of introduction points to know the number of instances, but that would depend on how HS descriptors are created, stored, and distributed. From whom is this being hidden? You didn't state the adversary. Is it HS directory servers, intro point operators, potential clients of a hidden service? I don't see why any of these necessarily learns the state or number of instances simply because each intro point is chosen by a single instance (ignoring coincidental collisions if these choices are not coordinated). Also, in your response to Nick you said that not having instances share intro points in some way would place an upper bound on the number of instances. True, but if the number of available intro points likely number of instances, this is a nonissue. And come to think of it, not true: if the instances are sometimes choosing the same intro points then this does not bound the number of instances possible (ignoring the number of HSes or instances for which a single intro point can serve as intro point at one time). Also, above you said If each instance just makes one circuit. Did you mean if there is a single intro point per instance? Hard to say specifically without exploring more, but in general I would be more worried about what is revealed because circuits are built to common intro points by different instances and the intro points can recognize and manage these, e.g., dropping redundant ones than I would be because the number of intro points puts an upper bound on instances. HTH, Paul ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 09/10/13 11:41, Paul Syverson wrote: These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. You said something similar in response to Nick, specifically you said I believe that to mask the state and possibly number of instances, you would have to at least have some of the introduction points connecting to multiple instances. I didn't understand why you said this in either place. Someone would have to know they had a complete list of introduction points to know the number of instances, but that would depend on how HS descriptors are created, stored, and distributed. From whom is this being hidden? You didn't state the adversary. Is it HS directory servers, intro point operators, potential clients of a hidden service? I don't see why any of these necessarily learns the state or number of instances simply because each intro point is chosen by a single instance (ignoring coincidental collisions if these choices are not coordinated). To clarify, I was interpreting the goal as only the service operator should know the number of instances. In particular, the adversary here is the introduction point. If hidden service instances only ever create one circuit to each introduction point, each introduction point knows the number of instances of every service it is an introduction point for, as this is the same as the number of circuits for that service. Also, in your response to Nick you said that not having instances share intro points in some way would place an upper bound on the number of instances. True, but if the number of available intro points likely number of instances, this is a nonissue. I don't really follow your reasoning. And come to think of it, not true: if the instances are sometimes choosing the same intro points then this does not bound the number of instances possible (ignoring the number of HSes or instances for which a single intro point can serve as intro point at one time). Ok, but I was assuming the current behaviour of Tor, which I believe prevents instances using some of the same introduction points. Also, above you said If each instance just makes one circuit. Did you mean if there is a single intro point per instance? No, as you could have one instance that makes say 3 circuits to just one introduction point. This can help, as it can hide the number of instances from the introduction point. Hard to say specifically without exploring more, but in general I would be more worried about what is revealed because circuits are built to common intro points by different instances and the intro points can recognize and manage these, e.g., dropping redundant ones than I would be because the number of intro points puts an upper bound on instances. I don't quite understand the last part, but regarding introduction points handling more that one circuit for the same service. I think that having this helps possibly hide information (like the number of instances). This does depend on also allowing one instance to use multiple circuits, otherwise, some information would be given away. I might try creating a wiki page on the Tor wiki to collect all of the information in this thread, as it might be a nice reference for discussion. signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Wed, Oct 09, 2013 at 09:58:07AM +0100, Christopher Baines wrote: On 09/10/13 01:16, Matthew Finkel wrote: Then comes the second problem, following the above, the introduction point would then disconnect from any other connected OP using the same public key (unsure why as a reason is not given in the rend-spec). This would need to change such that an introduction point can talk to more than one instance of the hidden service. It's important to think about the current design based on the assumption that a hidden service is a single node. Any modifications to this assumption will change the behavior of the various components. The only interactions I currently believe can be affected are the Hidden Service instance - Introduction point(s) and Hidden Service instance - directory server. I need to go and read more about the latter, as I don't have all the information yet. Also, to be fair, one of the devs has already started working on upgrading various components of hidden services [3][4]. You may also want to read through these so you have an idea of what are some future plans. Also, keep in mind that that the current design may not work well for this (scaling) use case. Perhaps also thinking about modifications to the current design that are backwards compatible will help. These two changes combined should help with the two goals. Reliability is improved by having multiple OP's providing the service, and having all of these accessible from the introduction points. Scalability is also improved, as you are not limited to one OP (as described above, currently you can also have +1 but only one will receive most of the traffic, and fail over is slow). Do you see any disadvantages to this design? So, care needs to be taken around the interaction between the hidden service instances, and the introduction points. If each instance just makes one circuit, then this reveals the number of instances. Does it? There is also uncertainty around the replacement of failing introduction points. New ones have to be chosen, but as the service instances do not directly communicate, there could be some interesting behaviour unless this is done carefully. Is there a reason they shouldn't communicate with each other? I am also unsure how the lack of direct communication between the hidden service instances could affect the usability of this. I think what would be good to do is take some large, open source, distributed web applications and look at how/how not to set them up using various possible implementations of distributed hidden services. I am aware that there are several undefined parts of the above description, e.g. how does a introduction point choose what circuit to use? but at the moment I am more interested in the wider picture. It would be good to get some feedback on this. 1: https://blog.torproject.org/blog/hidden-services-need-some-love 2: http://tor.stackexchange.com/questions/13/can-a-hidden-service-be-hosted-by-multiple-instances-of-tor/24#24 This is a good start! Some important criteria you might also think about include how much you trust each component/node and which nodes do you want to be responsible for deciding where connections are routed. Also seriously think about how something like a botnet that uses hidden services might impact the reliability of your design (crazy idea, I know). I assume the characteristics of this are: 1 or more hidden service instances, connected to by very large numbers of clients, sending and reviving small amounts of information? Perhaps, but just think about the load an intro point can handle and sustain. If Introduction Points are where load balacing takes place, then does this affect the difficulty of attacking a hidden service? (for some undefined definition of 'attack'.) [3] https://lists.torproject.org/pipermail/tor-dev/2013-October/005534.html [4] https://lists.torproject.org/pipermail/tor-dev/2013-October/005536.html ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On Tue, Oct 8, 2013 at 1:52 AM, Christopher Baines cbain...@gmail.com wrote: I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1]. So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability. I think what I am planning distils down to two main changes. Firstly, when a OP initialises a hidden service, currently if you start a hidden service using an existing keypair and address, the new OP's introduction points replace the existing introduction points [2]. This does provide some redundancy (if slow), but no load balancing. My current plan is to change this such that if the OP has an existing public/private keypair and address, it would attempt to lookup the existing introduction points (probably over a Tor circuit). If found, it then establishes introduction circuits to those Tor servers. Then comes the second problem, following the above, the introduction point would then disconnect from any other connected OP using the same public key (unsure why as a reason is not given in the rend-spec). This would need to change such that an introduction point can talk to more than one instance of the hidden service. So, let's figure out all our possibilities before we pick one, and talk about requirements a little. Alternative 1: Multiple hidden service descriptors. Each instance of a hidden service picks its own introduction points, and uploads a separate hidden service descriptor to a subset of the HSDir nodes handling that service. Alternative 2: Combined hidden service descriptors in the network. Each instance of a hidden service picks its own introduction points, and uploads something to every appropriate HSDir node. The HSDir nodes combine those somethings, somehow, into a hidden service descriptor. Alternative 3: Single hidden service descriptor, one service instance per intro point. Each instance of a hidden service picks its introduction points, and somehow they coordinate so that they, together, get a single unified list of all their introduction points. They use this list to make a single signed hidden service descriptor, and upload that to the appropriate HSDirs. Alternative 4: Single hidden service descriptor, multiple service instances per intro point. This is your design above, where there's one descriptor chosen by a single hidden service instance (or possibly made collaboratively?), and the rest of the service instances fetch it, learn which intro points they're supposed to be at, and parasitically establish fallback introduction circuits there. There are probably other alternatives too; let's see if we can think of some more. Here are some possible desirable things. I don't know if they're all important, or all worth it. Let's discuss! Goal 1) Obscure number of hidden service instances. Goal 2) No master hidden service instance. Goal 3) If there is a master hidden service instance, clean fail-over from one master to the next, undetectable by the network. Goal 4) Obscure which instances are up and which are down. What other goals should we have in this kind of design? -- Nick ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Hidden Service Scaling
On 08/10/13 23:41, Nick Mathewson wrote: Here are some possible desirable things. I don't know if they're all important, or all worth it. Let's discuss! So, I think it makes more sense to cover the goals first. Goal 1) Obscure number of hidden service instances. Good to have, as it probably helps with the anonymity of a hidden service. This is a guess based on the assumption that attacks based on traffic analysis are harder if you don't know the number of servers that you are looking for. Goal 2) No master hidden service instance. Goal 3) If there is a master hidden service instance, clean fail-over from one master to the next, undetectable by the network. That sounds reasonable. Goal 4) Obscure which instances are up and which are down. I think it would be good to make the failure of a hidden service server (perhaps alone, or one of many for that service) indistinguishable from a breakage in any of the relays. If you don't have this property, distributing the service does little to help with attacks based on correlating server downtime with public events (power outages, network outages, ...). This is a specific form of this goal, that applies if you are in communication with a instance that goes down. What other goals should we have in this kind of design? Goal 5) It should cope (all the goals hold) with taking down (planned downtime), and bringing up instances. Goal 6) Adding instances should not reduce the performance. I can see problems if you have a large powerful server, adding a smaller server could actually reduce the performance, if the load is distributed equally. Goal 7) It should be possible to move between a single instance and mutiple instance service easily. (this might be a specific case of goal 5, or just need consolidating) Alternative 1: Multiple hidden service descriptors. Each instance of a hidden service picks its own introduction points, and uploads a separate hidden service descriptor to a subset of the HSDir nodes handling that service. This is close to breaking goal 1, as each instance would have to have = 1 introduction point, this puts a upper bound on the number of instances. The way the OP picks the number of introduction points to create would have to be thought about with respect to this. Also, goal 4 could be broken, as if the service becomes unreachable through a subset of the introduction points, this probably means that one or more of the instances have gone down. (assuming that an attacker can discover all the introduction points?) Alternative 2: Combined hidden service descriptors in the network. Each instance of a hidden service picks its own introduction points, and uploads something to every appropriate HSDir node. The HSDir nodes combine those somethings, somehow, into a hidden service descriptor. Same problem with goal 4 as alternative 1. Probably also has problems obscuring the number of instances from the HSDir's. Alternative 3: Single hidden service descriptor, one service instance per intro point. Each instance of a hidden service picks its introduction points, and somehow they coordinate so that they, together, get a single unified list of all their introduction points. They use this list to make a single signed hidden service descriptor, and upload that to the appropriate HSDirs. Same problem with goal 4 as alternative 1. I don't believe this has the same problem with the number of instances as Alternative 3 though. Alternative 4: Single hidden service descriptor, multiple service instances per intro point. This is your design above, where there's one descriptor chosen by a single hidden service instance (or possibly made collaboratively?), and the rest of the service instances fetch it, learn which intro points they're supposed to be at, and parasitically establish fallback introduction circuits there. I don't really see how choosing introduction points collaboratively would work, as it could lead to a separation between single instance services, and multiple instance services, which could break goal 7. It would also require the instances to interact, which adds some complexity. As for the fallback circuits, they are probably better off being just circuits. This would be what provides the scaling. The way you do this would have to be thought out though, to avoid breaking goal 6. A simple algorithm would be for the introduction point to just use a round robin for all the circuits to that service, but allow a service to reject a connection (if it has two much load), the introduction point would then continue to the next circuit. The introduction would also know the number of instances, if each instance only connected once. This could be masked by having instances making multiple connections to each introduction point (both in one instance and multiple instance services). While an external attacker might not be able to detect individual instance failure by trying to