Re: [tor-dev] Projects to combat/defeat data correlation
On Wed, 15 Jan 2014 21:16:20 +, Jim Rucker wrote: There was a story in the news recently of a Harvard student who used Tor to send a bomb threat to Harvard in order to cancel classes so he wouldn't have to take a test. He was apprehended within a day, which puts into question the anonymity of Tor. This was because it was known that the threat was delivered via tor, and that he was the only one in $(organizational unit of harvard) using tor at that time, and he confessed when being confronted with that. There was nothing that actually proved that he did the threat. (Unless this is a case of parallel construction, of course, which I don't assume.) ... Are there any projects in Tor being worked in to combat data correlation? For instance, relays the send/recv constant data rates continuously - capping data rates and padding partial or non-packets with random data to maintain the data rates At the moment that would be prohibitively expensive. Also, it wouldn't guard against the scenario above - you can't be online and shoveling data all the time, so longterm correlation is still possible. Andreas -- Totally trivial. Famous last words. From: Linus Torvalds torvalds@*.org Date: Fri, 22 Jan 2010 07:29:21 -0800 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Anyone wanting to write some Weather-tight code?
Hello Norbert and Karsten, I have added a couple of attachments to the projects wiki-page. The first one, is a UML diagram of the data-models being used in the current weather. It should gives us a good idea about the current implementation. The second attachment is the Design Document from the current Tor Weather application. It should give a decent idea about the important modules in the application and more importantly its work-flow. Thanks! ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
[tor-dev] Using MaxMind's GeoIP2 databases in tor, BridgeDB, metrics-*, Onionoo, etc.
Hi devs, you probably know that we use MaxMind's GeoIP database in various of our products (list may not be exhaustive): - tor: We ship little-t-tor with a geoip and a geoip6 file for clients to support excluding relays by country code and for relays to generate by-country statistics. - BridgeDB: I vaguely recall that the BridgeDB service uses GeoIP data to return only bridges that are not blocked in a user's country. Or maybe that was a feature yet to be implemented. - Onionoo: The Onionoo service uses MaxMind's city database to provide location information of relays. (It also uses MaxMind's ASN database to provide information on AS number and name.) - metrics-db: I'm planning to use GeoIP data to resolve bridge IP addresses to country codes in the bridge descriptor sanitizing process. - metrics-web: We have been using GeoIP data to provide statistics on relays by country. This is currently disabled because the implementation was eating too many resources, but I plan to put these statistics back. However, the GeoIP database that we currently use has a big shortcoming: it replaces valid country codes with A1 or A2 whenever MaxMind thinks that a relay is an anonymizing proxy or satellite provider. That's why we currently repair their database by either automatically guessing what country code an A1 entry could have had [1, 2], or by manually looking it up in RIR delegation files [3, 4]. This is just a workaround. Also, I think BridgeDB doesn't repair its GeoIP database. Here's the good news: MaxMind now provides their databases in new formats which provide the A1/A2 information in *addition* to the correct country codes [5, 6]. We should switch! How do we switch? First option is to ship their binary database files and include their APIs [7] in our products. Looks there are APIs for C, Java, and Python, so all the languages we need for the tools listed above. Pros: we can kick out our parsing and lookup code. Cons: we need to check if their licenses are compatible, we have to kick out our parsing and lookup code and learn their APIs, and we add new dependencies. Another option is to write a new tool that parses their full databases and converts them into file formats we already support. (This would also allow us to provide a custom format with multiple database versions which would be pretty useful for metrics, see #6471.) Also, it looks like their license, Creative Commons Attribution-ShareAlike 3.0 Unported, allows converting their database to a different format. If we want to write such a tool, we have a few options: - We use their database specification [8] and write our own parser using a language of our choice (read: whoever writes it pretty much decides). We could skip the binary search tree part of their files and only process the contents. Whenever they change their format, we'll have to adapt. - We use their Python API [9] to build our parser, though it looks like that requires pip or easy_install and compiling their C API. I don't know enough about Python to assess what headaches that's going to cause. - We use their Java API [10] to build our parser, though we're probably forced to use Maven rather than Ant. I don't have much experience with Maven. Also, using Java probably makes me the default (and only) maintainer, which I'd want to avoid if possible. Thoughts? What other options did I miss, and what pros and cons that I overlooked? And is this something that people on this list would want to help with, once we agreed on one of the options? If so, please feel free to join the discussion now and maybe influence which path we're going to take. All the best, Karsten [1] https://gitweb.torproject.org/tor.git/blob/HEAD:/src/config/deanonymind.py [2] https://gitweb.torproject.org/onionoo.git/blob/HEAD:/geoip/deanonymind.py [3] https://gitweb.torproject.org/tor.git/blob/HEAD:/src/config/geoip-manual [4] https://gitweb.torproject.org/onionoo.git/blob/HEAD:/geoip/geoip-manual [5] http://dev.maxmind.com/geoip/geoip2/whats-new-in-geoip2/ [6] http://dev.maxmind.com/geoip/geoip2/geolite2/ [7] http://dev.maxmind.com/geoip/geoip2/downloadable/ [8] https://github.com/marklr/MaxMind-IPDB-perl/blob/master/docs/MaxMind-IPDB-spec.md [9] https://pypi.python.org/pypi/geoip2 [10] https://github.com/maxmind/MaxMind-DB-Reader-java ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
On Wed, Jan 15, 2014 at 09:16:20PM -0600, Jim Rucker wrote: Are there any projects in Tor being worked in to combat data correlation? For instance, relays the send/recv constant data rates continuously - capping data rates and padding partial or non-packets with random data to maintain the data rates The very quick answer without providing much detail is that you may want to look at scramblesuit [0][1]. It doesn't try to provide constant throughput, but (as the website says) we alter inter-arrival times and the transported protocol's packet length distribution. This isn't a perfect solution, and won't impress a GPA, but it's a start if you're dealing with a localized passive observer. - Matt [0] http://www.cs.kau.se/philwint/scramblesuit/ [1] https://gitweb.torproject.org/user/phw/scramblesuit.git ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Proposal 225: Strawman proposal: commit-and-reveal shared rng
I don't think that a solution which uses DKG is overkill, I think it would be more secure. The more all-or-nothing security provided by DKG based schemes seems preferable to the sliding-scale-of-influence provided by coin flipping ones. Then again I don't know that much about coin flipping protocols so that could simply be me being ignorant. On the other hand he's right that DKG-based schemes would probably be more complex, at least in regards to number of calculations. For instance the Fouque protocol. I've actually implemented a proof-of-concept version of the Fouque protocol and it's not at all efficient. Using a 1024 bit prime it can run for 5 participants in about 2 minutes (calculations only) on a single decent computer. I imagine a more speed focused implementation running over 10 directory authorities would have an acceptable runtime, but 20 or 30 authorities might be pushing it. Luckily other DKG protocols apparently have far less complexity [ O(n) as opposed to O(n^2) ] since they don't have the nice one-round-sharing-phase feature. I guess it's all just a balancing act with no real correct answer. Security vs calculation complexity vs network traffic vs protocol complexity vs ... ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
In that case would it then look like zero in $(organizational unit of harvard) using tor and one in $(organizational unit of harvard) using scramble suit? I like the idea of the tor pluggable transport combiner... wherein we could wrap a pseudo-random appearing obfuscation protocol (such as obfs3, scramblesuit etc) in a white listed obfuscation protocol such as http?, sshrproxy, hexchat etc. I imagine the anonymity set would be much smaller for these combined transports... fewer people using them. On Thu, Jan 16, 2014 at 12:54 PM, Matthew Finkel matthew.fin...@gmail.com wrote: On Wed, Jan 15, 2014 at 09:16:20PM -0600, Jim Rucker wrote: Are there any projects in Tor being worked in to combat data correlation? For instance, relays the send/recv constant data rates continuously - capping data rates and padding partial or non-packets with random data to maintain the data rates The very quick answer without providing much detail is that you may want to look at scramblesuit [0][1]. It doesn't try to provide constant throughput, but (as the website says) we alter inter-arrival times and the transported protocol's packet length distribution. This isn't a perfect solution, and won't impress a GPA, but it's a start if you're dealing with a localized passive observer. - Matt [0] http://www.cs.kau.se/philwint/scramblesuit/ [1] https://gitweb.torproject.org/user/phw/scramblesuit.git ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
I imagine the anonymity set would be much smaller for these combined transports... fewer people using them. In my understanding, the anonymity set doesn't apply to use of PTs since this is only at the entry side. The exit side does not know[1] what PT the originator is using, so is unable to use that information to de-anonymise. [1] at least, in theory should not know, perhaps someone can check there are no side-channels? would be pretty scary if exit could work out that originator is using PTs. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
On Wed, Jan 15, 2014 at 7:16 PM, Jim Rucker mrjim...@gmail.com wrote: [snip] From my understanding (please correct me if I'm wrong) Tor has a weakness in that if someone can monitor data going into the relays and going out of the exit nodes then they can defeat the anonymity of tor by correlating the size and number of packets being sent to relays and comparing those that the packets leaving the exit nodes. Are there any projects in Tor being worked in to combat data correlation? For instance, relays the send/recv constant data rates continuously - capping data rates and padding partial or non-packets with random data to maintain the data rates What you are referring to is a traffic confirmation attack. It's a deceptively hard problem --- even if the naive strategy of sending data at a constant rate worked (for some definition) it would be prohibitively expense in practice. It is also worth reiterating that even if such a countermeasure is in place, it wouldn't conceal that fact that a specific user is connecting to the Tor network. If you are interested in recent academic works on traffic analysis, you should have a look at [1] and [2]. They explore the related setting of website fingerprinting attacks and defenses (including the one you suggest.) -Kevin [1] https://kpdyer.com/publications/oakland2012-peekaboo.pdf [2] http://cacr.uwaterloo.ca/techreports/2013/cacr2013-30.pdf ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Dusting off dir-spec.txt
On Tue, Jan 14, 2014 at 1:56 PM, Karsten Loesing kars...@torproject.org wrote: [...] (Let me know if you prefer this review to happen in a ticket rather than here.) Thanks, Karsten! I think it should ideally be a ticket? -- Nick ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
Ximin Luo wrote: In my understanding, the anonymity set doesn't apply to use of PTs since this is only at the entry side. The exit side does not know[1] what PT the originator is using, so is unable to use that information to de-anonymise. [1] at least, in theory should not know, perhaps someone can check there are no side-channels? would be pretty scary if exit could work out that originator is using PTs. Anonymity is still a consideration, even if it's highly unlikely to be impinged upon by pluggable transports. For example, if a network notices someone connect to a known obfsproxy bridge, then they can make an educated guess that the person is using both Tor and obfsproxy. With flashproxy, this is of much less concern given address diversity. With bananaphone, it wouldn't really apply at all as far as I can see. ~Griffin ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Review of Proposal 147: Eliminate the need for v2 directories in generating v3 directories
On Wed, Jan 15, 2014 at 9:15 PM, Roger Dingledine a...@mit.edu wrote: On Wed, Jan 15, 2014 at 01:08:03PM +0100, Karsten Loesing wrote: I talked to Roger on IRC, and here's why this proposal may indeed be overkill: As of January 2013, there is only a single version 3 directory authority left that serves version 2 statuses: dizum. moria1 and tor26 have been rejecting version 2 requests for a long time, and it's mostly an oversight that dizum still serves them. The other six authorities have never generated version 2 statuses for others to be used as pre-voting opinions. So, it's basically not true that version 2 statuses are required for the version 3 protocol to work properly. See git commits 2e692bd8 and eaf5487d, which went into 0.2.2.12-alpha: o Major bugfixes: - Many relays have been falling out of the consensus lately because not enough authorities know about their descriptor for them to get a majority of votes. When we deprecated the v2 directory protocol, we got rid of the only way that v3 authorities can hear from each other about other descriptors. Now authorities examine every v3 vote for new descriptors, and fetch them from that authority. Bugfix on 0.2.1.23. That was the stopgap that made proposal 147 not so critical. I think based on Karsten's recent results that maybe it's enough. Sounds good to me. Is this in dir-spec.txt? I'm not finding it at first glance. If it isn't, Karsten, would you be able to add it? Probably we should do that _after_ merging your dirspec branch. -- Nick ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Review of Proposal 147: Eliminate the need for v2 directories in generating v3 directories
On Wed, Jan 15, 2014 at 7:08 AM, Karsten Loesing kars...@torproject.org wrote: [...] I talked to Roger on IRC, and here's why this proposal may indeed be overkill: As of January 2013, there is only a single version 3 directory authority left that serves version 2 statuses: dizum. moria1 and tor26 have been rejecting version 2 requests for a long time, and it's mostly an oversight that dizum still serves them. The other six authorities have never generated version 2 statuses for others to be used as pre-voting opinions. So, it's basically not true that version 2 statuses are required for the version 3 protocol to work properly. Here's a possible way to move this forward. - Please review and merge my prop147tweaks branch that contains tweaks from our discussion above, regardless of whether this proposal will be implemented or not. Done. - I'm going to run a quick analysis of archived vote documents to see how much authorities would have benefited from the others' votes before generating their own votes. - I'm going to ask Alex to disable version 2 statuses on dizum using DisableV2DirectoryInfo_ 1 to see what that does to the network. We should probably finish the 2048 bits RSA keys upgrade first before changing yet another variable. Soynds good. - If there's no convincing argument to implement opinion documents, we close this proposal as rejected. Great. What do you think? Sounds good. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Projects to combat/defeat data correlation
Yeah I guess if the PT doesn't draw attention and the bridge IP is not known then one's Tor traffic may be somewhat obscured. What about bananaphone? Do you mean the bananaphone PT? It is trivially detectable... more so than say... a transport like obfs3 who's output looks like pseudo random noise. On Thu, Jan 16, 2014 at 8:33 PM, Griffin Boyce grif...@cryptolab.net wrote: Ximin Luo wrote: In my understanding, the anonymity set doesn't apply to use of PTs since this is only at the entry side. The exit side does not know[1] what PT the originator is using, so is unable to use that information to de-anonymise. [1] at least, in theory should not know, perhaps someone can check there are no side-channels? would be pretty scary if exit could work out that originator is using PTs. Anonymity is still a consideration, even if it's highly unlikely to be impinged upon by pluggable transports. For example, if a network notices someone connect to a known obfsproxy bridge, then they can make an educated guess that the person is using both Tor and obfsproxy. With flashproxy, this is of much less concern given address diversity. With bananaphone, it wouldn't really apply at all as far as I can see. ~Griffin ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev