Re: [tor-dev] Panopticlick summer project
Gunes Acar: > My name is Gunes Acar, a 2nd year PhD student at Computer Security and > Industrial Cryptography (COSIC) group of University of Leuven. > > I work with Prof. Claudia Diaz and study online tracking and browser > fingerprinting. I'd like to work on "Panopticlick" > (https://www.torproject.org/getinvolved/volunteer.html.en#panopticlick) > summer > project and other fingerprinting related issues which I tried to > outline below: > > 1) Collaborate with Peter@EFF to port/open-source Panopticlick: > https://trac.torproject.org/projects/tor/ticket/6119#comment:4 > a) implement necessary modifications - e.g. we won't be having cookies > or real IP addresses to match returning visitors. > b) consider security implications of storing fingerprints (e.g. what > happens if someone gets access to fingerprint database?) > > 2) Add machine-readability support outlined in Tor Automation > proposals: > https://people.torproject.org/~boklm/automation/tor-automation-proposals.html#helper-fingerprint > a) which one(s) should we implement? JSON, YAML, XML? > > 3) Survey the literature for fingerprinting attacks published since > Panopticlick. Implement those that may apply to TBB: > a) Canvas & WebGL fingerprinting (Mowery et al.) - make sure the patch > at #6253 works > b) JS engine fingerprinting (Mulazzani et al.) > c) CSS & rendering engine fingerprinting, (Unger et al.) > ... This sounds good. We already have a fix for #1 though, but verification can't hurt (the canvas should come back as all white unless the user allows it). We also have a couple fixes for CSS-based fingerprinting (fonts and system colors) that are entropy-reduction efforts only. Actually measuring the amount of entropy reduction here would be useful. > 4) Check with realworld fingerprinting scripts to see if they collect > anything that is not considered before. Check if TBB's FP > countermeasures work against them. (We can use data from FPDetective > study to find sites with fingerprinting scripts) Great. > 5) Backport new "attacks" found in 3 & 4 to EFF's Panopticlick in case > they consider an update. Unfortunately, the EFF has been reluctant to work with us in any way to improve or re-deploy Panopticlick for our needs, hence the frustrated tone of my other mail in this thread. It also seems that the EFF would not permit your resulting work to be open source, which I believe is a violation of the GSoC rules. I guess since you are not intending to actually apply to GSoC, this is a moot point though. It's just also a sore one for me, so I figured I'd poke it once more ;). However, as I also said in my other mail, I actually think we may be better served by developing something independent of Panopticlick. We need per-TBB version breakdowns of all the statistics we record, so we can measure the change in entropy as we deploy fixes and improvements to our defenses, without previous datapoints biasing the distribution. Other than some helper functions to store data and calculate entropy, and one (or maybe two) simple fingerprinting tests, we should not need any of the Panopticlick code for this project. It's also likely that our DB schema will end up radically different, due to the need to segment data by browser version (which may be input by the user), and the need for many more (and more varied) tests than they have. > 6) Convert fixed FP-related bugs into regression tests. > https://trac.torproject.org/projects/tor/query?keywords=~tbb-fingerprinting&status=closed > > 7) Build test cases to check the severity of fingerprinting related > open tickets, e.g.: > https://trac.torproject.org/projects/tor/ticket/8770 > https://trac.torproject.org/projects/tor/ticket/10299 > > 8) Work on potential fingerprinting bugs that ESR31 may bring. > > 9) ESR transitions seem to create a lot of FP-related issues that need > to be checked manually (e.g. #9608). Consider developing a tool that > iterates over the host objects of two browsers to compare them > automatically (e.g. to pinpoint new objects, new methods, updated > default values, etc.). Similar to "diff tool" mentioned here: > https://people.torproject.org/~boklm/automation/tor-automation-proposals.html#helper-fingerprint I am not sure this is helpful. In general, we only want to measure fingerprintability *within* a specific browser version. To determine the appearance of new APIs, it's probably best and simplest to simply review Mozilla's Developer Documentation, ie: https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/24 > 10) Evaluate the font-limits of TBB by checking the average # of fonts > Top 1 Million sites use. We can either collect fresh data with > FPDetective or use the existing (~1 year old) data. Excellent. > More on my background relevant to fingerprinting and TBB code base: > > We recently published a paper called "FPDetective: Dusting the Web for > Fingerprinters" (CCS'13) to measure the prevalence of browser > fingerprinting on the Internet.
Re: [tor-dev] GoSC - Website Fingerprinting project
On Tue, Mar 18, 2014 at 7:30 PM, Mike Perry wrote: > [snip] > Related: Do you happen to have any existing classifier code working > already, by any chance? > If It helps, the code [2] from our website fingerprinting paper [1] is public. It includes the edit-distance classifier [3] from [4], which wasn't reported on in [1], I believe. -Kevin [1] https://kpdyer.com/publications/oakland2012-peekaboo.pdf [2] https://github.com/kpdyer/website-fingerprinting [3] https://github.com/kpdyer/website-fingerprinting/blob/master/classifiers/ESORICSClassifier.py [4] http://dl.acm.org/citation.cfm?id=198 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] GoSC - Website Fingerprinting project
Marc Juarez: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Lunar: > > Have you read Mike Perry's long blog post on the topic? > > https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks > > > > It outlines future research work in evaluating the efficiency of > > fingerprinting attacks, and also mention a couple of promising defenses. > > Yes, I am aware of it and I'm currently working on a study to evaluate > the efficiency of these attacks. > > As Mike Perry said in the post, most of the attacks give an unrealistic > advantage to the adversary and probably countermeasures work much better > than what has been shown so far. > > However, some of the results of these articles suggest that there exist > coarse-grained traffic features that are invariant to randomized > pipelines (RP, SPDY) and thus can still identify web pages (Dyer et. > al.). Also, edit-distance based classifiers broke some old versions of > the RP implemented in Tor Browser. > > It's an open problem to see if these features actually uniquely identify > web pages in larger worlds than the ones considered in the literature. > In any case, link-padding strategies are specially designed to conceal > these features with the minimal amount of cover traffic and are becoming > affordable in terms of bandwidth. > > The project I propose would be directed to address this bug ticket: > > https://trac.torproject.org/projects/tor/ticket/7028 > > For example, I would like to implement the common building blocks for > link-padding countermeasures (such as a "traffic generator controller" > in the onion proxy and the entry guard). This sounds like a good summer-sized amount of work. I think I am in agreement with George that pluggable transports are a good place to start for prototyping this work. That way, you can experiment with custom padding protocols easily, without needing to make invasive changes to tor-core for each revision, each time. For example, it would be neat to be able to transmit a set of statistics to your bridge node during the connection handshake or with the circuit setup, so that you don't have to always request downstream padding cells with a upstream cell, and downstream padding can asynchronously arrive according to some probability or histrogram distribution you specify. You could also obviously specify a number of cells to send in response to a padding cell request (from O..N, where N is some reasonable cap similar to a largeish web object size). The current Tor link padding protocol supports neither of these operations. More advanced padding protocols are also possible, but may also be overkill. We can discuss those further if this sounds interesting. I'd also like to hear any ideas you might have on the design and/or implementation of such a protocol. Related: Do you happen to have any existing classifier code working already, by any chance? One of the ideas I've been considering is taking a closer look at the nearest-neighbor edit distances between page class labels, for the edit distance based classifiers. This distance provides us with an estimate of the ideal minimum cover traffic we will need to make testing instances jump from one nearest-neighbor label to another (causing a false positive). It will also decrease as the world size increases (more class labels in the same amount of N-dimensional space). A successful defense should change of the distribution of edit distances of test instances around their class labels (it will increase the intra-class variance) and this in turn will increase the size of the threshold around class labels for a given accuracy rate, reducing accuracy or increasing false positives. It may also be the case that low or no cost defenses (like a smarter use of SPDY) do this, too, but we'll be able to see it for sure with padding. Does this make sense? -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Interested in GSoC - Hidden Service Naming or Hidden Service Searching
On 03/04/2014 11:31 AM, George Kadianakis wrote: AFAIK, you can submit multiple proposals. Even multiple proposals through different FOSS projects. Like I suggested in my previous mail, I would even encourage you to submit multiple proposals since the HS search engine project has gotten plenty of student attention lately. Cheers! Sorry for delayed reply, school had me busy. What is the preferred way to get feedback on a full proposal? Is there a way to submit a draft proposal on the GSoC website so that Tor devs can read it and send me feedback, but I can revise it before the deadline? Or should I just post a link in an e-mail to the Tor-Dev list? Also, does Tor prefer proposals in plain text, PDF, or some other format? Thanks, -Jeremy Rand ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Panopticlick summer project
Yan Zhu: > On 03/17/2014 04:41 AM, Gunes Acar wrote: > > Hi Yan, > > > > Glad that you're interested in the project. > > It'd be very nice collaborate with you on this. > > > > Indeed, we've been corresponding with Peter for a related project and > > I mentioned my intention to work as a middleman between EFF and Tor. > > > > Great, it seems that Peter and I are both interested and willing to help. > > Regarding > https://trac.torproject.org/projects/tor/ticket/6119#comment:10, Peter > says he has some reluctance to open source the project (not the data) > because it might make it easier for some websites to track visitors > without their consent. This might have been a valid concern 5 years ago, but now it's just a joke. The tests on Panopticlick are ancient, widely known, easy to reproduce, and since then much more severe and invasive mechanisms of fingerprinting have since been developed/deployed in modern browsers. Moreover, only 2 of the tests it performs actually apply to Tor Browser users. Banks in particular have already deployed some of the techniques we've fixed that the EFF study entirely predates. And these techniques are far higher entropy than browser resolution (such as localhost open port enumeration, OS theme fingerprinting, and HTML5+WebGL canvas rendering+extraction+hashing). Not only should we (as Tor) publicly provide tests and easy-to-deploy working PoC code for all of these vectors, we should also endeavor to detail cases where major browser vendors are ignoring or exacerbating this problem, and make it easy for everyone to test and observe this behavior themselves. Not sure if that means the EFF now has a conflict of interest with this project for some ridiculous reason, but frankly any attempt at trying to "hide" these techniques is downright silly. They are too well known (most are publicly documented elsewhere, or at least on our bugtracker), and there's waaay too much money on the other side of the fence in terms of incentives to develop and deploy working attacks. Further, starting the from EFF codebase might also be a hindrance to us. It is not designed for measuring the effects of defenses. In fact, its measurement mechanisms actively penalize any attempt at defense development (because any approach to alter browser behavior instantly makes you more unique than the previous userbase). I actually think Panopticlick has of late done more to prevent browser fingerprinting defense development than to encourage it. I would really like to see it DIAF. Here's hoping we can make something better! -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Panopticlick summer project
On 03/17/2014 04:41 AM, Gunes Acar wrote: > Hi Yan, > > Glad that you're interested in the project. > It'd be very nice collaborate with you on this. > > Indeed, we've been corresponding with Peter for a related project and > I mentioned my intention to work as a middleman between EFF and Tor. > Great, it seems that Peter and I are both interested and willing to help. Regarding https://trac.torproject.org/projects/tor/ticket/6119#comment:10, Peter says he has some reluctance to open source the project (not the data) because it might make it easier for some websites to track visitors without their consent. -Yan ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torbirdy
Hi, Thanks for the feedback. I have done some work in javacript and c++ and am comfortable coding in both, which is one of the reasons I chose the project. So I'll go ahead and submit my application at the max by tomorrow. Also can I edit the final project proposal once its submitted to melange ? Anyway thanks for all the help. Regards D ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev