date:20140318

Re: [tor-dev] Panopticlick summer project

2014-03-18 Thread Mike Perry

Gunes Acar:
> My name is Gunes Acar, a 2nd year PhD student at Computer Security and
> Industrial Cryptography (COSIC) group of University of Leuven.
> 
> I work with Prof. Claudia Diaz and study online tracking and browser
> fingerprinting. I'd like to work on "Panopticlick"
> (https://www.torproject.org/getinvolved/volunteer.html.en#panopticlick)
> summer
> project and other fingerprinting related issues which I tried to
> outline below:
> 
> 1) Collaborate with Peter@EFF to port/open-source Panopticlick:
> https://trac.torproject.org/projects/tor/ticket/6119#comment:4
> a) implement necessary modifications - e.g. we won't be having cookies
> or real IP addresses to match returning visitors.
> b) consider security implications of storing fingerprints (e.g. what
> happens if someone gets access to fingerprint database?)
> 
> 2) Add machine-readability support outlined in Tor Automation
> proposals:
> https://people.torproject.org/~boklm/automation/tor-automation-proposals.html#helper-fingerprint
> a) which one(s) should we implement? JSON, YAML, XML?
> 
> 3) Survey the literature for fingerprinting attacks published since
> Panopticlick. Implement those that may apply to TBB:
> a) Canvas & WebGL fingerprinting (Mowery et al.) - make sure the patch
> at #6253 works
> b) JS engine fingerprinting (Mulazzani et al.)
> c) CSS & rendering engine fingerprinting, (Unger et al.)
> ...

This sounds good. We already have a fix for #1 though, but verification
can't hurt (the canvas should come back as all white unless the user
allows it).

We also have a couple fixes for CSS-based fingerprinting (fonts and
system colors) that are entropy-reduction efforts only. Actually
measuring the amount of entropy reduction here would be useful.

> 4) Check with realworld fingerprinting scripts to see if they collect
> anything that is not considered before. Check if TBB's FP
> countermeasures work against them. (We can use data from FPDetective
> study to find sites with fingerprinting scripts)

Great.
 
> 5) Backport new "attacks" found in 3 & 4 to EFF's Panopticlick in case
> they consider an update.

Unfortunately, the EFF has been reluctant to work with us in any way to
improve or re-deploy Panopticlick for our needs, hence the frustrated
tone of my other mail in this thread. It also seems that the EFF would
not permit your resulting work to be open source, which I believe is a
violation of the GSoC rules. I guess since you are not intending to
actually apply to GSoC, this is a moot point though. It's just also a
sore one for me, so I figured I'd poke it once more ;).

However, as I also said in my other mail, I actually think we may be
better served by developing something independent of Panopticlick. We
need per-TBB version breakdowns of all the statistics we record, so we
can measure the change in entropy as we deploy fixes and improvements
to our defenses, without previous datapoints biasing the distribution.

Other than some helper functions to store data and calculate entropy,
and one (or maybe two) simple fingerprinting tests, we should not need
any of the Panopticlick code for this project. It's also likely that our
DB schema will end up radically different, due to the need to segment
data by browser version (which may be input by the user), and the need
for many more (and more varied) tests than they have.

> 6) Convert fixed FP-related bugs into regression tests.
> https://trac.torproject.org/projects/tor/query?keywords=~tbb-fingerprinting&status=closed
> 
> 7) Build test cases to check the severity of fingerprinting related
> open tickets, e.g.:
> https://trac.torproject.org/projects/tor/ticket/8770
> https://trac.torproject.org/projects/tor/ticket/10299
> 
> 8) Work on potential fingerprinting bugs that ESR31 may bring.
> 
> 9) ESR transitions seem to create a lot of FP-related issues that need
> to be checked manually (e.g. #9608). Consider developing a tool that
> iterates over the host objects of two browsers to compare them
> automatically (e.g. to pinpoint new objects, new methods, updated
> default values, etc.). Similar to "diff tool" mentioned here:
> https://people.torproject.org/~boklm/automation/tor-automation-proposals.html#helper-fingerprint

I am not sure this is helpful. In general, we only want to measure
fingerprintability *within* a specific browser version.

To determine the appearance of new APIs, it's probably best and simplest
to simply review Mozilla's Developer Documentation, ie:
https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/24
 
> 10) Evaluate the font-limits of TBB by checking the average # of fonts
> Top 1 Million sites use. We can either collect fresh data with
> FPDetective or use the existing (~1 year old) data.

Excellent.

> More on my background relevant to fingerprinting and TBB code base:
> 
> We recently published a paper called "FPDetective: Dusting the Web for
> Fingerprinters" (CCS'13) to measure the prevalence of browser
> fingerprinting on the Internet.

Re: [tor-dev] GoSC - Website Fingerprinting project

2014-03-18 Thread Kevin P Dyer

On Tue, Mar 18, 2014 at 7:30 PM, Mike Perry wrote:

> [snip]
> Related: Do you happen to have any existing classifier code working
> already, by any chance?
>

If It helps, the code [2] from our website fingerprinting paper [1] is
public. It includes the edit-distance classifier [3] from [4], which wasn't
reported on in [1], I believe.

-Kevin

[1] https://kpdyer.com/publications/oakland2012-peekaboo.pdf
[2] https://github.com/kpdyer/website-fingerprinting
[3]
https://github.com/kpdyer/website-fingerprinting/blob/master/classifiers/ESORICSClassifier.py
[4] http://dl.acm.org/citation.cfm?id=198
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] GoSC - Website Fingerprinting project

2014-03-18 Thread Mike Perry

Marc Juarez:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Lunar:
> > Have you read Mike Perry's long blog post on the topic?
> > https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks
> > 
> > It outlines future research work in evaluating the efficiency of
> > fingerprinting attacks, and also mention a couple of promising defenses.
> 
> Yes, I am aware of it and I'm currently working on a study to evaluate
> the efficiency of these attacks.
> 
> As Mike Perry said in the post, most of the attacks give an unrealistic
> advantage to the adversary and probably countermeasures work much better
> than what has been shown so far.
> 
> However, some of the results of these articles suggest that there exist
> coarse-grained traffic features that are invariant to randomized
> pipelines (RP, SPDY) and thus can still identify web pages (Dyer et.
> al.). Also, edit-distance based classifiers broke some old versions of
> the RP implemented in Tor Browser.
> 
> It's an open problem to see if these features actually uniquely identify
> web pages in larger worlds than the ones considered in the literature.
> In any case, link-padding strategies are specially designed to conceal
> these features with the minimal amount of cover traffic and are becoming
> affordable in terms of bandwidth.
> 
> The project I propose would be directed to address this bug ticket:
> 
> https://trac.torproject.org/projects/tor/ticket/7028
> 
> For example, I would like to implement the common building blocks for
> link-padding countermeasures (such as a "traffic generator controller"
> in the onion proxy and the entry guard).

This sounds like a good summer-sized amount of work. I think I am in
agreement with George that pluggable transports are a good place to
start for prototyping this work. That way, you can experiment with
custom padding protocols easily, without needing to make invasive
changes to tor-core for each revision, each time.

For example, it would be neat to be able to transmit a set of statistics
to your bridge node during the connection handshake or with the circuit
setup, so that you don't have to always request downstream padding cells
with a upstream cell, and downstream padding can asynchronously arrive
according to some probability or histrogram distribution you specify.

You could also obviously specify a number of cells to send in response
to a padding cell request (from O..N, where N is some reasonable cap
similar to a largeish web object size). The current Tor link padding
protocol supports neither of these operations.

More advanced padding protocols are also possible, but may also be
overkill. We can discuss those further if this sounds interesting. I'd
also like to hear any ideas you might have on the design and/or
implementation of such a protocol.


Related: Do you happen to have any existing classifier code working
already, by any chance?

One of the ideas I've been considering is taking a closer look at the
nearest-neighbor edit distances between page class labels, for the edit
distance based classifiers. This distance provides us with an estimate
of the ideal minimum cover traffic we will need to make testing
instances jump from one nearest-neighbor label to another (causing a
false positive). It will also decrease as the world size increases (more
class labels in the same amount of N-dimensional space).

A successful defense should change of the distribution of edit distances
of test instances around their class labels (it will increase the
intra-class variance) and this in turn will increase the size of the
threshold around class labels for a given accuracy rate, reducing
accuracy or increasing false positives.

It may also be the case that low or no cost defenses (like a smarter use
of SPDY) do this, too, but we'll be able to see it for sure with
padding.

Does this make sense?


-- 
Mike Perry


signature.asc
Description: Digital signature
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] Interested in GSoC - Hidden Service Naming or Hidden Service Searching

2014-03-18 Thread Jeremy Rand


On 03/04/2014 11:31 AM, George Kadianakis wrote:

AFAIK, you can submit multiple proposals. Even multiple proposals
through different FOSS projects. Like I suggested in my previous mail,
I would even encourage you to submit multiple proposals since the HS
search engine project has gotten plenty of student attention lately.

Cheers!

Sorry for delayed reply, school had me busy.

What is the preferred way to get feedback on a full proposal?  Is there 
a way to submit a draft proposal on the GSoC website so that Tor devs 
can read it and send me feedback, but I can revise it before the 
deadline?  Or should I just post a link in an e-mail to the Tor-Dev 
list?  Also, does Tor prefer proposals in plain text, PDF, or some other 
format?


Thanks,
-Jeremy Rand
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] Panopticlick summer project

2014-03-18 Thread Mike Perry

Yan Zhu:
> On 03/17/2014 04:41 AM, Gunes Acar wrote:
> > Hi Yan,
> > 
> > Glad that you're interested in the project.
> > It'd be very nice collaborate with you on this.
> > 
> > Indeed, we've been corresponding with Peter for a related project and
> > I mentioned my intention to work as a middleman between EFF and Tor.
> > 
> 
> Great, it seems that Peter and I are both interested and willing to help.
> 
> Regarding
> https://trac.torproject.org/projects/tor/ticket/6119#comment:10, Peter
> says he has some reluctance to open source the project (not the data)
> because it might make it easier for some websites to track visitors
> without their consent.

This might have been a valid concern 5 years ago, but now it's just a
joke. The tests on Panopticlick are ancient, widely known, easy to
reproduce, and since then much more severe and invasive mechanisms of
fingerprinting have since been developed/deployed in modern browsers.

Moreover, only 2 of the tests it performs actually apply to Tor Browser
users.

Banks in particular have already deployed some of the techniques we've
fixed that the EFF study entirely predates. And these techniques are far
higher entropy than browser resolution (such as localhost open port
enumeration, OS theme fingerprinting, and HTML5+WebGL canvas
rendering+extraction+hashing).

Not only should we (as Tor) publicly provide tests and easy-to-deploy
working PoC code for all of these vectors, we should also endeavor to
detail cases where major browser vendors are ignoring or exacerbating
this problem, and make it easy for everyone to test and observe this
behavior themselves.

Not sure if that means the EFF now has a conflict of interest with this
project for some ridiculous reason, but frankly any attempt at trying to
"hide" these techniques is downright silly. They are too well known
(most are publicly documented elsewhere, or at least on our bugtracker),
and there's waaay too much money on the other side of the fence in terms
of incentives to develop and deploy working attacks.

Further, starting the from EFF codebase might also be a hindrance to us.
It is not designed for measuring the effects of defenses. In fact, its
measurement mechanisms actively penalize any attempt at defense
development (because any approach to alter browser behavior instantly
makes you more unique than the previous userbase).

I actually think Panopticlick has of late done more to prevent browser
fingerprinting defense development than to encourage it. I would really
like to see it DIAF.

Here's hoping we can make something better!

-- 
Mike Perry

signature.asc
Description: Digital signature
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] Panopticlick summer project

2014-03-18 Thread Yan Zhu

On 03/17/2014 04:41 AM, Gunes Acar wrote:
> Hi Yan,
> 
> Glad that you're interested in the project.
> It'd be very nice collaborate with you on this.
> 
> Indeed, we've been corresponding with Peter for a related project and
> I mentioned my intention to work as a middleman between EFF and Tor.
> 

Great, it seems that Peter and I are both interested and willing to help.

Regarding
https://trac.torproject.org/projects/tor/ticket/6119#comment:10, Peter
says he has some reluctance to open source the project (not the data)
because it might make it easier for some websites to track visitors
without their consent.

-Yan
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] Torbirdy

2014-03-18 Thread mujnabed

Hi,

Thanks for the feedback. I have done some work in javacript and c++ and
am comfortable coding in both, which is one of the reasons I chose the
project.

So I'll go ahead and submit my application at the max by tomorrow.
Also can I edit the final project proposal once its submitted to melange ?

Anyway thanks for all the help.

Regards
D
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Re: [tor-dev] Panopticlick summer project

Re: [tor-dev] GoSC - Website Fingerprinting project

Re: [tor-dev] GoSC - Website Fingerprinting project

Re: [tor-dev] Interested in GSoC - Hidden Service Naming or Hidden Service Searching

Re: [tor-dev] Panopticlick summer project

Re: [tor-dev] Panopticlick summer project

Re: [tor-dev] Torbirdy

7 matches

Site Navigation

Mail list logo

Footer information