Re: Tor seems to have a huge security risk--please prove me wrong!

2010-09-02 Thread grarpamp
> believe that the "global external passive adversary" does exist
> though (via ... secret rooms that splice cables and copy off
> traffic in transit)

The historical existence and use of taps, whether for international/local
intrigue, criminal, research or black/white ops, with or without clear legal
authority, is well documented. Even moreso is the public product line
developed / purchased and capable of use by various GPA's... carnivore,
narus, sql, tcpdump, fiber toys, etc. As is the base interest in research
towards any potential application. It should be assumed that GPA's are
actively present, at minimum in highly active research mode. At most,
that remains to be seen.

>  try to bring their success
>  rates low enough that their incentive might switch to becoming a
>  "local internal adversary", where they have to actually run Tor nodes
>  to get enough information to perform their attacks.

Further, simply because there is not sufficient evidence to the contrary,
and because the history of cover ops and secrecy is equally documented...
it should be assumed that any sufficiently large number of anonymity
nodes are, in fact, not run by disinterested kids in their mom's basement.

Just because the IP says residential dsl/cable, some corp or colo
somewhere, or even signed by some seemingly well known internet
figure... as opposed to mapping back to any given adversary... does
not give the user reason to dismiss them.

The monetary cost of owning a kilonode or two is of trivial impact to an
agent capable of making productive use of such a set.

Agreed, writing off a known [or unknown hypothetical] strong adversary
is far better than disbelief in same or failing to see one at all.
***
To unsubscribe, send an e-mail to majord...@torproject.org with
unsubscribe or-talkin the body. http://archives.seul.org/or/talk/


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-30 Thread Paul Syverson
On Sun, Aug 29, 2010 at 05:13:21PM -0700, Mike Perry wrote:
> Thus spake Paul Syverson (syver...@itd.nrl.navy.mil):
> 
> > On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> > > Any classifier needs enough bits to differentiate between two
> > > potentially coincident events. This is also why Tor's fixed packet
> > > size performs better against known fingerprinting attacks. Because
> > > we've truncated the lower 8 bits off of all signatures that use size
> > > as a feature in their fingerprint classifiers. They need to work to
> > > find other sources of bits.
> > 
> > I disagree. Most of what you say about base rates etc. is valid and
> > should be taken into account, but that is not the only thing that is
> > going on. First, you have just stated one reason that correlation
> > should be easier than fingerprinting but then tried to claim it as
> > some sort of methodological flaw. Truncating the lower 8 bits does
> > have a significant impact on fingerprinting but little impact on
> > correlation because of the windows and datasets, just like you said.
> > But way more importantly, fingerprinting is inherently a passive
> > attack. You are sifting through a pile of known fingerprints looking
> > for matches and that's all you can do as an attacker. But its easy to
> > induce any timing signature you want during a correlation attack. (It
> > seems to be completely unnecessary because of point 1, but it would be
> > trivial to add that if you wanted to.) Tor's current design has no
> > mechanism to counter active correlation. Proposed techniques, such as
> > in the recent paper by Aaron, Joan, and me, are clearly too expensive
> > and iffy at this stoge of research. This is totally different for
> > fingerprinting. One could have an active attack similar to
> > fingerprinting in which one tries to alter a fingerprint to make it
> > more unique and then look for that fingerprint.  I don't want to get
> > into a terminological quibble, but that is not what I mean by
> > fingerprinting and would want to call it something else or start
> > calling fingerprinting 'passive fingerprinting', something like that.
> > Then there is the whole question of how effective this would be,
> > plus a lot more details to say what "this" is, but anyway I think
> > we have good reason to treat fingerprinting and correlation as different
> > but related problems unless we want to say something trivial like
> > "They are both just instances of pattern recognition."
> 
> Ah, of course. What I meant to say then was that "passive
> fingerprinting" really is the same problem as "passive correlation". 

But there might be significant value to solving just passive
fingerprinting relative to cost whereas the value of solving just
passive correlation seems really tiny if it leaves active correlation
untouched. More below.

> 
> I don't spend a whole lot of time worrying about the "global *active*
> adversary", because I don't believe that such an adversary can really
> exist in practical terms. However, it is good that your research
> considers active adversaries in general, because they can and do exist
> on more localized scales.
> 
> I do believe that the "global external passive adversary" does exist
> though (via the AT&T secret rooms that splice cables and copy off
> traffic in transit), and I think that the techniques used against
> "passive fingerprinting" can be very useful against that adversary. I
> also think a balance can be found to provide defenses against the
> "global external passive adversary" to try to bring their success
> rates low enough that their incentive might switch to becoming a
> "local internal adversary", where they have to actually run Tor nodes
> to get enough information to perform their attacks.
>  
> This is definitely a terminological quibble, but I think it is useful
> to consider these different adversary classes and attacks, and how
> they relate to one another. I think it is likely that we are able to
> easily defeat most cases of dragnet surveillance with very good
> passive fingerprinting defenses, but that various types of active
> surveillance may remain beyond our (practical) reach for quite some
> time.

I don't share your belief about global external passive adversaries on
the current Tor network. I do find it plausible that there could be
(but no idea if there actually are) widespread adversaries (internal
and/or external) capable of attacking double-digit percentages of the
network; however I don't think they would be anything approaching
global. But we can agree to disagree on our speculations here.  Your
paranoia may vary.

My main concern is that your characterization implies a false
dichotomy by assuming an adversary's capabilities are uniform wherever
he may exist, either active everywhere or passive everywhere (and
let's ignore for now that each of 'active' and 'passive' cover a
variety of attackers).

This is central to the distinction between fingerprinting and
corr

Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-29 Thread Mike Perry
Thus spake Gregory Maxwell (gmaxw...@gmail.com):

> On Sun, Aug 29, 2010 at 3:54 AM, Mike Perry  wrote:
> [snip]
> > Any classifier needs enough bits to differentiate between two
> > potentially coincident events. This is also why Tor's fixed packet
> > size performs better against known fingerprinting attacks. Because
> > we've truncated the lower 8 bits off of all signatures that use size
> > as a feature in their fingerprint classifiers. They need to work to
> > find other sources of bits.
> 
> If this is so??? that people are trying to attack tor with size
> fingerprinting but failing because of the size quantization and then
> failing to publish because they got a non-result??? then it is something
> which very much needs to be made public.

According to the research groups Roger has talked to, yes, this is the
case.
 
> Not only might future versions of tor make different design decisions
> with respect to cell size, other privacy applications would benefit
> from even a no-result in this area.

The problem though is that it's hard to publish a no-result, unless
its pretty a pretty surprising no-result, or at least a quantifiable
no-result. It's not terribly surprising that existing fingerprinting
techniques do not work well "out of the box" against Tor, because a
lot less information is available during a Tor session, and there is a
lot more noise (due to more than just the 512-byte cell size).

If someone actually worked hard and took all these things into
account, and still had a result that said "Fingerprinting on Tor does
not usually work unless you have fewer than than X numbers of targets
and/or event rates below Y", it still probably would belong more in a
tech report than a full academic paper, unless it also came with
information-theoretic proofs that showed exactly why their
implementation got the results it did.


-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs


pgpMoFK8glL4M.pgp
Description: PGP signature


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-29 Thread Mike Perry
Thus spake Paul Syverson (syver...@itd.nrl.navy.mil):

> On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> > Any classifier needs enough bits to differentiate between two
> > potentially coincident events. This is also why Tor's fixed packet
> > size performs better against known fingerprinting attacks. Because
> > we've truncated the lower 8 bits off of all signatures that use size
> > as a feature in their fingerprint classifiers. They need to work to
> > find other sources of bits.
> 
> I disagree. Most of what you say about base rates etc. is valid and
> should be taken into account, but that is not the only thing that is
> going on. First, you have just stated one reason that correlation
> should be easier than fingerprinting but then tried to claim it as
> some sort of methodological flaw. Truncating the lower 8 bits does
> have a significant impact on fingerprinting but little impact on
> correlation because of the windows and datasets, just like you said.
> But way more importantly, fingerprinting is inherently a passive
> attack. You are sifting through a pile of known fingerprints looking
> for matches and that's all you can do as an attacker. But its easy to
> induce any timing signature you want during a correlation attack. (It
> seems to be completely unnecessary because of point 1, but it would be
> trivial to add that if you wanted to.) Tor's current design has no
> mechanism to counter active correlation. Proposed techniques, such as
> in the recent paper by Aaron, Joan, and me, are clearly too expensive
> and iffy at this stoge of research. This is totally different for
> fingerprinting. One could have an active attack similar to
> fingerprinting in which one tries to alter a fingerprint to make it
> more unique and then look for that fingerprint.  I don't want to get
> into a terminological quibble, but that is not what I mean by
> fingerprinting and would want to call it something else or start
> calling fingerprinting 'passive fingerprinting', something like that.
> Then there is the whole question of how effective this would be,
> plus a lot more details to say what "this" is, but anyway I think
> we have good reason to treat fingerprinting and correlation as different
> but related problems unless we want to say something trivial like
> "They are both just instances of pattern recognition."

Ah, of course. What I meant to say then was that "passive
fingerprinting" really is the same problem as "passive correlation". 

I don't spend a whole lot of time worrying about the "global *active*
adversary", because I don't believe that such an adversary can really
exist in practical terms. However, it is good that your research
considers active adversaries in general, because they can and do exist
on more localized scales.

I do believe that the "global external passive adversary" does exist
though (via the AT&T secret rooms that splice cables and copy off
traffic in transit), and I think that the techniques used against
"passive fingerprinting" can be very useful against that adversary. I
also think a balance can be found to provide defenses against the
"global external passive adversary" to try to bring their success
rates low enough that their incentive might switch to becoming a
"local internal adversary", where they have to actually run Tor nodes
to get enough information to perform their attacks.
 
This is definitely a terminological quibble, but I think it is useful
to consider these different adversary classes and attacks, and how
they relate to one another. I think it is likely that we are able to
easily defeat most cases of dragnet surveillance with very good
passive fingerprinting defenses, but that various types of active
surveillance may remain beyond our (practical) reach for quite some
time.
 
> > Personally, I believe that it may be possible to develop fingerprint
> > resistance mechanisms good enough to also begin to make inroads
> > against correlation, *if* the network is large enough to provide an
> > extremely high event rate. Say, the event rate of an Internet-scale
> > anonymity network.
> > 
> > For this reason, I think it is very important for academic research to
> > clearly state their event rates, and the entropy of their feature
> > extractors and classifiers. As well as source code and full data
> > traces, so that their results can be reproduced on larger numbers of
> > targets and with larger event rates, as I mentioned in my other reply.
> 
> We don't have the luxury of chemistry or even behavioral stuff like
> population biology of some species of fish to just hand out full
> traces. There's this pesky little thing user privacy that creates a
> tension we have that those fields don't. We could also argue more
> about the nature of research and publication criteria, but I suspect
> that we will quickly get way off topic in such a discussion, indeed
> have already started.

In most cases, we pretty intensely frown on these attacks on the live
Tor net

Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-29 Thread Paul Syverson
On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> Thus spake Paul Syverson (syver...@itd.nrl.navy.mil):
> 
> > > For those who want more background, you can read more at item #1 on
> > > https://www.torproject.org/research.html.en#Ideas
> > > (I hoped to transition
> > > https://www.torproject.org/volunteer.html.en#Research over to that new
> > > page, but haven't gotten around to finishing)
> > 
> > Yes. Exploring defensive techniques would be good. Unlike correlation,
> > fingerprinting seems more likely to be amenable to traffic shaping;
> > although the study of this for countering correlation (as some of us
> > recently published at PETS ;>) may be an OK place to build on.
> > Personally I still think trust is going to play a bigger role as an
> > effective counter than general shaping, but one place we seem to be in
> > sync is that it all needs more study.
> 
> Yeah, though again I want to point out that what we are actually
> looking at when we intuitively believe fingerprinting to be easier to
> solve than correlation is the event rate from the base rate fallacy.
> 
> Otherwise, they really are the same problem. Correlation is merely the
> act of taking a live fingerprint and extracting a number of bits from
> it, and adding these bits to the number of bits obtained from a window
> of time during which the event was supposed to have occurred.
> 
> Or, to put it in terms of event rates, it is merely the case that much
> fewer potentially misclassified events happen during the very small
> window of time provided by correlation, as opposed to the much larger
> number of events that happen during a dragnet fingerprinting attempt.
> 
> Any classifier needs enough bits to differentiate between two
> potentially coincident events. This is also why Tor's fixed packet
> size performs better against known fingerprinting attacks. Because
> we've truncated the lower 8 bits off of all signatures that use size
> as a feature in their fingerprint classifiers. They need to work to
> find other sources of bits.


I disagree. Most of what you say about base rates etc. is valid and
should be taken into account, but that is not the only thing that is
going on. First, you have just stated one reason that correlation
should be easier than fingerprinting but then tried to claim it as
some sort of methodological flaw. Truncating the lower 8 bits does
have a significant impact on fingerprinting but little impact on
correlation because of the windows and datasets, just like you said.
But way more importantly, fingerprinting is inherently a passive
attack. You are sifting through a pile of known fingerprints looking
for matches and that's all you can do as an attacker. But its easy to
induce any timing signature you want during a correlation attack. (It
seems to be completely unnecessary because of point 1, but it would be
trivial to add that if you wanted to.) Tor's current design has no
mechanism to counter active correlation. Proposed techniques, such as
in the recent paper by Aaron, Joan, and me, are clearly too expensive
and iffy at this stoge of research. This is totally different for
fingerprinting. One could have an active attack similar to
fingerprinting in which one tries to alter a fingerprint to make it
more unique and then look for that fingerprint.  I don't want to get
into a terminological quibble, but that is not what I mean by
fingerprinting and would want to call it something else or start
calling fingerprinting 'passive fingerprinting', something like that.
Then there is the whole question of how effective this would be,
plus a lot more details to say what "this" is, but anyway I think
we have good reason to treat fingerprinting and correlation as different
but related problems unless we want to say something trivial like
"They are both just instances of pattern recognition."

> 
> Personally, I believe that it may be possible to develop fingerprint
> resistance mechanisms good enough to also begin to make inroads
> against correlation, *if* the network is large enough to provide an
> extremely high event rate. Say, the event rate of an Internet-scale
> anonymity network.
> 
> For this reason, I think it is very important for academic research to
> clearly state their event rates, and the entropy of their feature
> extractors and classifiers. As well as source code and full data
> traces, so that their results can be reproduced on larger numbers of
> targets and with larger event rates, as I mentioned in my other reply.

We don't have the luxury of chemistry or even behavioral stuff like
population biology of some species of fish to just hand out full
traces. There's this pesky little thing user privacy that creates a
tension we have that those fields don't. We could also argue more
about the nature of research and publication criteria, but I suspect
that we will quickly get way off topic in such a discussion, indeed
have already started.

aloha,
Paul
*

Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-29 Thread Gregory Maxwell
On Sun, Aug 29, 2010 at 3:54 AM, Mike Perry  wrote:
[snip]
> Any classifier needs enough bits to differentiate between two
> potentially coincident events. This is also why Tor's fixed packet
> size performs better against known fingerprinting attacks. Because
> we've truncated the lower 8 bits off of all signatures that use size
> as a feature in their fingerprint classifiers. They need to work to
> find other sources of bits.

If this is so— that people are trying to attack tor with size
fingerprinting but failing because of the size quantization and then
failing to publish because they got a non-result— then it is something
which very much needs to be made public.

Not only might future versions of tor make different design decisions
with respect to cell size, other privacy applications would benefit
from even a no-result in this area.
***
To unsubscribe, send an e-mail to majord...@torproject.org with
unsubscribe or-talkin the body. http://archives.seul.org/or/talk/


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-29 Thread Mike Perry
Thus spake Paul Syverson (syver...@itd.nrl.navy.mil):

> > For those who want more background, you can read more at item #1 on
> > https://www.torproject.org/research.html.en#Ideas
> > (I hoped to transition
> > https://www.torproject.org/volunteer.html.en#Research over to that new
> > page, but haven't gotten around to finishing)
> 
> Yes. Exploring defensive techniques would be good. Unlike correlation,
> fingerprinting seems more likely to be amenable to traffic shaping;
> although the study of this for countering correlation (as some of us
> recently published at PETS ;>) may be an OK place to build on.
> Personally I still think trust is going to play a bigger role as an
> effective counter than general shaping, but one place we seem to be in
> sync is that it all needs more study.

Yeah, though again I want to point out that what we are actually
looking at when we intuitively believe fingerprinting to be easier to
solve than correlation is the event rate from the base rate fallacy.

Otherwise, they really are the same problem. Correlation is merely the
act of taking a live fingerprint and extracting a number of bits from
it, and adding these bits to the number of bits obtained from a window
of time during which the event was supposed to have occurred.

Or, to put it in terms of event rates, it is merely the case that much
fewer potentially misclassified events happen during the very small
window of time provided by correlation, as opposed to the much larger
number of events that happen during a dragnet fingerprinting attempt.

Any classifier needs enough bits to differentiate between two
potentially coincident events. This is also why Tor's fixed packet
size performs better against known fingerprinting attacks. Because
we've truncated the lower 8 bits off of all signatures that use size
as a feature in their fingerprint classifiers. They need to work to
find other sources of bits.

Personally, I believe that it may be possible to develop fingerprint
resistance mechanisms good enough to also begin to make inroads
against correlation, *if* the network is large enough to provide an
extremely high event rate. Say, the event rate of an Internet-scale
anonymity network.

For this reason, I think it is very important for academic research to
clearly state their event rates, and the entropy of their feature
extractors and classifiers. As well as source code and full data
traces, so that their results can be reproduced on larger numbers of
targets and with larger event rates, as I mentioned in my other reply.

-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs


pgp4G3BRZVD7p.pgp
Description: PGP signature


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-28 Thread Paul Syverson
On Sat, Aug 28, 2010 at 02:51:35PM -0400, Roger Dingledine wrote:
> On Sat, Aug 28, 2010 at 11:20:41AM -0400, Paul Syverson wrote:
> > What you describe is known in the literature as website fingerprinting
> > attacks,
> [snip]
> > Roughly, while Tor is not invulnerable to such an attack, it fairs
> > pretty well, much better than other systems that this and earlier
> > papers examined mostly because the uniform size cells that Tor moves
> > all data with adds lots of noise.
> 
> Maybe. Or maybe not. This is an open research area that continues to
> worry me.
> 
> I keep talking to professors and grad students who have started a paper
> showing that website fingerprinting works on Tor, and after a while they
> stop working on the paper because they can't get good results either way
> (they can't show that it works well, and they also can't show that it
> doesn't work well).
> 
> The real question I want to see answered is not "does it work" -- I bet
> it can work in some narrow situations even if it doesn't work well in
> the general case. Rather, I want to know how to make it work less well.
> But we need to have a better handle on how well it works before we can
> answer that harder question.

OK I'm confused. Sorry for being terse initially but I just wanted to
get out that website fingerprinting is a known problem not a new
surprise. But it sounds like you think you are contrasting with what I
said rather than extending the same points. I said Tor is not
invulnerable to the attack, only that the published research (I wasn't
talking about the abandoned projects) shows it's a lot less vulnerable
than other deployed systems examined in that research, like jondonym
or various VPNs.  Yes, of course that's subject to the experiments and
assumptions conducted so far. I also said that it's worthy of
continued examination and analysis even if it is not the demonstrated
problem for Tor that end-to-end correlation is.  Since it's a pretty
open research area, we cannot say some significant attack isn't around
the corner. That's always the case.  All we know yet is that the few
published results there are show a small fraction of websites seem to
be uniquely identifiable via existing techniques. What am I missing?

> 
> For those who want more background, you can read more at item #1 on
> https://www.torproject.org/research.html.en#Ideas
> (I hoped to transition
> https://www.torproject.org/volunteer.html.en#Research over to that new
> page, but haven't gotten around to finishing)

Yes. Exploring defensive techniques would be good. Unlike correlation,
fingerprinting seems more likely to be amenable to traffic shaping;
although the study of this for countering correlation (as some of us
recently published at PETS ;>) may be an OK place to build on.
Personally I still think trust is going to play a bigger role as an
effective counter than general shaping, but one place we seem to be in
sync is that it all needs more study.

aloha,
Paul
***
To unsubscribe, send an e-mail to majord...@torproject.org with
unsubscribe or-talkin the body. http://archives.seul.org/or/talk/


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-28 Thread Mike Perry
Thus spake Roger Dingledine (a...@mit.edu):

> On Sat, Aug 28, 2010 at 11:20:41AM -0400, Paul Syverson wrote:
>
> I keep talking to professors and grad students who have started a paper
> showing that website fingerprinting works on Tor, and after a while they
> stop working on the paper because they can't get good results either way
> (they can't show that it works well, and they also can't show that it
> doesn't work well).
> 
> The real question I want to see answered is not "does it work" -- I bet
> it can work in some narrow situations even if it doesn't work well in
> the general case. Rather, I want to know how to make it work less well.
> But we need to have a better handle on how well it works before we can
> answer that harder question.

Yes. This is the approach we need to solve this problem. However, one
of the problems with getting it out of most academics is the bias
against easy reproducibility. In order for any of this research to be
usable by us, it must be immediately and easily verifiable and
reproducible in the face of both changing attacks, and changing
network protocols (such as UDP-Tor and SPDY). This means source code
and experimental logs and data.

Most computer science academia is inherently biased against providing
this data for various reasons, and while this works for large industry
with the budget and time to reproduce experiments without assistance,
it will not work for us. I believe it is the main reason we see
adoption lag of 5-10 years for typical research all over
computer-related academia. My guess is Tor not have this much time to
fix these problems, hence we must demand better science from 
researchers who claim to be solving Tor-related problems (or proving
attacks on Tor networks).

I've gone into a little more detail on this subject and the
shortcomings of timing attacks in general in my comments on Michal
Zalewski's blog about regular, non-Tor HTTPS timing attacks:
http://lcamtuf.blogspot.com/2010/06/https-is-not-very-good-privacy-tool.html#comment-form


-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs


pgpqV53J20YXO.pgp
Description: PGP signature


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-28 Thread Roger Dingledine
On Sat, Aug 28, 2010 at 11:20:41AM -0400, Paul Syverson wrote:
> What you describe is known in the literature as website fingerprinting
> attacks,
[snip]
> Roughly, while Tor is not invulnerable to such an attack, it fairs
> pretty well, much better than other systems that this and earlier
> papers examined mostly because the uniform size cells that Tor moves
> all data with adds lots of noise.

Maybe. Or maybe not. This is an open research area that continues to
worry me.

I keep talking to professors and grad students who have started a paper
showing that website fingerprinting works on Tor, and after a while they
stop working on the paper because they can't get good results either way
(they can't show that it works well, and they also can't show that it
doesn't work well).

The real question I want to see answered is not "does it work" -- I bet
it can work in some narrow situations even if it doesn't work well in
the general case. Rather, I want to know how to make it work less well.
But we need to have a better handle on how well it works before we can
answer that harder question.

For those who want more background, you can read more at item #1 on
https://www.torproject.org/research.html.en#Ideas
(I hoped to transition
https://www.torproject.org/volunteer.html.en#Research over to that new
page, but haven't gotten around to finishing)

or see my 25c3 talk from 2008:
http://events.ccc.de/congress/2008/Fahrplan/events/2977.en.html
http://media.torproject.org/video/25c3-2977-en-security_and_anonymity_vulnerabilities_in_tor.mp4

--Roger

***
To unsubscribe, send an e-mail to majord...@torproject.org with
unsubscribe or-talkin the body. http://archives.seul.org/or/talk/


Re: Tor seems to have a huge security risk--please prove me wrong!

2010-08-28 Thread Paul Syverson
Hi Hikki,

What you describe is known in the literature as website fingerprinting
attacks, and there have been several research papers published about
them. Consult freehaven.net/anonbib or type "website fingerprinting"
in your favorite search engine. I think the most recent paper on this
is "Website fingerprinting: attacking popular privacy enhancing
technologies with the multinomial na??ve-bayes classifie" by Herman
et al.  at the 2009 ACM CCSW (Cloud Computing Security Workshop). It
will cite much of the relevant previous literature.

Roughly, while Tor is not invulnerable to such an attack, it fairs
pretty well, much better than other systems that this and earlier
papers examined mostly because the uniform size cells that Tor moves
all data with adds lots of noise.

The ability to identify destinations without seeing the destination
end of a connection (even with pretty low probability of typical
success) remains worthy of continued examination and analysis.
But end-to-end correlation remains the most significant
fact-of-life for all practical low-latency systems, including Tor.

aloha,
Paul

On Sat, Aug 28, 2010 at 06:51:13AM -0400, hi...@safe-mail.net wrote:
> There are a lot of discussions going on over at the Onion Forum, a Tor hidden 
> service board, regarding a possible attack on the Tor's anonymity and safety. 
> It's called "classifier attacks" and seems to be a high probability attack 
> that may in a way unmask the encryption used by Tor, and in addition to that 
> reveal the source as in the user using Tor as the first part of the chain.
> 
> This subject seems to be either very unknown or very well silenced. So 
> therefore I'm very interesting about what the users of this mailing list have 
> to say about this.
> 
> --
> 
> http://l6nvqsqivhrunqvs.onion/index.php?do=topic&id=12078
> 
> Here are two concerning posts:
> 
> -- QUOTE START --
> 
> It's really not that hard to understand the attack I don't see why everyone 
> is having such a hard time to get it.
> 
> You encrypt X with a key and the output is Y. There are 2^256 possible Y 
> values, with a 256 bit Initialization vector. This means each time you 
> encrypt X, even with the same key, the resulting Y is a different bit string. 
> The Bit string of X becomes impossible to get unless you have the key and Y. 
> So, the encrypted information itself can not be fingerprinted because there 
> are 2^256 possible ciphertexts for a given plaintext/key.
> 
> However, the SIZE that X will be after encrypted can be determined. X always 
> produces a Y of the same size when encrypted with a given key length, even 
> though there are 2^256 possible ciphertexts there is ONE possible size for Y.
> 
> This by itself isn't that bad for small data. Cat and Dog produce the same 
> output size for the same key. Once you start getting into really big things, 
> like motion pictures etc, then it starts to be a lot more damaging because 
> there are not a whole lot of things that are 329,384,394,231 bits, and by 
> looking at the Y value you can tell how many bits the X value was if you know 
> the algorithm used. Classifier attacks work better with SIZE.
> 
> However, complexity is another issue. If there is a website with 25 small 
> images on it, then the adversary can see the size of all these different 
> encrypted images you are loading. Each image can be seen by the adversary as 
> a different object, and the size of these objects can be determined. Also, if 
> you follow links on a page that you vist, the adversary can see the same data 
> for each of these pages and become more and more certain of what you are 
> doing. Classifier attacks work better with COMPLEXITY.
> 
> If you encrypt LARGE data, or COMPLEX SETS of data, it does not matter if you 
> use AES-256the bitstring of X can not be derived with Y with out the key, 
> but enough characteristics of X stay in Y that the adversary can with high 
> probability say what Y would PROBABLY decrypt into if they had the key. This 
> does require the adversary to have SEEN the value of X at some point prior to 
> it being encrypted, but this is not really that hard now is it? Tor is used 
> to PROTECT YOU incase there IS an insider in your groupbut an insider in 
> your group can fingerprint X regardless of if it is CP, a drug forum or a 
> secret military document.
> 
> Understand?
> 
> -- QUOTE END --
> 
> -- QUOTE START --
> 
> Oh yeah, it can be done with layers too so its not just the entry node / 
> infrastructure to worry about, although that is the biggest worry since you 
> are next in the chain.
> 
> X -> Y
> Y -> Z
> Z -> U
> 
> U can be used to determine the size of Z, Z can be used to determine the size 
> of Y, Y can be used to determine the size of X.
> 
> Layer encrypted data can still be classified, its just the relay node isn't 
> looking for the fingerprint of X it is looking for the fingerprint of Y which 
> it can get with Z.
> 
> -- QUOTE END --
> ***