Comments on proposal 238:
1. I’m not convinced that the proposed amount of obfuscation is sufficient for 
the HS descriptor count. Adding noise to cover the contribution in a single 
period of any single HS doesn’t cover its vector of contributions. Thus, if 
over time the number of HSes stays the same (or has some other pattern that can 
be guessed by the adversary), then the randomness of the noise in the 
descriptor counts can effectively be removed by taking, say, taking the 
average. The best solution to this that I can think of is to bin every k 
consecutive integers and report the bin of the count after noise has been 
added. Then over time an adversary can at worst determine that the number of 
HSes lies within a range k. This applies to the cell counts also.

2. In 2.3, what exactly are “unique hidden-service identities”? .onion 
addresses?

3. It would hugely improve statistics accuracy to aggregate the statistics and 
only add noise once. However, this would require that the relays participate in 
a distributed protocol (e.g. [0]) rather than stick numbers in their extra-info 
docs.

4. Some possible privacy issues with revealing descriptor publication counts:
  - You wish to use hidden services in a way that involves a lot of .onion 
addresses for your service. This will blow past our noise, which I am assuming 
is calibrated to hide any single publication (or a small constant number of 
them). Then the total count could reveal when this new service appeared and is 
active (assuming the number of other descriptor publications is stable or 
otherwise predictable, say because they correspond to public HSes whose status 
can determined via a connection attempt).
  - You can factor out the noise over time if the total count is stable or 
otherwise predictable. This is the same issue as #1 above and using bins could 
work here as well.

[0] Our Data, Ourselves: Privacy via Distributed Noise Generation
  by Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and 
Moni Naor
  EUROCRYPT 2006
  <http://research.microsoft.com/pubs/65086/odo.pdf>

On Nov 25, 2014, at 5:14 PM, A. Johnson <aaron.m.john...@nrl.navy.mil> wrote:

> Hi George,
> 
>> I posted an initial draft of the proposal here:
>> https://lists.torproject.org/pipermail/tor-dev/2014-November/007863.html
>> Any feedback would be awesome.
> 
> OK, I’ll have a chance to look at this in the next few days.
> 
>> Specifically, I would be interested in undertanding the concept of
>> additive noise a bit better. As you can see the proposal draft is
>> still using multiplicative noise, and if you think that additive is
>> better we should change it. Unfortunately, I couldn't find any good
>> resources on the Internet explaining the difference between additive
>> and multiplicative noise. Could you expand a bit on what you said
>> above? Or link to a paper that explains more? Or link to some other
>> system that is doing additive noise (or even better its implementation)?
> 
> The technical argument for differential privacy is explained in 
> <http://research.microsoft.com/en-us/projects/databaseprivacy/dwork.pdf>.  
> The definition appears in Def. 2, the Laplace mechanism is given in Eq. 3 of 
> Sec. 5, and Thm. 4 shows why that mechanism achieves differential privacy.
> 
> But that stuff is pretty dry. The basic idea is that you’re trying to the 
> contribution of any one sensitive input (e.g. a single user’s data or a 
> single component of a single user’s data). The noise that you need to cover 
> that doesn’t scale with the number of other users, and so you use additive 
> noise.
> 
> Hope that helps,
> Aaron
> _______________________________________________
> tor-dev mailing list
> tor-dev@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to