On Fri, Dec 17, 2010 at 04:03:06PM +0100, Loic Dachary wrote:
> Ludovic Courtès wrote:
> > Hello!
> >
> > Loic Dachary <l...@dachary.org> writes:
> >
> > [...]
> >
> >   
> >> The key question, it seems, is :
> >>
> >>  * given the most frequent use cases of the collaborative search software
> >>  * given the known security issues that threaten any DHT + those
> >> specific to the application
> >>  * given the existing counter measures (ECRS among others)
> >>  => is it likely that users will be able to effectively use the software
> >> or will they be disrupted
> >>        by security problems at a frequency that will be discouraging for
> >> most ?
> >>     
> >
> > Then you have to choose an adversary model.  
> I will reveal my ignorance: what is an "adversary model" ?
> http://en.wikipedia.org/wiki/Adversary_%28online_algorithm%29 ?
> > What do you expect
> > wrongdoers to be able to do to make the system unusable?
> >   
> A user finds the answer to question Q by sending a request to the DHT
> node responsible for Q (the question Q is hashed into a DHT key). A
> malicious node may try to impersonate the node responsible for Q and
> return an answer that is irrelevant.
> 
> This is the first example that comes to mind. Emmanuel Benazera ( the
> lead of http://seeks-project.info/ ) may have others.

Hi,

I've read the thread with great interest.

The example by Loic sounds like a Sybil attack to me. 

Seeks uses LSH [1] over the DHT to build so-called 'search groups'. Every
group lists IP addresses of machines serving recommendations and/or results
on the group's topic (here topic = query).

Injection of poisonous / spam content can be treated at both the informational
level (much like what a spam filter does in a mail box) and at network level.

Attacks on the structure itself of the DHT may be more severe. I understand
that Sybil can be contained with certificates and a master authority; and that
Chord is robust to Eclipse as long as it does not use extra geographical /
topological information to steer its routing process.

Other problems appear to more specifically target the collaborative search
application. First, I am worried by potential 'flash crowds' due to the
instant popularity of certain queries. A related problem is the uneven
load-balance of the participating nodes due to the higher frequency of
certain queries and words. I have selected technical solutions to these
problems.

I would be very grateful to see the envisioned solutions below discussed here.

- uneven balance due to high frequency words can be mitigated at the
vocabulary level with standard techniques from the information retrieval
community [2].

- flash crowds and frequent queries could be mitigated by moving Chord's
virtual nodes that are under heavy load to other nodes with more resources. 
This would work very much as virtual machines today, that can be displaced
with no or little service downtime. Unfortunately I cannot find 
the reference of the paper in which I found this idea in the first
place. I'll post it to this list when I find it. This solution requires
a protocol for exchanging resource stats among nodes.

As I am coming from another computer science research field, my experience
in these matters is limited. Are there other solutions that I may have missed
or overlooked ? I would give serious attention to any recommendations and 
pointers 
to more theoretical / technical information and code.

Sincerely,

Em.

[1] http://www.mit.edu/~andoni/LSH/
[2] http://en.wikipedia.org/wiki/Stop_words

Attachment: pgpHkCO9LtzLw.pgp
Description: PGP signature

_______________________________________________
p2p-hackers mailing list
p2p-hackers@lists.zooko.com
http://lists.zooko.com/mailman/listinfo/p2p-hackers

Reply via email to