On Fri, Dec 17, 2010 at 04:03:06PM +0100, Loic Dachary wrote: > Ludovic Courtès wrote: > > Hello! > > > > Loic Dachary <l...@dachary.org> writes: > > > > [...] > > > > > >> The key question, it seems, is : > >> > >> * given the most frequent use cases of the collaborative search software > >> * given the known security issues that threaten any DHT + those > >> specific to the application > >> * given the existing counter measures (ECRS among others) > >> => is it likely that users will be able to effectively use the software > >> or will they be disrupted > >> by security problems at a frequency that will be discouraging for > >> most ? > >> > > > > Then you have to choose an adversary model. > I will reveal my ignorance: what is an "adversary model" ? > http://en.wikipedia.org/wiki/Adversary_%28online_algorithm%29 ? > > What do you expect > > wrongdoers to be able to do to make the system unusable? > > > A user finds the answer to question Q by sending a request to the DHT > node responsible for Q (the question Q is hashed into a DHT key). A > malicious node may try to impersonate the node responsible for Q and > return an answer that is irrelevant. > > This is the first example that comes to mind. Emmanuel Benazera ( the > lead of http://seeks-project.info/ ) may have others.
Hi, I've read the thread with great interest. The example by Loic sounds like a Sybil attack to me. Seeks uses LSH [1] over the DHT to build so-called 'search groups'. Every group lists IP addresses of machines serving recommendations and/or results on the group's topic (here topic = query). Injection of poisonous / spam content can be treated at both the informational level (much like what a spam filter does in a mail box) and at network level. Attacks on the structure itself of the DHT may be more severe. I understand that Sybil can be contained with certificates and a master authority; and that Chord is robust to Eclipse as long as it does not use extra geographical / topological information to steer its routing process. Other problems appear to more specifically target the collaborative search application. First, I am worried by potential 'flash crowds' due to the instant popularity of certain queries. A related problem is the uneven load-balance of the participating nodes due to the higher frequency of certain queries and words. I have selected technical solutions to these problems. I would be very grateful to see the envisioned solutions below discussed here. - uneven balance due to high frequency words can be mitigated at the vocabulary level with standard techniques from the information retrieval community [2]. - flash crowds and frequent queries could be mitigated by moving Chord's virtual nodes that are under heavy load to other nodes with more resources. This would work very much as virtual machines today, that can be displaced with no or little service downtime. Unfortunately I cannot find the reference of the paper in which I found this idea in the first place. I'll post it to this list when I find it. This solution requires a protocol for exchanging resource stats among nodes. As I am coming from another computer science research field, my experience in these matters is limited. Are there other solutions that I may have missed or overlooked ? I would give serious attention to any recommendations and pointers to more theoretical / technical information and code. Sincerely, Em. [1] http://www.mit.edu/~andoni/LSH/ [2] http://en.wikipedia.org/wiki/Stop_words
pgpHkCO9LtzLw.pgp
Description: PGP signature
_______________________________________________ p2p-hackers mailing list p2p-hackers@lists.zooko.com http://lists.zooko.com/mailman/listinfo/p2p-hackers