On Sun, Jan 4, 2015 at 2:49 PM, Serguei Osokine <[email protected]> wrote:
> I would also add the complicated search queries; not sure, maybe there > is already a good soution somehwere, but generally, when different DHT > nodes are responsible for parts of the key space, finding everything > that has both "sex" and "city" might result in the unmanageable traffic > volume. Both key words might yield lots of pointers to nodes with that > content, and finding the intersection between these pointer sets might > be really complicated, even if the result will be just a single node > with "sex and the city" content. There's a really neat solution to this problem called Hyperspace Hashing which has you model your data in a set of different dimensions (i.e. a euclidian hyperspace), and locate a point within the hyperspace based on the hashes of the inputs. You then break the hyperspace down into a finite number of subspaces, and partition those among the available servers. The familiar hash ring can be seen as the one dimensional version of this concept. It doesn't quite fit the full text search scenario you described, but it does help expressing more powerful queries than just k/v lookup: http://www.cs.cornell.edu/people/egs/papers/hyperdex-sigcomm.pdf http://www.slideshare.net/DECK36/c36-hyperdexmeetupdec2013v2nogfx -- Tony Arcieri
_______________________________________________ p2p-hackers mailing list [email protected] http://lists.zooko.com/mailman/listinfo/p2p-hackers
