Hi,

I'm trying to implement a custom RDD that essentially works as a
distributed hash table, i.e. the key space is split up into partitions and
within a partition, an element can be looked up efficiently by the key.
However, the RDD lookup() function (in PairRDDFunctions) is implemented in
a way iterate through all elements of a partition and find the matching
ones.  Is there a better way to do what I want to do, short of just
implementing new methods on the custom RDD?

Thanks,
Akshat

Reply via email to