Create a custom key class implement the equals methods and make sure the hash method is compatible. Use that key to map and join your row.
On Sat, May 9, 2015 at 4:02 PM, Mathieu D <matd...@gmail.com> wrote: > Hi folks, > > I need to join RDDs having composite keys like this : (K1, K2 ... Kn). > > The joining rule looks like this : > * if left.K1 == right.K1, then we have a "true equality", and all K2... Kn > are also equal. > * if left.K1 != right.K1 but left.K2 == right.K2, I have a partial > equality, and I also want the join to occur there. > * if K2 don't match, then I test K3 and so on. > > Is there a way to implement a custom join with a given predicate to > implement this ? (I would probably also need to provide a partitioner, and > some sorting predicate). > > Left and right RDD are 1-10 millions lines long. > Any idea ? > > Thanks > Mathieu >