Create a custom  key class implement the equals methods and make sure the
hash method is compatible.
Use that key to map and join your row.



On Sat, May 9, 2015 at 4:02 PM, Mathieu D <matd...@gmail.com> wrote:

> Hi folks,
>
> I need to join RDDs having composite keys like this : (K1, K2 ... Kn).
>
> The joining rule looks like this :
> * if left.K1 == right.K1, then we have a "true equality", and all K2... Kn
> are also equal.
> * if left.K1 != right.K1 but left.K2 == right.K2, I have a partial
> equality, and I also want the join to occur there.
> * if K2 don't match, then I test K3 and so on.
>
> Is there a way to implement a custom join with a given predicate to
> implement this ? (I would probably also need to provide a partitioner, and
> some sorting predicate).
>
> Left and right RDD are 1-10 millions lines long.
> Any idea ?
>
> Thanks
> Mathieu
>

Reply via email to