We use Scoobi + MR to perform joins and we particularly use blockJoin() API of scoobi
/** Perform an equijoin with another distributed list where this list is considerably smaller * than the right (but too large to fit in memory), and where the keys of right may be * particularly skewed. */ def blockJoin[B : WireFormat](right: DList[(K, B)]): DList[(K, (A, B))] = Relational.blockJoin(left, right) I am trying to do a POC and what Spark join API(s) is recommended to achieve something similar ? Please suggest. -- Deepak