For datasets structured as 

ds1
rowN col1
1       A
2       B
3       C
4       C
…

and

ds2
rowN col2
1       X
2       Y
3       Z
…

I want to do a left join 

Dataset<Row> joined = ds1.join(ds2,”rowN”,”left outer”);

I somewhere read in SO or this mailing list that if spark is aware of datasets 
being sorted it will use some optimizations for joins.
Is it possible to make this join more efficient/faster.

Rohit

Reply via email to