Hi Romi,
Yes, you understand it correctly.And rdd1 keys are cross with rdd2 keys, that
is, there are lots of same keys between rdd1 and rdd2, and there are some keys
inrdd1 but not in rdd2, there are also some keys in rdd2 but not in rdd1.Then
rdd3 keys would be same with rdd1 keys, rdd3 will
Dear Romi, Priya, Sujt and Shivaram and all,
I have took lots of days to think into this issue, however, without any enough
good solution...I shall appreciate your all kind help.
There is an RDD rdd1, and another RDD rdd2,
(rdd2 can be PairRDD, or DataFrame with two columns
Hi,
If I understand correctly:
rdd1 contains keys (of type StringDate)
rdd2 contains keys and values
and rdd3 contains all the keys, and the values from rdd2?
I think you should make rdd1 and rdd2 PairRDD, and then use outer join.
Does that make sense?
On Mon, Sep 21, 2015 at 8:37 PM Zhiliang