Consider the following left outer join
potentialDailyModificationsRDD =
reducedDailyPairRDD.leftOuterJoin(baselinePairRDD).partitionBy(new
HashPartitioner(1024)).persist(StorageLevel.MEMORY_AND_DISK_SER());
Below are the record counts for the RDDs involved
Number of records for
if you have duplicate values for a key, join creates all pairs. Eg. if you
2 values for key X in rdd A 2 values for key X in rdd B, then a.join(B)
will have 4 records for key X
On Thu, Feb 19, 2015 at 3:39 PM, Darin McBeath ddmcbe...@yahoo.com.invalid
wrote:
Consider the following left outer