Specify number of partitions with which to run DataFrame.join?

Matt Cheah Thu, 18 Jun 2015 11:50:45 -0700

Hi everyone,

I¹m looking into switching raw RDD operations to DataFrames operations. When
I used JavaPairRDD.join(), I had the option to specify the number of
partitions with which to do the join. However, I don¹t see an equivalent
option in DataFrame.join(). Is there a way to specify the partitioning for a
DataFrame join operation as it is being computed? Or do I have to compute
the join and repartition separately after?


Thanks,

-Matt Cheah

smime.p7s
Description: S/MIME cryptographic signature

Specify number of partitions with which to run DataFrame.join?

Reply via email to