Hi everyone,

I¹m looking into switching raw RDD operations to DataFrames operations. When
I used JavaPairRDD.join(), I had the option to specify the number of
partitions with which to do the join. However, I don¹t see an equivalent
option in DataFrame.join(). Is there a way to specify the partitioning for a
DataFrame join operation as it is being computed? Or do I have to compute
the join and repartition separately after?

Thanks,

-Matt Cheah


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to