Hi everyone, I¹m looking into switching raw RDD operations to DataFrames operations. When I used JavaPairRDD.join(), I had the option to specify the number of partitions with which to do the join. However, I don¹t see an equivalent option in DataFrame.join(). Is there a way to specify the partitioning for a DataFrame join operation as it is being computed? Or do I have to compute the join and repartition separately after?
Thanks, -Matt Cheah
smime.p7s
Description: S/MIME cryptographic signature