Hi I need to partition my data represented as RDD into n folds and run metrics computation in each fold and finally compute the means of my metrics overall the folds. Does spark can do the data partition out of the box or do I need to implement it myself. I know that RDD has a partitions method and mapPartitions but I really don't understand the purpose and the meaning of partition here.
Cheers, Jaonary