I am using the next line to re-partition a data frame by multiple columns:
val partitionColumns = Seq("date", "company_id").map(x => new Column(x))
val numPartitions = 100
val dfRepartitioined = df.repartition(numPartitions, partitionColumns)
I understand that if the number of combinations of date and company_id is
at most 100, each combination of will go to a different partition.
My question is, what happens when the number of combinations larger than
100 ? Does re-partition changes in behavior if I switch the column order in
the definition of partitionColumns variable?
Thanks
--
Cesar Flores