Re: does column order matter in dataframe.repartition?

2016-11-17 Thread Sean Owen
It's not in general true that 100 different partitions keys go to 100 partitions -- it depends on the partitioner, but wouldn't be true in the case of a default HashPartitioner. But, yeah you'd expect a reasonably even distribution. What happens in all cases depends on the partitioner. I haven't

does column order matter in dataframe.repartition?

2016-11-17 Thread Cesar
I am using the next line to re-partition a data frame by multiple columns: val partitionColumns = Seq("date", "company_id").map(x => new Column(x)) val numPartitions = 100 val dfRepartitioined = df.repartition(numPartitions, partitionColumns) I understand that if the number of combinations of