Hi! We expected the order of sorted partitions to be preserved after a dataframe write. We use the following code to write out one file per partition, with the rows sorted by a column.
*df .repartition($"col1") .sortWithinPartitions("col1", "col2")
.write .partitionBy("col1") .csv(path)*
However we observe unexpected sort order in some files. Does spark
guarantee sort order within partitions on write?
Thanks,
swebask
