[Spark Internals]: Is sort order preserved after partitioned write?

Swetha Baskaran Thu, 15 Sep 2022 20:43:13 -0700

Hi!

We expected the order of sorted partitions to be preserved after a
dataframe write. We use the following code to write out one file per
partition, with the rows sorted by a column.







*df    .repartition($"col1")    .sortWithinPartitions("col1", "col2")
.write    .partitionBy("col1")    .csv(path)*

However we observe unexpected sort order in some files. Does spark
guarantee sort order within partitions on write?


Thanks,
swebask

[Spark Internals]: Is sort order preserved after partitioned write?

Reply via email to