Github user junegunn commented on the issue: https://github.com/apache/spark/pull/16347 Rebased to current master. The patch is simpler thanks to the refactoring made in [SPARK-18243](https://issues.apache.org/jira/browse/SPARK-18243). Anyway, I can understand your rationale for wanting to have explicit API on the writer side, but then make sure that the sort specification from `sortWithinPartitions` is automatically propagated to the writer, or the method is no longer compatible to `SORT BY` in Hive and [the documentation](https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L990) should be corrected accordingly. Care should be taken for `INSERT OVERWRITE TABLE ... DISTIRBUTE BY ... SORT BY ...` statement in Spark SQL so that it's compatible to the same Hive SQL.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org