GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23249
[SPARK-26297][SQL] improve the doc of Distribution/Partitioning ## What changes were proposed in this pull request? Some documents of `Distribution/Partitioning` are stale and misleading, this PR fixes them: 1. `ClusteredDistribution` doesn't have intra-partition requirement 2. `OrderedDistribution` does not require tuples that share the same value being colocated in the same partition. 3. `RangePartitioning` can provide a weaker guarantee for a prefix of its `ordering` expressions. ## How was this patch tested? comment-only PR. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23249.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23249 ---- commit 24ea28abd5a385351703335df33b26838d203fe3 Author: Wenchen Fan <wenchen@...> Date: 2018-12-06T15:47:23Z improve the doc of Distribution/Partitioning ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org