HeartSaVioR commented on a change in pull request #31355: URL: https://github.com/apache/spark/pull/31355#discussion_r565785171
########## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/distributions/OrderedDistribution.java ########## @@ -32,4 +32,13 @@ * Returns ordering expressions. */ SortOrder[] ordering(); + + /** + * Returns the number of partitions required by this write. Review comment: Hmm... you're right we seem to try to co-use the interface in both read/write path, so it might be confusing if something is only saying about write. I'm not sure this method can be used in read path as well though. For read path, data source is already able to control the parallelism (say, partitions), so no need to pass the requirement and let Spark does it instead. So I guess this is a point of divergence between read and write. If we can explicitly and effectively classify the methods as "common", "read specific", "write specific" then that would be great, but let's hear others' voices on this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org