HeartSaVioR commented on a change in pull request #31355:
URL: https://github.com/apache/spark/pull/31355#discussion_r565785171



##########
File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/distributions/OrderedDistribution.java
##########
@@ -32,4 +32,13 @@
    * Returns ordering expressions.
    */
   SortOrder[] ordering();
+
+  /**
+   * Returns the number of partitions required by this write.

Review comment:
       Hmm... you're right we seem to try to co-use the interface in both 
read/write path, so it might be confusing if something is only saying about 
write.
   
   I'm not sure this method can be used in read path as well though. For read 
path, data source is already able to control the parallelism (say, partitions), 
so no need to pass the requirement and let Spark does it instead.
   
   So I guess this is a point of divergence between read and write. If we can 
explicitly and effectively classify the methods as "common", "read specific", 
"write specific" then that would be great, but let's hear others' voices on 
this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to