Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/21145#discussion_r186498369 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ReadTask.java --- @@ -22,20 +22,20 @@ import org.apache.spark.annotation.InterfaceStability; /** - * A reader factory returned by {@link DataSourceReader#createDataReaderFactories()} and is + * A read task returned by {@link DataSourceReader#createReadTasks()} and is --- End diff -- Now I'm rethinking the suggestion: `InputSplit` is a well-known Hadoop class that we probably shouldn't duplicate. What about using `InputPartition` instead? That makes it clear that the partitioning is on the input data and uses the more common term in Spark. Is everyone okay with this? @jose-torres @gengliangwang @cloud-fan @henryr @arunmahadevan @gatorsmile?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org