Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20397#discussion_r164347780 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java --- @@ -22,19 +22,19 @@ import org.apache.spark.annotation.InterfaceStability; /** - * A read task returned by {@link DataSourceV2Reader#createReadTasks()} and is responsible for - * creating the actual data reader. The relationship between {@link ReadTask} and {@link DataReader} + * A reader factory returned by {@link DataSourceV2Reader#createDataReaderFactories()} and is responsible for + * creating the actual data reader. The relationship between {@link DataReaderFactory} and {@link DataReader} * is similar to the relationship between {@link Iterable} and {@link java.util.Iterator}. * - * Note that, the read task will be serialized and sent to executors, then the data reader will be - * created on executors and do the actual reading. So {@link ReadTask} must be serializable and + * Note that, the reader factory will be serialized and sent to executors, then the data reader will be + * created on executors and do the actual reading. So {@link DataReaderFactory} must be serializable and * {@link DataReader} doesn't need to be. */ @InterfaceStability.Evolving -public interface ReadTask<T> extends Serializable { +public interface DataReaderFactory<T> extends Serializable { /** - * The preferred locations where this read task can run faster, but Spark does not guarantee that + * The preferred locations where this data reader factory can run faster, but Spark does not guarantee that --- End diff -- `... where the data reader returned by this reader factory can run faster ...`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org