Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/21145 Either names are not perfect. It is not a real task, and it has a method name `createDataReader`, while there is `createDataWriter` in `DataWriterFactory`. It is not a factory (design pattern). I did the renaming `ReadTask` -> `DataReaderFactory` to make read and write API consistent. It wasn't such misleading as expected, since the API in `DataSourceReader` is `List<DataReaderFactory<Row>> createDataReaderFactories();`. Now I feel sorry that I didn't come up with a better naming at that time. But **partially** changing the naming to `ReadTask` now only makes things worse. If there is a better name than both names, let's use it. Otherwise, I prefer `DataReaderFactory` to `ReadTask`.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org