[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-24073. ----------------------------- Resolution: Fixed Assignee: Ryan Blue > DataSourceV2: Rename DataReaderFactory back to ReadTask. > -------------------------------------------------------- > > Key: SPARK-24073 > URL: https://issues.apache.org/jira/browse/SPARK-24073 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.3.0 > Reporter: Ryan Blue > Assignee: Ryan Blue > Priority: Major > Fix For: 2.4.0 > > > Just before 2.3.0, SPARK-23219 renamed ReadTask to DataReaderFactory. The > intent was to make the read and write API match (write side uses > DataWriterFactory), but the underlying problem is that the two classes are > not equivalent. > ReadTask/DataReader function as Iterable/Iterator. ReadTask is a specific to > a read task, in contrast to DataWriterFactory where the same factory instance > is used in all write tasks. ReadTask's purpose is to manage the lifecycle of > DataReader with an explicit create operation to mirror the close operation. > This is no longer clear from the API, where DataReaderFactory appears to be > more generic than it is and it isn't clear why a set of them is produced for > a read. > We should rename DataReaderFactory back to ReadTask, which correctly conveys > the purpose and use of the class. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org