[ https://issues.apache.org/jira/browse/ARROW-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479636#comment-17479636 ]
Dewey Dunnington commented on ARROW-15271: ------------------------------------------ Just collecting a few related code comments here: - https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/dataset-scan.R#L89 - https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/query-engine.R#L23-L26 - https://github.com/apache/arrow/blob/03219e21b42f17294fba3b3d2b22a9117fe0f080/r/R/dataset-scan.R#L184 Related is the ability to write files directly in a query plan using the {{WriteNode}} that was added in ARROW-13542. For example, there is a ticket open for using the {{WriteNode}} to write data sets (ARROW-14266). Writing files is useful but perhaps orthogonal to the ability to iterate over a {{RecordBatchReader}}, which is exemplified by the revamped {{map_batches()}} + vignette addition. > [R] Refactor do_exec_plan to return a RecordBatchReader > ------------------------------------------------------- > > Key: ARROW-15271 > URL: https://issues.apache.org/jira/browse/ARROW-15271 > Project: Apache Arrow > Issue Type: Improvement > Components: R > Affects Versions: 6.0.1 > Reporter: Will Jones > Priority: Major > > Right now > [{{do_exec_plan}}|https://github.com/apache/arrow/blob/master/r/R/query-engine.R#L18] > returns an Arrow table because {{head}}, {{tail}}, and {{arrange}} do. If > ARROW-14289 is completed and similar work is done for {{arrange}}, we may be > able to alter {{do_exec_plan}} to return a RBR instead. > The {{map_batches()}} implementation (ARROW-14029) could benefit from this > refactor. And it might make ARROW-15040 more useful. -- This message was sent by Atlassian Jira (v8.20.1#820001)