[ https://issues.apache.org/jira/browse/ARROW-11782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291706#comment-17291706 ]
Neal Richardson commented on ARROW-11782: ----------------------------------------- I would love to delete ScanTask from the R bindings. The reason they're exposed there is to support a (hacky, experimental) attempt to do computations on the stream of record batches so that it's possible to compute things that we couldn't do otherwise because we can't hold the whole Table in memory. So Scanner::ToBatches doesn't work in that case because everything would be materialized. What I _really_ want is to be able to essentially pass a function/lambda to something like ToTable or ToBatches and have that function be applied to every record batch in the stream. I don't want to manage consuming the ScanTasks/RecordBatchIterators, I'd prefer to have the C++ library handle that. (In my current hacky use of ScanTasks, it's actually prohibitively slow because it has to consume the iterators single-threaded.) > [GLib][Dataset] Remove bindings for internal classes > ---------------------------------------------------- > > Key: ARROW-11782 > URL: https://issues.apache.org/jira/browse/ARROW-11782 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib > Affects Versions: 3.0.0 > Reporter: Ben Kietzman > Priority: Major > Fix For: 4.0.0 > > > GLib and ruby include bindings for internal classes such as ScanOptions, > ScanContext, InMemoryScanTask, ScanTask, ... These are probably unnecessary > and should be removed to present a simpler interface less prone to breakage > under refactoring of the wrapped classes > https://github.com/apache/arrow/pull/9532/checks?check_run_id=1974229719#step:8:2071 -- This message was sent by Atlassian Jira (v8.3.4#803005)