westonpace commented on a change in pull request #10008:
URL: https://github.com/apache/arrow/pull/10008#discussion_r612500764
##########
File path: cpp/src/arrow/dataset/dataset.cc
##########
@@ -95,6 +95,33 @@ Result<ScanTaskIterator>
InMemoryFragment::Scan(std::shared_ptr<ScanOptions> opt
return MakeMapIterator(fn, std::move(batches_it));
}
+Result<RecordBatchGenerator> InMemoryFragment::ScanBatchesAsync(
+ const ScanOptions& options) {
+ struct Generator {
+ Future<std::shared_ptr<RecordBatch>> operator()() {
+ if (batch_index >= self->record_batches_.size()) {
+ return AsyncGeneratorEnd<std::shared_ptr<RecordBatch>>();
+ }
+ const auto& next_parent = self->record_batches_[batch_index];
+ if (offset + batch_size < next_parent->num_rows()) {
+ offset += batch_size;
+ auto next = next_parent->Slice(offset, batch_size);
+ return
Future<std::shared_ptr<RecordBatch>>::MakeFinished(std::move(next));
+ }
+ batch_index++;
+ auto next = next_parent->Slice(offset, batch_size);
+ return
Future<std::shared_ptr<RecordBatch>>::MakeFinished(std::move(next));
Review comment:
Yep. This logic was all backwards. I've since changed it to a while
loop. Just pushed the change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]