lidavidm commented on a change in pull request #12560:
URL: https://github.com/apache/arrow/pull/12560#discussion_r819561503



##########
File path: cpp/src/arrow/dataset/scanner.cc
##########
@@ -877,6 +882,15 @@ Result<compute::ExecNode*> MakeScanNode(compute::ExecPlan* 
plan,
         batch->values.emplace_back(partial.fragment.index);
         batch->values.emplace_back(partial.record_batch.index);
         batch->values.emplace_back(partial.record_batch.last);
+
+        auto filefragment_casted =
+            dynamic_cast<const FileFragment*>(partial.fragment.value.get());

Review comment:
       Use `arrow::internal::checked_cast`? This will be dynamic_cast in debug 
mode and static cast otherwise

##########
File path: cpp/src/arrow/dataset/scanner.cc
##########
@@ -877,6 +882,15 @@ Result<compute::ExecNode*> MakeScanNode(compute::ExecPlan* 
plan,
         batch->values.emplace_back(partial.fragment.index);
         batch->values.emplace_back(partial.record_batch.index);
         batch->values.emplace_back(partial.record_batch.last);
+
+        auto filefragment_casted =
+            dynamic_cast<const FileFragment*>(partial.fragment.value.get());
+        if (filefragment_casted != nullptr) {
+          batch->values.emplace_back(filefragment_casted->source().path());
+        } else {
+          batch->values.emplace_back("");

Review comment:
       Maybe something more descriptive? "(not a file)" or something? Can we 
call fragment->ToString() or something instead?

##########
File path: cpp/src/arrow/dataset/scanner.cc
##########
@@ -708,8 +709,12 @@ Result<ProjectionDescr> 
ProjectionDescr::FromNames(std::vector<std::string> name
   for (size_t i = 0; i < exprs.size(); ++i) {
     exprs[i] = compute::field_ref(names[i]);
   }
+  auto fields = dataset_schema.fields();
+  for (const auto& aug_field : kAugmentedFields) {
+    fields.push_back(aug_field);
+  }
   return ProjectionDescr::FromExpressions(std::move(exprs), std::move(names),
-                                          dataset_schema);
+                                          Schema(fields));

Review comment:
       Why is this needed now but not before? I wouldn't expect the projection 
to have access to these fields




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to