lidavidm commented on a change in pull request #12560:
URL: https://github.com/apache/arrow/pull/12560#discussion_r824809426



##########
File path: cpp/src/arrow/dataset/scanner_test.cc
##########
@@ -128,6 +128,18 @@ class TestScanner : public 
DatasetFixtureMixinWithParam<TestScannerParams> {
     AssertScanBatchesEquals(expected.get(), scanner.get());
   }
 
+  void AssertScanForAugmentedFields(std::shared_ptr<Scanner> scanner) {
+    auto result = scanner.get()->ToTable();
+    if (result.ok()) {

Review comment:
       Shouldn't `result` be OK here?

##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -245,9 +245,10 @@ cdef class Dataset(_Weakrefable):
             The columns will be passed down to Datasets and corresponding data
             fragments to avoid loading, copying, and deserializing columns
             that will not be required further down the compute chain.
-            By default all of the available columns are projected. Raises
-            an exception if any of the referenced column names does not exist
-            in the dataset's Schema.
+            By default all of the available columns along with the augmented 
fields
+            such as `batch_index`, `fragment_index`, `last_in_fragment` and 
`filename`
+            are projected. Raises an exception if any of the referenced column 
names 

Review comment:
       This seems to contradict the test above, isn't it checking that these 
fields don't exist by default?

##########
File path: cpp/src/arrow/dataset/scanner_test.cc
##########
@@ -1503,6 +1518,7 @@ TEST(ScanNode, Schema) {
   fields.push_back(field("__fragment_index", int32()));
   fields.push_back(field("__batch_index", int32()));
   fields.push_back(field("__last_in_fragment", boolean()));
+  fields.push_back(field("__filename", utf8()));

Review comment:
       I don't see any test checking that we do actually get the right values 
in this new field




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to