sahil1105 commented on code in PR #43661:
URL: https://github.com/apache/arrow/pull/43661#discussion_r1715557796


##########
cpp/src/arrow/dataset/file_parquet.cc:
##########
@@ -555,6 +562,57 @@ Future<std::shared_ptr<parquet::arrow::FileReader>> 
ParquetFileFormat::GetReader
       });
 }
 
+struct CastingGenerator {

Review Comment:
   Thank you!
   Based on what I see, that is only responsible for casting the data to the 
logical type specified in the parquet metadata and not the Arrow type we want 
to convert to (the one in the dataset_schema). For strings, that seems to 
always map to a String type (based on `FromByteArray` which is called by 
`GetArrowType` which is called by `GetTypeForNode` which is called by 
`NodeToSchemaField` which is called in `SchemaManifest::Make` during the 
creation of the `LeafReader`). Am I missing something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to