mbutrovich commented on code in PR #22026:
URL: https://github.com/apache/datafusion/pull/22026#discussion_r3191255136


##########
datafusion/datasource-parquet/src/opener.rs:
##########
@@ -807,11 +827,24 @@ impl MetadataLoadedParquetOpen {
         let needs_rewrite = prepared.predicate.is_some()
             || prepared.logical_file_schema != physical_file_schema;
         if needs_rewrite {
+            // When virtual columns are requested, augment the logical and
+            // physical schemas passed to the rewriter/simplifier with those
+            // fields. The rewriter identity-rewrites references found in both
+            // schemas, keeping virtual-column references as `Column` rather
+            // than replacing them with null literals; the simplifier needs
+            // them present so it can resolve their data types while walking
+            // expression trees. We keep `physical_file_schema` itself as the
+            // pure file schema so downstream predicate pushdown, pruning, and
+            // row filter construction stay unaffected.

Review Comment:
   Good point, arrow-rs doesn't support filters on virtual columns being pushed 
down. I'll come up with a solution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to