adriangb commented on code in PR #15561:
URL: https://github.com/apache/datafusion/pull/15561#discussion_r2027673271
##########
datafusion/sqllogictest/test_files/parquet.slt:
##########
@@ -625,7 +625,7 @@ physical_plan
01)CoalesceBatchesExec: target_batch_size=8192
02)--FilterExec: column1@0 LIKE f%
03)----RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1
-04)------DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]},
projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%,
pruning_predicate=column1_null_count@2 != row_count@3 AND column1_min@0 <= g
AND f <= column1_max@1, required_guarantees=[]
+04)------DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]},
projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%
Review Comment:
Yes I think we can do that. I feared that it would be more confusing because
the pruning predicate you see is not what you get in the end...
Is there any way we can inject this information at runtime? Metrics already
kind of do that. It'd be nice to record the per-file pruning predicates, per
file schema mappings and per-file filters once those exist.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]