aditanase commented on code in PR #16049: URL: https://github.com/apache/datafusion/pull/16049#discussion_r2256824252
########## datafusion/sqllogictest/test_files/push_down_filter.slt: ########## @@ -288,3 +288,56 @@ physical_plan DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/ statement ok drop table t; + +statement ok +create table test_uppercase_cols (a int, "A" int, "B" int, "C" int); + +# statement ok +# set datafusion.explain.physical_plan_only = false; + +# Turn off the optimizer to make the logical plan closer to the initial one +# statement ok +# set datafusion.optimizer.max_passes = 0; + +# test push down through aggregate for uppercase column name +query TT +explain +select "A", total_salary +from ( + select "A", sum("B") as total_salary from test_uppercase_cols group by "A" +) +where "A" > 10; +---- +physical_plan +01)ProjectionExec: expr=[A@0 as A, sum(test_uppercase_cols.B)@1 as total_salary] +02)--AggregateExec: mode=FinalPartitioned, gby=[A@0 as A], aggr=[sum(test_uppercase_cols.B)] +03)----CoalesceBatchesExec: target_batch_size=8192 +04)------RepartitionExec: partitioning=Hash([A@0], 4), input_partitions=4 +05)--------AggregateExec: mode=Partial, gby=[A@0 as A], aggr=[sum(test_uppercase_cols.B)] +06)----------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 +07)------------CoalesceBatchesExec: target_batch_size=8192 +08)--------------FilterExec: A@0 > 10 Review Comment: This is the key fix, seeing the FilterExec pushed right on top of the datasource exec. Without the fix, the plan would look like this (with the filter executed AFTER the aggregation): ``` + 02)--CoalesceBatchesExec: target_batch_size=8192 + 03)----FilterExec: A@0 > 10 + 04)------AggregateExec: mode=SinglePartitioned, gby=[A@0 as A], aggr=[sum(test_uppercase_cols.B)] + 05)--------DataSourceExec: partitions=1, partition_sizes=[0] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org