asolimando commented on code in PR #21075: URL: https://github.com/apache/datafusion/pull/21075#discussion_r2975234847
########## datafusion/sqllogictest/test_files/union.slt: ########## @@ -273,6 +273,49 @@ physical_plan 04)--ProjectionExec: expr=[name@0 || _new as name] 05)----DataSourceExec: partitions=1, partition_sizes=[1] +# unions_to_filter is disabled by default +query TT +EXPLAIN SELECT id, name FROM t1 WHERE id = 1 UNION SELECT id, name FROM t1 WHERE id = 2 +---- +logical_plan +01)Aggregate: groupBy=[[id, name]], aggr=[[]] +02)--Union +03)----Filter: t1.id = Int32(1) +04)------TableScan: t1 projection=[id, name] +05)----Filter: t1.id = Int32(2) +06)------TableScan: t1 projection=[id, name] +physical_plan +01)AggregateExec: mode=FinalPartitioned, gby=[id@0 as id, name@1 as name], aggr=[] +02)--RepartitionExec: partitioning=Hash([id@0, name@1], 4), input_partitions=4 +03)----AggregateExec: mode=Partial, gby=[id@0 as id, name@1 as name], aggr=[] +04)------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=2 +05)--------UnionExec +06)----------FilterExec: id@0 = 1 +07)------------DataSourceExec: partitions=1, partition_sizes=[1] +08)----------FilterExec: id@0 = 2 +09)------------DataSourceExec: partitions=1, partition_sizes=[1] + +statement ok +set datafusion.optimizer.enable_unions_to_filter = true; + +query TT +EXPLAIN SELECT id, name FROM t1 WHERE id = 1 UNION SELECT id, name FROM t1 WHERE id = 2 +---- Review Comment: I haven't looked at the proposed implementation but the rewrite can surely help in case of repeated costly union branches (especially when coming from possibly complex data sources powered via `TableProvider`). It seems also particularly relevant until https://github.com/apache/datafusion/issues/8777 gets addressed (broader scope, CTE materialization), as currently there is no other way to mutualize repeated reads AFAIK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
