liukun4515 commented on code in PR #7428: URL: https://github.com/apache/arrow-datafusion/pull/7428#discussion_r1307001137
########## datafusion/sqllogictest/test_files/joins.slt: ########## @@ -1500,15 +1500,16 @@ Projection: join_t1.t1_id, join_t2.t2_id, join_t1.t1_name physical_plan ProjectionExec: expr=[t1_id@0 as t1_id, t2_id@2 as t2_id, t1_name@1 as t1_name] --ProjectionExec: expr=[t1_id@0 as t1_id, t1_name@1 as t1_name, t2_id@3 as t2_id] -----CoalesceBatchesExec: target_batch_size=4096 -------HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(join_t1.t1_id + UInt32(12)@2, join_t2.t2_id + UInt32(1)@1)] ---------CoalescePartitionsExec +----ProjectionExec: expr=[t1_id@2 as t1_id, t1_name@3 as t1_name, join_t1.t1_id + UInt32(12)@4 as join_t1.t1_id + UInt32(12), t2_id@0 as t2_id, join_t2.t2_id + UInt32(1)@1 as join_t2.t2_id + UInt32(1)] Review Comment: after this pr, the t1 and t2 has the statistics in the projection exe. The table of `t2` project the t2_id, the table of `t1` project the t1_id and t1_name, hence the cost of `t1` is greater than `t2`. Collect the left in the join, use the `t2` as the building table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
