[I] [Bug] Unnecessary Motion Nodes Added for Partitioned Tables Join with other tables [cloudberry]

via GitHub Fri, 08 Aug 2025 01:52:04 -0700


avamingli opened a new issue, #1301:
URL: https://github.com/apache/cloudberry/issues/1301


   ### Apache Cloudberry version
   
   main
   
   ### What happened
   
   While debug TPC-DS with partitioned tables, I found that when executing 
queries on partitioned tables, the planner sometimes generates redundant Motion 
nodes even when distribution keys are logically compatible. 
   
   The issue leads to suboptimal query plans with avoidable data redistribution 
overhead.
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   ```sql
   CREATE TABLE t1 (id varchar(32), date date) DISTRIBUTED BY (id)
   PARTITION BY RANGE (date)
   (START (date '2016-01-01') INCLUSIVE END (date '2016-01-04') EXCLUSIVE EVERY 
(INTERVAL '1 day'));
   CREATE TABLE t2 (id varchar(32)) DISTRIBUTED BY (id);
   analyze t1;
   analyze t2;
   \d+ t1;
                                         Partitioned table "public.t1"
    Column |         Type          | Collation | Nullable | Default | Storage  
| Stats target | Description
   
--------+-----------------------+-----------+----------+---------+----------+--------------+-------------
    id     | character varying(32) |           |          |         | extended 
|              |
    date   | date                  |           |          |         | plain    
|              |
   Partition key: RANGE (date)
   Partitions: t1_1_prt_1 FOR VALUES FROM ('01-01-2016') TO ('01-02-2016'),
               t1_1_prt_2 FOR VALUES FROM ('01-02-2016') TO ('01-03-2016'),
               t1_1_prt_3 FOR VALUES FROM ('01-03-2016') TO ('01-04-2016')
   Distributed by: (id)
   
   \d+ t2;
                                               Table "public.t2"
    Column |         Type          | Collation | Nullable | Default | Storage  
| Stats target | Description
   
--------+-----------------------+-----------+----------+---------+----------+--------------+-------------
    id     | character varying(32) |           |          |         | extended 
|              |
   Distributed by: (id)
   
   EXPLAIN(COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 USING(id);
                              QUERY PLAN
   ----------------------------------------------------------------
    Aggregate
      ->  Gather Motion 3:1  (slice1; segments: 3)
            ->  Hash Join
                  Hash Cond: ((t1.id)::text = (t2.id)::text)
                  ->  Dynamic Seq Scan on t1
                        Number of partitions to scan: 3 (out of 3)
                  ->  Hash
                        ->  Seq Scan on t2
    Optimizer: GPORCA
   (9 rows)
   
   set optimizer=off;
   SET
   
   EXPLAIN(COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 USING(id);
                               QUERY PLAN
   ------------------------------------------------------------------
    Aggregate
      ->  Gather Motion 3:1  (slice1; segments: 3)
            ->  Hash Join
                  Hash Cond: ((t1.id)::text = (t2.id)::text)
                  ->  Redistribute Motion 3:3  (slice2; segments: 3)
                        Hash Key: t1.id
                        ->  Append
                              ->  Seq Scan on t1_1_prt_1 t1_1
                              ->  Seq Scan on t1_1_prt_2 t1_2
                              ->  Seq Scan on t1_1_prt_3 t1_3
                  ->  Hash
                        ->  Seq Scan on t2
    Optimizer: Postgres query optimizer
   (13 rows)
   ```
   
   ### Operating System
   
   all
   
   ### Anything else
   
   ORCA works well, it's planner's bug.
   
   ### Are you willing to submit PR?
   
   - [x] Yes, I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [Bug] Unnecessary Motion Nodes Added for Partitioned Tables Join with other tables [cloudberry]

Reply via email to