Hi, I have a question regarding the interpretation of total cost in PostgreSQL's EXPLAIN output.
In some cases, I noticed that a parent path node's total cost is less than that of one of its child path nodes. I initially expected that the total cost of a node should be at least as large as the sum of its child nodes’ total costs, or at least not smaller than any single child. Is this behaviour expected? Could it be due to estimation imprecision, selectivity, or parallelism effects? Here is a reproducible example: DROP TABLE IF EXISTS t_a, t_b, t_c, t_d; CREATE TABLE t_a(id INT, val TEXT); CREATE TABLE t_b(id INT, val TEXT); CREATE TABLE t_c(id INT, val TEXT); CREATE TABLE t_d(id INT, val TEXT); -- Small tables INSERT INTO t_a SELECT i, 'a_' || i FROM generate_series(1, 1000) i; INSERT INTO t_b SELECT i, 'b_' || i FROM generate_series(1, 1000) i; -- Large tables INSERT INTO t_c SELECT i, repeat('c', 1000) FROM generate_series(1, 100000) i; INSERT INTO t_d SELECT i, repeat('d', 1000) FROM generate_series(1, 1000000) i; ANALYZE t_a; ANALYZE t_b; ANALYZE t_c; ANALYZE t_d; EXPLAIN SELECT t_a.*, t_b.*, t_c.*, t_d.* FROM t_a JOIN t_b ON t_a.id = t_b.id JOIN t_c ON t_b.id = t_c.id JOIN t_d ON t_c.id = t_d.id; The resulting EXPLAIN result is: Merge Join (cost=1612692.31..1612848.97 rows=1000 width=2034) Merge Cond: (t_a.id = t_c.id) -> Merge Join (cost=131.66..151.66 rows=1000 width=18) Merge Cond: (t_a.id = t_b.id) -> Sort (cost=65.83..68.33 rows=1000 width=9) Sort Key: t_a.id -> Seq Scan on t_a (cost=0.00..16.00 rows=1000 width=9) -> Sort (cost=65.83..68.33 rows=1000 width=9) Sort Key: t_b.id -> Seq Scan on t_b (cost=0.00..16.00 rows=1000 width=9) -> Materialize (cost=1612560.52..1625748.24 rows=100000 width=2016) -> Gather Merge (cost=1612560.52..1625498.24 rows=100000 width=2016) Workers Planned: 2 -> Merge Join (cost=1611560.49..1612955.73 rows=41667 width=2016) Merge Cond: (t_c.id = t_d.id) -> Sort (cost=36274.90..36379.07 rows=41667 width=1008) Sort Key: t_c.id -> Parallel Seq Scan on t_c (cost=0.00..14702.67 rows=41667 width=1008) -> Materialize (cost=1575285.54..1580285.57 rows=1000006 width=1008) -> Sort (cost=1575285.54..1577785.56 rows=1000006 width=1008) Sort Key: t_d.id -> Seq Scan on t_d (cost=0.00..152858.06 rows=1000006 width=1008) (22 rows) which includes the following: Merge Join (cost=1612692.31..1612848.97 rows=1000 width=2034) -> ... -> Merge Join (cost=1611560.49..1612955.73 rows=41667 width=2016) In this case, the child node (Merge Join on t_c.id = t_d.id) has a total cost of 1612955.73, which is higher than the parent’s total cost of 1612848.97. Could someone clarify whether this is considered valid? Any explanation on how total cost is propagated or computed in such cases would be appreciated. Thank you very much! PostgreSQL major version: 16 Best regards, Xuan Chen