I noticed that UNION's output rowcount estimate can be very wrong, as
the planner ignores the duplicate removal and just uses the total
input size.  Here is an example:

create table t (x int);
insert into t select g % 50 from generate_series(1, 100000) g;
create table big (id int primary key, payload text);
insert into big select g, repeat('p', 20) from generate_series(1, 2000000) g;
analyze t, big;

set max_parallel_workers_per_gather = 0;

explain (analyze)
select b.id from big b
  join (select x from t union select x from t) s
  on s.x = b.id;

On master, the UNION is estimated at 200000 rows, while it actually
only has 50 rows.  As a result, the planner chooses hash-join and
seqscan all of big.

-- master

  ->  HashAggregate  (cost=4386.00..6386.00 rows=200000 width=4)
                (actual time=62.052..62.526 rows=50.00 loops=1)

explain (costs off)
select b.id from big b
  join (select x from t union select x from t) s
  on s.x = b.id;
                QUERY PLAN
-------------------------------------------
 Hash Join
   Hash Cond: (b.id = t.x)
   ->  Seq Scan on big b
   ->  Hash
         ->  HashAggregate
               Group Key: t.x
               ->  Append
                     ->  Seq Scan on t
                     ->  Seq Scan on t t_1
(9 rows)

We can improve this rowcount by estimating the output as the sum of
the per-child distinct-group counts, which build_setop_child_paths()
already computes for us.  Since

  distinct(A union B) <= distinct(A) + distinct(B)

this is a safe upper bound, and it never exceeds the old estimate, so
it only tightens the previous over-estimate.

On patched, the UNION is estimated at 100 rows; still over-estimate,
but much better than the old estimate.  And the planner ends up with
an index nested loop.

-- patched

  ->  HashAggregate  (cost=4386.00..4387.00 rows=100 width=4)
                (actual time=63.593..63.601 rows=50.00 loops=1)

explain (costs off)
select b.id from big b
  join (select x from t union select x from t) s
  on s.x = b.id;
                  QUERY PLAN
-----------------------------------------------
 Nested Loop
   ->  HashAggregate
         Group Key: t.x
         ->  Append
               ->  Seq Scan on t
               ->  Seq Scan on t t_1
   ->  Index Only Scan using big_pkey on big b
         Index Cond: (id = t.x)
(8 rows)

Running this two plans, and here are what I got:

-- on master
 Planning Time: 0.554 ms
 Execution Time: 357.520 ms

-- on patched
 Planning Time: 0.521 ms
 Execution Time: 42.313 ms

This is about 8x faster.

Thoughts?

- Richard

Attachment: v1-0001-Improve-UNION-s-output-row-count-estimate.patch
Description: Binary data

Reply via email to