Hi hackers,
While studying a regression reported[1] against my parallel hash join
patch, I noticed that we can also reach a good and a bad plan in
unpatched master. One of the causes seems to be the estimated
selectivity of a semi-join with an extra <> filter qual.
Here are some times I measured for TPCH Q21 at scale 10 and work_mem
of 1GB. That is a query with a large anti-join and a large semi-join.
8 workers = 8.3s
7 workers = 8.2s
6 workers = 8.5s
5 workers = 8.9s
4 workers = 9.5s
3 workers = 39.7s
2 workers = 36.9s
1 worker = 38.2s
0 workers = 47.9s
Please see the attached query plans showing the change in plan from
Hash Semi Join to Nested Loop Semi Join that happens only once we
reach 4 workers and the (partial) base relation size becomes smaller.
The interesting thing is that row estimate for the semi-join and
anti-join come out as 1 (I think this is 0 clamped to 1).
The same thing can be seen with a simple semi-join, if you happen to
have TPCH loaded. Compare these two queries:
SELECT *
FROM lineitem l1
WHERE EXISTS (SELECT *
FROM lineitem l2
WHERE l1.l_orderkey = l2.l_orderkey);
-> estimates 59986012 rows, actual rows 59,986,052 (scale 10 TPCH)
SELECT *
FROM lineitem l1
WHERE EXISTS (SELECT *
FROM lineitem l2
WHERE l1.l_orderkey = l2.l_orderkey
AND l1.l_suppkey <> l2.l_suppkey);
-> estimates 1 row, actual rows 57,842,090 (scale 10 TPCH)
Or for a standalone example:
CREATE TABLE foo AS
SELECT (generate_series(1, 1000000) / 4)::int AS a,
(generate_series(1, 1000000) % 100)::int AS b;
ANALYZE foo;
SELECT *
FROM foo f1
WHERE EXISTS (SELECT *
FROM foo f2
WHERE f1.a = f2.a);
-> estimates 1,000,000 rows
SELECT *
FROM foo f1
WHERE EXISTS (SELECT *
FROM foo f2
WHERE f1.a = f2.a
AND f1.b <> f2.b);
-> estimates 1 row
I'm trying to wrap my brain around the selectivity code, but am too
green to grok how this part of the planner that I haven't previously
focused on works so far, and I'd like to understand whether this is
expected behaviour so that I can figure out how to tackle the reported
regression with my patch. What is happening here?
Thanks for reading.
[1]
https://www.postgresql.org/message-id/CAEepm%3D3Og-7-b3WOkiT%3Dc%2B6y3eZ0VVSyb1K%2BSOvF17BO5KAt0A%40mail.gmail.com
--
Thomas Munro
http://www.enterprisedb.com
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4642943.81..4642943.82 rows=1 width=34) (actual
time=39806.068..39806.080 rows=100 loops=1)
-> Sort (cost=4642943.81..4642943.82 rows=1 width=34) (actual
time=39806.064..39806.070 rows=100 loops=1)
Sort Key: (count(*)) DESC, supplier.s_name
Sort Method: top-N heapsort Memory: 38kB
-> GroupAggregate (cost=4642943.78..4642943.80 rows=1 width=34)
(actual time=39796.916..39804.615 rows=3945 loops=1)
Group Key: supplier.s_name
-> Sort (cost=4642943.78..4642943.79 rows=1 width=26) (actual
time=39796.906..39800.147 rows=38905 loops=1)
Sort Key: supplier.s_name
Sort Method: quicksort Memory: 4576kB
-> Nested Loop Anti Join (cost=2861152.85..4642943.77
rows=1 width=26) (actual time=19851.808..39565.632 rows=38905 loops=1)
-> Nested Loop (cost=2861152.29..4642900.65 rows=1
width=42) (actual time=19850.550..35747.936 rows=696628 loops=1)
-> Gather (cost=2861151.85..4642893.24
rows=1 width=50) (actual time=19850.323..29654.121 rows=1441593 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Hash Semi Join
(cost=2860151.85..4641893.14 rows=1 width=50) (actual time=21036.398..30323.561
rows=360398 loops=4)
Hash Cond: (l1.l_orderkey =
l2.l_orderkey)
Join Filter: (l2.l_suppkey <>
l1.l_suppkey)
Rows Removed by Join Filter: 93610
-> Hash Join
(cost=2585.58..1486212.61 rows=258004 width=42) (actual time=17.681..3985.606
rows=373726 loops=4)
Hash Cond: (l1.l_suppkey =
supplier.s_suppkey)
-> Parallel Seq Scan on
lineitem l1 (cost=0.00..1456859.08 rows=6450109 width=16) (actual
time=0.031..2986.281 rows=9482337 loops=4)
Filter: (l_receiptdate
> l_commitdate)
Rows Removed by
Filter: 5514176
-> Hash
(cost=2535.58..2535.58 rows=4000 width=30) (actual time=17.410..17.410
rows=3945 loops=4)
Buckets: 4096
Batches: 1 Memory Usage: 279kB
-> Nested Loop
(cost=79.29..2535.58 rows=4000 width=30) (actual time=1.872..15.971 rows=3945
loops=4)
-> Seq Scan on
nation (cost=0.00..1.31 rows=1 width=4) (actual time=0.026..0.034 rows=1
loops=4)
Filter:
(n_name = 'ETHIOPIA'::bpchar)
Rows
Removed by Filter: 24
-> Bitmap Heap
Scan on supplier (cost=79.29..2494.27 rows=4000 width=38) (actual
time=1.838..14.950 rows=3945 loops=4)
Recheck
Cond: (s_nationkey = nation.n_nationkey)
Heap
Blocks: exact=1898
-> Bitmap
Index Scan on idx_supplier_nation_key (cost=0.00..78.29 rows=4000 width=0)
(actual time=1.309..1.309 rows=3945 loops=4)
Index Cond: (s_nationkey = nation.n_nationkey)
-> Hash
(cost=1814840.12..1814840.12 rows=59986012 width=16) (actual
time=20846.345..20846.345 rows=59986052 loops=4)
Buckets: 33554432 Batches:
4 Memory Usage: 965240kB
-> Seq Scan on lineitem l2
(cost=0.00..1814840.12 rows=59986012 width=16) (actual time=0.046..11007.730
rows=59986052 loops=4)
-> Index Scan using orders_pkey on orders
(cost=0.43..7.40 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=1441593)
Index Cond: (o_orderkey = l1.l_orderkey)
Filter: (o_orderstatus = 'F'::bpchar)
Rows Removed by Filter: 1
-> Index Scan using idx_lineitem_orderkey on
lineitem l3 (cost=0.56..21.58 rows=53 width=16) (actual time=0.005..0.005
rows=1 loops=696628)
Index Cond: (l_orderkey = l1.l_orderkey)
Filter: ((l_receiptdate > l_commitdate) AND
(l_suppkey <> l1.l_suppkey))
Rows Removed by Filter: 1
Planning time: 4.970 ms
Execution time: 39809.977 ms
(47 rows)
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4216591.95..4216591.95 rows=1 width=34) (actual
time=9489.945..9489.957 rows=100 loops=1)
-> Sort (cost=4216591.95..4216591.95 rows=1 width=34) (actual
time=9489.943..9489.949 rows=100 loops=1)
Sort Key: (count(*)) DESC, supplier.s_name
Sort Method: top-N heapsort Memory: 38kB
-> GroupAggregate (cost=4216591.92..4216591.94 rows=1 width=34)
(actual time=9480.290..9488.403 rows=3945 loops=1)
Group Key: supplier.s_name
-> Sort (cost=4216591.92..4216591.92 rows=1 width=26) (actual
time=9480.278..9483.654 rows=38905 loops=1)
Sort Key: supplier.s_name
Sort Method: quicksort Memory: 4576kB
-> Nested Loop Anti Join (cost=559157.24..4216591.91
rows=1 width=26) (actual time=4154.990..9198.379 rows=38905 loops=1)
-> Gather (cost=559156.67..4216548.78 rows=1
width=42) (actual time=4153.557..7185.268 rows=696628 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Nested Loop Semi Join
(cost=558156.67..4215548.68 rows=1 width=42) (actual time=4391.022..8656.693
rows=139326 loops=5)
Join Filter: (l2.l_suppkey <>
l1.l_suppkey)
Rows Removed by Join Filter: 36246
-> Hash Join
(cost=558156.11..1984566.52 rows=97950 width=46) (actual
time=4390.978..7970.236 rows=144548 loops=5)
Hash Cond: (l1.l_orderkey =
orders.o_orderkey)
-> Hash Join
(cost=2585.58..1425767.03 rows=199953 width=42) (actual time=20.784..3405.518
rows=298981 loops=5)
Hash Cond: (l1.l_suppkey =
supplier.s_suppkey)
-> Parallel Seq Scan on
lineitem l1 (cost=0.00..1402436.29 rows=4998834 width=16) (actual
time=0.040..2570.422 rows=7585870 loops=5)
Filter: (l_receiptdate
> l_commitdate)
Rows Removed by
Filter: 4411341
-> Hash
(cost=2535.58..2535.58 rows=4000 width=30) (actual time=20.458..20.458
rows=3945 loops=5)
Buckets: 4096
Batches: 1 Memory Usage: 279kB
-> Nested Loop
(cost=79.29..2535.58 rows=4000 width=30) (actual time=1.921..18.996 rows=3945
loops=5)
-> Seq Scan on
nation (cost=0.00..1.31 rows=1 width=4) (actual time=0.033..0.041 rows=1
loops=5)
Filter:
(n_name = 'ETHIOPIA'::bpchar)
Rows
Removed by Filter: 24
-> Bitmap Heap
Scan on supplier (cost=79.29..2494.27 rows=4000 width=38) (actual
time=1.881..17.968 rows=3945 loops=5)
Recheck
Cond: (s_nationkey = nation.n_nationkey)
Heap
Blocks: exact=1898
-> Bitmap
Index Scan on idx_supplier_nation_key (cost=0.00..78.29 rows=4000 width=0)
(actual time=1.363..1.363 rows=3945 loops=5)
Index Cond: (s_nationkey = nation.n_nationkey)
-> Hash
(cost=463721.01..463721.01 rows=7347961 width=4) (actual
time=4328.438..4328.438 rows=7309184 loops=5)
Buckets: 8388608 Batches: 1
Memory Usage: 322500kB
-> Seq Scan on orders
(cost=0.00..463721.01 rows=7347961 width=4) (actual time=0.040..2856.374
rows=7309184 loops=5)
Filter: (o_orderstatus
= 'F'::bpchar)
Rows Removed by
Filter: 7690816
-> Index Scan using
idx_lineitem_orderkey on lineitem l2 (cost=0.56..20.79 rows=159 width=16)
(actual time=0.004..0.004 rows=1 loops=722740)
Index Cond: (l_orderkey =
orders.o_orderkey)
-> Index Scan using idx_lineitem_orderkey on
lineitem l3 (cost=0.56..21.58 rows=53 width=16) (actual time=0.003..0.003
rows=1 loops=696628)
Index Cond: (l_orderkey = l1.l_orderkey)
Filter: ((l_receiptdate > l_commitdate) AND
(l_suppkey <> l1.l_suppkey))
Rows Removed by Filter: 1
Planning time: 4.495 ms
Execution time: 9493.733 ms
(47 rows)
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers