Hello all. I'm migrating a database from PG 9.6 to 10.3 and have noticed a particular query that is performing very badly compared to its 9.6 counterpart.
The plan on 9.6 v 10.3 are effectively identical except in 9.6 the planner
decides to use an index only scan on the primary key and in 10.3 it does a
sequential scan. The problem is the sequential scan is for a table of 75M
rows and 25 columns so its quiet a lot of pages it has to traverse.
This is the query:
explain verbose
WITH removed AS (
DELETE FROM match m
WHERE
NOT EXISTS (
SELECT 1
FROM build.household h -- This is the table that has 70M rows and
does a full table scan in 10.3
WHERE h.household_id = m.household_id
) OR (
m.property_id IS NOT NULL AND
NOT EXISTS (
SELECT 1
FROM build.property p
WHERE p.household_id = m.household_id AND p.property_id =
m.property_id
)
)
RETURNING *
)
INSERT INTO orphaned_matches (household_id, account_id, candidate_id,
matched_at, full_name, first_name, last_name, match_reason, property_id,
owner_id)
SELECT
removed.household_id,
removed.account_id,
removed.candidate_id,
removed.created_at,
removed.full_name,
removed.first_name,
removed.last_name,
removed.match_reason,
removed.property_id,
removed.owner_id
FROM removed;
What's worse is that in 10.3, the number of rows is actually much smaller
than in 9.6 because I am doing this query on a partitioned table (table
name "match") with a reduced data set.
Query plans for both are attached, plus the query.
thanks
--Cory
Insert on public.orphaned_matches (cost=204030825.83..204247350.03
rows=8660968 width=264)
CTE removed
-> Delete on public.match m (cost=0.00..204030825.83 rows=8660968
width=6)
Output: m.id, m.created_at, m.modified_at, m.household_id,
m.property_id, m.match_reason, m.full_name, m.first_name, m.middle_name,
m.last_name, m.account_id, m.candidate_id, m.match_category, m.confidence,
m.owner_id, m.match_resource
-> Seq Scan on public.match m (cost=0.00..204030825.83
rows=8660968 width=6)
Output: m.ctid
Filter: ((NOT (SubPlan 1)) OR ((m.property_id IS NOT NULL) AND
(NOT (SubPlan 2))))
SubPlan 1
-> Index Only Scan using uq_household_id on build.household
h (cost=0.57..8.59 rows=1 width=0)
Index Cond: (h.household_id = (m.household_id)::text)
SubPlan 2
-> Index Scan using property_property_id_idx on
build.property p (cost=0.57..8.59 rows=1 width=0)
Index Cond: (p.property_id = m.property_id)
Filter: (p.household_id = (m.household_id)::text)
-> CTE Scan on removed (cost=0.00..216524.20 rows=8660968 width=264)
Output: nextval('orphaned_matches_id_seq'::regclass), now(),
removed.household_id, removed.account_id, removed.candidate_id,
removed.created_at, removed.full_name, removed.first_name, removed.last_name,
removed.match_reason, removed.property_id, removed.owner_id
(16 rows)
Insert on match.orphaned_matches (cost=1823761513653.51..1823761525043.91
rows=455616 width=380)
CTE removed
-> Delete on match m (cost=0.00..1823761513653.51 rows=455616 width=6)
Output: m.id, m.created_at, m.modified_at, m.household_id,
m.account_id, m.candidate_id, m.match_reason, m.property_id, m.full_name,
m.first_name, m.middle_name, m.last_name, m.match_category, m.confidence,
m.owner_id, m.match_resource
-> Seq Scan on match m (cost=0.00..1823761513653.51 rows=455616
width=6)
Output: m.ctid
Filter: ((NOT (SubPlan 1)) OR ((m.property_id IS NOT NULL) AND
(NOT (SubPlan 2))))
SubPlan 1
-> Seq Scan on build.household h (cost=0.00..2948996.80
rows=1 width=0)
Filter: (h.household_id = (m.household_id)::text)
SubPlan 2
-> Index Scan using uq_idx_property_id on build.property p
(cost=0.57..2.59 rows=1 width=0)
Index Cond: (p.property_id = m.property_id)
Filter: (p.household_id = (m.household_id)::text)
-> CTE Scan on removed (cost=0.00..11390.40 rows=455616 width=380)
Output: nextval('orphaned_matches_id_seq'::regclass), now(),
removed.household_id, removed.account_id, removed.candidate_id,
removed.created_at, removed.full_name, removed.first_name, removed.last_name,
removed.match_reason, removed.property_id, removed.owner_id
(16 rows)
query.sql
Description: Binary data
