Recently we encountered the following unhappy sequence of events:
1/ system running happily 2/ batch load into table begins3/ very quickly (some) preexisting queries on said table go orders of magnitude slower
4/ database instance becomes unresponsive 5/ application outageAfter looking down a few false leads, We've isolated the cause to the following:
The accumulating in-progress row changes are such that previously optimal plans are optimal no longer. Now this situation will fix itself when the next autoanalyze happens (and new plan will be chosen) - however that cannot occur until the batch load is completed and committed (approx 70 seconds). However during that time there is enough of a performance degradation for queries still using the old plan to cripple the server.
Now that we know what is happening we can work around it. But I'm wondering - is there any way (or if not should there be one) to let postgres handle this automatically? I experimented with a quick hack to src/backend/commands/analyze.c (attached) that lets another session's ANALYZE see in progress rows - which works but a) may cause other problems and b) does not help autoaanalyze which has to wait for COMMIT + stats message.
I've attached a (synthetic) test case that shows the issue, I'll reproduce the output below to hopefully make the point obvious:
Table "public.plan"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer | not null
typ | integer | not null
dat | timestamp without time zone |
val | text | not null
Indexes:
"plan_id" UNIQUE, btree (id)
"plan_dat" btree (dat)
"plan_typ" btree (typ)
[Session 1]
EXPLAIN ANALYZE
SELECT * FROM plan
WHERE typ = 3 AND dat IS NOT NULL;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Scan using plan_dat on plan (cost=0.00..265.47 rows=55
width=117) (actual time=0.130..4.409 rows=75 loops=1)
Index Cond: (dat IS NOT NULL)
Filter: (typ = 3)
Rows Removed by Filter: 5960
Total runtime: 4.440 ms
(5 rows)
[Session 2]
BEGIN;
INSERT INTO plan
SELECT id + 2000001,typ,current_date + id * '1 seconds'::interval ,val
FROM plan
;
[Session 1]
EXPLAIN ANALYZE
SELECT * FROM plan
WHERE typ = 3 AND dat IS NOT NULL;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using plan_dat on plan (cost=0.00..551.35 rows=91
width=117) (actual time=0.131..202.699 rows=75 loops=1)
Index Cond: (dat IS NOT NULL) Filter: (typ = 3) Rows Removed by Filter: 5960 Total runtime: 202.729 ms (5 rows) [Session 2] COMMIT; [Session 1...wait for autoanalyze to finish then] EXPLAIN ANALYZE SELECT * FROM plan WHERE typ = 3 AND dat IS NOT NULL; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on plan (cost=407.87..44991.95 rows=10116 width=117) (actual time=2.692..6.582 rows=75 loops=1)
Recheck Cond: (typ = 3) Filter: (dat IS NOT NULL) Rows Removed by Filter: 19925-> Bitmap Index Scan on plan_typ (cost=0.00..405.34 rows=20346 width=0) (actual time=2.573..2.573 rows=20000 loops=1)
Index Cond: (typ = 3)
Total runtime: 6.615 ms
Regards
Mark
plan.tar.gz
Description: application/gzip
*** analyze.c.orig 2013-04-26 10:40:06.634942283 +1200
--- analyze.c 2013-04-26 11:36:13.537404101 +1200
***************
*** 1173,1183 ****
* has to adjust the numbers we send to the stats
* collector to make this come out right.)
*/
! if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(targtuple.t_data)))
! {
! sample_it = true;
! liverows += 1;
! }
break;
case HEAPTUPLE_DELETE_IN_PROGRESS:
--- 1173,1181 ----
* has to adjust the numbers we send to the stats
* collector to make this come out right.)
*/
! /* Amend so insert in progress tuples are counted */
! sample_it = true;
! liverows += 1;
break;
case HEAPTUPLE_DELETE_IN_PROGRESS:
-- Sent via pgsql-performance mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
