[HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Hello, we found (and maybe fixed) two wrong behavior related to inheritance and FOR UPDATE. This report is about one of them. This behavior could be observed on master and back to 8.4 but 8.4 behaves a bit more funny. I haven't checked on 8.3. This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. - Reproducing the symptom This behavior is easy to reproduce as following. A=# DROP TABLE IF EXISTS p CASCADE; A=# CREATE TABLE p (a int, b int, c int); A=# CREATE TABLE c1 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c2 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# CREATE TABLE c3 (LIKE p INCLUDING INDEXES) INHERITS (p); A=# INSERT INTO c1 (SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a); A=# INSERT INTO c2 (SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a); A=# INSERT INTO c3 (SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a); A=# ANALYZE; A=# BEGIN; A=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; A=# UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0; B=# SELECT tableoid, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; -- the above is blocked by session A until it commits. A=# COMMIT; Where A and B are individual sessions. You will get the follwing result on sesison B just after the COMMIT on session A. | tableoid | ctid | a | b | c | --+---+---+---+--- * 33834 | (0,1) | 0 | 0 | 0 | 33834 | (0,4) | 0 | 1 | 0 | 33838 | (0,1) | 1 | 0 | 0 * 33834 | (0,1) | 0 | 0 | 0 | 33842 | (0,1) | 2 | 0 | 0 | 33842 | (0,4) | 2 | 1 | 0 | (6 rows) The lines prefixed with '*' appear twice in the result. The plan for the query is as following, LockRows - Result - Append - Seq Scan on p Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c1 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c2 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Seq Scan on c3 Filter: ((b = ANY ('{0,1}'::integer[])) AND (c = 0)) - Analysys and solution This seems to be caused in ExecScanFetch under EvalPlanQualNext, when es_epqTuleSet[scanrelid - 1] is false (p and c1 in the above example). In that case, execution confluents onto non-EPQ route. As the result, the accessMtd (= SeqScan, IndexScan will behaves as the same) retunes null slot at the first iteration on p. Then the second iteration on c1, it reuturns the c1's first tuple (33834:(0,1)) and EvalPlanQualNext is satisfied with the wrong tuple. In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From 79c201539abdc59400af8339795c6c63c2980278 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Thu, 11 Dec 2014 09:06:59 +0900 Subject: [PATCH] Fix duplicate tuples for inheritance and FOR UPDATE --- src/backend/executor/execScan.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c index 1319519..d5c491c 100644 --- a/src/backend/executor/execScan.c +++ b/src/backend/executor/execScan.c @@ -74,6 +74,11 @@ ExecScanFetch(ScanState *node, return slot; } + else + { + /* This relation has no tuple to recheck.*/ + return NULL; + } } /* -- 2.1.0.GIT -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. Good catch! In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. ... but bad fix. This would break join cases, wherein it's important that other scan nodes still return all the required tuples. (It's unfortunate that the eval-plan-qual isolation test fails to cover joins :-(.) After digging around a bit, I realized that the problem is actually in nodeLockRows.c. It is supposed to do EvalPlanQualSetTuple(..., NULL) for each child table that's not supposed to return any row during the current EPQ test cycle. Unfortunately, it only does that reliably once the EPQ environment is already set up. If we discover we need an EPQ test while looking at a non-first child table, tables already passed over in the loop over node-lr_arowMarks don't get the word. So the correct fix (or a correct fix, anyway; this could also have been done with more-invasive loop logic changes) is as attached. I'm working on back-patching this. regards, tom lane diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c index 990240b..5dcb9b0 100644 *** a/src/backend/executor/nodeLockRows.c --- b/src/backend/executor/nodeLockRows.c *** lnext: *** 194,200 --- 194,222 */ if (!epq_started) { + ListCell *lc2; + EvalPlanQualBegin(node-lr_epqstate, estate); + + /* + * Ensure that rels with already-visited rowmarks are told + * not to return tuples during the first EPQ test. We can + * exit this loop once it reaches the current rowmark; + * rels appearing later in the list will be set up + * correctly by the EvalPlanQualSetTuple call at the top + * of the loop. + */ + foreach(lc2, node-lr_arowMarks) + { + ExecAuxRowMark *aerm2 = (ExecAuxRowMark *) lfirst(lc2); + + if (lc2 == lc) + break; + EvalPlanQualSetTuple(node-lr_epqstate, + aerm2-rowmark-rti, + NULL); + } + epq_started = true; } diff --git a/src/test/isolation/expected/eval-plan-qual.out b/src/test/isolation/expected/eval-plan-qual.out index ab778cb..0f6595f 100644 *** a/src/test/isolation/expected/eval-plan-qual.out --- b/src/test/isolation/expected/eval-plan-qual.out *** accountid balance *** 49,51 --- 49,74 checking 600 savings2334 + + starting permutation: readp1 writep1 readp2 c1 c2 + step readp1: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; + tableoid ctid a b c + + c1 (0,1) 0 0 0 + c1 (0,4) 0 1 0 + c2 (0,1) 1 0 0 + c2 (0,4) 1 1 0 + c3 (0,1) 2 0 0 + c3 (0,4) 2 1 0 + step writep1: UPDATE p SET b = -1 WHERE a = 1 AND b = 1 AND c = 0; + step readp2: SELECT tableoid::regclass, ctid, * FROM p WHERE b IN (0, 1) AND c = 0 FOR UPDATE; waiting ... + step c1: COMMIT; + step readp2: ... completed + tableoid ctid a b c + + c1 (0,1) 0 0 0 + c1 (0,4) 0 1 0 + c2 (0,1) 1 0 0 + c3 (0,1) 2 0 0 + c3 (0,4) 2 1 0 + step c2: COMMIT; diff --git a/src/test/isolation/specs/eval-plan-qual.spec b/src/test/isolation/specs/eval-plan-qual.spec index 786b791..876e547 100644 *** a/src/test/isolation/specs/eval-plan-qual.spec --- b/src/test/isolation/specs/eval-plan-qual.spec *** setup *** 8,18 --- 8,27 { CREATE TABLE accounts (accountid text PRIMARY KEY, balance numeric not null); INSERT INTO accounts VALUES ('checking', 600), ('savings', 600); + + CREATE TABLE p (a int, b int, c int); + CREATE TABLE c1 () INHERITS (p); + CREATE TABLE c2 () INHERITS (p); + CREATE TABLE c3 () INHERITS (p); + INSERT INTO c1 SELECT 0, a / 3, a % 3 FROM generate_series(0, 9) a; + INSERT INTO c2 SELECT 1, a / 3, a % 3 FROM generate_series(0, 9) a; + INSERT INTO c3 SELECT 2, a / 3, a % 3 FROM generate_series(0, 9) a; }
Re: [HACKERS] [Bug] Duplicate results for inheritance and FOR UPDATE.
Hello, This issue is that some query returns duplicate rows after FOR UPDATE was blocked, in other words, after getting HeapTupleUpdated in ExecLockRows. Good catch! Thank you. In the EPQ block in ExecScanFetch, it seems should return NULL if the relation does not has test tuple. The attached patch does so on the current master and the problem has disappeared. ... but bad fix. This would break join cases, wherein it's important that other scan nodes still return all the required tuples. (It's unfortunate that the eval-plan-qual isolation test fails to cover joins :-(.) Hmm. I regretfully forgot to do isolation check. The confluent didn't seem a bug but I couldn't see its intention. After digging around a bit, I realized that the problem is actually in nodeLockRows.c. It is supposed to do EvalPlanQualSetTuple(..., NULL) for each child table that's not supposed to return any row during the current EPQ test cycle. Unfortunately, it only does that reliably once the EPQ environment is already set up. If we discover we need an EPQ test while looking at a non-first child table, tables already passed over in the loop over node-lr_arowMarks don't get the word. Thank you for the detailed explanation. I was wondering during investigation about EvalPlanQualSetTuple(, NULL) was called only for children after the first one that had EPQ test tuple. Now I understand that it was the core of this bug. So the correct fix (or a correct fix, anyway; this could also have been done with more-invasive loop logic changes) is as attached. I'm working on back-patching this. That really helps. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers