Pavan Deolasee wrote:
> Looking at the history and some past discussions, it seems Tomas reported
> somewhat similar problem and Andres proposed a patch here
> https://www.postgresql.org/message-id/[email protected]
> which got committed via b23b0f5588d2e2. Not exactly the same issue, but
> related to relcache flush happening in index_open().
>
> I wonder if we should just add a recheck logic
> in RelationGetIndexAttrBitmap() such that if rd_indexvalid is seen as 0 at
> the end, we go back and start again from RelationGetIndexList(). Or is
> there any code path that relies on finding the old information?
Good find on that old report. It occured to me to re-run Tomas' test
case with this second patch you proposed (the "ddl" test takes 11
minutes in my laptop), and lo and behold -- your "recheck" code enters
an infinite loop. Not good. I tried some more tricks that came to
mind, but I didn't get any good results. Pavan and I discussed it
further and he came up with the idea of returning the bitmapset we
compute, but if an invalidation occurs in index_open, simply do not
cache the bitmapsets so that next time this is called they are all
computed again. Patch attached.
This appears to have appeased both test_decoding's ddl test under
CACHE_CLOBBER_ALWAYS, as well as the CREATE INDEX CONCURRENTLY bug.
I intend to commit this soon to all branches, to ensure it gets into the
next set of minors.
Here is a detailed reproducer for the CREATE INDEX CONCURRENTLY bug.
Line numbers are as of today's master; comments are supposed to help
locating good spots in other branches. Given the use of gdb
breakpoints, there's no good way to integrate this with regression
tests, which is a pity because this stuff looks a little brittle to me,
and if it breaks it is hard to detect.
Prep:
DROP TABLE IF EXISTS testtab;
CREATE TABLE testtab (a int unique, b int, c int);
INSERT INTO testtab VALUES (1, 100, 10);
CREATE INDEX testindx ON testtab (b);
S1:
SELECT * FROM testtab;
UPDATE testtab SET b = b + 1 WHERE a = 1;
SELECT pg_backend_pid();
attach GDB to this session.
break relcache.c:4800
# right before entering the per-index loop
cont
S2:
SELECT pg_backend_pid();
DROP INDEX testindx;
attach GDB to this session
handle SIGUSR1 nostop
break indexcmds.c:666
# right after index_create
break indexcmds.c:790
# right before the second CommitTransactionCommand
cont
CREATE INDEX CONCURRENTLY testindx ON testtab (b);
-- this blocks in breakpoint 1. Leave it there.
S1:
UPDATE testtab SET b = b + 1 WHERE a = 1;
-- this blocks in breakpoint 1. Leave it there.
S2:
gdb -> cont
-- This blocks waiting for S1
S1:
gdb -> cont
-- S1 finishes the update. S2 awakens and blocks again in breakpoint 2.
S1:
-- Now run more UPDATEs:
UPDATE testtab SET b = b + 1 WHERE a = 1;
This produces a broken HOT chain.
This can be done several times; all these updates should be non-HOT
because of the index created by CIC, but they are marked HOT.
S2: "cont"
From this point onwards, update are correct.
SELECT * FROM testtab;
SET enable_seqscan TO 0;
SET enable_bitmapscan TO 0;
SELECT * FROM testtab WHERE b between 100 and 110;
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 7054c21883154f67058d42180606e0c5428d2745[m
Author: Alvaro Herrera <[email protected]>
AuthorDate: Fri Feb 3 15:40:37 2017 -0300
CommitDate: Fri Feb 3 15:41:59 2017 -0300
Fix CREATE INDEX CONCURRENTLY relcache bug
diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index 8a7c560..003ccfa 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -4745,9 +4745,11 @@ RelationGetIndexPredicate(Relation relation)
* Attribute numbers are offset by FirstLowInvalidHeapAttributeNumber so that
* we can include system attributes (e.g., OID) in the bitmap representation.
*
- * Caller had better hold at least RowExclusiveLock on the target relation
- * to ensure that it has a stable set of indexes. This also makes it safe
- * (deadlock-free) for us to take locks on the relation's indexes.
+ * While all callers should at least RowExclusiveLock on the target relation,
+ * we still can't guarantee a stable set of indexes because CREATE INDEX
+ * CONCURRENTLY and DROP INDEX CONCURRENTLY can change set of indexes, without
+ * taking any conflicting lock. So we must be prepared to handle changes in the
+ * set of indexes and recompute bitmaps, when necessary.
*
* The returned result is palloc'd in the caller's memory context and should
* be bms_free'd when not needed anymore.
@@ -4763,7 +4765,6 @@ RelationGetIndexAttrBitmap(Relation relation,
IndexAttrBitmapKind attrKind)
Oid relpkindex;
Oid relreplindex;
ListCell *l;
- MemoryContext oldcxt;
/* Quick exit if we already computed the result. */
if (relation->rd_indexattr != NULL)
@@ -4807,6 +4808,8 @@ RelationGetIndexAttrBitmap(Relation relation,
IndexAttrBitmapKind attrKind)
relpkindex = relation->rd_pkindex;
relreplindex = relation->rd_replidindex;
+ Assert(relation->rd_indexvalid != 0);
+
/*
* For each index, add referenced attributes to indexattrs.
*
@@ -4893,18 +4896,23 @@ RelationGetIndexAttrBitmap(Relation relation,
IndexAttrBitmapKind attrKind)
relation->rd_idattr = NULL;
/*
- * Now save copies of the bitmaps in the relcache entry. We
intentionally
- * set rd_indexattr last, because that's the one that signals validity
of
- * the values; if we run out of memory before making that copy, we won't
- * leave the relcache entry looking like the other ones are valid but
- * empty.
+ * Now save copies of the bitmaps in the relcache entry, but only if no
+ * invalidation occured in the meantime. We intentionally set
rd_indexattr
+ * last, because that's the one that signals validity of the values; if
we
+ * run out of memory before making that copy, we won't leave the
relcache
+ * entry looking like the other ones are valid but empty.
*/
- oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
- relation->rd_keyattr = bms_copy(uindexattrs);
- relation->rd_pkattr = bms_copy(pkindexattrs);
- relation->rd_idattr = bms_copy(idindexattrs);
- relation->rd_indexattr = bms_copy(indexattrs);
- MemoryContextSwitchTo(oldcxt);
+ if (relation->rd_indexvalid != 0)
+ {
+ MemoryContext oldcxt;
+
+ oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
+ relation->rd_keyattr = bms_copy(uindexattrs);
+ relation->rd_pkattr = bms_copy(pkindexattrs);
+ relation->rd_idattr = bms_copy(idindexattrs);
+ relation->rd_indexattr = bms_copy(indexattrs);
+ MemoryContextSwitchTo(oldcxt);
+ }
/* We return our original working copy for caller to play with */
switch (attrKind)
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers