On Tuesday, September 25, 2012 01:58:31 AM Andres Freund wrote:
> On Monday, September 24, 2012 01:37:59 PM Andres Freund wrote:
> > On Monday, September 24, 2012 01:27:54 PM Andres Freund wrote:
> > > Problem 2: undroppable indexes:
> > > 
> > > 
> > > Session 1:
> > > CREATE TABLE test_drop_concurrently(id serial primary key, data int);
> > > CREATE INDEX test_drop_concurrently_data ON
> > > test_drop_concurrently(data); BEGIN;
> > > EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data =
> > > 34343;
> > > 
> > > 
> > > 
> > > Session 2:
> > > DROP INDEX CONCURRENTLY test_drop_concurrently_data;
> > > <waiting>
> > > ^CCancel request sent
> > > ERROR:  canceling statement due to user request
> > > 
> > > 
> > > 
> > > Session 1:
> > > ROLLBACK;
> > > DROP TABLE test_drop_concurrently;
> > > SELECT indexrelid, indrelid, indexrelid::regclass, indrelid::regclass,
> > > indisvalid, indisready FROM pg_index WHERE indexrelid =
> > > 'test_drop_concurrently_data'::regclass;
> > > 
> > >  indexrelid | indrelid |         indexrelid          | indrelid |
> > > 
> > > indisvalid | indisready
> > > ------------+----------+-----------------------------+----------+------
> > > -- -- --+------------ 24703 |    24697 | test_drop_concurrently_data |
> > > 24697    | f          | f
> > > (1 row)
> > > 
> > > 
> > > 
> > > DROP INDEX test_drop_concurrently_data;
> > > ERROR:  could not open relation with OID 24697
> > > 
> > > 
> > > 
> > > Haven't looked at this one at all.
> > 
> > Thats because it has to commit transactions inbetween to make the catalog
> > changes visible. Unfortunately at that point it already deleted the
> > dependencies in deleteOneObject...
> 
> Seems to be solvable with some minor reshuffling in deleteOneObject. We can
> only perform the deletion of outgoing pg_depend records *after* we have
> dropped the object with doDeletion in the concurrent case. As the actual
> drop of the relation happens in the same transaction that will then go on
> to drop the dependency records that seems to be fine.
> I don't see any problems with that right now, will write a patch tomorrow.
> We will see if thats problematic...
Patch attached. Review appreciated, there might be consequences of moving the 
deletion of dependencies I didn't forsee.

Greetings,

Andres
-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 8891fcd59496483793aecc21a096fc0119369e73 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 25 Sep 2012 01:41:29 +0200
Subject: [PATCH] Fix concurrency issues in concurrent index drops

Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and*
indisready. Thats problematic if some transaction is still looking at the index
and another transction makes changes. See the example below.

Now we do the drop in three stages, just as a concurrent index build. First
unset indivalid, wait, unset indisready, wait, drop index.

Example:

Session 1:
CREATE TABLE test_drop_concurrently(id serial primary key, data int);
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1,
100000);
CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data);
BEGIN;
EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)

Session 2:
BEGIN;
SELECT * FROM test_drop_concurrently WHERE data = 34343;

Session 3:
DROP INDEX CONCURRENTLY test_drop_concurrently_data;
(in-progress)

Session 2:
INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 100000);
COMMIT;

Session 1:
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(1 row)
SET enable_bitmapscan = false;
SET enable_indexscan = false;
SELECT * FROM test_drop_concurrently WHERE data = 34343;
(2 rows)
---
 src/backend/catalog/index.c | 99 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 84 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index a61b424..3e1794f 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1318,6 +1318,10 @@ index_drop(Oid indexId, bool concurrent)
 	 * table lock strong enough to prevent all queries on the table from
 	 * proceeding until we commit and send out a shared-cache-inval notice
 	 * that will make them update their index lists.
+	 *
+	 * In the concurrent case we make sure that nobody can be looking at the
+	 * indexes by dropping the index in multiple steps, so we don't need a full
+	 * fledged AccessExlusiveLock yet.
 	 */
 	heapId = IndexGetRelation(indexId, false);
 	if (concurrent)
@@ -1338,7 +1342,19 @@ index_drop(Oid indexId, bool concurrent)
 
 	/*
 	 * Drop Index concurrently is similar in many ways to creating an index
-	 * concurrently, so some actions are similar to DefineIndex()
+	 * concurrently, so some actions are similar to DefineIndex() just in the
+	 * reverse order.
+	 *
+	 * First we unset indisvalid so queries starting afterwards don't use the
+	 * index to answer queries anymore. We have to keep indisready = true
+	 * though so transactions that are still using the index can continue to
+	 * see valid index contents. E.g. when they are using READ COMMITTED mode,
+	 * and another transactions that started later commits makes changes and
+	 * commits, they need to see those new tuples in the index.
+	 *
+	 * After all transactions that could possibly have used it for queries
+	 * ended we can unset indisready and wait till nobody could be updating it
+	 * anymore.
 	 */
 	if (concurrent)
 	{
@@ -1357,21 +1373,14 @@ index_drop(Oid indexId, bool concurrent)
 			elog(ERROR, "cache lookup failed for index %u", indexId);
 		indexForm = (Form_pg_index) GETSTRUCT(tuple);
 
-		indexForm->indisvalid = false;	/* make unusable for queries */
-		indexForm->indisready = false;	/* make invisible to changes */
+		indexForm->indisvalid = false;	/* make unusable for new queries */
+		/* we keep indisready == true so it still gets updated */
 
 		simple_heap_update(indexRelation, &tuple->t_self, tuple);
 		CatalogUpdateIndexes(indexRelation, tuple);
 
 		heap_close(indexRelation, RowExclusiveLock);
 
-		/*
-		 * Invalidate the relcache for the table, so that after this
-		 * transaction we will refresh the index list. Forgetting just the
-		 * index is not enough.
-		 */
-		CacheInvalidateRelcache(userHeapRelation);
-
 		/* save lockrelid and locktag for below, then close but keep locks */
 		heaprelid = userHeapRelation->rd_lockInfo.lockRelId;
 		SET_LOCKTAG_RELATION(heaplocktag, heaprelid.dbId, heaprelid.relId);
@@ -1383,8 +1392,8 @@ index_drop(Oid indexId, bool concurrent)
 		/*
 		 * For a concurrent drop, it's important to make the catalog entries
 		 * visible to other transactions before we drop the index. The index
-		 * will be marked not indisvalid, so that no one else tries to either
-		 * insert into it or use it for queries.
+		 * will be marked not indisvalid, so that no one else tries to use it
+		 * for queries.
 		 *
 		 * We must commit our current transaction so that the index update
 		 * becomes visible; then start another.  Note that all the data
@@ -1399,6 +1408,7 @@ index_drop(Oid indexId, bool concurrent)
 		LockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock);
 		LockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock);
 
+		/* XXX: Why do we have to do that here, who pushed which snap? */
 		PopActiveSnapshot();
 		CommitTransactionCommand();
 		StartTransactionCommand();
@@ -1431,11 +1441,70 @@ index_drop(Oid indexId, bool concurrent)
 		}
 
 		/*
+		 * now we are sure that nobody uses the index for queries, they just
+		 * might have it opened for updating it. So now we can unset
+		 * ->indisready and wait till nobody could update the index anymore.
+		 */
+		indexRelation = heap_open(IndexRelationId, RowExclusiveLock);
+
+		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
+		userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
+
+		tuple = SearchSysCacheCopy1(INDEXRELID,
+									ObjectIdGetDatum(indexId));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for index %u", indexId);
+		indexForm = (Form_pg_index) GETSTRUCT(tuple);
+
+		/* we hold a lock, but make sure we haven't screwed anything up */
+		Assert(indexForm->indisvalid == false);
+		indexForm->indisready = false;	/* don't update index anymore */
+
+		simple_heap_update(indexRelation, &tuple->t_self, tuple);
+		CatalogUpdateIndexes(indexRelation, tuple);
+
+		heap_close(indexRelation, RowExclusiveLock);
+
+		/*
+		 * we close the relations again, were still holding the session lock
+		 * from above though!
+		 */
+		heap_close(userHeapRelation, NoLock);
+		index_close(userIndexRelation, NoLock);
+
+		/*
+		 * Invalidate the relcache for the table, so that after this
+		 * transaction we will refresh the index list. Forgetting just the
+		 * index is not enough.
+		 */
+		CacheInvalidateRelcache(userHeapRelation);
+
+		/*
+		 * just as with indisvalid = false we need to make sure indisready =
+		 * false is visible for everyone.
+		 */
+		CommitTransactionCommand();
+		StartTransactionCommand();
+
+		/*
+		 * Wait till everyone that saw indisready = true finished so we can
+		 * finally really remove the index. The logic here is the same as
+		 * above.
+		 */
+		old_lockholders = GetLockConflicts(&heaplocktag, AccessExclusiveLock);
+
+		while (VirtualTransactionIdIsValid(*old_lockholders))
+		{
+			VirtualXactLock(*old_lockholders, true);
+			old_lockholders++;
+		}
+
+		/*
 		 * Re-open relations to allow us to complete our actions.
 		 *
-		 * At this point, nothing should be accessing the index, but lets
-		 * leave nothing to chance and grab AccessExclusiveLock on the index
-		 * before the physical deletion.
+		 * At this point, nothing should be accessing the index, but lets leave
+		 * nothing to chance and grab AccessExclusiveLock on the index before
+		 * the physical deletion.
 		 */
 		userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
 		userIndexRelation = index_open(indexId, AccessExclusiveLock);
-- 
1.7.12.289.g0ce9864.dirty

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to