On Tue, Nov 11, 2014 at 3:24 AM, Jeff Janes <[email protected]> wrote:
>
> On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <[email protected]>
> wrote:
>>
>> On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
>> <[email protected]> wrote:
>> > Updated patch is attached.
>> Please find attached an updated patch with the following things changed:
>> - Addition of tab completion in psql for all new commands
>> - Addition of a call to WaitForLockers in index_concurrent_swap to
>> ensure that there are no running transactions on the parent table
>> running before exclusive locks are taken on the index and its
>> concurrent entry. Previous patch versions created deadlocks because of
>> that, issue spotted by the isolation tests integrated in the patch.
>> - Isolation tests for reindex concurrently are re-enabled by default.
>> Regards,
>
>
> It looks like this needs another rebase, I get failures on index.c,
> toasting.c, indexcmds.c, and index.h
Indeed. There are some conflicts created by the recent modification of
index_create. Here is a rebased patch.
--
Michael
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR: could not serialize access due to read/write dependencies among transact
<para>
Acquired by <command>VACUUM</command> (without <option>FULL</option>),
- <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>, and
+ <command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
+ <command>REINDEX CONCURRENTLY</>,
<command>ALTER TABLE VALIDATE</command> and other
<command>ALTER TABLE</command> variants (for full details see
<xref linkend="SQL-ALTERTABLE">).
@@ -1143,7 +1144,7 @@ ERROR: could not serialize access due to read/write dependencies among transact
<sect2 id="locking-pages">
<title>Page-level Locks</title>
-
+
<para>
In addition to table and row locks, page-level share/exclusive locks are
used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [ FORCE ]
</synopsis>
</refsynopsisdiv>
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
An index build with the <literal>CONCURRENTLY</> option failed, leaving
an <quote>invalid</> index. Such indexes are useless but it can be
convenient to use <command>REINDEX</> to rebuild them. Note that
- <command>REINDEX</> will not perform a concurrent build. To build the
- index without interfering with production you should drop the index and
- reissue the <command>CREATE INDEX CONCURRENTLY</> command.
+ <command>REINDEX</> will perform a concurrent build if <literal>
+ CONCURRENTLY</> is specified. To build the index without interfering
+ with production you should drop the index and reissue either the
+ <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
+ command. Indexes of toast relations can be rebuilt with <command>REINDEX
+ CONCURRENTLY</>.
</para>
</listitem>
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
</varlistentry>
<varlistentry>
+ <term><literal>CONCURRENTLY</literal></term>
+ <listitem>
+ <para>
+ When this option is used, <productname>PostgreSQL</> will rebuild the
+ index without taking any locks that prevent concurrent inserts,
+ updates, or deletes on the table; whereas a standard reindex build
+ locks out writes (but not reads) on the table until it's done.
+ There are several caveats to be aware of when using this option
+ — see <xref linkend="SQL-REINDEX-CONCURRENTLY"
+ endterm="SQL-REINDEX-CONCURRENTLY-title">.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><literal>FORCE</literal></term>
<listitem>
<para>
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } <replaceable class="PARAMETER">nam
reindex anything.
</para>
+ <refsect2 id="SQL-REINDEX-CONCURRENTLY">
+ <title id="SQL-REINDEX-CONCURRENTLY-title">Rebuilding Indexes Concurrently</title>
+
+ <indexterm zone="SQL-REINDEX-CONCURRENTLY">
+ <primary>index</primary>
+ <secondary>rebuilding concurrently</secondary>
+ </indexterm>
+
+ <para>
+ Rebuilding an index can interfere with regular operation of a database.
+ Normally <productname>PostgreSQL</> locks the table whose index is rebuilt
+ against writes and performs the entire index build with a single scan of the
+ table. Other transactions can still read the table, but if they try to
+ insert, update, or delete rows in the table they will block until the
+ index rebuild is finished. This could have a severe effect if the system is
+ a live production database. Very large tables can take many hours to be
+ indexed, and even for smaller tables, an index rebuild can lock out writers
+ for periods that are unacceptably long for a production system.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</> supports rebuilding indexes with minimum locking
+ of writes. This method is invoked by specifying the
+ <literal>CONCURRENTLY</> option of <command>REINDEX</>. When this option
+ is used, <productname>PostgreSQL</> must perform two scans of the table
+ for each index that needs to be rebuild and in addition it must wait for
+ all existing transactions that could potentially use the index to
+ terminate. This method requires more total work than a standard index
+ rebuild and takes significantly longer to complete as it needs to wait
+ for unfinished transactions that might modify the index. However, since
+ it allows normal operations to continue while the index is rebuilt, this
+ method is useful for rebuilding indexes in a production environment. Of
+ course, the extra CPU, memory and I/O load imposed by the index rebuild
+ may slow down other operations.
+ </para>
+
+ <para>
+ The following steps occur in a concurrent index build, each in a separate
+ transaction except when the new index definitions are created, where all
+ the concurrent entries are created using only one transaction. Note that
+ if there are multiple indexes to be rebuilt then each step loops through
+ all the indexes we're rebuilding, using a separate transaction for each one.
+ <command>REINDEX CONCURRENTLY</> proceeds as follows when rebuilding
+ indexes:
+
+ <orderedlist>
+ <listitem>
+ <para>
+ A new temporary index definition is added into the catalog
+ <literal>pg_index</literal>. This definition is only used to build the
+ new index, and will be removed at the completion of the process. This
+ step is done as a single transaction for all the indexes involved in
+ this process, meaning that if <command>REINDEX CONCURRENTLY</> is
+ run on a table with multiple indexes, all the catalog entries of the
+ temporary indexes are created within a single transaction. A
+ <literal>SHARE UPDATE EXCLUSIVE</literal> lock at session level is taken
+ on the indexes being reindexed as well as its parent table to prevent
+ any schema modification while processing.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A first pass to build the index is done for each temporary entry.
+ Once the index is built, its flag <literal>pg_class.isready</> is
+ switched to <quote>true</> to make ready for inserts, making it
+ visible to other sessions once the transaction that performed the
+ build is finished. This step is done within a single transaction
+ for each entry.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Then a second pass is performed to add tuples that were added while
+ the first pass build was running. One the validation of the index
+ related to the temporary entry is done, a cache invalidation is done
+ so as all the sessions that referenced this index in any cached plans
+ will refresh them. This step is performed within a single transaction
+ for each temporary entry.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>pg_class.relfilenode</> for the existing index definition
+ and the temporary definition are swapped. This means that the existing
+ index definition now uses the index data that we stored during the
+ build, and the temporary definition is using the old index data. Again
+ a cache invalidation is performed to refresh any sessions that may refer
+ to the previous index definition. Note that at this point
+ <literal>pg_class.indisvalid</> is not switched to <quote>true</>,
+ making the temporary index definition ignored by any read query, for
+ the sake of toast indexes that can only use one single index in ready
+ state at the same time. During the swap an exclusive lock is taken
+ on the index and its temporary entry.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Temporary entries have <literal>pg_class.isready</> switched to
+ <quote>false</> to prevent any new tuple insertions. This step
+ is done within a single transaction for each temporary entry.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The temporary index definition and its data (which is now the
+ data for the old index) are dropped. This step is done within
+ a single transaction for each temporary entry.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <literal>SHARE UPDATE EXCLUSIVE</literal> session lock is released
+ for all the indexes processed as well as their parent tables.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ If a problem arises while rebuilding the indexes, such as a
+ uniqueness violation in a unique index, the <command>REINDEX</>
+ command will fail but leave behind an <quote>invalid</> new index on top
+ of the existing one. This index will be ignored for querying purposes
+ because it might be incomplete; however it will still consume update
+ overhead. The <application>psql</> <command>\d</> command will report
+ such an index as <literal>INVALID</>:
+
+<programlisting>
+postgres=# \d tab
+ Table "public.tab"
+ Column | Type | Modifiers
+--------+---------+-----------
+ col | integer |
+Indexes:
+ "idx" btree (col)
+ "idx_cct" btree (col) INVALID
+</programlisting>
+
+ The recommended recovery method in such cases is to drop the concurrent
+ index and try again to perform <command>REINDEX CONCURRENTLY</>.
+ The concurrent index created during the processing has a name finishing by
+ the suffix cct. If a concurrent index is based on a <literal>PRIMARY KEY</>
+ or an exclude constraint is marked as invalid. It can be dropped with
+ <literal>ALTER TABLE DROP CONSTRAINT</>. This is also the case of
+ <literal>UNIQUE</> indexes using constraints. Other indexes can be
+ dropped using <literal>DROP INDEX</> including invalid toast indexes.
+ </para>
+
+ <para>
+ Regular index builds permit other regular index builds on the same
+ table to occur in parallel, but only one concurrent index build can
+ occur on a table at a time. In both cases, no other types of schema
+ modification on the table are allowed meanwhile. Another difference
+ is that a regular <command>REINDEX TABLE</> or <command>REINDEX INDEX</>
+ command can be performed within a transaction block, but
+ <command>REINDEX CONCURRENTLY</> cannot. <command>REINDEX DATABASE</> is
+ by default not allowed to run inside a transaction block, so in this case
+ <command>CONCURRENTLY</> is not supported.
+ </para>
+
+ <para>
+ Invalid indexes of toast relations can be dropped if a failure occurred
+ during <command>REINDEX CONCURRENTLY</>. Valid indexes, being unique
+ for a given toast relation, cannot be dropped.
+ </para>
+
+ <para>
+ <command>REINDEX DATABASE</command> used with <command>CONCURRENTLY
+ </command> rebuilds concurrently only the non-system relations. System
+ relations are rebuilt with a non-concurrent context. Toast indexes are
+ rebuilt concurrently if the relation they depend on is a non-system
+ relation.
+ </para>
+
+ <para>
+ <command>REINDEX</command> uses <literal>ACCESS EXCLUSIVE</literal> lock
+ on all the relations involved during operation. When
+ <command>CONCURRENTLY</command> is specified, the operation is done with
+ <literal>SHARE UPDATE EXCLUSIVE</literal> except when an index and its
+ concurrent entry are swapped where a <literal>ACCESS EXCLUSIVE</literal>
+ lock is taken on the parent relation.
+ </para>
+
+ <para>
+ <command>REINDEX SYSTEM</command> does not support
+ <command>CONCURRENTLY</command>.
+ </para>
+ </refsect2>
</refsect1>
<refsect1>
@@ -249,7 +455,18 @@ $ <userinput>psql broken_db</userinput>
...
broken_db=> REINDEX DATABASE broken_db;
broken_db=> \q
-</programlisting></para>
+</programlisting>
+ </para>
+
+ <para>
+ Rebuild a table while authorizing read and write operations on involved
+ relations when performed:
+
+<programlisting>
+REINDEX TABLE CONCURRENTLY my_broken_table;
+</programlisting>
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index f3b3689..138be1c 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -259,6 +259,18 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
}
/*
+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+ int i;
+
+ for (i = 0; i < tupdesc->natts; i++)
+ tupdesc->attrs[i]->attcacheoff = -1;
+}
+
+/*
* Free a TupleDesc including all substructure
*/
void
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 912038a..66020ba 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -44,9 +44,11 @@
#include "catalog/pg_trigger.h"
#include "catalog/pg_type.h"
#include "catalog/storage.h"
+#include "commands/defrem.h"
#include "commands/tablecmds.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "mb/pg_wchar.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@@ -63,6 +65,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/ruleutils.h"
#include "utils/syscache.h"
#include "utils/tuplesort.h"
#include "utils/snapmgr.h"
@@ -663,6 +666,7 @@ UpdateIndexRelation(Oid indexoid,
* classObjectId: array of index opclass OIDs, one per index column
* coloptions: array of per-index-column indoption settings
* reloptions: AM-specific options
+ * tupdesc: Tuple descriptor used for the index if defined
* isprimary: index is a PRIMARY KEY
* isconstraint: index is owned by PRIMARY KEY, UNIQUE, or EXCLUSION constraint
* deferrable: constraint is DEFERRABLE
@@ -676,6 +680,10 @@ UpdateIndexRelation(Oid indexoid,
* is_internal: if true, post creation hook for new index
* if_not_exists: if true, do not throw an error if a relation with
* the same name already exists.
+ * is_reindex: if true, create an index that is used as a duplicate of an
+ * existing index created during a concurrent operation. This index can
+ * also be a toast relation. Sufficient locks are normally taken on
+ * the related relations once this is called during a concurrent operation.
*
* Returns the OID of the created index.
*/
@@ -692,6 +700,7 @@ index_create(Relation heapRelation,
Oid *classObjectId,
int16 *coloptions,
Datum reloptions,
+ TupleDesc tupdesc,
bool isprimary,
bool isconstraint,
bool deferrable,
@@ -700,7 +709,8 @@ index_create(Relation heapRelation,
bool skip_build,
bool concurrent,
bool is_internal,
- bool if_not_exists)
+ bool if_not_exists,
+ bool is_reindex)
{
Oid heapRelationId = RelationGetRelid(heapRelation);
Relation pg_class;
@@ -743,19 +753,24 @@ index_create(Relation heapRelation,
/*
* concurrent index build on a system catalog is unsafe because we tend to
- * release locks before committing in catalogs
+ * release locks before committing in catalogs. If the index is created during
+ * a REINDEX CONCURRENTLY operation, sufficient locks are already taken.
*/
if (concurrent &&
- IsSystemRelation(heapRelation))
+ IsSystemRelation(heapRelation) &&
+ !is_reindex)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("concurrent index creation on system catalog tables is not supported")));
/*
- * This case is currently not supported, but there's no way to ask for it
- * in the grammar anyway, so it can't happen.
+ * This case is currently only supported during a concurrent index
+ * rebuild, but there is no way to ask for it in the grammar otherwise
+ * anyway. If support for exclusion constraints is added in the future,
+ * the check similar to this one in check_exclusion_constraint should as
+ * well be changed accordingly.
*/
- if (concurrent && is_exclusion)
+ if (concurrent && is_exclusion && !is_reindex)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg_internal("concurrent index creation for exclusion constraints is not supported")));
@@ -794,14 +809,21 @@ index_create(Relation heapRelation,
}
/*
- * construct tuple descriptor for index tuples
+ * construct tuple descriptor for index tuples if nothing is passed
+ * by caller.
*/
- indexTupDesc = ConstructTupleDescriptor(heapRelation,
- indexInfo,
- indexColNames,
- accessMethodObjectId,
- collationObjectId,
- classObjectId);
+ if (tupdesc == NULL)
+ indexTupDesc = ConstructTupleDescriptor(heapRelation,
+ indexInfo,
+ indexColNames,
+ accessMethodObjectId,
+ collationObjectId,
+ classObjectId);
+ else
+ {
+ Assert(indexColNames == NIL);
+ indexTupDesc = tupdesc;
+ }
/*
* Allocate an OID for the index, unless we were told what to use.
@@ -1108,6 +1130,350 @@ index_create(Relation heapRelation,
return indexRelationId;
}
+
+/*
+ * index_concurrent_create
+ *
+ * Create a concurrent index based on the definition of the one provided by
+ * caller that will be used for concurrent operations. The index is inserted
+ * into catalogs and needs to be built later on. This is called during
+ * concurrent index processing. The heap relation on which is based the index
+ * needs to be closed by the caller.
+ */
+Oid
+index_concurrent_create(Relation heapRelation, Oid indOid, char *concurrentName)
+{
+ Relation indexRelation;
+ IndexInfo *indexInfo;
+ Oid concurrentOid = InvalidOid;
+ HeapTuple indexTuple, classTuple;
+ Datum indclassDatum, colOptionDatum, optionDatum;
+ TupleDesc indexTupDesc;
+ oidvector *indclass;
+ int2vector *indcoloptions;
+ bool isnull;
+ bool initdeferred = false;
+ Oid constraintOid = get_index_constraint(indOid);
+
+ indexRelation = index_open(indOid, RowExclusiveLock);
+
+ /* Concurrent index uses the same index information as former index */
+ indexInfo = BuildIndexInfo(indexRelation);
+
+ /*
+ * Determine if index is initdeferred, this depends on its dependent
+ * constraint.
+ */
+ if (OidIsValid(constraintOid))
+ {
+ /* Look for the correct value */
+ HeapTuple constraintTuple;
+ Form_pg_constraint constraintForm;
+
+ constraintTuple = SearchSysCache1(CONSTROID,
+ ObjectIdGetDatum(constraintOid));
+ if (!HeapTupleIsValid(constraintTuple))
+ elog(ERROR, "cache lookup failed for constraint %u",
+ constraintOid);
+ constraintForm = (Form_pg_constraint) GETSTRUCT(constraintTuple);
+ initdeferred = constraintForm->condeferred;
+
+ ReleaseSysCache(constraintTuple);
+ }
+
+ /*
+ * Create a copy of the tuple descriptor to be used for the concurrent
+ * entry and reset any cache counters on it to have a fresh version.
+ */
+ indexTupDesc = CreateTupleDescCopyConstr(RelationGetDescr(indexRelation));
+ ResetTupleDescCache(indexTupDesc);
+
+ /* Get the array of class and column options IDs from index info */
+ indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indOid));
+ if (!HeapTupleIsValid(indexTuple))
+ elog(ERROR, "cache lookup failed for index %u", indOid);
+ indclassDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+ Anum_pg_index_indclass, &isnull);
+ Assert(!isnull);
+ indclass = (oidvector *) DatumGetPointer(indclassDatum);
+
+ colOptionDatum = SysCacheGetAttr(INDEXRELID, indexTuple,
+ Anum_pg_index_indoption, &isnull);
+ Assert(!isnull);
+ indcoloptions = (int2vector *) DatumGetPointer(colOptionDatum);
+
+ /* Fetch options of index if any */
+ classTuple = SearchSysCache1(RELOID, indOid);
+ if (!HeapTupleIsValid(classTuple))
+ elog(ERROR, "cache lookup failed for relation %u", indOid);
+ optionDatum = SysCacheGetAttr(RELOID, classTuple,
+ Anum_pg_class_reloptions, &isnull);
+
+ /* Now create the concurrent index */
+ concurrentOid = index_create(heapRelation,
+ (const char *) concurrentName,
+ InvalidOid,
+ InvalidOid,
+ indexInfo,
+ NIL,
+ indexRelation->rd_rel->relam,
+ indexRelation->rd_rel->reltablespace,
+ indexRelation->rd_indcollation,
+ indclass->values,
+ indcoloptions->values,
+ optionDatum,
+ indexTupDesc,
+ indexRelation->rd_index->indisprimary,
+ OidIsValid(constraintOid), /* is constraint? */
+ !indexRelation->rd_index->indimmediate, /* is deferrable? */
+ initdeferred, /* is initially deferred? */
+ true, /* allow table to be a system catalog? */
+ true, /* skip build? */
+ true, /* concurrent? */
+ false, /* is_internal? */
+ false, /* if_not_exists? */
+ true); /* reindex? */
+
+ /* Close the relations used and clean up */
+ index_close(indexRelation, NoLock);
+ ReleaseSysCache(indexTuple);
+ ReleaseSysCache(classTuple);
+
+ return concurrentOid;
+}
+
+
+/*
+ * index_concurrent_build
+ *
+ * Build index for a concurrent operation. Low-level locks are taken when this
+ * operation is performed to prevent only schema changes but they need to be
+ * kept until the end of the transaction performing this operation.
+ */
+void
+index_concurrent_build(Oid heapOid,
+ Oid indexOid,
+ bool isprimary)
+{
+ Relation heapRel, indexRelation;
+ IndexInfo *indexInfo;
+
+ /* Open and lock the parent heap relation */
+ heapRel = heap_open(heapOid, ShareUpdateExclusiveLock);
+
+ /* And the target index relation */
+ indexRelation = index_open(indexOid, RowExclusiveLock);
+
+ /*
+ * We have to re-build the IndexInfo struct, since it was lost in
+ * commit of transaction where this concurrent index was created
+ * at the catalog level.
+ */
+ indexInfo = BuildIndexInfo(indexRelation);
+ Assert(!indexInfo->ii_ReadyForInserts);
+ indexInfo->ii_Concurrent = true;
+ indexInfo->ii_BrokenHotChain = false;
+
+ /* Now build the index */
+ index_build(heapRel, indexRelation, indexInfo, isprimary, false);
+
+ /* Close both relations, and keep the locks */
+ heap_close(heapRel, NoLock);
+ index_close(indexRelation, NoLock);
+}
+
+
+/*
+ * index_concurrent_swap
+ *
+ * Swap old index and new index in a concurrent context. An exclusive lock
+ * is taken on those two relations during the swap of their relfilenode.
+ */
+void
+index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, LOCKTAG locktag)
+{
+ Relation oldIndexRel, newIndexRel, pg_class;
+ HeapTuple oldIndexTuple, newIndexTuple;
+ Form_pg_class oldIndexForm, newIndexForm;
+ Oid tmpnode;
+
+ /*
+ * Before doing any operation, we need to wait until no running
+ * transaction could be using any index for a query as a deadlock
+ * could occur if another transaction running tries to take the same
+ * level of locking as this operation. Hence use AccessExclusiveLock
+ * to ensure that there is nothing nasty waiting.
+ */
+ WaitForLockers(locktag, AccessExclusiveLock);
+
+ /*
+ * Take a necessary lock on the old and new index before swapping them.
+ */
+ oldIndexRel = relation_open(oldIndexOid, AccessExclusiveLock);
+ newIndexRel = relation_open(newIndexOid, AccessExclusiveLock);
+
+ /* Now swap relfilenode of those indexes */
+ pg_class = heap_open(RelationRelationId, RowExclusiveLock);
+
+ oldIndexTuple = SearchSysCacheCopy1(RELOID,
+ ObjectIdGetDatum(oldIndexOid));
+ if (!HeapTupleIsValid(oldIndexTuple))
+ elog(ERROR, "could not find tuple for relation %u", oldIndexOid);
+ newIndexTuple = SearchSysCacheCopy1(RELOID,
+ ObjectIdGetDatum(newIndexOid));
+ if (!HeapTupleIsValid(newIndexTuple))
+ elog(ERROR, "could not find tuple for relation %u", newIndexOid);
+ oldIndexForm = (Form_pg_class) GETSTRUCT(oldIndexTuple);
+ newIndexForm = (Form_pg_class) GETSTRUCT(newIndexTuple);
+
+ /* Here is where the actual swap happens */
+ tmpnode = oldIndexForm->relfilenode;
+ oldIndexForm->relfilenode = newIndexForm->relfilenode;
+ newIndexForm->relfilenode = tmpnode;
+
+ /* Then update the tuples for each relation */
+ simple_heap_update(pg_class, &oldIndexTuple->t_self, oldIndexTuple);
+ simple_heap_update(pg_class, &newIndexTuple->t_self, newIndexTuple);
+ CatalogUpdateIndexes(pg_class, oldIndexTuple);
+ CatalogUpdateIndexes(pg_class, newIndexTuple);
+
+ /* Close relations and clean up */
+ heap_freetuple(oldIndexTuple);
+ heap_freetuple(newIndexTuple);
+ heap_close(pg_class, RowExclusiveLock);
+
+ /* The lock taken previously is not released until the end of transaction */
+ relation_close(oldIndexRel, NoLock);
+ relation_close(newIndexRel, NoLock);
+}
+
+/*
+ * index_concurrent_set_dead
+ *
+ * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
+ * CONCURRENTLY before actually dropping the index. After calling this
+ * function the index is seen by all the backends as dead. Low-level locks
+ * taken here are kept until the end of the transaction doing calling this
+ * function.
+ */
+void
+index_concurrent_set_dead(Oid heapOid, Oid indexOid, LOCKTAG locktag)
+{
+ Relation heapRelation, indexRelation;
+
+ /*
+ * Now we must wait until no running transaction could be using the
+ * index for a query. Use AccessExclusiveLock here to check for
+ * running transactions that hold locks of any kind on the table. Note
+ * we do not need to worry about xacts that open the table for reading
+ * after this point; they will see the index as invalid when they open
+ * the relation.
+ *
+ * Note: the reason we use actual lock acquisition here, rather than
+ * just checking the ProcArray and sleeping, is that deadlock is
+ * possible if one of the transactions in question is blocked trying
+ * to acquire an exclusive lock on our table. The lock code will
+ * detect deadlock and error out properly.
+ */
+ WaitForLockers(locktag, AccessExclusiveLock);
+
+ /*
+ * No more predicate locks will be acquired on this index, and we're
+ * about to stop doing inserts into the index which could show
+ * conflicts with existing predicate locks, so now is the time to move
+ * them to the heap relation.
+ */
+ heapRelation = heap_open(heapOid, ShareUpdateExclusiveLock);
+ indexRelation = index_open(indexOid, ShareUpdateExclusiveLock);
+ TransferPredicateLocksToHeapRelation(indexRelation);
+
+ /*
+ * Now we are sure that nobody uses the index for queries; they just
+ * might have it open for updating it. So now we can unset indisready
+ * and indislive, then wait till nobody could be using it at all
+ * anymore.
+ */
+ index_set_state_flags(indexOid, INDEX_DROP_SET_DEAD);
+
+ /*
+ * Invalidate the relcache for the table, so that after this commit
+ * all sessions will refresh the table's index list. Forgetting just
+ * the index's relcache entry is not enough.
+ */
+ CacheInvalidateRelcache(heapRelation);
+
+ /*
+ * Close the relations again, though still holding session lock.
+ */
+ heap_close(heapRelation, NoLock);
+ index_close(indexRelation, NoLock);
+}
+
+/*
+ * index_concurrent_drop
+ *
+ * Drop a single index concurrently as the last step of an index concurrent
+ * process. Deletion is done through performDeletion or dependencies of the
+ * index would not get dropped. At this point all the indexes are already
+ * considered as invalid and dead so they can be dropped without using any
+ * concurrent options as it is sure that they will not interact with other
+ * server sessions.
+ */
+void
+index_concurrent_drop(Oid indexOid)
+{
+ Oid constraintOid = get_index_constraint(indexOid);
+ ObjectAddress object;
+ Form_pg_index indexForm;
+ Relation pg_index;
+ HeapTuple indexTuple;
+
+ /*
+ * Check that the index dropped here is not alive, it might be used by
+ * other backends in this case.
+ */
+ pg_index = heap_open(IndexRelationId, RowExclusiveLock);
+
+ indexTuple = SearchSysCacheCopy1(INDEXRELID,
+ ObjectIdGetDatum(indexOid));
+ if (!HeapTupleIsValid(indexTuple))
+ elog(ERROR, "cache lookup failed for index %u", indexOid);
+ indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+ /*
+ * This is only a safety check, just to avoid live indexes from being
+ * dropped.
+ */
+ if (indexForm->indislive)
+ elog(ERROR, "cannot drop live index with OID %u", indexOid);
+
+ /* Clean up */
+ heap_close(pg_index, RowExclusiveLock);
+
+ /*
+ * We are sure to have a dead index, so begin the drop process.
+ * Register constraint or index for drop.
+ */
+ if (OidIsValid(constraintOid))
+ {
+ object.classId = ConstraintRelationId;
+ object.objectId = constraintOid;
+ }
+ else
+ {
+ object.classId = RelationRelationId;
+ object.objectId = indexOid;
+ }
+
+ object.objectSubId = 0;
+
+ /* Perform deletion for normal and toast indexes */
+ performDeletion(&object,
+ DROP_RESTRICT,
+ 0);
+}
+
+
/*
* index_constraint_create
*
@@ -1456,52 +1822,8 @@ index_drop(Oid indexId, bool concurrent)
CommitTransactionCommand();
StartTransactionCommand();
- /*
- * Now we must wait until no running transaction could be using the
- * index for a query. Use AccessExclusiveLock here to check for
- * running transactions that hold locks of any kind on the table. Note
- * we do not need to worry about xacts that open the table for reading
- * after this point; they will see the index as invalid when they open
- * the relation.
- *
- * Note: the reason we use actual lock acquisition here, rather than
- * just checking the ProcArray and sleeping, is that deadlock is
- * possible if one of the transactions in question is blocked trying
- * to acquire an exclusive lock on our table. The lock code will
- * detect deadlock and error out properly.
- */
- WaitForLockers(heaplocktag, AccessExclusiveLock);
-
- /*
- * No more predicate locks will be acquired on this index, and we're
- * about to stop doing inserts into the index which could show
- * conflicts with existing predicate locks, so now is the time to move
- * them to the heap relation.
- */
- userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock);
- userIndexRelation = index_open(indexId, ShareUpdateExclusiveLock);
- TransferPredicateLocksToHeapRelation(userIndexRelation);
-
- /*
- * Now we are sure that nobody uses the index for queries; they just
- * might have it open for updating it. So now we can unset indisready
- * and indislive, then wait till nobody could be using it at all
- * anymore.
- */
- index_set_state_flags(indexId, INDEX_DROP_SET_DEAD);
-
- /*
- * Invalidate the relcache for the table, so that after this commit
- * all sessions will refresh the table's index list. Forgetting just
- * the index's relcache entry is not enough.
- */
- CacheInvalidateRelcache(userHeapRelation);
-
- /*
- * Close the relations again, though still holding session lock.
- */
- heap_close(userHeapRelation, NoLock);
- index_close(userIndexRelation, NoLock);
+ /* Finish invalidation of index and mark it as dead */
+ index_concurrent_set_dead(heapId, indexId, heaplocktag);
/*
* Again, commit the transaction to make the pg_index update visible
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 5ef6dcc..e1992ad 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -341,8 +341,9 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
BTREE_AM_OID,
rel->rd_rel->reltablespace,
collationObjectId, classObjectId, coloptions, (Datum) 0,
+ NULL,
true, false, false, false,
- true, false, false, true, false);
+ true, false, false, true, false, false);
heap_close(toast_rel, NoLock);
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 0205595..b3c1db5 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -68,8 +68,9 @@ static void ComputeIndexAttrs(IndexInfo *indexInfo,
static Oid GetIndexOpClass(List *opclass, Oid attrType,
char *accessMethodName, Oid accessMethodId);
static char *ChooseIndexName(const char *tabname, Oid namespaceId,
- List *colnames, List *exclusionOpNames,
- bool primary, bool isconstraint);
+ List *colnames, List *exclusionOpNames,
+ bool primary, bool isconstraint,
+ bool concurrent);
static char *ChooseIndexNameAddition(List *colnames);
static List *ChooseIndexColumnNames(List *indexElems);
static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
@@ -276,6 +277,86 @@ CheckIndexCompatible(Oid oldId,
}
/*
+ * WaitForOlderSnapshots
+ *
+ * Wait for transactions that might have older snapshot than the given xmin
+ * limit, because it might not contain tuples deleted just before it has
+ * been taken. Obtain a list of VXIDs of such transactions, and wait for them
+ * individually.
+ *
+ * We can exclude any running transactions that have xmin > the xmin given;
+ * their oldest snapshot must be newer than our xmin limit.
+ * We can also exclude any transactions that have xmin = zero, since they
+ * evidently have no live snapshot at all (and any one they might be in
+ * process of taking is certainly newer than ours). Transactions in other
+ * DBs can be ignored too, since they'll never even be able to see this
+ * index.
+ *
+ * We can also exclude autovacuum processes and processes running manual
+ * lazy VACUUMs, because they won't be fazed by missing index entries
+ * either. (Manual ANALYZEs, however, can't be excluded because they
+ * might be within transactions that are going to do arbitrary operations
+ * later.)
+ *
+ * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
+ * check for that.
+ *
+ * If a process goes idle-in-transaction with xmin zero, we do not need to
+ * wait for it anymore, per the above argument. We do not have the
+ * infrastructure right now to stop waiting if that happens, but we can at
+ * least avoid the folly of waiting when it is idle at the time we would
+ * begin to wait. We do this by repeatedly rechecking the output of
+ * GetCurrentVirtualXIDs. If, during any iteration, a particular vxid
+ * doesn't show up in the output, we know we can forget about it.
+ */
+static void
+WaitForOlderSnapshots(TransactionId limitXmin)
+{
+ int i, n_old_snapshots;
+ VirtualTransactionId *old_snapshots;
+
+ old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
+ PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+ &n_old_snapshots);
+
+ for (i = 0; i < n_old_snapshots; i++)
+ {
+ if (!VirtualTransactionIdIsValid(old_snapshots[i]))
+ continue; /* found uninteresting in previous cycle */
+
+ if (i > 0)
+ {
+ /* see if anything's changed ... */
+ VirtualTransactionId *newer_snapshots;
+ int n_newer_snapshots, j, k;
+
+ newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
+ true, false,
+ PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
+ &n_newer_snapshots);
+ for (j = i; j < n_old_snapshots; j++)
+ {
+ if (!VirtualTransactionIdIsValid(old_snapshots[j]))
+ continue; /* found uninteresting in previous cycle */
+ for (k = 0; k < n_newer_snapshots; k++)
+ {
+ if (VirtualTransactionIdEquals(old_snapshots[j],
+ newer_snapshots[k]))
+ break;
+ }
+ if (k >= n_newer_snapshots) /* not there anymore */
+ SetInvalidVirtualTransactionId(old_snapshots[j]);
+ }
+ pfree(newer_snapshots);
+ }
+
+ if (VirtualTransactionIdIsValid(old_snapshots[i]))
+ VirtualXactLock(old_snapshots[i], true);
+ }
+}
+
+
+/*
* DefineIndex
* Creates a new index.
*
@@ -312,7 +393,6 @@ DefineIndex(Oid relationId,
Oid tablespaceId;
List *indexColNames;
Relation rel;
- Relation indexRelation;
HeapTuple tuple;
Form_pg_am accessMethodForm;
bool amcanorder;
@@ -322,13 +402,10 @@ DefineIndex(Oid relationId,
IndexInfo *indexInfo;
int numberOfAttributes;
TransactionId limitXmin;
- VirtualTransactionId *old_snapshots;
- int n_old_snapshots;
LockRelId heaprelid;
LOCKTAG heaplocktag;
LOCKMODE lockmode;
Snapshot snapshot;
- int i;
/*
* count attributes in index
@@ -459,7 +536,8 @@ DefineIndex(Oid relationId,
indexColNames,
stmt->excludeOpNames,
stmt->primary,
- stmt->isconstraint);
+ stmt->isconstraint,
+ false);
/*
* look up the access method, verify it can handle the requested features
@@ -606,12 +684,12 @@ DefineIndex(Oid relationId,
indexInfo, indexColNames,
accessMethodId, tablespaceId,
collationObjectId, classObjectId,
- coloptions, reloptions, stmt->primary,
+ coloptions, reloptions, NULL, stmt->primary,
stmt->isconstraint, stmt->deferrable, stmt->initdeferred,
allowSystemTableMods,
skip_build || stmt->concurrent,
stmt->concurrent, !check_rights,
- stmt->if_not_exists);
+ stmt->if_not_exists, false);
if (!OidIsValid(indexRelationId))
{
@@ -699,27 +777,15 @@ DefineIndex(Oid relationId,
* HOT-chain or the extension of the chain is HOT-safe for this index.
*/
- /* Open and lock the parent heap relation */
- rel = heap_openrv(stmt->relation, ShareUpdateExclusiveLock);
-
- /* And the target index relation */
- indexRelation = index_open(indexRelationId, RowExclusiveLock);
-
/* Set ActiveSnapshot since functions in the indexes may need it */
PushActiveSnapshot(GetTransactionSnapshot());
- /* We have to re-build the IndexInfo struct, since it was lost in commit */
- indexInfo = BuildIndexInfo(indexRelation);
- Assert(!indexInfo->ii_ReadyForInserts);
- indexInfo->ii_Concurrent = true;
- indexInfo->ii_BrokenHotChain = false;
-
- /* Now build the index */
- index_build(rel, indexRelation, indexInfo, stmt->primary, false);
-
- /* Close both the relations, but keep the locks */
- heap_close(rel, NoLock);
- index_close(indexRelation, NoLock);
+ /* Perform concurrent build of index */
+ index_concurrent_build(RangeVarGetRelid(stmt->relation,
+ ShareUpdateExclusiveLock,
+ false),
+ indexRelationId,
+ stmt->primary);
/*
* Update the pg_index row to mark the index as ready for inserts. Once we
@@ -784,74 +850,9 @@ DefineIndex(Oid relationId,
* The index is now valid in the sense that it contains all currently
* interesting tuples. But since it might not contain tuples deleted just
* before the reference snap was taken, we have to wait out any
- * transactions that might have older snapshots. Obtain a list of VXIDs
- * of such transactions, and wait for them individually.
- *
- * We can exclude any running transactions that have xmin > the xmin of
- * our reference snapshot; their oldest snapshot must be newer than ours.
- * We can also exclude any transactions that have xmin = zero, since they
- * evidently have no live snapshot at all (and any one they might be in
- * process of taking is certainly newer than ours). Transactions in other
- * DBs can be ignored too, since they'll never even be able to see this
- * index.
- *
- * We can also exclude autovacuum processes and processes running manual
- * lazy VACUUMs, because they won't be fazed by missing index entries
- * either. (Manual ANALYZEs, however, can't be excluded because they
- * might be within transactions that are going to do arbitrary operations
- * later.)
- *
- * Also, GetCurrentVirtualXIDs never reports our own vxid, so we need not
- * check for that.
- *
- * If a process goes idle-in-transaction with xmin zero, we do not need to
- * wait for it anymore, per the above argument. We do not have the
- * infrastructure right now to stop waiting if that happens, but we can at
- * least avoid the folly of waiting when it is idle at the time we would
- * begin to wait. We do this by repeatedly rechecking the output of
- * GetCurrentVirtualXIDs. If, during any iteration, a particular vxid
- * doesn't show up in the output, we know we can forget about it.
+ * transactions that might have older snapshots.
*/
- old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
- PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
- &n_old_snapshots);
-
- for (i = 0; i < n_old_snapshots; i++)
- {
- if (!VirtualTransactionIdIsValid(old_snapshots[i]))
- continue; /* found uninteresting in previous cycle */
-
- if (i > 0)
- {
- /* see if anything's changed ... */
- VirtualTransactionId *newer_snapshots;
- int n_newer_snapshots;
- int j;
- int k;
-
- newer_snapshots = GetCurrentVirtualXIDs(limitXmin,
- true, false,
- PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
- &n_newer_snapshots);
- for (j = i; j < n_old_snapshots; j++)
- {
- if (!VirtualTransactionIdIsValid(old_snapshots[j]))
- continue; /* found uninteresting in previous cycle */
- for (k = 0; k < n_newer_snapshots; k++)
- {
- if (VirtualTransactionIdEquals(old_snapshots[j],
- newer_snapshots[k]))
- break;
- }
- if (k >= n_newer_snapshots) /* not there anymore */
- SetInvalidVirtualTransactionId(old_snapshots[j]);
- }
- pfree(newer_snapshots);
- }
-
- if (VirtualTransactionIdIsValid(old_snapshots[i]))
- VirtualXactLock(old_snapshots[i], true);
- }
+ WaitForOlderSnapshots(limitXmin);
/*
* Index can now be marked valid -- update its pg_index entry
@@ -878,6 +879,559 @@ DefineIndex(Oid relationId,
/*
+ * ReindexRelationConcurrently
+ *
+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each phase is processed
+ * one by done for each table's indexes as well as its dependent toast indexes
+ * if this table has a toast relation defined.
+ */
+bool
+ReindexRelationConcurrently(Oid relationOid)
+{
+ List *concurrentIndexIds = NIL,
+ *indexIds = NIL,
+ *parentRelationIds = NIL,
+ *lockTags = NIL,
+ *relationLocks = NIL;
+ ListCell *lc, *lc2;
+ Snapshot snapshot;
+
+ /*
+ * Extract the list of indexes that are going to be rebuilt based on the
+ * list of relation Oids given by caller. For each element in given list,
+ * If the relkind of given relation Oid is a table, all its valid indexes
+ * will be rebuilt, including its associated toast table indexes. If
+ * relkind is an index, this index itself will be rebuilt. The locks taken
+ * on parent relations and involved indexes are kept until this transaction
+ * is committed to protect against schema changes that might occur until
+ * the session lock is taken on each relation, session lock used to
+ * similarly protect from any schema change that could happen within the
+ * multiple transactions that are used during this process.
+ */
+ switch (get_rel_relkind(relationOid))
+ {
+ case RELKIND_RELATION:
+ case RELKIND_MATVIEW:
+ case RELKIND_TOASTVALUE:
+ {
+ /*
+ * In the case of a relation, find all its indexes
+ * including toast indexes.
+ */
+ Relation heapRelation;
+
+ /* Track this relation for session locks */
+ parentRelationIds = lappend_oid(parentRelationIds, relationOid);
+
+ /* A shared relation cannot be reindexed concurrently */
+ if (IsSharedRelation(relationOid))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("concurrent reindex is not supported for shared relations")));
+
+ /* A system catalog cannot be reindexed concurrently */
+ if (IsSystemNamespace(get_rel_namespace(relationOid)))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("concurrent reindex is not supported for catalog relations")));
+
+ /* Open relation to get its indexes */
+ heapRelation = heap_open(relationOid, ShareUpdateExclusiveLock);
+
+ /* Add all the valid indexes of relation to list */
+ foreach(lc2, RelationGetIndexList(heapRelation))
+ {
+ Oid cellOid = lfirst_oid(lc2);
+ Relation indexRelation = index_open(cellOid,
+ ShareUpdateExclusiveLock);
+
+ if (!indexRelation->rd_index->indisvalid)
+ ereport(WARNING,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+ get_namespace_name(get_rel_namespace(cellOid)),
+ get_rel_name(cellOid))));
+ else
+ indexIds = lappend_oid(indexIds, cellOid);
+
+ index_close(indexRelation, NoLock);
+ }
+
+ /* Also add the toast indexes */
+ if (OidIsValid(heapRelation->rd_rel->reltoastrelid))
+ {
+ Oid toastOid = heapRelation->rd_rel->reltoastrelid;
+ Relation toastRelation = heap_open(toastOid,
+ ShareUpdateExclusiveLock);
+
+ /* Track this relation for session locks */
+ parentRelationIds = lappend_oid(parentRelationIds, toastOid);
+
+ foreach(lc2, RelationGetIndexList(toastRelation))
+ {
+ Oid cellOid = lfirst_oid(lc2);
+ Relation indexRelation = index_open(cellOid,
+ ShareUpdateExclusiveLock);
+
+ if (!indexRelation->rd_index->indisvalid)
+ ereport(WARNING,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+ get_namespace_name(get_rel_namespace(cellOid)),
+ get_rel_name(cellOid))));
+ else
+ indexIds = lappend_oid(indexIds, cellOid);
+
+ index_close(indexRelation, NoLock);
+ }
+
+ heap_close(toastRelation, NoLock);
+ }
+
+ heap_close(heapRelation, NoLock);
+ break;
+ }
+ case RELKIND_INDEX:
+ {
+ /*
+ * For an index simply add its Oid to list. Invalid indexes
+ * cannot be included in list.
+ */
+ Relation indexRelation = index_open(relationOid, ShareUpdateExclusiveLock);
+
+ /* Track the parent relation of this index for session locks */
+ parentRelationIds = list_make1_oid(IndexGetRelation(relationOid, false));
+
+ if (!indexRelation->rd_index->indisvalid)
+ ereport(WARNING,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("cannot reindex concurrently invalid index \"%s.%s\", skipping",
+ get_namespace_name(get_rel_namespace(relationOid)),
+ get_rel_name(relationOid))));
+ else
+ indexIds = list_make1_oid(relationOid);
+
+ index_close(indexRelation, NoLock);
+ break;
+ }
+ default:
+ /* Return error if type of relation is not supported */
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot reindex concurrently this type of relation")));
+ break;
+ }
+
+ /* Definetely no indexes, so leave */
+ if (indexIds == NIL)
+ return false;
+
+ Assert(parentRelationIds != NIL);
+
+ /*
+ * Phase 1 of REINDEX CONCURRENTLY
+ *
+ * Here begins the process for concurrently rebuilding the index entries.
+ * We need first to create an index which is based on the same data
+ * as the former index except that it will be only registered in catalogs
+ * and will be built later. It is possible to perform all the operations
+ * on all the indexes at the same time for a parent relation including
+ * indexes for its toast relation.
+ */
+
+ /* Do the concurrent index creation for each index */
+ foreach(lc, indexIds)
+ {
+ char *concurrentName;
+ Oid indOid = lfirst_oid(lc);
+ Oid concurrentOid = InvalidOid;
+ Relation indexRel,
+ indexParentRel,
+ indexConcurrentRel;
+ LockRelId lockrelid;
+
+ indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+ /* Open the index parent relation, might be a toast or parent relation */
+ indexParentRel = heap_open(indexRel->rd_index->indrelid,
+ ShareUpdateExclusiveLock);
+
+ /* Choose a relation name for concurrent index */
+ concurrentName = ChooseIndexName(get_rel_name(indOid),
+ get_rel_namespace(indexRel->rd_index->indrelid),
+ NULL,
+ NULL,
+ false,
+ false,
+ true);
+
+ /* Create concurrent index based on given index */
+ concurrentOid = index_concurrent_create(indexParentRel,
+ indOid,
+ concurrentName);
+
+ /*
+ * Now open the relation of concurrent index, a lock is also needed on
+ * it
+ */
+ indexConcurrentRel = index_open(concurrentOid, ShareUpdateExclusiveLock);
+
+ /* Save the concurrent index Oid */
+ concurrentIndexIds = lappend_oid(concurrentIndexIds, concurrentOid);
+
+ /*
+ * Save lockrelid to protect each concurrent relation from drop then
+ * close relations. The lockrelid on parent relation is not taken here
+ * to avoid multiple locks taken on the same relation, instead we rely
+ * on parentRelationIds built earlier.
+ */
+ lockrelid = indexRel->rd_lockInfo.lockRelId;
+ relationLocks = lappend(relationLocks, &lockrelid);
+ lockrelid = indexConcurrentRel->rd_lockInfo.lockRelId;
+ relationLocks = lappend(relationLocks, &lockrelid);
+
+ index_close(indexRel, NoLock);
+ index_close(indexConcurrentRel, NoLock);
+ heap_close(indexParentRel, NoLock);
+ }
+
+ /*
+ * Save the heap lock for following visibility checks with other backends
+ * might conflict with this session.
+ */
+ foreach(lc, parentRelationIds)
+ {
+ Relation heapRelation = heap_open(lfirst_oid(lc), ShareUpdateExclusiveLock);
+ LockRelId lockrelid = heapRelation->rd_lockInfo.lockRelId;
+ LOCKTAG *heaplocktag = (LOCKTAG *) palloc(sizeof(LOCKTAG));
+
+ /* Add lockrelid of parent relation to the list of locked relations */
+ relationLocks = lappend(relationLocks, &lockrelid);
+
+ /* Save the LOCKTAG for this parent relation for the wait phase */
+ SET_LOCKTAG_RELATION(*heaplocktag, lockrelid.dbId, lockrelid.relId);
+ lockTags = lappend(lockTags, heaplocktag);
+
+ /* Close heap relation */
+ heap_close(heapRelation, NoLock);
+ }
+
+ /*
+ * For a concurrent build, it is necessary to make the catalog entries
+ * visible to the other transactions before actually building the index.
+ * This will prevent them from making incompatible HOT updates. The index
+ * is marked as not ready and invalid so as no other transactions will try
+ * to use it for INSERT or SELECT.
+ *
+ * Before committing, get a session level lock on the relation, the
+ * concurrent index and its copy to insure that none of them are dropped
+ * until the operation is done.
+ */
+ foreach(lc, relationLocks)
+ {
+ LockRelId lockRel = *((LockRelId *) lfirst(lc));
+ LockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+ }
+
+ PopActiveSnapshot();
+ CommitTransactionCommand();
+
+ /*
+ * Phase 2 of REINDEX CONCURRENTLY
+ *
+ * Build concurrent indexes in a separate transaction for each index to
+ * avoid having open transactions for an unnecessary long time. A
+ * concurrent build is done for each concurrent index that will replace
+ * the old indexes. Before doing that, we need to wait on the parent
+ * relations until no running transactions could have the parent table
+ * of index open.
+ */
+
+ /* Perform a wait on all the session locks */
+ StartTransactionCommand();
+ WaitForLockersMultiple(lockTags, ShareLock);
+ CommitTransactionCommand();
+
+ forboth(lc, indexIds, lc2, concurrentIndexIds)
+ {
+ Relation indexRel;
+ Oid indOid = lfirst_oid(lc);
+ Oid concurrentOid = lfirst_oid(lc2);
+ bool primary;
+
+ /* Check for any process interruption */
+ CHECK_FOR_INTERRUPTS();
+
+ /* Start new transaction for this index concurrent build */
+ StartTransactionCommand();
+
+ /* Set ActiveSnapshot since functions in the indexes may need it */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ /*
+ * Index relation has been closed by previous commit, so reopen it
+ * to determine if it is used as a primary key.
+ */
+ indexRel = index_open(indOid, ShareUpdateExclusiveLock);
+ primary = indexRel->rd_index->indisprimary;
+ index_close(indexRel, NoLock);
+
+ /* Perform concurrent build of new index */
+ index_concurrent_build(indexRel->rd_index->indrelid,
+ concurrentOid,
+ primary);
+
+ /*
+ * Update the pg_index row of the concurrent index as ready for inserts.
+ * Once we commit this transaction, any new transactions that open the
+ * table must insert new entries into the index for insertions and
+ * non-HOT updates.
+ */
+ index_set_state_flags(concurrentOid, INDEX_CREATE_SET_READY);
+
+ /* we can do away with our snapshot */
+ PopActiveSnapshot();
+
+ /*
+ * Commit this transaction to make the indisready update visible for
+ * concurrent index.
+ */
+ CommitTransactionCommand();
+ }
+
+
+ /*
+ * Phase 3 of REINDEX CONCURRENTLY
+ *
+ * During this phase the concurrent indexes catch up with any new tuples
+ * that were created during the previous phase.
+ *
+ * We once again wait until no transaction can have the table open with
+ * the index marked as read-only for updates. Each index validation is
+ * done in a separate transaction to minimize how long we hold an open
+ * transaction.
+ */
+
+ /* Perform a wait on all the session locks */
+ StartTransactionCommand();
+ WaitForLockersMultiple(lockTags, ShareLock);
+ CommitTransactionCommand();
+
+ /*
+ * Perform a scan of each concurrent index with the heap, then insert
+ * any missing index entries.
+ */
+ foreach(lc, concurrentIndexIds)
+ {
+ Oid indOid = lfirst_oid(lc);
+ Oid relOid;
+ TransactionId limitXmin;
+
+ /* Check for any process interruption */
+ CHECK_FOR_INTERRUPTS();
+
+ /* Open separate transaction to validate index */
+ StartTransactionCommand();
+
+ /* Get the parent relation Oid */
+ relOid = IndexGetRelation(indOid, false);
+
+ /*
+ * Take the reference snapshot that will be used for the concurrent indexes
+ * validation.
+ */
+ snapshot = RegisterSnapshot(GetTransactionSnapshot());
+ PushActiveSnapshot(snapshot);
+
+ /* Validate index, which might be a toast */
+ validate_index(relOid, indOid, snapshot);
+
+ /*
+ * Invalidate the relcache for the table, so that after this commit
+ * all sessions will refresh any cached plans that might reference the
+ * index.
+ */
+ CacheInvalidateRelcacheByRelid(relOid);
+
+ /*
+ * We can now do away with our active snapshot, we still need to save the xmin
+ * limit to wait for older snapshots.
+ */
+ limitXmin = snapshot->xmin;
+ PopActiveSnapshot();
+
+ /* And we can remove the validating snapshot too */
+ UnregisterSnapshot(snapshot);
+
+ /*
+ * This concurrent index is now valid as they contain all the tuples
+ * necessary. However, it might not have taken into account deleted tuples
+ * before the reference snapshot was taken, so we need to wait for the
+ * transactions that might have older snapshots than ours.
+ */
+ WaitForOlderSnapshots(limitXmin);
+
+ /* Commit this transaction to make the concurrent index valid */
+ CommitTransactionCommand();
+ }
+
+ /*
+ * Phase 4 of REINDEX CONCURRENTLY
+ *
+ * Now that the concurrent indexes have been validated, it is necessary
+ * to swap each concurrent index with its corresponding old index. Note
+ * that the concurrent index used for swaping is not marked as valid
+ * because we need to keep the former index and the concurrent index with
+ * a different valid status to avoid an explosion in the number of indexes
+ * a parent relation could have if this operation step fails multiple times
+ * in a row due to a reason or another. Note that once this phase is done
+ * each concurrent index will be thrown away in the next process steps.
+ */
+ forboth(lc, indexIds, lc2, concurrentIndexIds)
+ {
+ Oid indOid = lfirst_oid(lc);
+ Oid concurrentOid = lfirst_oid(lc2);
+ LOCKTAG *heapLockTag = NULL;
+ ListCell *cell;
+ Oid relOid;
+
+ /* Check for any process interruption */
+ CHECK_FOR_INTERRUPTS();
+
+ /*
+ * Each index needs to be swapped in a separate transaction, so start
+ * a new one.
+ */
+ StartTransactionCommand();
+ relOid = IndexGetRelation(indOid, false);
+
+ /*
+ * Find the locktag of parent table for this index, we need to wait for
+ * locks on it before the swap.
+ */
+ foreach(cell, lockTags)
+ {
+ LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+ if (relOid == localTag->locktag_field2)
+ heapLockTag = localTag;
+ }
+ Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+ /* Swap old index and its concurrent entry */
+ index_concurrent_swap(concurrentOid, indOid, *heapLockTag);
+
+ /*
+ * Invalidate the relcache for the table, so that after this commit
+ * all sessions will refresh any cached plans that might reference the
+ * index.
+ */
+ relOid = IndexGetRelation(indOid, false);
+ CacheInvalidateRelcacheByRelid(relOid);
+
+ /* Commit this transaction and make old index invalidation visible */
+ CommitTransactionCommand();
+ }
+
+ /*
+ * Phase 5 of REINDEX CONCURRENTLY
+ *
+ * The indexes hold now a fresh relfilenode of their respective concurrent
+ * entries indexes. It is time to mark the now-useless concurrent entries
+ * as not ready so as they can be safely discarded from write operations
+ * that may occur on them. One transaction is used for each single index
+ * entry.
+ */
+ foreach(lc, concurrentIndexIds)
+ {
+ Oid indOid = lfirst_oid(lc);
+ Oid relOid;
+ LOCKTAG *heapLockTag = NULL;
+ ListCell *cell;
+
+ /* Check for any process interruption */
+ CHECK_FOR_INTERRUPTS();
+
+ StartTransactionCommand();
+ relOid = IndexGetRelation(indOid, false);
+
+ /*
+ * Find the locktag of parent table for this index, we need to wait for
+ * locks on it.
+ */
+ foreach(cell, lockTags)
+ {
+ LOCKTAG *localTag = (LOCKTAG *) lfirst(cell);
+ if (relOid == localTag->locktag_field2)
+ heapLockTag = localTag;
+ }
+ Assert(heapLockTag && heapLockTag->locktag_field2 != InvalidOid);
+
+ /*
+ * Finish the index invalidation and set it as dead. Note that it is
+ * necessary to wait for for virtual locks on the parent relation
+ * before setting the index as dead.
+ */
+ index_concurrent_set_dead(relOid, indOid, *heapLockTag);
+
+ /* Commit this transaction to make the update visible. */
+ CommitTransactionCommand();
+ }
+
+ /*
+ * Phase 6 of REINDEX CONCURRENTLY
+ *
+ * Drop the concurrent indexes, with actually the same code path as
+ * DROP INDEX CONCURRENTLY. This is safe as all the concurrent entries are
+ * already considered as invalid and not ready, so they will not be used
+ * by other backends for any read or write operations.
+ */
+ foreach(lc, concurrentIndexIds)
+ {
+ Oid indexOid = lfirst_oid(lc);
+
+ /* Check for any process interruption */
+ CHECK_FOR_INTERRUPTS();
+
+ /* Start transaction to drop this index */
+ StartTransactionCommand();
+
+ /* Get fresh snapshot for next step */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ /*
+ * Open transaction if necessary, for the first index treated its
+ * transaction has been already opened previously.
+ */
+ index_concurrent_drop(indexOid);
+
+ /* We can do away with our snapshot */
+ PopActiveSnapshot();
+
+ /* Commit this transaction to make the update visible. */
+ CommitTransactionCommand();
+ }
+
+ /*
+ * Last thing to do is to release the session-level lock on the parent table
+ * and the indexes of table.
+ */
+ foreach(lc, relationLocks)
+ {
+ LockRelId lockRel = *((LockRelId *) lfirst(lc));
+ UnlockRelationIdForSession(&lockRel, ShareUpdateExclusiveLock);
+ }
+
+ /* Start a new transaction to finish process properly */
+ StartTransactionCommand();
+
+ /* Get fresh snapshot for the end of process */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
+ return true;
+}
+
+
+/*
* CheckMutability
* Test whether given expression is mutable
*/
@@ -1540,7 +2094,8 @@ ChooseRelationName(const char *name1, const char *name2,
static char *
ChooseIndexName(const char *tabname, Oid namespaceId,
List *colnames, List *exclusionOpNames,
- bool primary, bool isconstraint)
+ bool primary, bool isconstraint,
+ bool concurrent)
{
char *indexname;
@@ -1566,6 +2121,13 @@ ChooseIndexName(const char *tabname, Oid namespaceId,
"key",
namespaceId);
}
+ else if (concurrent)
+ {
+ indexname = ChooseRelationName(tabname,
+ NULL,
+ "cct",
+ namespaceId);
+ }
else
{
indexname = ChooseRelationName(tabname,
@@ -1678,18 +2240,22 @@ ChooseIndexColumnNames(List *indexElems)
* Recreate a specific index.
*/
Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool concurrent)
{
Oid indOid;
Oid heapOid = InvalidOid;
- /* lock level used here should match index lock reindex_index() */
- indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
- false, false,
- RangeVarCallbackForReindexIndex,
- (void *) &heapOid);
+ indOid = RangeVarGetRelidExtended(indexRelation,
+ concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
+ concurrent, concurrent,
+ RangeVarCallbackForReindexIndex,
+ (void *) &heapOid);
- reindex_index(indOid, false);
+ /* Continue process for concurrent or non-concurrent case */
+ if (!concurrent)
+ reindex_index(indOid, false);
+ else
+ ReindexRelationConcurrently(indOid);
return indOid;
}
@@ -1758,17 +2324,27 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
* Recreate all indexes of a table (and of its toast table, if any)
*/
Oid
-ReindexTable(RangeVar *relation)
+ReindexTable(RangeVar *relation, bool concurrent)
{
Oid heapOid;
+ bool result;
/* The lock level used here should match reindex_relation(). */
- heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
- RangeVarCallbackOwnsTable, NULL);
-
- if (!reindex_relation(heapOid,
+ heapOid = RangeVarGetRelidExtended(relation,
+ concurrent ? ShareUpdateExclusiveLock : ShareLock,
+ concurrent, concurrent,
+ RangeVarCallbackOwnsTable, NULL);
+
+ /* Run the concurrent process if necessary */
+ if (concurrent)
+ result = ReindexRelationConcurrently(heapOid);
+ else
+ result = reindex_relation(heapOid,
REINDEX_REL_PROCESS_TOAST |
- REINDEX_REL_CHECK_CONSTRAINTS))
+ REINDEX_REL_CHECK_CONSTRAINTS);
+
+ /* Let user know if operation has been moot */
+ if (!result)
ereport(NOTICE,
(errmsg("table \"%s\" has no indexes",
relation->relname)));
@@ -1785,7 +2361,10 @@ ReindexTable(RangeVar *relation)
* That means this must not be called within a user transaction block!
*/
Oid
-ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
+ReindexDatabase(const char *databaseName,
+ bool do_system,
+ bool do_user,
+ bool concurrent)
{
Relation relationRelation;
HeapScanDesc scan;
@@ -1797,6 +2376,15 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
AssertArg(databaseName);
+ /*
+ * CONCURRENTLY operation is not allowed for a system, but it is for a
+ * database.
+ */
+ if (concurrent && !do_user)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot reindex system concurrently")));
+
if (strcmp(databaseName, get_database_name(MyDatabaseId)) != 0)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
@@ -1881,17 +2469,42 @@ ReindexDatabase(const char *databaseName, bool do_system, bool do_user)
foreach(l, relids)
{
Oid relid = lfirst_oid(l);
+ bool result = false;
+ bool process_concurrent;
StartTransactionCommand();
/* functions in indexes may want a snapshot set */
PushActiveSnapshot(GetTransactionSnapshot());
- if (reindex_relation(relid,
- REINDEX_REL_PROCESS_TOAST |
- REINDEX_REL_CHECK_CONSTRAINTS))
+
+ /* Determine if relation needs to be processed concurrently */
+ process_concurrent = concurrent &&
+ !IsSystemNamespace(get_rel_namespace(relid));
+
+ /*
+ * Reindex relation with a concurrent or non-concurrent process.
+ * System relations cannot be reindexed concurrently, but they
+ * need to be reindexed including pg_class with a normal process
+ * as they could be corrupted, and concurrent process might also
+ * use them. This does not include toast relations, which are
+ * reindexed when their parent relation is processed.
+ */
+ if (process_concurrent)
+ {
+ old = MemoryContextSwitchTo(private_context);
+ result = ReindexRelationConcurrently(relid);
+ MemoryContextSwitchTo(old);
+ }
+ else
+ result = reindex_relation(relid,
+ REINDEX_REL_PROCESS_TOAST |
+ REINDEX_REL_CHECK_CONSTRAINTS);
+
+ if (result)
ereport(NOTICE,
- (errmsg("table \"%s.%s\" was reindexed",
+ (errmsg("table \"%s.%s\" was reindexed%s",
get_namespace_name(get_rel_namespace(relid)),
- get_rel_name(relid))));
+ get_rel_name(relid),
+ process_concurrent ? " concurrently" : "")));
PopActiveSnapshot();
CommitTransactionCommand();
}
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 714a9f1..9875f1a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -908,6 +908,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
char relkind;
Form_pg_class classform;
LOCKMODE heap_lockmode;
+ bool invalid_system_index = false;
state = (struct DropRelationCallbackState *) arg;
relkind = state->relkind;
@@ -943,7 +944,37 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid,
aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
rel->relname);
- if (!allowSystemTableMods && IsSystemClass(relOid, classform))
+ /*
+ * Check the case of a system index that might have been invalidated by a
+ * failed concurrent process and allow its drop. For the time being, this
+ * only concerns indexes of toast relations that became invalid during a
+ * REINDEX CONCURRENTLY process.
+ */
+ if (IsSystemClass(relOid, classform) &&
+ relkind == RELKIND_INDEX)
+ {
+ HeapTuple locTuple;
+ Form_pg_index indexform;
+ bool indisvalid;
+
+ locTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(relOid));
+ if (!HeapTupleIsValid(locTuple))
+ {
+ ReleaseSysCache(tuple);
+ return;
+ }
+
+ indexform = (Form_pg_index) GETSTRUCT(locTuple);
+ indisvalid = indexform->indisvalid;
+ ReleaseSysCache(locTuple);
+
+ /* Mark object as being an invalid index of system catalogs */
+ if (!indisvalid)
+ invalid_system_index = true;
+ }
+
+ /* In the case of an invalid index, it is fine to bypass this check */
+ if (!invalid_system_index && !allowSystemTableMods && IsSystemClass(relOid, classform))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("permission denied: \"%s\" is a system catalog",
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index d5e1273..8690eeb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1201,6 +1201,20 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
}
/*
+ * As an invalid index only exists when created in a concurrent context,
+ * and that this code path cannot be taken by CREATE INDEX CONCURRENTLY
+ * as this feature is not available for exclusion constraints, this code
+ * path can only be taken by REINDEX CONCURRENTLY. In this case the same
+ * index exists in parallel to this one so we can bypass this check as
+ * it has already been done on the other index existing in parallel.
+ * If exclusion constraints are supported in the future for CREATE INDEX
+ * CONCURRENTLY, this should be removed or completed especially for this
+ * purpose.
+ */
+ if (!index->rd_index->indisvalid)
+ return true;
+
+ /*
* Search the tuples that are in the index for any violations, including
* tuples that aren't visible yet.
*/
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e76b5b3..6675d85 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3748,6 +3748,7 @@ _copyReindexStmt(const ReindexStmt *from)
COPY_STRING_FIELD(name);
COPY_SCALAR_FIELD(do_system);
COPY_SCALAR_FIELD(do_user);
+ COPY_SCALAR_FIELD(concurrent);
return newnode;
}
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index d5db71d..de35a08 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1902,6 +1902,7 @@ _equalReindexStmt(const ReindexStmt *a, const ReindexStmt *b)
COMPARE_STRING_FIELD(name);
COMPARE_SCALAR_FIELD(do_system);
COMPARE_SCALAR_FIELD(do_user);
+ COMPARE_SCALAR_FIELD(concurrent);
return true;
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index bd180e7..f31519d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -7183,35 +7183,38 @@ opt_if_exists: IF_P EXISTS { $$ = TRUE; }
*
* QUERY:
*
- * REINDEX type <name> [FORCE]
+ * REINDEX type [CONCURRENTLY] <name> [FORCE]
*
* FORCE no longer does anything, but we accept it for backwards compatibility
*****************************************************************************/
ReindexStmt:
- REINDEX reindex_type qualified_name opt_force
+ REINDEX reindex_type opt_concurrently qualified_name opt_force
{
ReindexStmt *n = makeNode(ReindexStmt);
n->kind = $2;
- n->relation = $3;
+ n->concurrent = $3;
+ n->relation = $4;
n->name = NULL;
$$ = (Node *)n;
}
- | REINDEX SYSTEM_P name opt_force
+ | REINDEX SYSTEM_P opt_concurrently name opt_force
{
ReindexStmt *n = makeNode(ReindexStmt);
n->kind = OBJECT_DATABASE;
- n->name = $3;
+ n->concurrent = $3;
+ n->name = $4;
n->relation = NULL;
n->do_system = true;
n->do_user = false;
$$ = (Node *)n;
}
- | REINDEX DATABASE name opt_force
+ | REINDEX DATABASE opt_concurrently name opt_force
{
ReindexStmt *n = makeNode(ReindexStmt);
n->kind = OBJECT_DATABASE;
- n->name = $3;
+ n->concurrent = $3;
+ n->name = $4;
n->relation = NULL;
n->do_system = true;
n->do_user = true;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 422911c..4da9191 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -745,16 +745,20 @@ standard_ProcessUtility(Node *parsetree,
{
ReindexStmt *stmt = (ReindexStmt *) parsetree;
+ if (stmt->concurrent)
+ PreventTransactionChain(isTopLevel,
+ "REINDEX CONCURRENTLY");
+
/* we choose to allow this during "read only" transactions */
PreventCommandDuringRecovery("REINDEX");
switch (stmt->kind)
{
case OBJECT_INDEX:
- ReindexIndex(stmt->relation);
+ ReindexIndex(stmt->relation, stmt->concurrent);
break;
case OBJECT_TABLE:
case OBJECT_MATVIEW:
- ReindexTable(stmt->relation);
+ ReindexTable(stmt->relation, stmt->concurrent);
break;
case OBJECT_DATABASE:
@@ -766,8 +770,8 @@ standard_ProcessUtility(Node *parsetree,
*/
PreventTransactionChain(isTopLevel,
"REINDEX DATABASE");
- ReindexDatabase(stmt->name,
- stmt->do_system, stmt->do_user);
+ ReindexDatabase(stmt->name, stmt->do_system,
+ stmt->do_user, stmt->concurrent);
break;
default:
elog(ERROR, "unrecognized object type: %d",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 66d80b5..4f2376f 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1643,6 +1643,23 @@ command_no_begin(const char *query)
return true;
if (wordlen == 10 && pg_strncasecmp(query, "tablespace", 10) == 0)
return true;
+ if (wordlen == 5 && (pg_strncasecmp(query, "index", 5) == 0 ||
+ pg_strncasecmp(query, "table", 5) == 0))
+ {
+ query += wordlen;
+ query = skip_white_space(query);
+ wordlen = 0;
+ while (isalpha((unsigned char) query[wordlen]))
+ wordlen += PQmblen(&query[wordlen], pset.encoding);
+
+ /*
+ * REINDEX [ TABLE | INDEX ] CONCURRENTLY are not allowed in
+ * xacts.
+ */
+ if (wordlen == 12 && pg_strncasecmp(query, "concurrently", 12) == 0)
+ return true;
+ }
+
return false;
}
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56dc688..7339bcd 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -144,7 +144,7 @@ static bool completion_case_sensitive; /* completion is case sensitive */
* 5) The list of attributes of the given table (possibly schema-qualified).
* 6/ The list of arguments to the given function (possibly schema-qualified).
*/
-#define COMPLETE_WITH_QUERY(query) \
+#define COMPLETE_WITH_QUERY(query) \
do { \
completion_charp = query; \
matches = completion_matches(text, complete_from_query); \
@@ -2261,7 +2261,9 @@ psql_completion(const char *text, int start, int end)
pg_strcasecmp(prev_wd, "ON") == 0)
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
/* If we have CREATE|UNIQUE INDEX <sth> CONCURRENTLY, then add "ON" */
- else if ((pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
+ else if ((pg_strcasecmp(prev4_wd, "CREATE") == 0 ||
+ pg_strcasecmp(prev3_wd, "CREATE") == 0) &&
+ (pg_strcasecmp(prev3_wd, "INDEX") == 0 ||
pg_strcasecmp(prev2_wd, "INDEX") == 0) &&
pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
COMPLETE_WITH_CONST("ON");
@@ -3334,14 +3336,35 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH_LIST(list_REINDEX);
}
- else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0)
+ else if (pg_strcasecmp(prev2_wd, "REINDEX") == 0 ||
+ pg_strcasecmp(prev3_wd, "REINDEX") == 0)
{
+ /* Complete REINDEX TABLE with a list of tables, and CONCURRENTLY */
if (pg_strcasecmp(prev_wd, "TABLE") == 0)
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm,
+ " UNION SELECT 'CONCURRENTLY'");
+ /* Complete REINDEX TABLE CONCURRENTLY with a list of tables */
+ else if (pg_strcasecmp(prev2_wd, "TABLE") == 0 &&
+ pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tm, NULL);
+ /* Complete REINDEX TABLE with a list of indexes, and CONCURRENTLY */
else if (pg_strcasecmp(prev_wd, "INDEX") == 0)
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes,
+ " UNION SELECT 'CONCURRENTLY'");
+ /* Complete REINDEX INDEX CONCCURRENTLY with a list if indexes */
+ else if (pg_strcasecmp(prev2_wd, "INDEX") == 0 &&
+ pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_indexes, NULL);
- else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0 ||
- pg_strcasecmp(prev_wd, "DATABASE") == 0)
+ /* Complete REINDEX DATABASE with a list of databases, and CONCURRENTLY */
+ else if (pg_strcasecmp(prev_wd, "DATABASE") == 0)
+ COMPLETE_WITH_QUERY(Query_for_list_of_databases
+ " UNION SELECT 'CONCURRENTLY'");
+ /* Complete REINDEX DATABASE CONCURRENTLY with a list of databases */
+ else if (pg_strcasecmp(prev2_wd, "DATABASE") == 0 ||
+ pg_strcasecmp(prev_wd, "CONCURRENTLY") == 0)
+ COMPLETE_WITH_QUERY(Query_for_list_of_databases);
+ /* Complete REINDEX SYSTEM with a list of databases */
+ else if (pg_strcasecmp(prev_wd, "SYSTEM") == 0)
COMPLETE_WITH_QUERY(Query_for_list_of_databases);
}
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 083f4bd..6b3df50 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -95,6 +95,8 @@ extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
extern void FreeTupleDesc(TupleDesc tupdesc);
+extern void ResetTupleDescCache(TupleDesc tupdesc);
+
extern void IncrTupleDescRefCount(TupleDesc tupdesc);
extern void DecrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index c36a729..97f9d83 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -53,6 +53,7 @@ extern Oid index_create(Relation heapRelation,
Oid *classObjectId,
int16 *coloptions,
Datum reloptions,
+ TupleDesc tupdesc,
bool isprimary,
bool isconstraint,
bool deferrable,
@@ -61,7 +62,26 @@ extern Oid index_create(Relation heapRelation,
bool skip_build,
bool concurrent,
bool is_internal,
- bool if_not_exists);
+ bool if_not_exists,
+ bool is_reindex);
+
+extern Oid index_concurrent_create(Relation heapRelation,
+ Oid indOid,
+ char *concurrentName);
+
+extern void index_concurrent_build(Oid heapOid,
+ Oid indexOid,
+ bool isprimary);
+
+extern void index_concurrent_swap(Oid newIndexOid,
+ Oid oldIndexOid,
+ LOCKTAG locktag);
+
+extern void index_concurrent_set_dead(Oid heapOid,
+ Oid indexOid,
+ LOCKTAG locktag);
+
+extern void index_concurrent_drop(Oid indexOid);
extern void index_constraint_create(Relation heapRelation,
Oid indexRelationId,
diff --git a/src/include/commands/defrem.h b/src/include/commands/defrem.h
index 0ebdbc1..b988555 100644
--- a/src/include/commands/defrem.h
+++ b/src/include/commands/defrem.h
@@ -28,10 +28,11 @@ extern Oid DefineIndex(Oid relationId,
bool check_rights,
bool skip_build,
bool quiet);
-extern Oid ReindexIndex(RangeVar *indexRelation);
-extern Oid ReindexTable(RangeVar *relation);
+extern Oid ReindexIndex(RangeVar *indexRelation, bool concurrent);
+extern Oid ReindexTable(RangeVar *relation, bool concurrent);
extern Oid ReindexDatabase(const char *databaseName,
- bool do_system, bool do_user);
+ bool do_system, bool do_user, bool concurrent);
+extern bool ReindexRelationConcurrently(Oid relOid);
extern char *makeObjectName(const char *name1, const char *name2,
const char *label);
extern char *ChooseRelationName(const char *name1, const char *name2,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 3e4f815..b9484a0 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2729,6 +2729,7 @@ typedef struct ReindexStmt
const char *name; /* name of database to reindex */
bool do_system; /* include system tables in database case */
bool do_user; /* include user tables in database case */
+ bool concurrent; /* reindex concurrently? */
} ReindexStmt;
/* ----------------------
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
new file mode 100644
index 0000000..9e04169
--- /dev/null
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -0,0 +1,78 @@
+Parsed test spec with 3 sessions
+
+starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+
+starting permutation: sel1 reindex upd2 ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 reindex ins2 del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 reindex del2 end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 reindex end1 end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end1: COMMIT;
+step end2: COMMIT;
+step reindex: <... completed>
+
+starting permutation: sel1 upd2 ins2 del2 end1 reindex end2
+step sel1: SELECT data FROM reind_con_tab WHERE id = 3;
+data
+
+aaaa
+step upd2: UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3;
+step ins2: INSERT INTO reind_con_tab(data) VALUES ('cccc');
+step del2: DELETE FROM reind_con_tab WHERE data = 'cccc';
+step end1: COMMIT;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step end2: COMMIT;
+step reindex: <... completed>
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 79a7956..451a415 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -21,6 +21,7 @@ test: delete-abort-savept-2
test: aborted-keyrevoke
test: multixact-no-deadlock
test: multixact-no-forget
+test: reindex-concurrently
test: propagate-lock-delete
test: nowait
test: nowait-2
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
new file mode 100644
index 0000000..eb59fe0
--- /dev/null
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -0,0 +1,40 @@
+# REINDEX CONCURRENTLY
+#
+# Ensure that concurrent operations work correctly when a REINDEX is performed
+# concurrently.
+
+setup
+{
+ CREATE TABLE reind_con_tab(id serial primary key, data text);
+ INSERT INTO reind_con_tab(data) VALUES ('aa');
+ INSERT INTO reind_con_tab(data) VALUES ('aaa');
+ INSERT INTO reind_con_tab(data) VALUES ('aaaa');
+ INSERT INTO reind_con_tab(data) VALUES ('aaaaa');
+}
+
+teardown
+{
+ DROP TABLE reind_con_tab;
+}
+
+session "s1"
+setup { BEGIN; }
+step "sel1" { SELECT data FROM reind_con_tab WHERE id = 3; }
+step "end1" { COMMIT; }
+
+session "s2"
+setup { BEGIN; }
+step "upd2" { UPDATE reind_con_tab SET data = 'bbbb' WHERE id = 3; }
+step "ins2" { INSERT INTO reind_con_tab(data) VALUES ('cccc'); }
+step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
+step "end2" { COMMIT; }
+
+session "s3"
+step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+
+permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
+permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 26d883c..192eb14 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2818,3 +2818,60 @@ explain (costs off)
Index Cond: ((thousand = 1) AND (tenthous = 1001))
(2 rows)
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+NOTICE: table "concur_reindex_tab" has no indexes
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+NOTICE: table "concur_reindex_tab" has no indexes
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES (1, 'a');
+INSERT INTO concur_reindex_tab VALUES (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+ERROR: REINDEX CONCURRENTLY cannot run inside a transaction block
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+ERROR: concurrent reindex is not supported for shared relations
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+ERROR: concurrent reindex is not supported for catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+ERROR: cannot reindex system concurrently
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+Table "public.concur_reindex_tab"
+ Column | Type | Modifiers
+--------+---------+-----------
+ c1 | integer | not null
+ c2 | text |
+Indexes:
+ "concur_reindex_ind1" PRIMARY KEY, btree (c1)
+ "concur_reindex_ind3" UNIQUE, btree (abs(c1))
+ "concur_reindex_ind2" btree (c2)
+ "concur_reindex_ind4" btree (c1, c1, c2)
+Referenced by:
+ TABLE "concur_reindex_tab2" CONSTRAINT "concur_reindex_tab2_c1_fkey" FOREIGN KEY (c1) REFERENCES concur_reindex_tab(c1)
+
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index e08f35e..34e1c42 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -957,3 +957,45 @@ RESET enable_indexscan;
explain (costs off)
select * from tenk1 where (thousand, tenthous) in ((1,1001), (null,null));
+
+--
+-- Check behavior of REINDEX and REINDEX CONCURRENTLY
+--
+
+CREATE TABLE concur_reindex_tab (c1 int);
+-- REINDEX
+REINDEX TABLE concur_reindex_tab; -- notice
+REINDEX TABLE CONCURRENTLY concur_reindex_tab; -- notice
+ALTER TABLE concur_reindex_tab ADD COLUMN c2 text; -- add toast index
+-- Normal index with integer column
+CREATE UNIQUE INDEX concur_reindex_ind1 ON concur_reindex_tab(c1);
+-- Normal index with text column
+CREATE INDEX concur_reindex_ind2 ON concur_reindex_tab(c2);
+-- UNIQUE index with expression
+CREATE UNIQUE INDEX concur_reindex_ind3 ON concur_reindex_tab(abs(c1));
+-- Duplicate column names
+CREATE INDEX concur_reindex_ind4 ON concur_reindex_tab(c1, c1, c2);
+-- Create table for check on foreign key dependence switch with indexes swapped
+ALTER TABLE concur_reindex_tab ADD PRIMARY KEY USING INDEX concur_reindex_ind1;
+CREATE TABLE concur_reindex_tab2 (c1 int REFERENCES concur_reindex_tab);
+INSERT INTO concur_reindex_tab VALUES (1, 'a');
+INSERT INTO concur_reindex_tab VALUES (2, 'a');
+-- Check materialized views
+CREATE MATERIALIZED VIEW concur_reindex_matview AS SELECT * FROM concur_reindex_tab;
+REINDEX INDEX CONCURRENTLY concur_reindex_ind1;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+REINDEX TABLE CONCURRENTLY concur_reindex_matview;
+
+-- Check errors
+-- Cannot run inside a transaction block
+BEGIN;
+REINDEX TABLE CONCURRENTLY concur_reindex_tab;
+COMMIT;
+REINDEX TABLE CONCURRENTLY pg_database; -- no shared relation
+REINDEX TABLE CONCURRENTLY pg_class; -- no catalog relations
+REINDEX SYSTEM CONCURRENTLY postgres; -- not allowed for SYSTEM
+
+-- Check the relation status, there should not be invalid indexes
+\d concur_reindex_tab
+DROP MATERIALIZED VIEW concur_reindex_matview;
+DROP TABLE concur_reindex_tab, concur_reindex_tab2;
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers