Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-10-31 Thread Andreas Karlsson

Here is a rebased version of the patch.

Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0ca2851e5..f8c59ea127 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,6 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
  Acquired by VACUUM (without FULL),
  ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  CREATE STATISTICS and
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 2e053c4c24..4019bad4c2 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
   An index build with the CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
-  convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  convenient to use REINDEX to rebuild them.
  
 
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 

 
+   
+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+

 VERBOSE
 
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   
+
+   
+PostgreSQL supports rebuilding indexes with minimum locking
+of writes.  This method is invoked by specifying the
+CONCURRENTLY option of REINDEX. When this option
+is used, PostgreSQL must perform two scans of the table
+for each index that needs to be rebuild and in addition it must wait for
+all existing transactions that could potentially use the index to
+terminate. This method requires more total work than a standard index
+rebuild and takes significantly longer to complete as it needs to wait
+for unfinished transactions that might modify the index. However, since
+it allows normal operations to continue while the index is rebuilt, this
+method is useful for rebuilding indexes in a production environment. Of
+course, the extra CPU, memory and I/O load imposed by the index rebuild
+may slow down other operations.
+   
+
+   
+The following steps occur in a concurrent index build, each in a separate
+transaction except when the new index definitions are created, where all
+the concurrent entries are created using only one transaction. Note that
+if there are multiple indexes to be rebuilt then each step loops through
+all the indexes we're rebuilding, using a separate transaction for each one.
+REINDEX CONCURRENTLY proceeds as follows when rebuilding
+indexes:
+
+
+ 
+  
+   A new temporary index definition is added into the catalog
+   pg_index. This definition will be used to replace the
+   old index. This step is done as a single transaction for all the indexes
+   involved in this process, meaning that if
+   REINDEX CONCURRENTLY is run on a table with multiple
+   indexes, all the catalog entries of the new indexes are created within a
+   single transaction. A SHARE UPDATE EXCLUSIVE lock at
+   session level is taken on the indexes being reindexed as well as its
+   parent table to prevent any schema modification while processing.
+  
+ 
+ 
+  
+   A first 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-08-31 Thread Andreas Karlsson
I have attached a new, rebased version of the batch with most of Banck's 
and some of your feedback incorporated. Thanks for the good feedback!


On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX 
SCHEMA CONCURRENTLY public on the regression

database I am bumping into a bunch of these warnings:
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123


I failed to reproduce this. Do you have a reproducible test case?


+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+   tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.


Sorry, but I am not sure I understand your suggestion. I do not like the 
ResetTupleDescCache function so all suggestions are welcome.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] name
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?


I have thought some about this myself and I do not care strongly either way.


- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.


I am no fan of the current code duplication and how fragile it is, but I 
think these cases are sufficiently different to prevent meaningful code 
reuse. But it could just be me who is unfamiliar with that part of the code.



- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am still leaning towards my current tradeoff since waiting for all 
queries to stop using an index can take a lot of time and if you only 
have to do that once per table it would be a huge benefit under some 
workloads, and you can still reindex each index separately if you need to.


Andreas

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index dda0170886..c97944b2c9 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -926,7 +926,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
  Acquired by VACUUM (without FULL),
  ANALYZE, CREATE INDEX CONCURRENTLY,
- CREATE STATISTICS and
+ REINDEX CONCURRENTLY, CREATE STATISTICS and
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..4ef3a89a29 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
   An index build with the CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
-  convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  convenient to use REINDEX to rebuild them.
  
 
 
@@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 

 
+   
+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+

 VERBOSE
 
@@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-04-03 Thread Andreas Karlsson

On 04/03/2017 07:57 AM, Michael Paquier wrote:

On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson  wrote:

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am personally worried about the amount time spent waiting for long running
transactions if you reindex per index rather than per relation. Because when
you for one index wait on long running transactions nothing prevents new
long transaction from starting, which we will have to wait for while
reindexing the next index. If your database has many long running
transactions more time will be spent waiting than the time spent working.


Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.


Agreed, and which is worse probably depends heavily on your schema and 
workload.



I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.


Thanks! I also think I will have time to work on this before the first CF.

Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-04-02 Thread Michael Paquier
On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson  wrote:
> Thanks for the feedback. I will look at it when I get the time.
>
> On 03/31/2017 08:27 AM, Michael Paquier wrote:
>>
>> - Do a per-index rebuild and not a per-relation rebuild for concurrent
>> indexing. Doing a per-relation reindex has the disadvantage that many
>> objects need to be created at the same time, and in the case of
>> REINDEX CONCURRENTLY time of the operation is not what matters, it is
>> how intrusive the operation is. Relations with many indexes would also
>> result in much object locks taken at each step.
>
>
> I am personally worried about the amount time spent waiting for long running
> transactions if you reindex per index rather than per relation. Because when
> you for one index wait on long running transactions nothing prevents new
> long transaction from starting, which we will have to wait for while
> reindexing the next index. If your database has many long running
> transactions more time will be spent waiting than the time spent working.

Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.

> and doing the REINDEX per relation allows for flexibility
> since people can still explicitly reindex per index of they want to.

You have a point here.

I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-31 Thread Andreas Karlsson

Thanks for the feedback. I will look at it when I get the time.

On 03/31/2017 08:27 AM, Michael Paquier wrote:

- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.


I am personally worried about the amount time spent waiting for long 
running transactions if you reindex per index rather than per relation. 
Because when you for one index wait on long running transactions nothing 
prevents new long transaction from starting, which we will have to wait 
for while reindexing the next index. If your database has many long 
running transactions more time will be spent waiting than the time spent 
working.


Are the number of locks really a big deal compared to other costs 
involved here? REINDEX does a lot of expensive things like staring 
transactions, taking snapshots, scanning large tables, building a new 
index, etc. The trade off I see is between temporary disk usage and time 
spent waiting for transactions, and doing the REINDEX per relation 
allows for flexibility since people can still explicitly reindex per 
index of they want to.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-31 Thread Michael Paquier
On Thu, Mar 30, 2017 at 5:13 AM, Michael Banck
 wrote:
> On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:
>> Spotted one of my TODO comments there so I have attached a patch where I
>> have cleaned up that function. I also fixed the the code to properly support
>> triggers.
>
> I hope that Michael will post a full review as he worked on the code
> extensively, but here are some some code comments, mostly on the
> comments (note that I'm not a native speaker, so I might be wrong on
> some of them as well):

Thanks, Michael. I have done a pass on it

> [review comments]

Here are more comments:

+   
+REINDEX SYSTEM does not support
+CONCURRENTLY.
+   
It would be nice to mention that REINDEX SCHEMA pg_catalog is not
supported, or just tell that concurrent reindexing of any system
catalog indexes.

When running REINDEX SCHEMA CONCURRENTLY public on the regression
database I am bumping into a bunch of these warnings:
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
WARNING:  01000: snapshot 0x7fa5e640 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123

+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+   tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.

+   /*
+* Copy contraint flags for old index. This is safe because the old index
+* guaranteed uniquness.
+*/
s/uniquness/uniqueness/ and s/contraint/constraint/.

+   /*
+* Move contstraints and triggers over to the new index
+*/
s/contstraints/constraints/.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] name
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?

With this patch, we are reaching the 9th boolean argument for
create_index(). Any opinions about refactoring that into a set of
bitwise flags? Fairly unrelated to this patch.

+   /*
+* Move all dependencies on the old index to the new
+*/
Sentence unfinished.

It has been years since I looked at this code (I wrote it in
majority), but here is what I would explore if I were to work on that
for the next release cycle:
- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.
- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.
The first and second points require a bit of thoughts for sure, but in
the long term that would pay in maintenance if we don't reinvent the
wheel, or at least try not to.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-30 Thread Michael Paquier
On Mon, Mar 13, 2017 at 11:11 AM, Andreas Karlsson  wrote:
> On 03/02/2017 03:10 AM, Michael Paquier wrote:
>> There is a lot of mumbo-jumbo in the patch to create the exact same
>> index definition as the original one being reindexed, and that's a
>> huge maintenance burden for the future. You can blame me for that in
>> the current patch. I am wondering if it would not just be better to
>> generate a CREATE INDEX query string and then use the SPI to create
>> the index, and also do the following extensions at SQL level:
>> - Add a sort of WITH NO DATA clause where the index is created, so the
>> index is created empty, and is marked invalid and not ready.
>> - Extend pg_get_indexdef_string() with an optional parameter to
>> enforce the index name to something else, most likely it should be
>> extended with the WITH NO DATA/INVALID clause, which should just be a
>> storage parameter by the way.
>> By doing something like that what the REINDEX CONCURRENTLY code path
>> should just be careful about is that it chooses an index name that
>> avoids any conflicts.
>
>
> Hm, I am not sure how much that would help since a lot of the mumb-jumbo is
> by necessity in the step where we move the constraints over from the old
> index to the new.

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-29 Thread Michael Banck
Hi,

I had a look at this.

On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:
> Spotted one of my TODO comments there so I have attached a patch where I
> have cleaned up that function. I also fixed the the code to properly support
> triggers.

The patch applies with quite a few offsets on top of current (2fd8685)
master, I have not verified that those are all ok.

Regression tests pass, also the included isolation tests.

I hope that Michael will post a full review as he worked on the code
extensively, but here are some some code comments, mostly on the
comments (note that I'm not a native speaker, so I might be wrong on
some of them as well):

> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
> index 3908ade37b..3449c0af73 100644
> --- a/doc/src/sgml/ref/reindex.sgml
> +++ b/doc/src/sgml/ref/reindex.sgml
> @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | 
> DATABASE | SYSTEM } An index build with the CONCURRENTLY option failed, leaving
>an invalid index. Such indexes are useless but it can be
>convenient to use REINDEX to rebuild them. Note that
> -  REINDEX will not perform a concurrent build. To build the
> -  index without interfering with production you should drop the index and
> -  reissue the CREATE INDEX CONCURRENTLY command.
> +  REINDEX will perform a concurrent build if 
> +  CONCURRENTLY is specified. To build the index without interfering
> +  with production you should drop the index and reissue either the
> +  CREATE INDEX CONCURRENTLY or REINDEX 
> CONCURRENTLY
> +  command. Indexes of toast relations can be rebuilt with 
> REINDEX
> +  CONCURRENTLY.

I think the "To build the index[...]" part should be rephrased, the
current diff makes it sound like you should drop the index first even if
you reindex concurrently. What about "Note that REINDEX will
only perform a concurrent build if  CONCURRENTLY is
specified"?

Anyway, this part is only about reindexing invalid indexes, so
mentioning that reindex is not concurrently or the part about create-
index-concurrently-then-rename only for this case is a bit weird, but
this is a pre-existing condition.

> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 8d42a347ea..c40ac0b154 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
>  /*
> + * index_concurrent_create_copy
> + *
> + * Create a concurrent index based on the definition of the one provided by
> + * caller that will be used for concurrent operations. The index is inserted
> + * into catalogs and needs to be built later on. This is called during
> + * concurrent reindex processing. The heap relation on which is based the 
> index
> + * needs to be closed by the caller.
> + */

That should be "The heap relation on which the index is based ..." I
think.

> + /*
> +  * We have to re-build the IndexInfo struct, since it was lost in
> +  * commit of transaction where this concurrent index was created
> +  * at the catalog level.
> +  */
> + indexInfo = BuildIndexInfo(indexRelation);

Looks like indexInfo starts with lowercase, but the comment above has
upper case `IndexInfo'.

> +/*
> + * index_concurrent_swap
> + *
> + * Swap name, dependencies and constraints of the old index over to the new
> + * index, while marking the old index as invalid and the new as valid.
> + */

The `while' looks slightly odd to me, ISTM this is just another
operation this function performs, whereas "while" makes it sound like
the marking happens concurrently; so maybe ". Also, mark the old index
as invalid[...]"?

> +void
> +index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
> +{

[...]

> + /*
> +  * Copy contraint flags for old index. This is safe because the old 
> index
> +  * guaranteed uniquness.
> +  */

"uniqueness".

> + /* Mark old index as valid and new is invalid as index_set_state_flags 
> */

"new as invalid". Also, this comment style is different to this one:

> + /*
> +  * Move contstraints and triggers over to the new index
> +  */

I guess the latter could be changed to a one-line comment as the former,
but maybe there is a deeper sense (locality of comment?) in this.

> + /* make a modifiable copy */

I think comments should start capitalized?


> +/*
> + * index_concurrent_drop
> + *
> + * Drop a single index concurrently as the last step of an index concurrent
> + * process. Deletion is done through performDeletion or dependencies of the
> + * index would not get dropped. At this point all the indexes are already
> + * considered as invalid and dead so they can be dropped without using any
> + * concurrent options as it is sure that they will not interact with other
> + * server sessions.
> + */

I'd write "as it is certain" instead of "as it is sure", but I can't
explain why. Maybe persons are sure, but situations 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-12 Thread Andreas Karlsson

On 03/13/2017 03:11 AM, Andreas Karlsson wrote:

I also fixed the the code to properly support triggers.


And by "support triggers" I actually meant fixing the support for moving 
the foreign keys to the new index.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-12 Thread Andreas Karlsson

On 03/02/2017 03:10 AM, Michael Paquier wrote:

On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson  wrote:
+/*
+ * Copy contraint flags for old index. This is safe because the old index
+ * guaranteed uniquness.
+ */
+newIndexForm->indisprimary = oldIndexForm->indisprimary;
+oldIndexForm->indisprimary = false;
+newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+oldIndexForm->indisexclusion = false;
[...]
+deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+RelationRelationId, DEPENDENCY_AUTO);
+deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+// TODO: pg_depend for old index?


Spotted one of my TODO comments there so I have attached a patch where I 
have cleaned up that function. I also fixed the the code to properly 
support triggers.



There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.


Hm, I am not sure how much that would help since a lot of the mumb-jumbo 
is by necessity in the step where we move the constraints over from the 
old index to the new.


Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 
  Acquired by VACUUM (without FULL),
- ANALYZE, CREATE INDEX CONCURRENTLY, and
+ ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
   convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  REINDEX will perform a concurrent build if 
+  CONCURRENTLY is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY
+  command. Indexes of toast relations can be rebuilt with REINDEX
+  CONCURRENTLY.
  
 
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 

+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+
+   
 VERBOSE
 
  
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-10 Thread Thomas Munro
On Fri, Mar 10, 2017 at 9:36 PM, Thomas Munro
 wrote:
> On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund  wrote:
>> Can you come up with an halfway realistic scenario why an index oid, not
>> a table, constraint, sequence oid, would be relied upon?
>
> Is there an implication for SIREAD locks?  Predicate locks on index
> pages include the index OID in the tag.

Ah, yes, but that is covered by a call to
TransferPredicateLocksToHeapRelation() in index_concurrent_set_dead().

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-10 Thread Thomas Munro
On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund  wrote:
> Can you come up with an halfway realistic scenario why an index oid, not
> a table, constraint, sequence oid, would be relied upon?

Is there an implication for SIREAD locks?  Predicate locks on index
pages include the index OID in the tag.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-09 Thread Jim Nasby

On 3/8/17 9:34 AM, Andreas Karlsson wrote:

Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.


Hm, I cannot think of any real life scenario where this will be an issue
based on my personal experience with PostgreSQL, but if you can think of
one please provide it. I will try to ponder some more on this myself.


The case I currently have is to allow tracking database objects similar 
to (but not the same) as how we track the objects that belong to an 
extension[1]. That currently depends on event triggers to keep names 
updated if they're changed, as well as making use of the reg* types. If 
an event trigger fired as part of the index rename (essentially treating 
it like an ALTER INDEX) then I should be able to work around that.


The ultimate reason for doing this is to provide something similar to 
extensions (create a bunch of database objects that are all bound 
together), but also similar to classes in OO languages (so you can have 
multiple instances).[2]


Admittedly, this is pretty off the beaten path and I certainly wouldn't 
hold up the patch because of it. I am hoping that it'd be fairly easy to 
fire an event trigger as if someone had just renamed the index.


1: https://github.com/decibel/object_reference
2: https://github.com/decibel/pg_classy
--
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-08 Thread Andreas Karlsson

On 03/08/2017 03:48 AM, Robert Haas wrote:

On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson  wrote:

And I would argue that his feature is useful for quite many, based on my
experience running a semi-large database. Index bloat happens and without
REINDEX CONCURRENTLY it can be really annoying to solve, especially for
primary keys. Certainly more people have problems with index bloat than the
number of people who store index oids in their database.


Yeah, but that's not the only wart, I think.


The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated 
when adding new things which refer to an index oid.


Given how useful I find REINDEX CONCURRENTLY I think these warts are 
worth it given that the impact is quite limited. I am of course biased 
since if I did not believe this I would not pursue this solution in the 
first place.



For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level.  If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.


This version of the patch no longer does that. For my use case 
escalating the lock would make this patch much less interesting. The 
highest lock level taken is the same one as the initial one (SHARE 
UPDATE EXCLUSIVE). The current patch does on a high level (very 
simplified) this:


1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to 
ind_old, and rename ind_new to ind.

3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no 
part escalates the lock level over what would be required by the steps 
for creating and dropping indexes concurrently



Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.


Hm, I cannot think of any real life scenario where this will be an issue 
based on my personal experience with PostgreSQL, but if you can think of 
one please provide it. I will try to ponder some more on this myself.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-07 Thread Andres Freund
On 2017-03-07 21:48:23 -0500, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson  wrote:
> > And I would argue that his feature is useful for quite many, based on my
> > experience running a semi-large database. Index bloat happens and without
> > REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> > primary keys. Certainly more people have problems with index bloat than the
> > number of people who store index oids in their database.
> 
> Yeah, but that's not the only wart, I think.

I don't really see any other warts that don't correspond to CREATE/DROP
INDEX CONCURRENTLY.


> For example, I believe (haven't looked at this patch series in a
> while) that the patch takes a lock and later escalates the lock level.

It shouldn't* - that was required precisely because we had to switch the
relfilenodes when the oid stayed the same.  Otherwise in-progress index
lookups could end up using the wrong relfilenodes and/or switch in the
middle of a lookup.

* excepting the exclusive lock DROP INDEX CONCURRENTLY style dropping
  uses after marking the index as dead - but that shouldn't be much of a
  concern?


> Also, if by any chance you think (or use any software that thinks)
> that OIDs for system objects are a stable identifier, this will be the
> first case where that ceases to be true.

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?


> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system.

Given how relatively few complaints we have about CIC's possibility of
ending up with invalid indexes - not that there are none - and it's
widespread usage, I'm not too concerned about this.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-07 Thread Robert Haas
On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson  wrote:
> And I would argue that his feature is useful for quite many, based on my
> experience running a semi-large database. Index bloat happens and without
> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> primary keys. Certainly more people have problems with index bloat than the
> number of people who store index oids in their database.

Yeah, but that's not the only wart, I think.  For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level.  If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.  Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-05 Thread Andreas Karlsson

On 03/05/2017 07:56 PM, Robert Haas wrote:

On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund  wrote:

I agree that'd it be nicer not to have this, but not having the feature at all 
is a lot worse than this wart.


I, again, give that a firm "maybe".  If the warts end up annoying 1%
of the users who try to use this feature, then you're right.  If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain.  I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.


I agree that these warts are annoying but I also think the limitations
can be explained pretty easily to users (e.g. by explaining it in the 
manual how REINDEX CONCURRENTLY creates a new index to replace the old 
one), and I think that is the important question about if a useful 
feature with warts should be merged or not. Does it make things 
substantially harder for the average user to grok?


And I would argue that his feature is useful for quite many, based on my 
experience running a semi-large database. Index bloat happens and 
without REINDEX CONCURRENTLY it can be really annoying to solve, 
especially for primary keys. Certainly more people have problems with 
index bloat than the number of people who store index oids in their 
database.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-05 Thread Peter Geoghegan
On Sat, Mar 4, 2017 at 9:34 AM, Andres Freund  wrote:
> On March 4, 2017 1:16:56 AM PST, Robert Haas  wrote:
>>Maybe.  But it looks to me like this patch is going to have
>>considerably more than its share of user-visible warts, and I'm not
>>very excited about that.  I feel like what we ought to be doing is
>>keeping the index OID the same and changing the relfilenode to point
>>to a newly-created one, and I attribute our failure to make that
>>design work thus far to insufficiently aggressive hacking.
>
> We literally spent years and a lot of emails waiting for that to happen. 
> Users now hack up solutions like this in userspace, obviously a bad solution.
>
> I agree that'd it be nicer not to have this, but not having the feature at 
> all is a lot worse than this wart.

IMHO, REINDEX itself is implemented in a way that is conceptually
pure, and yet quite user hostile.

I tend to tell colleagues that ask about REINDEX something along the
lines of: Just assume that REINDEX is going to block out even SELECT
statements referencing the underlying table. It might not be that bad
for you in practice, but the details are arcane such that it might as
well be that simple most of the time. Even if you have time to listen
to me explain it all, which you clearly don't, you're still probably
not going to be able to apply what you've learned in a way that helps
you.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-05 Thread Robert Haas
On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund  wrote:
> I agree that'd it be nicer not to have this, but not having the feature at 
> all is a lot worse than this wart.

I, again, give that a firm "maybe".  If the warts end up annoying 1%
of the users who try to use this feature, then you're right.  If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain.  I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-04 Thread Andres Freund


On March 4, 2017 1:16:56 AM PST, Robert Haas  wrote:
>Maybe.  But it looks to me like this patch is going to have
>considerably more than its share of user-visible warts, and I'm not
>very excited about that.  I feel like what we ought to be doing is
>keeping the index OID the same and changing the relfilenode to point
>to a newly-created one, and I attribute our failure to make that
>design work thus far to insufficiently aggressive hacking.

We literally spent years and a lot of emails waiting for that to happen. Users 
now hack up solutions like this in userspace, obviously a bad solution.

I agree that'd it be nicer not to have this, but not having the feature at all 
is a lot worse than this wart.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-04 Thread Robert Haas
On Thu, Mar 2, 2017 at 11:48 AM, Andres Freund  wrote:
> On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:
>> On 2/28/17 11:21 AM, Andreas Karlsson wrote:
>> > The only downside I can see to this approach is that we no logner will
>> > able to reindex catalog tables concurrently, but in return it should be
>> > easier to confirm that this approach can be made work.
>>
>> Another downside is any stored regclass fields will become invalid.
>> Admittedly that's a pretty unusual use case, but it'd be nice if there was
>> at least a way to let users fix things during the rename phase (perhaps via
>> an event trigger).
>
> I'm fairly confident that we don't want to invoke event triggers inside
> the CIC code...  I'm also fairly confident that between index oids
> stored somewhere being invalidated - what'd be a realistic use case of
> that - and not having reindex concurrently, just about everyone will
> choose the former.

Maybe.  But it looks to me like this patch is going to have
considerably more than its share of user-visible warts, and I'm not
very excited about that.  I feel like what we ought to be doing is
keeping the index OID the same and changing the relfilenode to point
to a newly-created one, and I attribute our failure to make that
design work thus far to insufficiently aggressive hacking.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-02 Thread Andreas Karlsson

On 03/02/2017 02:25 AM, Jim Nasby wrote:

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.


Another downside is any stored regclass fields will become invalid.
Admittedly that's a pretty unusual use case, but it'd be nice if there
was at least a way to let users fix things during the rename phase
(perhaps via an event trigger).


Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY 
issue event triggers seems strange to me. While it does create and drop 
indexes as part of its implementation, it is actually just an index 
maintenance job.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-01 Thread Andres Freund
On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:
> On 2/28/17 11:21 AM, Andreas Karlsson wrote:
> > The only downside I can see to this approach is that we no logner will
> > able to reindex catalog tables concurrently, but in return it should be
> > easier to confirm that this approach can be made work.
> 
> Another downside is any stored regclass fields will become invalid.
> Admittedly that's a pretty unusual use case, but it'd be nice if there was
> at least a way to let users fix things during the rename phase (perhaps via
> an event trigger).

I'm fairly confident that we don't want to invoke event triggers inside
the CIC code...  I'm also fairly confident that between index oids
stored somewhere being invalidated - what'd be a realistic use case of
that - and not having reindex concurrently, just about everyone will
choose the former.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-01 Thread Michael Paquier
On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson  wrote:
> For each table:
>
> 1. Create new indexes without populating them, and lock the tables and
> indexes for the session.

+/*
+ * Copy contraint flags for old index. This is safe because the old index
+ * guaranteed uniquness.
+ */
+newIndexForm->indisprimary = oldIndexForm->indisprimary;
+oldIndexForm->indisprimary = false;
+newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+oldIndexForm->indisexclusion = false;
[...]
+deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+RelationRelationId, DEPENDENCY_AUTO);
+deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+// TODO: pg_depend for old index?
There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-03-01 Thread Jim Nasby

On 2/28/17 11:21 AM, Andreas Karlsson wrote:

The only downside I can see to this approach is that we no logner will
able to reindex catalog tables concurrently, but in return it should be
easier to confirm that this approach can be made work.


Another downside is any stored regclass fields will become invalid. 
Admittedly that's a pretty unusual use case, but it'd be nice if there 
was at least a way to let users fix things during the rename phase 
(perhaps via an event trigger).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-28 Thread Andreas Karlsson

Hi,

Here is a third take on this feature, heavily based on Michael Paquier's 
2.0 patch. This time the patch does not attempt to preserve the index 
oids, but instead creates new indexes and moves all dependencies from 
the old indexes to the new before dropping the old ones. The only 
downside I can see to this approach is that we no logner will able to 
reindex catalog tables concurrently, but in return it should be easier 
to confirm that this approach can be made work.


This patch relies on that we can change the indisvalid flag of indexes 
transactionally, and as far as I can tell this is the case now that we 
have MVCC for the catalog updates.


The code does some extra intermediate commits when building the indexes 
to avoid long running transactions.


How REINDEX CONCURRENTLY operates:

For each table:

1. Create new indexes without populating them, and lock the tables and 
indexes for the session.


2. After waiting for all running transactions populate each index in a 
separate transaction and set them to ready.


3. After waiting again for all running transactions validate each index 
in a separate transaction (but not setting them to valid just yet).


4. Swap all dependencies over from each old index to the new index and 
rename the old and the new indexes (from the  to _ccold and 
_new to ), and set isprimary and isexclusion flags. Here we 
also mark the new indexes as valid and the old indexes as invalid.


5. After waiting for all running transactions we change each index from 
invalid to dead.


6. After waiting for all running transactions we drop each index.

7. Drop all session locks.

Andreas
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 
  Acquired by VACUUM (without FULL),
- ANALYZE, CREATE INDEX CONCURRENTLY, and
+ ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..3449c0af73 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
   convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  REINDEX will perform a concurrent build if 
+  CONCURRENTLY is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY
+  command. Indexes of toast relations can be rebuilt with REINDEX
+  CONCURRENTLY.
  
 
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 

+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+
+   
 VERBOSE
 
  
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   
+
+   
+PostgreSQL supports rebuilding indexes with minimum locking
+of writes.  This method is invoked by specifying the
+CONCURRENTLY option of 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-27 Thread Bruce Momjian
On Mon, Feb 27, 2017 at 05:31:21PM -0500, Tom Lane wrote:
> Michael Paquier  writes:
> > I don't object to the addition of this patch in next CF as this
> > presents no new concept. Still per the complications this patch and
> > because it is a complicated patch I was wondering if people are fine
> > to include it in this last CF.
> 
> The March CF is already looking pretty daunting.  We can try to include
> this but I won't be too surprised if it gets punted to a future CF.

Yeah, that was my reaction too.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-27 Thread Tom Lane
Michael Paquier  writes:
> I don't object to the addition of this patch in next CF as this
> presents no new concept. Still per the complications this patch and
> because it is a complicated patch I was wondering if people are fine
> to include it in this last CF.

The March CF is already looking pretty daunting.  We can try to include
this but I won't be too surprised if it gets punted to a future CF.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-27 Thread Michael Paquier
On Tue, Feb 28, 2017 at 5:29 AM, Bruce Momjian  wrote:
> On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:
>> On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson  wrote:
>> > Thinking about this makes me wonder about why you decided to use a
>> > transaction per index in many of the steps rather than a transaction per
>> > step. Most steps should be quick. The only steps I think the makes sense to
>> > have a transaction per table are.
>>
>> I don't recall all the details to be honest :)
>>
>> > 1) When building indexes to avoid long running transactions.
>> > 2) When validating the new indexes, also to avoid long running 
>> > transactions.
>> >
>> > But when swapping the indexes or when dropping the old indexes I do not see
>> > any reason to not just use one transaction per step since we do not even
>> > have to wait for any locks (other than WaitForLockers which we just want to
>> > call once anyway since all indexes relate to the same table).
>>
>> Perhaps, this really needs a careful lookup.
>>
>> By the way, as this patch is showing up for the first time in this
>> development cycle, would it be allowed in the last commit fest? That's
>> not a patch in the easy category, far from that, but it does not
>> present a new concept.
>
> FYI, the thread started on 2013-11-15.

I don't object to the addition of this patch in next CF as this
presents no new concept. Still per the complications this patch and
because it is a complicated patch I was wondering if people are fine
to include it in this last CF.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-27 Thread Bruce Momjian
On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:
> On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson  wrote:
> > Thinking about this makes me wonder about why you decided to use a
> > transaction per index in many of the steps rather than a transaction per
> > step. Most steps should be quick. The only steps I think the makes sense to
> > have a transaction per table are.
> 
> I don't recall all the details to be honest :)
> 
> > 1) When building indexes to avoid long running transactions.
> > 2) When validating the new indexes, also to avoid long running transactions.
> >
> > But when swapping the indexes or when dropping the old indexes I do not see
> > any reason to not just use one transaction per step since we do not even
> > have to wait for any locks (other than WaitForLockers which we just want to
> > call once anyway since all indexes relate to the same table).
> 
> Perhaps, this really needs a careful lookup.
> 
> By the way, as this patch is showing up for the first time in this
> development cycle, would it be allowed in the last commit fest? That's
> not a patch in the easy category, far from that, but it does not
> present a new concept.

FYI, the thread started on 2013-11-15.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-17 Thread Michael Paquier
On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson  wrote:
> Thinking about this makes me wonder about why you decided to use a
> transaction per index in many of the steps rather than a transaction per
> step. Most steps should be quick. The only steps I think the makes sense to
> have a transaction per table are.

I don't recall all the details to be honest :)

> 1) When building indexes to avoid long running transactions.
> 2) When validating the new indexes, also to avoid long running transactions.
>
> But when swapping the indexes or when dropping the old indexes I do not see
> any reason to not just use one transaction per step since we do not even
> have to wait for any locks (other than WaitForLockers which we just want to
> call once anyway since all indexes relate to the same table).

Perhaps, this really needs a careful lookup.

By the way, as this patch is showing up for the first time in this
development cycle, would it be allowed in the last commit fest? That's
not a patch in the easy category, far from that, but it does not
present a new concept.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-17 Thread Andreas Karlsson

On 02/17/2017 01:53 PM, Andreas Karlsson wrote:

I am actually thinking about going the opposite direction (by reducing
the number of times we call WaitForLockers), because it is not just
about consuming transaction IDs, we also do not want to wait too many
times for transactions to commit. I am leaning towards only calling
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped
over all indexes (and as a bonus the VERBOSE output would be more
obvious), but I do not think it is worth waiting for lockers all those
extra times.


Thinking about this makes me wonder about why you decided to use a 
transaction per index in many of the steps rather than a transaction per 
step. Most steps should be quick. The only steps I think the makes sense 
to have a transaction per table are.


1) When building indexes to avoid long running transactions.

2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not 
see any reason to not just use one transaction per step since we do not 
even have to wait for any locks (other than WaitForLockers which we just 
want to call once anyway since all indexes relate to the same table).


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-17 Thread Andreas Karlsson

On 02/14/2017 04:56 AM, Michael Paquier wrote:

On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson  wrote:

On 02/13/2017 06:31 AM, Michael Paquier wrote:

Er, something like that as well, no?
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.


REINDEX (VERBOSE) currently prints one such line per index, which does not
really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
on a relation at the same time. It is not immediately obvious how this
should work. Maybe one such detail line per table?


Hard to recall this thing in details with the time and the fact that a
relation is reindexed by processing all the indexes once at each step.
Hm... What if ReindexRelationConcurrently() actually is refactored in
such a way that it processes all the steps for each index
individually? This way you can monitor the time it takes to build
completely each index, including its . This operation would consume
more transactions but in the event of a failure the amount of things
to clean up is really reduced particularly for relations with many
indexes. This would as well reduce VERBOSE to print one line per index
rebuilt.


I am actually thinking about going the opposite direction (by reducing 
the number of times we call WaitForLockers), because it is not just 
about consuming transaction IDs, we also do not want to wait too many 
times for transactions to commit. I am leaning towards only calling 
WaitForLockersMultiple three times per table.


1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for 
lockers once per index. This seems rather wasteful.


I have thought about that the code might be cleaner if we just looped 
over all indexes (and as a bonus the VERBOSE output would be more 
obvious), but I do not think it is worth waiting for lockers all those 
extra times.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-13 Thread Michael Paquier
On Tue, Feb 14, 2017 at 12:56 PM, Michael Paquier
 wrote:
> This way you can monitor the time it takes to build
> completely each index, including its .

You can ignore the terms "including its" here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-13 Thread Michael Paquier
On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson  wrote:
> On 02/13/2017 06:31 AM, Michael Paquier wrote:
>> Er, something like that as well, no?
>> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
>
> REINDEX (VERBOSE) currently prints one such line per index, which does not
> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
> on a relation at the same time. It is not immediately obvious how this
> should work. Maybe one such detail line per table?

Hard to recall this thing in details with the time and the fact that a
relation is reindexed by processing all the indexes once at each step.
Hm... What if ReindexRelationConcurrently() actually is refactored in
such a way that it processes all the steps for each index
individually? This way you can monitor the time it takes to build
completely each index, including its . This operation would consume
more transactions but in the event of a failure the amount of things
to clean up is really reduced particularly for relations with many
indexes. This would as well reduce VERBOSE to print one line per index
rebuilt.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-13 Thread Andreas Karlsson

On 02/13/2017 06:31 AM, Michael Paquier wrote:

- What should we do with REINDEX DATABASE CONCURRENTLY and the system
catalog? I so not think we can reindex the system catalog concurrently
safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
them, reindex them while taking locks, or just error out?


System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.


Good idea, I think I will add one line of warning if it finds any system 
index in the schema.



- What level of information should be output in VERBOSE mode?


Er, something like that as well, no?
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.


REINDEX (VERBOSE) currently prints one such line per index, which does 
not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all 
indexes on a relation at the same time. It is not immediately obvious 
how this should work. Maybe one such detail line per table?



This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING:  XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION:  ReindexRelationConcurrently, indexcmds.c:2119
WARNING:  01000: snapshot 0x7fde12003038 still active


Thanks for testing the patch! The crash was caused by things being 
allocated in the wrong memory context when reindexing multiple tables 
and therefore freed on the first intermediate commit. I have created a 
new memory context to handle this in which I only allocate the lists 
which need to survive between transactions..


Hm, when writing the above I just realized why ReindexTable/ReindexIndex 
did not suffer from the same bug. It is because the first transaction 
there allocated in the PortalHeapMemory context which survives commit. I 
really need to look at if there is a clean way to handle memory contexts 
in my patch.


I also found the snapshot still active bug, it seems to have been caused 
by REINDEX TABLE CONCURRENTLY leaving an open snapshot which cannot be 
popped by PortalRunUtility().


Thanks again!
Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-12 Thread Michael Paquier
On Sun, Feb 12, 2017 at 6:44 AM, Andreas Karlsson  wrote:
> On 02/02/2015 03:10 PM, Andres Freund wrote:
>> I think if we should instead just use the new index, repoint the
>> dependencies onto the new oid, and then afterwards, when dropping,
>> rename the new index one onto the old one. That means the oid of the
>> index will change and some less than pretty grovelling around
>> dependencies, but it still seems preferrable to what we're discussing
>> here otherwise.
>
> I think that sounds like a good plan. The oid change does not seem like a
> too big deal to me, especially since that is what users will get now too. Do
> you still think this is the right way to solve this?

That hurts mainly system indexes. Perhaps users with broken system
indexes are not going to care about concurrency anyway. Thinking now
about it I don't see how that would not work, but I did not think
deeply about this problem lately.

> I have attached my work in progress patch which implements and is very
> heavily based on Michael's previous work. There are some things left to do
> but I think I should have a patch ready for the next commitfest if people
> still like this type of solution.

Cool to see a rebase of this patch. It's been a long time...

> I also changed index_set_state_flags() to be transactional since I wanted
> the old index to become invalid at exactly the same time as the new becomes
> valid. From reviewing the code that seems like a safe change.
>
> A couple of bike shedding questions:
>
> - Is the syntax "REINDEX  CONCUURENTLY " ok?

Yeah, that's fine. At least that's what has been concluded in previous threads.

> - What should we do with REINDEX DATABASE CONCURRENTLY and the system
> catalog? I so not think we can reindex the system catalog concurrently
> safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
> them, reindex them while taking locks, or just error out?

System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.

> - What level of information should be output in VERBOSE mode?

Er, something like that as well, no?
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

> What remains to be implemented:
> - Support for exclusion constraints
> - Look more into how I handle constraints (currently the temporary index too
> seems to have the PRIMARY KEY flag)
> - Support for the VERBOSE flag
> - More testing to catch bugs

This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING:  XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION:  ReindexRelationConcurrently, indexcmds.c:2119
WARNING:  01000: snapshot 0x7fde12003038 still active
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2017-02-11 Thread Andreas Karlsson

On 02/02/2015 03:10 PM, Andres Freund wrote:

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.


I think that sounds like a good plan. The oid change does not seem like 
a too big deal to me, especially since that is what users will get now 
too. Do you still think this is the right way to solve this?


I have attached my work in progress patch which implements and is very 
heavily based on Michael's previous work. There are some things left to 
do but I think I should have a patch ready for the next commitfest if 
people still like this type of solution.


I also changed index_set_state_flags() to be transactional since I 
wanted the old index to become invalid at exactly the same time as the 
new becomes valid. From reviewing the code that seems like a safe change.


A couple of bike shedding questions:

- Is the syntax "REINDEX  CONCUURENTLY " ok?

- What should we do with REINDEX DATABASE CONCURRENTLY and the system 
catalog? I so not think we can reindex the system catalog concurrently 
safely, so what should REINDEX DATABASE do with the catalog indexes? 
Skip them, reindex them while taking locks, or just error out?


- What level of information should be output in VERBOSE mode?

What remains to be implemented:

- Support for exclusion constraints
- Look more into how I handle constraints (currently the temporary index 
too seems to have the PRIMARY KEY flag)

- Support for the VERBOSE flag
- More testing to catch bugs

Andreas

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 306def4a15..ca1aeca65f 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -923,7 +923,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 
  Acquired by VACUUM (without FULL),
- ANALYZE, CREATE INDEX CONCURRENTLY, and
+ ANALYZE, CREATE INDEX CONCURRENTLY,
+ REINDEX CONCURRENTLY,
  ALTER TABLE VALIDATE and other
  ALTER TABLE variants (for full details see
  ).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 3908ade37b..24464020cd 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
 
  
 
@@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving
   an invalid index. Such indexes are useless but it can be
   convenient to use REINDEX to rebuild them. Note that
-  REINDEX will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the CREATE INDEX CONCURRENTLY command.
+  REINDEX will perform a concurrent build if 
+  CONCURRENTLY is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY
+  command. Indexes of toast relations can be rebuilt with REINDEX
+  CONCURRENTLY.
  
 
 
@@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 

+CONCURRENTLY
+
+ 
+  When this option is used, PostgreSQL will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+   see .
+ 
+
+   
+
+   
 VERBOSE
 
  
@@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } 
 
+  
+   Rebuilding Indexes Concurrently
+
+   
+   index
+   rebuilding concurrently
+   
+
+   
+Rebuilding an index can interfere with regular operation of a database.
+Normally PostgreSQL locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   
+
+   
+PostgreSQL supports rebuilding 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2015-02-03 Thread Robert Haas
On Mon, Feb 2, 2015 at 9:10 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
  If REINDEX cannot work without an exclusive lock, we should invent some
  other qualifier, like WITH FEWER LOCKS.
 
  What he said.

 But more to the point  why, precisely, can't this work without an
 AccessExclusiveLock?  And can't we fix that instead of setting for
 something clearly inferior?

 So, here's an alternative approach of how to get rid of the AEL
 locks. They're required because we want to switch the relfilenodes
 around. I've pretty much no confidence in any of the schemes anybody has
 come up to avoid that.

 So, let's not switch relfilenodes around.

 I think if we should instead just use the new index, repoint the
 dependencies onto the new oid, and then afterwards, when dropping,
 rename the new index one onto the old one. That means the oid of the
 index will change and some less than pretty grovelling around
 dependencies, but it still seems preferrable to what we're discussing
 here otherwise.

 Does anybody see a fundamental problem with that approach?

I'm not sure whether that will work out, but it seems worth a try.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2015-02-02 Thread Andres Freund
On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
  If REINDEX cannot work without an exclusive lock, we should invent some
  other qualifier, like WITH FEWER LOCKS.
 
  What he said.
 
 But more to the point  why, precisely, can't this work without an
 AccessExclusiveLock?  And can't we fix that instead of setting for
 something clearly inferior?

So, here's an alternative approach of how to get rid of the AEL
locks. They're required because we want to switch the relfilenodes
around. I've pretty much no confidence in any of the schemes anybody has
come up to avoid that.

So, let's not switch relfilenodes around.

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

Does anybody see a fundamental problem with that approach?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-12-30 Thread Michael Paquier
On Tue, Dec 23, 2014 at 5:54 PM, Oskari Saarenmaa o...@ohmu.fi wrote:

 If the short-lived lock is the only blocker for this feature at the
 moment could we just require an additional qualifier for CONCURRENTLY
 (FORCE?) until the lock can be removed, something like:
 =# [blah]

FWIW, I'd just keep only CONCURRENTLY with no fancy additional
keywords even if we cheat on it, as long as it is precised in the
documentation that an exclusive lock is taken for a very short time,
largely shorter than what a normal REINDEX would do btw.

 It's not optimal, but currently there's no way to reindex a primary key
 anywhere close to concurrently and a short lock would be a huge
 improvement over the current situation.
Yep.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-12-23 Thread Oskari Saarenmaa
13.11.2014, 23:50, Andres Freund kirjoitti:
 On November 13, 2014 10:23:41 PM CET, Peter Eisentraut pete...@gmx.net 
 wrote:
 On 11/12/14 7:31 PM, Andres Freund wrote:
 Yes, it sucks. But it beats not being able to reindex a relation with
 a
 primary key (referenced by a fkey) without waiting several hours by a
 couple magnitudes. And that's the current situation.

 That's fine, but we have, for better or worse, defined CONCURRENTLY :=
 does not take exclusive locks.  Use a different adverb for an
 in-between
 facility.
 
 I think that's not actually a service to our users. They'll have to adapt 
 their scripts and knowledge when we get around to the more concurrent 
 version. What exactly CONCURRENTLY means is already not strictly defined and 
 differs between the actions.
 
 I'll note that DROP INDEX CONCURRENTLY actually already  internally acquires 
 an AEL lock. Although it's a bit harder to see the consequences of that.

If the short-lived lock is the only blocker for this feature at the
moment could we just require an additional qualifier for CONCURRENTLY
(FORCE?) until the lock can be removed, something like:

tmp# REINDEX INDEX CONCURRENTLY tmp_pkey;
ERROR:  REINDEX INDEX CONCURRENTLY is not fully concurrent; use REINDEX
INDEX CONCURRENTLY FORCE to perform reindex with a short-lived lock.

tmp=# REINDEX INDEX CONCURRENTLY FORCE tmp_pkey;
REINDEX

It's not optimal, but currently there's no way to reindex a primary key
anywhere close to concurrently and a short lock would be a huge
improvement over the current situation.

/ Oskari



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-18 Thread Michael Paquier
On Thu, Nov 13, 2014 at 10:25 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Andres Freund wrote:
 On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

   The problem is that it's very hard to avoid the wrong index's
   relfilenode being used when swapping the relfilenodes between two
   indexes.
 
  How about storing both the old and new relfilenodes in the same pg_class 
  entry?

 That's quite a cool idea

 [think a bit]

 But I think it won't work realistically. We have a *lot* of
 infrastructure that refers to indexes using it's primary key.

 Hmm, can we make the relmapper do this job instead of having another
 pg_class column?  Essentially the same sketch Robert proposed, instead
 we would initially set relfilenode=0 and have all onlookers use the
 relmapper to obtain the correct relfilenode; switching to the new
 relfilenode can be done atomically, and un-relmap the index once the
 process is complete.
 The difference from what Robert proposes is that the transient state is
 known to cause failures for anyone not prepared to deal with it, so it
 should be easy to spot what places need adjustment.

How would the failure handling actually work? Would we need some extra
process to remove the extra relfilenodes? Note that in the current
patch the temporary concurrent entry is kept as INVALID all the time,
giving the user a path to remove them with DROP INDEX even in the case
of invalid toast indexes in catalog pg_toast.

Note that I am on the side of using the exclusive lock when swapping
relfilenodes for now in any case, that's what pg_repack does by
renaming the indexes, and people use it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-17 Thread Robert Haas
On Fri, Nov 14, 2014 at 11:47 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-13 11:41:18 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  But I think it won't work realistically. We have a *lot* of
  infrastructure that refers to indexes using it's primary key. I don't
  think we want to touch all those places to also disambiguate on some
  other factor. All the relevant APIs are either just passing around oids
  or relcache entries.

 I'm not quite following this.  The whole point is to AVOID having two
 indexes.  You have one index which may at times have two sets of
 physical storage.

 Right. But how are we going to refer to these different relfilenodes?
 All the indexing infrastructure just uses oids and/or Relation pointers
 to refer to the index. How would you hand down the knowledge which of
 the relfilenodes is supposed to be used in some callchain?

If you've got a Relation, you don't need someone to tell you which
physical storage to use; you can figure that out for yourself by
looking at the Relation.  If you've got an OID, you're probably going
to go conjure up a Relation, and then you can do the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-14 Thread Andres Freund
On 2014-11-14 02:04:00 -0600, Jim Nasby wrote:
 On 11/13/14, 3:50 PM, Andres Freund wrote:
 Having been responsible for a site where downtime was a 6 figure
 dollar amount per hour, I've spent a LOT of time worrying about lock
 problems. The really big issue here isn't grabbing an exclusive lock;
 it's grabbing one at some random time when no one is there to actively
 monitor what's happening. (If you can't handle *any* exclusive locks,
 that also means you can never do an ALTER TABLE ADD COLUMN either.)

 With that in mind, would it be possible to set this up so that the
 time-consuming process of building the new index file happens first,
 and then (optionally) some sort of DBA action is required to actually
 do the relfilenode swap? I realize that's not the most elegant
 solution, but it's WAY better than this feature not hitting 9.5 and
 people having to hand-code a solution.

I don't think having a multi step version of the feature and it not
making into 9.5 are synonymous. And I really don't want to make it even
more complex before we have the basic version in.

I think a split like your:

 Possible syntax:
 REINDEX CONCURRENTLY -- Does what current patch does
 REINDEX CONCURRENT BUILD -- Builds new files
 REINDEX CONCURRENT SWAP -- Swaps new files in

could make sense, but it's really an additional feature ontop.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-14 Thread Andres Freund
On 2014-11-13 11:41:18 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund and...@2ndquadrant.com wrote:
  But I think it won't work realistically. We have a *lot* of
  infrastructure that refers to indexes using it's primary key. I don't
  think we want to touch all those places to also disambiguate on some
  other factor. All the relevant APIs are either just passing around oids
  or relcache entries.
 
 I'm not quite following this.  The whole point is to AVOID having two
 indexes.  You have one index which may at times have two sets of
 physical storage.

Right. But how are we going to refer to these different relfilenodes?
All the indexing infrastructure just uses oids and/or Relation pointers
to refer to the index. How would you hand down the knowledge which of
the relfilenodes is supposed to be used in some callchain?

There's ugly solutions like having a flag like 'bool
rd_useotherfilenode' inside struct RelationData, but even ignoring the
uglyness I don't think that'd work well - what if some function called
inside the index code again starts a index lookup?

I think I might just not getting your idea here?

  And all the
  indexing infrastructure can't deal with that without having separate
  oids  relcache entries.

Hopefully explained above?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-13 Thread Robert Haas
On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund and...@2ndquadrant.com wrote:
 But I think it won't work realistically. We have a *lot* of
 infrastructure that refers to indexes using it's primary key. I don't
 think we want to touch all those places to also disambiguate on some
 other factor. All the relevant APIs are either just passing around oids
 or relcache entries.

I'm not quite following this.  The whole point is to AVOID having two
indexes.  You have one index which may at times have two sets of
physical storage.

 There's also the problem that we'd really need two different pg_index
 rows to make things work. Alternatively we can duplicate the three
 relevant columns (indisready, indislive, indislive) in there for the
 different filenodes. But that's not entirely pretty.

I think what you would probably end up with is a single char or int2
column that defines the state of the index.  Certain states would be
valid only when relnewfilenode != 0.

 1. Take a snapshot.
 2. Index all the tuples in that snapshot.
 3. Publish the new relfilenode to an additional pg_class column,
 relnewfilenode or similar.
 4. Wait until everyone can see step #3.

 Here all backends need to update both indexes, right?

Yes.

 And all the
 indexing infrastructure can't deal with that without having separate
 oids  relcache entries.

Why not?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-13 Thread Peter Eisentraut
On 11/12/14 7:31 PM, Andres Freund wrote:
 Yes, it sucks. But it beats not being able to reindex a relation with a
 primary key (referenced by a fkey) without waiting several hours by a
 couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks.  Use a different adverb for an in-between
facility.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-13 Thread Andres Freund
On November 13, 2014 10:23:41 PM CET, Peter Eisentraut pete...@gmx.net wrote:
On 11/12/14 7:31 PM, Andres Freund wrote:
 Yes, it sucks. But it beats not being able to reindex a relation with
a
 primary key (referenced by a fkey) without waiting several hours by a
 couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks.  Use a different adverb for an
in-between
facility.

I think that's not actually a service to our users. They'll have to adapt their 
scripts and knowledge when we get around to the more concurrent version. What 
exactly CONCURRENTLY means is already not strictly defined and differs between 
the actions.

I'll note that DROP INDEX CONCURRENTLY actually already  internally acquires an 
AEL lock. Although it's a bit harder to see the consequences of that.

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Robert Haas
On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
 If REINDEX cannot work without an exclusive lock, we should invent some
 other qualifier, like WITH FEWER LOCKS.

What he said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Robert Haas
On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
 If REINDEX cannot work without an exclusive lock, we should invent some
 other qualifier, like WITH FEWER LOCKS.

 What he said.

But more to the point  why, precisely, can't this work without an
AccessExclusiveLock?  And can't we fix that instead of setting for
something clearly inferior?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Andres Freund
On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
  If REINDEX cannot work without an exclusive lock, we should invent some
  other qualifier, like WITH FEWER LOCKS.
 
  What he said.

I'm unconvinced. A *short* exclusive lock (just to update two pg_class
row), still gives most of the benefits of CONCURRENTLY. Also, I do think
we can get rid of that period in the not too far away future.

 But more to the point  why, precisely, can't this work without an
 AccessExclusiveLock?  And can't we fix that instead of setting for
 something clearly inferior?

It's nontrivial to fix, but I think we can fix it at some point. I just
think we should get the *major* part of the feature before investing
lots of time making it even better. There's *very* frequent questions
about having this. And people do really dangerous stuff (like manually
updating pg_class.relfilenode and such) to cope.

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Robert Haas
On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
  On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
  If REINDEX cannot work without an exclusive lock, we should invent some
  other qualifier, like WITH FEWER LOCKS.
 
  What he said.

 I'm unconvinced. A *short* exclusive lock (just to update two pg_class
 row), still gives most of the benefits of CONCURRENTLY.

I am pretty doubtful about that.  It's still going to require you to
wait for all transactions to drain out of the table while new ones are
blocked from entering.  Which sucks.  Unless all of your transactions
are very short, but that's not necessarily typical.

 The problem is that it's very hard to avoid the wrong index's
 relfilenode being used when swapping the relfilenodes between two
 indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

1. Take a snapshot.
2. Index all the tuples in that snapshot.
3. Publish the new relfilenode to an additional pg_class column,
relnewfilenode or similar.
4. Wait until everyone can see step #3.
5. Rescan the table and add any missing tuples to the index.
6. Set some flag in pg_class to mark the relnewfilenode as active and
relfilenode as not to be used for queries.
7. Wait until everyone can see step #6.
8. Set some flag in pg_class to mark relfilenode as not even to be opened.
9. Wait until everyone can see step #8.
10. Drop old relfilenode.
11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

This is basically CREATE INDEX CONCURRENTLY (without the first step
where we out-wait people who might create now-invalid HOT chains,
because that can't arise with a REINDEX of an existing index) plus
DROP INDEX CONCURRENTLY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Andres Freund
On 2014-11-12 18:23:38 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
  On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas robertmh...@gmail.com wrote:
   On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut pete...@gmx.net wrote:
   If REINDEX cannot work without an exclusive lock, we should invent some
   other qualifier, like WITH FEWER LOCKS.
  
   What he said.
 
  I'm unconvinced. A *short* exclusive lock (just to update two pg_class
  row), still gives most of the benefits of CONCURRENTLY.
 
 I am pretty doubtful about that.  It's still going to require you to
 wait for all transactions to drain out of the table while new ones are
 blocked from entering.  Which sucks.  Unless all of your transactions
 are very short, but that's not necessarily typical.

Yes, it sucks. But it beats not being able to reindex a relation with a
primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

  The problem is that it's very hard to avoid the wrong index's
  relfilenode being used when swapping the relfilenodes between two
  indexes.
 
 How about storing both the old and new relfilenodes in the same pg_class 
 entry?

That's quite a cool idea

[think a bit]

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

There's also the problem that we'd really need two different pg_index
rows to make things work. Alternatively we can duplicate the three
relevant columns (indisready, indislive, indislive) in there for the
different filenodes. But that's not entirely pretty.

 1. Take a snapshot.
 2. Index all the tuples in that snapshot.
 3. Publish the new relfilenode to an additional pg_class column,
 relnewfilenode or similar.
 4. Wait until everyone can see step #3.

Here all backends need to update both indexes, right? And all the
indexing infrastructure can't deal with that without having separate
oids  relcache entries.

 5. Rescan the table and add any missing tuples to the index.
 6. Set some flag in pg_class to mark the relnewfilenode as active and
 relfilenode as not to be used for queries.
 7. Wait until everyone can see step #6.
 8. Set some flag in pg_class to mark relfilenode as not even to be opened.
 9. Wait until everyone can see step #8.
 10. Drop old relfilenode.
 11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

Even that one isn't trivial - how do you deal with the fact that
somebody looking at updating newrelfilenode might, in the midst of
processing, see newrelfilenode = 0?


I've earlier come up with a couple possible solutions, but I
unfortunately found holes in all of them. And if I can find holes in
them, there surely are more :(.

I don't recall what the problem with just swapping the names was - but
I'm pretty sure there was one... Hm. The index relation oids are
referred to by constraints and dependencies. That's somewhat
solvable. But I think there was something else as well...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Alvaro Herrera
Andres Freund wrote:
 On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

   The problem is that it's very hard to avoid the wrong index's
   relfilenode being used when swapping the relfilenodes between two
   indexes.
  
  How about storing both the old and new relfilenodes in the same pg_class 
  entry?
 
 That's quite a cool idea
 
 [think a bit]
 
 But I think it won't work realistically. We have a *lot* of
 infrastructure that refers to indexes using it's primary key.

Hmm, can we make the relmapper do this job instead of having another
pg_class column?  Essentially the same sketch Robert proposed, instead
we would initially set relfilenode=0 and have all onlookers use the
relmapper to obtain the correct relfilenode; switching to the new
relfilenode can be done atomically, and un-relmap the index once the
process is complete.

The difference from what Robert proposes is that the transient state is
known to cause failures for anyone not prepared to deal with it, so it
should be easy to spot what places need adjustment.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Michael Paquier
On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund and...@2ndquadrant.com wrote:
 I don't recall what the problem with just swapping the names was - but
 I'm pretty sure there was one... Hm. The index relation oids are
 referred to by constraints and dependencies. That's somewhat
 solvable. But I think there was something else as well...
The reason given 2 years ago for not using relname was the fast that
the oid of the index changes, and to it be refered by some pg_depend
entries:
http://www.postgresql.org/message-id/20121208133730.ga6...@awork2.anarazel.de
http://www.postgresql.org/message-id/12742.1354977...@sss.pgh.pa.us
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-12 Thread Michael Paquier
On Thu, Nov 13, 2014 at 10:26 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund and...@2ndquadrant.com wrote:
 I don't recall what the problem with just swapping the names was - but
 I'm pretty sure there was one... Hm. The index relation oids are
 referred to by constraints and dependencies. That's somewhat
 solvable. But I think there was something else as well...
 The reason given 2 years ago for not using relname was the fast that
 the oid of the index changes, and to it be refered by some pg_depend
 entries:
Feel free to correct: and that it could be referred.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-10 Thread Jeff Janes
On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Updated patch is attached.
 Please find attached an updated patch with the following things changed:
 - Addition of tab completion in psql for all new commands
 - Addition of a call to WaitForLockers in index_concurrent_swap to
 ensure that there are no running transactions on the parent table
 running before exclusive locks are taken on the index and its
 concurrent entry. Previous patch versions created deadlocks because of
 that, issue spotted by the isolation tests integrated in the patch.
 - Isolation tests for reindex concurrently are re-enabled by default.
 Regards,


It looks like this needs another rebase, I get failures
on index.c, toasting.c, indexcmds.c, and index.h

Thanks,

Jeff


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-10 Thread Michael Paquier
On Tue, Nov 11, 2014 at 3:24 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier michael.paqu...@gmail.com 
 wrote:

 On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Updated patch is attached.
 Please find attached an updated patch with the following things changed:
 - Addition of tab completion in psql for all new commands
 - Addition of a call to WaitForLockers in index_concurrent_swap to
 ensure that there are no running transactions on the parent table
 running before exclusive locks are taken on the index and its
 concurrent entry. Previous patch versions created deadlocks because of
 that, issue spotted by the isolation tests integrated in the patch.
 - Isolation tests for reindex concurrently are re-enabled by default.
 Regards,


 It looks like this needs another rebase, I get failures on index.c, 
 toasting.c, indexcmds.c, and index.h

Indeed. There are some conflicts created by the recent modification of
index_create. Here is a rebased patch.
-- 
Michael
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 para
  Acquired by commandVACUUM/command (without optionFULL/option),
- commandANALYZE/, commandCREATE INDEX CONCURRENTLY/, and
+ commandANALYZE/, commandCREATE INDEX CONCURRENTLY/,
+ commandREINDEX CONCURRENTLY/,
  commandALTER TABLE VALIDATE/command and other
  commandALTER TABLE/command variants (for full details see
  xref linkend=SQL-ALTERTABLE).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
sect2 id=locking-pages
 titlePage-level Locks/title
-  
+
 para
  In addition to table and row locks, page-level share/exclusive locks are
  used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  refsynopsisdiv
 synopsis
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] replaceable class=PARAMETERname/replaceable [ FORCE ]
 /synopsis
  /refsynopsisdiv
 
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
   An index build with the literalCONCURRENTLY/ option failed, leaving
   an quoteinvalid/ index. Such indexes are useless but it can be
   convenient to use commandREINDEX/ to rebuild them. Note that
-  commandREINDEX/ will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the commandCREATE INDEX CONCURRENTLY/ command.
+  commandREINDEX/ will perform a concurrent build if literal
+  CONCURRENTLY/ is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  commandCREATE INDEX CONCURRENTLY/ or commandREINDEX CONCURRENTLY/
+  command. Indexes of toast relations can be rebuilt with commandREINDEX
+  CONCURRENTLY/.
  /para
 /listitem
 
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
/varlistentry
 
varlistentry
+termliteralCONCURRENTLY/literal/term
+listitem
+ para
+  When this option is used, productnamePostgreSQL/ will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+  mdash; see xref linkend=SQL-REINDEX-CONCURRENTLY
+  endterm=SQL-REINDEX-CONCURRENTLY-title.
+ /para
+/listitem
+   /varlistentry
+
+   varlistentry
 termliteralFORCE/literal/term
 listitem
  para
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
reindex anything.
   /para
 
+  refsect2 id=SQL-REINDEX-CONCURRENTLY
+   title id=SQL-REINDEX-CONCURRENTLY-titleRebuilding Indexes Concurrently/title
+
+   indexterm zone=SQL-REINDEX-CONCURRENTLY
+   primaryindex/primary
+   secondaryrebuilding concurrently/secondary
+   /indexterm
+
+   para
+Rebuilding an index can interfere with regular operation of a database.
+Normally productnamePostgreSQL/ locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+  

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-06 Thread Peter Eisentraut
On 10/1/14 3:00 AM, Michael Paquier wrote:
 - Use of AccessExclusiveLock when swapping relfilenodes of an index and
 its concurrent entry instead of ShareUpdateExclusiveLock for safety. At
 the limit of my understanding, that's the consensus reached until now.

I'm very curious about this point.  I looked through all the previous
discussions, and the only time I saw this mentioned was at the very
beginning when it was said that we could review the patch while ignoring
this issue and fix it later with MVCC catalog access.  Then it got very
technical, but it was never explicitly concluded whether it was possible
to fix this or not.

Also, in the thread Concurrently option for reindexdb you pointed out
that requiring an exclusive lock isn't really concurrent and proposed an
option like --minimum-locks.

I will point out again that we specifically invented DROP INDEX
CONCURRENTLY because holding an exclusive lock even briefly isn't good
enough.

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.  It's still useful, but we
shouldn't give people the idea that they can throw away their custom
CREATE INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY scripts.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-11-05 Thread Michael Paquier
On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Updated patch is attached.
Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 para
  Acquired by commandVACUUM/command (without optionFULL/option),
- commandANALYZE/, commandCREATE INDEX CONCURRENTLY/, and
+ commandANALYZE/, commandCREATE INDEX CONCURRENTLY/,
+ commandREINDEX CONCURRENTLY/,
  commandALTER TABLE VALIDATE/command and other
  commandALTER TABLE/command variants (for full details see
  xref linkend=SQL-ALTERTABLE).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
sect2 id=locking-pages
 titlePage-level Locks/title
-  
+
 para
  In addition to table and row locks, page-level share/exclusive locks are
  used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..285f3ff 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  refsynopsisdiv
 synopsis
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] replaceable class=PARAMETERname/replaceable [ FORCE ]
 /synopsis
  /refsynopsisdiv
 
@@ -68,9 +68,12 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
   An index build with the literalCONCURRENTLY/ option failed, leaving
   an quoteinvalid/ index. Such indexes are useless but it can be
   convenient to use commandREINDEX/ to rebuild them. Note that
-  commandREINDEX/ will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the commandCREATE INDEX CONCURRENTLY/ command.
+  commandREINDEX/ will perform a concurrent build if literal
+  CONCURRENTLY/ is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  commandCREATE INDEX CONCURRENTLY/ or commandREINDEX CONCURRENTLY/
+  command. Indexes of toast relations can be rebuilt with commandREINDEX
+  CONCURRENTLY/.
  /para
 /listitem
 
@@ -139,6 +142,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
/varlistentry
 
varlistentry
+termliteralCONCURRENTLY/literal/term
+listitem
+ para
+  When this option is used, productnamePostgreSQL/ will rebuild the
+  index without taking any locks that prevent concurrent inserts,
+  updates, or deletes on the table; whereas a standard reindex build
+  locks out writes (but not reads) on the table until it's done.
+  There are several caveats to be aware of when using this option
+  mdash; see xref linkend=SQL-REINDEX-CONCURRENTLY
+  endterm=SQL-REINDEX-CONCURRENTLY-title.
+ /para
+/listitem
+   /varlistentry
+
+   varlistentry
 termliteralFORCE/literal/term
 listitem
  para
@@ -218,6 +236,194 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
reindex anything.
   /para
 
+  refsect2 id=SQL-REINDEX-CONCURRENTLY
+   title id=SQL-REINDEX-CONCURRENTLY-titleRebuilding Indexes Concurrently/title
+
+   indexterm zone=SQL-REINDEX-CONCURRENTLY
+   primaryindex/primary
+   secondaryrebuilding concurrently/secondary
+   /indexterm
+
+   para
+Rebuilding an index can interfere with regular operation of a database.
+Normally productnamePostgreSQL/ locks the table whose index is rebuilt
+against writes and performs the entire index build with a single scan of the
+table. Other transactions can still read the table, but if they try to
+insert, update, or delete rows in the table they will block until the
+index rebuild is finished. This could have a severe effect if the system is
+a live production database. Very large tables can take many hours to be
+indexed, and even for smaller tables, an index rebuild can lock out writers
+for periods that are unacceptably long for a production system.
+   /para

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-10-30 Thread Michael Paquier
Thanks for your input, Jim!

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 Patch applies against current HEAD and builds, but I'm getting 37 failed
 tests (mostly parallel, but also misc and WITH; results attached). Is that
 expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated
in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

 The mark the concurrent index bit is rather confusing; it sounds like
it's
 referring to the new index instead of the old. Now that I've read the
code I
 understand what's going on here between the concurrent index *entry* and
the
 filenode swap, but I don't think the docs make this sufficiently clear to
 users.

 How about something like this:

 The following steps occur in a concurrent index build, each in a separate
 transaction. Note that if there are multiple indexes to be rebuilt then
each
 step loops through all the indexes we're rebuilding, using a separate
 transaction for each one.

 1. [blah]
Definitely a good idea! I took your text and made it more precise, listing
the actions done for each step, the pg_index flags switched, using
orderedlist to make the list of steps described in a separate paragraph
more exhaustive for the user. At the same time I reworked the docs removing
a part that was somewhat duplicated about dealing with the constraints
having invalid index entries and how to drop them.

 + * index_concurrent_create
 + *
 + * Create an index based on the given one that will be used for
concurrent
 + * operations. The index is inserted into catalogs and needs to be built
 later
 + * on. This is called during concurrent index processing. The heap
relation
 + * on which is based the index needs to be closed by the caller.

 Last bit presumably should be on which the index is based.
What about Create a concurrent index based on the definition of the one
provided by caller?

 +   /* Build the list of column names, necessary for index_create */
 Instead of all this work wouldn't it be easier to create a version of
 index_create/ConstructTupleDescriptor that will use the IndexInfo for the
 old index? ISTM index_concurrent_create() is doing a heck of a lot of work
 to marshal data into one form just to have it get marshaled yet again.
Worst
 case, if we do have to play this game, there should be a stand-alone
 function to get the columns/expressions for an existing index; you're
 duplicating a lot of code from pg_get_indexdef_worker().
Yes, this definitely sucks and the approach creating a function to get all
the column names is not productive as well. Then let's define an additional
argument in index_create to pass a potential TupleDesc instead of this
whole wart. I noticed as well that we need to actually reset attcacheoff to
be able to use a fresh version of the tuple descriptor of the old index. I
added a small API for this purpose in tupdesc.h called ResetTupleDescCache.
Would it make sense instead to extend CreateTupleDescCopyConstr or
CreateTupleDescCopy with a boolean flag?

 index_concurrent_swap(): Perhaps it'd be better to create
 index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
 refactor the duplicated code out... the actual function would then become:
This sentence is not finished :) IMO, index_concurrent_swap looks good as
is, taking as arguments the index and its concurrent entry, and swapping
their relfilenode after taking AccessExclusiveLock that will be hold until
the end of this transaction.

 ReindexRelationConcurrently()

 + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can
be
 + * either an index or a table. If a table is specified, each step of
 REINDEX
 + * CONCURRENTLY is done in parallel with all the table's indexes as well
as
 + * its dependent toast indexes.
 This comment is a bit misleading; we're not actually doing anything in
 parallel, right? AFAICT index_concurrent_build is going to block while
each
 index is built the first time.
Yes, parallel may be misleading. What is meant here is that each step of
the process is done one by one on all the valid indexes a table may have.

 +* relkind is an index, this index itself will be rebuilt. The
locks
 taken
 +* parent relations and involved indexes are kept until this
 transaction
 +* is committed to protect against schema changes that might occur
 until
 +* the session lock is taken on each relation.

 This comment is a bit unclear to me... at minimum I think it should be *
on
 parent relations instead of * parent relations, but I think it needs to
 elaborate on why/when we're also taking session level locks.
Hum, done as follows:
@@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
 * If the relkind of given relation Oid is a table, all its valid
indexes
 * will be rebuilt, including its associated toast table indexes. If
 * relkind is an index, this index itself will be rebuilt. The
locks 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-10-30 Thread Jim Nasby

On 10/30/14, 3:19 AM, Michael Paquier wrote:

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby jim.na...@bluetreble.com 
mailto:jim.na...@bluetreble.com wrote:
  Patch applies against current HEAD and builds, but I'm getting 37 failed
  tests (mostly parallel, but also misc and WITH; results attached). Is that
  expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated in 
reviewing :p) because of a missing inclusion of ruleutils.h in index.c.


Sorry, forgot to mention patch now passes make check cleanly.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-10-30 Thread Jim Nasby

On 10/30/14, 3:19 AM, Michael Paquier wrote:

Thanks for your input, Jim!

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby jim.na...@bluetreble.com 
mailto:jim.na...@bluetreble.com wrote:
  Patch applies against current HEAD and builds, but I'm getting 37 failed
  tests (mostly parallel, but also misc and WITH; results attached). Is that
  expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated in 
reviewing :p) because of a missing inclusion of ruleutils.h in index.c.



  The mark the concurrent index bit is rather confusing; it sounds like it's
  referring to the new index instead of the old. Now that I've read the code I
  understand what's going on here between the concurrent index *entry* and the
  filenode swap, but I don't think the docs make this sufficiently clear to
  users.
 
  How about something like this:
 
  The following steps occur in a concurrent index build, each in a separate
  transaction. Note that if there are multiple indexes to be rebuilt then each
  step loops through all the indexes we're rebuilding, using a separate
  transaction for each one.
 
  1. [blah]
Definitely a good idea! I took your text and made it more precise, listing the 
actions done for each step, the pg_index flags switched, using orderedlist to 
make the list of steps described in a separate paragraph more exhaustive for the 
user. At the same time I reworked the docs removing a part that was somewhat 
duplicated about dealing with the constraints having invalid index entries and how to 
drop them.


Awesome! Just a few items here:

+   Then a second pass is performed to add tuples that were added while
+   the first pass build was running. One the validation of the index

s/One the/Once the/


  + * index_concurrent_create
  + *
  + * Create an index based on the given one that will be used for concurrent
  + * operations. The index is inserted into catalogs and needs to be built
  later
  + * on. This is called during concurrent index processing. The heap relation
  + * on which is based the index needs to be closed by the caller.
 
  Last bit presumably should be on which the index is based.
What about Create a concurrent index based on the definition of the one provided by 
caller?


That's good too, but my comment was on the last sentence, not the first.


  +   /* Build the list of column names, necessary for index_create */
  Instead of all this work wouldn't it be easier to create a version of
  index_create/ConstructTupleDescriptor that will use the IndexInfo for the
  old index? ISTM index_concurrent_create() is doing a heck of a lot of work
  to marshal data into one form just to have it get marshaled yet again. Worst
  case, if we do have to play this game, there should be a stand-alone
  function to get the columns/expressions for an existing index; you're
  duplicating a lot of code from pg_get_indexdef_worker().
Yes, this definitely sucks and the approach creating a function to get all the 
column names is not productive as well. Then let's define an additional 
argument in index_create to pass a potential TupleDesc instead of this whole 
wart. I noticed as well that we need to actually reset attcacheoff to be able 
to use a fresh version of the tuple descriptor of the old index. I added a 
small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it 
make sense instead to extend CreateTupleDescCopyConstr or CreateTupleDescCopy 
with a boolean flag?


Perhaps there'd be other places that would want to reset the stats, so I lean 
slightly that direction.

The comment at the beginning of index_create needs to be modified to mention 
tupdesc and especially that setting tupdesc over-rides indexColNames.


  index_concurrent_swap(): Perhaps it'd be better to create
  index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
  refactor the duplicated code out... the actual function would then become:
This sentence is not finished :) IMO, index_concurrent_swap looks good as is, 
taking as arguments the index and its concurrent entry, and swapping their 
relfilenode after taking AccessExclusiveLock that will be hold until the end of 
this transaction.


Fair enough.


  ReindexRelationConcurrently()
 
  + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
  + * either an index or a table. If a table is specified, each step of
  REINDEX
  + * CONCURRENTLY is done in parallel with all the table's indexes as well as
  + * its dependent toast indexes.
  This comment is a bit misleading; we're not actually doing anything in
  parallel, right? AFAICT index_concurrent_build is going to block while each
  index is built the first time.
Yes, parallel may be misleading. What is meant here is that each step of the 
process is done one by one on all the valid indexes a table may have.


New comment looks good.


  +* relkind is an index, this index itself will be rebuilt. The locks
  taken
  +* parent relations 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-10-01 Thread Michael Paquier
On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  Can you add it to the next CF? I'll try to look earlier, but can't
  promise anything.
 
  I very much would like this to get committed in some form or another.
 Added it here to keep track of it:
 https://commitfest.postgresql.org/action/patch_view?id=1563

Attached is a fairly-refreshed patch that should be used as a base for the
next commit fest. The following changes should be noticed:
- Use of AccessExclusiveLock when swapping relfilenodes of an index and its
concurrent entry instead of ShareUpdateExclusiveLock for safety. At the
limit of my understanding, that's the consensus reached until now.
- Cleanup of many comments and refresh of the documentation that was
somewhat wrongly-formulated or shaped at some places
- Addition of support for autocommit off in psql for REINDEX [ TABLE |
INDEX ] CONCURRENTLY
- Some more code cleanup..
I haven't been through the tab completion support for psql but looking at
tab-completion.c this seems a bit tricky with the stuff related to CREATE
INDEX CONCURRENTLY already present. Nothing huge though.
Regards,
-- 
Michael
From 5e9125da5cbeb2f6265b68ff14cc70e4cb10d502 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Wed, 27 Aug 2014 10:56:02 +0900
Subject: [PATCH] Support for REINDEX CONCURRENTLY

Fully supported patch, with regression and isolation tests, including
documentation and support for autocommit 'off' in psql. This version
uses an exclusive lock when swapping relations for safety. psql tab
completion has not been added yet in this version as this is rather
independent from the core version and actually psql makes things a
bit tricky with CREATE INDEX CONCURRENTLY using the same keywords.
---
 doc/src/sgml/mvcc.sgml |   5 +-
 doc/src/sgml/ref/reindex.sgml  | 160 +++-
 src/backend/catalog/index.c| 476 ++--
 src/backend/catalog/toasting.c |   2 +-
 src/backend/commands/indexcmds.c   | 821 ++---
 src/backend/commands/tablecmds.c   |  33 +-
 src/backend/executor/execUtils.c   |  14 +
 src/backend/nodes/copyfuncs.c  |   1 +
 src/backend/nodes/equalfuncs.c |   1 +
 src/backend/parser/gram.y  |  17 +-
 src/backend/tcop/utility.c |  12 +-
 src/bin/psql/common.c  |  17 +
 src/include/catalog/index.h|  19 +-
 src/include/commands/defrem.h  |   7 +-
 src/include/nodes/parsenodes.h |   1 +
 .../isolation/expected/reindex-concurrently.out|  78 ++
 src/test/isolation/isolation_schedule  |   1 +
 src/test/isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out |  57 ++
 src/test/regress/sql/create_index.sql  |  42 ++
 20 files changed, 1615 insertions(+), 189 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index cd55be8..653b120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,8 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 para
  Acquired by commandVACUUM/command (without optionFULL/option),
- commandANALYZE/, commandCREATE INDEX CONCURRENTLY/, and
+ commandANALYZE/, commandCREATE INDEX CONCURRENTLY/,
+ commandREINDEX CONCURRENTLY/,
  commandALTER TABLE VALIDATE/command and other
  commandALTER TABLE/command variants (for full details see
  xref linkend=SQL-ALTERTABLE).
@@ -1143,7 +1144,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
sect2 id=locking-pages
 titlePage-level Locks/title
-  
+
 para
  In addition to table and row locks, page-level share/exclusive locks are
  used to control read/write access to table pages in the shared buffer
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..0b7a93c 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  refsynopsisdiv
 synopsis
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] replaceable class=PARAMETERname/replaceable [ FORCE ]
 /synopsis
  /refsynopsisdiv
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
   An index build with the literalCONCURRENTLY/ option failed, leaving
   an quoteinvalid/ 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-08-27 Thread Andres Freund
On 2014-08-27 11:00:56 +0900, Michael Paquier wrote:
 On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  I have realigned this patch with latest head (d2458e3)... In case
  someone is interested at some point...
 Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
 (d7938a4), as some people are showing interest in it by reading recent
 discussions. Patch compiles and passes regression as well as isolation
 tests..

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-08-27 Thread Michael Paquier
On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund and...@2ndquadrant.com wrote:
 Can you add it to the next CF? I'll try to look earlier, but can't
 promise anything.

 I very much would like this to get committed in some form or another.
Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-08-26 Thread Michael Paquier
On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 I have realigned this patch with latest head (d2458e3)... In case
 someone is interested at some point...
Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
(d7938a4), as some people are showing interest in it by reading recent
discussions. Patch compiles and passes regression as well as isolation
tests..
-- 
Michael
From 944bc6cc2998ebac1cab11a91f55120a3452f3ed Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Wed, 27 Aug 2014 10:56:02 +0900
Subject: [PATCH] Support for REINDEX CONCURRENTLY

Fully supported patch, with regression and isolation tests, including
documentation. This version uses low-level lock when swapping relations
and suffers from deadlock problems when checking for old snapshots as
well.
---
 doc/src/sgml/mvcc.sgml |   4 +-
 doc/src/sgml/ref/reindex.sgml  | 169 -
 src/backend/catalog/index.c| 478 ++--
 src/backend/catalog/toasting.c |   2 +-
 src/backend/commands/indexcmds.c   | 820 ++---
 src/backend/commands/tablecmds.c   |  33 +-
 src/backend/executor/execUtils.c   |  14 +
 src/backend/nodes/copyfuncs.c  |   1 +
 src/backend/nodes/equalfuncs.c |   1 +
 src/backend/parser/gram.y  |  15 +-
 src/backend/tcop/utility.c |  12 +-
 src/include/catalog/index.h|  19 +-
 src/include/commands/defrem.h  |   7 +-
 src/include/nodes/parsenodes.h |   1 +
 .../isolation/expected/reindex-concurrently.out|  78 ++
 src/test/isolation/isolation_schedule  |   1 +
 src/test/isolation/specs/reindex-concurrently.spec |  40 +
 src/test/regress/expected/create_index.out |  57 ++
 src/test/regress/sql/create_index.sql  |  42 ++
 19 files changed, 1610 insertions(+), 184 deletions(-)
 create mode 100644 src/test/isolation/expected/reindex-concurrently.out
 create mode 100644 src/test/isolation/specs/reindex-concurrently.spec

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 0bbbc71..365a6dd 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -864,7 +864,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
 
 para
  Acquired by commandVACUUM/command (without optionFULL/option),
- commandANALYZE/, commandCREATE INDEX CONCURRENTLY/, and
+ commandANALYZE/, commandCREATE INDEX CONCURRENTLY/,
+ commandREINDEX CONCURRENTLY/,
+ and
  commandALTER TABLE VALIDATE/command and other
  commandALTER TABLE/command variants (for full details see
  xref linkend=SQL-ALTERTABLE).
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index cabae19..ea3410f 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  refsynopsisdiv
 synopsis
-REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | DATABASE | SYSTEM } [ CONCURRENTLY ] replaceable class=PARAMETERname/replaceable [ FORCE ]
 /synopsis
  /refsynopsisdiv
 
@@ -68,9 +68,22 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
   An index build with the literalCONCURRENTLY/ option failed, leaving
   an quoteinvalid/ index. Such indexes are useless but it can be
   convenient to use commandREINDEX/ to rebuild them. Note that
-  commandREINDEX/ will not perform a concurrent build. To build the
-  index without interfering with production you should drop the index and
-  reissue the commandCREATE INDEX CONCURRENTLY/ command.
+  commandREINDEX/ will perform a concurrent build if literal
+  CONCURRENTLY/ is specified. To build the index without interfering
+  with production you should drop the index and reissue either the
+  commandCREATE INDEX CONCURRENTLY/ or commandREINDEX CONCURRENTLY/
+  command. Indexes of toast relations can be rebuilt with commandREINDEX
+  CONCURRENTLY/.
+ /para
+/listitem
+
+listitem
+ para
+  Concurrent indexes based on a literalPRIMARY KEY/ or an literal
+  EXCLUDE/  constraint need to be dropped with literalALTER TABLE
+  DROP CONSTRAINT/. This is also the case of literalUNIQUE/ indexes
+  using constraints. Other indexes can be dropped using literalDROP INDEX/,
+  including invalid toast indexes.
  /para
 /listitem
 
@@ -139,6 +152,21 @@ REINDEX { INDEX | TABLE | DATABASE | SYSTEM } replaceable class=PARAMETERnam
/varlistentry
 
varlistentry
+termliteralCONCURRENTLY/literal/term
+listitem
+ para
+  When this option is used, productnamePostgreSQL/ will 

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-01-09 Thread Jim Nasby

Sorry for the lateness of this...

On 11/14/13, 8:40 PM, Michael Paquier wrote:

+   /*
+* Phase 4 of REINDEX CONCURRENTLY
+*
+* Now that the concurrent indexes have been validated could be used,
+* we need to swap each concurrent index with its corresponding old 
index.
+* Note that the concurrent index used for swaping is not marked as 
valid
+* because we need to keep the former index and the concurrent index 
with
+* a different valid status to avoid an implosion in the number of 
indexes
+* a parent relation could have if this operation fails multiple times 
in
+* a row due to a reason or another. Note that we already know thanks to
+* validation step that
+*/
+


Was there supposed to be more to that comment?

In the loop right below it...

+   /* Swap the indexes and mark the indexes that have the old data as 
invalid */
+   forboth(lc, indexIds, lc2, concurrentIndexIds)
...
+   CacheInvalidateRelcacheByRelid(relOid);

Do we actually need to invalidate the cache on each case? Is it because we're 
grabbing a new transaction each time through?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2014-01-03 Thread Alvaro Herrera
Michael Paquier escribió:
 Hi all,
 
 Please find attached updated patches for the support of REINDEX
 CONCURRENTLY, renamed 2.0 for the occasion:
 - 20131114_1_index_drop_comments.patch, patch that updates some
 comments in index_drop. This updates only a couple of comments in
 index_drop but has not been committed yet. It should be IMO...

Pushed this one, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-18 Thread Andres Freund
On 2013-11-18 19:52:29 +0900, Michael Paquier wrote:
 On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
  - 20131114_3_reindex_concurrently.patch, providing the core feature.
  Patch 3 needs to have patch 2 applied first. Regression tests,
  isolation tests and documentation are included with the patch.
 
  Have you addressed my concurrency concerns from the last version?
 I have added documentation in the patch with a better explanation
 about why those choices of implementation are made.

The dropping still isn't safe:
After phase 4 we are in the state:
old index: valid, live, !isdead
new index: !valid, live, !isdead
Then you do a index_concurrent_set_dead() from that state on in phase 5.
There you do WaitForLockers(locktag, AccessExclusiveLock) before
index_set_state_flags(INDEX_DROP_SET_DEAD).
That's not sufficient.

Consider what happens with the following sequence:
1) WaitForLockers(locktag, AccessExclusiveLock)
   - GetLockConflicts() = virtualxact 1
   - VirtualXactLock(1)
2) virtualxact 2 starts, opens the *old* index since it's currently the
   only valid one.
3) virtualxact 1 finishes
4) index_concurrent_set_dead() does index_set_state_flags(DROP_SET_DEAD)
5) another transaction (vxid 3) starts inserting data into the relation, updates
   only the new index, the old index is dead
6) vxid 2 inserts data, updates only the old index. Since it had the
   index already open the cache invalidations won't be processed.

Now the indexes are out of sync. There's entries only in the old index
and there's entries only in the new index. Not good.

I hate to repeat myself, but you really need to follow the current
protocol for concurrently dropping indexes. Which involves *first*
marking the index as invalid so it won't be used for querying anymore,
then wait for everyone possibly still seing that entry to finish, and
only *after* that mark the index as dead. You cannot argue away
correctness concerns with potential deadlocks.

c.f. 
http://www.postgresql.org/message-id/20130926103400.ga2471...@alap2.anarazel.de


I am also still unconvinced that the logic in index_concurrent_swap() is
correct. It very much needs to explain why no backend can see values
that are inconsistent. E.g. what prevents a backend thinking the old and
new indexes have the same relfilenode? MVCC snapshots don't seem to
protect you against that.
I am not sure there's a problem, but there certainly needs to more
comments explaining why there are none.

Something like the following might be possible:

Backend 1: start reindex concurrently, till phase 4
Backend 2: ExecOpenIndices()
   - RelationGetIndexList (that list is consistent due to mvcc 
snapshots!)
Backend 2: - index_open(old_index) (old relfilenode)
Backend 1: index_concurrent_swap()
   - CommitTransaction()
   - ProcArrayEndTransaction() (changes visible to others henceforth!)
Backend 2: - index_open(new_index)
   = no cache invalidations yet, gets the old relfilenode
Backend 2: ExecInsertIndexTuples()
   = updates the same relation twice, corruptf
Backend 1: stil. in CommitTransaction()
  - AtEOXact_Inval() sends out invalidations

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 9:40 PM, Michael Paquier wrote:
 Hi all,
 
 Please find attached updated patches for the support of REINDEX
 CONCURRENTLY, renamed 2.0 for the occasion:
 - 20131114_1_index_drop_comments.patch, patch that updates some
 comments in index_drop. This updates only a couple of comments in
 index_drop but has not been committed yet. It should be IMO...
 - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
 providing a single API that can be used to wait for old snapshots
 - 20131114_3_reindex_concurrently.patch, providing the core feature.
 Patch 3 needs to have patch 2 applied first. Regression tests,
 isolation tests and documentation are included with the patch.

The third patch needs to be rebased, because of a conflict in index.c.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Andres Freund
Hi,

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
 - 20131114_3_reindex_concurrently.patch, providing the core feature.
 Patch 3 needs to have patch 2 applied first. Regression tests,
 isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers