Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-05 Thread Michael Paquier
On Fri, Apr 03, 2020 at 03:54:38PM +0900, Michael Paquier wrote:
> Now, would it be better to apply the refactoring patch for HEAD before
> feature freeze, or are people fine if this is committed a bit after?
> Patch 0002 is neither a new feature, nor an actual bug, and just some
> code cleanup, but I am a bit worried about applying that cleanup on
> HEAD just after the freeze.

I have worked more on this one this morning and just applied the bug
fix down to 9.5, and the refactoring on HEAD.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-03 Thread Michael Paquier
On Thu, Apr 02, 2020 at 04:38:36AM -0300, Alvaro Herrera wrote:
> I don't think we need to worry about that problem, because we already
> checked that the pg_class tuple for the index is there two lines above.
> The pg_index tuple cannot have gone away on its own; and the index can't
> be deleted either, because cluster_rel holds AEL on the table.  There
> isn't "probably" about the can't-happen condition, is there?

Yes, you are right here.  I was wondering about an interference with
the multi-relation cluster that would not lock the parent relation at
the upper level of cluster() but the check on the existence of the
index makes sure that we'll never see an invalid entry in pg_index, so
let's keep patch 0002 as originally presented.  As the commit tree is
going to be rather crowded until feature freeze on Sunday, I'll wait
until Monday or Tuesday to finalize this patch set.

Now, would it be better to apply the refactoring patch for HEAD before
feature freeze, or are people fine if this is committed a bit after?
Patch 0002 is neither a new feature, nor an actual bug, and just some
code cleanup, but I am a bit worried about applying that cleanup on
HEAD just after the freeze.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-02 Thread Michael Paquier
On Thu, Apr 02, 2020 at 04:24:03PM +0900, Michael Paquier wrote:
> On Thu, Apr 02, 2020 at 01:52:09AM -0500, Justin Pryzby wrote:
>> Sounds right.  Or else get_index_isclustered() could be redefined to take a
>> boolean "do_elog" flag, and if syscache fails and that's false, then return
>> false instead of ERROR.
> 
> Not sure if that's completely right to do either.  For one, it is not
> consistent with the surroundings as of lsyscache.c.

Actually, we do have some missing_ok flags lying around already in
lsyscache.c, so it would be much more consistent to use that name that
instead of the do_elog you are suggesting.  Could you update the
patch to reflect that?
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-02 Thread Alvaro Herrera
On 2020-Apr-02, Michael Paquier wrote:

> Now, regarding patch 0002, note that you have a problem for this part:
> -tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
> -if (!HeapTupleIsValid(tuple))/* probably can't happen */
> -{
> -relation_close(OldHeap, AccessExclusiveLock);
> -pgstat_progress_end_command();
> -return;
> -}
> -indexForm = (Form_pg_index) GETSTRUCT(tuple);
> -if (!indexForm->indisclustered)
> +if (!get_index_isclustered(indexOid))
>  {
> -ReleaseSysCache(tuple);
>  relation_close(OldHeap, AccessExclusiveLock);
>  pgstat_progress_end_command();
>  return;
>  }
> -ReleaseSysCache(tuple);
> 
> On an invalid tuple for pg_index, the new code would issue an error,
> while the old code would just return.

I don't think we need to worry about that problem, because we already
checked that the pg_class tuple for the index is there two lines above.
The pg_index tuple cannot have gone away on its own; and the index can't
be deleted either, because cluster_rel holds AEL on the table.  There
isn't "probably" about the can't-happen condition, is there?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-02 Thread Michael Paquier
On Thu, Apr 02, 2020 at 01:52:09AM -0500, Justin Pryzby wrote:
> Sounds right.  Or else get_index_isclustered() could be redefined to take a
> boolean "do_elog" flag, and if syscache fails and that's false, then return
> false instead of ERROR.

Not sure if that's completely right to do either.  For one, it is not
consistent with the surroundings as of lsyscache.c.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-02 Thread Justin Pryzby
On Thu, Apr 02, 2020 at 03:14:21PM +0900, Michael Paquier wrote:
> Now, regarding patch 0002, note that you have a problem for this part:
> -tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
> -if (!HeapTupleIsValid(tuple))/* probably can't happen */
> -{
> -relation_close(OldHeap, AccessExclusiveLock);
> -pgstat_progress_end_command();
> -return;
> -}
> -indexForm = (Form_pg_index) GETSTRUCT(tuple);
> -if (!indexForm->indisclustered)
> +if (!get_index_isclustered(indexOid))
>  {
> -ReleaseSysCache(tuple);
>  relation_close(OldHeap, AccessExclusiveLock);
>  pgstat_progress_end_command();
>  return;
>  }
> -ReleaseSysCache(tuple);
> 
> On an invalid tuple for pg_index, the new code would issue an error,
> while the old code would just return.  And it seems to me that this
> can lead to problems because the parent relation is processed and
> locked at the beginning of cluster_rel(), *after* we know the index
> OID to work on.

> The refactoring is fine for the other two areas
> though, so I think that there is still value to apply
> get_index_isclustered() within mark_index_clustered() and cluster(),
> and I would suggest to keep 0002 to that.
> 
> Justin, what do you think?

Sounds right.  Or else get_index_isclustered() could be redefined to take a
boolean "do_elog" flag, and if syscache fails and that's false, then return
false instead of ERROR.

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-04-02 Thread Michael Paquier
On Wed, Mar 18, 2020 at 11:48:37AM +0900, Michael Paquier wrote:
> Anyway, Tom, Alvaro, are you planning to look at what is proposed on
> this thread?  I don't want to step on your toes if that's the case and
> it seems to me that the approach taken by the patch is sound, using as
> basic fix the addition of an AT_ClusterOn sub-command to the list of
> commands to execute when rebuilding the table, ensuring that any
> follow-up CLUSTER command will use the correct index.

Hearing nothing, I have been looking at the patches sent upthread, and
did some modifications as per the attached for 0001.  The logic looked
fine to me and it is unchanged as you are taking care of normal
indexes as well as constraint indexes.  Please note that I have
tweaked some comments, and removed what was on top of
RememberConstraintForRebuilding() as that was just a duplicate.
Regarding the tests, I was annoyed by the fact that the logic was not
manipulating two indexes at the same time on the relation rewritten
with a normal and a constraint index, and cross-checking both at the
same time to see which one is clustered after each rewrite is good for
consistency.

Now, regarding patch 0002, note that you have a problem for this part:
-tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
-if (!HeapTupleIsValid(tuple))/* probably can't happen */
-{
-relation_close(OldHeap, AccessExclusiveLock);
-pgstat_progress_end_command();
-return;
-}
-indexForm = (Form_pg_index) GETSTRUCT(tuple);
-if (!indexForm->indisclustered)
+if (!get_index_isclustered(indexOid))
 {
-ReleaseSysCache(tuple);
 relation_close(OldHeap, AccessExclusiveLock);
 pgstat_progress_end_command();
 return;
 }
-ReleaseSysCache(tuple);

On an invalid tuple for pg_index, the new code would issue an error,
while the old code would just return.  And it seems to me that this
can lead to problems because the parent relation is processed and
locked at the beginning of cluster_rel(), *after* we know the index
OID to work on.  The refactoring is fine for the other two areas
though, so I think that there is still value to apply
get_index_isclustered() within mark_index_clustered() and cluster(),
and I would suggest to keep 0002 to that.

Justin, what do you think?
--
Michael
From 60a2e5a615a39b852985166531019ac1baa4146a Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 2 Apr 2020 14:52:15 +0900
Subject: [PATCH v8] Preserve clustered indexes after rewrites of ALTER TABLE

---
 src/include/utils/lsyscache.h |  1 +
 src/backend/commands/tablecmds.c  | 47 ++
 src/backend/utils/cache/lsyscache.c   | 23 +++
 src/test/regress/expected/alter_table.out | 48 +++
 src/test/regress/sql/alter_table.sql  | 25 
 5 files changed, 144 insertions(+)

diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 374f57fb43..c9c68e2f4f 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -185,6 +185,7 @@ extern Oid	get_range_collation(Oid rangeOid);
 extern Oid	get_index_column_opclass(Oid index_oid, int attno);
 extern bool	get_index_isreplident(Oid index_oid);
 extern bool get_index_isvalid(Oid index_oid);
+extern bool get_index_isclustered(Oid index_oid);
 
 #define type_is_array(typid)  (get_element_type(typid) != InvalidOid)
 /* type_is_array_domain accepts both plain arrays and domains over arrays */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c8c88be2c9..ef234075dc 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -177,6 +177,7 @@ typedef struct AlteredTableInfo
 	List	   *changedIndexOids;	/* OIDs of indexes to rebuild */
 	List	   *changedIndexDefs;	/* string definitions of same */
 	char	   *replicaIdentityIndex;	/* index to reset as REPLICA IDENTITY */
+	char	   *clusterOnIndex;		/* index to use for CLUSTER */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -11581,6 +11582,21 @@ RememberReplicaIdentityForRebuilding(Oid indoid, AlteredTableInfo *tab)
 	tab->replicaIdentityIndex = get_rel_name(indoid);
 }
 
+/*
+ * Subroutine for ATExecAlterColumnType: remember any clustered index.
+ */
+static void
+RememberClusterOnForRebuilding(Oid indoid, AlteredTableInfo *tab)
+{
+	if (!get_index_isclustered(indoid))
+		return;
+
+	if (tab->clusterOnIndex)
+		elog(ERROR, "relation %u has multiple clustered indexes", tab->relid);
+
+	tab->clusterOnIndex = get_rel_name(indoid);
+}
+
 /*
  * Subroutine for ATExecAlterColumnType: remember that a constraint needs
  * to be rebuilt (which we might already know).
@@ -11606,9 +11622,18 @@ RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-17 Thread Michael Paquier
On Tue, Mar 17, 2020 at 11:20:44AM -0500, Justin Pryzby wrote:
> On Tue, Mar 17, 2020 at 02:33:32PM +0900, Michael Paquier wrote:
>> Patch 0002 from Justin does that, I would keep this refactoring as
>> HEAD-only material though, and I don't spot any other code paths in
>> need of patching.
>> 
>> The commit message of patch 0001 is not what you wanted I guess.
> 
> That's what git-am does, and I didn't find any option to make it less
> unreadable.  I guess I should just delete the email body it inserts.

Strange...

Anyway, Tom, Alvaro, are you planning to look at what is proposed on
this thread?  I don't want to step on your toes if that's the case and
it seems to me that the approach taken by the patch is sound, using as
basic fix the addition of an AT_ClusterOn sub-command to the list of
commands to execute when rebuilding the table, ensuring that any
follow-up CLUSTER command will use the correct index.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-17 Thread Justin Pryzby
On Tue, Mar 17, 2020 at 02:33:32PM +0900, Michael Paquier wrote:
> > Yeah, in cluster(), mark_index_clustered().
> 
> Patch 0002 from Justin does that, I would keep this refactoring as
> HEAD-only material though, and I don't spot any other code paths in
> need of patching.
> 
> The commit message of patch 0001 is not what you wanted I guess.

That's what git-am does, and I didn't find any option to make it less
unreadable.  I guess I should just delete the email body it inserts.

|   The commit message is formed by the title taken from the "Subject: ", a
|   blank line and the body of the message up to where the patch begins. 
Excess
|   whitespace at the end of each line is automatically stripped.

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-16 Thread Michael Paquier
On Mon, Mar 16, 2020 at 11:25:23AM -0300, Alvaro Herrera wrote:
> On 2020-Mar-16, Justin Pryzby wrote:
> 
> > Also, should we call it "is_index_clustered", since otherwise it sounds alot
> > like "+get_index_clustered" (without "is"), which sounds like it takes a 
> > table
> > and returns which index is clustered.  That might be just as useful for 
> > some of
> > these callers.
> 
> Amit's proposed name seems to match lsyscache.c usual conventions better.

There were no get_index_isvalid() (introduced by me) or
get_index_isreplident() (introduced by Peter) until last week, and
those names have been chosen to be consistent with the existing
get_index_column_opclass(), so using get_index_isclustered is in my
opinion the most consistent choice.

> Yeah, in cluster(), mark_index_clustered().

Patch 0002 from Justin does that, I would keep this refactoring as
HEAD-only material though, and I don't spot any other code paths in
need of patching.

The commit message of patch 0001 is not what you wanted I guess.

 if (OidIsValid(indexOid))
 {
-indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
-if (!HeapTupleIsValid(indexTuple))
-elog(ERROR, "cache lookup failed for index %u", indexOid);
-indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
-
-if (indexForm->indisclustered)
-{
-ReleaseSysCache(indexTuple);
+if (get_index_isclustered(indexOid))
 return;
-}
-
-ReleaseSysCache(indexTuple);
 }
No need for two layers of if(s) here.

+create index alttype_cluster_a on alttype_cluster (a);
+alter table alttype_cluster cluster on alttype_cluster_a;
+select indisclustered from pg_index where indrelid = 
'alttype_cluster'::regclass;

Would it make sense to add a second index not used for clustering to
check after the case where isclustered is false?  A second thing would
be to check if relfilenode values match before and after each DDL to
make sure that a rewrite happened or not, see check_ddl_rewrite() for
example in alter_table.sql.

Keeping both RememberClusterOnForRebuilding and
RememberReplicaIdentityForRebuilding as separate looks fine to me. 
The code could be a bit more organized though.  We have
RememberIndexForRebuilding which may go through
RememberConstraintForRebuilding if the relation OID changed is a
constraint, and both register the replindent or isclustered
information if present.  Not really something for this patch set to
care about, just a thought while reading this code.

While looking at this bug, I have spotted a behavior which is perhaps
not welcome.  Take this test case:
create table aa (a int);
insert into aa values (1), (1);
create unique index concurrently aai on aa(a); -- fails
alter table aa alter column a type bigint;

This generates the following error:
ERROR:  23505: could not create unique index "aai"
DETAIL:  Key (a)=(1) is duplicated.
SCHEMA NAME:  public
TABLE NAME:  aa
CONSTRAINT NAME:  aai
LOCATION:  comparetup_index_btree, tuplesort.c:4049

After a REINDEX CONCURRENTLY, we may leave behind an invalid index
on the relation's toast table or even normal indexes.  CREATE INDEX
CONCURRENTLY may also leave behind invalid indexes.  If triggering an
ALTER TABLE that causes a rewrite of the relation, we have the
following behavior:
- An invalid toast index is correctly discarded, keeping one valid
toast index.  No problem here.
- Invalid non-toast indexes are all rebuilt.  If the index relies on a
constraint then ALTER TABLE would fail, like the above.

I am wondering if there is an argument for not including invalid
indexes in the rebuilt version, even if the existing behavior may be
useful for some users.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-16 Thread Alvaro Herrera
On 2020-Mar-16, Justin Pryzby wrote:

> Also, should we call it "is_index_clustered", since otherwise it sounds alot
> like "+get_index_clustered" (without "is"), which sounds like it takes a table
> and returns which index is clustered.  That might be just as useful for some 
> of
> these callers.

Amit's proposed name seems to match lsyscache.c usual conventions better.

> Should we use your get_index_isclustered more widely ?

Yeah, in cluster(), mark_index_clustered().

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-16 Thread Justin Pryzby
On Mon, Mar 16, 2020 at 04:01:42PM +0900, Amit Langote wrote:
> I came across a commit that recently went in:
> 
> commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca
> Author: Peter Eisentraut 
> Date:   Fri Mar 13 11:28:11 2020 +0100
> 
> Preserve replica identity index across ALTER TABLE rewrite
> 
> which fixes something very similar to what we are trying to with this
> patch.  The way it's done looks to me very close to what you are
> telling.  I have updated the patch to be similar to the above fix.

Yes, I noticed it too.

Should we use your get_index_isclustered more widely ?

Also, should we call it "is_index_clustered", since otherwise it sounds alot
like "+get_index_clustered" (without "is"), which sounds like it takes a table
and returns which index is clustered.  That might be just as useful for some of
these callers.

-- 
Justin
>From add5e8481c70b6b66342b264a243f26f4c634e53 Mon Sep 17 00:00:00 2001
From: Amit Langote 
Date: Mon, 16 Mar 2020 16:01:42 +0900
Subject: [PATCH v7 1/2] ALTER tbl rewrite loses CLUSTER ON index

On Fri, Mar 13, 2020 at 2:19 AM Tom Lane  wrote:
> Justin Pryzby  writes:
> > @cfbot: resending with only Amit's 0001, since Michael pushed a variation on
> > 0002.

Thank you for taking a look at it.

> Boy, I really dislike this patch.  ATPostAlterTypeParse is documented as
> using the supplied definition string, and nothing else, to reconstruct
> the index.  This breaks that without even the courtesy of documenting
> the breakage.  Moreover, the reason why it's designed like that is to
> avoid requiring the old index objects to still be accessible.  So I'm
> surprised that this hack works at all.  I don't think it would have
> worked at the time the code was first written, and I think it's imposing
> a constraint we'll have problems with (again?) in future.

Okay, so maybe in the middle of ATPostAlterTypeParse() is not a place
to do it, but don't these arguments apply to
RebuildConstraintComment(), which I based the patch on?

> The right way to fix this is to note the presence of the indisclustered
> flag when we're examining the index earlier, and include a suitable
> command in the definition string.  So probably pg_get_indexdef_string()
> is what needs to change.  It doesn't look like that's used anywhere
> else, so we can just redefine its behavior as needed.

I came across a commit that recently went in:

commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca
Author: Peter Eisentraut 
Date:   Fri Mar 13 11:28:11 2020 +0100

Preserve replica identity index across ALTER TABLE rewrite

which fixes something very similar to what we are trying to with this
patch.  The way it's done looks to me very close to what you are
telling.  I have updated the patch to be similar to the above fix.

--
Thank you,
Amit
---
 src/backend/commands/tablecmds.c  | 49 +++
 src/backend/utils/cache/lsyscache.c   | 23 +++
 src/include/utils/lsyscache.h |  1 +
 src/test/regress/expected/alter_table.out | 34 
 src/test/regress/sql/alter_table.sql  | 15 +++
 5 files changed, 122 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8c33b67c1b..73d9392878 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -177,6 +177,7 @@ typedef struct AlteredTableInfo
 	List	   *changedIndexOids;	/* OIDs of indexes to rebuild */
 	List	   *changedIndexDefs;	/* string definitions of same */
 	char	   *replicaIdentityIndex;	/* index to reset as REPLICA IDENTITY */
+	char	   *clusterOnIndex;		/* index to CLUSTER ON */
 } AlteredTableInfo;
 
 /* Struct describing one new constraint to check in Phase 3 scan */
@@ -11579,9 +11580,28 @@ RememberReplicaIdentityForRebuilding(Oid indoid, AlteredTableInfo *tab)
 	tab->replicaIdentityIndex = get_rel_name(indoid);
 }
 
+/*
+ * Subroutine for ATExecAlterColumnType: remember any clustered index.
+ */
+static void
+RememberClusterOnForRebuilding(Oid indoid, AlteredTableInfo *tab)
+{
+	if (!get_index_isclustered(indoid))
+		return;
+
+	if (tab->clusterOnIndex)
+		elog(ERROR, "relation %u has multiple clustered indexes", tab->relid);
+
+	tab->clusterOnIndex = get_rel_name(indoid);
+}
+
 /*
  * Subroutine for ATExecAlterColumnType: remember that a constraint needs
  * to be rebuilt (which we might already know).
+ *
+ * For constraint's index (if any), also remember if it's the table's replica
+ * identity or its clustered index, so that ATPostAlterTypeCleanup() can
+ * queue up commands necessary to restore that property.
  */
 static void
 RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab)
@@ -11604,9 +11624,18 @@ RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab)
 		tab->changedConstraintDefs = lappend(tab->changedConstraintDefs,
 			 defstring);
 
+		/*
+		 * For constraint's index (if any), remember if it's the table's
+		 * replica identity or its clustered index, so that
+		 * 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-16 Thread Amit Langote
On Fri, Mar 13, 2020 at 2:19 AM Tom Lane  wrote:
> Justin Pryzby  writes:
> > @cfbot: resending with only Amit's 0001, since Michael pushed a variation on
> > 0002.

Thank you for taking a look at it.

> Boy, I really dislike this patch.  ATPostAlterTypeParse is documented as
> using the supplied definition string, and nothing else, to reconstruct
> the index.  This breaks that without even the courtesy of documenting
> the breakage.  Moreover, the reason why it's designed like that is to
> avoid requiring the old index objects to still be accessible.  So I'm
> surprised that this hack works at all.  I don't think it would have
> worked at the time the code was first written, and I think it's imposing
> a constraint we'll have problems with (again?) in future.

Okay, so maybe in the middle of ATPostAlterTypeParse() is not a place
to do it, but don't these arguments apply to
RebuildConstraintComment(), which I based the patch on?

> The right way to fix this is to note the presence of the indisclustered
> flag when we're examining the index earlier, and include a suitable
> command in the definition string.  So probably pg_get_indexdef_string()
> is what needs to change.  It doesn't look like that's used anywhere
> else, so we can just redefine its behavior as needed.

I came across a commit that recently went in:

commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca
Author: Peter Eisentraut 
Date:   Fri Mar 13 11:28:11 2020 +0100

Preserve replica identity index across ALTER TABLE rewrite

which fixes something very similar to what we are trying to with this
patch.  The way it's done looks to me very close to what you are
telling.  I have updated the patch to be similar to the above fix.

--
Thank you,
Amit


v6-0001-ALTER-tbl-rewrite-loses-CLUSTER-ON-index.patch
Description: Binary data


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-12 Thread Tom Lane
Justin Pryzby  writes:
> @cfbot: resending with only Amit's 0001, since Michael pushed a variation on
> 0002.

Boy, I really dislike this patch.  ATPostAlterTypeParse is documented as
using the supplied definition string, and nothing else, to reconstruct
the index.  This breaks that without even the courtesy of documenting
the breakage.  Moreover, the reason why it's designed like that is to
avoid requiring the old index objects to still be accessible.  So I'm
surprised that this hack works at all.  I don't think it would have
worked at the time the code was first written, and I think it's imposing
a constraint we'll have problems with (again?) in future.

The right way to fix this is to note the presence of the indisclustered
flag when we're examining the index earlier, and include a suitable
command in the definition string.  So probably pg_get_indexdef_string()
is what needs to change.  It doesn't look like that's used anywhere
else, so we can just redefine its behavior as needed.

regards, tom lane




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-05 Thread Ibrar Ahmed
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

I tested the patch on the master branch 
(a77315fdf2a197a925e670be2d8b376c4ac02efc) and it works fine.

The new status of this patch is: Ready for Committer


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-02 Thread Justin Pryzby
@cfbot: resending with only Amit's 0001, since Michael pushed a variation on
0002.

-- 
Justin
>From 865fc2713ad29d0f8c0f63609a7c15c83cfa5cfe Mon Sep 17 00:00:00 2001
From: Amit Langote 
Date: Thu, 6 Feb 2020 18:14:16 +0900
Subject: [PATCH v5] ALTER tbl rewrite loses CLUSTER ON index

---
 src/backend/commands/tablecmds.c  | 39 +++
 src/test/regress/expected/alter_table.out | 34 
 src/test/regress/sql/alter_table.sql  | 16 +-
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 02a7c04fdb..6b2469bd09 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
 static void RebuildConstraintComment(AlteredTableInfo *tab, int pass,
 	 Oid objid, Relation rel, List *domname,
 	 const char *conname);
+static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid);
 static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
 static void TryReuseForeignKey(Oid oldId, Constraint *con);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
@@ -11838,6 +11839,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
 			newcmd->def = (Node *) stmt;
 			tab->subcmds[AT_PASS_OLD_INDEX] =
 lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd);
+
+			/* Preserve index's indisclustered property, if set. */
+			PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId);
 		}
 		else if (IsA(stm, AlterTableStmt))
 		{
@@ -11874,6 +11878,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
 			 rel,
 			 NIL,
 			 indstmt->idxname);
+
+	/* Preserve index's indisclustered property, if set. */
+	PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid);
 }
 else if (cmd->subtype == AT_AddConstraint)
 {
@@ -11996,6 +12003,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid,
 	tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
 }
 
+/*
+ * For a table's index that is to be recreated due to PostAlterType
+ * processing, preserve its indisclustered property by issuing ALTER TABLE
+ * CLUSTER ON command on the table that will run after the command to recreate
+ * the index.
+ */
+static void
+PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid)
+{
+	HeapTuple	indexTuple;
+	Form_pg_index indexForm;
+
+	Assert(OidIsValid(indoid));
+	Assert(pass == AT_PASS_OLD_INDEX);
+
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indoid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	if (indexForm->indisclustered)
+	{
+		AlterTableCmd *newcmd = makeNode(AlterTableCmd);
+
+		newcmd->subtype = AT_ClusterOn;
+		newcmd->name = get_rel_name(indoid);
+		tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
+	}
+
+	ReleaseSysCache(indexTuple);
+}
+
 /*
  * Subroutine for ATPostAlterTypeParse().  Calls out to CheckIndexCompatible()
  * for the real analysis, then mutates the IndexStmt based on that verdict.
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index fb6d86a269..a01c6d6ec5 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4296,3 +4296,37 @@ create trigger xtrig
 update bar1 set a = a + 1;
 INFO:  a=1, b=1
 /* End test case for bug #16242 */
+-- alter type rewrite/rebuild should preserve cluster marking on index
+create table alttype_cluster (a int);
+create index alttype_cluster_a on alttype_cluster (a);
+alter table alttype_cluster cluster on alttype_cluster_a;
+select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass;
+ indisclustered 
+
+ t
+(1 row)
+
+alter table alttype_cluster alter a type bigint;
+select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass;
+ indisclustered 
+
+ t
+(1 row)
+
+drop index alttype_cluster_a;
+alter table alttype_cluster add primary key (a);
+alter table alttype_cluster cluster on alttype_cluster_pkey;
+select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass;
+ indisclustered 
+
+ t
+(1 row)
+
+alter table alttype_cluster alter a type int;
+select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass;
+ indisclustered 
+
+ t
+(1 row)
+
+drop table alttype_cluster;
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 3801f19c58..6e9048bbec 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -2802,7 +2802,6 @@ drop table at_test_sql_partop;
 drop operator class at_test_sql_partop using btree;
 drop function at_test_sql_partop;
 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-02 Thread Justin Pryzby
On Mon, Mar 02, 2020 at 12:28:18PM +0900, Michael Paquier wrote:
> > +SELECT indexrelid::regclass FROM pg_index WHERE 
> > indrelid='concur_clustered'::regclass;
> 
> This test should check after indisclustered.  Except that, the patch
> is fine so I'll apply it if there are no objections.

Oops - I realized that, but didn't send a new patch before you noticed - thanks
for handling it.

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-02 Thread Michael Paquier
On Mon, Mar 02, 2020 at 12:28:18PM +0900, Michael Paquier wrote:
> This test should check after indisclustered.  Except that, the patch
> is fine so I'll apply it if there are no objections.

I got a second look at this one, and applied it down to 12 after some
small modifications in the new test and in the comments.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-01 Thread Michael Paquier
On Sat, Feb 29, 2020 at 10:52:58AM -0600, Justin Pryzby wrote:
> Rebased Amit's patch and included my own 0002 to fix the REINDEX CONCURRENTLY
> issue.

I have looked at 0002 as that concerns me.

> +SELECT indexrelid::regclass FROM pg_index WHERE 
> indrelid='concur_clustered'::regclass;
> +   indexrelid   
> +
> + concur_clustered_i_idx
> +(1 row)

This test should check after indisclustered.  Except that, the patch
is fine so I'll apply it if there are no objections.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-29 Thread Justin Pryzby
On Fri, Feb 28, 2020 at 08:42:02PM -0600, Justin Pryzby wrote:
> On Fri, Feb 28, 2020 at 06:26:04PM -0500, Tom Lane wrote:
> > Justin Pryzby  writes:
> > > I think the attached is 80% complete (I didn't touch pg_dump).
> > > One objection to this change would be that all relations (including 
> > > indices)
> > > end up with relclustered fields, and pg_index already has a number of 
> > > bools, so
> > > it's not like this one bool is wasting a byte.
> > > I think relisclustered was a's clever way of avoiding that overhead 
> > > (c0ad5953).
> > > So I would be -0.5 on moving it to pg_class..
> > > But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
> > > somewhere else.
> > 
> > 0001 has been superseded by events (faade5d4c), so the cfbot is choking
> > on that one's failure to apply, and not testing any further.  Please
> > repost without 0001 so that we can get this testing again.
> 
> I've just noticed while working on (1) that this separately affects REINDEX
> CONCURRENTLY, which would be a new bug in v12.  Without CONCURRENTLY there's 
> no
> issue.  I guess we need a separate patch for that case.

Rebased Amit's patch and included my own 0002 to fix the REINDEX CONCURRENTLY
issue.

-- 
Justin
>From dccc6f08450eacafc3a08bc8b2836d2feb23a533 Mon Sep 17 00:00:00 2001
From: Amit Langote 
Date: Thu, 6 Feb 2020 18:14:16 +0900
Subject: [PATCH v4 1/2] ALTER tbl rewrite loses CLUSTER ON index

---
 src/backend/commands/tablecmds.c  | 39 +++
 src/test/regress/expected/alter_table.out | 34 
 src/test/regress/sql/alter_table.sql  | 16 +-
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 02a7c04fdb..6b2469bd09 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
 static void RebuildConstraintComment(AlteredTableInfo *tab, int pass,
 	 Oid objid, Relation rel, List *domname,
 	 const char *conname);
+static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid);
 static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
 static void TryReuseForeignKey(Oid oldId, Constraint *con);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
@@ -11838,6 +11839,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
 			newcmd->def = (Node *) stmt;
 			tab->subcmds[AT_PASS_OLD_INDEX] =
 lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd);
+
+			/* Preserve index's indisclustered property, if set. */
+			PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId);
 		}
 		else if (IsA(stm, AlterTableStmt))
 		{
@@ -11874,6 +11878,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
 			 rel,
 			 NIL,
 			 indstmt->idxname);
+
+	/* Preserve index's indisclustered property, if set. */
+	PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid);
 }
 else if (cmd->subtype == AT_AddConstraint)
 {
@@ -11996,6 +12003,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid,
 	tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
 }
 
+/*
+ * For a table's index that is to be recreated due to PostAlterType
+ * processing, preserve its indisclustered property by issuing ALTER TABLE
+ * CLUSTER ON command on the table that will run after the command to recreate
+ * the index.
+ */
+static void
+PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid)
+{
+	HeapTuple	indexTuple;
+	Form_pg_index indexForm;
+
+	Assert(OidIsValid(indoid));
+	Assert(pass == AT_PASS_OLD_INDEX);
+
+	indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid));
+	if (!HeapTupleIsValid(indexTuple))
+		elog(ERROR, "cache lookup failed for index %u", indoid);
+	indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+	if (indexForm->indisclustered)
+	{
+		AlterTableCmd *newcmd = makeNode(AlterTableCmd);
+
+		newcmd->subtype = AT_ClusterOn;
+		newcmd->name = get_rel_name(indoid);
+		tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
+	}
+
+	ReleaseSysCache(indexTuple);
+}
+
 /*
  * Subroutine for ATPostAlterTypeParse().  Calls out to CheckIndexCompatible()
  * for the real analysis, then mutates the IndexStmt based on that verdict.
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index fb6d86a269..a01c6d6ec5 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4296,3 +4296,37 @@ create trigger xtrig
 update bar1 set a = a + 1;
 INFO:  a=1, b=1
 /* End test case for bug #16242 */
+-- alter type rewrite/rebuild should preserve cluster marking on index
+create table alttype_cluster (a int);
+create index alttype_cluster_a on alttype_cluster (a);
+alter table alttype_cluster cluster on alttype_cluster_a;
+select 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-28 Thread Justin Pryzby
On Fri, Feb 28, 2020 at 06:26:04PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > I think the attached is 80% complete (I didn't touch pg_dump).
> > One objection to this change would be that all relations (including indices)
> > end up with relclustered fields, and pg_index already has a number of 
> > bools, so
> > it's not like this one bool is wasting a byte.
> > I think relisclustered was a's clever way of avoiding that overhead 
> > (c0ad5953).
> > So I would be -0.5 on moving it to pg_class..
> > But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
> > somewhere else.
> 
> 0001 has been superseded by events (faade5d4c), so the cfbot is choking
> on that one's failure to apply, and not testing any further.  Please
> repost without 0001 so that we can get this testing again.

I've just noticed while working on (1) that this separately affects REINDEX
CONCURRENTLY, which would be a new bug in v12.  Without CONCURRENTLY there's no
issue.  I guess we need a separate patch for that case.

(1) https://commitfest.postgresql.org/27/2269/

The ALTER bug goes back further and its fix should be a kept separate.

postgres=# DROP TABLE tt; CREATE TABLE tt(i int unique); CLUSTER tt USING 
tt_i_key; CLUSTER tt; REINDEX INDEX tt_i_key; CLUSTER tt;
DROP TABLE
CREATE TABLE
CLUSTER
CLUSTER
REINDEX
CLUSTER

postgres=# DROP TABLE tt; CREATE TABLE tt(i int unique); CLUSTER tt USING 
tt_i_key; CLUSTER tt; REINDEX INDEX CONCURRENTLY tt_i_key; CLUSTER tt;
DROP TABLE
CREATE TABLE
CLUSTER
CLUSTER
REINDEX
ERROR:  there is no previously clustered index for table "tt"

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-28 Thread Tom Lane
Justin Pryzby  writes:
> I think the attached is 80% complete (I didn't touch pg_dump).
> One objection to this change would be that all relations (including indices)
> end up with relclustered fields, and pg_index already has a number of bools, 
> so
> it's not like this one bool is wasting a byte.
> I think relisclustered was a's clever way of avoiding that overhead 
> (c0ad5953).
> So I would be -0.5 on moving it to pg_class..
> But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
> somewhere else.

0001 has been superseded by events (faade5d4c), so the cfbot is choking
on that one's failure to apply, and not testing any further.  Please
repost without 0001 so that we can get this testing again.

regards, tom lane




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-16 Thread Justin Pryzby
On Mon, Feb 17, 2020 at 02:31:42PM +0900, Amit Langote wrote:
> Hi Justin,
> 
> On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby  wrote:
> > On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> > > On 2020-Feb-06, Justin Pryzby wrote:
> > >
> > > > I wondered if it wouldn't be better if CLUSTER ON was stored in 
> > > > pg_class as the
> > > > Oid of a clustered index, rather than a boolean in pg_index.
> > >
> > > Maybe.  Do you want to try a patch?
> >
> > I think the attached is 80% complete (I didn't touch pg_dump).
> >
> > One objection to this change would be that all relations (including indices)
> > end up with relclustered fields, and pg_index already has a number of 
> > bools, so
> > it's not like this one bool is wasting a byte.
> >
> > I think relisclustered was a's clever way of avoiding that overhead 
> > (c0ad5953).
> > So I would be -0.5 on moving it to pg_class..

In case there's any confusion: "a's" was probably me halfway changing
"someone's" to "a".

> Are you still for fixing ALTER TABLE losing relisclustered with the
> patch we were working on earlier [1], if not for moving relisclustered
> to pg_class anymore?

Thanks for remembering this one.

I think your patch is the correct fix.

I forgot to say it, but moving relisclustered to pg_class doesn't help to avoid
losting indisclustered: it still needs a fix just like this.  Anyway, I
withdrew my suggestion for moving to pg_class, since it has more overhead, even
for pg_class rows for relations which can't have indexes.

> I have read elsewhere [2] that forcing ALTER TABLE to rewrite in
> clustered order might not be a good option, but maybe that one is a
> more radical proposal than this.

Right; your fix seems uncontroversial.  I ran into this (indisclustered) bug
while starting to write that patch for "ALTER rewrite in clustered order".

-- 
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-16 Thread Amit Langote
Hi Justin,

On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby  wrote:
> On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> > On 2020-Feb-06, Justin Pryzby wrote:
> >
> > > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class 
> > > as the
> > > Oid of a clustered index, rather than a boolean in pg_index.
> >
> > Maybe.  Do you want to try a patch?
>
> I think the attached is 80% complete (I didn't touch pg_dump).
>
> One objection to this change would be that all relations (including indices)
> end up with relclustered fields, and pg_index already has a number of bools, 
> so
> it's not like this one bool is wasting a byte.
>
> I think relisclustered was a's clever way of avoiding that overhead 
> (c0ad5953).
> So I would be -0.5 on moving it to pg_class..

Are you still for fixing ALTER TABLE losing relisclustered with the
patch we were working on earlier [1], if not for moving relisclustered
to pg_class anymore?

I have read elsewhere [2] that forcing ALTER TABLE to rewrite in
clustered order might not be a good option, but maybe that one is a
more radical proposal than this.

Thanks,
Amit

[1] 
https://postgr.es/m/CA%2BHiwqEt1HnXYckCdaO8%2BpOoFs7NNS5byoZ6Xg2B7epKbhS85w%40mail.gmail.com
[2] https://postgr.es/m/10984.1581181029%40sss.pgh.pa.us




Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)

2020-02-07 Thread Justin Pryzby
On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote:
> On 2020-Feb-06, Justin Pryzby wrote:
> 
> > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as 
> > the
> > Oid of a clustered index, rather than a boolean in pg_index.
> 
> Maybe.  Do you want to try a patch?

I think the attached is 80% complete (I didn't touch pg_dump).

One objection to this change would be that all relations (including indices)
end up with relclustered fields, and pg_index already has a number of bools, so
it's not like this one bool is wasting a byte.

I think relisclustered was a's clever way of avoiding that overhead (c0ad5953).
So I would be -0.5 on moving it to pg_class..

But I think 0001 and 0002 are worthy.  Maybe the test in 0002 should live
somewhere else.
>From 7eea0a17e495fe13379ffd589b551f2f145f5672 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 6 Feb 2020 21:48:13 -0600
Subject: [PATCH v1 1/3] Update comment obsolete since b9b8831a

---
 src/backend/commands/cluster.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index e9d7a7f..3adcbeb 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1539,9 +1539,9 @@ get_tables_to_cluster(MemoryContext cluster_context)
 
 	/*
 	 * Get all indexes that have indisclustered set and are owned by
-	 * appropriate user. System relations or nailed-in relations cannot ever
-	 * have indisclustered set, because CLUSTER will refuse to set it when
-	 * called with one of them as argument.
+	 * appropriate user. Shared relations cannot ever have indisclustered
+	 * set, because CLUSTER will refuse to set it when called with one as
+	 * an argument.
 	 */
 	indRelation = table_open(IndexRelationId, AccessShareLock);
 	ScanKeyInit(,
-- 
2.7.4

>From 4777be522a7aa8b8c77b13f765cbd02043438f2a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 7 Feb 2020 08:12:50 -0600
Subject: [PATCH v1 2/3] Give developer a helpful kick in the pants if they
 change natts in one place but not another

---
 src/backend/bootstrap/bootstrap.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index bfc629c..d5e1888 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -25,7 +25,9 @@
 #include "access/xlog_internal.h"
 #include "bootstrap/bootstrap.h"
 #include "catalog/index.h"
+#include "catalog/pg_class.h"
 #include "catalog/pg_collation.h"
+#include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "common/link-canary.h"
 #include "libpq/pqsignal.h"
@@ -49,6 +51,7 @@
 #include "utils/ps_status.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
+#include "utils/syscache.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
 
@@ -602,6 +605,26 @@ boot_openrel(char *relname)
 	TableScanDesc scan;
 	HeapTuple	tup;
 
+	/* Check that pg_class data is consistent now, rather than failing obscurely later */
+	struct { Oid oid; int natts; }
+		checknatts[] = {
+		{RelationRelationId, Natts_pg_class,},
+		{TypeRelationId, Natts_pg_type,},
+		{AttributeRelationId, Natts_pg_attribute,},
+		{ProcedureRelationId, Natts_pg_proc,},
+	};
+
+	for (int i=0; irelnatts);
+		ReleaseSysCache(tuple);
+	}
+
 	if (strlen(relname) >= NAMEDATALEN)
 		relname[NAMEDATALEN - 1] = '\0';
 
-- 
2.7.4

>From ed886f8202486dea8069b719d35a5d0db7f3277c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 6 Feb 2020 12:56:34 -0600
Subject: [PATCH v1 3/3] Make cluster a property of table in pg_index..

..rather than of indexes in pg_index.

The only issue with this is that it makes pg_class larger, and the new column
applies not only to tables, but to indices.
---
 doc/src/sgml/catalogs.sgml |  14 +--
 src/backend/catalog/heap.c |   1 +
 src/backend/catalog/index.c|   6 --
 src/backend/commands/cluster.c | 172 +
 src/backend/commands/tablecmds.c   |   5 +-
 src/backend/utils/cache/relcache.c |   1 -
 src/bin/psql/describe.c|   4 +-
 src/include/catalog/pg_class.dat   |   2 +-
 src/include/catalog/pg_class.h |   3 +
 src/include/catalog/pg_index.h |   1 -
 10 files changed, 93 insertions(+), 116 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a10b665..8efeaff 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1752,6 +1752,13 @@ SCRAM-SHA-256$iteration count:
  
 
  
+  relclustered
+  oid
+  
+  The OID of the index last clustered, or zero
+ 
+
+ 
   relpages
   int4
   
@@ -3808,13 +3815,6 @@ SCRAM-SHA-256$iteration count:
  
 
  
-  indisclustered
-  bool
-  
-  If true, the table was last clustered on this index
- 
-
- 
   indisvalid
   bool
   
diff --git 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-07 Thread Amit Langote
On Fri, Feb 7, 2020 at 2:24 AM Alvaro Herrera  wrote:
> On 2020-Feb-06, Justin Pryzby wrote:
> > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as 
> > the
> > Oid of a clustered index, rather than a boolean in pg_index.
>
> Maybe.  Do you want to try a patch?

+1

Thanksm
Amit




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-06 Thread Alvaro Herrera
On 2020-Feb-06, Justin Pryzby wrote:

> I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as 
> the
> Oid of a clustered index, rather than a boolean in pg_index.

Maybe.  Do you want to try a patch?

> That likely would've avoided (or at least exposed) this issue.
> And avoids the possibility of having two indices marked as "clustered".
> These would be more trivial:
> mark_index_clustered
> /* We need to find the index that has indisclustered set. */

You need to be careful when dropping the index ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-06 Thread Justin Pryzby
I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as the
Oid of a clustered index, rather than a boolean in pg_index.

That likely would've avoided (or at least exposed) this issue.
And avoids the possibility of having two indices marked as "clustered".
These would be more trivial:
mark_index_clustered
/* We need to find the index that has indisclustered set. */




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-06 Thread Amit Langote
On Thu, Feb 6, 2020 at 10:31 AM Amit Langote  wrote:
> On Thu, Feb 6, 2020 at 8:44 AM Justin Pryzby  wrote:
> > On Wed, Feb 05, 2020 at 03:53:45PM +0900, Amit Langote wrote:
> > > diff --git a/src/test/regress/sql/alter_table.sql 
> > > b/src/test/regress/sql/alter_table.sql
> > > +-- alter type shouldn't lose clustered index
> >
> > My only suggestion is to update the comment
> > +-- alter type rewrite/rebuild should preserve cluster marking on index
>
> Sure, done.

Deja vu.  Last two messages weren't sent to the list; updated patch attached.

Thanks,
Amit
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f599393473..bab3ddf67c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, 
Oid refRelId,
 static void RebuildConstraintComment(AlteredTableInfo *tab, int pass,
 Oid 
objid, Relation rel, List *domname,
 const 
char *conname);
+static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid);
 static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
 static void TryReuseForeignKey(Oid oldId, Constraint *con);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char 
*colName,
@@ -11832,6 +11833,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid 
refRelId, char *cmd,
newcmd->def = (Node *) stmt;
tab->subcmds[AT_PASS_OLD_INDEX] =
lappend(tab->subcmds[AT_PASS_OLD_INDEX], 
newcmd);
+
+   /* Preserve index's indisclustered property, if set. */
+   PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId);
}
else if (IsA(stm, AlterTableStmt))
{
@@ -11868,6 +11872,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid 
refRelId, char *cmd,

 rel,

 NIL,

 indstmt->idxname);
+
+   /* Preserve index's indisclustered 
property, if set. */
+   PreserveClusterOn(tab, 
AT_PASS_OLD_INDEX, indoid);
}
else if (cmd->subtype == AT_AddConstraint)
{
@@ -11990,6 +11997,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int 
pass, Oid objid,
tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
 }
 
+/*
+ * For a table's index that is to be recreated due to PostAlterType
+ * processing, preserve its indisclustered property by issuing ALTER TABLE
+ * CLUSTER ON command on the table that will run after the command to recreate
+ * the index.
+ */
+static void
+PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid)
+{
+   HeapTuple   indexTuple;
+   Form_pg_index indexForm;
+
+   Assert(OidIsValid(indoid));
+   Assert(pass == AT_PASS_OLD_INDEX);
+
+   indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid));
+   if (!HeapTupleIsValid(indexTuple))
+   elog(ERROR, "cache lookup failed for index %u", indoid);
+   indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+   if (indexForm->indisclustered)
+   {
+   AlterTableCmd *newcmd = makeNode(AlterTableCmd);
+
+   newcmd->subtype = AT_ClusterOn;
+   newcmd->name = get_rel_name(indoid);
+   tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
+   }
+
+   ReleaseSysCache(indexTuple);
+}
+
 /*
  * Subroutine for ATPostAlterTypeParse().  Calls out to CheckIndexCompatible()
  * for the real analysis, then mutates the IndexStmt based on that verdict.
diff --git a/src/test/regress/expected/alter_table.out 
b/src/test/regress/expected/alter_table.out
index e73ce4b6f3..09c00eef05 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -4258,3 +4258,37 @@ alter table at_test_sql_partop attach partition 
at_test_sql_partop_1 for values
 drop table at_test_sql_partop;
 drop operator class at_test_sql_partop using btree;
 drop function at_test_sql_partop;
+-- alter type rewrite/rebuild should preserve cluster marking on index
+create table alttype_cluster (a int);
+create index alttype_cluster_a on alttype_cluster (a);
+alter table alttype_cluster cluster on alttype_cluster_a;
+select indisclustered from pg_index where indrelid = 
'alttype_cluster'::regclass;
+ indisclustered 
+
+ t
+(1 row)
+
+alter table alttype_cluster alter a type bigint;
+select indisclustered from pg_index where 

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-05 Thread Justin Pryzby
On Wed, Feb 05, 2020 at 03:53:45PM +0900, Amit Langote wrote:
> Hi Justin,
> 
> On Mon, Feb 3, 2020 at 1:17 AM Justin Pryzby  wrote:
> > Other options are preserved by ALTER (and CLUSTER ON is and most obviously
> > should be preserved by CLUSTER's rewrite), so I think (SET) CLUSTER should 
> > be
> > preserved by ALTER, too.
> 
> Yes.
> 
> create table foo (a int primary key);
> cluster foo;
> ERROR:  there is no previously clustered index for table "foo"
> cluster foo using foo_pkey;
> alter table foo alter a type bigint;
> cluster foo;
> ERROR:  there is no previously clustered index for table "foo"
> 
> With your patch, this last error doesn't occur.
> 
> Like you, I too suspect that losing indisclustered like this is
> unintentional, so should be fixed.

Thanks for checking.

It doesn't need to be said, but your patch is obviously superior.

I ran into this while looking into a suggestion from Alvaro that ALTER should
rewrite in order of a clustered index (if any) rather than in pre-existing heap
order (more on that another day).  So while this looks like a bug, and I can't
think how a backpatch would break something, my suggestion is that backpatching
a fix is of low value, so it's only worth +0.

Thanks
Justin




Re: ALTER tbl rewrite loses CLUSTER ON index

2020-02-04 Thread Amit Langote
Hi Justin,

On Mon, Feb 3, 2020 at 1:17 AM Justin Pryzby  wrote:
> Other options are preserved by ALTER (and CLUSTER ON is and most obviously
> should be preserved by CLUSTER's rewrite), so I think (SET) CLUSTER should be
> preserved by ALTER, too.

Yes.

create table foo (a int primary key);
cluster foo;
ERROR:  there is no previously clustered index for table "foo"
cluster foo using foo_pkey;
alter table foo alter a type bigint;
cluster foo;
ERROR:  there is no previously clustered index for table "foo"

With your patch, this last error doesn't occur.

Like you, I too suspect that losing indisclustered like this is
unintentional, so should be fixed.

> As far as I can see, this should be the responsibility of something in the
> vicinity of ATPostAlterTypeParse/RememberIndexForRebuilding.
>
> Attach patch sketches a fix.

While your sketch hits pretty close, it could be done a bit
differently.  For one, I don't like the way it's misusing
changedIndexOids and changedIndexDefs.

Instead, we can do something similar to what
RebuildConstraintComments() does for constraint comments.  For
example, we can have a PreserveClusterOn() that adds a AT_ClusterOn
command into table's AT_PASS_OLD_INDEX pass commands.  Attached patch
shows what I'm thinking.  I also added representative tests.

Thanks,
Amit
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f599393473..ea9941278e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, 
Oid refRelId,
 static void RebuildConstraintComment(AlteredTableInfo *tab, int pass,
 Oid 
objid, Relation rel, List *domname,
 const 
char *conname);
+static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid);
 static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
 static void TryReuseForeignKey(Oid oldId, Constraint *con);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char 
*colName,
@@ -11832,6 +11833,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid 
refRelId, char *cmd,
newcmd->def = (Node *) stmt;
tab->subcmds[AT_PASS_OLD_INDEX] =
lappend(tab->subcmds[AT_PASS_OLD_INDEX], 
newcmd);
+
+   /* Preserve index's indisclustered property, if set. */
+   PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId);
}
else if (IsA(stm, AlterTableStmt))
{
@@ -11868,6 +11872,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid 
refRelId, char *cmd,

 rel,

 NIL,

 indstmt->idxname);
+
+   /* Preserve index's indisclustered 
property, if set. */
+   PreserveClusterOn(tab, 
AT_PASS_OLD_INDEX, indoid);
}
else if (cmd->subtype == AT_AddConstraint)
{
@@ -11990,6 +11997,37 @@ RebuildConstraintComment(AlteredTableInfo *tab, int 
pass, Oid objid,
tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
 }
 
+/*
+ * For an index on table that's being recreated due PostAlterType processing,
+ * preserve its indisclustered property by issuing ALTER TABLE CLUSTER ON on
+ * table to run after the command to re-create the index.
+ */
+static void
+PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid)
+{
+   HeapTuple   indexTuple;
+   Form_pg_index indexForm;
+
+   Assert(OidIsValid(indoid));
+   Assert(pass == AT_PASS_OLD_INDEX);
+
+   indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid));
+   if (!HeapTupleIsValid(indexTuple))
+   elog(ERROR, "cache lookup failed for index %u", indoid);
+   indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
+
+   if (indexForm->indisclustered)
+   {
+   AlterTableCmd *newcmd = makeNode(AlterTableCmd);
+
+   newcmd->subtype = AT_ClusterOn;
+   newcmd->name = get_rel_name(indoid);
+   tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd);
+   }
+
+   ReleaseSysCache(indexTuple);
+}
+
 /*
  * Subroutine for ATPostAlterTypeParse().  Calls out to CheckIndexCompatible()
  * for the real analysis, then mutates the IndexStmt based on that verdict.
diff --git a/src/test/regress/expected/alter_table.out 
b/src/test/regress/expected/alter_table.out
index e73ce4b6f3..010a7aa98b 100644
--- a/src/test/regress/expected/alter_table.out
+++