Re: ALTER tbl rewrite loses CLUSTER ON index
On Fri, Apr 03, 2020 at 03:54:38PM +0900, Michael Paquier wrote: > Now, would it be better to apply the refactoring patch for HEAD before > feature freeze, or are people fine if this is committed a bit after? > Patch 0002 is neither a new feature, nor an actual bug, and just some > code cleanup, but I am a bit worried about applying that cleanup on > HEAD just after the freeze. I have worked more on this one this morning and just applied the bug fix down to 9.5, and the refactoring on HEAD. Thanks! -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Thu, Apr 02, 2020 at 04:38:36AM -0300, Alvaro Herrera wrote: > I don't think we need to worry about that problem, because we already > checked that the pg_class tuple for the index is there two lines above. > The pg_index tuple cannot have gone away on its own; and the index can't > be deleted either, because cluster_rel holds AEL on the table. There > isn't "probably" about the can't-happen condition, is there? Yes, you are right here. I was wondering about an interference with the multi-relation cluster that would not lock the parent relation at the upper level of cluster() but the check on the existence of the index makes sure that we'll never see an invalid entry in pg_index, so let's keep patch 0002 as originally presented. As the commit tree is going to be rather crowded until feature freeze on Sunday, I'll wait until Monday or Tuesday to finalize this patch set. Now, would it be better to apply the refactoring patch for HEAD before feature freeze, or are people fine if this is committed a bit after? Patch 0002 is neither a new feature, nor an actual bug, and just some code cleanup, but I am a bit worried about applying that cleanup on HEAD just after the freeze. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Thu, Apr 02, 2020 at 04:24:03PM +0900, Michael Paquier wrote: > On Thu, Apr 02, 2020 at 01:52:09AM -0500, Justin Pryzby wrote: >> Sounds right. Or else get_index_isclustered() could be redefined to take a >> boolean "do_elog" flag, and if syscache fails and that's false, then return >> false instead of ERROR. > > Not sure if that's completely right to do either. For one, it is not > consistent with the surroundings as of lsyscache.c. Actually, we do have some missing_ok flags lying around already in lsyscache.c, so it would be much more consistent to use that name that instead of the do_elog you are suggesting. Could you update the patch to reflect that? -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On 2020-Apr-02, Michael Paquier wrote: > Now, regarding patch 0002, note that you have a problem for this part: > -tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid)); > -if (!HeapTupleIsValid(tuple))/* probably can't happen */ > -{ > -relation_close(OldHeap, AccessExclusiveLock); > -pgstat_progress_end_command(); > -return; > -} > -indexForm = (Form_pg_index) GETSTRUCT(tuple); > -if (!indexForm->indisclustered) > +if (!get_index_isclustered(indexOid)) > { > -ReleaseSysCache(tuple); > relation_close(OldHeap, AccessExclusiveLock); > pgstat_progress_end_command(); > return; > } > -ReleaseSysCache(tuple); > > On an invalid tuple for pg_index, the new code would issue an error, > while the old code would just return. I don't think we need to worry about that problem, because we already checked that the pg_class tuple for the index is there two lines above. The pg_index tuple cannot have gone away on its own; and the index can't be deleted either, because cluster_rel holds AEL on the table. There isn't "probably" about the can't-happen condition, is there? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER tbl rewrite loses CLUSTER ON index
On Thu, Apr 02, 2020 at 01:52:09AM -0500, Justin Pryzby wrote: > Sounds right. Or else get_index_isclustered() could be redefined to take a > boolean "do_elog" flag, and if syscache fails and that's false, then return > false instead of ERROR. Not sure if that's completely right to do either. For one, it is not consistent with the surroundings as of lsyscache.c. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Thu, Apr 02, 2020 at 03:14:21PM +0900, Michael Paquier wrote: > Now, regarding patch 0002, note that you have a problem for this part: > -tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid)); > -if (!HeapTupleIsValid(tuple))/* probably can't happen */ > -{ > -relation_close(OldHeap, AccessExclusiveLock); > -pgstat_progress_end_command(); > -return; > -} > -indexForm = (Form_pg_index) GETSTRUCT(tuple); > -if (!indexForm->indisclustered) > +if (!get_index_isclustered(indexOid)) > { > -ReleaseSysCache(tuple); > relation_close(OldHeap, AccessExclusiveLock); > pgstat_progress_end_command(); > return; > } > -ReleaseSysCache(tuple); > > On an invalid tuple for pg_index, the new code would issue an error, > while the old code would just return. And it seems to me that this > can lead to problems because the parent relation is processed and > locked at the beginning of cluster_rel(), *after* we know the index > OID to work on. > The refactoring is fine for the other two areas > though, so I think that there is still value to apply > get_index_isclustered() within mark_index_clustered() and cluster(), > and I would suggest to keep 0002 to that. > > Justin, what do you think? Sounds right. Or else get_index_isclustered() could be redefined to take a boolean "do_elog" flag, and if syscache fails and that's false, then return false instead of ERROR. -- Justin
Re: ALTER tbl rewrite loses CLUSTER ON index
On Wed, Mar 18, 2020 at 11:48:37AM +0900, Michael Paquier wrote: > Anyway, Tom, Alvaro, are you planning to look at what is proposed on > this thread? I don't want to step on your toes if that's the case and > it seems to me that the approach taken by the patch is sound, using as > basic fix the addition of an AT_ClusterOn sub-command to the list of > commands to execute when rebuilding the table, ensuring that any > follow-up CLUSTER command will use the correct index. Hearing nothing, I have been looking at the patches sent upthread, and did some modifications as per the attached for 0001. The logic looked fine to me and it is unchanged as you are taking care of normal indexes as well as constraint indexes. Please note that I have tweaked some comments, and removed what was on top of RememberConstraintForRebuilding() as that was just a duplicate. Regarding the tests, I was annoyed by the fact that the logic was not manipulating two indexes at the same time on the relation rewritten with a normal and a constraint index, and cross-checking both at the same time to see which one is clustered after each rewrite is good for consistency. Now, regarding patch 0002, note that you have a problem for this part: -tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid)); -if (!HeapTupleIsValid(tuple))/* probably can't happen */ -{ -relation_close(OldHeap, AccessExclusiveLock); -pgstat_progress_end_command(); -return; -} -indexForm = (Form_pg_index) GETSTRUCT(tuple); -if (!indexForm->indisclustered) +if (!get_index_isclustered(indexOid)) { -ReleaseSysCache(tuple); relation_close(OldHeap, AccessExclusiveLock); pgstat_progress_end_command(); return; } -ReleaseSysCache(tuple); On an invalid tuple for pg_index, the new code would issue an error, while the old code would just return. And it seems to me that this can lead to problems because the parent relation is processed and locked at the beginning of cluster_rel(), *after* we know the index OID to work on. The refactoring is fine for the other two areas though, so I think that there is still value to apply get_index_isclustered() within mark_index_clustered() and cluster(), and I would suggest to keep 0002 to that. Justin, what do you think? -- Michael From 60a2e5a615a39b852985166531019ac1baa4146a Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 2 Apr 2020 14:52:15 +0900 Subject: [PATCH v8] Preserve clustered indexes after rewrites of ALTER TABLE --- src/include/utils/lsyscache.h | 1 + src/backend/commands/tablecmds.c | 47 ++ src/backend/utils/cache/lsyscache.c | 23 +++ src/test/regress/expected/alter_table.out | 48 +++ src/test/regress/sql/alter_table.sql | 25 5 files changed, 144 insertions(+) diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index 374f57fb43..c9c68e2f4f 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -185,6 +185,7 @@ extern Oid get_range_collation(Oid rangeOid); extern Oid get_index_column_opclass(Oid index_oid, int attno); extern bool get_index_isreplident(Oid index_oid); extern bool get_index_isvalid(Oid index_oid); +extern bool get_index_isclustered(Oid index_oid); #define type_is_array(typid) (get_element_type(typid) != InvalidOid) /* type_is_array_domain accepts both plain arrays and domains over arrays */ diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index c8c88be2c9..ef234075dc 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -177,6 +177,7 @@ typedef struct AlteredTableInfo List *changedIndexOids; /* OIDs of indexes to rebuild */ List *changedIndexDefs; /* string definitions of same */ char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */ + char *clusterOnIndex; /* index to use for CLUSTER */ } AlteredTableInfo; /* Struct describing one new constraint to check in Phase 3 scan */ @@ -11581,6 +11582,21 @@ RememberReplicaIdentityForRebuilding(Oid indoid, AlteredTableInfo *tab) tab->replicaIdentityIndex = get_rel_name(indoid); } +/* + * Subroutine for ATExecAlterColumnType: remember any clustered index. + */ +static void +RememberClusterOnForRebuilding(Oid indoid, AlteredTableInfo *tab) +{ + if (!get_index_isclustered(indoid)) + return; + + if (tab->clusterOnIndex) + elog(ERROR, "relation %u has multiple clustered indexes", tab->relid); + + tab->clusterOnIndex = get_rel_name(indoid); +} + /* * Subroutine for ATExecAlterColumnType: remember that a constraint needs * to be rebuilt (which we might already know). @@ -11606,9 +11622,18 @@ RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo
Re: ALTER tbl rewrite loses CLUSTER ON index
On Tue, Mar 17, 2020 at 11:20:44AM -0500, Justin Pryzby wrote: > On Tue, Mar 17, 2020 at 02:33:32PM +0900, Michael Paquier wrote: >> Patch 0002 from Justin does that, I would keep this refactoring as >> HEAD-only material though, and I don't spot any other code paths in >> need of patching. >> >> The commit message of patch 0001 is not what you wanted I guess. > > That's what git-am does, and I didn't find any option to make it less > unreadable. I guess I should just delete the email body it inserts. Strange... Anyway, Tom, Alvaro, are you planning to look at what is proposed on this thread? I don't want to step on your toes if that's the case and it seems to me that the approach taken by the patch is sound, using as basic fix the addition of an AT_ClusterOn sub-command to the list of commands to execute when rebuilding the table, ensuring that any follow-up CLUSTER command will use the correct index. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Tue, Mar 17, 2020 at 02:33:32PM +0900, Michael Paquier wrote: > > Yeah, in cluster(), mark_index_clustered(). > > Patch 0002 from Justin does that, I would keep this refactoring as > HEAD-only material though, and I don't spot any other code paths in > need of patching. > > The commit message of patch 0001 is not what you wanted I guess. That's what git-am does, and I didn't find any option to make it less unreadable. I guess I should just delete the email body it inserts. | The commit message is formed by the title taken from the "Subject: ", a | blank line and the body of the message up to where the patch begins. Excess | whitespace at the end of each line is automatically stripped. -- Justin
Re: ALTER tbl rewrite loses CLUSTER ON index
On Mon, Mar 16, 2020 at 11:25:23AM -0300, Alvaro Herrera wrote: > On 2020-Mar-16, Justin Pryzby wrote: > > > Also, should we call it "is_index_clustered", since otherwise it sounds alot > > like "+get_index_clustered" (without "is"), which sounds like it takes a > > table > > and returns which index is clustered. That might be just as useful for > > some of > > these callers. > > Amit's proposed name seems to match lsyscache.c usual conventions better. There were no get_index_isvalid() (introduced by me) or get_index_isreplident() (introduced by Peter) until last week, and those names have been chosen to be consistent with the existing get_index_column_opclass(), so using get_index_isclustered is in my opinion the most consistent choice. > Yeah, in cluster(), mark_index_clustered(). Patch 0002 from Justin does that, I would keep this refactoring as HEAD-only material though, and I don't spot any other code paths in need of patching. The commit message of patch 0001 is not what you wanted I guess. if (OidIsValid(indexOid)) { -indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid)); -if (!HeapTupleIsValid(indexTuple)) -elog(ERROR, "cache lookup failed for index %u", indexOid); -indexForm = (Form_pg_index) GETSTRUCT(indexTuple); - -if (indexForm->indisclustered) -{ -ReleaseSysCache(indexTuple); +if (get_index_isclustered(indexOid)) return; -} - -ReleaseSysCache(indexTuple); } No need for two layers of if(s) here. +create index alttype_cluster_a on alttype_cluster (a); +alter table alttype_cluster cluster on alttype_cluster_a; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; Would it make sense to add a second index not used for clustering to check after the case where isclustered is false? A second thing would be to check if relfilenode values match before and after each DDL to make sure that a rewrite happened or not, see check_ddl_rewrite() for example in alter_table.sql. Keeping both RememberClusterOnForRebuilding and RememberReplicaIdentityForRebuilding as separate looks fine to me. The code could be a bit more organized though. We have RememberIndexForRebuilding which may go through RememberConstraintForRebuilding if the relation OID changed is a constraint, and both register the replindent or isclustered information if present. Not really something for this patch set to care about, just a thought while reading this code. While looking at this bug, I have spotted a behavior which is perhaps not welcome. Take this test case: create table aa (a int); insert into aa values (1), (1); create unique index concurrently aai on aa(a); -- fails alter table aa alter column a type bigint; This generates the following error: ERROR: 23505: could not create unique index "aai" DETAIL: Key (a)=(1) is duplicated. SCHEMA NAME: public TABLE NAME: aa CONSTRAINT NAME: aai LOCATION: comparetup_index_btree, tuplesort.c:4049 After a REINDEX CONCURRENTLY, we may leave behind an invalid index on the relation's toast table or even normal indexes. CREATE INDEX CONCURRENTLY may also leave behind invalid indexes. If triggering an ALTER TABLE that causes a rewrite of the relation, we have the following behavior: - An invalid toast index is correctly discarded, keeping one valid toast index. No problem here. - Invalid non-toast indexes are all rebuilt. If the index relies on a constraint then ALTER TABLE would fail, like the above. I am wondering if there is an argument for not including invalid indexes in the rebuilt version, even if the existing behavior may be useful for some users. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On 2020-Mar-16, Justin Pryzby wrote: > Also, should we call it "is_index_clustered", since otherwise it sounds alot > like "+get_index_clustered" (without "is"), which sounds like it takes a table > and returns which index is clustered. That might be just as useful for some > of > these callers. Amit's proposed name seems to match lsyscache.c usual conventions better. > Should we use your get_index_isclustered more widely ? Yeah, in cluster(), mark_index_clustered(). -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER tbl rewrite loses CLUSTER ON index
On Mon, Mar 16, 2020 at 04:01:42PM +0900, Amit Langote wrote: > I came across a commit that recently went in: > > commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca > Author: Peter Eisentraut > Date: Fri Mar 13 11:28:11 2020 +0100 > > Preserve replica identity index across ALTER TABLE rewrite > > which fixes something very similar to what we are trying to with this > patch. The way it's done looks to me very close to what you are > telling. I have updated the patch to be similar to the above fix. Yes, I noticed it too. Should we use your get_index_isclustered more widely ? Also, should we call it "is_index_clustered", since otherwise it sounds alot like "+get_index_clustered" (without "is"), which sounds like it takes a table and returns which index is clustered. That might be just as useful for some of these callers. -- Justin >From add5e8481c70b6b66342b264a243f26f4c634e53 Mon Sep 17 00:00:00 2001 From: Amit Langote Date: Mon, 16 Mar 2020 16:01:42 +0900 Subject: [PATCH v7 1/2] ALTER tbl rewrite loses CLUSTER ON index On Fri, Mar 13, 2020 at 2:19 AM Tom Lane wrote: > Justin Pryzby writes: > > @cfbot: resending with only Amit's 0001, since Michael pushed a variation on > > 0002. Thank you for taking a look at it. > Boy, I really dislike this patch. ATPostAlterTypeParse is documented as > using the supplied definition string, and nothing else, to reconstruct > the index. This breaks that without even the courtesy of documenting > the breakage. Moreover, the reason why it's designed like that is to > avoid requiring the old index objects to still be accessible. So I'm > surprised that this hack works at all. I don't think it would have > worked at the time the code was first written, and I think it's imposing > a constraint we'll have problems with (again?) in future. Okay, so maybe in the middle of ATPostAlterTypeParse() is not a place to do it, but don't these arguments apply to RebuildConstraintComment(), which I based the patch on? > The right way to fix this is to note the presence of the indisclustered > flag when we're examining the index earlier, and include a suitable > command in the definition string. So probably pg_get_indexdef_string() > is what needs to change. It doesn't look like that's used anywhere > else, so we can just redefine its behavior as needed. I came across a commit that recently went in: commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca Author: Peter Eisentraut Date: Fri Mar 13 11:28:11 2020 +0100 Preserve replica identity index across ALTER TABLE rewrite which fixes something very similar to what we are trying to with this patch. The way it's done looks to me very close to what you are telling. I have updated the patch to be similar to the above fix. -- Thank you, Amit --- src/backend/commands/tablecmds.c | 49 +++ src/backend/utils/cache/lsyscache.c | 23 +++ src/include/utils/lsyscache.h | 1 + src/test/regress/expected/alter_table.out | 34 src/test/regress/sql/alter_table.sql | 15 +++ 5 files changed, 122 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 8c33b67c1b..73d9392878 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -177,6 +177,7 @@ typedef struct AlteredTableInfo List *changedIndexOids; /* OIDs of indexes to rebuild */ List *changedIndexDefs; /* string definitions of same */ char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */ + char *clusterOnIndex; /* index to CLUSTER ON */ } AlteredTableInfo; /* Struct describing one new constraint to check in Phase 3 scan */ @@ -11579,9 +11580,28 @@ RememberReplicaIdentityForRebuilding(Oid indoid, AlteredTableInfo *tab) tab->replicaIdentityIndex = get_rel_name(indoid); } +/* + * Subroutine for ATExecAlterColumnType: remember any clustered index. + */ +static void +RememberClusterOnForRebuilding(Oid indoid, AlteredTableInfo *tab) +{ + if (!get_index_isclustered(indoid)) + return; + + if (tab->clusterOnIndex) + elog(ERROR, "relation %u has multiple clustered indexes", tab->relid); + + tab->clusterOnIndex = get_rel_name(indoid); +} + /* * Subroutine for ATExecAlterColumnType: remember that a constraint needs * to be rebuilt (which we might already know). + * + * For constraint's index (if any), also remember if it's the table's replica + * identity or its clustered index, so that ATPostAlterTypeCleanup() can + * queue up commands necessary to restore that property. */ static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab) @@ -11604,9 +11624,18 @@ RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab) tab->changedConstraintDefs = lappend(tab->changedConstraintDefs, defstring); + /* + * For constraint's index (if any), remember if it's the table's + * replica identity or its clustered index, so that + *
Re: ALTER tbl rewrite loses CLUSTER ON index
On Fri, Mar 13, 2020 at 2:19 AM Tom Lane wrote: > Justin Pryzby writes: > > @cfbot: resending with only Amit's 0001, since Michael pushed a variation on > > 0002. Thank you for taking a look at it. > Boy, I really dislike this patch. ATPostAlterTypeParse is documented as > using the supplied definition string, and nothing else, to reconstruct > the index. This breaks that without even the courtesy of documenting > the breakage. Moreover, the reason why it's designed like that is to > avoid requiring the old index objects to still be accessible. So I'm > surprised that this hack works at all. I don't think it would have > worked at the time the code was first written, and I think it's imposing > a constraint we'll have problems with (again?) in future. Okay, so maybe in the middle of ATPostAlterTypeParse() is not a place to do it, but don't these arguments apply to RebuildConstraintComment(), which I based the patch on? > The right way to fix this is to note the presence of the indisclustered > flag when we're examining the index earlier, and include a suitable > command in the definition string. So probably pg_get_indexdef_string() > is what needs to change. It doesn't look like that's used anywhere > else, so we can just redefine its behavior as needed. I came across a commit that recently went in: commit 1cc9c2412cc9a2fbe6a381170097d315fd40ccca Author: Peter Eisentraut Date: Fri Mar 13 11:28:11 2020 +0100 Preserve replica identity index across ALTER TABLE rewrite which fixes something very similar to what we are trying to with this patch. The way it's done looks to me very close to what you are telling. I have updated the patch to be similar to the above fix. -- Thank you, Amit v6-0001-ALTER-tbl-rewrite-loses-CLUSTER-ON-index.patch Description: Binary data
Re: ALTER tbl rewrite loses CLUSTER ON index
Justin Pryzby writes: > @cfbot: resending with only Amit's 0001, since Michael pushed a variation on > 0002. Boy, I really dislike this patch. ATPostAlterTypeParse is documented as using the supplied definition string, and nothing else, to reconstruct the index. This breaks that without even the courtesy of documenting the breakage. Moreover, the reason why it's designed like that is to avoid requiring the old index objects to still be accessible. So I'm surprised that this hack works at all. I don't think it would have worked at the time the code was first written, and I think it's imposing a constraint we'll have problems with (again?) in future. The right way to fix this is to note the presence of the indisclustered flag when we're examining the index earlier, and include a suitable command in the definition string. So probably pg_get_indexdef_string() is what needs to change. It doesn't look like that's used anywhere else, so we can just redefine its behavior as needed. regards, tom lane
Re: ALTER tbl rewrite loses CLUSTER ON index
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested I tested the patch on the master branch (a77315fdf2a197a925e670be2d8b376c4ac02efc) and it works fine. The new status of this patch is: Ready for Committer
Re: ALTER tbl rewrite loses CLUSTER ON index
@cfbot: resending with only Amit's 0001, since Michael pushed a variation on 0002. -- Justin >From 865fc2713ad29d0f8c0f63609a7c15c83cfa5cfe Mon Sep 17 00:00:00 2001 From: Amit Langote Date: Thu, 6 Feb 2020 18:14:16 +0900 Subject: [PATCH v5] ALTER tbl rewrite loses CLUSTER ON index --- src/backend/commands/tablecmds.c | 39 +++ src/test/regress/expected/alter_table.out | 34 src/test/regress/sql/alter_table.sql | 16 +- 3 files changed, 88 insertions(+), 1 deletion(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 02a7c04fdb..6b2469bd09 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, static void RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, Relation rel, List *domname, const char *conname); +static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid); static void TryReuseIndex(Oid oldId, IndexStmt *stmt); static void TryReuseForeignKey(Oid oldId, Constraint *con); static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName, @@ -11838,6 +11839,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, newcmd->def = (Node *) stmt; tab->subcmds[AT_PASS_OLD_INDEX] = lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId); } else if (IsA(stm, AlterTableStmt)) { @@ -11874,6 +11878,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, rel, NIL, indstmt->idxname); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid); } else if (cmd->subtype == AT_AddConstraint) { @@ -11996,6 +12003,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); } +/* + * For a table's index that is to be recreated due to PostAlterType + * processing, preserve its indisclustered property by issuing ALTER TABLE + * CLUSTER ON command on the table that will run after the command to recreate + * the index. + */ +static void +PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid) +{ + HeapTuple indexTuple; + Form_pg_index indexForm; + + Assert(OidIsValid(indoid)); + Assert(pass == AT_PASS_OLD_INDEX); + + indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid)); + if (!HeapTupleIsValid(indexTuple)) + elog(ERROR, "cache lookup failed for index %u", indoid); + indexForm = (Form_pg_index) GETSTRUCT(indexTuple); + + if (indexForm->indisclustered) + { + AlterTableCmd *newcmd = makeNode(AlterTableCmd); + + newcmd->subtype = AT_ClusterOn; + newcmd->name = get_rel_name(indoid); + tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); + } + + ReleaseSysCache(indexTuple); +} + /* * Subroutine for ATPostAlterTypeParse(). Calls out to CheckIndexCompatible() * for the real analysis, then mutates the IndexStmt based on that verdict. diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index fb6d86a269..a01c6d6ec5 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -4296,3 +4296,37 @@ create trigger xtrig update bar1 set a = a + 1; INFO: a=1, b=1 /* End test case for bug #16242 */ +-- alter type rewrite/rebuild should preserve cluster marking on index +create table alttype_cluster (a int); +create index alttype_cluster_a on alttype_cluster (a); +alter table alttype_cluster cluster on alttype_cluster_a; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; + indisclustered + + t +(1 row) + +alter table alttype_cluster alter a type bigint; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; + indisclustered + + t +(1 row) + +drop index alttype_cluster_a; +alter table alttype_cluster add primary key (a); +alter table alttype_cluster cluster on alttype_cluster_pkey; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; + indisclustered + + t +(1 row) + +alter table alttype_cluster alter a type int; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; + indisclustered + + t +(1 row) + +drop table alttype_cluster; diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index 3801f19c58..6e9048bbec 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -2802,7 +2802,6 @@ drop table at_test_sql_partop; drop operator class at_test_sql_partop using btree; drop function at_test_sql_partop;
Re: ALTER tbl rewrite loses CLUSTER ON index
On Mon, Mar 02, 2020 at 12:28:18PM +0900, Michael Paquier wrote: > > +SELECT indexrelid::regclass FROM pg_index WHERE > > indrelid='concur_clustered'::regclass; > > This test should check after indisclustered. Except that, the patch > is fine so I'll apply it if there are no objections. Oops - I realized that, but didn't send a new patch before you noticed - thanks for handling it. -- Justin
Re: ALTER tbl rewrite loses CLUSTER ON index
On Mon, Mar 02, 2020 at 12:28:18PM +0900, Michael Paquier wrote: > This test should check after indisclustered. Except that, the patch > is fine so I'll apply it if there are no objections. I got a second look at this one, and applied it down to 12 after some small modifications in the new test and in the comments. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Sat, Feb 29, 2020 at 10:52:58AM -0600, Justin Pryzby wrote: > Rebased Amit's patch and included my own 0002 to fix the REINDEX CONCURRENTLY > issue. I have looked at 0002 as that concerns me. > +SELECT indexrelid::regclass FROM pg_index WHERE > indrelid='concur_clustered'::regclass; > + indexrelid > + > + concur_clustered_i_idx > +(1 row) This test should check after indisclustered. Except that, the patch is fine so I'll apply it if there are no objections. -- Michael signature.asc Description: PGP signature
Re: ALTER tbl rewrite loses CLUSTER ON index
On Fri, Feb 28, 2020 at 08:42:02PM -0600, Justin Pryzby wrote: > On Fri, Feb 28, 2020 at 06:26:04PM -0500, Tom Lane wrote: > > Justin Pryzby writes: > > > I think the attached is 80% complete (I didn't touch pg_dump). > > > One objection to this change would be that all relations (including > > > indices) > > > end up with relclustered fields, and pg_index already has a number of > > > bools, so > > > it's not like this one bool is wasting a byte. > > > I think relisclustered was a's clever way of avoiding that overhead > > > (c0ad5953). > > > So I would be -0.5 on moving it to pg_class.. > > > But I think 0001 and 0002 are worthy. Maybe the test in 0002 should live > > > somewhere else. > > > > 0001 has been superseded by events (faade5d4c), so the cfbot is choking > > on that one's failure to apply, and not testing any further. Please > > repost without 0001 so that we can get this testing again. > > I've just noticed while working on (1) that this separately affects REINDEX > CONCURRENTLY, which would be a new bug in v12. Without CONCURRENTLY there's > no > issue. I guess we need a separate patch for that case. Rebased Amit's patch and included my own 0002 to fix the REINDEX CONCURRENTLY issue. -- Justin >From dccc6f08450eacafc3a08bc8b2836d2feb23a533 Mon Sep 17 00:00:00 2001 From: Amit Langote Date: Thu, 6 Feb 2020 18:14:16 +0900 Subject: [PATCH v4 1/2] ALTER tbl rewrite loses CLUSTER ON index --- src/backend/commands/tablecmds.c | 39 +++ src/test/regress/expected/alter_table.out | 34 src/test/regress/sql/alter_table.sql | 16 +- 3 files changed, 88 insertions(+), 1 deletion(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 02a7c04fdb..6b2469bd09 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, static void RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, Relation rel, List *domname, const char *conname); +static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid); static void TryReuseIndex(Oid oldId, IndexStmt *stmt); static void TryReuseForeignKey(Oid oldId, Constraint *con); static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName, @@ -11838,6 +11839,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, newcmd->def = (Node *) stmt; tab->subcmds[AT_PASS_OLD_INDEX] = lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId); } else if (IsA(stm, AlterTableStmt)) { @@ -11874,6 +11878,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, rel, NIL, indstmt->idxname); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid); } else if (cmd->subtype == AT_AddConstraint) { @@ -11996,6 +12003,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); } +/* + * For a table's index that is to be recreated due to PostAlterType + * processing, preserve its indisclustered property by issuing ALTER TABLE + * CLUSTER ON command on the table that will run after the command to recreate + * the index. + */ +static void +PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid) +{ + HeapTuple indexTuple; + Form_pg_index indexForm; + + Assert(OidIsValid(indoid)); + Assert(pass == AT_PASS_OLD_INDEX); + + indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid)); + if (!HeapTupleIsValid(indexTuple)) + elog(ERROR, "cache lookup failed for index %u", indoid); + indexForm = (Form_pg_index) GETSTRUCT(indexTuple); + + if (indexForm->indisclustered) + { + AlterTableCmd *newcmd = makeNode(AlterTableCmd); + + newcmd->subtype = AT_ClusterOn; + newcmd->name = get_rel_name(indoid); + tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); + } + + ReleaseSysCache(indexTuple); +} + /* * Subroutine for ATPostAlterTypeParse(). Calls out to CheckIndexCompatible() * for the real analysis, then mutates the IndexStmt based on that verdict. diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index fb6d86a269..a01c6d6ec5 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -4296,3 +4296,37 @@ create trigger xtrig update bar1 set a = a + 1; INFO: a=1, b=1 /* End test case for bug #16242 */ +-- alter type rewrite/rebuild should preserve cluster marking on index +create table alttype_cluster (a int); +create index alttype_cluster_a on alttype_cluster (a); +alter table alttype_cluster cluster on alttype_cluster_a; +select
Re: ALTER tbl rewrite loses CLUSTER ON index
On Fri, Feb 28, 2020 at 06:26:04PM -0500, Tom Lane wrote: > Justin Pryzby writes: > > I think the attached is 80% complete (I didn't touch pg_dump). > > One objection to this change would be that all relations (including indices) > > end up with relclustered fields, and pg_index already has a number of > > bools, so > > it's not like this one bool is wasting a byte. > > I think relisclustered was a's clever way of avoiding that overhead > > (c0ad5953). > > So I would be -0.5 on moving it to pg_class.. > > But I think 0001 and 0002 are worthy. Maybe the test in 0002 should live > > somewhere else. > > 0001 has been superseded by events (faade5d4c), so the cfbot is choking > on that one's failure to apply, and not testing any further. Please > repost without 0001 so that we can get this testing again. I've just noticed while working on (1) that this separately affects REINDEX CONCURRENTLY, which would be a new bug in v12. Without CONCURRENTLY there's no issue. I guess we need a separate patch for that case. (1) https://commitfest.postgresql.org/27/2269/ The ALTER bug goes back further and its fix should be a kept separate. postgres=# DROP TABLE tt; CREATE TABLE tt(i int unique); CLUSTER tt USING tt_i_key; CLUSTER tt; REINDEX INDEX tt_i_key; CLUSTER tt; DROP TABLE CREATE TABLE CLUSTER CLUSTER REINDEX CLUSTER postgres=# DROP TABLE tt; CREATE TABLE tt(i int unique); CLUSTER tt USING tt_i_key; CLUSTER tt; REINDEX INDEX CONCURRENTLY tt_i_key; CLUSTER tt; DROP TABLE CREATE TABLE CLUSTER CLUSTER REINDEX ERROR: there is no previously clustered index for table "tt" -- Justin
Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)
Justin Pryzby writes: > I think the attached is 80% complete (I didn't touch pg_dump). > One objection to this change would be that all relations (including indices) > end up with relclustered fields, and pg_index already has a number of bools, > so > it's not like this one bool is wasting a byte. > I think relisclustered was a's clever way of avoiding that overhead > (c0ad5953). > So I would be -0.5 on moving it to pg_class.. > But I think 0001 and 0002 are worthy. Maybe the test in 0002 should live > somewhere else. 0001 has been superseded by events (faade5d4c), so the cfbot is choking on that one's failure to apply, and not testing any further. Please repost without 0001 so that we can get this testing again. regards, tom lane
Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)
On Mon, Feb 17, 2020 at 02:31:42PM +0900, Amit Langote wrote: > Hi Justin, > > On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby wrote: > > On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote: > > > On 2020-Feb-06, Justin Pryzby wrote: > > > > > > > I wondered if it wouldn't be better if CLUSTER ON was stored in > > > > pg_class as the > > > > Oid of a clustered index, rather than a boolean in pg_index. > > > > > > Maybe. Do you want to try a patch? > > > > I think the attached is 80% complete (I didn't touch pg_dump). > > > > One objection to this change would be that all relations (including indices) > > end up with relclustered fields, and pg_index already has a number of > > bools, so > > it's not like this one bool is wasting a byte. > > > > I think relisclustered was a's clever way of avoiding that overhead > > (c0ad5953). > > So I would be -0.5 on moving it to pg_class.. In case there's any confusion: "a's" was probably me halfway changing "someone's" to "a". > Are you still for fixing ALTER TABLE losing relisclustered with the > patch we were working on earlier [1], if not for moving relisclustered > to pg_class anymore? Thanks for remembering this one. I think your patch is the correct fix. I forgot to say it, but moving relisclustered to pg_class doesn't help to avoid losting indisclustered: it still needs a fix just like this. Anyway, I withdrew my suggestion for moving to pg_class, since it has more overhead, even for pg_class rows for relations which can't have indexes. > I have read elsewhere [2] that forcing ALTER TABLE to rewrite in > clustered order might not be a good option, but maybe that one is a > more radical proposal than this. Right; your fix seems uncontroversial. I ran into this (indisclustered) bug while starting to write that patch for "ALTER rewrite in clustered order". -- Justin
Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)
Hi Justin, On Fri, Feb 7, 2020 at 11:39 PM Justin Pryzby wrote: > On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote: > > On 2020-Feb-06, Justin Pryzby wrote: > > > > > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class > > > as the > > > Oid of a clustered index, rather than a boolean in pg_index. > > > > Maybe. Do you want to try a patch? > > I think the attached is 80% complete (I didn't touch pg_dump). > > One objection to this change would be that all relations (including indices) > end up with relclustered fields, and pg_index already has a number of bools, > so > it's not like this one bool is wasting a byte. > > I think relisclustered was a's clever way of avoiding that overhead > (c0ad5953). > So I would be -0.5 on moving it to pg_class.. Are you still for fixing ALTER TABLE losing relisclustered with the patch we were working on earlier [1], if not for moving relisclustered to pg_class anymore? I have read elsewhere [2] that forcing ALTER TABLE to rewrite in clustered order might not be a good option, but maybe that one is a more radical proposal than this. Thanks, Amit [1] https://postgr.es/m/CA%2BHiwqEt1HnXYckCdaO8%2BpOoFs7NNS5byoZ6Xg2B7epKbhS85w%40mail.gmail.com [2] https://postgr.es/m/10984.1581181029%40sss.pgh.pa.us
Re: ALTER tbl rewrite loses CLUSTER ON index (consider moving indisclustered to pg_class)
On Thu, Feb 06, 2020 at 02:24:47PM -0300, Alvaro Herrera wrote: > On 2020-Feb-06, Justin Pryzby wrote: > > > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as > > the > > Oid of a clustered index, rather than a boolean in pg_index. > > Maybe. Do you want to try a patch? I think the attached is 80% complete (I didn't touch pg_dump). One objection to this change would be that all relations (including indices) end up with relclustered fields, and pg_index already has a number of bools, so it's not like this one bool is wasting a byte. I think relisclustered was a's clever way of avoiding that overhead (c0ad5953). So I would be -0.5 on moving it to pg_class.. But I think 0001 and 0002 are worthy. Maybe the test in 0002 should live somewhere else. >From 7eea0a17e495fe13379ffd589b551f2f145f5672 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 6 Feb 2020 21:48:13 -0600 Subject: [PATCH v1 1/3] Update comment obsolete since b9b8831a --- src/backend/commands/cluster.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index e9d7a7f..3adcbeb 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -1539,9 +1539,9 @@ get_tables_to_cluster(MemoryContext cluster_context) /* * Get all indexes that have indisclustered set and are owned by - * appropriate user. System relations or nailed-in relations cannot ever - * have indisclustered set, because CLUSTER will refuse to set it when - * called with one of them as argument. + * appropriate user. Shared relations cannot ever have indisclustered + * set, because CLUSTER will refuse to set it when called with one as + * an argument. */ indRelation = table_open(IndexRelationId, AccessShareLock); ScanKeyInit(, -- 2.7.4 >From 4777be522a7aa8b8c77b13f765cbd02043438f2a Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Fri, 7 Feb 2020 08:12:50 -0600 Subject: [PATCH v1 2/3] Give developer a helpful kick in the pants if they change natts in one place but not another --- src/backend/bootstrap/bootstrap.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index bfc629c..d5e1888 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -25,7 +25,9 @@ #include "access/xlog_internal.h" #include "bootstrap/bootstrap.h" #include "catalog/index.h" +#include "catalog/pg_class.h" #include "catalog/pg_collation.h" +#include "catalog/pg_proc.h" #include "catalog/pg_type.h" #include "common/link-canary.h" #include "libpq/pqsignal.h" @@ -49,6 +51,7 @@ #include "utils/ps_status.h" #include "utils/rel.h" #include "utils/relmapper.h" +#include "utils/syscache.h" uint32 bootstrap_data_checksum_version = 0; /* No checksum */ @@ -602,6 +605,26 @@ boot_openrel(char *relname) TableScanDesc scan; HeapTuple tup; + /* Check that pg_class data is consistent now, rather than failing obscurely later */ + struct { Oid oid; int natts; } + checknatts[] = { + {RelationRelationId, Natts_pg_class,}, + {TypeRelationId, Natts_pg_type,}, + {AttributeRelationId, Natts_pg_attribute,}, + {ProcedureRelationId, Natts_pg_proc,}, + }; + + for (int i=0; irelnatts); + ReleaseSysCache(tuple); + } + if (strlen(relname) >= NAMEDATALEN) relname[NAMEDATALEN - 1] = '\0'; -- 2.7.4 >From ed886f8202486dea8069b719d35a5d0db7f3277c Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Thu, 6 Feb 2020 12:56:34 -0600 Subject: [PATCH v1 3/3] Make cluster a property of table in pg_index.. ..rather than of indexes in pg_index. The only issue with this is that it makes pg_class larger, and the new column applies not only to tables, but to indices. --- doc/src/sgml/catalogs.sgml | 14 +-- src/backend/catalog/heap.c | 1 + src/backend/catalog/index.c| 6 -- src/backend/commands/cluster.c | 172 + src/backend/commands/tablecmds.c | 5 +- src/backend/utils/cache/relcache.c | 1 - src/bin/psql/describe.c| 4 +- src/include/catalog/pg_class.dat | 2 +- src/include/catalog/pg_class.h | 3 + src/include/catalog/pg_index.h | 1 - 10 files changed, 93 insertions(+), 116 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index a10b665..8efeaff 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1752,6 +1752,13 @@ SCRAM-SHA-256$iteration count: + relclustered + oid + + The OID of the index last clustered, or zero + + + relpages int4 @@ -3808,13 +3815,6 @@ SCRAM-SHA-256$iteration count: - indisclustered - bool - - If true, the table was last clustered on this index - - - indisvalid bool diff --git
Re: ALTER tbl rewrite loses CLUSTER ON index
On Fri, Feb 7, 2020 at 2:24 AM Alvaro Herrera wrote: > On 2020-Feb-06, Justin Pryzby wrote: > > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as > > the > > Oid of a clustered index, rather than a boolean in pg_index. > > Maybe. Do you want to try a patch? +1 Thanksm Amit
Re: ALTER tbl rewrite loses CLUSTER ON index
On 2020-Feb-06, Justin Pryzby wrote: > I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as > the > Oid of a clustered index, rather than a boolean in pg_index. Maybe. Do you want to try a patch? > That likely would've avoided (or at least exposed) this issue. > And avoids the possibility of having two indices marked as "clustered". > These would be more trivial: > mark_index_clustered > /* We need to find the index that has indisclustered set. */ You need to be careful when dropping the index ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ALTER tbl rewrite loses CLUSTER ON index
I wondered if it wouldn't be better if CLUSTER ON was stored in pg_class as the Oid of a clustered index, rather than a boolean in pg_index. That likely would've avoided (or at least exposed) this issue. And avoids the possibility of having two indices marked as "clustered". These would be more trivial: mark_index_clustered /* We need to find the index that has indisclustered set. */
Re: ALTER tbl rewrite loses CLUSTER ON index
On Thu, Feb 6, 2020 at 10:31 AM Amit Langote wrote: > On Thu, Feb 6, 2020 at 8:44 AM Justin Pryzby wrote: > > On Wed, Feb 05, 2020 at 03:53:45PM +0900, Amit Langote wrote: > > > diff --git a/src/test/regress/sql/alter_table.sql > > > b/src/test/regress/sql/alter_table.sql > > > +-- alter type shouldn't lose clustered index > > > > My only suggestion is to update the comment > > +-- alter type rewrite/rebuild should preserve cluster marking on index > > Sure, done. Deja vu. Last two messages weren't sent to the list; updated patch attached. Thanks, Amit diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index f599393473..bab3ddf67c 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, static void RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, Relation rel, List *domname, const char *conname); +static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid); static void TryReuseIndex(Oid oldId, IndexStmt *stmt); static void TryReuseForeignKey(Oid oldId, Constraint *con); static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName, @@ -11832,6 +11833,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, newcmd->def = (Node *) stmt; tab->subcmds[AT_PASS_OLD_INDEX] = lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId); } else if (IsA(stm, AlterTableStmt)) { @@ -11868,6 +11872,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, rel, NIL, indstmt->idxname); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid); } else if (cmd->subtype == AT_AddConstraint) { @@ -11990,6 +11997,38 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); } +/* + * For a table's index that is to be recreated due to PostAlterType + * processing, preserve its indisclustered property by issuing ALTER TABLE + * CLUSTER ON command on the table that will run after the command to recreate + * the index. + */ +static void +PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid) +{ + HeapTuple indexTuple; + Form_pg_index indexForm; + + Assert(OidIsValid(indoid)); + Assert(pass == AT_PASS_OLD_INDEX); + + indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid)); + if (!HeapTupleIsValid(indexTuple)) + elog(ERROR, "cache lookup failed for index %u", indoid); + indexForm = (Form_pg_index) GETSTRUCT(indexTuple); + + if (indexForm->indisclustered) + { + AlterTableCmd *newcmd = makeNode(AlterTableCmd); + + newcmd->subtype = AT_ClusterOn; + newcmd->name = get_rel_name(indoid); + tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); + } + + ReleaseSysCache(indexTuple); +} + /* * Subroutine for ATPostAlterTypeParse(). Calls out to CheckIndexCompatible() * for the real analysis, then mutates the IndexStmt based on that verdict. diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index e73ce4b6f3..09c00eef05 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -4258,3 +4258,37 @@ alter table at_test_sql_partop attach partition at_test_sql_partop_1 for values drop table at_test_sql_partop; drop operator class at_test_sql_partop using btree; drop function at_test_sql_partop; +-- alter type rewrite/rebuild should preserve cluster marking on index +create table alttype_cluster (a int); +create index alttype_cluster_a on alttype_cluster (a); +alter table alttype_cluster cluster on alttype_cluster_a; +select indisclustered from pg_index where indrelid = 'alttype_cluster'::regclass; + indisclustered + + t +(1 row) + +alter table alttype_cluster alter a type bigint; +select indisclustered from pg_index where
Re: ALTER tbl rewrite loses CLUSTER ON index
On Wed, Feb 05, 2020 at 03:53:45PM +0900, Amit Langote wrote: > Hi Justin, > > On Mon, Feb 3, 2020 at 1:17 AM Justin Pryzby wrote: > > Other options are preserved by ALTER (and CLUSTER ON is and most obviously > > should be preserved by CLUSTER's rewrite), so I think (SET) CLUSTER should > > be > > preserved by ALTER, too. > > Yes. > > create table foo (a int primary key); > cluster foo; > ERROR: there is no previously clustered index for table "foo" > cluster foo using foo_pkey; > alter table foo alter a type bigint; > cluster foo; > ERROR: there is no previously clustered index for table "foo" > > With your patch, this last error doesn't occur. > > Like you, I too suspect that losing indisclustered like this is > unintentional, so should be fixed. Thanks for checking. It doesn't need to be said, but your patch is obviously superior. I ran into this while looking into a suggestion from Alvaro that ALTER should rewrite in order of a clustered index (if any) rather than in pre-existing heap order (more on that another day). So while this looks like a bug, and I can't think how a backpatch would break something, my suggestion is that backpatching a fix is of low value, so it's only worth +0. Thanks Justin
Re: ALTER tbl rewrite loses CLUSTER ON index
Hi Justin, On Mon, Feb 3, 2020 at 1:17 AM Justin Pryzby wrote: > Other options are preserved by ALTER (and CLUSTER ON is and most obviously > should be preserved by CLUSTER's rewrite), so I think (SET) CLUSTER should be > preserved by ALTER, too. Yes. create table foo (a int primary key); cluster foo; ERROR: there is no previously clustered index for table "foo" cluster foo using foo_pkey; alter table foo alter a type bigint; cluster foo; ERROR: there is no previously clustered index for table "foo" With your patch, this last error doesn't occur. Like you, I too suspect that losing indisclustered like this is unintentional, so should be fixed. > As far as I can see, this should be the responsibility of something in the > vicinity of ATPostAlterTypeParse/RememberIndexForRebuilding. > > Attach patch sketches a fix. While your sketch hits pretty close, it could be done a bit differently. For one, I don't like the way it's misusing changedIndexOids and changedIndexDefs. Instead, we can do something similar to what RebuildConstraintComments() does for constraint comments. For example, we can have a PreserveClusterOn() that adds a AT_ClusterOn command into table's AT_PASS_OLD_INDEX pass commands. Attached patch shows what I'm thinking. I also added representative tests. Thanks, Amit diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index f599393473..ea9941278e 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -490,6 +490,7 @@ static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, static void RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, Relation rel, List *domname, const char *conname); +static void PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid); static void TryReuseIndex(Oid oldId, IndexStmt *stmt); static void TryReuseForeignKey(Oid oldId, Constraint *con); static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName, @@ -11832,6 +11833,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, newcmd->def = (Node *) stmt; tab->subcmds[AT_PASS_OLD_INDEX] = lappend(tab->subcmds[AT_PASS_OLD_INDEX], newcmd); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, oldId); } else if (IsA(stm, AlterTableStmt)) { @@ -11868,6 +11872,9 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, rel, NIL, indstmt->idxname); + + /* Preserve index's indisclustered property, if set. */ + PreserveClusterOn(tab, AT_PASS_OLD_INDEX, indoid); } else if (cmd->subtype == AT_AddConstraint) { @@ -11990,6 +11997,37 @@ RebuildConstraintComment(AlteredTableInfo *tab, int pass, Oid objid, tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); } +/* + * For an index on table that's being recreated due PostAlterType processing, + * preserve its indisclustered property by issuing ALTER TABLE CLUSTER ON on + * table to run after the command to re-create the index. + */ +static void +PreserveClusterOn(AlteredTableInfo *tab, int pass, Oid indoid) +{ + HeapTuple indexTuple; + Form_pg_index indexForm; + + Assert(OidIsValid(indoid)); + Assert(pass == AT_PASS_OLD_INDEX); + + indexTuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indoid)); + if (!HeapTupleIsValid(indexTuple)) + elog(ERROR, "cache lookup failed for index %u", indoid); + indexForm = (Form_pg_index) GETSTRUCT(indexTuple); + + if (indexForm->indisclustered) + { + AlterTableCmd *newcmd = makeNode(AlterTableCmd); + + newcmd->subtype = AT_ClusterOn; + newcmd->name = get_rel_name(indoid); + tab->subcmds[pass] = lappend(tab->subcmds[pass], newcmd); + } + + ReleaseSysCache(indexTuple); +} + /* * Subroutine for ATPostAlterTypeParse(). Calls out to CheckIndexCompatible() * for the real analysis, then mutates the IndexStmt based on that verdict. diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index e73ce4b6f3..010a7aa98b 100644 --- a/src/test/regress/expected/alter_table.out +++