On 2019/03/08 19:22, Amit Langote wrote:
> On 2019/03/07 20:36, Amit Langote wrote:
>> On Thu, Mar 7, 2019 at 11:17 AM Amit Langote
>> <langote_amit...@lab.ntt.co.jp> wrote:
>>> The problem start when ALTER TABLE users ALTER COLUMN is executed.
>>> create table users(user_id int, name varchar(64), unique (user_id, name))
>>> partition by list(user_id);
>>>
>>> create table users_000 partition of users for values in(0);
>>> create table users_001 partition of users for values in(1);
>>> select relname, relfilenode from pg_class where relname like 'users%';
>>>           relname           │ relfilenode
>>> ────────────────────────────┼─────────────
>>>  users                      │       16441
>>>  users_000                  │       16446
>>>  users_000_user_id_name_key │       16449
>>>  users_001                  │       16451
>>>  users_001_user_id_name_key │       16454
>>>  users_user_id_name_key     │       16444
>>> (6 rows)
>>>
>>> alter table users alter column name type varchar(127);
>>> select relname, relfilenode from pg_class where relname like 'users%';
>>>           relname           │ relfilenode
>>> ────────────────────────────┼─────────────
>>>  users                      │       16441
>>>  users_000                  │       16446
>>>  users_000_user_id_name_key │       16444    <===  duplicated
>>>  users_001                  │       16451
>>>  users_001_user_id_name_key │       16444    <===  duplicated
>>>  users_user_id_name_key     │       16444    <===  duplicated
>>> (6 rows)
>>
>> I checked why users_000's and user_0001's indexes end up reusing
>> users_user_id_name_key's relfilenode.  At the surface, it's because
>> DefineIndex(<parent's-index-to-be-recreated>) is carrying oldNode =
>> <old-parents-index's-relfilenode> in IndexStmt, which is recursively
>> passed down to DefineIndex(<child-indexes-to-be-recreated>).  This
>> DefineIndex() chain is running due to ATPostAlterTypeCleanup() on the
>> parent rel.  This surface problem may be solved in DefineIndex() by
>> just resetting oldNode in each child IndexStmt before recursing, but
>> that means child indexes are recreated with new relfilenodes.  That
>> solves the immediate problem of relfilenodes being wrongly duplicated,
>> that's leading to madness such as SMgrRelationHash corruption being
>> seen in the original bug report.
> 
> This doesn't happen in HEAD, because in HEAD we got 807ae415c5, which
> changed things so that partitioned relations always have their relfilenode
> set to 0.  So, there's no question of parent's relfilenode being passed to
> children and hence being duplicated.
> 
> But that also means child indexes are unnecessarily rewritten, that is,
> with new relfilenodes.
> 
>> But, the root problem seems to be that ATPostAlterTypeCleanup() on
>> child tables isn't setting up their own
>> DefineIndex(<child-index-to-be-rewritten>) step.  That's because the
>> parent's ATPostAlterTypeCleanup() dropped child copies of the UNIQUE
>> constraint due to dependencies (+ CCI).  So, ATExecAlterColumnType()
>> on child relations isn't able to find the constraint on the individual
>> child relations to turn into their own
>> DefineIndex(<child-index-to-be-rewritten>). If we manage to handle
>> each relation's ATPostAlterTypeCleanup() independently, child's
>> recreated indexes will be able to reuse their old relfilenodes and
>> everything will be fine.  But maybe that will require significant
>> overhaul of how this post-alter-type-cleanup occurs?
> 
> We could try to solve this dividing ATPostAlterTypeCleanup processing into
> two functions:
> 
> 1 The first one runs right after ATExecAlterColumnType() is finished for a
> given table (like it does today), which then runs ATPostAlterTypeParse to
> generate commands for constraints and/or indexes to re-add.  This part
> won't drop the old constraints/indexes just yet, so child
> constraints/indexes will remain for ATExecAlterColumnType to see when
> executed for the children.
> 
> 2. Dropping the old constraints/indexes is the responsibility of the 2nd
> part, which runs just before executing ATExecReAddIndex or
> ATExecAddConstraint (with is_readd = true), so that the new constraints
> don't collide with the existing ones.
> 
> This arrangement allows step 1 to generate the commands to recreate even
> the child indexes such that the old relfilenode can be be preserved by
> setting IndexStmt.oldNode.
> 
> Attached patch is a very rough sketch, which fails some regression tests,
> but I ran out of time today.

I'm thinking of adding this to open items under Older Bugs.  Attached the
patch that I had posted on -bugs, but it's only a rough sketch as
described above, not a full fix.

Link to the original bug report:
https://www.postgresql.org/message-id/flat/15672-b9fa7db32698269f%40postgresql.org

Thanks,
Amit
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 5dcedc337a..ca514919d1 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -872,7 +872,7 @@ DefineIndex(Oid relationId,
                CreateComments(indexRelationId, RelationRelationId, 0,
                                           stmt->idxcomment);
 
-       if (partitioned)
+       if (partitioned && !is_alter_table)
        {
                /*
                 * Unless caller specified to skip this step (via ONLY), 
process each
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 93f13a4778..fb10d640e5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -426,7 +426,9 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo 
*tab, Relation rel,
                                          AlterTableCmd *cmd, LOCKMODE 
lockmode);
 static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char 
*colName,
                                                                List *options, 
LOCKMODE lockmode);
-static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
+static void ATPostAlterTypeCleanup1(List **wqueue, AlteredTableInfo *tab,
+                                          LOCKMODE lockmode);
+static void ATPostAlterTypeCleanup2(List **wqueue, AlteredTableInfo *tab,
                                           LOCKMODE lockmode);
 static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
                                         char *cmd, List **wqueue, LOCKMODE 
lockmode,
@@ -4068,6 +4070,10 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
                        if (subcmds == NIL)
                                continue;
 
+                       /* Drop the old indexes and constraints before 
creating*/
+                       if (pass == AT_PASS_OLD_INDEX || pass == 
AT_PASS_OLD_CONSTR)
+                               ATPostAlterTypeCleanup2(wqueue, tab, lockmode);
+
                        /*
                         * Appropriate lock was obtained by phase 1, needn't 
get it again
                         */
@@ -4084,7 +4090,7 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
                         * multiple columns of a table are altered).
                         */
                        if (pass == AT_PASS_ALTER_TYPE)
-                               ATPostAlterTypeCleanup(wqueue, tab, lockmode);
+                               ATPostAlterTypeCleanup1(wqueue, tab, lockmode);
 
                        relation_close(rel, NoLock);
                }
@@ -7233,6 +7239,9 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo 
*tab, Relation rel,
        if (constr->is_no_inherit)
                return address;
 
+       if (is_readd)
+               return address;
+
        /*
         * Propagate to children as appropriate.  Unlike most other ALTER
         * routines, we have to do this one level of recursion at a time; we 
can't
@@ -10307,21 +10316,12 @@ ATExecAlterColumnGenericOptions(Relation rel,
  * and constraints that depend on the altered columns.
  */
 static void
-ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
+ATPostAlterTypeCleanup1(List **wqueue, AlteredTableInfo *tab, LOCKMODE 
lockmode)
 {
-       ObjectAddress obj;
-       ObjectAddresses *objects;
        ListCell   *def_item;
        ListCell   *oid_item;
 
        /*
-        * Collect all the constraints and indexes to drop so we can process 
them
-        * in a single call.  That way we don't have to worry about dependencies
-        * among them.
-        */
-       objects = new_object_addresses();
-
-       /*
         * Re-parse the index and constraint definitions, and attach them to the
         * appropriate work queue entries.  We do this before dropping because 
in
         * the case of a FOREIGN KEY constraint, we might not yet have exclusive
@@ -10363,16 +10363,13 @@ ATPostAlterTypeCleanup(List **wqueue, 
AlteredTableInfo *tab, LOCKMODE lockmode)
                conislocal = con->conislocal;
                ReleaseSysCache(tup);
 
-               ObjectAddressSet(obj, ConstraintRelationId, oldId);
-               add_exact_object_address(&obj, objects);
-
                /*
                 * If the constraint is inherited (only), we don't want to 
inject a
                 * new definition here; it'll get recreated when 
ATAddCheckConstraint
                 * recurses from adding the parent table's constraint.  But we 
had to
                 * carry the info this far so that we can drop the constraint 
below.
                 */
-               if (!conislocal)
+               if (!conislocal && contype == CONSTR_CHECK)
                        continue;
 
                /*
@@ -10398,6 +10395,46 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo 
*tab, LOCKMODE lockmode)
                ATPostAlterTypeParse(oldId, relid, InvalidOid,
                                                         (char *) 
lfirst(def_item),
                                                         wqueue, lockmode, 
tab->rewrite);
+       }
+}
+
+/*
+ * Cleanup after we've finished all the ALTER TYPE operations for a
+ * particular relation.  We have to drop and recreate all the indexes
+ * and constraints that depend on the altered columns.
+ */
+static void
+ATPostAlterTypeCleanup2(List **wqueue, AlteredTableInfo *tab, LOCKMODE 
lockmode)
+{
+       ObjectAddress obj;
+       ObjectAddresses *objects;
+       ListCell   *def_item;
+       ListCell   *oid_item;
+
+       /*
+        * Collect all the constraints and indexes to drop so we can process 
them
+        * in a single call.  That way we don't have to worry about dependencies
+        * among them.
+        */
+       objects = new_object_addresses();
+       forboth(oid_item, tab->changedConstraintOids,
+                       def_item, tab->changedConstraintDefs)
+       {
+               Oid                     oldId = lfirst_oid(oid_item);
+
+               if (!SearchSysCacheExists1(CONSTROID, oldId))
+                       continue;
+
+               ObjectAddressSet(obj, ConstraintRelationId, oldId);
+               add_exact_object_address(&obj, objects);
+       }
+       forboth(oid_item, tab->changedIndexOids,
+                       def_item, tab->changedIndexDefs)
+       {
+               Oid                     oldId = lfirst_oid(oid_item);
+
+               if (!SearchSysCacheExists1(RELOID, oldId))
+                       continue;
 
                ObjectAddressSet(obj, RelationRelationId, oldId);
                add_exact_object_address(&obj, objects);
@@ -10411,6 +10448,7 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo 
*tab, LOCKMODE lockmode)
 
        free_object_addresses(objects);
 
+       CommandCounterIncrement();
        /*
         * The objects will get recreated during subsequent passes over the work
         * queue.

Reply via email to