On 2019/03/08 19:22, Amit Langote wrote:
> On 2019/03/07 20:36, Amit Langote wrote:
>> On Thu, Mar 7, 2019 at 11:17 AM Amit Langote
>> <[email protected]> wrote:
>>> The problem start when ALTER TABLE users ALTER COLUMN is executed.
>>> create table users(user_id int, name varchar(64), unique (user_id, name))
>>> partition by list(user_id);
>>>
>>> create table users_000 partition of users for values in(0);
>>> create table users_001 partition of users for values in(1);
>>> select relname, relfilenode from pg_class where relname like 'users%';
>>> relname │ relfilenode
>>> ────────────────────────────┼─────────────
>>> users │ 16441
>>> users_000 │ 16446
>>> users_000_user_id_name_key │ 16449
>>> users_001 │ 16451
>>> users_001_user_id_name_key │ 16454
>>> users_user_id_name_key │ 16444
>>> (6 rows)
>>>
>>> alter table users alter column name type varchar(127);
>>> select relname, relfilenode from pg_class where relname like 'users%';
>>> relname │ relfilenode
>>> ────────────────────────────┼─────────────
>>> users │ 16441
>>> users_000 │ 16446
>>> users_000_user_id_name_key │ 16444 <=== duplicated
>>> users_001 │ 16451
>>> users_001_user_id_name_key │ 16444 <=== duplicated
>>> users_user_id_name_key │ 16444 <=== duplicated
>>> (6 rows)
>>
>> I checked why users_000's and user_0001's indexes end up reusing
>> users_user_id_name_key's relfilenode. At the surface, it's because
>> DefineIndex(<parent's-index-to-be-recreated>) is carrying oldNode =
>> <old-parents-index's-relfilenode> in IndexStmt, which is recursively
>> passed down to DefineIndex(<child-indexes-to-be-recreated>). This
>> DefineIndex() chain is running due to ATPostAlterTypeCleanup() on the
>> parent rel. This surface problem may be solved in DefineIndex() by
>> just resetting oldNode in each child IndexStmt before recursing, but
>> that means child indexes are recreated with new relfilenodes. That
>> solves the immediate problem of relfilenodes being wrongly duplicated,
>> that's leading to madness such as SMgrRelationHash corruption being
>> seen in the original bug report.
>
> This doesn't happen in HEAD, because in HEAD we got 807ae415c5, which
> changed things so that partitioned relations always have their relfilenode
> set to 0. So, there's no question of parent's relfilenode being passed to
> children and hence being duplicated.
>
> But that also means child indexes are unnecessarily rewritten, that is,
> with new relfilenodes.
>
>> But, the root problem seems to be that ATPostAlterTypeCleanup() on
>> child tables isn't setting up their own
>> DefineIndex(<child-index-to-be-rewritten>) step. That's because the
>> parent's ATPostAlterTypeCleanup() dropped child copies of the UNIQUE
>> constraint due to dependencies (+ CCI). So, ATExecAlterColumnType()
>> on child relations isn't able to find the constraint on the individual
>> child relations to turn into their own
>> DefineIndex(<child-index-to-be-rewritten>). If we manage to handle
>> each relation's ATPostAlterTypeCleanup() independently, child's
>> recreated indexes will be able to reuse their old relfilenodes and
>> everything will be fine. But maybe that will require significant
>> overhaul of how this post-alter-type-cleanup occurs?
>
> We could try to solve this dividing ATPostAlterTypeCleanup processing into
> two functions:
>
> 1 The first one runs right after ATExecAlterColumnType() is finished for a
> given table (like it does today), which then runs ATPostAlterTypeParse to
> generate commands for constraints and/or indexes to re-add. This part
> won't drop the old constraints/indexes just yet, so child
> constraints/indexes will remain for ATExecAlterColumnType to see when
> executed for the children.
>
> 2. Dropping the old constraints/indexes is the responsibility of the 2nd
> part, which runs just before executing ATExecReAddIndex or
> ATExecAddConstraint (with is_readd = true), so that the new constraints
> don't collide with the existing ones.
>
> This arrangement allows step 1 to generate the commands to recreate even
> the child indexes such that the old relfilenode can be be preserved by
> setting IndexStmt.oldNode.
>
> Attached patch is a very rough sketch, which fails some regression tests,
> but I ran out of time today.
I'm thinking of adding this to open items under Older Bugs. Attached the
patch that I had posted on -bugs, but it's only a rough sketch as
described above, not a full fix.
Link to the original bug report:
https://www.postgresql.org/message-id/flat/15672-b9fa7db32698269f%40postgresql.org
Thanks,
Amit
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 5dcedc337a..ca514919d1 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -872,7 +872,7 @@ DefineIndex(Oid relationId,
CreateComments(indexRelationId, RelationRelationId, 0,
stmt->idxcomment);
- if (partitioned)
+ if (partitioned && !is_alter_table)
{
/*
* Unless caller specified to skip this step (via ONLY),
process each
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 93f13a4778..fb10d640e5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -426,7 +426,9 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo
*tab, Relation rel,
AlterTableCmd *cmd, LOCKMODE
lockmode);
static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char
*colName,
List *options,
LOCKMODE lockmode);
-static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
+static void ATPostAlterTypeCleanup1(List **wqueue, AlteredTableInfo *tab,
+ LOCKMODE lockmode);
+static void ATPostAlterTypeCleanup2(List **wqueue, AlteredTableInfo *tab,
LOCKMODE lockmode);
static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
char *cmd, List **wqueue, LOCKMODE
lockmode,
@@ -4068,6 +4070,10 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
if (subcmds == NIL)
continue;
+ /* Drop the old indexes and constraints before
creating*/
+ if (pass == AT_PASS_OLD_INDEX || pass ==
AT_PASS_OLD_CONSTR)
+ ATPostAlterTypeCleanup2(wqueue, tab, lockmode);
+
/*
* Appropriate lock was obtained by phase 1, needn't
get it again
*/
@@ -4084,7 +4090,7 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
* multiple columns of a table are altered).
*/
if (pass == AT_PASS_ALTER_TYPE)
- ATPostAlterTypeCleanup(wqueue, tab, lockmode);
+ ATPostAlterTypeCleanup1(wqueue, tab, lockmode);
relation_close(rel, NoLock);
}
@@ -7233,6 +7239,9 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo
*tab, Relation rel,
if (constr->is_no_inherit)
return address;
+ if (is_readd)
+ return address;
+
/*
* Propagate to children as appropriate. Unlike most other ALTER
* routines, we have to do this one level of recursion at a time; we
can't
@@ -10307,21 +10316,12 @@ ATExecAlterColumnGenericOptions(Relation rel,
* and constraints that depend on the altered columns.
*/
static void
-ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
+ATPostAlterTypeCleanup1(List **wqueue, AlteredTableInfo *tab, LOCKMODE
lockmode)
{
- ObjectAddress obj;
- ObjectAddresses *objects;
ListCell *def_item;
ListCell *oid_item;
/*
- * Collect all the constraints and indexes to drop so we can process
them
- * in a single call. That way we don't have to worry about dependencies
- * among them.
- */
- objects = new_object_addresses();
-
- /*
* Re-parse the index and constraint definitions, and attach them to the
* appropriate work queue entries. We do this before dropping because
in
* the case of a FOREIGN KEY constraint, we might not yet have exclusive
@@ -10363,16 +10363,13 @@ ATPostAlterTypeCleanup(List **wqueue,
AlteredTableInfo *tab, LOCKMODE lockmode)
conislocal = con->conislocal;
ReleaseSysCache(tup);
- ObjectAddressSet(obj, ConstraintRelationId, oldId);
- add_exact_object_address(&obj, objects);
-
/*
* If the constraint is inherited (only), we don't want to
inject a
* new definition here; it'll get recreated when
ATAddCheckConstraint
* recurses from adding the parent table's constraint. But we
had to
* carry the info this far so that we can drop the constraint
below.
*/
- if (!conislocal)
+ if (!conislocal && contype == CONSTR_CHECK)
continue;
/*
@@ -10398,6 +10395,46 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo
*tab, LOCKMODE lockmode)
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *)
lfirst(def_item),
wqueue, lockmode,
tab->rewrite);
+ }
+}
+
+/*
+ * Cleanup after we've finished all the ALTER TYPE operations for a
+ * particular relation. We have to drop and recreate all the indexes
+ * and constraints that depend on the altered columns.
+ */
+static void
+ATPostAlterTypeCleanup2(List **wqueue, AlteredTableInfo *tab, LOCKMODE
lockmode)
+{
+ ObjectAddress obj;
+ ObjectAddresses *objects;
+ ListCell *def_item;
+ ListCell *oid_item;
+
+ /*
+ * Collect all the constraints and indexes to drop so we can process
them
+ * in a single call. That way we don't have to worry about dependencies
+ * among them.
+ */
+ objects = new_object_addresses();
+ forboth(oid_item, tab->changedConstraintOids,
+ def_item, tab->changedConstraintDefs)
+ {
+ Oid oldId = lfirst_oid(oid_item);
+
+ if (!SearchSysCacheExists1(CONSTROID, oldId))
+ continue;
+
+ ObjectAddressSet(obj, ConstraintRelationId, oldId);
+ add_exact_object_address(&obj, objects);
+ }
+ forboth(oid_item, tab->changedIndexOids,
+ def_item, tab->changedIndexDefs)
+ {
+ Oid oldId = lfirst_oid(oid_item);
+
+ if (!SearchSysCacheExists1(RELOID, oldId))
+ continue;
ObjectAddressSet(obj, RelationRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -10411,6 +10448,7 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo
*tab, LOCKMODE lockmode)
free_object_addresses(objects);
+ CommandCounterIncrement();
/*
* The objects will get recreated during subsequent passes over the work
* queue.