Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
On Thu, Apr 12, 2018 at 3:59 PM, Alvaro Herrera wrote: > Robert Haas wrote: >> Please revert the part of this commit that changed the lock level. > > Done. Thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Robert Haas wrote: > Please revert the part of this commit that changed the lock level. Done. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Robert Haas wrote: > I don't think it was a good idea to change this without a lot more > discussion, as part of another commit that really was about something > else, and after feature freeze. > Please revert the part of this commit that changed the lock level. You're right, that was too hasty. Will revert. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
On Mon, Apr 2, 2018 at 3:11 PM, Alvaro Herrera wrote: > Why do we need AccessExclusiveLock on all children of a relation that we > want to scan to search for rows not satisfying the constraint? I think > it should be enough to get ShareLock, which prevents INSERT, no? I have > a feeling I'm missing something here, but I don't know what, and all > tests pass with that change. I don't think it was a good idea to change this without a lot more discussion, as part of another commit that really was about something else, and after feature freeze. As Kyotaro Horiguchi also mentioned, this introduces a deadlock hazard. With current master: Setup: create table foo (a int, b text) partition by range (a); create table foo1 partition of foo for values from (1) to (100); create table food (a int, b text) partition by range (a); create table food1 partition of food for values from (1) to (100); Session 1: begin; BEGIN rhaas=# select * from food1; a | b ---+--- (0 rows) rhaas=# insert into food1 values (1, 'thunk'); Session 2: rhaas=# alter table foo attach partition food default; At which point session 1 deadlocks, because the lock has to be upgraded to AccessExclusiveLock since we're changing the constraints. Now you might think about relaxing the lock level for the later acquisition, too, but I'm not sure that's safe. The issue is that scanning a relation for rows that don't match the new constraint isn't by itself sufficient: you also have to be sure that nobody can add one later. If they don't have the relation open, they'll certainly rebuild their relcache entry when they open it, so it'll be fine. But if they already have the relation open, I'm not sure we can be certain it will get rebuilt if, later in the same transaction, they try to insert data. This whole area needs more research -- there may very well be good opportunities to reduce lock levels in this area, but it needs careful study and analysis. Please revert the part of this commit that changed the lock level. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Thanks for the discussion. Per your suggestions, I changed the check for default partition OID to an assert instead of the 'if' condition, and removed the code that attempted vainly to verify the constraint when attaching a foreign table as a partition. And pushed. I think we're done here, so marked the Open Item as fixed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Added to the open items list. https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Open_Issues On Tue, Apr 3, 2018 at 2:52 PM, Amit Langote wrote: > On 2018/04/03 14:45, Kyotaro HORIGUCHI wrote: > > Hello. > > > > At Mon, 2 Apr 2018 16:11:12 -0300, Alvaro Herrera < > alvhe...@alvh.no-ip.org> wrote: > >> Why do we need AccessExclusiveLock on all children of a relation that we > >> want to scan to search for rows not satisfying the constraint? I think > >> it should be enough to get ShareLock, which prevents INSERT, no? I have > >> a feeling I'm missing something here, but I don't know what, and all > >> tests pass with that change. > > Thinking on this a bit, I see no problem with locking the children with > just ShareLock. It was just a paranoia that if we're going to lock the > table itself being attached with AEL, we must its children (if any) with > AEL, too. > > > Mmm. I'm not sure if there's a lock-upgrade case but the > > following sentense is left at the last of one of the modified > > comments. ATREwriteTables is called with AEL after that (that has > > finally no effect in this case). > > > > | But we cannot risk a deadlock by > taking > > | * a weaker lock now and the stronger one only when needed. > > > > I don't actually find places where the children's lock can be > > raised but this seems just following the lock parent first > > principle. > > No lock upgrade happen as of now. The comment was added by the commit > 972b6ec20bf [1], which removed the code that could cause such a deadlock. > The comment fragment is simply trying to explain why we don't postpone the > locking of children to a later time, say, to the point where we actually > know that they need to be scanned. Previously the code next to the > comment used to lock the children using AccessShareLock, because at that > point we just needed to check if the table being attached is one of those > children and then later locked with AEL if it turned out that they need to > be scanned to check the partition constraint. > > > By the way check_default_allows_bound() (CREATE TABLE case) > > contains a similar usage of find_all_inheritors(default_rel, > > AEL). > > Good catch. Its job is more or less same as > ValidatePartitionConstraints(), except all the children (if any) are > scanned right away instead of queuing it like in the AT case. > > >> Also: the change proposed to remove the get_default_oid_from_partdesc() > >> call actually fixes the bug, but to me it was not at all obvious why. > > > > CommandCounterIncrement updates the content of a relcache entry > > via invalidation. It can be surprising for callers of a function > > like StorePartitionBound. > > > > CommandCounterIncrement > > > >RelationCacheInvalidateEntry > > RelationFlushRelation > >RelationClearRelation > > Because of the CCI() after storing the default partition OID into the > system catalog, RelationClearRelation() would changes what > rel->rd_partdesc points to where 'rel' is the ATExecAttachPartition()'s > reference to the relcache entry of the parent that it passed to > StorePartitionBound. > > So, whereas the rel->rd_partdesc wouldn't contain the default partition > before StorePartitionBound() was called, it would after. > > >> To figure out why, I first had to realize that > >> ValidatePartitionConstraints was lying to me, both in name and in > >> comments: it doesn't actually validate anything, it merely queues > >> entries so that alter table's phase 3 would do the validation. I found > >> this extremely confusing, so I fixed the comments to match reality, and > >> later decided to rename the function also. > > > > It is reasonable. Removing exccessive extension of lower-level > > partitions is also reasonable. The previous code extended > > inheritances in different manner for levels at odd and even > > depth. > > I like the new code including the naming, but I notice this changes the > order in which we do the locking now. There are still sites in the code > where the locking order is breadth-first, that is, as determined by > find_all_inheritors(), as this function would too previously. > > Also note that beside the breadth-first -> depth-first change, this also > changes the locking order of partitions for a given partitioned table. > The OIDs in partdesc->oids[] are canonically ordered (that is order of > their partition bounds), whereas find_inheritance_children() that's called > by find_all_inheritors() would lock them in the order in which the > individual OIDs were found in the system catalog. > > Not sure if there is anything to be alarmed of here, but in all previous > discussions, this has been a thing to avoid. > > >> At that point I was able to understand what the problem was: when > >> attaching the default partition, we were setting the constraint to be > >> validated for that partition to the correctly computed partition > >> constraint; and later in the same function we wo
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
On 2018/04/03 14:45, Kyotaro HORIGUCHI wrote: > Hello. > > At Mon, 2 Apr 2018 16:11:12 -0300, Alvaro Herrera > wrote: >> Why do we need AccessExclusiveLock on all children of a relation that we >> want to scan to search for rows not satisfying the constraint? I think >> it should be enough to get ShareLock, which prevents INSERT, no? I have >> a feeling I'm missing something here, but I don't know what, and all >> tests pass with that change. Thinking on this a bit, I see no problem with locking the children with just ShareLock. It was just a paranoia that if we're going to lock the table itself being attached with AEL, we must its children (if any) with AEL, too. > Mmm. I'm not sure if there's a lock-upgrade case but the > following sentense is left at the last of one of the modified > comments. ATREwriteTables is called with AEL after that (that has > finally no effect in this case). > > | But we cannot risk a deadlock by taking > | * a weaker lock now and the stronger one only when needed. > > I don't actually find places where the children's lock can be > raised but this seems just following the lock parent first > principle. No lock upgrade happen as of now. The comment was added by the commit 972b6ec20bf [1], which removed the code that could cause such a deadlock. The comment fragment is simply trying to explain why we don't postpone the locking of children to a later time, say, to the point where we actually know that they need to be scanned. Previously the code next to the comment used to lock the children using AccessShareLock, because at that point we just needed to check if the table being attached is one of those children and then later locked with AEL if it turned out that they need to be scanned to check the partition constraint. > By the way check_default_allows_bound() (CREATE TABLE case) > contains a similar usage of find_all_inheritors(default_rel, > AEL). Good catch. Its job is more or less same as ValidatePartitionConstraints(), except all the children (if any) are scanned right away instead of queuing it like in the AT case. >> Also: the change proposed to remove the get_default_oid_from_partdesc() >> call actually fixes the bug, but to me it was not at all obvious why. > > CommandCounterIncrement updates the content of a relcache entry > via invalidation. It can be surprising for callers of a function > like StorePartitionBound. > > CommandCounterIncrement > >RelationCacheInvalidateEntry > RelationFlushRelation >RelationClearRelation Because of the CCI() after storing the default partition OID into the system catalog, RelationClearRelation() would changes what rel->rd_partdesc points to where 'rel' is the ATExecAttachPartition()'s reference to the relcache entry of the parent that it passed to StorePartitionBound. So, whereas the rel->rd_partdesc wouldn't contain the default partition before StorePartitionBound() was called, it would after. >> To figure out why, I first had to realize that >> ValidatePartitionConstraints was lying to me, both in name and in >> comments: it doesn't actually validate anything, it merely queues >> entries so that alter table's phase 3 would do the validation. I found >> this extremely confusing, so I fixed the comments to match reality, and >> later decided to rename the function also. > > It is reasonable. Removing exccessive extension of lower-level > partitions is also reasonable. The previous code extended > inheritances in different manner for levels at odd and even > depth. I like the new code including the naming, but I notice this changes the order in which we do the locking now. There are still sites in the code where the locking order is breadth-first, that is, as determined by find_all_inheritors(), as this function would too previously. Also note that beside the breadth-first -> depth-first change, this also changes the locking order of partitions for a given partitioned table. The OIDs in partdesc->oids[] are canonically ordered (that is order of their partition bounds), whereas find_inheritance_children() that's called by find_all_inheritors() would lock them in the order in which the individual OIDs were found in the system catalog. Not sure if there is anything to be alarmed of here, but in all previous discussions, this has been a thing to avoid. >> At that point I was able to understand what the problem was: when >> attaching the default partition, we were setting the constraint to be >> validated for that partition to the correctly computed partition >> constraint; and later in the same function we would set the constraint >> to "the immature constraint" because we were now seeing that the default >> partition OID was not invalid. So it was rather a bug in >> ValidatePartitionConstraints, in that it was accepting to set the >> expression to validate when another expression had already been set! I >> added an assert to protect against this. And then ch
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Hello. At Mon, 2 Apr 2018 16:11:12 -0300, Alvaro Herrera wrote in <20180402191112.wneiyj4v5upnfjst@alvherre.pgsql> > Why do we need AccessExclusiveLock on all children of a relation that we > want to scan to search for rows not satisfying the constraint? I think > it should be enough to get ShareLock, which prevents INSERT, no? I have > a feeling I'm missing something here, but I don't know what, and all > tests pass with that change. Mmm. I'm not sure if there's a lock-upgrade case but the following sentense is left at the last of one of the modified comments. ATREwriteTables is called with AEL after that (that has finally no effect in this case). | But we cannot risk a deadlock by taking | * a weaker lock now and the stronger one only when needed. I don't actually find places where the children's lock can be raised but this seems just following the lock parent first principle. By the way check_default_allows_bound() (CREATE TABLE case) contains a similar usage of find_all_inheritors(default_rel, AEL). > Also: the change proposed to remove the get_default_oid_from_partdesc() > call actually fixes the bug, but to me it was not at all obvious why. CommandCounterIncrement updates the content of a relcache entry via invalidation. It can be surprising for callers of a function like StorePartitionBound. CommandCounterIncrement RelationCacheInvalidateEntry RelationFlushRelation RelationClearRelation > To figure out why, I first had to realize that > ValidatePartitionConstraints was lying to me, both in name and in > comments: it doesn't actually validate anything, it merely queues > entries so that alter table's phase 3 would do the validation. I found > this extremely confusing, so I fixed the comments to match reality, and > later decided to rename the function also. It is reasonable. Removing exccessive extension of lower-level partitions is also reasonable. The previous code extended inheritances in different manner for levels at odd and even depth. > At that point I was able to understand what the problem was: when > attaching the default partition, we were setting the constraint to be > validated for that partition to the correctly computed partition > constraint; and later in the same function we would set the constraint > to "the immature constraint" because we were now seeing that the default > partition OID was not invalid. So it was rather a bug in > ValidatePartitionConstraints, in that it was accepting to set the > expression to validate when another expression had already been set! I > added an assert to protect against this. And then changed the decision > of whether or not to run this block based on the attached partition > being the default one or not; because as the "if" test was, it was just > a recipe for confusion. (Now, if you look carefully you realize that > the new test for bound->is_default I added is kinda redundant: it can > only be set if there was a default partition OID at start of the > function, and therefore we know we're not adding a default partition. > And for the case where we *are* adding the default partition, it is > still Invalid because we don't re-read its own OID. But relying on that > seems brittle because it breaks if we happen to update the default > partition OID in the middle of that function, which is what we were > doing. Hence the new is_default test.) Seems reasonable. But even if we assume so, rereading defaultPartOid is still breaking the assumption that defaultPartOid is that at the time of entering to this function and the added condition just conceals the fact. So I think it should be an assertion. | if (OidIsValid(defaultPartOid)) | { | /* | * The command cannot be adding default partition if the | * defaultPartOid is valid. | */ | Assert(!cmd->bound->is_default); > I looked at the implementation of ValidatePartitionConstraints and > didn't like it, so I rewrote it also. > > This comment is mistaken, too: > - /* > -* Skip if the partition is itself a partitioned table. We can only > -* ever scan RELKIND_RELATION relations. > -*/ > ... because it ignores the possibility of a partition being a foreign > table. The code seems to work, but I think there is no test to cover > the case that a foreign table containing data that doesn't satisfy the > constraint is attached, because when I set that case to return doing > nothing (ie. ATGetQueueEntry is not called for a foreign partition), no > test failed. Foreign tables are intentionally not verified on attaching (in my understanding). ATRewriteTables ignores foreign tables so it contradicts to what ATRewriteTables thinks. As my understanding foreign tables are assumed to be unrestrictable by local constraint after attaching so the users are responsible to the consistency. > Generally speaking, I didn't like ATExecAttachPartition; it's doing too > much that should have b
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Why do we need AccessExclusiveLock on all children of a relation that we want to scan to search for rows not satisfying the constraint? I think it should be enough to get ShareLock, which prevents INSERT, no? I have a feeling I'm missing something here, but I don't know what, and all tests pass with that change. Also: the change proposed to remove the get_default_oid_from_partdesc() call actually fixes the bug, but to me it was not at all obvious why. To figure out why, I first had to realize that ValidatePartitionConstraints was lying to me, both in name and in comments: it doesn't actually validate anything, it merely queues entries so that alter table's phase 3 would do the validation. I found this extremely confusing, so I fixed the comments to match reality, and later decided to rename the function also. At that point I was able to understand what the problem was: when attaching the default partition, we were setting the constraint to be validated for that partition to the correctly computed partition constraint; and later in the same function we would set the constraint to "the immature constraint" because we were now seeing that the default partition OID was not invalid. So it was rather a bug in ValidatePartitionConstraints, in that it was accepting to set the expression to validate when another expression had already been set! I added an assert to protect against this. And then changed the decision of whether or not to run this block based on the attached partition being the default one or not; because as the "if" test was, it was just a recipe for confusion. (Now, if you look carefully you realize that the new test for bound->is_default I added is kinda redundant: it can only be set if there was a default partition OID at start of the function, and therefore we know we're not adding a default partition. And for the case where we *are* adding the default partition, it is still Invalid because we don't re-read its own OID. But relying on that seems brittle because it breaks if we happen to update the default partition OID in the middle of that function, which is what we were doing. Hence the new is_default test.) I looked at the implementation of ValidatePartitionConstraints and didn't like it, so I rewrote it also. This comment is mistaken, too: - /* -* Skip if the partition is itself a partitioned table. We can only -* ever scan RELKIND_RELATION relations. -*/ ... because it ignores the possibility of a partition being a foreign table. The code seems to work, but I think there is no test to cover the case that a foreign table containing data that doesn't satisfy the constraint is attached, because when I set that case to return doing nothing (ie. ATGetQueueEntry is not called for a foreign partition), no test failed. Generally speaking, I didn't like ATExecAttachPartition; it's doing too much that should have been done in ATPrepAttachPartition. The only reason that's not broken is because we don't allow ATTACH PARTITION to appear together with other commands in alter table, which seems disappointing ... for example, when attaching multiple tables and a default partition exist, you have to scan the default one multiple times, which is lame. (Speaking of lame: I noticed that RelationGetPartitionQual obtains lock on the parent relation ... I wonder if this can be used to cause a deadlock during InitResultRelationInfo.) BTW, I think this is already broken for the case where the default partition is partitioned and you attach a partition colliding with a row that's being concurrently inserted in a partition of the default partition, though I didn't bother to write a test for that. Anyway, I'm just an onlooker fixing a CommandCounterIncrement change. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index c8da82217d..357b1078f8 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -483,10 +483,9 @@ static void RemoveInheritance(Relation child_rel, Relation parent_rel); static ObjectAddress ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd); static void AttachPartitionEnsureIndexes(Relation rel, Relation attachrel); -static void ValidatePartitionConstraints(List **wqueue, Relation scanrel, -List *scanrel_children, -List *partConstraint, -bool validate_default); +static void QueuePartitionConstraintValidation(List **wqueue, Relation scanrel, + List *partConstraint, + bool validate_default); static vo
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
> /* > - * Check whether default partition has a row that would fit the > partition > - * being attached. > + * Check if the default partition contains a row that would belong in > the > + * partition being attached. >*/ > - defaultPartOid = > - get_default_oid_from_partdesc(RelationGetPartitionDesc(rel)); > if (OidIsValid(defaultPartOid)) Oh my. This code is terrible, and I think this patch is wrong. More later. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Thanks Jeevan for reviewing. On 2018/04/02 13:10, Jeevan Ladhe wrote: > Hi, > > I noticed that there were no tests covering this case causing 4dba331cb3 >> to not notice this failure in the first place. I updated your patch to >> add a few tests. Also, I revised the comment changed by your patch a bit. >> > > 1. A minor typo: > > +-- check that violating rows are correctly reported when attching as the > s/attching/attaching Oops, fixed. > 2. I think following part of the test is already covered: > > +-- trying to add a partition for 2 should fail because the default > +-- partition contains a row that would violate its new constraint which > +-- prevents rows containing 2 > +create table defpart_attach_test2 partition of defpart_attach_test for > values in (2); > +ERROR: updated partition constraint for default partition > "defpart_attach_test_d" would be violated by some row > +drop table defpart_attach_test; > > IIUC, the test in create_table covers the same scenario as of above: > > -- check default partition overlap > INSERT INTO list_parted2 VALUES('X'); > CREATE TABLE fail_part PARTITION OF list_parted2 FOR VALUES IN ('W', 'X', > 'Y'); > ERROR: updated partition constraint for default partition > "list_parted2_def" would be violated by some row Sorry, didn't realize that it was already covered in create_tabel.sql. Removed this one. Attached updated patch. Adding this to the v11 open items list. Thanks, Amit diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index c8da82217d..9d474ad5e2 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -14114,11 +14114,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd) } /* -* Check whether default partition has a row that would fit the partition -* being attached. +* Check if the default partition contains a row that would belong in the +* partition being attached. */ - defaultPartOid = - get_default_oid_from_partdesc(RelationGetPartitionDesc(rel)); if (OidIsValid(defaultPartOid)) { Relationdefaultrel; diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index a80d16a394..4712fab540 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -3891,3 +3891,19 @@ ALTER TABLE attmp ALTER COLUMN i RESET (n_distinct_inherited); ANALYZE attmp; DROP TABLE attmp; DROP USER regress_alter_table_user1; +-- check that violating rows are correctly reported when attaching as the +-- default partition +create table defpart_attach_test (a int) partition by list (a); +create table defpart_attach_test1 partition of defpart_attach_test for values in (1); +create table defpart_attach_test_d (like defpart_attach_test); +insert into defpart_attach_test_d values (1), (2); +-- error because its constraint as the default partition would be violated +-- by the row containing 1 +alter table defpart_attach_test attach partition defpart_attach_test_d default; +ERROR: partition constraint is violated by some row +delete from defpart_attach_test_d where a = 1; +alter table defpart_attach_test_d add check (a > 1); +-- should be attached successfully and without needing to be scanned +alter table defpart_attach_test attach partition defpart_attach_test_d default; +INFO: partition constraint for table "defpart_attach_test_d" is implied by existing constraints +drop table defpart_attach_test; diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index 8198d1e930..c557b050af 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -2565,3 +2565,21 @@ ANALYZE attmp; DROP TABLE attmp; DROP USER regress_alter_table_user1; + +-- check that violating rows are correctly reported when attaching as the +-- default partition +create table defpart_attach_test (a int) partition by list (a); +create table defpart_attach_test1 partition of defpart_attach_test for values in (1); +create table defpart_attach_test_d (like defpart_attach_test); +insert into defpart_attach_test_d values (1), (2); + +-- error because its constraint as the default partition would be violated +-- by the row containing 1 +alter table defpart_attach_test attach partition defpart_attach_test_d default; +delete from defpart_attach_test_d where a = 1; +alter table defpart_attach_test_d add check (a > 1); + +-- should be attached successfully and without needing to be scanned +alter table defpart_attach_test attach partition defpart_attach_test_d default; + +drop table defpart_attach_test;
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Hi, I noticed that there were no tests covering this case causing 4dba331cb3 > to not notice this failure in the first place. I updated your patch to > add a few tests. Also, I revised the comment changed by your patch a bit. > 1. A minor typo: +-- check that violating rows are correctly reported when attching as the s/attching/attaching 2. I think following part of the test is already covered: +-- trying to add a partition for 2 should fail because the default +-- partition contains a row that would violate its new constraint which +-- prevents rows containing 2 +create table defpart_attach_test2 partition of defpart_attach_test for values in (2); +ERROR: updated partition constraint for default partition "defpart_attach_test_d" would be violated by some row +drop table defpart_attach_test; IIUC, the test in create_table covers the same scenario as of above: -- check default partition overlap INSERT INTO list_parted2 VALUES('X'); CREATE TABLE fail_part PARTITION OF list_parted2 FOR VALUES IN ('W', 'X', 'Y'); ERROR: updated partition constraint for default partition "list_parted2_def" would be violated by some row Regards, Jeevan Ladhe
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
On 2018/03/30 17:31, Kyotaro HORIGUCHI wrote: > At Thu, 29 Mar 2018 13:04:06 -0300, Alvaro Herrera wrote: >> Rushabh Lathia wrote: >>> On Thu, Mar 29, 2018 at 7:46 PM, Alvaro Herrera >>> wrote: >> Hmm, offhand I don't quite see why this error fails to be thrown. >>> >>> ATTACH PARTITION should throw an error, because partition table "foo" >>> already have two partition with key values (1, 2,3 4). And table "foo_d" >>> which we are attaching here has : >> >> Oh, I understand how it's supposed to work. I was just saying I don't >> understand how this bug occurs. Is it because we fail to determine the >> correct partition constraint for the default partition in time for its >> verification scan? > > The reason is that CommandCounterIncrement added in > StorePartitionBound reveals the just added default partition to > get_default_oid_from_partdesc too early. The revealed partition > has immature constraint and it overrites the right constraint > generated just above. > > ATExecAttachPartition checks for default partition oid twice but > the second is just needless before the commit and harms after it. Yes. What happens as of the commit mentioned in $subject is that the partition constraint that's set as tab->partition_constraint during the first call to ValidatePartitionConstraints (which is the correct one) is overwritten by a wrong one during the 2nd call, which wouldn't happen before the commit. In the wrongly occurring 2nd call, we'd end up setting tab->partition_constraint to the negation of the clause expression that would've been set by the first call (in this case). Thus tab->partition_constraint ends up returning true for all the values it contains. I noticed that there were no tests covering this case causing 4dba331cb3 to not notice this failure in the first place. I updated your patch to add a few tests. Also, I revised the comment changed by your patch a bit. Thanks, Amit diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 83a881eff3..8bba9a2e20 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -14114,11 +14114,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd) } /* -* Check whether default partition has a row that would fit the partition -* being attached. +* Check if the default partition contains a row that would belong in the +* partition being attached. */ - defaultPartOid = - get_default_oid_from_partdesc(RelationGetPartitionDesc(rel)); if (OidIsValid(defaultPartOid)) { Relationdefaultrel; diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index a80d16a394..82b195e034 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -3891,3 +3891,24 @@ ALTER TABLE attmp ALTER COLUMN i RESET (n_distinct_inherited); ANALYZE attmp; DROP TABLE attmp; DROP USER regress_alter_table_user1; +-- check that violating rows are correctly reported when attching as the +-- default partition +create table defpart_attach_test (a int) partition by list (a); +create table defpart_attach_test1 partition of defpart_attach_test for values in (1); +create table defpart_attach_test_d (like defpart_attach_test); +insert into defpart_attach_test_d values (1), (2); +-- error because its constraint as the default partition would be violated +-- by the row containing 1 +alter table defpart_attach_test attach partition defpart_attach_test_d default; +ERROR: partition constraint is violated by some row +delete from defpart_attach_test_d where a = 1; +alter table defpart_attach_test_d add check (a > 1); +-- should be attached successfully and without needing to be scanned +alter table defpart_attach_test attach partition defpart_attach_test_d default; +INFO: partition constraint for table "defpart_attach_test_d" is implied by existing constraints +-- trying to add a partition for 2 should fail because the default +-- partition contains a row that would violate its new constraint which +-- prevents rows containing 2 +create table defpart_attach_test2 partition of defpart_attach_test for values in (2); +ERROR: updated partition constraint for default partition "defpart_attach_test_d" would be violated by some row +drop table defpart_attach_test; diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index 8198d1e930..e0f4009b19 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -2565,3 +2565,26 @@ ANALYZE attmp; DROP TABLE attmp; DROP USER regress_alter_table_user1; + +-- check that violating rows are correctly reported when attching as the +-- default partition +create table defpart_attach_test (a int) partition by list (a); +create table defpart_attach_test1 partition of defpart_attach_test for values in (1
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
At Thu, 29 Mar 2018 13:04:06 -0300, Alvaro Herrera wrote in <20180329160406.ii2wgbkmlnfxtwbt@alvherre.pgsql> > Rushabh Lathia wrote: > > On Thu, Mar 29, 2018 at 7:46 PM, Alvaro Herrera > > wrote: > > > > Hmm, offhand I don't quite see why this error fails to be thrown. > > > > ATTACH PARTITION should throw an error, because partition table "foo" > > already have two partition with key values (1, 2,3 4). And table "foo_d" > > which we are attaching here has : > > Oh, I understand how it's supposed to work. I was just saying I don't > understand how this bug occurs. Is it because we fail to determine the > correct partition constraint for the default partition in time for its > verification scan? The reason is that CommandCounterIncrement added in StorePartitionBound reveals the just added default partition to get_default_oid_from_partdesc too early. The revealed partition has immature constraint and it overrites the right constraint generated just above. ATExecAttachPartition checks for default partition oid twice but the second is just needless before the commit and harms after it. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 8f83aa4675..96b62bd4b6 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -14082,11 +14082,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, PartitionCmd *cmd) } /* - * Check whether default partition has a row that would fit the partition - * being attached. + * Check whether existing default partition has a row that would fit the + * partition being attached. */ - defaultPartOid = - get_default_oid_from_partdesc(RelationGetPartitionDesc(rel)); if (OidIsValid(defaultPartOid)) { Relation defaultrel;
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Rushabh Lathia wrote: > On Thu, Mar 29, 2018 at 7:46 PM, Alvaro Herrera > wrote: > > Hmm, offhand I don't quite see why this error fails to be thrown. > > ATTACH PARTITION should throw an error, because partition table "foo" > already have two partition with key values (1, 2,3 4). And table "foo_d" > which we are attaching here has : Oh, I understand how it's supposed to work. I was just saying I don't understand how this bug occurs. Is it because we fail to determine the correct partition constraint for the default partition in time for its verification scan? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
On Thu, Mar 29, 2018 at 7:46 PM, Alvaro Herrera wrote: > Rushabh Lathia wrote: > > > CREATE TABLE foo (a INT, b INT, c VARCHAR) PARTITION BY LIST(a); > > CREATE TABLE foo_p1 PARTITION OF foo FOR VALUES IN (1,2); > > CREATE TABLE foo_p2 PARTITION OF foo FOR VALUES IN (3,4); > > INSERT INTO foo select i,i,i from generate_series(1,4)i; > > > > CREATE TABLE foo_d (like foo); > > INSERT INTO foo_d select i,i,i from generate_series(1,9)i; > > > > ALTER TABLE foo ATTACH PARTITION foo_d DEFAULT; > > > > Above ATTACH PARTITION should fail with "partition constraint is > violated" > > error, but instead it's working on a master branch. > > Hmm, offhand I don't quite see why this error fails to be thrown. > > ATTACH PARTITION should throw an error, because partition table "foo" already have two partition with key values (1, 2,3 4). And table "foo_d" which we are attaching here has : postgres@76099=#select * from foo_d; a | b | c ---+---+--- 1 | 1 | 1 2 | 2 | 2 3 | 3 | 3 4 | 4 | 4 5 | 5 | 5 6 | 6 | 6 7 | 7 | 7 8 | 8 | 8 9 | 9 | 9 (9 rows) After ATTACH PARTITION, when we see the partition constraint for the newly attached partition: postgres@76099=#\d+ foo_d Table "public.foo_d" Column | Type| Collation | Nullable | Default | Storage | Stats target | Description +---+---+--+-+--+--+- a | integer | | | | plain| | b | integer | | | | plain| | c | character varying | | | | extended | | Partition of: foo DEFAULT Partition constraint: (NOT ((a IS NOT NULL) AND (a = ANY (ARRAY[1, 2, 3, 4] So, it says that this partition (table) should not include values (1,2, 3, 4). But it sill has those values. postgres@76099=#select tableoid::regclass, * from foo; tableoid | a | b | c --+---+---+--- foo_p1 | 1 | 1 | 1 foo_p1 | 2 | 2 | 2 foo_p2 | 3 | 3 | 3 foo_p2 | 4 | 4 | 4 foo_d| 1 | 1 | 1 foo_d| 2 | 2 | 2 foo_d| 3 | 3 | 3 foo_d| 4 | 4 | 4 foo_d| 5 | 5 | 5 foo_d| 6 | 6 | 6 foo_d| 7 | 7 | 7 foo_d| 8 | 8 | 8 foo_d| 9 | 9 | 9 (13 rows) So basically ATTACH PARTITION should throw an error when partition constraint is violated. -- Rushabh Lathia www.EnterpriseDB.com
Re: Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Rushabh Lathia wrote: > CREATE TABLE foo (a INT, b INT, c VARCHAR) PARTITION BY LIST(a); > CREATE TABLE foo_p1 PARTITION OF foo FOR VALUES IN (1,2); > CREATE TABLE foo_p2 PARTITION OF foo FOR VALUES IN (3,4); > INSERT INTO foo select i,i,i from generate_series(1,4)i; > > CREATE TABLE foo_d (like foo); > INSERT INTO foo_d select i,i,i from generate_series(1,9)i; > > ALTER TABLE foo ATTACH PARTITION foo_d DEFAULT; > > Above ATTACH PARTITION should fail with "partition constraint is violated" > error, but instead it's working on a master branch. Hmm, offhand I don't quite see why this error fails to be thrown. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Commit 4dba331cb3 broke ATTACH PARTITION behaviour.
Hi, Consider the below test: CREATE TABLE foo (a INT, b INT, c VARCHAR) PARTITION BY LIST(a); CREATE TABLE foo_p1 PARTITION OF foo FOR VALUES IN (1,2); CREATE TABLE foo_p2 PARTITION OF foo FOR VALUES IN (3,4); INSERT INTO foo select i,i,i from generate_series(1,4)i; CREATE TABLE foo_d (like foo); INSERT INTO foo_d select i,i,i from generate_series(1,9)i; ALTER TABLE foo ATTACH PARTITION foo_d DEFAULT; Above ATTACH PARTITION should fail with "partition constraint is violated" error, but instead it's working on a master branch. Looking further I found that problem introduced with below commit: commit 4dba331cb3dc1b5ffb0680ed8efae847de216796 Author: Alvaro Herrera Date: Tue Mar 20 11:19:41 2018 -0300 Fix CommandCounterIncrement in partition-related DDL It makes sense to do the CCIs in the places that do catalog updates, rather than before the places that error out because the former ones fail to do it. In particular, it looks like StorePartitionBound() and IndexSetParentIndex() ought to make their own CCIs. Per review comments from Peter Eisentraut for row-level triggers on partitioned tables. I noticed that further below commit tried to correct thing in similar area, but still I am noticing the broken behavior in case of attaching the partition as DEFAULT. commit 56163004b8b2151db279744b77138d4d90e2d5cb Author: Alvaro Herrera Date: Wed Mar 21 12:03:35 2018 -0300 Fix relcache handling of the 'default' partition Regards, -- Rushabh Lathia