Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-08 Thread Amit Langote
On 2018/10/08 0:09, Michael Paquier wrote:
> On Sun, Oct 07, 2018 at 11:13:19AM -0300, Alvaro Herrera wrote:
>> Good point.  I cannot do that today, but if you want to, please go
>> ahead.
> 
> Okay, done.

Thank you both.

Regards,
Amit





Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-07 Thread Michael Paquier
On Sun, Oct 07, 2018 at 11:13:19AM -0300, Alvaro Herrera wrote:
> Good point.  I cannot do that today, but if you want to, please go
> ahead.

Okay, done.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-07 Thread Alvaro Herrera
On 2018-Oct-07, Michael Paquier wrote:

> On Sat, Oct 06, 2018 at 10:15:32PM -0300, Alvaro Herrera wrote:
> > Pushed to 11 and master.  I'm not convinced that it's a good idea to
> > mess with 10 at this point -- the partitioning code is rather completely
> > different already ...
> 
> While I definitely agree with you to not mess up with this portion of
> the partitioning code for REL_10_STABLE, I think that we have a good set
> of arguments to get the regression test into this branch.  This
> particular case has been broken while developing v11, hence it would be
> nice to get the insurance that any future bug fix applying to
> REL_10_STABLE does not break it again.

Good point.  I cannot do that today, but if you want to, please go
ahead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-07 Thread Michael Paquier
On Sat, Oct 06, 2018 at 10:15:32PM -0300, Alvaro Herrera wrote:
> Pushed to 11 and master.  I'm not convinced that it's a good idea to
> mess with 10 at this point -- the partitioning code is rather completely
> different already ...

While I definitely agree with you to not mess up with this portion of
the partitioning code for REL_10_STABLE, I think that we have a good set
of arguments to get the regression test into this branch.  This
particular case has been broken while developing v11, hence it would be
nice to get the insurance that any future bug fix applying to
REL_10_STABLE does not break it again.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-06 Thread Alvaro Herrera
On 2018-Oct-05, Amit Langote wrote:

> On 2018/10/05 16:05, Michael Paquier wrote:
> >> As of commit 2fbdf1b38bc [1], which has been applied in 11 and HEAD
> >> branches, RelationBuildPartitionDesc emits an error if we don't find
> >> relpartbound set for a child found by scanning pg_inherits, instead of
> >> skipping such children.  While that commit switched the order of creating
> >> pg_inherits entry and checking a new bound against existing bounds in
> >> DefineRelation in light of aforementioned change, it didn't in
> >> ATExecAttachPartition, hence this error.
> >>
> >> Attached patch fixes that.
> > 
> > Could you please add a minimal regression test in your patch?  That's
> > the second bug related to ATTACH PARTITION I am looking at today..
> 
> OK, done, although I hope that's not bloating the tests.

Pushed to 11 and master.  I'm not convinced that it's a good idea to
mess with 10 at this point -- the partitioning code is rather completely
different already ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-05 Thread Michael Paquier
On Fri, Oct 05, 2018 at 04:55:51PM +0900, Amit Langote wrote:
> On 2018/10/05 16:05, Michael Paquier wrote:
>> Hmm.  Indeed, v10 does not complain but HEAD does.  (I ran the attached
>> SQL file, which is the complete test case both of you have compiled).
> 
> Did you forget to attach some file?

Yep, here you go for the archive's sake.
--
Michael


upgrade_part.sql
Description: application/sql


signature.asc
Description: PGP signature


Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-05 Thread Amit Langote
On 2018/10/05 16:05, Michael Paquier wrote:
>> As of commit 2fbdf1b38bc [1], which has been applied in 11 and HEAD
>> branches, RelationBuildPartitionDesc emits an error if we don't find
>> relpartbound set for a child found by scanning pg_inherits, instead of
>> skipping such children.  While that commit switched the order of creating
>> pg_inherits entry and checking a new bound against existing bounds in
>> DefineRelation in light of aforementioned change, it didn't in
>> ATExecAttachPartition, hence this error.
>>
>> Attached patch fixes that.
> 
> Could you please add a minimal regression test in your patch?  That's
> the second bug related to ATTACH PARTITION I am looking at today..

OK, done, although I hope that's not bloating the tests.

>> I thought we'd need to apply this to 10, 11, HEAD, but I couldn't
>> reproduce this in 10.  That's because the above commit wasn't applied to
>> 10, so the child that causes this error is being skipped in 10's case.
> 
> Hmm.  Indeed, v10 does not complain but HEAD does.  (I ran the attached
> SQL file, which is the complete test case both of you have compiled).

Did you forget to attach some file?

Thanks,
Amit
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 8219d05d83..75e41ed21c 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -195,23 +195,12 @@ RelationBuildPartitionDesc(Relation rel)
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for relation %u", 
inhrelid);
 
-   /*
-* It is possible that the pg_class tuple of a partition has 
not been
-* updated yet to set its relpartbound field.  The only case 
where
-* this happens is when we open the parent relation to check 
using its
-* partition descriptor that a new partition's bound does not 
overlap
-* some existing partition.
-*/
-   if (!((Form_pg_class) GETSTRUCT(tuple))->relispartition)
-   {
-   ReleaseSysCache(tuple);
-   continue;
-   }
-
datum = SysCacheGetAttr(RELOID, tuple,

Anum_pg_class_relpartbound,
&isnull);
-   Assert(!isnull);
+   if(isnull)
+   elog(ERROR, "null relpartbound for relation %u", 
inhrelid);
+
boundspec = (Node *) stringToNode(TextDatumGetCString(datum));
boundspecs = lappend(boundspecs, boundspec);
partoids = lappend_oid(partoids, inhrelid);
@@ -1856,8 +1845,7 @@ generate_partition_qual(Relation rel)
 
Anum_pg_class_relpartbound,
 &isnull);
if (isnull) /* should not happen */
-   elog(ERROR, "relation \"%s\" has relpartbound = null",
-RelationGetRelationName(rel));
+   elog(ERROR, "null relpartbound for relation %u", 
RelationGetRelid(rel));
bound = castNode(PartitionBoundSpec,
 
stringToNode(TextDatumGetCString(boundDatum)));
ReleaseSysCache(tuple);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f143101b5d..43b76812d2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -745,9 +745,6 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,

  false,

  typaddress);
 
-   /* Store inheritance information for new rel. */
-   StoreCatalogInheritance(relationId, inheritOids, stmt->partbound != 
NULL);
-
/*
 * We must bump the command counter to make the newly-created relation
 * tuple visible for opening.
@@ -809,6 +806,9 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
CommandCounterIncrement();
}
 
+   /* Store inheritance information for new rel. */
+   StoreCatalogInheritance(relationId, inheritOids, stmt->partbound != 
NULL);
+
/*
 * Process the partitioning specification (if any) and store the 
partition
 * key information into the catalog.
@@ -13611,9 +13611,6 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
trigger_name, 
RelationGetRelationName(attachrel)),
 errdetail("ROW triggers with transition tables 
are not supported on partitions")));
 
-   /* OK to create inheritance.  Rest of the checks performed there */
-  

Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-05 Thread Michael Paquier
On Fri, Oct 05, 2018 at 03:06:52PM +0900, Amit Langote wrote:
> To reproduce, the following works too (after creating the objects as
> described above):
> 
> alter table partkey_t detach partition partkey_t_1;
> alter table partkey_t attach partition partkey_t_1 for values from (0) to
> (1000);
> ERROR:  null relpartbound for relation 16396
> CONTEXT:  SQL function "my_int4_sort"
> 
> The stack at the time of the error:
> 
> (gdb) bt
> #0  RelationBuildPartitionDesc
> #1  0x009bf04e in RelationBuildDesc
> #2  0x009c1784 in RelationClearRelation
> #3  0x009c1cc5 in RelationFlushRelation
> #4  0x009c1dd9 in RelationCacheInvalidateEntry
> #5  0x009b9496 in LocalExecuteInvalidationMessage
> #6  0x009b91ec in ProcessInvalidationMessages
> #7  0x009b9cdb in CommandEndInvalidationMessages
> #8  0x005346ef in AtCCI_LocalCache
> #9  0x00534124 in CommandCounterIncrement
> #10 0x006c579d in fmgr_sql
> #11 0x009de7c2 in FunctionCall2Coll
> #12 0x0058ac9f in partition_rbound_cmp
> #13 0x00588059 in check_new_partition_bound
> #14 0x0067f536 in ATExecAttachPartition
> 
> 
> So, the CommandCounterIncrement done in fmgr_sql causes partkey_t's
> PartitionDesc to be recomputed, which counts partkey_t_1 as its child
> because ATExecAttachPartition has already finished CreateInheritance which
> would've sent out an invalidation message for partkey_t.
> 
> As of commit 2fbdf1b38bc [1], which has been applied in 11 and HEAD
> branches, RelationBuildPartitionDesc emits an error if we don't find
> relpartbound set for a child found by scanning pg_inherits, instead of
> skipping such children.  While that commit switched the order of creating
> pg_inherits entry and checking a new bound against existing bounds in
> DefineRelation in light of aforementioned change, it didn't in
> ATExecAttachPartition, hence this error.
> 
> Attached patch fixes that.

Could you please add a minimal regression test in your patch?  That's
the second bug related to ATTACH PARTITION I am looking at today..

> I thought we'd need to apply this to 10, 11, HEAD, but I couldn't
> reproduce this in 10.  That's because the above commit wasn't applied to
> 10, so the child that causes this error is being skipped in 10's case.

Hmm.  Indeed, v10 does not complain but HEAD does.  (I ran the attached
SQL file, which is the complete test case both of you have compiled).
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-04 Thread Amit Langote
Thanks for the report.

On 2018/10/04 3:58, Rajkumar Raghuwanshi wrote:
> Hi,
> 
> I am getting ERROR:  null relpartbound for relation 18159 while doing
> pg_upgrade from v11 to v11/master.
> 
> -- user-defined operator class in partition key
> CREATE FUNCTION my_int4_sort(int4,int4) RETURNS int LANGUAGE sql
>   AS $$ SELECT CASE WHEN $1 = $2 THEN 0 WHEN $1 > $2 THEN 1 ELSE -1 END; $$;
> CREATE OPERATOR CLASS test_int4_ops FOR TYPE int4 USING btree AS
>   OPERATOR 1 < (int4,int4), OPERATOR 2 <= (int4,int4),
>   OPERATOR 3 = (int4,int4), OPERATOR 4 >= (int4,int4),
>   OPERATOR 5 > (int4,int4), FUNCTION 1 my_int4_sort(int4,int4);
> CREATE TABLE partkey_t (a int4) PARTITION BY RANGE (a test_int4_ops);
> CREATE TABLE partkey_t_1 PARTITION OF partkey_t FOR VALUES FROM (0) TO
> (1000);
> INSERT INTO partkey_t VALUES (100);
> INSERT INTO partkey_t VALUES (200);
> 
> --ran pg_upgrade failed with below error.
> pg_restore: [archiver (db)] could not execute query: ERROR:  null
> relpartbound for relation 18159
> CONTEXT:  SQL function "my_int4_sort"

Interesting test case.

To reproduce, the following works too (after creating the objects as
described above):

alter table partkey_t detach partition partkey_t_1;
alter table partkey_t attach partition partkey_t_1 for values from (0) to
(1000);
ERROR:  null relpartbound for relation 16396
CONTEXT:  SQL function "my_int4_sort"

The stack at the time of the error:

(gdb) bt
#0  RelationBuildPartitionDesc
#1  0x009bf04e in RelationBuildDesc
#2  0x009c1784 in RelationClearRelation
#3  0x009c1cc5 in RelationFlushRelation
#4  0x009c1dd9 in RelationCacheInvalidateEntry
#5  0x009b9496 in LocalExecuteInvalidationMessage
#6  0x009b91ec in ProcessInvalidationMessages
#7  0x009b9cdb in CommandEndInvalidationMessages
#8  0x005346ef in AtCCI_LocalCache
#9  0x00534124 in CommandCounterIncrement
#10 0x006c579d in fmgr_sql
#11 0x009de7c2 in FunctionCall2Coll
#12 0x0058ac9f in partition_rbound_cmp
#13 0x00588059 in check_new_partition_bound
#14 0x0067f536 in ATExecAttachPartition


So, the CommandCounterIncrement done in fmgr_sql causes partkey_t's
PartitionDesc to be recomputed, which counts partkey_t_1 as its child
because ATExecAttachPartition has already finished CreateInheritance which
would've sent out an invalidation message for partkey_t.

As of commit 2fbdf1b38bc [1], which has been applied in 11 and HEAD
branches, RelationBuildPartitionDesc emits an error if we don't find
relpartbound set for a child found by scanning pg_inherits, instead of
skipping such children.  While that commit switched the order of creating
pg_inherits entry and checking a new bound against existing bounds in
DefineRelation in light of aforementioned change, it didn't in
ATExecAttachPartition, hence this error.

Attached patch fixes that.

I thought we'd need to apply this to 10, 11, HEAD, but I couldn't
reproduce this in 10.  That's because the above commit wasn't applied to
10, so the child that causes this error is being skipped in 10's case.

Maybe, we should apply parts of the above commit that apply to 10 and then
this patch on top.  Attached for-10.patch file does that.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2fbdf1b38bc
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c145385f84..2d0043d1db 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14247,9 +14247,6 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
trigger_name, 
RelationGetRelationName(attachrel)),
 errdetail("ROW triggers with transition tables 
are not supported on partitions")));
 
-   /* OK to create inheritance.  Rest of the checks performed there */
-   CreateInheritance(attachrel, rel);
-
/*
 * Check that the new partition's bound is valid and does not overlap 
any
 * of existing partitions of the parent - note that it does not return 
on
@@ -14258,6 +14255,9 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
check_new_partition_bound(RelationGetRelationName(attachrel), rel,
  cmd->bound);
 
+   /* OK to create inheritance.  Rest of the checks performed there */
+   CreateInheritance(attachrel, rel);
+
/* Update the pg_class entry. */
StorePartitionBound(attachrel, rel, cmd->bound);
 
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 8219d05d83..75e41ed21c 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -195,23 +195,12 @@ RelationBuildPartitionDesc(Relation rel)
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for relation %u", 
inhrelid

pg_upgrade failed with ERROR: null relpartbound for relation 18159 error.

2018-10-04 Thread Rajkumar Raghuwanshi
Hi,

I am getting ERROR:  null relpartbound for relation 18159 while doing
pg_upgrade from v11 to v11/master.

-- user-defined operator class in partition key
CREATE FUNCTION my_int4_sort(int4,int4) RETURNS int LANGUAGE sql
  AS $$ SELECT CASE WHEN $1 = $2 THEN 0 WHEN $1 > $2 THEN 1 ELSE -1 END; $$;
CREATE OPERATOR CLASS test_int4_ops FOR TYPE int4 USING btree AS
  OPERATOR 1 < (int4,int4), OPERATOR 2 <= (int4,int4),
  OPERATOR 3 = (int4,int4), OPERATOR 4 >= (int4,int4),
  OPERATOR 5 > (int4,int4), FUNCTION 1 my_int4_sort(int4,int4);
CREATE TABLE partkey_t (a int4) PARTITION BY RANGE (a test_int4_ops);
CREATE TABLE partkey_t_1 PARTITION OF partkey_t FOR VALUES FROM (0) TO
(1000);
INSERT INTO partkey_t VALUES (100);
INSERT INTO partkey_t VALUES (200);

--ran pg_upgrade failed with below error.
pg_restore: creating TABLE "public.partitioned"
pg_restore: creating TABLE "public.partkey_t"
pg_restore: creating TABLE "public.partkey_t_1"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 485; 1259 18159 TABLE
partkey_t_1 edb
pg_restore: [archiver (db)] could not execute query: ERROR:  null
relpartbound for relation 18159
CONTEXT:  SQL function "my_int4_sort"
Command was:
-- For binary upgrade, must preserve pg_type oid
SELECT
pg_catalog.binary_upgrade_set_next_pg_type_oid('18161'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type array oid
SELECT
pg_catalog.binary_upgrade_set_next_array_pg_type_oid('18160'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_class oids
SELECT
pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('18159'::pg_catalog.oid);

CREATE TABLE "public"."partkey_t_1" (
"a" integer
);

-- For binary upgrade, recreate inherited column.
UPDATE pg_catalog.pg_attribute
SET attislocal = false
WHERE attname = 'a'
  AND attrelid = '"public"."partkey_t_1"'::pg_catalog.regclass;

-- For binary upgrade, set up inheritance and partitioning this way.
ALTER TABLE ONLY "public"."partkey_t" ATTACH PARTITION
"public"."partkey_t_1" FOR VALUES FROM (0) TO (1000);

-- For binary upgrade, set heap's relfrozenxid and relminmxid
UPDATE pg_catalog.pg_class
SET relfrozenxid = '1915', relminmxid = '1'
WHERE oid = '"public"."partkey_t_1"'::pg_catalog.regclass;

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation