wrong results due to qual pushdown
tender wang [image: 附件]14:51 (2小时前) 发送至 pgsql-hackers Hi hackers. This query has different result on 16devel and 15.2. select sample_3.n_regionkey as c0, ref_7.l_linenumber as c3, sample_4.l_quantity as c6, sample_5.n_nationkey as c7, sample_3.n_name as c8 from public.nation as sample_3 left join public.lineitem as ref_5 on ((cast(null as text) ~>=~ cast(null as text)) or (ref_5.l_discount is NULL)) left join public.time_statistics as ref_6 inner join public.lineitem as ref_7 on (ref_7.l_returnflag = ref_7.l_linestatus) right join public.lineitem as sample_4 left join public.nation as sample_5 on (cast(null as tsquery) = cast(null as tsquery)) on (cast(null as "time") <= cast(null as "time")) right join public.customer as ref_8 on (sample_4.l_comment = ref_8.c_name ) on (ref_5.l_quantity = ref_7.l_quantity ) where (ref_7.l_suppkey is not NULL) or ((case when cast(null as lseg) >= cast(null as lseg) then cast(null as inet) else cast(null as inet) end && cast(null as inet)) or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as int2) <= cast(null as int8)) or (EXISTS ( select ref_9.ps_comment as c0, 5 as c1, ref_8.c_address as c2, 58 as c3, ref_8.c_acctbal as c4, ref_7.l_orderkey as c5, ref_7.l_shipmode as c6, ref_5.l_commitdate as c7, ref_8.c_custkey as c8, sample_3.n_nationkey as c9 from public.partsupp as ref_9 where cast(null as tsquery) @> cast(null as tsquery) order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit 38)) then cast(null as text) else cast(null as text) end )) order by c0, c3, c6, c7, c8 limit 137; plan on 16devel: QUERY PLAN Limit InitPlan 1 (returns $0) -> Result One-Time Filter: false -> Sort Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity, n_nationkey, sample_3.n_name -> Nested Loop Left Join -> Seq Scan on nation sample_3 -> Materialize -> Nested Loop Left Join Join Filter: (ref_5.l_quantity = l_quantity) Filter: ((l_suppkey IS NOT NULL) OR (getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text ELSE NULL::text END)) -> Seq Scan on lineitem ref_5 Filter: (l_discount IS NULL) -> Result One-Time Filter: false (16 rows) plan on 15.2: QUERY PLAN Limit InitPlan 1 (returns $0) -> Result One-Time Filter: false -> Sort Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity, n_nationkey, sample_3.n_name -> Nested Loop Left Join Filter: ((l_suppkey IS NOT NULL) OR (getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text ELSE NULL::text END)) -> Seq Scan on nation sample_3 -> Materialize -> Nested Loop Left Join Join Filter: (ref_5.l_quantity = l_quantity) -> Seq Scan on lineitem ref_5 Filter: (l_discount IS NULL) -> Result One-Time Filter: false (16 rows) It looks wrong that the qual (e.g ((l_suppkey IS NOT NULL) OR (getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text ELSE NULL::text END))) is pushdown. regards, tender wang
Re: wrong results due to qual pushdown
Results on 16devel: c0 | c3 | c6 | c7 |c8 ++++--- 0 |||| ALGERIA 0 |||| ETHIOPIA 0 |||| KENYA 0 |||| MOROCCO 0 |||| MOZAMBIQUE 1 |||| ARGENTINA 1 |||| BRAZIL 1 |||| CANADA 1 |||| PERU 1 |||| UNITED STATES 2 |||| CHINA 2 |||| INDIA 2 |||| INDONESIA 2 |||| JAPAN 2 |||| VIETNAM 3 |||| FRANCE 3 |||| GERMANY 3 |||| ROMANIA 3 |||| RUSSIA 3 |||| UNITED KINGDOM 4 |||| EGYPT 4 |||| IRAN 4 |||| IRAQ 4 |||| JORDAN 4 |||| SAUDI ARABIA (25 rows) Results on 15.2: c0 | c3 | c6 | c7 | c8 ++++ (0 rows) tender wang 于2023年3月6日周一 22:48写道: > Results on 16devel: > c0 | c3 | c6 | c7 |c8 > ++++--- > 0 |||| ALGERIA > 0 |||| ETHIOPIA > 0 |||| KENYA > 0 |||| MOROCCO > 0 |||| MOZAMBIQUE > 1 |||| ARGENTINA > 1 |||| BRAZIL > 1 |||| CANADA > 1 |||| PERU > 1 |||| UNITED STATES > 2 |||| CHINA > 2 |||| INDIA > 2 |||| INDONESIA > 2 |||| JAPAN > 2 |||| VIETNAM > 3 |||| FRANCE > 3 |||| GERMANY > 3 |||| ROMANIA > 3 |||| RUSSIA > 3 |||| UNITED KINGDOM > 4 |||| EGYPT > 4 |||| IRAN > 4 |||| IRAQ > 4 |||| JORDAN > 4 |||| SAUDI ARABIA > (25 rows) > > Results on 15.2: > c0 | c3 | c6 | c7 | c8 > ++++---- > (0 rows) > > Ashutosh Bapat 于2023年3月6日周一 22:14写道: > >> >> >> On Mon, Mar 6, 2023 at 3:00 PM tender wang wrote: >> >>> tender wang >>> [image: 附件]14:51 (2小时前) >>> 发送至 pgsql-hackers >>> Hi hackers. >>>This query has different result on 16devel and 15.2. >>> select >>> sample_3.n_regionkey as c0, >>> ref_7.l_linenumber as c3, >>> sample_4.l_quantity as c6, >>> sample_5.n_nationkey as c7, >>> sample_3.n_name as c8 >>> from >>> public.nation as sample_3 >>> left join public.lineitem as ref_5 >>> on ((cast(null as text) ~>=~ cast(null as text)) >>> or (ref_5.l_discount is NULL)) >>> left join public.time_statistics as ref_6 >>> inner join public.lineitem as ref_7 >>> on (ref_7.l_returnflag = ref_7.l_linestatus) >>> right join public.lineitem as sample_4 >>> left join public.nation as sample_5 >>> on (cast(null as tsquery) = cast(null as tsquery)) >>> on (cast(null as "time") <= cast(null as "time")) >>> right join public.customer as ref_8 >>> on (sample_4.l_comment = ref_8.c_name ) >>> on (ref_5.l_quantity = ref_7.l_quantity ) >>> where (ref_7.l_suppkey is not NULL) >>> or ((case when cast(null as lseg) >= cast(null as lseg) then >>> cast(null as inet) else cast(null as inet) end >>>&& cast(null as inet)) >>> or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as >>> int2) <= cast(null as int8)) >>> or (EXISTS ( >>> select >>> ref_9.ps_comment as c0, >>> 5 as c1, >>> ref_8.c_address as c2, >>> 58 as c3, >>> ref_8.c_acctbal as c4, >>> ref_7.l_orderkey as c5, >>> ref_7.l_shipmode as c6, >>> ref_5.l_commitdate as c7, >>> ref_8.c_custkey as c8, >>> sample_3.n_nationkey as c9 >>> from >>> public.partsupp as ref_9 >>> where cast(null as tsquery) @> cast(null as tsquery) >>> order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit >>> 38)) then cast(null as text) else cast(null as text) end >>> )) >>> order by c0, c3, c6, c7, c8 limit 137; >>> >>> plan on 16devel: >>> >>>QUERY PLAN >>> >>&
Can't find not null constraint, but \d+ shows that
Hi Alvaro, I met an issue related to Catalog not-null commit on HEAD. postgres=# CREATE TABLE t1(c0 int, c1 int); CREATE TABLE postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); ALTER TABLE postgres=# \d+ t1 Table "public.t1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+-+-+--+- c0 | integer | | not null | | plain | | | c1 | integer | | not null | | plain | | | Indexes: "q" PRIMARY KEY, btree (c0, c1) Access method: heap postgres=# ALTER TABLE t1 DROP c1; ALTER TABLE postgres=# \d+ t1 Table "public.t1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+-+-+--+- c0 | integer | | not null | | plain | | | Access method: heap postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; ERROR: could not find not-null constraint on column "c0", relation "t1" postgres=# insert into t1 values (NULL); ERROR: null value in column "c0" of relation "t1" violates not-null constraint DETAIL: Failing row contains (null). I couldn't reproduce aboved issue on older version(12.12 ...16.1). to be more precisely, since b0e96f3119 commit. Without the b0e96f3119, when we drop not null constraint, we just update the pg_attribute attnotnull to false in ATExecDropNotNull(). But now we first check pg_constraint if has the tuple. if attnotnull is ture, but pg_constraint doesn't has that tuple. Aboved error will report. It will be confuesed for users. Because \d shows the column c0 has not null, and we cann't insert NULL value. But it reports errore when users drop the NOT NULL constraint. The attached patch is my workaround solution. Look forward your apply. -- Tender Wang OpenPie: https://en.openpie.com/ 0001-Fix-attnotnull-not-correct-reset-issue.patch Description: Binary data
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年3月26日周二 23:25写道: > On 2024-Mar-26, Tender Wang wrote: > > > postgres=# CREATE TABLE t1(c0 int, c1 int); > > postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > > postgres=# ALTER TABLE t1 DROP c1; > > > > postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; > > ERROR: could not find not-null constraint on column "c0", relation "t1" > > Ooh, hah, what happens here is that we drop the PK constraint > indirectly, so we only go via doDeletion rather than the tablecmds.c > code, so we don't check the attnotnull flags that the PK was protecting. > Yeah, Indeed, as you said. > The attached patch is my workaround solution. Look forward your apply. > > Yeah, this is not a very good approach -- I think you're just guessing > that the column is marked NOT NULL because a PK was dropped in the > past -- but really what this catalog state is, is corrupted contents > because the PK drop was mishandled. At least in theory there are other > ways to drop a constraint other than dropping one of its columns (for > example, maybe this could happen if you drop a collation that the PK > depends on). The right approach is to ensure that the PK drop always > does the dance that ATExecDropConstraint does. A good fix probably just > moves some code from dropconstraint_internal to RemoveConstraintById. > Agreed. It is look better. But it will not work if simply move some codes from dropconstraint_internal to RemoveConstraintById. I have tried this fix before 0001 patch, but failed. For example: create table skip_wal_skip_rewrite_index (c varchar(10) primary key); alter table skip_wal_skip_rewrite_index alter c type varchar(20); ERROR: primary key column "c" is not marked NOT NULL index_check_primary_key() in index.c has below comments; "We check for a pre-existing primary key, and that all columns of the index are simple column references (not expressions), and that all those columns are marked NOT NULL. If not, fail." So in aboved example, RemoveConstraintById() can't reset attnotnull. We can pass some information to RemoveConstraintById() like a bool var to indicate that attnotnull should be reset or not. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年3月26日周二 23:25写道: > On 2024-Mar-26, Tender Wang wrote: > > > postgres=# CREATE TABLE t1(c0 int, c1 int); > > postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > > postgres=# ALTER TABLE t1 DROP c1; > > > > postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; > > ERROR: could not find not-null constraint on column "c0", relation "t1" > > Ooh, hah, what happens here is that we drop the PK constraint > indirectly, so we only go via doDeletion rather than the tablecmds.c > code, so we don't check the attnotnull flags that the PK was protecting. > > > The attached patch is my workaround solution. Look forward your apply. > > Yeah, this is not a very good approach -- I think you're just guessing > that the column is marked NOT NULL because a PK was dropped in the > past -- but really what this catalog state is, is corrupted contents > because the PK drop was mishandled. At least in theory there are other > ways to drop a constraint other than dropping one of its columns (for > example, maybe this could happen if you drop a collation that the PK > depends on). The right approach is to ensure that the PK drop always > does the dance that ATExecDropConstraint does. A good fix probably just > moves some code from dropconstraint_internal to RemoveConstraintById. > I think again, below solutin maybe looks more better: i. move some code from dropconstraint_internal to RemoveConstraintById, not change the RemoveConstraintById interface. Ensure that the PK drop always does the dance that ATExecDropConstraint does. ii. After i phase, the attnotnull of some column of primary key may be reset to false as I provided example in last email. We can set attnotnull to true again in some place. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年3月26日周二 23:25写道: > On 2024-Mar-26, Tender Wang wrote: > > > postgres=# CREATE TABLE t1(c0 int, c1 int); > > postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > > postgres=# ALTER TABLE t1 DROP c1; > > > > postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; > > ERROR: could not find not-null constraint on column "c0", relation "t1" > > Ooh, hah, what happens here is that we drop the PK constraint > indirectly, so we only go via doDeletion rather than the tablecmds.c > code, so we don't check the attnotnull flags that the PK was protecting. > > > The attached patch is my workaround solution. Look forward your apply. > > Yeah, this is not a very good approach -- I think you're just guessing > that the column is marked NOT NULL because a PK was dropped in the > past -- but really what this catalog state is, is corrupted contents > because the PK drop was mishandled. At least in theory there are other > ways to drop a constraint other than dropping one of its columns (for > example, maybe this could happen if you drop a collation that the PK > depends on). The right approach is to ensure that the PK drop always > does the dance that ATExecDropConstraint does. A good fix probably just > moves some code from dropconstraint_internal to RemoveConstraintById. > I found some types ddl would check the attnotnull of column is true, for example: AT_ReAddIndex, AT_ReplicaIdentity. So we should add AT_SetAttNotNull sub-command to the wqueue. I add a new AT_PASS_OLD_COL_ATTRS to make sure AT_SetAttNotNull will have done when do AT_ReAddIndex or AT_ReplicaIdentity. -- Tender Wang OpenPie: https://en.openpie.com/ v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch Description: Binary data
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年3月28日周四 13:18写道: > On Wed, Mar 27, 2024 at 10:26 PM Tender Wang wrote: > > > > Alvaro Herrera 于2024年3月26日周二 23:25写道: > >> > >> On 2024-Mar-26, Tender Wang wrote: > >> > >> > postgres=# CREATE TABLE t1(c0 int, c1 int); > >> > postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > >> > postgres=# ALTER TABLE t1 DROP c1; > >> > > >> > postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; > >> > ERROR: could not find not-null constraint on column "c0", relation > "t1" > >> > >> Ooh, hah, what happens here is that we drop the PK constraint > >> indirectly, so we only go via doDeletion rather than the tablecmds.c > >> code, so we don't check the attnotnull flags that the PK was protecting. > >> > >> > The attached patch is my workaround solution. Look forward your > apply. > >> > > after applying > v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch > > something is off, now i cannot drop a table. > demo: > CREATE TABLE t2(c0 int, c1 int); > ALTER TABLE t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1); > ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY; > DROP TABLE t2 cascade; > Similarly, maybe there will be some issue with replica identity. > Thanks for review this patch. Yeah, I can reproduce it. The error reported in RemoveConstraintById(), where I moved some codes from dropconstraint_internal(). But some check seems to no need in RemoveConstraintById(). For example: /* * It's not valid to drop the not-null constraint for a GENERATED * AS IDENTITY column. */ if (attForm->attidentity) ereport(ERROR, errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("column \"%s\" of relation \"%s\" is an identity column", get_attname(RelationGetRelid(rel), attnum, false), RelationGetRelationName(rel))); /* * It's not valid to drop the not-null constraint for a column in * the replica identity index, either. (FULL is not affected.) */ if (bms_is_member(lfirst_int(lc) - FirstLowInvalidHeapAttributeNumber, ircols)) ereport(ERROR, errcode(ERRCODE_INVALID_TABLE_DEFINITION), errmsg("column \"%s\" is in index used as replica identity", get_attname(RelationGetRelid(rel), lfirst_int(lc), false))); Above two check can remove from RemoveConstraintById()? I need more test. > > + /* > + * If this was a NOT NULL or the primary key, the constrained columns must > + * have had pg_attribute.attnotnull set. See if we need to reset it, and > + * do so. > + */ > + if (unconstrained_cols) > it should be if (unconstrained_cols != NIL)?, > given unconstrained_cols is a List, also "unconstrained_cols" naming > seems not intuitive. > maybe pk_attnums or pk_cols or pk_columns. > As I said above, the codes were copied from dropconstraint_internal(). NOT NULL columns were not alwayls PK. So I thinks "unconstrained_cols" is OK. > > + attrel = table_open(AttributeRelationId, RowExclusiveLock); > + rel = table_open(con->conrelid, RowExclusiveLock); > I am not sure why we using RowExclusiveLock for con->conrelid? > given we use AccessExclusiveLock at: > /* > * If the constraint is for a relation, open and exclusive-lock the > * relation it's for. > */ > rel = table_open(con->conrelid, AccessExclusiveLock); > Yeah, you are right. > > > + /* > + * Since the above deletion has been made visible, we can now > + * search for any remaining constraints on this column (or these > + * columns, in the case we're dropping a multicol primary key.) > + * Then, verify whether any further NOT NULL or primary key > + * exists, and reset attnotnull if none. > + * > + * However, if this is a generated identity column, abort the > + * whole thing with a specific error message, because the > + * constraint is required in that case. > + */ > + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum); > + if (contup || > + bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, > + pkcols)) > + continue; > > I didn't get this part. > if you drop delete a primary key, > the "NOT NULL" constraint within pg_constraint should definitely be > removed? > therefore contup should be pretty sure is NULL? > No, If the original definaiton of column includes NOT NULL, we can't reset attnotnull to false when We we drop PK. > > /* > - * The parser will create AT_AttSetNotNull subcommands for > - * columns of PRIMARY KEY indexes/constraints, but we need > - * not do anything with them here, because the columns' > - * NOT NULL marks will already have been propagated into > - * the new table definition. > + * PK drop now will reset pg_attribute attnotnull to false. > + * We should set attnotnull to true again. > */ > PK drop now will reset pg_attribute attnotnull to false, > which is what we should be expecting. > the comment didn't explain why should set attnotnull to true again? > The V2 patch still needs more cases to test, Probably not right solution. Anyway, I will send a v3 version patch after I do more test. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年3月26日周二 23:25写道: > On 2024-Mar-26, Tender Wang wrote: > > > postgres=# CREATE TABLE t1(c0 int, c1 int); > > postgres=# ALTER TABLE t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > > postgres=# ALTER TABLE t1 DROP c1; > > > > postgres=# ALTER TABLE t1 ALTER c0 DROP NOT NULL; > > ERROR: could not find not-null constraint on column "c0", relation "t1" > > Ooh, hah, what happens here is that we drop the PK constraint > indirectly, so we only go via doDeletion rather than the tablecmds.c > code, so we don't check the attnotnull flags that the PK was protecting. > > > The attached patch is my workaround solution. Look forward your apply. > > Yeah, this is not a very good approach -- I think you're just guessing > that the column is marked NOT NULL because a PK was dropped in the > past -- but really what this catalog state is, is corrupted contents > because the PK drop was mishandled. At least in theory there are other > ways to drop a constraint other than dropping one of its columns (for > example, maybe this could happen if you drop a collation that the PK > depends on). The right approach is to ensure that the PK drop always > does the dance that ATExecDropConstraint does. A good fix probably just > moves some code from dropconstraint_internal to RemoveConstraintById. > RemoveConstraintById() should think recurse(e.g. partition table)? I'm not sure now. If we should think process recurse in RemoveConstraintById(), the function will look complicated than before. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年3月28日周四 17:18写道: > On 2024-Mar-28, Tender Wang wrote: > > > RemoveConstraintById() should think recurse(e.g. partition table)? I'm > not > > sure now. > > If we should think process recurse in RemoveConstraintById(), the > > function will look complicated than before. > > No -- this function handles just a single constraint, as identified by > OID. The recursion is handled by upper layers, which can be either > dependency.c or tablecmds.c. I think the problem you found is caused by > the fact that I worked with the tablecmds.c recursion and neglected the > one in dependency.c. > Indeed. create table skip_wal_skip_rewrite_index (c varchar(10) primary key); alter table skip_wal_skip_rewrite_index alter c type varchar(20); Above SQL need attnotnull to be true when re-add index, but RemoveConstraintById() is hard to recognize this scenario as I know. We should re-set attnotnull to be true before re-add index. I add a new AT_PASS in attached patch. Any thoughts? -- Tender Wang OpenPie: https://en.openpie.com/ v4-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch Description: Binary data
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年3月29日周五 14:56写道: > hi. > about v4, i think, i understand the changes you made. > RemoveConstraintById(Oid conId) > will drop a single constraint record. > if the constraint is primary key, then primary key associated > attnotnull should set to false. > but sometimes it shouldn't. > Yeah, indeed. > > > for example: > drop table if exists t2; > CREATE TABLE t2(c0 int, c1 int); > ALTER TABLE t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1); > ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY; > ALTER TABLE t2 DROP c1; > > + * If this was a NOT NULL or the primary key, the constrained columns must > + * have had pg_attribute.attnotnull set. See if we need to reset it, and > + * do so. > + */ > + if (unconstrained_cols != NIL) > > unconstrained_cols is not NIL, which means we have dropped a primary key. > I am confused by "If this was a NOT NULL or the primary key". > NOT NULL means the definition of column having not-null constranit. For example: create table t1(a int not null); the pg_attribute.attnotnull is set to true. primary key case like below: create table t2(a int primary key); the pg_attribute.attnotnull is set to true. I think aboved case can explain what's meaning about comments in dropconstraint_internal. But here, in RemoveConstraintById() , we only care about primary key case, so NOT NULL is better to removed from comments. > > + > + /* > + * Since the above deletion has been made visible, we can now > + * search for any remaining constraints on this column (or these > + * columns, in the case we're dropping a multicol primary key.) > + * Then, verify whether any further NOT NULL exists, and reset > + * attnotnull if none. > + */ > + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum); > + if (HeapTupleIsValid(contup)) > + { > + heap_freetuple(contup); > + heap_freetuple(atttup); > + continue; > + } > > I am a little bit confused by the above comment. > I think the comments should say, > if contup is valid, that means, we already have one "not null" > constraint associate with the attnum > in that condition, we must not set attnotnull, otherwise the > consistency between attnotnull and "not null" > table constraint will be broken. > > other than that, the change in RemoveConstraintById looks sane. > Above comments want to say that after pk constranit dropped, if there are tuples in pg_constraint, that means the definition of column has not-null constraint. So we can't set pg_attribute.attnotnull to false. For example: create table t1(a int not null); alter table t1 add constraint t1_pk primary key(a); alter table t1 drop constraint t1_pk; -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
It has been several days since the last email. Do you have any suggestions, please? What I'm concerned about is that adding a new AT_PASS is good fix? Is it a big try? More concerned is that it can cover all ALTER TABLE cases? Any thoughts. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Hi all, I went through the MERGE/SPLIT partition codes today, thanks for the works. I found some grammar errors: i. in error messages(Users can see this grammar errors, not friendly). ii. in codes comments Alexander Korotkov 于2024年4月7日周日 06:23写道: > Hi, Dmitry! > > On Fri, Apr 5, 2024 at 4:00 PM Dmitry Koval > wrote: > > > I've revised the patchset. > > > > Thanks for the corrections (especially ddl.sgml). > > Could you also look at a small optimization for the MERGE PARTITIONS > > command (in a separate file > > v31-0003-Additional-patch-for-ALTER-TABLE-.-MERGE-PARTITI.patch, I wrote > > about it in an email 2024-03-31 00:56:50)? > > > > Files v31-0001-*.patch, v31-0002-*.patch are the same as > > v30-0001-*.patch, v30-0002-*.patch (after rebasing because patch stopped > > applying due to changes in upstream). > > I've pushed 0001 and 0002. I didn't push 0003 for the following reasons. > 1) This doesn't keep functionality equivalent to 0001. With 0003, the > merged partition will inherit indexes, constraints, and so on from the > one of merging partitions. > 2) This is not necessarily an optimization. Without 0003 indexes on > the merged partition are created after moving the rows in > attachPartitionTable(). With 0003 we merge data into the existing > partition which saves its indexes. That might cause a significant > performance loss because mass inserts into indexes may be much slower > than building indexes from scratch. > I think both aspects need to be carefully considered. Even if we > accept them, this needs to be documented. I think now it's too late > for both of these. So, this should wait for v18. > > -- > Regards, > Alexander Korotkov > > > -- Tender Wang OpenPie: https://en.openpie.com/ 0001-Fix-some-grammer-errors-from-error-messages-and-code.patch Description: Binary data
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Andrey M. Borodin 于2024年4月8日周一 17:40写道: > > > > On 27 Sep 2023, at 16:06, tender wang wrote: > > > >Do you have any comments or suggestions on this issue? Thanks. > Hi Tender, > > there are some review comments in the thread, that you might be interested > in. > I'll mark this [0] entry "Waiting on Author" and move to next CF. > Thank you for the reminder. I will update the patch later. I also deeply hope to get more advice about this patch. (even the advice that not worth continuint to work on this patch). Thanks. Thanks! > > > Best regards, Andrey Borodin. > > [0]https://commitfest.postgresql.org/47/4549/ -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年4月10日周三 14:10写道: > On Wed, Apr 10, 2024 at 1:29 AM Alvaro Herrera > wrote: > > > > On 2024-Mar-29, Tender Wang wrote: > > > > > I think aboved case can explain what's meaning about comments in > > > dropconstraint_internal. > > > But here, in RemoveConstraintById() , we only care about primary key > case, > > > so NOT NULL is better to removed from comments. > > > > Actually, I think it's better if all the resets of attnotnull occur in > > RemoveConstraintById, for both types of constraints; we would keep that > > block in dropconstraint_internal only to raise errors in the cases where > > the constraint is protecting a replica identity or a generated column. > > Something like the attached, perhaps, may need more polish. > > > > DROP TABLE if exists notnull_tbl2; > CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1 > int); > ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null; > ALTER TABLE notnull_tbl2 DROP c1; > \d notnull_tbl2 > > last "\d notnull_tbl2" command, master output is: > Table "public.notnull_tbl2" > Column | Type | Collation | Nullable | Default > > +-+---+--+-- > c0 | integer | | not null | generated by default as identity > > > > last "\d notnull_tbl2" command, applying > 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch > output: > Table "public.notnull_tbl2" > Column | Type | Collation | Nullable | Default > > +-+---+--+-- > c0 | integer | | | generated by default as identity > Hmm, ALTER TABLE notnull_tbl2 DROP c1; will not call dropconstraint_internal(). When dropping PK constraint indirectly, c0's attnotnull was set to false in RemoveConstraintById(). -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年4月10日周三 17:34写道: > > another related bug, in master. > > drop table if exists notnull_tbl1; > CREATE TABLE notnull_tbl1 (c0 int not null, c1 int); > ALTER TABLE notnull_tbl1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > \d+ notnull_tbl1 > ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL; > ALTER TABLE notnull_tbl1 ALTER c1 DROP NOT NULL; > > "ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;" > should fail? > Yeah, it should fail as before, because c0 is primary key. In master, although c0's pg_attribute.attnotnull is still true, but its not-null constraint has been deleted in dropconstraint_internal(). If we drop column c1 after dropping c0 not null, the primary key will be dropped indirectly. And now you can see c0 is still not-null if you do \d+ notnull_tbl1. But it will report error "not found not-null" if you alter c0 drop not null. postgres=# ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL; ALTER TABLE postgres=# \d+ notnull_tbl1 Table "public.notnull_tbl1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+-+-+--+- c0 | integer | | not null | | plain | | | c1 | integer | | not null | | plain | | | Indexes: "q" PRIMARY KEY, btree (c0, c1) Access method: heap postgres=# alter table notnull_tbl1 drop c1; ALTER TABLE postgres=# \d notnull_tbl1 Table "public.notnull_tbl1" Column | Type | Collation | Nullable | Default +-+---+--+- c0 | integer | | not null | postgres=# alter table notnull_tbl1 alter c0 drop not null; ERROR: could not find not-null constraint on column "c0", relation "notnull_tbl1" -- Tender Wang OpenPie: https://en.openpie.com/
Re: Revise some error messages in split partition code
Richard Guo 于2024年4月10日周三 19:32写道: > I noticed some error messages in the split partition code that are not > up to par. Such as: > > "new partitions not have value %s but split partition has" > > how about we revise it to: > > "new partitions do not have value %s but split partition does" > > Another one is: > > "any partition in the list should be DEFAULT because split partition is > DEFAULT" > > how about we revise it to: > > "all partitions in the list should be DEFAULT because split partition is > DEFAULT" > > Another problem I noticed is that in the test files partition_split.sql > and partition_merge.sql, there are comments specifying the expected > error messages for certain test queries. However, in some cases, the > error message mentioned in the comment does not match the error message > actually generated by the query. Such as: > > -- ERROR: invalid partitions order, partition "sales_mar2022" can not be > merged > -- (space between sections sales_jan2022 and sales_mar2022) > ALTER TABLE sales_range MERGE PARTITIONS (sales_jan2022, sales_mar2022) > INTO sales_jan_mar2022; > ERROR: lower bound of partition "sales_mar2022" conflicts with upper > bound of previous partition "sales_jan2022" > > I'm not sure if it's a good practice to specify the expected error > message in the comment. But if we choose to do so, I think we at least > need to ensure that the specified error message in the comment remains > consistent with the error message produced by the query. > > Also there are some comments containing grammatical issues. Such as: > > -- no error: bounds of sales_noerror equals to lower and upper bounds of > sales_dec2022 and sales_feb2022 > > Attached is a patch to fix the issues I've observed. I suspect there > may be more to be found. > Yeah. The day before yesterday I found some grammer errors from error messages and code comments [1] . Except those issues, @Alexander Lakhin has found some bugs [2] I have some concerns that whether this patch is ready to commit. [1] https://www.postgresql.org/message-id/CAHewXNkGMPU50QG7V6Q60JGFORfo8LfYO1_GCkCa0VWbmB-fEw%40mail.gmail.com [2] https://www.postgresql.org/message-id/dbc8b96c-3cf0-d1ee-860d-0e491da20485%40gmail.com -- Tender Wang OpenPie: https://en.openpie.com/
"type with xxxx does not exist" when doing ExecMemoize()
Hi, I met Memoize node failed When I used sqlancer test postgres. database0=# explain select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0); QUERY PLAN -- Nested Loop (cost=0.17..21.20 rows=4 width=32) -> Seq Scan on t5 (cost=0.00..1.04 rows=4 width=14) -> Memoize (cost=0.17..6.18 rows=1 width=32) Cache Key: (t5.c0 - t5.c0) Cache Mode: logical -> Index Only Scan using t0_c0_key on t0 (cost=0.15..6.17 rows=1 width=32) Index Cond: (c0 = (t5.c0 - t5.c0)) (7 rows) database0=# select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0); ERROR: type with OID 2139062143 does not exist How to repeat: The attached database0.log (created by sqlancer) included statements to repeat this issue. Firstly, create database test; then; psql postgres \i /xxx/database0.log I analyzed aboved issue this weekend. And I found that After called ResetExprContext() in MemoizeHash_hash(), the data in mstate->probeslot was corrputed. in prepare_probe_slot: the data as below: (gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0])) $1 = {vl_len_ = 36, rangetypid = 3904} after called ResetExprContext() in MemoizeHash_hash: (gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0])) $3 = {vl_len_ = 264, rangetypid = 2139062143} I think in prepare_probe_slot(), should called datumCopy as the attached patch does. Any thoughts? Thanks. -- Tender Wang OpenPie: https://en.openpie.com/ database0.log Description: Binary data 0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch Description: Binary data
Re: "type with xxxx does not exist" when doing ExecMemoize()
The attached patch is a new version based on v3(not including Andrei's the test case). There is no need to call datumCopy when isnull is true. I have not added a new runtime memoryContext so far. Continue to use mstate->tableContext, I'm not sure the memory used of probeslot will affect mstate->mem_limit. Maybe adding a new memoryContext is better. I think I should spend a little time to learn nodeMemoize.c more deeply. Andrei Lepikhov 于2024年2月26日周一 20:29写道: > On 26/2/2024 18:34, Richard Guo wrote: > > > > On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov > > mailto:a.lepik...@postgrespro.ru>> wrote: > > > > On 26/2/2024 12:44, Tender Wang wrote: > > > Make sense. I found MemoizeState already has a MemoryContext, so > > I used it. > > > I update the patch. > > This approach is better for me. In the next version of this patch, I > > included a test case. I am still unsure about the context chosen and > > the > > stability of the test case. Richard, you recently fixed some Memoize > > issues, could you look at this problem and patch? > > > > > > I looked at this issue a bit. It seems to me what happens is that at > > first the memory areas referenced by probeslot->tts_values[] are > > allocated in the per tuple context (see prepare_probe_slot). And then > > in MemoizeHash_hash, after we've calculated the hashkey, we will reset > > the per tuple context. However, later in MemoizeHash_equal, we still > > need to reference the values in probeslot->tts_values[], which have been > > cleared. > Agree > > > > Actually the context would always be reset in MemoizeHash_equal, for > > both binary and logical mode. So I kind of wonder if it's necessary to > > reset the context in MemoizeHash_hash. > I can only provide one thought against this solution: what if we have a > lot of unique hash values, maybe all of them? In that case, we still > have a kind of 'leak' David fixed by the commit 0b053e78b5. > Also, I have a segfault report of one client. As I see, it was caused by > too long text column in the table slot. As I see, key value, stored in > the Memoize hash table, was corrupted, and the most plain reason is this > bug. Should we add a test on this bug, and what do you think about the > one proposed in v3? > > -- > regards, > Andrei Lepikhov > Postgres Professional > > -- Tender Wang OpenPie: https://en.openpie.com/ v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch Description: Binary data
Re: "type with xxxx does not exist" when doing ExecMemoize()
I read Memoize code and how other node use ResetExprContext() recently. The comments about per_tuple_memory said that : * ecxt_per_tuple_memory is a short-term context for expression results. * As the name suggests, it will typically be reset once per tuple, * before we begin to evaluate expressions for that tuple. Each * ExprContext normally has its very own per-tuple memory context. So ResetExprContext() should called once per tuple, but not in Hash and Equal function just as Richard said before. In ExecResult() and ExecProjectSet(), they call ResetExprContext() once when enter these functions. So I think ExecMemoize() can do the same way. The attached patch includes below modifications: 1. When I read the code in nodeMemoize.c, I found a typos: outer should be inner, if I don't misunderstand the intend of Memoize. 2. I found that almost executor node call CHECK_FOR_INTERRUPTS(), so I add it. Is it right to add it for ExecMemoize()? 3. I remove ResetExprContext() from Hash and Equal funciton. And I call it when enter ExecMemoize() just like ExecPrejectSet() does. ExecQualAndReset() is replaed with ExecQual(). 4. This patch doesn't include test case. I use the Andrei's test case, but I don't repeat the aboved issue. I may need to spend some more time to think about how to repeat this issue easily. So, what do you think about the one proposed in v5? @Andrei Lepikhov @Richard Guo @David Rowley . I don't want to continue to do work based on v3 patch. As Andrei Lepikhov said, using mstate->tableContext for probeslot is not good. v5 looks more simple. v5-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch Description: Binary data
Re: "type with xxxx does not exist" when doing ExecMemoize()
Hi, When I think about how to add a test case for v5 version patch, and I want to test if v5 version patch has memory leak. This thread [1] provided a way how to repeat the memory leak, so I used it to test v5 patch. I didn't found memory leak on v5 patch. But I found other interesting issue. When changed whereClause in [1], the query reported below error: "ERROR could not find memoization table entry" the query: EXPLAIN analyze select sum(q.id_table1) from ( SELECT t2.* FROM table1 t1 JOIN table2 t2 ON (t2.id_table1 + t2.id_table1) = t1.id) q; But on v5 patch, it didn't report error. I guess it is the same reason that data in probeslot was reset in Hash function. I debug the above query, and get this: before (gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0])) $1 = {vl_len_ = 48, choice = {n_header = 32770, n_long = {n_sign_dscale = 32770, n_weight = 60, n_data = 0x564632ebd708}, n_short = {n_header = 32770, n_data = 0x564632ebd706}}} after (gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0])) $2 = {vl_len_ = 264, choice = {n_header = 32639, n_long = {n_sign_dscale = 32639, n_weight = 32639, n_data = 0x564632ebd6a8}, n_short = {n_header = 32639, n_data = 0x564632ebd6a6}}} So after call ResetExprContext() in Hash function, the data in probeslot is corrupted. It is not sure what error will happen when executing on corrupted data. During debug, I learned that numeric_add doesn't have type check like rangetype, so aboved query will not report "type with xxx does not exist". And I realize that the test case added by Andrei Lepikhov in v3 is right. So in v6 patch I add Andrei Lepikhov's test case. Thanks a lot. Now I think the v6 version patch seems to be complete now. [1] https://www.postgresql.org/message-id/83281eed63c74e4f940317186372abfd%40cft.ru -- Tender Wang OpenPie: https://en.openpie.com/ v6-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch Description: Binary data
Re: "type with xxxx does not exist" when doing ExecMemoize()
Andrei Lepikhov 于2024年3月5日周二 17:36写道: > On 1/3/2024 14:18, Tender Wang wrote: > > During debug, I learned that numeric_add doesn't have type check like > > rangetype, so aboved query will not report "type with xxx does not > exist". > > > > And I realize that the test case added by Andrei Lepikhov in v3 is > > right. So in v6 patch I add Andrei Lepikhov's test case. Thanks a lot. > > > > Now I think the v6 version patch seems to be complete now. > I've passed through the patch, and it looks okay. Although I am afraid > of the same problems that future changes can cause and how to detect > them, it works correctly. > Thanks for reviewing it, and I add it to commitfest 2024-07. -- Tender Wang OpenPie: https://en.openpie.com/
Re: "type with xxxx does not exist" when doing ExecMemoize()
Andrei Lepikhov 于2024年3月6日周三 11:37写道: > I think, it is a bug. Should it be fixed (and back-patched) earlier? > Agreed. Need David to review it as he knows this area best. -- Tender Wang OpenPie: https://en.openpie.com/
Re: "type with xxxx does not exist" when doing ExecMemoize()
David Rowley 于2024年3月11日周一 13:25写道: > On Thu, 7 Mar 2024 at 22:50, David Rowley wrote: > > > > On Thu, 7 Mar 2024 at 15:24, Tender Wang wrote: > > > > > > Andrei Lepikhov 于2024年3月6日周三 11:37写道: > > >> I think, it is a bug. Should it be fixed (and back-patched) earlier? > > > > > > Agreed. Need David to review it as he knows this area best. > > > > This is on my list of things to do. Just not at the top yet. > > I've gone over this patch and I'm happy with the changes to > nodeMemoize.c. The thing I did change was the newly added test. The > problem there was the test was passing for me with and without the > code fix. I ended up changing the test so the cache hits and misses > are reported. That required moving the test to above where the > work_mem is set to 64KB so we can be certain the values will all be > cached and the cache hits are predictable. > > My other changes were just cosmetic. > > Thanks for working on this fix. I've pushed the patch. > > David > Thanks for pushing the patch. -- Tender Wang OpenPie: https://en.openpie.com/
same query but different result on pg16devel and pg15.2
Hi hackers, I encounter a problem, as shown below: query: select ref_0.ps_suppkey as c0, ref_1.c_acctbal as c1, ref_2.o_totalprice as c2, ref_2.o_orderpriority as c3, ref_2.o_clerk as c4 from public.partsupp as ref_0 left join public.nation as sample_0 inner join public.customer as sample_1 on (false) on (true) left join public.customer as ref_1 right join public.orders as ref_2 on (false) left join public.supplier as ref_3 on (false) on (sample_0.n_comment = ref_1.c_name ) where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END)) order by c0, c1, c2, c3, c4 limit 1; on pg16devel: c0 | c1 | c2 | c3 | c4 ++++ 1 |||| (1 row) plan: QUERY PLAN --- Limit -> Sort Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice, o_orderpriority, o_clerk -> Nested Loop Left Join -> Seq Scan on partsupp ref_0 -> Result One-Time Filter: false (7 rows) on pg15.2: c0 | c1 | c2 | c3 | c4 ++++ (0 rows) plan: QUERY PLAN --- Limit -> Sort Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice, o_orderpriority, o_clerk -> Hash Left Join Hash Cond: ((n_comment)::text = (c_name)::text) Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END)) -> Nested Loop Left Join -> Seq Scan on partsupp ref_0 -> Result One-Time Filter: false -> Hash -> Result One-Time Filter: false (13 rows) regards, tender wang
Re: same query but different result on pg16devel and pg15.2
Attached file included table schema information, but no data. tender wang 于2023年4月4日周二 10:53写道: > Hi hackers, > I encounter a problem, as shown below: > > query: > select > ref_0.ps_suppkey as c0, > ref_1.c_acctbal as c1, > ref_2.o_totalprice as c2, > ref_2.o_orderpriority as c3, > ref_2.o_clerk as c4 > from > public.partsupp as ref_0 > left join public.nation as sample_0 > inner join public.customer as sample_1 > on (false) > on (true) > left join public.customer as ref_1 > right join public.orders as ref_2 > on (false) > left join public.supplier as ref_3 > on (false) > on (sample_0.n_comment = ref_1.c_name ) > where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END, > CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END)) > order by c0, c1, c2, c3, c4 limit 1; > > on pg16devel: > c0 | c1 | c2 | c3 | c4 > ++++ > 1 |||| > (1 row) > plan: > QUERY PLAN > > > --- > Limit >-> Sort > Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice, > o_orderpriority, o_clerk > -> Nested Loop Left Join >-> Seq Scan on partsupp ref_0 >-> Result > One-Time Filter: false > (7 rows) > > on pg15.2: > c0 | c1 | c2 | c3 | c4 > ++++ > (0 rows) > plan: > > QUERY PLAN > > > --- > Limit >-> Sort > Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice, > o_orderpriority, o_clerk > -> Hash Left Join >Hash Cond: ((n_comment)::text = (c_name)::text) >Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) > THEN 4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 > END)) >-> Nested Loop Left Join > -> Seq Scan on partsupp ref_0 > -> Result > One-Time Filter: false >-> Hash > -> Result >One-Time Filter: false > (13 rows) > > > > regards, tender > wang > dbt3-s0.01-janm.sql Description: Binary data
Re: Improve list manipulation in several places
Richard Guo 于2023年4月21日周五 15:35写道: > There was discussion in [1] about improvements to list manipulation in > several places. But since the discussion is not related to the topic in > that thread, fork a new thread here and attach a patch to show my > thoughts. > > Some are just cosmetic changes by using macros. The others should have > performance gain from the avoidance of moving list entries. But I doubt > the performance gain can be noticed or measured, as currently there are > only a few places affected by the change. I still think the changes are > worthwhile though, because it is very likely that future usage of the > same scenario can benefit from these changes. > I doubt the performance gain from lappend_copy func. new_tail_cell in lappend may not enter enlarge_list in most cases, because we may allocate extra cells in new_list(see the comment in new_list). > > (Copying in David and Ranier. Ranier provided a patch about the changes > in list.c, but I'm not using that one.) > > [1] > https://www.postgresql.org/message-id/CAMbWs49aakL%3DPP7NcTajCtDyaVUE-NMVMGpaLEKreYbQknkQWA%40mail.gmail.com > > Thanks > Richard >
Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog
Xiaoran Wang 于2023年3月18日周六 15:04写道: > Hi hackers, > > In heap_create_with_catalog, the Relation new_rel_desc is created > by RelationBuildLocalRelation, not table_open. So it's better to > call RelationClose to release it. > Why it's better to call RelationClose? Is there a problem if using table_close()? > What's more, the comment for it seems useless, just delete it. > > Thanks! > regard, tender wang
Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog
Bharath Rupireddy 于2023年5月10日周三 22:17写道: > On Sat, Mar 18, 2023 at 12:34 PM Xiaoran Wang wrote: > > > > Hi hackers, > > > > In heap_create_with_catalog, the Relation new_rel_desc is created > > by RelationBuildLocalRelation, not table_open. So it's better to > > call RelationClose to release it. > > > > What's more, the comment for it seems useless, just delete it. > > Essentially, all the close functions are the same with NoLock, IOW, > table_close(relation, NoLock) = relation_closerelation, NoLock) = > RelationClose(relation). Therefore, table_close(new_rel_desc, NoLock); > looks fine to me. Agreed. And, the /* do not unlock till end of xact */, it looks like it's been > there from day 1. It may be indicating that the ref count fo the new > relation created in heap_create_with_catalog() will be decremented at > the end of xact, but I'm not sure what it means. > Me too > -- > Bharath Rupireddy > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com > > >
Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog
Tom Lane 于2023年5月11日周四 00:32写道: > Bharath Rupireddy writes: > > And, the /* do not unlock till end of xact */, it looks like it's been > > there from day 1. It may be indicating that the ref count fo the new > > relation created in heap_create_with_catalog() will be decremented at > > the end of xact, but I'm not sure what it means. > > Hmm, I think it's been copied-and-pasted from somewhere. It's quite > common for us to not release locks on modified tables until end of > transaction. However, that's not what's happening here, because we > actually *don't have any such lock* at this point, as you can easily > prove by stepping through this code and watching the contents of > pg_locks from another session. We do acquire AccessExclusiveLock > on the new table eventually, but not till control returns to > DefineRelation. > > I'm not real sure that I like the proposed code change: it's unclear > to me whether it's an unwise piercing of a couple of abstraction > layers or an okay change because those abstraction layers haven't > yet been applied to the new relation at all. However, I think the > existing comment is actively misleading and needs to be changed. > Maybe something like > > /* > * Close the relcache entry, since we return only an OID not a > * relcache reference. Note that we do not yet hold any lockmanager > * lock on the new rel, so there's nothing to release. > */ > table_close(new_rel_desc, NoLock); > > /* > * ok, the relation has been cataloged, so close catalogs and return > * the OID of the newly created relation. > */ > table_close(pg_class_desc, RowExclusiveLock); > +1 Personally, I prefer above code. Given these comments, maybe changing the first call to RelationClose > would be sensible, but I'm still not quite convinced. > > regards, tom lane > > >
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Hi Is there any conclusion to this issue? Jehan-Guillaume de Rorthais 于2023年8月10日周四 23:03写道: > On Thu, 3 Aug 2023 11:02:43 +0200 > Alvaro Herrera wrote: > > > On 2023-Aug-03, tender wang wrote: > > > > > I think old "sub-FK" should not be dropped, that will be violates > foreign > > > key constraint. > > > > Yeah, I've been playing more with the patch and it is definitely not > > doing the right things. Just eyeballing the contents of pg_trigger and > > pg_constraint for partitions added by ALTER...ATTACH shows that the > > catalog contents are inconsistent with those added by CREATE TABLE > > PARTITION OF. > > Well, as stated in my orignal message, at the patch helps understanding the > problem and sketch a possible solution. It definitely is not complete. > > After DETACHing the table, we surely needs to check everything again and > recreating what is needed to keep the FK consistent. > > But should we keep the FK after DETACH? Did you check the two other > discussions > related to FK, self-FK & partition? Unfortunately, as Tender experienced, > the > more we dig the more we find bugs. Moreover, the second one might seems > unsolvable and deserve a closer look. See: > > * FK broken after DETACHing referencing part > https://www.postgresql.org/message-id/20230420144344.40744130%40karst > * Issue attaching a table to a partitioned table with an auto-referenced > foreign key > https://www.postgresql.org/message-id/20230707175859.17c91538%40karst > >
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
into v2 patch, so I don't add test case in v2 patch. No test case is not good patch. I just share my idea about this issue. Hope to get your reply. Alvaro Herrera 于2023年10月25日周三 20:13写道: > On 2023-Oct-25, tender wang wrote: > > > Hi > >Is there any conclusion to this issue? > > None yet. I intend to work on this at some point, hopefully soon. > > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ > From 24491419ad65871e54207d3ef481d8abe529e1e1 Mon Sep 17 00:00:00 2001 From: "tender.wang" Date: Fri, 27 Oct 2023 13:48:48 +0800 Subject: [PATCH v2] Fix partition detach issue. --- src/backend/commands/tablecmds.c | 21 + 1 file changed, 21 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 416a98e7ce..3b897b620a 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19356,7 +19356,9 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent, { ForeignKeyCacheInfo *fk = lfirst(cell); HeapTuple contup; + HeapTuple parentConTup; Form_pg_constraint conform; + Form_pg_constraint parentConForm; Constraint *fkconstraint; Oid insertTriggerOid, updateTriggerOid; @@ -19374,6 +19376,24 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent, continue; } + /* For referenced-side, if it is partitioned table, each partition + * has one row in pg_constraint. But it doesn't have INSERT CHECK trigger + */ + Assert(OidIsValid(conform->conparentid)); + parentConTup = SearchSysCache1(CONSTROID, + ObjectIdGetDatum(conform->conparentid)); + if (!HeapTupleIsValid(parentConTup)) + elog(ERROR, "cache lookup failed for constraint %u", + conform->conparentid); + parentConForm = (Form_pg_constraint)GETSTRUCT(parentConTup); + if (parentConForm->confrelid != conform->confrelid && + parentConForm->conrelid == conform->conrelid) + { + ReleaseSysCache(contup); + ReleaseSysCache(parentConTup); + continue; + } + /* unset conparentid and adjust conislocal, coninhcount, etc. */ ConstraintSetParentConstraint(fk->conoid, InvalidOid, InvalidOid); @@ -19421,6 +19441,7 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent, NULL, NULL); ReleaseSysCache(contup); + ReleaseSysCache(parentConTup); } list_free_deep(fks); if (trigrel) -- 2.25.1
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
The v8-0001 patch failed to apply in my local repo as below: git apply v8-0001-Make-all-SLRU-buffer-sizes-configurable.patch error: patch failed: src/backend/access/transam/multixact.c:1851 error: src/backend/access/transam/multixact.c: patch does not apply error: patch failed: src/backend/access/transam/subtrans.c:184 error: src/backend/access/transam/subtrans.c: patch does not apply error: patch failed: src/backend/commands/async.c:117 error: src/backend/commands/async.c: patch does not apply error: patch failed: src/backend/storage/lmgr/predicate.c:808 error: src/backend/storage/lmgr/predicate.c: patch does not apply error: patch failed: src/include/commands/async.h:15 error: src/include/commands/async.h: patch does not apply My local head commit is 15c9ac36299. Is there something I missed? Dilip Kumar 于2023年11月24日周五 17:08写道: > On Fri, Nov 24, 2023 at 10:17 AM Dilip Kumar > wrote: > > > > On Thu, Nov 23, 2023 at 11:34 AM Dilip Kumar > wrote: > > > > > > Note: With this testing, we have found a bug in the bank-wise > > > approach, basically we are clearing a procglobal->clogGroupFirst, even > > > before acquiring the bank lock that means in most of the cases there > > > will be a single process in each group as a group leader > > > > I realized that the bug fix I have done is not proper, so will send > > the updated patch set with the proper fix soon. > > PFA, updated patch set fixes the bug found during the testing of the > group update using the injection point. Also attached a path to test > the injection point but for that, we need to apply the injection point > patches [1] > > [1] https://www.postgresql.org/message-id/ZWACtHPetBFIvP61%40paquier.xyz > > -- > Regards, > Dilip Kumar > EnterpriseDB: http://www.enterprisedb.com >
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Andrey M. Borodin 于2023年12月14日周四 17:02写道: > > > > On 14 Dec 2023, at 08:12, Amul Sul wrote: > > > > > > + int bankno = pageno & ctl->bank_mask; > > > > I am a bit uncomfortable seeing it as a mask, why can't it be simply a > number > > of banks (num_banks) and get the bank number through modulus op (pageno % > > num_banks) instead of bitwise & operation (pageno & ctl->bank_mask) > which is a > > bit difficult to read compared to modulus op which is quite simple, > > straightforward and much common practice in hashing. > > > > Are there any advantages of using & over % ? > use Compiler Explorer[1] tool, '%' has more Assembly instructions than '&' . int GetBankno1(int pageno) { return pageno & 127; } int GetBankno2(int pageno) { return pageno % 127; } under clang 13.0 GetBankno1: # @GetBankno1 pushrbp mov rbp, rsp mov dword ptr [rbp - 4], edi mov eax, dword ptr [rbp - 4] and eax, 127 pop rbp ret GetBankno2: # @GetBankno2 pushrbp mov rbp, rsp mov dword ptr [rbp - 4], edi mov eax, dword ptr [rbp - 4] mov ecx, 127 cdq idivecx mov eax, edx pop rbp ret under gcc 13.2 GetBankno1: pushrbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov eax, DWORD PTR [rbp-4] and eax, 127 pop rbp ret GetBankno2: pushrbp mov rbp, rsp mov DWORD PTR [rbp-4], edi mov eax, DWORD PTR [rbp-4] movsx rdx, eax imulrdx, rdx, -2130574327 shr rdx, 32 add edx, eax mov ecx, edx sar ecx, 6 cdq sub ecx, edx mov edx, ecx sal edx, 7 sub edx, ecx sub eax, edx mov ecx, eax mov eax, ecx pop rbp ret [1] https://godbolt.org/ The instruction AND is ~20 times faster than IDIV [0]. This is relatively > hot function, worth sacrificing some readability to save ~ten nanoseconds > on each check of a status of a transaction. > Now that AND is more faster, Can we replace the '% SLRU_MAX_BANKLOCKS' operation in SimpleLruGetBankLock() with '& 127' : SimpleLruGetBankLock() { int banklockno = (pageno & ctl->bank_mask) % SLRU_MAX_BANKLOCKS; use '&' return &(ctl->shared->bank_locks[banklockno].lock); } Thoughts? > > [0] https://www.agner.org/optimize/instruction_tables.pdf > >
Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Andrey M. Borodin 于2023年12月14日周四 17:35写道: > > > > On 14 Dec 2023, at 14:28, tender wang wrote: > > > > Now that AND is more faster, Can we replace the '% > SLRU_MAX_BANKLOCKS' operation in SimpleLruGetBankLock() with '& 127' > > unsigned int GetBankno1(unsigned int pageno) { > return pageno & 127; > } > > unsigned int GetBankno2(unsigned int pageno) { > return pageno % 128; > } > > Generates with -O2 > > GetBankno1(unsigned int): > mov eax, edi > and eax, 127 > ret > GetBankno2(unsigned int): > mov eax, edi > and eax, 127 > ret > > > Compiler is smart enough with constants. > Yeah, that's true. int GetBankno(long pageno) { unsigned short bank_mask = 128; int bankno = (pageno & bank_mask) % 128; return bankno; } enable -O2, only one instruction: xor eax, eax But if we all use '%', thing changs as below: int GetBankno(long pageno) { unsigned short bank_mask = 128; int bankno = (pageno % bank_mask) % 128; return bankno; } mov rdx, rdi sar rdx, 63 shr rdx, 57 lea rax, [rdi+rdx] and eax, 127 sub eax, edx > Best regards, Andrey Borodin.
Re: Improve heapgetpage() performance, overhead from serializable
This thread [1] also Improving the heapgetpage function, and looks like this thread. [1] https://www.postgresql.org/message-id/a9f40066-3d25-a240-4229-ec2fbe94e7a5%40yeah.net Muhammad Malik 于2023年9月1日周五 04:04写道: > Hi, > > Is there a plan to merge this patch in PG16? > > Thanks, > Muhammad > > -- > *From:* Andres Freund > *Sent:* Saturday, July 15, 2023 6:56 PM > *To:* pgsql-hack...@postgresql.org > *Cc:* Thomas Munro > *Subject:* Improve heapgetpage() performance, overhead from serializable > > Hi, > > Several loops which are important for query performance, like > heapgetpage()'s > loop over all tuples, have to call functions like > HeapCheckForSerializableConflictOut() and PredicateLockTID() in every > iteration. > > When serializable is not in use, all those functions do is to to return. > But > being situated in a different translation unit, the compiler can't inline > (without LTO at least) the check whether serializability is needed. It's > not > just the function call overhead that's noticable, it's also that registers > have to be spilled to the stack / reloaded from memory etc. > > On a freshly loaded pgbench scale 100, with turbo mode disabled, postgres > pinned to one core. Parallel workers disabled to reduce noise. All times > are > the average of 15 executions with pgbench, in a newly started, but > prewarmed > postgres. > > SELECT * FROM pgbench_accounts OFFSET 1000; > HEAD: > 397.977 > > removing the HeapCheckForSerializableConflictOut() from heapgetpage() > (incorrect!), to establish the baseline of what serializable costs: > 336.695 > > pulling out CheckForSerializableConflictOutNeeded() from > HeapCheckForSerializableConflictOut() in heapgetpage(), and avoiding > calling > HeapCheckForSerializableConflictOut() in the loop: > 339.742 > > moving the loop into a static inline function, marked as pg_always_inline, > called with static arguments for always_visible, check_serializable: > 326.546 > > marking the always_visible, !check_serializable case likely(): > 322.249 > > removing TestForOldSnapshot() calls, which we pretty much already decided > on: > 312.987 > > > FWIW, there's more we can do, with some hacky changes I got the time down > to > 273.261, but the tradeoffs start to be a bit more complicated. And > 397->320ms > for something as core as this, is imo worth considering on its own. > > > > > Now, this just affects the sequential scan case. heap_hot_search_buffer() > shares many of the same pathologies. I find it a bit harder to improve, > because the compiler's code generation seems to switch between good / bad > with > changes that seems unrelated... > > > I wonder why we haven't used PageIsAllVisible() in > heap_hot_search_buffer() so > far? > > > Greetings, > > Andres Freund >
Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Hi all, I recently run benchmark[1] on master, but I found performance problem as below: explain analyze select subq_0.c0 as c0, subq_0.c1 as c1, subq_0.c2 as c2 from (select ref_0.l_shipmode as c0, sample_0.l_orderkey as c1, sample_0.l_quantity as c2, ref_0.l_orderkey as c3, sample_0.l_shipmode as c5, ref_0.l_shipinstruct as c6 from public.lineitem as ref_0 left join public.lineitem as sample_0 on ((select p_partkey from public.part order by p_partkey limit 1) is not NULL) where sample_0.l_orderkey is NULL) as subq_0 where subq_0.c5 is NULL limit 1; QUERY PLAN - Limit (cost=78.00..45267050.75 rows=1 width=27) (actual time=299695.097..299695.099 rows=0 loops=1) InitPlan 1 (returns $0) -> Limit (cost=78.00..78.00 rows=1 width=8) (actual time=0.651..0.652 rows=1 loops=1) -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual time=0.650..0.651 rows=1 loops=1) Sort Key: part.p_partkey Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8) (actual time=0.013..0.428 rows=2000 loops=1) -> Nested Loop Left Join (cost=0.00..45266972.75 rows=1 width=27) (actual time=299695.096..299695.096 rows=0 loops=1) Join Filter: ($0 IS NOT NULL) Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL)) Rows Removed by Filter: 3621030625 -> Seq Scan on lineitem ref_0 (cost=0.00..1969.75 rows=60175 width=11) (actual time=0.026..6.225 rows=60175 loops=1) -> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.554 rows=60175 loops=60175) -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1) Planning Time: 0.172 ms Execution Time: 299695.501 ms (16 rows) After I set enable_material to off, the same query run faster, as below: set enable_material = off; explain analyze select subq_0.c0 as c0, subq_0.c1 as c1, subq_0.c2 as c2 from (select ref_0.l_shipmode as c0, sample_0.l_orderkey as c1, sample_0.l_quantity as c2, ref_0.l_orderkey as c3, sample_0.l_shipmode as c5, ref_0.l_shipinstruct as c6 from public.lineitem as ref_0 left join public.lineitem as sample_0 on ((select p_partkey from public.part order by p_partkey limit 1) is not NULL) where sample_0.l_orderkey is NULL) as subq_0 where subq_0.c5 is NULL limit 1; QUERY PLAN --- Limit (cost=1078.00..91026185.57 rows=1 width=27) (actual time=192669.605..192670.425 rows=0 loops=1) InitPlan 1 (returns $0) -> Limit (cost=78.00..78.00 rows=1 width=8) (actual time=0.662..0.663 rows=1 loops=1) -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual time=0.661..0.662 rows=1 loops=1) Sort Key: part.p_partkey Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8) (actual time=0.017..0.430 rows=2000 loops=1) -> Gather (cost=1000.00..91026107.57 rows=1 width=27) (actual time=192669.604..192670.422 rows=0 loops=1) Workers Planned: 1 Params Evaluated: $0 Workers Launched: 1 -> Nested Loop Left Join (cost=0.00..91025107.47 rows=1 width=27) (actual time=192588.143..192588.144 rows=0 loops=2) Join Filter: ($0 IS NOT NULL) Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL)) Rows Removed by Filter: 1810515312 -> Parallel Seq Scan on lineitem ref_0 (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2) -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175) Planning Time: 0.174 ms Execution Time: 192670.458 ms (19 rows) I debug the code and find consider_parallel_nestloop() doesn't consider materialized form of the cheapest inner path. When enable_material = true, we can see Material path won in first plan, but Parallel Seq Scan node doesn't add as outer path, which because in try_partial_nestloop_path() , the cost of nestloop wat computed using seq scan path not material path. [1] include test table schema and data, you can repeat above problem. I try fix this problem in attached patch, and I found pg12.12 also had this is
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
After using patch, the result as below : QUERY PLAN Limit (cost=1078.00..26630101.20 rows=1 width=27) (actual time=160571.005..160571.105 rows=0 loops=1) InitPlan 1 (returns $0) -> Limit (cost=78.00..78.00 rows=1 width=8) (actual time=1.065..1.066 rows=1 loops=1) -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual time=1.064..1.065 rows=1 loops=1) Sort Key: part.p_partkey Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8) (actual time=0.046..0.830 rows=2000 loops=1) -> Gather (cost=1000.00..26630023.20 rows=1 width=27) (actual time=160571.003..160571.102 rows=0 loops=1) Workers Planned: 1 Params Evaluated: $0 Workers Launched: 1 -> Nested Loop Left Join (cost=0.00..26629023.10 rows=1 width=27) (actual time=160549.257..160549.258 rows=0 loops=2) Join Filter: ($0 IS NOT NULL) Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL)) Rows Removed by Filter: 1810515312 -> Parallel Seq Scan on lineitem ref_0 (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2) -> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.839 rows=60175 loops=60175) -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2) Planning Time: 0.174 ms Execution Time: 160571.476 ms (20 rows) tender wang 于2023年9月5日周二 16:52写道: > Hi all, > >I recently run benchmark[1] on master, but I found performance problem > as below: > > explain analyze select > subq_0.c0 as c0, > subq_0.c1 as c1, > subq_0.c2 as c2 > from > (select > ref_0.l_shipmode as c0, > sample_0.l_orderkey as c1, > sample_0.l_quantity as c2, > ref_0.l_orderkey as c3, > sample_0.l_shipmode as c5, > ref_0.l_shipinstruct as c6 > from > public.lineitem as ref_0 > left join public.lineitem as sample_0 > on ((select p_partkey from public.part order by p_partkey limit > 1) > is not NULL) > where sample_0.l_orderkey is NULL) as subq_0 > where subq_0.c5 is NULL > limit 1; >QUERY PLAN > > > - > Limit (cost=78.00..45267050.75 rows=1 width=27) (actual > time=299695.097..299695.099 rows=0 loops=1) >InitPlan 1 (returns $0) > -> Limit (cost=78.00..78.00 rows=1 width=8) (actual > time=0.651..0.652 rows=1 loops=1) >-> Sort (cost=78.00..83.00 rows=2000 width=8) (actual > time=0.650..0.651 rows=1 loops=1) > Sort Key: part.p_partkey > Sort Method: top-N heapsort Memory: 25kB > -> Seq Scan on part (cost=0.00..68.00 rows=2000 > width=8) (actual time=0.013..0.428 rows=2000 loops=1) >-> Nested Loop Left Join (cost=0.00..45266972.75 rows=1 width=27) > (actual time=299695.096..299695.096 rows=0 loops=1) > Join Filter: ($0 IS NOT NULL) > Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode > IS NULL)) > Rows Removed by Filter: 3621030625 > -> Seq Scan on lineitem ref_0 (cost=0.00..1969.75 rows=60175 > width=11) (actual time=0.026..6.225 rows=60175 loops=1) > -> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual > time=0.000..2.554 rows=60175 loops=60175) >-> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 > rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1) > Planning Time: 0.172 ms > Execution Time: 299695.501 ms > (16 rows) > > After I set enable_material to off, the same query run faster, as below: > set enable_material = off; > explain analyze select > subq_0.c0 as c0, > subq_0.c1 as c1, > subq_0.c2 as c2 > from > (select > ref_0.l_shipmode as c0, > sample_0.l_orderkey as c1, > sample_0.l_quantity as c2, > ref_0.l_orderkey as c3, > sample_0.l_shipmode as c5, > ref_0.l_shipinstruct as c6 > from > public.lineitem as ref_0 > left join public.lineitem as sample_0 > o
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Richard Guo 于2023年9月5日周二 18:51写道: > > On Tue, Sep 5, 2023 at 4:52 PM tender wang wrote: > >>I recently run benchmark[1] on master, but I found performance problem >> as below: >> ... >> >> I debug the code and find consider_parallel_nestloop() doesn't consider >> materialized form of the cheapest inner path. >> > > Yeah, this seems an omission in commit 45be99f8. I reviewed the patch > and here are some comments. > > * I think we should not consider materializing the cheapest inner path > if we're doing JOIN_UNIQUE_INNER, because in this case we have to > unique-ify the inner path. > That's right. The V2 patch has been fixed. > * I think we can check if it'd be parallel safe before creating the > material path, thus avoid the creation in unsafe cases. > Agreed. > * I don't think the test case you added works for the code changes. > Maybe a plan likes below is better: > Agreed. explain (costs off) > select * from tenk1, tenk2 where tenk1.two = tenk2.two; > QUERY PLAN > -- > Gather >Workers Planned: 4 >-> Nested Loop > Join Filter: (tenk1.two = tenk2.two) > -> Parallel Seq Scan on tenk1 > -> Materialize >-> Seq Scan on tenk2 > (7 rows) > > Thanks > Richard > From 096c645d7c8d06df3a4e6a8aa7fc4edd1c0a3755 Mon Sep 17 00:00:00 2001 From: "tender.wang" Date: Thu, 7 Sep 2023 17:55:04 +0800 Subject: [PATCH v2] Parallel seq scan should consider materila inner path in nestloop case. --- src/backend/optimizer/path/joinpath.c | 19 +++ src/test/regress/expected/select_parallel.out | 24 +++ src/test/regress/sql/select_parallel.sql | 9 +++ 3 files changed, 52 insertions(+) diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index 821d282497..87111ad643 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -2004,10 +2004,25 @@ consider_parallel_nestloop(PlannerInfo *root, { JoinTypesave_jointype = jointype; ListCell *lc1; + Path *matpath = NULL; + Path *inner_cheapest_total = innerrel->cheapest_total_path; if (jointype == JOIN_UNIQUE_INNER) jointype = JOIN_INNER; + /* +* Consider materializing the cheapest inner path, unless we're +* doing JOIN_UNIQUE_INNER or enable_material is off or the subpath +* is parallel unsafe or the path in question materializes its output anyway. +*/ + if (save_jointype != JOIN_UNIQUE_INNER && + enable_material && + inner_cheapest_total != NULL && + inner_cheapest_total->parallel_safe && + !ExecMaterializesOutput(inner_cheapest_total->pathtype)) + matpath = (Path *) + create_material_path(innerrel, inner_cheapest_total); + foreach(lc1, outerrel->partial_pathlist) { Path *outerpath = (Path *) lfirst(lc1); @@ -2064,6 +2079,10 @@ consider_parallel_nestloop(PlannerInfo *root, try_partial_nestloop_path(root, joinrel, outerpath, mpath, pathkeys, jointype, extra); } + /* Also consider materialized form of the cheapest inner path */ + if (matpath != NULL && matpath->parallel_safe) + try_partial_nestloop_path(root, joinrel, outerpath, matpath, + pathkeys, jointype, extra); } } diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out index d88353d496..5b9f5c58cc 100644 --- a/src/test/regress/expected/select_parallel.out +++ b/src/test/regress/expected/select_parallel.out @@ -844,6 +844,30 @@ select * from (12 rows) reset enable_material; +-- test materialized form of the cheapest inner path +set min_parallel_table_scan_size = '512kB'; +explain(costs off) +select count(*) from tenk1, int4_tbl where tenk1.two < int4_tbl.f1; + QUERY PLAN + + Finalize Aggregate + -> Gather + Workers Planned: 4 + -> Partial Aggregate + -> Nested Loop + Join Filter: (tenk1.two < int4_tbl.f1) + -> Parallel Seq Scan on tenk1 + -> Materialize + -> Seq Scan on int4_tbl +(9 ro
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Hi tom, Do you have any comments or suggestions on this issue? Thanks. Richard Guo 于2023年9月8日周五 14:06写道: > > On Fri, Sep 8, 2023 at 3:15 AM Robert Haas wrote: > >> The example query provided here seems rather artificial. Surely few >> people write a join clause that references neither of the tables being >> joined. Is there a more realistic case where this makes a big >> difference? > > > Yes the given example query is not that convincing. I tried a query > with plans as below (after some GUC setting) which might be more > realistic in real world. > > unpatched: > > explain select * from partsupp join lineitem on l_partkey > ps_partkey; > QUERY PLAN > > -- > Gather (cost=0.00..5522666.44 rows=16047 width=301) >Workers Planned: 4 >-> Nested Loop (cost=0.00..5522666.44 rows=40116667 width=301) > Join Filter: (lineitem.l_partkey > partsupp.ps_partkey) > -> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044 > width=144) > -> Seq Scan on partsupp (cost=0.00..267.00 rows=8000 width=157) > (6 rows) > > patched: > > explain select * from partsupp join lineitem on l_partkey > ps_partkey; > QUERY PLAN > > -- > Gather (cost=0.00..1807085.44 rows=16047 width=301) >Workers Planned: 4 >-> Nested Loop (cost=0.00..1807085.44 rows=40116667 width=301) > Join Filter: (lineitem.l_partkey > partsupp.ps_partkey) > -> Parallel Seq Scan on lineitem (cost=0.00..1518.44 rows=15044 > width=144) > -> Materialize (cost=0.00..307.00 rows=8000 width=157) >-> Seq Scan on partsupp (cost=0.00..267.00 rows=8000 > width=157) > (7 rows) > > The execution time (ms) are (avg of 3 runs): > > unpatched: 71769.21 > patched:65510.04 > > So we can see some (~9%) performance gains in this case. > > Thanks > Richard >
Re: Problem, partition pruning for prepared statement with IS NULL clause.
The comment /* not needed for Consts */ may be more better close to if (!IsA(expr, Const)). Others look good to me. David Rowley 于2023年10月9日周一 07:28写道: > On Sat, 7 Oct 2023 at 03:11, Sergei Glukhov > wrote: > > I noticed that combination of prepared statement with generic plan and > > 'IS NULL' clause could lead partition pruning to crash. > > > Test case: > > -- > > set plan_cache_mode to force_generic_plan; > > prepare stmt AS select * from hp where a is null and b = $1; > > explain execute stmt('xxx'); > > Thanks for the detailed report and proposed patch. > > I think your proposed fix isn't quite correct. I think the problem > lies in InitPartitionPruneContext() where we assume that the list > positions of step->exprs are in sync with the keyno. If you look at > perform_pruning_base_step() the code there makes a special effort to > skip over any keyno when a bit is set in opstep->nullkeys. > > It seems that your patch is adjusting the keyno that's given to the > PruneCxtStateIdx() and it looks like (for your test case) it'll end up > passing keyno==0 when it should be passing keyno==1. keyno is the > index of the partition key, so you can't pass 0 when it's for key > index 1. > > I wonder if it's worth expanding the tests further to cover more of > the pruning cases to cover run-time pruning too. > I think it's worth doing that. David >
Re: Problem, partition pruning for prepared statement with IS NULL clause.
For hash partition table, if partition key is IS NULL clause, the condition in if in get_steps_using_prefix_recurse: if (cur_keyno < step_lastkeyno - 1) is not enough. Like the decode crashed case, explain select * from hp where a = 1 and b is null and c = 1; prefix list just has a = 1 clause. I try fix this in attached patch. David Rowley 于2023年10月11日周三 10:50写道: > On Tue, 10 Oct 2023 at 21:31, Sergei Glukhov > wrote: > > create table hp (a int, b text, c int, d int) > >partition by hash (a part_test_int4_ops, b part_test_text_ops, c > > part_test_int4_ops); > > create table hp0 partition of hp for values with (modulus 4, remainder > 0); > > create table hp3 partition of hp for values with (modulus 4, remainder > 3); > > create table hp1 partition of hp for values with (modulus 4, remainder > 1); > > create table hp2 partition of hp for values with (modulus 4, remainder > 2); > > > > > > Another crash in the different place even with the fix: > > explain select * from hp where a = 1 and b is null and c = 1; > > Ouch. It looks like 13838740f tried to fix things in this area before > and even added a regression test for it. Namely: > > -- Test that get_steps_using_prefix() handles non-NULL step_nullkeys > explain (costs off) select * from hp_prefix_test where a = 1 and b is > null and c = 1 and d = 1; > > I guess that one does not crash because of the "d = 1" clause is in > the "start" ListCell in get_steps_using_prefix_recurse(), whereas, > with your case start is NULL which is an issue for cur_keyno = > ((PartClauseInfo *) lfirst(start))->keyno;. > > It might have been better if PartClauseInfo could also describe IS > NULL quals, but I feel if we do that now then it would require lots of > careful surgery in partprune.c to account for that. Probably the fix > should be localised to get_steps_using_prefix_recurse() to have it do > something like pass the keyno to try and work on rather than trying to > get that from the "prefix" list. That way if there's no item in that > list for that keyno, we can check in step_nullkeys for the keyno. > > I'll continue looking. > > David > > > From 1640c0d05269c3368fb41fcffc66e38630ff522d Mon Sep 17 00:00:00 2001 From: "tender.wang" Date: Wed, 11 Oct 2023 11:32:04 +0800 Subject: [PATCH] Fix null partition key pruning for hash parittion table. --- src/backend/partitioning/partprune.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index 7179b22a05..c2a388d454 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -2438,6 +2438,7 @@ get_steps_using_prefix_recurse(GeneratePruningStepsContext *context, List *result = NIL; ListCell *lc; int cur_keyno; + int prefix_lastkeyno; /* Actually, recursion would be limited by PARTITION_MAX_KEYS. */ check_stack_depth(); @@ -2445,7 +2446,11 @@ get_steps_using_prefix_recurse(GeneratePruningStepsContext *context, /* Check if we need to recurse. */ Assert(start != NULL); cur_keyno = ((PartClauseInfo *) lfirst(start))->keyno; - if (cur_keyno < step_lastkeyno - 1) + /* Note that for hash partitioning, if partition key is IS NULL clause, +* that partition key will not present in prefix List. +*/ + prefix_lastkeyno = ((PartClauseInfo *) llast(prefix))->keyno; + if (cur_keyno < step_lastkeyno - 1 && cur_keyno != prefix_lastkeyno) { PartClauseInfo *pc; ListCell *next_start; -- 2.25.1
Re: Problem, partition pruning for prepared statement with IS NULL clause.
David Rowley 于2023年10月11日周三 15:52写道: > On Wed, 11 Oct 2023 at 15:49, David Rowley wrote: > > It might have been better if PartClauseInfo could also describe IS > > NULL quals, but I feel if we do that now then it would require lots of > > careful surgery in partprune.c to account for that. Probably the fix > > should be localised to get_steps_using_prefix_recurse() to have it do > > something like pass the keyno to try and work on rather than trying to > > get that from the "prefix" list. That way if there's no item in that > > list for that keyno, we can check in step_nullkeys for the keyno. > > > > I'll continue looking. > > The fix seems to amount to the attached. The following condition > assumes that by not recursively processing step_lastkeyno - 1 that > there will be at least one more PartClauseInfo in the prefix List to > process. However, that didn't work when that partition key clause was > covered by an IS NULL clause. > > If we adjust the following condition: > > if (cur_keyno < step_lastkeyno - 1) > > to become: > > final_keyno = ((PartClauseInfo *) llast(prefix))->keyno; > if (cur_keyno < final_keyno) > Yeah, aggred. > then that ensures that the else clause can pick up any clauses for the > final column mentioned in the 'prefix' list, plus any nullkeys if > there happens to be any of those too. > > For testing, given that 13838740f (from 2020) had a go at fixing this > already, I'm kinda thinking that it's not overkill to test all > possible 16 combinations of IS NULL and equality equals on the 4 > partition key column partitioned table that commit added in > partition_prune.sql. > > I added some tests there using \gexec to prevent having to write out > each of the 16 queries by hand. I tested that pruning worked (i.e 1 > matching partition in EXPLAIN), and that we get the correct results > (i.e we pruned the correct partition) by running the query and we get > the expected 1 row after having inserted 16 rows, one for each > combination of quals to test. > > I wanted to come up with some tests that test for multiple quals > matching the same partition key. This is tricky due to the > equivalence class code being smart and removing any duplicates or > marking the rel as dummy when it finds conflicting quals. With hash > partitioning, we're limited to just equality quals, so maybe something > could be done with range-partitioned tables instead. I see there are > some tests just above the ones I modified which try to cover this. > > I also tried to outsmart the planner by using Params and prepared > queries. Namely: > > set plan_cache_mode = 'force_generic_plan'; > prepare q1 (int, int, int, int, int, int, int, int) as select > tableoid::regclass,* from hp_prefix_test where a = $1 and b = $2 and c > = $3 and d = $4 and a = $5 and b = $6 and c = $7 and d = $8; > explain (costs off) execute q1 (1,2,3,4,1,2,3,4); > > But I was outsmarted again with a gating qual which checked the pairs > match before doing the scan :-( > > Append >Subplans Removed: 15 >-> Result > One-Time Filter: (($1 = $5) AND ($2 = $6) AND ($3 = $7) AND ($4 = > $8)) > -> Seq Scan on hp_prefix_test_p14 hp_prefix_test_1 >Filter: ((a = $5) AND (b = $6) AND (c = $7) AND (d = $8)) > > I'm aiming to commit these as two separate fixes, so I'm going to go > look again at the first one and wait to see if anyone wants to comment > on this patch in the meantime. > +1, LGTM > David >
Why cann't simplify stable function in planning phase?
Hi hackers, In evaluate_function(), I find codes as shown below: /* * Ordinarily we are only allowed to simplify immutable functions. But for * purposes of estimation, we consider it okay to simplify functions that * are merely stable; the risk that the result might change from planning * time to execution time is worth taking in preference to not being able * to estimate the value at all. */ if (funcform->provolatile == PROVOLATILE_IMMUTABLE) /* okay */ ; else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE) /* okay */ ; else return NULL; The codes say that stable function can not be simplified here(e.g. planning phase). I want to know the reason why stable function can not be simplified in planning phase. Maybe show me a example that it will be incorrect for a query if simplify stable function in planning phases. With kindest regards, tender wang
ERROR: permission info at index 1 ....
Hi hackers, After a61b1f74823c commit, below query reports error: create table perm_test1(a int); create table perm_test2(b int); select subq.c0 from (select (select a from perm_test1 order by a limit 1) as c0, b as c1 from perm_test2 where false order by c0, c1) as subq where false; ERROR: permission info at index 1 (with relid=16457) does not match provided RTE (with relid=16460) Below codes can fix this: --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -512,11 +512,16 @@ flatten_rtes_walker(Node *node, flatten_rtes_walker_context *cxt) * Recurse into subselects. Must update cxt->query to this query so * that the rtable and rteperminfos correspond with each other. */ + Query *current_query = cxt->query; + bool result; + cxt->query = (Query *) node; - return query_tree_walker((Query *) node, + result = query_tree_walker((Query *) node, flatten_rtes_walker, (void *) cxt, QTW_EXAMINE_RTES_BEFORE); + cxt->query = current_query; + return result; } regards, tender wang
wrong query result due to wang plan
Hi hackers, I have this query as shown below: select ref_1.r_comment as c0, subq_0.c1 as c1 from public.region as sample_0 right join public.partsupp as sample_1 right join public.lineitem as sample_2 on (cast(null as path) = cast(null as path)) on (cast(null as "timestamp") < cast(null as "timestamp")) inner join public.lineitem as ref_0 on (true) left join (select sample_3.ps_availqty as c1, sample_3.ps_comment as c2 from public.partsupp as sample_3 where false order by c1, c2 ) as subq_0 on (sample_1.ps_supplycost = subq_0.c1 ) right join public.region as ref_1 on (sample_1.ps_availqty = ref_1.r_regionkey ) where ref_1.r_comment is not NULL order by c0, c1; *This query has different result on pg12.12 and on HEAD*, on pg12.12: c0 | c1 -+ even, ironic theodolites according to the bold platelets wa | furiously unusual packages use carefully above the unusual, exp | silent, bold requests sleep slyly across the quickly sly dependencies. furiously silent instructions alongside | special, bold deposits haggle foxes. platelet | special Tiresias about the furiously even dolphins are furi | (5 rows) its plan : QUERY PLAN -- Sort Sort Key: ref_1.r_comment, c1 -> Hash Left Join Hash Cond: (ref_1.r_regionkey = ps_availqty) -> Seq Scan on region ref_1 Filter: (r_comment IS NOT NULL) -> Hash -> Result One-Time Filter: false (9 rows) But on HEAD(pg16devel), its results below: c0 | c1 + (0 rows) its plan: QUERY PLAN Sort Sort Key: ref_1.r_comment, subq_0.c1 -> Result One-Time Filter: false (4 rows) Attached file included table schema info. regards, tender wang dbt3-s0.01-janm.sql Description: Binary data
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
fujii.y...@df.mitsubishielectric.co.jp < fujii.y...@df.mitsubishielectric.co.jp> 于2024年6月5日周三 09:26写道: > Hi. Tender. > > Thank you for modification. > > > From: Tender Wang > > Sent: Tuesday, June 4, 2024 7:51 PM > > Please add more tests. Especially please add some negative tests; > > current patch checks that it is safe to apply materialization. It > would > > be helpful to add tests checking that materialization is not > applied > > in both checked cases: > > 1. when inner join path is not parallel safe > > 2. when matpath is not parallel safe > > > > > > > > I added a test case that inner rel is not parallel safe. Actually, > > matpath will not create if inner rel is not parallel safe. So I didn't > add test case for the second scenario. > Is there case in which matpath is not parallel safe and inner rel is > parallel safe? > If right, I think that it would be suitable to add a negative test in a > such case. > I looked through create_xxx_path(), and I found that almost path.parallel_safe is assigned from RelOptiInfo.consider_parallel. Some pathes take subpath->parallel_safe into account(e.g. Material path). In most cases, Material is parallel_safe if rel is parallel safe. Now I haven't come up a query plan that material is un parallel-safe but rel is parallel-safe. > > Sincerely yours, > Yuuki Fujii > > -- > Yuuki Fujii > Information Technology R&D Center Mitsubishi Electric Corporation > > > -- Tender Wang OpenPie: https://en.openpie.com/
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Hi Robert, Since this patch had been reviewed at PgConf.dev Patch Review Workshop. And I have updated the patch according to the review advice. Now there are no others to comment this patch. The status of this patch on commitfest have stayed "need review" for a long time. I want to know if it is ready to move to the next status "Ready for commiter". Thanks. -- Tender Wang
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Richard Guo 于2024年6月18日周二 17:24写道: > On Tue, Jun 4, 2024 at 6:51 PM Tender Wang wrote: > > Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1] > > > > * I think we should not consider materializing the cheapest inner path > > if we're doing JOIN_UNIQUE_INNER, because in this case we have to > > unique-ify the inner path. > > > > We don't consider material inner path if jointype is JOIN_UNIQUE_INNER > in match_unsorted_order(). > > So here is as same logic as match_unsorted_order(). I added comments to > explain why. > > I looked through the v4 patch and found an issue. For the plan diff: > > + -> Nested Loop > + -> Parallel Seq Scan on prt1_p1 t1_1 > + -> Materialize > + -> Sample Scan on prt1_p1 t2_1 > + Sampling: system (t1_1.a) REPEATABLE (t1_1.b) > + Filter: (t1_1.a = a) > > This does not seem correct to me. The inner path is parameterized by > the outer rel, in which case it does not make sense to add a Materialize > node on top of it. > Yeah, you're right. > > I updated the patch to include a check in consider_parallel_nestloop > ensuring that inner_cheapest_total is not parameterized by outerrel > before materializing it. I also tweaked the comments, test cases and > commit message. > Thanks for the work. Now it looks better. I have changed the status from "need review" to "ready for commiters" on the commitfest. -- Tender Wang
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Richard Guo 于2024年7月12日周五 10:30写道: > On Sat, Jul 6, 2024 at 5:32 PM Richard Guo wrote: > > Here is a new rebase. > > > > I'm planning to push it next week, barring any objections. > > Pushed. > > Thanks > Richard > Thanks for pushing. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
I think old "sub-FK" should not be dropped, that will be violates foreign key constraint. For example : postgres=# insert into r values(1,1); INSERT 0 1 postgres=# ALTER TABLE r DETACH PARTITION r_1; ALTER TABLE postgres=# delete from p_1 where id = 1; DELETE 1 postgres=# select * from r_1; id | p_id +-- 1 |1 (1 row) If I run above SQLs on pg12.12, it will report error below: postgres=# delete from p_1 where id = 1; ERROR: update or delete on table "p_1" violates foreign key constraint "r_1_p_id_fkey1" on table "r_1" DETAIL: Key (id)=(1) is still referenced from table "r_1". Alvaro Herrera 于2023年7月31日周一 20:58写道: > On 2023-Jul-05, Jehan-Guillaume de Rorthais wrote: > > > ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); > > > > The old sub-FKs (below 18289) created in this table to enforce the action > > triggers on referenced partitions are not deleted when the table becomes > a > > partition. Because of this, we have additional and useless triggers on > the > > referenced partitions and we can not DETACH this partition on the > referencing > > side anymore: > > Oh, hm, interesting. Thanks for the report and patch. I found a couple > of minor issues with it (most serious one: nkeys should be 3, not 2; > also sysscan should use conrelid index), but I'll try and complete it so > that it's ready for 2023-08-10's releases. > > Regards > > -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ > > >
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
I think the code to determine that fk of a partition is inherited or not is not enough. For example, in this case, foreign key r_1_p_id_fkey1 is not inherited from parent. If conform->conparentid(in DetachPartitionFinalize func) is valid, we should recheck confrelid(pg_constraint) field. I try to fix this problem in the attached patch. Any thoughts. Alvaro Herrera 于2023年8月3日周四 17:02写道: > On 2023-Aug-03, tender wang wrote: > > > I think old "sub-FK" should not be dropped, that will be violates > foreign > > key constraint. > > Yeah, I've been playing more with the patch and it is definitely not > doing the right things. Just eyeballing the contents of pg_trigger and > pg_constraint for partitions added by ALTER...ATTACH shows that the > catalog contents are inconsistent with those added by CREATE TABLE > PARTITION OF. > > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ > From d84395c7321c201a78661de0b41b76e71ab10678 Mon Sep 17 00:00:00 2001 From: "tender.wang" Date: Thu, 3 Aug 2023 17:23:06 +0800 Subject: [PATCH] Recheck foreign key of a partition is inherited from parent. Previously, fk is inherited if conparentid(pg_constraint) is valid. It is not enough and we should compare confrelid field to determine fk is inherited. --- src/backend/commands/tablecmds.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 727f151750..1447433109 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -18556,6 +18556,8 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent, ForeignKeyCacheInfo *fk = lfirst(cell); HeapTuple contup; Form_pg_constraint conform; + HeapTuple parentTup; + Form_pg_constraint parentForm; Constraint *fkconstraint; Oid insertTriggerOid, updateTriggerOid; @@ -18573,6 +18575,23 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent, continue; } + /* recheck confrelid field */ + if (OidIsValid(conform->conparentid)) + { + parentTup = SearchSysCache1(CONSTROID, ObjectIdGetDatum(conform->conparentid)); + if (!HeapTupleIsValid(parentTup)) + elog(ERROR, "cache lookup failed for constraint %u", conform->conparentid); + parentForm = (Form_pg_constraint) GETSTRUCT(parentTup); + /* It is not inherited foreign keys */ + if (parentForm->confrelid != conform->confrelid) + { + ReleaseSysCache(contup); + ReleaseSysCache(parentTup); + continue; + } + ReleaseSysCache(parentTup); + } + /* unset conparentid and adjust conislocal, coninhcount, etc. */ ConstraintSetParentConstraint(fk->conoid, InvalidOid, InvalidOid); -- 2.25.1
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Oversight the DetachPartitionFinalize(), I found another bug below: postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id); CREATE TABLE postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1); CREATE TABLE postgres=# CREATE TABLE r_1 ( postgres(# id bigint PRIMARY KEY, postgres(# p_id bigint NOT NULL postgres(# ); CREATE TABLE postgres=# CREATE TABLE r ( postgres(# id bigint PRIMARY KEY, postgres(# p_id bigint NOT NULL, postgres(# FOREIGN KEY (p_id) REFERENCES p (id) postgres(# ) PARTITION BY list (id); CREATE TABLE postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); ALTER TABLE postgres=# ALTER TABLE r DETACH PARTITION r_1; ALTER TABLE postgres=# insert into r_1 values(1,1); ERROR: insert or update on table "r_1" violates foreign key constraint "r_p_id_fkey" DETAIL: Key (p_id)=(1) is not present in table "p". After detach operation, r_1 is normal relation and the inherited foreign key 'r_p_id_fkey' should be removed. tender wang 于2023年8月3日周四 17:34写道: > I think the code to determine that fk of a partition is inherited or not > is not enough. > For example, in this case, foreign key r_1_p_id_fkey1 is not inherited > from parent. > > If conform->conparentid(in DetachPartitionFinalize func) is valid, we > should recheck confrelid(pg_constraint) field. > > I try to fix this problem in the attached patch. > Any thoughts. > > Alvaro Herrera 于2023年8月3日周四 17:02写道: > >> On 2023-Aug-03, tender wang wrote: >> >> > I think old "sub-FK" should not be dropped, that will be violates >> foreign >> > key constraint. >> >> Yeah, I've been playing more with the patch and it is definitely not >> doing the right things. Just eyeballing the contents of pg_trigger and >> pg_constraint for partitions added by ALTER...ATTACH shows that the >> catalog contents are inconsistent with those added by CREATE TABLE >> PARTITION OF. >> >> -- >> Álvaro Herrera PostgreSQL Developer — >> https://www.EnterpriseDB.com/ >> >
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Oversight the DetachPartitionFinalize() again, I found the root cause why 'r_p_id_fkey' wat not removed. DetachPartitionFinalize() call the GetParentedForeignKeyRefs() func to get tuple from pg_constraint that will be delete but failed. according to the comments, the GetParentedForeignKeyRefs() func get the tuple reference me not I reference others. I try to fix this bug : i. ConstraintSetParentConstraint() should not be called in DetachPartitionFinalize(), because after conparentid was set to 0, we can not find inherited foreign keys. ii. create another function like GetParentedForeignKeyRefs(), but the ScanKey should be conrelid field not confrelid. I quickly test on my above solution in my env, can be solve above issue. tender wang 于2023年8月4日周五 17:04写道: > Oversight the DetachPartitionFinalize(), I found another bug below: > > postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id); > CREATE TABLE > postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1); > CREATE TABLE > postgres=# CREATE TABLE r_1 ( > postgres(# id bigint PRIMARY KEY, > postgres(# p_id bigint NOT NULL > postgres(# ); > CREATE TABLE > postgres=# CREATE TABLE r ( > postgres(# id bigint PRIMARY KEY, > postgres(# p_id bigint NOT NULL, > postgres(# FOREIGN KEY (p_id) REFERENCES p (id) > postgres(# ) PARTITION BY list (id); > CREATE TABLE > postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); > ALTER TABLE > postgres=# ALTER TABLE r DETACH PARTITION r_1; > ALTER TABLE > postgres=# insert into r_1 values(1,1); > ERROR: insert or update on table "r_1" violates foreign key constraint > "r_p_id_fkey" > DETAIL: Key (p_id)=(1) is not present in table "p". > > After detach operation, r_1 is normal relation and the inherited foreign > key 'r_p_id_fkey' should be removed. > > > tender wang 于2023年8月3日周四 17:34写道: > >> I think the code to determine that fk of a partition is inherited or not >> is not enough. >> For example, in this case, foreign key r_1_p_id_fkey1 is not inherited >> from parent. >> >> If conform->conparentid(in DetachPartitionFinalize func) is valid, we >> should recheck confrelid(pg_constraint) field. >> >> I try to fix this problem in the attached patch. >> Any thoughts. >> >> Alvaro Herrera 于2023年8月3日周四 17:02写道: >> >>> On 2023-Aug-03, tender wang wrote: >>> >>> > I think old "sub-FK" should not be dropped, that will be violates >>> foreign >>> > key constraint. >>> >>> Yeah, I've been playing more with the patch and it is definitely not >>> doing the right things. Just eyeballing the contents of pg_trigger and >>> pg_constraint for partitions added by ALTER...ATTACH shows that the >>> catalog contents are inconsistent with those added by CREATE TABLE >>> PARTITION OF. >>> >>> -- >>> Álvaro Herrera PostgreSQL Developer — >>> https://www.EnterpriseDB.com/ >>> >>
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
The foreign key still works even though partition was detached. Is this behavior expected? I can't find the answer in the document. If it is expected behavior , please ignore the bug I reported a few days ago. tender wang 于2023年8月4日周五 17:04写道: > Oversight the DetachPartitionFinalize(), I found another bug below: > > postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id); > CREATE TABLE > postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1); > CREATE TABLE > postgres=# CREATE TABLE r_1 ( > postgres(# id bigint PRIMARY KEY, > postgres(# p_id bigint NOT NULL > postgres(# ); > CREATE TABLE > postgres=# CREATE TABLE r ( > postgres(# id bigint PRIMARY KEY, > postgres(# p_id bigint NOT NULL, > postgres(# FOREIGN KEY (p_id) REFERENCES p (id) > postgres(# ) PARTITION BY list (id); > CREATE TABLE > postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); > ALTER TABLE > postgres=# ALTER TABLE r DETACH PARTITION r_1; > ALTER TABLE > postgres=# insert into r_1 values(1,1); > ERROR: insert or update on table "r_1" violates foreign key constraint > "r_p_id_fkey" > DETAIL: Key (p_id)=(1) is not present in table "p". > > After detach operation, r_1 is normal relation and the inherited foreign > key 'r_p_id_fkey' should be removed. > > > tender wang 于2023年8月3日周四 17:34写道: > >> I think the code to determine that fk of a partition is inherited or not >> is not enough. >> For example, in this case, foreign key r_1_p_id_fkey1 is not inherited >> from parent. >> >> If conform->conparentid(in DetachPartitionFinalize func) is valid, we >> should recheck confrelid(pg_constraint) field. >> >> I try to fix this problem in the attached patch. >> Any thoughts. >> >> Alvaro Herrera 于2023年8月3日周四 17:02写道: >> >>> On 2023-Aug-03, tender wang wrote: >>> >>> > I think old "sub-FK" should not be dropped, that will be violates >>> foreign >>> > key constraint. >>> >>> Yeah, I've been playing more with the patch and it is definitely not >>> doing the right things. Just eyeballing the contents of pg_trigger and >>> pg_constraint for partitions added by ALTER...ATTACH shows that the >>> catalog contents are inconsistent with those added by CREATE TABLE >>> PARTITION OF. >>> >>> -- >>> Álvaro Herrera PostgreSQL Developer — >>> https://www.EnterpriseDB.com/ >>> >>
[question] multil-column range partition prune
I have an range partition and query below: create table p_range(a int, b int) partition by range (a,b); create table p_range1 partition of p_range for values from (1,1) to (3,3); create table p_range2 partition of p_range for values from (4,4) to (6,6); explain select * from p_range where b =2; QUERY PLAN -- Append (cost=0.00..76.61 rows=22 width=8) -> Seq Scan on p_range1 p_range_1 (cost=0.00..38.25 rows=11 width=8) Filter: (b = 2) -> Seq Scan on p_range2 p_range_2 (cost=0.00..38.25 rows=11 width=8) Filter: (b = 2) (5 rows) The result of EXPLAIN shows that no partition prune happened. And gen_prune_steps_from_opexps() has comments that can answer the result. /* * For range partitioning, if we have no clauses for the current key, * we can't consider any later keys either, so we can stop here. */ if (part_scheme->strategy == PARTITION_STRATEGY_RANGE && clauselist == NIL) break; But I want to know why we don't prune when just have latter partition key in whereClause. Thanks.
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年4月10日周三 21:58写道: > It turns out that trying to close all holes that lead to columns marked > not-null without a pg_constraint row is not possible within the ALTER > TABLE framework, because it can happen outside it also. Consider this > > CREATE DOMAIN dom1 AS integer; > CREATE TABLE notnull_tbl (a dom1, b int, PRIMARY KEY (a, b)); > DROP DOMAIN dom1 CASCADE; > > In this case you'll end up with b having attnotnull=true and no > constraint; and no amount of messing with tablecmds.c will fix it. > I try above case on my v4 patch[1], and it seems no result as what you said. But, anyway, I now don't like updating other catalog in RemoveConstraintById(). Because it will not be friendly for others who call RemoveConstraintById() want only to remove pg_constraint tuple, but actually it do more works stealthily. > So I propose to instead allow those constraints, and treat them as > second-class citizens. We allow dropping them with ALTER TABLE DROP NOT > NULL, and we allow to create a backing full-fledged constraint with SET > NOT NULL or ADD CONSTRAINT. So here's a partial crude initial patch to > do that. > Hmm, the patch looks like the patch in my first email in this thread. But my v1 patch seem a poc at most. > > > One thing missing here is pg_dump support. If you just dump this table, > it'll end up with no constraint at all. That's obviously bad, so I > propose we have pg_dump add a regular NOT NULL constraint for those, to > avoid perpetuating the weird situation further. > > Another thing I wonder if whether I should use the existing > set_attnotnull() instead of adding drop_orphaned_notnull(). Or we could > just inline the code in ATExecDropNotNull, since it's small and > self-contained. > I like just inline the code in ATExecDropNotNull, as you said, it's small and self-contained. in ATExecDropNotNull(), we had open the pg_attributed table and hold RowExclusiveLock, the tuple we also get. What we do is set attnotnull = false, and call CatalogTupleUpdate. -- > Álvaro Herrera 48°01'N 7°57'E — > https://www.EnterpriseDB.com/ > "Postgres is bloatware by design: it was built to house > PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002) > [1] https://www.postgresql.org/message-id/CAHewXNn_So7LUCxxxyDNfdvCQp1TnD3gTVECBZX2bT_nbPgraQ%40mail.gmail.com -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年4月11日周四 14:40写道: > On Wed, Apr 10, 2024 at 2:10 PM jian he > wrote: > > > > DROP TABLE if exists notnull_tbl2; > > CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1 > int); > > ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1); > > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null; > > ALTER TABLE notnull_tbl2 DROP c1; > > \d notnull_tbl2 > > > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null; > per above sequence execution order, this should error out? > > otherwise which "not null" (attribute|constraint) to anchor "generated > by default as identity" not null property? > "DROP c1" will drop the not null property for "c0" and "c1". > if "DROP CONSTRAINT notnull_tbl2_c0_not_nul" not error out, then > " ALTER TABLE notnull_tbl2 DROP c1;" > should either error out > or transform "c0" from "c0 int generated by default as identity" > to > "c0 int" > > I try above case on MASTER and MASTER with Alvaro V2 patch, and all work correctly. \d+ notnull_tbl2 will see not-null of "c0". > > On Thu, Apr 11, 2024 at 1:23 AM Alvaro Herrera > wrote: > > > > On 2024-Apr-10, Alvaro Herrera wrote: > > > > > One thing missing here is pg_dump support. If you just dump this > table, > > > it'll end up with no constraint at all. That's obviously bad, so I > > > propose we have pg_dump add a regular NOT NULL constraint for those, to > > > avoid perpetuating the weird situation further. > > > > Here's another crude patchset, this time including the pg_dump aspect. > > > > +DROP TABLE notnull_tbl1; > +-- make sure attnotnull is reset correctly when a PK is dropped indirectly > +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1)); > +ALTER TABLE notnull_tbl1 DROP c1; > +\d+ notnull_tbl1 > + Table "public.notnull_tbl1" > + Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > > ++-+---+--+-----+-+--+- > + c0 | integer | | not null | | plain | > | > + > > this is not what we expected? > "not null" for "c0" now should be false? > am I missing something? > Yeah, now this is expected behavior. Users can drop manually not-null of "c0" if they want, and no error reporte. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
jian he 于2024年4月12日周五 10:12写道: > On Thu, Apr 11, 2024 at 10:48 PM Alvaro Herrera > wrote: > > > > > > I'm still not ready with this -- still not convinced about the new AT > > pass. Also, I want to add a test for the pg_dump behavior, and there's > > an XXX comment. > > > Now I am more confused... > > +-- make sure attnotnull is reset correctly when a PK is dropped > indirectly, > +-- or kept if there's a reason for that > +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1)); > +ALTER TABLE notnull_tbl1 DROP c1; > +\d+ notnull_tbl1 > + Table "public.notnull_tbl1" > + Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > > ++-+---+--+-+-+--+- > + c0 | integer | | | | plain | > | > + > +DROP TABLE notnull_tbl1; > > same query, mysql make let "c0" be not null > mysql https://dbfiddle.uk/_ltoU7PO > > for postgre > https://dbfiddle.uk/ZHJXEqL1 > from 9.3 to 16 (click the link (https://dbfiddle.uk/ZHJXEqL1), then > click "9.3" choose which version you like) > all will make the remaining column "co" be not null. > > latest > 0001-Better-handle-indirect-constraint-drops.patch make c0 attnotnull be > false. > > previous patches: > v2-0001-Handle-ALTER-.-DROP-NOT-NULL-when-no-pg_constrain.patch make > c0 attnotnull be true. > 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch make > c0 attnotnull be false. > I'm not sure that SQL standard specifies what database must do for this case. If the standard does not specify, then it depends on each database vendor's decision. Some people like not-null retained, other people may like not-null removed. I think it will be ok if people can drop not-null or add not-null back again after dropping pk. In Master, not-null will reset when we drop PK directly. I hope dropping pk indirectly is consistent with dropping PK directly. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年4月11日周四 22:48写道: > On 2024-Apr-11, Alvaro Herrera wrote: > > > Well, I think you were right that we should try to handle the situation > > of unmarking attnotnull as much as possible, to decrease the chances > > that the problematic situation occurs. That means, we can use the > > earlier code to handle DROP COLUMN when it causes a PK to be dropped -- > > even though we still need to handle the situation of an attnotnull flag > > set with no pg_constraint row. I mean, we still have to handle DROP > > DOMAIN correctly (and there might be other cases that I haven't thought > > about) ... but surely this is a much less common situation than the one > > you reported. So I'll merge everything and post an updated patch. > > Here's roughly what I'm thinking. If we drop a constraint, we can still > reset attnotnull in RemoveConstraintById(), but only after checking that > it's not a generated column or a replica identity. If they are, we > don't error out -- just skip the attnotnull update. > > Now, about the code to allow ALTER TABLE DROP NOT NULL in case there's > no pg_constraint row, I think at this point it's mostly dead code, > because it can only happen when you have a replica identity or generated > column ... and the DROP NOT NULL should still prevent you from dropping > the flag anyway. But the case can still arise, if you change the > replica identity or ALTER TABLE ALTER COLUMN DROP DEFAULT, respectively. > > I'm still not ready with this -- still not convinced about the new AT > pass. Yeah, at first, I was also hesitant. Two reasons make me convinced. in ATPostAlterTypeParse() - else if (cmd->subtype == AT_SetAttNotNull) { /* * The parser will create AT_AttSetNotNull subcommands for * columns of PRIMARY KEY indexes/constraints, but we need * not do anything with them here, because the columns' * NOT NULL marks will already have been propagated into * the new table definition. */ } --- The new table difinition continues to use old column not-null, so here does nothing. If we reset NOT NULL marks in RemoveConstrainById() when dropping PK indirectly, we need to do something here or somewhere else. Except AT_SetAttNotNull type, other types add a AT pass to tab->subcmds. Because not-null should be added before re-adding index, there is no right AT pass in current AlterTablePass. So a new AT pass ahead AT_PASS_OLD_INDEX is needed. Another reason is that it can use ALTER TABLE frame to set not-null. This way looks simpler and better than hardcode to re-install not-null in some funciton. -- Tender Wang OpenPie: https://en.openpie.com/
Re: Can't find not null constraint, but \d+ shows that
Alvaro Herrera 于2024年4月19日周五 02:49写道: > On 2024-Apr-13, jian he wrote: > > > I wonder is there any incompatibility issue, or do we need to say > something > > about the new behavior when dropping a key column? > > Umm, yeah, maybe we should document it in ALTER TABLE DROP PRIMARY KEY > and in the release notes to note the different behavior. > > > only minor cosmetic issue: > > + if (unconstrained_cols) > > i would like change it to > > + if (unconstrained_cols != NIL) > > > > + foreach(lc, unconstrained_cols) > > we can change to > > + foreach_int(attnum, unconstrained_cols) > > per commit > > > https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff > > Ah, yeah. I did that, rewrote some comments and refined the tests a > little bit to ensure the pg_upgrade behavior is sane. I intend to get > this pushed tomorrow, if nothing ugly comes up. > The new patch looks good to me. > > CI run: https://cirrus-ci.com/build/5471117953990656 > > -- Tender Wang OpenPie: https://en.openpie.com/
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Andrey M. Borodin 于2024年4月8日周一 17:40写道: > > > > On 27 Sep 2023, at 16:06, tender wang wrote: > > > >Do you have any comments or suggestions on this issue? Thanks. > Hi Tender, > > there are some review comments in the thread, that you might be interested > in. > I'll mark this [0] entry "Waiting on Author" and move to next CF. > > Thanks! > > > Best regards, Andrey Borodin. > > [0]https://commitfest.postgresql.org/47/4549/ I have rebased master and fixed a plan diff case. -- Tender Wang OpenPie: https://en.openpie.com/ v3-0001-Support-materializing-inner-path-on-parallel-oute.patch Description: Binary data
Re: First draft of PG 17 release notes
jian he 于2024年5月9日周四 18:00写道: > On Thu, May 9, 2024 at 12:04 PM Bruce Momjian wrote: > > > > I have committed the first draft of the PG 17 release notes; you can > > see the results here: > > > > https://momjian.us/pgsql_docs/release-17.html > > > > another potential incompatibilities issue: > ALTER TABLE DROP PRIMARY KEY > > see: > > https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql > > Since Alvaro has reverted all changes to not-null constraints, so will not have potential incompatibilities issue. -- Tender Wang OpenPie: https://en.openpie.com/
Re: struct RelOptInfo member relid comments
jian he 于2024年5月24日周五 10:58写道: > hi > > typedef struct RelOptInfo > { > > /* > * information about a base rel (not set for join rels!) > */ > Index relid; > ... > } > > imho, the above comment is not very helpful. > we should say more about what kind of information relid says about a base > rel? > > I don't know much about RelOptInfo, that's why I ask. > > > The fields in struct RelOptInfo between comment " information about a base rel " and commnet "Information about foreign tables and foreign joins" are all about a base rel. Every field has a comment. I think that's already helpful for understanding what information we need to optimize a base rel. -- Tender Wang OpenPie: https://en.openpie.com/
Re: why memoize is not used for correlated subquery
Pavel Stehule 于2024年5月28日周二 15:31写道: > Hi > > > My question is - does memoize support subqueries? And can be enhanced to > support this exercise without LATERAL and optimization fences? > > The commit messages in memoize may answer your question: "For now, the planner will only consider using a result cache for parameterized nested loop joins. This works for both normal joins and also for LATERAL type joins to subqueries. It is possible to use this new node for other uses in the future. For example, to cache results from correlated subqueries. However, that's not done here due to some difficulties obtaining a distinct estimation on the outer plan to calculate the estimated cache hit ratio. Currently we plan the inner plan before planning the outer plan so there is no good way to know if a result cache would be useful or not since we can't estimate the number of times the subplan will be called until the outer plan is generated." git show b6002a796d -- Tender Wang OpenPie: https://en.openpie.com/
Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Tomasz Rybak 于2024年5月31日周五 04:21写道: > On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote: > > > [ cut ] > > > > I have rebased master and fixed a plan diff case. > > We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch > at PgConf.dev Patch Review Workshop. > Thanks for reviewing this patch. > Here are our findings. > > Patch tries to allow for using materialization together > with parallel subqueries. > It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac > (current HEAD). > Tests pass locally on macOS and Linux in VM under Windows. > Tests are also green in cfbot (for last 2 weeks; they were > red previously, probably because of need to rebase). > > Please add more tests. Especially please add some negative tests; > current patch checks that it is safe to apply materialization. It would > be helpful to add tests checking that materialization is not applied > in both checked cases: > 1. when inner join path is not parallel safe > 2. when matpath is not parallel safe > I added a test case that inner rel is not parallel safe. Actually, matpath will not create if inner rel is not parallel safe. So I didn't add test case for the second scenario. This patch tries to apply materialization only when join type > is not JOIN_UNIQUE_INNER. Comment mentions this, but does not > explain why. So please either add comment describing reason for that > or try enabling materialization in such a case. > Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1] * I think we should not consider materializing the cheapest inner path if we're doing JOIN_UNIQUE_INNER, because in this case we have to unique-ify the inner path. We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in match_unsorted_order(). So here is as same logic as match_unsorted_order(). I added comments to explain why. [1] https://www.postgresql.org/message-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA%40mail.gmail.com -- Tender Wang OpenPie: https://en.openpie.com/ v4-0001-Support-materializing-inner-path-on-parallel-oute.patch Description: Binary data
ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3
Hi hackers, I found $subject problem when using SQLancer. How to repeat: CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT); CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0); CREATE TEMP TABLE t2(c0 boolean , c1 DECIMAL NOT NULL UNIQUE); CREATE TEMPORARY TABLE t3(LIKE t1); CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3); SELECT SUM(count) FROM (SELECT (((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN v0 ON ((t2.c1)=(0.08182310538090898))) as res; psql (16devel) Type "help" for help. postgres=# \d Did not find any relations. postgres=# CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT); CREATE TABLE postgres=# CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0); CREATE TABLE postgres=# CREATE TEMP TABLE t2(c0 boolean , c1 DECIMAL NOT NULL UNIQUE); CREATE TABLE postgres=# CREATE TEMPORARY TABLE t3(LIKE t1); CREATE TABLE postgres=# CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3); NOTICE: view "v0" will be a temporary view CREATE VIEW postgres=# SELECT SUM(count) FROM (SELECT (((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN v0 ON ((t2.c1)=(0.08182310538090898))) as res; ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3 regards, tender wang
Re: Improve join_search_one_level readibilty (one line change)
謝東霖 于2023年6月3日周六 23:21写道: > Hello hackers > > Attached is my first patch for PostgreSQL, which is a simple one-liner > that I believe can improve the code. > > In the "join_search_one_level" function, I noticed that the variable > "other_rels_list" always refers to "joinrels[1]" even when the (level > == 2) condition is met. I propose changing: > > "other_rels_list = joinrels[level - 1]" to "other_rels_list = joinrels[1]" > > This modification aims to enhance clarity and avoid unnecessary > instructions. > I guess compiler can make that code more efficiency. But from the point of code readibilty, I agree with you. As Julien Rouhaud said, it had better to to move the other_rels_list initialization out of the if instruction and put it with the variable declaration. I would greatly appreciate any review and feedback on this patch as I > am a newcomer to PostgreSQL contributions. Your input will help me > improve and contribute effectively to the project. > > I have followed the excellent guide "How to submit a patch by email, > 2023 edition" by Peter Eisentraut. > > Additionally, if anyone has any tips on ensuring that Gmail recognizes > my attached patches as the "text/x-patch" MIME type when sending them > from the Chrome client, I would be grateful for the advice. > > Or maybe the best practice is to use Git send-email ? > > Thank you for your time. > > Best regards > Alex Hsieh >
Small code simplification
Hi, Oversight in 7ff9afbbd; I think we can do the same way for the ATExecAddColumn(). -- Tender Wang v1-0001-Small-code-simplification.patch Description: Binary data
Re: Small code simplification
Richard Guo 于2024年8月22日周四 10:53写道: > On Wed, Aug 21, 2024 at 6:25 PM Tender Wang wrote: > >Oversight in 7ff9afbbd; I think we can do the same way for the > ATExecAddColumn(). > > LGTM. Pushed. > Thanks for pushing. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月22日周四 06:00写道: > On 2024-Aug-19, Alvaro Herrera wrote: > > > I haven't pushed it yet, mostly because of being unsure about not doing > > anything for the oldest branches (14 and back). > > Last night, after much mulling on this, it occurred to me that one easy > way out of this problem for the old branches, without having to write > more code, is to simply remove the constraint from the partition when > it's detached (but only if they reference a partitioned relation). It's > not a great solution, but at least we're no longer leaving bogus catalog > entries around. That would be like the attached patch, which was cut > from 14 and applies cleanly to 12 and 13. I'd throw in a couple of > tests and call it a day. > I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and I find below error. postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id); CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1); CREATE TABLE r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL, FOREIGN KEY (p_id) REFERENCES p (id) ); CREATE TABLE r ( id bigint PRIMARY KEY, p_id bigint NOT NULL, FOREIGN KEY (p_id) REFERENCES p (id) ) PARTITION BY list (id); ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); ALTER TABLE r DETACH PARTITION r_1; CREATE TABLE CREATE TABLE CREATE TABLE CREATE TABLE ALTER TABLE ERROR: cache lookup failed for constraint 16400 I haven't look into details to find out where cause above error. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Tender Wang 于2024年8月22日周四 11:19写道: > > > Alvaro Herrera 于2024年8月22日周四 06:00写道: > >> On 2024-Aug-19, Alvaro Herrera wrote: >> >> > I haven't pushed it yet, mostly because of being unsure about not doing >> > anything for the oldest branches (14 and back). >> >> Last night, after much mulling on this, it occurred to me that one easy >> way out of this problem for the old branches, without having to write >> more code, is to simply remove the constraint from the partition when >> it's detached (but only if they reference a partitioned relation). It's >> not a great solution, but at least we're no longer leaving bogus catalog >> entries around. That would be like the attached patch, which was cut >> from 14 and applies cleanly to 12 and 13. I'd throw in a couple of >> tests and call it a day. >> > > I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and > I > find below error. > postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id); > CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1); > CREATE TABLE r_1 ( > id bigint PRIMARY KEY, > p_id bigint NOT NULL, > FOREIGN KEY (p_id) REFERENCES p (id) > ); > CREATE TABLE r ( > id bigint PRIMARY KEY, > p_id bigint NOT NULL, > FOREIGN KEY (p_id) REFERENCES p (id) > ) PARTITION BY list (id); > ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); > ALTER TABLE r DETACH PARTITION r_1; > CREATE TABLE > CREATE TABLE > CREATE TABLE > CREATE TABLE > ALTER TABLE > ERROR: cache lookup failed for constraint 16400 > I guess it is because cascade dropping, the oid=16400 has been deleted. Adding a list that remember dropped constraint, if we find the parent of constraint is in the list, we skip. By the way, I run above SQL sequences on REL_14_STABLE without your partch. I didn't find reporting error, and running oidjoins.sql didn't report warnings. Do I miss something? -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月23日周五 02:41写道: > On 2024-Aug-22, Tender Wang wrote: > > > I apply the v14 patch on branch REL_14_STABLE. I run this thread issue > and I > > find below error. > > [...] > > ERROR: cache lookup failed for constraint 16400 > > > > I haven't look into details to find out where cause above error. > > Right, we try to drop the constraint twice. We can dodge this by > collecting all constraints to drop in the loop and process them in a > single performMultipleDeletions, as in the attached v14-2. > Can we move the CommandCounterIncrement() in if (get_rel_relkind(fk->confrelid) == RELKIND_PARTITIONED_TABLE) block to be close to performMultipleDeletions(). Others look good to me. TBH I think it's a bit infuriating that we lose the constraint (which > was explicitly declared) because of ATTACH/DETACH. Agree. Do you think it is friendly to users if we add hints that inform them the constraint was dropped? -- Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月21日周三 15:11写道: > On Fri, Aug 16, 2024 at 4:14 PM Richard Guo > wrote: > > I had a self-review of this patchset and made some refactoring, > > especially to the function that creates the RelAggInfo structure for a > > given relation. While there were no major changes, the code should > > now be simpler. > > I found a bug in v10 patchset: when we generate the GROUP BY clauses > for the partial aggregation that is pushed down to a non-aggregated > relation, we may produce a clause with a tleSortGroupRef that > duplicates one already present in the query's groupClause, which would > cause problems. > > Attached is the updated version of the patchset that fixes this bug > and includes further code refactoring. > Rectenly, I do some benchmark tests, mainly on tpch and tpcds. tpch tests have no plan diff, so I do not continue to test on tpch. tpcds(10GB) tests have 22 plan diff as below: 4.sql, 5.sql, 8.sql,11.sql,19.sql,23.sql,31.sql, 33.sql,39.sql,45.sql,46.sql,47.sql,53.sql, 56.sql,57.sql,60.sql,63.sql,68.sql,74.sql,77.sql,80.sql,89.sql I haven't look all of them. I just pick few simple plan test(e.g. 19.sql, 45.sql). For example, 19.sql, eager agg pushdown doesn't get large gain, but a little performance regress. I will continue to do benchmark on this feature. [1] https://github.com/tenderwg/eager_agg -- Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月29日周四 10:46写道: > On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote: > > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote: > > > I haven't look all of them. I just pick few simple plan test(e.g. > 19.sql, 45.sql). > > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a > little > > > performance regress. > > > > Yeah, this is one of the things I was worried about in my previous > > reply to Richard. It would be worth Richard, or someone, probing into > > exactly why that's happening. My fear is that we just don't have good > > enough estimates to make good decisions, but there might well be > > another explanation. > > It's great that we have a query to probe into. Your guess is likely > correct: it may be caused by poor estimates. > > Tender, would you please help provide the outputs of > > EXPLAIN (COSTS ON, ANALYZE) > > on 19.sql with and without eager aggregation? > Yeah, in [1], 19_off.out and 19_on.out are the output of explain(costs off, analyze). I will do EXPLAIN(COSTS ON, ANALYZE) tests and upload them later today. [1] https://github.com/tenderwg/eager_agg -- Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月29日周四 10:46写道: > On Wed, Aug 28, 2024 at 9:01 PM Robert Haas wrote: > > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang wrote: > > > I haven't look all of them. I just pick few simple plan test(e.g. > 19.sql, 45.sql). > > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a > little > > > performance regress. > > > > Yeah, this is one of the things I was worried about in my previous > > reply to Richard. It would be worth Richard, or someone, probing into > > exactly why that's happening. My fear is that we just don't have good > > enough estimates to make good decisions, but there might well be > > another explanation. > > It's great that we have a query to probe into. Your guess is likely > correct: it may be caused by poor estimates. > > Tender, would you please help provide the outputs of > > EXPLAIN (COSTS ON, ANALYZE) > > on 19.sql with and without eager aggregation? > > I upload EXPLAIN(COSTS ON, ANALYZE) test to [1]. I ran the same query three times, and I chose the third time result. You can check 19_off_explain.out and 19_on_explain.out. [1] https://github.com/tenderwg/eager_agg -- Tender Wang
Re: not null constraints, again
Alvaro Herrera 于2024年8月31日周六 11:59写道: > Hello > > Here I present another attempt at making not-null constraints be > catalogued. This is largely based on the code reverted at 9ce04b50e120, > except that we now have a not-null constraint automatically created for > every column of a primary key, and such constraint cannot be removed > while the PK exists. Thanks to this, a lot of rather ugly code is gone, > both in pg_dump and in backend -- in particular the handling of NO > INHERIT, which was needed for pg_dump. > > Noteworthy psql difference: because there are now even more not-null > constraints than before, the \d+ display would be far too noisy if we > just let it grow. So here I've made it omit any constraints that > underlie the primary key. This should be OK since you can't do much > with those constraints while the PK is still there. If you drop the PK, > the next \d+ will show those constraints. > > One thing that regretfully I haven't yet had time for, is paring down > the original test code: a lot of it is verifying the old semantics, > particularly for NO INHERIT constraints, which had grown ugly special > cases. It now mostly raises errors; or the tests are simply redundant. > I'm going to remove that stuff as soon as I'm back on my normal work > timezone. > > sepgsql is untested. > > I'm adding this to the September commitfest. Thanks for working on this again. AT_PASS_OLD_COL_ATTRS, I didn't see any other code that used it. I don't think it's necessary. I will take the time to look over the attached patch. -- Tender Wang
Re: not null constraints, again
Alvaro Herrera 于2024年8月31日周六 11:59写道: > Hello > > Here I present another attempt at making not-null constraints be > catalogued. This is largely based on the code reverted at 9ce04b50e120, > except that we now have a not-null constraint automatically created for > every column of a primary key, and such constraint cannot be removed > while the PK exists. Thanks to this, a lot of rather ugly code is gone, > both in pg_dump and in backend -- in particular the handling of NO > INHERIT, which was needed for pg_dump. > > Noteworthy psql difference: because there are now even more not-null > constraints than before, the \d+ display would be far too noisy if we > just let it grow. So here I've made it omit any constraints that > underlie the primary key. This should be OK since you can't do much > with those constraints while the PK is still there. If you drop the PK, > the next \d+ will show those constraints. > > One thing that regretfully I haven't yet had time for, is paring down > the original test code: a lot of it is verifying the old semantics, > particularly for NO INHERIT constraints, which had grown ugly special > cases. It now mostly raises errors; or the tests are simply redundant. > I'm going to remove that stuff as soon as I'm back on my normal work > timezone. > > sepgsql is untested. > > I'm adding this to the September commitfest. > The attached patch adds List *nnconstraints, which store the not-null definition, in struct CreateStmt. This makes me a little confused about List *constraints in struct CreateStmt. Actually, the List constraints store ckeck constraint, and it will be better if the comments can reflect that. Re-naming it to List *ckconstraints seems more reasonable. But a lot of codes that use stmt->constraints will be changed. Since AddRelationNewConstraints() can now add not-null column constraint, the comments about AddRelationNewConstraints() should tweak a little. "All entries in newColDefaults will be processed. Entries in newConstraints will be processed only if they are CONSTR_CHECK type." Now, the type of new constraints may be not-null constraints. If the column has already had one not-null constraint, and we add same not-null constraint again. Then the code will call AdjustNotNullInheritance1() in AddRelationNewConstraints(). The comments before entering AdjustNotNullInheritance1() in AddRelationNewConstraints() look confusing to me. Because constraint is not inherited. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
ARTITION OF fk_p FOR VALUES IN (1) > CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2) > CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL) > CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL) > CREATE TABLE fk_r ( id bigint PRIMARY KEY, p_id bigint NOT NULL, > FOREIGN KEY (p_id) REFERENCES fk_p (id) > ) PARTITION BY list (id); > SET search_path TO fkpart12; > > INSERT INTO fk_p VALUES (1); > > ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2); > > ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1); > \d fk_r_1 > > INSERT INTO fk_r VALUES (1,1); > > ALTER TABLE fk_r DETACH PARTITION fk_r_1; > \d fk_r_1 > > INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED > DELETE FROM p; -- should fails but was buggy > > ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1); > \d fk_r_1 > > > 3. Self referencing FK between partitions [3] > > You added to your commit message: > > verify: 20230707175859.17c91538@karst > > I'm not sure what the "verify" flag means. Unfortunately, your patch > doesn't > help on this topic. > > This bug really needs more discussion and design consideration. I have > thought about this problem and haven't found any solution that don't > involve > breaking the current core behavior. It really looks like an impossible > bug to > fix without dropping the constraint itself. On both side. Maybe the only > sane > behavior would be to forbid detaching the partition if it would break the > constraint. > > But let's discuss this on the related thread, should we? > > > Thank you for reading me all the way down to the bottom! > > Regards, > > [1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst > [2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst > [3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst > > > > -- Tender Wang
Re: not null constraints, again
Alvaro Herrera 于2024年8月31日周六 11:59写道: > Hello > > Here I present another attempt at making not-null constraints be > catalogued. This is largely based on the code reverted at 9ce04b50e120, > except that we now have a not-null constraint automatically created for > every column of a primary key, and such constraint cannot be removed > while the PK exists. Thanks to this, a lot of rather ugly code is gone, > both in pg_dump and in backend -- in particular the handling of NO > INHERIT, which was needed for pg_dump. > > Noteworthy psql difference: because there are now even more not-null > constraints than before, the \d+ display would be far too noisy if we > just let it grow. So here I've made it omit any constraints that > underlie the primary key. This should be OK since you can't do much > with those constraints while the PK is still there. If you drop the PK, > the next \d+ will show those constraints. > > One thing that regretfully I haven't yet had time for, is paring down > the original test code: a lot of it is verifying the old semantics, > particularly for NO INHERIT constraints, which had grown ugly special > cases. It now mostly raises errors; or the tests are simply redundant. > I'm going to remove that stuff as soon as I'm back on my normal work > timezone. > > sepgsql is untested. > > I'm adding this to the September commitfest. > The test case in constraints.sql, as below: CREATE TABLE notnull_tbl1 (a INTEGER NOT NULL NOT NULL); ^^ There are two not-null definitions, and is the second one redundant? When we drop the column not-null constraint, we will enter ATExecDropNotNull(). Then, it calls findNotNullConstraint() to get the constraint tuple. We already have attnum before the call findNotNullConstraint(). Can we directly call findNotNullConstraintAttnum()? In RemoveConstraintById(), I see below comments: "For not-null and primary key constraints, obtain the list of columns affected, so that we can reset their attnotnull flags below." However, there are no related codes that reflect the above comments. -- Thanks, Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Jehan-Guillaume de Rorthais 于2024年9月3日周二 17:26写道: > Hi Tender, > > On Tue, 3 Sep 2024 10:16:44 +0800 > Tender Wang wrote: > > > Jehan-Guillaume de Rorthais 于2024年9月3日周二 05:02写道: > […] > > > * Constraint & trigger catalog cleanup [1] (this thread) > > > * FK broken after DETACH [2] > > > * Maintenance consideration about self referencing FK between > partitions > > > [3] > > > > > > > The third issue has been fixed, and codes have been pushed. Because of > my > > misunderstanding, > > It should not be here. > > I just retried the SQL scenario Guillaume gave on both master and master > with > Alvaro's patch. See: > > > https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com > > It doesn't seem fixed at all. Maybe you are mixing up with another > thread/issue? > Sorry, I mixed up the third issue with the Alexander reported issue. Please ignore the above noise. > > > > 0. Splitting in two commits > > > > > > […] > > > > > > Unfortunately, this discussion about the first bug slipped to the > second > > > one when Tender stumbled on this bug as well and reported it. But, > both > > > bugs can be triggered independently, and have distinct fixes. > > > > It's ok that these two issues are fixed together. It is because current > > codes don't handle better when the referenced side is the partition > table. > > I don't feel the same. Mixing two discussions and fixes together in the > same > thread and commit makes life harder. > Hmm, these two issues have a close relationship. Anyway, I think it's ok to fix the two issues together. > Last year, when you found the other bug, I tried to point you to the > right thread to avoid mixing subjects: > > https://www.postgresql.org/message-id/20230810170345.26e41b05%40karst > > If I wrote about the third (non fixed) issue yesterday, it's just because > Alvaro included a reference to it in his commit message. But I think we > should > really keep up with this issue on its own, dedicated discussion: > > > https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com Thanks for the reminder. I didn't take the time to look into the third issue. Please give me some to analyze it. -- Thanks, Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月21日周三 15:11写道: > On Fri, Aug 16, 2024 at 4:14 PM Richard Guo > wrote: > > I had a self-review of this patchset and made some refactoring, > > especially to the function that creates the RelAggInfo structure for a > > given relation. While there were no major changes, the code should > > now be simpler. > > I found a bug in v10 patchset: when we generate the GROUP BY clauses > for the partial aggregation that is pushed down to a non-aggregated > relation, we may produce a clause with a tleSortGroupRef that > duplicates one already present in the query's groupClause, which would > cause problems. > > Attached is the updated version of the patchset that fixes this bug > and includes further code refactoring. > The v11-0002 git am failed on HEAD(6c2b5edecc). tender@iZ2ze6la2dizi7df9q3xheZ:/workspace/postgres$ git am v11-0002-Implement-Eager-Aggregation.patch Applying: Implement Eager Aggregation error: patch failed: src/test/regress/parallel_schedule:119 error: src/test/regress/parallel_schedule: patch does not apply Patch failed at 0001 Implement Eager Aggregation hint: Use 'git am --show-current-patch=diff' to see the failed patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". -- Thanks, Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月21日周三 15:11写道: > On Fri, Aug 16, 2024 at 4:14 PM Richard Guo > wrote: > > I had a self-review of this patchset and made some refactoring, > > especially to the function that creates the RelAggInfo structure for a > > given relation. While there were no major changes, the code should > > now be simpler. > > I found a bug in v10 patchset: when we generate the GROUP BY clauses > for the partial aggregation that is pushed down to a non-aggregated > relation, we may produce a clause with a tleSortGroupRef that > duplicates one already present in the query's groupClause, which would > cause problems. > > Attached is the updated version of the patchset that fixes this bug > and includes further code refactoring. I review the v11 patch set, and here are a few of my thoughts: 1. in setup_eager_aggregation(), before calling create_agg_clause_infos(), it does some checks if eager aggregation is available. Can we move those checks into a function, for example, can_eager_agg(), like can_partial_agg() does? 2. I found that outside of joinrel.c we all use IS_DUMMY_REL, but in joinrel.c, Tom always uses is_dummy_rel(). Other commiters use IS_DUMMY_REL. 3. The attached patch does not consider FDW when creating a path for grouped_rel or grouped_join. Do we need to think about FDW? I haven't finished reviewing the patch set. I will continue to learn this feature. -- Thanks, Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月8日周四 06:50写道: > On 2024-Jul-26, Tender Wang wrote: > > > Junwang Zhao 于2024年7月26日周五 14:57写道: > > > > > There is a bug report[0] Tender comments might be the same issue as > > > this one, but I tried Alvaro's and mine patch, neither could solve > > > that problem, I did not tried Tender's earlier patch thought. I post > > > the test script below in case you are interested. > > > > My earlier patch should handle Alexander reported case. But I did not > > do more test. I'm not sure that wether or not has dismatching between > > pg_constraint and pg_trigger. > > > > I aggred with Alvaro said that "requires a much more invasive > > solution". > > Here's the patch which, as far as I can tell, fixes all the reported > problems (other than the one in bug 18541, for which I proposed an > unrelated fix in that thread[1]). If you can double-check, I would very > much appreciate that. Also, I think the test cases the patch adds > reflect the provided examples sufficiently, but if we're still failing > to cover some, please let me know. > When I review Jehan-Guillaume v2 patch, I found the below codes that need a little tweak. In DetachPartitionFinalize() /* * If the referenced side is partitioned (which we know because our * parent's constraint points to a different relation than ours) then * we must, in addition to the above, create pg_constraint rows that * point to each partition, each with its own action triggers. */ if (parentConForm->conrelid != conform->conrelid) I found that the above IF was always true, regardless of whether the referenced side is partitioned. Although find_all_inheritors() can return an empty children list when the referenced side is not partitioned, we can avoid much useless work. How about this way: if (get_rel_relkind(conform->confrelid) == RELKIND_PARTITIONED_TABLE) -- Thanks, Tender Wang
Re: not null constraints, again
jian he 于2024年9月9日周一 16:31写道: > On Mon, Sep 2, 2024 at 6:33 PM Tender Wang wrote: > > > > > > > > The attached patch adds List *nnconstraints, which store the not-null > definition, in struct CreateStmt. > > This makes me a little confused about List *constraints in struct > CreateStmt. Actually, the List constraints > > store ckeck constraint, and it will be better if the comments can > reflect that. Re-naming it to List *ckconstraints > > seems more reasonable. But a lot of codes that use stmt->constraints > will be changed. > > > hi. > seems you forgot to attach the patch? > I also noticed this minor issue. > I have no preference for Renaming it to List *ckconstraints. > +1 for better comments. maybe reword to > >>> > List *constraints;/* CHECK constraints (list of Constraint > nodes) */ > >>> > I just gave advice; whether it is accepted or not, it's up to Alvaro. If Alvaro agrees with the advice, he will patch a new one. We can continue to review the new patch. If Alvaro disagrees, he doesn't need to change the current patch. I think this way will be more straightforward for others who will review this feature. > > On Tue, Sep 3, 2024 at 3:17 PM Tender Wang wrote: > > > > The test case in constraints.sql, as below: > > CREATE TABLE notnull_tbl1 (a INTEGER NOT NULL NOT NULL); > > > ^^ > > There are two not-null definitions, and is the second one redundant? > > > > hi. > i think this is ok. please see > function transformColumnDefinition and variable saw_nullable. > Yeah, it is ok. > we need to make sure this: > CREATE TABLE notnull_tbl3 (a INTEGER NULL NOT NULL); > fails. > > > of course, it's also OK do this: > CREATE TABLE notnull_tbl3 (a INTEGER NULL NULL); > -- Thanks, Tender Wang
Re: Eager aggregation, take 3
Richard Guo 于2024年8月21日周三 15:11写道: > On Fri, Aug 16, 2024 at 4:14 PM Richard Guo > wrote: > > I had a self-review of this patchset and made some refactoring, > > especially to the function that creates the RelAggInfo structure for a > > given relation. While there were no major changes, the code should > > now be simpler. > > I found a bug in v10 patchset: when we generate the GROUP BY clauses > for the partial aggregation that is pushed down to a non-aggregated > relation, we may produce a clause with a tleSortGroupRef that > duplicates one already present in the query's groupClause, which would > cause problems. > > Attached is the updated version of the patchset that fixes this bug > and includes further code refactoring. > > I continue to review the v11 version patches. Here are some my thoughts. 1. In make_one_rel(), we have the below codes: /* * Build grouped base relations for each base rel if possible. */ setup_base_grouped_rels(root); As far as I know, each base rel only has one grouped base relation, if possible. The comments may be changed to "Build a grouped base relation for each base rel if possible." 2. According to the comments of generate_grouped_paths(), we may generate paths for a grouped relation on top of paths of join relation. So the ”rel_plain" argument in generate_grouped_paths() may be confused. "plain" usually means "base rel" . How about Re-naming rel_plain to input_rel? 3. In create_partial_grouping_paths(), The partially_grouped_rel could have been already created due to eager aggregation. If partially_grouped_rel exists, its reltarget has been created. So do we need below logic? /* * Build target list for partial aggregate paths. These paths cannot just * emit the same tlist as regular aggregate paths, because (1) we must * include Vars and Aggrefs needed in HAVING, which might not appear in * the result tlist, and (2) the Aggrefs must be set in partial mode. */ partially_grouped_rel->reltarget = make_partial_grouping_target(root, grouped_rel->reltarget, extra->havingQual); -- Thanks, Tender Wang
Re: Eager aggregation, take 3
Tender Wang 于2024年9月4日周三 11:48写道: > > > Richard Guo 于2024年8月21日周三 15:11写道: > >> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo >> wrote: >> > I had a self-review of this patchset and made some refactoring, >> > especially to the function that creates the RelAggInfo structure for a >> > given relation. While there were no major changes, the code should >> > now be simpler. >> >> I found a bug in v10 patchset: when we generate the GROUP BY clauses >> for the partial aggregation that is pushed down to a non-aggregated >> relation, we may produce a clause with a tleSortGroupRef that >> duplicates one already present in the query's groupClause, which would >> cause problems. >> >> Attached is the updated version of the patchset that fixes this bug >> and includes further code refactoring. >> > > The v11-0002 git am failed on HEAD(6c2b5edecc). > > tender@iZ2ze6la2dizi7df9q3xheZ:/workspace/postgres$ git am > v11-0002-Implement-Eager-Aggregation.patch > Applying: Implement Eager Aggregation > error: patch failed: src/test/regress/parallel_schedule:119 > error: src/test/regress/parallel_schedule: patch does not apply > Patch failed at 0001 Implement Eager Aggregation > hint: Use 'git am --show-current-patch=diff' to see the failed patch > When you have resolved this problem, run "git am --continue". > If you prefer to skip this patch, run "git am --skip" instead. > To restore the original branch and stop patching, run "git am --abort". > > Since MERGE/SPLIT partition has been reverted, the tests *partition_merge* and *partition_split* should be removed from parallel_schedule. After doing the above, the 0002 patch can be applied. -- Thanks, Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Hi Alvaro, Recently, Alexander reported the same issue on [1]. And before that, another same issue was reported on [2]. So I try to re-work those issues. In my last email on this thread, I said that " I slightly modified the previous patch,but I didn't add test case, because I found another issue. After done ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); I run the oidjoins.sql and has warnings as belwo: psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING: FK VIOLATION IN pg_trigger({tgparentid}): ("(0,3)",16401) psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING: FK VIOLATION IN pg_trigger({tgparentid}): ("(0,4)",16402) " And I gave the explanation: " The two trigger tuples are deleted in tryAttachPartitionForeignKey called by CloneFkReferencing. /* * Looks good! Attach this constraint. The action triggers in the new * partition become redundant -- the parent table already has equivalent * ones, and those will be able to reach the partition. Remove the ones * in the partition. We identify them because they have our constraint * OID, as well as being on the referenced rel. */ " I try to fix above fk violation. I have two ideas. i. Do not remove redundant, but when detaching parittion, the action trigger on referenced side will be create again. I have consider about this situation. ii. We still remove redundant, and the remove the child action trigger, too. If we do this way. Should we create action trigger recursively on referced side when detaching partition. I can't decide which one is better. And I'm not sure that keep this FK VIOLATION will cause some problem. I rebase and send v3 patch, which only fix NOT FOUND INSERT CHECK TRIGGER. [1] https://www.postgresql.org/message-id/18541-628a61bc267cd2d3%40postgresql.org [2] https://www.postgresql.org/message-id/GVAP278MB02787E7134FD691861635A8BC9032%40GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM -- Tender Wang v3-0001-Fix-partition-detach-issue.patch Description: Binary data v3-0002-Add-test-case.patch Description: Binary data
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年7月19日周五 21:18写道: > Hello, > > I think the fix for the check triggers should be as the attached. Very > close to what you did, but you were skipping some operations that needed > to be kept. AFAICS this patch works correctly for the posted cases. > After applying the attached, the r_1_p_id_fkey1 will have redundant action triggers, as below: postgres=# select oid, conname, contype, conrelid, conindid,conparentid, confrelid,conislocal,coninhcount, connoinherit from pg_constraint where oid = 16402; oid |conname | contype | conrelid | conindid | conparentid | confrelid | conislocal | coninhcount | connoinherit ---++-+--+--+-+---++-+-- 16402 | r_1_p_id_fkey1 | f |16394 |16392 | 0 | 16389 | t | 0 | f (1 row) postgres=# select oid, tgrelid, tgparentid, tgconstrrelid, tgconstrindid, tgconstraint from pg_trigger where tgconstraint = 16402; oid | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint ---+-++---+---+-- 16403 | 16389 | 16400 | 16394 | 16392 |16402 16404 | 16389 | 16401 | 16394 | 16392 |16402 16422 | 16389 | 0 | 16394 | 16392 |16402 16423 | 16389 | 0 | 16394 | 16392 |16402 (4 rows) -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年7月19日周五 21:18写道: > > I find this pair of queries useful; they show which constraints exist > and which triggers belong to each. We need to make the constraints and > triggers match after a detach right as things would be if the > just-detached partition were an individual table having the same foreign > key. > I don't find the useful queries in your last email. Can you provide them. Thanks. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Junwang Zhao 于2024年7月26日周五 14:57写道: > > There is a bug report[0] Tender comments might be the same > issue as this one, but I tried Alvaro's and mine patch, neither > could solve that problem, I did not tried Tender's earlier patch > thought. I post the test script below in case you are interested. > My earlier patch should handle Alexander reported case. But I did not do more test. I'm not sure that wether or not has dismatching between pg_constraint and pg_trigger. I aggred with Alvaro said that "requires a much more invasive solution". -- Tender Wang
Re: Fixed small typo in bufpage.h
Senglee Choi 于2024年8月5日周一 09:24写道: > Fixed a minor typo in src/include/storage/bufpage.h. > Please let me know if any further action is required. > Good catch. +1 -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月8日周四 06:50写道: > On 2024-Jul-26, Tender Wang wrote: > > > Junwang Zhao 于2024年7月26日周五 14:57写道: > > > > > There is a bug report[0] Tender comments might be the same issue as > > > this one, but I tried Alvaro's and mine patch, neither could solve > > > that problem, I did not tried Tender's earlier patch thought. I post > > > the test script below in case you are interested. > > > > My earlier patch should handle Alexander reported case. But I did not > > do more test. I'm not sure that wether or not has dismatching between > > pg_constraint and pg_trigger. > > > > I aggred with Alvaro said that "requires a much more invasive > > solution". > > Here's the patch which, as far as I can tell, fixes all the reported > problems (other than the one in bug 18541, for which I proposed an > unrelated fix in that thread[1]). If you can double-check, I would very > much appreciate that. Also, I think the test cases the patch adds > reflect the provided examples sufficiently, but if we're still failing > to cover some, please let me know. > Thanks for your work. I will take some time to look it in detail. -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月8日周四 06:50写道: > On 2024-Jul-26, Tender Wang wrote: > > > Junwang Zhao 于2024年7月26日周五 14:57写道: > > > > > There is a bug report[0] Tender comments might be the same issue as > > > this one, but I tried Alvaro's and mine patch, neither could solve > > > that problem, I did not tried Tender's earlier patch thought. I post > > > the test script below in case you are interested. > > > > My earlier patch should handle Alexander reported case. But I did not > > do more test. I'm not sure that wether or not has dismatching between > > pg_constraint and pg_trigger. > > > > I aggred with Alvaro said that "requires a much more invasive > > solution". > > Here's the patch which, as far as I can tell, fixes all the reported > problems (other than the one in bug 18541, for which I proposed an > unrelated fix in that thread[1]). If you can double-check, I would very > much appreciate that. Also, I think the test cases the patch adds > reflect the provided examples sufficiently, but if we're still failing > to cover some, please let me know. > I did a lot of tests, and did not report error and did not find any warnings using oidjoins.sql. +1 -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月8日周四 06:50写道: > On 2024-Jul-26, Tender Wang wrote: > > > Junwang Zhao 于2024年7月26日周五 14:57写道: > > > > > There is a bug report[0] Tender comments might be the same issue as > > > this one, but I tried Alvaro's and mine patch, neither could solve > > > that problem, I did not tried Tender's earlier patch thought. I post > > > the test script below in case you are interested. > > > > My earlier patch should handle Alexander reported case. But I did not > > do more test. I'm not sure that wether or not has dismatching between > > pg_constraint and pg_trigger. > > > > I aggred with Alvaro said that "requires a much more invasive > > solution". > > Here's the patch which, as far as I can tell, fixes all the reported > problems (other than the one in bug 18541, for which I proposed an > unrelated fix in that thread[1]). If you can double-check, I would very > much appreciate that. Also, I think the test cases the patch adds > reflect the provided examples sufficiently, but if we're still failing > to cover some, please let me know. > > > As I understand, this fix needs to be applied all the way back to 12, > because the overall functionality is that old. However, in branches 14 > and back, the patch doesn't apply cleanly, because of the changes we > made in commit f4566345cf40 :-( I'm tempted to fix it in branches 15, > 16, 17, master now and potentially backpatch later, to avoid dragging > things along further. It's taken long enough already. > I haven't seen this patch on master. Is there something that we fotget to handle? -- Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年8月20日周二 10:25写道: > On 2024-Aug-20, Tender Wang wrote: > > > > As I understand, this fix needs to be applied all the way back to 12, > > > because the overall functionality is that old. However, in branches 14 > > > and back, the patch doesn't apply cleanly, because of the changes we > > > made in commit f4566345cf40 :-( I'm tempted to fix it in branches 15, > > > 16, 17, master now and potentially backpatch later, to avoid dragging > > > things along further. It's taken long enough already. > > > > I haven't seen this patch on master. Is there something that we fotget > to > > handle? > > I haven't pushed it yet, mostly because of being unsure about not doing > anything for the oldest branches (14 and back). > I only did codes and tests on master. I'm not sure how much complicated it would be to fix this issue on branches 14 and back. -- Tender Wang
Re: Improve node type forward reference
Peter Eisentraut 于2024年10月15日周二 15:02写道: > On 14.10.24 23:28, Nathan Bossart wrote: > > On Mon, Oct 14, 2024 at 09:47:59AM +0200, Peter Eisentraut wrote: > >> But we can do this better by using an incomplete struct, like > >> > >> struct Query *viewQuery ...; > >> > >> That way, everything has the correct type and fewer casts are required. > This > >> technique is already used elsewhere in node type definitions. > Agree. The attached patches LGTM. > > > > I noticed that the examples in parsenodes.h are for structs defined > within > > the same file. If the struct is defined in a separate file, I guess you > > might need to include another header file wherever it is used, but that > > doesn't seem too bad. > > No, you can leave the struct incomplete. You only need to provide its > full definition (= include the other header file) if you actually want > to access the struct's fields. > > Yeah. There are many examples. For example as below: in struct RelOptInfos, /* use "struct FdwRoutine" to avoid including fdwapi.h here */ struct FdwRoutine *fdwroutine pg_node_attr(read_write_ignore); -- Thanks, Tender Wang
Remove an unnecessary check on semijoin_target_ok() on postgres_fdw.c
Hi, When I debug FDW join pushdown codes, I found below codes in semijoin_target_ok(): if (bms_is_member(var->varno, innerrel->relids) && !bms_is_member(var->varno, outerrel->relids)) As far as I know, if a var belongs to the innerrel of joinrel, it's not possible that it may belong to the outerrel. So if the bms_is_member(var->varno, innerrel->relids) returns TRUE, then !bms_is_member(var->varno, outerrel->relids) must be TRUE. If bms_is_member(var->varno, innerrel->relids) returns FALSE, !bms_is_member(var->varno, outerrel->relids) will not execute due to short circuit. So I think we can remove the "!bms_is_member(var->varno, outerrel->relids)" from if. Any thoughts? -- Thanks, Tender Wang 0001-Remove-an-unnecessary-check-as-Var-can-only-belong-t.patch Description: Binary data
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年10月18日周五 22:52写道: > On 2024-Sep-26, Jehan-Guillaume de Rorthais wrote: > > > REL_14_STABLE backport doesn't seem trivial, so I'll wait for some > > feedback, review & decision before going further down in backpatching. > > Hi, thanks for these patches. I have made some edits of my own. In the > end I decided that I quite liked the new structure of the > addFkConstraint() function and friends. I added a bunch of comments and > changed names somewhat. Also, I think the benefit of the refactoring is > more obvious when all patches are squashed together, so I did that. > > For branch 14, I opted to make it delete the constraints on detach. > This isn't ideal but unless somebody wants to spend a lot more time on > this, it seems the best we can do. Leaving broken constraints around > seems worse. The patch for 14 doesn't apply to 13 sadly. I didn't have > time to verify why, but it seems there's some rather larger conflict in > one spot. > > I hope to be able to get these pushed over the weekend. That would give > us a couple of weeks before the November releases (but my availability > in those two weeks might be spotty.) > > I spent some time going through your test additions and ended up with > your functions in this form: > > -- displays constraints in schema fkpart12 > CREATE OR REPLACE FUNCTION > fkpart12_constraints(OUT conrel regclass, OUT conname name, > OUT confrelid regclass, OUT conparent text) > RETURNS SETOF record AS $$ > WITH RECURSIVE arrh AS ( >SELECT oid, conrelid, conname, confrelid, NULL::name AS conparent > FROM pg_constraint > WHERE connamespace = 'fkpart12'::regnamespace AND > contype = 'f' AND conparentid = 0 > UNION ALL >SELECT c.oid, c.conrelid, c.conname, c.confrelid, > (pg_identify_object('pg_constraint'::regclass, arrh.oid, > 0)).identity > FROM pg_constraint c > JOIN arrh ON c.conparentid = arrh.oid > ) SELECT conrelid::regclass, conname, confrelid::regclass, conparent > FROM arrh > ORDER BY conrelid::regclass::text, conname; > $$ > LANGUAGE SQL; > > -- displays triggers in tables in schema fkpart12 > CREATE OR REPLACE FUNCTION > fkpart12_triggers(OUT tablename regclass, OUT constr text, > OUT samefunc boolean, OUT parent text) > RETURNS SETOF record AS $$ > WITH RECURSIVE arrh AS ( >SELECT t.oid, t.tgrelid::regclass as tablename, tgname, > t.tgfoid::regproc as trigfn, > (pg_identify_object('pg_constraint'::regclass, c.oid, > 0)).identity as constr, > NULL::bool as samefunc, > NULL::name AS parent > FROM pg_trigger t > LEFT JOIN pg_constraint c ON c.oid = t.tgconstraint > WHERE (SELECT relnamespace FROM pg_class WHERE oid = t.tgrelid) = > 'fkpart12'::regnamespace > AND c.contype = 'f' AND t.tgparentid = 0 > UNION ALL >SELECT t2.oid, t2.tgrelid::regclass as tablename, t2.tgname, > t2.tgfoid::regproc as trigfn, > (pg_identify_object('pg_constraint'::regclass, c2.oid, > 0)).identity, > arrh.trigfn = t2.tgfoid as samefunc, > replace((pg_identify_object('pg_trigger'::regclass, > t2.tgparentid, 0)).identity, > t2.tgparentid::text, 'TGOID') > FROM pg_trigger t2 > LEFT JOIN pg_constraint c2 ON c2.oid = t2.tgconstraint > JOIN arrh ON t2.tgparentid = arrh.oid > ) SELECT tablename, constr, samefunc, parent > FROM arrh > ORDER BY tablename::text, constr; > $$ > LANGUAGE SQL; > > However, in the end I think this is a very good technique to verify that > the fix works correctly, but it's excessive to include these results in > the tests forevermore. So I've left them out for now. Maybe we should > reconsider on the older branches? > > Hi, I looked at the patch on master. I gave a little comment in [1] I reconsider the codes. I suspect that we don't need the below if statement anymore. /* * If the referenced side is partitioned (which we know because our * parent's constraint points to a different relation than ours) then * we must, in addition to the above, create pg_constraint rows that * point to each partition, each with its own action triggers. */ if (parentConForm->conrelid != conform->conrelid) { ... } The above contidion is always true according to my test. I haven't figured out an anti-case. Any thoughts? [1] https://www.postgresql.org/message-id/CAHewXNkuU2V7GfgFkwd265RJ99%2BBfJueNEZhrHSk39o3thqxNA%40mail.gmail.com -- Thanks, Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alvaro Herrera 于2024年10月22日周二 05:52写道: > On 2024-Oct-21, Tender Wang wrote: > > > I suspect that we don't need the below if > > statement anymore. > > /* > > * If the referenced side is partitioned (which we know because our > > * parent's constraint points to a different relation than ours) then > > * we must, in addition to the above, create pg_constraint rows that > > * point to each partition, each with its own action triggers. > > */ > > if (parentConForm->conrelid != conform->conrelid) > > { > > ... > > } > > > > The above contidion is always true according to my test. > > I haven't figured out an anti-case. > > You're right, this is useless, we can remove the 'if' line. I kept the > block though, to have a place for all those local variable declarations > (otherwise the code looks messier than it needs to). > Agree. > > I also noticed that addFkRecurseReferenced() is uselessly taking a List > **wqueue argument but doesn't use it, so I removed it (as fallout, other > routines don't need it either, especially DetachPartitionFinalize). I > added some commentary on the creation of dependencies in > addFkConstraint(). > Yeah, I also noticed this before. > > I also include a few other cosmetic changes; just comment layout > changes. > > This time I attach the patch for master only; the others have changed > identically. 14 is unchanged from before. I figured that the conflict > from 14 to 13 was trivial to resolve -- it was just because of DETACH > CONCURRENTLY, so some code moves around, but that's all that happens. > > I don't find any other problems with the newest patch. -- Thanks, Tender Wang
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Alexander Lakhin 于2024年10月24日周四 22:00写道: > Hello Alvaro, > > 22.10.2024 17:32, Alvaro Herrera wrote: > > Yeah. I pushed these patches finally, thanks! > > Please look at a new anomaly introduced with 53af9491a. When running the > following script: > CREATE TABLE t (a int, b int, PRIMARY KEY (a, b)); > CREATE TABLE pt (a int, b int, FOREIGN KEY (a, b) REFERENCES t(a, b)) > PARTITION BY LIST (a); > > CREATE TABLE tp1 (x int, a int, b int); > ALTER TABLE tp1 DROP COLUMN x; > > ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1); > > ALTER TABLE pt DETACH PARTITION tp1; > > I get a memory access error detected by Valgrind: > 2024-10-24 12:05:04.645 UTC [1079077] LOG: statement: ALTER TABLE pt > DETACH PARTITION tp1; > ==00:00:00:07.887 1079077== Invalid read of size 2 > ==00:00:00:07.887 1079077==at 0x4A61DD: DetachPartitionFinalize > (tablecmds.c:19545) > ==00:00:00:07.887 1079077==by 0x4A5C11: ATExecDetachPartition > (tablecmds.c:19386) > ==00:00:00:07.887 1079077==by 0x48561E: ATExecCmd (tablecmds.c:5540) > ==00:00:00:07.887 1079077==by 0x4845DE: ATRewriteCatalogs > (tablecmds.c:5203) > ==00:00:00:07.887 1079077==by 0x4838EC: ATController (tablecmds.c:4758) > ==00:00:00:07.887 1079077==by 0x4834F1: AlterTable (tablecmds.c:4404) > ==00:00:00:07.887 1079077==by 0x7D6D52: ProcessUtilitySlow > (utility.c:1318) > ==00:00:00:07.887 1079077==by 0x7D65F7: standard_ProcessUtility > (utility.c:1067) > ==00:00:00:07.887 1079077==by 0x7D54F7: ProcessUtility (utility.c:523) > ==00:00:00:07.887 1079077==by 0x7D3D70: PortalRunUtility > (pquery.c:1158) > ==00:00:00:07.887 1079077==by 0x7D3FE7: PortalRunMulti (pquery.c:1316) > ==00:00:00:07.887 1079077==by 0x7D3431: PortalRun (pquery.c:791) > > Reproduced on REL_15_STABLE .. master. > Sorry, I can't reproduce this leak on master. -- Thanks, Tender Wang