from:"tender wang"

wrong results due to qual pushdown

2023-03-06 Thread tender wang

tender wang 
[image: 附件]14:51 (2小时前)
发送至 pgsql-hackers
Hi hackers.
   This query has different result on 16devel and 15.2.
select
sample_3.n_regionkey as c0,
ref_7.l_linenumber as c3,
sample_4.l_quantity as c6,
sample_5.n_nationkey as c7,
sample_3.n_name as c8
  from
public.nation as sample_3
left join public.lineitem as ref_5
on ((cast(null as text) ~>=~ cast(null as text))
or (ref_5.l_discount is NULL))
  left join public.time_statistics as ref_6
inner join public.lineitem as ref_7
on (ref_7.l_returnflag = ref_7.l_linestatus)
  right join public.lineitem as sample_4
left join public.nation as sample_5
on (cast(null as tsquery) = cast(null as tsquery))
  on (cast(null as "time") <= cast(null as "time"))
right join public.customer as ref_8
on (sample_4.l_comment = ref_8.c_name )
  on (ref_5.l_quantity = ref_7.l_quantity )
  where (ref_7.l_suppkey is not NULL)
or ((case when cast(null as lseg) >= cast(null as lseg) then cast(null
as inet) else cast(null as inet) end
   && cast(null as inet))
  or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as
int2) <= cast(null as int8))
or (EXISTS (
  select
  ref_9.ps_comment as c0,
  5 as c1,
  ref_8.c_address as c2,
  58 as c3,
  ref_8.c_acctbal as c4,
  ref_7.l_orderkey as c5,
  ref_7.l_shipmode as c6,
  ref_5.l_commitdate as c7,
  ref_8.c_custkey as c8,
  sample_3.n_nationkey as c9
from
  public.partsupp as ref_9
where cast(null as tsquery) @> cast(null as tsquery)
order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit 38))
then cast(null as text) else cast(null as text) end
  ))
  order by c0, c3, c6, c7, c8 limit 137;

plan on 16devel:

 QUERY PLAN


 Limit
   InitPlan 1 (returns $0)
 ->  Result
   One-Time Filter: false
   ->  Sort
 Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity,
n_nationkey, sample_3.n_name
 ->  Nested Loop Left Join
   ->  Seq Scan on nation sample_3
   ->  Materialize
 ->  Nested Loop Left Join
   Join Filter: (ref_5.l_quantity = l_quantity)
   Filter: ((l_suppkey IS NOT NULL) OR
(getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text
ELSE NULL::text END))
   ->  Seq Scan on lineitem ref_5
 Filter: (l_discount IS NULL)
   ->  Result
 One-Time Filter: false
(16 rows)

plan on 15.2:
QUERY
PLAN

 Limit
   InitPlan 1 (returns $0)
 ->  Result
   One-Time Filter: false
   ->  Sort
 Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity,
n_nationkey, sample_3.n_name
 ->  Nested Loop Left Join
   Filter: ((l_suppkey IS NOT NULL) OR (getdatabaseencoding()
!~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text ELSE NULL::text END))
   ->  Seq Scan on nation sample_3
   ->  Materialize
 ->  Nested Loop Left Join
   Join Filter: (ref_5.l_quantity = l_quantity)
   ->  Seq Scan on lineitem ref_5
 Filter: (l_discount IS NULL)
   ->  Result
 One-Time Filter: false
(16 rows)


It looks wrong that the qual (e.g ((l_suppkey IS NOT NULL) OR
(getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text
ELSE NULL::text END))) is pushdown.

regards, tender wang

Re: wrong results due to qual pushdown

2023-03-06 Thread tender wang

Results on 16devel:
c0 | c3 | c6 | c7 |c8
++++---
  0 |||| ALGERIA
  0 |||| ETHIOPIA
  0 |||| KENYA
  0 |||| MOROCCO
  0 |||| MOZAMBIQUE
  1 |||| ARGENTINA
  1 |||| BRAZIL
  1 |||| CANADA
  1 |||| PERU
  1 |||| UNITED STATES
  2 |||| CHINA
  2 |||| INDIA
  2 |||| INDONESIA
  2 |||| JAPAN
  2 |||| VIETNAM
  3 |||| FRANCE
  3 |||| GERMANY
  3 |||| ROMANIA
  3 |||| RUSSIA
  3 |||| UNITED KINGDOM
  4 |||| EGYPT
  4 |||| IRAN
  4 |||| IRAQ
  4 |||| JORDAN
  4 |||| SAUDI ARABIA
(25 rows)

Results on 15.2:
 c0 | c3 | c6 | c7 | c8
++++
(0 rows)

tender wang  于2023年3月6日周一 22:48写道：

> Results on 16devel:
> c0 | c3 | c6 | c7 |c8
> ++++---
>   0 |||| ALGERIA
>   0 |||| ETHIOPIA
>   0 |||| KENYA
>   0 |||| MOROCCO
>   0 |||| MOZAMBIQUE
>   1 |||| ARGENTINA
>   1 |||| BRAZIL
>   1 |||| CANADA
>   1 |||| PERU
>   1 |||| UNITED STATES
>   2 |||| CHINA
>   2 |||| INDIA
>   2 |||| INDONESIA
>   2 |||| JAPAN
>   2 |||| VIETNAM
>   3 |||| FRANCE
>   3 |||| GERMANY
>   3 |||| ROMANIA
>   3 |||| RUSSIA
>   3 |||| UNITED KINGDOM
>   4 |||| EGYPT
>   4 |||| IRAN
>   4 |||| IRAQ
>   4 |||| JORDAN
>   4 |||| SAUDI ARABIA
> (25 rows)
>
> Results on 15.2:
>  c0 | c3 | c6 | c7 | c8
> ++++----
> (0 rows)
>
> Ashutosh Bapat  于2023年3月6日周一 22:14写道：
>
>>
>>
>> On Mon, Mar 6, 2023 at 3:00 PM tender wang  wrote:
>>
>>> tender wang 
>>> [image: 附件]14:51 (2小时前)
>>> 发送至 pgsql-hackers
>>> Hi hackers.
>>>This query has different result on 16devel and 15.2.
>>> select
>>> sample_3.n_regionkey as c0,
>>> ref_7.l_linenumber as c3,
>>> sample_4.l_quantity as c6,
>>> sample_5.n_nationkey as c7,
>>> sample_3.n_name as c8
>>>   from
>>> public.nation as sample_3
>>> left join public.lineitem as ref_5
>>> on ((cast(null as text) ~>=~ cast(null as text))
>>> or (ref_5.l_discount is NULL))
>>>   left join public.time_statistics as ref_6
>>> inner join public.lineitem as ref_7
>>> on (ref_7.l_returnflag = ref_7.l_linestatus)
>>>   right join public.lineitem as sample_4
>>> left join public.nation as sample_5
>>> on (cast(null as tsquery) = cast(null as tsquery))
>>>   on (cast(null as "time") <= cast(null as "time"))
>>> right join public.customer as ref_8
>>> on (sample_4.l_comment = ref_8.c_name )
>>>   on (ref_5.l_quantity = ref_7.l_quantity )
>>>   where (ref_7.l_suppkey is not NULL)
>>> or ((case when cast(null as lseg) >= cast(null as lseg) then
>>> cast(null as inet) else cast(null as inet) end
>>>&& cast(null as inet))
>>>   or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as
>>> int2) <= cast(null as int8))
>>> or (EXISTS (
>>>   select
>>>   ref_9.ps_comment as c0,
>>>   5 as c1,
>>>   ref_8.c_address as c2,
>>>   58 as c3,
>>>   ref_8.c_acctbal as c4,
>>>   ref_7.l_orderkey as c5,
>>>   ref_7.l_shipmode as c6,
>>>   ref_5.l_commitdate as c7,
>>>   ref_8.c_custkey as c8,
>>>   sample_3.n_nationkey as c9
>>> from
>>>   public.partsupp as ref_9
>>> where cast(null as tsquery) @> cast(null as tsquery)
>>> order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit
>>> 38)) then cast(null as text) else cast(null as text) end
>>>   ))
>>>   order by c0, c3, c6, c7, c8 limit 137;
>>>
>>> plan on 16devel:
>>>
>>>QUERY PLAN
>>>
>>&

Can't find not null constraint, but \d+ shows that

2024-03-26 Thread Tender Wang

Hi Alvaro,

I met an issue related to Catalog not-null commit on HEAD.

postgres=# CREATE TABLE t1(c0 int, c1 int);
CREATE TABLE
postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
ALTER TABLE
postgres=# \d+ t1
   Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
 c1 | integer |   | not null | | plain   |
|  |
Indexes:
"q" PRIMARY KEY, btree (c0, c1)
Access method: heap

postgres=# ALTER TABLE  t1 DROP c1;
ALTER TABLE
postgres=# \d+ t1
   Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
Access method: heap

postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
ERROR:  could not find not-null constraint on column "c0", relation "t1"
postgres=# insert into t1 values (NULL);
ERROR:  null value in column "c0" of relation "t1" violates not-null
constraint
DETAIL:  Failing row contains (null).

I couldn't reproduce aboved issue on older version(12.12 ...16.1).
to be more precisely， since  b0e96f3119 commit.

Without the b0e96f3119, when we drop not null constraint, we  just update
the pg_attribute attnotnull to false
in ATExecDropNotNull().  But now we first check pg_constraint if has the
tuple. if attnotnull is ture, but pg_constraint
doesn't has that tuple. Aboved error will report.

It will be confuesed for users. Because \d shows the column c0 has not
null, and we cann't insert NULL value. But it
reports errore when users drop the NOT NULL constraint.

The attached patch is my workaround solution.  Look forward your apply.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


0001-Fix-attnotnull-not-correct-reset-issue.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-03-26 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>

Yeah,   Indeed, as you said.

> The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

Agreed. It is look better.  But it will not work if simply move some codes
from dropconstraint_internal
to RemoveConstraintById. I have tried this fix before 0001 patch, but
failed.

For example:
create table skip_wal_skip_rewrite_index (c varchar(10) primary key);
alter table skip_wal_skip_rewrite_index alter c type varchar(20);
ERROR:  primary key column "c" is not marked NOT NULL

index_check_primary_key() in index.c has below comments;

"We check for a pre-existing primary key, and that all columns of the index
are simple column references (not expressions), and that all those columns
are marked NOT NULL.  If not, fail."

So in aboved example, RemoveConstraintById() can't reset attnotnull. We can
pass some information  to
RemoveConstraintById() like a bool var to indicate that attnotnull should
be reset or not.



--
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-27 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

I think again, below solutin maybe looks more better:
i. move some code from  dropconstraint_internal to RemoveConstraintById,
  not change the RemoveConstraintById interface. Ensure that the PK drop
always
  does the dance that ATExecDropConstraint does.

ii. After i phase, the attnotnull of some column of primary key  may be
reset to false as I provided example in last email.
   We can set attnotnull to true again in some place.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-27 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

I found some types ddl would check the attnotnull of column is true, for
example:  AT_ReAddIndex, AT_ReplicaIdentity.
So we should add AT_SetAttNotNull sub-command to the wqueue. I add a
new AT_PASS_OLD_COL_ATTRS to make sure
AT_SetAttNotNull  will have done when do AT_ReAddIndex or
AT_ReplicaIdentity.



-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-03-27 Thread Tender Wang

jian he  于2024年3月28日周四 13:18写道：

> On Wed, Mar 27, 2024 at 10:26 PM Tender Wang  wrote:
> >
> > Alvaro Herrera  于2024年3月26日周二 23:25写道：
> >>
> >> On 2024-Mar-26, Tender Wang wrote:
> >>
> >> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> >> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> >> > postgres=# ALTER TABLE  t1 DROP c1;
> >> >
> >> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> >> > ERROR:  could not find not-null constraint on column "c0", relation
> "t1"
> >>
> >> Ooh, hah, what happens here is that we drop the PK constraint
> >> indirectly, so we only go via doDeletion rather than the tablecmds.c
> >> code, so we don't check the attnotnull flags that the PK was protecting.
> >>
> >> > The attached patch is my workaround solution.  Look forward your
> apply.
> >>
>
> after applying
> v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
>
> something is off, now i cannot drop a table.
> demo:
> CREATE TABLE t2(c0 int, c1 int);
> ALTER TABLE  t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1);
> ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY;
> DROP TABLE t2 cascade;
> Similarly, maybe there will be some issue with replica identity.
>
Thanks for review this patch. Yeah, I can reproduce it.  The error reported
in RemoveConstraintById(), where I moved
some codes from dropconstraint_internal(). But some check seems to no need
in RemoveConstraintById(). For example:

/*
* It's not valid to drop the not-null constraint for a GENERATED
* AS IDENTITY column.
*/
if (attForm->attidentity)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("column \"%s\" of relation \"%s\" is an identity column",
  get_attname(RelationGetRelid(rel), attnum,
  false),
  RelationGetRelationName(rel)));

/*
* It's not valid to drop the not-null constraint for a column in
* the replica identity index, either. (FULL is not affected.)
*/
if (bms_is_member(lfirst_int(lc) - FirstLowInvalidHeapAttributeNumber,
ircols))
ereport(ERROR,
errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("column \"%s\" is in index used as replica identity",
  get_attname(RelationGetRelid(rel), lfirst_int(lc), false)));

Above two check can remove from RemoveConstraintById()? I need more test.

>
> + /*
> + * If this was a NOT NULL or the primary key, the constrained columns must
> + * have had pg_attribute.attnotnull set.  See if we need to reset it, and
> + * do so.
> + */
> + if (unconstrained_cols)
> it should be if (unconstrained_cols != NIL)?,
> given unconstrained_cols is a List, also "unconstrained_cols" naming
> seems not intuitive.
> maybe pk_attnums or pk_cols or pk_columns.
>
As I said above, the codes were copied from  dropconstraint_internal(). NOT
NULL columns were not alwayls PK.
So I thinks "unconstrained_cols" is OK.

>
> + attrel = table_open(AttributeRelationId, RowExclusiveLock);
> + rel = table_open(con->conrelid, RowExclusiveLock);
> I am not sure why we using RowExclusiveLock for con->conrelid?
> given we use AccessExclusiveLock at:
> /*
> * If the constraint is for a relation, open and exclusive-lock the
> * relation it's for.
> */
> rel = table_open(con->conrelid, AccessExclusiveLock);
>
Yeah, you are right.

>
>
> + /*
> + * Since the above deletion has been made visible, we can now
> + * search for any remaining constraints on this column (or these
> + * columns, in the case we're dropping a multicol primary key.)
> + * Then, verify whether any further NOT NULL or primary key
> + * exists, and reset attnotnull if none.
> + *
> + * However, if this is a generated identity column, abort the
> + * whole thing with a specific error message, because the
> + * constraint is required in that case.
> + */
> + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum);
> + if (contup ||
> + bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
> +  pkcols))
> + continue;
>
> I didn't get this part.
> if you drop delete a primary key,
> the "NOT NULL" constraint within pg_constraint should definitely be
> removed?
> therefore contup should be pretty sure is NULL?
>

No,  If the original definaiton of column includes NOT NULL, we can't reset
attnotnull to false when
We we drop PK.


>
>   /*
> - * The parser will create AT_AttSetNotNull subcommands for
> - * columns of PRIMARY KEY indexes/constraints, but we need
> - * not do anything with them here, because the columns'
> - * NOT NULL marks will already have been propagated into
> - * the new table definition.
> + * PK drop now will reset pg_attribute attnotnull to false.
> + * We should set attnotnull to true again.
>   */
> PK drop now will reset pg_attribute attnotnull to false,
> which is what we should be expecting.
> the comment didn't explain why should set attnotnull to true again?
>

The V2 patch still needs more cases to test, Probably not right solution.
Anyway, I will  send a v3 version patch  after I do more test.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-28 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

 RemoveConstraintById() should think recurse(e.g. partition table)? I'm not
sure now.
 If we should think process recurse in RemoveConstraintById(),  the
function will look complicated than before.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-28 Thread Tender Wang

Alvaro Herrera  于2024年3月28日周四 17:18写道：

> On 2024-Mar-28, Tender Wang wrote:
>
> >  RemoveConstraintById() should think recurse(e.g. partition table)? I'm
> not
> > sure now.
> >  If we should think process recurse in RemoveConstraintById(),  the
> > function will look complicated than before.
>
> No -- this function handles just a single constraint, as identified by
> OID.  The recursion is handled by upper layers, which can be either
> dependency.c or tablecmds.c.  I think the problem you found is caused by
> the fact that I worked with the tablecmds.c recursion and neglected the
> one in dependency.c.
>

Indeed.

create table skip_wal_skip_rewrite_index (c varchar(10) primary key);
alter table skip_wal_skip_rewrite_index alter c type varchar(20);

Above SQL need attnotnull to be true when re-add index, but
RemoveConstraintById() is hard to recognize this scenario as I know.
We should re-set attnotnull to be true before re-add index. I add a new
AT_PASS in attached patch.
Any thoughts?
--
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-03-29 Thread Tender Wang

jian he  于2024年3月29日周五 14:56写道：

> hi.
> about v4, i think, i understand the changes you made.
> RemoveConstraintById(Oid conId)
> will drop a single constraint record.
> if the constraint is primary key, then primary key associated
> attnotnull should set to false.
> but sometimes it shouldn't.
>

Yeah, indeed.

>
>
> for example:
> drop table if exists t2;
> CREATE TABLE t2(c0 int, c1 int);
> ALTER TABLE  t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1);
> ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY;
> ALTER TABLE  t2 DROP c1;
>
> + * If this was a NOT NULL or the primary key, the constrained columns must
> + * have had pg_attribute.attnotnull set.  See if we need to reset it, and
> + * do so.
> + */
> + if (unconstrained_cols != NIL)
>
> unconstrained_cols is not NIL, which means we have dropped a primary key.
> I am confused by "If this was a NOT NULL or the primary key".
>

NOT NULL means the definition of column having not-null constranit. For
example:
create table t1(a int not null);
the pg_attribute.attnotnull is set to true.
primary key case like below:
create table t2(a int primary key);
the pg_attribute.attnotnull is set to true.

I think aboved case can explain what's meaning about comments in
dropconstraint_internal.
But here, in RemoveConstraintById() , we only care about primary key case,
so NOT NULL is better
to removed from comments.


>
> +
> + /*
> + * Since the above deletion has been made visible, we can now
> + * search for any remaining constraints on this column (or these
> + * columns, in the case we're dropping a multicol primary key.)
> + * Then, verify whether any further NOT NULL exists, and reset
> + * attnotnull if none.
> + */
> + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum);
> + if (HeapTupleIsValid(contup))
> + {
> + heap_freetuple(contup);
> + heap_freetuple(atttup);
> + continue;
> + }
>
> I am a little bit confused by the above comment.
> I think the comments should say,
> if contup is valid, that means, we already have one  "not null"
> constraint associate with the attnum
> in that condition, we must not set attnotnull, otherwise the
> consistency between attnotnull and "not null"
> table constraint will be broken.
>
> other than that, the change in RemoveConstraintById looks sane.
>
 Above comments want to say that after pk constranit dropped, if there are
tuples in
pg_constraint, that means the definition of column has not-null constraint.
So we can't
set pg_attribute.attnotnull to false.

For example:
create table t1(a int not null);
alter table t1 add constraint t1_pk primary key(a);
alter table t1 drop constraint t1_pk;


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-07 Thread Tender Wang

It has been several days since the last email.  Do you have any
suggestions, please?

What I'm concerned about  is that adding a new AT_PASS is good fix?  Is it
a big try?
More concerned is that it can cover all ALTER TABLE cases?

Any thoughts.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-08 Thread Tender Wang

Hi all,
  I went through the MERGE/SPLIT partition codes today, thanks for the
works.  I found some grammar errors:
 i. in error messages(Users can see this grammar errors, not friendly).
ii. in codes comments



Alexander Korotkov  于2024年4月7日周日 06:23写道：

> Hi, Dmitry!
>
> On Fri, Apr 5, 2024 at 4:00 PM Dmitry Koval 
> wrote:
> > > I've revised the patchset.
> >
> > Thanks for the corrections (especially ddl.sgml).
> > Could you also look at a small optimization for the MERGE PARTITIONS
> > command (in a separate file
> > v31-0003-Additional-patch-for-ALTER-TABLE-.-MERGE-PARTITI.patch, I wrote
> > about it in an email 2024-03-31 00:56:50)?
> >
> > Files v31-0001-*.patch, v31-0002-*.patch are the same as
> > v30-0001-*.patch, v30-0002-*.patch (after rebasing because patch stopped
> > applying due to changes in upstream).
>
> I've pushed 0001 and 0002.  I didn't push 0003 for the following reasons.
> 1) This doesn't keep functionality equivalent to 0001.  With 0003, the
> merged partition will inherit indexes, constraints, and so on from the
> one of merging partitions.
> 2) This is not necessarily an optimization. Without 0003 indexes on
> the merged partition are created after moving the rows in
> attachPartitionTable(). With 0003 we merge data into the existing
> partition which saves its indexes.  That might cause a significant
> performance loss because mass inserts into indexes may be much slower
> than building indexes from scratch.
> I think both aspects need to be carefully considered.  Even if we
> accept them, this needs to be documented.  I think now it's too late
> for both of these.  So, this should wait for v18.
>
> --
> Regards,
> Alexander Korotkov
>
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


0001-Fix-some-grammer-errors-from-error-messages-and-code.patch
Description: Binary data

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-08 Thread Tender Wang

Andrey M. Borodin  于2024年4月8日周一 17:40写道：

>
>
> > On 27 Sep 2023, at 16:06, tender wang  wrote:
> >
> >Do you have any comments or suggestions on this issue? Thanks.
> Hi Tender,
>
> there are some review comments in the thread, that you might be interested
> in.
> I'll mark this [0] entry "Waiting on Author" and move to next CF.
>

 Thank you for the reminder.  I will update the patch later.
I also deeply hope to get more advice about this patch.
(even the advice that not worth  continuint to work on this patch).

Thanks.

Thanks!
>
>
> Best regards, Andrey Borodin.
>
> [0]https://commitfest.postgresql.org/47/4549/



-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-09 Thread Tender Wang

jian he  于2024年4月10日周三 14:10写道：

> On Wed, Apr 10, 2024 at 1:29 AM Alvaro Herrera 
> wrote:
> >
> > On 2024-Mar-29, Tender Wang wrote:
> >
> > > I think aboved case can explain what's meaning about comments in
> > > dropconstraint_internal.
> > > But here, in RemoveConstraintById() , we only care about primary key
> case,
> > > so NOT NULL is better to removed from comments.
> >
> > Actually, I think it's better if all the resets of attnotnull occur in
> > RemoveConstraintById, for both types of constraints; we would keep that
> > block in dropconstraint_internal only to raise errors in the cases where
> > the constraint is protecting a replica identity or a generated column.
> > Something like the attached, perhaps, may need more polish.
> >
>
> DROP TABLE if exists notnull_tbl2;
> CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1
> int);
> ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> ALTER TABLE notnull_tbl2 DROP c1;
> \d notnull_tbl2
>
> last "\d notnull_tbl2" command, master output is:
> Table "public.notnull_tbl2"
>  Column |  Type   | Collation | Nullable | Default
>
> +-+---+--+--
>  c0 | integer |   | not null | generated by default as identity
>
>
>
> last "\d notnull_tbl2" command, applying
> 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch
> output:
> Table "public.notnull_tbl2"
>  Column |  Type   | Collation | Nullable | Default
>
> +-+---+--+--
>  c0 | integer |   |      | generated by default as identity
>

Hmm,
ALTER TABLE notnull_tbl2 DROP c1; will not call dropconstraint_internal().
When dropping PK constraint indirectly, c0's attnotnull was set to false in
RemoveConstraintById().


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-10 Thread Tender Wang

jian he  于2024年4月10日周三 17:34写道：

>
> another related bug, in master.
>
> drop table if exists notnull_tbl1;
> CREATE TABLE notnull_tbl1 (c0 int not null, c1 int);
> ALTER TABLE notnull_tbl1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> \d+ notnull_tbl1
> ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;
> ALTER TABLE notnull_tbl1 ALTER c1 DROP NOT NULL;
>
> "ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;"
> should fail?
>

Yeah, it should fail as before, because c0 is primary key.
In master, although c0's pg_attribute.attnotnull is still true, but its
not-null constraint has been deleted
in dropconstraint_internal().

If we drop column c1 after dropping c0 not null, the primary key will be
dropped indirectly.
And now you can see c0 is still not-null if you do \d+ notnull_tbl1. But it
will report error "not found not-null"
if you alter c0 drop not null.

postgres=# ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;
ALTER TABLE
postgres=# \d+ notnull_tbl1
  Table "public.notnull_tbl1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
 c1 | integer |   | not null | | plain   |
|  |
Indexes:
"q" PRIMARY KEY, btree (c0, c1)
Access method: heap

postgres=# alter table notnull_tbl1 drop c1;
ALTER TABLE
postgres=# \d notnull_tbl1
Table "public.notnull_tbl1"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 c0 | integer |   | not null |

postgres=# alter table notnull_tbl1 alter c0 drop not null;
ERROR:  could not find not-null constraint on column "c0", relation
"notnull_tbl1"


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Revise some error messages in split partition code

2024-04-10 Thread Tender Wang

Richard Guo  于2024年4月10日周三 19:32写道：

> I noticed some error messages in the split partition code that are not
> up to par.  Such as:
>
> "new partitions not have value %s but split partition has"
>
> how about we revise it to:
>
> "new partitions do not have value %s but split partition does"
>
> Another one is:
>
> "any partition in the list should be DEFAULT because split partition is
> DEFAULT"
>
> how about we revise it to:
>
> "all partitions in the list should be DEFAULT because split partition is
> DEFAULT"
>
> Another problem I noticed is that in the test files partition_split.sql
> and partition_merge.sql, there are comments specifying the expected
> error messages for certain test queries.  However, in some cases, the
> error message mentioned in the comment does not match the error message
> actually generated by the query.  Such as:
>
> -- ERROR:  invalid partitions order, partition "sales_mar2022" can not be
> merged
> -- (space between sections sales_jan2022 and sales_mar2022)
> ALTER TABLE sales_range MERGE PARTITIONS (sales_jan2022, sales_mar2022)
> INTO sales_jan_mar2022;
> ERROR:  lower bound of partition "sales_mar2022" conflicts with upper
> bound of previous partition "sales_jan2022"
>
> I'm not sure if it's a good practice to specify the expected error
> message in the comment.  But if we choose to do so, I think we at least
> need to ensure that the specified error message in the comment remains
> consistent with the error message produced by the query.
>
> Also there are some comments containing grammatical issues.  Such as:
>
> -- no error: bounds of sales_noerror equals to lower and upper bounds of
> sales_dec2022 and sales_feb2022
>
> Attached is a patch to fix the issues I've observed.  I suspect there
> may be more to be found.
>

Yeah. The day before yesterday I found some grammer errors from error
messages and code comments [1] .
Except  those issues, @Alexander Lakhin   has found
some bugs [2]
I have some concerns that whether this patch is ready to commit.


[1]
https://www.postgresql.org/message-id/CAHewXNkGMPU50QG7V6Q60JGFORfo8LfYO1_GCkCa0VWbmB-fEw%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/dbc8b96c-3cf0-d1ee-860d-0e491da20485%40gmail.com

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

"type with xxxx does not exist" when doing ExecMemoize()

2024-02-25 Thread Tender Wang

Hi,

I met Memoize node failed When I used sqlancer test postgres.
database0=# explain select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
  QUERY PLAN
--
 Nested Loop  (cost=0.17..21.20 rows=4 width=32)
   ->  Seq Scan on t5  (cost=0.00..1.04 rows=4 width=14)
   ->  Memoize  (cost=0.17..6.18 rows=1 width=32)
 Cache Key: (t5.c0 - t5.c0)
 Cache Mode: logical
 ->  Index Only Scan using t0_c0_key on t0  (cost=0.15..6.17 rows=1
width=32)
   Index Cond: (c0 = (t5.c0 - t5.c0))
(7 rows)

database0=# select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
ERROR:  type with OID 2139062143 does not exist

How to repeat:
The attached database0.log (created by sqlancer) included statements to
repeat this issue.
Firstly, create database test;
then;
psql postgres
\i   /xxx/database0.log

I analyzed aboved issue this weekend. And I found that
After called ResetExprContext() in MemoizeHash_hash(), the data in
mstate->probeslot was corrputed.

in prepare_probe_slot: the data as below:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$1 = {vl_len_ = 36, rangetypid = 3904}
after called ResetExprContext() in MemoizeHash_hash:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$3 = {vl_len_ = 264, rangetypid = 2139062143}

I think in prepare_probe_slot(), should called datumCopy as the attached
patch does.

Any thoughts? Thanks.
--
Tender Wang
OpenPie:  https://en.openpie.com/


database0.log
Description: Binary data


0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-27 Thread Tender Wang

The attached patch is a new version based on v3(not including Andrei's the
test case). There is no need to call datumCopy when
isnull is true.

I have not added a new runtime memoryContext so far. Continue to use
mstate->tableContext, I'm not sure the memory used of probeslot will affect
mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a little
time to learn nodeMemoize.c more deeply.

Andrei Lepikhov  于2024年2月26日周一 20:29写道：

> On 26/2/2024 18:34, Richard Guo wrote:
> >
> > On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov
> > mailto:a.lepik...@postgrespro.ru>> wrote:
> >
> > On 26/2/2024 12:44, Tender Wang wrote:
> >  > Make sense. I found MemoizeState already has a MemoryContext, so
> > I used it.
> >  > I update the patch.
> > This approach is better for me. In the next version of this patch, I
> > included a test case. I am still unsure about the context chosen and
> > the
> > stability of the test case. Richard, you recently fixed some Memoize
> > issues, could you look at this problem and patch?
> >
> >
> > I looked at this issue a bit.  It seems to me what happens is that at
> > first the memory areas referenced by probeslot->tts_values[] are
> > allocated in the per tuple context (see prepare_probe_slot).  And then
> > in MemoizeHash_hash, after we've calculated the hashkey, we will reset
> > the per tuple context.  However, later in MemoizeHash_equal, we still
> > need to reference the values in probeslot->tts_values[], which have been
> > cleared.
> Agree
> >
> > Actually the context would always be reset in MemoizeHash_equal, for
> > both binary and logical mode.  So I kind of wonder if it's necessary to
> > reset the context in MemoizeHash_hash.
> I can only provide one thought against this solution: what if we have a
> lot of unique hash values, maybe all of them? In that case, we still
> have a kind of 'leak' David fixed by the commit 0b053e78b5.
> Also, I have a segfault report of one client. As I see, it was caused by
> too long text column in the table slot. As I see, key value, stored in
> the Memoize hash table, was corrupted, and the most plain reason is this
> bug. Should we add a test on this bug, and what do you think about the
> one proposed in v3?
>
> --
> regards,
> Andrei Lepikhov
> Postgres Professional
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-28 Thread Tender Wang

I read Memoize code and how other node use ResetExprContext() recently.

The comments about per_tuple_memory said that :

 * ecxt_per_tuple_memory is a short-term context for expression results.
 *  As the name suggests, it will typically be reset once per tuple,
 *  before we begin to evaluate expressions for that tuple.  Each
 *  ExprContext normally has its very own per-tuple memory context.

So ResetExprContext() should called once per tuple, but not in Hash and
Equal function just as Richard said before.
In ExecResult() and ExecProjectSet(), they call ResetExprContext() once
when enter these functions.
So I think ExecMemoize() can do the same way.

The attached patch includes below modifications:
1.
When I read the code in nodeMemoize.c, I found a typos: outer should be
inner,
if I don't misunderstand the intend of Memoize.

2.
I found that almost executor node call CHECK_FOR_INTERRUPTS(), so I add it.
Is it right to add it for ExecMemoize()?

3.
I remove ResetExprContext() from Hash and Equal funciton. And I call it
when enter
ExecMemoize() just like ExecPrejectSet() does.
ExecQualAndReset() is replaed with ExecQual().

4.
This patch doesn't include test case.  I use the Andrei's test case, but I
don't repeat the aboved issue.
I may need to spend some more time to think about how to repeat this issue
easily.

So, what do you think about the one proposed in v5? @Andrei Lepikhov
  @Richard Guo  @David
Rowley  .
I don't want to continue to do work based on v3 patch. As Andrei Lepikhov
said, using mstate->tableContext for probeslot
is not good.  v5 looks more simple.


v5-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-29 Thread Tender Wang

Hi,

When I think about how to add a test case for v5 version patch, and I want
to test if v5 version patch has memory leak.
This thread [1] provided a way how to repeat the memory leak, so I used it
to test v5 patch. I didn't found memory leak on
v5 patch.

But I found other interesting issue. When changed whereClause in [1],  the
query reported below error:

"ERROR could not find memoization table entry"

the query:
EXPLAIN analyze
select sum(q.id_table1)
from (
SELECT t2.*
FROM table1 t1
JOIN table2 t2
ON (t2.id_table1 + t2.id_table1) = t1.id) q;

But on v5 patch, it didn't report error.

I guess it is the same reason that data in probeslot was reset in Hash
function.

I debug the above query, and get this:
before
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$1 = {vl_len_ = 48, choice = {n_header = 32770, n_long = {n_sign_dscale =
32770, n_weight = 60, n_data = 0x564632ebd708}, n_short = {n_header =
32770, n_data = 0x564632ebd706}}}
after
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$2 = {vl_len_ = 264, choice = {n_header = 32639, n_long = {n_sign_dscale =
32639, n_weight = 32639, n_data = 0x564632ebd6a8}, n_short = {n_header =
32639, n_data = 0x564632ebd6a6}}}

So after call ResetExprContext() in Hash function, the data in probeslot is
corrupted.  It is not sure what error will happen when executing on
corrupted data.

During debug, I learned that numeric_add doesn't have type check like
rangetype, so aboved query will not report "type with xxx does not exist".

And I realize that  the test case added by Andrei Lepikhov  in v3 is right.
So in v6 patch I add Andrei Lepikhov's test case.  Thanks a lot.

Now I think the v6 version patch seems to be complete now.



[1]
https://www.postgresql.org/message-id/83281eed63c74e4f940317186372abfd%40cft.ru

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v6-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-05 Thread Tender Wang

Andrei Lepikhov  于2024年3月5日周二 17:36写道：

> On 1/3/2024 14:18, Tender Wang wrote:
> > During debug, I learned that numeric_add doesn't have type check like
> > rangetype, so aboved query will not report "type with xxx does not
> exist".
> >
> > And I realize that  the test case added by Andrei Lepikhov  in v3 is
> > right. So in v6 patch I add Andrei Lepikhov's test case.  Thanks a lot.
> >
> > Now I think the v6 version patch seems to be complete now.
> I've passed through the patch, and it looks okay. Although I am afraid
> of the same problems that future changes can cause and how to detect
> them, it works correctly.
>

Thanks for reviewing it, and I add it to commitfest 2024-07.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-06 Thread Tender Wang

Andrei Lepikhov  于2024年3月6日周三 11:37写道：

> I think, it is a bug. Should it be fixed (and back-patched) earlier?
>

Agreed.   Need David to review it as he knows this area best.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-10 Thread Tender Wang

David Rowley  于2024年3月11日周一 13:25写道：

> On Thu, 7 Mar 2024 at 22:50, David Rowley  wrote:
> >
> > On Thu, 7 Mar 2024 at 15:24, Tender Wang  wrote:
> > >
> > > Andrei Lepikhov  于2024年3月6日周三 11:37写道：
> > >> I think, it is a bug. Should it be fixed (and back-patched) earlier?
> > >
> > > Agreed.   Need David to review it as he knows this area best.
> >
> > This is on my list of things to do. Just not at the top yet.
>
> I've gone over this patch and I'm happy with the changes to
> nodeMemoize.c.  The thing I did change was the newly added test.  The
> problem there was the test was passing for me with and without the
> code fix.  I ended up changing the test so the cache hits and misses
> are reported.  That required moving the test to above where the
> work_mem is set to 64KB so we can be certain the values will all be
> cached and the cache hits are predictable.
>
> My other changes were just cosmetic.
>
> Thanks for working on this fix.  I've pushed the patch.
>
> David
>

Thanks for pushing the patch.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

same query but different result on pg16devel and pg15.2

2023-04-03 Thread tender wang

Hi hackers,
I encounter a problem, as shown below:

query:
  select
  ref_0.ps_suppkey as c0,
  ref_1.c_acctbal as c1,
  ref_2.o_totalprice as c2,
  ref_2.o_orderpriority as c3,
  ref_2.o_clerk as c4
from
  public.partsupp as ref_0
  left join public.nation as sample_0
inner join public.customer as sample_1
on (false)
  on (true)
left join public.customer as ref_1
right join public.orders as ref_2
on (false)
  left join public.supplier as ref_3
  on (false)
on (sample_0.n_comment = ref_1.c_name )
where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END,
CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
order by c0, c1, c2, c3, c4 limit 1;

on pg16devel:
c0 | c1 | c2 | c3 | c4
++++
  1 ||||
(1 row)
plan:
  QUERY PLAN

---
 Limit
   ->  Sort
 Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
o_orderpriority, o_clerk
 ->  Nested Loop Left Join
   ->  Seq Scan on partsupp ref_0
   ->  Result
 One-Time Filter: false
(7 rows)

on pg15.2:
 c0 | c1 | c2 | c3 | c4
++++
(0 rows)
plan:
  QUERY
PLAN
---
 Limit
   ->  Sort
 Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
o_orderpriority, o_clerk
 ->  Hash Left Join
   Hash Cond: ((n_comment)::text = (c_name)::text)
   Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN
4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
   ->  Nested Loop Left Join
 ->  Seq Scan on partsupp ref_0
 ->  Result
   One-Time Filter: false
   ->  Hash
 ->  Result
   One-Time Filter: false
(13 rows)



     regards, tender
wang

Re: same query but different result on pg16devel and pg15.2

2023-04-03 Thread tender wang

Attached file included table schema information, but no data.

tender wang  于2023年4月4日周二 10:53写道：

> Hi hackers,
> I encounter a problem, as shown below:
>
> query:
>   select
>   ref_0.ps_suppkey as c0,
>   ref_1.c_acctbal as c1,
>   ref_2.o_totalprice as c2,
>   ref_2.o_orderpriority as c3,
>   ref_2.o_clerk as c4
> from
>   public.partsupp as ref_0
>   left join public.nation as sample_0
> inner join public.customer as sample_1
> on (false)
>   on (true)
> left join public.customer as ref_1
> right join public.orders as ref_2
> on (false)
>   left join public.supplier as ref_3
>   on (false)
> on (sample_0.n_comment = ref_1.c_name )
> where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END,
> CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
> order by c0, c1, c2, c3, c4 limit 1;
>
> on pg16devel:
> c0 | c1 | c2 | c3 | c4
> ++++
>   1 ||||
> (1 row)
> plan:
>   QUERY PLAN
>
>
> ---
>  Limit
>->  Sort
>  Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
> o_orderpriority, o_clerk
>  ->  Nested Loop Left Join
>->  Seq Scan on partsupp ref_0
>->  Result
>  One-Time Filter: false
> (7 rows)
>
> on pg15.2:
>  c0 | c1 | c2 | c3 | c4
> ++++
> (0 rows)
> plan:
>
> QUERY PLAN
>
>
> ---
>  Limit
>->  Sort
>  Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
> o_orderpriority, o_clerk
>  ->  Hash Left Join
>Hash Cond: ((n_comment)::text = (c_name)::text)
>Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL)
> THEN 4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95
> END))
>->  Nested Loop Left Join
>  ->  Seq Scan on partsupp ref_0
>  ->  Result
>        One-Time Filter: false
>->  Hash
>  ->  Result
>One-Time Filter: false
> (13 rows)
>
>
>
>  regards, tender
> wang
>


dbt3-s0.01-janm.sql
Description: Binary data

Re: Improve list manipulation in several places

2023-04-23 Thread tender wang

Richard Guo  于2023年4月21日周五 15:35写道：

> There was discussion in [1] about improvements to list manipulation in
> several places.  But since the discussion is not related to the topic in
> that thread, fork a new thread here and attach a patch to show my
> thoughts.
>
> Some are just cosmetic changes by using macros.  The others should have
> performance gain from the avoidance of moving list entries.  But I doubt
> the performance gain can be noticed or measured, as currently there are
> only a few places affected by the change.  I still think the changes are
> worthwhile though, because it is very likely that future usage of the
> same scenario can benefit from these changes.
>

I doubt the performance gain from lappend_copy func.  new_tail_cell in
lappend may not enter enlarge_list in most cases, because we
may allocate extra cells in new_list(see the comment in new_list).



>
> (Copying in David and Ranier.  Ranier provided a patch about the changes
> in list.c, but I'm not using that one.)
>
> [1]
> https://www.postgresql.org/message-id/CAMbWs49aakL%3DPP7NcTajCtDyaVUE-NMVMGpaLEKreYbQknkQWA%40mail.gmail.com
>
> Thanks
> Richard
>

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-09 Thread tender wang

Xiaoran Wang  于2023年3月18日周六 15:04写道：

> Hi hackers,
>
>  In heap_create_with_catalog, the Relation new_rel_desc is created
> by RelationBuildLocalRelation, not table_open. So it's better to
> call RelationClose to release it.
>
Why it's better to call RelationClose? Is there a problem if using
table_close()?

> What's more, the comment for it seems useless, just delete it.
>
> Thanks!
>

regard, tender wang

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-10 Thread tender wang

Bharath Rupireddy  于2023年5月10日周三
22:17写道：

> On Sat, Mar 18, 2023 at 12:34 PM Xiaoran Wang  wrote:
> >
> > Hi hackers,
> >
> >  In heap_create_with_catalog, the Relation new_rel_desc is created
> > by RelationBuildLocalRelation, not table_open. So it's better to
> > call RelationClose to release it.
> >
> > What's more, the comment for it seems useless, just delete it.
>
> Essentially, all the close functions are the same with NoLock, IOW,
> table_close(relation, NoLock) = relation_closerelation, NoLock) =
> RelationClose(relation). Therefore, table_close(new_rel_desc, NoLock);
> looks fine to me.

  Agreed.

And, the /* do not unlock till end of xact */, it looks like it's been
> there from day 1. It may be indicating that the ref count fo the new
> relation created in heap_create_with_catalog() will be decremented at
> the end of xact, but I'm not sure what it means.
>
  Me too

> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>
>
>

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-11 Thread tender wang

Tom Lane  于2023年5月11日周四 00:32写道：

> Bharath Rupireddy  writes:
> > And, the /* do not unlock till end of xact */, it looks like it's been
> > there from day 1. It may be indicating that the ref count fo the new
> > relation created in heap_create_with_catalog() will be decremented at
> > the end of xact, but I'm not sure what it means.
>
> Hmm, I think it's been copied-and-pasted from somewhere.  It's quite
> common for us to not release locks on modified tables until end of
> transaction.  However, that's not what's happening here, because we
> actually *don't have any such lock* at this point, as you can easily
> prove by stepping through this code and watching the contents of
> pg_locks from another session.  We do acquire AccessExclusiveLock
> on the new table eventually, but not till control returns to
> DefineRelation.
>
> I'm not real sure that I like the proposed code change: it's unclear
> to me whether it's an unwise piercing of a couple of abstraction
> layers or an okay change because those abstraction layers haven't
> yet been applied to the new relation at all.  However, I think the
> existing comment is actively misleading and needs to be changed.
> Maybe something like
>
> /*
>  * Close the relcache entry, since we return only an OID not a
>  * relcache reference.  Note that we do not yet hold any lockmanager
>  * lock on the new rel, so there's nothing to release.
>  */
> table_close(new_rel_desc, NoLock);
>
> /*
>  * ok, the relation has been cataloged, so close catalogs and return
>  * the OID of the newly created relation.
>  */
> table_close(pg_class_desc, RowExclusiveLock);
>
+1
 Personally, I prefer above code.

Given these comments, maybe changing the first call to RelationClose
> would be sensible, but I'm still not quite convinced.
>
> regards, tom lane
>
>
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-10-25 Thread tender wang

Hi
   Is there any conclusion to this issue?

Jehan-Guillaume de Rorthais  于2023年8月10日周四 23:03写道：

> On Thu, 3 Aug 2023 11:02:43 +0200
> Alvaro Herrera  wrote:
>
> > On 2023-Aug-03, tender wang wrote:
> >
> > > I think  old "sub-FK" should not be dropped, that will be violates
> foreign
> > > key constraint.
> >
> > Yeah, I've been playing more with the patch and it is definitely not
> > doing the right things.  Just eyeballing the contents of pg_trigger and
> > pg_constraint for partitions added by ALTER...ATTACH shows that the
> > catalog contents are inconsistent with those added by CREATE TABLE
> > PARTITION OF.
>
> Well, as stated in my orignal message, at the patch helps understanding the
> problem and sketch a possible solution. It definitely is not complete.
>
> After DETACHing the table, we surely needs to check everything again and
> recreating what is needed to keep the FK consistent.
>
> But should we keep the FK after DETACH? Did you check the two other
> discussions
> related to FK, self-FK & partition? Unfortunately, as Tender experienced,
> the
> more we dig the more we find bugs. Moreover, the second one might seems
> unsolvable and deserve a closer look. See:
>
> * FK broken after DETACHing referencing part
>   https://www.postgresql.org/message-id/20230420144344.40744130%40karst
> * Issue attaching a table to a partitioned table with an  auto-referenced
>   foreign key
>   https://www.postgresql.org/message-id/20230707175859.17c91538%40karst
>
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-10-27 Thread tender wang

 into v2 patch, so
I don't add test case in v2 patch.
No test case is not good patch. I just share my idea about this issue. Hope
to get your reply.




Alvaro Herrera  于2023年10月25日周三 20:13写道：

> On 2023-Oct-25, tender wang wrote:
>
> > Hi
> >Is there any conclusion to this issue?
>
> None yet.  I intend to work on this at some point, hopefully soon.
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>
From 24491419ad65871e54207d3ef481d8abe529e1e1 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Fri, 27 Oct 2023 13:48:48 +0800
Subject: [PATCH v2] Fix partition detach issue.

---
 src/backend/commands/tablecmds.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 416a98e7ce..3b897b620a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19356,7 +19356,9 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 	{
 		ForeignKeyCacheInfo *fk = lfirst(cell);
 		HeapTuple	contup;
+		HeapTuple	parentConTup;
 		Form_pg_constraint conform;
+		Form_pg_constraint parentConForm;
 		Constraint *fkconstraint;
 		Oid			insertTriggerOid,
 	updateTriggerOid;
@@ -19374,6 +19376,24 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 			continue;
 		}
 
+		/* For referenced-side, if it is partitioned table, each partition
+		 * has one row in pg_constraint. But it doesn't have INSERT CHECK trigger
+		 */
+		Assert(OidIsValid(conform->conparentid));
+		parentConTup = SearchSysCache1(CONSTROID,
+	   ObjectIdGetDatum(conform->conparentid));
+		if (!HeapTupleIsValid(parentConTup))
+			elog(ERROR, "cache lookup failed for constraint %u",
+ conform->conparentid);
+		parentConForm = (Form_pg_constraint)GETSTRUCT(parentConTup);
+		if (parentConForm->confrelid != conform->confrelid &&
+			parentConForm->conrelid == conform->conrelid)
+		{
+			ReleaseSysCache(contup);
+			ReleaseSysCache(parentConTup);
+			continue;
+		}
+
 		/* unset conparentid and adjust conislocal, coninhcount, etc. */
 		ConstraintSetParentConstraint(fk->conoid, InvalidOid, InvalidOid);
 
@@ -19421,6 +19441,7 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 	   NULL, NULL);
 
 		ReleaseSysCache(contup);
+		ReleaseSysCache(parentConTup);
 	}
 	list_free_deep(fks);
 	if (trigrel)
-- 
2.25.1

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-11-29 Thread tender wang

The v8-0001 patch failed to apply in my local repo as below:

git apply v8-0001-Make-all-SLRU-buffer-sizes-configurable.patch
error: patch failed: src/backend/access/transam/multixact.c:1851
error: src/backend/access/transam/multixact.c: patch does not apply
error: patch failed: src/backend/access/transam/subtrans.c:184
error: src/backend/access/transam/subtrans.c: patch does not apply
error: patch failed: src/backend/commands/async.c:117
error: src/backend/commands/async.c: patch does not apply
error: patch failed: src/backend/storage/lmgr/predicate.c:808
error: src/backend/storage/lmgr/predicate.c: patch does not apply
error: patch failed: src/include/commands/async.h:15
error: src/include/commands/async.h: patch does not apply

My local head commit is 15c9ac36299.  Is there something I missed?

Dilip Kumar  于2023年11月24日周五 17:08写道：

> On Fri, Nov 24, 2023 at 10:17 AM Dilip Kumar 
> wrote:
> >
> > On Thu, Nov 23, 2023 at 11:34 AM Dilip Kumar 
> wrote:
> > >
> > > Note: With this testing, we have found a bug in the bank-wise
> > > approach, basically we are clearing a procglobal->clogGroupFirst, even
> > > before acquiring the bank lock that means in most of the cases there
> > > will be a single process in each group as a group leader
> >
> > I realized that the bug fix I have done is not proper, so will send
> > the updated patch set with the proper fix soon.
>
> PFA, updated patch set fixes the bug found during the testing of the
> group update using the injection point.  Also attached a path to test
> the injection point but for that, we need to apply the injection point
> patches [1]
>
> [1] https://www.postgresql.org/message-id/ZWACtHPetBFIvP61%40paquier.xyz
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-12-14 Thread tender wang

Andrey M. Borodin  于2023年12月14日周四 17:02写道：

>
>
> > On 14 Dec 2023, at 08:12, Amul Sul  wrote:
> >
> >
> > + int bankno = pageno & ctl->bank_mask;
> >
> > I am a bit uncomfortable seeing it as a mask, why can't it be simply a
> number
> > of banks (num_banks) and get the bank number through modulus op (pageno %
> > num_banks) instead of bitwise & operation (pageno & ctl->bank_mask)
> which is a
> > bit difficult to read compared to modulus op which is quite simple,
> > straightforward and much common practice in hashing.
> >
> > Are there any advantages of using &  over % ?
>

use Compiler Explorer[1] tool, '%' has more Assembly instructions than '&'
.
int GetBankno1(int pageno) {
return pageno & 127;
}

int GetBankno2(int pageno) {
return pageno % 127;
}
under clang 13.0
GetBankno1: # @GetBankno1
pushrbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
and eax, 127
pop rbp
ret
GetBankno2: # @GetBankno2
pushrbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
mov ecx, 127
cdq
idivecx
mov eax, edx
pop rbp
ret
under gcc 13.2
GetBankno1:
pushrbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
and eax, 127
pop rbp
ret
GetBankno2:
pushrbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
movsx   rdx, eax
imulrdx, rdx, -2130574327
shr rdx, 32
add edx, eax
mov ecx, edx
sar ecx, 6
cdq
sub ecx, edx
mov edx, ecx
sal edx, 7
sub edx, ecx
sub eax, edx
mov ecx, eax
mov eax, ecx
pop rbp
ret


[1] https://godbolt.org/

The instruction AND is ~20 times faster than IDIV [0]. This is relatively
> hot function, worth sacrificing some readability to save ~ten nanoseconds
> on each check of a status of a transaction.
>

  Now that AND is more faster, Can we  replace the '% SLRU_MAX_BANKLOCKS'
operation in  SimpleLruGetBankLock()  with '& 127' :
  SimpleLruGetBankLock()
{
   int banklockno = (pageno & ctl->bank_mask) % SLRU_MAX_BANKLOCKS;

   use '&'
   return &(ctl->shared->bank_locks[banklockno].lock);
}
Thoughts?

>
> [0] https://www.agner.org/optimize/instruction_tables.pdf
>
>

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-12-14 Thread tender wang

Andrey M. Borodin  于2023年12月14日周四 17:35写道：

>
>
> > On 14 Dec 2023, at 14:28, tender wang  wrote:
> >
> >   Now that AND is more faster, Can we  replace the '%
> SLRU_MAX_BANKLOCKS' operation in  SimpleLruGetBankLock()  with '& 127'
>
> unsigned int GetBankno1(unsigned int pageno) {
> return pageno & 127;
> }
>
> unsigned int GetBankno2(unsigned int pageno) {
> return pageno % 128;
> }
>
> Generates with -O2
>
> GetBankno1(unsigned int):
> mov eax, edi
> and eax, 127
> ret
> GetBankno2(unsigned int):
> mov eax, edi
> and eax, 127
> ret
>
>
> Compiler is smart enough with constants.
>

Yeah, that's true.

int GetBankno(long pageno) {
unsigned short bank_mask = 128;
int bankno = (pageno & bank_mask) % 128;
return bankno;
}
enable -O2, only one  instruction:
xor eax, eax

But if we all use '%',  thing changs as below:
int GetBankno(long pageno) {
unsigned short bank_mask = 128;
int bankno = (pageno % bank_mask) % 128;
return bankno;
}
mov rdx, rdi
sar rdx, 63
shr rdx, 57
lea rax, [rdi+rdx]
and eax, 127
sub eax, edx


> Best regards, Andrey Borodin.

Re: Improve heapgetpage() performance, overhead from serializable

2023-08-31 Thread tender wang

This thread [1] also Improving the heapgetpage function, and looks like
this thread.

[1]
https://www.postgresql.org/message-id/a9f40066-3d25-a240-4229-ec2fbe94e7a5%40yeah.net

Muhammad Malik  于2023年9月1日周五 04:04写道：

> Hi,
>
> Is there a plan to merge this patch in PG16?
>
> Thanks,
> Muhammad
>
> --
> *From:* Andres Freund 
> *Sent:* Saturday, July 15, 2023 6:56 PM
> *To:* pgsql-hack...@postgresql.org 
> *Cc:* Thomas Munro 
> *Subject:* Improve heapgetpage() performance, overhead from serializable
>
> Hi,
>
> Several loops which are important for query performance, like
> heapgetpage()'s
> loop over all tuples, have to call functions like
> HeapCheckForSerializableConflictOut() and PredicateLockTID() in every
> iteration.
>
> When serializable is not in use, all those functions do is to to return.
> But
> being situated in a different translation unit, the compiler can't inline
> (without LTO at least) the check whether serializability is needed.  It's
> not
> just the function call overhead that's noticable, it's also that registers
> have to be spilled to the stack / reloaded from memory etc.
>
> On a freshly loaded pgbench scale 100, with turbo mode disabled, postgres
> pinned to one core. Parallel workers disabled to reduce noise.  All times
> are
> the average of 15 executions with pgbench, in a newly started, but
> prewarmed
> postgres.
>
> SELECT * FROM pgbench_accounts OFFSET 1000;
> HEAD:
> 397.977
>
> removing the HeapCheckForSerializableConflictOut() from heapgetpage()
> (incorrect!), to establish the baseline of what serializable costs:
> 336.695
>
> pulling out CheckForSerializableConflictOutNeeded() from
> HeapCheckForSerializableConflictOut() in heapgetpage(), and avoiding
> calling
> HeapCheckForSerializableConflictOut() in the loop:
> 339.742
>
> moving the loop into a static inline function, marked as pg_always_inline,
> called with static arguments for always_visible, check_serializable:
> 326.546
>
> marking the always_visible, !check_serializable case likely():
> 322.249
>
> removing TestForOldSnapshot() calls, which we pretty much already decided
> on:
> 312.987
>
>
> FWIW, there's more we can do, with some hacky changes I got the time down
> to
> 273.261, but the tradeoffs start to be a bit more complicated. And
> 397->320ms
> for something as core as this, is imo worth considering on its own.
>
>
>
>
> Now, this just affects the sequential scan case. heap_hot_search_buffer()
> shares many of the same pathologies.  I find it a bit harder to improve,
> because the compiler's code generation seems to switch between good / bad
> with
> changes that seems unrelated...
>
>
> I wonder why we haven't used PageIsAllVisible() in
> heap_hot_search_buffer() so
> far?
>
>
> Greetings,
>
> Andres Freund
>

Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-05 Thread tender wang

Hi all,

   I recently run benchmark[1] on master, but I found performance problem
as below:

explain analyze select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
  from
public.lineitem as ref_0
  left join public.lineitem as sample_0
  on ((select p_partkey from public.part order by p_partkey limit 1)
 is not NULL)
  where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
   QUERY PLAN

-
 Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual
time=299695.097..299695.099 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=0.651..0.652 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.650..0.651 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.013..0.428 rows=2000 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27)
(actual time=299695.096..299695.096 rows=0 loops=1)
 Join Filter: ($0 IS NOT NULL)
 Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS
NULL))
 Rows Removed by Filter: 3621030625
 ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175
width=11) (actual time=0.026..6.225 rows=60175 loops=1)
 ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual
time=0.000..2.554 rows=60175 loops=60175)
   ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
 Planning Time: 0.172 ms
 Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze  select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
  from
public.lineitem as ref_0
  left join public.lineitem as sample_0
  on ((select p_partkey from public.part order by p_partkey limit 1)
 is not NULL)
  where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
  QUERY
PLAN
---
 Limit  (cost=1078.00..91026185.57 rows=1 width=27) (actual
time=192669.605..192670.425 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=0.662..0.663 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.661..0.662 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.017..0.430 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..91026107.57 rows=1 width=27) (actual
time=192669.604..192670.422 rows=0 loops=1)
 Workers Planned: 1
 Params Evaluated: $0
 Workers Launched: 1
 ->  Nested Loop Left Join  (cost=0.00..91025107.47 rows=1
width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
   Join Filter: ($0 IS NOT NULL)
   Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
   Rows Removed by Filter: 1810515312
   ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2)
   ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
 Planning Time: 0.174 ms
 Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.
When enable_material = true,  we can see Material path won in first plan,
but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using
seq scan path not material path.

[1] include test table schema and data, you can repeat above problem.

I try fix this problem in attached patch, and I found pg12.12 also had this
is

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-05 Thread tender wang

After using patch, the result as below :
   QUERY
PLAN

 Limit  (cost=1078.00..26630101.20 rows=1 width=27) (actual
time=160571.005..160571.105 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=1.065..1.066 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=1.064..1.065 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.046..0.830 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..26630023.20 rows=1 width=27) (actual
time=160571.003..160571.102 rows=0 loops=1)
 Workers Planned: 1
 Params Evaluated: $0
 Workers Launched: 1
 ->  Nested Loop Left Join  (cost=0.00..26629023.10 rows=1
width=27) (actual time=160549.257..160549.258 rows=0 loops=2)
   Join Filter: ($0 IS NOT NULL)
   Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
   Rows Removed by Filter: 1810515312
   ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2)
   ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27)
(actual time=0.000..2.839 rows=60175 loops=60175)
 ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2)
 Planning Time: 0.174 ms
 Execution Time: 160571.476 ms
(20 rows)

tender wang  于2023年9月5日周二 16:52写道：

> Hi all,
>
>I recently run benchmark[1] on master, but I found performance problem
> as below:
>
> explain analyze select
>   subq_0.c0 as c0,
>   subq_0.c1 as c1,
>   subq_0.c2 as c2
> from
>   (select
> ref_0.l_shipmode as c0,
> sample_0.l_orderkey as c1,
> sample_0.l_quantity as c2,
> ref_0.l_orderkey as c3,
> sample_0.l_shipmode as c5,
> ref_0.l_shipinstruct as c6
>   from
> public.lineitem as ref_0
>   left join public.lineitem as sample_0
>   on ((select p_partkey from public.part order by p_partkey limit
> 1)
>  is not NULL)
>   where sample_0.l_orderkey is NULL) as subq_0
> where subq_0.c5 is NULL
> limit 1;
>QUERY PLAN
>
>
> -
>  Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual
> time=299695.097..299695.099 rows=0 loops=1)
>InitPlan 1 (returns $0)
>  ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
> time=0.651..0.652 rows=1 loops=1)
>->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
> time=0.650..0.651 rows=1 loops=1)
>  Sort Key: part.p_partkey
>  Sort Method: top-N heapsort  Memory: 25kB
>  ->  Seq Scan on part  (cost=0.00..68.00 rows=2000
> width=8) (actual time=0.013..0.428 rows=2000 loops=1)
>->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27)
> (actual time=299695.096..299695.096 rows=0 loops=1)
>  Join Filter: ($0 IS NOT NULL)
>  Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode
> IS NULL))
>  Rows Removed by Filter: 3621030625
>  ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175
> width=11) (actual time=0.026..6.225 rows=60175 loops=1)
>  ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual
> time=0.000..2.554 rows=60175 loops=60175)
>->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
> rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
>  Planning Time: 0.172 ms
>  Execution Time: 299695.501 ms
> (16 rows)
>
> After I set enable_material to off, the same query run faster, as below:
> set enable_material = off;
> explain analyze  select
>   subq_0.c0 as c0,
>   subq_0.c1 as c1,
>   subq_0.c2 as c2
> from
>   (select
> ref_0.l_shipmode as c0,
> sample_0.l_orderkey as c1,
> sample_0.l_quantity as c2,
> ref_0.l_orderkey as c3,
> sample_0.l_shipmode as c5,
> ref_0.l_shipinstruct as c6
>   from
> public.lineitem as ref_0
>   left join public.lineitem as sample_0
>   o

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-07 Thread tender wang

Richard Guo  于2023年9月5日周二 18:51写道：

>
> On Tue, Sep 5, 2023 at 4:52 PM tender wang  wrote:
>
>>I recently run benchmark[1] on master, but I found performance problem
>> as below:
>> ...
>>
>> I debug the code and find consider_parallel_nestloop() doesn't consider
>> materialized form of the cheapest inner path.
>>
>
> Yeah, this seems an omission in commit 45be99f8.  I reviewed the patch
> and here are some comments.
>
> * I think we should not consider materializing the cheapest inner path
>   if we're doing JOIN_UNIQUE_INNER, because in this case we have to
>   unique-ify the inner path.
>

 That's right. The V2 patch has been fixed.


> * I think we can check if it'd be parallel safe before creating the
>   material path, thus avoid the creation in unsafe cases.
>

Agreed.



> * I don't think the test case you added works for the code changes.
>   Maybe a plan likes below is better:
>

  Agreed.

explain (costs off)
> select * from tenk1, tenk2 where tenk1.two = tenk2.two;
>   QUERY PLAN
> --
>  Gather
>Workers Planned: 4
>->  Nested Loop
>  Join Filter: (tenk1.two = tenk2.two)
>  ->  Parallel Seq Scan on tenk1
>  ->  Materialize
>->  Seq Scan on tenk2
> (7 rows)
>
> Thanks
> Richard
>
From 096c645d7c8d06df3a4e6a8aa7fc4edd1c0a3755 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Thu, 7 Sep 2023 17:55:04 +0800
Subject: [PATCH v2] Parallel seq scan should consider materila inner path in
 nestloop case.

---
 src/backend/optimizer/path/joinpath.c | 19 +++
 src/test/regress/expected/select_parallel.out | 24 +++
 src/test/regress/sql/select_parallel.sql  |  9 +++
 3 files changed, 52 insertions(+)

diff --git a/src/backend/optimizer/path/joinpath.c 
b/src/backend/optimizer/path/joinpath.c
index 821d282497..87111ad643 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -2004,10 +2004,25 @@ consider_parallel_nestloop(PlannerInfo *root,
 {
JoinTypesave_jointype = jointype;
ListCell   *lc1;
+   Path *matpath = NULL;
+   Path *inner_cheapest_total = innerrel->cheapest_total_path;
 
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
 
+   /*
+* Consider materializing the cheapest inner path, unless we're
+* doing JOIN_UNIQUE_INNER or enable_material is off or the subpath
+* is parallel unsafe or the path in question materializes its output 
anyway.
+*/
+   if (save_jointype != JOIN_UNIQUE_INNER &&
+   enable_material &&
+   inner_cheapest_total != NULL &&
+   inner_cheapest_total->parallel_safe &&
+   !ExecMaterializesOutput(inner_cheapest_total->pathtype))
+   matpath = (Path *)
+   create_material_path(innerrel, inner_cheapest_total);
+
foreach(lc1, outerrel->partial_pathlist)
{
Path   *outerpath = (Path *) lfirst(lc1);
@@ -2064,6 +2079,10 @@ consider_parallel_nestloop(PlannerInfo *root,
try_partial_nestloop_path(root, joinrel, 
outerpath, mpath,

  pathkeys, jointype, extra);
}
+   /* Also consider materialized form of the cheapest inner path */
+   if (matpath != NULL && matpath->parallel_safe)
+   try_partial_nestloop_path(root, joinrel, outerpath, 
matpath,
+ 
pathkeys, jointype, extra);
}
 }
 
diff --git a/src/test/regress/expected/select_parallel.out 
b/src/test/regress/expected/select_parallel.out
index d88353d496..5b9f5c58cc 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -844,6 +844,30 @@ select * from
 (12 rows)
 
 reset enable_material;
+-- test materialized form of the cheapest inner path
+set min_parallel_table_scan_size = '512kB';
+explain(costs off)
+select count(*) from tenk1, int4_tbl where tenk1.two < int4_tbl.f1;
+ QUERY PLAN 
+
+ Finalize Aggregate
+   ->  Gather
+ Workers Planned: 4
+ ->  Partial Aggregate
+   ->  Nested Loop
+ Join Filter: (tenk1.two < int4_tbl.f1)
+ ->  Parallel Seq Scan on tenk1
+ ->  Materialize
+   ->  Seq Scan on int4_tbl
+(9 ro

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-27 Thread tender wang

Hi tom,
   Do you have any comments or suggestions on this issue? Thanks.

Richard Guo  于2023年9月8日周五 14:06写道：

>
> On Fri, Sep 8, 2023 at 3:15 AM Robert Haas  wrote:
>
>> The example query provided here seems rather artificial. Surely few
>> people write a join clause that references neither of the tables being
>> joined. Is there a more realistic case where this makes a big
>> difference?
>
>
> Yes the given example query is not that convincing.  I tried a query
> with plans as below (after some GUC setting) which might be more
> realistic in real world.
>
> unpatched:
>
> explain select * from partsupp join lineitem on l_partkey > ps_partkey;
>   QUERY PLAN
>
> --
>  Gather  (cost=0.00..5522666.44 rows=16047 width=301)
>Workers Planned: 4
>->  Nested Loop  (cost=0.00..5522666.44 rows=40116667 width=301)
>  Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
>  ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044
> width=144)
>  ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
> (6 rows)
>
> patched:
>
> explain select * from partsupp join lineitem on l_partkey > ps_partkey;
>   QUERY PLAN
>
> --
>  Gather  (cost=0.00..1807085.44 rows=16047 width=301)
>Workers Planned: 4
>->  Nested Loop  (cost=0.00..1807085.44 rows=40116667 width=301)
>  Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
>  ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044
> width=144)
>  ->  Materialize  (cost=0.00..307.00 rows=8000 width=157)
>->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000
> width=157)
> (7 rows)
>
> The execution time (ms) are (avg of 3 runs):
>
> unpatched:  71769.21
> patched:65510.04
>
> So we can see some (~9%) performance gains in this case.
>
> Thanks
> Richard
>

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-10 Thread tender wang

The comment  /* not needed for Consts */  may be more better close to if
(!IsA(expr, Const)).
Others look good to me.

David Rowley  于2023年10月9日周一 07:28写道：

> On Sat, 7 Oct 2023 at 03:11, Sergei Glukhov 
> wrote:
> > I noticed that combination of prepared statement with generic plan and
> > 'IS NULL' clause could lead partition pruning to crash.
>
> > Test case:
> > --
> > set plan_cache_mode to force_generic_plan;
> > prepare stmt AS select * from hp where a is null and b = $1;
> > explain execute stmt('xxx');
>
> Thanks for the detailed report and proposed patch.
>
> I think your proposed fix isn't quite correct.  I think the problem
> lies in InitPartitionPruneContext() where we assume that the list
> positions of step->exprs are in sync with the keyno.  If you look at
> perform_pruning_base_step() the code there makes a special effort to
> skip over any keyno when a bit is set in opstep->nullkeys.
>
> It seems that your patch is adjusting the keyno that's given to the
> PruneCxtStateIdx() and it looks like (for your test case) it'll end up
> passing keyno==0 when it should be passing keyno==1.  keyno is the
> index of the partition key, so you can't pass 0 when it's for key
> index 1.
>
> I wonder if it's worth expanding the tests further to cover more of
> the pruning cases to cover run-time pruning too.
>

   I think it's worth doing that.

David
>

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-10 Thread tender wang

For hash partition table, if partition key is IS NULL clause,  the
condition in if  in get_steps_using_prefix_recurse:
if (cur_keyno < step_lastkeyno - 1)
is not enough.
Like the decode crashed case, explain select * from hp where a = 1 and b is
null and c = 1;
prefix list just has a = 1 clause.
I try fix this in attached patch.
David Rowley  于2023年10月11日周三 10:50写道：

> On Tue, 10 Oct 2023 at 21:31, Sergei Glukhov 
> wrote:
> > create table hp (a int, b text, c int, d int)
> >partition by hash (a part_test_int4_ops, b part_test_text_ops, c
> > part_test_int4_ops);
> > create table hp0 partition of hp for values with (modulus 4, remainder
> 0);
> > create table hp3 partition of hp for values with (modulus 4, remainder
> 3);
> > create table hp1 partition of hp for values with (modulus 4, remainder
> 1);
> > create table hp2 partition of hp for values with (modulus 4, remainder
> 2);
> >
> >
> > Another crash in the different place even with the fix:
> > explain select * from hp where a = 1 and b is null and c = 1;
>
> Ouch.  It looks like 13838740f tried to fix things in this area before
> and even added a regression test for it. Namely:
>
> -- Test that get_steps_using_prefix() handles non-NULL step_nullkeys
> explain (costs off) select * from hp_prefix_test where a = 1 and b is
> null and c = 1 and d = 1;
>
> I guess that one does not crash because of the "d = 1" clause is in
> the "start" ListCell in get_steps_using_prefix_recurse(), whereas,
> with your case start is NULL which is an issue for cur_keyno =
> ((PartClauseInfo *) lfirst(start))->keyno;.
>
> It might have been better if PartClauseInfo could also describe IS
> NULL quals, but I feel if we do that now then it would require lots of
> careful surgery in partprune.c to account for that.  Probably the fix
> should be localised to get_steps_using_prefix_recurse() to have it do
> something like pass the keyno to try and work on rather than trying to
> get that from the "prefix" list. That way if there's no item in that
> list for that keyno, we can check in step_nullkeys for the keyno.
>
> I'll continue looking.
>
> David
>
>
>
From 1640c0d05269c3368fb41fcffc66e38630ff522d Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Wed, 11 Oct 2023 11:32:04 +0800
Subject: [PATCH] Fix null partition key pruning for hash parittion table.

---
 src/backend/partitioning/partprune.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend/partitioning/partprune.c 
b/src/backend/partitioning/partprune.c
index 7179b22a05..c2a388d454 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -2438,6 +2438,7 @@ 
get_steps_using_prefix_recurse(GeneratePruningStepsContext *context,
List   *result = NIL;
ListCell   *lc;
int cur_keyno;
+   int prefix_lastkeyno;
 
/* Actually, recursion would be limited by PARTITION_MAX_KEYS. */
check_stack_depth();
@@ -2445,7 +2446,11 @@ 
get_steps_using_prefix_recurse(GeneratePruningStepsContext *context,
/* Check if we need to recurse. */
Assert(start != NULL);
cur_keyno = ((PartClauseInfo *) lfirst(start))->keyno;
-   if (cur_keyno < step_lastkeyno - 1)
+   /* Note that for hash partitioning, if partition key is IS NULL clause,
+* that partition key will not present in prefix List.
+*/
+   prefix_lastkeyno = ((PartClauseInfo *) llast(prefix))->keyno;
+   if (cur_keyno < step_lastkeyno - 1 && cur_keyno != prefix_lastkeyno)
{
PartClauseInfo *pc;
ListCell   *next_start;
-- 
2.25.1

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-11 Thread tender wang

David Rowley  于2023年10月11日周三 15:52写道：

> On Wed, 11 Oct 2023 at 15:49, David Rowley  wrote:
> > It might have been better if PartClauseInfo could also describe IS
> > NULL quals, but I feel if we do that now then it would require lots of
> > careful surgery in partprune.c to account for that.  Probably the fix
> > should be localised to get_steps_using_prefix_recurse() to have it do
> > something like pass the keyno to try and work on rather than trying to
> > get that from the "prefix" list. That way if there's no item in that
> > list for that keyno, we can check in step_nullkeys for the keyno.
> >
> > I'll continue looking.
>
> The fix seems to amount to the attached.  The following condition
> assumes that by not recursively processing step_lastkeyno - 1 that
> there will be at least one more PartClauseInfo in the prefix List to
> process.  However, that didn't work when that partition key clause was
> covered by an IS NULL clause.
>
> If we adjust the following condition:
>
> if (cur_keyno < step_lastkeyno - 1)
>
> to become:
>
> final_keyno = ((PartClauseInfo *) llast(prefix))->keyno;
> if (cur_keyno < final_keyno)
>

Yeah, aggred.


> then that ensures that the else clause can pick up any clauses for the
> final column mentioned in the 'prefix' list, plus any nullkeys if
> there happens to be any of those too.
>
> For testing, given that 13838740f (from 2020) had a go at fixing this
> already, I'm kinda thinking that it's not overkill to test all
> possible 16 combinations of IS NULL and equality equals on the 4
> partition key column partitioned table that commit added in
> partition_prune.sql.
>
> I added some tests there using \gexec to prevent having to write out
> each of the 16 queries by hand. I tested that pruning worked (i.e 1
> matching partition in EXPLAIN), and that we get the correct results
> (i.e we pruned the correct partition) by running the query and we get
> the expected 1 row after having inserted 16 rows, one for each
> combination of quals to test.
>
> I wanted to come up with some tests that test for multiple quals
> matching the same partition key.  This is tricky due to the
> equivalence class code being smart and removing any duplicates or
> marking the rel as dummy when it finds conflicting quals.  With hash
> partitioning, we're limited to just equality quals, so maybe something
> could be done with range-partitioned tables instead.  I see there are
> some tests just above the ones I modified which try to cover this.
>
> I also tried to outsmart the planner by using Params and prepared
> queries. Namely:
>
> set plan_cache_mode = 'force_generic_plan';
> prepare q1 (int, int, int, int, int, int, int, int) as select
> tableoid::regclass,* from hp_prefix_test where a = $1 and b = $2 and c
> = $3 and d = $4 and a = $5 and b = $6 and c = $7 and d = $8;
> explain (costs off) execute q1 (1,2,3,4,1,2,3,4);
>
> But I was outsmarted again with a gating qual which checked the pairs
> match before doing the scan :-(
>
>  Append
>Subplans Removed: 15
>->  Result
>  One-Time Filter: (($1 = $5) AND ($2 = $6) AND ($3 = $7) AND ($4 =
> $8))
>  ->  Seq Scan on hp_prefix_test_p14 hp_prefix_test_1
>Filter: ((a = $5) AND (b = $6) AND (c = $7) AND (d = $8))
>
> I'm aiming to commit these as two separate fixes, so I'm going to go
> look again at the first one and wait to see if anyone wants to comment
> on this patch in the meantime.
>
+1, LGTM


> David
>

Why cann't simplify stable function in planning phase?

2023-02-08 Thread tender wang

Hi hackers,
   In evaluate_function(), I find codes as shown below:

 /*
  * Ordinarily we are only allowed to simplify immutable functions. But for
  * purposes of estimation, we consider it okay to simplify functions that
  * are merely stable; the risk that the result might change from planning
  * time to execution time is worth taking in preference to not being able
   * to estimate the value at all.
   */
if (funcform->provolatile == PROVOLATILE_IMMUTABLE)
/* okay */ ;
else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE)
 /* okay */ ;
else
return NULL;

The codes say that stable function can not be simplified here(e.g. planning
phase).
I want to know the reason why stable function can not be simplified in
planning phase.
Maybe show me a example that it will be incorrect for  a query if simplify
stable function in
planning phases.

With kindest regards, tender wang

ERROR: permission info at index 1 ....

2023-02-13 Thread tender wang

Hi hackers,
   After a61b1f74823c commit, below query reports error:

 create table perm_test1(a int);
 create table perm_test2(b int);
 select subq.c0
from (select (select a from perm_test1 order by a limit 1) as c0, b as c1
from perm_test2 where false order by c0, c1) as subq where false;
ERROR:  permission info at index 1 (with relid=16457) does not match
provided RTE (with relid=16460)

Below codes can fix this:

--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -512,11 +512,16 @@ flatten_rtes_walker(Node *node,
flatten_rtes_walker_context *cxt)
 * Recurse into subselects.  Must update cxt->query to this
query so
 * that the rtable and rteperminfos correspond with each
other.
 */
+   Query *current_query = cxt->query;
+   bool result;
+
cxt->query = (Query *) node;
-   return query_tree_walker((Query *) node,
+   result = query_tree_walker((Query *) node,

 flatten_rtes_walker,
 (void *)
cxt,

 QTW_EXAMINE_RTES_BEFORE);
+   cxt->query = current_query;
+   return result;
}


     regards, tender wang

wrong query result due to wang plan

2023-02-15 Thread tender wang

Hi hackers,
I have this query as shown below:

select ref_1.r_comment as c0, subq_0.c1 as c1 from public.region as
sample_0 right join public.partsupp as sample_1 right join public.lineitem
as sample_2 on (cast(null as path) = cast(null as path)) on (cast(null as
"timestamp") < cast(null as "timestamp")) inner join public.lineitem as
ref_0 on (true) left join (select sample_3.ps_availqty as c1,
sample_3.ps_comment as c2 from public.partsupp as sample_3 where false
order by c1, c2 ) as subq_0 on (sample_1.ps_supplycost = subq_0.c1 ) right
join public.region as ref_1 on (sample_1.ps_availqty = ref_1.r_regionkey )
where ref_1.r_comment is not NULL order by c0, c1;

*This query has different result on pg12.12 and on HEAD*,
on pg12.12:
   c0
 | c1
-+
 even, ironic theodolites according to the bold platelets wa
  |
 furiously unusual packages use carefully above the unusual, exp
  |
 silent, bold requests sleep slyly across the quickly sly dependencies.
furiously silent instructions alongside  |
 special, bold deposits haggle foxes. platelet
  |
 special Tiresias about the furiously even dolphins are furi
  |
(5 rows)

its plan :
  QUERY PLAN
--
 Sort
   Sort Key: ref_1.r_comment, c1
   ->  Hash Left Join
 Hash Cond: (ref_1.r_regionkey = ps_availqty)
 ->  Seq Scan on region ref_1
   Filter: (r_comment IS NOT NULL)
 ->  Hash
   ->  Result
 One-Time Filter: false
(9 rows)

But on HEAD(pg16devel), its results below:
c0 | c1
+
(0 rows)

its plan:
   QUERY PLAN

 Sort
   Sort Key: ref_1.r_comment, subq_0.c1
   ->  Result
 One-Time Filter: false
(4 rows)

Attached file included table schema info.
regards, tender wang


dbt3-s0.01-janm.sql
Description: Binary data

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-11 Thread Tender Wang

fujii.y...@df.mitsubishielectric.co.jp <
fujii.y...@df.mitsubishielectric.co.jp> 于2024年6月5日周三 09:26写道：

> Hi. Tender.
>
> Thank you for modification.
>
> > From: Tender Wang 
> > Sent: Tuesday, June 4, 2024 7:51 PM
> >   Please add more tests.  Especially please add some negative tests;
> >   current patch checks that it is safe to apply materialization. It
> would
> >   be helpful to add tests checking that materialization is not
> applied
> >   in both checked cases:
> >   1. when inner join path is not parallel safe
> >   2. when matpath is not parallel safe
> >
> >
> >
> > I added a test case that inner rel is not parallel safe. Actually,
> > matpath will not create if inner rel is not parallel safe. So I didn't
> add test case for the second  scenario.
> Is there case in which matpath is not parallel safe and inner rel is
> parallel safe?
> If right, I think that it would be suitable to add a negative test in a
> such case.
>

I looked through create_xxx_path(), and I found that almost
path.parallel_safe is assigned from RelOptiInfo.consider_parallel.
Some pathes take subpath->parallel_safe into account(e.g. Material path).
In most cases, Material is parallel_safe if rel is parallel
safe. Now I haven't come up a query plan that material is un parallel-safe
but rel is parallel-safe.


>
> Sincerely yours,
> Yuuki Fujii
>
> --
> Yuuki Fujii
> Information Technology R&D Center Mitsubishi Electric Corporation
>
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-13 Thread Tender Wang

Hi Robert,

Since this patch had been reviewed at PgConf.dev Patch Review
Workshop.  And I have updated
the patch according to the review advice. Now there are no others to
comment this patch.
The status of this patch on commitfest have stayed "need review" for a long
time.
I want to know if it is ready to move to the next status "Ready for
commiter".

Thanks.

-- 
Tender Wang

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-18 Thread Tender Wang

Richard Guo  于2024年6月18日周二 17:24写道：

> On Tue, Jun 4, 2024 at 6:51 PM Tender Wang  wrote:
> > Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]
> >
> > * I think we should not consider materializing the cheapest inner path
> > if we're doing JOIN_UNIQUE_INNER, because in this case we have to
> > unique-ify the inner path.
> >
> > We don't consider material inner path if jointype is JOIN_UNIQUE_INNER
> in match_unsorted_order().
> > So here is as same logic as match_unsorted_order(). I added comments to
> explain why.
>
> I looked through the v4 patch and found an issue.  For the plan diff:
>
> + ->  Nested Loop
> +   ->  Parallel Seq Scan on prt1_p1 t1_1
> +   ->  Materialize
> + ->  Sample Scan on prt1_p1 t2_1
> +   Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
> +   Filter: (t1_1.a = a)
>
> This does not seem correct to me.  The inner path is parameterized by
> the outer rel, in which case it does not make sense to add a Materialize
> node on top of it.
>

Yeah, you're right.

>
> I updated the patch to include a check in consider_parallel_nestloop
> ensuring that inner_cheapest_total is not parameterized by outerrel
> before materializing it.  I also tweaked the comments, test cases and
> commit message.
>

Thanks for the work. Now it looks better.
I have changed the status from "need review" to "ready for commiters"  on
the commitfest.

-- 
Tender Wang

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-07-11 Thread Tender Wang

Richard Guo  于2024年7月12日周五 10:30写道：

> On Sat, Jul 6, 2024 at 5:32 PM Richard Guo  wrote:
> > Here is a new rebase.
> >
> > I'm planning to push it next week, barring any objections.
>
> Pushed.
>
> Thanks
> Richard
>

Thanks for pushing.

-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-02 Thread tender wang

I think  old "sub-FK" should not be dropped, that will be violates foreign
key constraint. For example :
postgres=# insert into r values(1,1);
INSERT 0 1
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# delete from p_1 where id = 1;
DELETE 1
postgres=# select * from r_1;
 id | p_id
+--
  1 |1
(1 row)

If I run above SQLs on pg12.12, it will report error below:
postgres=# delete from p_1 where id = 1;
ERROR:  update or delete on table "p_1" violates foreign key constraint
"r_1_p_id_fkey1" on table "r_1"
DETAIL:  Key (id)=(1) is still referenced from table "r_1".

Alvaro Herrera  于2023年7月31日周一 20:58写道：

> On 2023-Jul-05, Jehan-Guillaume de Rorthais wrote:
>
> >   ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> >
> > The old sub-FKs (below 18289) created in this table to enforce the action
> > triggers on referenced partitions are not deleted when the table becomes
> a
> > partition. Because of this, we have additional and useless triggers on
> the
> > referenced partitions and we can not DETACH this partition on the
> referencing
> > side anymore:
>
> Oh, hm, interesting.  Thanks for the report and patch.  I found a couple
> of minor issues with it (most serious one: nkeys should be 3, not 2;
> also sysscan should use conrelid index), but I'll try and complete it so
> that it's ready for 2023-08-10's releases.
>
> Regards
>
> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
>
>
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-03 Thread tender wang

I think the code to determine that fk of a partition is inherited or not is
not enough.
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we
should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera  于2023年8月3日周四 17:02写道：

> On 2023-Aug-03, tender wang wrote:
>
> > I think  old "sub-FK" should not be dropped, that will be violates
> foreign
> > key constraint.
>
> Yeah, I've been playing more with the patch and it is definitely not
> doing the right things.  Just eyeballing the contents of pg_trigger and
> pg_constraint for partitions added by ALTER...ATTACH shows that the
> catalog contents are inconsistent with those added by CREATE TABLE
> PARTITION OF.
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>
From d84395c7321c201a78661de0b41b76e71ab10678 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Thu, 3 Aug 2023 17:23:06 +0800
Subject: [PATCH] Recheck foreign key of a partition is inherited from parent.

Previously, fk is inherited if conparentid(pg_constraint) is valid.
It is not enough and we should compare confrelid field to determine
fk is inherited.
---
 src/backend/commands/tablecmds.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 727f151750..1447433109 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18556,6 +18556,8 @@ DetachPartitionFinalize(Relation rel, Relation partRel, 
bool concurrent,
ForeignKeyCacheInfo *fk = lfirst(cell);
HeapTuple   contup;
Form_pg_constraint conform;
+   HeapTuple parentTup;
+   Form_pg_constraint parentForm;
Constraint *fkconstraint;
Oid insertTriggerOid,
updateTriggerOid;
@@ -18573,6 +18575,23 @@ DetachPartitionFinalize(Relation rel, Relation 
partRel, bool concurrent,
continue;
}
 
+   /* recheck confrelid field */
+   if (OidIsValid(conform->conparentid))
+   {
+   parentTup = SearchSysCache1(CONSTROID, 
ObjectIdGetDatum(conform->conparentid));
+   if (!HeapTupleIsValid(parentTup))
+   elog(ERROR, "cache lookup failed for constraint 
%u", conform->conparentid);
+   parentForm = (Form_pg_constraint) GETSTRUCT(parentTup);
+   /* It is not inherited foreign keys */
+   if (parentForm->confrelid != conform->confrelid)
+   {
+   ReleaseSysCache(contup);
+   ReleaseSysCache(parentTup);
+   continue;
+   }
+   ReleaseSysCache(parentTup);
+   }
+
/* unset conparentid and adjust conislocal, coninhcount, etc. */
ConstraintSetParentConstraint(fk->conoid, InvalidOid, 
InvalidOid);
 
-- 
2.25.1

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-04 Thread tender wang

Oversight the DetachPartitionFinalize(), I found another bug below:

postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE
postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE r_1 (
postgres(# id   bigint PRIMARY KEY,
postgres(# p_id bigint NOT NULL
postgres(#   );
CREATE TABLE
postgres=#  CREATE TABLE r (
postgres(# id   bigint PRIMARY KEY,
postgres(# p_id bigint NOT NULL,
postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
postgres(#   ) PARTITION BY list (id);
CREATE TABLE
postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# insert into r_1 values(1,1);
ERROR:  insert or update on table "r_1" violates foreign key constraint
"r_p_id_fkey"
DETAIL:  Key (p_id)=(1) is not present in table "p".

After detach operation, r_1 is normal relation and the inherited foreign
key 'r_p_id_fkey' should be removed.

tender wang  于2023年8月3日周四 17:34写道：

> I think the code to determine that fk of a partition is inherited or not
> is not enough.
> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
> from parent.
>
> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
> should recheck confrelid(pg_constraint) field.
>
> I try to fix this problem in the attached patch.
> Any thoughts.
>
> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>
>> On 2023-Aug-03, tender wang wrote:
>>
>> > I think  old "sub-FK" should not be dropped, that will be violates
>> foreign
>> > key constraint.
>>
>> Yeah, I've been playing more with the patch and it is definitely not
>> doing the right things.  Just eyeballing the contents of pg_trigger and
>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>> catalog contents are inconsistent with those added by CREATE TABLE
>> PARTITION OF.
>>
>> --
>> Álvaro Herrera PostgreSQL Developer  —
>> https://www.EnterpriseDB.com/
>>
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-04 Thread tender wang

Oversight the DetachPartitionFinalize() again, I found the root cause why
'r_p_id_fkey' wat not removed.

DetachPartitionFinalize() call the GetParentedForeignKeyRefs() func to get
tuple from pg_constraint that will be delete but failed.
 according to the comments, the GetParentedForeignKeyRefs() func get the
tuple reference me not I reference others.

I try to fix this bug :
i. ConstraintSetParentConstraint() should not be called in
DetachPartitionFinalize(), because after conparentid was set to 0,
we can not find inherited foreign keys.
ii. create another function like GetParentedForeignKeyRefs(), but the
ScanKey should be conrelid field not confrelid.

I quickly test on my above solution in my env, can be solve above issue.

tender wang  于2023年8月4日周五 17:04写道：

> Oversight the DetachPartitionFinalize(), I found another bug below:
>
> postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
> CREATE TABLE
> postgres=# CREATE TABLE r_1 (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL
> postgres(#   );
> CREATE TABLE
> postgres=#  CREATE TABLE r (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL,
> postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
> postgres(#   ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> ALTER TABLE
> postgres=# ALTER TABLE r DETACH PARTITION r_1;
> ALTER TABLE
> postgres=# insert into r_1 values(1,1);
> ERROR:  insert or update on table "r_1" violates foreign key constraint
> "r_p_id_fkey"
> DETAIL:  Key (p_id)=(1) is not present in table "p".
>
> After detach operation, r_1 is normal relation and the inherited foreign
> key 'r_p_id_fkey' should be removed.
>
>
> tender wang  于2023年8月3日周四 17:34写道：
>
>> I think the code to determine that fk of a partition is inherited or not
>> is not enough.
>> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
>> from parent.
>>
>> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
>> should recheck confrelid(pg_constraint) field.
>>
>> I try to fix this problem in the attached patch.
>> Any thoughts.
>>
>> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>>
>>> On 2023-Aug-03, tender wang wrote:
>>>
>>> > I think  old "sub-FK" should not be dropped, that will be violates
>>> foreign
>>> > key constraint.
>>>
>>> Yeah, I've been playing more with the patch and it is definitely not
>>> doing the right things.  Just eyeballing the contents of pg_trigger and
>>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>>> catalog contents are inconsistent with those added by CREATE TABLE
>>> PARTITION OF.
>>>
>>> --
>>> Álvaro Herrera PostgreSQL Developer  —
>>> https://www.EnterpriseDB.com/
>>>
>>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-07 Thread tender wang

The foreign key still works even though partition was detached. Is this
behavior expected?
I can't find the answer in the document. If it is expected behavior ,
please ignore the bug I reported a few days ago.

tender wang  于2023年8月4日周五 17:04写道：

> Oversight the DetachPartitionFinalize(), I found another bug below:
>
> postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
> CREATE TABLE
> postgres=# CREATE TABLE r_1 (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL
> postgres(#   );
> CREATE TABLE
> postgres=#  CREATE TABLE r (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL,
> postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
> postgres(#   ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> ALTER TABLE
> postgres=# ALTER TABLE r DETACH PARTITION r_1;
> ALTER TABLE
> postgres=# insert into r_1 values(1,1);
> ERROR:  insert or update on table "r_1" violates foreign key constraint
> "r_p_id_fkey"
> DETAIL:  Key (p_id)=(1) is not present in table "p".
>
> After detach operation, r_1 is normal relation and the inherited foreign
> key 'r_p_id_fkey' should be removed.
>
>
> tender wang  于2023年8月3日周四 17:34写道：
>
>> I think the code to determine that fk of a partition is inherited or not
>> is not enough.
>> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
>> from parent.
>>
>> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
>> should recheck confrelid(pg_constraint) field.
>>
>> I try to fix this problem in the attached patch.
>> Any thoughts.
>>
>> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>>
>>> On 2023-Aug-03, tender wang wrote:
>>>
>>> > I think  old "sub-FK" should not be dropped, that will be violates
>>> foreign
>>> > key constraint.
>>>
>>> Yeah, I've been playing more with the patch and it is definitely not
>>> doing the right things.  Just eyeballing the contents of pg_trigger and
>>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>>> catalog contents are inconsistent with those added by CREATE TABLE
>>> PARTITION OF.
>>>
>>> --
>>> Álvaro Herrera PostgreSQL Developer  —
>>> https://www.EnterpriseDB.com/
>>>
>>

[question] multil-column range partition prune

2023-08-10 Thread tender wang

I have an range partition and query below:
create table p_range(a int, b int) partition by range (a,b); create table
p_range1 partition of p_range for values from (1,1) to (3,3); create table
p_range2 partition of p_range for values from (4,4) to (6,6); explain
select * from p_range where b =2;
QUERY PLAN
--
 Append  (cost=0.00..76.61 rows=22 width=8)
   ->  Seq Scan on p_range1 p_range_1  (cost=0.00..38.25 rows=11 width=8)
 Filter: (b = 2)
   ->  Seq Scan on p_range2 p_range_2  (cost=0.00..38.25 rows=11 width=8)
 Filter: (b = 2)
(5 rows)

The result of EXPLAIN shows that no partition prune happened.
And gen_prune_steps_from_opexps() has comments that can answer the result.
/*
* For range partitioning, if we have no clauses for the current key,
* we can't consider any later keys either, so we can stop here.
*/
if (part_scheme->strategy == PARTITION_STRATEGY_RANGE &&
clauselist == NIL)
break;

But I want to know why we don't prune when just have latter partition key
in whereClause.
Thanks.

Re: Can't find not null constraint, but \d+ shows that

2024-04-10 Thread Tender Wang

Alvaro Herrera  于2024年4月10日周三 21:58写道：

> It turns out that trying to close all holes that lead to columns marked
> not-null without a pg_constraint row is not possible within the ALTER
> TABLE framework, because it can happen outside it also.  Consider this
>
> CREATE DOMAIN dom1 AS integer;
> CREATE TABLE notnull_tbl (a dom1, b int, PRIMARY KEY (a, b));
> DROP DOMAIN dom1 CASCADE;
>
> In this case you'll end up with b having attnotnull=true and no
> constraint; and no amount of messing with tablecmds.c will fix it.
>

I try above case on my v4 patch[1], and it seems no result as what you said.
But, anyway, I now don't like updating other catalog in
RemoveConstraintById().
Because it will not be friendly for others who call RemoveConstraintById()
want only
to remove pg_constraint tuple, but actually it do more works stealthily.


> So I propose to instead allow those constraints, and treat them as
> second-class citizens.  We allow dropping them with ALTER TABLE DROP NOT
> NULL, and we allow to create a backing full-fledged constraint with SET
> NOT NULL or ADD CONSTRAINT.  So here's a partial crude initial patch to
> do that.
>

Hmm, the patch looks like the patch in my first email in this thread. But
my v1 patch seem
a poc at most.

>
>
> One thing missing here is pg_dump support.  If you just dump this table,
> it'll end up with no constraint at all.  That's obviously bad, so I
> propose we have pg_dump add a regular NOT NULL constraint for those, to
> avoid perpetuating the weird situation further.
>
> Another thing I wonder if whether I should use the existing
> set_attnotnull() instead of adding drop_orphaned_notnull().  Or we could
> just inline the code in ATExecDropNotNull, since it's small and
> self-contained.
>

I like just inline the code in  ATExecDropNotNull, as you said, it's small
and self-contained.
in ATExecDropNotNull(), we had open the pg_attributed table and hold
RowExclusiveLock,
the tuple we also get.
What we do is set attnotnull = false, and call CatalogTupleUpdate.

-- 
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
> "Postgres is bloatware by design: it was built to house
>  PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)
>

[1]
https://www.postgresql.org/message-id/CAHewXNn_So7LUCxxxyDNfdvCQp1TnD3gTVECBZX2bT_nbPgraQ%40mail.gmail.com
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-11 Thread Tender Wang

jian he  于2024年4月11日周四 14:40写道：

> On Wed, Apr 10, 2024 at 2:10 PM jian he 
> wrote:
> >
> > DROP TABLE if exists notnull_tbl2;
> > CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1
> int);
> > ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> > ALTER TABLE notnull_tbl2 DROP c1;
> > \d notnull_tbl2
>
> > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> per above sequence execution order, this should error out?
>
> otherwise which "not null" (attribute|constraint) to anchor "generated
> by default as identity" not null property?
> "DROP c1" will drop the not null property for "c0" and "c1".
> if "DROP CONSTRAINT notnull_tbl2_c0_not_nul" not error out, then
> " ALTER TABLE notnull_tbl2 DROP c1;"
> should either error out
> or transform "c0" from "c0 int generated by default as identity"
> to
> "c0 int"
>
> I try above case on MASTER and MASTER with Alvaro V2 patch, and all work
correctly.
\d+ notnull_tbl2 will see not-null of "c0".


>
> On Thu, Apr 11, 2024 at 1:23 AM Alvaro Herrera 
> wrote:
> >
> > On 2024-Apr-10, Alvaro Herrera wrote:
> >
> > > One thing missing here is pg_dump support.  If you just dump this
> table,
> > > it'll end up with no constraint at all.  That's obviously bad, so I
> > > propose we have pg_dump add a regular NOT NULL constraint for those, to
> > > avoid perpetuating the weird situation further.
> >
> > Here's another crude patchset, this time including the pg_dump aspect.
> >
>
> +DROP TABLE notnull_tbl1;
> +-- make sure attnotnull is reset correctly when a PK is dropped indirectly
> +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1));
> +ALTER TABLE  notnull_tbl1 DROP c1;
> +\d+ notnull_tbl1
> +   Table "public.notnull_tbl1"
> + Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
>
> ++-+---+--+-----+-+--+-
> + c0 | integer |   | not null | | plain   |
>   |
> +
>
> this is not what we expected?
> "not null" for "c0" now should be false?
> am I missing something?
>
Yeah, now this is expected behavior.
Users can  drop manually not-null of "c0" if they want, and no error
reporte.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-11 Thread Tender Wang

jian he  于2024年4月12日周五 10:12写道：

> On Thu, Apr 11, 2024 at 10:48 PM Alvaro Herrera 
> wrote:
> >
> >
> > I'm still not ready with this -- still not convinced about the new AT
> > pass.  Also, I want to add a test for the pg_dump behavior, and there's
> > an XXX comment.
> >
> Now I am more confused...
>
> +-- make sure attnotnull is reset correctly when a PK is dropped
> indirectly,
> +-- or kept if there's a reason for that
> +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1));
> +ALTER TABLE  notnull_tbl1 DROP c1;
> +\d+ notnull_tbl1
> +   Table "public.notnull_tbl1"
> + Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
>
> ++-+---+--+-+-+--+-
> + c0 | integer |   |  | | plain   |
>   |
> +
> +DROP TABLE notnull_tbl1;
>
> same query, mysql make let "c0" be not null
> mysql https://dbfiddle.uk/_ltoU7PO
>
> for postgre
> https://dbfiddle.uk/ZHJXEqL1
> from 9.3 to 16 (click the link (https://dbfiddle.uk/ZHJXEqL1), then
> click "9.3" choose which version you like)
> all will make the remaining column "co" be not null.
>
> latest
> 0001-Better-handle-indirect-constraint-drops.patch make c0 attnotnull be
> false.
>
> previous patches:
> v2-0001-Handle-ALTER-.-DROP-NOT-NULL-when-no-pg_constrain.patch  make
> c0 attnotnull be true.
> 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch make
> c0 attnotnull be false.
>

I'm not sure that SQL standard specifies what database must do for this
case.
If the standard does not specify, then it depends on each database vendor's
decision.

Some people like not-null retained, other people may like not-null removed.
I think it will be ok if people can drop not-null or add not-null back
again after dropping pk.

In Master, not-null will reset when we drop PK directly. I hope dropping pk
indirectly
is consistent with dropping PK directly.

--
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-11 Thread Tender Wang

Alvaro Herrera  于2024年4月11日周四 22:48写道：

> On 2024-Apr-11, Alvaro Herrera wrote:
>
> > Well, I think you were right that we should try to handle the situation
> > of unmarking attnotnull as much as possible, to decrease the chances
> > that the problematic situation occurs.  That means, we can use the
> > earlier code to handle DROP COLUMN when it causes a PK to be dropped --
> > even though we still need to handle the situation of an attnotnull flag
> > set with no pg_constraint row.  I mean, we still have to handle DROP
> > DOMAIN correctly (and there might be other cases that I haven't thought
> > about) ... but surely this is a much less common situation than the one
> > you reported.  So I'll merge everything and post an updated patch.
>
> Here's roughly what I'm thinking.  If we drop a constraint, we can still
> reset attnotnull in RemoveConstraintById(), but only after checking that
> it's not a generated column or a replica identity.  If they are, we
> don't error out -- just skip the attnotnull update.
>
> Now, about the code to allow ALTER TABLE DROP NOT NULL in case there's
> no pg_constraint row, I think at this point it's mostly dead code,
> because it can only happen when you have a replica identity or generated
> column ... and the DROP NOT NULL should still prevent you from dropping
> the flag anyway.  But the case can still arise, if you change the
> replica identity or ALTER TABLE ALTER COLUMN DROP DEFAULT, respectively.
>
> I'm still not ready with this -- still not convinced about the new AT
> pass.


Yeah, at first, I was also hesitant. Two reasons make me convinced.
in ATPostAlterTypeParse()
-
   else if (cmd->subtype == AT_SetAttNotNull)
{
/*
 * The parser will create
AT_AttSetNotNull subcommands for
 * columns of PRIMARY KEY
indexes/constraints, but we need
 * not do anything with them here,
because the columns'
 * NOT NULL marks will already have
been propagated into
 * the new table definition.
 */
}
---
The new table difinition continues to use old column not-null, so here does
nothing.
If we reset NOT NULL marks in RemoveConstrainById() when dropping PK
indirectly,
we need to do something here or somewhere else.

Except AT_SetAttNotNull type, other types add a AT pass to tab->subcmds.
Because
not-null should be added before re-adding index, there is no right AT pass
in current AlterTablePass.
So a new AT pass ahead AT_PASS_OLD_INDEX  is needed.

Another reason is that it can use ALTER TABLE frame to set not-null.
This way looks simpler and better than hardcode to re-install not-null in
some funciton.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-18 Thread Tender Wang

Alvaro Herrera  于2024年4月19日周五 02:49写道：

> On 2024-Apr-13, jian he wrote:
>
> > I wonder is there any incompatibility issue, or do we need to say
> something
> > about the new behavior when dropping a key column?
>
> Umm, yeah, maybe we should document it in ALTER TABLE DROP PRIMARY KEY
> and in the release notes to note the different behavior.
>
> > only minor cosmetic issue:
> > + if (unconstrained_cols)
> > i would like change it to
> > + if (unconstrained_cols != NIL)
> >
> > + foreach(lc, unconstrained_cols)
> > we can change to
> > + foreach_int(attnum, unconstrained_cols)
> > per commit
> >
> https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff
>
> Ah, yeah.  I did that, rewrote some comments and refined the tests a
> little bit to ensure the pg_upgrade behavior is sane.  I intend to get
> this pushed tomorrow, if nothing ugly comes up.
>

The new patch looks good to me.


>
> CI run: https://cirrus-ci.com/build/5471117953990656
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-23 Thread Tender Wang

Andrey M. Borodin  于2024年4月8日周一 17:40写道：

>
>
> > On 27 Sep 2023, at 16:06, tender wang  wrote:
> >
> >Do you have any comments or suggestions on this issue? Thanks.
> Hi Tender,
>
> there are some review comments in the thread, that you might be interested
> in.
> I'll mark this [0] entry "Waiting on Author" and move to next CF.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.
>
> [0]https://commitfest.postgresql.org/47/4549/


I have rebased master and fixed a plan diff case.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v3-0001-Support-materializing-inner-path-on-parallel-oute.patch
Description: Binary data

Re: First draft of PG 17 release notes

2024-05-13 Thread Tender Wang

jian he  于2024年5月9日周四 18:00写道：

> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
>
> another potential incompatibilities issue:
> ALTER TABLE DROP PRIMARY KEY
>
> see:
>
> https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql
>
>
Since Alvaro has reverted all changes to not-null constraints, so will not
have potential incompatibilities issue.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: struct RelOptInfo member relid comments

2024-05-23 Thread Tender Wang

jian he  于2024年5月24日周五 10:58写道：

> hi
>
> typedef struct RelOptInfo
> {
> 
> /*
> * information about a base rel (not set for join rels!)
> */
> Index relid;
> ...
> }
>
> imho, the above comment is not very helpful.
> we should say more about what kind of information relid says about a base
> rel?
>
> I don't know much about RelOptInfo, that's why I ask.
>
>
> The fields in struct RelOptInfo between comment " information about a base
rel " and
commnet "Information about foreign tables and foreign joins" are all about
a base rel.

Every field has a comment. I think that's already helpful for understanding
what information
we need to optimize a base rel.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: why memoize is not used for correlated subquery

2024-05-28 Thread Tender Wang

Pavel Stehule  于2024年5月28日周二 15:31写道：

> Hi
>
>
> My question is - does memoize support subqueries? And can be enhanced to
> support this exercise without LATERAL and optimization fences?
>
>
The commit messages in memoize may answer your question:

   "For now, the planner will only consider using a result cache for
parameterized nested loop joins.  This works for both normal joins and
also for LATERAL type joins to subqueries.  It is possible to use this
new
node for other uses in the future.  For example, to cache results from
correlated subqueries.  However, that's not done here due to some
difficulties obtaining a distinct estimation on the outer plan to
calculate the estimated cache hit ratio.  Currently we plan the inner
plan
before planning the outer plan so there is no good way to know if a
result
cache would be useful or not since we can't estimate the number of times
the subplan will be called until the outer plan is generated."

git show b6002a796d
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-04 Thread Tender Wang

Tomasz Rybak  于2024年5月31日周五 04:21写道：

> On Tue, 2024-04-23 at 16:59 +0800, Tender Wang wrote:
> >
> [ cut ]
> >
> > I have rebased master and fixed a plan diff case.
>
> We (me, Paul Jungwirth, and Yuki Fujii) reviewed this patch
> at PgConf.dev Patch Review Workshop.
>

Thanks for reviewing this patch.

> Here are our findings.
>
> Patch tries to allow for using materialization together
> with parallel subqueries.
> It applies cleanly on 8fea1bd5411b793697a4c9087c403887e050c4ac
> (current HEAD).
> Tests pass locally on macOS and Linux in VM under Windows.
> Tests are also green in cfbot (for last 2 weeks; they were
> red previously, probably because of need to rebase).
>
> Please add more tests.  Especially please add some negative tests;
> current patch checks that it is safe to apply materialization. It would
> be helpful to add tests checking that materialization is not applied
> in both checked cases:
> 1. when inner join path is not parallel safe
> 2. when matpath is not parallel safe
>

I added a test case that inner rel is not parallel safe. Actually, matpath
will not create
if inner rel is not parallel safe. So I didn't add test case for the
second  scenario.

This patch tries to apply materialization only when join type
> is not JOIN_UNIQUE_INNER. Comment mentions this, but does not
> explain why. So please either add comment describing reason for that
> or try enabling materialization in such a case.
>

Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]

* I think we should not consider materializing the cheapest inner path
if we're doing JOIN_UNIQUE_INNER, because in this case we have to
unique-ify the inner path.

We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in
match_unsorted_order().
So here is as same logic as match_unsorted_order(). I added comments to
explain why.



[1]
https://www.postgresql.org/message-id/CAMbWs49LbQF_Z0iKPRPnTHfsRECT7M-4rF6ft5vpW1ARSpBkPA%40mail.gmail.com



-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Support-materializing-inner-path-on-parallel-oute.patch
Description: Binary data

ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3

2023-05-18 Thread tender wang

Hi hackers,
   I found $subject problem when using SQLancer.

   How to repeat:
 CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT);
CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0); CREATE TEMP TABLE t2(c0
boolean , c1 DECIMAL NOT NULL UNIQUE); CREATE TEMPORARY TABLE t3(LIKE t1);
CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3); SELECT SUM(count) FROM
(SELECT
(((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as
count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN
v0 ON ((t2.c1)=(0.08182310538090898))) as res;


psql (16devel)
Type "help" for help.

postgres=# \d
Did not find any relations.
postgres=# CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT);
CREATE TABLE
postgres=# CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0);
CREATE TABLE
postgres=# CREATE TEMP TABLE t2(c0 boolean , c1 DECIMAL  NOT NULL UNIQUE);
CREATE TABLE
postgres=# CREATE TEMPORARY TABLE t3(LIKE t1);
CREATE TABLE
postgres=# CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3);
NOTICE:  view "v0" will be a temporary view
CREATE VIEW
postgres=# SELECT SUM(count) FROM (SELECT
(((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as
count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN
v0 ON ((t2.c1)=(0.08182310538090898))) as res;
ERROR:  wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3

regards, tender wang

Re: Improve join_search_one_level readibilty (one line change)

2023-06-05 Thread tender wang

謝東霖  于2023年6月3日周六 23:21写道：

> Hello hackers
>
> Attached is my first patch for PostgreSQL, which is a simple one-liner
> that I believe can improve the code.
>
> In the "join_search_one_level" function, I noticed that the variable
> "other_rels_list" always refers to "joinrels[1]" even when the (level
> == 2) condition is met. I propose changing:
>
> "other_rels_list = joinrels[level - 1]" to "other_rels_list = joinrels[1]"
>
> This modification aims to enhance clarity and avoid unnecessary
> instructions.
>

I  guess compiler can make that code more efficiency. But from the point of
code readibilty, I agree with you.
As Julien Rouhaud said, it had better to  to move the other_rels_list
initialization out of the if instruction and put it with the variable
declaration.

I would greatly appreciate any review and feedback on this patch as I
> am a newcomer to PostgreSQL contributions. Your input will help me
> improve and contribute effectively to the project.
>
> I have followed the excellent guide "How to submit a patch by email,
> 2023 edition" by Peter Eisentraut.
>
> Additionally, if anyone has any tips on ensuring that Gmail recognizes
> my attached patches as the "text/x-patch" MIME type when sending them
> from the Chrome client, I would be grateful for the advice.
>
> Or maybe the best practice is to use Git send-email ?
>
> Thank you for your time.
>
> Best regards
> Alex Hsieh
>

Small code simplification

2024-08-21 Thread Tender Wang

Hi,

   Oversight in 7ff9afbbd; I think we can do the same way for the
ATExecAddColumn().
-- 
Tender Wang


v1-0001-Small-code-simplification.patch
Description: Binary data

Re: Small code simplification

2024-08-21 Thread Tender Wang

Richard Guo  于2024年8月22日周四 10:53写道：

> On Wed, Aug 21, 2024 at 6:25 PM Tender Wang  wrote:
> >Oversight in 7ff9afbbd; I think we can do the same way for the
> ATExecAddColumn().
>
> LGTM. Pushed.
>

Thanks for pushing.

-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-21 Thread Tender Wang

Alvaro Herrera  于2024年8月22日周四 06:00写道：

> On 2024-Aug-19, Alvaro Herrera wrote:
>
> > I haven't pushed it yet, mostly because of being unsure about not doing
> > anything for the oldest branches (14 and back).
>
> Last night, after much mulling on this, it occurred to me that one easy
> way out of this problem for the old branches, without having to write
> more code, is to simply remove the constraint from the partition when
> it's detached (but only if they reference a partitioned relation).  It's
> not a great solution, but at least we're no longer leaving bogus catalog
> entries around.  That would be like the attached patch, which was cut
> from 14 and applies cleanly to 12 and 13.  I'd throw in a couple of
> tests and call it a day.
>

I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and I
find below error.
postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE r_1 (
id   bigint PRIMARY KEY,
p_id bigint NOT NULL,
FOREIGN KEY (p_id) REFERENCES p (id)
);
CREATE TABLE r (
id   bigint PRIMARY KEY,
p_id bigint NOT NULL,
FOREIGN KEY (p_id) REFERENCES p (id)
) PARTITION BY list (id);
ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE r DETACH PARTITION r_1;
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
ERROR:  cache lookup failed for constraint 16400

I haven't look into details to find out where cause above error.

-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-22 Thread Tender Wang

Tender Wang  于2024年8月22日周四 11:19写道：

>
>
> Alvaro Herrera  于2024年8月22日周四 06:00写道：
>
>> On 2024-Aug-19, Alvaro Herrera wrote:
>>
>> > I haven't pushed it yet, mostly because of being unsure about not doing
>> > anything for the oldest branches (14 and back).
>>
>> Last night, after much mulling on this, it occurred to me that one easy
>> way out of this problem for the old branches, without having to write
>> more code, is to simply remove the constraint from the partition when
>> it's detached (but only if they reference a partitioned relation).  It's
>> not a great solution, but at least we're no longer leaving bogus catalog
>> entries around.  That would be like the attached patch, which was cut
>> from 14 and applies cleanly to 12 and 13.  I'd throw in a couple of
>> tests and call it a day.
>>
>
> I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and
> I
> find below error.
> postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
> CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
> CREATE TABLE r_1 (
> id   bigint PRIMARY KEY,
> p_id bigint NOT NULL,
> FOREIGN KEY (p_id) REFERENCES p (id)
> );
> CREATE TABLE r (
> id   bigint PRIMARY KEY,
> p_id bigint NOT NULL,
> FOREIGN KEY (p_id) REFERENCES p (id)
> ) PARTITION BY list (id);
> ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> ALTER TABLE r DETACH PARTITION r_1;
> CREATE TABLE
> CREATE TABLE
> CREATE TABLE
> CREATE TABLE
> ALTER TABLE
> ERROR:  cache lookup failed for constraint 16400
>

I guess it is because cascade dropping, the oid=16400 has been deleted.
Adding a list that remember dropped constraint, if we find the parent of
constraint
is in the list, we skip.

By the way, I run above SQL sequences on REL_14_STABLE without your partch.
I didn't find reporting error, and running oidjoins.sql  didn't report
warnings.
Do I miss something?


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-22 Thread Tender Wang

Alvaro Herrera  于2024年8月23日周五 02:41写道：

> On 2024-Aug-22, Tender Wang wrote:
>
> > I apply the v14 patch on branch REL_14_STABLE. I run this thread issue
> and I
> > find below error.
> > [...]
> > ERROR:  cache lookup failed for constraint 16400
> >
> > I haven't look into details to find out where cause above error.
>
> Right, we try to drop the constraint twice.  We can dodge this by
> collecting all constraints to drop in the loop and process them in a
> single performMultipleDeletions, as in the attached v14-2.
>

Can we move the  CommandCounterIncrement() in
if (get_rel_relkind(fk->confrelid) == RELKIND_PARTITIONED_TABLE) block
to be close to performMultipleDeletions().

Others look good to me.

TBH I think it's a bit infuriating that we lose the constraint (which
> was explicitly declared) because of ATTACH/DETACH.


Agree.
Do you think it is friendly to users if we add hints that inform them the
constraint was dropped?

-- 
Tender Wang

Re: Eager aggregation, take 3

2024-08-27 Thread Tender Wang

Richard Guo  于2024年8月21日周三 15:11写道：

> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo 
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation.  While there were no major changes, the code should
> > now be simpler.
>
> I found a bug in v10 patchset: when we generate the GROUP BY clauses
> for the partial aggregation that is pushed down to a non-aggregated
> relation, we may produce a clause with a tleSortGroupRef that
> duplicates one already present in the query's groupClause, which would
> cause problems.
>
> Attached is the updated version of the patchset that fixes this bug
> and includes further code refactoring.
>

Rectenly, I do some benchmark tests, mainly on tpch and tpcds.
tpch tests have no plan diff, so I do not continue to test on tpch.
tpcds(10GB) tests have 22 plan diff as below:
4.sql, 5.sql, 8.sql,11.sql,19.sql,23.sql,31.sql,
33.sql,39.sql,45.sql,46.sql,47.sql,53.sql,
56.sql,57.sql,60.sql,63.sql,68.sql,74.sql,77.sql,80.sql,89.sql

I haven't look all of them. I just pick few simple plan test(e.g. 19.sql,
45.sql).
For example, 19.sql, eager agg pushdown doesn't get large gain, but a little
performance regress.

I will continue to do benchmark on this feature.

[1] https://github.com/tenderwg/eager_agg

-- 
Tender Wang

Re: Eager aggregation, take 3

2024-08-28 Thread Tender Wang

Richard Guo  于2024年8月29日周四 10:46写道：

> On Wed, Aug 28, 2024 at 9:01 PM Robert Haas  wrote:
> > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang  wrote:
> > > I haven't look all of them. I just pick few simple plan test(e.g.
> 19.sql, 45.sql).
> > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a
> little
> > > performance regress.
> >
> > Yeah, this is one of the things I was worried about in my previous
> > reply to Richard. It would be worth Richard, or someone, probing into
> > exactly why that's happening. My fear is that we just don't have good
> > enough estimates to make good decisions, but there might well be
> > another explanation.
>
> It's great that we have a query to probe into.  Your guess is likely
> correct: it may be caused by poor estimates.
>
> Tender, would you please help provide the outputs of
>
> EXPLAIN (COSTS ON, ANALYZE)
>
> on 19.sql with and without eager aggregation?
>

Yeah, in [1], 19_off.out and 19_on.out are the output of explain(costs off,
analyze).
I will do EXPLAIN(COSTS ON, ANALYZE) tests and upload them later today.


[1] https://github.com/tenderwg/eager_agg


-- 
Tender Wang

Re: Eager aggregation, take 3

2024-08-28 Thread Tender Wang

Richard Guo  于2024年8月29日周四 10:46写道：

> On Wed, Aug 28, 2024 at 9:01 PM Robert Haas  wrote:
> > On Tue, Aug 27, 2024 at 11:57 PM Tender Wang  wrote:
> > > I haven't look all of them. I just pick few simple plan test(e.g.
> 19.sql, 45.sql).
> > > For example, 19.sql, eager agg pushdown doesn't get large gain, but a
> little
> > > performance regress.
> >
> > Yeah, this is one of the things I was worried about in my previous
> > reply to Richard. It would be worth Richard, or someone, probing into
> > exactly why that's happening. My fear is that we just don't have good
> > enough estimates to make good decisions, but there might well be
> > another explanation.
>
> It's great that we have a query to probe into.  Your guess is likely
> correct: it may be caused by poor estimates.
>
> Tender, would you please help provide the outputs of
>
> EXPLAIN (COSTS ON, ANALYZE)
>
> on 19.sql with and without eager aggregation?
>
> I upload EXPLAIN(COSTS ON, ANALYZE) test to [1].
I ran the same query three times, and I chose the third time result.
You can check 19_off_explain.out and 19_on_explain.out.


[1] https://github.com/tenderwg/eager_agg


-- 
Tender Wang

Re: not null constraints, again

2024-08-31 Thread Tender Wang

Alvaro Herrera  于2024年8月31日周六 11:59写道：

> Hello
>
> Here I present another attempt at making not-null constraints be
> catalogued.  This is largely based on the code reverted at 9ce04b50e120,
> except that we now have a not-null constraint automatically created for
> every column of a primary key, and such constraint cannot be removed
> while the PK exists.  Thanks to this, a lot of rather ugly code is gone,
> both in pg_dump and in backend -- in particular the handling of NO
> INHERIT, which was needed for pg_dump.
>
> Noteworthy psql difference: because there are now even more not-null
> constraints than before, the \d+ display would be far too noisy if we
> just let it grow.  So here I've made it omit any constraints that
> underlie the primary key.  This should be OK since you can't do much
> with those constraints while the PK is still there.  If you drop the PK,
> the next \d+ will show those constraints.
>
> One thing that regretfully I haven't yet had time for, is paring down
> the original test code: a lot of it is verifying the old semantics,
> particularly for NO INHERIT constraints, which had grown ugly special
> cases.  It now mostly raises errors; or the tests are simply redundant.
> I'm going to remove that stuff as soon as I'm back on my normal work
> timezone.
>
> sepgsql is untested.
>
> I'm adding this to the September commitfest.


Thanks for working on this again.

 AT_PASS_OLD_COL_ATTRS, I didn't see any other code that used it. I don't
think it's necessary.

I will take the time to look over the attached patch.

-- 
Tender Wang

Re: not null constraints, again

2024-09-02 Thread Tender Wang

Alvaro Herrera  于2024年8月31日周六 11:59写道：

> Hello
>
> Here I present another attempt at making not-null constraints be
> catalogued.  This is largely based on the code reverted at 9ce04b50e120,
> except that we now have a not-null constraint automatically created for
> every column of a primary key, and such constraint cannot be removed
> while the PK exists.  Thanks to this, a lot of rather ugly code is gone,
> both in pg_dump and in backend -- in particular the handling of NO
> INHERIT, which was needed for pg_dump.
>
> Noteworthy psql difference: because there are now even more not-null
> constraints than before, the \d+ display would be far too noisy if we
> just let it grow.  So here I've made it omit any constraints that
> underlie the primary key.  This should be OK since you can't do much
> with those constraints while the PK is still there.  If you drop the PK,
> the next \d+ will show those constraints.
>
> One thing that regretfully I haven't yet had time for, is paring down
> the original test code: a lot of it is verifying the old semantics,
> particularly for NO INHERIT constraints, which had grown ugly special
> cases.  It now mostly raises errors; or the tests are simply redundant.
> I'm going to remove that stuff as soon as I'm back on my normal work
> timezone.
>
> sepgsql is untested.
>
> I'm adding this to the September commitfest.
>

The attached patch adds  List *nnconstraints, which store the not-null
definition, in struct CreateStmt.
This makes me a little confused about List *constraints in struct
CreateStmt. Actually, the List constraints
store ckeck constraint, and it will be better if the comments can reflect
that. Re-naming it to List *ckconstraints
seems more reasonable. But a lot of codes that use stmt->constraints will
be changed.

Since AddRelationNewConstraints() can now add not-null column constraint,
the comments about AddRelationNewConstraints()
should tweak a little.
"All entries in newColDefaults will be processed.  Entries in
newConstraints
will be processed only if they are CONSTR_CHECK type."
Now, the type of new constraints may be not-null constraints.


If the column has already had one not-null constraint, and we add same
not-null constraint again.
Then the code will call AdjustNotNullInheritance1() in
AddRelationNewConstraints().
The comments
before entering AdjustNotNullInheritance1() in AddRelationNewConstraints()
look confusing to me.
Because constraint is not inherited.

-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-09-02 Thread Tender Wang

ARTITION OF fk_p FOR VALUES IN (1)
>   CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2)
>   CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
>   CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
>   CREATE TABLE fk_r   ( id bigint PRIMARY KEY, p_id bigint NOT NULL,
>  FOREIGN KEY (p_id) REFERENCES fk_p (id)
>   ) PARTITION BY list (id);
> SET search_path TO fkpart12;
>
> INSERT INTO fk_p VALUES (1);
>
> ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2);
>
> ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
> \d fk_r_1
>
> INSERT INTO fk_r VALUES (1,1);
>
> ALTER TABLE fk_r DETACH PARTITION fk_r_1;
> \d fk_r_1
>
> INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED
> DELETE FROM p; -- should fails but was buggy
>
> ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
> \d fk_r_1
>
>
> 3. Self referencing FK between partitions [3]
>
>   You added to your commit message:
>
> verify: 20230707175859.17c91538@karst
>
>   I'm not sure what the "verify" flag means. Unfortunately, your patch
> doesn't
>   help on this topic.
>
>   This bug really needs more discussion and design consideration. I have
>   thought about this problem and haven't found any solution that don't
> involve
>   breaking the current core behavior. It really looks like an impossible
> bug to
>   fix without dropping the constraint itself. On both side. Maybe the only
> sane
>   behavior would be to forbid detaching the partition if it would break the
>   constraint.
>
>   But let's discuss this on the related thread, should we?
>
>
> Thank you for reading me all the way down to the bottom!
>
> Regards,
>
> [1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst
> [2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst
> [3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst
>
>
>
>

-- 
Tender Wang

Re: not null constraints, again

2024-09-03 Thread Tender Wang

Alvaro Herrera  于2024年8月31日周六 11:59写道：

> Hello
>
> Here I present another attempt at making not-null constraints be
> catalogued.  This is largely based on the code reverted at 9ce04b50e120,
> except that we now have a not-null constraint automatically created for
> every column of a primary key, and such constraint cannot be removed
> while the PK exists.  Thanks to this, a lot of rather ugly code is gone,
> both in pg_dump and in backend -- in particular the handling of NO
> INHERIT, which was needed for pg_dump.
>
> Noteworthy psql difference: because there are now even more not-null
> constraints than before, the \d+ display would be far too noisy if we
> just let it grow.  So here I've made it omit any constraints that
> underlie the primary key.  This should be OK since you can't do much
> with those constraints while the PK is still there.  If you drop the PK,
> the next \d+ will show those constraints.
>
> One thing that regretfully I haven't yet had time for, is paring down
> the original test code: a lot of it is verifying the old semantics,
> particularly for NO INHERIT constraints, which had grown ugly special
> cases.  It now mostly raises errors; or the tests are simply redundant.
> I'm going to remove that stuff as soon as I'm back on my normal work
> timezone.
>
> sepgsql is untested.
>
> I'm adding this to the September commitfest.
>

The test case in constraints.sql, as below:
CREATE TABLE notnull_tbl1 (a INTEGER NOT NULL NOT NULL);

   ^^
There are two not-null definitions, and is the second one redundant?

When we drop the column not-null constraint, we will enter
ATExecDropNotNull().
Then, it calls findNotNullConstraint() to get the constraint tuple.   We
already have
attnum before the call findNotNullConstraint(). Can we directly call
findNotNullConstraintAttnum()?

In RemoveConstraintById(), I see below comments:

"For not-null and primary key constraints,
obtain the list of columns affected, so that we can reset their
attnotnull flags below."

However, there are no related codes that reflect the above comments.

--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-09-03 Thread Tender Wang

Jehan-Guillaume de Rorthais  于2024年9月3日周二 17:26写道：

> Hi Tender,
>
> On Tue, 3 Sep 2024 10:16:44 +0800
> Tender Wang  wrote:
>
> > Jehan-Guillaume de Rorthais  于2024年9月3日周二 05:02写道：
> […]
> > > * Constraint & trigger catalog cleanup [1] (this thread)
> > > * FK broken after DETACH [2]
> > > * Maintenance consideration about self referencing FK between
> partitions
> > > [3]
> > >
> >
> > The third issue has been fixed, and codes have been pushed.  Because of
> my
> > misunderstanding,
> > It should not be here.
>
> I just retried the SQL scenario Guillaume gave on both master and master
> with
> Alvaro's patch. See:
>
>
> https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com
>
> It doesn't seem fixed at all. Maybe you are mixing up with another
> thread/issue?
>

Sorry, I mixed up the third issue with the Alexander reported  issue.
Please ignore the above noise.

>
> > > 0. Splitting in two commits
> > >
> > >   […]
> > >
> > >   Unfortunately, this discussion about the first bug slipped to the
> second
> > >   one when Tender stumbled on this bug as well and reported it. But,
> both
> > >   bugs can be triggered independently, and have distinct fixes.
> >
> > It's ok that these two issues are fixed together.  It is  because current
> > codes don't handle better when the referenced side is the partition
> table.
>
> I don't feel the same. Mixing two discussions and fixes together in the
> same
> thread and commit makes life harder.
>

Hmm, these two issues have a close relationship. Anyway, I think it's ok to
fix the two issues together.


> Last year, when you found the other bug, I tried to point you to the
> right thread to avoid mixing subjects:
>
> https://www.postgresql.org/message-id/20230810170345.26e41b05%40karst
>
> If I wrote about the third (non fixed) issue yesterday, it's just because
> Alvaro included a reference to it in his commit message. But I think we
> should
> really keep up with this issue on its own, dedicated discussion:
>
>
> https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com


Thanks for the reminder. I didn't take the time to look into the third
issue. Please give me some to analyze it.


-- 
Thanks,
Tender Wang

Re: Eager aggregation, take 3

2024-09-03 Thread Tender Wang

Richard Guo  于2024年8月21日周三 15:11写道：

> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo 
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation.  While there were no major changes, the code should
> > now be simpler.
>
> I found a bug in v10 patchset: when we generate the GROUP BY clauses
> for the partial aggregation that is pushed down to a non-aggregated
> relation, we may produce a clause with a tleSortGroupRef that
> duplicates one already present in the query's groupClause, which would
> cause problems.
>
> Attached is the updated version of the patchset that fixes this bug
> and includes further code refactoring.
>

The v11-0002 git am failed on HEAD(6c2b5edecc).

tender@iZ2ze6la2dizi7df9q3xheZ:/workspace/postgres$ git am
v11-0002-Implement-Eager-Aggregation.patch
Applying: Implement Eager Aggregation
error: patch failed: src/test/regress/parallel_schedule:119
error: src/test/regress/parallel_schedule: patch does not apply
Patch failed at 0001 Implement Eager Aggregation
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

-- 
Thanks,
Tender Wang

Re: Eager aggregation, take 3

2024-09-04 Thread Tender Wang

Richard Guo  于2024年8月21日周三 15:11写道：

> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo 
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation.  While there were no major changes, the code should
> > now be simpler.
>
> I found a bug in v10 patchset: when we generate the GROUP BY clauses
> for the partial aggregation that is pushed down to a non-aggregated
> relation, we may produce a clause with a tleSortGroupRef that
> duplicates one already present in the query's groupClause, which would
> cause problems.
>
> Attached is the updated version of the patchset that fixes this bug
> and includes further code refactoring.

 I review the v11 patch set, and here are a few of my thoughts:

1.  in setup_eager_aggregation(), before calling create_agg_clause_infos(),
it does
some checks if eager aggregation is available. Can we move those checks
into a function,
for example, can_eager_agg(), like can_partial_agg() does?

2.  I found that outside of joinrel.c we all use IS_DUMMY_REL,  but in
joinrel.c, Tom always uses
is_dummy_rel(). Other commiters use IS_DUMMY_REL.

3.  The attached patch does not consider FDW when creating a path for
grouped_rel or grouped_join.
Do we need to think about FDW?

I haven't finished reviewing the patch set. I will continue to learn this
feature.

-- 
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-09-04 Thread Tender Wang

Alvaro Herrera  于2024年8月8日周四 06:50写道：

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao  于2024年7月26日周五 14:57写道：
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.
>

When I review Jehan-Guillaume v2 patch, I found the below codes that need
a little tweak. In DetachPartitionFinalize()
/*
* If the referenced side is partitioned (which we know because our
* parent's constraint points to a different relation than ours) then
* we must, in addition to the above, create pg_constraint rows that
* point to each partition, each with its own action triggers.
*/
if (parentConForm->conrelid != conform->conrelid)

I found that the above IF was always true,  regardless of whether the
referenced side is partitioned.
Although find_all_inheritors() can return an empty children list when the
referenced side is not partitioned,
we can avoid much useless work.
How about this way:
if (get_rel_relkind(conform->confrelid) == RELKIND_PARTITIONED_TABLE)

--
Thanks,
Tender Wang

Re: not null constraints, again

2024-09-09 Thread Tender Wang

jian he  于2024年9月9日周一 16:31写道：

> On Mon, Sep 2, 2024 at 6:33 PM Tender Wang  wrote:
> >
> >
> >
> > The attached patch adds  List *nnconstraints, which store the not-null
> definition, in struct CreateStmt.
> > This makes me a little confused about List *constraints in struct
> CreateStmt. Actually, the List constraints
> > store ckeck constraint, and it will be better if the comments can
> reflect that. Re-naming it to List *ckconstraints
> > seems more reasonable. But a lot of codes that use stmt->constraints
> will be changed.
> >
> hi.
> seems you forgot to attach the patch?
> I also noticed this minor issue.
> I have no preference for Renaming it to List *ckconstraints.
> +1 for better comments. maybe reword to
> >>>
> List   *constraints;/* CHECK constraints (list of Constraint
> nodes) */
> >>>
>

I just gave advice; whether it is accepted or not,  it's up to Alvaro.
If Alvaro agrees with the advice, he will patch a new one. We can continue
to review the
new patch.
If Alvaro disagrees, he doesn't need to change the current patch.  I think
this way will be
more straightforward for others who will review this feature.


>
> On Tue, Sep 3, 2024 at 3:17 PM Tender Wang  wrote:
> >
> > The test case in constraints.sql, as below:
> > CREATE TABLE notnull_tbl1 (a INTEGER NOT NULL NOT NULL);
> >
>   ^^
> > There are two not-null definitions, and is the second one redundant?
> >
>
> hi.
> i think this is ok. please see
> function transformColumnDefinition and variable saw_nullable.
>

Yeah, it is ok.


> we need to make sure this:
> CREATE TABLE notnull_tbl3 (a INTEGER  NULL NOT NULL);
> fails.
>
>
> of course, it's also OK do this:
> CREATE TABLE notnull_tbl3 (a INTEGER  NULL NULL);
>


-- 
Thanks,
Tender Wang

Re: Eager aggregation, take 3

2024-09-10 Thread Tender Wang

Richard Guo  于2024年8月21日周三 15:11写道：

> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo 
> wrote:
> > I had a self-review of this patchset and made some refactoring,
> > especially to the function that creates the RelAggInfo structure for a
> > given relation.  While there were no major changes, the code should
> > now be simpler.
>
> I found a bug in v10 patchset: when we generate the GROUP BY clauses
> for the partial aggregation that is pushed down to a non-aggregated
> relation, we may produce a clause with a tleSortGroupRef that
> duplicates one already present in the query's groupClause, which would
> cause problems.
>
> Attached is the updated version of the patchset that fixes this bug
> and includes further code refactoring.
>
>
I continue to review the v11 version patches. Here are some my thoughts.

1. In make_one_rel(), we have the below codes:
/*
* Build grouped base relations for each base rel if possible.
*/
setup_base_grouped_rels(root);

As far as I know, each base rel only has one grouped base relation, if
possible.
The comments may be changed to "Build a grouped base relation for each base
rel if possible."

2.  According to the comments of generate_grouped_paths(), we may generate
paths for a grouped
relation on top of paths of join relation. So the ”rel_plain" argument in
generate_grouped_paths() may be
confused. "plain" usually means "base rel" . How about Re-naming rel_plain
to input_rel?

3. In create_partial_grouping_paths(), The partially_grouped_rel could have
been already created due to eager
aggregation. If partially_grouped_rel exists,  its reltarget has been
created. So do we need below logic?

/*
* Build target list for partial aggregate paths.  These paths cannot just
* emit the same tlist as regular aggregate paths, because (1) we must
* include Vars and Aggrefs needed in HAVING, which might not appear in
* the result tlist, and (2) the Aggrefs must be set in partial mode.
*/
partially_grouped_rel->reltarget =
   make_partial_grouping_target(root, grouped_rel->reltarget,
extra->havingQual);

--
Thanks,
Tender Wang

Re: Eager aggregation, take 3

2024-09-13 Thread Tender Wang

Tender Wang  于2024年9月4日周三 11:48写道：

>
>
> Richard Guo  于2024年8月21日周三 15:11写道：
>
>> On Fri, Aug 16, 2024 at 4:14 PM Richard Guo 
>> wrote:
>> > I had a self-review of this patchset and made some refactoring,
>> > especially to the function that creates the RelAggInfo structure for a
>> > given relation.  While there were no major changes, the code should
>> > now be simpler.
>>
>> I found a bug in v10 patchset: when we generate the GROUP BY clauses
>> for the partial aggregation that is pushed down to a non-aggregated
>> relation, we may produce a clause with a tleSortGroupRef that
>> duplicates one already present in the query's groupClause, which would
>> cause problems.
>>
>> Attached is the updated version of the patchset that fixes this bug
>> and includes further code refactoring.
>>
>
> The v11-0002 git am failed on HEAD(6c2b5edecc).
>
> tender@iZ2ze6la2dizi7df9q3xheZ:/workspace/postgres$ git am
> v11-0002-Implement-Eager-Aggregation.patch
> Applying: Implement Eager Aggregation
> error: patch failed: src/test/regress/parallel_schedule:119
> error: src/test/regress/parallel_schedule: patch does not apply
> Patch failed at 0001 Implement Eager Aggregation
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
>
>
Since MERGE/SPLIT partition has been reverted, the tests  *partition_merge*
and  *partition_split*  should be removed
from parallel_schedule. After doing the above, the 0002 patch can be
applied.

-- 
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-07-17 Thread Tender Wang

Hi Alvaro,

Recently, Alexander reported the same issue on [1]. And before that,
another same issue was reported on [2].
So I try to re-work those issues. In my last email on this thread, I said
that
"
I slightly modified the previous patch，but I didn't add test case, because
I found another issue.
After done ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
I run the oidjoins.sql and has warnings as belwo:
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING: FK
VIOLATION IN pg_trigger({tgparentid}): ("(0,3)",16401)
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING: FK
VIOLATION IN pg_trigger({tgparentid}): ("(0,4)",16402)
"

And I gave the explanation:
"
The two trigger tuples are deleted in tryAttachPartitionForeignKey called
by CloneFkReferencing.
/*
* Looks good! Attach this constraint. The action triggers in the new
* partition become redundant -- the parent table already has equivalent
* ones, and those will be able to reach the partition. Remove the ones
* in the partition. We identify them because they have our constraint
* OID, as well as being on the referenced rel.
*/
"
I try to fix above fk violation. I have two ideas.
i. Do not remove redundant, but when detaching parittion, the action
trigger on referenced side will be create again.
I have consider about this situation.

ii. We still remove redundant, and the remove the child action trigger,
too. If we do this way.
Should we create action trigger recursively on referced side when detaching
partition.

I can't decide which one is better. And I'm not sure that keep this FK
VIOLATION will cause some problem.
I rebase and send v3 patch, which only fix NOT FOUND INSERT CHECK TRIGGER.

[1]
https://www.postgresql.org/message-id/18541-628a61bc267cd2d3%40postgresql.org
[2]
https://www.postgresql.org/message-id/GVAP278MB02787E7134FD691861635A8BC9032%40GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM

--
Tender Wang

v3-0001-Fix-partition-detach-issue.patch
Description: Binary data

v3-0002-Add-test-case.patch
Description: Binary data

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-07-21 Thread Tender Wang

Alvaro Herrera  于2024年7月19日周五 21:18写道：

> Hello,
>
> I think the fix for the check triggers should be as the attached.  Very
> close to what you did, but you were skipping some operations that needed
> to be kept.  AFAICS this patch works correctly for the posted cases.
>

After applying the attached, the r_1_p_id_fkey1 will have redundant action
triggers, as below:
postgres=# select oid, conname, contype, conrelid, conindid,conparentid,
confrelid,conislocal,coninhcount, connoinherit from pg_constraint where oid
= 16402;
  oid  |conname | contype | conrelid | conindid | conparentid |
confrelid | conislocal | coninhcount | connoinherit
---++-+--+--+-+---++-+--
 16402 | r_1_p_id_fkey1 | f   |16394 |16392 |   0 |
16389 | t  |   0 | f
(1 row)

postgres=# select oid, tgrelid, tgparentid, tgconstrrelid, tgconstrindid,
tgconstraint from pg_trigger where tgconstraint = 16402;
  oid  | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
---+-++---+---+--
 16403 |   16389 |  16400 | 16394 | 16392 |16402
 16404 |   16389 |  16401 | 16394 | 16392 |16402
 16422 |   16389 |  0 | 16394 | 16392 |16402
 16423 |   16389 |  0 | 16394 | 16392 |16402
(4 rows)


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-07-22 Thread Tender Wang

Alvaro Herrera  于2024年7月19日周五 21:18写道：

>
> I find this pair of queries useful; they show which constraints exist
> and which triggers belong to each.  We need to make the constraints and
> triggers match after a detach right as things would be if the
> just-detached partition were an individual table having the same foreign
> key.
>

I don't find the useful queries in your last email. Can you provide them.
Thanks.


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-07-26 Thread Tender Wang

Junwang Zhao  于2024年7月26日周五 14:57写道：

>
> There is a bug report[0] Tender comments might be the same
> issue as this one, but I tried Alvaro's and mine patch, neither
> could solve that problem, I did not tried Tender's earlier patch
> thought. I post the test script below in case you are interested.
>

My earlier patch should handle Alexander reported case. But I did not do
more
test. I'm not sure that wether or not has dismatching between pg_constraint
and pg_trigger.

I aggred with Alvaro said that "requires a much more invasive solution".

-- 
Tender Wang

Re: Fixed small typo in bufpage.h

2024-08-04 Thread Tender Wang

Senglee Choi  于2024年8月5日周一 09:24写道：

> Fixed a minor typo in src/include/storage/bufpage.h.
> Please let me know if any further action is required.
>

Good catch. +1


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-07 Thread Tender Wang

Alvaro Herrera  于2024年8月8日周四 06:50写道：

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao  于2024年7月26日周五 14:57写道：
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.
>

Thanks for your work. I will take some time to look it in detail.


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-08 Thread Tender Wang

Alvaro Herrera  于2024年8月8日周四 06:50写道：

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao  于2024年7月26日周五 14:57写道：
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.
>

I did a lot of tests, and did not report error and did not find any
warnings using oidjoins.sql.
+1


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-19 Thread Tender Wang

Alvaro Herrera  于2024年8月8日周四 06:50写道：

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao  于2024年7月26日周五 14:57写道：
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.
>
>
> As I understand, this fix needs to be applied all the way back to 12,
> because the overall functionality is that old.  However, in branches 14
> and back, the patch doesn't apply cleanly, because of the changes we
> made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> 16, 17, master now and potentially backpatch later, to avoid dragging
> things along further.  It's taken long enough already.
>

I haven't seen this patch on master.  Is there something that we fotget to
handle?


-- 
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-19 Thread Tender Wang

Alvaro Herrera  于2024年8月20日周二 10:25写道：

> On 2024-Aug-20, Tender Wang wrote:
>
> > > As I understand, this fix needs to be applied all the way back to 12,
> > > because the overall functionality is that old.  However, in branches 14
> > > and back, the patch doesn't apply cleanly, because of the changes we
> > > made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> > > 16, 17, master now and potentially backpatch later, to avoid dragging
> > > things along further.  It's taken long enough already.
> >
> > I haven't seen this patch on master.  Is there something that we fotget
> to
> > handle?
>
> I haven't pushed it yet, mostly because of being unsure about not doing
> anything for the oldest branches (14 and back).
>

I only did codes and tests on master. I'm not sure how much complicated it
would be
to fix this issue on branches 14 and back.


-- 
Tender Wang

Re: Improve node type forward reference

2024-10-15 Thread Tender Wang

Peter Eisentraut  于2024年10月15日周二 15:02写道：

> On 14.10.24 23:28, Nathan Bossart wrote:
> > On Mon, Oct 14, 2024 at 09:47:59AM +0200, Peter Eisentraut wrote:
> >> But we can do this better by using an incomplete struct, like
> >>
> >>  struct Query *viewQuery ...;
> >>
> >> That way, everything has the correct type and fewer casts are required.
> This
> >> technique is already used elsewhere in node type definitions.
>

Agree.  The attached patches LGTM.


> >
> > I noticed that the examples in parsenodes.h are for structs defined
> within
> > the same file.  If the struct is defined in a separate file, I guess you
> > might need to include another header file wherever it is used, but that
> > doesn't seem too bad.
>
> No, you can leave the struct incomplete.  You only need to provide its
> full definition (= include the other header file) if you actually want
> to access the struct's fields.
>
>
Yeah.   There are many examples. For example as below:
in struct RelOptInfos,

/* use "struct FdwRoutine" to avoid including fdwapi.h here */
struct FdwRoutine *fdwroutine pg_node_attr(read_write_ignore);

-- 
Thanks,
Tender Wang

Remove an unnecessary check on semijoin_target_ok() on postgres_fdw.c

2024-10-09 Thread Tender Wang

Hi,
   When I debug FDW join pushdown codes, I found below codes in
semijoin_target_ok():
if (bms_is_member(var->varno, innerrel->relids) &&
!bms_is_member(var->varno, outerrel->relids))

As far as I know, if a var belongs to the innerrel of joinrel, it's not
possible that it
may belong to the outerrel. So if the bms_is_member(var->varno,
innerrel->relids)
returns TRUE, then !bms_is_member(var->varno, outerrel->relids) must be
TRUE.
If bms_is_member(var->varno, innerrel->relids) returns FALSE,
!bms_is_member(var->varno, outerrel->relids) will not execute due to short
circuit.

So I think we can remove the "!bms_is_member(var->varno, outerrel->relids)"
from if.
Any thoughts?

--
Thanks,
Tender Wang


0001-Remove-an-unnecessary-check-as-Var-can-only-belong-t.patch
Description: Binary data

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-10-21 Thread Tender Wang

Alvaro Herrera  于2024年10月18日周五 22:52写道：

> On 2024-Sep-26, Jehan-Guillaume de Rorthais wrote:
>
> > REL_14_STABLE backport doesn't seem trivial, so I'll wait for some
> > feedback, review & decision before going further down in backpatching.
>
> Hi, thanks for these patches.  I have made some edits of my own.  In the
> end I decided that I quite liked the new structure of the
> addFkConstraint() function and friends.  I added a bunch of comments and
> changed names somewhat.  Also, I think the benefit of the refactoring is
> more obvious when all patches are squashed together, so I did that.
>
> For branch 14, I opted to make it delete the constraints on detach.
> This isn't ideal but unless somebody wants to spend a lot more time on
> this, it seems the best we can do.  Leaving broken constraints around
> seems worse.  The patch for 14 doesn't apply to 13 sadly.  I didn't have
> time to verify why, but it seems there's some rather larger conflict in
> one spot.
>
> I hope to be able to get these pushed over the weekend.  That would give
> us a couple of weeks before the November releases (but my availability
> in those two weeks might be spotty.)
>
> I spent some time going through your test additions and ended up with
> your functions in this form:
>
> -- displays constraints in schema fkpart12
> CREATE OR REPLACE FUNCTION
> fkpart12_constraints(OUT conrel regclass, OUT conname name,
>  OUT confrelid regclass, OUT conparent text)
> RETURNS SETOF record AS $$
>   WITH RECURSIVE arrh AS (
>SELECT oid, conrelid, conname, confrelid, NULL::name AS conparent
>  FROM pg_constraint
> WHERE connamespace = 'fkpart12'::regnamespace AND
>   contype = 'f' AND conparentid = 0
> UNION ALL
>SELECT c.oid, c.conrelid, c.conname, c.confrelid,
>   (pg_identify_object('pg_constraint'::regclass, arrh.oid,
> 0)).identity
>  FROM pg_constraint c
>   JOIN arrh ON c.conparentid = arrh.oid
>   ) SELECT conrelid::regclass, conname, confrelid::regclass, conparent
>   FROM arrh
>   ORDER BY conrelid::regclass::text, conname;
> $$
> LANGUAGE SQL;
>
> -- displays triggers in tables in schema fkpart12
> CREATE OR REPLACE FUNCTION
> fkpart12_triggers(OUT tablename regclass, OUT constr text,
>   OUT samefunc boolean, OUT parent text)
> RETURNS SETOF record AS $$
>   WITH RECURSIVE arrh AS (
>SELECT t.oid, t.tgrelid::regclass as tablename, tgname,
>   t.tgfoid::regproc as trigfn,
>   (pg_identify_object('pg_constraint'::regclass, c.oid,
> 0)).identity as constr,
>   NULL::bool as samefunc,
>   NULL::name AS parent
>  FROM pg_trigger t
> LEFT JOIN pg_constraint c ON c.oid = t.tgconstraint
> WHERE (SELECT relnamespace FROM pg_class WHERE oid = t.tgrelid) =
> 'fkpart12'::regnamespace
>   AND c.contype = 'f' AND t.tgparentid = 0
> UNION ALL
>SELECT t2.oid, t2.tgrelid::regclass as tablename, t2.tgname,
>   t2.tgfoid::regproc as trigfn,
>   (pg_identify_object('pg_constraint'::regclass, c2.oid,
> 0)).identity,
>   arrh.trigfn = t2.tgfoid as samefunc,
>   replace((pg_identify_object('pg_trigger'::regclass,
> t2.tgparentid, 0)).identity,
>   t2.tgparentid::text, 'TGOID')
>  FROM pg_trigger t2
> LEFT JOIN pg_constraint c2 ON c2.oid = t2.tgconstraint
>  JOIN arrh ON t2.tgparentid = arrh.oid
>   ) SELECT tablename, constr, samefunc, parent
>   FROM arrh
>   ORDER BY tablename::text, constr;
> $$
> LANGUAGE SQL;
>
> However, in the end I think this is a very good technique to verify that
> the fix works correctly, but it's excessive to include these results in
> the tests forevermore.  So I've left them out for now.  Maybe we should
> reconsider on the older branches?
>
> Hi,

I looked at the patch on master.   I gave a little comment in [1]
I reconsider the codes.  I suspect that we don't need the below if
statement anymore.
/*
* If the referenced side is partitioned (which we know because our
* parent's constraint points to a different relation than ours) then
* we must, in addition to the above, create pg_constraint rows that
* point to each partition, each with its own action triggers.
*/
if (parentConForm->conrelid != conform->conrelid)
{
...
}

The above contidion is always true according to my test.
I haven't figured out an anti-case.
Any thoughts?

[1]
https://www.postgresql.org/message-id/CAHewXNkuU2V7GfgFkwd265RJ99%2BBfJueNEZhrHSk39o3thqxNA%40mail.gmail.com

-- 
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-10-21 Thread Tender Wang

Alvaro Herrera  于2024年10月22日周二 05:52写道：

> On 2024-Oct-21, Tender Wang wrote:
>
> > I suspect that we don't need the below if
> > statement anymore.
> > /*
> > * If the referenced side is partitioned (which we know because our
> > * parent's constraint points to a different relation than ours) then
> > * we must, in addition to the above, create pg_constraint rows that
> > * point to each partition, each with its own action triggers.
> > */
> > if (parentConForm->conrelid != conform->conrelid)
> > {
> > ...
> > }
> >
> > The above contidion is always true according to my test.
> > I haven't figured out an anti-case.
>
> You're right, this is useless, we can remove the 'if' line.  I kept the
> block though, to have a place for all those local variable declarations
> (otherwise the code looks messier than it needs to).
>

Agree.


>
> I also noticed that addFkRecurseReferenced() is uselessly taking a List
> **wqueue argument but doesn't use it, so I removed it (as fallout, other
> routines don't need it either, especially DetachPartitionFinalize).  I
> added some commentary on the creation of dependencies in
> addFkConstraint().
>

Yeah, I also noticed this before.


>
> I also include a few other cosmetic changes; just comment layout
> changes.
>
> This time I attach the patch for master only; the others have changed
> identically.  14 is unchanged from before.  I figured that the conflict
> from 14 to 13 was trivial to resolve -- it was just because of DETACH
> CONCURRENTLY, so some code moves around, but that's all that happens.
>
>
I don't find any other problems with the newest patch.

-- 
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-10-24 Thread Tender Wang

Alexander Lakhin  于2024年10月24日周四 22:00写道：

> Hello Alvaro,
>
> 22.10.2024 17:32, Alvaro Herrera wrote:
> > Yeah.  I pushed these patches finally, thanks!
>
> Please look at a new anomaly introduced with 53af9491a. When running the
> following script:
> CREATE TABLE t (a int, b int, PRIMARY KEY (a, b));
> CREATE TABLE pt (a int, b int, FOREIGN KEY (a, b) REFERENCES t(a, b))
>   PARTITION BY LIST (a);
>
> CREATE TABLE tp1 (x int, a int, b int);
> ALTER TABLE tp1 DROP COLUMN x;
>
> ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1);
>
> ALTER TABLE pt DETACH PARTITION tp1;
>
> I get a memory access error detected by Valgrind:
> 2024-10-24 12:05:04.645 UTC [1079077] LOG:  statement: ALTER TABLE pt
> DETACH PARTITION tp1;
> ==00:00:00:07.887 1079077== Invalid read of size 2
> ==00:00:00:07.887 1079077==at 0x4A61DD: DetachPartitionFinalize
> (tablecmds.c:19545)
> ==00:00:00:07.887 1079077==by 0x4A5C11: ATExecDetachPartition
> (tablecmds.c:19386)
> ==00:00:00:07.887 1079077==by 0x48561E: ATExecCmd (tablecmds.c:5540)
> ==00:00:00:07.887 1079077==by 0x4845DE: ATRewriteCatalogs
> (tablecmds.c:5203)
> ==00:00:00:07.887 1079077==by 0x4838EC: ATController (tablecmds.c:4758)
> ==00:00:00:07.887 1079077==by 0x4834F1: AlterTable (tablecmds.c:4404)
> ==00:00:00:07.887 1079077==by 0x7D6D52: ProcessUtilitySlow
> (utility.c:1318)
> ==00:00:00:07.887 1079077==by 0x7D65F7: standard_ProcessUtility
> (utility.c:1067)
> ==00:00:00:07.887 1079077==by 0x7D54F7: ProcessUtility (utility.c:523)
> ==00:00:00:07.887 1079077==by 0x7D3D70: PortalRunUtility
> (pquery.c:1158)
> ==00:00:00:07.887 1079077==by 0x7D3FE7: PortalRunMulti (pquery.c:1316)
> ==00:00:00:07.887 1079077==by 0x7D3431: PortalRun (pquery.c:791)
>
> Reproduced on REL_15_STABLE .. master.
>

Sorry, I can't reproduce this leak on master.

-- 
Thanks,
Tender Wang

1 2 >

1 - 100 of 195 matches

Mail list logo