Re: "ERROR: cache lookup failed for function %d" while dropping function from another session

2021-08-08 Thread Mahendra Singh Thalor
On Mon, 9 Aug 2021 at 11:07, Mahendra Singh Thalor 
wrote:
>
> Hi,
> I am able to hit "ERROR:  cache lookup failed for function %d" while I am
dropping function from another session.
>
> Reproducible steps are as(I tried on current head d8a75b1308b133502ae3):
>
> Step 1) Fire below query from session-1 to create a function:
>
> create function fun1(int) returns record as ' DECLARE retval RECORD;
BEGIN SELECT INTO retval 5, 10, 15; RETURN retval;
> END;' language plpgsql;
>
> Step 2) Attach PID of session-1 into gdb and set a breakpoint into
internal_get_result_type and then press c.
>
> (gdb) b internal_get_result_type Breakpoint 1 at 0x5565dd4f3ce4: file
funcapi.c, line 325. (gdb) c
> Continuing.
>
> Step 3) Fire below select command from session-1:
>
> SELECT * FROM fun1(1500) AS (a int, b int, c int);
>
> Step 3) Open session-2 and drop the function:
>
> psql (15devel) Type "help" for help. postgres=# drop function fun1; DROP
FUNCTION
> postgres=#
>
> Step 4) Press 'c' in gdb for 2 times and we will get an error in the 1st
session.
>
> (gdb) c Continuing. Breakpoint 1, internal_get_result_type (funcid=16384,
call_expr=0x5565de885af0, rsinfo=0x0, resultTypeId=0x7ffc94572608,
resultTupleDesc=0x7ffc94572628) at funcapi.c:325 (gdb) c
> Continuing.
>
> postgres=# SELECT * FROM fun1(1500) AS (a int, b int, c int);
> ERROR: cache lookup failed for function 16384
>
> Call stack of error:
>>
>> Breakpoint 2, errfinish (filename=0x55e0b8907c6f "funcapi.c",
lineno=336, funcname=0x55e0b89083a0 <__func__.23015>
"internal_get_result_type") at elog.c:515
>> 515 ErrorData  *edata = [errordata_stack_depth];
>> (gdb) bt
>> #0  errfinish (filename=0x55e0b8907c6f "funcapi.c", lineno=336,
funcname=0x55e0b89083a0 <__func__.23015> "internal_get_result_type") at
elog.c:515
>> #1  0x55e0b86abdaf in internal_get_result_type (funcid=16384,
call_expr=0x55e0ba9e2ad0, rsinfo=0x0, resultTypeId=0x7ffcb7ca4918,
resultTupleDesc=0x7ffcb7ca4938)
>> at funcapi.c:336
>> #2  0x55e0b86ab6d0 in get_expr_result_type (expr=0x55e0ba9e2ad0,
resultTypeId=0x7ffcb7ca4918, resultTupleDesc=0x7ffcb7ca4938) at
funcapi.c:230
>> #3  0x55e0b7fef027 in ExecInitFunctionScan (node=0x55e0ba9e3cd0,
estate=0x55e0ba9d2fe0, eflags=16) at nodeFunctionscan.c:370
>> #4  0x55e0b7fbb408 in ExecInitNode (node=0x55e0ba9e3cd0,
estate=0x55e0ba9d2fe0, eflags=16) at execProcnode.c:254
>> #5  0x55e0b7faa1d5 in InitPlan (queryDesc=0x55e0ba926f50, eflags=16)
at execMain.c:936
>> #6  0x55e0b7fa84b6 in standard_ExecutorStart
(queryDesc=0x55e0ba926f50, eflags=16) at execMain.c:263
>> #7  0x55e0b7fa7fc7 in ExecutorStart (queryDesc=0x55e0ba926f50,
eflags=0) at execMain.c:143
>> #8  0x55e0b83ba8de in PortalStart (portal=0x55e0ba968870,
params=0x0, eflags=0, snapshot=0x0) at pquery.c:512
>> #9  0x55e0b83ae112 in exec_simple_query (query_string=0x55e0ba906760
"SELECT * FROM fun1(1500) AS (a int, b int, c int);") at postgres.c:1175
>> #10 0x55e0b83b7a86 in PostgresMain (argc=1, argv=0x7ffcb7ca4f70,
dbname=0x55e0ba932770 "postgres", username=0x55e0ba932748 "mahendrathalor")
at postgres.c:4488
>> #11 0x55e0b825161b in BackendRun (port=0x55e0ba929730) at
postmaster.c:4519
>> #12 0x55e0b8250971 in BackendStartup (port=0x55e0ba929730) at
postmaster.c:4241
>> #13 0x55e0b8248a1b in ServerLoop () at postmaster.c:1758
>> #14 0x55e0b8247c18 in PostmasterMain (argc=3, argv=0x55e0ba8fee70)
at postmaster.c:1430
>> #15 0x55e0b808dbe9 in main (argc=3, argv=0x55e0ba8fee70) at
main.c:199
>
>
>  cache lookup failed errors are never an expected behavior(Thanks Robert
Hass for your opinion) so I think we should fix this error.
>
> I haven't debugged it so I will debug it and will post my findings in the
coming days.
>

I googled it and found that we already discussed this issue in another
thread.

Link:
https://www.postgresql.org/message-id/flat/NX26BD7O.SQ6NCQDL.Z4KK4FBY%40HXCHZXMY.EBBFSJQH.OUMOBSJN

I reproduced this issue locally (to avoid race conditions, I just had the
> function drop itself ;-)) and traced the site of the failure to this
> bit in plpgsql function exit:
>
> /*
> * Need to look up the expected result type. XXX would be
> * better to cache the tupdesc instead of repeating
> * get_call_result_type(), but the only easy place to save it
> * is in the PLpgSQL_function struct, and that's too
> * long-lived: composite types could change during the
> * existence of a PLpgSQL_function.
> */
> switch (get_call_result_type(fcinfo, , ))
>
> The catalog access inside get_call_result_type is fairly essential,
> because this function has OUT parameters, so its pg_proc row is the
> ultimate source of truth about what it returns. We could imagine
> caching the info earlier during function execution, but as the comment
> says, that has its own failure modes ... and they're more important
> ones in practice. So I'm afraid there's not much to be done to
> improve this.
>
> regards, tom lane
>

-- 
Thanks and Regards
Mahendra Singh Thalor

Re: Diagnostic comment in LogicalIncreaseXminForSlot

2021-08-08 Thread Ashutosh Bapat
On Sat, Aug 7, 2021 at 11:40 AM Andres Freund  wrote:

> Hi,
>
> On 2021-07-12 17:28:15 +0530, Ashutosh Bapat wrote:
> > On Mon, Jul 12, 2021 at 8:39 AM Amit Kapila 
> wrote:
> >
> > > On Mon, Jul 5, 2021 at 12:54 PM Masahiko Sawada  >
> > > Today, again looking at this patch, I don't think doing elog inside
> > > spinlock is a good idea. We can either release spinlock before it or
> > > use some variable to remember that we need to write such an elog and
> > > do it after releasing the lock. What do you think?
> >
> >
> > The elog will be effective only under DEBUG1, otherwise it will be
> almost a
> > NOOP. I am wondering whether it's worth adding a bool assignment and move
> > the elog out only for DEBUG1. Anyway, will defer it to you.
>
> It's *definitely* not ok to do an elog() while holding a spinlock. Consider
> what happens if the elog tries to format the message and runs out of
> memory. Or if elog detects the stack depth is too deep. An ERROR would be
> thrown, and we'd loose track of the held spinlock.
>

Thanks for the clarification.

Amit,
I will provide the patch changed accordingly.

--
Best Wishes,
Ashutosh


"ERROR: cache lookup failed for function %d" while dropping function from another session

2021-08-08 Thread Mahendra Singh Thalor
Hi,
I am able to hit "ERROR:  cache lookup failed for function %d" while I am
dropping function from another session.

*Reproducible steps are as(I tried on current head d8a75b1308b133502ae3):*

*Step 1) Fire below query from session-1 to create a function:*
create function fun1(int) returns record as ' DECLARE retval RECORD;
BEGIN SELECT
INTO retval 5, 10, 15; RETURN retval;
END;' language plpgsql;

*Step 2)* Attach PID of session-1 into gdb and set a breakpoint into
internal_get_result_type and then press c.
(gdb) b internal_get_result_type Breakpoint 1 at 0x5565dd4f3ce4: file
funcapi.c, line 325. (gdb) c
Continuing.

*Step 3)* Fire below select command from session-1:
SELECT * FROM fun1(1500) AS (a int, b int, c int);

*Step 3) Open session-2 and drop the function:*
psql (15devel) Type "help" for help. postgres=# drop function fun1; DROP
FUNCTION
postgres=#

*Step 4)* Press 'c' in gdb for 2 times and we will get an error in the 1st
session.
(gdb) c Continuing. Breakpoint 1, internal_get_result_type (funcid=16384,
call_expr=0x5565de885af0, rsinfo=0x0, resultTypeId=0x7ffc94572608,
resultTupleDesc=0x7ffc94572628) at funcapi.c:325 (gdb) c
Continuing.

postgres=# SELECT * FROM fun1(1500) AS (a int, b int, c int);
ERROR: cache lookup failed for function 16384
*Call stack of error:*

> Breakpoint 2, errfinish (filename=0x55e0b8907c6f "funcapi.c", lineno=336,
> funcname=0x55e0b89083a0 <__func__.23015> "internal_get_result_type") at
> elog.c:515
> 515 ErrorData  *edata = [errordata_stack_depth];
> (gdb) bt
> #0  errfinish (filename=0x55e0b8907c6f "funcapi.c", lineno=336,
> funcname=0x55e0b89083a0 <__func__.23015> "internal_get_result_type") at
> elog.c:515
> #1  0x55e0b86abdaf in internal_get_result_type (funcid=16384,
> call_expr=0x55e0ba9e2ad0, rsinfo=0x0, resultTypeId=0x7ffcb7ca4918,
> resultTupleDesc=0x7ffcb7ca4938)
> at funcapi.c:336
> #2  0x55e0b86ab6d0 in get_expr_result_type (expr=0x55e0ba9e2ad0,
> resultTypeId=0x7ffcb7ca4918, resultTupleDesc=0x7ffcb7ca4938) at
> funcapi.c:230
> #3  0x55e0b7fef027 in ExecInitFunctionScan (node=0x55e0ba9e3cd0,
> estate=0x55e0ba9d2fe0, eflags=16) at nodeFunctionscan.c:370
> #4  0x55e0b7fbb408 in ExecInitNode (node=0x55e0ba9e3cd0,
> estate=0x55e0ba9d2fe0, eflags=16) at execProcnode.c:254
> #5  0x55e0b7faa1d5 in InitPlan (queryDesc=0x55e0ba926f50, eflags=16)
> at execMain.c:936
> #6  0x55e0b7fa84b6 in standard_ExecutorStart
> (queryDesc=0x55e0ba926f50, eflags=16) at execMain.c:263
> #7  0x55e0b7fa7fc7 in ExecutorStart (queryDesc=0x55e0ba926f50,
> eflags=0) at execMain.c:143
> #8  0x55e0b83ba8de in PortalStart (portal=0x55e0ba968870, params=0x0,
> eflags=0, snapshot=0x0) at pquery.c:512
> #9  0x55e0b83ae112 in exec_simple_query (query_string=0x55e0ba906760
> "SELECT * FROM fun1(1500) AS (a int, b int, c int);") at postgres.c:1175
> #10 0x55e0b83b7a86 in PostgresMain (argc=1, argv=0x7ffcb7ca4f70,
> dbname=0x55e0ba932770 "postgres", username=0x55e0ba932748 "mahendrathalor")
> at postgres.c:4488
> #11 0x55e0b825161b in BackendRun (port=0x55e0ba929730) at
> postmaster.c:4519
> #12 0x55e0b8250971 in BackendStartup (port=0x55e0ba929730) at
> postmaster.c:4241
> #13 0x55e0b8248a1b in ServerLoop () at postmaster.c:1758
> #14 0x55e0b8247c18 in PostmasterMain (argc=3, argv=0x55e0ba8fee70) at
> postmaster.c:1430
> #15 0x55e0b808dbe9 in main (argc=3, argv=0x55e0ba8fee70) at main.c:199
>

 cache lookup failed errors are never an expected behavior(Thanks Robert
Hass for your opinion) so I think we should fix this error.

I haven't debugged it so I will debug it and will post my findings in the
coming days.

-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com


Re: Advanced Questions about PostgreSQL

2021-08-08 Thread David Rowley
On Mon, 9 Aug 2021 at 17:14, A Z  wrote:
>
> 1) Are there free scripts for CREATE TYPE (native type), more advanced, 
>  or  sorts of types out there, online, free for commercial 
> use? With function support, too? Can someone reply with a link or a 
> suggestion?
>
> 2) How may I get PostgreSQL to output the create table statement(s) for one 
> or more tables inside one database, without issuing instructions via the 
> command line, but only inside a database login, as a query or pl/sql?
>
> 3) I know that I can use COPY to import or export one database table between 
> it and a *.csv file. Can I use it to do this with multiple TABLES and *.csv 
> files specified in one COPY COMMAND, or not?
>
> 4) In the absence of OS command line instructions, is there an internal 
> postgresql way, via COPY or another function for example, to backup an entire 
> database, with all its create table statements and all insert statements, and 
> any other associated objects as well, in one hit? Or is this ill- advised?
>
> 5) When setting up communication to remote databases on remote machines, I 
> need to use the OPTIONS() function. It seems to require as its first function 
> parameter, the schema of the table (the second parameter) that it wants to 
> access. Can I supply a null schema, and still be able to reference the remote 
> table, or must I also make use of IMPORT FOREIGN SCHEMA?
>
> 6) How may I access, query, the log for the details of a normal table, or 
> similar?
>
> 7) I have found that the native trigonometry functions, namely the radians 
> versions, do produce error results around key trigonometry input values. I 
> have discovered that these errors persist, even if I cast the input parameter 
> away from DOUBLE PRECISION and into DECIMAL. I would like to know if there 
> are any freely available scripts out there that include Arbitrary Precision 
> mathematical functions support that work on DECIMAL and not on DOUBLE 
> PRECISION, that do not produce any error values around key inputs? Could 
> someone refer me to a website that has a script that is such?

This is the wrong mailing list for this type of thing.

I suggest you review the answers that you've already received to these
questions in the general mailing list.  They all look like general
questions to me.

If you're not happy with the answers you get there, then see if you
can get someone to expand on the existing answer or provide further
details that might help them help you.

Ignoring everyone that's tried to help you is likely not a good way
for you to obtain the help you need.

David




Advanced Questions about PostgreSQL

2021-08-08 Thread A Z
1) Are there free scripts for CREATE TYPE (native type), more advanced,  
or  sorts of types out there, online, free for commercial use? With 
function support, too? Can someone reply with a link or a suggestion?

2) How may I get PostgreSQL to output the create table statement(s) for one or 
more tables inside one database, without issuing instructions via the command 
line, but only inside a database login, as a query or pl/sql?

3) I know that I can use COPY to import or export one database table between it 
and a *.csv file. Can I use it to do this with multiple TABLES and *.csv files 
specified in one COPY COMMAND, or not?

4) In the absence of OS command line instructions, is there an internal 
postgresql way, via COPY or another function for example, to backup an entire 
database, with all its create table statements and all insert statements, and 
any other associated objects as well, in one hit? Or is this ill- advised?

5) When setting up communication to remote databases on remote machines, I need 
to use the OPTIONS() function. It seems to require as its first function 
parameter, the schema of the table (the second parameter) that it wants to 
access. Can I supply a null schema, and still be able to reference the remote 
table, or must I also make use of IMPORT FOREIGN SCHEMA?

6) How may I access, query, the log for the details of a normal table, or 
similar?

7) I have found that the native trigonometry functions, namely the radians 
versions, do produce error results around key trigonometry input values. I have 
discovered that these errors persist, even if I cast the input parameter away 
from DOUBLE PRECISION and into DECIMAL. I would like to know if there are any 
freely available scripts out there that include Arbitrary Precision 
mathematical functions support that work on DECIMAL and not on DOUBLE 
PRECISION, that do not produce any error values around key inputs? Could 
someone refer me to a website that has a script that is such?


Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM

2021-08-08 Thread David Rowley
On Mon, 9 Aug 2021 at 14:44, David Rowley  wrote:
> I plan to push this and backpatch to 9.6 shortly unless there are any
> better ideas.

I pushed this patch.  I've now marked the entry in the commitfest app
as committed too.

David




Re: Added schema level support for publication.

2021-08-08 Thread Amit Kapila
On Sun, Aug 8, 2021 at 2:52 PM vignesh C  wrote:
>
> On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila  wrote:
> >
> > On Fri, Aug 6, 2021 at 2:02 PM vignesh C  wrote:
> > >
> > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C  wrote:
> > >
> > > > 6.
> > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > + 2,
> > > > + {
> > > > + Anum_pg_publication_schema_psnspcid,
> > > > + Anum_pg_publication_schema_pspubid,
> > > > + 0,
> > > > + 0
> > > > + },
> > > >
> > > > Why don't we keep pubid as the first column in this index?
> > >
> > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > it is, thoughts?
> > >
> >
> > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > because it is searched using the only relid in
> > GetRelationPublications, so, similarly, in the patch, you are using
> > schema_oid in GetSchemaPublications, so probably that will help. I was
> > wondering why you haven't directly used the cache in
> > GetSchemaPublications similar to GetRelationPublications?
>
> Both of the approaches work, I was not sure which one is better, If
> the approach in GetRelationPublications is better, I will change it to
> something similar to GetRelationPublications. Thoughts?
>

I think it is better to use the cache as if we don't find the entry in
the cache, then we will anyway search the required entry via sys table
scan, see SearchCatCacheList. I think the point I wanted to ensure was
that a concurrent session won't blow up the entry while we are looking
at it. How is that ensured?


-- 
With Regards,
Amit Kapila.




Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

2021-08-08 Thread Greg Nancarrow
On Thu, Aug 5, 2021 at 12:03 PM Greg Nancarrow  wrote:
>

As the current v7 patch doesn't fix the coredump issue and also the
cfbot is now failing (as one of the regression tests fails) I'm
reinstating my v2/v5 patch (as v8) as the current best solution to
this issue.
So far I haven't found a test case (e.g. isolation level) that breaks
the patch. Pavel did encounter an error (but no crash) when testing
the patch with SERIALIZABLE, but he found that this error occurred
anyway without the patch.

Regards,
Greg Nancarrow
Fujitsu Australia


v8-0001-Fix-parallel-worker-failed-assertion-and-coredump.patch
Description: Binary data


Re: Small documentation improvement for ALTER SUBSCRIPTION

2021-08-08 Thread Peter Smith
On Mon, Aug 9, 2021 at 12:46 PM Amit Kapila  wrote:
>
> On Sun, Aug 8, 2021 at 10:21 AM Peter Smith  wrote:
> >
> > On Sat, Aug 7, 2021 at 4:33 PM Amit Kapila  wrote:
> > >
> > > On Thu, Jul 8, 2021 at 6:31 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > When reading the doc of ALTER SUBSCRIPTION I realized that 'refresh
> > > > options' in the following paragraph is not tagged:
> > > >
> > > > ---
> > > > Additionally, refresh options as described under REFRESH PUBLICATION
> > > > may be specified, except in the case of DROP PUBLICATION.
> > > > ---
> > > >
> > > > When I read it for the first time, I got confused because we actually
> > > > have the 'refresh' option and this description in the paragraph of the
> > > > 'refresh' option. I think we can improve it by changing to
> > > > 'refresh_option'. Thoughts?
> > > >
> > >
> > > I see that one can get confused but how about changing it to
> > > "Additionally, refresh options as described under REFRESH
> > > PUBLICATION (refresh_option) may
> > > be specified,.."? I think keeping "refresh options" in the text would
> > > be good because there could be multiple such options.
> > >
> >
> > I feel like it would be better to reword it in some way that avoids
> > using parentheses because they look like part of the syntax instead of
> > just part of the sentence.
> >
>
> Fair enough, feel free to propose if you find something better or if
> you think the current text in the docs is good.
>

IMO just the same as your suggestion but without the parens would be good. e.g.

"Additionally, refresh options as described under REFRESH
PUBLICATION refresh_option may be
specified,.."




Re: Strange code in EXPLAIN for queryid

2021-08-08 Thread David Rowley
On Mon, 9 Aug 2021 at 02:06, Julien Rouhaud  wrote:
> > Any objection to applying the attached to pg14 and master?
>
> No objection.

Thanks for looking.  I've now pushed this to master and pg14.

David




Re: elog.c query_id support vs shutdown

2021-08-08 Thread Julien Rouhaud
On Mon, Aug 9, 2021 at 3:06 AM Andres Freund  wrote:
>
> Hi,
>
> On 2021-08-08 11:53:39 -0700, Andres Freund wrote:
> > On 2021-08-08 13:46:39 +0800, Julien Rouhaud wrote:
> > > > I suspect that to make the elog.c usage safe, we'll have to clear 
> > > > MyBEEntry in
> > > > pgstat_beshutdown_hook().
> > >
> > > I agree, and a quick test indeed fix your scenario.  It also seems like a 
> > > good
> > > thing to do overall.
> >
> > Yea, it does seem like a good thing. But we should do a search for the
> > problems it could cause...

Agreed, and I'm also looking into it.

> Not a problem with unsetting MyBEEntry. But the search for problems made me
> reread the following comment:
>
> /*
>  * There's no need for a lock around pgstat_begin_read_activity /
>  * pgstat_end_read_activity here as it's only called from
>  * pg_stat_get_activity which is already protected, or from the same
>  * backend which means that there won't be concurrent writes.
>  */
>
> I don't understand the pg_stat_get_activity() part of this comment?
> pgstat_get_my_query_id() hardcodes MyBEEntry->st_query_id, so it can't be
> useful to pg_stat_get_activity(), nor is it used there?
>
> I assume it's just a remnant from an earlier iteration of the code?

Ah indeed!  This was quite a long thread so I didn't try to see when
that changed.  I also now realize that I made a typo in the patch
where I s/loop/look/ which was then changed to s/look/lock/.  The
comment should be something like:

/*
* There's no need for a loop around pgstat_begin_read_activity /
* pgstat_end_read_activity here as it's only called from the same backend
* which means that there won't be concurrent writes.
*/




Re: Use POPCNT on MSVC

2021-08-08 Thread David Rowley
On Mon, 9 Aug 2021 at 12:58, John Naylor  wrote:
>
> On Sun, Aug 8, 2021 at 8:31 PM David Rowley  wrote:
> >
> > I've attached a v2 patch which I think is more along the lines of what
> > you had in mind.
>
> LGTM

Thanks for the review.

Pushed.

David




Re: Typo in subscription TAP test comment

2021-08-08 Thread Amit Kapila
On Sun, Aug 8, 2021 at 11:44 AM Peter Smith  wrote:
>
> PSA a patch to fix trivial comment typo in newly committed  TAP test.
>
> "amd" -> "and"
>

I'll push this in some time.

-- 
With Regards,
Amit Kapila.




Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION

2021-08-08 Thread Amit Kapila
On Sat, Aug 7, 2021 at 6:53 PM houzj.f...@fujitsu.com
 wrote:
>
> On Sat, Aug 7, 2021 1:36 PM Amit Kapila  wrote:
> > On Fri, Aug 6, 2021 at 9:57 PM Japin Li  wrote:
> >
> > Do you mean to say that do it for both Add and Drop or just for Drop?
> > Actually, doing it both will make the behavior consistent but doing it just 
> > for
> > Drop might be preferable by some users. I think it is better to be 
> > consistent here
> > but I am fine if you and others feel it is better to refresh all 
> > publications only in
> > Drop case.
>
> Personally, I also think it will be better to make the behavior consistent.
> Attach the new version patch make both ADD and DROP behave the same as SET 
> PUBLICATION
> which refresh all the publications.
>

There is a bug reported on pgsql-bugs [1] which seems to be happening
due to the same reason. Can you please test that the other bug is also
fixed with your patch?

[1] - 
https://www.postgresql.org/message-id/17132-6a702189086c6243%40postgresql.org

-- 
With Regards,
Amit Kapila.




Re: [HACKERS] logical decoding of two-phase transactions

2021-08-08 Thread Amit Kapila
On Wed, Aug 4, 2021 at 4:14 PM Amit Kapila  wrote:
>
> I have pushed this last patch in the series.
>

I have closed this CF entry. Thanks to everyone involved in this work!

-- 
With Regards,
Amit Kapila.




Re: Small documentation improvement for ALTER SUBSCRIPTION

2021-08-08 Thread Amit Kapila
On Sun, Aug 8, 2021 at 10:21 AM Peter Smith  wrote:
>
> On Sat, Aug 7, 2021 at 4:33 PM Amit Kapila  wrote:
> >
> > On Thu, Jul 8, 2021 at 6:31 PM Masahiko Sawada  
> > wrote:
> > >
> > > Hi all,
> > >
> > > When reading the doc of ALTER SUBSCRIPTION I realized that 'refresh
> > > options' in the following paragraph is not tagged:
> > >
> > > ---
> > > Additionally, refresh options as described under REFRESH PUBLICATION
> > > may be specified, except in the case of DROP PUBLICATION.
> > > ---
> > >
> > > When I read it for the first time, I got confused because we actually
> > > have the 'refresh' option and this description in the paragraph of the
> > > 'refresh' option. I think we can improve it by changing to
> > > 'refresh_option'. Thoughts?
> > >
> >
> > I see that one can get confused but how about changing it to
> > "Additionally, refresh options as described under REFRESH
> > PUBLICATION (refresh_option) may
> > be specified,.."? I think keeping "refresh options" in the text would
> > be good because there could be multiple such options.
> >
>
> I feel like it would be better to reword it in some way that avoids
> using parentheses because they look like part of the syntax instead of
> just part of the sentence.
>

Fair enough, feel free to propose if you find something better or if
you think the current text in the docs is good.

-- 
With Regards,
Amit Kapila.




Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM

2021-08-08 Thread David Rowley
On Wed, 7 Jul 2021 at 23:44, David Rowley  wrote:
>
> On Sun, 4 Jul 2021 at 22:38, David Rowley  wrote:
> > I could do with a 2nd opinion about if we should just adjust the
> > maximum value for the autovacuum_work_mem GUC to 1GB in master.
> >
> > I'm also not sure if since we'd not backpatch the GUC max value
> > adjustment if we need to document the upper limit in the manual.
>
> I was just looking at this again and I see that GIN indexes are able
> to use more than 1GB of memory during VACUUM. That discovery makes me
> think having the docs say that vacuum cannot use more than 1GB of
> memory is at best misleading and more likely just incorrect.

The attached patch aims to put right where I went wrong with the
documentation about vacuum/autovacuum only using maintainance_work_mem
memory for dead tuple collection.

I plan to push this and backpatch to 9.6 shortly unless there are any
better ideas.

What's in there right now is wrong and I want that fixed before the
cut-off for the next set of bug fix releases.

David


fix_wrong_documentation_on_vacuum_mem_limits.patch
Description: Binary data


Re: PG Docs - CREATE SUBSCRIPTION option list order

2021-08-08 Thread Peter Smith
v1 -> v2

Rebased.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-create-subscription-options-list-order.patch
Description: Binary data


Re: Support reset of Shared objects statistics in "pg_stat_reset" function

2021-08-08 Thread Mahendra Singh Thalor
On Mon, 9 Aug 2021 at 06:56, Sadhuprasad Patro  wrote:
>
> Hi,
>
> > > 3)
> > >  pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int 
> > > len)
> > >  {
> > >  PgStat_StatDBEntry *dbentry;
> > > +boolfound;
> > >
> > >  dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> > >
> > > @@ -5168,13 +5192,41 @@
> > > pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int
> > > len)
> > >  /* Set the reset timestamp for the whole database */
> > >  dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> > >
> > > -/* Remove object if it exists, ignore it if not */
> > > +/* Remove object if it exists, if not then may be it's a shared 
> > > table */
> > >  if (msg->m_resettype == RESET_TABLE)
> > > +{
> > >  (void) hash_search(dbentry->tables, (void *) &(msg->m_objectid),
> > > -   HASH_REMOVE, NULL);
> > > +   HASH_REMOVE, );
> > > +if (!found)
> > > +{
> > > +/* If we didn't find it, maybe it's a shared table */
> > > +dbentry = pgstat_get_db_entry(InvalidOid, false);
> > > +if (dbentry)
> > > +{
> > > +/* Set the reset timestamp for the whole database */
> > > +dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> > > +(void) hash_search(dbentry->tables, (void *)
> > > &(msg->m_objectid),
> > > +   HASH_REMOVE, NULL);
> > > +}
> > > +}
> > > +}
> > >  else if (msg->m_resettype == RESET_FUNCTION)
> > > +{
> > >  (void) hash_search(dbentry->functions, (void *) 
> > > &(msg->m_objectid),
> > > -   HASH_REMOVE, NULL);
> > > +   HASH_REMOVE, );
> > > +if (!found)
> > > +{
> > > +/* If we didn't find it, maybe it's a shared table */
> > > +dbentry = pgstat_get_db_entry(InvalidOid, false);
> > > +if (dbentry)
> > > +{
> > > +/* Set the reset timestamp for the whole database */
> > > +dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> > > +(void) hash_search(dbentry->functions, (void *)
> > > &(msg->m_objectid),
> > > +   HASH_REMOVE, NULL);
> > > +}
> > > +}
> > > +}
> > >  }
> > >
> > > Above code can be replaced with:
> > > @@ -5160,7 +5162,10 @@
> > > pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int
> > > len)
> > >  {
> > > PgStat_StatDBEntry *dbentry;
> > >
> > > -   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> > > +   if (IsSharedRelation(msg->m_objectid))
> > > +   dbentry = pgstat_get_db_entry(InvalidOid, false);
> > > +   else
> > > +   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> > >
> > > if (!dbentry)
> > > return;
> > >
>
> Will Check the function "IsSharedRelation '' usage further and will
> use it to fix the patch.
> But I have not seen this function used every time as needed. No where
> in the stat collector code.
>
>
> > > 5) pg_stat_reset_single_table_counters is not resetting all the
> > > columns for pg_database.
> > > Ex:
> > > postgres=# SELECT * FROM pg_statio_all_tables  where relid =
> > > 'pg_database'::regclass::oid;
> > >  relid | schemaname |   relname   | heap_blks_read | heap_blks_hit |
> > > idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
> > > tidx_blks_read | tidx_blks_hit
> > > ---++-++---+---+--+-+++---
> > >   1262 | pg_catalog | pg_database |  1 | 2 |
> > >   2 |0 |   0 |  0 |
> > >   0 | 0
> > > (1 row)
> > >
> > > postgres=# select
> > > pg_stat_reset_single_table_counters('pg_database'::regclass::oid);
> > >  pg_stat_reset_single_table_counters
> > > -
> > >
> > > (1 row)
> > >
> > > postgres=# SELECT * FROM pg_statio_all_tables  where relid =
> > > 'pg_database'::regclass::oid;
> > >  relid | schemaname |   relname   | heap_blks_read | heap_blks_hit |
> > > idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
> > > tidx_blks_read | tidx_blks_hit
> > > ---++-++---+---+--+-+++---
> > >   1262 | pg_catalog | pg_database |  0 | 0 |
> > >   2 |0 |   0 |  0 |
> > >   0 | 0
> > > (1 row)
> > >
>
> As we have given input as ObjectID of pg_database, it resets only the
> counters of heap_blk_read & heap_blk_hit.
> To reset 

Re: Support reset of Shared objects statistics in "pg_stat_reset" function

2021-08-08 Thread Sadhuprasad Patro
Hi,

> > 3)
> >  pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
> >  {
> >  PgStat_StatDBEntry *dbentry;
> > +boolfound;
> >
> >  dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> >
> > @@ -5168,13 +5192,41 @@
> > pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int
> > len)
> >  /* Set the reset timestamp for the whole database */
> >  dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> >
> > -/* Remove object if it exists, ignore it if not */
> > +/* Remove object if it exists, if not then may be it's a shared table 
> > */
> >  if (msg->m_resettype == RESET_TABLE)
> > +{
> >  (void) hash_search(dbentry->tables, (void *) &(msg->m_objectid),
> > -   HASH_REMOVE, NULL);
> > +   HASH_REMOVE, );
> > +if (!found)
> > +{
> > +/* If we didn't find it, maybe it's a shared table */
> > +dbentry = pgstat_get_db_entry(InvalidOid, false);
> > +if (dbentry)
> > +{
> > +/* Set the reset timestamp for the whole database */
> > +dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> > +(void) hash_search(dbentry->tables, (void *)
> > &(msg->m_objectid),
> > +   HASH_REMOVE, NULL);
> > +}
> > +}
> > +}
> >  else if (msg->m_resettype == RESET_FUNCTION)
> > +{
> >  (void) hash_search(dbentry->functions, (void *) &(msg->m_objectid),
> > -   HASH_REMOVE, NULL);
> > +   HASH_REMOVE, );
> > +if (!found)
> > +{
> > +/* If we didn't find it, maybe it's a shared table */
> > +dbentry = pgstat_get_db_entry(InvalidOid, false);
> > +if (dbentry)
> > +{
> > +/* Set the reset timestamp for the whole database */
> > +dbentry->stat_reset_timestamp = GetCurrentTimestamp();
> > +(void) hash_search(dbentry->functions, (void *)
> > &(msg->m_objectid),
> > +   HASH_REMOVE, NULL);
> > +}
> > +}
> > +}
> >  }
> >
> > Above code can be replaced with:
> > @@ -5160,7 +5162,10 @@
> > pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int
> > len)
> >  {
> > PgStat_StatDBEntry *dbentry;
> >
> > -   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> > +   if (IsSharedRelation(msg->m_objectid))
> > +   dbentry = pgstat_get_db_entry(InvalidOid, false);
> > +   else
> > +   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
> >
> > if (!dbentry)
> > return;
> >

Will Check the function "IsSharedRelation '' usage further and will
use it to fix the patch.
But I have not seen this function used every time as needed. No where
in the stat collector code.


> > 5) pg_stat_reset_single_table_counters is not resetting all the
> > columns for pg_database.
> > Ex:
> > postgres=# SELECT * FROM pg_statio_all_tables  where relid =
> > 'pg_database'::regclass::oid;
> >  relid | schemaname |   relname   | heap_blks_read | heap_blks_hit |
> > idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
> > tidx_blks_read | tidx_blks_hit
> > ---++-++---+---+--+-+++---
> >   1262 | pg_catalog | pg_database |  1 | 2 |
> >   2 |0 |   0 |  0 |
> >   0 | 0
> > (1 row)
> >
> > postgres=# select
> > pg_stat_reset_single_table_counters('pg_database'::regclass::oid);
> >  pg_stat_reset_single_table_counters
> > -
> >
> > (1 row)
> >
> > postgres=# SELECT * FROM pg_statio_all_tables  where relid =
> > 'pg_database'::regclass::oid;
> >  relid | schemaname |   relname   | heap_blks_read | heap_blks_hit |
> > idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit |
> > tidx_blks_read | tidx_blks_hit
> > ---++-++---+---+--+-+++---
> >   1262 | pg_catalog | pg_database |  0 | 0 |
> >   2 |0 |   0 |  0 |
> >   0 | 0
> > (1 row)
> >

As we have given input as ObjectID of pg_database, it resets only the
counters of heap_blk_read & heap_blk_hit.
To reset other counters like index & toast table related, we need to
provide respective objectIDs and call the function again. Then it
resets everything.
This behavior is kept same as user tables local to the database.



> 2) I think, pg_stat_reset_single_table_counters is not working properly for 
> shared 

Re: Use POPCNT on MSVC

2021-08-08 Thread John Naylor
On Sun, Aug 8, 2021 at 8:31 PM David Rowley  wrote:
>
> I've attached a v2 patch which I think is more along the lines of what
> you had in mind.

LGTM

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Use POPCNT on MSVC

2021-08-08 Thread David Rowley
On Thu, 5 Aug 2021 at 07:02, John Naylor  wrote:
> > #if defined(_MSC_VER) && defined(_WIN64)
> > #define HAVE_X86_64_POPCNTQ
> > #endif
>
> That seems fine. I don't know PG can build with Arm on Windows, but for the 
> cpuid to work, it seems safer to also check for __x86_64?

That's a good point. I've adjusted it to do #if defined(_MSC_VER) &&
defined(_M_AMD64).

I've attached a v2 patch which I think is more along the lines of what
you had in mind.

David


use_popcnt_on_msvc_v2.patch
Description: Binary data


Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 3:28 PM, Tom Lane  wrote:
> 
> Cool, thanks.  I also tried your millions-of-random-regexps script
> and didn't find any difference between the results from HEAD and
> those from the v3 patch.

The patch looks ready to commit.  I don't expect to test it any further unless 
you have something in particular you'd like me to focus on.

Thanks again for working on this.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Tom Lane
Mark Dilger  writes:
> I have applied your latest patch and do not see any problems with it.  All my 
> tests pass with no asserts and with no differences in results vs. master.  
> This is a test suite of nearly 1.5 million separate regular expressions.

Cool, thanks.  I also tried your millions-of-random-regexps script
and didn't find any difference between the results from HEAD and
those from the v3 patch.

regards, tom lane




Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 1:25 PM, Tom Lane  wrote:
> 
> Ugh.  The regex engine is finding the match correctly, but it's failing to
> tell the caller where it is :-(.  I was a little too cute in optimizing
> the regmatch_t result-vector copying in pg_regexec, and forgot to ensure
> that the overall match position would be reported.
> 
> Thanks for the testing!

Sure!  Thanks for improving the regular expression engine!

I have applied your latest patch and do not see any problems with it.  All my 
tests pass with no asserts and with no differences in results vs. master.  This 
is a test suite of nearly 1.5 million separate regular expressions.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Tom Lane
Mark Dilger  writes:
> Hmm.  This changes the behavior when applied against master 
> (c1132aae336c41cf9d316222e525d8d593c2b5d2):

>  select regexp_split_to_array('uuuzkodphfbfbfb', '((.))(\1\2)', 'ntw');
>   regexp_split_to_array
>  ---
> - {"",zkodphfbfbfb}
> + {uuuzkodphfbfbfb}
>  (1 row)

Ugh.  The regex engine is finding the match correctly, but it's failing to
tell the caller where it is :-(.  I was a little too cute in optimizing
the regmatch_t result-vector copying in pg_regexec, and forgot to ensure
that the overall match position would be reported.

Thanks for the testing!

regards, tom lane

diff --git a/contrib/pg_trgm/trgm_regexp.c b/contrib/pg_trgm/trgm_regexp.c
index bf1dea6352..64816dd370 100644
--- a/contrib/pg_trgm/trgm_regexp.c
+++ b/contrib/pg_trgm/trgm_regexp.c
@@ -541,9 +541,11 @@ createTrgmNFA(text *text_re, Oid collation,
 	 * Stage 1: Compile the regexp into a NFA, using the regexp library.
 	 */
 #ifdef IGNORECASE
-	RE_compile(, text_re, REG_ADVANCED | REG_ICASE, collation);
+	RE_compile(, text_re,
+			   REG_ADVANCED | REG_NOSUB | REG_ICASE, collation);
 #else
-	RE_compile(, text_re, REG_ADVANCED, collation);
+	RE_compile(, text_re,
+			   REG_ADVANCED | REG_NOSUB, collation);
 #endif
 
 	/*
diff --git a/src/backend/regex/regc_lex.c b/src/backend/regex/regc_lex.c
index 7673dab76f..45727ffa01 100644
--- a/src/backend/regex/regc_lex.c
+++ b/src/backend/regex/regc_lex.c
@@ -528,10 +528,7 @@ next(struct vars *v)
 }
 assert(NOTREACHED);
 			}
-			if (v->cflags & REG_NOSUB)
-RETV('(', 0);	/* all parens non-capturing */
-			else
-RETV('(', 1);
+			RETV('(', 1);
 			break;
 		case CHR(')'):
 			if (LASTTYPE('('))
diff --git a/src/backend/regex/regcomp.c b/src/backend/regex/regcomp.c
index 60a220c57a..ae3a7b6a38 100644
--- a/src/backend/regex/regcomp.c
+++ b/src/backend/regex/regcomp.c
@@ -65,7 +65,7 @@ static struct subre *subre(struct vars *, int, int, struct state *, struct state
 static void freesubre(struct vars *, struct subre *);
 static void freesubreandsiblings(struct vars *, struct subre *);
 static void freesrnode(struct vars *, struct subre *);
-static void optst(struct vars *, struct subre *);
+static void removecaptures(struct vars *, struct subre *);
 static int	numst(struct subre *, int);
 static void markst(struct subre *);
 static void cleanst(struct vars *);
@@ -431,7 +431,8 @@ pg_regcomp(regex_t *re,
 		dumpst(v->tree, debug, 1);
 	}
 #endif
-	optst(v, v->tree);
+	if (v->cflags & REG_NOSUB)
+		removecaptures(v, v->tree);
 	v->ntree = numst(v->tree, 1);
 	markst(v->tree);
 	cleanst(v);
@@ -1013,14 +1014,16 @@ parseqatom(struct vars *v,
 			break;
 		case BACKREF:			/* the Feature From The Black Lagoon */
 			INSIST(type != LACON, REG_ESUBREG);
-			INSIST(v->nextvalue < v->nsubs, REG_ESUBREG);
-			INSIST(v->subs[v->nextvalue] != NULL, REG_ESUBREG);
+			subno = v->nextvalue;
+			assert(subno > 0);
+			INSIST(subno < v->nsubs, REG_ESUBREG);
+			NOERRN();
+			INSIST(v->subs[subno] != NULL, REG_ESUBREG);
 			NOERRN();
-			assert(v->nextvalue > 0);
 			atom = subre(v, 'b', BACKR, lp, rp);
 			NOERRN();
-			subno = v->nextvalue;
 			atom->backno = subno;
+			v->subs[subno]->flags |= BRUSE;
 			EMPTYARC(lp, rp);	/* temporarily, so there's something */
 			NEXT();
 			break;
@@ -2141,19 +2144,54 @@ freesrnode(struct vars *v,		/* might be NULL */
 }
 
 /*
- * optst - optimize a subRE subtree
+ * removecaptures - remove unnecessary capture subREs
+ *
+ * If the caller said that it doesn't care about subexpression match data,
+ * we may delete the "capture" markers on subREs that are not referenced
+ * by any backrefs, and then simplify anything that's become non-messy.
+ * Call this only if REG_NOSUB flag is set.
  */
 static void
-optst(struct vars *v,
-	  struct subre *t)
+removecaptures(struct vars *v,
+			   struct subre *t)
 {
+	struct subre *t2;
+
+	assert(t != NULL);
+
+	/*
+	 * If this isn't itself a backref target, clear capno and tentatively
+	 * clear CAP flag.
+	 */
+	if (!(t->flags & BRUSE))
+	{
+		t->capno = 0;
+		t->flags &= ~CAP;
+	}
+
+	/* Now recurse to children */
+	for (t2 = t->child; t2 != NULL; t2 = t2->sibling)
+	{
+		removecaptures(v, t2);
+		/* Propagate child CAP flag back up, if it's still set */
+		if (t2->flags & CAP)
+			t->flags |= CAP;
+	}
+
 	/*
-	 * DGP (2007-11-13): I assume it was the programmer's intent to eventually
-	 * come back and add code to optimize subRE trees, but the routine coded
-	 * just spends effort traversing the tree and doing nothing. We can do
-	 * nothing with less effort.
+	 * If t now contains neither captures nor backrefs, there's no longer any
+	 * need to care where its sub-match boundaries are, so we can reduce it to
+	 * a simple DFA node.  (Note in particular that MIXED child greediness is
+	 * not a hindrance here, so we don't use the MESSY() macro.)
 	 */
-	return;
+	if ((t->flags & (CAP | BACKR)) == 0)
+	{
+		if (t->child)
+			freesubreandsiblings(v, 

Re: Column Filtering in Logical Replication

2021-08-08 Thread Rahila Syed
Hi,



> >
>> > Currently, this capability is not included in the patch. If the table on
>> > the subscriber
>> > server has lesser attributes than that on the publisher server, it
>> > throws an error at the
>> > time of CREATE SUBSCRIPTION.
>> >
>>
>> That's a bit surprising, to be honest. I do understand the patch simply
>> treats the filtered columns as "unchanged" because that's the simplest
>> way to filter the *data* of the columns. But if someone told me we can
>> "filter columns" I'd expect this to work without the columns on the
>> subscriber.
>>
>> OK, I will look into adding this.
>

This has been added in the attached patch. Now, instead of
treating the filtered columns as unchanged and sending a byte
with that information, unfiltered columns are not sent to the subscriber
server at all. This along with saving the network bandwidth, allows
the logical replication to even work between tables with different numbers
of
columns i.e with the table on subscriber server containing only the
filtered
columns. Currently, replica identity columns are replicated irrespective of
the presence of the column filters, hence the table on the subscriber side
must
contain the replica identity columns.

The patch adds a new parse node PublicationTable, but doesn't add
> copyfuncs.c, equalfuncs.c, readfuncs.c, outfuncs.c support for it.


Thanks, added this.


> Looking at OpenTableList(), I think you forgot to update the comment --
> it says "open relations specified by a RangeVar list",


Thank you for the review, Modified this.

To nitpick, I find "Bitmapset *att_list" a bit annoying, because it's
> not really a list ;-)


Changed this.

>
>  It's not super clear to me that strlist_to_textarray() and related
> processing will behave sanely when the column names contain weird
> characters such as commas or quotes, or just when used with uppercase
> column names.  Maybe it's worth having tests that try to break such
> cases.


Added a few test cases for this.

In AlterPublicationTables() I was confused by some code that seemed
> commented a bit too verbosely


Modified this as per the suggestion.

: REPLICA IDENTITY columns are always replicated
> : irrespective of column names specification.


... for which you don't have any tests


I have added these tests.

Having said that, I'm not sure I agree with this design decision; what I
> think this is doing is hiding from the user the fact that they are
> publishing columns that they don't want to publish.  I think as a user I
> would rather get an error in that case:


  ERROR:  invalid column list in published set
>   DETAIL:  The set of published commands does not include all the replica
> identity columns.


or something like that.  Avoid possible nasty surprises of security-
> leaking nature.


Ok, Thank you for your opinion. I agree that giving an explicit error in
this case will be safer.
I will include this, in case there are no counter views.

Thank you for your review comments. Please find attached the rebased and
updated patch.


Thank you,
Rahila Syed


v3-0001-Add-column-filtering-to-logical-replication.patch
Description: Binary data


Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE

2021-08-08 Thread Peter Geoghegan
On Sun, Aug 8, 2021 at 11:34 AM Michael Meskes  wrote:
> > https://postgr.es/m/CAH2-Wzk=qxtsp0h5ekv92eh0u22hvmqlhgwyp4_fa3ytiei...@mail.gmail.com
>
> This email said nothing about sending a status update or a deadline or
> any question at all, only that you'd revert the patch if I was unable
> to resolve the issue. So what's your point?

I think that it's crystal clear what I meant in the email of July 30.
I meant: it's not okay that you're simply ignoring the RMT. You hadn't
even made a token effort at that point. For example you didn't give
the proposed fix a cursory 15 minute review, just so we had some very
rough idea of where things stand. You still haven't.

My understanding of what you're taking issue with (perhaps a flawed
understanding) is that you think that you have been treated unfairly
or arbitrarily in general. That's why I brought up the email of July
30 yesterday. So my point was: no, you haven't been treated unfairly.
If you only take issue with the specific tone and tenor of my email
from Friday (the email that specified a deadline), and not the content
itself, then maybe the timeline and the wider context are not so
important.

I am still unsure about whether your concern is limited to the tone of
the email from Friday, or if you also take exception to the content of
that email (and the wider context).

> > I also talked about the RMT in the third person. My intent was to
> > make
>
> So? It's okay to disrespect a person if you mention the team that you
> are representing in the third person, too?

Perhaps the tone of my email from Friday was unhelpful. Even still, I
am surprised that you seem to think that it was totally outrageous --
especially given the context. It was the first email that you
responded to *at all* on this thread, with the exception of your
original terse response. I am not well practised in communicating with
a committer that just doesn't engage with the RMT at all. This is all
new to me. I admit that I found it awkward to write the email for my
own reasons.

> > You didn't say anything at all, which speaks for itself. And makes it
> > impossible for us to be flexible.
>
> Which flexibility did I ask for? It'd be nice if you only accused me of
> things I did.

I brought up flexibility to point out that this could have been
avoided. If you needed more time, why didn't you simply ask for it?

Again, the scope of what you're complaining about was (and still is)
unclear to me.

> Just for the record, of course I'm going to look into the issue before
> your deadline and will send a status update.

Thank you.

-- 
Peter Geoghegan




Re: elog.c query_id support vs shutdown

2021-08-08 Thread Andres Freund
Hi,

On 2021-08-08 11:53:39 -0700, Andres Freund wrote:
> On 2021-08-08 13:46:39 +0800, Julien Rouhaud wrote:
> > > I suspect that to make the elog.c usage safe, we'll have to clear 
> > > MyBEEntry in
> > > pgstat_beshutdown_hook().
> > 
> > I agree, and a quick test indeed fix your scenario.  It also seems like a 
> > good
> > thing to do overall.
> 
> Yea, it does seem like a good thing. But we should do a search for the
> problems it could cause...

Not a problem with unsetting MyBEEntry. But the search for problems made me
reread the following comment:

/*
 * There's no need for a lock around pgstat_begin_read_activity /
 * pgstat_end_read_activity here as it's only called from
 * pg_stat_get_activity which is already protected, or from the same
 * backend which means that there won't be concurrent writes.
 */

I don't understand the pg_stat_get_activity() part of this comment?
pgstat_get_my_query_id() hardcodes MyBEEntry->st_query_id, so it can't be
useful to pg_stat_get_activity(), nor is it used there?

I assume it's just a remnant from an earlier iteration of the code?

Greetings,

Andres Freund




Re: elog.c query_id support vs shutdown

2021-08-08 Thread Andres Freund
Hi,

On 2021-08-08 13:46:39 +0800, Julien Rouhaud wrote:
> On Sat, Aug 07, 2021 at 04:44:07PM -0700, Andres Freund wrote:
> > 
> > As currently implemented those pgstat_get_my_query_id() calls are not
> > safe. It's fine during backend startup because MyBEEntry is not set, but
> > during shutdown that's not ok, because we never unset MyBEEntry.
> > 
> > andres@awork3:~/src/postgresql$ 
> > /home/andres/build/postgres/dev-assert/vpath/src/backend/postgres --single 
> > postgres -D /srv/dev/pgdev-dev/ -c 'log_line_prefix=%Q' -c 
> > log_min_messages=debug1
> > [...]
> > PostgreSQL stand-alone backend 15devel
> > backend> 0NOTICE:  shutting down
> > 0DEBUG:  performing replication slot checkpoint
> > Segmentation fault
> 
> Ouch
> 
> > I suspect that to make the elog.c usage safe, we'll have to clear MyBEEntry 
> > in
> > pgstat_beshutdown_hook().
> 
> I agree, and a quick test indeed fix your scenario.  It also seems like a good
> thing to do overall.

Yea, it does seem like a good thing. But we should do a search for the
problems it could cause...


> I didn't find any other problematic corner cases, but I'm not that familiar
> with pgstat, especially after the recent activity.

I don't think anything relevant to this issue has changed so far... And there
shouldn't be a meaningful amount of change to backend_status.c anyway - the
"what is currently happening" stuff that backend_status.c implements is mostly
independent from the "what has happened so far" that pgstats.c implements.

It probably would be a good idea to separate out the two namespaces more
clearly. Even with things like pgstat_report_activity() not being entirely
clear cut (because of the pgstat_count_conn_* calls) it still seems like it
would be an improvement. But I don't want to do that before the shared memory
stuff is in.

Greetings,

Andres Freund




Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Tom Lane
Mark Dilger  writes:
> There are a few changes which appear correct to me, but I don't know if you 
> expected them:

>  select regexp_match('hko', '? - regexp_match 
> ---
> - 
> +regexp_match
> +
> + {"","","",hk,"",k}
>  (1 row)

Yes, this one is a consequence of 4aea704a5 (Fix semantics of regular
expression back-references).  The lookbehind constraint should not be
applied within the back-ref.

>  select 'nigqvigqvee' !~ '(())((\2)\2)(?:([^\W]))';
> -ERROR:  invalid regular expression: invalid escape \ sequence
> + ?column? 
> +--
> + f
> +(1 row)
> +

and this one of 2a0af7fe4 (Allow complemented character class escapes
within regex brackets).

>  select regexp_replace('tl', '.{0}?(((?   regexp_replace
>  
> - tl
> + jxl
>  (1 row)

This looks like also a consequence of 4aea704a5.

regards, tom lane



Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE

2021-08-08 Thread Michael Meskes
On Sat, 2021-08-07 at 15:31 -0700, Peter Geoghegan wrote:
> On Sat, Aug 7, 2021 at 12:43 PM Michael Meskes
>  wrote:
> > Again, I didn't know the RMT was expecting anything from me. Yes, I
> > knew I needed to spend some time on a technical issues, but that's
> > exactly the information I had at the time.
> 
> As Andrew mentioned, I sent you an email on the 30th -- a full week
> prior to the email that formally timeboxed this open item. That
> earlier email is here:
> 
> https://postgr.es/m/CAH2-Wzk=qxtsp0h5ekv92eh0u22hvmqlhgwyp4_fa3ytiei...@mail.gmail.com

This email said nothing about sending a status update or a deadline or
any question at all, only that you'd revert the patch if I was unable
to resolve the issue. So what's your point? 


> I also talked about the RMT in the third person. My intent was to
> make

So? It's okay to disrespect a person if you mention the team that you
are representing in the third person, too?

> the message legalistic and impersonal. That's what is driving our
> thinking on this -- the charter of the RMT.

Please show me where the charter says that disrespecting a person is
fine. And while we're add it, does the code of conduct say anything
about the way people should be treated? 

> We're all volunteers, just like you. I happen to be a big believer in
> our culture of personal ownership and personal responsibility. But
> you
> simply haven't engaged with us at all.

Which I tried to explain several times, but apparently not well enough.
Let me give you a short rundown from my perspective:

- A patch is sent that I mistakenly thought was a new feature and thus
did not apply time too immediately.
- After a while I get an email from you as spokesperson of the RMT that
if this is not fixed it'll have to be reverted eventually.
- I learn that Andrew tried to reach me. Again, I believe you Andrew,
that you sent the email, but I never saw it. Whether it's some
filtering or a user error that made it disappear, I have no idea, but
I'm surely sorry about that.
- I receive that email we keep talking about, the one in which you
treat me like I'm not even worth being addressed.

> You didn't say anything at all, which speaks for itself. And makes it
> impossible for us to be flexible.

Which flexibility did I ask for? It'd be nice if you only accused me of
things I did.

Honestly I do not understand you at all. You keep treating me like I
was asking for anything unreasonable while I'm only trying to explain
why I didn't act earlier. The only issue I have is the rude treatment
you gave me. 

Just for the record, of course I'm going to look into the issue before
your deadline and will send a status update.

Michael

-- 
Michael Meskes
Michael at Fam-Meskes dot De
Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org





Re: [PATCH] Add tab-complete for backslash commands

2021-08-08 Thread Dagfinn Ilmari Mannsåker
"tanghy.f...@fujitsu.com"  writes:

> On Sunday, August 8, 2021 8:13 AM, Dagfinn Ilmari Mannsåker 
>  wrote:
>>> +   "\\r", "\\rset",
>>
>>There's a typo here, that should be "\\reset".  Also, I noticed that for
>>\connect, the situation is the opposite: it has the full form but not
>>the short form (\c).
>>
>>I've addressed both in the attached v2 patch.
>
> Thanks for you comments and fix. Your modified patch LGTM.

I've updated the commitfest entry to Ready for Committer.

> Regards,
> Tang

- ilmari




Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 10:04 AM, Tom Lane  wrote:
> 
> I've also rebased over the bug fixes from the other thread,
> and added a couple more test cases.
> 
>   regards, tom lane

Hmm.  This changes the behavior when applied against master 
(c1132aae336c41cf9d316222e525d8d593c2b5d2):

 select regexp_split_to_array('uuuzkodphfbfbfb', '((.))(\1\2)', 'ntw');
  regexp_split_to_array
 ---
- {"",zkodphfbfbfb}
+ {uuuzkodphfbfbfb}
 (1 row)

The string starts with three "u" characters.  The first of them is 
doubly-matched, meaning \1 and \2 refer to the first "u" character.  The (\1\2) 
that follows matches the next two "u" characters.  When the extra "useless" 
capture group is skipped, apparently this doesn't work anymore. I haven't 
looked at your patch, so I'm not sure why, but I'm guessing that \2 doesn't 
refer to anything.

That analysis is consistent with the next change:

 select 
regexp_split_to_array('snfwbvxeesnzqabixqbixqiumpgxdemmxvnsemjxgqoqknrqessmcqmfslfspskqpqxe',
 '?:.\3');
-regexp_split_to_array
--
- {snfwbvx,snzqabixqbixqiumpgxde,xvnsemjxgqoqknrqe,mcqmfslfspskqpqxe}
+ regexp_split_to_array  
+
+ {snfwbvxeesnzqabixqbixqiumpgxdemmxvnsemjxgqoqknrqessmcqmfslfspskqpqxe}
 (1 row)

The pattern matches any double character.  I would expect it to match the "ee", 
the "mm" and the "ss" in the text.  With the patched code, it matches nothing.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 10:34 AM, Mark Dilger  wrote:
> 
> I'll rerun my tests with the new master.  I was still using the code that I 
> pulled yesterday.

I am now testing REL_13_STABLE (ba9f665a4413f855bbf4b544481db326f7b9bd73) vs 
master (c1132aae336c41cf9d316222e525d8d593c2b5d2).  The regular expressions we 
discussed upthread show no differences.

There are a few changes which appear correct to me, but I don't know if you 
expected them:

 select regexp_match('hko', '?http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 10:32 AM, Mark Dilger  wrote:
> 
> I'm not sure what you mean by "as-committed".  Did you commit one of these 
> recently?

Nevermind, I see the commit now.

I'll rerun my tests with the new master.  I was still using the code that I 
pulled yesterday.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 10:29 AM, Tom Lane  wrote:
> 
> Mark Dilger  writes:
>> Applying your  to master changes 
>> the outcome of some regular expression queries, but I *think* it changes 
>> them for the better.
> 
> [ squint... ] You sure you applied the patch properly?  All these examples
> give me the same results in HEAD (with said patches as-committed) and v13.
> 
>   regards, tom lane

I'm not sure what you mean by "as-committed".  Did you commit one of these 
recently?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Tom Lane
Mark Dilger  writes:
> Applying your  to master changes 
> the outcome of some regular expression queries, but I *think* it changes them 
> for the better.

[ squint... ] You sure you applied the patch properly?  All these examples
give me the same results in HEAD (with said patches as-committed) and v13.

regards, tom lane




Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 10:15 AM, Mark Dilger  wrote:
> 
> But these next two look to me correct before the patch and wrong after:
> 
> select regexp_matches('ircecpbgyiggvtruqgxzigxzigxzisdbkuhbkuhrvl', 
> '(((.)))(?:(\3))[^\f]');
>  regexp_matches
> 
> -(0 rows)
> + {g,g,g,g}
> +(1 row)
> 
> select regexp_matches('fhgxnvbvjaej', '(((.)).)((\3)((.)))', 'csx');
> - regexp_matches 
> -
> -(0 rows)
> +  regexp_matches   
> +---
> + {vb,v,v,vj,v,j,j}
> +(1 row)

Scratch that.  On further thought, these also look correct.  I wasn't doing the 
capturing in my head correctly.  So I think the patch is working!  I'll test a 
bit longer.

Is this patch (alternate-backref-corner-case-fix-1.patch) the current state of 
your patch set?  I'd like to be sure I'm testing your latest work.  Thanks.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 7, 2021, at 6:03 PM, Tom Lane  wrote:
> 
> That requires tweaking the API of parseqatom,
> which why I'd not done it like that to begin with --- but that's not
> a hard or complicated change, just a mite tedious.
> 
> Hence, the attached patch.

Applying your  to master changes the 
outcome of some regular expression queries, but I *think* it changes them for 
the better.

These three look to me exactly correct after the patch, and wrong before:

 select regexp_matches('vnrungnajjjgkaaeaper', '((.))(((\1)))((?:\5..))', 'mx');
- regexp_matches 
-
-(0 rows)
+ regexp_matches  
+-
+ {j,j,j,j,j,jgk}
+(1 row)

 select 
regexp_match('kgkgeganlifykxhfspjtgluwluwluwdfdfbbdjvrxjvrxedndrkaxxvbtqdj', 
'((.))\2');
  regexp_match
 --
- 
+ {b,b}
 (1 row)

 select regexp_split_to_array('uuuzkodphfbfbfb', '((.))(\1\2)', 'ntw');
  regexp_split_to_array
 ---
- {uuuzkodphfbfbfb}
+ {"",zkodphfbfbfb}
 (1 row)

But these next two look to me correct before the patch and wrong after:

 select regexp_matches('ircecpbgyiggvtruqgxzigxzigxzisdbkuhbkuhrvl', 
'(((.)))(?:(\3))[^\f]');
  regexp_matches
 
-(0 rows)
+ {g,g,g,g}
+(1 row)

 select regexp_matches('fhgxnvbvjaej', '(((.)).)((\3)((.)))', 'csx');
- regexp_matches 
-
-(0 rows)
+  regexp_matches   
+---
+ {vb,v,v,vj,v,j,j}
+(1 row)

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Another regexp performance improvement: skip useless paren-captures

2021-08-08 Thread Tom Lane
Mark Dilger  writes:
> The patch triggers an assertion that master does not:

> +select 'azrlfkjbjgidgryryiglcabkgqluflu' !~ '(.(.)((.)))((?:(\3)))';

On looking into this, it's pretty simple: regexec.c has an assertion
that a pure-capture subre node ought to be doing some capturing.

case '(':/* no-op capture node */
assert(t->child != NULL);
assert(t->capno > 0);

That's fine as of HEAD, but with the proposed patch, we may notice
that the node isn't actually referenced by any backref, and remove
its capture marker, allowing this assertion to fire.  Nothing's
really wrong though.

There seem to be three things we could do about that:

1. Extend removecaptures() so that it can actually remove no-op
capture nodes if it's removed their capture markings.  This would
substantially complicate that function, and I judge that it's not
worth the trouble.  We'll only have such nodes in cases of
capturing parentheses immediately surrounding capturing parentheses,
which doesn't seem like a case worth expending sweat for.

2. Just drop the "t->capno > 0" assertion in regexec.c.

3. Weaken said assertion, perhaps by also checking the BRUSE flag bit.

Not sure that I see any point to #3, so I just dropped the
assertion in the attached.

I've also rebased over the bug fixes from the other thread,
and added a couple more test cases.

regards, tom lane

diff --git a/contrib/pg_trgm/trgm_regexp.c b/contrib/pg_trgm/trgm_regexp.c
index bf1dea6352..64816dd370 100644
--- a/contrib/pg_trgm/trgm_regexp.c
+++ b/contrib/pg_trgm/trgm_regexp.c
@@ -541,9 +541,11 @@ createTrgmNFA(text *text_re, Oid collation,
 	 * Stage 1: Compile the regexp into a NFA, using the regexp library.
 	 */
 #ifdef IGNORECASE
-	RE_compile(, text_re, REG_ADVANCED | REG_ICASE, collation);
+	RE_compile(, text_re,
+			   REG_ADVANCED | REG_NOSUB | REG_ICASE, collation);
 #else
-	RE_compile(, text_re, REG_ADVANCED, collation);
+	RE_compile(, text_re,
+			   REG_ADVANCED | REG_NOSUB, collation);
 #endif
 
 	/*
diff --git a/src/backend/regex/regc_lex.c b/src/backend/regex/regc_lex.c
index 7673dab76f..45727ffa01 100644
--- a/src/backend/regex/regc_lex.c
+++ b/src/backend/regex/regc_lex.c
@@ -528,10 +528,7 @@ next(struct vars *v)
 }
 assert(NOTREACHED);
 			}
-			if (v->cflags & REG_NOSUB)
-RETV('(', 0);	/* all parens non-capturing */
-			else
-RETV('(', 1);
+			RETV('(', 1);
 			break;
 		case CHR(')'):
 			if (LASTTYPE('('))
diff --git a/src/backend/regex/regcomp.c b/src/backend/regex/regcomp.c
index 60a220c57a..ae3a7b6a38 100644
--- a/src/backend/regex/regcomp.c
+++ b/src/backend/regex/regcomp.c
@@ -65,7 +65,7 @@ static struct subre *subre(struct vars *, int, int, struct state *, struct state
 static void freesubre(struct vars *, struct subre *);
 static void freesubreandsiblings(struct vars *, struct subre *);
 static void freesrnode(struct vars *, struct subre *);
-static void optst(struct vars *, struct subre *);
+static void removecaptures(struct vars *, struct subre *);
 static int	numst(struct subre *, int);
 static void markst(struct subre *);
 static void cleanst(struct vars *);
@@ -431,7 +431,8 @@ pg_regcomp(regex_t *re,
 		dumpst(v->tree, debug, 1);
 	}
 #endif
-	optst(v, v->tree);
+	if (v->cflags & REG_NOSUB)
+		removecaptures(v, v->tree);
 	v->ntree = numst(v->tree, 1);
 	markst(v->tree);
 	cleanst(v);
@@ -1013,14 +1014,16 @@ parseqatom(struct vars *v,
 			break;
 		case BACKREF:			/* the Feature From The Black Lagoon */
 			INSIST(type != LACON, REG_ESUBREG);
-			INSIST(v->nextvalue < v->nsubs, REG_ESUBREG);
-			INSIST(v->subs[v->nextvalue] != NULL, REG_ESUBREG);
+			subno = v->nextvalue;
+			assert(subno > 0);
+			INSIST(subno < v->nsubs, REG_ESUBREG);
+			NOERRN();
+			INSIST(v->subs[subno] != NULL, REG_ESUBREG);
 			NOERRN();
-			assert(v->nextvalue > 0);
 			atom = subre(v, 'b', BACKR, lp, rp);
 			NOERRN();
-			subno = v->nextvalue;
 			atom->backno = subno;
+			v->subs[subno]->flags |= BRUSE;
 			EMPTYARC(lp, rp);	/* temporarily, so there's something */
 			NEXT();
 			break;
@@ -2141,19 +2144,54 @@ freesrnode(struct vars *v,		/* might be NULL */
 }
 
 /*
- * optst - optimize a subRE subtree
+ * removecaptures - remove unnecessary capture subREs
+ *
+ * If the caller said that it doesn't care about subexpression match data,
+ * we may delete the "capture" markers on subREs that are not referenced
+ * by any backrefs, and then simplify anything that's become non-messy.
+ * Call this only if REG_NOSUB flag is set.
  */
 static void
-optst(struct vars *v,
-	  struct subre *t)
+removecaptures(struct vars *v,
+			   struct subre *t)
 {
+	struct subre *t2;
+
+	assert(t != NULL);
+
+	/*
+	 * If this isn't itself a backref target, clear capno and tentatively
+	 * clear CAP flag.
+	 */
+	if (!(t->flags & BRUSE))
+	{
+		t->capno = 0;
+		t->flags &= ~CAP;
+	}
+
+	/* Now recurse to children */
+	for (t2 = t->child; t2 != NULL; t2 = t2->sibling)
+	{
+		removecaptures(v, 

Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Mark Dilger



> On Aug 8, 2021, at 9:31 AM, Tom Lane  wrote:
> 
> I realized that the earlier patch is actually a bad idea

Applying only your  to master, the following 
which did not crash begins to crash:

 select regexp_split_to_array('rjgy', '(?:(.))((\1\1.))');

It also changes the results for another one:

 select 
regexp_split_to_array('jdpveoarcnsarcnsarcnszieqbqbqbqbiufdlywphbnrxtdoboouuzcqiqmenj',
 '()((.)){1}?\3?', 'itx');
-   
  regexp_split_to_array 
 
-
- 
{"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""}
+  regexp_split_to_array   
+--
+ {jdpveoarcnsarcnsarcnszieqbqbqbqbiufdlywphbnrxtdoboouuzcqiqmenj}

I'm not sure what *should* be returned here, only that it is a behavioral 
change.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Assert triggered during RE_compile_and_cache

2021-08-08 Thread Tom Lane
I wrote:
> While this is sufficient to make the problem go away, I'm
> inclined to apply both changesets.  Even if it accidentally
> works right now to have later backrefs consult the outer s/s2
> state pair rather than the original pair, that seems far too
> convoluted and bug-prone.  The outer states should be strictly
> the concern of the iteration setup logic in the outer invocation
> of parseqatom.

> (I'm sort of wondering now whether the outer s/s2 states are
> even really necessary anymore ... maybe Spencer put those in
> as a way of preventing some prehistoric version of this bug.
> But I'm not excited about messing with that right now.)

I realized that the earlier patch is actually a bad idea, because
it breaks the possibility of updating the subre to mark it as
being referenced by a later backref, as the REG_NOSUB patch needs
to do.  However, on closer study, the outer s/s2 states being
added by the "prepare a general-purpose state skeleton" stanza
really are duplicative of the ones we already made for a
parenthesized atom, so we can just get rid of them.  Done that
way.

regards, tom lane




Re: Strange code in EXPLAIN for queryid

2021-08-08 Thread Julien Rouhaud
On Sun, Aug 08, 2021 at 11:56:41PM +1200, David Rowley wrote:
> 
> On looking a little closer I also saw that plannedstmt->queryId is a
> uint64.  I guess that we're showing this as an int64 so that it
> matches the queryid column in the pg_stat_statements view?

The only reason we display it as an int64 rather than a uint64 is because
the SQL standard doesn't define such a type, and using numeric would add
useless overhead.

> Any objection to applying the attached to pg14 and master?

No objection.




Re: very long record lines in expanded psql output

2021-08-08 Thread Pavel Stehule
čt 5. 8. 2021 v 17:13 odesílatel Pavel Stehule 
napsal:

>
>
> čt 5. 8. 2021 v 16:50 odesílatel Platon Pronko 
> napsal:
>
>> Hi!
>>
>> > pspg supports copy cell to clipboard - you can copy complete cell
>> >
>> > but I am not sure if it is working well in extended mode. This mode is
>> not
>> > extra tested in pspg
>>
>> Thanks for the tip!
>>
>> I tried it out, in non-extended mode copy indeed works well, in extended
>> mode vertical cursor
>> selects both columns (and copies both), and in extended+wrapped mode copy
>> includes wrapping
>> symbol.
>>
>
> Yes, this is probably not finished yet. It was a surprise for me, so I am
> able to use the clipboard from the terminal application, so support for
> copy is relatively new feature. I'll check it when I have time.
>

I fixed this issue. Now, the export should work in wrapped mode too (master
branch).

Regards

Pavel


> Pavel
>
>
>
>
>
>
>
>> Best regards,
>> Platon Pronko
>>
>


Re: Use generation context to speed up tuplesorts

2021-08-08 Thread David Rowley
On Mon, 9 Aug 2021 at 00:38, Tomas Vondra  wrote:
> I'm not sure quadrupling the size is a good idea, though, because it
> increases the amount of memory we might be wasting. With the doubling,
> the amount of wasted /unused memory is limited to ~50%, because the next
> block is (roughly) equal to sum of already allocated blocks, so
> allocating just 1B on it leaves us with 50%. But quadrupling the size
> means we'll end up with ~75% free space. Of course, this is capped by
> the maximum block size etc. but still ...

Yeah, not sure what is best. It does however seem likely that the
majority of the performance improvement that I saw is due to either
malloc()/free() calls or just having fewer blocks in the context.

Maybe it's worth getting the planner on board with deciding how to do
the allocations.  It feels a bit overcautious to go allocating blocks
in each power of two starting at 8192 bytes when doing a 1GB sort.
Maybe we should be looking towards doing something more like making
the init allocation size more like pg_prevpower2_64(Min(work_mem *
1024L, sort_tuples * tuple_width)), or maybe half or quarter that.

It would certainly not be the only executor node to allocate memory
based on what the planner thought. Just look at ExecHashTableCreate().

David




Re: Use generation context to speed up tuplesorts

2021-08-08 Thread David Rowley
On Wed, 4 Aug 2021 at 02:10, Tomas Vondra  wrote:
> I did run the same set of benchmarks as for Slab, measuring some usual
> allocation patterns. The results for i5-2500k machine are attached (for
> the xeon it's almost exactly the same behavior). While running those
> tests I realized the last patch is wrong and sets allocChunkLimit=1,
> which is bogus and causes significant regression. So here's an updated
> version of the patch series too.

I know you're not done with these yet, but FWIW, I was getting an
Assert failure with these patches on:

Assert(total_allocated == context->mem_allocated);

It seems to be because you've forgotten to ignore keeper blocks when
adjusting context->mem_allocated in GenerationReset()

David




Re: Use generation context to speed up tuplesorts

2021-08-08 Thread Tomas Vondra

On 8/8/21 12:11 PM, David Rowley wrote:

On Sat, 7 Aug 2021 at 12:10, Tomas Vondra  wrote:

All of the tests show that the patches to improve the allocation
efficiency of generation.c don't help to improve the results of the
test cases. I wondered if it's maybe worth trying to see what happens
if instead of doubling the allocations each time, quadruple them
instead. I didn't try this.



I doubt quadrupling the allocations won't help very much, but I suspect
the problem might be in the 0004 patch - at least that's what shows
regression in my results. Could you try with just 0001-0003 applied?


I tried the quadrupling of the buffer instead of doubling it each time
and got the attached.  Column E, or green in the graphs show the
results of that. It's now much closer to the original patch which just
made the block size 8MB.



Interesting, I wouldn't have expected that to make such difference.

I'm not sure quadrupling the size is a good idea, though, because it 
increases the amount of memory we might be wasting. With the doubling, 
the amount of wasted /unused memory is limited to ~50%, because the next 
block is (roughly) equal to sum of already allocated blocks, so 
allocating just 1B on it leaves us with 50%. But quadrupling the size 
means we'll end up with ~75% free space. Of course, this is capped by 
the maximum block size etc. but still ...



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Use generation context to speed up tuplesorts

2021-08-08 Thread Tomas Vondra




On 8/8/21 9:02 AM, David Rowley wrote:

On Sat, 7 Aug 2021 at 12:10, Tomas Vondra  wrote:


On 8/6/21 3:07 PM, David Rowley wrote:

All of the tests show that the patches to improve the allocation
efficiency of generation.c don't help to improve the results of the
test cases. I wondered if it's maybe worth trying to see what happens
if instead of doubling the allocations each time, quadruple them
instead. I didn't try this.



I doubt quadrupling the allocations won't help very much, but I suspect
the problem might be in the 0004 patch - at least that's what shows
regression in my results. Could you try with just 0001-0003 applied?


But 0004 only changes the logic which controls the threshold of when
we allocate an oversized chunk.  It looks like the threshold is 512KB
with the 0004 patch.  My test is only doing a maximum allocation of
296 bytes so will never allocate an oversized chunk.

Can you explain why you think 0004 would cause performance regressions?



It's based solely on results of my benchmarks, where this patch seems to 
cause performance regression. I agree it's a bit bizzare, considering 
what the patch does.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Strange code in EXPLAIN for queryid

2021-08-08 Thread David Rowley
I was wondering about the following code in explain.c

char buf[MAXINT8LEN + 1];

pg_lltoa(plannedstmt->queryId, buf);
ExplainPropertyText("Query Identifier", buf, es);

I thought it was a bit strange that we don't just use
ExplainPropertyInteger() instead of going to the trouble of building
the string with pg_lltoa().

On looking a little closer I also saw that plannedstmt->queryId is a
uint64.  I guess that we're showing this as an int64 so that it
matches the queryid column in the pg_stat_statements view? If so, I
think it's worth a comment to mention why we're not using
ExplainPropertyUInteger.

Any objection to applying the attached to pg14 and master?

David


clarify_why_queryid_is_signed_rather_than_unsigned.patch
Description: Binary data


Re: Extra code in commit_ts.h

2021-08-08 Thread David Rowley
On Wed, 4 Aug 2021 at 00:02, David Rowley  wrote:
> Yeah, I think it can be safely removed.  I can do so unless someone
> thinks otherwise.

Pushed.

David




RE: [PATCH]Comment improvement in publication.sql

2021-08-08 Thread tanghy.f...@fujitsu.com
On Sunday, August 8, 2021 6:34 PM, vignesh C  wrote
>Thanks for the updated patch, your changes look good to me. You might
>want to include the commit message in the patch, that will be useful.

Thanks for your kindly suggestion.
Updated patch attached. Added the patch commit message with no new fix.

Regards,
Tang


v4-0001-Fix-comments-in-publication.sql-and-parsenodes.h.patch
Description:  v4-0001-Fix-comments-in-publication.sql-and-parsenodes.h.patch


RE: [PATCH] Add tab-complete for backslash commands

2021-08-08 Thread tanghy.f...@fujitsu.com
On Sunday, August 8, 2021 8:13 AM, Dagfinn Ilmari Mannsåker  
wrote:
>> +"\\r", "\\rset",
>
>There's a typo here, that should be "\\reset".  Also, I noticed that for
>\connect, the situation is the opposite: it has the full form but not
>the short form (\c).
>
>I've addressed both in the attached v2 patch.

Thanks for you comments and fix. Your modified patch LGTM.

Regards,
Tang


Re: Use generation context to speed up tuplesorts

2021-08-08 Thread David Rowley
On Sat, 7 Aug 2021 at 12:10, Tomas Vondra  wrote:
> > All of the tests show that the patches to improve the allocation
> > efficiency of generation.c don't help to improve the results of the
> > test cases. I wondered if it's maybe worth trying to see what happens
> > if instead of doubling the allocations each time, quadruple them
> > instead. I didn't try this.
> >
>
> I doubt quadrupling the allocations won't help very much, but I suspect
> the problem might be in the 0004 patch - at least that's what shows
> regression in my results. Could you try with just 0001-0003 applied?

I tried the quadrupling of the buffer instead of doubling it each time
and got the attached.  Column E, or green in the graphs show the
results of that. It's now much closer to the original patch which just
made the block size 8MB.

David


generation context tuplesort_v2.ods
Description: application/vnd.oasis.opendocument.spreadsheet


Re: [PATCH]Comment improvement in publication.sql

2021-08-08 Thread vignesh C
On Fri, Aug 6, 2021 at 3:33 PM tanghy.f...@fujitsu.com
 wrote:
>
> Hi
>
> I saw some inaccurate comments for AlterPublicationStmt structure when
> reviewing patches related to publication[1].
>
> The variable tables are used for 'ALTER PUBLICATION ... ADD/DROP/SET TABLE',
> but the comments only say 'ADD/DROP'. How about add 'SET' to the comments?
>
> typedef struct AlterPublicationStmt
> {
> NodeTag type;
> char   *pubname;/* Name of the publication */
>
> /* parameters used for ALTER PUBLICATION ... WITH */
> List   *options;/* List of DefElem nodes */
>
> /* parameters used for ALTER PUBLICATION ... ADD/DROP TABLE */
> List   *tables; /* List of tables to add/drop 
> */
> boolfor_all_tables; /* Special publication for all tables 
> in db */
> DefElemAction tableAction;  /* What action to perform with the 
> tables */
> } AlterPublicationStmt;
>
> It's also a comment improvement, so I add this change to this patch.

Thanks for the updated patch, your changes look good to me. You might
want to include the commit message in the patch, that will be useful.

Regards,
Vignesh




Re: Added schema level support for publication.

2021-08-08 Thread vignesh C
On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila  wrote:
>
> On Fri, Aug 6, 2021 at 2:02 PM vignesh C  wrote:
> >
> > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila  wrote:
> > >
> > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C  wrote:
> >
> > > 6.
> > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > + 2,
> > > + {
> > > + Anum_pg_publication_schema_psnspcid,
> > > + Anum_pg_publication_schema_pspubid,
> > > + 0,
> > > + 0
> > > + },
> > >
> > > Why don't we keep pubid as the first column in this index?
> >
> > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > it is, thoughts?
> >
>
> Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> because it is searched using the only relid in
> GetRelationPublications, so, similarly, in the patch, you are using
> schema_oid in GetSchemaPublications, so probably that will help. I was
> wondering why you haven't directly used the cache in
> GetSchemaPublications similar to GetRelationPublications?

Both of the approaches work, I was not sure which one is better, If
the approach in GetRelationPublications is better, I will change it to
something similar to GetRelationPublications. Thoughts?

It seems
> there is a danger for concurrent object drop. Can you please check how
> the safety is ensured say when either one wants to drop the
> corresponding relation/schema or publication?

If a table is dropped concurrently from another session during logical
replication of some operation in that table, while we get
get_rel_sync_entry the cache invalidations(rel_sync_cache_relation_cb)
happen.  The cache entry will be marked as false, also the schema_sent
will be marked as false. It will resend the relation using the
relation that was prepared while processing this transaction from
ReorderBufferProcessTXN. I felt this is ok since the relation is
dropped after the operation on the table. Similarly if the publication
is dropped concurrently from another session during logical
replication of some operation in that table, while we get
get_rel_sync_entry the cache
invalidations(publication_invalidation_cb) happen. The publications
will be reloaded and validated again, the data will be replicated to
the server. I felt this behavior is fine since the publication is
dropped after the operation on the table.

 Another point is why
> can't we use the other index (where the index is on relid or
> schema_oid (PublicationSchemaObjectIndexId))?

I felt this cannot be used because In this case the index is in the
oid column of pg_publication_schema and not on psnspcid column.

Regards,
Vignesh




Re: Use generation context to speed up tuplesorts

2021-08-08 Thread David Rowley
On Sat, 7 Aug 2021 at 12:10, Tomas Vondra  wrote:
>
> On 8/6/21 3:07 PM, David Rowley wrote:
> > All of the tests show that the patches to improve the allocation
> > efficiency of generation.c don't help to improve the results of the
> > test cases. I wondered if it's maybe worth trying to see what happens
> > if instead of doubling the allocations each time, quadruple them
> > instead. I didn't try this.
> >
>
> I doubt quadrupling the allocations won't help very much, but I suspect
> the problem might be in the 0004 patch - at least that's what shows
> regression in my results. Could you try with just 0001-0003 applied?

But 0004 only changes the logic which controls the threshold of when
we allocate an oversized chunk.  It looks like the threshold is 512KB
with the 0004 patch.  My test is only doing a maximum allocation of
296 bytes so will never allocate an oversized chunk.

Can you explain why you think 0004 would cause performance regressions?

David




Re: parse_subscription_options - suggested improvements

2021-08-08 Thread Peter Smith
v11 -> v12

* A rebase was needed due to some recent pgindent changes on HEAD.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v12-0001-Simplify-recent-parse_subscription_option-change.patch
Description: Binary data


Typo in subscription TAP test comment

2021-08-08 Thread Peter Smith
PSA a patch to fix trivial comment typo in newly committed  TAP test.

"amd" -> "and"

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Fix-typo-in-TAP-test-comment.patch
Description: Binary data


Re: row filtering for logical replication

2021-08-08 Thread Peter Smith
v22 --> v23

Changes:

* A rebase was needed (due to commit [1]) to keep the patch working with cfbot.

--
[1] 
https://github.com/postgres/postgres/commit/93d573d86571d148e2d14415166ec6981d34ea9d

Kind Regards,
Peter Smith.
Fujitsu Australia


v23-0001-Row-filter-for-logical-replication.patch
Description: Binary data


Re: elog.c query_id support vs shutdown

2021-08-08 Thread Julien Rouhaud
On Sun, Aug 08, 2021 at 01:46:39PM +0800, Julien Rouhaud wrote:
> On Sat, Aug 07, 2021 at 04:44:07PM -0700, Andres Freund wrote:
> > 
> > As currently implemented those pgstat_get_my_query_id() calls are not
> > safe. It's fine during backend startup because MyBEEntry is not set, but
> > during shutdown that's not ok, because we never unset MyBEEntry.
> > 
> > andres@awork3:~/src/postgresql$ 
> > /home/andres/build/postgres/dev-assert/vpath/src/backend/postgres --single 
> > postgres -D /srv/dev/pgdev-dev/ -c 'log_line_prefix=%Q' -c 
> > log_min_messages=debug1
> > [...]
> > PostgreSQL stand-alone backend 15devel
> > backend> 0NOTICE:  shutting down
> > 0DEBUG:  performing replication slot checkpoint
> > Segmentation fault
> 
FTR I just added an open item for that.