Commitfest 2023-11 is almost over

2023-12-01 Thread John Naylor
I'm currently spending some time looking for "Needs Review" threads
that have some combination of 1) stalled for a long time and 2) lack
consensus, with an aim to clearing them out of CF.

Since we got a late start, I will begin moving entries over to January
on Monday. I thought I remembered discussion on a "bulk move" button,
but if it's there, I can't find it...

--
John Naylor




Re: User functions for building SCRAM secrets

2023-12-01 Thread John Naylor
On Wed, Mar 22, 2023 at 9:06 PM Jonathan S. Katz  wrote:
>
> Maybe I'm not conveying the problem this is solving -- I'm happy to go
> one more round trying to make it clearer -- but if this is not clear,
> it'd be good to at develop an alternative approach to this before
> withdrawing the patch.

This thread had some lively discussion, but it doesn't seem to have
converged towards consensus, and hasn't had activity since April. That
being the case, maybe it's time to withdraw and reconsider the
approach later?




Re: Improving asan/ubsan support

2023-12-01 Thread Alexander Lakhin

Hello Tristan,

02.12.2023 00:48, Tristan Partin wrote:



So it looks like the asan feature detect_stack_use_after_return implemented
in gcc allows itself to add some data on stack, that breaks our alignment
expectations. With all three such Asserts in md.c removed,
`make check-world` passes for me.


Decided to do some digging into this, and Google actually documents[0] how it works. After reading the algorithm, it 
is obvious why this fails. What happens if you throw an __attribute__((no_sanitize("address")) on the function? I 
assume the Asserts would then pass. The commit[1] which added pg_attribute_aligned() provides insight as to why the 
Asserts exist.


Thank you for spending your time on this!

Yes, I understand what those Asserts were added for, I removed them just
to check what else is on the way.
And I can confirm that marking that function with the no_sanitize attribute
fixes that exact failure. Then the same attribute has to be added to
_hash_alloc_buckets(), to prevent:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: 
"md.c", Line: 471, PID: 1766976

#5  0x5594a6a0f0d0 in ExceptionalCondition (conditionName=0x5594a7454a60 "(uintptr_t) buffer == 
TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x5594a7454880 "md.c", lineNumber=471) at assert.c:66
#6  0x5594a61ce133 in mdextend (reln=0x62537e48, forknum=MAIN_FORKNUM, blocknum=9, buffer=0x7fc3b3947020, 
skipFsync=false) at md.c:471
#7  0x5594a61d89ab in smgrextend (reln=0x62537e48, forknum=MAIN_FORKNUM, blocknum=9, buffer=0x7fc3b3947020, 
skipFsync=false) at smgr.c:501

#8  0x5594a4a0c43d in _hash_alloc_buckets (rel=0x7fc3a89714f8, 
firstblock=6, nblocks=4) at hashpage.c:1033

And to RelationCopyStorage(), to prevent:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: 
"md.c", Line: 752, PID: 1787855

#5  0x56081d5688bc in ExceptionalCondition (conditionName=0x56081dfaea40 "(uintptr_t) buffer == 
TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x56081dfae860 "md.c", lineNumber=752) at assert.c:66

#6  0x56081cd29415 in mdread (reln=0x62943158, forknum=MAIN_FORKNUM, 
blocknum=0, buffer=0x7fe480633020) at md.c:752
#7  0x56081cd32cb3 in smgrread (reln=0x62943158, forknum=MAIN_FORKNUM, blocknum=0, buffer=0x7fe480633020) at 
smgr.c:565
#8  0x56081b9ed5f2 in RelationCopyStorage (src=0x62943158, dst=0x62941248, forkNum=MAIN_FORKNUM, 
relpersistence=112 'p') at storage.c:487


Probably, it has to be added for all the functions where PGIOAlignedBlock
located on stack...

But I still wonder, how it works with clang, why that extra attribute is
not required?
In other words, such implementation specifics discourage me...



Possibly, but I think I would rather see upstream support running with all features with instrumentation turned off in 
various sections of code. Even some assistance from AddressSanitizer is better than none. Here[1][2] are all the 
AddressSanitizer flags for those curious.


Yeah, and you might also need to specify extra flags to successfully run
postgres with newer sanitizers' versions. Say, for clang-18 you need to
specify -fno-sanitize=function (which is not recognized by gcc 13.2), to
avoid errors like this:
running bootstrap script ... dynahash.c:1120:4: runtime error: call to function strlcpy through pointer to incorrect 
function type 'void *(*)(void *, const void *, unsigned long)'

.../src/port/strlcpy.c:46: note: strlcpy defined here
    #0 0x556af5e0b0a9 in hash_search_with_hash_value 
.../src/backend/utils/hash/dynahash.c:1120:4
    #1 0x556af5e08f4f in hash_search .../src/backend/utils/hash/dynahash.c:958:9

I personally would like to see Postgres have support for  AddressSanitizer. I think it already supports 
UndefinedBehaviorSanitizer if I am remembering the buildfarm properly. AddressSanitizer has been so helpful in past 
experiences writing C.


Me too. I find it very valuable for my personal usage but I'm afraid it's
still not very stable/mature.
One more example. Just adding -fsanitize=undefined for gcc 12, 13 (I tried
12.1, 13.0, 13.2) produces new warnings like:
In function 'PageGetItemId',
    inlined from 'heap_xlog_update' at heapam.c:9569:9:
../../../../src/include/storage/bufpage.h:243:16: warning: array subscript -1 is below array bounds of 'ItemIdData[]' 
[-Warray-bounds=]

  243 | return &((PageHeader) page)->pd_linp[offsetNumber - 1];
  | ^~~

But I don't get such warnings when I use gcc 11.3 (though it generates
other ones) or clang (15, 18). They also aren't produced with -O0, -O1...
Maybe it's another gcc bug, I'm not sure how to deal with it.
(I can research this issue, if it makes any sense.)

So I would say that cost of providing/maintaining full support for asan
(hwasan), ubsan is not near zero, unfortunately. I would estimate it to
10-20 discussions/commits on start/5-10 per year later (not including fixes
for 

Re: Synchronizing slots from primary to standby

2023-12-01 Thread Amit Kapila
On Fri, Dec 1, 2023 at 9:31 PM Nisha Moond  wrote:
>
> On Fri, Dec 1, 2023 at 5:40 PM Nisha Moond  wrote:
> >
> > Review for v41 patch.
> >
> > 1.
> > ==
> > src/backend/utils/misc/postgresql.conf.sample
> >
> > +#enable_syncslot = on # enables slot synchronization on the physical
> > standby from the primary
> >
> > enable_syncslot is disabled by default, so, it should be 'off' here.
> >
> > ~~~
> > 2.
> > IIUC, the slotsyncworker's connection to the primary is to execute a
> > query. Its aim is not walsender type connection, but at primary when
> > queried, the 'backend_type' is set to 'walsender'.
> > Snippet from primary db-
> >
> > datname  |   usename   | application_name | wait_event_type | backend_type
> > -+-+--+-+--
> > postgres | replication | slotsyncworker   | Client  | walsender
> >
> > Is it okay?
> >
> > ~~~
> > 3.
> > As per current logic, If there are slots on primary with disabled
> > subscriptions, then, when standby is created it replicates these slots
> > but can't make them sync-ready until any activity happens on the
> > slots.
> > So, such slots stay in 'i' sync-state and get dropped when failover
> > happens. Now, if the subscriber tries to enable their existing
> > subscription after failover, it gives an error that the slot does not
> > exist.
>
> Is this behavior expected? If yes, then is it worth documenting about
> disabled subscription slots not being synced?
>

This is expected behavior because even if we would retain such slots
(state 'i'), we won't be able to make them in the 'ready' state after
failover because we can't get the required WAL from the primary.

-- 
With Regards,
Amit Kapila.




Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2023-12-01 Thread Amit Kapila
On Fri, Dec 1, 2023 at 1:55 PM li jie  wrote:
>
> > This is just an immature idea. I haven't started to implement it yet.
> > Maybe it was designed this way because there
> > are key factors that I didn't consider. So I want to hear everyone's
> > opinions, especially the designers of logic decoding.
>
> Attached is the patch I used to implement this optimization.
> The main designs are as follows:
> 1. Added a callback LogicalDecodeFilterByRelCB for the output plugin.
>
> 2. Added this callback function pgoutput_table_filter for the pgoutput plugin.
> Its main implementation is based on the table filter in the
> pgoutput_change function.
>
> 3. After constructing a change and before Queue a change into a transaction,
> use RelidByRelfilenumber to obtain the relation associated with the 
> change,
> just like obtaining the relation in the ReorderBufferProcessTXN function.
>
> 4. Relation may be a toast, and there is no good way to get its real
> table relation
>based on toast relation. Here, I get the real table oid through
> toast relname, and
>then get the real table relation.
>

This may be helpful for the case you have mentioned but how about
cases where there is nothing to filter by relation? It will add
overhead related to the transaction start/end and others for each
change. Currently, we do that just once for all the changes that need
to be processed. I wonder why the spilling can't be avoided with GUC
logical_decoding_work_mem?

-- 
With Regards,
Amit Kapila.




Re: Emitting JSON to file using COPY TO

2023-12-01 Thread Joe Conway

On 12/1/23 22:00, Davin Shearer wrote:
I'm really glad to see this taken up as a possible new feature and will 
definitely use it if it gets released.  I'm impressed with how clean, 
understandable, and approachable the postgres codebase is in general and 
how easy it is to read and understand this patch.


I reviewed the patch (though I didn't build and test the code) and have 
a concern with adding the '[' at the beginning and ']' at the end of the 
json output.  Those are already added by `json_agg` 
(https://www.postgresql.org/docs/current/functions-aggregate.html 
) as 
you can see in my initial email.  Adding them in the COPY TO may be 
redundant (e.g., [[{"key":"value"...}]]).


With this patch in place you don't use json_agg() at all. See the 
example output (this is real output with the patch applied):


(oops -- I meant to send this with the same email as the patch)

8<-
create table foo(id int8, f1 text, f2 timestamptz);
insert into foo
  select g.i,
 'line: ' || g.i::text,
 clock_timestamp()
  from generate_series(1,4) as g(i);

copy foo to stdout (format 'json');
[
 {"id":1,"f1":"line: 1","f2":"2023-12-01T12:58:16.776863-05:00"}
,{"id":2,"f1":"line: 2","f2":"2023-12-01T12:58:16.777084-05:00"}
,{"id":3,"f1":"line: 3","f2":"2023-12-01T12:58:16.777096-05:00"}
,{"id":4,"f1":"line: 4","f2":"2023-12-01T12:58:16.777103-05:00"}
]
8<-


I think COPY TO makes good sense to support, though COPY FROM maybe not 
so much as JSON isn't necessarily flat and rectangular like CSV.


Yeah -- definitely not as straight forward but possibly we just support 
the array-of-jsonobj-rows as input as well?


For my use-case, I'm emitting JSON files to Apache NiFi for processing, 
and NiFi has superior handling of JSON (via JOLT parsers) versus CSV 
where parsing is generally done with regex.  I want to be able to emit 
JSON using a postgres function and thus COPY TO.


Definitely +1 for COPY TO.

I don't think COPY FROM will work out well unless the JSON is required 
to be flat and rectangular.  I would vote -1 to leave it out due to the 
necessary restrictions making it not generally useful.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Bug in pgbench prepared statements

2023-12-01 Thread Lev Kokotov
> The patch you have sent does not apply cleanly on the master branch,
> could you rebase please?

Attached. PR against master also here
, just to make sure it's
mergeable .

> Wouldn't it
> better to consume the errors from PQsendQueryPrepared() and
> PQsendQueryParams() when these fail?

The error is returned in PQPrepare(), which happens only in QUERY_PREPARED
mode, so PQsendQueryParams() does not apply, and before
PQsendQueryPrepared() is called, so catching the error from
PQsendQueryPrepared() is exactly what's causing the bug: ERROR:  prepared
statement "P_0" does not exist.

Best,

Lev
postgresml.org

On Thu, Nov 30, 2023 at 11:19 PM Michael Paquier 
wrote:

> On Thu, Nov 30, 2023 at 07:15:54PM -0800, Lev Kokotov wrote:
> >> I see prepareCommand() is called one more time in
> >> prepareCommandsInPipeline(). Should you also check the return value
> >> there?
> >
> > Yes, good catch. New patch attached.
>
> Agreed that this is not really helpful as it stands
>
> >> It may also be useful to throw this patch on the January commitfest if
> >> no one else comes along to review/commit it.
> >
> > First time contributing, not familiar with the process here, but happy to
> > learn.
>
> The patch you have sent does not apply cleanly on the master branch,
> could you rebase please?
>
> FWIW, I am a bit confused by the state of sendCommand().  Wouldn't it
> better to consume the errors from PQsendQueryPrepared() and
> PQsendQueryParams() when these fail?
> --
> Michael
>


pgbench2.patch
Description: Binary data


Re: Emitting JSON to file using COPY TO

2023-12-01 Thread Davin Shearer
I'm really glad to see this taken up as a possible new feature and will
definitely use it if it gets released.  I'm impressed with how clean,
understandable, and approachable the postgres codebase is in general and
how easy it is to read and understand this patch.

I reviewed the patch (though I didn't build and test the code) and have a
concern with adding the '[' at the beginning and ']' at the end of the json
output.  Those are already added by `json_agg` (
https://www.postgresql.org/docs/current/functions-aggregate.html) as you
can see in my initial email.  Adding them in the COPY TO may be redundant
(e.g., [[{"key":"value"...}]]).

I think COPY TO makes good sense to support, though COPY FROM maybe not so
much as JSON isn't necessarily flat and rectangular like CSV.

For my use-case, I'm emitting JSON files to Apache NiFi for processing, and
NiFi has superior handling of JSON (via JOLT parsers) versus CSV where
parsing is generally done with regex.  I want to be able to emit JSON using
a postgres function and thus COPY TO.

Definitely +1 for COPY TO.

I don't think COPY FROM will work out well unless the JSON is required to
be flat and rectangular.  I would vote -1 to leave it out due to the
necessary restrictions making it not generally useful.

Hope it helps,
Davin

On Fri, Dec 1, 2023 at 6:10 PM Nathan Bossart 
wrote:

> On Fri, Dec 01, 2023 at 02:28:55PM -0500, Joe Conway wrote:
> > I did a quick PoC patch (attached) -- if there interest and no hard
> > objections I would like to get it up to speed for the January commitfest.
>
> Cool.  I would expect there to be interest, given all the other JSON
> support that has been added thus far.
>
> I noticed that, with the PoC patch, "json" is the only format that must be
> quoted.  Without quotes, I see a syntax error.  I'm assuming there's a
> conflict with another json-related rule somewhere in gram.y, but I haven't
> tracked down exactly which one is causing it.
>
> > 1. Is supporting JSON array format sufficient, or does it need to support
> > some other options? How flexible does the support scheme need to be?
>
> I don't presently have a strong opinion on this one.  My instinct would be
> start with something simple, though.  I don't think we offer any special
> options for log_destination...
>
> > 2. This only supports COPY TO and we would undoubtedly want to support
> COPY
> > FROM for JSON as well, but is that required from the start?
>
> I would vote for including COPY FROM support from the start.
>
> > ! if (!cstate->opts.json_mode)
>
> I think it's unfortunate that this further complicates the branching in
> CopyOneRowTo(), but after some quick glances at the code, I'm not sure it's
> worth refactoring a bunch of stuff to make this nicer.
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
>


Re: Refactoring backend fork+exec code

2023-12-01 Thread Alexander Lakhin

Hello Heikki,

01.12.2023 23:44, Heikki Linnakangas wrote:



With memset(param, 0, sizeof(*param)); added at the beginning of
save_backend_variables(), server starts successfully, but then
`make check` fails with other Valgrind error.


That's actually a pre-existing issue, I'm seeing that even on unpatched 
'master'.


Thank you for fixing that!

Yes, I had discovered it before, but yesterday I decided to check whether
your patches improve the situation...

What bothered me additionally, is an error detected after server start. I
couldn't see it without the patches applied. I mean, on HEAD I now see
`make check` passing, but with the patches it fails:
...
# parallel group (20 tests):  interval date numerology polygon box macaddr8 macaddr multirangetypes line timestamp 
timetz timestamptz time circle strings lseg inet md5 path point

not ok 22    + strings  1048 ms
# (test process exited with exit code 2)
not ok 23    + md5  1052 ms
# (test process exited with exit code 2)
...
src/test/regress/log/postmaster.log contains:
==00:00:00:30.730 1713480== Syscall param write(buf) points to uninitialised 
byte(s)
==00:00:00:30.730 1713480==    at 0x5245A37: write (write.c:26)
==00:00:00:30.730 1713480==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 
(fileops.c:1180)
==00:00:00:30.730 1713480==    by 0x51BC84F: new_do_write (fileops.c:448)
==00:00:00:30.730 1713480==    by 0x51BC84F: _IO_new_file_xsputn 
(fileops.c:1254)
==00:00:00:30.730 1713480==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 
(fileops.c:1196)
==00:00:00:30.730 1713480==    by 0x51B1056: fwrite (iofwrite.c:39)
==00:00:00:30.730 1713480==    by 0x5540CF: internal_forkexec 
(postmaster.c:4526)
==00:00:00:30.730 1713480==    by 0x5543C0: bgworker_forkexec 
(postmaster.c:5624)
==00:00:00:30.730 1713480==    by 0x555477: do_start_bgworker 
(postmaster.c:5665)
==00:00:00:30.730 1713480==    by 0x555738: maybe_start_bgworkers 
(postmaster.c:5928)
==00:00:00:30.730 1713480==    by 0x556072: process_pm_pmsignal 
(postmaster.c:5080)
==00:00:00:30.730 1713480==    by 0x556610: ServerLoop (postmaster.c:1761)
==00:00:00:30.730 1713480==    by 0x557BE2: PostmasterMain (postmaster.c:1469)
==00:00:00:30.730 1713480==    by 0x47216B: main (main.c:198)
==00:00:00:30.730 1713480==  Address 0x1ffeffd8c0 is on thread 1's stack
==00:00:00:30.730 1713480==  in frame #4, created by internal_forkexec 
(postmaster.c:4482)
==00:00:00:30.730 1713480==
...
2023-12-02 05:14:30.751 MSK client backend[1713740] pg_regress/rangetypes FATAL:  terminating connection due to 
unexpected postmaster exit
2023-12-02 05:14:31.033 MSK client backend[1713734] pg_regress/numeric FATAL:  postmaster exited during a parallel 
transaction

TRAP: failed Assert("!IsTransactionOrTransactionBlock()"), File: "pgstat.c", 
Line: 591, PID: 1713734

I haven't looked deeper yet, but it seems that we see two issues here (and
Assert is not directly caused by the patches set.)

Best regards,
Alexander




Re: processes stuck in shutdown following OOM/recovery

2023-12-01 Thread Thomas Munro
On Sat, Dec 2, 2023 at 2:18 PM Thomas Munro  wrote:
> On Fri, Dec 1, 2023 at 6:13 PM Justin Pryzby  wrote:
> > $ kill -9 2524495; sleep 0.05; pg_ctl -D ./pgdev.dat1 stop -m fast # 
> > 2524495 is a child's pid
>
> > This affects v15, and fails at ) but not its parent.
>
> Repro'd here.  I had to make the sleep shorter on my system.  Looking...

The PostmasterStateMachine() case for PM_WAIT_BACKENDS doesn't tell
the checkpointer to shut down in this race case.  We have
CheckpointerPID != 0 (because 7ff23c6d27 starts it earlier than
before), and FatalError is true because a child recently crashed and
we haven't yet received the PMSIGNAL_RECOVERY_STARTED handler that
would clear it.  Hmm.




Re: [PATCH] plpython function causes server panic

2023-12-01 Thread Tom Lane
I wrote:
> The only readily-reachable error case in BeginInternalSubTransaction
> is this specific one about IsInParallelMode, which was added later
> than the original design and evidently with not a lot of thought or
> testing.  The comment for it speculates about whether we could get
> rid of it, so I wonder if our thoughts about this ought to go in that
> direction.

After thinking a bit more I wonder why we need that error check at all.
Why isn't it sufficient to rely on GetNewTransactionId()'s check that
throws an error if a parallelized subtransaction tries to obtain an XID?
I don't see why we'd need to "synchronize transaction state" about
anything that never acquires an XID.

regards, tom lane




Re: [PATCH] plpython function causes server panic

2023-12-01 Thread Tom Lane
Andres Freund  writes:
> On 2023-12-01 20:04:15 -0500, Tom Lane wrote:
>> Thanks for the report!  I see the problem is that we're not expecting
>> BeginInternalSubTransaction to fail.  However, I'm not sure I like
>> this solution, mainly because it's only covering a fraction of the
>> problem.  There are similarly unsafe usages in plperl, pltcl, and
>> very possibly a lot of third-party PLs.  I wonder if there's a way
>> to deal with this issue without changing these API assumptions.

> There are plenty other uses, but it's not clear to me that they are similarly
> affected by BeginInternalSubTransaction raising an error? It e.g. doesn't
> immediately look like plperl's usage would be affected in a similar way?

Why not?  We'd be longjmp'ing out from inside the Perl interpreter.
Maybe Perl is so robust it doesn't care, but I'd be surprised if this
can't break it.  The same for Tcl.

I think that plpgsql indeed doesn't care because it has no non-PG
interpreter state to worry about.  But it's in the minority I fear.

regards, tom lane




Re: [PATCH] plpython function causes server panic

2023-12-01 Thread Andres Freund
Hi,

On 2023-12-01 20:04:15 -0500, Tom Lane wrote:
> Hao Zhang  writes:
> > I found a problem when executing the plpython function:
> > After the plpython function returns an error, in the same session, if we
> > continue to execute
> > plpython function, the server panic will be caused.
> 
> Thanks for the report!  I see the problem is that we're not expecting
> BeginInternalSubTransaction to fail.  However, I'm not sure I like
> this solution, mainly because it's only covering a fraction of the
> problem.  There are similarly unsafe usages in plperl, pltcl, and
> very possibly a lot of third-party PLs.  I wonder if there's a way
> to deal with this issue without changing these API assumptions.

There are plenty other uses, but it's not clear to me that they are similarly
affected by BeginInternalSubTransaction raising an error? It e.g. doesn't
immediately look like plperl's usage would be affected in a similar way?

Greetings,

Andres Freund




Re: processes stuck in shutdown following OOM/recovery

2023-12-01 Thread Thomas Munro
On Fri, Dec 1, 2023 at 6:13 PM Justin Pryzby  wrote:
> $ kill -9 2524495; sleep 0.05; pg_ctl -D ./pgdev.dat1 stop -m fast # 2524495 
> is a child's pid

> This affects v15, and fails at 7ff23c6d27 but not its parent.

Repro'd here.  I had to make the sleep shorter on my system.  Looking...




Re: [PATCH] plpython function causes server panic

2023-12-01 Thread Tom Lane
Hao Zhang  writes:
> I found a problem when executing the plpython function:
> After the plpython function returns an error, in the same session, if we
> continue to execute
> plpython function, the server panic will be caused.

Thanks for the report!  I see the problem is that we're not expecting
BeginInternalSubTransaction to fail.  However, I'm not sure I like
this solution, mainly because it's only covering a fraction of the
problem.  There are similarly unsafe usages in plperl, pltcl, and
very possibly a lot of third-party PLs.  I wonder if there's a way
to deal with this issue without changing these API assumptions.

The only readily-reachable error case in BeginInternalSubTransaction
is this specific one about IsInParallelMode, which was added later
than the original design and evidently with not a lot of thought or
testing.  The comment for it speculates about whether we could get
rid of it, so I wonder if our thoughts about this ought to go in that
direction.

In any case, if we do proceed along the lines of catching errors
from BeginInternalSubTransaction, I think your patch is a bit shy
of a load because it doesn't do all the same things that other callers
of PLy_spi_exception_set do.  Looking at that made me wonder why
the PLy_spi_exceptions lookup business was being duplicated by every
caller rather than being done once in PLy_spi_exception_set.  So
0001 attached is a quick refactoring patch to remove that code
duplication, and then 0002 is your patch adapted to that.

I also attempted to include a test case in 0002, but I'm not very
satisfied with that.  Your original test case seemed pretty expensive
for the amount of code coverage it adds, so I tried to make it happen
with debug_parallel_query instead.  That does exercise the new code,
but it does not exhibit the crash if run against unpatched code.
That's because with this test case the error is only thrown in worker
processes not the leader, so we don't end up with corrupted Python
state in the leader.  That result also points up that the original
test case isn't very reliable for this either: you have to have
parallel_leader_participation on, and you have to have the leader
process at least one row, which makes it pretty timing-sensitive.
On top of all that, the test would become useless if we do eventually
get rid of the !IsInParallelMode restriction.  So I'm kind of inclined
to not bother with a test case if this gets to be committed in this
form.

Thoughts anyone?

regards, tom lane

diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index ff87b27de0..64ca47f218 100644
--- a/src/pl/plpython/plpy_spi.c
+++ b/src/pl/plpython/plpy_spi.c
@@ -28,7 +28,7 @@
 static PyObject *PLy_spi_execute_query(char *query, long limit);
 static PyObject *PLy_spi_execute_fetch_result(SPITupleTable *tuptable,
 			  uint64 rows, int status);
-static void PLy_spi_exception_set(PyObject *excclass, ErrorData *edata);
+static void PLy_spi_exception_set(ErrorData *edata);
 
 
 /* prepare(query="select * from foo")
@@ -470,8 +470,6 @@ PLy_commit(PyObject *self, PyObject *args)
 	PG_CATCH();
 	{
 		ErrorData  *edata;
-		PLyExceptionEntry *entry;
-		PyObject   *exc;
 
 		/* Save error info */
 		MemoryContextSwitchTo(oldcontext);
@@ -481,17 +479,8 @@ PLy_commit(PyObject *self, PyObject *args)
 		/* was cleared at transaction end, reset pointer */
 		exec_ctx->scratch_ctx = NULL;
 
-		/* Look up the correct exception */
-		entry = hash_search(PLy_spi_exceptions, &(edata->sqlerrcode),
-			HASH_FIND, NULL);
-
-		/*
-		 * This could be a custom error code, if that's the case fallback to
-		 * SPIError
-		 */
-		exc = entry ? entry->exc : PLy_exc_spi_error;
 		/* Make Python raise the exception */
-		PLy_spi_exception_set(exc, edata);
+		PLy_spi_exception_set(edata);
 		FreeErrorData(edata);
 
 		return NULL;
@@ -517,8 +506,6 @@ PLy_rollback(PyObject *self, PyObject *args)
 	PG_CATCH();
 	{
 		ErrorData  *edata;
-		PLyExceptionEntry *entry;
-		PyObject   *exc;
 
 		/* Save error info */
 		MemoryContextSwitchTo(oldcontext);
@@ -528,17 +515,8 @@ PLy_rollback(PyObject *self, PyObject *args)
 		/* was cleared at transaction end, reset pointer */
 		exec_ctx->scratch_ctx = NULL;
 
-		/* Look up the correct exception */
-		entry = hash_search(PLy_spi_exceptions, &(edata->sqlerrcode),
-			HASH_FIND, NULL);
-
-		/*
-		 * This could be a custom error code, if that's the case fallback to
-		 * SPIError
-		 */
-		exc = entry ? entry->exc : PLy_exc_spi_error;
 		/* Make Python raise the exception */
-		PLy_spi_exception_set(exc, edata);
+		PLy_spi_exception_set(edata);
 		FreeErrorData(edata);
 
 		return NULL;
@@ -594,8 +572,6 @@ void
 PLy_spi_subtransaction_abort(MemoryContext oldcontext, ResourceOwner oldowner)
 {
 	ErrorData  *edata;
-	PLyExceptionEntry *entry;
-	PyObject   *exc;
 
 	/* Save error info */
 	MemoryContextSwitchTo(oldcontext);
@@ -607,17 +583,8 @@ PLy_spi_subtransaction_abort(MemoryContext oldcontext, 

Re: logical decoding and replication of sequences, take 2

2023-12-01 Thread Tomas Vondra



On 12/1/23 12:08, Hayato Kuroda (Fujitsu) wrote:
> Dear Tomas,
> 
>> I did some micro-benchmarking today, trying to identify cases where this
>> would cause unexpected problems, either due to having to maintain all
>> the relfilenodes, or due to having to do hash lookups for every sequence
>> change. But I think it's fine, mostly ...
>>
> 
> I did also performance tests (especially case 3). First of all, there are some
> variants from yours.
> 
> 1. patch 0002 was reverted because it has an issue. So this test checks 
> whether
>refactoring around ReorderBufferSequenceIsTransactional seems really 
> needed.

FWIW I also did the benchmarks without the 0002 patch, for the same
reason. I forgot to mention that.

> 2. per comments from Amit, I also measured the abort case. In this case, the
>alter_sequence() is called but the transaction is aborted.
> 3. I measured with changing number of clients {8, 16, 32, 64, 128}. In any 
> cases,
>clients executed 1000 transactions. The performance machine has 128 core 
> so that
>result for 128 clients might be saturated.
> 4. a short sleep (0.1s) was added in alter_sequence(), especially between
>"alter sequence" and nextval(). Because while testing, I found that the
>transaction is too short to execute in parallel. I think it is reasonable
>because ReorderBufferSequenceIsTransactional() might be worse when the 
> parallelism
>is increased.
> 
> I attached one backend process via perf and executed 
> pg_slot_logical_get_changes().
> Attached txt file shows which function occupied CPU time, especially from
> pg_logical_slot_get_changes_guts() and ReorderBufferSequenceIsTransactional().
> Here are my observations about them.
> 
> * In case of commit, as you said, SnapBuildCommitTxn() seems dominant for 8-64
>   clients case.
> * For (commit, 128 clients) case, however, ReorderBufferRestoreChanges() waste
>   many times. I think this is because changes exceed 
> logical_decoding_work_mem,
>   so we do not have to analyze anymore.
> * In case of abort, CPU time used by ReorderBufferSequenceIsTransactional() 
> is linearly
>   longer. This means that we need to think some solution to avoid the 
> overhead by
>   ReorderBufferSequenceIsTransactional().
> 
> ```
> 8 clients  3.73% occupied time
> 16 7.26%
> 32 15.82%
> 64 29.14%
> 128 46.27%
> ```

Interesting, so what exactly does the transaction do? Anyway, I don't
think this is very surprising - I believe it behaves like this because
of having to search in many hash tables (one in each toplevel xact). And
I think the solution I explained before (maintaining a single toplevel
hash, instead of many per-top-level hashes).

FWIW I find this case interesting, but not very practical, because no
practical workload has that many aborts.

> 
> * In case of abort, I also checked CPU time used by 
> ReorderBufferAddRelFileLocator(), but
>   it seems not so depends on the number of clients.
> 
> ```
> 8 clients 3.66% occupied time
> 16 6.94%
> 32 4.65%
> 64 5.39%
> 128 3.06%
> ```
> 
> As next step, I've planned to run the case which uses setval() function, 
> because it
> generates more WALs than normal nextval();
> How do you think?
> 

Sure, although I don't think it's much different from the test selecting
40 values from the sequence (in each transaction). That generates about
the same amount of WAL.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: logical decoding and replication of sequences, take 2

2023-12-01 Thread Tomas Vondra


On 11/30/23 12:56, Amit Kapila wrote:
> On Thu, Nov 30, 2023 at 5:28 AM Tomas Vondra
>  wrote:
>>
>> 3) "bad case" - small transactions that generate a lot of relfilenodes
>>
>>   select alter_sequence();
>>
>> where the function is defined like this (I did create 1000 sequences
>> before the test):
>>
>>   CREATE OR REPLACE FUNCTION alter_sequence() RETURNS void AS $$
>>   DECLARE
>>   v INT;
>>   BEGIN
>>   v := 1 + (random() * 999)::int;
>>   execute format('alter sequence s%s restart with 1000', v);
>>   perform nextval('s');
>>   END;
>>   $$ LANGUAGE plpgsql;
>>
>> This performs terribly, but it's entirely unrelated to sequences.
>> Current master has exactly the same problem, if transactions do DDL.
>> Like this, for example:
>>
>>   CREATE OR REPLACE FUNCTION create_table() RETURNS void AS $$
>>   DECLARE
>>   v INT;
>>   BEGIN
>>   v := 1 + (random() * 999)::int;
>>   execute format('create table t%s (a int)', v);
>>   execute format('drop table t%s', v);
>>   insert into t values (1);
>>   END;
>>   $$ LANGUAGE plpgsql;
>>
>> This has the same impact on master. The perf report shows this:
>>
>>   --98.06%--pg_logical_slot_get_changes_guts
>>|
>> --97.88%--LogicalDecodingProcessRecord
>>  |
>>  --97.56%--xact_decode
>>   |
>>--97.51%--DecodeCommit
>> |
>> |--91.92%--SnapBuildCommitTxn
>> | |
>> |  --91.65%--SnapBuildBuildSnapshot
>> |   |
>> |   --91.14%--pg_qsort
>>
>> The sequence decoding is maybe ~1%. The reason why SnapBuildSnapshot
>> takes so long is because:
>>
>> -
>>   Breakpoint 1, SnapBuildBuildSnapshot (builder=0x21f60f8)
>>   at snapbuild.c:498
>>   498+ sizeof(TransactionId) *   builder->committed.xcnt
>>   (gdb) p builder->committed.xcnt
>>   $4 = 11532
>> -
>>
>> And with each iteration it grows by 1.
>>
> 
> Can we somehow avoid this either by keeping DDL-related xacts open or
> aborting them?
I
I'm not sure why the snapshot builder does this, i.e. why we end up
accumulating that many xids, and I didn't have time to look closer. So I
don't know if this would be a solution or not.

> Also, will it make any difference to use setval as
> do_setval() seems to be logging each time?
> 

I think that's pretty much what case (2) does, as it calls nextval()
enough time for each transaction do generate WAL. But I don't think this
is a very sensible benchmark - it's an extreme case, but practical cases
are far closer to case (1) because sequences are intermixed with other
activity. No one really does just nextval() calls.

> If possible, can you share the scripts? Kuroda-San has access to the
> performance machine, he may be able to try it as well.
> 

Sure, attached. But it's a very primitive script, nothing fancy.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

scripts.tgz
Description: application/compressed-tar


Re: [Proposal] global sequence implemented by snowflake ID

2023-12-01 Thread Tomas Vondra
On 12/1/23 07:15, Amit Kapila wrote:
> On Thu, Nov 30, 2023 at 5:21 PM Tomas Vondra
>  wrote:
>>
>> On 11/30/23 11:56, Amit Kapila wrote:
>>
>>>
>>> This was the key point that I wanted to discuss or hear opinions
>>> about. So, if we wish to have some sort of global sequences then it is
>>> not clear to me what benefits will we get by having replication of
>>> non-global sequences. One thing that comes to mind is replication
>>> covers a subset of use cases (like help in case of failover or
>>> switchover to subscriber) and till the time we have some
>>> implementation of global sequences, it can help users.
>>>
>>
>> What are you going to do about use cases like using logical replication
>> for upgrade to the next major version?
> 
> 
> As per my understanding, they should work as it is when using a global
> sequence. Just for the sake of example, considering we have a
> same-name global sequence on both pub and sub now it should work
> during and after major version upgrades.
> 

Sequential IDs have significant benefits too, it's simply not that these
global sequences are universally superior. For example, with sequential
sequences you often get locality, because recent data have about the
same sequence values. With global sequences that's not really the case,
because they are often based on randomness, which massively limits the
access locality. (Yes, some variants may maintain the ordering, others
don't.)

>>
>> Or applications that prefer (or
>> have to) use traditional sequences?
>>
> 
> I think we have to suggest them to use global sequence for the use
> cases where they want those to work with logical replication use
> cases. Now, if still users want their existing sequences to work then
> we can probably see if there is a way to provide an option via Alter
> Sequence to change it to a global sequence.
> 

I really don't know how that would work e.g. for existing applications
that have already designed the schema long time ago. Or for systems that
use 32-bit sequences - I'm not aware of global sequences that narrow.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Refactoring backend fork+exec code

2023-12-01 Thread Tom Lane
"Tristan Partin"  writes:
> On Fri Dec 1, 2023 at 2:44 PM CST, Heikki Linnakangas wrote:
>> Committed a fix with memset(). I'm not sure what our policy with 
>> backpatching this kind of issues is. This goes back to all supported 
>> versions, but given the lack of complaints, I chose to not backpatch.

> Seems like a harmless think to backpatch. It is conceivable that someone 
> might want to run Valgrind on something other than HEAD.

FWIW, I agree with Heikki's conclusion.  EXEC_BACKEND on non-Windows
is already a niche developer-only setup, and given the lack of complaints,
nobody's that interested in running Valgrind with it.  Fixing it on HEAD
seems like plenty.

regards, tom lane




Re: Emitting JSON to file using COPY TO

2023-12-01 Thread Nathan Bossart
On Fri, Dec 01, 2023 at 02:28:55PM -0500, Joe Conway wrote:
> I did a quick PoC patch (attached) -- if there interest and no hard
> objections I would like to get it up to speed for the January commitfest.

Cool.  I would expect there to be interest, given all the other JSON
support that has been added thus far.

I noticed that, with the PoC patch, "json" is the only format that must be
quoted.  Without quotes, I see a syntax error.  I'm assuming there's a
conflict with another json-related rule somewhere in gram.y, but I haven't
tracked down exactly which one is causing it.

> 1. Is supporting JSON array format sufficient, or does it need to support
> some other options? How flexible does the support scheme need to be?

I don't presently have a strong opinion on this one.  My instinct would be
start with something simple, though.  I don't think we offer any special
options for log_destination...

> 2. This only supports COPY TO and we would undoubtedly want to support COPY
> FROM for JSON as well, but is that required from the start?

I would vote for including COPY FROM support from the start.

> ! if (!cstate->opts.json_mode)

I think it's unfortunate that this further complicates the branching in
CopyOneRowTo(), but after some quick glances at the code, I'm not sure it's
worth refactoring a bunch of stuff to make this nicer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: pgsql: meson: docs: Add {html,man} targets, rename install-doc-*

2023-12-01 Thread Andres Freund
Hi,

On 2023-12-01 15:55:29 -0600, Tristan Partin wrote:
> Commits look fine to me, but I hate the new target names...

You shouldn't ever need to use them anywhere - that's what the alias is for...

Happy to go another route if you have a suggestion.


> >  +for name, v in targets_info_byname.items():
> >  +if len(targets_info_byname[name]) > 1:
> 
> My only comment is that you could reverse the logic and save yourself an
> indentation.
> 
> - if len(targets_info_byname[name]) > 1:
> + if len(targets_info_byname[name]) <= 1:
> + continue
> 
> But whatever you want.

Makes sense.




Re: Clean up some signal usage mainly related to Windows

2023-12-01 Thread Tristan Partin

On Wed Jul 12, 2023 at 9:35 AM CDT, Tristan Partin wrote:

On Wed Jul 12, 2023 at 9:31 AM CDT, Peter Eisentraut wrote:
> On 12.07.23 16:23, Tristan Partin wrote:
> > It has come to my attention that STDOUT_FILENO might not be portable and
> > fileno(3) isn't marked as signal-safe, so I have just used the raw 1 for
> > stdout, which as far as I know is portable.
>
> We do use STDOUT_FILENO elsewhere in the code, and there are even 
> workaround definitions for Windows, so it appears it is meant to be used.


v3 is back to the original patch with newline being printed. Thanks.


Peter, did you have anything more to say about patch 1 in this series? 
Thinking about patch 2 more, not sure it should be considered until 
I setup a Windows VM to do some testing, or unless some benevolent 
Windows user wants to look at it and test it.


--
Tristan Partin
Neon (https://neon.tech)




Re: [RFC] Clang plugin for catching suspicious typedef casting

2023-12-01 Thread Tristan Partin

On Fri Aug 4, 2023 at 5:47 AM CDT, Dmitry Dolgov wrote:

> On Thu, Aug 03, 2023 at 12:23:52PM -0500, Tristan Partin wrote:
>
> This is the first I am learning about clang plugins. Interesting concept.
> Did you give any thought to using libclang instead of a compiler plugin? I
> am kind of doing similar work, but I went with libclang instead of a plugin.

Nope, never thought about trying libclang. From the clang documentation
it seems a plugin is a suitable interface if one wants to:

special lint-style warnings or errors for your project

Which is what I was trying to achieve. Are there any advantages of
libclang that you see?


Only advantage I see to using libclang is that you can run programs 
built with libclang regardless of what your C compiler is. I typically 
use GCC.


I think your idea of automating this kind of thing is great no matter 
how it is implemented.


--
Tristan Partin
Neon (https://neon.tech)




Re: pgsql: meson: docs: Add {html,man} targets, rename install-doc-*

2023-12-01 Thread Tristan Partin
Commits look fine to me, but I hate the new target names... Luckily, 
I just use plain ninja, so I don't interact with that.



 +for name, v in targets_info_byname.items():
 +if len(targets_info_byname[name]) > 1:


My only comment is that you could reverse the logic and save yourself an 
indentation.


- if len(targets_info_byname[name]) > 1:
+ if len(targets_info_byname[name]) <= 1:
+ continue

But whatever you want.

--
Tristan Partin
Neon (https://neon.tech)




Re: [PATCH] pg_convert improvement

2023-12-01 Thread Nathan Bossart
On Mon, Nov 27, 2023 at 08:11:06AM +0100, Drouvot, Bertrand wrote:
> + PG_RETURN_BYTEA_P(string);

I looked around to see whether there was some sort of project policy about
returning arguments without copying them, but the only strict rule I see is
to avoid scribbling on argument data without first copying it.  However, I
do see functions that return unmodified arguments both with and without
copying.  For example, unaccent_dict() is careful to copy the argument
before returning it:

PG_RETURN_TEXT_P(PG_GETARG_TEXT_P_COPY(strArg));

But replace_text() is not:

/* Return unmodified source string if empty source or pattern */
if (src_text_len < 1 || from_sub_text_len < 1)
{
PG_RETURN_TEXT_P(src_text);
}

I don't have any specific concerns about doing this, though.  Otherwise,
the patch looks pretty good to me, so I will plan on committing it shortly.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Improving asan/ubsan support

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 3:00 AM CST, Alexander Lakhin wrote:

[ this thread separated from [1] as the discussion focus shifted ]

H Andres,

29.11.2023 22:39, Andres Freund wrote:
>> I use the following:
>> ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:print_stacktrace=1:\
>> disable_coredump=0:strict_string_checks=1:check_initialization_order=1:\
>> strict_init_order=1:detect_stack_use_after_return=0
> I wonder if we should add some of these options by default ourselves. We could
> e.g. add something like the __ubsan_default_options() in
> src/backend/main/main.c to src/port/... instead, and return a combination of
> "our" options (like detect_leaks=0) and the ones from the environment.

I think that such explicit expression of the project policy regarding
sanitizer checks is for good, but I see some obstacles on this way.

First, I'm not sure what to do with new useful options/maybe new option
values, that will appear in sanitizers eventually. Should the only options,
that are supported by all sanitizers' versions, be specified, or we may
expect that unsupported options/values would be ignored by old versions?

Second, what to do with other binaries, that need detect_leaks=0, for
example, that same ecpg?

> ISTM that, if it actually works as I theorize it should, using
> __attribute__((no_sanitize("address"))) would be the easiest approach
> here. Something like
>
> #if defined(__has_feature) && __has_feature(address_sanitizer)
> #define pg_attribute_no_asan __attribute__((no_sanitize("address")))
> #else
> #define pg_attribute_no_asan
> #endif
>
> or such should work.

I've tried adding:
  bool
+__attribute__((no_sanitize("address")))
  stack_is_too_deep(void)

and it does work got me with clang 15, 18: `make check-world` passes with
ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:print_stacktrace=1:\
disable_coredump=0:strict_string_checks=1:check_initialization_order=1:\
strict_init_order=1:detect_stack_use_after_return=1
UBSAN_OPTIONS=abort_on_error=1:print_stacktrace=1

(with a fix for pg_bsd_indent applied [2])

But with gcc 11, 12, 13 I get an assertion failure during `make check`:
#4  0x7fabadcd67f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x557f35260382 in ExceptionalCondition (conditionName=0x557f35ca51a0 "(uintptr_t) buffer == 
TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x557f35ca4fc0 "md.c", lineNumber=471) at assert.c:66
#6  0x557f34a3b2bc in mdextend (reln=0x625375c8, forknum=MAIN_FORKNUM, blocknum=18, buffer=0x7fabaa800020, 
skipFsync=true) at md.c:471
#7  0x557f34a45a6f in smgrextend (reln=0x625375c8, forknum=MAIN_FORKNUM, blocknum=18, buffer=0x7fabaa800020, 
skipFsync=true) at smgr.c:501
#8  0x557f349139ed in RelationCopyStorageUsingBuffer (srclocator=..., dstlocator=..., forkNum=MAIN_FORKNUM, 
permanent=true) at bufmgr.c:4386


The buffer (buf) declared as follows:
static void
RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
    RelFileLocator dstlocator,
    ForkNumber forkNum, bool permanent)
{
...
     PGIOAlignedBlock buf;
...

But as we can see, the buffer address is really not 4k-aligned, and that
offset 0x20 added in run-time only when the server started with
detect_stack_use_after_return=1.
So it looks like the asan feature detect_stack_use_after_return implemented
in gcc allows itself to add some data on stack, that breaks our alignment
expectations. With all three such Asserts in md.c removed,
`make check-world` passes for me.


Decided to do some digging into this, and Google actually documents[0] 
how it works. After reading the algorithm, it is obvious why this fails. 
What happens if you throw an __attribute__((no_sanitize("address")) on 
the function? I assume the Asserts would then pass. The commit[1] which 
added pg_attribute_aligned() provides insight as to why the Asserts 
exist.



/* If this build supports direct I/O, the buffer must be I/O aligned. */


Disabling instrumentation in functions which use this specific type when 
the build supports direct IO seems like the best solution.



> One thing that's been holding me back on trying to do something around this is
> the basically non-existing documentation around all of this. I haven't even
> found documentation referencing the fact that there are headers like
> sanitizer/asan_interface.h, you just have to figure that out yourself. Compare
> that to something like valgrind, which has documented this at least somewhat.

Yes, so maybe it's reasonable to support only basic/common features (such
as detect_leaks), leaving advanced ones for ad-hoc usage till they prove
their worthiness.


Possibly, but I think I would rather see upstream support running with 
all features with instrumentation turned off in various sections of 
code. Even some assistance from AddressSanitizer is better than none. 
Here[1][2] are all the AddressSanitizer flags for those curious.



Best regards,
Alexander

[1] 

Re: Refactoring backend fork+exec code

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 2:44 PM CST, Heikki Linnakangas wrote:

On 01/12/2023 16:00, Alexander Lakhin wrote:
> Maybe you could look at issues with running `make check` under Valgrind
> when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND":
> # +++ regress check in src/test/regress +++
> # initializing database system by copying initdb template
> # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for 
the reason
> Bail out!make[1]: ***
> 
> ...

> 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG:  listening on Unix socket 
"/tmp/pg_regress-pPFNk0/.s.PGSQL.55312"
> ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised 
byte(s)
> ==00:00:00:01.692 1259396==    at 0x5245A37: write (write.c:26)
> ==00:00:00:01.692 1259396==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 
(fileops.c:1180)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: new_do_write (fileops.c:448)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_new_file_xsputn 
(fileops.c:1254)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 
(fileops.c:1196)
> ==00:00:00:01.692 1259396==    by 0x51B1056: fwrite (iofwrite.c:39)
> ==00:00:00:01.692 1259396==    by 0x552E21: internal_forkexec 
(postmaster.c:4518)
> ==00:00:00:01.692 1259396==    by 0x5546A1: postmaster_forkexec 
(postmaster.c:)
> ==00:00:00:01.692 1259396==    by 0x55471C: StartChildProcess 
(postmaster.c:5275)
> ==00:00:00:01.692 1259396==    by 0x557B61: PostmasterMain (postmaster.c:1454)
> ==00:00:00:01.692 1259396==    by 0x472136: main (main.c:198)
> ==00:00:00:01.692 1259396==  Address 0x1ffeffdc11 is on thread 1's stack
> ==00:00:00:01.692 1259396==  in frame #4, created by internal_forkexec 
(postmaster.c:4482)
> ==00:00:00:01.692 1259396==
> 
> With memset(param, 0, sizeof(*param)); added at the beginning of

> save_backend_variables(), server starts successfully, but then
> `make check` fails with other Valgrind error.

That's actually a pre-existing issue, I'm seeing that even on unpatched 
'master'.


In a nutshell, the problem is that the uninitialized padding bytes in 
BackendParameters are written to the file. When we read the file back, 
we don't access the padding bytes, so that's harmless. But Valgrind 
doesn't know that.


On Windows, the file is created with 
CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables 
directly to the mapping. If I understand the Windows API docs correctly, 
it is guaranteed to be initialized to zeros, so we don't have this 
problem on Windows, only on other platforms with EXEC_BACKEND. I think 
it makes sense to clear the memory on other platforms too, since that's 
what we do on Windows.


Committed a fix with memset(). I'm not sure what our policy with 
backpatching this kind of issues is. This goes back to all supported 
versions, but given the lack of complaints, I chose to not backpatch.


Seems like a harmless think to backpatch. It is conceivable that someone 
might want to run Valgrind on something other than HEAD.


--
Tristan Partin
Neon (https://neon.tech)




Re: Refactoring backend fork+exec code

2023-12-01 Thread Heikki Linnakangas

On 01/12/2023 16:00, Alexander Lakhin wrote:

Maybe you could look at issues with running `make check` under Valgrind
when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND":
# +++ regress check in src/test/regress +++
# initializing database system by copying initdb template
# postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the 
reason
Bail out!make[1]: ***

...
2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG:  listening on Unix socket 
"/tmp/pg_regress-pPFNk0/.s.PGSQL.55312"
==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised 
byte(s)
==00:00:00:01.692 1259396==    at 0x5245A37: write (write.c:26)
==00:00:00:01.692 1259396==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 
(fileops.c:1180)
==00:00:00:01.692 1259396==    by 0x51BC84F: new_do_write (fileops.c:448)
==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_new_file_xsputn 
(fileops.c:1254)
==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 
(fileops.c:1196)
==00:00:00:01.692 1259396==    by 0x51B1056: fwrite (iofwrite.c:39)
==00:00:00:01.692 1259396==    by 0x552E21: internal_forkexec 
(postmaster.c:4518)
==00:00:00:01.692 1259396==    by 0x5546A1: postmaster_forkexec 
(postmaster.c:)
==00:00:00:01.692 1259396==    by 0x55471C: StartChildProcess 
(postmaster.c:5275)
==00:00:00:01.692 1259396==    by 0x557B61: PostmasterMain (postmaster.c:1454)
==00:00:00:01.692 1259396==    by 0x472136: main (main.c:198)
==00:00:00:01.692 1259396==  Address 0x1ffeffdc11 is on thread 1's stack
==00:00:00:01.692 1259396==  in frame #4, created by internal_forkexec 
(postmaster.c:4482)
==00:00:00:01.692 1259396==

With memset(param, 0, sizeof(*param)); added at the beginning of
save_backend_variables(), server starts successfully, but then
`make check` fails with other Valgrind error.


That's actually a pre-existing issue, I'm seeing that even on unpatched 
'master'.


In a nutshell, the problem is that the uninitialized padding bytes in 
BackendParameters are written to the file. When we read the file back, 
we don't access the padding bytes, so that's harmless. But Valgrind 
doesn't know that.


On Windows, the file is created with 
CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables 
directly to the mapping. If I understand the Windows API docs correctly, 
it is guaranteed to be initialized to zeros, so we don't have this 
problem on Windows, only on other platforms with EXEC_BACKEND. I think 
it makes sense to clear the memory on other platforms too, since that's 
what we do on Windows.


Committed a fix with memset(). I'm not sure what our policy with 
backpatching this kind of issues is. This goes back to all supported 
versions, but given the lack of complaints, I chose to not backpatch.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Dumped SQL failed to execute with ERROR "GROUP BY position -1 is not in select list"

2023-12-01 Thread Tom Lane
Haotian Chen  writes:
> postgres=# create view v1 as select * from t1 group by a,b,-1::int;
> CREATE VIEW

Hmm, surely that is a contrived case?

> After exploring codes, I suppose we should treat operator plus constant
> as -'nnn'::typename instead of const, my patch just did this by handling
> Opexpr especially, but I am not sure it's the best way or not,

Yeah, after some time looking at alternatives, I agree that hacking up
get_rule_sortgroupclause some more is the least risky way to make this
work.  We could imagine changing the parser instead but people might
be depending on the current parsing behavior.

I don't like your patch too much though, particularly not the arbitrary
(and undocumented) change in get_const_expr; that seems way too likely
to have unexpected side-effects.  Also, I think that relying on
generate_operator_name to produce exactly '-' (and not, say,
'pg_catalog.-') is unwise as well as unduly expensive.

There are, I think, precisely two operators we need to worry about here,
namely int4um and numeric_uminus.  It'd be cheaper and more reliable to
identify those by OID.  (If the underlying Const is neither int4 nor
numeric, it'll end up getting labeled with a typecast, so that we don't
need to worry about anything else.)

As for getting the right thing to be printed, I think what we might
want is to effectively const-fold the expression into a negative
Const, and then we can just apply get_const_expr with showtype=1.
(So we'd end with output like '-1'::integer or similar.)

We could do worse than to implement that by actual const-folding,
ie call expression_planner.  Efficiency-wise that's not great, but
this is such a weird edge case that I don't think we need to sweat
about making it fast.  The alternative of hand-rolled constant
folding code isn't very appealing.

regards, tom lane




Re: Building PosgresSQL with LLVM fails on Solaris 11.4

2023-12-01 Thread Andres Freund
Hi,

On 2023-12-01 17:02:25 +, Sacha Hottinger wrote:
> Compiling PostgreSQL 13.13 with option –with-llvm fails with Developer Studio 
> 12.6 as well as with gcc 13.2.0.
> I have installed the developer/llvm/clang" + "developer/llvm/clang-build pkgs 
> (13.0.1).

Uh, huh. I did not expect that anybody would ever really do that on
solaris. Not that the breakage was intentional, that's a separate issue.

Is this on x86-64 or sparc?


I'm somewhat confused that you report this to happen with gcc as well. We
don't use .s files there. Oh, I guess you see a different error
there:

> o With gcc (psql 13.13):
>
> #./configure CC='/usr/bin/gcc -m64' 
> --with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm
>
> # time gmake all
> ...
> -Wl,--as-needed -Wl,-R'/usr/local/pgsql/lib'  -lLLVM-13
> Undefined   first referenced
> symbol in file
> TTSOpsHeapTuple llvmjit_deform.o
> pfree   llvmjit.o
> …
> MemoryContextAllocZero  llvmjit.o
> pkglib_path llvmjit.o
> ExecEvalStepOp  llvmjit_expr.o
> errhidestmt llvmjit.o
> ld: warning: symbol referencing errors

This is odd. I think this is when building llvmjit.so - unfortunately there's
not enough details to figure out what's wrong here.

Oh, one thing that might be going wrong is that you just set the C compiler to
be gcc, but not C++ - what happens if you addtionally set CXX to g++?



I did not think about .o files generated from .s when writing the make
infrastructure for JITing.  At first I thought the easiest solution would be
to just add a rule to build .bc from .s - but that doesn't work in the
sunstudio case, because it relies on preprocessor logic that's specific to sun
studio - which clang can't parse. Gah.

Thus the attached hack - I think that should work.  It'd mostly be interesting
to see if this is the only roadblock or if there's more.


To be honest, the only case where .s files matter today is building with sun
studio, and that's a compiler we're planning to remove support for. So I'm not
sure it's worth fixing, if it adds complexity.

Greetings,

Andres Freund
diff --git i/src/backend/port/Makefile w/src/backend/port/Makefile
index 47338d99229..005921e8a0d 100644
--- i/src/backend/port/Makefile
+++ w/src/backend/port/Makefile
@@ -33,6 +33,14 @@ endif
 
 include $(top_srcdir)/src/backend/common.mk
 
+# Hacky rule to allow clang to generate tas.bc. One reasonably would think
+# that we could just compile tas.s, but on solaris tas.s can't be parsed with
+# clang. It's not interesting from a JIT perspective anyway, so just generate
+# an empty file.
+
+tas.bc: tas.s
+	echo | $(COMPILE.c.bc) -xc -o $@ $<
+
 tas.o: tas.s
 ifeq ($(SUN_STUDIO_CC), yes)
 # preprocess assembler file with cpp


Re: Emitting JSON to file using COPY TO

2023-12-01 Thread Joe Conway

On 11/29/23 10:32, Davin Shearer wrote:

Thanks for the responses everyone.

I worked around the issue using the `psql -tc` method as Filip described.

I think it would be great to support writing JSON using COPY TO at 
some point so I can emit JSON to files using a PostgreSQL function directly.


-Davin

On Tue, Nov 28, 2023 at 2:36 AM Filip Sedlák > wrote:


This would be a very special case for COPY. It applies only to a single
column of JSON values. The original problem can be solved with psql
--tuples-only as David wrote earlier.


$ psql -tc 'select json_agg(row_to_json(t))
               from (select * from public.tbl_json_test) t;'

   [{"id":1,"t_test":"here's a \"string\""}]


Special-casing any encoding/escaping scheme leads to bugs and harder
parsing.


(moved to hackers)

I did a quick PoC patch (attached) -- if there interest and no hard 
objections I would like to get it up to speed for the January commitfest.


Currently the patch lacks documentation and regression test support.

Questions:
--
1. Is supporting JSON array format sufficient, or does it need to 
support some other options? How flexible does the support scheme need to be?


2. This only supports COPY TO and we would undoubtedly want to support 
COPY FROM for JSON as well, but is that required from the start?


Thanks for any feedback.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cfad47b..bc1f684 100644
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
*** ProcessCopyOptions(ParseState *pstate,
*** 443,448 
--- 443,450 
   /* default format */ ;
  			else if (strcmp(fmt, "csv") == 0)
  opts_out->csv_mode = true;
+ 			else if (strcmp(fmt, "json") == 0)
+ opts_out->json_mode = true;
  			else if (strcmp(fmt, "binary") == 0)
  opts_out->binary = true;
  			else
*** ProcessCopyOptions(ParseState *pstate,
*** 667,672 
--- 669,679 
  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
   errmsg("cannot specify HEADER in BINARY mode")));
  
+ 	if (opts_out->json_mode && opts_out->header_line)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("cannot specify HEADER in JSON mode")));
+ 
  	/* Check quote */
  	if (!opts_out->csv_mode && opts_out->quote != NULL)
  		ereport(ERROR,
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index c66a047..f6ee771 100644
*** a/src/backend/commands/copyto.c
--- b/src/backend/commands/copyto.c
***
*** 37,42 
--- 37,43 
  #include "rewrite/rewriteHandler.h"
  #include "storage/fd.h"
  #include "tcop/tcopprot.h"
+ #include "utils/json.h"
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/partcache.h"
*** typedef struct
*** 112,117 
--- 113,120 
  /* NOTE: there's a copy of this in copyfromparse.c */
  static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
  
+ /* need delimiter to start next json array element */
+ static bool json_row_delim_needed = false;
  
  /* non-export function prototypes */
  static void EndCopy(CopyToState cstate);
*** DoCopyTo(CopyToState cstate)
*** 845,850 
--- 848,861 
  
  			CopySendEndOfRow(cstate);
  		}
+ 
+ 		/* if a JSON has been requested send the opening bracket */
+ 		if (cstate->opts.json_mode)
+ 		{
+ 			CopySendChar(cstate, '[');
+ 			CopySendEndOfRow(cstate);
+ 			json_row_delim_needed = false;
+ 		}
  	}
  
  	if (cstate->rel)
*** DoCopyTo(CopyToState cstate)
*** 892,897 
--- 903,915 
  		CopySendEndOfRow(cstate);
  	}
  
+ 	/* if a JSON has been requested send the closing bracket */
+ 	if (cstate->opts.json_mode)
+ 	{
+ 		CopySendChar(cstate, ']');
+ 		CopySendEndOfRow(cstate);
+ 	}
+ 
  	MemoryContextDelete(cstate->rowcontext);
  
  	if (fe_copy)
*** DoCopyTo(CopyToState cstate)
*** 906,916 
  static void
  CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot)
  {
- 	bool		need_delim = false;
- 	FmgrInfo   *out_functions = cstate->out_functions;
  	MemoryContext oldcontext;
- 	ListCell   *cur;
- 	char	   *string;
  
  	MemoryContextReset(cstate->rowcontext);
  	oldcontext = MemoryContextSwitchTo(cstate->rowcontext);
--- 924,930 
*** CopyOneRowTo(CopyToState cstate, TupleTa
*** 921,974 
  		CopySendInt16(cstate, list_length(cstate->attnumlist));
  	}
  
! 	/* Make sure the tuple is fully deconstructed */
! 	slot_getallattrs(slot);
! 
! 	foreach(cur, cstate->attnumlist)
  	{
! 		int			attnum = lfirst_int(cur);
! 		Datum		value = slot->tts_values[attnum - 1];
! 		bool		isnull = slot->tts_isnull[attnum - 1];
  
! 		if (!cstate->opts.binary)
! 		{
! 			if (need_delim)
! CopySendChar(cstate, cstate->opts.delim[0]);
! 			need_delim = true;
! 		}
  
! 		if (isnull)
! 		{
! 			

Re: Fix bogus Asserts in calc_non_nestloop_required_outer

2023-12-01 Thread shihao zhong
I have reviewed the changes and it looks fine.

The new status of this patch is: Ready for Committer


Re: A wrong comment about search_indexed_tlist_for_var

2023-12-01 Thread Tom Lane
Alvaro Herrera  writes:
> On 2023-Dec-01, Richard Guo wrote:
>> However, this cross-check will also be performed in non-debug builds
>> ever since commit 867be9c07, which converts this check from Asserts to
>> test-and-elog.  The commit message there also says:
>> Committed separately with the idea that eventually we'll revert
>> this.  It might be awhile though.
>> I wonder if now is the time to revert it, since there have been no
>> related bugs reported for quite a while.

> I don't know anything about this, but maybe it would be better to let
> these elogs there for longer, so that users have time to upgrade and
> test.

Yeah.  It's good that we've not had field reports against 16.0 or 16.1,
but we can't really expect that 16.x has seen widespread adoption yet.
I do think we should revert this eventually, but I'd wait perhaps
another year.

> OTOH keeping the elog there might impact performance.  Would that be
> significant?

Doubt it'd be anything measurable, in comparison to all the other
stuff the planner does.

regards, tom lane




Re: meson: Stop using deprecated way getting path of files

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 12:07 PM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On the buildfarm page[0], it would be nice if more than just the 
> compiler versions were stated. It would be nice to have all 
> build/runtime dependencies listed.


By and large, we've attempted to address such concerns by extending
the configure script to emit info about versions of things it finds.
So you should look into the configure step's log to see what version
of bison, openssl, etc is in use.


Good point. For some reason that slipped my mind. Off into the weeds 
I go...


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-01 Thread Tom Lane
"Tristan Partin"  writes:
> On the buildfarm page[0], it would be nice if more than just the 
> compiler versions were stated. It would be nice to have all 
> build/runtime dependencies listed.

By and large, we've attempted to address such concerns by extending
the configure script to emit info about versions of things it finds.
So you should look into the configure step's log to see what version
of bison, openssl, etc is in use.

> I know the buildfarm seems to be a volunteer thing, so asking more of 
> them seems like a hard ask.

We certainly aren't going to ask owners to maintain such information
manually.  Even if they tried, it'd soon be impossibly out-of-date.
The logging method has the additional advantage that it's accurate
for historical runs as well as whatever the machine has installed
today.

regards, tom lane




Re: pg_upgrade and logical replication

2023-12-01 Thread vignesh C
On Fri, 1 Dec 2023 at 10:57, Peter Smith  wrote:
>
> Here are review comments for patch v21-0001
>
> ==
> src/bin/pg_upgrade/check.c
>
> 1. check_old_cluster_subscription_state
>
> +/*
> + * check_old_cluster_subscription_state()
> + *
> + * Verify that each of the subscriptions has all their corresponding tables 
> in
> + * i (initialize) or r (ready).
> + */
> +static void
> +check_old_cluster_subscription_state(void)
>
> Function comment should also mention it also validates the origin.

Modified

> ~~~
>
> 2.
> In this function there are a couple of errors written to the
> "subs_invalid.txt" file:
>
> + fprintf(script, "replication origin is missing for database:\"%s\"
> subscription:\"%s\"\n",
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1));
>
> and
>
> + fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
> relation:\"%s\" state:\"%s\" not in required state\n",
> + active_db->db_name,
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1),
> + PQgetvalue(res, i, 2),
> + PQgetvalue(res, i, 3));
>
> The format of those messages is not consistent. It could be improved
> in a number of ways to make them more similar. e.g. below.
>
> SUGGESTION #1
> the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
> the table sync state \"%s\" is not allowed for database:\"%s\"
> subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n
>
> SUGGESTION #2
> database:\"%s\" subscription:\"%s\" -- replication origin is missing\n
> database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" --
> upgrade when table sync state is \"%s\" is not supported\n
>
> etc.

Modified based on SUGGESTION#1

> ==
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 3.
> +# Initial setup
> +$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
> +$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
> +$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
> +$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
>
> IMO it is tidier to combine multiple DDLS whenever you can.

Modified

> ~~~
>
> 4.
> +# Create a subscription in enabled state before upgrade
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1"
> +);
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
>
> That publication has an empty set of tables. Should there be some
> comment to explain why it is OK like this?

This test is just to verify that the enabled subscriptions will be
enabled after upgrade, we don't need data for this. Data validation
happens with a different subscriptin. Modified comments

> ~~~
>
> 5.
> +# Wait till the table tab_upgraded1 reaches 'ready' state
> +my $synced_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
> +$old_sub->poll_query_until('postgres', $synced_query)
> +  or die "Timed out while waiting for the table to reach ready state";
> +
> +$publisher->safe_psql('postgres',
> + "INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
> +);
> +$publisher->wait_for_catchup('regress_sub2');
>
> IMO better without the blank line, so then everything more clearly
> belongs to this same comment.

Modified

> ~~~
>
> 6.
> +# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
> +# to the publication.
> +$publisher->safe_psql('postgres',
> + "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
> +
> +$old_sub->safe_psql('postgres',
> + "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
>
> Ditto. IMO better without the blank line, so then everything more
> clearly belongs to this same comment.

Modified

> ~~~
>
> 7.
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> + '-D', $new_sub->data_dir, '-b', $oldbindir,
> + '-B', $newbindir, '-s', $new_sub->host,
> + '-p', $old_sub->port, '-P', $new_sub->port,
> + $mode
> + ],
> + 'run of pg_upgrade for old instance when the subscription tables are
> in init/ready state'
> +);
>
> Maybe those 'command_ok' args can be formatted neatly (like you've
> done later for the 'command_checks_all').

This is based on the run from pgperlytidy. Even if i format it
pgperltidy reverts the formatting that I have done. I have seen the
same is the case with other upgrade commands in few places. So not
making any changes for this.

> ~~~
>
> 8.
> +# --
> +# Check that the data inserted to the publisher when the subscriber
> is down will
> +# be replicated to the new subscriber once the new subscriber is started.
> +# --
>
> 8a.
> SUGGESTION
> ...when the new subscriber is down will be replicated once it is started.
>

Modified

> ~
>
> 8b.
> I thought this main comment should also say something like "Also check
> that the old subscription states and 

Building PosgresSQL with LLVM fails on Solaris 11.4

2023-12-01 Thread Sacha Hottinger
Hi all

Compiling PostgreSQL 13.13 with option –with-llvm fails with Developer Studio 
12.6 as well as with gcc 13.2.0.
I have installed the developer/llvm/clang" + "developer/llvm/clang-build pkgs 
(13.0.1).

- It works without the llvm option
- I have also tried it with 16.1 – no success either

o With Developer Studio (psql 13.13):

# ./configure CC='/opt/developerstudio12.6/bin/cc -m64 -xarch=native' 
--enable-dtrace DTRACEFLAGS='-64' --with-system-tzdata=/usr/share/lib/zoneinfo 
--with-llvm

# gmake all
...
/opt/developerstudio12.6/bin/cc -m64 -xarch=native -Xa -v -O 
-I../../../src/include-c -o pg_shmem.o pg_shmem.c
gmake[3]: *** No rule to make target 'tas.bc', needed by 'objfiles.txt'.  Stop.
gmake[3]: Leaving directory 
'/opt/cnd/opt24_13.13_gmake_all_llvm/src/backend/port'
gmake[2]: *** [common.mk:39: port-recursive] Error 2
gmake[2]: Leaving directory '/opt/cnd/opt24_13.13_gmake_all_llvm/src/backend'
gmake[1]: *** [Makefile:42: all-backend-recurse] Error 2
gmake[1]: Leaving directory '/opt/cnd/opt24_13.13_gmake_all_llvm/src'
gmake: *** [GNUmakefile:11: all-src-recurse] Error 2


o With gcc (psql 13.13):

#./configure CC='/usr/bin/gcc -m64' 
--with-system-tzdata=/usr/share/lib/zoneinfo --with-llvm

# time gmake all
...
-Wl,--as-needed -Wl,-R'/usr/local/pgsql/lib'  -lLLVM-13
Undefined   first referenced
symbol in file
TTSOpsHeapTuple llvmjit_deform.o
pfree   llvmjit.o
…
MemoryContextAllocZero  llvmjit.o
pkglib_path llvmjit.o
ExecEvalStepOp  llvmjit_expr.o
errhidestmt llvmjit.o
ld: warning: symbol referencing errors
/usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv 
-Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -O2  
-D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS 
-I/usr/include  -I../../../../src/include   -flto=thin -emit-llvm -c -o 
llvmjit_types.bc llvmjit_types.c
gmake[2]: Leaving directory 
'/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/src/backend/jit/llvm'
gmake[1]: Leaving directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/src'
gmake -C config all
gmake[1]: Entering directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/config'
gmake[1]: Nothing to be done for 'all'.
gmake[1]: Leaving directory '/opt/cnd/opt25_13.13_gcc_gmak_all_llvm/config'


Kind regards
Sasha


This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you have received this email in error please notify the system manager.



Re: Remove unnecessary includes of system headers in header files

2023-12-01 Thread Tom Lane
Peter Eisentraut  writes:
> I noticed that some header files included system header files for no 
> apparent reason, so I did some digging and found out that in a few cases 
> the original reason has disappeared.  So I propose the attached patches 
> to remove the unnecessary includes.

Seems generally reasonable.  Have you checked that headerscheck and
cpluspluscheck are happy?

regards, tom lane




Re: Refactoring backend fork+exec code

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 6:10 AM CST, Heikki Linnakangas wrote:

On 30/11/2023 20:44, Tristan Partin wrote:
> Patches 1-3 seem committable as-is.

Thanks for the review! I'm focusing on patches 1-3 now, and will come 
back to the rest after committing 1-3.


There was one test failure with EXEC_BACKEND from patch 2, in 
'test_shm_mq'. In restore_backend_variables() I checked if 'bgw_name' is 
empty to decide if the BackgroundWorker struct is filled in or not, but 
it turns out that 'test_shm_mq' doesn't fill in bgw_name. It probably 
should, I think that's an oversight in 'test_shm_mq', but that's a 
separate issue.


I did some more refactoring of patch 2, to fix that and to improve it in 
general. The BackgroundWorker struct is now passed through the 
fork-related functions similarly to the Port struct. That seems more 
consistent.


Attached is new version of these patches. For easier review, I made the 
new refactorings compared in a new commit 0003. I will squash that 
before pushing, but this makes it easier to see what changed.


Barring any new feedback or issues, I will commit these.


My only thought is that perhaps has_bg_worker is a better name than 
has_worker, but I agree that having a flag is better than checking 
bgw_name.


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-01 Thread Tristan Partin

On Thu Nov 30, 2023 at 3:46 PM CST, Andrew Dunstan wrote:


On 2023-11-30 Th 16:00, Tristan Partin wrote:
> On Wed Nov 29, 2023 at 1:42 PM CST, Andres Freund wrote:
>> Hi,
>>
>> On 2023-11-29 13:11:23 -0600, Tristan Partin wrote:
>> > What is our limiting factor on bumping the minimum Meson version?
>>
>> Old distro versions, particularly ones where the distro just has an 
>> older
>> python. It's one thing to require installing meson but quite another 
>> to also

>> require building python. There's some other ongoing discussion about
>> establishing the minimum baseline in a somewhat more, uh, formalized 
>> way:
>> https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com 
>>

>
> I'll take a look there. According to Meson, the following versions had 
> Python version bumps:

>
> 0.61.5: 3.6
> 0.56.2: 3.5
> 0.45.1: 3.4
>
> Taking a look at pkgs.org, Debian 10, Ubuntu 20.04, and Oracle Linux 7 
> (a RHEL re-spin), and CentOS 7, all have >= Python 3.6.8. Granted, 
> this isn't the whole picture of what Postgres supports from version 
> 16+. To put things in perspective, Python 3.6 was released on December 
> 23, 2016, which is coming up on 7 years. Python 3.6 reached end of 
> life on the same date in 2021.

>
> Is there a complete list somewhere that talks about what platforms 
> each version of Postgres supports?



You can look at animals in the buildfarm. For meson only release 16 and 
up matter.


On the buildfarm page[0], it would be nice if more than just the 
compiler versions were stated. It would be nice to have all 
build/runtime dependencies listed. For instance, it would be interesting 
if there was a json document for each build animal, and perhaps a root 
json document which was an amalgomation of the individual documents. 
Then I could use a tool like jq to query all the information rather 
easily. As-is, I don't know where to search for package versions for 
some of the archaic operating systems in the farm. Perhaps other people 
have had similar problems in the past. Having a full write-up of every 
build machine would also be good for debugging purposes. If I see 
openssl tests suddenly failing on one machine, then I can just check the 
openssl version, and try to reproduce locally.


I know the buildfarm seems to be a volunteer thing, so asking more of 
them seems like a hard ask. Just wanted to throw my thoughts into the 
void.


[0]: https://buildfarm.postgresql.org/cgi-bin/show_members.pl

--
Tristan Partin
Neon (https://neon.tech)




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Nisha Moond
On Fri, Dec 1, 2023 at 5:40 PM Nisha Moond  wrote:
>
> Review for v41 patch.
>
> 1.
> ==
> src/backend/utils/misc/postgresql.conf.sample
>
> +#enable_syncslot = on # enables slot synchronization on the physical
> standby from the primary
>
> enable_syncslot is disabled by default, so, it should be 'off' here.
>
> ~~~
> 2.
> IIUC, the slotsyncworker's connection to the primary is to execute a
> query. Its aim is not walsender type connection, but at primary when
> queried, the 'backend_type' is set to 'walsender'.
> Snippet from primary db-
>
> datname  |   usename   | application_name | wait_event_type | backend_type
> -+-+--+-+--
> postgres | replication | slotsyncworker   | Client  | walsender
>
> Is it okay?
>
> ~~~
> 3.
> As per current logic, If there are slots on primary with disabled
> subscriptions, then, when standby is created it replicates these slots
> but can't make them sync-ready until any activity happens on the
> slots.
> So, such slots stay in 'i' sync-state and get dropped when failover
> happens. Now, if the subscriber tries to enable their existing
> subscription after failover, it gives an error that the slot does not
> exist.

Is this behavior expected? If yes, then is it worth documenting about
disabled subscription slots not being synced?

--
Thanks,
Nisha




Re: Materialized view in Postgres from the variables rather than SQL query results

2023-12-01 Thread Nurul Karim Rafi
Hi David,
Thanks for replying back.
Already did that but haven’t received anything yet.

On Fri, Dec 1, 2023 at 7:40 PM David G. Johnston 
wrote:

> This mailing list is for discussing the development of patches to the
> PostgreSQL code base.  Please send your request for help to a more
> appropriate list - specifically the -general list.
>
> David J.
>
>
> On Thursday, November 30, 2023, Nurul Karim Rafi 
> wrote:
>
>> I have a stored procedure in Postgres. I have generated some variables in
>> that procedure. These variables are generated inside a loop of query
>> result. Suppose if i have 10 rows from my query then for 10 times I am
>> generating those variables.
>>
>> Now I want to create a materialized view where these variables will be
>> the columns and every row will have values of those variables generated in
>> every iteration of that loop.
>>
>> What can be the better approach for this?
>>
>> I can create a temporary table with those variables and insert values in
>> every iteration of that loop but that will not serve my purpose. Because I
>> want to drop my existing table where all the values are available and
>> columns are those variables. My target is to make a materialized view with
>> those variables so that I can get rid of that parent table.
>>
>> Best Regards
>>
>> *Rafi*
>>
>


Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document

2023-12-01 Thread Daniel Verite
shihao zhong wrote:

> Thanks for your comments, a new version is attached.

In this hunk:

@@ -1097,8 +1097,8 @@ WITH ( MODULUS numeric_literal, REM
   method index_method.
   The operators are required to be commutative.
   Each exclude_element
-  can optionally specify an operator class and/or ordering options;
-  these are described fully under
+  can optionally specify any of the following: a collation, a
+  operator class, or ordering options; these are described fully under
   .
  
 
"a" should be "an" as it's followed by "operator class".

Also the use of "and/or" in the previous version conveys the fact
that operator class and ordering options are not mutually
exclusive. But when using "any of the following" in the new text,
doesn't it loose that meaning?

In case it does, I would suggest the attached diff.

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/alter_table.sgml 
b/doc/src/sgml/ref/alter_table.sgml
index 2c4138e4e9..43f5f0873e 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -135,7 +135,7 @@ WITH ( MODULUS numeric_literal, REM
 
 exclude_element in an 
EXCLUDE constraint is:
 
-{ column_name | ( expression ) } [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST 
} ]
+{ column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST 
} ]
 
 referential_action in a 
FOREIGN KEY/REFERENCES constraint 
is:
 
diff --git a/doc/src/sgml/ref/create_table.sgml 
b/doc/src/sgml/ref/create_table.sgml
index e04a0692c4..277d292aac 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -105,7 +105,7 @@ WITH ( MODULUS numeric_literal, REM
 
 exclude_element in an 
EXCLUDE constraint is:
 
-{ column_name | ( expression ) } [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST 
} ]
+{ column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST 
} ]
 
 referential_action in a 
FOREIGN KEY/REFERENCES constraint 
is:
 
@@ -1097,9 +1097,8 @@ WITH ( MODULUS numeric_literal, REM
   method index_method.
   The operators are required to be commutative.
   Each exclude_element
-  can optionally specify an operator class and/or ordering options;
-  these are described fully under
-  .
+  can optionally specify a collation, an operator class, and ordering 
options;
+  these are described fully under .
  
 
  


Re: Various small doc improvements; plpgsql, schemas, permissions, oidvector

2023-12-01 Thread Karl O. Pinc
Hello,

Thank you all for your help.  I won't be able to submit
new patches based on reviews for 2 weeks.

On Thu, 30 Nov 2023 16:02:28 +0530
Shubham Khanna  wrote:

> -v7-0012-Explain-role-management.patch
> 
> +   The managment of most database objects is by way of granting some
> role
> 
> Here 'managment' should be replaced with 'management'.



Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein




Re: Refactoring backend fork+exec code

2023-12-01 Thread Alexander Lakhin

Hello Heikki,

01.12.2023 15:10, Heikki Linnakangas wrote:
Attached is new version of these patches. For easier review, I made the new refactorings compared in a new commit 
0003. I will squash that before pushing, but this makes it easier to see what changed.


Barring any new feedback or issues, I will commit these.



Maybe you could look at issues with running `make check` under Valgrind
when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND":
# +++ regress check in src/test/regress +++
# initializing database system by copying initdb template
# postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the 
reason
Bail out!make[1]: ***

...
2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG:  listening on Unix socket 
"/tmp/pg_regress-pPFNk0/.s.PGSQL.55312"
==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised 
byte(s)
==00:00:00:01.692 1259396==    at 0x5245A37: write (write.c:26)
==00:00:00:01.692 1259396==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 
(fileops.c:1180)
==00:00:00:01.692 1259396==    by 0x51BC84F: new_do_write (fileops.c:448)
==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_new_file_xsputn 
(fileops.c:1254)
==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 
(fileops.c:1196)
==00:00:00:01.692 1259396==    by 0x51B1056: fwrite (iofwrite.c:39)
==00:00:00:01.692 1259396==    by 0x552E21: internal_forkexec 
(postmaster.c:4518)
==00:00:00:01.692 1259396==    by 0x5546A1: postmaster_forkexec 
(postmaster.c:)
==00:00:00:01.692 1259396==    by 0x55471C: StartChildProcess 
(postmaster.c:5275)
==00:00:00:01.692 1259396==    by 0x557B61: PostmasterMain (postmaster.c:1454)
==00:00:00:01.692 1259396==    by 0x472136: main (main.c:198)
==00:00:00:01.692 1259396==  Address 0x1ffeffdc11 is on thread 1's stack
==00:00:00:01.692 1259396==  in frame #4, created by internal_forkexec 
(postmaster.c:4482)
==00:00:00:01.692 1259396==

With memset(param, 0, sizeof(*param)); added at the beginning of
save_backend_variables(), server starts successfully, but then
`make check` fails with other Valgrind error.

Best regards,
Alexander




Re: Materialized view in Postgres from the variables rather than SQL query results

2023-12-01 Thread David G. Johnston
This mailing list is for discussing the development of patches to the
PostgreSQL code base.  Please send your request for help to a more
appropriate list - specifically the -general list.

David J.


On Thursday, November 30, 2023, Nurul Karim Rafi 
wrote:

> I have a stored procedure in Postgres. I have generated some variables in
> that procedure. These variables are generated inside a loop of query
> result. Suppose if i have 10 rows from my query then for 10 times I am
> generating those variables.
>
> Now I want to create a materialized view where these variables will be the
> columns and every row will have values of those variables generated in
> every iteration of that loop.
>
> What can be the better approach for this?
>
> I can create a temporary table with those variables and insert values in
> every iteration of that loop but that will not serve my purpose. Because I
> want to drop my existing table where all the values are available and
> columns are those variables. My target is to make a materialized view with
> those variables so that I can get rid of that parent table.
>
> Best Regards
>
> *Rafi*
>


Re: Postgres Partitions Limitations (5.11.2.3)

2023-12-01 Thread Ashutosh Bapat
On Thu, Nov 30, 2023 at 10:29 PM Laurenz Albe  wrote:
>
> On Thu, 2023-11-30 at 19:22 +0530, Ashutosh Bapat wrote:
> > May be attach the patch to hackers thread (this) as well?
>
> If you want, sure.  I thought it was good enough if the thread
> is accessible via the commitfest app.

The addition is long enough that it deserved to be outside of parentheses.

I think it's worth mentioning the exception but in a way that avoids
repeating what's mentioned in the last paragraph of just the previous
section. I don't have brilliant ideas about how to rephrase it.

Maybe "Using ONLY to add or drop a constraint, other than PRIMARY and
UNIQUE, on only the partitioned table is supported as long as there
are no partitions. ...".

-- 
Best Wishes,
Ashutosh Bapat




Re: pgsql: Clean up role created in new subscription test.

2023-12-01 Thread Daniel Gustafsson
> On 1 Dec 2023, at 13:19, Alvaro Herrera  wrote:
> 
> Isn't it simpler to use DROP OWNED BY in 0001?

I suppose it is, I kind of like the explicit drops but we do use OWNED BY quite
a lot in the regression tests so changed to that in the attached v5.

--
Daniel Gustafsson



v5-0001-Drop-global-objects-after-completed-test.patch
Description: Binary data


v5-0002-pg_regress-Detect-global-objects-left-over-after-.patch
Description: Binary data


Re: A wrong comment about search_indexed_tlist_for_var

2023-12-01 Thread Alvaro Herrera
On 2023-Dec-01, Richard Guo wrote:

> However, this cross-check will also be performed in non-debug builds
> ever since commit 867be9c07, which converts this check from Asserts to
> test-and-elog.  The commit message there also says:
> 
> Committed separately with the idea that eventually we'll revert
> this.  It might be awhile though.
> 
> I wonder if now is the time to revert it, since there have been no
> related bugs reported for quite a while.

I don't know anything about this, but maybe it would be better to let
these elogs there for longer, so that users have time to upgrade and
test.  This new code has proven quite tricky, and if I understand
correctly, if we do run some query with wrong varnullingrels in
production code without elog and where Assert() does nothing, that might
silently lead to wrong results.

OTOH keeping the elog there might impact performance.  Would that be
significant?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Debido a que la velocidad de la luz es mucho mayor que la del sonido,
 algunas personas nos parecen brillantes un minuto antes
 de escuchar las pelotudeces que dicen." (Roberto Fontanarrosa)




Re: pgsql: Clean up role created in new subscription test.

2023-12-01 Thread Alvaro Herrera
Isn't it simpler to use DROP OWNED BY in 0001?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Hay que recordar que la existencia en el cosmos, y particularmente la
elaboración de civilizaciones dentro de él no son, por desgracia,
nada idílicas" (Ijon Tichy)




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Nisha Moond
Review for v41 patch.

1.
==
src/backend/utils/misc/postgresql.conf.sample

+#enable_syncslot = on # enables slot synchronization on the physical
standby from the primary

enable_syncslot is disabled by default, so, it should be 'off' here.

~~~
2.
IIUC, the slotsyncworker's connection to the primary is to execute a
query. Its aim is not walsender type connection, but at primary when
queried, the 'backend_type' is set to 'walsender'.
Snippet from primary db-

datname  |   usename   | application_name | wait_event_type | backend_type
-+-+--+-+--
postgres | replication | slotsyncworker   | Client  | walsender

Is it okay?

~~~
3.
As per current logic, If there are slots on primary with disabled
subscriptions, then, when standby is created it replicates these slots
but can't make them sync-ready until any activity happens on the
slots.
So, such slots stay in 'i' sync-state and get dropped when failover
happens. Now, if the subscriber tries to enable their existing
subscription after failover, it gives an error that the slot does not
exist.

~~~
4. primary_slot_name GUC value test:

When standby is started with a non-existing primary_slot_name, the
wal-receiver gives an error but the slot-sync worker does not raise
any error/warning. It is no-op though as it has a check 'if
(XLogRecPtrIsInvalid(WalRcv->latestWalEnd)) do nothing'.   Is this
okay or shall the slot-sync worker too raise an error and exit?

In another case, when standby is started with valid primary_slot_name,
but it is changed to some invalid value in runtime, then walreceiver
starts giving error but the slot-sync worker keeps on running. In this
case, unlike the previous case, it even did not go to no-op mode (as
it sees valid WalRcv->latestWalEnd from the earlier run) and keep
pinging primary repeatedly for slots.  Shall here it should error out
or at least be no-op until we give a valid primary_slot_name?

--
Thanks,
Nisha




Re: Refactoring backend fork+exec code

2023-12-01 Thread Heikki Linnakangas

On 30/11/2023 20:44, Tristan Partin wrote:

Patches 1-3 seem committable as-is.


Thanks for the review! I'm focusing on patches 1-3 now, and will come 
back to the rest after committing 1-3.


There was one test failure with EXEC_BACKEND from patch 2, in 
'test_shm_mq'. In restore_backend_variables() I checked if 'bgw_name' is 
empty to decide if the BackgroundWorker struct is filled in or not, but 
it turns out that 'test_shm_mq' doesn't fill in bgw_name. It probably 
should, I think that's an oversight in 'test_shm_mq', but that's a 
separate issue.


I did some more refactoring of patch 2, to fix that and to improve it in 
general. The BackgroundWorker struct is now passed through the 
fork-related functions similarly to the Port struct. That seems more 
consistent.


Attached is new version of these patches. For easier review, I made the 
new refactorings compared in a new commit 0003. I will squash that 
before pushing, but this makes it easier to see what changed.


Barring any new feedback or issues, I will commit these.

--
Heikki Linnakangas
Neon (https://neon.tech)
From 6dd44d7057e535eb441c9ab8fda1bbdd8079f2fb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Thu, 30 Nov 2023 00:01:25 +0200
Subject: [PATCH v4 1/4] Refactor CreateSharedMemoryAndSemaphores

For clarity, have separate functions for *creating* the shared memory
and semaphores at postmaster or single-user backend startup, and
for *attaching* to existing shared memory structures in EXEC_BACKEND
case. CreateSharedMemoryAndSemaphores() is now called only at
postmaster startup, and a new AttachSharedMemoryStructs() function is
called at backend startup in EXEC_BACKEND mode.

Reviewed-by: Tristan Partin, Andres Freund
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce...@iki.fi
---
 src/backend/postmaster/postmaster.c   |  20 +--
 src/backend/replication/walreceiver.c |   3 +-
 src/backend/storage/ipc/ipci.c| 178 +++---
 src/backend/storage/lmgr/proc.c   |   2 +-
 src/include/storage/ipc.h |   3 +
 5 files changed, 117 insertions(+), 89 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7a5cd06c5c9..b03e88e6702 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4909,11 +4909,11 @@ SubPostmasterMain(int argc, char *argv[])
 		/* Restore basic shared memory pointers */
 		InitShmemAccess(UsedShmemSegAddr);
 
-		/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
+		/* Need a PGPROC to run AttachSharedMemoryStructs */
 		InitProcess();
 
 		/* Attach process to shared data structures */
-		CreateSharedMemoryAndSemaphores();
+		AttachSharedMemoryStructs();
 
 		/* And run the backend */
 		BackendRun();		/* does not return */
@@ -4927,11 +4927,11 @@ SubPostmasterMain(int argc, char *argv[])
 		/* Restore basic shared memory pointers */
 		InitShmemAccess(UsedShmemSegAddr);
 
-		/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
+		/* Need a PGPROC to run AttachSharedMemoryStructs */
 		InitAuxiliaryProcess();
 
 		/* Attach process to shared data structures */
-		CreateSharedMemoryAndSemaphores();
+		AttachSharedMemoryStructs();
 
 		auxtype = atoi(argv[3]);
 		AuxiliaryProcessMain(auxtype);	/* does not return */
@@ -4941,11 +4941,11 @@ SubPostmasterMain(int argc, char *argv[])
 		/* Restore basic shared memory pointers */
 		InitShmemAccess(UsedShmemSegAddr);
 
-		/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
+		/* Need a PGPROC to run AttachSharedMemoryStructs */
 		InitProcess();
 
 		/* Attach process to shared data structures */
-		CreateSharedMemoryAndSemaphores();
+		AttachSharedMemoryStructs();
 
 		AutoVacLauncherMain(argc - 2, argv + 2);	/* does not return */
 	}
@@ -4954,11 +4954,11 @@ SubPostmasterMain(int argc, char *argv[])
 		/* Restore basic shared memory pointers */
 		InitShmemAccess(UsedShmemSegAddr);
 
-		/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
+		/* Need a PGPROC to run AttachSharedMemoryStructs */
 		InitProcess();
 
 		/* Attach process to shared data structures */
-		CreateSharedMemoryAndSemaphores();
+		AttachSharedMemoryStructs();
 
 		AutoVacWorkerMain(argc - 2, argv + 2);	/* does not return */
 	}
@@ -4972,11 +4972,11 @@ SubPostmasterMain(int argc, char *argv[])
 		/* Restore basic shared memory pointers */
 		InitShmemAccess(UsedShmemSegAddr);
 
-		/* Need a PGPROC to run CreateSharedMemoryAndSemaphores */
+		/* Need a PGPROC to run AttachSharedMemoryStructs */
 		InitProcess();
 
 		/* Attach process to shared data structures */
-		CreateSharedMemoryAndSemaphores();
+		AttachSharedMemoryStructs();
 
 		/* Fetch MyBgworkerEntry from shared memory */
 		shmem_slot = atoi(argv[1] + 15);
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 2398167f495..26ded928a71 100644
--- a/src/backend/replication/walreceiver.c
+++ 

Re: pgsql: Clean up role created in new subscription test.

2023-12-01 Thread Daniel Gustafsson
> On 1 Dec 2023, at 12:37, Heikki Linnakangas  wrote:
> 
> On 01/12/2023 13:22, Daniel Gustafsson wrote:
>>> On 8 Nov 2023, at 13:32, Alvaro Herrera  wrote:
>>> I suppose you're just thinking of using PQexec() or whatever, run one
>>> query with sufficient ORDER BY, save the result, and at the end of the
>>> test run just run the same query and compare that they are cell-by-cell
>>> identical?  This sounds a lot simpler than the patch you posted.
>> I found some spare cycles for this and came up with the attached.  The idea 
>> was
>> to keep it in-line with how pg_regress already today manipulate and traverse
>> _stringlists for various things.  With the addition of the 0001 patch to 
>> clean
>> up global objects left in test_pg_dump it passes check-world.
> 
> Do we want to impose this policy to all extensions too?

I don't think it would be bad, and as of today the policy holds for all of
check-world apart from this one test module.

>> +/*
>> + * Store the global objects before the test starts such that we can 
>> check
>> + * for any objects left behind after the tests finish.
>> + */
>> +query_to_stringlist("postgres",
>> +"(SELECT rolname AS obj FROM 
>> pg_catalog.pg_roles ORDER BY 1) "
>> +"UNION ALL "
>> +"(SELECT spcname AS obj FROM 
>> pg_catalog.pg_tablespace ORDER BY 1) "
>> +"UNION ALL "
>> +"(SELECT subname AS obj FROM 
>> pg_catalog.pg_subscription ORDER BY 1)",
>> +_before);
>> +
> 
> Strictly speaking, the order of this query isn't guaranteed to be stable, 
> although in practice it probably is. 

Of course, will fix.  I originally had three separate query_to_stringlist calls
and had a brainfade when combining.  It seemed like pointless use of cycles
when we can get everything in one connection.

> Is it OK to leave behind extra databases?

The test suite for pg_upgrade can make use of left behind databases to seed the
old cluster, so I think that's allowed by design.

--
Daniel Gustafsson





Re: pgsql: Clean up role created in new subscription test.

2023-12-01 Thread Heikki Linnakangas

On 01/12/2023 13:22, Daniel Gustafsson wrote:

On 8 Nov 2023, at 13:32, Alvaro Herrera  wrote:



I suppose you're just thinking of using PQexec() or whatever, run one
query with sufficient ORDER BY, save the result, and at the end of the
test run just run the same query and compare that they are cell-by-cell
identical?  This sounds a lot simpler than the patch you posted.


I found some spare cycles for this and came up with the attached.  The idea was
to keep it in-line with how pg_regress already today manipulate and traverse
_stringlists for various things.  With the addition of the 0001 patch to clean
up global objects left in test_pg_dump it passes check-world.


Do we want to impose this policy to all extensions too?


+   /*
+* Store the global objects before the test starts such that we can 
check
+* for any objects left behind after the tests finish.
+*/
+   query_to_stringlist("postgres",
+   "(SELECT rolname AS obj FROM 
pg_catalog.pg_roles ORDER BY 1) "
+   "UNION ALL "
+   "(SELECT spcname AS obj FROM 
pg_catalog.pg_tablespace ORDER BY 1) "
+   "UNION ALL "
+   "(SELECT subname AS obj FROM 
pg_catalog.pg_subscription ORDER BY 1)",
+   _before);
+


Strictly speaking, the order of this query isn't guaranteed to be 
stable, although in practice it probably is. Maybe something like this:


(SELECT 'role', rolname AS obj FROM pg_catalog.pg_roles
UNION ALL
SELECT 'tablespace', spcname AS obj FROM pg_catalog.pg_tablespace
UNION ALL
SELECT 'subscription', subname AS obj FROM pg_catalog.pg_subscription
) ORDER BY 1, 2

Is it OK to leave behind extra databases?

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: pgsql: Clean up role created in new subscription test.

2023-12-01 Thread Daniel Gustafsson
> On 8 Nov 2023, at 13:32, Alvaro Herrera  wrote:

> I suppose you're just thinking of using PQexec() or whatever, run one
> query with sufficient ORDER BY, save the result, and at the end of the
> test run just run the same query and compare that they are cell-by-cell
> identical?  This sounds a lot simpler than the patch you posted.

I found some spare cycles for this and came up with the attached.  The idea was
to keep it in-line with how pg_regress already today manipulate and traverse
_stringlists for various things.  With the addition of the 0001 patch to clean
up global objects left in test_pg_dump it passes check-world.

--
Daniel Gustafsson



v4-0002-pg_regress-Detect-global-objects-left-over-after-.patch
Description: Binary data


v4-0001-Drop-global-objects-after-completed-test.patch
Description: Binary data


Re: Fix a wrong comment in setrefs.c

2023-12-01 Thread Richard Guo
On Tue, Nov 28, 2023 at 8:19 PM Heikki Linnakangas  wrote:

> On 03/11/2023 08:10, Richard Guo wrote:
> > On Tue, Sep 26, 2023 at 9:51 AM Richard Guo  > > wrote:
> > On Tue, Sep 26, 2023 at 5:45 AM Tom Lane  > > wrote:
> >
> > I'm inclined to write the comment more like "Usually the equal()
> > check is redundant, but in setop plans it may not be, since
> > prepunion.c assigns ressortgroupref equal to the column resno
> > without regard to whether that matches the topmost level's
> > sortgrouprefs and without regard to whether any implicit
> coercions
> > are added in the setop tree.  We might have to clean that up
> > someday;
> > but for now, just ignore any false matches."
> >
> > +1.  It explains the situation much more clearly and accurately.
> >
> > To make it easier to review, I've updated the patch to be so.
>
> Committed, thanks!


Thanks for pushing!

Thanks
Richard


Re: Synchronizing slots from primary to standby

2023-12-01 Thread Drouvot, Bertrand

Hi,

On 12/1/23 12:06 PM, Amit Kapila wrote:

On Wed, Nov 29, 2023 at 3:24 PM Drouvot, Bertrand
 wrote:

I think I'm fine with documenting the fact that the user should not change the 
failover
value. But if he does change it (because at the end nothing prevents it to do 
so) then
I think the meaning of subfailoverstate should still make sense.



How user can change the slot's failover property? Do we provide any
command for it?


It's doable, using a replication connection:

"
$ psql replication=database
psql (17devel)
Type "help" for help.

postgres=# ALTER_REPLICATION_SLOT logical_slot6 (FAILOVER false);
ALTER_REPLICATION_SLOT
"

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




A wrong comment about search_indexed_tlist_for_var

2023-12-01 Thread Richard Guo
The comment of search_indexed_tlist_for_var says:

 * In debugging builds, we cross-check the varnullingrels of the subplan
 * output Var based on nrm_match.

However, this cross-check will also be performed in non-debug builds
ever since commit 867be9c07, which converts this check from Asserts to
test-and-elog.  The commit message there also says:

Committed separately with the idea that eventually we'll revert
this.  It might be awhile though.

I wonder if now is the time to revert it, since there have been no
related bugs reported for quite a while.  Otherwise I think we may need
to revise the comment of search_indexed_tlist_for_var to clarify that
the cross-check is not limited to debugging builds.

Please note that if we intend to revert commit 867be9c07, we need to
revert 69c430626 too.

Thanks
Richard


RE: logical decoding and replication of sequences, take 2

2023-12-01 Thread Hayato Kuroda (Fujitsu)
Dear Tomas,

> I did some micro-benchmarking today, trying to identify cases where this
> would cause unexpected problems, either due to having to maintain all
> the relfilenodes, or due to having to do hash lookups for every sequence
> change. But I think it's fine, mostly ...
>

I did also performance tests (especially case 3). First of all, there are some
variants from yours.

1. patch 0002 was reverted because it has an issue. So this test checks whether
   refactoring around ReorderBufferSequenceIsTransactional seems really needed.
2. per comments from Amit, I also measured the abort case. In this case, the
   alter_sequence() is called but the transaction is aborted.
3. I measured with changing number of clients {8, 16, 32, 64, 128}. In any 
cases,
   clients executed 1000 transactions. The performance machine has 128 core so 
that
   result for 128 clients might be saturated.
4. a short sleep (0.1s) was added in alter_sequence(), especially between
   "alter sequence" and nextval(). Because while testing, I found that the
   transaction is too short to execute in parallel. I think it is reasonable
   because ReorderBufferSequenceIsTransactional() might be worse when the 
parallelism
   is increased.

I attached one backend process via perf and executed 
pg_slot_logical_get_changes().
Attached txt file shows which function occupied CPU time, especially from
pg_logical_slot_get_changes_guts() and ReorderBufferSequenceIsTransactional().
Here are my observations about them.

* In case of commit, as you said, SnapBuildCommitTxn() seems dominant for 8-64
  clients case.
* For (commit, 128 clients) case, however, ReorderBufferRestoreChanges() waste
  many times. I think this is because changes exceed logical_decoding_work_mem,
  so we do not have to analyze anymore.
* In case of abort, CPU time used by ReorderBufferSequenceIsTransactional() is 
linearly
  longer. This means that we need to think some solution to avoid the overhead 
by
  ReorderBufferSequenceIsTransactional().

```
8 clients  3.73% occupied time
16 7.26%
32 15.82%
64 29.14%
128 46.27%
```

* In case of abort, I also checked CPU time used by 
ReorderBufferAddRelFileLocator(), but
  it seems not so depends on the number of clients.

```
8 clients 3.66% occupied time
16 6.94%
32 4.65%
64 5.39%
128 3.06%
```

As next step, I've planned to run the case which uses setval() function, 
because it
generates more WALs than normal nextval();
How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


# commit case

## 128 clients

-   97.77% 0.00%  postgres  postgres[.] 
pg_logical_slot_get_changes_guts
   - 97.77% pg_logical_slot_get_changes_guts
  - 97.77% LogicalDecodingProcessRecord
 - 97.71% xact_decode
- 54.95% ReorderBufferProcessTXN
   - 32.64% ReorderBufferRestoreChanges
  + 23.37% FileRead
4.58% __memcpy_ssse3_back
  + 4.14% MemoryContextAllocZero
   + 7.73% ReorderBufferCopySnap.isra.21
   + 5.94% ReorderBufferSerializeTXN
   + 5.78% ReorderBufferCleanupTXN
   + 2.04% RelationIdGetRelation
- 42.75% SnapBuildCommitTxn
   - 34.56% ReorderBufferQueueChange
  - 34.36% ReorderBufferSerializeTXN
   27.64% __write_nocancel
   4.42% __memcpy_ssse3_back
 + 1.52% OpenTransientFilePerm
   + 7.97% SnapBuildBuildSnapshot

 0.04% 0.00%  postgres  postgres  [.] 
ReorderBufferSequenceIsTransactional

## 64 clients

-   86.49% 0.04%  postgres  postgres[.] 
pg_logical_slot_get_changes_guts
   - 86.45% pg_logical_slot_get_changes_guts
  - 86.37% LogicalDecodingProcessRecord
 - 84.79% xact_decode
- 51.05% SnapBuildCommitTxn
   + 49.77% SnapBuildBuildSnapshot
 0.53% ReorderBufferXidHasBaseSnapshot
- 33.45% ReorderBufferProcessTXN
   + 21.37% ReorderBufferCopySnap.isra.21
   + 4.86% RelationIdGetRelation
   + 2.31% RelidByRelfilenumber
   + 0.84% AbortCurrentTransaction
 0.81% ReorderBufferCleanupTXN
 - 1.10% seq_decode
  0.65% ReorderBufferSequenceIsTransactional

-1.04% 0.09%  postgres  postgres  [.] 
ReorderBufferSequenceIsTransactional

## 32 clients

-   82.20% 0.13%  postgres  postgres [.] 
pg_logical_slot_get_changes_guts
   - 82.09% pg_logical_slot_get_changes_guts
  - 81.86% LogicalDecodingProcessRecord
 - 80.01% xact_decode
- 49.38% SnapBuildCommitTxn
   + 48.10% SnapBuildBuildSnapshot
 0.50% ReorderBufferXidHasBaseSnapshot
- 29.98% ReorderBufferProcessTXN
   + 10.52% ReorderBufferCopySnap.isra.21
   + 7.64% RelationIdGetRelation
   + 4.11% RelidByRelfilenumber
   + 1.70% 

Re: Synchronizing slots from primary to standby

2023-12-01 Thread Amit Kapila
On Wed, Nov 29, 2023 at 3:24 PM Drouvot, Bertrand
 wrote:
>
> On 11/29/23 6:58 AM, Zhijie Hou (Fujitsu) wrote:
> > On Tuesday, November 28, 2023 8:07 PM Drouvot, Bertrand 
> >  wrote:
> >
> > Hi,
> >
> >> On 11/27/23 9:57 AM, Zhijie Hou (Fujitsu) wrote:
> >>> On Monday, November 27, 2023 4:51 PM shveta malik
> >>  wrote:
> >>>
> >>> Here is the updated version(v39_2) which include all the changes made in
> >> 0002.
> >>> Please use for review, and sorry for the confusion.
> >>>
> >>
> >> Thanks!
> >>
> >> As far v39_2-0001:
> >>
> >> "
> >>   Altering the failover option of the subscription is currently not
> >>   permitted. However, this restriction may be lifted in future 
> >> versions.
> >> "
> >>
> >> Should we mention that we can alter the related replication slot?
> >
> > Will add.
> >
> >>
> >> + 
> >> +  The implementation of failover requires that replication
> >> +  has successfully finished the initial table synchronization
> >> +  phase. So even when failover is enabled for a
> >> +  subscription, the internal failover state remains
> >> +  temporarily pending until the initialization
> >> phase
> >> +  completes. See column
> >> subfailoverstate
> >> +  of  >> linkend="catalog-pg-subscription">pg_subscription >> me>
> >> +  to know the actual failover state.
> >> + 
> >>
> >> I think we have a corner case here. If one alter the replication slot on 
> >> the
> >> primary then "subfailoverstate" is not updated accordingly on the 
> >> subscriber.
> >> Given the 2 remarks above would that make sense to prevent altering a
> >> replication slot associated to a subscription?
> >
> > Thanks for the review!
> >
> > I think we could not distinguish the user created logical slot or subscriber
> > created slot as there is no related info in slot's data.
>
> Yeah that would need extra work.
>
> > And user could change
> > the slot on subscription by "alter sub set (slot_name)", so maintaining 
> > this info
> > would need some efforts.
> >
>
> Yes.
>
> > Besides, I think this case overlaps the previous discussed "alter sub set
> > (slot_name)" issue[1]. Both the cases are because the slot's failover is
> > different from the subscription's failover setting.
>
> Yeah agree.
>
> > I think we could handle
> > them similarly that user need to take care of not changing the failover to
> > wrong value. Or do you prefer another approach that mentioned in that 
> > thread[1]
> > ? (always alter the slot at the startup of apply worker).
> >
>
> I think I'm fine with documenting the fact that the user should not change 
> the failover
> value. But if he does change it (because at the end nothing prevents it to do 
> so) then
> I think the meaning of subfailoverstate should still make sense.
>

How user can change the slot's failover property? Do we provide any
command for it?

-- 
With Regards,
Amit Kapila.




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Drouvot, Bertrand

Hi,

On 12/1/23 4:19 AM, shveta malik wrote:

On Thu, Nov 30, 2023 at 5:37 PM Ajin Cherian  wrote:



1. In my opinion, the second message "aborting the wait...moving to
the next slot" does not hold much value. There might not even be a
"next slot", there might be just one slot. I think the first LOG is
enough to indicate that the sync-slot is waiting as it repeats this
log till the slot catches up. I know these messages hold great value
for debugging but in production,  "waiting..", "aborting the wait.."
might not be as helpful, maybe change it to debug?

2023-11-30 05:13:49.811 EST [6115] LOG:  waiting for remote slot
"sub1" LSN (0/3047A90) and catalog xmin (745) to pass local slot LSN
(0/3047AC8) and catalog xmin (745)
2023-11-30 05:13:57.909 EST [6115] LOG:  aborting the wait for remote
slot "sub1" and moving to the next slot, will attempt creating it
again
2023-11-30 05:14:07.921 EST [6115] LOG:  waiting for remote slot
"sub1" LSN (0/3047A90) and catalog xmin (745) to pass local slot LSN
(0/3047AC8) and catalog xmin (745)



Sure, the message can be trimmed down. But I am not very sure if we
should convert it to DEBUG. It might be useful to know what exactly is
happening with this slot through the log file.Curious to know what
others think here?



I think LOG is fine for the "waiting" one but I'd be tempted to put part of the
message in errdetail().

I think we could get rid of the "aborting" message (or move it to DEBUG). I mean
if one does not see the "newly locally created slot" message then I think it's
enough to guess the wait has been aborted or that it is still waiting.

But that's probably just a matter of taste.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




RE: Remove unnecessary includes of system headers in header files

2023-12-01 Thread shubham.kha...@fujitsu.com
Hi Peter,

I have reviewed the patches and also RUN the command, 'make check-world'. It is 
working fine. All test cases are passed successfully.

Thanks and Regards,
Shubham Khanna.

-Original Message-
From: Peter Eisentraut  
Sent: Friday, December 1, 2023 1:24 PM
To: pgsql-hackers 
Subject: Remove unnecessary includes of system headers in header files

I noticed that some header files included system header files for no apparent 
reason, so I did some digging and found out that in a few cases the original 
reason has disappeared.  So I propose the attached patches to remove the 
unnecessary includes.


Re: Synchronizing slots from primary to standby

2023-12-01 Thread shveta malik
On Fri, Dec 1, 2023 at 3:43 PM Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 11/30/23 1:06 PM, Ajin Cherian wrote:
> > On Wed, Nov 29, 2023 at 8:17 PM Zhijie Hou (Fujitsu)
> >
> > 3. If creation of a slot on the standby fails for one slot because a
> > slot of the same name exists, then thereafter no new sync slots are
> > created on standby. Is this expected? I do see that previously created
> > slots are kept up to date, just that no new slots are created after
> > that.
>
> Yes this is the expected behavior as per discussion in [1].
> Does this behavior make sense to you?
>
Not completely. The chances of slots getting synced in this case seems
order based. Every time a worker restarts after the error (considering
the user has not taken corrective action yet), it will successfully
sync the slots prior to the problematic one, while leaving the ones
after that un-synced.

I need a little more clarity on what is the way for the user to know
that the slot-sync worker (or any background worker for say) has
error'ed out?  Is it only from a log file or are there other
mechanisms used for this? I mean, do ERRORs have better chances to
catch user's attention than WARNING in the context of background
worker?  I feel we can give a second thought on this and see if it is
more appropriate to keep on syncing the rest of the slots and skip the
duplicate-name one?

thanks
Shveta




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Drouvot, Bertrand

Hi,

On 11/30/23 1:06 PM, Ajin Cherian wrote:

On Wed, Nov 29, 2023 at 8:17 PM Zhijie Hou (Fujitsu)

3. If creation of a slot on the standby fails for one slot because a
slot of the same name exists, then thereafter no new sync slots are
created on standby. Is this expected? I do see that previously created
slots are kept up to date, just that no new slots are created after
that.


Yes this is the expected behavior as per discussion in [1].
Does this behavior make sense to you?

[1]: 
https://www.postgresql.org/message-id/dd9dbbaf-ca77-423a-8d62-bfc814626b47%40gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Is this a problem in GenericXLogFinish()?

2023-12-01 Thread Amit Kapila
On Thu, Nov 30, 2023 at 4:30 PM Alexander Lakhin  wrote:
>
> 30.11.2023 10:28, Hayato Kuroda (Fujitsu) wrote:
> > Again, good catch. Here is my analysis and fix patch.
> > I think it is sufficient to add an initialization for writebuf.
>
> I agree with the change.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: pg_upgrade and logical replication

2023-12-01 Thread Amit Kapila
On Fri, Dec 1, 2023 at 10:57 AM Peter Smith  wrote:
>
> Here are review comments for patch v21-0001
>
>
> 2.
> In this function there are a couple of errors written to the
> "subs_invalid.txt" file:
>
> + fprintf(script, "replication origin is missing for database:\"%s\"
> subscription:\"%s\"\n",
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1));
>
> and
>
> + fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
> relation:\"%s\" state:\"%s\" not in required state\n",
> + active_db->db_name,
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1),
> + PQgetvalue(res, i, 2),
> + PQgetvalue(res, i, 3));
>
> The format of those messages is not consistent. It could be improved
> in a number of ways to make them more similar. e.g. below.
>
> SUGGESTION #1
> the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
> the table sync state \"%s\" is not allowed for database:\"%s\"
> subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n
>

+1. Shall we keep 'the' as 'The' in the message? Few other messages in
the same file start with capital letter.

>
> 4.
> +# Create a subscription in enabled state before upgrade
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1"
> +);
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
>
> That publication has an empty set of tables. Should there be some
> comment to explain why it is OK like this?
>

I think we can add a comment to state the intention of overall test
which this is part of.

>
> 10.
> +# The subscription's running status should be preserved
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription ORDER BY subname");
> +is($result, qq(t
> +f),
> + "check that the subscription's running status are preserved"
> +);
>
> I felt this was a bit too tricky. It might be more readable to do 2
> separate SELECTs with explicit subnames. Alternatively, leave the code
> as-is but improve the comment to explicitly say something like:
>
> # old subscription regress_sub was enabled
> # old subscription regress_sub1 was disabled
>

I don't see the need to have separate queries though adding comments
is a good idea.

>
> 15.
>  extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
> - XLogRecPtr sublsn);
> + XLogRecPtr sublsn, bool upgrade);
>
> Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
> matches the comment you added in that function?
>

It is better to name this parameter as retain_lock and then explain it
in the function header. The bigger problem with change is that we
should release the other lock
(LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);)
taken in the function as well.

-- 
With Regards,
Amit Kapila.




Re: Show WAL write and fsync stats in pg_stat_io

2023-12-01 Thread Nazir Bilal Yavuz
Hi,

Thanks for all the feedback. I am sharing the new version of the patchset.

Current status of the patchset is:
- IOOBJECT_WAL / IOCONTEXT_NORMAL / read, write, fsync stats and their
tests are added.
- IOOBJECT_WAL / IOCONTEXT_INIT stats and their tests are added.
- Documentation is updated.
- pg_stat_io shows different op_bytes for the IOOBJECT_WAL operations.
- PendingWalStats.wal_sync and PendingWalStats.wal_write_time /
PendingWalStats.wal_sync_time are moved to pgstat_count_io_op_n() /
pgstat_count_io_op_time() respectively.

Updates & Discussion items:
- Try to set op_bytes for BackendType / IOContext: I think we don't
need this now, we will need this when we add streaming replication WAL
IOs.

- Decide which 'BackendType / IOContext / IOOp' should not be tracked:
-- IOOBJECT_WAL / IOCONTEXT_INIT + IOCONTEXT_NORMAL / write and fsync
IOs can be done on every backend that tracks IO statistics. Because of
that and since we have a pgstat_tracks_io_bktype(bktype) check, I
didn't add another check for this.
-- I found that only the standalone backend and startup backend do
IOOBJECT_WAL / IOCONTEXT_NORMAL / read IOs. So, I added a check for
that but I am not sure if there are more backends that do WAL reads on
WAL recovery.

- For the IOOBJECT_WAL / IOCONTEXT_INIT and IOOBJECT_WAL /
IOCONTEXT_NORMAL / read tests, I used initial WAL IOs to check these
stats. I am not sure if that is the correct way or enough to test
these stats.

- To not calculate WAL timings on pg_stat_wal and pg_stat_io view,
pg_stat_wal view's WAL timings are fetched from pg_stat_io. Since
these timings are fetched from pg_stat_io, pg_stat_reset_shared('io')
will reset pg_stat_wal's timings too.

- I didn't move 'PendingWalStats.wal_sync' out from the
'pgstat_count_io_op_n' function because they count the same thing
(block vs system calls) but I agree that this doesn't look good.

Any kind of feedback would be appreciated.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft
From 4ad85b11d418ae78237ed70eced6e3b46d086ef5 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 1 Dec 2023 10:03:21 +0300
Subject: [PATCH v5 3/4] Add IOOBJECT_WAL / IOCONTEXT_INIT / write and fsync
 tests

---
 src/test/regress/expected/stats.out | 19 +++
 src/test/regress/sql/stats.sql  | 10 ++
 2 files changed, 29 insertions(+)

diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 4d3a515bdd..4adda9e479 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -862,6 +862,25 @@ WHERE pg_stat_get_backend_pid(beid) = pg_backend_pid();
  t
 (1 row)
 
+-- Test pg_stat_io IOOBJECT_WAL / IOCONTEXT_INIT / write and fsync.
+-- When the servers starts, the initial WAL file must be created,
+-- so check these stats before stats get resetted.
+SELECT sum(writes) AS writes, sum(fsyncs) AS fsyncs
+  FROM pg_stat_io
+  WHERE context = 'init' and object = 'wal' \gset io_sum_wal_init_
+SELECT :io_sum_wal_init_writes > 0;
+ ?column? 
+--
+ t
+(1 row)
+
+SELECT current_setting('fsync') = 'off'
+  OR :io_sum_wal_init_fsyncs > 0;
+ ?column? 
+--
+ t
+(1 row)
+
 -
 -- Test that resetting stats works for reset timestamp
 -
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index aa48e65dc8..72e864a0d2 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -442,6 +442,16 @@ SELECT (current_schemas(true))[1] = ('pg_temp_' || beid::text) AS match
 FROM pg_stat_get_backend_idset() beid
 WHERE pg_stat_get_backend_pid(beid) = pg_backend_pid();
 
+-- Test pg_stat_io IOOBJECT_WAL / IOCONTEXT_INIT / write and fsync.
+-- When the servers starts, the initial WAL file must be created,
+-- so check these stats before stats get resetted.
+SELECT sum(writes) AS writes, sum(fsyncs) AS fsyncs
+  FROM pg_stat_io
+  WHERE context = 'init' and object = 'wal' \gset io_sum_wal_init_
+SELECT :io_sum_wal_init_writes > 0;
+SELECT current_setting('fsync') = 'off'
+  OR :io_sum_wal_init_fsyncs > 0;
+
 -
 -- Test that resetting stats works for reset timestamp
 -
-- 
2.43.0

From c7ae6c12cd02806d9d8201d738920179985cee7a Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 10 Nov 2023 14:52:22 +0300
Subject: [PATCH v5 2/4] Add IOOBJECT_WAL / IOCONTEXT_NORMAL / write and fsync
 tests

---
 src/test/regress/expected/stats.out | 26 ++
 src/test/regress/sql/stats.sql  | 11 +++
 2 files changed, 37 insertions(+)

diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 346e10a3d2..4d3a515bdd 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1255,6 +1255,7 @@ SELECT pg_stat_get_subscription_stats(NULL);
 -- - extends of relations using shared buffers
 -- - fsyncs done to ensure the durability of data dirtying shared buffers
 -- - shared buffer hits
+-- - WAL writes and fsyncs in IOContext IOCONTEXT_NORMAL
 

Improving asan/ubsan support

2023-12-01 Thread Alexander Lakhin

[ this thread separated from [1] as the discussion focus shifted ]

H Andres,

29.11.2023 22:39, Andres Freund wrote:

I use the following:
ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:print_stacktrace=1:\
disable_coredump=0:strict_string_checks=1:check_initialization_order=1:\
strict_init_order=1:detect_stack_use_after_return=0

I wonder if we should add some of these options by default ourselves. We could
e.g. add something like the __ubsan_default_options() in
src/backend/main/main.c to src/port/... instead, and return a combination of
"our" options (like detect_leaks=0) and the ones from the environment.


I think that such explicit expression of the project policy regarding
sanitizer checks is for good, but I see some obstacles on this way.

First, I'm not sure what to do with new useful options/maybe new option
values, that will appear in sanitizers eventually. Should the only options,
that are supported by all sanitizers' versions, be specified, or we may
expect that unsupported options/values would be ignored by old versions?

Second, what to do with other binaries, that need detect_leaks=0, for
example, that same ecpg?


ISTM that, if it actually works as I theorize it should, using
__attribute__((no_sanitize("address"))) would be the easiest approach
here. Something like

#if defined(__has_feature) && __has_feature(address_sanitizer)
#define pg_attribute_no_asan __attribute__((no_sanitize("address")))
#else
#define pg_attribute_no_asan
#endif

or such should work.


I've tried adding:
 bool
+__attribute__((no_sanitize("address")))
 stack_is_too_deep(void)

and it does work got me with clang 15, 18: `make check-world` passes with
ASAN_OPTIONS=detect_leaks=0:abort_on_error=1:print_stacktrace=1:\
disable_coredump=0:strict_string_checks=1:check_initialization_order=1:\
strict_init_order=1:detect_stack_use_after_return=1
UBSAN_OPTIONS=abort_on_error=1:print_stacktrace=1

(with a fix for pg_bsd_indent applied [2])

But with gcc 11, 12, 13 I get an assertion failure during `make check`:
#4  0x7fabadcd67f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x557f35260382 in ExceptionalCondition (conditionName=0x557f35ca51a0 "(uintptr_t) buffer == 
TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)", fileName=0x557f35ca4fc0 "md.c", lineNumber=471) at assert.c:66
#6  0x557f34a3b2bc in mdextend (reln=0x625375c8, forknum=MAIN_FORKNUM, blocknum=18, buffer=0x7fabaa800020, 
skipFsync=true) at md.c:471
#7  0x557f34a45a6f in smgrextend (reln=0x625375c8, forknum=MAIN_FORKNUM, blocknum=18, buffer=0x7fabaa800020, 
skipFsync=true) at smgr.c:501
#8  0x557f349139ed in RelationCopyStorageUsingBuffer (srclocator=..., dstlocator=..., forkNum=MAIN_FORKNUM, 
permanent=true) at bufmgr.c:4386


The buffer (buf) declared as follows:
static void
RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
   RelFileLocator dstlocator,
   ForkNumber forkNum, bool permanent)
{
...
    PGIOAlignedBlock buf;
...

But as we can see, the buffer address is really not 4k-aligned, and that
offset 0x20 added in run-time only when the server started with
detect_stack_use_after_return=1.
So it looks like the asan feature detect_stack_use_after_return implemented
in gcc allows itself to add some data on stack, that breaks our alignment
expectations. With all three such Asserts in md.c removed,
`make check-world` passes for me.


One thing that's been holding me back on trying to do something around this is
the basically non-existing documentation around all of this. I haven't even
found documentation referencing the fact that there are headers like
sanitizer/asan_interface.h, you just have to figure that out yourself. Compare
that to something like valgrind, which has documented this at least somewhat.


Yes, so maybe it's reasonable to support only basic/common features (such
as detect_leaks), leaving advanced ones for ad-hoc usage till they prove
their worthiness.

Best regards,
Alexander

[1] 
https://www.postgresql.org/message-id/flat/CWTLB2WWVJJ2.2YV6ERNOL1WVF%40neon.tech
[2] 
https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com




Re: PostgreSql: Canceled on conflict out to old pivot

2023-12-01 Thread Wirch, Eduard
Thanks guys for the valuable info. The key take away for me is clear: keep
transactions short under all circumstances.

Cheers,
Eduard

Am Fr., 1. Dez. 2023 um 01:31 Uhr schrieb Andres Freund :

> Hi,
>
> On 2023-11-30 18:51:35 -0500, Tom Lane wrote:
> > On what grounds do you assert that?  Operations on shared catalogs
> > are visible across databases.  Admittedly they can't be written by
> > ordinary DML, and I'm not sure that we make any promises about DDL
> > writes honoring serializability.  But I'm unwilling to add
> > "optimizations" that assume that that will never happen.
>
> I'd say the issue is more that it's quite expensive to collect the
> information. I tried in the past to make the xmin computation in
> GetSnapshotData() be database specific, but it quickly shows in profiles,
> and
> GetSnapshotData() unfortunately is really performance / scalability
> critical.
>
> If that weren't the case, we could check a shared horizon for shared
> tables,
> and a non-shared horizon otherwise.
>
> In some cases we can compute a "narrower" horizon when it's worth the cost,
> but quite often we lack the necessary data, because various backends have
> stored the "global" xmin in the procarray.
>
> Greetings,
>
> Andres Freund
>


Re: Show WAL write and fsync stats in pg_stat_io

2023-12-01 Thread Nazir Bilal Yavuz
Hi,

On Wed, 8 Nov 2023 at 10:34, Bharath Rupireddy
 wrote:
>
> Is there any plan to account for WAL read stats in the WALRead()
> function which will cover walsenders i.e. WAL read by logical and
> streaming replication, WAL read by pg_walinspect and so on? I see the
> patch already covers WAL read stats by recovery in XLogPageRead(), but
> not other page_read callbacks which will end up in WALRead()
> eventually. If added, the feature at
> https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com
> can then extend it to cover WAL read from WAL buffer stats.

Yes, I am planning to create a patch for that after this patch is
done. Thanks for informing!

-- 
Regards,
Nazir Bilal Yavuz
Microsoft




Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding

2023-12-01 Thread li jie
> This is just an immature idea. I haven't started to implement it yet.
> Maybe it was designed this way because there
> are key factors that I didn't consider. So I want to hear everyone's
> opinions, especially the designers of logic decoding.

Attached is the patch I used to implement this optimization.
The main designs are as follows:
1. Added a callback LogicalDecodeFilterByRelCB for the output plugin.

2. Added this callback function pgoutput_table_filter for the pgoutput plugin.
Its main implementation is based on the table filter in the
pgoutput_change function.

3. After constructing a change and before Queue a change into a transaction,
use RelidByRelfilenumber to obtain the relation associated with the change,
just like obtaining the relation in the ReorderBufferProcessTXN function.

4. Relation may be a toast, and there is no good way to get its real
table relation
   based on toast relation. Here, I get the real table oid through
toast relname, and
   then get the real table relation.

5. This filtering takes into account INSERT/UPDATE/INSERT. Other
changes have not
   been considered yet and can be expanded in the future.

6.  The table filter in pgoutput_change and the get relation in
ReorderBufferProcessTXN
  can be deleted. This has not been done yet. This is the next step.

Sincerely look forward to your feedback.
Regards, lijie


v1-Filter-irrelevant-change-before-reassemble-transacti.patch
Description: Binary data


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-12-01 Thread Masahiko Sawada
On Thu, Nov 30, 2023 at 6:49 PM Amit Kapila  wrote:
>
> On Wed, Nov 29, 2023 at 7:33 AM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > > > Actually, we do not expect that it won't input NULL. IIUC all of slots 
> > > > have
> > > > slot_name, and subquery uses its name. But will it be kept forever? I 
> > > > think we
> > > > can avoid any risk.
> > > >
> > > > > I've not tested it yet but even if it returns NULL, perhaps
> > > > > get_old_cluster_logical_slot_infos() would still set curr->caught_up
> > > > > to false, no?
> > > >
> > > > Hmm. I checked the C99 specification [1] of strcmp, but it does not 
> > > > define the
> > > > case when the NULL is input. So it depends implementation.
> > >
> > > I think PQgetvalue() returns an empty string if the result value is null.
> > >
> >
> > Oh, you are right... I found below paragraph from [1].
> >
> > > An empty string is returned if the field value is null. See PQgetisnull 
> > > to distinguish
> > > null values from empty-string values.
> >
> > So I agree what you said - current code can accept NULL.
> > But still not sure the error message is really good or not.
> > If we regard an empty string as false, the slot which has empty name will 
> > be reported like:
> > "The slot \"\" has not consumed the WAL yet" in 
> > check_old_cluster_for_valid_slots().
> > Isn't it inappropriate?
> >
>
> I see your point that giving a better message (which would tell the
> actual problem) to the user in this case also has a value. OTOH, as
> you said, this case won't happen in practical scenarios, so I am fine
> either way with a slight tilt toward retaining a better error message
> (aka the current way). Sawada-San/Bharath, do you have any suggestions
> on this?

TBH I'm not sure the error message is much helpful for users more than
the message "The slot \"\" has not consumed the WAL yet" in practice.
In either case, the messages just tell the user the slot name passed
to the function was not appropriate. Rather, I'm a bit concerned that
we create a precedent that we make a function non-strict to produce an
error message only for unrealistic cases. Please point out if we
already have such precedents. Other functions in pg_upgrade_support.c
such as binary_upgrade_set_next_pg_tablespace_oid() are not called if
the argument is NULL since it's a strict function. But if null was
passed in (where should not happen in practice), pg_upgrade would fail
with an error message or would finish while leaving the cluster in an
inconsistent state, I've not tested. Why do we want to care about the
argument being NULL only in
binary_upgrade_logical_slot_has_caught_up()?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com