Re: SQL/JSON: functions

2022-03-28 Thread Nikola Ivanov
Hi Andrew,

Let me know check what can I do with the access. I will get back to you in
an hour.

Regards

On Mon, Mar 28, 2022, 17:30 Andrew Dunstan  wrote:

>
> On 3/28/22 09:35, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> On 3/27/22 19:14, Tom Lane wrote:
> >>> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
> >>> system here.  So there's something odd about jabiru's build
> >>> environment; but what?
> >> It's hard to see how this could be caused by the OS environment. Maybe a
> >> flaky bison/flex? I'm going to be pretty reluctant to revert this based
> >> on this error.
> > No, I wouldn't recommend reverting.  Perhaps if we dig down and find
> > something reproducible here, we could fix it --- but right now,
> > given my failure to reproduce, I think there's just something broken
> > on jabiru.
> >
> >
>
>
>
> Yeah. I have just duplicated your non-replication on a fresh instance.
>
>
> Nikola Ivanov, can you give us any assistance or give us access to the
> machine?
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

2022-03-28 Thread Julien Rouhaud
On Mon, Mar 28, 2022 at 11:20:42AM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > I'm attaching a v5 with hopefully a better comment for the function, and 
> > only
> > the "pretty" parameter.
> 
> Pushed with some minor cosmetic adjustments.

Thanks a lot!

I just published an extension using this for the use case I'm interested in.




Re: Document atthasmissing default optimization avoids verification table scan

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 9:54 AM James Coleman  wrote:
> No, I've appreciated constructive feedback from both Tom and David on
> this thread. Your original email was so incredibly strongly worded
> (and contained no constructive recommendations about a better path
> forward, unlike Tom's and David's replies), and I had a hard time
> understanding what could possibly have made you that irritated with a
> proposal to document how to avoid long-running table scans while
> holding an exclusive lock.

I don't think I was particularly irritated then, but I admit I'm
getting irritated now. I clearly said that the documentation wasn't
perfect but that I didn't think these patches made it better, and I
explained why in some detail. It's not like I said "you suck and I
hate you and please go die in a fire" or something like that. So why
is that "incredibly strongly worded"? Especially when both David and
Tom agreed with my recommendation that we reject these patches as
proposed?

There are probably patches in this CommitFest that have gotten no
review from anyone, but it's pretty hard to find them, because the
CommitFest is full of patches like this one, which have been reviewed
fairly extensively yet which, for one reason or another, don't seem
likely to go anywhere any time soon. I think that's a much bigger
problem for the project than the lack of documentation on this
particular issue. Of course, you will likely disagree.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: SQL/JSON: functions

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 18:05:19 +0300, Nikola Ivanov wrote:
> Let me know check what can I do with the access. I will get back to you in
> an hour.

Perhaps you can temporarily enable keep_error_builds, and send in
src/interfaces/ecpg/preproc/c_kwlist_d.h
src/interfaces/ecpg/preproc/pgc.c
src/interfaces/ecpg/preproc/preproc.h
src/interfaces/ecpg/preproc/ecpg_kwlist_d.h
src/interfaces/ecpg/preproc/preproc.c
from the failed build directory? It seems something there have to differ.

Regards,

Andres




Re: SQL/JSON: functions

2022-03-28 Thread Andrew Dunstan


On 3/28/22 11:05, Nikola Ivanov wrote:
> Hi Andrew,
>
> Let me know check what can I do with the access. I will get back to
> you in an hour.


Thanks for you help and prompt response.

In the first instance we'd like to know what might be different about
jabiru from the openbsd7/clang11 instances Tom and I have just
successfully tested on. In the last resort we might need to run ecpg
under a debugger on jabiru to see why it's failing there and not
elsewhere. To set up for that run the buildfarm script with --keepall.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)

2022-03-28 Thread Tom Lane
Julien Rouhaud  writes:
> I'm attaching a v5 with hopefully a better comment for the function, and only
> the "pretty" parameter.

Pushed with some minor cosmetic adjustments.

regards, tom lane




Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-03-28 Thread Matthias van de Meent
On Mon, 28 Mar 2022 at 06:33, Michael Paquier  wrote:
>
> On Wed, Feb 16, 2022 at 10:24:58PM +0100, Matthias van de Meent wrote:
> > A first good reason to do this is preventing further damage when a
> > page is corrupted: if I can somehow overwrite pd_special,
> > non-assert-enabled builds would start reading and writing at arbitrary
> > offsets from the page pointer, quite possibly in subsequent buffers
> > (or worse, on the stack, in case of stack-allocated blocks).
>
> Well, pageinspect has proved to be rather careless in this area for a
> couple of years, but this just came from the fact that we called
> PageGetSpecialPointer() before checking that PageGetSpecialSize() maps
> with the MAXALIGN'd size of the opaque area, if the related AM has
> one.
>
> I am not sure that we need to worry about corruptions, as data
> checksums would offer protection for most cases users usually need to
> worry about, at the exception of page corruptions because of memory
> for pages already read in the PG shared buffers from disk or even the
> OS cache.

Not all clusters have checksums enabled (boo!, but we can't
realistically fix that), so on-disk corruption could reasonably
propagate to the rest of such system. Additionally, checksums are only
checked on read, and updated when the page is written to disk (in
PageSetChecksumCopy called from FlushBuffer), but this does not check
for signs of page corruption. As such, a memory bug resulting in
in-memory corruption of pd_special would persist and would currently
have the potential to further corrupt neighbouring buffers.

> > A second reason would be less indirection to get to the opaque
> > pointer. This should improve performance a bit in those cases where we
> > (initially) only want to access the [Index]PageOpaque struct.
>
> We don't have many cases that do that, do we?

Indeed not many, but I know that at least the nbtree-code has some
cases where it uses the opaque before other fields in the header are
accessed (sometimes even without accessing the header for other
reasons): in _bt_moveright and _bt_walk_left. There might be more; but
these are cases I know of.

> > Lastly, 0002 and 0003 remove some repetitive tasks of dealing with the
> > [nbtree, hash] opaques by providing a typed accessor macro similar to
> > what is used in the GIN and (SP-)GIST index methods; improving the
> > legibility of the code and decreasing the churn.
>
> +#define PageGetSpecialOpaque(page, OpaqueDataTyp) \
> +( \
> +   AssertMacro(PageGetPageSize(page) == BLCKSZ && \
> +   PageGetSpecialSize(page) == MAXALIGN(sizeof(OpaqueDataTyp))), 
> \
> +   (OpaqueDataTyp *) ((char *) (page) + (BLCKSZ - 
> MAXALIGN(sizeof(OpaqueDataTyp \
> +)
>
> Did you notice PageValidateSpecialPointer() that mentions MSVC?  Could
> this stuff a problem if not an inline function.

I'm not sure; but the CFbot build (using whatever MSVC is used with
latest Window Server 2019, VS 19) failed to complain in this case.
Furthermore, the case mentioned in the comment of
PageValidateSpecialPointer() refers to problems that only happened
after the macro was updated from one MacroAssert to three seperate
MacroAssert()s; this new code currently only has one, so we should be
fine for now.

> Perhaps you should do a s/OpaqueDataTyp/OpaqueDataType/.

This has been fixed in attached v2; and apart from tweaks in the
commit messages there have been no other changes.

> As a whole, this patch set looks like an improvement in terms of
> checks and consistency regarding the special area handling across the
> different AMs for the backend code, while reducing the MAXALIGN-ism on
> all those sizes, and this tends to be missed.

Thanks!

-Matthias
From c22128e0e63ab83f9009a1466ea50c46878d7650 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent 
Date: Mon, 28 Mar 2022 14:53:43 +0200
Subject: [PATCH v2 1/3] Add known-size pre-aligned special area pointer macro

This removes 1 layer of indirection for special areas of which we know the
type (=size) and location.

Special area access through the page header is an extra cache line that needs
to be fetched. If might only want to look at the special area, it is much
less effort to calculate the offset to the special area directly instead of
first checking the page header - saving one cache line to fetch.

Assertions are added to check that the page has a correctly sized special
area, and that the page is of the expected size. This detects data corruption,
instead of doing random reads/writes into the page or data allocated next to
the page being accessed.

Additionally, updates the GIN, GIST and SP-GIST [Index]PageGetOpaque macros
with the new pre-aligned special area accessor.
---
 src/include/access/ginblock.h   |  3 ++-
 src/include/access/gist.h   |  3 ++-
 src/include/access/spgist_private.h |  3 ++-
 src/include/storage/bufpage.h   | 14 ++
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/include/access/ginblock.h b/src/inc

Re: support for MERGE

2022-03-28 Thread Alvaro Herrera
On 2022-Mar-28, Alvaro Herrera wrote:

> I intend to get this pushed after lunch.

Pushed, with one more change: fetching the tuple ID junk attribute in
ExecMerge was not necessary, since we already had done that in
ExecModifyTable.  We just needed to pass that down to ExecMerge, and
make sure to handle the case where there isn't one.

Early builfarm results: I need to fix role names in the new test.  On it
now.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)




Re: role self-revocation

2022-03-28 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Mar 24, 2022 at 1:10 PM Tom Lane  wrote:
> > > However, it might. And if it does, I think it would be best if
> > > removing that exception were the *only* change in this area made by
> > > that release.
> >
> > Good idea, especially since it's getting to be too late to consider
> > anything more invasive anyway.
> 
> I'd say it's definitely too late at this point.

Agreed.

> > > So I propose to commit something like what I posted here:
> > > http://postgr.es/m/ca+tgmobgek0jraowqvpqhsxcfbdfitxsomoebhmmmhmj4gl...@mail.gmail.com
> >
> > +1, although the comments might need some more work.  In particular,
> > I'm not sure that this bit is well stated:

Also +1 on this.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Robert Haas
On Fri, Mar 25, 2022 at 8:26 AM Alvaro Herrera  wrote:
> On 2022-Mar-21, Alvaro Herrera wrote:
> > I had a look at this latest version of the patch, and found some things
> > to tweak.  Attached is v21 with three main changes from Kyotaro's v20:
>
> Pushed this, backpatching to 14 and 13.  It would have been good to
> backpatch further, but there's an (textually trivial) merge conflict
> related to commit e6d8069522c8.  Because that commit conceptually
> touches the same area that this bugfix is about, I'm not sure that
> backpatching further without a lot more thought is wise -- particularly
> so when there's no way to automate the test in branches older than
> master.
>
> This is quite annoying, considering that the bug was reported shortly
> before 12 went into beta.

I think that the warnings this patch issues may cause some unnecessary
end-user alarm. It seems to me that they are basically warning about a
situation that is unusual but not scary. Isn't the appropriate level
for that DEBUG1, maybe without the errhint?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [WIP] Allow pg_upgrade to copy segments of the same relfilenode in parallel

2022-03-28 Thread Justin Pryzby
On Sun, Mar 27, 2022 at 11:07:27AM -0500, Jaime Casanova wrote:
> > > It lacks documentation and I need help with WIN32 part of it, I created
> > > this new mail to put the patch on the next commitfest.
> > 
> > The patch currently fails on cfbot due to warnings, likely related due to 
> > the
> > win32 issue: 
> > https://cirrus-ci.com/task/4566046517493760?logs=mingw_cross_warning#L388
> > 
> > As it's a new patch submitted to the last CF, hasn't gotten any review yet 
> > and
> > misses some platform support, it seems like there's no chance it can make it
> > into 15?
> 
> Because I have zero experience on the windows side of this, I will take
> some time to complete that part.
> 
> Should we move this to the next commitfest (and make 16 the target for
> this)?

Done.

src/tools/ci/README may help test this under windows, but that's probably not 
enough
to allow writing the win-specific parts.

I guess you'll need to write tests for this..unfortunately that requires files
>1GB in size, unless you recompile postgres :(

It may be good enough to write an 0002 patch meant for CI only, but not
intended to be merged.  That can create a 2300MB table in src/test/regress, and
change pg_upgrade to run with (or default to) multiple jobs per tablespace.
Make sure it fails if the loop around relfilenodes doesn't work.

I can't help with win32, but that would be enough to verify it if someone else
fills in the windows parts.

-- 
Justin




Re: SQL/JSON: functions

2022-03-28 Thread Andrew Dunstan


On 3/28/22 09:35, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 3/27/22 19:14, Tom Lane wrote:
>>> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
>>> system here.  So there's something odd about jabiru's build
>>> environment; but what?
>> It's hard to see how this could be caused by the OS environment. Maybe a
>> flaky bison/flex? I'm going to be pretty reluctant to revert this based
>> on this error.
> No, I wouldn't recommend reverting.  Perhaps if we dig down and find
> something reproducible here, we could fix it --- but right now,
> given my failure to reproduce, I think there's just something broken
> on jabiru.
>
>   



Yeah. I have just duplicated your non-replication on a fresh instance.


Nikola Ivanov, can you give us any assistance or give us access to the
machine?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: identifying unrecognized node type errors

2022-03-28 Thread Ashutosh Bapat
On Fri, Mar 25, 2022 at 7:23 PM Tom Lane  wrote:
>
> Ashutosh Bapat  writes:
> > All these functions are too low level to be helpful to know. Knowing
> > the caller might actually give a hint as to where the unknown node
> > originated from. We may get that from the stack trace if that's
> > available. But if we could annotate the error with error_context that
> > will be super helpful.
>
> Is it really that interesting?  If function X lacks coverage for
> node type Y, then X is what needs to be fixed.  The exact call
> chain for any particular failure seems of only marginal interest,
> certainly not enough to be building vast new infrastructure for.
>

I don't think we have tests covering all possible combinations of
expression trees. Code coverage reports may not reveal this. I have
encountered flaky "unknown expression type" errors. Always ended up
changing code to get the stack trace. Having error context helps.

-- 
Best Wishes,
Ashutosh Bapat




Re: Document atthasmissing default optimization avoids verification table scan

2022-03-28 Thread James Coleman
On Mon, Mar 28, 2022 at 9:30 AM Robert Haas  wrote:
>
> On Sun, Mar 27, 2022 at 1:00 PM James Coleman  wrote:
> > So "undocumented concept" is just not accurate, and so I don't see it
> > as a valid reason to reject the patch.
>
> I mean, I think it's pretty accurate. The fact that you can point to a
> few uses of the terms "table rewrite" and "table scan" in the ALTER
> TABLE documentation doesn't prove that those terms are defined there
> or systematically discussed and it seems pretty clear to me that they
> are not. And I don't even know what we're arguing about here, because
> elsewhere in the same email you agree that it is reasonable to
> critique the patch on the basis of how well it fits into the
> documentation and at least for me that is precisely this issue.
>
> I think the bottom line here is that you're not prepared to accept as
> valid any opinion to the effect that we shouldn't commit these
> patches. But that remains my opinion.

No, I've appreciated constructive feedback from both Tom and David on
this thread. Your original email was so incredibly strongly worded
(and contained no constructive recommendations about a better path
forward, unlike Tom's and David's replies), and I had a hard time
understanding what could possibly have made you that irritated with a
proposal to document how to avoid long-running table scans while
holding an exclusive lock.

The two patches you reviewed aren't the current state of this
proposal; I'll continue working on revising to reviewers replies, and
as either a replacement or follow-on for this I like Tom's idea of
having a comprehensive guide (which I think has been needed for quite
some time).

Thanks,
James Coleman




Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-03-28 Thread Peter Eisentraut

On 28.03.22 12:46, Aleksander Alekseev wrote:

Personally I don't have a strong opinion here. Merging the patch sooner will
allow us to move toward 64-bit XIDs faster (e.g. to gather the feedback from
the early adopters, allow the translators to do their thing earlier, etc).
Merging it later will make PG15 more stable (you can't break anything new if
you don't change anything). As always, engineering is all about compromises.


At this point, I'm more concerned that code review is still finding 
bugs, and that we have no test coverage for any of this, so we are 
supposed to gain confidence in this patch by staring at it very hard. ;-)


AFAICT, this patch provides no actual functionality change, so holding 
it a bit for early in the PG16 cycle wouldn't lose anything.






Re: SQL/JSON: functions

2022-03-28 Thread Tom Lane
Andrew Dunstan  writes:
> On 3/27/22 19:14, Tom Lane wrote:
>> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
>> system here.  So there's something odd about jabiru's build
>> environment; but what?

> It's hard to see how this could be caused by the OS environment. Maybe a
> flaky bison/flex? I'm going to be pretty reluctant to revert this based
> on this error.

No, I wouldn't recommend reverting.  Perhaps if we dig down and find
something reproducible here, we could fix it --- but right now,
given my failure to reproduce, I think there's just something broken
on jabiru.

regards, tom lane




Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-28 Thread Tom Lane
Alvaro Herrera  writes:
>> After:
>> interval_start num_transactions sum_latency sum_latency_2 min_latency 
>> max_latency
>> { failures | serialization_failures deadlock_failures } [ sum_lag sum_lag_2 
>> min_lag max_lag [ skipped ] ] [ retried retries ]

> I think that the explanatory paragraph is way too long now, particularly
> since it explains --failures-detailed starting in the middle.  Also, the
> example output doesn't include the failures-detailed mode.

I think the problem is not merely one of documentation, but one of
bad design.  Up to now it was possible to tell what was what from
counting the number of columns in the output; but with this design,
that is impossible.  That should be fixed.  The first thing you have
got to do is drop the alternation { failures | serialization_failures
deadlock_failures }.  That doesn't make any sense in the first place:
counting serialization and deadlock failures doesn't make it impossible
for other errors to occur.  It'd probably make the most sense to have
three columns always, serialization, deadlock and total.  Now maybe
that change alone is sufficient, but I'm not convinced, because the
multiple options at the end of the line mean we will never again be
able to add any more columns without reintroducing ambiguity.  I
would be happier if the syntax diagram were such that columns could
only be dropped from right to left.

regards, tom lane




Re: Document atthasmissing default optimization avoids verification table scan

2022-03-28 Thread Robert Haas
On Sun, Mar 27, 2022 at 1:00 PM James Coleman  wrote:
> So "undocumented concept" is just not accurate, and so I don't see it
> as a valid reason to reject the patch.

I mean, I think it's pretty accurate. The fact that you can point to a
few uses of the terms "table rewrite" and "table scan" in the ALTER
TABLE documentation doesn't prove that those terms are defined there
or systematically discussed and it seems pretty clear to me that they
are not. And I don't even know what we're arguing about here, because
elsewhere in the same email you agree that it is reasonable to
critique the patch on the basis of how well it fits into the
documentation and at least for me that is precisely this issue.

I think the bottom line here is that you're not prepared to accept as
valid any opinion to the effect that we shouldn't commit these
patches. But that remains my opinion.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: document the need to analyze partitioned tables

2022-03-28 Thread Tomas Vondra
On 3/16/22 00:00, Justin Pryzby wrote:
> On Mon, Mar 14, 2022 at 05:23:54PM -0400, Robert Haas wrote:
>> On Fri, Jan 21, 2022 at 1:31 PM Tomas Vondra  
>> wrote:
>>> [ new patch ]
>>
>> This patch is originally by Justin. The latest version is by Tomas. I
>> think the next step is for Justin to say whether he's OK with the
>> latest version that Tomas posted. If he is, then I suggest that he
>> also mark it Ready for Committer, and that Tomas commit it. If he's
>> not, he should say what he wants changed and either post a new version
>> himself or wait for Tomas to do that.
> 
> Yes, I think it can be Ready.  Done.
> 
> I amended some of Tomas' changes (see 0003, attached as txt).
> 
> @cfbot: the *.patch file is for your consumption, and the others are only 
> there
> to show my changes.
> 
>> I think the fact that is classified as a "Bug Fix" in the CommitFest
>> application is not particularly good. I would prefer to see it
>> classified under "Documentation". I'm prepared to concede that
>> documentation can have bugs as a general matter, but nobody's data is
>> getting eaten because the documentation wasn't updated. In fact, this
>> is the fourth patch from the "bug fix" section I've studied this
>> afternoon, and, well, none of them have been back-patchable code
>> defects.
> 
> In fact, I consider this to be back-patchable back to v10.  IMO it's an
> omission that this isn't documented.  Not all bugs cause data to be eaten.  If
> someone reads the existing documentation, they might conclude that their
> partitioned tables don't need to be analyzed, and they would've been better
> served by not reading the docs.
> 

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: On login trigger: take three

2022-03-28 Thread a . sokolov

Daniel Gustafsson писал 2022-03-22 11:43:

On 22 Mar 2022, at 04:48, Andres Freund  wrote:


docs fail to build: 
https://cirrus-ci.com/task/5556234047717376?logs=docs_build#L349


Ugh, that one was on me and not the original author. Fixed.



+data initialization. It is vital that any event trigger using the
+login event checks whether or not the database 
is in

+recovery.

Does any trigger really have to contain a pg_is_in_recovery() call? In 
this message 
(https://www.postgresql.org/message-id/20220312024652.lvgehszwke4hhove%40alap3.anarazel.de) 
it was only about triggers on hot standby, which run not read-only 
queries


--
Andrey Sokolov
Postgres Professional: http://www.postgrespro.com




Re: Add LZ4 compression in pg_dump

2022-03-28 Thread Robert Haas
On Sun, Mar 27, 2022 at 12:06 PM Justin Pryzby  wrote:
> Maybe it should take an argument which specifies the default algorithm to use
> for input of a numeric "level".  And reject such input if not specified, since
> wal_compression has never taken a "level", so it's not useful or desirable to
> have that default to some new algorithm.

That sounds odd to me. Wouldn't it be rather confusing if a bare
integer meant gzip for one case and lz4 for another?

> I could write this down if you want, although I'm not sure how/if you intend
> other people to use bc_algorithm and bc_algorithm.  I don't think it's
> important to do for v15, but it seems like it could be done after featue
> freeze.  pg_dump+lz4 is targetting v16, although there's a cleanup patch that
> could also go in before branching.

Well, I think the first thing we should do is get rid of enum
WalCompressionMethod and use enum WalCompression instead. They've got
the same elements and very similar names, but the WalCompressionMethod
ones just have names like COMPRESSION_NONE, which is too generic,
whereas WalCompressionMethod uses WAL_COMPRESSION_NONE, which is
better. Then I think we should also rename the COMPR_ALG_* constants
in pg_dump.h to names like DUMP_COMPRESSION_*. Once we do that we've
got rid of all the unprefixed things that purport to be a list of
compression algorithms.

Then, if people are willing to adopt the syntax that the
backup_compression.c/h stuff supports as a project standard (+1 from
me) we can go the other way and rename that stuff to be more generic,
taking backup out of the name.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




RE: logical replication empty transactions

2022-03-28 Thread houzj.f...@fujitsu.com
On Monday, March 28, 2022 3:08 PM Amit Kapila  wrote:
> 
> On Fri, Mar 25, 2022 at 12:50 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > Attach the new version patch with this change.
> >
> 
> Few comments:

Thanks for the comments.

> =
> 1. I think we can move the keep_alive check after the tracklag record
> check to keep it consistent with another patch [1].

Changed.

> 2. Add the comment about the new parameter skipped_xact atop
> WalSndUpdateProgress.

Added.

> 3. I think we need to call pq_flush_if_writable after sending a
> keepalive message to avoid delaying sync transactions.

Agreed.
If we don’t flush the data, we might flush the keepalive later than before. And
we could get the reply later as well and then the release of syncwait could be
delayed.

Attach the new version patch which addressed the above comments.
The patch also adds a loop after the newly added keepalive message
to make sure the message is actually flushed to the client like what
did in WalSndWriteData.

Best regards,
Hou zj



v32-0001-Skip-empty-transactions-for-logical-replication.patch
Description:  v32-0001-Skip-empty-transactions-for-logical-replication.patch


Re: SQL/JSON: functions

2022-03-28 Thread Andrew Dunstan


On 3/27/22 19:14, Tom Lane wrote:
> I wrote:
>>> Andres Freund  writes:
 There's also
 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru&dt=2022-03-22%2022%3A25%3A26
 that started failing with
 ../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
 test1.pgc:12: ERROR: syntax error at or near "int"
 with this commit.
>>> Yeah, I was just scratching my head about that.
> This problem came back as soon as we de-reverted that patch :-(.
> So much for my guess about unused rules.
>
> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
> system here.  So there's something odd about jabiru's build
> environment; but what?
>
>   



It's hard to see how this could be caused by the OS environment. Maybe a
flaky bison/flex? I'm going to be pretty reluctant to revert this based
on this error.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-03-28 Thread Aleksander Alekseev
Hi hackers,

> Here is v30.

I took another look at the patchset. Personally I don't think it will get much
better than it is now. I'm inclined to change the status of the CF entry to
"Ready for Committer" unless anyone disagrees.

cfbot reports a problem with t/013_partition.pl but the test seems to be flaky
on `master` [1]. I couldn't find anything useful in the logs except for
"[postmaster] LOG:  received immediate shutdown request". Then I re-checked the
patchset on FreeBSD 13 locally. The patchset passed `make installcked-world`.

> About v25-0002-Use-64-bit-format-to-output-XIDs.patch:
> I don't see the point of applying this now. It doesn't make PG15 any
> better. It's just a patch part of which we might need later.

> It was proposed in [1] that we'd better cut it into separate threads and
> commit by parts, some into v15, the other into v16. So we did. In view of
> this, I can not accept that 0002 patch doesn't make v15 better. I consider
> it is separate enough to be committed as a base to further 64xid parts.

I understand how disappointing this may be.

Personally I don't have a strong opinion here. Merging the patch sooner will
allow us to move toward 64-bit XIDs faster (e.g. to gather the feedback from
the early adopters, allow the translators to do their thing earlier, etc).
Merging it later will make PG15 more stable (you can't break anything new if
you don't change anything). As always, engineering is all about compromises.

It's up to the committer to decide.

[1]: 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogfish&dt=2022-03-25%2018%3A37%3A10

--
Best regards,
Aleksander Alekseev




Re: Summary Sort workers Stats in EXPLAIN ANALYZE

2022-03-28 Thread Jian Guo

I have updated the patch addressing the review comments, but I didn't moved 
this code block into VERBOSE mode, to keep consistency with 
`show_incremental_sort_info`:

https://github.com/postgres/postgres/blob/d4ba8b51c76300f06cc23f4d8a41d9f7210c4866/src/backend/commands/explain.c#L2890

Please review, thanks.


From: Julien Rouhaud 
Sent: Friday, March 25, 2022 17:04
To: Jian Guo 
Cc: pgsql-hackers@lists.postgresql.org ; 
Zhenghua Lyu 
Subject: Re: Summary Sort workers Stats in EXPLAIN ANALYZE

⚠ External Email

Hi,

On Thu, Mar 24, 2022 at 07:50:11AM +, Jian Guo wrote:
> For a simple demo, with this explain statement:
>
> -- Test sort stats summary
> set force_parallel_mode=on;
> select explain_filter('explain (analyze, summary off, timing off, costs off, 
> format json) select * from tenk1 order by unique1');
>
> Before this patch, we got plan like this:
>
>
>"Node Type": "Sort",+
>"Parent Relationship": "Outer", +
>"Parallel Aware": false,+
>"Async Capable": false, +
>"Actual Rows": 1,   +
>"Actual Loops": 1,  +
>"Sort Key": ["unique1"],+
>"Workers": [+
>  { +
>"Worker Number": 0, +
>"Sort Method": "external merge",+
>"Sort Space Used": 2496,+
>"Sort Space Type": "Disk"   +
>  } +
>],  +

> After this patch, the effected plan is this:
>
>"Node Type": "Sort",   +
>"Parent Relationship": "Outer",+
>"Parallel Aware": false,   +
>"Async Capable": false,+
>"Actual Rows": N,  +
>"Actual Loops": N, +
>"Sort Key": ["unique1"],   +
>"Workers planned": N,  +
>"Sort Method": "external merge",   +
>"Average Sort Space Used": N,  +
>"Peak Sort Space Used": N, +
>"Sort Space Type": "Disk", +

I think the idea is interesting, however there are a few problems in the patch.

First, I think that it should only be done in the VERBOSE OFF mode.  If you ask
for a VERBOSE output you don't need both the details and the summarized
version.

Other minor problems:

- why (only) emitting the number of workers planned and not the number of
  workers launched?
- the textual format is missing details about what the numbers are, which is
  particularly obvious since avgSpaceUsed and peakSpaceUsed don't have any unit
  or even space between them:

+"Sort Method: %s  %s: " INT64_FORMAT INT64_FORMAT 
"kB\n",
+sortMethod, spaceType, avgSpaceUsed, peakSpaceUsed);




⚠ External Email: This email originated from outside of the organization. Do 
not click links or open attachments unless you recognize the sender.
From 578b9ce26c7ea1a13d6cf78d0848fa6a950bd82b Mon Sep 17 00:00:00 2001
From: Aegeaner Guo 
Date: Mon, 21 Mar 2022 11:19:46 +0800
Subject: [PATCH] Summary Sort workers Stats.

Signed-off-by: Jian Guo 
---
 src/backend/commands/explain.c|  43 +--
 src/test/regress/expected/explain.out | 522 ++
 src/test/regress/sql/explain.sql  |  13 +-
 3 files changed, 390 insertions(+), 188 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..cdda0ac955 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2758,40 +2758,41 @@ show_sort_info(SortState *sortstate, ExplainState *es)
 	if (sortstate->shared_info != NULL)
 	{
 		int			n;
+		const char *sortMethod;
+		const char *spaceType;
+		int64		peakSpaceUsed = 0;
+		int64		totalSpaceUsed = 0;
 
 		for (n = 0; n < sortstate->shared_info->num_workers; n++)
 		{
 			TuplesortInstrumentation *sinstrument;
-			const char *sortMethod;
-			const char *spaceType;
-			int64		spaceUsed;
 
 			sinstrument = &sortstate->shared_info->sinstrument[n];
 			if (sinstrument->sortMethod == SORT_TYPE_STILL_IN_PROGRESS)
 continue;		/* ignore any unfilled slots */
 			sortMethod = tuplesort_method_name(sinstrument->sortMethod);
 			spaceType = tuplesort_space_type_name(sinstrument->spaceType);
-			spaceUsed = sinstrument->spaceUsed;
+			peakSpaceUsed = Max(peakSpaceUsed, sinstrument->spaceUsed);
+			totalSpaceUsed += sinstrument->spaceUsed;
+		}
 
-			if (es->workers_state)
-ExplainOpenWorker(n, es);
+		int64 avgSpaceUsed = sortstate->shared_info->num_workers > 0 ?
+totalSpaceUsed / sortstate->shared_info->num_workers : 0;
 
-			if (es->format == EXPLAIN_FORMAT_TEXT)
-			{
-ExplainIndentText(es);
-	

Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

2022-03-28 Thread Kyotaro Horiguchi
At Sun, 27 Mar 2022 20:32:45 +0200, Erik Rijkers  wrote in 
> On master I got a FailedAssertion("HaveRegisteredOrActiveSnapshot()"
> on an assert-enabled instance and with (I think) data over a certain
> length.
> 
> I whittled it down to the attached bash (careful - it drops stuff).
> It has 5 tsv-data lines (one long line) that COPY slurps into a table.
> The middle, third line causes the problem, later on.  Shortening the
> long line to somewhere below 2000 characters fixes it again.
> 
> More info in the attached .sh file.

It is reproducible for me. Thanks for the reproducer.

> If debug-assert is 'off', the problem does not occur. (REL_14_STABLE
> also does not have the problem, assertions or not)

It seems like related with [1]?

Inserting EnsurePortalSnapshotExists() to RunFromStore fixes this but
I'm not sure where is the right place to do this..

[1] 
https://www.postgresql.org/message-id/flat/20210623035916.GL29179%40telsasoft.com#f802617a00cee4d013ad8fa69e1af048

For someone's information, this is more readable stack trace.

#0  0x7f43aeed037f in raise () from /lib64/libc.so.6
#1  0x7f43aeebadb5 in abort () from /lib64/libc.so.6
#2  0x00b28747 in ExceptionalCondition (
conditionName=0xba2c48 "HaveRegisteredOrActiveSnapshot()", 
errorType=0xba2882 "FailedAssertion", 
fileName=0xba2870 "toast_internals.c", lineNumber=670) at assert.c:69
#3  0x004ac776 in init_toast_snapshot (toast_snapshot=0x7ffce64f7440)
at toast_internals.c:670
#4  0x005164ea in heap_fetch_toast_slice (toastrel=0x7f43b193cad0, 
valueid=16393, attrsize=1848, sliceoffset=0, slicelength=1848, 
result=0x1cbb948) at heaptoast.c:688
#5  0x0049fc86 in table_relation_fetch_toast_slice (
toastrel=0x7f43b193cad0, valueid=16393, attrsize=1848, sliceoffset=0, 
slicelength=1848, result=0x1cbb948)
at ../../../../src/include/access/tableam.h:1892
#6  0x004a0a0f in toast_fetch_datum (attr=0x1d6b171) at detoast.c:375
#7  0x0049fffb in detoast_attr (attr=0x1d6b171) at detoast.c:123
#8  0x00b345ba in pg_detoast_datum_packed (datum=0x1d6b171)
at fmgr.c:1757
#9  0x00aece72 in text_to_cstring (t=0x1d6b171) at varlena.c:225
#10 0x00aedda2 in textout (fcinfo=0x7ffce64f77a0) at varlena.c:574
#11 0x00b331bf in FunctionCall1Coll (flinfo=0x1d695e0, collation=0, 
arg1=30847345) at fmgr.c:1138
#12 0x00b3422b in OutputFunctionCall (flinfo=0x1d695e0, val=30847345)
at fmgr.c:1575
#13 0x004a6b6c in printtup (slot=0x1cb81f0, self=0x1c96e90)
at printtup.c:357
#14 0x0099499f in RunFromStore (portal=0x1cf9380, 
direction=ForwardScanDirection, count=0, dest=0x1c96e90) at pquery.c:1096
#15 0x009944e3 in PortalRunSelect (portal=0x1cf9380, forward=true, 
count=0, dest=0x1c96e90) at pquery.c:917
#16 0x009941d3 in PortalRun (portal=0x1cf9380, 
count=9223372036854775807, isTopLevel=true, run_once=true, 
dest=0x1c96e90, altdest=0x1c96e90, qc=0x7ffce64f7ac0) at pquery.c:765
#17 0x0098df4b in exec_simple_query (
query_string=0x1c96030 "fetch all in myportal;") at postgres.c:1250
#18 0x009923a3 in PostgresMain (dbname=0x1cc11b0 "postgres", 
username=0x1cc1188 "horiguti") at postgres.c:4520
#19 0x008c6caf in BackendRun (port=0x1cb74c0) at postmaster.c:4593
#20 0x008c6631 in BackendStartup (port=0x1cb74c0) at postmaster.c:4321
#21 0x008c29cb in ServerLoop () at postmaster.c:1801
#22 0x008c2298 in PostmasterMain (argc=1, argv=0x1c8e0e0)
at postmaster.c:1473
#23 0x007c14c3 in main (argc=1, argv=0x1c8e0e0) at main.c:202



regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-03-28 Thread Jelte Fennema
Thanks for all the feedback everyone. I'll try to send a new patch
later this week that includes user facing docs and a simplified API.
For now a few responses:

> Yeah.  We need to choose a name for these new function(s) that is
> sufficiently different from "PQcancel" that people won't expect them
> to behave exactly the same as that does.  I lack any good ideas about
> that, how about you?

So I guess the names I proposed were not great, since everyone seems to be 
falling over them. 
But I'd like to make my intention clear with the current naming. After this 
patch there would be 
four different APIs for starting a cancelation:
1. PQrequestCancel: deprecated+old, not signal-safe function for requesting 
query cancellation, only uses a specific set of connection options
2. PQcancel: Cancel queries in a signal safe way, to be signal-safe it only 
uses a limited set of connection options
3. PQcancelConnect: Cancel queries in a non-signal safe way that uses all 
connection options
4. PQcancelConnectStart: Cancel queries in a non-signal safe and non-blocking 
way that uses all connection options

So the idea was that you should not look at PQcancelConnectStart as the 
non-blocking
version of PQcancel, but as the non-blocking version of PQcancelConnect. I'll 
try to 
think of some different names too, but IMHO these names could be acceptable
when their differences are addressed sufficiently in the documentation. 

One other approach to naming that comes to mind now is repurposing 
PQrequestCancel:
1. PQrequestCancel: Cancel queries in a non-signal safe way that uses all 
connection options
2. PQrequestCancelStart: Cancel queries in a non-signal safe and non-blocking 
way that uses all connection options
3. PQcancel: Cancel queries in a signal safe way, to be signal-safe it only 
uses a limited set of connection options

> I think it's probably a
> bad sign that this function is tinkering with logic in this sort of
> low-level function anyway. pqReadData() is a really general function
> that manages to work with non-blocking I/O already, so why does
> non-blocking query cancellation need to change its return values, or
> whether or not it drops data in certain cases?

The reason for this low level change is that the cancellation part of the
Postgres protocol is following a different, much more simplistic design 
than all the other parts. The client does not expect a response message back 
from the server after sending the cancellation request. The expectation 
is that the server signals completion by closing the connection, i.e. sending 
EOF. 
For all other parts of the protocol, connection termination should be initiated
client side by sending a Terminate message. So the server closing (sending
EOF) is always unexpected and is thus currently considered an error by 
pqReadData.

But since this is not the case for the cancellation protocol, the result is
changed to -2 in case of EOF to make it possible to distinguish between
an EOF and an actual error.

> And it updates that function to return -2 when the is
> ECONNRESET, which seems to fly in the face of the comment's idea that
> this is the "clean connection closure" case. 

The diff sadly does not include the very relevant comment right above these
lines. Pasting the whole case statement here to clear up this confusion:

case SSL_ERROR_ZERO_RETURN:

/*
 * Per OpenSSL documentation, this error code is only returned for
 * a clean connection closure, so we should not report it as a
 * server crash.
 */
appendPQExpBufferStr(&conn->errorMessage,
 libpq_gettext("SSL connection 
has been closed unexpectedly\n"));
result_errno = ECONNRESET;
n = -2;
break;


> For example, it updates the
> comment for pqsecure_read() to say "Returns -1 in case of failures,
> except in the case of clean connection closure then it returns -2."
> But that function calls any of three different implementation
> functions depending on the situation and the patch only updates one of
> them. 

That comment is indeed not describing what is happening correctly and I'll 
try to make it clearer. The main reason for it being incorrect is coming from 
the fact that receiving EOFs is handled in different places based on the 
encryption method:

1. Unencrypted TCP: EOF is not returned as an error by pqsecure_read, but 
detected by pqReadData (see comments related to definitelyEOF)
2. OpenSSL: EOF is returned as an error by pqsecure_read (see copied case 
statement above)
3. GSS: When writing the patch I was not sure how EOF handling worked here, but 
given that the tests passed for Jacob on GSS, I'm guessing it works the same as 
unencrypted TCP.




Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-28 Thread Daniel Gustafsson
> On 28 Mar 2022, at 00:44, Daniel Gustafsson  wrote:

> I'll take a look at fixing up the test in this patch tomorrow.

Fixing up the switch_server_cert() calls and using default_ssl_connstr makes
the test pass for me.  The required fixes are in the supplied 0004 diff, I kept
them separate to allow the original author to incorporate them without having
to dig them out to see what changed (named to match the git format-patch output
since I think the CFBot just applies the patches in alphabetical order).

--
Daniel Gustafsson   https://vmware.com/



v10-0001-Move-inet_net_pton-to-src-port.patch
Description: Binary data


v10-0003-squash-libpq-allow-IP-address-SANs-in-server-cer.patch
Description: Binary data


v10-0003-squash-libpq-allow-IP-address-SANs-in-server-cer-1.patch
Description: Binary data


v10-0004-fix-tests.diff
Description: Binary data


Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Julien Rouhaud
On Mon, Mar 28, 2022 at 05:02:14PM +0900, Michael Paquier wrote:
> On Mon, Mar 28, 2022 at 03:43:41PM +0800, Julien Rouhaud wrote:
> >
> > Ok.  We could still keep the tests for the valid lines part though?
>
> With the SQLs modified as below, this part is less interesting.

Ok.

> > Do you mean something like
> >
> > SELECT count(*) > 0 AS ok,
> >count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
> > FROM pg_hba_file_rules ;
> >
> > and similar for pg_ident_rule_mappings?
>
> Something like that, yes.

Ok, v5 attached without the TAP tests and updated sysviews tests.
>From 681c16db9fcb4e27ebba28e6a0f2134de071 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 21 Feb 2022 17:38:34 +0800
Subject: [PATCH v5 1/3] Add a pg_ident_file_mappings view.

This view is similar to pg_hba_file_rules view, and can be also helpful to help
diagnosing configuration problems.

A following commit will add the possibility to include files in pg_hba and
pg_ident configuration files, which will then make this view even more useful.

Catversion is bumped.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 doc/src/sgml/catalogs.sgml | 108 
 doc/src/sgml/client-auth.sgml  |  10 ++
 doc/src/sgml/func.sgml |   5 +-
 src/backend/catalog/system_views.sql   |   6 ++
 src/backend/libpq/hba.c|  31 +++---
 src/backend/utils/adt/hbafuncs.c   | 136 +
 src/include/catalog/pg_proc.dat|   7 ++
 src/include/libpq/hba.h|   1 +
 src/test/regress/expected/rules.out|   6 ++
 src/test/regress/expected/sysviews.out |  16 ++-
 src/test/regress/sql/sysviews.sql  |   6 +-
 11 files changed, 311 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 94f01e4099..75fedfa07e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9591,6 +9591,11 @@ SCRAM-SHA-256$:&l
   summary of client authentication configuration file 
contents
  
 
+ 
+  pg_ident_file_mappings
+  summary of client user name mapping configuration file 
contents
+ 
+
  
   pg_indexes
   indexes
@@ -10589,6 +10594,109 @@ SCRAM-SHA-256$:&l
   
  
 
+ 
+  pg_ident_file_mappings
+
+  
+   pg_ident_file_mappings
+  
+
+  
+   The view pg_ident_file_mappings provides a summary
+   of the contents of the client user name mapping configuration file,
+   pg_ident.conf.
+   A row appears in this view for each
+   non-empty, non-comment line in the file, with annotations indicating
+   whether the rule could be applied successfully.
+  
+
+  
+   This view can be helpful for checking whether planned changes in the
+   authentication configuration file will work, or for diagnosing a previous
+   failure.  Note that this view reports on the current
+   contents of the file, not on what was last loaded by the server.
+  
+
+  
+   By default, the pg_ident_file_mappings view can be
+   read only by superusers.
+  
+
+  
+   pg_ident_file_mappings Columns 

+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   line_number int4
+  
+  
+   Line number of this rule in pg_ident.conf
+  
+ 
+
+ 
+  
+   map_name text
+  
+  
+   Name of the map
+  
+ 
+
+ 
+  
+   sys_name text
+  
+  
+   Detected user name of the client
+  
+ 
+
+ 
+  
+   pg_username text
+  
+  
+   Requested PostgreSQL user name
+  
+ 
+
+ 
+  
+   error text
+  
+  
+   If not null, an error message indicating why this line could not be
+   processed
+  
+ 
+
+   
+  
+
+  
+   Usually, a row reflecting an incorrect entry will have values for only
+   the line_number and 
error fields.
+  
+
+  
+   See  for more information about
+   client authentication configuration.
+  
+ 
+
  
   pg_indexes
 
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 02f0489112..142b0affcb 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -896,6 +896,16 @@ mymap   /^(.*)@otherdomain\.com$   guest
-HUP) to make it re-read the file.
   
 
+  
+   The system view
+   pg_ident_file_mappings
+   can be helpful for pre-testing changes to the
+   pg_ident.conf file, or for diagnosing problems if
+   loading of the file did not have the desired effects.  Rows in the view with
+   non-null error fields indicate problems in the
+   corresponding lines of the file.
+  
+
   
A pg_ident.conf file that could be used in
conjunction with the pg_hba.conf file in SIGHUP signal to the postmaster
 process, which in turn sends SIGHUP to each
 of its children.) You can use the
-pg_

Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Kyotaro Horiguchi
At Mon, 28 Mar 2022 10:01:05 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Fri, 25 Mar 2022 13:26:05 +0100, Alvaro Herrera  
> wrote in 
> > Pushed this, backpatching to 14 and 13.  It would have been good to
> > backpatch further, but there's an (textually trivial) merge conflict
> > related to commit e6d8069522c8.  Because that commit conceptually
> > touches the same area that this bugfix is about, I'm not sure that
> > backpatching further without a lot more thought is wise -- particularly
> > so when there's no way to automate the test in branches older than
> > master.
> 
> Thaks for committing.
> 
> > This is quite annoying, considering that the bug was reported shortly
> > before 12 went into beta.
> 
> Sure.  I'm going to look into that.

This is a preparatory patch and tentative (yes, it's just tentative)
test. This is made for 12 but applies with some warnings to 10-11.

(Hope the attachments are attached as "attachment", not  "inline".)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 3d5b24691517c1aac4b49728abb122c66a4e33be Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 28 Mar 2022 16:29:04 +0900
Subject: [PATCH 1/2] Tentative test for tsp replay fix

---
 src/test/perl/PostgresNode.pm | 342 +-
 src/test/recovery/t/011_crash_recovery.pl | 108 ++-
 2 files changed, 447 insertions(+), 3 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 7b2ec29bb7..88fa08b61d 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -104,6 +104,8 @@ use TestLib ();
 use Time::HiRes qw(usleep);
 use Scalar::Util qw(blessed);
 
+my $windows_os = 0;
+
 our @EXPORT = qw(
   get_new_node
   get_free_port
@@ -323,6 +325,64 @@ sub archive_dir
 
 =pod
 
+=item $node->tablespace_storage([, nocreate])
+
+Diretory to store tablespace directories.
+If nocreate is true, returns undef if not yet created.
+
+=cut
+
+sub tablespace_storage
+{
+   my ($self, $nocreate) = @_;
+
+   if (!defined $self->{_tsproot})
+   {
+   # tablespace is not used, return undef if nocreate is specified.
+   return undef if ($nocreate);
+
+   # create and remember the tablespae root directotry.
+   $self->{_tsproot} = TestLib::tempdir_short();
+   }
+
+   return $self->{_tsproot};
+}
+
+=pod
+
+=item $node->tablespaces()
+
+Returns a hash from tablespace OID to tablespace directory name.  For
+example, an oid 16384 pointing to /tmp/jWAhkT_fs0/ts1 is stored as
+$hash{16384} = "ts1".
+
+=cut
+
+sub tablespaces
+{
+   my ($self) = @_;
+   my $pg_tblspc = $self->data_dir . '/pg_tblspc';
+   my %ret;
+
+   # return undef if no tablespace is used
+   return undef if (!defined $self->tablespace_storage(1));
+
+   # collect tablespace entries in pg_tblspc directory
+   opendir(my $dir, $pg_tblspc);
+   while (my $oid = readdir($dir))
+   {
+   next if ($oid !~ /^([0-9]+)$/);
+   my $linkpath = "$pg_tblspc/$oid";
+   my $tsppath = dir_readlink($linkpath);
+   $ret{$oid} = File::Basename::basename($tsppath);
+   }
+   closedir($dir);
+
+   return %ret;
+}
+
+=pod
+
 =item $node->backup_dir()
 
 The output path for backups taken with $node->backup()
@@ -338,6 +398,77 @@ sub backup_dir
 
 =pod
 
+=item $node->backup_tablespace_storage_path(backup_name)
+
+Returns tablespace location path for backup_name.
+Retuns the parent directory if backup_name is not given.
+
+=cut
+
+sub backup_tablespace_storage_path
+{
+   my ($self, $backup_name) = @_;
+   my $dir = $self->backup_dir . '/__tsps';
+
+   $dir .= "/$backup_name" if (defined $backup_name);
+
+   return $dir;
+}
+
+=pod
+
+=item $node->backup_create_tablespace_storage(backup_name)
+
+Create tablespace location directory for backup_name if not yet.
+Create the parent tablespace storage that holds all location
+directories if backup_name is not supplied.
+
+=cut
+
+sub backup_create_tablespace_storage
+{
+   my ($self, $backup_name) = @_;
+   my $dir = $self->backup_tablespace_storage_path($backup_name);
+
+   File::Path::make_path $dir if (! -d $dir);
+}
+
+=pod
+
+=item $node->backup_tablespaces(backup_name)
+
+Returns a reference to hash from tablespace OID to tablespace
+directory name of tablespace directory that the specified backup has.
+For example, an oid 16384 pointing to ../tsps/backup1/ts1 is stored as
+$hash{16384} = "ts1".
+
+=cut
+
+sub backup_tablespaces
+{
+   my ($self, $backup_name) = @_;
+   my $pg_tblspc = $self->backup_dir . '/' . $backup_name . '/pg_tblspc';
+   my %ret;
+
+   #return undef if this backup holds no tablespaces
+   return undef if (! -d 
$self->backup_tablespace_storage_path($backup_name));
+
+   # scan pg_tblspc directory of the backup
+   opendir(my $dir, $pg_tblspc);
+   while (my $oid = readdir($dir))
+ 

Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 03:43:41PM +0800, Julien Rouhaud wrote:
> On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
>> We could use a failure path for each psql command rather than a SKIP
>> block, as you told me, if the psql fails and check that we get some
>> error strings related to the loading of auth files.  However, I am
>> scared of this design in the long-term as it could cause the tests to
>> pass with a failure triggered on platforms and/or configurations where
>> we should have a success.  So, I am tempted to drop the ball for now
>> with the TAP part.
> 
> Ok.  We could still keep the tests for the valid lines part though?

With the SQLs modified as below, this part is less interesting.

>> The patch still has value for the end-user.  I have checked the
>> backend part, and I did not notice any obvious issue.  There is one
>> thing that I am wondering though: should we change the two queries in
>> sysviews.sql so as we check that there are zero errors in the two
>> views when the files are parsed?  This simple change would avoid
>> mistakes for users running installcheck on a production installation.
> 
> Do you mean something like
> 
> SELECT count(*) > 0 AS ok,
>count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
> FROM pg_hba_file_rules ;
> 
> and similar for pg_ident_rule_mappings?

Something like that, yes.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-28 Thread Alvaro Herrera
Hello,

On 2022-Mar-27, Tatsuo Ishii wrote:

> After:
> interval_start num_transactions sum_latency sum_latency_2 min_latency 
> max_latency
>   { failures | serialization_failures deadlock_failures } [ sum_lag sum_lag_2 
> min_lag max_lag [ skipped ] ] [ retried retries ]

You're showing an indentation, but looking at the HTML output there is
no such.  Is the HTML processor eating leading whitespace or something
like that?

I think that the explanatory paragraph is way too long now, particularly
since it explains --failures-detailed starting in the middle.  Also, the
example output doesn't include the failures-detailed mode.  I suggest
that this should be broken down even more; first to explain the output
without failures-detailed, including an example, and then the output
with failures-detailed, and an example of that.  Something like this,
perhaps:

Aggregated Logging
With the --aggregate-interval option, a different format is used for the log 
files (note that the actual log line is not folded).

  interval_start num_transactions sum_latency sum_latency_2 min_latency 
max_latency
  failures [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]

where interval_start is the start of the interval (as a Unix epoch time stamp), 
num_transactions is the number of transactions within the interval, sum_latency 
is the sum of the transaction latencies within the interval, sum_latency_2 is 
the sum of squares of the transaction latencies within the interval, 
min_latency is the minimum latency within the interval, and max_latency is the 
maximum latency within the interval, failures is the number of transactions 
that ended with a failed SQL command within the interval.

The next fields, sum_lag, sum_lag_2, min_lag, and max_lag, are only present if 
the --rate option is used. They provide statistics about the time each 
transaction had to wait for the previous one to finish, i.e., the difference 
between each transaction's scheduled start time and the time it actually 
started. The next field, skipped, is only present if the --latency-limit option 
is used, too. It counts the number of transactions skipped because they would 
have started too late. The retried and retries fields are present only if the 
--max-tries option is not equal to 1. They report the number of retried 
transactions and the sum of all retries after serialization or deadlock errors 
within the interval. Each transaction is counted in the interval when it was 
committed.

Notice that while the plain (unaggregated) log file shows which script was used 
for each transaction, the aggregated log does not. Therefore if you need 
per-script data, you need to aggregate the data on your own.

Here is some example output:

1345828501 5601 1542744 483552416 61 2573 0
1345828503 7884 1979812 565806736 60 1479 0
1345828505 7208 1979422 567277552 59 1391 0
1345828507 7685 1980268 569784714 60 1398 0
1345828509 7073 1979779 573489941 236 1411 0

If you use option --failures-detailed, instead of the sum of all failed 
transactions you will get more detailed statistics for the failed transactions:

  interval_start num_transactions sum_latency sum_latency_2 min_latency 
max_latency
  serialization_failures deadlock_failures [ sum_lag sum_lag_2 min_lag max_lag 
[ skipped ] ] [ retried retries ]

This is similar to the above, but here the single 'failures' figure is replaced 
by serialization_failures which is the number of transactions that got a 
serialization error and were not retried after this, deadlock_failures which is 
the number of transactions that got a deadlock error and were not retried after 
this. The other fields are as above. Here is some example output:

[example with detailed failures]

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it."   (New York Times, about Microsoft PowerPoint)




Re: DSA failed to allocate memory

2022-03-28 Thread Thomas Munro
On Mon, Mar 28, 2022 at 8:14 PM Dongming Liu  wrote:
> On Fri, Mar 18, 2022 at 3:30 PM Dongming Liu  wrote:
>> I'm trying to move segments into appropriate bins in dsa_free().
>> In 0001-Re-bin-segment-when-dsa-memory-is-freed.patch, I extract
>> the re-bin segment logic into a separate function called rebin_segment,
>> call it to move the segment to the appropriate bin when dsa memory is
>> freed. Otherwise, when allocating memory, due to the segment with
>> enough contiguous pages is in a smaller bin, a suitable segment
>> may not be found to allocate memory.
>>
>> Fot test, I port the test_dsa patch from [1] and add an OOM case to
>> test memory allocation until OOM, free and then allocation, compare
>> the number of allocated memory before and after.

Hi Dongming,

Thanks for the report, and for working on the fix.  Can you please
create a commitfest entry (if you haven't already)?  I plan to look at
this soon, after the code freeze.

Are you proposing that the test_dsa module should be added to the
tree?  If so, some trivial observations: "#ifndef
HAVE_INT64_TIMESTAMP" isn't needed anymore (see commit b6aa17e0, which
is in all supported branches), the year should be updated, and we use
size_t instead of Size in new code.




Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Julien Rouhaud
Hi,

On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
>
> > Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
> > builds), as they're testing invalid files which by definition prevent any
> > further connection attempt.  I'm not sure what would be best to do here, 
> > apart
> > from bypassing the invalid config tests on such platforms.  I don't think 
> > that
> > validating that trying to connect on such platforms when an invalid
> > pg_hba/pg_ident file brings anything.
>
> Hmm, indeed.  I have been looking at that and that's annoying.  As you
> mentioned off-list, in order to know if a build has an emulation of
> fork() that would avoid failures with new connection attempts after
> including some dummy entries in pg_hba.conf or pg_ident.conf, we'd
> need to look at EXEC_BACKEND (well, mostly), as of:
> - CFLAGS in pg_config, or a query on pg_config() (does not work with
> MSVC as CFLAGS is not set there).
> - A potential check on pg_config{_manual}.h.
> - Perhaps an extra dependency on $windows_os, discarding incorrectly
> cygwin that should be able to work.

Yeah, detecting !cygwin-Windows or EXEC_BACKEND in the TAP tests is quite hard.

> We could use a failure path for each psql command rather than a SKIP
> block, as you told me, if the psql fails and check that we get some
> error strings related to the loading of auth files.  However, I am
> scared of this design in the long-term as it could cause the tests to
> pass with a failure triggered on platforms and/or configurations where
> we should have a success.  So, I am tempted to drop the ball for now
> with the TAP part.

Ok.  We could still keep the tests for the valid lines part though?

> The patch still has value for the end-user.  I have checked the
> backend part, and I did not notice any obvious issue.  There is one
> thing that I am wondering though: should we change the two queries in
> sysviews.sql so as we check that there are zero errors in the two
> views when the files are parsed?  This simple change would avoid
> mistakes for users running installcheck on a production installation.

Do you mean something like

SELECT count(*) > 0 AS ok,
   count(*) FILTER (WHERE error IS NOT NULL) = 0 AS has_no_error
FROM pg_hba_file_rules ;

and similar for pg_ident_rule_mappings?




Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 04:20:07PM +0900, Michael Paquier wrote:
> See the attached, for reference, but it would fail with EXEC_BACKEND
> on WIN32.

Ditto.
--
Michael
From 69e02734fd0199ba02cc34bc468b04584bdf0efd Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 28 Mar 2022 16:20:40 +0900
Subject: [PATCH v5] Add a pg_ident_file_mappings view.

This view is similar to pg_hba_file_rules view, and can be also helpful to help
diagnosing configuration problems.

A following commit will add the possibility to include files in pg_hba and
pg_ident configuration files, which will then make this view even more useful.

Catversion is bumped.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 src/include/catalog/pg_proc.dat |   6 +
 src/include/libpq/hba.h |   1 +
 src/backend/catalog/system_views.sql|   6 +
 src/backend/libpq/hba.c |  31 +++--
 src/backend/utils/adt/hbafuncs.c| 136 
 src/test/authentication/t/003_auth_views.pl | 108 
 src/test/regress/expected/rules.out |   6 +
 src/test/regress/expected/sysviews.out  |   6 +
 src/test/regress/sql/sysviews.sql   |   2 +
 doc/src/sgml/catalogs.sgml  | 108 
 doc/src/sgml/client-auth.sgml   |  10 ++
 doc/src/sgml/func.sgml  |   5 +-
 12 files changed, 409 insertions(+), 16 deletions(-)
 create mode 100644 src/test/authentication/t/003_auth_views.pl

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5e612a6b67..915bc19176 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6115,6 +6115,12 @@
   proargmodes => '{o,o,o,o,o,o,o,o,o}',
   proargnames => '{line_number,type,database,user_name,address,netmask,auth_method,options,error}',
   prosrc => 'pg_hba_file_rules' },
+{ oid => '9556', descr => 'show pg_ident.conf mappings',
+  proname => 'pg_ident_file_mappings', prorows => '1000', proretset => 't',
+  provolatile => 'v', prorettype => 'record', proargtypes => '',
+  proallargtypes => '{int4,text,text,text,text}', proargmodes => '{o,o,o,o,o}',
+  proargnames => '{line_number,map_name,sys_name,pg_username,error}',
+  prosrc => 'pg_ident_file_mappings' },
 { oid => '1371', descr => 'view system lock information',
   proname => 'pg_lock_status', prorows => '1000', proretset => 't',
   provolatile => 'v', prorettype => 'record', proargtypes => '',
diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 13ecb329f8..90036f7bcd 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -171,6 +171,7 @@ extern int	check_usermap(const char *usermap_name,
 		  const char *pg_role, const char *auth_user,
 		  bool case_sensitive);
 extern HbaLine *parse_hba_line(TokenizedAuthLine *tok_line, int elevel);
+extern IdentLine *parse_ident_line(TokenizedAuthLine *tok_line, int elevel);
 extern bool pg_isblank(const char c);
 extern MemoryContext tokenize_auth_file(const char *filename, FILE *file,
 		List **tok_lines, int elevel);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9570a53e7b..9eaa51df29 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -617,6 +617,12 @@ CREATE VIEW pg_hba_file_rules AS
 REVOKE ALL ON pg_hba_file_rules FROM PUBLIC;
 REVOKE EXECUTE ON FUNCTION pg_hba_file_rules() FROM PUBLIC;
 
+CREATE VIEW pg_ident_file_mappings AS
+   SELECT * FROM pg_ident_file_mappings() AS A;
+
+REVOKE ALL ON pg_ident_file_mappings FROM PUBLIC;
+REVOKE EXECUTE ON FUNCTION pg_ident_file_mappings() FROM PUBLIC;
+
 CREATE VIEW pg_timezone_abbrevs AS
 SELECT * FROM pg_timezone_abbrevs();
 
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index 673135144d..f8393ca8ed 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -887,25 +887,22 @@ do { \
 } while (0)
 
 /*
- * Macros for handling pg_ident problems.
- * Much as above, but currently the message level is hardwired as LOG
- * and there is no provision for an err_msg string.
+ * Macros for handling pg_ident problems, similar as above.
  *
  * IDENT_FIELD_ABSENT:
- * Log a message and exit the function if the given ident field ListCell is
- * not populated.
+ * Reports when the given ident field ListCell is not populated.
  *
  * IDENT_MULTI_VALUE:
- * Log a message and exit the function if the given ident token List has more
- * than one element.
+ * Reports when the given ident token List has more than one element.
  */
 #define IDENT_FIELD_ABSENT(field) \
 do { \
 	if (!field) { \
-		ereport(LOG, \
+		ereport(elevel, \
 (errcode(ERRCODE_CONFIG_FILE_ERROR), \
  errmsg("missing entry in file \"%s\" at end of line %d", \
 		IdentFileName, line_num))); \
+		*err_msg = psprintf("missing entry at end of line"); \
 		return NULL; \
 

Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Michael Paquier
On Sun, Mar 27, 2022 at 05:52:22PM +0800, Julien Rouhaud wrote:
> I didn't like the various suggestions, as it would mean to scatter the tests
> all over the place.  The whole point of those views is indeed to check the
> current content of a file without applying the configuration change (not on
> Windows or EXEC_BACKEND, but there's nothing we can do there), so let's use
> this way.  I added a naive src/test/authentication/003_hba_ident_views.pl test
> that validates that specific new valid and invalid lines in both files are
> correctly reported.  Note that I didn't update those tests for the file
> inclusion.

I see.  The tests ought to be more extensive though, as hba.c checks
for multiple or missing fields for a varying number of expecting
parameters.  Here are the patterns that would cover most of the
failures of the backend.  For hba.conf:
+host
+host incorrect_db
+host incorrect_db incorrect_user
+host incorrect_db incorrect_user incorrect_host

For pg_ident.conf:
+# Error with incomplete lines.
+incorrect_map
+incorrect_map os_user
+# Errors with lines that have multiple values, for each field.
+incorrect_map_1,incorrect_map_2
+incorrect_map os_user_1,os_user_2
+incorrect_map os_user pg_role_1,pg_role_2

Then I was thinking about doing full scan of each view and could the
expected errors with some GROUP BY magic.  See the attached, for
reference, but it would fail with EXEC_BACKEND on WIN32.

> Note that those tests fail on Windows (and I'm assuming on EXEC_BACKEND
> builds), as they're testing invalid files which by definition prevent any
> further connection attempt.  I'm not sure what would be best to do here, apart
> from bypassing the invalid config tests on such platforms.  I don't think that
> validating that trying to connect on such platforms when an invalid
> pg_hba/pg_ident file brings anything.

Hmm, indeed.  I have been looking at that and that's annoying.  As you
mentioned off-list, in order to know if a build has an emulation of
fork() that would avoid failures with new connection attempts after
including some dummy entries in pg_hba.conf or pg_ident.conf, we'd
need to look at EXEC_BACKEND (well, mostly), as of:
- CFLAGS in pg_config, or a query on pg_config() (does not work with
MSVC as CFLAGS is not set there).
- A potential check on pg_config{_manual}.h.
- Perhaps an extra dependency on $windows_os, discarding incorrectly
cygwin that should be able to work.

We could use a failure path for each psql command rather than a SKIP
block, as you told me, if the psql fails and check that we get some
error strings related to the loading of auth files.  However, I am
scared of this design in the long-term as it could cause the tests to
pass with a failure triggered on platforms and/or configurations where
we should have a success.  So, I am tempted to drop the ball for now
with the TAP part.

The patch still has value for the end-user.  I have checked the
backend part, and I did not notice any obvious issue.  There is one
thing that I am wondering though: should we change the two queries in 
sysviews.sql so as we check that there are zero errors in the two
views when the files are parsed?  This simple change would avoid
mistakes for users running installcheck on a production installation.
--
Michael


signature.asc
Description: PGP signature


Re: DSA failed to allocate memory

2022-03-28 Thread Dongming Liu
On Fri, Mar 18, 2022 at 3:30 PM Dongming Liu  wrote:

> So it's OK for a segment to be in a bin that suggests that it has more
>> consecutive free pages than it really does. But it's NOT ok for a
>> segment to be in a bin that suggests it has fewer consecutive pages
>> than it really does. If dsa_free() is putting things back into the
>> wrong place, that's what we need to fix.
>
>
> I'm trying to move segments into appropriate bins in dsa_free().
> In 0001-Re-bin-segment-when-dsa-memory-is-freed.patch, I extract
> the re-bin segment logic into a separate function called rebin_segment,
> call it to move the segment to the appropriate bin when dsa memory is
> freed. Otherwise, when allocating memory, due to the segment with
> enough contiguous pages is in a smaller bin, a suitable segment
> may not be found to allocate memory.
>
> Fot test, I port the test_dsa patch from [1] and add an OOM case to
> test memory allocation until OOM, free and then allocation, compare
> the number of allocated memory before and after.
>
> Any thoughts?
>
> [1]
> https://www.postgresql.org/message-id/CAEepm%3D3U7%2BRo7%3DECeQuAZoeFXs8iDVX56NXGCV7z3%3D%2BH%2BWd0Sw%40mail.gmail.com
>
>
Fix rebin_segment not working on in-place dsa.

-- 
Best Regards,
Dongming


0001-Re-bin-segment-when-dsa-memory-is-freed.patch
Description: Binary data


0002-port-test_dsa.patch
Description: Binary data


Re: logical replication empty transactions

2022-03-28 Thread Amit Kapila
On Fri, Mar 25, 2022 at 12:50 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the new version patch with this change.
>

Few comments:
=
1. I think we can move the keep_alive check after the tracklag record
check to keep it consistent with another patch [1].
2. Add the comment about the new parameter skipped_xact atop
WalSndUpdateProgress.
3. I think we need to call pq_flush_if_writable after sending a
keepalive message to avoid delaying sync transactions.

[1]: 
https://www.postgresql.org/message-id/OS3PR01MB6275C64F264662E84D2FB7AE9E1D9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.




<    1   2