Re: Improving scalability of Parallel Bitmap Heap/Index Scan

2022-07-15 Thread John Naylor
On Thu, Jul 14, 2022 at 5:13 PM David Geier  wrote:
> optimizing the PagetableEntry data structure for size and using a faster
sorting algorithm like e.g. radix sort

On this note, there has been a proposed (but as far as I know untested)
patch to speed up this sort in a much simpler way, in this thread

https://www.postgresql.org/message-id/CA%2BhUKGKztHEWm676csTFjYzortziWmOcf8HDss2Zr0muZ2xfEg%40mail.gmail.com

where you may find this patch

https://www.postgresql.org/message-id/attachment/120560/0007-Specialize-pagetable-sort-routines-in-tidbitmap.c.patch

and see if it helps.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: EINTR in ftruncate()

2022-07-15 Thread Thomas Munro
On Sat, Jul 16, 2022 at 1:28 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > On Fri, Jul 15, 2022 at 9:34 AM Tom Lane  wrote:
> >> (Someday we oughta go ahead and make our Windows signal API look more
> >> like POSIX, as I suggested back in 2015.  I'm still not taking
> >> point on that, though.)
>
> > For the sigprocmask() part, here's a patch that passes CI.  Only the
> > SIG_SETMASK case is actually exercised by our current code, though.
>
> Passes an eyeball check, but I can't actually test it.

Thanks.  Pushed.

I'm not brave enough to try to write a replacement sigaction() yet,
but it does appear that we could rip more ugliness and inconsistencies
that way, eg sa_mask.




Re: Handle infinite recursion in logical replication setup

2022-07-15 Thread Dilip Kumar
On Sat, Jul 16, 2022 at 9:05 AM vignesh C  wrote:
>
> On Thu, Jul 14, 2022 at 6:42 PM Dilip Kumar  wrote:
> >
> > On Thu, 14 Jul 2022 at 6:34 PM, vignesh C  wrote:
> >>
> >> On Thu, Jul 14, 2022 at 11:26 AM Dilip Kumar  wrote:
> >> >
> >> > On Wed, Jul 13, 2022 at 4:49 PM Dilip Kumar  
> >> > wrote:
> >> > >
> >> > > On Tue, Jul 12, 2022 at 2:58 PM vignesh C  wrote:
> >> > > >
> >> > > > On Tue, Jul 12, 2022 at 9:51 AM Amit Kapila 
> >> > > >  wrote:
> >> > >
> >> > > I find one thing  confusing about this patch.  Basically, this has two
> >> > > option 'local' and 'any', so I would assume that all the local server
> >> > > changes should be covered under the 'local' but now if we set some
> >> > > origin using 'select pg_replication_origin_session_setup('aa');' then
> >> > > changes from that session will be ignored because it has an origin id.
> >> > > I think actually the name is creating confusion, because by local it
> >> > > seems like a change which originated locally and the document is also
> >> > > specifying the same.
> >> > >
> >> > > +   If local, the subscription will request the 
> >> > > publisher
> >> > > +   to only send changes that originated locally. If 
> >> > > any,
> >> > >
> >> > > I think if we want to keep the option local then we should look up all
> >> > > the origin in the replication origin catalog and identify whether it
> >> > > is a local origin id or remote origin id and based on that filter out
> >> > > the changes.
> >> >
> >> > On the other hand if we are interested in receiving the changes which
> >> > are generated without any origin then I think we should change 'local'
> >> > to 'none' and then in future we can provide a new option which can
> >> > send the changes generated by all the local origin?  I think other
> >> > than this the patch LGTM.
> >>
> >> Thanks for the comment. The attached v33 patch has the changes to
> >> specify origin as 'none' instead of 'local' which will not publish the
> >> data having any origin.
> >
> >
> > I think the ‘none’ might have problem from expand ability pov?  what if in 
> > future we support the actual origin name and than what none mean? no origin 
> > or origin name none?  Should we just give origin name empty name ‘’?  Or is 
> > there some other issue?
>
> Currently there is no restriction in the name we can specify for
> origin, ex any, none, local, etc all are allowed as origin name. How
> about extending it with another parameter "origin_name" when we
> support filtering of a particular origin like:
> 1) origin = name, origin_name = 'orig1' -- Means that the publisher
> will filter the changes having origin name as 'orig1' and send the
> other changes.
> 2) origin = any -- Means that the publisher sends all changes
> regardless of their origin.
> 3) origin = none -- Means that the subscription will request the
> publisher to only send changes that have no origin associated.
> When we need to specify filtering of a particular origin name we will
> have to use both origin and origin_name option, like origin = name,
> origin_name = 'orig1' as in the first example.
>
> I'm not sure if there is a simpler way to do this with only a single
> option as both 'none' and 'any' can be specified as the origin name.
> Thoughts?

I think giving two options would be really confusing from the
usability perspective.  I think what we should be doing here is to
keep these three names 'none', 'any' and 'local' as reserved names for
the origin name so that those are not allowed to be set by the user
and they have some internal meaning.  And I don't think this is going
to create too much trouble for anyone because those are not really the
names someone wants to use for their replication origin.  So I think
pg_replication_origin_create() we can also check for the reserved
names just for the replication origin.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Freeing sortgroupatts in use_physical_tlist

2022-07-15 Thread Zhihong Yu
On Fri, Jul 15, 2022 at 8:33 PM Tom Lane  wrote:

> Zhihong Yu  writes:
> > I was looking at the code in use_physical_tlist().
> > In the code block checking CP_LABEL_TLIST, I noticed that
> > the Bitmapset sortgroupatts is not freed before returning from the
> method.
> > Looking at create_foreignscan_plan() (in the same file):
> > bms_free(attrs_used);
> > It seems the intermediate Bitmapset is freed before returning.
>
> TBH, I'd say that it's probably the former code not the latter
> that is good practice.  Retail pfree's in code that's not in a
> loop very possibly expend more cycles than they are worth, because
> the space will get cleaned up anyway when the active memory
> context is reset, and pfree is not as cheap as one might wish.
> It might be possible to make a case for one method over the other
> with some study of the particular situation, but you can't say
> a priori which way is better.
>
> On the whole, I would not bother changing either of these bits
> of code without some clear evidence that it matters.  It likely
> doesn't.  It's even more likely that it doesn't matter enough
> to be worth investigating.
>
> regards, tom lane
>
Hi, Tom:
Thanks for responding over the weekend.

I will try to remember what you said.

Cheers


Re: Handle infinite recursion in logical replication setup

2022-07-15 Thread vignesh C
On Thu, Jul 14, 2022 at 6:42 PM Dilip Kumar  wrote:
>
> On Thu, 14 Jul 2022 at 6:34 PM, vignesh C  wrote:
>>
>> On Thu, Jul 14, 2022 at 11:26 AM Dilip Kumar  wrote:
>> >
>> > On Wed, Jul 13, 2022 at 4:49 PM Dilip Kumar  wrote:
>> > >
>> > > On Tue, Jul 12, 2022 at 2:58 PM vignesh C  wrote:
>> > > >
>> > > > On Tue, Jul 12, 2022 at 9:51 AM Amit Kapila  
>> > > > wrote:
>> > >
>> > > I find one thing  confusing about this patch.  Basically, this has two
>> > > option 'local' and 'any', so I would assume that all the local server
>> > > changes should be covered under the 'local' but now if we set some
>> > > origin using 'select pg_replication_origin_session_setup('aa');' then
>> > > changes from that session will be ignored because it has an origin id.
>> > > I think actually the name is creating confusion, because by local it
>> > > seems like a change which originated locally and the document is also
>> > > specifying the same.
>> > >
>> > > +   If local, the subscription will request the 
>> > > publisher
>> > > +   to only send changes that originated locally. If 
>> > > any,
>> > >
>> > > I think if we want to keep the option local then we should look up all
>> > > the origin in the replication origin catalog and identify whether it
>> > > is a local origin id or remote origin id and based on that filter out
>> > > the changes.
>> >
>> > On the other hand if we are interested in receiving the changes which
>> > are generated without any origin then I think we should change 'local'
>> > to 'none' and then in future we can provide a new option which can
>> > send the changes generated by all the local origin?  I think other
>> > than this the patch LGTM.
>>
>> Thanks for the comment. The attached v33 patch has the changes to
>> specify origin as 'none' instead of 'local' which will not publish the
>> data having any origin.
>
>
> I think the ‘none’ might have problem from expand ability pov?  what if in 
> future we support the actual origin name and than what none mean? no origin 
> or origin name none?  Should we just give origin name empty name ‘’?  Or is 
> there some other issue?

Currently there is no restriction in the name we can specify for
origin, ex any, none, local, etc all are allowed as origin name. How
about extending it with another parameter "origin_name" when we
support filtering of a particular origin like:
1) origin = name, origin_name = 'orig1' -- Means that the publisher
will filter the changes having origin name as 'orig1' and send the
other changes.
2) origin = any -- Means that the publisher sends all changes
regardless of their origin.
3) origin = none -- Means that the subscription will request the
publisher to only send changes that have no origin associated.
When we need to specify filtering of a particular origin name we will
have to use both origin and origin_name option, like origin = name,
origin_name = 'orig1' as in the first example.

I'm not sure if there is a simpler way to do this with only a single
option as both 'none' and 'any' can be specified as the origin name.
Thoughts?

Regards,
Vignesh




Re: Freeing sortgroupatts in use_physical_tlist

2022-07-15 Thread Tom Lane
Zhihong Yu  writes:
> I was looking at the code in use_physical_tlist().
> In the code block checking CP_LABEL_TLIST, I noticed that
> the Bitmapset sortgroupatts is not freed before returning from the method.
> Looking at create_foreignscan_plan() (in the same file):
> bms_free(attrs_used);
> It seems the intermediate Bitmapset is freed before returning.

TBH, I'd say that it's probably the former code not the latter
that is good practice.  Retail pfree's in code that's not in a
loop very possibly expend more cycles than they are worth, because
the space will get cleaned up anyway when the active memory
context is reset, and pfree is not as cheap as one might wish.
It might be possible to make a case for one method over the other
with some study of the particular situation, but you can't say
a priori which way is better.

On the whole, I would not bother changing either of these bits
of code without some clear evidence that it matters.  It likely
doesn't.  It's even more likely that it doesn't matter enough
to be worth investigating.

regards, tom lane




Re: Commitfest Update

2022-07-15 Thread Michael Paquier
On Fri, Jul 15, 2022 at 05:23:48PM -0700, Jacob Champion wrote:
> I agree in principle -- I think, ideally, WoA patches should be
> procedurally closed at the end of a commitfest, and carried forward when
> the author has actually responded. The problems I can imagine resulting
> from this are
> 
> - Some reviewers mark WoA _immediately_ upon sending an email. I think
> authors should have a small grace period to respond before having their
> patches automatically "muted" by the system, if the review happens right
> at the end of the CF.

On this point, I'd like to think that a window of two weeks is a right
balance.  That's half of the commit fest, so that leaves plenty of
time for one to answer.  There is always the case where one is on
vacations for a period longer than that, but it is also possible for
an author to add a new entry in a future CF for the same patch.  If I
recall correctly, it should be possible to move a patch that has been
closed even if the past CF has been marked as been concluded, allowing
one to keep the same patch with its history and annotations.
--
Michael


signature.asc
Description: PGP signature


Freeing sortgroupatts in use_physical_tlist

2022-07-15 Thread Zhihong Yu
Hi,
I was looking at the code in use_physical_tlist().

In the code block checking CP_LABEL_TLIST, I noticed that
the Bitmapset sortgroupatts is not freed before returning from the method.

Looking at create_foreignscan_plan() (in the same file):

bms_free(attrs_used);

It seems the intermediate Bitmapset is freed before returning.

I would appreciate review comments for the proposed patch.

Thanks


free-sort-group-atts.patch
Description: Binary data


Re: Commitfest Update

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 05:23:48PM -0700, Jacob Champion wrote:
> This is important stuff to discuss, for sure, but I also want to revisit
> the thing I put on pause, which is to clear out old Reviewer entries to
> make it easier for new reviewers to find work to do. If we're not using
> Reviewers as a historical record, is there any reason for me not to keep
> clearing that out?

> It undoes work that you and others have done to make the historical
> record more accurate, and I think that's understandably frustrating. But
> I thought we were trying to move away from that usage of it altogether.

I don't agree that I'm using it "for historical record".  See 3499.

There's certainly some value in updating the cfapp to be "more accurate" for
some definition.  By chance, I saw the "activity log".
https://commitfest.postgresql.org/activity/

Honestly, most of the changes seems to be for the worse (16 patches had the
review field nullified).  Incomplete list of changes:

3609 - removed Nathan
3561 - removed Michael
3046 - removed PeterE
3396 - removed Tom
3396 - removed Robert and Bharath
2710 - removed Julien
3612 - removed Nathan (added by Greg)
3295 - removed Andrew
2573 - removed Daniel
3623 - removed Hou Zhijie
3260 - removed Fabien
3041 - removed Masahiko
2161 - removed Michael

I'm not suggesting to give the community regulars special treatment, but you
could reasonably assume that when they added themselves and then "didn't remove
themself", it was on purpose and not by omission.  I think most of those people
would be surprised to learn that they're no longer considered to be reviewing
the patch.

> If someone put a lot of review into a patchset a few months ago, they
> absolutely deserve credit. But if that entry has been sitting with no
> feedback this month, why is it useful to keep that Reviewer around?

I don't know what to say to this.

Why do you think it's useful to remove annotations that people added ? (And, if
it were useful, why shouldn't that be implemented in the cfapp, which already
has all the needed information.)

Can you give an example of a patch where you sent a significant review, and
added yourself as a reviewer, but wouldn't mind if someone summarily removed
you, in batch ?  It seems impolite to remove someone who is, in fact, a
reviewer.

The stated goal was to avoid the scenario that a would-be reviewer decides not
to review a patch because cfapp already shows someone else as a reviewer.  I'm
sure that could happen, but I doubt it's something that happens frequently..

-- 
Justin




Re: Backup command and functions can cause assertion failure and segmentation fault

2022-07-15 Thread Michael Paquier
On Fri, Jul 15, 2022 at 04:46:32PM +0900, Fujii Masao wrote:
> On 2022/07/14 17:00, Michael Paquier wrote:
>> and it is possible to rely on
>> pg_stat_activity.wait_event to be BaseBackupThrottle, which would make
> 
> ISTM that you can also use pg_stat_progress_basebackup.phase.

Indeed, as of "streaming database files".  That should work.

> Thanks for the patch! But I'm still not sure if it's worth adding
> only this test for the corner case while we don't have basic tests
> for BASE_BACKUP, pg_backup_start and pg_backup_stop.
> 
> BTW, if we decide to add that test, are you planning to back-patch it?

I was thinking about doing that only on HEAD.  One thing interesting
about this patch is that it can also be used as a point of reference
for other future things.

> This sounds fine to me, too. On the other hand, it's also fine for
> me to push the changes separately so that we can easily identify
> each change later. So I separated the patch into two ones. 
> 
> Since one of them failed to be applied to v14 or before cleanly, I
> also created the patch for those back branches. So I attached three
> patches. 

Fine by me.
--
Michael


signature.asc
Description: PGP signature


Re: Commitfest Update

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 05:23:48PM -0700, Jacob Champion wrote:
> On 7/15/22 16:15, Justin Pryzby wrote:
> > On Fri, Jul 15, 2022 at 03:17:49PM -0700, Jacob Champion wrote:
> >> Also, I would like to see us fold cfbot output into the official CF,
> >> rather than do the opposite.
> > 
> > That's been the plan for years :)
> 
> Is there something other than lack of round tuits that's blocking
> progress? I'm happy to donate more time in this area, but I don't know
> if my first patch proposal was helpful (or even on the right list --
> pgsql-www, right?).

cfbot is Thomas's project, so moving it run on postgres vm was one step, but I
imagine the "integration with cfapp" requires coordination with Magnus.

What patch ?

> > Similarly, patches could be summarily set to "waiting on author" if they 
> > didn't
> > recently apply, compile, and pass tests.  That's the minimum standard.
> > However, I think it's better not to do this immediately after the patch 
> > stops
> > applying/compiling/failing tests, since it's usually easy enough to review 
> > it.
> 
> It's hard to argue with that, but without automation, this is plenty of
> busy work too.

I don't think that's busywork, since it's understood to require human
judgement, like 1) to deal with false-positive test failures, and 2) check if
there's actually anything left for the author to do; 3) check if it passed
tests recently; 4) evaluate existing opinions in the thread and make a
judgement call.

> > I didn't know until recently that when a CF entry is closed, that it's 
> > possible
> > (I think) for the author themselves to reopen it and "move it to the next 
> > CF".
> > I suggest to point this out to people; I suppose I'm not the only one who 
> > finds
> > it offputting when a patch is closed in batch at the end of the month after
> > getting only insignificant review.
> 
> I think this may have been the goal but I don't think it actually works
> in practice. The app refuses to let you carry a RwF patch to a new CF.

I was able to do what Peter said.  I don't know why the cfapp has that
restriction, it seems like an artificial constraint.

https://www.postgresql.org/message-id/8498f959-e7a5-b0ec-7761-26984e581a51%40enterprisedb.com
https://commitfest.postgresql.org/32/2888/

-- 
Justin




Re: [Commitfest 2022-07] Begins Now

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 17:28:06 -0700, Jacob Champion wrote:
> On 7/15/22 16:51, Andres Freund wrote:
> > I'd make it dependent on whether there have been previous rounds of feedback
> > or not. If somebody spent a good amount of time reviewing a patch 
> > previously,
> > but then didn't review the newest version in the last few weeks, it doesn't
> > seem useful to remove them from the CF entry. The situation is different if
> > somebody has signed up but not done much.
> 
> If someone put a lot of review into a patchset a few months ago, they
> absolutely deserve credit. But if that entry has been sitting with no
> feedback this month, why is it useful to keep that Reviewer around?

IDK, I've plenty times given feedback and it took months till it all was
implemented. What's the point of doing further rounds of review until then?

Greetings,

Andres Freund




Re: Select Reference Page - Make Join Syntax More Prominent

2022-07-15 Thread David G. Johnston
On Thu, Jul 7, 2022 at 2:33 PM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > Looking again at the SELECT Reference page while helping a novice user I
> > was once again annoyed but how the most common query syntax form for the
> > FROM clause is buried within a bunch of "how to generate a table" detail.
>
> Hmm ... I'm good with the concept of making JOIN syntax more prominent,
> but I don't much like this patch.  I think it's fundamentally wrong to
> describe from_item as disjoint from join_expression, and you're
> going to make people more confused not less so by doing that.
>

I'm not so sure about this - but in any case, as you note below, some of
this probably would be better placed in Chapter 7.  My impression is that
this aspect is dense and confusing enough as-is that people are learning
the syntax elsewhere and not worrying about how we express it here.  My
desire is to bring some design theory aspects to the docs and have the
syntax expression align with the design.  I'll do another pass to try and
hopefully find some middle ground here; or be more convincing.


> IMO there's nothing really wrong with the synopsis.  The problem is
> in the "FROM Clause" section, which dives headfirst into the weedy
> details without any context.  What do you think of adding an
> introductory para or two in that section, saying that the FROM
> clause is built from base tables possibly joined into join expressions?
> You sort of have that here, but it's pretty terse still.  Maybe it
> would be helpful also to separate the subsequent list of syntax
> details into base-table options and join syntax.
>

I really don't think focusing on base tables is the best choice here.
Frankly, those end up being fairly trivial.  When you start adding lateral
(especially if introduced by comma instead of a join), and subqueries more
generally, that understanding how the pieces compose becomes more useful.
And by disenjoining the from_item and join_item it becomes easier to give
each a purpose and describe how they relate to each other, rather than
trying to paper over their differences.  If anything I think that having
from_item be used within the join_item syntax and directly under FROM maybe
be detrimental.  Instead, have a term just for the items comma-separated in
the FROM clause, the ones that are CROSS JOINed and use the WHERE clause
for restrictions (or maybe not...).


>
> Not sure what I think about moving LATERAL up.  That's a sufficiently
> weird/obtuse thing that I think we're best off dealing with it last,
>

Strongly disagree given how useful set returning functions can be.  Part of
the reason I'm doing this now is a recent spate of questions where the
advice that I gave was to write an SRF in a joined lateral, what I consider
at least to be the idiomatic query structure for that use case.



> There's also an argument that the reference page *should* be terse
> and the place to cater to novices is 7.2's "Table Expressions"
> discussion (which we could, but don't, link from the ref page).
> I'm not sure if there's any good way to rework that material to make
> it clearer, but there are definitely bits of it that I don't find
> very well-written.  There might be an argument for jettisoning
> some details (like the obsolete "table*" notation) from 7.2
> altogether and covering those only in the ref page.
>
>
Agreed, I will see about allocating material between the two sections
better on the next pass.

I probably need to understand ROWS FROM syntax better as well to make sure
it fits into the mental model I am trying to create.

David J.


Re: [Commitfest 2022-07] Begins Now

2022-07-15 Thread Jacob Champion
On 7/15/22 16:51, Andres Freund wrote:
> I'd make it dependent on whether there have been previous rounds of feedback
> or not. If somebody spent a good amount of time reviewing a patch previously,
> but then didn't review the newest version in the last few weeks, it doesn't
> seem useful to remove them from the CF entry. The situation is different if
> somebody has signed up but not done much.

If someone put a lot of review into a patchset a few months ago, they
absolutely deserve credit. But if that entry has been sitting with no
feedback this month, why is it useful to keep that Reviewer around?

(We may want to join this with the other thread.)

--Jacob




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 07:27:47PM -0400, Jonathan S. Katz wrote:
> > So I agree with Andres here. It seems weird to me to try to document
> > this new thing that I caused when we don't really make any attempt to
> > document all the other weird stuff with work_mem.
> 
> I can't argue with this.
> 
> My note on the documentation was primarily around to seeing countless user
> issues post-upgrade where queries that "once performed well no longer do
> so." I want to ensure that our users at least have a starting point to work
> on resolving the issues, even if they end up being very nuanced.
> 
> Perhaps a next step (and a separate step from this) is to assess the
> guidance we give on the upgrade page[1] about some common things they should
> check for. Then we can have the "boilerplate" there.

I assume that if you set a GUC, you should also review and maintain it into the
future.  Non-default settings should be re-evaluated (at least) during major
upgrades.  That's typically more important than additionally fidding with any
newly-added GUCs.

For example, in v13, I specifically re-evaluated shared_buffers everywhere due
to de-duplication.

In v13, hash agg was updated to spill to disk, and hash_mem_multiplier was
added to mitigate any performance issues (and documented as such in the release
notes).

I've needed to disable JIT since it was enabled by default in v12, since it
1) doesn't help; and 2) leaks memory enough to cause some customers' DBs to be
killed every 1-2 days.  (The change in default was documented, so there's no
more documentation needed).

I'm sure some people should/have variously revisted the parallel and
"asynchronous" GUCs, if they changed the defaults.  (When parallel query was
enabled by default in v10, the change wasn't initially documented, which was a
problem).

I'm suppose checkpoint_* and *wal* should be/have been revisited at various
points.  Probably effective_cache_size too.  Those are just the most common
ones to change.

-- 
Justin




Re: Commitfest Update

2022-07-15 Thread Jacob Champion
On 7/15/22 16:15, Justin Pryzby wrote:
> On Fri, Jul 15, 2022 at 03:17:49PM -0700, Jacob Champion wrote:
>> Also, I would like to see us fold cfbot output into the official CF,
>> rather than do the opposite.
> 
> That's been the plan for years :)

Is there something other than lack of round tuits that's blocking
progress? I'm happy to donate more time in this area, but I don't know
if my first patch proposal was helpful (or even on the right list --
pgsql-www, right?).

>>> I'd prefer to see someone
>>> pick one of the patches that hasn't seen a review in 6 (or 16) months, and 
>>> send
>>> out their most critical review and recommend it be closed, or send an 
>>> updated
>>> patch with their own fixes as an 0099 patch.
>>
>> That would be cool, but who is "someone"? There have been many, many
>> statements about the amount of CF cruft that needs to be removed. Seems
>> like the CFM is in a decent position to actually do it.
> 
> I have hesitated to even try to begin the conversation.
> 
> Since a patch author initially creates the CF entry, why shouldn't they also 
> be
> responsible for moving them to the next cf.  This serves to indicate a
> continued interest.  Ideally they also set back to "needs review" after
> addressing feedback, but I imagine many would forget, and this seems like a
> reasonable task for a CFM to do - look at WOA patches that pass tests to see 
> if
> they're actually WOA.

I agree in principle -- I think, ideally, WoA patches should be
procedurally closed at the end of a commitfest, and carried forward when
the author has actually responded. The problems I can imagine resulting
from this are

- Some reviewers mark WoA _immediately_ upon sending an email. I think
authors should have a small grace period to respond before having their
patches automatically "muted" by the system, if the review happens right
at the end of the CF.

- Carrying forward a closed patch is not actually easy. See below.

> Similarly, patches could be summarily set to "waiting on author" if they 
> didn't
> recently apply, compile, and pass tests.  That's the minimum standard.
> However, I think it's better not to do this immediately after the patch stops
> applying/compiling/failing tests, since it's usually easy enough to review it.

It's hard to argue with that, but without automation, this is plenty of
busy work too.

> It should be the author's responsibility to handle that; I don't know why the
> accepted process seems to involve sending dozens of emails to say "needs
> rebase".  You're putting to good use of some cfapp email features I don't
> remember seeing used before; that seems much better.  Also, it's possible to
> subscribe to CF updates (possibly not a well-advertised, well-known or
> well-used feature).  

I don't think it should be reviewers' responsibility to say "needs
rebase" over and over again, and I also don't think a new author should
have to refresh the cfbot every single day to find out whether or not
their patch still applies. These things should be handled by the app.

(Small soapbox, hopefully relevant: I used to be in the camp that making
contributors jump through small procedural hoops would somehow weed out
people who were making low-effort patches. I've since come around to the
position that this just tends to select for people with more free time
and/or persistence. I don't like the idea of raising the busy-work bar
for authors, especially without first fixing the automation problem.)

> I didn't know until recently that when a CF entry is closed, that it's 
> possible
> (I think) for the author themselves to reopen it and "move it to the next CF".
> I suggest to point this out to people; I suppose I'm not the only one who 
> finds
> it offputting when a patch is closed in batch at the end of the month after
> getting only insignificant review.

I think this may have been the goal but I don't think it actually works
in practice. The app refuses to let you carry a RwF patch to a new CF.

--

This is important stuff to discuss, for sure, but I also want to revisit
the thing I put on pause, which is to clear out old Reviewer entries to
make it easier for new reviewers to find work to do. If we're not using
Reviewers as a historical record, is there any reason for me not to keep
clearing that out?

It undoes work that you and others have done to make the historical
record more accurate, and I think that's understandably frustrating. But
I thought we were trying to move away from that usage of it altogether.

--Jacob




Re: Use -fvisibility=hidden for shared libraries

2022-07-15 Thread Andres Freund
Hi,

On 2022-01-11 15:54:19 -0500, Tom Lane wrote:
> I still don't understand what are the conditions for MSVC to complain.
> The rule is evidently not that every extern must agree with the function
> definition, because for example you added
> 
> +extern PGDLLEXPORT void _PG_init(void);
> 
> in fmgr.h, but you didn't change any of the existing extern declarations
> or definitions for _PG_init functions, and yet everything seems to work.

I think I figured that part out now:
https://godbolt.org/z/qYqo95fYs

It works as long as the *first* declaration has the declspec, later ones don't
need it. If the first one does *not* have the declspec but later ones don't,
you get "error C2375: 'msvc_fail': redefinition; different linkage".

That makes some sort of sense.


> I had concluded that gcc/clang follow the rule "use an attribute if it
> appears on at least one extern for the function", and this seems like
> evidence that it works like that in MSVC too.

So it's not quite the same as with gcc / clang...

Greetings,

Andres Freund




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Jonathan S. Katz

On 7/15/22 6:52 PM, Andres Freund wrote:

Hi,

On 2022-07-15 18:40:11 -0400, Jonathan S. Katz wrote:

What I find interesting is the resistance to adding any documentation around
this feature to guide users in case they hit the regression. I understand it
can be difficult to provide guidance on issues related to adjusting
work_mem, but even just a hint in the release notes to say "if you see a
performance regression you may need to adjust work_mem" would be helpful.
This would help people who are planning upgrades to at least know what to
watch out for.


If we want to add that as boilerplate for every major release - ok, although
it's not really actionable.  What I'm against is adding it specifically for
this release - we have stuff like this *all the time*, we just often don't
bother to carefully find the specific point at which an optimization might
hurt.

Honestly, if this thread teaches anything, it's to hide relatively minor
corner case regressions. Which isn't good.


I think it's OK to discuss ways we can better help our users. It's also 
OK to be wrong; I have certainly been so plenty of times.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Commitfest 2022-07] Begins Now

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 16:42:03 -0700, Jacob Champion wrote:
> I started removing stale Reviewers fairly aggressively today, as
> discussed in [1], but there was some immediate feedback and I have
> paused that process for now. If you're wondering why you are no longer
> marked as reviewer on a patch, I followed the following rule: if you
> were signed up to review before June of this year, but you haven't
> interacted with the patch in this commitfest, I removed you from the
> list. If you have thoughts/comments on this approach, please share them!

I'd make it dependent on whether there have been previous rounds of feedback
or not. If somebody spent a good amount of time reviewing a patch previously,
but then didn't review the newest version in the last few weeks, it doesn't
seem useful to remove them from the CF entry. The situation is different if
somebody has signed up but not done much.

Greetings,

Andres Freund




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 14:51:38 -0700, Jacob Champion wrote:
> > We already have pg_clean_ascii() and use it for application_name, fwiw.
> 
> That seems much worse than escaping for this particular patch; if your
> cert's Common Name is in (non-ASCII) UTF-8 then all you'll see is
> "CN=?" in the log lines that were supposed to be helping you
> root-cause. Escaping would be much more helpful in this case.

I'm doubtful that's all that common. But either way, I suggest a separate
patch to deal with that...

Greetings,

Andres Freund




Re: [Commitfest 2022-07] Begins Now

2022-07-15 Thread Jacob Champion
On 7/8/22 16:42, Jacob Champion wrote:
> On 7/1/22 08:08, Jacob Champion wrote:
>> It's been July everywhere on Earth for a few hours, so the July
>> commitfest is now in progress:
>>
>> https://commitfest.postgresql.org/38/

Halfway through!

We are now at

Needs review: 175
Waiting on Author: 43
Ready for Committer:   20
Committed: 52
Moved to next CF:   2
Returned with Feedback: 4
Rejected:   3
Withdrawn:  6
--
Total:305

Since last week, that's fifteen more committed patchsets, with the Ready
for Committer queue holding fairly steady. Nice work, everyone.

I started removing stale Reviewers fairly aggressively today, as
discussed in [1], but there was some immediate feedback and I have
paused that process for now. If you're wondering why you are no longer
marked as reviewer on a patch, I followed the following rule: if you
were signed up to review before June of this year, but you haven't
interacted with the patch in this commitfest, I removed you from the
list. If you have thoughts/comments on this approach, please share them!

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/flat/34b32cb2-a728-090a-00d5-067305874174%40timescale.com#3247e661b219f8736ae418c9b5452d63




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Jonathan S. Katz

Thank you for the very detailed analysis. Comments inline.

On 7/15/22 7:12 PM, David Rowley wrote:

On Sat, 16 Jul 2022 at 10:40, Jonathan S. Katz  wrote:

What I find interesting is the resistance to adding any documentation
around this feature to guide users in case they hit the regression. I
understand it can be difficult to provide guidance on issues related to
adjusting work_mem, but even just a hint in the release notes to say "if
you see a performance regression you may need to adjust work_mem" would
be helpful. This would help people who are planning upgrades to at least
know what to watch out for.


Looking back at the final graph in the blog [1], l see that work_mem
is a pretty surprising GUC.  I'm sure many people would expect that
setting work_mem to some size that allows the sort to be entirely done
in RAM would be the fastest way. And that does appear to be the case,
as 16GB was the only setting which allowed that.  However, I bet it
would surprise many people to see that 8GB wasn't 2nd fastest. Even
128MB was faster than 8GB!


Yeah that is interesting. And while some of those settings are less 
likely in the wild, I do think we are going to see larger and larger 
"work_mem" settings as instance sizes continue to grow. That said, your 
PG15 benchmarks are overall faster than the PG14, and that is what I am 
looking at in the context of this release.



Most likely that's because the machine I tested that on has lots of
RAM spare for kernel buffers which would allow all that disk activity
for batching not actually to cause physical reads or writes.  I bet
that would have looked different if I'd run a few concurrent sorts
with 128MB of work_mem. They'd all be competing for kernel buffers in
that case.

So I agree with Andres here. It seems weird to me to try to document
this new thing that I caused when we don't really make any attempt to
document all the other weird stuff with work_mem.


I can't argue with this.

My note on the documentation was primarily around to seeing countless 
user issues post-upgrade where queries that "once performed well no 
longer do so." I want to ensure that our users at least have a starting 
point to work on resolving the issues, even if they end up being very 
nuanced.


Perhaps a next step (and a separate step from this) is to assess the 
guidance we give on the upgrade page[1] about some common things they 
should check for. Then we can have the "boilerplate" there.



I think the problem can actually be worse with work_mem sizes in
regards to hash tables.  The probing phase of a hash join causes
memory access patterns that the CPU cannot determine which can result
in poor performance when the hash table size is larger than the CPU's
L3 cache size.  If you have fast enough disks, it seems realistic that
given the right workload (most likely much more than 1 probe per
bucket) that you could also get better performance by having lower
values of work_mem.

If we're going to document the generic context anomaly then we should
go all out and document all of the above, plus all the other weird
stuff I've not thought of.  However, I think, short of having an
actual patch to review, it might be better to leave it until someone
can come up with some text that's comprehensive enough to be worthy of
reading.  I don't think I could do the topic justice.  I'm also not
sure any wisdom we write about this would be of much use in the real
world given that its likely concurrency has a larger effect, and we
don't have much ability to control that.


Understood. I don't think that is fair to ask for this release, but 
don't sell your short on explaining the work_mem nuances.



FWIW, I think it would be better for us just to solve these problems
in code instead.  Having memory gating to control the work_mem from a
pool and teaching sort about CPU caches might be better than
explaining to users that tuning work_mem is hard.


+1. Again thank you for taking the time for the thorough explanation and 
of course, working on the patch and fixes.


Jonathan

[1] https://www.postgresql.org/docs/current/upgrading.html


OpenPGP_signature
Description: OpenPGP digital signature


Moving RwF patches to a new CF

2022-07-15 Thread Jacob Champion
[changing the subject line, was "[Commitfest 2022-07] Begins Now"]

On Fri, Jul 15, 2022 at 1:37 AM Wenjing Zeng  wrote:
> Please move the Global Temporary table to check next month, that is at 202208.
> I need more time to process the existing issue.

Hi Wenjing,

My current understanding is that RwF patches can't be moved (even by
the CFM). You can just reattach the existing thread to a new CF entry
when you're ready, as discussed in [1]. (Reviewers can sign themselves
back up; don't carry them over.)

It does seem annoying to have to reapply annotations, though. I would
like to see the CF app support this and I'll try to put a patch
together sometime (there's a growing list).

Thanks,
--Jacob

[1] 
https://www.postgresql.org/message-id/CALT9ZEGoVd6%2B5FTJ3d6DXRO4z%3DrvqK%2BdjrmgSiFo7dkKeMkyfQ%40mail.gmail.com




Re: Commitfest Update

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 03:17:49PM -0700, Jacob Champion wrote:
> > Also, cfapp has a page for "patches where you are the author", but the cfbot
> > doesn't,
> 
> (I assume you mean "reviewer"?)

Yes

> Also, I would like to see us fold cfbot output into the official CF,
> rather than do the opposite.

That's been the plan for years :)

> > I'd prefer to see someone
> > pick one of the patches that hasn't seen a review in 6 (or 16) months, and 
> > send
> > out their most critical review and recommend it be closed, or send an 
> > updated
> > patch with their own fixes as an 0099 patch.
> 
> That would be cool, but who is "someone"? There have been many, many
> statements about the amount of CF cruft that needs to be removed. Seems
> like the CFM is in a decent position to actually do it.

I have hesitated to even try to begin the conversation.

Since a patch author initially creates the CF entry, why shouldn't they also be
responsible for moving them to the next cf.  This serves to indicate a
continued interest.  Ideally they also set back to "needs review" after
addressing feedback, but I imagine many would forget, and this seems like a
reasonable task for a CFM to do - look at WOA patches that pass tests to see if
they're actually WOA.

Similarly, patches could be summarily set to "waiting on author" if they didn't
recently apply, compile, and pass tests.  That's the minimum standard.
However, I think it's better not to do this immediately after the patch stops
applying/compiling/failing tests, since it's usually easy enough to review it.

It should be the author's responsibility to handle that; I don't know why the
accepted process seems to involve sending dozens of emails to say "needs
rebase".  You're putting to good use of some cfapp email features I don't
remember seeing used before; that seems much better.  Also, it's possible to
subscribe to CF updates (possibly not a well-advertised, well-known or
well-used feature).  

I didn't know until recently that when a CF entry is closed, that it's possible
(I think) for the author themselves to reopen it and "move it to the next CF".
I suggest to point this out to people; I suppose I'm not the only one who finds
it offputting when a patch is closed in batch at the end of the month after
getting only insignificant review.

Thanks for considering

-- 
Justin




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread David Rowley
On Sat, 16 Jul 2022 at 10:40, Jonathan S. Katz  wrote:
> What I find interesting is the resistance to adding any documentation
> around this feature to guide users in case they hit the regression. I
> understand it can be difficult to provide guidance on issues related to
> adjusting work_mem, but even just a hint in the release notes to say "if
> you see a performance regression you may need to adjust work_mem" would
> be helpful. This would help people who are planning upgrades to at least
> know what to watch out for.

Looking back at the final graph in the blog [1], l see that work_mem
is a pretty surprising GUC.  I'm sure many people would expect that
setting work_mem to some size that allows the sort to be entirely done
in RAM would be the fastest way. And that does appear to be the case,
as 16GB was the only setting which allowed that.  However, I bet it
would surprise many people to see that 8GB wasn't 2nd fastest. Even
128MB was faster than 8GB!

Most likely that's because the machine I tested that on has lots of
RAM spare for kernel buffers which would allow all that disk activity
for batching not actually to cause physical reads or writes.  I bet
that would have looked different if I'd run a few concurrent sorts
with 128MB of work_mem. They'd all be competing for kernel buffers in
that case.

So I agree with Andres here. It seems weird to me to try to document
this new thing that I caused when we don't really make any attempt to
document all the other weird stuff with work_mem.

I think the problem can actually be worse with work_mem sizes in
regards to hash tables.  The probing phase of a hash join causes
memory access patterns that the CPU cannot determine which can result
in poor performance when the hash table size is larger than the CPU's
L3 cache size.  If you have fast enough disks, it seems realistic that
given the right workload (most likely much more than 1 probe per
bucket) that you could also get better performance by having lower
values of work_mem.

If we're going to document the generic context anomaly then we should
go all out and document all of the above, plus all the other weird
stuff I've not thought of.  However, I think, short of having an
actual patch to review, it might be better to leave it until someone
can come up with some text that's comprehensive enough to be worthy of
reading.  I don't think I could do the topic justice.  I'm also not
sure any wisdom we write about this would be of much use in the real
world given that its likely concurrency has a larger effect, and we
don't have much ability to control that.

FWIW, I think it would be better for us just to solve these problems
in code instead.  Having memory gating to control the work_mem from a
pool and teaching sort about CPU caches might be better than
explaining to users that tuning work_mem is hard.

David

[1] 
https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/speeding-up-sort-performance-in-postgres-15/ba-p/3396953




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 18:40:11 -0400, Jonathan S. Katz wrote:
> What I find interesting is the resistance to adding any documentation around
> this feature to guide users in case they hit the regression. I understand it
> can be difficult to provide guidance on issues related to adjusting
> work_mem, but even just a hint in the release notes to say "if you see a
> performance regression you may need to adjust work_mem" would be helpful.
> This would help people who are planning upgrades to at least know what to
> watch out for.

If we want to add that as boilerplate for every major release - ok, although
it's not really actionable.  What I'm against is adding it specifically for
this release - we have stuff like this *all the time*, we just often don't
bother to carefully find the specific point at which an optimization might
hurt.

Honestly, if this thread teaches anything, it's to hide relatively minor
corner case regressions. Which isn't good.

Greetings,

Andres Freund




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Jonathan S. Katz

On 7/15/22 6:40 PM, Jonathan S. Katz wrote:

On 7/15/22 4:54 PM, Tom Lane wrote:

Tomas Vondra  writes:

... My personal opinion is that it's a rare regression. Other
optimization patches have similar rare regressions, except that David
spent so much time investigating this one it seems more serious.


Yeah, this.  I fear we're making a mountain out of a molehill.  We have
committed many optimizations that win on average but possibly lose
in edge cases, and not worried too much about it.


I disagree with the notion of this being a "mountain out of a molehill." 
The RMT looked at the situation, asked if we should make one more pass. 
There were logical argument as to why not to (e.g. v16 efforts). I think 
that is reasonable, and we can move on from any additional code changes 
for v15.


What I find interesting is the resistance to adding any documentation 
around this feature to guide users in case they hit the regression. I 
understand it can be difficult to provide guidance on issues related to 
adjusting work_mem, but even just a hint in the release notes to say "if 
you see a performance regression you may need to adjust work_mem" would 
be helpful. This would help people who are planning upgrades to at least 
know what to watch out for.


If that still seems unreasonable, I'll agree to disagree so we can move 
on with other parts of the release.


For completeness, I marked the open item as closed.

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Jonathan S. Katz

On 7/15/22 4:54 PM, Tom Lane wrote:

Tomas Vondra  writes:

... My personal opinion is that it's a rare regression. Other
optimization patches have similar rare regressions, except that David
spent so much time investigating this one it seems more serious.


Yeah, this.  I fear we're making a mountain out of a molehill.  We have
committed many optimizations that win on average but possibly lose
in edge cases, and not worried too much about it.


I disagree with the notion of this being a "mountain out of a molehill." 
The RMT looked at the situation, asked if we should make one more pass. 
There were logical argument as to why not to (e.g. v16 efforts). I think 
that is reasonable, and we can move on from any additional code changes 
for v15.


What I find interesting is the resistance to adding any documentation 
around this feature to guide users in case they hit the regression. I 
understand it can be difficult to provide guidance on issues related to 
adjusting work_mem, but even just a hint in the release notes to say "if 
you see a performance regression you may need to adjust work_mem" would 
be helpful. This would help people who are planning upgrades to at least 
know what to watch out for.


If that still seems unreasonable, I'll agree to disagree so we can move 
on with other parts of the release.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Commitfest Update

2022-07-15 Thread Jacob Champion
On 7/15/22 14:57, Justin Pryzby wrote:
> On Fri, Jul 08, 2022 at 02:41:52PM -0700, Jacob Champion wrote:
>
>> If there are no objections, I'll start doing that during next Friday's
>> patch sweep.
> 
> I think it's fine to update the cfapp fields to reflect reality...
> 
> ..but a couple updates that I just saw seem wrong.

Hm, okay. Let me hold off on continuing then; I'm only about 25% in. The
general rule I was applying was "if you were marked Reviewer prior to
June, and you haven't interacted with the patchset this commitfest, I've
removed you."

>  The reviewers field was
> nullified, even though the patches haven't been updated in a long time.
> There's nothing new to review.  All this has done is lost information that
> someone else (me, in this case) went to the bother of adding.

My understanding from upthread was that we wanted to get out of the
habit of using Reviewers as a historical record, and move towards using
it as a marker of current activity. As Tom said, "people see that the
patch already has a reviewer and look for something else to do."

I am sorry that I ended up reverting your work, though.

> Also, cfapp has a page for "patches where you are the author", but the cfbot
> doesn't,

(I assume you mean "reviewer"?)

> and I think people probably look at cfbot more than the cfapp itself.

I think some people do. But the number of dead/non-applicable patches
that need manual reminders suggests to me that maybe it's not an
overwhelming majority of people.

> So being marked as a reviewer is not very visible even to oneself.
> But, one of the cfbot patches I sent to Thomas would change that.  Each user's
> page would *also* show patches where they're a reviewer ("Needs review -
> Reviewer").  That maybe provides an incentive to 1) help maintain the patch; 
> or
> otherwise 2) remove oneself.
I didn't notice cfbot's user pages until this CF, so it wouldn't have
been an effective incentive for me, at least.

Also, I would like to see us fold cfbot output into the official CF,
rather than do the opposite.

> Also, TBH, this seems to create a lot of busywork.

Well, yes, but only because it's not automated. I don't think that's a
good reason not to do it, but it is a good reason not to make a person
do it.

>  I'd prefer to see someone
> pick one of the patches that hasn't seen a review in 6 (or 16) months, and 
> send
> out their most critical review and recommend it be closed, or send an updated
> patch with their own fixes as an 0099 patch.

That would be cool, but who is "someone"? There have been many, many
statements about the amount of CF cruft that needs to be removed. Seems
like the CFM is in a decent position to actually do it.

--Jacob




Re: Commitfest Update

2022-07-15 Thread Justin Pryzby
On Fri, Jul 08, 2022 at 02:41:52PM -0700, Jacob Champion wrote:
> On 3/31/22 07:37, Tom Lane wrote:
> > Robert Haas  writes:
> >> On Thu, Mar 31, 2022 at 10:11 AM Tom Lane  wrote:
> >>> ... Would it be feasible or reasonable
> >>> to drop reviewers if they've not commented in the thread in X amount
> >>> of time?
> > 
> >> In theory, this might cause someone who made a valuable contribution
> >> to the discussion to not get credited in the commit message. But it
> >> probably wouldn't in practice, because I at least always construct the
> >> list of reviewers from the thread, not the CF app, since that tends to
> >> be wildly inaccurate in both directions. So maybe it's fine? Not sure.
> > 
> > Hmm, I tend to believe what's in the CF app, so maybe I'm dropping the
> > ball on review credits :-(.  But there are various ways we could implement
> > this.  One way would be a nagbot that sends private email along the lines
> > of "you haven't commented on patch X in Y months.  Please remove your name
> > from the list of reviewers if you don't intend to review it any more."
> 
> It seems there wasn't a definitive decision here. Are there any
> objections to more aggressive pruning of the Reviewers entries? So
> committers would need to go through the thread for full attribution,
> moving forward.
> 
> If there are no objections, I'll start doing that during next Friday's
> patch sweep.

I think it's fine to update the cfapp fields to reflect reality...

..but a couple updates that I just saw seem wrong.  The reviewers field was
nullified, even though the patches haven't been updated in a long time.
There's nothing new to review.  All this has done is lost information that
someone else (me, in this case) went to the bother of adding.

Also, cfapp has a page for "patches where you are the author", but the cfbot
doesn't, and I think people probably look at cfbot more than the cfapp itself.
So being marked as a reviewer is not very visible even to oneself.
But, one of the cfbot patches I sent to Thomas would change that.  Each user's
page would *also* show patches where they're a reviewer ("Needs review -
Reviewer").  That maybe provides an incentive to 1) help maintain the patch; or
otherwise 2) remove oneself.

Also, TBH, this seems to create a lot of busywork.  I'd prefer to see someone
pick one of the patches that hasn't seen a review in 6 (or 16) months, and send
out their most critical review and recommend it be closed, or send an updated
patch with their own fixes as an 0099 patch.

-- 
Justin




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Jacob Champion
On 7/15/22 14:19, Andres Freund wrote:
> On 2022-07-15 14:01:53 -0700, Jacob Champion wrote:
>> On 7/15/22 13:35, Andres Freund wrote:
>>> I don't know, but I don't think not caring at all is a good
>>> option. Particularly for unauthenticated data I'd say that escaping 
>>> everything
>>> but printable ascii chars is a sensible approach.
>>
>> It'll also be painful for anyone whose infrastructure isn't in a Latin
>> character set... Maybe that's worth the tradeoff for a v1.
> 
> I don't think it's a huge issue, or really avoidable, pre-authentication.
> Don't we require all server-side encodings to be supersets of ascii?

Well, I was going to say that for this feature, where the goal is to
debug a failed certificate chain, having to manually unescape the logged
certificate names if your infrastructure already handled UTF-8 natively
would be a real pain.

But your later point about truncation makes that moot; I forgot that my
patch can truncate in the middle of a UTF-8 sequence, so you're probably
dealing with replacement glyphs anyway. I don't really have a leg to
stand on there.

> We already have pg_clean_ascii() and use it for application_name, fwiw.

That seems much worse than escaping for this particular patch; if your
cert's Common Name is in (non-ASCII) UTF-8 then all you'll see is
"CN=?" in the log lines that were supposed to be helping you
root-cause. Escaping would be much more helpful in this case.

>> Is there an acceptable approach that could centralize it, so we fix it
>> once and are done? E.g. a log_encoding GUC and either conversion or
>> escaping in send_message_to_server_log()?
> 
> Introducing escaping to ascii for all log messages seems like it'd be
> incredibly invasive, and would remove a lot of worthwhile information. Nor
> does it really address the whole scope - consider e.g. the truncation in this
> patch, that can't be done correctly by the time send_message_to_server_log()
> is reached - just chopping in the middle of a multi-byte string would have
> made the string invalidly encoded. And we can't perform encoding conversion
> from client data until we've gone further into the authentication process, I
> think.
> 
> Always escaping ANSI escape codes (or rather the non-printable ascii range) is
> more convincing. Then we'd just need to make sure that client controlled data
> is properly encoded before handing it over to other parts of the system.
> 
> I can see a point in a log_encoding GUC at some point, but it seems a bit
> separate from the discussion here.

Fair enough.

--Jacob




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Tom Lane
Andres Freund  writes:
> We already have pg_clean_ascii() and use it for application_name, fwiw.

Yeah, we should just use that.  If anyone wants to upgrade the situation
for non-ASCII data later, fixing it for all of these cases at once would
be appropriate.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 14:01:53 -0700, Jacob Champion wrote:
> On 7/15/22 13:35, Andres Freund wrote:
> >> (And do we want to fix it now, regardless?)
> > 
> > Yes.
> 
> Cool. I can get on board with that.
> 
> >> What guarantees are we supposed to be making for log encoding?
> > 
> > I don't know, but I don't think not caring at all is a good
> > option. Particularly for unauthenticated data I'd say that escaping 
> > everything
> > but printable ascii chars is a sensible approach.
> 
> It'll also be painful for anyone whose infrastructure isn't in a Latin
> character set... Maybe that's worth the tradeoff for a v1.

I don't think it's a huge issue, or really avoidable, pre-authentication.
Don't we require all server-side encodings to be supersets of ascii?

We already have pg_clean_ascii() and use it for application_name, fwiw.


> Is there an acceptable approach that could centralize it, so we fix it
> once and are done? E.g. a log_encoding GUC and either conversion or
> escaping in send_message_to_server_log()?

Introducing escaping to ascii for all log messages seems like it'd be
incredibly invasive, and would remove a lot of worthwhile information. Nor
does it really address the whole scope - consider e.g. the truncation in this
patch, that can't be done correctly by the time send_message_to_server_log()
is reached - just chopping in the middle of a multi-byte string would have
made the string invalidly encoded. And we can't perform encoding conversion
from client data until we've gone further into the authentication process, I
think.

Always escaping ANSI escape codes (or rather the non-printable ascii range) is
more convincing. Then we'd just need to make sure that client controlled data
is properly encoded before handing it over to other parts of the system.

I can see a point in a log_encoding GUC at some point, but it seems a bit
separate from the discussion here.

Greetings,

Andres Freund




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-08 17:05:49 -0400, Andrew Dunstan wrote:
> On 2022-07-05 Tu 15:04, Andrew Dunstan wrote:
> > On 2022-07-05 Tu 14:36, Andres Freund wrote:
> >>
>  I think Andrew's beta 2 comment was more about my other architectural
>  complains around the json expression eval stuff.
> >>> Right. That's being worked on but it's not going to be a mechanical fix.
> >> Any updates here?
> >
> > Not yet. A colleague and I are working on it. I'll post a status this
> > week if we can't post a fix.

> We're still working on it. We've made substantial progress but there are
> some tests failing that we need to fix.

I think we need to resolve this soon - or consider the alternatives. A lot of
the new json stuff doesn't seem fully baked, so I'm starting to wonder if we
have to consider pushing it a release further down.

Perhaps you could post your current state? I might be able to help resolving
some of the problems.

Greetings,

Andres Freund




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Jacob Champion
On 7/15/22 13:35, Andres Freund wrote:
>> (And do we want to fix it now, regardless?)
> 
> Yes.

Cool. I can get on board with that.

>> What guarantees are we supposed to be making for log encoding?
> 
> I don't know, but I don't think not caring at all is a good
> option. Particularly for unauthenticated data I'd say that escaping everything
> but printable ascii chars is a sensible approach.

It'll also be painful for anyone whose infrastructure isn't in a Latin
character set... Maybe that's worth the tradeoff for a v1.

Is there an acceptable approach that could centralize it, so we fix it
once and are done? E.g. a log_encoding GUC and either conversion or
escaping in send_message_to_server_log()?

--Jacob




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 16:42:14 -0400, Jonathan S. Katz wrote:
> On 7/15/22 4:36 PM, Tomas Vondra wrote:
> > > > If there does not appear to be a clear path forward, we should at least
> > > > document how a user can detect and resolve the issue.
> > > 
> > > To me that doesn't really make sense. We have lots of places were 
> > > performance
> > > changes once you cross some threshold, and lots of those are related to
> > > work_mem.
> 
> Yes, but in this case this is nonobvious to a user. A sort that may be
> performing just fine on a pre-PG15 version is suddenly degraded, and the
> user has no guidance as to why or how to remediate.

We make minor changes affecting thresholds at which point things spill to disk
etc *all the time*. Setting the standard that all of those need to be
documented seems not wise to me. Both because of the effort for us, and
because it'll end up being a morass of documentation that nobody can make use
off, potentially preventing people from upgrading because some minor perf
changes in some edge cases sound scary.

I'm fairly certain there were numerous other changes with such effects. We
just don't know because there wasn't as careful benchmarking.


> > > We don't, e.g., provide tooling to detect when performance in aggregation
> > > regresses due to crossing work_mem and could be fixed by a tiny increase 
> > > in
> > > work_mem.
> > > 
> > 
> > Yeah. I find it entirely reasonable to tell people to increase work_mem
> > a bit to fix this. The problem is knowing you're affected :-(
> 
> This is the concern the RMT discussed. Yes it is reasonable to say "increase
> work_mem to XYZ", but how does a user know how to detect and apply that?

The won't. Nor will they with any reasonable documentation we can give. Nor
will they know in any of the other cases some threshold is crossed.


> This is the part that is worrisome, especially because we don't have any
> sense of what the overall impact will be.
> 
> Maybe it's not much, but we should document that there is the potential for
> a regression.

-1

Greetings,

Andres Freund




Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Tom Lane
Tomas Vondra  writes:
> ... My personal opinion is that it's a rare regression. Other
> optimization patches have similar rare regressions, except that David
> spent so much time investigating this one it seems more serious.

Yeah, this.  I fear we're making a mountain out of a molehill.  We have
committed many optimizations that win on average but possibly lose
in edge cases, and not worried too much about it.

regards, tom lane




Re: MERGE and parsing with prepared statements

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 12:59:34PM -0700, David G. Johnston wrote:
> On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby  wrote:
> 
> >
> | If the expression for any column is not of the correct data type, automatic 
> type conversion will be attempted.
> > That appears to be copied from the INSERT page.
> > What does that mean, if not that data types will be resolved as needed ?
>
> Yep, and the system needs to resolve the type at a point where there is no
> contextual information and so it chooses text.

I don't know if that's a viable interpretation.  The parser "resolved the type"
by assuming a default, which caused an error before finishing parsing.

In the case of INSERT, "conversion will be attempted" means it looks for a cast
from the source type to the target type, and its "automatic type conversion"
will fail if (for example) you try to insert a timestamp into an int.

In the case of MERGE, that same sentence evidently means that it assumes a
default source type (rather than looking for a cast) and then fails if it
doesn't exactly match the target type.

Should that sentence be removed from MERGE ?

-- 
Justin




Re: doc: Bring mention of unique index forced transaction wait behavior outside of the internal section

2022-07-15 Thread David G. Johnston
On Thu, Jul 14, 2022 at 12:18 PM Bruce Momjian  wrote:

> On Mon, Jul 11, 2022 at 05:22:41PM +0300, Aleksander Alekseev wrote:
> > Hi Bruce,
> >
> > > I was not happy with putting this in the Transaction Isolation section.
> > > I rewrote it and put it in the INSERT secion, right before ON CONFLICT;
> > > patch attached.
> >
> > Looks good.
>
> Applied to all supported PG versions.
>
>
Sorry for the delayed response on this but I'm not a fan.  A comment of
some form in transaction isolation seems to make sense (even if not my
original thought...that patch got messed up a bit anyhow), and while having
something in INSERT makes sense this doesn't seem precise enough.

Comments about locking and modifying rows doesn't make sense (the issue
isn't relegated to ON CONFLICT, simple inserts will wait if they happen to
choose the same key to insert).

I would also phrase it as simply "Tables with a unique index will..." and
not even mention tables that lack a unique index - those don't really exist
and inference of their behavior by contrast seems sufficient.

Sticking close to what you proposed then:

INSERT into tables with a unique index might block when concurrent sessions
are inserting conflicting rows (i.e., have identical values for the unique
index columns) or when there already exists a conflicting row which is in
the process of being deleted.  Details are covered in .

I can modify my original patch to be shorter and more on-point for
inclusion in the MVCC chapter if there is interest in having a pointer from
there to index-unique-checks as well.  I think such a note regarding
concurrency on an index naturally fits into one of the main pages for
learning about concurrency in PostgreSQL.

David J.


Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Jonathan S. Katz

On 7/15/22 4:36 PM, Tomas Vondra wrote:



On 7/13/22 17:32, Andres Freund wrote:

Hi,

On 2022-07-13 09:23:00 -0400, Jonathan S. Katz wrote:

On 7/13/22 12:13 AM, David Rowley wrote:

On Tue, 12 Jul 2022 at 17:15, David Rowley  wrote:

So far only Robert has raised concerns with this regression for PG15
(see [2]). Tom voted for leaving things as they are for PG15 in [3].
John agrees, as quoted above. Does anyone else have any opinion?


Let me handle this slightly differently.  I've moved the open item for
this into the "won't fix" section.  If nobody shouts at me for that
then I'll let that end the debate.  Otherwise, we can consider the
argument when it arrives.


The RMT discussed this issue at its meeting today (and a few weeks back --
apologies for not writing sooner). While we agree with your analysis that 1/
this issue does appear to be a corner case and 2/ the benefits outweigh the
risks, we still don't know how prevalent it may be in the wild and the
general impact to user experience.

The RMT suggests that you make one more pass at attempting to solve it.


I think without a more concrete analysis from the RMT that's not really
actionable. What risks are we willing to accept to resolve this? This is
mostly a question of tradeoffs.

Several "senior" postgres hackers looked at this and didn't find any solution
that makes sense to apply to 15. I don't think having David butt his head
further against this code is likely to achieve much besides a headache.  Note
that David already has a patch to address this for 16.



I agree with this. It's not clear to me how we'd asses how prevalent it
really is (reports on a mailing list surely are not a very great way to
measure this). My personal opinion is that it's a rare regression. Other
optimization patches have similar rare regressions, except that David
spent so much time investigating this one it seems more serious.

I think it's fine to leave this as is. If we feel we have to fix this
for v15, it's probably best to just apply the v16. I doubt we'll find
anything simpler.


I think the above is reasonable.


If there does not appear to be a clear path forward, we should at least
document how a user can detect and resolve the issue.


To me that doesn't really make sense. We have lots of places were performance
changes once you cross some threshold, and lots of those are related to
work_mem.


Yes, but in this case this is nonobvious to a user. A sort that may be 
performing just fine on a pre-PG15 version is suddenly degraded, and the 
user has no guidance as to why or how to remediate.



We don't, e.g., provide tooling to detect when performance in aggregation
regresses due to crossing work_mem and could be fixed by a tiny increase in
work_mem.



Yeah. I find it entirely reasonable to tell people to increase work_mem
a bit to fix this. The problem is knowing you're affected :-(


This is the concern the RMT discussed. Yes it is reasonable to say 
"increase work_mem to XYZ", but how does a user know how to detect and 
apply that? This is the part that is worrisome, especially because we 
don't have any sense of what the overall impact will be.


Maybe it's not much, but we should document that there is the potential 
for a regression.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG15 beta1 sort performance regression due to Generation context change

2022-07-15 Thread Tomas Vondra



On 7/13/22 17:32, Andres Freund wrote:
> Hi,
> 
> On 2022-07-13 09:23:00 -0400, Jonathan S. Katz wrote:
>> On 7/13/22 12:13 AM, David Rowley wrote:
>>> On Tue, 12 Jul 2022 at 17:15, David Rowley  wrote:
 So far only Robert has raised concerns with this regression for PG15
 (see [2]). Tom voted for leaving things as they are for PG15 in [3].
 John agrees, as quoted above. Does anyone else have any opinion?
>>>
>>> Let me handle this slightly differently.  I've moved the open item for
>>> this into the "won't fix" section.  If nobody shouts at me for that
>>> then I'll let that end the debate.  Otherwise, we can consider the
>>> argument when it arrives.
>>
>> The RMT discussed this issue at its meeting today (and a few weeks back --
>> apologies for not writing sooner). While we agree with your analysis that 1/
>> this issue does appear to be a corner case and 2/ the benefits outweigh the
>> risks, we still don't know how prevalent it may be in the wild and the
>> general impact to user experience.
>>
>> The RMT suggests that you make one more pass at attempting to solve it.
> 
> I think without a more concrete analysis from the RMT that's not really
> actionable. What risks are we willing to accept to resolve this? This is
> mostly a question of tradeoffs.
> 
> Several "senior" postgres hackers looked at this and didn't find any solution
> that makes sense to apply to 15. I don't think having David butt his head
> further against this code is likely to achieve much besides a headache.  Note
> that David already has a patch to address this for 16.
> 

I agree with this. It's not clear to me how we'd asses how prevalent it
really is (reports on a mailing list surely are not a very great way to
measure this). My personal opinion is that it's a rare regression. Other
optimization patches have similar rare regressions, except that David
spent so much time investigating this one it seems more serious.

I think it's fine to leave this as is. If we feel we have to fix this
for v15, it's probably best to just apply the v16. I doubt we'll find
anything simpler.

> 
>> If there does not appear to be a clear path forward, we should at least
>> document how a user can detect and resolve the issue.
> 
> To me that doesn't really make sense. We have lots of places were performance
> changes once you cross some threshold, and lots of those are related to
> work_mem.
> 
> We don't, e.g., provide tooling to detect when performance in aggregation
> regresses due to crossing work_mem and could be fixed by a tiny increase in
> work_mem.
> 

Yeah. I find it entirely reasonable to tell people to increase work_mem
a bit to fix this. The problem is knowing you're affected :-(


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 13:20:59 -0700, Jacob Champion wrote:
> On 7/15/22 12:11, Andres Freund wrote:
> > This might have been discussed somewhere, but I'm worried about emitting
> > unescaped data from pre-auth clients. What guarantees that subject / issuer
> > name only contain printable ascii-chars? Printing terminal control chars or
> > such would not be great, nor would splitting a string at a multi-boundary.
> 
> Hm. The last time I asked about that, Magnus pointed out that we reflect
> port->user_name as-is [1], so I kind of stopped worrying about it.

I think we need to fix a number of these. But no, I don't think we should just
add more because we've not been careful in a bunch of other places.


> Is this more dangerous than that?

Hard to say.


> (And do we want to fix it now, regardless?)

Yes.


> What guarantees are we supposed to be making for log encoding?

I don't know, but I don't think not caring at all is a good
option. Particularly for unauthenticated data I'd say that escaping everything
but printable ascii chars is a sensible approach.

Greetings,

Andres Freund




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Jacob Champion
On 7/15/22 12:11, Andres Freund wrote:
> This might have been discussed somewhere, but I'm worried about emitting
> unescaped data from pre-auth clients. What guarantees that subject / issuer
> name only contain printable ascii-chars? Printing terminal control chars or
> such would not be great, nor would splitting a string at a multi-boundary.

Hm. The last time I asked about that, Magnus pointed out that we reflect
port->user_name as-is [1], so I kind of stopped worrying about it. Is
this more dangerous than that? (And do we want to fix it now,
regardless?) What guarantees are we supposed to be making for log encoding?

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/CABUevExVHryTasKmtJW5RtU-dBesYj4bV7ggpeVMfiPCHCvLNA%40mail.gmail.com




Re: optimize lookups in snapshot [sub]xip arrays

2022-07-15 Thread Andres Freund
Hi,

Sounds worth pursuing.

On 2022-07-13 10:09:50 -0700, Nathan Bossart wrote:
> The attached patch has some key differences from the previous proposal.
> For example, the new patch uses simplehash instead of open-coding a new
> hash table.

+1

> The attached patch waits until a lookup of [sub]xip before generating the
> hash table, so we only need to allocate enough space for the current
> elements in the [sub]xip array, and we avoid allocating extra memory for
> workloads that do not need the hash tables.

Hm. Are there any contexts where we'd not want the potential for failing due
to OOM?

I wonder if we additionally / alternatively could use a faster method of
searching the array linearly, e.g. using SIMD.


Another thing that might be worth looking into is to sort the xip/subxip
arrays into a binary-search optimized layout. That'd make the binary search
faster, wouldn't require additional memory (a boolean indicating whether
sorted somewhere, I guess), and would easily persist across copies of the
snapshot.


> I'm slightly worried about increasing the number of memory
> allocations in this code path, but the results above seemed encouraging on
> that front.

ISMT that the test wouldn't be likely to show those issues.


> These hash tables are regarded as ephemeral; they only live in
> process-local memory and are never rewritten, copied, or
> serialized.

What does rewriting refer to here?

Not convinced that's the right idea in case of copying. I think we often end
up copying snapshots frequently, and building & allocating the hashed xids
separately every time seems not great.


> + snapshot->xiph = NULL;
> + snapshot->subxiph = NULL;

Do we need separate hashes for these? ISTM that if we're overflowed then we
don't need ->subxip[h], and if not, then the action for an xid being in ->xip
and ->subxiph is the same?

Greetings,

Andres Freund




Re: MERGE and parsing with prepared statements

2022-07-15 Thread David G. Johnston
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby  wrote:

>
> That appears to be copied from the INSERT page.
> What does that mean, if not that data types will be resolved as needed ?
>

Yep, and the system needs to resolve the type at a point where there is no
contextual information and so it chooses text.


> Note that if I add casts to the "ON" condition, MERGE complains about the
> INSERT VALUES.
>
> PREPARE p AS
> MERGE INTO CustomerAccount CA
> USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
> ON CA.CustomerId = T.CustomerId::int
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue)
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue;
>
> ERROR:  column "customerid" is of type integer but expression is of type
> text
> LINE 7:   VALUES (T.CustomerId, T.TransactionValue)
>
>
Noted.  Not surprised.  That error was always present, it's just that the
join happens first.  Since your fix narrowly targeted the join this error
remained to be discovered.

David J.


Re: MERGE and parsing with prepared statements

2022-07-15 Thread David G. Johnston
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby  wrote:

> On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
> > On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera 
> wrote:
> > > On 2022-Jul-15, Justin Pryzby wrote:
> > >
> > > > It seems a bit odd that it's impossible to use merge with prepared
> statements
> > > > without specifically casting the source types (which I did now to
> continue my
> > > > experiment).
> > >
> > > I have no comments on this.  Maybe it can be improved, but I don't know
> > > how.
> >
> > Not tested, but the example prepare command fails to make use of the
> > optional data types specification.  Using that should remove the need to
> > cast the parameter placeholder within the query itself.
>
> What optional data type specification ?
>

The one documented here:

https://www.postgresql.org/docs/current/sql-prepare.html

PREPARE name [ ( data_type [, ...] ) ] AS statement

David J.


Re: POC: GROUP BY optimization

2022-07-15 Thread Tomas Vondra



On 7/15/22 07:18, David Rowley wrote:
> On Thu, 31 Mar 2022 at 12:19, Tomas Vondra
>  wrote:
>> Pushed, after going through the patch once more, running check-world
>> under valgrind, and updating the commit message.
> 
> I'm still working in this area and I noticed that db0d67db2 updated
> some regression tests in partition_aggregate.out without any care as
> to what the test was testing.
> 
> The comment above the test reads:
> 
> -- Without ORDER BY clause, to test Gather at top-most path
> 
> and you've changed the expected plan from being a parallel plan with a
> Gather to being a serial plan. So it looks like the test might have
> become useless.
> 

I agree this is a mistake in db0d67db2 that makes the test useless.

> I see that the original plan appears to come back with some
> adjustments to parallel_setup_cost and parallel_tuple_cost. It seems a
> bit strange to me that the changes with this patch would cause a
> change of plan for this. There is only 1 GROUP BY column in the query
> in question. There's no rearrangement to do with a single column GROUP
> BY.
> 

It might seem a bit strange, but the patch tweaked the costing a bit, so
it's not entirely unexpected. I'd bet the plan cost changed just a teeny
bit, but enough to change the cheapest plan. The costing changed for all
group counts, including a single group.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: MERGE and parsing with prepared statements

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
> On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera  
> wrote:
> > On 2022-Jul-15, Justin Pryzby wrote:
> >
> > > It seems a bit odd that it's impossible to use merge with prepared 
> > > statements
> > > without specifically casting the source types (which I did now to 
> > > continue my
> > > experiment).
> >
> > I have no comments on this.  Maybe it can be improved, but I don't know
> > how.
>
> Not tested, but the example prepare command fails to make use of the
> optional data types specification.  Using that should remove the need to
> cast the parameter placeholder within the query itself.

What optional data type specification ?

> That said, in theory the INSERT specification of the MERGE could be used to
> either resolve unknowns or even forcibly convert the data types of the
> relation produced by the USING clause to match the actual types required
> for the INSERT (since that will happen at insert time anyway).

Yeah.  I hadn't looked before, but just noticed this:

https://www.postgresql.org/docs/devel/sql-merge.html
| If the expression for any column is not of the correct data type, automatic 
type conversion will be attempted.

That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?

Note that if I add casts to the "ON" condition, MERGE complains about the
INSERT VALUES.

PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId::int
WHEN NOT MATCHED THEN
  INSERT (CustomerId, Balance)
  VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
  UPDATE SET Balance = Balance + TransactionValue;

ERROR:  column "customerid" is of type integer but expression is of type text
LINE 7:   VALUES (T.CustomerId, T.TransactionValue)

postgres: pryzbyj postgres [local] PREPARE(transformAssignedExpr+0x3b6) 
[0x5605f9699e8e]
postgres: pryzbyj postgres [local] PREPARE(transformInsertRow+0x2ce) 
[0x5605f9653a47]
postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x7ec) 
[0x5605f968fe3b]
postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) 
[0x5605f9656071]
postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5605f9657350]
postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) 
[0x5605f9657385]
postgres: pryzbyj postgres [local] 
PREPARE(parse_analyze_varparams+0x5b) [0x5605f96574f4]
postgres: pryzbyj postgres [local] 
PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5605f991edfe]
postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) 
[0x5605f96f4155]
postgres: pryzbyj postgres [local] 
PREPARE(standard_ProcessUtility+0x4ea) [0x5605f99251a0]
postgres: pryzbyj postgres [local] PREPARE(ProcessUtility+0xdb) 
[0x5605f992587e]

-- 
Justin




Re: Move Section 9.27.7 (Data Object Management Functions) to System Information Chapter

2022-07-15 Thread David G. Johnston
On Thu, Jul 14, 2022 at 3:57 PM Tom Lane  wrote:

> Bruce Momjian  writes:
> > On Mon, Apr 25, 2022 at 08:33:47AM -0700, David G. Johnston wrote:
> >> Both the location and name of the linked to section make no sense to me:
> >> https://www.postgresql.org/docs/current/functions-admin.html#
> >> FUNCTIONS-ADMIN-DBOBJECT
> >> Neither of the tables listed there manage (cause to change) anything.
> They are
> >> pure informational functions - size and path of objects respectively.
> It
> >> belongs in the previous chapter "System Information Functions and
> Operators"
> >> with a different name.
>
> > So, the section title is:
> >   9.27.7. Database Object Management Functions
> > I think the idea is that they _help_ to manage database objects by
> > reporting their size or location.  I do think it is in the right
> > chapter, but maybe needs a better title?  I can't think of one.
>
> I'm hesitant to move functions to a different documentation page
> without a really solid reason.  Just a couple days ago I fielded a
> complaint from somebody who couldn't find string_to_array anymore
> because we'd moved it from "array functions" to "string functions".
>
> I'd be the first to say that the division between 9.26 and 9.27 is
> pretty arbitrary ... but without a clearer demarcation rule,
> moving functions between the two pages seems more likely to
> add confusion than subtract it.
>
>
I'm not going to fight the prevailing winds on this one, much...but I've
probably been sitting on this annoyance for years since I use the ToC to
find stuff fairly quickly in the docs.  This seems much more clear to me
than a function than deciding whether a function that converts a string
into an array belongs in the string chapter or the array chapter.

On a related note, why itemize 9.27 in the table of contents but not 9.26?

I would ask that we at least rename it to:

Disk Usage Functions

Since this would show in the ToC finding the name of the functions that
allow one to compute disk usage, which is a question I probably see once a
year, and what motivates this request, would be more likely to be found
without skimming the entire 9.26 chapter (since I cannot see those table
heading in the ToC) and not finding it and then stumbling upon in on a
table the only deals with sizes but whose headers says nothing about sizes.

David J.


Re: MERGE and parsing with prepared statements

2022-07-15 Thread David G. Johnston
On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera 
wrote:

> On 2022-Jul-15, Justin Pryzby wrote:
>
> > It seems a bit odd that it's impossible to use merge with prepared
> statements
> > without specifically casting the source types (which I did now to
> continue my
> > experiment).
>
> I have no comments on this.  Maybe it can be improved, but I don't know
> how.
>
>
Not tested, but the example prepare command fails to make use of the
optional data types specification.  Using that should remove the need to
cast the parameter placeholder within the query itself.

That said, in theory the INSERT specification of the MERGE could be used to
either resolve unknowns or even forcibly convert the data types of the
relation produced by the USING clause to match the actual types required
for the INSERT (since that will happen at insert time anyway).  This would
make the UPDATE behavior slightly different than a top-level UPDATE command
though, since that does not have the same context information.

David J.


Re: MERGE and parsing with prepared statements

2022-07-15 Thread Tom Lane
Alvaro Herrera  writes:
> On 2022-Jul-15, Justin Pryzby wrote:
>> I see now that the same thing can happen with "ON CONFLICT" if used with a
>> subselect.
>> 
>> PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
>> ON CONFLICT (i) DO UPDATE SET i=excluded.i;
>> ERROR:  column "i" is of type integer but expression is of type text

> Right, I didn't think that MERGE was doing anything peculiar in this
> respect.

Yeah.  The current theory about this is that if we haven't assigned a
type to an unknown-type parameter (or literal) that is an output
column of a sub-SELECT, we will as a rule force it to text.
That MERGE USING clause is a sub-SELECT, so that rule applies.

There is a hoary old exception to that rule, which is that if you
write INSERT INTO tab SELECT ..., $1, ...
we will figure out the type of the column of "tab" that $1 is going
into, and force $1 to that type instead of text.  It looks like this
also works in INSERT ... VALUES.  You could make a case that MERGE
should be equally smart, but it's not clear to me that the info is
available sufficiently close by to make it reasonable to do that.
It looks like the MERGE syntax has a couple of levels of indirection,
which'd probably be enough to put the kibosh on that idea -- in
particular, AFAICS there might not be a unique target column
corresponding to a given data_source column.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 09:46:40 -0700, Jacob Champion wrote:
> On 7/15/22 09:34, Peter Eisentraut wrote:
> > Committed like that.
> 
> Thanks for all the reviews!

This might have been discussed somewhere, but I'm worried about emitting
unescaped data from pre-auth clients. What guarantees that subject / issuer
name only contain printable ascii-chars? Printing terminal control chars or
such would not be great, nor would splitting a string at a multi-boundary.

Greetings,

Andres Freund




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 11:59:41 -0400, Melanie Plageman wrote:
> I'm not sure about the idea of prefixing the IOOp and IOPath enums with
> Pg_Stat. I could see them being used outside of statistics (though they
> are defined in pgstat.h)

+1


> From Andres:
> 
> Quoting me (Melanie):
> > > Introduce "IOOp", an IO operation done by a backend, and "IOPath", the
> > > location or type of IO done by a backend. For example, the checkpointer
> > > may write a shared buffer out. This would be counted as an IOOp write on
> > > an IOPath IOPATH_SHARED by BackendType "checkpointer".
> 
> > I'm still not 100% happy with IOPath - seems a bit too easy to confuse
> with
> > the file path. What about 'origin'?
> 
> I can see the point about IOPATH.
> I'm not wild about origin mostly because of the number of O's given that
> IO Operation already has two O's. It gets kind of hard to read when
> using Pascal Case: IOOrigin and IOOp.
> Also, it doesn't totally make sense for alloc. I could be convinced,
> though.
> 
> IOSOURCE doesn't have the O problem but does still not make sense for
> alloc. I also thought of IOSITE and IOVENUE.

I like "source" - not too bothered by the alloc aspect. I can also see
"context" working.


> > Annoying question: pg_stat_io vs pg_statio? I'd not think of suggesting
> the
> > latter, except that we already have a bunch of views with that prefix.
> 
> As far as pg_stat_io vs pg_statio, they are the only stats views which
> don't have an underscore between stat and the rest of the view name, so
> perhaps we should move away from statio to stat_io going forward anyway.
> I am imagining adding to them with other iostat type metrics once direct
> IO is introduced, so they may well be changing soon anyway.

I don't think I have strong opinions on this one. I can see arguments for
either naming.

Greetings,

Andres Freund




Re: MERGE and parsing with prepared statements

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Justin Pryzby wrote:

> On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:

> Thanks for looking into it.

Definitely!  Thanks, Matthias.

> I see now that the same thing can happen with "ON CONFLICT" if used with a
> subselect.
> 
> PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
>  ON CONFLICT (i) DO UPDATE SET i=excluded.i;
> ERROR:  column "i" is of type integer but expression is of type text

Right, I didn't think that MERGE was doing anything peculiar in this
respect.

> It seems a bit odd that it's impossible to use merge with prepared statements
> without specifically casting the source types (which I did now to continue my
> experiment).

I have no comments on this.  Maybe it can be improved, but I don't know
how.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

2022-07-15 Thread Maciek Sakrejda
I ran that original test case with and without the patch. Here are the
numbers I'm seeing:

master (best of three):

postgres=# SELECT count(*) FROM lotsarows;
Time: 582.423 ms

postgres=# EXPLAIN (ANALYZE, TIMING OFF) SELECT count(*) FROM lotsarows;
Time: 616.102 ms

postgres=# EXPLAIN (ANALYZE, TIMING ON) SELECT count(*) FROM lotsarows;
Time: 1068.700 ms (00:01.069)

patched (best of three):

postgres=# SELECT count(*) FROM lotsarows;
Time: 550.822 ms

postgres=# EXPLAIN (ANALYZE, TIMING OFF) SELECT count(*) FROM lotsarows;
Time: 612.572 ms

postgres=# EXPLAIN (ANALYZE, TIMING ON) SELECT count(*) FROM lotsarows;
Time: 690.875 ms

On Fri, Jul 1, 2022 at 10:26 AM Andres Freund  wrote:
> On 2022-07-01 01:23:01 -0700, Lukas Fittl wrote:
>...
> > Known WIP problems with this patch version:
> >
> > * There appears to be a timing discrepancy I haven't yet worked out, where
> >   the \timing data reported by psql doesn't match what EXPLAIN ANALYZE is
> >   reporting. With Andres' earlier test case, I'm seeing a consistent ~700ms
> >   higher for \timing than for the EXPLAIN ANALYZE time reported on the
> > server
> >   side, only when rdtsc measurement is used -- its likely there is a problem
> >   somewhere with how we perform the cycles to time conversion
>
> Could you explain a bit more what you're seeing? I just tested your patches
> and didn't see that here.

I did not see this either, but I did see that the execution time
reported by \timing is (for this test case) consistently 0.5-1ms
*lower* than the Execution Time reported by EXPLAIN. I did not see
that on master. Is that expected?

Thanks,
Maciek




Re: MERGE and parsing with prepared statements

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:
> On Thu, 14 Jul 2022, 18:26 Justin Pryzby,  wrote:
> >
> > Why is $1 construed to be of type text ?
> 
> The select statement that generates the row type of T `(select $1 CID,
> $2 TxV) AS T` does not put type bounds on the input parameters, so it
> remains `unknown` for the scope of that subselect. Once stored into
> the row type, the type of that column defaults to text, as a row type
> should not have 'unknown'-typed columns.

Thanks for looking into it.

I see now that the same thing can happen with "ON CONFLICT" if used with a
subselect.

PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
 ON CONFLICT (i) DO UPDATE SET i=excluded.i;
ERROR:  column "i" is of type integer but expression is of type text

It seems a bit odd that it's impossible to use merge with prepared statements
without specifically casting the source types (which I did now to continue my
experiment).

-- 
Justin




Re: Transparent column encryption

2022-07-15 Thread Jacob Champion
On 7/12/22 11:29, Peter Eisentraut wrote:
> 
> Updated patch, to resolve some merge conflicts.

Thank you for working on this; it's an exciting feature.

> The CEK key
> material is in turn encrypted by an assymmetric key called the column
> master key (CMK).

I'm not yet understanding why the CMK is asymmetric. Maybe you could use
the public key to add ephemeral, single-use encryption keys that no one
but the private key holder could use (after you forget them on your
side, that is). But since the entire column is encrypted with a single
CEK, you would essentially only be able to do that if you created an
entirely new column or table; do I have that right?

I'm used to public keys being safe for... publication, but if I'm
understanding correctly, it's important that the server admin doesn't
get hold of the public key for your CMK, because then they could
substitute their own CEKs transparently and undermine future encrypted
writes. That seems surprising. Am I just missing something important
about RSAES-OAEP?

> +#define PG_CEK_AEAD_AES_128_CBC_HMAC_SHA_256   130
> +#define PG_CEK_AEAD_AES_192_CBC_HMAC_SHA_384   131
> +#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_384   132
> +#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_512   133

It looks like these ciphersuites were abandoned by the IETF. Are there
existing implementations of them that have been audited/analyzed? Are
they safe (and do we know that the claims made in the draft are
correct)? How do they compare to other constructions like AES-GCM-SIV
and XChacha20-Poly1305?

> +-- \gencr
> +-- (This just tests the parameter passing; there is no encryption here.)
> +CREATE TABLE test_gencr (a int, b text);
> +INSERT INTO test_gencr VALUES (1, 'one') \gencr
> +SELECT * FROM test_gencr WHERE a = 1 \gencr
> + a |  b
> +---+-
> + 1 | one
> +(1 row)
> +
> +INSERT INTO test_gencr VALUES ($1, $2) \gencr 2 'two'
> +SELECT * FROM test_gencr WHERE a IN ($1, $2) \gencr 2 3
> + a |  b
> +---+-
> + 2 | two
> +(1 row)
I'd expect \gencr to error out without sending plaintext. I know that
under the hood this is just setting up a prepared statement, but if I'm
using \gencr, presumably I really do want to be encrypting my data.
Would it be a problem to always set force-column-encryption for the
parameters we're given here? Any unencrypted columns could be provided
directly.

Another idle thought I had was that it'd be nice to have some syntax for
providing a null value to \gencr (assuming I didn't overlook it in the
patch). But that brings me to...

> +  
> +   Null values are not encrypted by transparent column encryption; null 
> values
> +   sent by the client are visible as null values in the database.  If the 
> fact
> +   that a value is null needs to be hidden from the server, this information
> +   needs to be encoded into a nonnull value in the client somehow.
> +  

This is a major gap, IMO. Especially with the switch to authenticated
ciphers, because it means you can't sign your NULL values. And having
each client or user that's out there solve this with a magic in-band
value seems like a recipe for pain.

Since we're requiring "canonical" use of text format, and the docs say
there are no embedded or trailing nulls allowed in text values, could we
steal the use of a single zero byte to mean NULL? One additional
complication would be that the client would have to double-check that
we're not writing a NULL into a NOT NULL column, and complain if it
reads one during decryption. Another complication would be that the
client would need to complain if it got a plaintext NULL.

(The need for robust client-side validation of encrypted columns might
be something to expand on in the docs more generally, since before this
feature, it could probably be assumed that the server was buggy if it
sent you unparsable junk in a column.)

> +   
> +The associated data in these algorithms consists of 4
> +bytes: The ASCII letters P and G
> +(byte values 80 and 71), followed by the algorithm ID as a 16-bit 
> unsigned
> +integer in network byte order.
> +   

Is this AD intended as a placeholder for the future, or does it serve a
particular purpose?

Thanks,
--Jacob




Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-15 11:25:54 +0200, Matthias van de Meent wrote:
> On Mon, 14 Mar 2022 at 18:14, Andres Freund  wrote:
> >
> > A random thought I had while thinking about the size limits: We could use 
> > the
> > low bits of the length and xl_prev to store XLR_SPECIAL_REL_UPDATE |
> > XLR_CHECK_CONSISTENCY and give rmgrs the full 8 bit of xl_info. Which would
> > allow us to e.g. get away from needing Heap2. Which would aestethically be
> > pleasing.
> 
> I just remembered your comment while going through the xlog code and
> thought this about the same issue: We still have 2 bytes of padding in
> XLogRecord, between xl_rmid and xl_crc. Can't we instead use that
> space for rmgr-specific flags, as opposed to stealing bits from
> xl_info?

Sounds like a good idea to me. I'm not sure who is stealing bits from what
right now, but it clearly seems worthwhile to separate "flags" from "record
type within rmgr".

I think we should split it at least into three things:

1) generic per-record flags for xlog machinery (ie. XLR_SPECIAL_REL_UPDATE, 
XLR_CHECK_CONSISTENCY)
2) rmgr record type identifier (e.g. XLOG_HEAP_*)
2) rmgr specific flags (e.g. XLOG_HEAP_INIT_PAGE)

Greetings,

Andres Freund




interrupt processing during FATAL processing

2022-07-15 Thread Andres Freund
Hi,

One of the problems we found ([1]) when looking at spurious failures of the
recovery conflict test ([2]) is that a single backend can FATAL out multiple
times.  That seems independent enough from that thread that I thought it best
to raise separately.

The problem is that during the course of FATAL processing, we end up
processing interrupts, which then can process the next recovery conflict
interrupt. This happens because while we send the FATAL to the client
(errfinish()->EmitErrorReport()) we'll process interrupts in various
places. If e.g. a new recovery conflict interrupt has been raised (which
startup.c does at an absurd rate), that'll then trigger a new FATAL.

Sources for recursive processing of interrupts are e.g. < ERROR ereports like
the COMERROR in internal_flush().

A similar problem does *not* exist for for ERROR, because errfinish()
PG_RE_THROW()s before the EmitErrorReport() and PostgresMain() does
HOLD_INTERRUPTS() first thing after sigsetjmp().

One might reasonably think that the proc_exit_inprogress logic in die(),
RecoveryConflictInterrupt() etc.  protects us against this - but it doesn't,
because we've not yet set it when doing EmitErrorReport(), it gets set later
during proc_exit().

I'm not certain what the best way to address this is.

One approach would be to put a HOLD_INTERRUPTS() before EmitErrorReport() when
handling a FATAL error. I'm a bit worried this might make it harder to
interrupt FATAL errors when they're blocking sending the message to the client
- ProcessClientWriteInterrupt() won't do its thing. OTOH, it looks like we
already have that problem if there's a FATAL after the sigsetjmp() block did
HOLD_INTERRUPTS(), because erfinish() won't reset InterruptHoldoffCount like
it does for ERROR.

Another approach would be to set proc_exit_inprogress earlier, before the
EmitErrorReport(). That's a bit ugly, because it ties ipc.c more closely to
elog.c, but it also "feels" correct to me. OTOH, it's at best a partial
protection, because it doesn't prevent already pending interrupts from being
processed.

I guess we could instead add a dedicated variable indicating whether we're
currently processing a FATAL error? I was considering exposin a function
checking whether elog's recursion_depth is != 0, and short-circuit
ProcessInterrupts() based on that, but I don't think that'd be good - we want
to be able interrupt some super-noisy NOTICE or such.


I suspect that we should, even if it does not address this issue, reset
InterruptHoldoffCount in errfinish() for FATALs, similar to ERRORs, so that
FATALs can benefit from the logic in ProcessClientWriteInterrupt().

Greetings,

Andres Freund

[1] https://postgr.es/m/20220701231833.vh76xkf3udani3np%40alap3.anarazel.de
[2] clearly we needed that test urgently, but I can't deny regretting the
consequence of having to fix the plethora of bugs it's uncovering




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Simon Riggs
On Fri, 15 Jul 2022 at 12:25, Simon Riggs  wrote:
>
> On Mon, 4 Jul 2022 at 08:00, Michael Paquier  wrote:
> >
> > On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote:
> > > This is marked as Ready for Committer, but that seems unduly
> > > optimistic.
> >
> > Please note that patch authors should not switch a patch as RfC by
> > themselves.  This is something that a reviewer should do.
> >
> > > The cfbot shows that it's failing on all platforms ---
> > > and not in the same way on each, suggesting there are multiple
> > > problems.
> >
> > A wild guess is that this comes from the patch that manipulates
> > get_database_name(), something that there is no need for as long as
> > the routine is called once in ReindexMultipleTables().
>
> OK, let me repost the new patch and see if CFbot likes that better.
>
> This includes Michael's other requested changes for reindexdb.

That's fixed it on the CFbot. Over to you, Michael. Thanks.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Simon Riggs
On Fri, 15 Jul 2022 at 15:03, Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > Ah, one theory is that the libpq_pipeline program is getting linked to
> > an installed version of libpq that doesn't contain the fixes.  Maybe you
> > can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
> > is picking up?
>
> That's pronounced "otool -L" on macOS.  But in any case, it's going
> to point at the installation directory.  One of the moving parts here
> is that "make check" will try to override the rpath that otool tells
> you about to make test programs use the libpq.dylib from the build tree.
> I say "try" because if you've got SIP enabled (see what "csrutil status"
> tells you), it will fail to do so and the installed libpq will be used.
> Maybe that's old.
>
> Standard recommendation on macOS with SIP on is to always do "make
> install" before "make check".

Thanks, will investigate.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Jacob Champion
On 7/15/22 09:34, Peter Eisentraut wrote:
> Committed like that.

Thanks for all the reviews!

--Jacob





Re: [PATCH] Log details for client certificate failures

2022-07-15 Thread Peter Eisentraut

On 14.07.22 23:09, Jacob Champion wrote:

On Thu, Jul 14, 2022 at 1:12 PM Peter Eisentraut
 wrote:

Concretely, I was thinking like the attached top-up patch.

The other way can surely be made to work somehow, but this seems much
simpler and with fewer questions about the details.


Ah, seeing it side-by-side helps. That's much easier, I agree.


Committed like that.





Re: [RFC] building postgres with meson -v9

2022-07-15 Thread Andres Freund
Hi,

On 2022-07-13 08:39:45 +0200, Peter Eisentraut wrote:
> On 06.07.22 15:21, Andres Freund wrote:
> > > - This patch is for unifying the list of languages in NLS, as
> > >previously discussed:https://commitfest.postgresql.org/38/3737/
> > There seems little downside to doing so, so ...
> 
> This has been committed, so on the next rebase, the languages arguments can
> be removed from the i18n.gettext() calls.

That's done in v10 I posted yesterday.


On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> 3bf5b317d5 meson: prereq: Specify output directory for psql sql help script.
> 2e5ed807f8 meson: prereq: ecpg: add and use output directory argument for
> parse.pl.
> 4e7fab01c5 meson: prereq: msvc: explicit output file for pgflex.pl
> cdcd3da4c4 meson: prereq: add output path arg in generate-lwlocknames.pl
> 1f655486e4 meson: prereq: generate-errcodes.pl: accept output file
> e834c48758 meson: prereq: unicode: allow to specify output directory.
> 
> You said you are still finalizing these.  I think we can move ahead with
> these once that is done.

I tried to address the pending things in these patches.


> 9f4a9b1749 meson: prereq: move snowball_create.sql creation into perl file.
> 
> This looks ready, except I think it needs to be hooked into the
> distprep target, since it's now a Perl script running at build time.

Done. I vacillated between doing the distprep rule in
src/backend/snowball/Makefile (as most things outside of the backend do) and
src/backend/Makefile (most backend things). Ended up with doing it in
snowball/Makefile.

Greetings,

Andres Freund




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-07-15 Thread Melanie Plageman
I am consolidating the various naming points from this thread into one
email:

>From Horiguchi-san:

> A bit different thing, but I felt a little uneasy about some uses of
> "pgstat_io_ops". IOOp looks like a neighbouring word of IOPath. On the
> other hand, actually iopath is used as an attribute of io_ops in many
> places.  Couldn't we be more consistent about the relationship between
> the names?
>
> IOOp   -> PgStat_IOOpType
> IOPath -> PgStat_IOPath
> PgStat_IOOpCOonters -> PgStat_IOCounters
> PgStat_IOPathOps-> PgStat_IO
> pgstat_count_io_op  -> pgstat_count_io

So, because of the way the data structures contain arrays of each other
the naming was meant to specify all the information contained in the
data structure:

PgStat_IOOpCounters are all IOOp (I could see removing the word
"counters" from the name for more consistency)

PgStat_IOPathOps are all IOOp for all IOPath

PgStat_BackendIOPathOps are all IOOp for all IOPath for all BackendType

The downside of this naming is that, when choosing a local variable name
for all of the IOOp for all IOPath for a single BackendType,
"backend_io_path_ops" seems accurate but is actually confusing if the
type name for all IOOp for all IOPath for all BackendType is
PgStat_BackendIOPathOps.

I would be open to changing PgStat_BackendIOPathOps to PgStat_IO, but I
don't see how I could omit Path or Op from PgStat_IOPathOps without
making its meaning unclear.

I'm not sure about the idea of prefixing the IOOp and IOPath enums with
Pg_Stat. I could see them being used outside of statistics (though they
are defined in pgstat.h) and could see myself using them in, for
example, calculations for the prefetcher.

>From Andres:

Quoting me (Melanie):
> > Introduce "IOOp", an IO operation done by a backend, and "IOPath", the
> > location or type of IO done by a backend. For example, the checkpointer
> > may write a shared buffer out. This would be counted as an IOOp write on
> > an IOPath IOPATH_SHARED by BackendType "checkpointer".

> I'm still not 100% happy with IOPath - seems a bit too easy to confuse
with
> the file path. What about 'origin'?

I can see the point about IOPATH.
I'm not wild about origin mostly because of the number of O's given that
IO Operation already has two O's. It gets kind of hard to read when
using Pascal Case: IOOrigin and IOOp.
Also, it doesn't totally make sense for alloc. I could be convinced,
though.

IOSOURCE doesn't have the O problem but does still not make sense for
alloc. I also thought of IOSITE and IOVENUE.

> Annoying question: pg_stat_io vs pg_statio? I'd not think of suggesting
the
> latter, except that we already have a bunch of views with that prefix.

As far as pg_stat_io vs pg_statio, they are the only stats views which
don't have an underscore between stat and the rest of the view name, so
perhaps we should move away from statio to stat_io going forward anyway.
I am imagining adding to them with other iostat type metrics once direct
IO is introduced, so they may well be changing soon anyway.

- Melanie


Re: Adjust ndistinct for eqjoinsel

2022-07-15 Thread Tom Lane
Zhenghua Lyu  writes:
> I run TPC-DS benchmark for Postgres and find the join size estimation has 
> several problems.
> For example, Ndistinct is key to join selectivity's estimation, this 
> value does not take restrictions
> of the rel, I hit some cases in the function eqjoinsel, nd is much larger 
> than vardata.rel->rows.

> Accurate estimation need good math model that considering dependency of 
> join var and vars in restriction.
> But at least, indistinct should not be greater than the number of rows.

> See the attached patch to adjust nd in eqjoinsel.

We're very unlikely to accept this with no test case and no explanation
of why it's not an overcorrection.  get_variable_numdistinct already
clamps its result to rel->tuples, and I think that by using rel->rows
instead you are probably double-counting the selectivity of the rel's
restriction clauses.

See the sad history of commit 7f3eba30c, which did something
pretty close to this and eventually got almost entirely reverted
(97930cf57, 0d3b231ee).  I'd be the first to agree that better
estimates here would be great, but it's not as simple as it looks.

regards, tom lane




Re: Add function to return backup_label and tablespace_map

2022-07-15 Thread Stephen Frost
Greetings,

* Fujii Masao (masao.fu...@oss.nttdata.com) wrote:
> On 2022/07/08 23:11, David Steele wrote:
> >Looks like I made that more complicated than it needed to be:
> >
> >select * from pg_backup_stop(...) \gset
> >\pset tuples_only on
> >\pset format unaligned
> >\o /backup_path/backup_label
> >select :'labelfile';
> 
> Thanks! I had completely forgotten \gset command.

Seems like it might make sense to consider using a better format for
these files and also to allow us to more easily add things in the future
(ending LSN, ending time, etc) for backup tools to be able to leverage.
Perhaps we should change this to having just a single file returned,
instead of two, and use JSON for it, as a more general and extensible
format that we've already got code to work with..?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Support load balancing in libpq

2022-07-15 Thread Jelte Fennema
> we can assume that one of members is a primary and others are secondary.

With plain Postgres this assumption is probably correct. But the main reason
I'm interested in this patch was because I would like to be able to load
balance across the workers in a Citus cluster. These workers are all primaries.
Similar usage would likely be possible with BDR (bidirectional replication).

> In this case user maybe add a primary host to top of the list,
> so sorting may increase time durations for establishing connection.

If the user takes such care when building their host list, they could simply
not add the load_balance_hosts=true option to the connection string.
If you know there's only one primary in the list and you're looking for
the primary, then there's no reason to use load_balance_hosts=true.


Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-15 Thread Masahiko Sawada
On Fri, Jul 15, 2022 at 10:43 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Thursday, July 14, 2022 10:31 AM Masahiko Sawada  
> wrote:
> > I've attached an updated patch that incorporated comments from Amit and Shi.
> Hi,
>
>
> Minor comments for v4.

Thank you for the comments!

>
> (1) typo in the commit message
>
> "When decoding a COMMIT record, we check both the list and the ReorderBuffer 
> to see if
> if the transaction has modified catalogs."
>
> There are two 'if's in succession in the last sentence of the second 
> paragraph.
>
> (2) The header comment for the spec test
>
> +# Test that decoding only the commit record of the transaction that have
> +# catalog-changed.
>
> Rewording of this part looks required, because "test that ... " requires a 
> complete sentence
> after that, right ?
>
>
> (3) SnapBuildRestore
>
> snapshot_not_interesting:
> if (ondisk.builder.committed.xip != NULL)
> pfree(ondisk.builder.committed.xip);
> return false;
> }
>
> Do we need to add pfree for ondisk.builder.catchange.xip after the 
> 'snapshot_not_interesting' label ?
>
>
> (4) SnapBuildPurgeOlderTxn
>
> +   elog(DEBUG3, "purged catalog modifying transactions from %d 
> to %d",
> +(uint32) builder->catchange.xcnt, surviving_xids);
>
> To make this part more aligned with existing codes,
> probably we can have a look at another elog for debug in the same function.
>
> We should use %u for casted xcnt & surviving_xids,
> while adding a format for xmin if necessary ?

I agreed with all the above comments and incorporated them into the
updated patch.

This patch should have the fix for the issue that Shi yu reported. Shi
yu, could you please test it again with this patch?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v5-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data


Re: Add 64-bit XIDs into PostgreSQL 15

2022-07-15 Thread Aleksander Alekseev
Hi hackers,

> I can agree with you that sending rebased patches too often can be a little 
> annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose 
> it's much easier and more comfortable to review the patches that at least 
> apply cleanly and pass all tests. So if Cfbot is red for a long time I feel 
> we need to send a rebased patchset anyway.
>
> I'll try to not doing this too often but frankly, I don't see a better 
> alternative at the moment.

Considering the overall activity on the mailing list personally I
don't see a problem here. Several extra emails don't bother me at all,
but I would like to see a green cfbot report for an open item in the
CF application. Otherwise someone will complain that the patch doesn't
apply anymore and the result will be the same as for sending an
updated patch, except that we will receive at least two emails instead
of one.

-- 
Best regards,
Aleksander Alekseev




Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-07-15 Thread David G. Johnston
On Fri, Jul 15, 2022 at 6:27 AM Japin Li  wrote:

>
> >
> > +servers.  If setting max_wal_senders to
> > +0 consider also reducing the amount of WAL
> produced
> > +by changing wal_level to
> minimal.
> >
> > I don't think this is great advice.  It will encourage people to use
> > wal_level = minimal even if they have other requirements that weigh
> > against it.  If they feel that their system is producing too much
> > WAL, I doubt they'll have a hard time finding the wal_level knob.
> >
>
> Agreed. It isn't good advice.  We can remove the suggestion.
>
>
Yeah, I wrote that thinking that max_wal_senders being set to 0 implied
archive_mode = off, but that isn't the case.

David J.


Adjust ndistinct for eqjoinsel

2022-07-15 Thread Zhenghua Lyu
Hi,
I run TPC-DS benchmark for Postgres and find the join size estimation has 
several problems.
For example, Ndistinct is key to join selectivity's estimation, this value 
does not take restrictions
of the rel, I hit some cases in the function eqjoinsel, nd is much larger 
than vardata.rel->rows.

Accurate estimation need good math model that considering dependency of 
join var and vars in restriction.
But at least, indistinct should not be greater than the number of rows.

See the attached patch to adjust nd in eqjoinsel.

Best,
Zhenghua Lyu


0001-Adjust-ndistinct-with-nrows-in-the-rel-when-estimati.patch
Description:  0001-Adjust-ndistinct-with-nrows-in-the-rel-when-estimati.patch


Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Tom Lane
Alvaro Herrera  writes:
> Ah, one theory is that the libpq_pipeline program is getting linked to
> an installed version of libpq that doesn't contain the fixes.  Maybe you
> can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
> is picking up?

That's pronounced "otool -L" on macOS.  But in any case, it's going
to point at the installation directory.  One of the moving parts here
is that "make check" will try to override the rpath that otool tells
you about to make test programs use the libpq.dylib from the build tree.
I say "try" because if you've got SIP enabled (see what "csrutil status"
tells you), it will fail to do so and the installed libpq will be used.
Maybe that's old.

Standard recommendation on macOS with SIP on is to always do "make
install" before "make check".

regards, tom lane




Re: SQL/JSON documentation JSON_TABLE

2022-07-15 Thread Andrew Dunstan


On 2022-07-15 Fr 02:20, Erik Rijkers wrote:
> On 7/14/22 17:45, Andrew Dunstan wrote:
>>
>>
>> Here's a patch that deals with most of this. There's one change you
>> wanted that I don't think is correct, which I omitted.
>>
>> [json-docs-fix.patch]
>
> Thanks, much better. I also agree that the change I proposed (and you
> omitted) wasn't great (although it leaves the paragraph somewhat
> orphaned - but maybe it isn't too bad.).
>
> I've now compared our present document not only with the original doc
> as produced by Nikita Glukhov et al in 2018,  but also with the ISO
> draft from 2017 (ISO/IEC TR 19075-6 (JSON) for JavaScript Object).
>
> I think we can learn a few things from that ISO draft's JSON_TABLE
> text. Let me copy-paste its first explicatory paragraph on JSON_TABLE:
>
> -- [ ISO SQL/JSON draft 2017 ] -
> Like the other JSON querying operators, JSON_TABLE begins with  API common syntax> to specify the context item, path expression and
> PASSING clause. The path expression in this case is more accurately
> called the row pattern path expression. This path expression is
> intended to produce an SQL/JSON sequence, with one SQL/JSON item for
> each row of the output table.
>
> The COLUMNS clause can define two kinds of columns: ordinality columns
> and regular columns.
>
> An ordinality column provides a sequential numbering of rows. Row
> numbering is 1-based.
>
> A regular column supports columns of scalar type. The column is
> produced using the semantics of JSON_VALUE. The column has an optional
> path expression, called the column pattern, which can be defaulted
> from the column name. The column pattern is used to search for the
> column within the current SQL/JSON item produced by the row pattern.
> The column also has optional ON EMPTY and ON ERROR clauses, with the
> same choices and semantics as JSON_VALUE.
> --
>
>
> So, where the ISO draft introduces the term 'row pattern' it /also/
> introduces the term 'column pattern' close by, in the next paragraph.
>
> I think our docs too should have both terms.  The presence of both
> 'row pattern' and 'column pattern' immediately makes their meanings
> obvious.  At the moment our docs only use the term 'row pattern', for
> all the JSON_TABLE json path expressions (also those in the COLUMN
> clause, it seems).
>
>
> At the moment, we say, in the JSON_TABLE doc:
> 
> To split the row pattern into columns, json_table provides the COLUMNS
> clause that defines the schema of the created view.
> 
>
> I think that to use 'row pattern' here is just wrong, or at least
> confusing.  The 'row pattern' is /not/ the data as produced from the
> json expression; the 'row pattern' /is/ the json path expression. 
> (ISO draft: 'The path expression in this case is more accurately
> called the row pattern path expression.' )
>
> If you agree with my reasoning I can try to rewrite these bits in our
> docs accordingly.
>
>
>

Yes, please do.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-15 Thread osumi.takami...@fujitsu.com
On Thursday, July 14, 2022 10:31 AM Masahiko Sawada  
wrote:
> I've attached an updated patch that incorporated comments from Amit and Shi.
Hi,


Minor comments for v4.

(1) typo in the commit message

"When decoding a COMMIT record, we check both the list and the ReorderBuffer to 
see if
if the transaction has modified catalogs."

There are two 'if's in succession in the last sentence of the second paragraph.

(2) The header comment for the spec test

+# Test that decoding only the commit record of the transaction that have
+# catalog-changed.

Rewording of this part looks required, because "test that ... " requires a 
complete sentence
after that, right ?


(3) SnapBuildRestore

snapshot_not_interesting:
if (ondisk.builder.committed.xip != NULL)
pfree(ondisk.builder.committed.xip);
return false;
}

Do we need to add pfree for ondisk.builder.catchange.xip after the 
'snapshot_not_interesting' label ?


(4) SnapBuildPurgeOlderTxn

+   elog(DEBUG3, "purged catalog modifying transactions from %d to 
%d",
+(uint32) builder->catchange.xcnt, surviving_xids);

To make this part more aligned with existing codes,
probably we can have a look at another elog for debug in the same function.

We should use %u for casted xcnt & surviving_xids,
while adding a format for xmin if necessary ?


Best Regards,
Takamichi Osumi



Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-07-15 Thread Japin Li


On Fri, 15 Jul 2022 at 08:49, Bruce Momjian  wrote:
> On Tue, Jul  5, 2022 at 08:02:33PM -0400, Tom Lane wrote:
>> "Precondition" is an overly fancy word that makes things less clear
>> not more so.  Does it mean that setting wal_level = minimal will fail
>> if you don't do these other things, or does it just mean that you
>> won't be getting the absolute minimum WAL volume?  If the former,
>> I think it'd be better to say something like "To set wal_level to minimal,
>> you must also set [these variables], which has the effect of disabling
>> both WAL archiving and streaming replication."
>
> I have created the attached patch to try to improve this text.

IMO we can add the following sentence for wal_level description, since
if wal_level = minimal and max_wal_senders > 0, we cannot start the database.

To set wal_level to minimal, you must also set max_wal_senders to 0,
which has the effect of disabling both WAL archiving and streaming
replication.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: EINTR in ftruncate()

2022-07-15 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Jul 15, 2022 at 9:34 AM Tom Lane  wrote:
>> (Someday we oughta go ahead and make our Windows signal API look more
>> like POSIX, as I suggested back in 2015.  I'm still not taking
>> point on that, though.)

> For the sigprocmask() part, here's a patch that passes CI.  Only the
> SIG_SETMASK case is actually exercised by our current code, though.

Passes an eyeball check, but I can't actually test it.

regards, tom lane




Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-07-15 Thread Japin Li


Sorry for the late reply.

On Wed, 06 Jul 2022 at 08:02, Tom Lane  wrote:
> Japin Li  writes:
>> [ v4-wal-level-documentation.patch ]
>
> Hm, I don't care for the wording here:
>
> +A precondition for using minimal WAL is to disable WAL archiving and
> +streaming replication by setting archive_mode to
> +off, and  to
> +0.
>
> "Precondition" is an overly fancy word that makes things less clear
> not more so.  Does it mean that setting wal_level = minimal will fail
> if you don't do these other things, or does it just mean that you
> won't be getting the absolute minimum WAL volume?  If the former,
> I think it'd be better to say something like "To set wal_level to minimal,
> you must also set [these variables], which has the effect of disabling
> both WAL archiving and streaming replication."

Yeah, it's the former case.

>
> +servers.  If setting max_wal_senders to
> +0 consider also reducing the amount of WAL 
> produced
> +by changing wal_level to 
> minimal.
>
> I don't think this is great advice.  It will encourage people to use
> wal_level = minimal even if they have other requirements that weigh
> against it.  If they feel that their system is producing too much
> WAL, I doubt they'll have a hard time finding the wal_level knob.
>

Agreed. It isn't good advice.  We can remove the suggestion.


--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Alvaro Herrera wrote:

> Not sure what to make of this.  Maybe the fix in 054325c5eeb3 is not
> right, but I don't understand why it doesn't fail in the macOS machines
> in the buildfarm.

Ah, one theory is that the libpq_pipeline program is getting linked to
an installed version of libpq that doesn't contain the fixes.  Maybe you
can do `ldd /path/to/libpq_pipeline` and see which copy of libpq.so it
is picking up?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)




Re: remove reset_shared()

2022-07-15 Thread Pavel Borisov
On Fri, 15 Jul 2022 at 16:41, Maxim Orlov  wrote:

> Hi!
>
> In general I'm for this patch. Some time ago I was working on a patch
> related to shared memory and noticed
> no reason to have reset_shared() function.
>

Hi, hackers!
I see the proposed patch as uncontroversial and good enough to be
committed. It will make the code a little clearer. Personally, I don't like
leaving functions that are just wrappers for another and called only once.
But I think that if there's a question of code readability it's not bad to
restore the comments on the purpose of a call that were originally in the
code.

PFA v2 of a patch (only the comment removed in v1 is restored in v2).

Overall I'd like to move it to RfC if none have any objections.


v2-0001-Remove-a-wrapper-for-CreateSharedMemoryAndSemapho.patch
Description: Binary data


Re: remove reset_shared()

2022-07-15 Thread Maxim Orlov
Hi!

In general I'm for this patch. Some time ago I was working on a patch
related to shared memory and noticed
no reason to have reset_shared() function.

-- 
Best regards,
Maxim Orlov.


Re: Add 64-bit XIDs into PostgreSQL 15

2022-07-15 Thread Pavel Borisov
On Fri, 15 Jul 2022 at 16:17, Justin Pryzby  wrote:

> On Fri, Jul 15, 2022 at 03:23:29PM +0400, Pavel Borisov wrote:
> > On Fri, 15 Jul 2022 at 15:20, Pavel Borisov 
> wrote:
> > > Hi, hackers!
> > > v42 stopped applying, so we rebased it to v43. Attached is a GitHub
> link,
> > > but I see Cfbot hasn't become green. Apparently, it hasn't seen
> changes in
> > > GitHub link relative to v42 attached as a patch.
> >
> > Github link is as follows:
> > https://github.com/ziva777/postgres/tree/64xid-cf
> > Maybe this will enable CFbot to see it.
>
> My suggestion was a bit ambiguous, sorry for the confusion.
>
> The "Git link" is just an annotation - cfbot doesn't try to do anything
> with
> it[0] (but cirrusci will run the same checks under your github account).
> What I meant was that it doesn't seem imporant to send rebases multiple
> times
> per week if there's been no other changes.  Anyone is still able to review
> the
> patch.  It's possible to build it by applying the patch to a checkout of
> the
> postgres tree at the time the patch was sent.  Or by using the git link.
> Or by
> checking out cfbot's branch here, from the last time it *did* apply.
> https://github.com/postgresql-cfbot/postgresql/tree/commitfest/38/3594
>
> [0] (I think cfbot *could* try to read the git link, and apply patches
> that it
> finds there, but that's an idea that hasn't been proposed or discussed.
> It'd
> need to know which branch to use, and it'd need to know when to use the git
> link and when to use the most-recent email attachments).
>

Hi, Justin!

I can agree with you that sending rebased patches too often can be a little
annoying. On the other hand, otherwise, it's just red in Cfbot. I suppose
it's much easier and more comfortable to review the patches that at least
apply cleanly and pass all tests. So if Cfbot is red for a long time I feel
we need to send a rebased patchset anyway.

I'll try to not doing this too often but frankly, I don't see a better
alternative at the moment.

Anyway, big thanks for your advice and attention to this thread!

-- 
Best regards,
Pavel Borisov


Re: Add 64-bit XIDs into PostgreSQL 15

2022-07-15 Thread Justin Pryzby
On Fri, Jul 15, 2022 at 03:23:29PM +0400, Pavel Borisov wrote:
> On Fri, 15 Jul 2022 at 15:20, Pavel Borisov  wrote:
> > Hi, hackers!
> > v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
> > but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
> > GitHub link relative to v42 attached as a patch.
> 
> Github link is as follows:
> https://github.com/ziva777/postgres/tree/64xid-cf
> Maybe this will enable CFbot to see it.

My suggestion was a bit ambiguous, sorry for the confusion.

The "Git link" is just an annotation - cfbot doesn't try to do anything with
it[0] (but cirrusci will run the same checks under your github account).
What I meant was that it doesn't seem imporant to send rebases multiple times
per week if there's been no other changes.  Anyone is still able to review the
patch.  It's possible to build it by applying the patch to a checkout of the
postgres tree at the time the patch was sent.  Or by using the git link.  Or by
checking out cfbot's branch here, from the last time it *did* apply.
https://github.com/postgresql-cfbot/postgresql/tree/commitfest/38/3594

[0] (I think cfbot *could* try to read the git link, and apply patches that it
finds there, but that's an idea that hasn't been proposed or discussed.  It'd
need to know which branch to use, and it'd need to know when to use the git
link and when to use the most-recent email attachments).

-- 
Justin




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Alvaro Herrera wrote:

> However, looking closer I noticed that on Windows we use our own
> readdir() implementation, which AFAICT includes everything to handle
> reparse points as symlinks correctly in get_dirent_type.  Which means
> that do_pg_start_backup is wasting its time with the "#ifdef WIN32" bits
> to handle junction points separately.  We could just do this
> 
> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index b809a2152c..4966213fde 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -8302,13 +8302,8 @@ do_pg_backup_start(const char *backupidstr, bool fast, 
> TimeLineID *starttli_p,
>* we sometimes use allow_in_place_tablespaces to create
>* directories directly under pg_tblspc, which would 
> fail below.
>*/
> -#ifdef WIN32
> - if (!pgwin32_is_junction(fullpath))
> - continue;
> -#else
>   if (get_dirent_type(fullpath, de, false, ERROR) != 
> PGFILETYPE_LNK)
>   continue;
> -#endif
>  
>  #if defined(HAVE_READLINK) || defined(WIN32)
>   rllen = readlink(fullpath, linkpath, sizeof(linkpath));
> 
> And everything should continue to work.

Hmm, but it does not:
https://cirrus-ci.com/build/4824963784900608

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Alvaro Herrera wrote:

> > Yeh, repeated failures on MacOS Catalina with HEAD.
> 
> Can you share the contents of src/test/modules/libpq_pipeline/tmp_check?
> Since these failures are not visible in the buildfarm (where we do have
> 11.0 as well as some 10.X versions of macOS), this is surprising.

So I got the log files, and the error is clear:

: # Running: libpq_pipeline -r 700 -t 
/Users/sriggs/pg/pg-git/postgresql/src/test/modules/libpq_pipeline/tmp_check/traces/pipeline_idle.trace
 pipeline_idle port=57444 
host=/var/folders/rd/kxv86w7567z9jk5qt7lhxfwrgn/T/Od6QFSH7TB 
dbname='postgres'
: 
: pipeline idle...
: NOTICE 1: message type 0x33 arrived from server while idle
: 
: libpq_pipeline:1037: got 1 notice(s)
: [12:17:00.181](0.016s) not ok 9 - libpq_pipeline pipeline_idle

then the trace test also fails, but only because it is truncated at the
point where the notice is reported; up to that point, the trace matches
correctly.

Not sure what to make of this.  Maybe the fix in 054325c5eeb3 is not
right, but I don't understand why it doesn't fail in the macOS machines
in the buildfarm.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Escucha y olvidarás; ve y recordarás; haz y entenderás" (Confucio)




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Simon Riggs wrote:

> > Test Summary Report
> > ---
> > t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
> >   Failed tests:  9-10
> >   Non-zero exit status: 2
> > Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
> >  0.64 csys =  1.32 CPU)
> > Result: FAIL
> 
> Yeh, repeated failures on MacOS Catalina with HEAD.

Can you share the contents of src/test/modules/libpq_pipeline/tmp_check?
Since these failures are not visible in the buildfarm (where we do have
11.0 as well as some 10.X versions of macOS), this is surprising.

Anyway, the errors shown for your patch in the cfbot are in the
pg_upgrade test, so I suggest to have a look at the logs for those
anyway.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.




Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-07-15 Thread Aleksander Alekseev
Pavel, Justin,

>> Is there any reason to continue with two separate threads and CF entries ?
>> The original reason was to have a smaller patch for considerate late in v15.

> I see the main goal of this split is to make discussion of this (easier) 
> thread separate to the discussion of a whole patchset which is expected to be 
> more thorough.

+1. This was done per explicit request in the first thread.

>> But right now, it just seems to cause every update to imply two email 
>> messages
>> rather than one.
>>
>> Also, since this patch series is large, and expects a lot of conflicts, it
>> seems better to update the cfapp with a "git link" [0] where you can 
>> maintain a
>> rebased branch.  It avoids the need to mail new patches to the list several
>> times more often than it's being reviewed.

Yep, this is a bit inconvenient. Personally I didn't expect that
merging patches in this thread would take that long. They are in
"Ready for Committer" state for a long time now and there are no known
issues with them other than unit tests for SLRU [1] should be merged
first.

I suggest we use "git link" for the larger patchset in the other
thread since I'm not contributing to it right now and all in all that
thread is waiting for this one. For this thread we continue using
patches since several people contribute to them.

[1]: https://commitfest.postgresql.org/38/3608/

-- 
Best regards,
Aleksander Alekseev




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Simon Riggs
On Mon, 4 Jul 2022 at 08:00, Michael Paquier  wrote:
>
> On Sun, Jul 03, 2022 at 05:41:31PM -0400, Tom Lane wrote:
> > This is marked as Ready for Committer, but that seems unduly
> > optimistic.
>
> Please note that patch authors should not switch a patch as RfC by
> themselves.  This is something that a reviewer should do.
>
> > The cfbot shows that it's failing on all platforms ---
> > and not in the same way on each, suggesting there are multiple
> > problems.
>
> A wild guess is that this comes from the patch that manipulates
> get_database_name(), something that there is no need for as long as
> the routine is called once in ReindexMultipleTables().

OK, let me repost the new patch and see if CFbot likes that better.

This includes Michael's other requested changes for reindexdb.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


reindex_not_require_database_name.v5.patch
Description: Binary data


Re: Add 64-bit XIDs into PostgreSQL 15

2022-07-15 Thread Pavel Borisov
On Fri, 15 Jul 2022 at 15:20, Pavel Borisov  wrote:

> Hi, hackers!
> v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
> but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
> GitHub link relative to v42 attached as a patch.
>

Github link is as follows:
https://github.com/ziva777/postgres/tree/64xid-cf
Maybe this will enable CFbot to see it.

-- 
Best regards,
Pavel Borisov


Re: Add 64-bit XIDs into PostgreSQL 15

2022-07-15 Thread Pavel Borisov
Hi, hackers!
v42 stopped applying, so we rebased it to v43. Attached is a GitHub link,
but I see Cfbot hasn't become green. Apparently, it hasn't seen changes in
GitHub link relative to v42 attached as a patch.

-- 
Best regards,
Pavel Borisov


Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Simon Riggs
On Fri, 15 Jul 2022 at 12:15, Simon Riggs  wrote:
>
> On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera  wrote:
> >
> > On 2022-Jul-15, Simon Riggs wrote:
> >
> > > On Fri, 15 Jul 2022 at 04:44, Michael Paquier  wrote:
> >
> > > > This patch has been marked as waiting for a review, however the CF bot
> > > > is completely red:
> > >
> > > Yes, it is failing, but so is current HEAD, with some kind of libpq
> > > pipelining error.
> >
> > Hmm, is it?  Where can I see that?  The buildfarm looks OK ...
>
> I got this... so cleaning up and retesting.
>
> # Looks like you failed 2 tests of 20.
> t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200)
>
> Failed 2/20 subtests
>
> Test Summary Report
> ---
> t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
>   Failed tests:  9-10
>   Non-zero exit status: 2
> Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
>  0.64 csys =  1.32 CPU)
> Result: FAIL

Yeh, repeated failures on MacOS Catalina with HEAD.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Allowing REINDEX to have an optional name

2022-07-15 Thread Simon Riggs
On Fri, 15 Jul 2022 at 09:12, Alvaro Herrera  wrote:
>
> On 2022-Jul-15, Simon Riggs wrote:
>
> > On Fri, 15 Jul 2022 at 04:44, Michael Paquier  wrote:
>
> > > This patch has been marked as waiting for a review, however the CF bot
> > > is completely red:
> >
> > Yes, it is failing, but so is current HEAD, with some kind of libpq
> > pipelining error.
>
> Hmm, is it?  Where can I see that?  The buildfarm looks OK ...

I got this... so cleaning up and retesting.

# Looks like you failed 2 tests of 20.
t/001_libpq_pipeline.pl .. Dubious, test returned 2 (wstat 512, 0x200)

Failed 2/20 subtests

Test Summary Report
---
t/001_libpq_pipeline.pl (Wstat: 512 Tests: 20 Failed: 2)
  Failed tests:  9-10
  Non-zero exit status: 2
Files=1, Tests=20,  4 wallclock secs ( 0.02 usr  0.00 sys +  0.66 cusr
 0.64 csys =  1.32 CPU)
Result: FAIL

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-07-15 Thread Dmitry Koval

This is not a review, but I think the isolation tests should be
expanded.  At least, include the case of serializable transactions being
involved.


Thanks!
I will expand the tests for the next commitfest.

--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-15 Thread Alvaro Herrera
On 2022-Jul-15, Kyotaro Horiguchi wrote:

> 0002:
> +is_path_tslink(const char *path)
> 
> What the "ts" of tslink stands for? If it stands for tablespace, the
> function is not specific for table spaces. We already have 
> 
> + errmsg("could not stat file \"%s\": 
> %m", path));
> 
> I'm not sure we need such correctness, but what is failing there is
> lstat.  I found similar codes in two places in backend and one place
> in frontend. So couldn't it be moved to /common and have a more
> generic name?

I wondered whether it'd be better to check whether get_dirent_type
returns PGFILETYPE_LNK.  However, that doesn't deal with junction points
at all, which seems pretty odd ... I mean, isn't it rather useful as an
abstraction if it doesn't abstract away the one platform-dependent point
we have in the area?

However, looking closer I noticed that on Windows we use our own
readdir() implementation, which AFAICT includes everything to handle
reparse points as symlinks correctly in get_dirent_type.  Which means
that do_pg_start_backup is wasting its time with the "#ifdef WIN32" bits
to handle junction points separately.  We could just do this

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index b809a2152c..4966213fde 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8302,13 +8302,8 @@ do_pg_backup_start(const char *backupidstr, bool fast, 
TimeLineID *starttli_p,
 * we sometimes use allow_in_place_tablespaces to create
 * directories directly under pg_tblspc, which would 
fail below.
 */
-#ifdef WIN32
-   if (!pgwin32_is_junction(fullpath))
-   continue;
-#else
if (get_dirent_type(fullpath, de, false, ERROR) != 
PGFILETYPE_LNK)
continue;
-#endif
 
 #if defined(HAVE_READLINK) || defined(WIN32)
rllen = readlink(fullpath, linkpath, sizeof(linkpath));


And everything should continue to work.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




RE: Collect ObjectAddress for ATTACH DETACH PARTITION to use in event trigger

2022-07-15 Thread kuroda.hay...@fujitsu.com
Dear Hou-san,

> Thanks for having a look. It was a bit difficult to add a test for this.
> Because we currently don't have a user function which can return these
> collected ObjectAddresses for ALTER TABLE. And It seems we don't have tests 
> for
> already collected ObjectAddresses as well :(
> The collected ObjectAddresses is in
> "currentEventTriggerState->currentCommand->d.alterTable.subcmds.address" while
> the public function pg_event_trigger_ddl_commands doesn't return these
> information. It can only be used in user defined event trigger function (C
> code).

Thanks for explaining. I did not know the reason why the test is not in 
event_trigger.sql.

> If we want to add some tests for both already existed and newly added
> ObjectAddresses, we might need to add some test codes in test_ddl_deparse.c.
> What do you think ?

I thought tests for ObjectAddresses should be added to test_ddl_deparse.c, but
it might be bigger because there were many ATExecXXX() functions.
I thought they could be added separately in another thread or patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-07-15 Thread Matthias van de Meent
On Mon, 14 Mar 2022 at 18:14, Andres Freund  wrote:
>
> A random thought I had while thinking about the size limits: We could use the
> low bits of the length and xl_prev to store XLR_SPECIAL_REL_UPDATE |
> XLR_CHECK_CONSISTENCY and give rmgrs the full 8 bit of xl_info. Which would
> allow us to e.g. get away from needing Heap2. Which would aestethically be
> pleasing.

I just remembered your comment while going through the xlog code and
thought this about the same issue: We still have 2 bytes of padding in
XLogRecord, between xl_rmid and xl_crc. Can't we instead use that
space for rmgr-specific flags, as opposed to stealing bits from
xl_info?


Kind regards,

Matthias van de Meent




Re: MERGE and parsing with prepared statements

2022-07-15 Thread Matthias van de Meent
On Thu, 14 Jul 2022, 18:26 Justin Pryzby,  wrote:
>
> We've used INSERT ON CONFLICT for a few years (with partitions as the target).
> That's also combined with prepared statements, for bulk loading.
>
> I was looking to see if we should use MERGE (probably not, but looking 
> anyway).
> And came across this behavior.  I'm not sure if it's any issue.
>
> CREATE TABLE CustomerAccount (CustomerId int, Balance float);
>
> PREPARE p AS
> MERGE INTO CustomerAccount CA
> USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
> ON CA.CustomerId = T.CustomerId
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue)
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue;
>
> ERROR:  operator does not exist: integer = text
> LINE 3: ON CA.CustomerId = T.CustomerId
>
> Why is $1 construed to be of type text ?

The select statement that generates the row type of T `(select $1 CID,
$2 TxV) AS T` does not put type bounds on the input parameters, so it
remains `unknown` for the scope of that subselect. Once stored into
the row type, the type of that column defaults to text, as a row type
should not have 'unknown'-typed columns.

You'll see the same issue with other subselects that select input
parameters without casts, such as `select a from (select $1 a) A where
A.a = 1;`. It's a pre-existing issue that has popped up earlier, and I
think it's not something we've planned to fix in backbranches.

Kind regards,

Matthias van de Meent




Re: Problem about postponing gathering partial paths for topmost scan/join rel

2022-07-15 Thread Richard Guo
On Fri, Jul 15, 2022 at 4:03 PM Richard Guo  wrote:

>
> On Thu, Jul 14, 2022 at 10:02 PM Antonin Houska  wrote:
>
>> I'd prefer a test that demonstrates that the Gather node at the top of the
>> "subproblem plan" is useful purely from the *cost* perspective, rather
>> than
>> due to executor limitation.
>
>
> This patch provides an additional path (Gather atop of subproblem) which
> was not available before. But your concern makes sense that we need to
> show this new path is valuable from competing on cost with other paths.
>
> How about we change to Nested Loop at the topmost? Something like:
>

Maybe a better example is that we use a small table 'c' to avoid the
Gather node above scanning 'c', so that the path of parallel nestloop is
possible to be generated.

set join_collapse_limit to 2;

# explain (costs off) select * from a join b on a.i = b.i join c on b.i >
c.i;
   QUERY PLAN

 Nested Loop
   Join Filter: (b.i > c.i)
   ->  Seq Scan on c
   ->  Gather
 Workers Planned: 4
 ->  Parallel Hash Join
   Hash Cond: (a.i = b.i)
   ->  Parallel Seq Scan on a
   ->  Parallel Hash
 ->  Parallel Seq Scan on b
(10 rows)

Thanks
Richard


Re: Add last_vacuum_index_scans in pg_stat_all_tables

2022-07-15 Thread Ken Kato

On 2022-07-09 03:18, Peter Geoghegan wrote:
On Fri, Jul 8, 2022 at 10:47 AM Alvaro Herrera 
 wrote:

Saving some sort of history would be much more useful, but of course a
lot more work.


Thank you for the comments!
Yes, having some sort of history would be ideal in this case.
However, I am not sure how to implement those features at this moment, 
so I will take some time to consider.


At the same time, I think having this metrics exposed in the 
pg_stat_all_tables comes in handy when tuning the 
maintenance_work_mem/autovacuume_work_mem even though it shows the value 
of only last vacuum/autovacuum.


Regards,

--
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [Commitfest 2022-07] Begins Now

2022-07-15 Thread Wenjing Zeng
Hi Jacob

Abort Global temporary table
https://commitfest.postgresql.org/36/2349/# 

Please move the Global Temporary table to check next month, that is at 202208.
I need more time to process the existing issue.

Thanks
Wenjing


> 2022年7月9日 07:42,Jacob Champion  写道:
> 
> On 7/1/22 08:08, Jacob Champion wrote:
>> It's been July everywhere on Earth for a few hours, so the July
>> commitfest is now in progress:
>> 
>>https://commitfest.postgresql.org/38/
> One week down, three to go.
> 
> I forgot to put the overall status in the last email. We started the
> month with the following stats:
> 
>Needs review: 214
>Waiting on Author: 36
>Ready for Committer:   23
>Committed: 21
>Moved to next CF:   1
>Withdrawn:  5
>Rejected:   2
>Returned with Feedback: 3
>--
>Total:305
> 
> And as of this email, we're now at
> 
>Needs review: 193
>Waiting on Author: 38
>Ready for Committer:   24
>Committed: 37
>Moved to next CF:   2
>Withdrawn:  6
>Rejected:   2
>Returned with Feedback: 3
>--
>Total:305
> 
> That's sixteen patchsets committed in the first week.
> 
> Have a good weekend,
> --Jacob
> 
> 



  1   2   >