src/test/modules/test_regex/test_regex.c typo issue

2023-02-03 Thread jian he
Hi,
in src/test/modules/test_regex/test_regex.c

/* Report individual info bit states */
for (inf = infonames; inf->bit != 0; inf++)
{
if (cpattern->re_info & inf->bit)
{
if (flags->info & inf->bit)
elems[nresults++] = 
PointerGetDatum(cstring_to_text(inf->text));
else
{
snprintf(buf, sizeof(buf), "unexpected %s!", 
inf->text);
elems[nresults++] = 
PointerGetDatum(cstring_to_text(buf));
}
}
else
{
if (flags->info & inf->bit)
{
snprintf(buf, sizeof(buf), "missing %s!", 
inf->text);
elems[nresults++] = 
PointerGetDatum(cstring_to_text(buf));
}
}
}

I think "expected" should be replaced with "missing".
the "missing" should be replaced with "expected".


Re: Remove unused code related to unknown type

2023-02-03 Thread Peter Eisentraut

On 03.02.23 00:12, Tom Lane wrote:

Peter Eisentraut  writes:

Here is a patch that removes some unused leftovers from commit
cfd9be939e9c516243c5b6a49ad1e1a9a38f1052 (old).


Ugh.  Those are outright wrong now, aren't they?  Better nuke 'em.


done





RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-03 Thread Takamichi Osumi (Fujitsu)
Hi,


On wangw.f...@fujitsu.com Amit Kapila  wrote:
> On Fri, Feb 3, 2023 at 3:12 PM wangw.f...@fujitsu.com
>  wrote:
> >
> > Here is a comment:
> >
> > 1. The checks in function AlterSubscription
> > +   /*
> > +* The combination of parallel
> streaming mode and
> > +* min_apply_delay is not
> allowed. See
> > +* parse_subscription_options
> for details of the reason.
> > +*/
> > +   if (opts.streaming ==
> LOGICALREP_STREAM_PARALLEL)
> > +   if
> ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> opts.min_apply_delay > 0) ||
> > +
> > + (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > + sub->minapplydelay > 0))
> > and
> > +   /*
> > +* The combination of parallel
> streaming mode and
> > +* min_apply_delay is not
> allowed.
> > +*/
> > +   if (opts.min_apply_delay > 0)
> > +   if
> ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
> LOGICALREP_STREAM_PARALLEL) ||
> > +
> > + (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
> > + LOGICALREP_STREAM_PARALLEL))
> >
> > I think the case where the options "min_apply_delay>0" and
> "streaming=parallel"
> > are set at the same time seems to have been checked in the function
> > parse_subscription_options, how about simplifying these two
> > if-statements here to the following:
> > ```
> > if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
> > !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > sub->minapplydelay > 0)
> >
> > and
> >
> > if (opts.min_apply_delay > 0 &&
> > !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
> > sub->stream == LOGICALREP_STREAM_PARALLEL) ```
> >
> 
> Won't just checking if ((opts.streaming ==
> LOGICALREP_STREAM_PARALLEL && sub->minapplydelay > 0) ||
> (opts.min_apply_delay > 0 && sub->stream ==
> LOGICALREP_STREAM_PARALLEL)) be sufficient in that case?
We need checks for !IsSet(). If we don't have those,
we error out when executing the alter subscription with min_apply_delay = 0
and streaming = parallel, at the same time for a subscription whose 
min_apply_delay
setting is bigger than 0, for instance. In this case, we pass (don't error out)
parse_subscription_options()'s test for the combination of mutual exclusive 
options
and then, error out the condition by matching the first condition
opts.streaming == parallel and sub->minapplydelay > 0 above.

Also, the Wang-san's refactoring proposal makes sense. Adopted.
Regarding the style how to write min_apply_delay > 0
(or just putting min_apply_delay in 'if' conditions) for checking parameters,
I agreed with Amit-san so I keep them as it is in the latest patch v27.


Kindly have a look at v27 posted in [1]

[1] - 
https://www.postgresql.org/message-id/TYCPR01MB83738F2BEF83DE525410E3ACEDD49%40TYCPR01MB8373.jpnprd01.prod.outlook.com



Best Regards,
Takamichi Osumi





RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-03 Thread Takamichi Osumi (Fujitsu)
Hi,


On Friday, February 3, 2023 3:35 PM I wrote:
> On Friday, February 3, 2023 2:21 PM Amit Kapila 
> wrote:
> > On Fri, Feb 3, 2023 at 6:41 AM Peter Smith 
> wrote:
> > > On Thu, Feb 2, 2023 at 7:21 PM Takamichi Osumi (Fujitsu)
> > >  wrote:
> > > ...
> > > > > Besides, I am not sure it's a stable test to check the log. Is
> > > > > it possible that there's no such log on a slow machine? I
> > > > > modified the code to sleep 1s at the beginning of
> > > > > apply_dispatch(), then the new added test failed because the server
> log cannot match.
> > > > To get the log by itself is necessary to ensure that the delay is
> > > > conducted by the apply worker, because we emit the diffms only if
> > > > it's bigger than 0 in maybe_apply_delay(). If we omit the step, we
> > > > are not sure the delay is caused by other reasons or the
> > > > time-delayed
> > feature.
> > > >
> > > > As you mentioned, it's possible that no log is emitted on slow
> > > > machine. Then, the idea to make the test safer for such machines
> > > > should
> > be to make the delayed time longer.
> > > > But we shortened the delay time to 1 second to mitigate the long
> > > > test
> > execution time of this TAP test.
> > > > So, I'm not sure if it's a good idea to make it longer again.
> > >
> > > I think there are a couple of things that can be done about this problem:
> > >
> > > 1. If you need the code/test to remain as-is then at least the test
> > > message could include some comforting text like "(this can fail on
> > > slow machines when the delay time is already exceeded)" so then a
> > > test failure will not cause undue alarm.
> > >
> > > 2. Try moving the DEBUG2 elog (in function maybe_apply_delay) so
> > > that it will *always* log the remaining wait time even if that wait
> > > time becomes negative. Then I think the test cases can be made
> > > deterministic instead of relying on good luck. This seems like the
> > > better option.
> > >
> >
> > I don't understand why we have to do any of this instead of using 3s
> > as min_apply_delay similar to what we are doing in
> > src/test/recovery/t/005_replay_delay. Also, I think we should use
> > exactly the same way to verify the test even though we want to keep
> > the log level as
> > DEBUG2 to check logs in case of any failures.
> OK, will try to make our tests similar to the tests in 005_replay_delay as 
> much
> as possible.
I've updated the TAP test and made it aligned with 005_reply_delay.pl.

For coverage, I have the stream of in-progress transaction test case
and ALTER SUBSCRIPTION DISABLE behavior, which is unique to logical replication.
Also, conducted pgindent and pgperltidy. Note that the latter half of the
005_reply_delay.pl doesn't seem to match with the test for time-delayed logical 
replication
(e.g. promotion). So, I don't have those points.

Kindly have a look at the attached v27.


Best Regards,
Takamichi Osumi



v27-0001-Time-delayed-logical-replication-subscriber.patch
Description:  v27-0001-Time-delayed-logical-replication-subscriber.patch


Re: run pgindent on a regular basis / scripted manner

2023-02-03 Thread Noah Misch
On Fri, Feb 03, 2023 at 12:52:50PM -0500, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 2023-01-22 Su 17:47, Tom Lane wrote:
> >> Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
> >> another assignment in a series of assignments, it is very likely to
> >> reformat all the adjacent assignments because it thinks it's cool to
> >> make all the equal signs line up.  That's just awful.
> 
> > Modern versions of perltidy give you much more control over this, so 
> > maybe we need to investigate the possibility of updating.
> 
> I have no objection to updating perltidy from time to time. I think the
> idea is just to make sure that we have an agreed-on version for everyone
> to use.

Agreed.  If we're changing the indentation of assignments, that's a
considerable diff already.  It would be a good time to absorb other diffs
we'll want eventually, including diffs from a perltidy version upgrade.




Re: Add progress reporting to pg_verifybackup

2023-02-03 Thread Michael Paquier
On Thu, Feb 02, 2023 at 05:56:47PM +0900, Masahiko Sawada wrote:
> Seems a good idea. Please find an attached patch.

That seems rather OK seen from here.  I'll see about getting that
applied except if there is an objection of any kind.
--
Michael


signature.asc
Description: PGP signature


Re: recovery modules

2023-02-03 Thread Michael Paquier
On Thu, Feb 02, 2023 at 11:37:00AM -0800, Nathan Bossart wrote:
> On Thu, Feb 02, 2023 at 02:34:17PM +0900, Michael Paquier wrote:
>> Okay, the changes done here look straight-forward seen from here.  I
>> got one small-ish comment.
>> 
>> +basic_archive_startup(ArchiveModuleState *state)
>> +{
>> +   BasicArchiveData *data = palloc0(sizeof(BasicArchiveData));
>> 
>> Perhaps this should use MemoryContextAlloc() rather than a plain
>> palloc().  This should not matter based on the position where the
>> startup callback is called, still that may be a pattern worth
>> encouraging.
> 
> Good call.

+   ArchiveModuleCallbacks struct filled with the callback function pointers for
This needs a structname markup.

+   can use state->private_data to store it.
And here it would be structfield.

As far as I can see, all the points raised about this redesign seem to
have been addressed.  Andres, any comments?
--
Michael


signature.asc
Description: PGP signature


Re: Amcheck verification of GiST and GIN

2023-02-03 Thread Peter Geoghegan
On Thu, Feb 2, 2023 at 12:15 PM Peter Geoghegan  wrote:
> * Why are there only WARNINGs, never ERRORs here?

Attached revision v22 switches all of the WARNINGs over to ERRORs. It
has also been re-indented, and now uses a non-generic version of
PageGetItemIdCareful() in both verify_gin.c and verify_gist.c.
Obviously this isn't a big set of revisions, but I thought that Andrey
would appreciate it if I posted this much now. I haven't thought much
more about the locking stuff, which is my main concern for now.

Who are the authors of the patch, in full? At some point we'll need to
get the attribution right if this is going to be committed.

I think that it would be good to add some comments explaining the high
level control flow. Is the verification process driven by a
breadth-first search, or a depth-first search, or something else?

I think that we should focus on getting the GiST patch into shape for
commit first, since that seems easier.

-- 
Peter Geoghegan


v22-0002-Add-gist_index_parent_check-function-to-verify-G.patch
Description: Binary data


v22-0001-Refactor-amcheck-to-extract-common-locking-routi.patch
Description: Binary data


v22-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch
Description: Binary data


Re: Weird failure with latches in curculio on v15

2023-02-03 Thread Michael Paquier
On Fri, Feb 03, 2023 at 10:54:17AM -0800, Nathan Bossart wrote:
> 0001 is just v1-0001 from upthread.  This moves Pre/PostRestoreCommand to
> surround only the call to system().  I think this should get us closer to
> pre-v15 behavior.

+   if (exitOnSigterm)
+   PreRestoreCommand();
+
rc = system(command);
+
+   if (exitOnSigterm)
+   PostRestoreCommand();

I don't really want to let that hanging around on HEAD much longer, so
I'm OK to do that for HEAD, then figure out what needs to be done for
the older issue at hand.

+   /*
+* PreRestoreCommand() is used to tell the SIGTERM handler for the startup
+* process that it is okay to proc_exit() right away on SIGTERM.  This is
+* done for the duration of the system() call because there isn't a good
+* way to break out while it is executing.  Since we might call proc_exit()
+* in a signal handler here, it is extremely important that nothing but the
+* system() call happens between the calls to PreRestoreCommand() and
+* PostRestoreCommand().  Any additional code must go before or after this
+* section.
+*/

Still, it seems to me that the large comment block in shell_restore()
ought to be moved to ExecuteRecoveryCommand(), no?  The assumptions
under which one can use exitOnSigterm and failOnSignal could be
completed in the header of the function based on that.
--
Michael


signature.asc
Description: PGP signature


Re: run pgindent on a regular basis / scripted manner

2023-02-03 Thread Michael Paquier
On Thu, Feb 02, 2023 at 11:34:37AM +, Dean Rasheed wrote:
> I didn't reply until now, but I'm solidly in the camp of committers
> who care about keeping the tree properly indented, and I wouldn't have
> any problem with such a check being imposed.

So do I.  pgindent is part of my routine when it comes to all the
patches I merge on HEAD, and having to clean up unrelated diffs in
the files touched after an indentation is always annoying.

FWIW, I just use a script that does pgindent, pgperltidy, pgperlcritic
and `make reformat-dat-files` in src/include/catalog.

> I regularly run pgindent locally, and if I ever commit without
> indenting, it's either intentional, or because I forgot, so the
> reminder would be useful.
> 
> And as someone who runs pgindent regularly, I think this will be a net
> time saver, since I won't have to skip over other unrelated indent
> changes all the time.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: About PostgreSQL Core Team

2023-02-03 Thread Andrew Dunstan


On 2023-02-01 We 07:24, adherent postgres wrote:

Hi hackers:
    I came across a blog that I was very impressed with, especially 
the views mentioned in it about PostgreSQL Core Team





Perhaps people might take more notice if you didn't hide behind an 
anonymous hotmail account.



I've heard these sort of criticisms before, in one case very recently, 
but almost always from people who aren't contributors or potential 
contributors. I've never had someone say to me "Well I would contribute 
lots of code to Postgres but I won't as you don't do PRs."




cheers



andrew (Not a core team member)

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: proposal: psql: show current user in prompt

2023-02-03 Thread Pavel Stehule
pá 3. 2. 2023 v 21:21 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > Both patches are very simple - and they use almost already prepared
> > infrastructure.
>
> It's not simple at all to make the psql feature depend on marking
> "role" as GUC_REPORT when it never has been before.  That will
> cause the feature to misbehave when using older servers.  I'm
> even less impressed by having it fall back on PQuser(), which
> would be misleading at exactly the times when it matters.
>

It is a good note. This can be disabled for older servers, and maybe it
can  introduce its own GUC (and again - it can be disallowed for older
servers).

My goal at this moment is to get some prototype. We can talk if this
feature request is valid or not, and we can talk about implementation.

There is another possibility to directly execute "select current_user()"
instead of looking at status parameters inside prompt processing. It can
work too.

Regards

Pavel





> regards, tom lane
>


Re: Transparent column encryption

2023-02-03 Thread Jacob Champion
On Tue, Jan 31, 2023 at 5:26 AM Peter Eisentraut
 wrote:
> See here for example:
> https://learn.microsoft.com/en-us/sql/relational-databases/security/encryption/always-encrypted-database-engine?view=sql-server-ver15

Hm. I notice they haven't implemented more than one algorithm, so I
wonder if they're going to be happy with their decision to
mix-and-match when number two arrives.

> For pg_dump, I'd like a mode that makes all values parameters of an
> INSERT statement.  But obviously not all of those will be encrypted.  So
> I think we'd want a per-parameter syntax.

Makes sense. Maybe something that defaults to encrypted with opt-out
per parameter?

UPDATE t SET name = $1 WHERE id = $2
\encbind "Jacob" plaintext(24)

> > More concretely: should psql allow you to push arbitrary text into an
> > encrypted \bind parameter, like it does now?
>
> We don't have any data type awareness like that now in psql or libpq.
> It would be quite a change to start now.

I agree. It just feels weird that a misbehaving client can "attack"
the other client implementations using it, and we don't make any
attempt to mitigate it.

--Jacob




Re: Schema variables - new implementation for Postgres 15 (typo)

2023-02-03 Thread Pavel Stehule
Hi

I read notes from the FOSDEM developer meeting, and I would like to repeat
notice about motivation for introduction of session variables, and one
reason why session_variables are not transactional, and why they should not
be replaced by temp tables is performance.

There are more use cases where session variables can be used. One scenario
for session variables is to use them like static variables. They can be
used from some rows triggers, .. where local variable is not enough
(like
https://www.cybertec-postgresql.com/en/why-are-my-postgresql-updates-getting-slower/
)

create variable xx as int;

do $$
begin
  let xx = 1;
  for i in 1..1 loop
let xx = xx + 1;
  end loop;
  raise notice '%', xx;
end;
$$;
NOTICE:  10001
DO
Time: 4,079 ms

create temp table xx01(a int);
delete from xx01; vacuum full xx01; vacuum;

do $$
begin
  insert into xx01 values(1);
  for i in 1..1 loop
update xx01 set a = a + 1;
  end loop;
  raise notice '%', (select a from xx01);
end;
$$;
NOTICE:  10001
DO
Time: 1678,949 ms (00:01,679)

postgres=# \dt+ xx01
List of relations
┌───┬──┬───┬───┬─┬───┬┬─┐
│  Schema   │ Name │ Type  │ Owner │ Persistence │ Access method │  Size  │
Description │
╞═══╪══╪═══╪═══╪═╪═══╪╪═╡
│ pg_temp_3 │ xx01 │ table │ pavel │ temporary   │ heap  │ 384 kB │
│
└───┴──┴───┴───┴─┴───┴┴─┘
(1 row)

Originally, I tested 100K iterations, but it was too slow, and I cancelled
it after 5 minutes. Vacuum can be done after the end of transaction.

And there can be another negative impact related to bloating of
pg_attribute, pg_class, pg_depend tables.

Workaround based on custom GUC is not too bad, but there is not any
possibility of  security protection (and there is not any possibility of
static check in plpgsql_check) - and still it is 20x slower than session
variables

do $$
begin
  perform set_config('cust.xx', '1', false);
  for i in 1..1 loop
perform set_config('cust.xx', (current_setting('cust.xx')::int +
1)::text, true);
  end loop;
  raise notice '%', current_setting('cust.xx');
end;
$$;
NOTICE:  10001
DO
Time: 80,201 ms

Session variables don't try to replace temp tables, and temp tables can be
a very bad replacement of session's variables.

Regards

Pavel


Re: proposal: psql: show current user in prompt

2023-02-03 Thread Tom Lane
Pavel Stehule  writes:
> Both patches are very simple - and they use almost already prepared
> infrastructure.

It's not simple at all to make the psql feature depend on marking
"role" as GUC_REPORT when it never has been before.  That will
cause the feature to misbehave when using older servers.  I'm
even less impressed by having it fall back on PQuser(), which
would be misleading at exactly the times when it matters.

regards, tom lane




Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-02-03 Thread Tom Lane
Laurenz Albe  writes:
> Thanks for the pointer.  Perhaps something like the attached?
> The changes in "CreatePartitionPruneState" make my test case work,
> but they cause a difference in the regression tests, which would be
> intended if we have no partition pruning with plain EXPLAIN.

Hmm, so I see (and the cfbot entry for this patch is now all red,
because you didn't include that diff in the patch).

I'm not sure if we can get away with that behavioral change.
People are probably expecting the current behavior for existing
cases.

I can think of a couple of possible ways forward:

* Fix things so that the generic parameters appear to have NULL
values when inspected during executor startup.  I'm not sure
how useful that'd be though.  In partition-pruning cases that'd
lead to EXPLAIN (GENERIC_PLAN) showing the plan with all
partitions pruned, other than the one for NULL values if there
is one.  Doesn't sound too helpful.

* Invent another executor flag that's a "stronger" version of
EXEC_FLAG_EXPLAIN_ONLY, and pass that when any generic parameters
exist, and check it in CreatePartitionPruneState to decide whether
to do startup pruning.  This avoids changing behavior for existing
cases, but I'm not sure how much it has to recommend it otherwise.
Skipping the startup prune step seems like it'd produce misleading
results in another way, ie showing too many partitions not too few.

This whole business of partition pruning dependent on the
generic parameters is making me uncomfortable.  It seems like
we *can't* show a plan that is either representative of real
execution or comparable to what you get from more-traditional
EXPLAIN usage.  Maybe we need to step back and think more.

regards, tom lane




Re: proposal: psql: show current user in prompt

2023-02-03 Thread Pavel Stehule
pá 3. 2. 2023 v 20:42 odesílatel Corey Huinker 
napsal:

>
>
> On Fri, Feb 3, 2023 at 9:56 AM Pavel Stehule 
> wrote:
>
>> Hi
>>
>> one visitor of p2d2 (Prague PostgreSQL Developer Day) asked if it is
>> possible to show the current role in psql's prompt. I think it is not
>> possible, but fortunately (with some limits) almost all necessary work is
>> done, and the patch is short.
>>
>> In the assigned patch I implemented a new prompt placeholder %N, that
>> shows the current role name.
>>
>> (2023-02-03 15:52:28) postgres=# \set PROMPT1 '%n as %N at '%/%=%#
>> pavel as pavel at postgres=#set role to admin;
>> SET
>> pavel as admin at postgres=>set role to default;
>> SET
>> pavel as pavel at postgres=#
>>
>> Comments, notes are welcome.
>>
>> Regards
>>
>> Pavel
>>
>
> This patch is cluttered with the BACKEND_PID patch and some guc_tables.c
> stuff that I don't think is related.
>

I was little bit lazy, I am sorry. I did it in one my experimental branch.
Both patches are PoC, and there are not documentation yet. I will separate
it.


> We'd have to document the %N.
>
> I think there is some value here for people who have to connect as several
> different users (tech support), and need a reminder-at-a-glance whether
> they are operating in the desired role. It may be helpful in audit
> documentation as well.
>

 yes, I agree so it can be useful - it is not my idea - and it is maybe
partially deduced from some other databases.

Both patches are very simple - and they use almost already prepared
infrastructure.

Regards

Pavel


Re: proposal: psql: show current user in prompt

2023-02-03 Thread Corey Huinker
On Fri, Feb 3, 2023 at 9:56 AM Pavel Stehule 
wrote:

> Hi
>
> one visitor of p2d2 (Prague PostgreSQL Developer Day) asked if it is
> possible to show the current role in psql's prompt. I think it is not
> possible, but fortunately (with some limits) almost all necessary work is
> done, and the patch is short.
>
> In the assigned patch I implemented a new prompt placeholder %N, that
> shows the current role name.
>
> (2023-02-03 15:52:28) postgres=# \set PROMPT1 '%n as %N at '%/%=%#
> pavel as pavel at postgres=#set role to admin;
> SET
> pavel as admin at postgres=>set role to default;
> SET
> pavel as pavel at postgres=#
>
> Comments, notes are welcome.
>
> Regards
>
> Pavel
>

This patch is cluttered with the BACKEND_PID patch and some guc_tables.c
stuff that I don't think is related.

We'd have to document the %N.

I think there is some value here for people who have to connect as several
different users (tech support), and need a reminder-at-a-glance whether
they are operating in the desired role. It may be helpful in audit
documentation as well.


Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl

2023-02-03 Thread Tom Lane
Nitin Jadhav  writes:
> My concern is if we do this, then we will end up having some policies
> (which can be read from pg_show_all_settings()) in guc.sql and some in
> guc.c. I feel all these should be at one place either at guc.c or
> guc.sql.

I don't particularly see why that needs to be the case.  Notably,
if we're interested in enforcing a policy even for extension GUCs,
guc.sql can't really do that since who knows whether the extension's
author will bother to run that test with the extension loaded.
On the other hand, moving *all* those checks into guc.c is probably
impractical and certainly will add undesirable startup overhead.

regards, tom lane




First draft of back-branch release notes is done

2023-02-03 Thread Tom Lane
... at

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f282b026787da69d88a35404cf62f1cc21cfbb7c

As usual, please send corrections/comments by Sunday.

regards, tom lane




Re: proposal: psql: psql variable BACKEND_PID

2023-02-03 Thread Corey Huinker
On Fri, Feb 3, 2023 at 5:42 AM Pavel Stehule 
wrote:

> Hi
>
> We can simply allow an access to backend process id thru psql variable. I
> propose the name "BACKEND_PID". The advantages of usage are simple
> accessibility by command \set, and less typing then using function
> pg_backend_pid, because psql variables are supported by tab complete
> routine. Implementation is very simple, because we can use the function
> PQbackendPID.
>
> Comments, notes?
>
> Regards
>
> Pavel
>

Interesting, and probably useful.

It needs a corresponding line in UnsyncVariables():

SetVariable(pset.vars, "BACKEND_PID", NULL);

That will set the variable back to null when the connection goes away.


Re: Weird failure with latches in curculio on v15

2023-02-03 Thread Nathan Bossart
On Thu, Feb 02, 2023 at 11:44:22PM -0800, Andres Freund wrote:
> On 2023-02-03 02:24:03 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>> > A workaround for the back branches could be to have a test in
>> > StartupProcShutdownHandler() that tests if MyProcPid == getpid(), and
>> > not do the proc_exit() if they don't match. We probably should just do
>> > an _exit() in that case.
>> 
>> Might work.
> 
> I wonder if we should add code complaining loudly about such a mismatch
> to proc_exit(), in addition to handling it more silently in
> StartupProcShutdownHandler().   Also, an assertion in
> [Auxiliary]ProcKill that proc->xid == MyProcPid == getpid() seems like a
> good idea.

>From the discussion, it sounds like we don't want to depend on the child
process receiving/handling the signal, so we can't get rid of the
break-out-of-system() behavior (at least not in back-branches).  I've put
together some work-in-progress patches for the stopgap/back-branch fix.

0001 is just v1-0001 from upthread.  This moves Pre/PostRestoreCommand to
surround only the call to system().  I think this should get us closer to
pre-v15 behavior.

0002 adds the getpid() check mentioned above to
StartupProcShutdownHandler(), and it adds assertions to proc_exit() and
[Auxiliary]ProcKill().

0003 adds checks for shutdown requests before and after the call to
shell_restore().  IMO the Pre/PostRestoreCommand stuff is an implementation
detail for restore_command, so I think it behooves us to have some
additional shutdown checks that apply even for recovery modules.  This
patch could probably be moved to the recovery modules thread.

Is this somewhat close to what folks had in mind?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 773cf2aa72d21d05b01c6c5f9308eab287826168 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 1 Feb 2023 14:32:02 -0800
Subject: [PATCH v4 1/3] stopgap fix for restore_command

---
 src/backend/access/transam/shell_restore.c | 22 --
 src/backend/access/transam/xlogarchive.c   |  7 ---
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/transam/shell_restore.c b/src/backend/access/transam/shell_restore.c
index 8458209f49..abec023c1a 100644
--- a/src/backend/access/transam/shell_restore.c
+++ b/src/backend/access/transam/shell_restore.c
@@ -21,6 +21,7 @@
 #include "access/xlogarchive.h"
 #include "access/xlogrecovery.h"
 #include "common/percentrepl.h"
+#include "postmaster/startup.h"
 #include "storage/ipc.h"
 #include "utils/wait_event.h"
 
@@ -124,8 +125,7 @@ shell_recovery_end(const char *lastRestartPointFileName)
  * human-readable name describing the command emitted in the logs. If
  * 'failOnSignal' is true and the command is killed by a signal, a FATAL
  * error is thrown. Otherwise, 'fail_elevel' is used for the log message.
- * If 'exitOnSigterm' is true and the command is killed by SIGTERM, we exit
- * immediately.
+ * If 'exitOnSigterm' is true and SIGTERM is received, we exit immediately.
  *
  * Returns whether the command succeeded.
  */
@@ -146,7 +146,25 @@ ExecuteRecoveryCommand(const char *command, const char *commandName,
 	 */
 	fflush(NULL);
 	pgstat_report_wait_start(wait_event_info);
+
+	/*
+	 * PreRestoreCommand() is used to tell the SIGTERM handler for the startup
+	 * process that it is okay to proc_exit() right away on SIGTERM.  This is
+	 * done for the duration of the system() call because there isn't a good
+	 * way to break out while it is executing.  Since we might call proc_exit()
+	 * in a signal handler here, it is extremely important that nothing but the
+	 * system() call happens between the calls to PreRestoreCommand() and
+	 * PostRestoreCommand().  Any additional code must go before or after this
+	 * section.
+	 */
+	if (exitOnSigterm)
+		PreRestoreCommand();
+
 	rc = system(command);
+
+	if (exitOnSigterm)
+		PostRestoreCommand();
+
 	pgstat_report_wait_end();
 
 	if (rc != 0)
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 4b89addf97..66312c816b 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -147,18 +147,11 @@ RestoreArchivedFile(char *path, const char *xlogfname,
 	else
 		XLogFileName(lastRestartPointFname, 0, 0L, wal_segment_size);
 
-	/*
-	 * Check signals before restore command and reset afterwards.
-	 */
-	PreRestoreCommand();
-
 	/*
 	 * Copy xlog from archival storage to XLOGDIR
 	 */
 	ret = shell_restore(xlogfname, xlogpath, lastRestartPointFname);
 
-	PostRestoreCommand();
-
 	if (ret)
 	{
 		/*
-- 
2.25.1

>From c16c46aac4af029d807d7060d3fb09a7b707ee4d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 3 Feb 2023 10:24:14 -0800
Subject: [PATCH v4 2/3] do not call proc_exit in process forked for
 restore_command

---
 src/backend/postmaster/startup.c | 14 +-
 src/backend/storage/ipc/ipc.c|  3 +++
 src/backend/storage/lmgr/proc.c  |  2 ++
 3 

Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl

2023-02-03 Thread Nitin Jadhav
> > Thanks Michael for identifying a new mistake. I am a little confused
> > here. I don't understand why GUC_NO_SHOW_ALL depends on other GUCs
> > like GUC_NOT_IN_SAMPLE or GUC_NO_RESET_ALL. Looks like the dependency
> > between GUC_NO_RESET_ALL and GUC_NO_SHOW_ALL is removed in the above
> > patch since we have these combinations now.
>
> pg_settings would be unable to show something marked as NO_SHOW_ALL,
> so the SQL check that looked after (NO_SHOW_ALL && !NO_RESET_ALL) is
> a no-op.  Postgres will likely gain more parameters that are kept
> around for compability reasons, and forcing a NO_RESET_ALL in such
> cases could impact applications using RESET on such GUCs, meaning
> potential compatibility breakages.
>
> > Similarly why can't we
> > have a GUC marked as GUC_NO_SHOW_ALL but not GUC_NOT_IN_CONFIG. For me
> > it makes sense if a GUC is marked as NO_SHOW_ALL and it can be present
> > in a sample file. It's up to the author/developer to choose which all
> > flags are applicable to the newly introduced GUCs. Please share your
> > thoughts.
>
> As also mentioned upthread by Tom, I am not sure that this combination
> makes much sense, actually, because I don't see why one would never
> want to know what is the effective value loaded for a parameter stored
> in a file when he/she has the permission to do so.  This could be
> changed as of ALTER SYSTEM, postgresql.conf or even an included file,
> and the value can only be read if permission to see it is given to the
> role querying SHOW or pg_settings.  This combination of flags is not a
> practice to encourage.

Got it. Makes sense.


> For the second part to prevent GUCs to be marked as NO_SHOW_ALL &&
> !NOT_IN_SAMPLE, check_GUC_init() looks like the right place to me,
> because this routine has been designed exactly for this purpose.
>
> So, what do you think about the attached?

My concern is if we do this, then we will end up having some policies
(which can be read from pg_show_all_settings()) in guc.sql and some in
guc.c. I feel all these should be at one place either at guc.c or
guc.sql. It is better to move all other policies from guc.sql to
guc.c. Otherwise, how about modifying the function
pg_show_all_settings as done in v1 patch and using this (as true)
while creating the table tab_settings_flags in guc.sq and just remove
(NO_SHOW_ALL && !NO_RESET_ALL) check from guc.sql. I don't think doing
this is a problem as we can retain the support of existing signatures
of the pg_show_all_settings function as suggested by Justin upthread
so that it will not cause any compatibility issues. Please share your
thoughts.

Thanks & Regards,
Nitin Jadhav

On Wed, Feb 1, 2023 at 10:59 AM Michael Paquier  wrote:
>
> On Mon, Jan 30, 2023 at 05:12:27PM +0530, Nitin Jadhav wrote:
> > Thanks Michael for identifying a new mistake. I am a little confused
> > here. I don't understand why GUC_NO_SHOW_ALL depends on other GUCs
> > like GUC_NOT_IN_SAMPLE or GUC_NO_RESET_ALL. Looks like the dependency
> > between GUC_NO_RESET_ALL and GUC_NO_SHOW_ALL is removed in the above
> > patch since we have these combinations now.
>
> pg_settings would be unable to show something marked as NO_SHOW_ALL,
> so the SQL check that looked after (NO_SHOW_ALL && !NO_RESET_ALL) is
> a no-op.  Postgres will likely gain more parameters that are kept
> around for compability reasons, and forcing a NO_RESET_ALL in such
> cases could impact applications using RESET on such GUCs, meaning
> potential compatibility breakages.
>
> > Similarly why can't we
> > have a GUC marked as GUC_NO_SHOW_ALL but not GUC_NOT_IN_CONFIG. For me
> > it makes sense if a GUC is marked as NO_SHOW_ALL and it can be present
> > in a sample file. It's up to the author/developer to choose which all
> > flags are applicable to the newly introduced GUCs. Please share your
> > thoughts.
>
> As also mentioned upthread by Tom, I am not sure that this combination
> makes much sense, actually, because I don't see why one would never
> want to know what is the effective value loaded for a parameter stored
> in a file when he/she has the permission to do so.  This could be
> changed as of ALTER SYSTEM, postgresql.conf or even an included file,
> and the value can only be read if permission to see it is given to the
> role querying SHOW or pg_settings.  This combination of flags is not a
> practice to encourage.
> --
> Michael




Index-only scan and random_page_cost

2023-02-03 Thread Konstantin Knizhnik

Hi hackers,

Right now cost of index-only scan is using `random_page_cost`.
Certainly for point selects we really have random access pattern, but 
queries like "select count(*) from hits"  access pattern is more or less 
sequential:
we are iterating through subsequent leaf B-Tree pages.  As far as 
default value of `random_page_cost`  is 4 times larger than `seq_page_cost`
it may force Postgres optimizer to choose sequential scan, while 
index-only scan is usually much faster in this case.

Can we do something here to provide more accurate cost estimation?






Re: run pgindent on a regular basis / scripted manner

2023-02-03 Thread Tom Lane
Andrew Dunstan  writes:
> On 2023-01-22 Su 17:47, Tom Lane wrote:
>> Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
>> another assignment in a series of assignments, it is very likely to
>> reformat all the adjacent assignments because it thinks it's cool to
>> make all the equal signs line up.  That's just awful.

> Modern versions of perltidy give you much more control over this, so 
> maybe we need to investigate the possibility of updating.

I have no objection to updating perltidy from time to time. I think the
idea is just to make sure that we have an agreed-on version for everyone
to use.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-02-03 Thread Andrew Dunstan


On 2023-01-22 Su 17:47, Tom Lane wrote:

Andres Freund  writes:

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up.  That's just awful.  You can either
run pgperltidy on new code before committing, and accept that the feature
patch will touch a lot of lines it's not making real changes to (thereby
dirtying the "git blame" history) or not do so and thereby commit code
that's not passing tidiness checks.  Let's *not* adopt any style that
causes similar things to start happening in our C code.



Modern versions of perltidy give you much more control over this, so 
maybe we need to investigate the possibility of updating. See the latest 
docco at 
<*https://metacpan.org/dist/Perl-Tidy/view/bin/perltidy#Completely-turning-off-vertical-alignment-with-novalign>

*

Probably we'd want to use something like

|--valign-exclusion-list=||'= => ,'|
||
||
|cheers|
||
|andrew|

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-02-03 Thread Laurenz Albe
On Fri, 2023-02-03 at 09:44 -0500, Tom Lane wrote:
> Laurenz Albe  writes:
> > I played around with it, and I ran into a problem with partitions that
> > are foreign tables:
> > ...
> >   EXPLAIN (GENERIC_PLAN) SELECT * FROM looppart WHERE key = $1;
> >   ERROR:  no value found for parameter 1
> 
> Hmm, offhand I'd say that something is doing something it has no
> business doing when EXEC_FLAG_EXPLAIN_ONLY is set (that is, premature
> evaluation of an expression).  I wonder whether this failure is
> reachable without this patch.

Thanks for the pointer.  Perhaps something like the attached?
The changes in "CreatePartitionPruneState" make my test case work,
but they cause a difference in the regression tests, which would be
intended if we have no partition pruning with plain EXPLAIN.

Yours,
Laurenz Albe
From ff16316aab8e5f5de84ae925e965a210d4368b75 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 3 Feb 2023 17:08:53 +0100
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized
statements (that have parameter placeholders like $1 in the
statement text).

Author: Laurenz Albe
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 doc/src/sgml/ref/explain.sgml | 15 ++
 src/backend/commands/explain.c|  9 +
 src/backend/executor/execPartition.c  |  6 --
 src/backend/parser/analyze.c  | 29 +++
 src/include/commands/explain.h|  1 +
 src/test/regress/expected/explain.out | 14 +
 src/test/regress/sql/explain.sql  |  5 +
 7 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..58350624e7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ]
 COSTS [ boolean ]
 SETTINGS [ boolean ]
+GENERIC_PLAN [ boolean ]
 BUFFERS [ boolean ]
 WAL [ boolean ]
 TIMING [ boolean ]
@@ -167,6 +168,20 @@ ROLLBACK;
 

 
+   
+GENERIC_PLAN
+
+ 
+  Generate a generic plan for the statement (see 
+  for details about generic plans).  The statement can contain parameter
+  placeholders like $1, if it is a statement that can
+  use parameters.  This option cannot be used together with
+  ANALYZE, since a statement with unknown parameters
+  cannot be executed.
+ 
+
+   
+

 BUFFERS
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 35c23bd27d..ab21a67862 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 			es->wal = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "settings") == 0)
 			es->settings = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "generic_plan") == 0)
+			es->generic = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "timing") == 0)
 		{
 			timing_set = true;
@@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
+	if (es->generic && es->analyze)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN")));
+
+	/* check that WAL is used with EXPLAIN ANALYZE */
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 651ad24fc1..38d14433a6 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -2043,7 +2043,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 			 * Initialize pruning contexts as needed.
 			 */
 			pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
-			if (pinfo->initial_pruning_steps)
+			if (pinfo->initial_pruning_steps &&
+!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY))
 			{
 InitPartitionPruneContext(>initial_context,
 		  pinfo->initial_pruning_steps,
@@ -2053,7 +2054,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
 prunestate->do_initial_prune = true;
 			}
 			pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
-			if (pinfo->exec_pruning_steps)
+			if (pinfo->exec_pruning_steps &&
+!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY))
 			{
 InitPartitionPruneContext(>exec_context,
 		  pinfo->exec_pruning_steps,
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index e892df9819..9143964e78 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -27,6 +27,7 @@
 #include "access/sysattr.h"
 #include 

proposal: psql: show current user in prompt

2023-02-03 Thread Pavel Stehule
Hi

one visitor of p2d2 (Prague PostgreSQL Developer Day) asked if it is
possible to show the current role in psql's prompt. I think it is not
possible, but fortunately (with some limits) almost all necessary work is
done, and the patch is short.

In the assigned patch I implemented a new prompt placeholder %N, that shows
the current role name.

(2023-02-03 15:52:28) postgres=# \set PROMPT1 '%n as %N at '%/%=%#
pavel as pavel at postgres=#set role to admin;
SET
pavel as admin at postgres=>set role to default;
SET
pavel as pavel at postgres=#

Comments, notes are welcome.

Regards

Pavel
From d45b620515387c531ea1d663b87dac6144b0b41e Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Fri, 3 Feb 2023 15:53:56 +0100
Subject: [PATCH 2/2] implementation of psql prompt placeholder %N

---
 src/backend/utils/misc/guc_tables.c |  2 +-
 src/bin/psql/common.c   | 25 +
 src/bin/psql/common.h   |  1 +
 src/bin/psql/prompt.c   |  5 +
 4 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index b46e3b8c55..3188fd015d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -4144,7 +4144,7 @@ struct config_string ConfigureNamesString[] =
 		{"role", PGC_USERSET, UNGROUPED,
 			gettext_noop("Sets the current role."),
 			NULL,
-			GUC_IS_NAME | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_NOT_WHILE_SEC_REST
+			GUC_IS_NAME | GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_NOT_WHILE_SEC_REST
 		},
 		_string,
 		"none",
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f907f5d4e8..9c82b8253b 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -2310,6 +2310,31 @@ session_username(void)
 }
 
 
+/*
+ * Return the current user of the current connection.
+ * Replace "none" by session user.
+ */
+const char *
+current_role(void)
+{
+	const char *val;
+
+	if (!pset.db)
+		return NULL;
+
+	val = PQparameterStatus(pset.db, "role");
+	if (val)
+	{
+		if (strncmp(val, "none", 4) == 0 && strlen(val) == 4)
+			return session_username();
+		else
+			return val;
+	}
+	else
+		return PQuser(pset.db);
+}
+
+
 /* expand_tilde
  *
  * substitute '~' with HOME or '~username' with username's home dir
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index cc4c73d097..b7a8182dd2 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -37,6 +37,7 @@ extern bool SendQuery(const char *query);
 extern bool is_superuser(void);
 extern bool standard_strings(void);
 extern const char *session_username(void);
+extern const char *current_role(void);
 
 extern void expand_tilde(char **filename);
 
diff --git a/src/bin/psql/prompt.c b/src/bin/psql/prompt.c
index 969cd9908e..91813e1356 100644
--- a/src/bin/psql/prompt.c
+++ b/src/bin/psql/prompt.c
@@ -165,6 +165,11 @@ get_prompt(promptStatus_t status, ConditionalStack cstack)
 	if (pset.db)
 		strlcpy(buf, session_username(), sizeof(buf));
 	break;
+	/* DB server current user name */
+case 'N':
+	if (pset.db)
+		strlcpy(buf, current_role(), sizeof(buf));
+	break;
 	/* backend pid */
 case 'p':
 	if (pset.db)
-- 
2.39.1

From 153994fd93571964766ca054b0f7fe342ac72a6f Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Fri, 3 Feb 2023 11:40:41 +0100
Subject: [PATCH 1/2] implementation of BACKEND_PID psql's variable

---
 src/bin/psql/command.c | 3 +++
 src/bin/psql/help.c| 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index b5201edf55..934dd26c61 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3783,6 +3783,9 @@ SyncVariables(void)
 	SetVariable(pset.vars, "PORT", PQport(pset.db));
 	SetVariable(pset.vars, "ENCODING", pg_encoding_to_char(pset.encoding));
 
+	snprintf(vbuf, sizeof(vbuf), "%d", PQbackendPID(pset.db));
+	SetVariable(pset.vars, "BACKEND_PID", vbuf);
+
 	/* this bit should match connection_warnings(): */
 	/* Try to get full text form of version, might include "devel" etc */
 	server_version = PQparameterStatus(pset.db, "server_version");
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index e45c4aaca5..61c6edd0ba 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -396,6 +396,8 @@ helpVariables(unsigned short int pager)
 
 	HELP0("  AUTOCOMMIT\n"
 		  "if set, successful SQL commands are automatically committed\n");
+	HELP0("  BACKEND_PID\n"
+		  "id of server process of the current connection\n");
 	HELP0("  COMP_KEYWORD_CASE\n"
 		  "determines the case used to complete SQL key words\n"
 		  "[lower, upper, preserve-lower, preserve-upper]\n");
-- 
2.39.1



Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-02-03 Thread Tom Lane
Laurenz Albe  writes:
> I played around with it, and I ran into a problem with partitions that
> are foreign tables:
> ...
>   EXPLAIN (GENERIC_PLAN) SELECT * FROM looppart WHERE key = $1;
>   ERROR:  no value found for parameter 1

Hmm, offhand I'd say that something is doing something it has no
business doing when EXEC_FLAG_EXPLAIN_ONLY is set (that is, premature
evaluation of an expression).  I wonder whether this failure is
reachable without this patch.

regards, tom lane




Re: CI and test improvements

2023-02-03 Thread Justin Pryzby
rebased, and re-including a patch to show code coverage of changed
files.

a5b3e50d922 cirrus/windows: add compiler_warnings_script
4c98dcb0e03 cirrus/freebsd: run with more CPUs+RAM and do not repartition
aaeef938ed4 cirrus/freebsd: define ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
9baf41674ad pg_upgrade: tap test: exercise --link and --clone
7e09035f588 WIP: ci/meson: allow showing only failed tests ..
e4534821ef5 cirrus/ccache: use G rather than GB suffix..
185d1c3ed13 cirrus: code coverage
5dace84a038 cirrus: upload changed html docs as artifacts
852360330ef +html index file

>From a5b3e50d922a6dcff22feabbcae741b23b2347c6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 21:53:22 -0500
Subject: [PATCH 1/9] cirrus/windows: add compiler_warnings_script

I'm not sure how to write this test in windows shell; it's also easy to
write something that doesn't work in posix sh, since windows shell is
interpretting && and ||...

https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de

See also:
8a1ce5e54f6d144e4f8e19af7c767b026ee0c956
https://cirrus-ci.com/task/6241060062494720
https://cirrus-ci.com/task/6496366607204352

ci-os-only: windows
---
 .cirrus.yml | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 1204824d2eb..eefc5c21fe6 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -566,12 +566,21 @@ task:
 
   build_script: |
 vcvarsall x64
-ninja -C build
+ninja -C build |tee build.txt
+REM Since pipes lose the exit status of the preceding command, rerun the compilation
+REM without the pipe, exiting now if it fails, to avoid trying to run checks
+ninja -C build > nul
 
   check_world_script: |
 vcvarsall x64
 meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
 
+  # This should be last, so check_world is run even if there are warnings
+  always:
+compiler_warnings_script:
+  # this avoids using metachars which would be interpretted by the windows shell
+  - sh -c 'if grep ": warning " build.txt; then exit 1; fi; exit 0'
+
   on_failure:
 <<: *on_failure_meson
 crashlog_artifacts:
-- 
2.25.1

>From 4c98dcb0e033ca12c6a6093c5e94e4231756fa4d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 24 Jun 2022 00:09:12 -0500
Subject: [PATCH 2/9] cirrus/freebsd: run with more CPUs+RAM and do not
 repartition

There was some historic problem where tests under freebsd took 8+ minutes (and
before 4a288a37f took 15 minutes).

This reduces test time from 10min to 3min.
4 CPUs 4 tests https://cirrus-ci.com/task/4880240739614720
4 CPUs 6 tests https://cirrus-ci.com/task/4664440120410112 https://cirrus-ci.com/task/4586784884523008
4 CPUs 8 tests https://cirrus-ci.com/task/5001995491737600

6 CPUs https://cirrus-ci.com/task/6678321684545536
8 CPUs https://cirrus-ci.com/task/6264854121021440

See also:
https://www.postgresql.org/message-id/flat/20220310033347.hgxk4pyarzq4h...@alap3.anarazel.de#f36c0b17e33e31e7925e7e5812998686
8 jobs 7min https://cirrus-ci.com/task/6186376667332608

//-os-only: freebsd
---
 .cirrus.yml | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index eefc5c21fe6..0c12ca04fd9 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -131,11 +131,9 @@ task:
   name: FreeBSD - 13 - Meson
 
   env:
-# FreeBSD on GCP is slow when running with larger number of CPUS /
-# jobs. Using one more job than cpus seems to work best.
-CPUS: 2
-BUILD_JOBS: 3
-TEST_JOBS: 3
+CPUS: 4
+BUILD_JOBS: 4
+TEST_JOBS: 6
 
 CCACHE_DIR: /tmp/ccache_dir
 CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES -DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST
@@ -160,8 +158,6 @@ task:
 
   ccache_cache:
 folder: $CCACHE_DIR
-  # Work around performance issues due to 32KB block size
-  repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
   create_user_script: |
 pw useradd postgres
 chown -R postgres:postgres .
@@ -175,8 +171,7 @@ task:
 #pkg install -y ...
 
   # NB: Intentionally build without -Dllvm. The freebsd image size is already
-  # large enough to make VM startup slow, and even without llvm freebsd
-  # already takes longer than other platforms except for windows.
+  # large enough to make VM startup slow
   configure_script: |
 su postgres <<-EOF
   meson setup \
-- 
2.25.1

>From aaeef938ed4a1b604c41d3973a49d790ad58af20 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 9 Dec 2022 15:32:50 -0600
Subject: [PATCH 3/9] cirrus/freebsd: define
 ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS

See also: 54100f5c6052404f68de9ce7310ceb61f1c291f8

ci-os-only: freebsd
---
 .cirrus.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 0c12ca04fd9..6186505ccd7 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -136,7 +136,7 @@ task:
 TEST_JOBS: 6
 
 CCACHE_DIR: /tmp/ccache_dir
-CPPFLAGS: 

Re: Make mesage at end-of-recovery less scary.

2023-02-03 Thread Alvaro Herrera
So this patch is now failing because it applies new tests to
011_crash_recovery.pl, which was removed recently.  Can you please move
them elsewhere?

I think the comment for ValidXLogRecordLength should explain what the
return value is.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-02-03 Thread Laurenz Albe
On Tue, 2023-01-31 at 13:49 -0500, Tom Lane wrote:
> Laurenz Albe  writes:
> > [ 0001-Add-EXPLAIN-option-GENERIC_PLAN.v4.patch ]
> 
> I took a closer look at this patch, and didn't like the implementation
> much.  You're not matching the behavior of PREPARE at all: for example,
> this patch is content to let $1 be resolved with different types in
> different places.  We should be using the existing infrastructure that
> parse_analyze_varparams uses.
> 
> Also, I believe that in contexts such as plpgsql, it is possible that
> there's an external source of $N definitions, which we should probably
> continue to honor even with GENERIC_PLAN.
> 
> So that leads me to think the code should be more like this.  I'm not
> sure if it's worth spending documentation and testing effort on the
> case where we don't override an existing p_paramref_hook.

Thanks, that looks way cleaner.

I played around with it, and I ran into a problem with partitions that
are foreign tables:

  CREATE TABLE loc1 (id integer NOT NULL, key integer NOT NULL CHECK (key = 1), 
value text);

  CREATE TABLE loc2 (id integer NOT NULL, key integer NOT NULL CHECK (key = 2), 
value text);

  CREATE TABLE looppart (id integer GENERATED ALWAYS AS IDENTITY, key integer 
NOT NULL, value text) PARTITION BY LIST (key);

  CREATE FOREIGN TABLE looppart1 PARTITION OF looppart FOR VALUES IN (1) SERVER 
loopback OPTIONS (table_name 'loc1');

  CREATE FOREIGN TABLE looppart2 PARTITION OF looppart FOR VALUES IN (2) SERVER 
loopback OPTIONS (table_name 'loc2');

  EXPLAIN (GENERIC_PLAN) SELECT * FROM looppart WHERE key = $1;
  ERROR:  no value found for parameter 1

The solution could be to set up a dynamic parameter hook in the
ExprContext in ecxt_param_list_info->paramFetch so that
ExecEvalParamExtern doesn't complain about the missing parameter.

Does that make sense?  How do I best hook into the executor to set that up?

Yours,
Laurenz Albe




Re: generic plans and "initial" pruning

2023-02-03 Thread Amit Langote
On Thu, Feb 2, 2023 at 11:49 PM Amit Langote  wrote:
> On Fri, Jan 27, 2023 at 4:01 PM Amit Langote  wrote:
> > I didn't actually go with calling the plancache on every lock taken on
> > a relation, that is, in ExecGetRangeTableRelation().  One thing about
> > doing it that way that I didn't quite like (or didn't see a clean
> > enough way to code) is the need to complicate the ExecInitNode()
> > traversal for handling the abrupt suspension of the ongoing setup of
> > the PlanState tree.
>
> OK, I gave this one more try and attached is what I came up with.
>
> This adds a ExecPlanStillValid(), which is called right after anything
> that may in turn call ExecGetRangeTableRelation() which has been
> taught to lock a relation if EXEC_FLAG_GET_LOCKS has been passed in
> EState.es_top_eflags.  That includes all ExecInitNode() calls, and a
> few other functions that call ExecGetRangeTableRelation() directly,
> such as ExecOpenScanRelation().  If ExecPlanStillValid() returns
> false, that is, if EState.es_cachedplan is found to have been
> invalidated after a lock being taken by ExecGetRangeTableRelation(),
> whatever funcion called it must return immediately and so must its
> caller and so on.  ExecEndPlan() seems to be able to clean up after a
> partially finished attempt of initializing a PlanState tree in this
> way.  Maybe my preliminary testing didn't catch cases where pointers
> to resources that are normally put into the nodes of a PlanState tree
> are now left dangling, because a partially built PlanState tree is not
> accessible to ExecEndPlan; QueryDesc.planstate would remain NULL in
> such cases.  Maybe there's only es_tupleTable and es_relations that
> needs to be explicitly released and the rest is taken care of by
> resetting the ExecutorState context.

In the attached updated patch, I've made the functions that check
ExecPlanStillValid() to return NULL (if returning something) instead
of returning partially initialized structs.  Those partially
initialized structs were not being subsequently looked at anyway.

> On testing, I'm afraid we're going to need something like
> src/test/modules/delay_execution to test that concurrent changes to
> relation(s) in PlannedStmt.relationOids that occur somewhere between
> RevalidateCachedQuery() and InitPlan() result in the latter to be
> aborted and that it is handled correctly.  It seems like it is only
> the locking of partitions (that are not present in an unplanned Query
> and thus not protected by AcquirePlannerLocks()) that can trigger
> replanning of a CachedPlan, so any tests we write should involve
> partitions.  Should this try to test as many plan shapes as possible
> though given the uncertainty around ExecEndPlan() robustness or should
> manual auditing suffice to be sure that nothing's broken?

I've added a test case under src/modules/delay_execution by adding a
new ExecutorStart_hook that works similarly as
delay_execution_planner().  The test works by allowing a concurrent
session to drop an object being referenced in a cached plan being
initialized while the ExecutorStart_hook waits to get an advisory
lock.  The concurrent drop of the referenced object is detected during
ExecInitNode() and thus triggers replanning of the cached plan.

I also fixed a bug in the ExplainExecuteQuery() while testing and some comments.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v33-0001-Move-AcquireExecutorLocks-s-responsibility-into-.patch
Description: Binary data


RE: Exit walsender before confirming remote flush in logical replication

2023-02-03 Thread Hayato Kuroda (Fujitsu)
Dear Amit, Sawada-san,

> > IIUC there is no difference between smart shutdown and fast shutdown
> > in logical replication walsender, but reading the doc[1], it seems to
> > me that in the smart shutdown mode, the server stops existing sessions
> > normally. For example, If the client is psql that gets stuck for some
> > reason and the network buffer gets full, the smart shutdown waits for
> > a backend process to send all results to the client. I think the
> > logical replication walsender should follow this behavior for
> > consistency. One idea is to distinguish smart shutdown and fast
> > shutdown also in logical replication walsender so that we disconnect
> > even without the done message in fast shutdown mode, but I'm not sure
> > it's worthwhile.
> >
> 
> The main problem we want to solve here is to avoid shutdown failing in
> case walreceiver/applyworker is busy waiting for some lock or for some
> other reason as shown in the email [1]. I haven't tested it but if
> such a problem doesn't exist in smart shutdown mode then probably we
> can allow walsender to wait till all the data is sent.

Based on the idea, I made a PoC patch to introduce the smart shutdown to 
walsenders.
PSA 0002 patch. 0001 is not changed from v5.
When logical walsenders got shutdown request but their send buffer is full due 
to
the delay, they will:

* wait to complete to send data to subscriber if we are in smart shutdown mode
* exit immediately if we are in fast shutdown mode

Note that in both case, walsender does not wait the remote flush of WALs.

For implementing that, I added new attribute to WalSndCtlData that indicates the
shutdown status. Basically it is zero, but it will be changed by postmaster when
it gets request.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v6-0001-Exit-walsender-before-confirming-remote-flush-in-.patch
Description:  v6-0001-Exit-walsender-before-confirming-remote-flush-in-.patch


v6-0002-Introduce-smart-shutdown-for-logical-walsender.patch
Description:  v6-0002-Introduce-smart-shutdown-for-logical-walsender.patch


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-03 Thread Amit Kapila
On Fri, Feb 3, 2023 at 3:12 PM wangw.f...@fujitsu.com
 wrote:
>
> Here is a comment:
>
> 1. The checks in function AlterSubscription
> +   /*
> +* The combination of parallel 
> streaming mode and
> +* min_apply_delay is not allowed. See
> +* parse_subscription_options for 
> details of the reason.
> +*/
> +   if (opts.streaming == 
> LOGICALREP_STREAM_PARALLEL)
> +   if 
> ((IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay 
> > 0) ||
> +   
> (!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 
> 0))
> and
> +   /*
> +* The combination of parallel 
> streaming mode and
> +* min_apply_delay is not allowed.
> +*/
> +   if (opts.min_apply_delay > 0)
> +   if 
> ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming == 
> LOGICALREP_STREAM_PARALLEL) ||
> +   
> (!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == 
> LOGICALREP_STREAM_PARALLEL))
>
> I think the case where the options "min_apply_delay>0" and 
> "streaming=parallel"
> are set at the same time seems to have been checked in the function
> parse_subscription_options, how about simplifying these two if-statements here
> to the following:
> ```
> if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
> !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> sub->minapplydelay > 0)
>
> and
>
> if (opts.min_apply_delay > 0 &&
> !IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
> sub->stream == LOGICALREP_STREAM_PARALLEL)
> ```
>

Won't just checking if ((opts.streaming == LOGICALREP_STREAM_PARALLEL
&& sub->minapplydelay > 0) || (opts.min_apply_delay > 0 && sub->stream
== LOGICALREP_STREAM_PARALLEL)) be sufficient in that case?

-- 
With Regards,
Amit Kapila.




proposal: psql: psql variable BACKEND_PID

2023-02-03 Thread Pavel Stehule
Hi

We can simply allow an access to backend process id thru psql variable. I
propose the name "BACKEND_PID". The advantages of usage are simple
accessibility by command \set, and less typing then using function
pg_backend_pid, because psql variables are supported by tab complete
routine. Implementation is very simple, because we can use the function
PQbackendPID.

Comments, notes?

Regards

Pavel
From 153994fd93571964766ca054b0f7fe342ac72a6f Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Fri, 3 Feb 2023 11:40:41 +0100
Subject: [PATCH] implementation of BACKEND_PID psql's variable

---
 src/bin/psql/command.c | 3 +++
 src/bin/psql/help.c| 2 ++
 2 files changed, 5 insertions(+)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index b5201edf55..934dd26c61 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3783,6 +3783,9 @@ SyncVariables(void)
 	SetVariable(pset.vars, "PORT", PQport(pset.db));
 	SetVariable(pset.vars, "ENCODING", pg_encoding_to_char(pset.encoding));
 
+	snprintf(vbuf, sizeof(vbuf), "%d", PQbackendPID(pset.db));
+	SetVariable(pset.vars, "BACKEND_PID", vbuf);
+
 	/* this bit should match connection_warnings(): */
 	/* Try to get full text form of version, might include "devel" etc */
 	server_version = PQparameterStatus(pset.db, "server_version");
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index e45c4aaca5..61c6edd0ba 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -396,6 +396,8 @@ helpVariables(unsigned short int pager)
 
 	HELP0("  AUTOCOMMIT\n"
 		  "if set, successful SQL commands are automatically committed\n");
+	HELP0("  BACKEND_PID\n"
+		  "id of server process of the current connection\n");
 	HELP0("  COMP_KEYWORD_CASE\n"
 		  "determines the case used to complete SQL key words\n"
 		  "[lower, upper, preserve-lower, preserve-upper]\n");
-- 
2.39.1



Re: [PATCH] Compression dictionaries for JSONB

2023-02-03 Thread Pavel Borisov
On Fri, 3 Feb 2023 at 14:04, Alvaro Herrera  wrote:
>
> This patch came up at the developer meeting in Brussels yesterday.
> https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage
>
> First, as far as I can tell, there is a large overlap between this patch
> and "Pluggable toaster" patch.  The approaches are completely different,
> but they seem to be trying to fix the same problem: the fact that the
> default TOAST stuff isn't good enough for JSONB.  I think before asking
> developers of both patches to rebase over and over, we should take a
> step back and decide which one we dislike the less, and how to fix that
> one into a shape that we no longer dislike.
>
> (Don't get me wrong.  I'm all for having better JSONB compression.
> However, for one thing, both patches require action from the user to set
> up a compression mechanism by hand.  Perhaps it would be even better if
> the system determines that a JSONB column uses a different compression
> implementation, without the user doing anything explicitly; or maybe we
> want to give the user *some* agency for specific columns if they want,
> but not force them into it for every single jsonb column.)
>
> Now, I don't think either of these patches can get to a committable
> shape in time for v16 -- even assuming we had an agreed design, which
> AFAICS we don't.  But I encourage people to continue discussion and try
> to find consensus.
>
Hi, Alvaro!

I'd like to give my +1 in favor of implementing a pluggable toaster
interface first. Then we can work on custom toast engines for
different scenarios, not limited to JSON(b).

For example, I find it useful to decrease WAL overhead on the
replication of TOAST updates. It is quite a pain now that we need to
rewrite all toast chunks at any TOAST update. Also, it could be good
for implementing undo access methods etc., etc. Now, these kinds of
activities in extensions face the fact that core has only one TOAST
which is quite inefficient in many scenarios.

So overall I value the extensibility part of this activity as the most
important one and will be happy to see it completed first.

Kind regards,
Pavel Borisov,
Supabase.




Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2023-02-03 Thread Damir Belyalov
Hi, Danil and Nikita!
Thank you for reviewing.

Why is there no static keyword in the definition of the SafeCopying()
> function, but it is in the function declaration.
>
Correct this.

675: MemoryContext cxt =
> MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
> 676:
> 677: valid_row = NextCopyFrom(cstate, econtext, myslot->tts_values,
> myslot->tts_isnull);
> 678: tuple_is_valid = valid_row;
> 679:
> 680: if (valid_row)
> 681: sfcstate->safeBufferBytes += cstate->line_buf.len;
> 682:
> 683: CurrentMemoryContext = cxt;
>
> Why are you using a direct assignment to CurrentMemoryContext instead of
> using the MemoryContextSwitchTo function in the SafeCopying() routine?
>

"CurrentMemoryContext = cxt" is the same as "MemoryContextSwitchTo(cxt)",
you can see it in the implementation of MemoryContextSwitchTo(). Also
correct this.


When you initialize the cstate->sfcstate structure, you create a
> cstate->sfcstate->safe_cxt memory context that inherits from oldcontext.
> Was it intended to use cstate->copycontext as the parent context here?
>

Good remark, correct this.



Thanks Nikita Malakhov for advice to create file with errors. But I decided
to to log errors in the system logfile and don't print them to the
terminal. The error's output in logfile is rather simple - only error
context logs (maybe it's better to log all error information?).
There are 2 points why logging errors in logfile is better than logging
errors in another file (e.g. PGDATA/copy_ignore_errors.txt). The user is
used to looking for errors in logfile. Creating another file entails
problems like: 'what file name to create?', 'do we need to make file
rotation?', 'where does this file create?' (we can't create it in PGDATA
cause of memory constraints)



Regards,
Damir Belyalov
Postgres Professional
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c25b52d0cb..50151aec54 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -34,6 +34,7 @@ COPY { table_name [ ( format_name
 FREEZE [ boolean ]
+IGNORE_ERRORS [ boolean ]
 DELIMITER 'delimiter_character'
 NULL 'null_string'
 HEADER [ boolean | MATCH ]
@@ -233,6 +234,18 @@ COPY { table_name [ ( 

 
+   
+IGNORE_ERRORS
+
+ 
+  Drops rows that contain malformed data while copying. These are rows
+  containing syntax errors in data, rows with too many or too few columns,
+  rows containing columns where the data type's input function raises an error.
+  Logs errors to system logfile and outputs the total number of errors.
+ 
+
+   
+

 DELIMITER
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index e34f583ea7..e741ce3e5a 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -407,6 +407,7 @@ ProcessCopyOptions(ParseState *pstate,
    bool is_from,
    List *options)
 {
+	bool		ignore_errors_specified = false;
 	bool		format_specified = false;
 	bool		freeze_specified = false;
 	bool		header_specified = false;
@@ -449,6 +450,13 @@ ProcessCopyOptions(ParseState *pstate,
 			freeze_specified = true;
 			opts_out->freeze = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "ignore_errors") == 0)
+		{
+			if (ignore_errors_specified)
+errorConflictingDefElem(defel, pstate);
+			ignore_errors_specified = true;
+			opts_out->ignore_errors = defGetBoolean(defel);
+		}
 		else if (strcmp(defel->defname, "delimiter") == 0)
 		{
 			if (opts_out->delim)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index af52faca6d..657fa44e0b 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -107,6 +107,9 @@ static char *limit_printout_length(const char *str);
 
 static void ClosePipeFromProgram(CopyFromState cstate);
 
+static bool SafeCopying(CopyFromState cstate, ExprContext *econtext,
+		TupleTableSlot *myslot);
+
 /*
  * error context callback for COPY FROM
  *
@@ -625,6 +628,173 @@ CopyMultiInsertInfoStore(CopyMultiInsertInfo *miinfo, ResultRelInfo *rri,
 	miinfo->bufferedBytes += tuplen;
 }
 
+/*
+ * Safely reads source data, converts to a tuple and fills tuple buffer.
+ * Skips some data in the case of failed conversion if data source for
+ * a next tuple can be surely read without a danger.
+ */
+static bool
+SafeCopying(CopyFromState cstate, ExprContext *econtext, TupleTableSlot *myslot)
+{
+	SafeCopyFromState  *sfcstate = cstate->sfcstate;
+	bool valid_row = true;
+
+	/* Standard COPY if IGNORE_ERRORS is disabled */
+	if (!cstate->sfcstate)
+		/* Directly stores the values/nulls array in the slot */
+		return NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull);
+
+	if (sfcstate->replayed_tuples < sfcstate->saved_tuples)
+	{
+		Assert(sfcstate->saved_tuples > 0);
+
+		/* Prepare to replay the tuple */
+		heap_deform_tuple(sfcstate->safe_buffer[sfcstate->replayed_tuples++], RelationGetDescr(cstate->rel),
+		  

Re: Support logical replication of DDLs

2023-02-03 Thread Alvaro Herrera
On 2023-Feb-03, Peter Smith wrote:

> 1.
> (This is not really a review comment - more just an observation...)
> 
> This patch seemed mostly like an assortment of random changes that
> don't seem to have anything in common except that some *later* patches
> of this set are apparently going to want them.

That's true, but from a submitter perspective it is 1000x easier to do
it like this, and for a reviewer these changes are not really very
interesting.  By now, given the amount of review effort that needs to go
into this patch (just because it's 800kb of diff), it seems fairly clear
that we cannot get this patch in time for v16, so it doesn't seem
priority to get this point sorted out.  Personally, from a review point
of view, I would still prefer to have it this way rather than each
change scattered in each individual patch that needs it, so let's not
get too worked out about it at this point.  Maybe if we can find some
use for some of these helpers in existing code that allow refactoring
while introducing these new functions, we can add them ahead of
everything else.

> 3. ExecuteGrantStmt
> 
> + /* Copy the grantor id needed for DDL deparsing of Grant */
> + istmt.grantor_uid = grantor;
> +
> 
> SUGGESTION (comment)
> Copy the grantor id to the parsetree, needed for DDL deparsing of Grant

Is istmt really "the parse tree" actually?  As I recall, it's a derived
struct that's created during execution of the grant/revoke command, so
modifying the comment like this would be a mistake.

> @@ -5922,7 +5922,7 @@ getObjectIdentityParts(const ObjectAddress *object,
>   transformType = format_type_be_qualified(transform->trftype);
>   transformLang = get_language_name(transform->trflang, false);
> 
> - appendStringInfo(, "for %s on language %s",
> + appendStringInfo(, "for %s language %s",
>   transformType,
>   transformLang);
> 
> There is no clue anywhere what this change was for.

We should get the objectIdentity changes ahead of everything else; I
think these can be qualified as bugs (though I would recommend not
backpatching them.)  I think there were two of these.

> 8.
> +/*
> + * Return the given object type as a string.
> + */
> +const char *
> +stringify_objtype(ObjectType objtype, bool isgrant)
> +{

> That 'is_grant' param seemed a bit hacky.
> 
> At least some comment should be given (maybe in the function header?)
> to explain why this boolean is modifying the return string.
> 
> Or maybe it is better to have another stringify_objtype_for_grant that
> just wraps this?

... I don't remember writing this code, but it's probably my fault (was
it 7 years ago now?).  Maybe we can find a different approach that
doesn't need yet another list of object types?  (If I did write it,) we
have a lot more infrastructure now that we had it back then, I think.
In any case it doesn't seem like a function called "stringify_objtype"
with this signature makes sense as an exported function, much less in
utility.c.


-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"But static content is just dynamic content that isn't moving!"
http://smylers.hates-software.com/2007/08/15/fe244d0c.html




Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-02-03 Thread shveta malik
On Thu, Feb 2, 2023 at 5:01 PM shveta malik  wrote:
>
> Reviewing further
>

Hi Melih,

int64 rep_slot_id;
int64 lastusedid;
int64 sublastusedid

--Should all of the above be unsigned integers?

thanks
Shveta




Re: Pluggable toaster

2023-02-03 Thread Alvaro Herrera
On 2023-Jan-14, Nikita Malakhov wrote:

> Hi!
> Fails due to recent changes. Working on it.

Please see my response here
https://postgr.es/m/20230203095540.zutul5vmsbmantbm@alvherre.pgsql

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)




Re: [PATCH] Compression dictionaries for JSONB

2023-02-03 Thread Alvaro Herrera
This patch came up at the developer meeting in Brussels yesterday.
https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2023_Developer_Meeting#v16_Patch_Triage

First, as far as I can tell, there is a large overlap between this patch
and "Pluggable toaster" patch.  The approaches are completely different,
but they seem to be trying to fix the same problem: the fact that the
default TOAST stuff isn't good enough for JSONB.  I think before asking
developers of both patches to rebase over and over, we should take a
step back and decide which one we dislike the less, and how to fix that
one into a shape that we no longer dislike.

(Don't get me wrong.  I'm all for having better JSONB compression.
However, for one thing, both patches require action from the user to set
up a compression mechanism by hand.  Perhaps it would be even better if
the system determines that a JSONB column uses a different compression
implementation, without the user doing anything explicitly; or maybe we
want to give the user *some* agency for specific columns if they want,
but not force them into it for every single jsonb column.)

Now, I don't think either of these patches can get to a committable
shape in time for v16 -- even assuming we had an agreed design, which
AFAICS we don't.  But I encourage people to continue discussion and try
to find consensus.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Re: Use windows VMs instead of windows containers on the CI

2023-02-03 Thread Thomas Munro
On Fri, Feb 3, 2023 at 6:57 PM Andres Freund  wrote:
> And pushed!  I think an improvement in CI times of this degree is pretty
> awesome.

+1

A lot of CI compute time is saved.  The Cirrus account[1] was
previously hitting the 4 job limit all day long, and now it's often
running 1 or 2 jobs when I look, and it has space capacity to start a
new job immediately if someone posts a new patch.  I'll monitor it
over the next few days but it looks great.

[1] https://cirrus-ci.com/github/postgresql-cfbot/postgresql




Re: Where is the logig to create a table file?

2023-02-03 Thread Pavel Borisov
Hi, Jack

On Fri, 3 Feb 2023 at 13:19, jack...@gmail.com  wrote:
>
> When I use 'create table t(a int);'; suppose that this table t's oid is 1200,
> then postgres will create a file named 1200 in the $PGDATA/base, So where
> is the logic code in the internal?
>
heapam_relation_set_new_filenode()->RelationCreateStorage()

Kind regards,
Pavel Borisov,
Supabase




Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-03 Thread Amit Kapila
On Fri, Feb 3, 2023 at 1:28 PM Masahiko Sawada  wrote:
>
> On Fri, Feb 3, 2023 at 12:29 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Friday, February 3, 2023 11:04 AM Amit Kapila  
> > wrote:
> > >
> > > On Thu, Feb 2, 2023 at 4:52 AM Peter Smith 
> > > wrote:
> > > >
> > > > Some minor review comments for v91-0001
> > > >
> > >
> > > Pushed this yesterday after addressing your comments!
> >
> > Thanks for pushing.
> >
> > Currently, we have two remaining patches which we are not sure whether it's 
> > worth
> > committing for now. Just share them here for reference.
> >
> > 0001:
> >
> > Based on our discussion[1] on -hackers, it's not clear that if it's 
> > necessary
> > to add the sub-feature to stop extra worker when
> > max_apply_workers_per_suibscription is reduced. Because:
> >
> > - it's not clear whether reducing the 'max_apply_workers_per_suibscription' 
> > is very
> >   common.
>
> A use case I'm concerned about is a temporarily intensive data load,
> for example, a data loading batch job in a maintenance window. In this
> case, the user might want to temporarily increase
> max_parallel_workers_per_subscription in order to avoid a large
> replication lag, and revert the change back to normal after the job.
> If it's unlikely to stream the changes in the regular workload as
> logical_decoding_work_mem is big enough to handle the regular
> transaction data, the excess parallel workers won't exit.
>

Won't in such a case, it would be better to just switch off the
parallel option for a subscription? We need to think of a predictable
way to test this path which may not be difficult. But I guess it would
be better to wait for some feedback from the field about this feature
before adding more to it and anyway it shouldn't be a big deal to add
this later as well.

-- 
With Regards,
Amit Kapila.




RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-03 Thread wangw.f...@fujitsu.com
On Thurs, Feb 2, 2023 16:04 PM Takamichi Osumi (Fujitsu) 
 wrote:
> Attached the updated patch v26 accordingly.

Thanks for your patch.

Here is a comment:

1. The checks in function AlterSubscription
+   /*
+* The combination of parallel 
streaming mode and
+* min_apply_delay is not allowed. See
+* parse_subscription_options for 
details of the reason.
+*/
+   if (opts.streaming == 
LOGICALREP_STREAM_PARALLEL)
+   if ((IsSet(opts.specified_opts, 
SUBOPT_MIN_APPLY_DELAY) && opts.min_apply_delay > 0) ||
+   
(!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0))
and
+   /*
+* The combination of parallel 
streaming mode and
+* min_apply_delay is not allowed.
+*/
+   if (opts.min_apply_delay > 0)
+   if ((IsSet(opts.specified_opts, 
SUBOPT_STREAMING) && opts.streaming == LOGICALREP_STREAM_PARALLEL) ||
+   
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream == 
LOGICALREP_STREAM_PARALLEL))

I think the case where the options "min_apply_delay>0" and "streaming=parallel"
are set at the same time seems to have been checked in the function
parse_subscription_options, how about simplifying these two if-statements here
to the following:
```
if (opts.streaming == LOGICALREP_STREAM_PARALLEL &&
!IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
sub->minapplydelay > 0)

and

if (opts.min_apply_delay > 0 &&
!IsSet(opts.specified_opts, SUBOPT_STREAMING) &&
sub->stream == LOGICALREP_STREAM_PARALLEL)
```

Regards,
Wang Wei


Re: Weird failure with latches in curculio on v15

2023-02-03 Thread Andres Freund
Hi, 

On February 3, 2023 9:19:23 AM GMT+01:00, Thomas Munro  
wrote:
>On Fri, Feb 3, 2023 at 9:09 PM Andres Freund  wrote:
>> Thinking about popen() suggests that we have a similar problem with COPY
>> FROM PROGRAM as we have in pgarch (i.e. not as bad as the startup
>> process issue, but still not great, due to
>> procsignal_sigusr1_handler()).
>
>A small mercy: while we promote some kinds of fatal-ish signals to
>group level with kill(-PID, ...), we don't do that for SIGUSR1 for
>latches or procsignals.

Not as bad, but we still do SetLatch() from a bunch of places that would be 
reached... 

Andres 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Where is the logig to create a table file?

2023-02-03 Thread jack...@gmail.com
When I use 'create table t(a int);'; suppose that this table t's oid is 1200,
then postgres will create a file named 1200 in the $PGDATA/base, So where
is the logic code in the internal?


--



jack...@gmail.com




Re: Non-superuser subscription owners

2023-02-03 Thread Andres Freund
Hi,

On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> I don't know what you mean by this. DML doesn't confer privileges. If
> code gets executed and runs with the replication user's credentials,
> that could lead to privilege escalation, but just moving rows around
> doesn't, at least not in the database sense.

Executing DML ends up executing code. Think predicated/expression
indexes, triggers, default expressions etc. If a badly written trigger
etc can be tricked to do arbitrary code exec, an attack will be able to
run with the privs of the run-as user.  How bad that is is influenced to
some degree by the amount of privileges that user has.

Greetings,

Andres Freund




Re: Weird failure with latches in curculio on v15

2023-02-03 Thread Thomas Munro
On Fri, Feb 3, 2023 at 9:09 PM Andres Freund  wrote:
> Thinking about popen() suggests that we have a similar problem with COPY
> FROM PROGRAM as we have in pgarch (i.e. not as bad as the startup
> process issue, but still not great, due to
> procsignal_sigusr1_handler()).

A small mercy: while we promote some kinds of fatal-ish signals to
group level with kill(-PID, ...), we don't do that for SIGUSR1 for
latches or procsignals.




Re: Weird failure with latches in curculio on v15

2023-02-03 Thread Andres Freund
Hi,

On 2023-02-03 02:50:38 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-02-03 02:24:03 -0500, Tom Lane wrote:
> >> setsid(2) is required since SUSv2, so I'm not sure which systems
> >> are of concern here ... other than Redmond's of course.
>
> > I was thinking of windows, yes.
>
> But given the lack of fork(2), Windows requires a completely
> different solution anyway, no?

Not sure it needs to be that different. I think what we basically want
is:

1) Something vaguely popen() shaped that starts a subprocess, while
   being careful about signal handlers, returning the pid of the child
   process. Not sure if we want to redirect stdout/stderr or
   not. Probably not?

2) A blocking wrapper around 1) that takes care to forward fatal signals
   to the subprocess, including in the SIGQUIT case and probably being
   interruptible with query cancels etc in the relevant process types.


Thinking about popen() suggests that we have a similar problem with COPY
FROM PROGRAM as we have in pgarch (i.e. not as bad as the startup
process issue, but still not great, due to
procsignal_sigusr1_handler()).


What's worse, the problem exists for untrusted PLs as well, and
obviously we can't ensure that signals are correctly masked there.

This seems to suggest that we ought to install a MyProcPid != getpid()
like defense in all our signal handlers...

Greetings,

Andres Freund