date:20231126

Re: Add recovery to pg_control and remove backup_label

2023-11-26 Thread Stephen Frost

Greetings,

* David Steele (da...@pgmasters.net) wrote:
> On 11/21/23 12:41, Andres Freund wrote:
> > Sure. They also receive a backup_label today. If an external solution 
> > forgets
> > to replace pg_control copied as part of the filesystem copy, they won't get 
> > an
> > error after the remove of backup_label, just like they don't get one today 
> > if
> > they don't put backup_label in the data directory.  Given that users don't 
> > do
> > the right thing with backup_label today, why can we rely on them doing the
> > right thing with pg_control?
> 
> I think reliable backup software does the right thing with backup_label, but
> if the user starts getting errors on recovery they the decide to remove
> backup_label. I know we can't do much about bad backup software, but we can
> at least make this a bit more resistant to user error after the fact.
> 
> It doesn't help that one of our hints suggests removing backup_label. In
> highly automated systems, the user might not even know they just restored
> from a backup. They are only in the loop because the restore failed and they
> are trying to figure out what is going wrong. When they remove backup_label
> the cluster comes up just fine. Victory!

Yup, this is exactly the issue.

> This is the scenario I've seen most often -- not the backup/restore process
> getting it wrong but the user removing backup_label on their own initiative.
> And because it yields such a positive result, at least initially, they
> remember in the future that the thing to do is to remove backup_label
> whenever they see the error.
> 
> If they only have pg_control, then their only choice is to get it right or
> run pg_resetwal. Most users have no knowledge of pg_resetwal so it will take
> them longer to get there. Also, I think that tool make it pretty clear that
> corruption will result and the only thing to do is a logical dump and
> restore after using it.

Agreed.

> There are plenty of ways a user can mess things up. What I'd like to prevent
> is the appearance of everything being OK when in fact they have corrupted
> their cluster. That's the situation we have now with backup_label. Is this
> new solution perfect? No, but I do think it checks several boxes, and is a
> worthwhile improvement.

+1.

As for the complaint about 'operators' having issue with the changes
we've been making in this area- where are those people complaining,
exactly?  Who are they?  I feel like we keep getting this kind of
push-back in this area from folks on this list but not from actual
backup software authors; all the complaints seem to either be 
speculative or unattributed pass-through from someone else.

What would really be helpful would be hearing from these individuals
directly as to what the issues are with the changes, such that perhaps
we can do things better in the future to avoid whatever the issue is
they're having with the changes.  Simply saying we shouldn't make
changes in this area isn't workable and the constant push-back is
actively discouraging to folks trying to make improvements.  Obviously
it's a biased view, but we've not had issues making the necessary
adjustments in pgbackrest with each release and I feel like if the
authors of wal-g or barman did that they would have spoken up.

Making a change as suggested which only helps pg_basebackup (and tools
like pgbackrest, since it's in C and can also make this particular
change) ends up leaving tools like wal-g and barman potentially still
with an easy way for users of those tools to corrupt their databases-
even though we've not heard anything from the authors of those tools
about issues with the proposed change, nor have there been a lot of
complaints from them about the prior changes to indicate that they'd
even have an issue with the more involved change.  Given the lack of
complaint about past changes, I'd certainly rather err on the side of
improved safety for users than on the side of the authors of these tools
possibly complaining.

What these changes have done is finally break things like omnipitr
completely, which hasn't been maintained in a very long time.  The
changes in v12 broke recovery with omnipitr but not backup, and folks
were trying to use omnipitr as recently as with v13[1].  Certainly
having a backup tool that only works for backup (fsvo works, anyway, as
it still used exclusive backup mode meaning that a crash during a backup
would cause the system to not come back up after...) but doesn't work
for recovery isn't exactly great and I'm glad that, now, an attempt to
use omnipitr to perform a backup will fail.  As with lots of other areas
of PG, folks need to read the release notes and potentially update their
code for new major versions.  If anything, the backup area is less of an
issue for this because the authors of the backup tools are able to make
the change (and who are often the ones pushing for these changes) and
the end-user isn't impacted at all.

Much the same can be said for wal-e, with users st

Missing docs on AT TIME ZONE precedence?

2023-11-26 Thread Shay Rojansky

Greeting hackers,

In the operator precedence table[1] table, AT TIME ZONE isn't explicitly
listed out; that means it's to be interpreted in the "any other operator
category".

However, it seems that the precedence of AT TIME ZONE is actually higher
than that of the addition operator:

-- Fails with "function pg_catalog.timezone(unknown, interval) does not
exist
SELECT now() + INTERVAL '14 days' AT TIME ZONE 'UTC';

-- Works:
SELECT (now() + INTERVAL '14 days') AT TIME ZONE 'UTC';

Note that missing parentheses for this were discussed in the context
of pg_catalog.pg_get_viewdef[2].

Is there a missing line in the operator precedence table in the docs?

Thanks,

Shay

[1]
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-PRECEDENCE
[2]
https://www.postgresql.org/message-id/flat/f41566aa-a057-6628-4b7c-b48770ecb...@deepbluecap.com

Re: Missing docs on AT TIME ZONE precedence?

2023-11-26 Thread Bruce Momjian

On Sun, Nov 26, 2023 at 11:13:39AM +0100, Shay Rojansky wrote:
> Greeting hackers,
> 
> In the operator precedence table[1] table, AT TIME ZONE isn't explicitly 
> listed
> out; that means it's to be interpreted in the "any other operator category".
> 
> However, it seems that the precedence of AT TIME ZONE is actually higher than
> that of the addition operator:
> 
> -- Fails with "function pg_catalog.timezone(unknown, interval) does not exist
> SELECT now() + INTERVAL '14 days' AT TIME ZONE 'UTC';
> 
> -- Works:
> SELECT (now() + INTERVAL '14 days') AT TIME ZONE 'UTC';
> 
> Note that missing parentheses for this were discussed in the context
> of pg_catalog.pg_get_viewdef[2].
> 
> Is there a missing line in the operator precedence table in the docs?

I think the big question is whether AT TIME ZONE is significant enough
to list there because there are many other clauses we could potentially
add there.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Should timezone be inherited from template database?

2023-11-26 Thread Anton A. Melnikov


Hello!

Found that if i set a specific time zone for a template database,
it will not be inherited in the database created from that template.

psql (17devel)
Type "help" for help.

postgres=# select now();
  now
---
 2023-11-26 17:24:58.242086+03
(1 row)

postgres=# ALTER DATABASE template1 SET TimeZone = 'UTC';
ALTER DATABASE
postgres=# \c template1
You are now connected to database "template1" as user "postgres".
template1=# select now();
  now
---
 2023-11-26 14:26:09.291082+00
(1 row)

template1=# CREATE DATABASE test;
CREATE DATABASE
template1=# \c test
You are now connected to database "test" as user "postgres".
test=# select now();
  now
---
 2023-11-26 17:29:05.487984+03
(1 row)

Could you clarify please. Is this normal, predictable behavior?

Would be very grateful!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Should timezone be inherited from template database?

2023-11-26 Thread David G. Johnston

On Sun, Nov 26, 2023 at 7:47 AM Anton A. Melnikov 
wrote:

>
> postgres=# ALTER DATABASE template1 SET TimeZone = 'UTC';
>
> Could you clarify please. Is this normal, predictable behavior?
>
>
https://www.postgresql.org/docs/current/sql-createdatabase.html

 Database-level configuration parameters (set via ALTER DATABASE) and
database-level permissions (set via GRANT) are not copied from the template
database.

David J.

Re: Should timezone be inherited from template database?

2023-11-26 Thread Anton A. Melnikov




On 26.11.2023 18:53, David G. Johnston wrote:


https://www.postgresql.org/docs/current/sql-createdatabase.html 


  Database-level configuration parameters (set via ALTER DATABASE) and 
database-level permissions (set via GRANT) are not copied from the template 
database.



Clear. Thank you very much!

With the best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Missing docs on AT TIME ZONE precedence?

2023-11-26 Thread Tom Lane

Bruce Momjian  writes:
> On Sun, Nov 26, 2023 at 11:13:39AM +0100, Shay Rojansky wrote:
>> Is there a missing line in the operator precedence table in the docs?

> I think the big question is whether AT TIME ZONE is significant enough
> to list there because there are many other clauses we could potentially
> add there.

Comparing the precedence list in the grammar with the doc table,
the only omissions I feel bad about are AT and COLLATE.  There's
a group of keywords that have "almost the same precedence as IDENT"
which probably don't need documentation; but these are not in that
group.

I am, however, feeling a little bit on the warpath about the
grammar comments for the SQL/JSON keyword precedences:

/* SQL/JSON related keywords */
%nonassoc   UNIQUE JSON
%nonassoc   KEYS OBJECT_P SCALAR VALUE_P
%nonassoc   WITH WITHOUT

Every other case where we're doing this has a para of explanation
in the block comment just below here.  These not only have no
meaningful explanation, they are in the wrong place --- it looks
like they are unrelated to the block comment, whereas actually
(I think) they are another instance of it.  I consider this
well below project standard.

regards, tom lane

Re: WIP: libpq: add a possibility to not send D(escribe) when executing a prepared statement

2023-11-26 Thread Ivan Trofimov


In a presumably very common case of repeatedly executing the same statement, 
this leads to
both client and server parsing/sending exactly the same RowDescritpion data 
over and over again.
Instead, library user could acquire a statement result RowDescription once (via 
PQdescribePrepared),
and reuse it in subsequent calls to PQexecPrepared and/or its async friends.

But what if query result structure changes? Will we detect this error 
gracefully and return correct error?


Afaik changing prepared statement result structure is prohibited by
Postgres server-side, and should always lead to "ERROR: cached plan
must not change result type", see src/test/regress/sql/plancache.sql.

So yes, from the libpq point of view this is just an server error, which
would be given to the user, the patch shouldn't change any behavior here.

The claim about this always being a server-side error better be
reassured from someone from the Postgres team, of course.

Re: WIP: libpq: add a possibility to not send D(escribe) when executing a prepared statement

2023-11-26 Thread Tom Lane

Ivan Trofimov  writes:
> Afaik changing prepared statement result structure is prohibited by
> Postgres server-side, and should always lead to "ERROR: cached plan
> must not change result type", see src/test/regress/sql/plancache.sql.

Independently of whether we're willing to guarantee that that will
never change, I think this patch is basically a bad idea as presented.
It adds a whole new set of programmer-error possibilities, and I doubt
it saves enough in typical cases to justify creating such a foot-gun.

Moreover, it will force us to devote attention to the problem of
keeping libpq itself from misbehaving badly in the inevitable
situation that somebody passes the wrong tuple descriptor.
That is developer effort we could better spend elsewhere.

I say this as somebody who deliberately designed the v3 protocol
to allow clients to skip Describe steps if they were confident
they knew the query result type.  I am not disavowing that choice;
I just think that successful use of that option requires a client-
side coding structure that allows tying a previously-obtained
tuple descriptor to the current query with confidence.  The proposed
API fails badly at that, or at least leaves it up to the end-user
programmer while providing no tools to help her get it right.

Instead, I'm tempted to suggest having PQprepare/PQexecPrepared
maintain a cache that maps statement name to result tupdesc, so that
this is all handled internally to libpq.  The main hole in that idea
is that it's possible to issue PREPARE, DEALLOCATE, etc via PQexec, so
that a user could possibly redefine a prepared statement without libpq
noticing it.  Maybe that's not a big issue.  For a little more safety,
we could add some extra step that the library user has to take to
enable caching of result tupdescs, whereupon it's definitely caller
error if she does that and then changes the statement behind our back.

BTW, the submitted patch lacks both documentation and tests.
For a feature like this, there is much to be said for writing
the documentation *first*.  Having to explain how to use something
often helps you figure out weak spots in your initial design.

regards, tom lane

Re: Missing docs on AT TIME ZONE precedence?

2023-11-26 Thread Shay Rojansky

>> Is there a missing line in the operator precedence table in the docs?
>
> I think the big question is whether AT TIME ZONE is significant enough
> to list there because there are many other clauses we could potentially
> add there.

Just to give more context, I'm a maintainer on Entity Framework Core (the
.NET ORM), and this caused the provider to generate incorrect SQL etc.

If you decide to not have a comprehensive operator precedence table (though
I do hope you do), I'd at least amend the "any other operator" and "all
other native and user-defined operators" to clearly indicate that some
operators aren't listed and have undocumented precedences, so implementers
can at least be aware and test the unlisted ones etc.

Re: Schema variables - new implementation for Postgres 15

2023-11-26 Thread Dmitry Dolgov

> On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
> > As a side note, I'm intended to go one more time through the first few
> > patches introducing the basic functionality, and then mark it as ready
> > in CF. I can't break the patch in testing since quite long time, and for
> > most parts the changes make sense to me.
>
> I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> rebase

So, after one week of uninterrupted evening reviews I've made it through
the first four patches :)

It's a decent job -- more than once, looking at the code, I thought I
could construct a case when it's going to blow up, but everything was
working just fine. Yet, I think the patch still has to be reshaped a bit
before moving forward. I've got a couple proposals of different nature:
high level changes (you probably won't like some of them, but I'm sure
they're going to be useful), technical code-level improvements/comments,
and few language changes. With those changes in mind I would be
satisfied with the patch, and hopefully they would also make it easier
for a potential committer to pick it up.

# High level proposals

* I would suggest reducing the scope of the patch as much as possible,
  and not just by trimming on the edges, but rather following Phileas
  Fogg's example with the steamboat Henrietta -- get rid of all
  non-essential parts. This will make this rather large patch more
  approachable for others.

  For that one can concentrate only on the first two patches plus the
  fourth one (memory cleanup after dropping variables), leaving DISCARD,
  ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
  future.

  Another thing in this context would be to evaluate plpgsql support for
  this feature. You know the use case better than me, how important it
  is? Is it an intrinsic part of the feature, or session variables could
  be still valuable enough even without plpgsql? From what I see
  postponing plgpsql will make everything about ~800 lines lighter (most
  likely more), and also allow to ignore couple of concerns about the
  implementation (about this later).

* The new GUC session_variables_ambiguity_warning is definitely going to
  cause many objections, it's another knob to manage very subtle
  behaviour detail very few people will ever notice. I see the point
  behind warning about ambiguity, so probably it makes sense to bite the
  bullet and decide one way or another. The proposal is to warn always
  in potentially ambiguous situations, and if concerns are high about
  logging too much, maybe do the warning on lower logging levels.

# Code-level observations

* It feels a bit awkward to have varid assignment logic in a separate
  function, what about adding an argument with varid to
  CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
  be used for CreateDestReceiver.

/*
 * Initially create a DestReceiver object.
 */
DestReceiver *
CreateVariableDestReceiver(void)

/*
 * Set parameters for a VariableDestReceiver.
 * Should be called right after creating the DestReceiver.
 */
void
SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)

* It's worth it to add a commentary here explaining why it's fine to use
  InvalidOid here:

 if (pstmt->commandType != CMD_UTILITY)
-   ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+   ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, paramLI, 
queryEnv,
   &planduration, (es->buffers ? &bufusage : NULL));

  My understanding is that since LetStmt is CMD_UTILITY, this branch
  will never be visited for a session variable.

* IIUC this one is introduced to exclude session variables from the normal
  path with EXPR_KIND_UPDATE_TARGET:

+   EXPR_KIND_ASSIGN_VARIABLE,  /* PL/pgSQL assignment target - disallow
+* session 
variables */

  But the name doesn't sound right, maybe longer
  EXPR_KIND_UPDATE_TARGET_NO_VARS is better?

* I'm curious about this one, which exactly part does this change cover?

@@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
-   if (param->paramkind != PARAM_EXTERN)
+   if (param->paramkind != PARAM_EXTERN &&
+   param->paramkind != PARAM_VARIABLE)
elog(ERROR, "unexpected paramkind: %d", (int) 
param->paramkind);

  I've commented it out, but no tests were affected.

* Does it mean there could be theoretically two LET statements at the
  same time with different command type, one CMD_UTILITY, one
  CMD_SELECT? Can it cause any issues?

+   /*
+* Inside PL/pgSQL we don't want to execute LET statement as utility
+* command, because it disallow to execute expression as simple
+* expression. So for PL/pgSQL we have

Re: Schema variables - new implementation for Postgres 15

2023-11-26 Thread Pavel Stehule

Hi

st 22. 11. 2023 v 7:20 odesílatel Julien Rouhaud 
napsal:

> Hi,
>
> On Tue, Oct 17, 2023 at 08:52:13AM +0200, Pavel Stehule wrote:
> >
> > When I thought about global temporary tables, I got one maybe interesting
> > idea. The one significant problem of global temporary tables is place for
> > storing info about size or column statistics.
> >
> > I think so these data can be stored simply in session variables. Any
> global
> > temporary table can get assigned one session variable, that can hold
> these
> > data.
>
> I don't know how realistic this would be.  For instance it will require to
> properly link the global temporary table life cycle with the session
> variable
> and I'm afraid it would require to add some hacks to make it work as
> needed.
>
> But this still raises the question of whether this feature could be used
> internally for the need of another feature.  If we think it's likely,
> should we
> try to act right now and reserve the "pg_" prefix for internal use rather
> than
> do that a few years down the line and probably break some user code as it
> was
> done recently for the role names?
>

I don't think it is necessary. Session variables (in this design) are
joined with schemas. If we use some session variables for system purposes,
we can use some dedicated schema. But when I think about it in detail,
probably my own dedicated storage (hash table in session memory) can be
much better than session variables. What can be shared (maybe) is probably
sinval message processing.

Re: Missing docs on AT TIME ZONE precedence?

2023-11-26 Thread Tom Lane

I wrote:
> Comparing the precedence list in the grammar with the doc table,
> the only omissions I feel bad about are AT and COLLATE.

Concretely, as attached.

regards, tom lane

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 37817d0638..4dfbbd0862 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1065,6 +1065,18 @@ CAST ( 'string' AS type )
unary plus, unary minus
   
 
+  
+   COLLATE
+   left
+   collation selection
+  
+
+  
+   AT
+   left
+   AT TIME ZONE, AT LOCAL
+  
+
   
^
left
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c224df4ecc..8c00b119ec 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -858,7 +858,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %left		'*' '/' '%'
 %left		'^'
 /* Unary Operators */
-%left		AT/* sets precedence for AT TIME ZONE */
+%left		AT/* sets precedence for AT TIME ZONE, AT LOCAL */
 %left		COLLATE
 %right		UMINUS
 %left		'[' ']'

Re: Improve rowcount estimate for UNNEST(column)

2023-11-26 Thread Laurenz Albe

On Sat, 2023-11-25 at 09:19 -0800, Paul A Jungwirth wrote:
> Here is a patch to improve rowcount estimates for
> `UNNEST(some_array_column)`. Today we hard code this to 10, but we
> have statistics about array size, so it's easy to use them.
> 
> I've seen plans where this would make a difference. If the array has
> only 1 or 2 elements, then overestimating the rowcount by 10 leads to
> unnecessary seqscans downstream. I can see how an underestimate would
> cause issues too.

The idea sounds good to me.
I didn't test or scrutinize the code, but I noticed that you use
EXPLAIN in the regression tests.  I think that makes the tests vulnerable
to changes in the parameters or in the block size.
Perhaps you can write a function that runs EXPLAIN and extracts just the
row count.  That should be stable enough.

Yours,
Laurenz Albe

Re: Improve rowcount estimate for UNNEST(column)

2023-11-26 Thread Tom Lane

Laurenz Albe  writes:
> On Sat, 2023-11-25 at 09:19 -0800, Paul A Jungwirth wrote:
>> Here is a patch to improve rowcount estimates for
>> `UNNEST(some_array_column)`. Today we hard code this to 10, but we
>> have statistics about array size, so it's easy to use them.

> The idea sounds good to me.

I didn't read the patch either yet, but it seems like a reasonable idea.

> I didn't test or scrutinize the code, but I noticed that you use
> EXPLAIN in the regression tests.  I think that makes the tests vulnerable
> to changes in the parameters or in the block size.

Yes, this regression test is entirely unacceptable; the numbers will
not be stable enough.  Even aside from the different-settings issue,
you can't rely on ANALYZE deriving exactly the same stats every time.
Usually what we try to do is devise a query where the plan shape
changes because of the better estimate.  That typically will provide
some insulation against small changes in the numerical estimates.

regards, tom lane

Re: GUC names in messages

2023-11-26 Thread Peter Smith

On Fri, Nov 24, 2023 at 2:11 PM Michael Paquier  wrote:
>
> On Thu, Nov 23, 2023 at 06:27:04PM +1100, Peter Smith wrote:
> > There may be some changes I've missed, but hopefully, this is a nudge
> > in the right direction.
>
> Thanks for spending some time on that.
>
> 
> +In messages containing configuration variable names, do not include 
> quotes
> +when the names are visibly not English natural words, such as when they
> +have underscores or are all-uppercase or have mixed case. Otherwise, 
> quotes
> +must be added.  Do include quotes in a message where an arbitrary 
> variable
> +name is to be expanded.
> +   
>
> That seems to describe clearly the consensus reached on the thread
> (quotes for GUCs that are single terms, no quotes for names that are
> obviously parameters).
>
> In terms of messages that have predictible names, 0002 moves in the
> needle in the right direction.  There seem to be more:
> src/backend/postmaster/bgworker.c:  errhint("Consider increasing the
> configuration parameter \"max_worker_processes\".")));
> contrib/pg_prewarm/autoprewarm.c:  errhint("Consider increasing
> configuration parameter \"max_worker_processes\".")));

Done in patch 0002

>
> Things like parse_and_validate_value() and set_config_option_ext()
> include log strings about GUC and these use quotes.  Could these areas
> be made smarter with a routine to check if quotes are applied
> automatically when we have a "simple" GUC name, aka I guess made of
> only lower-case characters?  This could be done with a islower() on
> the string name, for instance.

See what you think of patch 0003

~~

PSA v2 patches.

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-GUC-names-docs.patch
Description: Binary data


v2-0002-GUC-names-fix-quotes.patch
Description: Binary data


v2-0003-GUC-names-maybe-add-quotes.patch
Description: Binary data

Re: GUC names in messages

2023-11-26 Thread Peter Smith

On Fri, Nov 24, 2023 at 8:53 PM Alvaro Herrera  wrote:
>
> On 2023-Nov-24, Michael Paquier wrote:
>
> > On Thu, Nov 23, 2023 at 06:27:04PM +1100, Peter Smith wrote:
> > > There may be some changes I've missed, but hopefully, this is a nudge
> > > in the right direction.
> >
> > Thanks for spending some time on that.
>
> +1
>
> > 
> > +In messages containing configuration variable names, do not include 
> > quotes
> > +when the names are visibly not English natural words, such as when they
> > +have underscores or are all-uppercase or have mixed case. Otherwise, 
> > quotes
> > +must be added.  Do include quotes in a message where an arbitrary 
> > variable
> > +name is to be expanded.
> > +   
> >
> > That seems to describe clearly the consensus reached on the thread
> > (quotes for GUCs that are single terms, no quotes for names that are
> > obviously parameters).
>
> Yeah, this is pretty much the patch I proposed earlier.
>
> > In terms of messages that have predictible names, 0002 moves in the
> > needle in the right direction.  There seem to be more:
> > src/backend/postmaster/bgworker.c:  errhint("Consider increasing the
> > configuration parameter \"max_worker_processes\".")));
> > contrib/pg_prewarm/autoprewarm.c:  errhint("Consider increasing
> > configuration parameter \"max_worker_processes\".")));
>
> Yeah.  Also, these could be changed to have the GUC name outside the
> message proper, which would reduce the total number of messages.  (But
> care must be given to the word "the" there.)
>

 I had posted something similar a few posts back [1], but it just
caused more questions unrelated to GUC name quotes so I abandoned that
temporarily.

So for now, I hope this thread can be only about quotes on GUC names,
otherwise, I thought it may become stuck debating dozens of individual
messages. Certainly later, or in another thread, we can revisit all
messages again to try to identify/extract any "common" ones.

> > Things like parse_and_validate_value() and set_config_option_ext()
> > include log strings about GUC and these use quotes.  Could these areas
> > be made smarter with a routine to check if quotes are applied
> > automatically when we have a "simple" GUC name, aka I guess made of
> > only lower-case characters?  This could be done with a islower() on
> > the string name, for instance.
>
> I think we could leave these improvements for a second round.  They
> don't need to hold back the improvement we already have.
>

I tried something for this already but kept it in a separate patch. See v2-0003

==
[1] 
https://www.postgresql.org/message-id/CAHut%2BPv8VG7fvXzg5PNeQuUhJG17xwCWNpZSUUkN11ArV%3D%3DCdg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-26 Thread Alexander Korotkov

Alexander, Maxim,

Thank you for revisions.

On Thu, Nov 9, 2023 at 6:22 PM Maxim Orlov  wrote:
> Aleksander Alekseev,
>
> > Maxim,
> > I see both of us accounted for Alexanders feedback and submitted v59.
> > Your newer version seems to have issues on cfbot, so resubmitting the
> > previous patchset that passes the tests. Please feel free to add
> > changes.
>
> For unknown reasons, I do not receive any of your emails from after 
> 2023-11-07 11:57:12 (Message-ID: 
> CAJ7c6TN1hKqNPGrNcq96SUyD=z61rakgxf8iqq36qr90oud...@mail.gmail.com).
> Even after resend.
>
> Anyway, PFA patch set of version 61.  I've made some minor changes in the 
> 0001 and add 004 in order to test actual 64-bit SLRU pages.
>
> As for CF bot had failed on my v59 patch set, this is because of the bug.  
> It's manifested because of added 64-bit pages tests.
> The problem was in segno calculation, since we convert it from file name 
> using strtol call.  But strtol return long,
> which is 4 byte long in x86.
>
> -   segno = (int) strtol(clde->d_name, NULL, 16);
> +   segno = strtoi64(clde->d_name, NULL, 16);

v61 looks good to me.  I'm going to push it as long as there are no objections.

--
Regards,
Alexander Korotkov

Re: PATCH: Add REINDEX tag to event triggers

2023-11-26 Thread jian he

On Fri, Nov 24, 2023 at 10:44 AM Michael Paquier  wrote:
>
> As far as I can see, this patch is doing too much as presented.  Could
> you split the patch into more pieces, please?  Based on v4 you have
> sent, there are refactoring and basic piece parts like:
> - Patch to make event triggers ddl_command_start and ddl_command_stop
> react on ReindexStmt, so as a command is reported in
> pg_event_trigger_ddl_commands().
> - Patch for GetCommandLogLevel() to switch ReindexStmt from
> LOGSTMT_ALL to LOGSTMT_DDL.  True that this could be switched.
> - Patch to refactor the routines of indexcmds.c and index.c to use the
> ReindexStmt as argument, as a preparation of the next patch...
> - Patch to add a new event type with a new SQL function to return a
> list of the indexes rebuilt, with the ReindexStmt involved when the
> index OID was trapped by the collection.
>
> 1) and 2) are a minimal feature in themselves, which may be enough to
> satisfy your original case, and where you'd know when a REINDEX has
> been run in event triggers.  3) and 4) are what you are trying to
> achieve, to get the list of the indexes rebuilt knowing the context of
> a given command.
> --
> Michael

hi.
v5-0001. changed the REINDEX command tag from event_trigger_ok: false
to event_trigger_ok: true.
Move ReindexStmt moves from standard_ProcessUtility to ProcessUtilitySlow.
By default ProcessUtilitySlow will call trigger related functions.
So the event trigger facility can support reindex statements.

v5-0002. In GetCommandLogLevel, change T_ReindexStmt from lev =
LOGSTMT_ALL to lev = LOGSTMT_DDL. so log_statement (enum) >= 'ddl'
will make the reindex statement be logged.

v5-0003. Refactor the following functions: {ReindexIndex,
ReindexTable, ReindexMultipleTables,
ReindexPartitions,ReindexMultipleInternal
,ReindexRelationConcurrently, reindex_relation,reindex_index} by
adding `const ReindexStmt *stmt` as their first argument.
This is for event trigger support reindex. We need to pass both the
newly generated indexId and the ReindexStmt to
EventTriggerCollectSimpleCommand right after the newly index gets
their lock. To do that, we have to refactor related functions.

v5-0004. Event trigger support reindex command implementation,
documentation, regress test, helper function pass reindex info to
function EventTriggerCollectSimpleCommand.
From a72aaf4dcc2ae70766f5380855e7013035c63d9c Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 24 Nov 2023 20:44:44 +0800
Subject: [PATCH v5 2/4] tag ReindexStmt as ddl in GetCommandLogLevel.

In GetCommandLogLevel, change T_ReindexStmt from lev = LOGSTMT_ALL to lev = LOGSTMT_DDL.
so log_statement (enum) >= 'ddl' will make the reindex statement be logged.
---
 src/backend/tcop/utility.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 5a7f63cb..a269b5b6 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -3628,7 +3628,7 @@ GetCommandLogLevel(Node *parsetree)
 			break;

 		case T_ReindexStmt:
-			lev = LOGSTMT_ALL;	/* should this be DDL? */
+			lev = LOGSTMT_DDL;
 			break;

 		case T_CreateConversionStmt:
-- 
2.34.1

From 864f53c932cd2b692b5042bbfddc021d5e49ef0c Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Sat, 25 Nov 2023 09:28:07 +0800
Subject: [PATCH v5 4/4] Event trigger support reindex command. this add
 documentation, regress test, static function pass reindex info to function
 EventTriggerCollectSimpleCommand.

---
 doc/src/sgml/event-trigger.sgml |   8 ++
 src/backend/catalog/index.c |   5 +-
 src/backend/commands/indexcmds.c|  17 
 src/test/regress/expected/event_trigger.out | 103 
 src/test/regress/sql/event_trigger.sql  |  93 ++
 5 files changed, 225 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
index 10b20f03..234b4ffd 100644
--- a/doc/src/sgml/event-trigger.sgml
+++ b/doc/src/sgml/event-trigger.sgml
@@ -1032,6 +1032,14 @@
 -

+   
+REINDEX
+X
+X
+-
+-
+
+   

 REVOKE
 X
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index fc09191c..2afabdeb 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -136,7 +136,7 @@ static void SetReindexProcessing(Oid heapOid, Oid indexOid);
 static void ResetReindexProcessing(void);
 static void SetReindexPending(List *indexes);
 static void RemoveReindexPending(Oid indexOid);
-
+static void reindex_event_trigger_collect(Oid oid, const ReindexStmt *stmt);

 /*
  * relationHasPrimaryKey
@@ -3642,6 +3642,9 @@ reindex_index(const ReindexStmt *stmt, Oid indexId, bool skip_constraint_checks,
 	if (progress)
 		pgstat_progress_update_param(PROGRESS_CREATEIDX_ACCESS_METHOD_OID,
 	 iRel->rd_rel->relam);
+	/* event trigger tracking REINDEX primary pointer

Re: [HACKERS] make async slave to wait for lsn to be replayed

2023-11-26 Thread Alexander Korotkov

On Thu, Nov 23, 2023 at 5:52 AM Bowen Shi  wrote:
> I used the latest code and found some conflicts while applying. Which PG 
> version did you rebase?

I've successfully applied the patch on bc3c8db8ae.  But I've used
"patch -p1 < wait_proc_v6.patch", git am doesn't work.

--
Regards,
Alexander Korotkov

Re: [HACKERS] make async slave to wait for lsn to be replayed

2023-11-26 Thread Alexander Korotkov

On Mon, Nov 20, 2023 at 1:10 PM Картышов Иван
 wrote:
> Alexander, thank you for your review and pointing this issues. According to
> them I made some fixes and rebase all patch.
>
> But I can`t repeat your ERROR. Not with hot_standby_feedback = on nor
> hot_standby_feedback = off.
>
> master: create table test as (select i from generate_series(1,1) i);
> slave conn1: select pg_wal_replay_pause();
> master: delete from test;
> master: vacuum test;
> master: select pg_current_wal_lsn();
> slave conn2: select pg_wait_lsn('the value from previous query'::pg_lsn, 0);
> slave conn1: select pg_wal_replay_resume();
> slave conn2: ERROR: canceling statement due to conflict with recovery
> DETAIL: User query might have needed to see row versions that must be removed.
>
> Also I use little hack to work out of snapshot similar to SnapshotResetXmin.
>
> Patch rebased and ready for review.

I've retried my case with v6 and it doesn't fail anymore.  But I
wonder how safe it is to reset xmin within the user-visible function?
We have no guarantee that the function is not called inside the
complex query.  Then how will the rest of the query work with xmin
reset?  Separate utility statement still looks like more safe option
for me.

--
Regards,
Alexander Korotkov

Re: Add semi-join pushdown to postgres_fdw

2023-11-26 Thread Alexander Korotkov

Hi, Alexander!

On Tue, Oct 31, 2023 at 1:07 PM Alexander Pyhalov
 wrote:
> There are several cases when we can't push down semi-join in current
> patch.
>
> 1) When target list has attributes from inner relation, which are
> equivalent to some attributes of outer
> relation, we fail to notice this.
>
> 2) When we examine A join B and decide that we can't push it down, this
> decision is final - we state it in fdw_private of joinrel,
> and so if we consider joining these relations in another order, we don't
> reconsider.
> This means that if later examine B join A, we don't try to push it down.
> As semi-join can be executed as JOIN_UNIQUE_INNER or JOIN_UNIQUE_OUTER,
> this can be a problem - we look at some of these paths and remember that
> we can't push down such join.

Thank you for the revision.

I've revised the patch myself.  I've replaced StringInfo with
additional conds into a list of strings as I proposed before.  I think
the code became much clearer.  Also, it gets rid of some unnecessary
allocations.

I think the code itself is not in bad shape.  But patch lacks some
high-level description of semi-joins processing as well as comments on
each manipulation with additional conds.  Could you please add this?

--
Regards,
Alexander Korotkov

0001-postgres_fdw-add-support-for-deparsing-semi-joins-v6.patch
Description: Binary data

Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-11-26 Thread Alexander Korotkov

On Fri, Nov 24, 2023 at 3:54 PM Alexander Korotkov  wrote:
>
> On Thu, Nov 23, 2023 at 4:33 AM Richard Guo  wrote:
> >
> > On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov  
> > wrote:
> >>
> >> It's here.  New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
> >> each modification.
> >
> >
> > +1 to the idea of introducing a reallocation mode to Bitmapset.
> >
> >>
> >> I had the feeling of falling into a rabbit hole while debugging all
> >> the cases of failure with this new option.  With the second patch
> >> regressions tests pass.
> >
> >
> > It seems to me that we have always had situations where we share the
> > same pointer to a Bitmapset structure across different places.  I do not
> > think this is a problem as long as we do not modify the Bitmapsets in a
> > way that requires reallocation or impact the locations sharing the same
> > pointer.
> >
> > So I'm wondering, instead of attempting to avoid sharing pointer to
> > Bitmapset in all locations that have problems, can we simply bms_copy
> > the original Bitmapset within replace_relid() before making any
> > modifications, as I proposed previously?  Of course, as Andres pointed
> > out, we need to do so also for the "Delete relid without substitution"
> > path.  Please see the attached.
>
>
> Yes, this makes sense.  Thank you for the patch.  My initial point was
> that replace_relid() should either do in-place in all cases or make a
> copy in all cases.  Now I see that it should make a copy in all cases.
> Note, that without making a copy in delete case, regression tests fail
> with REALLOCATE_BITMAPSETS on.
>
> Please, find the revised patchset.  As Ashutosh Bapat asked, asserts
> are split into separate patch.

Any objections to pushing this?

--
Regards,
Alexander Korotkov

Re: pg_upgrade and logical replication

2023-11-26 Thread Peter Smith

Here are some review comments for patch set v19*

//

v19-0001.

No comments

///

v19-0002.

(I saw that both changes below seemed cut/paste from similar
functions, but I will ask the questions anyway).

==
src/backend/commands/subscriptioncmds.c

1.
+/* Potentially set by pg_upgrade_support functions */
+Oid binary_upgrade_next_pg_subscription_oid = InvalidOid;
+

The comment "by pg_upgrade_support functions" seemed a bit vague. IMO
you might as well tell the name of the function that sets this.

SUGGESTION
Potentially set by the pg_upgrade_support function --
binary_upgrade_set_next_pg_subscription_oid().

~~~

2. CreateSubscription

+ if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("pg_subscription OID value not set when in binary upgrade mode")));

Doesn't this condition mean some kind of impossible internal error
occurred -- i.e. should this be elog instead of ereport?

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: GUC names in messages

2023-11-26 Thread Michael Paquier

On Mon, Nov 27, 2023 at 10:04:35AM +1100, Peter Smith wrote:
> On Fri, Nov 24, 2023 at 8:53 PM Alvaro Herrera  
> wrote:
>> Yeah.  Also, these could be changed to have the GUC name outside the
>> message proper, which would reduce the total number of messages.  (But
>> care must be given to the word "the" there.)
> 
>  I had posted something similar a few posts back [1], but it just
> caused more questions unrelated to GUC name quotes so I abandoned that
> temporarily.

Yes, I kind of agree to let that out of the picture for the moment.
It would be good to reduce the translation chunks.

> So for now, I hope this thread can be only about quotes on GUC names,
> otherwise, I thought it may become stuck debating dozens of individual
> messages. Certainly later, or in another thread, we can revisit all
> messages again to try to identify/extract any "common" ones.

-HINT:  Perhaps you need a different "datestyle" setting.
+HINT:  Perhaps you need a different DateStyle setting. 

Is the change for "datestyle" really required?  It does not betray the
GUC quoting policy added by 0001.

>> I think we could leave these improvements for a second round.  They
>> don't need to hold back the improvement we already have.
> 
> I tried something for this already but kept it in a separate patch. See 
> v2-0003

+   if (*p == '_')
+   underscore = true;

Is there a reason why we don't just use islower() or is that just to
get something entirely local independent?  I am not sure that it needs
to be that complicated.  We should just check that all the characters
are lower-case and apply quotes.
--
Michael


signature.asc
Description: PGP signature

Re: GUC names in messages

2023-11-26 Thread Tom Lane

Michael Paquier  writes:
> Is there a reason why we don't just use islower() or is that just to
> get something entirely local independent?

islower() and related functions are not to be trusted for this
purpose.  They will certainly give locale-dependent results,
and they might give entirely wrong ones if there's any inconsistency
between the database encoding and what libc thinks the locale is.

regards, tom lane

Re: GUC names in messages

2023-11-26 Thread Peter Smith

On Mon, Nov 27, 2023 at 12:44 PM Michael Paquier  wrote:
>
> On Mon, Nov 27, 2023 at 10:04:35AM +1100, Peter Smith wrote:
> > On Fri, Nov 24, 2023 at 8:53 PM Alvaro Herrera  
> > wrote:
> >> Yeah.  Also, these could be changed to have the GUC name outside the
> >> message proper, which would reduce the total number of messages.  (But
> >> care must be given to the word "the" there.)
> >
> >  I had posted something similar a few posts back [1], but it just
> > caused more questions unrelated to GUC name quotes so I abandoned that
> > temporarily.
>
> Yes, I kind of agree to let that out of the picture for the moment.
> It would be good to reduce the translation chunks.
>
> > So for now, I hope this thread can be only about quotes on GUC names,
> > otherwise, I thought it may become stuck debating dozens of individual
> > messages. Certainly later, or in another thread, we can revisit all
> > messages again to try to identify/extract any "common" ones.
>
> -HINT:  Perhaps you need a different "datestyle" setting.
> +HINT:  Perhaps you need a different DateStyle setting.
>
> Is the change for "datestyle" really required?  It does not betray the
> GUC quoting policy added by 0001.
>

TBH, I suspect something fishy about these mixed-case GUCs.

In the documentation and in the guc_tables.c they are all described in
MixedCase (e.g. "DateStyle" instead of "datestyle"), so I felt the
messages should use the same case the documentation, which is why I
changed all the ones you are referring to.

I know the code is doing a case-insensitive hashtable lookup but I
suspect some of the string passing still in the code for those
particular GUCs ought to be using the same mixed case string literal
as in the guc_tables.c. Currently, I have seen a few quirks where the
case is inconsistent with the MixedCase docs. It needs some more
investigation to understand the reason. For example,

2023-11-27 11:03:48.565 AEDT [15303] STATEMENT:  set intervalstyle=123;
ERROR:  invalid value for parameter "intervalstyle": "123"

versus

2023-11-27 11:13:56.018 AEDT [15303] STATEMENT:  set datestyle=123;
ERROR:  invalid value for parameter DateStyle: "123"

> >> I think we could leave these improvements for a second round.  They
> >> don't need to hold back the improvement we already have.
> >
> > I tried something for this already but kept it in a separate patch. See 
> > v2-0003
>
> +   if (*p == '_')
> +   underscore = true;
>
> Is there a reason why we don't just use islower() or is that just to
> get something entirely local independent?  I am not sure that it needs
> to be that complicated.  We should just check that all the characters
> are lower-case and apply quotes.

Thanks for the feedback. Probably I have overcomplicated it. I'll revisit it.

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: New instability in stats regression test

2023-11-26 Thread Michael Paquier

On Sat, Nov 25, 2023 at 02:34:40PM -0500, Tom Lane wrote:
> -- Test that reset_shared with no argument resets all the stats types
> -- supported (providing NULL as argument has the same effect).
> SELECT pg_stat_reset_shared();

Right, this has switched pg_stat_reset_shared() from doing nothing to
do a full reset.  Removing this reset switches the results of
io_stats_pre_reset from a 7-digit number to a 4-digit number, thanks
to all the previous I/O activity generated by all the tests.

> Nonetheless, it seems like a really bad idea that this test
> of I/O stats reset happens after the newly-added test.  It
> is clearly now dependent on timing and the amount of concurrent
> activity whether it will pass or not.  We should probably
> re-order the tests to do the old test first; or else abandon
> this test methodology and just test I/O reset the same way
> we test the other cases (checking only for timestamp advance).
> Or maybe we don't really need the pg_stat_reset_shared() test?

I was ready to argue that we'd better keep this test and keep it close
to the end of stats.sql while documenting why things are kept in this
order, but two resets done on the same shared stats type would still
be prone to race conditions without all the previous activity done in
the tests (like pg_stat_wal).

With all that in mind and because we have checks for the individual
targets with pg_stat_reset_shared(), I would agree to just remove it
entirely.  Say as of the attached?
--
Michael
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 7869c598f9..87d881c3f9 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -990,50 +990,6 @@ SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
 (1 row)
 
 SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
--- Test that reset_shared with no argument resets all the stats types
--- supported (providing NULL as argument has the same effect).
-SELECT pg_stat_reset_shared();
- pg_stat_reset_shared 
---
- 
-(1 row)
-
-SELECT stats_reset > :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
- ?column? 
---
- t
-(1 row)
-
-SELECT stats_reset > :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
- ?column? 
---
- t
-(1 row)
-
-SELECT stats_reset > :'checkpointer_reset_ts'::timestamptz FROM pg_stat_checkpointer;
- ?column? 
---
- t
-(1 row)
-
-SELECT stats_reset > :'recovery_prefetch_reset_ts'::timestamptz FROM pg_stat_recovery_prefetch;
- ?column? 
---
- t
-(1 row)
-
-SELECT max(stats_reset) > :'slru_reset_ts'::timestamptz FROM pg_stat_slru;
- ?column? 
---
- t
-(1 row)
-
-SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
- ?column? 
---
- t
-(1 row)
-
 -- Test error case for reset_shared with unknown stats type
 SELECT pg_stat_reset_shared('unknown');
 ERROR:  unrecognized reset target: "unknown"
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 5b4e451a2e..d867fb406f 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -494,16 +494,6 @@ SELECT pg_stat_reset_shared('wal');
 SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
 SELECT stats_reset AS wal_reset_ts FROM pg_stat_wal \gset
 
--- Test that reset_shared with no argument resets all the stats types
--- supported (providing NULL as argument has the same effect).
-SELECT pg_stat_reset_shared();
-SELECT stats_reset > :'archiver_reset_ts'::timestamptz FROM pg_stat_archiver;
-SELECT stats_reset > :'bgwriter_reset_ts'::timestamptz FROM pg_stat_bgwriter;
-SELECT stats_reset > :'checkpointer_reset_ts'::timestamptz FROM pg_stat_checkpointer;
-SELECT stats_reset > :'recovery_prefetch_reset_ts'::timestamptz FROM pg_stat_recovery_prefetch;
-SELECT max(stats_reset) > :'slru_reset_ts'::timestamptz FROM pg_stat_slru;
-SELECT stats_reset > :'wal_reset_ts'::timestamptz FROM pg_stat_wal;
-
 -- Test error case for reset_shared with unknown stats type
 SELECT pg_stat_reset_shared('unknown');
 


signature.asc
Description: PGP signature

Re: initdb --no-locale=C doesn't work as specified when the environment is not C

2023-11-26 Thread Kyotaro Horiguchi

At Wed, 22 Nov 2023 11:04:01 -0500, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> > Commit 3e51b278db leaves lc_* conf lines as commented-out when
> > their value is "C". This leads to the following behavior.
> 
> Hmm ... I see a contributing factor here: this bit in
> postgresql.conf.sample is a lie:
> 
> #lc_messages = 'C'# locale for system error message
>   # strings
> 
> A look in guc_tables.c shows that the actual default is '' (empty
> string), which means "use the environment", and that matches how the
> variable is documented in config.sgml.  Somebody --- quite possibly me
> --- was misled by the contents of postgresql.conf.sample into thinking
> that the lc_xxx GUCs all default to C, when that's only true for the
> others.

It seems somewhat intentional that only lc_messages references the
environment at boot time. On the other hand, previously, in the
absence of a specified locale, initdb would embed the environmental
value in the configuration file, as it seems to be documented. Given
that initdb is always used for cluster creation, it's unlikey that
systems depend on this boot-time default for their operation.

> I think that a more correct fix for this would treat lc_messages
> differently from the other lc_xxx GUCs.  Maybe just eliminate the
> hack about not substituting "C" for that one?

For example, the --no-locale option for initdb is supposed to set all
categories to 'C'. That approach would lead to the postgres
referencing the runtime environment for all categories except
lc_messages, which I believe contradicts the documentation. In my
biew, if lc_messages is exempted from that hack, then all other
categories should be similarly excluded as in the second approach
among the attached in the previous mail.

> In any case, we need to fix this mistake in postgresql.conf.sample.

If you are not particularly concerned about the presence of quotation
marks, I think it would be fine to go with the second approach and
make the necessary modification to the configuration file accordingly.

With the attached patch, initdb --no-locale generates the following
lines in the configuration file.

> lc_messages = C   # locale for system error 
> message
>   # strings
> lc_monetary = C   # locale for monetary formatting
> lc_numeric = C# locale for number formatting
> lc_time = C   # locale for time formatting

By the way, the lines around lc_* in the sample file seem to have
somewhat inconsistent indentations. Wouldnt' it be preferable to fix
this? (The attached doesn't that.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/misc/postgresql.conf.sample 
b/src/backend/utils/misc/postgresql.conf.sample
index e48c066a5b..133dd3da7d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -726,7 +726,7 @@
# encoding

 # These settings are initialized by initdb, but they can be changed.
-#lc_messages = 'C' # locale for system error message
+#lc_messages = ''  # locale for system error message
# strings
 #lc_monetary = 'C' # locale for monetary formatting
 #lc_numeric = 'C'  # locale for number formatting
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 0c6f5ceb0a..646e8f29f6 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1226,25 +1226,17 @@ setup_config(void)
conflines = replace_guc_value(conflines, "shared_buffers",
  repltok, 
false);

-   /*
-* Hack: don't replace the LC_XXX GUCs when their value is 'C', because
-* replace_guc_value will decide not to quote that, which looks strange.
-*/
-   if (strcmp(lc_messages, "C") != 0)
-   conflines = replace_guc_value(conflines, "lc_messages",
- 
lc_messages, false);
+   conflines = replace_guc_value(conflines, "lc_messages",
+ lc_messages, 
false);

-   if (strcmp(lc_monetary, "C") != 0)
-   conflines = replace_guc_value(conflines, "lc_monetary",
- 
lc_monetary, false);
+   conflines = replace_guc_value(conflines, "lc_monetary",
+ lc_monetary, 
false);

-   if (strcmp(lc_numeric, "C") != 0)
-   conflines = replace_guc_value(conflines, "lc_numeric",
- 
lc_numeri

Re: [PATCH] fix race condition in libpq (related to ssl connections)

2023-11-26 Thread Michael Paquier

On Fri, Nov 24, 2023 at 04:48:58PM +0900, Michael Paquier wrote:
> I've looked at this idea, and finished by being unhappy with the error
> handling that we are currently assuming in my_SSL_set_fd() in the
> event of an error in the bio method setup, which would be most likely
> an OOM, so let's use ssl_config_mutex in my_BIO_s_socket().  Another
> thing is that I have minimized the manipulation of my_bio_methods in
> the setup routine.

I've spent more time on that today, and the patch I've posted on
Friday had a small mistake in the non-HAVE_BIO_METH_NEW path when
saving the BIO_METHODs causing the SSL tests to fail with older
OpenSSL versions.  I've fixed that and the patch was straight-forward,
so applied it down to v12.  I didn't use Willi's patch at the end,
still credited him as author as his original patch is rather close to
the result committed and it feels that he has spent a good deal of
time on this issue.
--
Michael


signature.asc
Description: PGP signature

Re: New instability in stats regression test

2023-11-26 Thread Tom Lane

Michael Paquier  writes:
> With all that in mind and because we have checks for the individual
> targets with pg_stat_reset_shared(), I would agree to just remove it
> entirely.  Say as of the attached?

I'm good with that answer --- I doubt that this test sequence is
proving anything that's worth the cycles it takes.  If it'd catch
oversights like failing to add new stats types to the "reset all"
code path, then I'd be for keeping it; but I don't see how the
test could notice that.

regards, tom lane

Re: [PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.

2023-11-26 Thread Bowen Shi

> I think that we should backpatch that down to v16 at least where the
> GUC has been introduced as it's more like a nuisance if one sets the
> GUC to an incorrect value, and I'd like to apply the patch this way.

Agreed.

The patch has been submitted in https://commitfest.postgresql.org/46/4671/

Regards
Bowen Shi

Re: GUC names in messages[

2023-11-26 Thread Michael Paquier

On Mon, Nov 27, 2023 at 01:41:18PM +1100, Peter Smith wrote:
> TBH, I suspect something fishy about these mixed-case GUCs.
> 
> In the documentation and in the guc_tables.c they are all described in
> MixedCase (e.g. "DateStyle" instead of "datestyle"), so I felt the
> messages should use the same case the documentation, which is why I
> changed all the ones you are referring to.

FWIW, I've been tempted for a few years to propose that we should keep
the parsers as they behave now, but format the name of these
parameters in the code and the docs to just be lower-case all the
time.

>> Is there a reason why we don't just use islower() or is that just to
>> get something entirely local independent?  I am not sure that it needs
>> to be that complicated.  We should just check that all the characters
>> are lower-case and apply quotes.
> 
> Thanks for the feedback. Probably I have overcomplicated it. I'll revisit it.

The use of a static variable with a fixed size was itching me a bit as
well..  I was wondering if it would be cleaner to use %s%s%s in the
strings adding a note that these are GUC names that may be optionally
quoted, then hide what gets assigned in a macro with a result rather
similar to LSN_FORMAT_ARGS (GUC_FORMAT?).  The routine checking if
quotes should be applied would only need to return a boolean to tell
what to do, and could be hidden in the macro.
--
Michael


signature.asc
Description: PGP signature

Re: Postgres picks suboptimal index after building of an extended statistics

2023-11-26 Thread Andrei Lepikhov

Second version of the patch - resolve non-symmetrical decision, thanks 
to Teodor Sigaev's review.



--
regards,
Andrei Lepikhov
Postgres Professional
From 604899b6afe70eccbbdbf69ce254f37808c598db Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Mon, 27 Nov 2023 11:23:48 +0700
Subject: [PATCH] Choose an index path with the best selectivity estimation.

In the case when optimizer predicts only one row prefer choosing UNIQUE indexes
In other cases, if optimizer treats indexes as equal, make a last attempt
selecting the index with less selectivity - this decision takes away dependency
on the order of indexes in an index list (good for reproduction of some issues)
and proposes one more objective argument to choose specific index.
---
 src/backend/optimizer/util/pathnode.c | 42 ++
 .../expected/drop-index-concurrently-1.out| 16 ---
 src/test/regress/expected/functional_deps.out | 43 +++
 src/test/regress/expected/join.out| 40 +
 src/test/regress/sql/functional_deps.sql  | 36 
 5 files changed, 151 insertions(+), 26 deletions(-)

diff --git a/src/backend/optimizer/util/pathnode.c 
b/src/backend/optimizer/util/pathnode.c
index 0b1d17b9d3..4b5aedd579 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -454,6 +454,48 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
costcmp = compare_path_costs_fuzzily(new_path, old_path,

 STD_FUZZ_FACTOR);
 
+   /*
+* Apply some heuristics on index paths.
+*/
+   if (IsA(new_path, IndexPath) && IsA(old_path, IndexPath))
+   {
+   IndexPath *inp = (IndexPath *) new_path;
+   IndexPath *iop = (IndexPath *) old_path;
+
+   if (new_path->rows <= 1.0 && old_path->rows <= 1.0)
+   {
+   /*
+* When both paths are predicted to produce 
only one tuple,
+* the optimiser should prefer choosing a 
unique index scan
+* in all cases.
+*/
+   if (inp->indexinfo->unique && 
!iop->indexinfo->unique)
+   costcmp = COSTS_BETTER1;
+   else if (!inp->indexinfo->unique && 
iop->indexinfo->unique)
+   costcmp = COSTS_BETTER2;
+   else if (costcmp != COSTS_DIFFERENT)
+   /*
+* If the optimiser doesn't have an 
obviously stable choice
+* of unique index, increase the chance 
of avoiding mistakes
+* by choosing an index with smaller 
selectivity.
+* This option makes decision more 
conservative and looks
+* debatable.
+*/
+   costcmp = (inp->indexselectivity < 
iop->indexselectivity) ?
+   
COSTS_BETTER1 : COSTS_BETTER2;
+   }
+   else if (costcmp == COSTS_EQUAL)
+   /*
+* The optimizer can't differ the value of two 
index paths.
+* To avoid making a decision that is based on 
only an index
+* order in the list, use some rational 
strategy based on
+* selectivity: prefer touching fewer tuples on 
the disk to
+* filtering them after.
+*/
+   costcmp = (inp->indexselectivity < 
iop->indexselectivity) ?
+   
COSTS_BETTER1 : COSTS_BETTER2;
+   }
+
/*
 * If the two paths compare differently for startup and total 
cost,
 * then we want to keep both, and we can skip comparing 
pathkeys and
diff --git a/src/test/isolation/expected/drop-index-concurrently-1.out 
b/src/test/isolation/expected/drop-index-concurrently-1.out
index 1cb2250891..2392cdb033 100644
--- a/src/test/isolation/expected/drop-index-concurrently-1.out
+++ b/src/test/isolation/expected/drop-index-concurrently-1.out
@@ -12,13 +12,15 @@ step preps: PREPARE getrow_seqscan AS SELECT * FROM test_dc 
WHERE data = 34 ORDE
 step begin: BEGIN;
 step disableseq: SET enable_seqscan = false;
 step explaini: EXPLAIN (COSTS OFF)

Re: brininsert optimization opportunity

2023-11-26 Thread Richard Guo

On Sun, Nov 26, 2023 at 4:06 AM Tomas Vondra 
wrote:

> I've done a bit more cleanup on the last version of the patch (renamed
> the fields to start with bis_ as agreed, rephrased the comments / docs /
> commit message a bit) and pushed.

It seems that we have an oversight in this commit.  If there is no tuple
that has been inserted, we wouldn't have an available insert state in
the clean up phase.  So the Assert in brininsertcleanup() is not always
right.  For example:

regression=# update brin_summarize set value = brin_summarize.value;
server closed the connection unexpectedly

So I wonder if we should check 'bistate' and do the clean up only if
there is an available one, something like below.

--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -359,7 +359,9 @@ brininsertcleanup(IndexInfo *indexInfo)
 {
BrinInsertState *bistate = (BrinInsertState *) indexInfo->ii_AmCache;

-   Assert(bistate);
+   /* We don't have an available insert state, nothing to do */
+   if (!bistate)
+   return;

Thanks
Richard

Incorrect comment in tableam.h regarding GetHeapamTableAmRoutine()

2023-11-26 Thread Michael Paquier

Hi all,

I have noticed that GetHeapamTableAmRoutine() is listed as being a
member of tableamapi.c but it is a convenience routine located in
heapam_handler.c.  Shouldn't the header be fixed with something like
the attached?

Thoughts or comments?
--
Michael
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index dbb709b56c..9ab7201f4d 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -2095,6 +2095,12 @@ extern void table_block_relation_estimate_size(Relation rel,
  */
 
 extern const TableAmRoutine *GetTableAmRoutine(Oid amhandler);
+
+/* 
+ * Functions in heapam_handler.c
+ * 
+ */
+
 extern const TableAmRoutine *GetHeapamTableAmRoutine(void);
 
 #endif			/* TABLEAM_H */


signature.asc
Description: PGP signature

Re: brininsert optimization opportunity

2023-11-26 Thread Soumyadeep Chakraborty

On Sun, Nov 26, 2023 at 9:28 PM Richard Guo  wrote:
>
>
> On Sun, Nov 26, 2023 at 4:06 AM Tomas Vondra  
> wrote:
>>
>> I've done a bit more cleanup on the last version of the patch (renamed
>> the fields to start with bis_ as agreed, rephrased the comments / docs /
>> commit message a bit) and pushed.
>
>
> It seems that we have an oversight in this commit.  If there is no tuple
> that has been inserted, we wouldn't have an available insert state in
> the clean up phase.  So the Assert in brininsertcleanup() is not always
> right.  For example:
>
> regression=# update brin_summarize set value = brin_summarize.value;
> server closed the connection unexpectedly
>

I wasn't able to repro the issue on 86b64bafc19c4c60136a4038d2a8d1e6eecc59f2.
with UPDATE/INSERT:

postgres=# drop table a;
DROP TABLE
postgres=# create table a(i int);
CREATE TABLE
postgres=# create index on a using brin(i);
CREATE INDEX
postgres=# insert into a select 1 where 1!=1;
INSERT 0 0
postgres=# update a set i = 2 where i = 0;
UPDATE 0
postgres=# update a set i = a.i;
UPDATE 0

This could be because since c5b7ba4e67aeb5d6f824b74f94114d99ed6e42b7,
we have moved ExecOpenIndices()
from ExecInitModifyTable() to ExecInsert(). Since we never open the
indices if nothing is
inserted, we would never attempt to close them with ExecCloseIndices()
while the ii_AmCache
is NULL (which is what causes this assertion failure).

However, it is possible to repro the issue with:
# create empty file
touch /tmp/a.csv
postgres=# create table a(i int);
CREATE TABLE
postgres=# create index on a using brin(i);
CREATE INDEX
postgres=# copy a from '/tmp/a.csv';
TRAP: failed Assert("bistate"), File: "brin.c", Line: 362, PID: 995511

> So I wonder if we should check 'bistate' and do the clean up only if
there is an available one, something like below.

Yes, this is the right way to go. Thanks for reporting!

Regards,
Soumyadeep (VMware)

Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-11-26 Thread Ashutosh Bapat

On Mon, Nov 27, 2023 at 6:35 AM Alexander Korotkov  wrote:
>
> On Fri, Nov 24, 2023 at 3:54 PM Alexander Korotkov  
> wrote:
> >
> > On Thu, Nov 23, 2023 at 4:33 AM Richard Guo  wrote:
> > >
> > > On Sun, Nov 19, 2023 at 9:17 AM Alexander Korotkov  
> > > wrote:
> > >>
> > >> It's here.  New REALLOCATE_BITMAPSETS forces bitmapset reallocation on
> > >> each modification.
> > >
> > >
> > > +1 to the idea of introducing a reallocation mode to Bitmapset.
> > >
> > >>
> > >> I had the feeling of falling into a rabbit hole while debugging all
> > >> the cases of failure with this new option.  With the second patch
> > >> regressions tests pass.
> > >
> > >
> > > It seems to me that we have always had situations where we share the
> > > same pointer to a Bitmapset structure across different places.  I do not
> > > think this is a problem as long as we do not modify the Bitmapsets in a
> > > way that requires reallocation or impact the locations sharing the same
> > > pointer.
> > >
> > > So I'm wondering, instead of attempting to avoid sharing pointer to
> > > Bitmapset in all locations that have problems, can we simply bms_copy
> > > the original Bitmapset within replace_relid() before making any
> > > modifications, as I proposed previously?  Of course, as Andres pointed
> > > out, we need to do so also for the "Delete relid without substitution"
> > > path.  Please see the attached.
> >
> >
> > Yes, this makes sense.  Thank you for the patch.  My initial point was
> > that replace_relid() should either do in-place in all cases or make a
> > copy in all cases.  Now I see that it should make a copy in all cases.
> > Note, that without making a copy in delete case, regression tests fail
> > with REALLOCATE_BITMAPSETS on.
> >
> > Please, find the revised patchset.  As Ashutosh Bapat asked, asserts
> > are split into separate patch.
>
> Any objections to pushing this?
>

Did we at least measure the memory impact?

How do we ensure that we are not making unnecessary copies of Bitmapsets?

-- 
Best Wishes,
Ashutosh Bapat

Re: logical decoding and replication of sequences, take 2

2023-11-26 Thread Amit Kapila

On Mon, Nov 27, 2023 at 6:41 AM Tomas Vondra
 wrote:
>
> I've been cleaning up the first two patches to get them committed soon
> (adding the decoding infrastructure + test_decoding), cleaning up stale
> comments, updating commit messages etc. And I think it's ready to go,
> but it's too late over, so I plan going over once more tomorrow and then
> likely push. But if someone wants to take a look, I'd welcome that.
>
> The one issue I found during this cleanup is that the patch was missing
> the changes introduced by 29d0a77fa660 for decoding of other stuff.
>
>   commit 29d0a77fa6606f9c01ba17311fc452dabd3f793d
>   Author: Amit Kapila 
>   Date:   Thu Oct 26 06:54:16 2023 +0530
>
>   Migrate logical slots to the new node during an upgrade.
>   ...
>
> I fixed that, but perhaps someone might want to double check ...
>
>
> 0003 is here just for completeness - that's the part adding sequences to
> built-in replication. I haven't done much with it, it needs some cleanup
> too to get it committable. I don't intend to push that right after
> 0001+0002, though.
>
>
> While going over 0001, I realized there might be an optimization for
> ReorderBufferSequenceIsTransactional. As coded in 0001, it always
> searches through all top-level transactions, and if there's many of them
> that might be expensive, even if very few of them have any relfilenodes
> in the hash table. It's still linear search, and it needs to happen for
> each sequence change.
>
> But can the relfilenode even be in some other top-level transaction? How
> could it be - our transaction would not see it, and wouldn't be able to
> generate the sequence change. So we should be able to simply check *our*
> transaction (or if it's a subxact, the top-level transaction). Either
> it's there (and it's transactional change), or not (and then it's
> non-transactional change).
>

I also think the relfilenode should be part of either the current
top-level xact or one of its subxact, so looking at all the top-level
transactions for each change doesn't seem advisable.

> The 0004 does this.
>
> This of course hinges on when exactly the transactions get created, and
> assignments processed. For example if this would fire before the txn
> gets assigned to the top-level one, this would break. I don't think this
> can happen thanks to the immediate logging of assignments, but I'm too
> tired to think about it now.
>

This needs some thought because I think we can't guarantee the
association till we reach the point where we can actually decode the
xact. See comments in AssertTXNLsnOrder() [1].

I noticed few minor comments while reading the patch:
1.
+ * turned on here because the non-transactional logical message is
+ * decoded without waiting for these records.

Instead of '.. logical message', shouldn't we say sequence change message?

2.
+ /*
+ * If we found an entry with matchine relfilenode,

typo (matchine)

3.
+  Note that this may not the value obtained by the process updating the
+  process, but the future sequence value written to WAL (typically about
+  32 values ahead).

/may not the value/may not be the value

[1] -
/*
* Skip the verification if we don't reach the LSN at which we start
* decoding the contents of transactions yet because until we reach the
* LSN, we could have transactions that don't have the association between
* the top-level transaction and subtransaction yet and consequently have
* the same LSN.  We don't guarantee this association until we try to
* decode the actual contents of transaction.

-- 
With Regards,
Amit Kapila.

Re: Incorrect comment in tableam.h regarding GetHeapamTableAmRoutine()

2023-11-26 Thread Richard Guo

On Mon, Nov 27, 2023 at 1:50 PM Michael Paquier  wrote:

> I have noticed that GetHeapamTableAmRoutine() is listed as being a
> member of tableamapi.c but it is a convenience routine located in
> heapam_handler.c.  Shouldn't the header be fixed with something like
> the attached?


+1. Nice catch.

Thanks
Richard

Re: GUC names in messages

2023-11-26 Thread Laurenz Albe

On Mon, 2023-11-27 at 13:41 +1100, Peter Smith wrote:
> TBH, I suspect something fishy about these mixed-case GUCs.
> 
> In the documentation and in the guc_tables.c they are all described in
> MixedCase (e.g. "DateStyle" instead of "datestyle"), so I felt the
> messages should use the same case the documentation, which is why I
> changed all the ones you are referring to.

I agree with that decision; we should use mixed case for these parameters.

Otherwise we might get complaints that the following query does not return
any results:

  SELECT * FROM pg_settings WHERE name = 'timezone';

Yours,
Laurenz Albe

Re: GUC names in messages

2023-11-26 Thread Tom Lane

Laurenz Albe  writes:
> On Mon, 2023-11-27 at 13:41 +1100, Peter Smith wrote:
>> In the documentation and in the guc_tables.c they are all described in
>> MixedCase (e.g. "DateStyle" instead of "datestyle"), so I felt the
>> messages should use the same case the documentation, which is why I
>> changed all the ones you are referring to.

> I agree with that decision; we should use mixed case for these parameters.
> Otherwise we might get complaints that the following query does not return
> any results:
>   SELECT * FROM pg_settings WHERE name = 'timezone';

Yeah.  Like Michael upthread, I've wondered occasionally about changing
these names to all-lower-case.  It'd surely be nicer if we'd done it
like that to begin with.  But I can't convince myself that the ensuing
user pain would be justified.

regards, tom lane

Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15

2023-11-26 Thread Amit Kapila

On Fri, Nov 24, 2023 at 7:23 PM Nikhil Benesch  wrote:
>
> Thank you both for reviewing. The updated patch set LGTM.
>

Pushed!

-- 
With Regards,
Amit Kapila.

Re: Adding facility for injection points (or probe points?) for more advanced tests

2023-11-26 Thread Ashutosh Bapat

On Fri, Nov 24, 2023 at 7:37 PM Michael Paquier  wrote:
>
> On Fri, Nov 24, 2023 at 04:37:58PM +0530, Ashutosh Bapat wrote:
> > Interesting idea. For that the callback needs to know the injection
> > point name. At least we should pass that to the callback. It's trivial
> > thing to do.
>
> This is what's done from the beginning, as well as of 0001 in the v5
> series:
> +INJECTION_POINT(name);
> [...]
> +   injection_callback(name);

In my first look I missed the actual call to the injection callback in
InjectionPointRun()
injection_callback(name);

Sorry for that.

The way I see it is that an extension using this functionality will
create an auxiliary lookup table keyed by the injection point name to
obtain the injection point specific arguments (sleep time, count etc.)
in the shared memory or local memory. Every time an injection callback
is called it will consult this look up table to get the arguments.
That looks ok to me. There might be other ways to achieve the same
effect. We will learn and absorb whatever benefits core and the users.
I like that.

>
> > That might work, but in order to run tests in that directory one has
> > to also install the extension. Do we have precedence for such kind of
> > dependency?
>
> Yes, please see EXTRA_INSTALL in some of the Makefiles.  This can
> install stuff from paths different than the location where the tests
> are run.

WFM then.

>
> >> and that there are no string objections, so feel free
> >> to comment.
> >
> > Let's get some more opinions on the design. I will review the detailed
> > code then.
>
> Sure.  Thanks.
>
> >> I don't want to propose 0003 in the tree, just an improved version of
> >> 0004 for the test coverage (still need to improve that).
> >
> > Are you working on v6 already?
>
> No, what would be the point at this stage?  I dont have much more to
> add to 0001 and 0002 at the moment, which focus on the core of the
> problem.

Since you wroten "(still need to improve ...) I thought you are
working on v6. No problem. Sorry for the confusion.

-- 
Best Wishes,
Ashutosh Bapat

Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-11-26 Thread Masahiko Sawada

On Sat, Oct 28, 2023 at 5:56 PM John Naylor  wrote:
>
> I wrote:
>
> > Seems fine at a glance, thanks. I will build on this to implement 
> > variable-length values. I have already finished one prerequisite which is: 
> > public APIs passing pointers to values.
>
> Since my publishing schedule has not kept up, I'm just going to share
> something similar to what I mentioned earlier, just to get things
> moving again.

Thanks for sharing the updates. I've returned to work today and will
resume working on this feature.

>
> 0001-0009 are from earlier versions, except for 0007 which makes a
> bunch of superficial naming updates, similar to those done in a recent
> other version. Somewhere along the way I fixed long-standing git
> whitespace warnings, but I don't remember if that's new here. In any
> case, let's try to preserve that.
>
> 0010 is some minor refactoring to reduce duplication
>
> 0011-0014 add public functions that give the caller more control over
> the input and responsibility for locking. They are not named well, but
> I plan these to be temporary: They are currently used for the tidstore
> only, since that has much simpler tests than the standard radix tree
> tests. One thing to note: since the tidstore has always done it's own
> locking within a larger structure, these patches don't bother to do
> locking at the radix tree level. Locking twice seems...not great.
> These patches are the main prerequisite for variable-length values.
> Once that is working well, we can switch the standard tests to the new
> APIs.

Since the variable-length values support is a big deal and would be
related to API design I'd like to discuss the API design first.
Currently, we have the following APIs:

---
RT_VALUE_TYPE
RT_GET(RT_RADIX_TREE *tree, uint64 key, bool *found);
or for variable-length value support,
RT_GET(RT_RADIX_TREE *tree, uint64 key, size_t sz, bool *found);

If an entry already exists, return its pointer and set "found" to
true. Otherwize, insert an empty value with sz bytes, return its
pointer, and set "found" to false.

---
RT_VALUE_TYPE
RT_FIND(RT_RADIX_TREE *tree, uint64 key);

If an entry exists, return the pointer to the value, otherwise return NULL.

(I omitted RT_SEARCH() as it's essentially the same as RT_FIND() and
will probably get removed.)

---
bool
RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p);
or for variable-length value support,
RT_SET(RT_RADIX_TREE *tree, uint64 key, RT_VALUE_TYPE *value_p, size_t sz);

If an entry already exists, update its value to 'value_p' and return
true. Otherwise set the value and return false.

Given variable-length value support, RT_GET() would have to do
repalloc() if the existing value size is not big enough for the new
value, but it cannot as the radix tree doesn't know the size of each
stored value. Another idea is that the radix tree returns the pointer
to the slot and the caller updates the value accordingly. But it means
that the caller has to update the slot properly while considering the
value size (embedded vs. single-leave value), which seems not a good
idea.

To deal with this problem, I think we can somewhat change RT_GET() API
as follow:

RT_VALUE_TYPE
RT_INSERT(RT_RADIX_TREE *tree, uint64 key, size_t sz, bool *found);

If the entry already exists, replace the value with a new empty value
with sz bytes and set "found" to true. Otherwise, insert an empty
value, return its pointer, and set "found" to false.

We probably will find a better name but I use RT_INSERT() for
discussion. RT_INSERT() returns an empty slot regardless of existing
values. It can be used to insert a new value or to replace the value
with a larger value.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2023-11-26 Thread Alexander Lakhin


Hello Alexander,

27.11.2023 02:43, Alexander Korotkov wrote:


v61 looks good to me.  I'm going to push it as long as there are no objections.



I've looked at the patch set and found a typo:
occured

And a warning:
$ CC=gcc-12 CFLAGS="-Wall -Wextra -Wno-unused-parameter -Wno-sign-compare -Wno-clobbered 
-Wno-missing-field-initializers" ./configure -q && make -s

slru.c:63:1: warning: ‘inline’ is not at beginning of declaration 
[-Wold-style-declaration]
   63 | static int  inline
  | ^~

Maybe it's worth fixing before committing...

Best regards,
Alexander

RE: Partial aggregates pushdown

2023-11-26 Thread fujii.y...@df.mitsubishielectric.co.jp

Hi Mr.Momjian, Mr.Haas, hackers.

> From: Bruce Momjian 
> Sent: Thursday, November 23, 2023 6:16 AM
> On Wed, Nov 22, 2023 at 10:16:16AM +, 
> fujii.y...@df.mitsubishielectric.co.jp wrote:
> > 2. Approach 2
> > (1) Advantages
> > (a) No need to add partial aggregate functions to the catalogs for
> > each aggregation
> > (2) Disadvantages
> > (a) Need to add non-standard keywords to the SQL syntax.
> >
> > I did not choose Approach2 because I was not confident that the
> > disadvantage mentioned in 2.(2)(a) would be accepted by the PostgreSQL 
> > development community.
> > If it is accepted, I think Approach 2 is smarter.
> > Could you please provide your opinion on which approach is preferable
> > after comparing these two approaches?
> 
> I didn't know #2 was possible, but given the great number of catalog entries, 
> doing it in the SQL grammar seems cleaner
> to me.
Thank you for comments. Yes, I understand.

> From: Bruce Momjian 
> Sent: Wednesday, November 22, 2023 5:34 AM
> On Tue, Nov 21, 2023 at 12:16:41PM -0500, Robert Haas wrote:
> > On Mon, Nov 20, 2023 at 5:48 PM Bruce Momjian  wrote:
> > > > I do have a concern about this, though. It adds a lot of bloat. It
> > > > adds a whole lot of additional entries to pg_aggregate, and every
> > > > new aggregate we add in the future will require a bonus entry for
> > > > this, and it needs a bunch of new pg_proc entries as well. One
> > > > idea that I've had in the past is to instead introduce syntax that
> > > > just does this, without requiring a separate aggregate definition in 
> > > > each case.
> > > > For example, maybe instead of changing string_agg(whatever) to
> > > > string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> > > > string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> > > > something. Then all aggregates could be treated in a generic way.
> > > > I'm not completely sure that's better, but I think it's worth 
> > > > considering.
> > >
> > > So use an SQL keyword to indicates a pushdown call?  We could then
> > > automate the behavior rather than requiring special catalog functions?
> >
> > Right. It would require more infrastructure in the parser, planner,
> > and executor, but it would be infinitely reusable instead of needing a
> > new thing for every aggregate. I think that might be better, but to be
> > honest I'm not totally sure.
> 
> It would make it automatic.  I guess we need to look at how big the patch is 
> to do it.
I will investigate specifically which parts of the PostgreSQL source code need 
to be modified and how big the patch will be if you take this approach.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Improve rowcount estimate for UNNEST(column)

2023-11-26 Thread jian he

Hi.
Since both array_op_test, arrest both are not dropped at the end of
src/test/regress/sql/arrays.sql.
I found using table array_op_test test more convincing.

select
reltuples * 10 as original,
reltuples * (select
floor(elem_count_histogram[array_length(elem_count_histogram,1)])
frompg_stats
where   tablename = 'array_op_test' and attname = 'i')
as with_patch
,(select (elem_count_histogram[array_length(elem_count_histogram,1)])
frompg_stats
where   tablename = 'array_op_test' and attname = 'i')
as elem_count_histogram_last_element
from pg_class where relname = 'array_op_test';
 original | with_patch | elem_count_histogram_last_element
--++---
 1030 |412 | 4.7843137
(1 row)

without patch:
explain select unnest(i)  from array_op_test;
  QUERY PLAN
--
 ProjectSet  (cost=0.00..9.95 rows=1030 width=4)
   ->  Seq Scan on array_op_test  (cost=0.00..4.03 rows=103 width=40)
(2 rows)

with patch:
 explain select unnest(i)  from array_op_test;
  QUERY PLAN
--
 ProjectSet  (cost=0.00..6.86 rows=412 width=4)
   ->  Seq Scan on array_op_test  (cost=0.00..4.03 rows=103 width=40)
(2 rows)

because, in the estimate_array_length function, `nelem =
sslot.numbers[sslot.nnumbers - 1];` will round  4.7843137 to 4.
so with patch estimated row 412 = 103 *4. without patch estimated rows
= 103 * 10.

Re: Catalog domain not-null constraints

2023-11-26 Thread Peter Eisentraut


On 23.11.23 17:38, Alvaro Herrera wrote:

If you create a table with column of domain that has a NOT NULL
constraint, what happens?  I mean, is the table column marked
attnotnull, and how does it behave?


No, the domain does not affect the catalog entry for the column.  This 
is the same way it behaves now.



Is there a separate pg_constraint
row for the constraint in the table?  What happens if you do
ALTER TABLE ... DROP NOT NULL for that column?


Those are separate.  After dropping the NOT NULL for a column, null 
values for the column could still be rejected by a domain.  (This is the 
same way CHECK constraints work.)

Re: [PATCH] pg_convert improvement

2023-11-26 Thread Drouvot, Bertrand


Hi,

On 11/24/23 3:32 PM, Yurii Rashkovskii wrote:

Hi Bertrand,

On Fri, Nov 24, 2023 at 6:26 AM Drouvot, Bertrand mailto:bertranddrouvot...@gmail.com>> wrote:


The patch is pretty straightforward, I just have one remark:

+       /* if no actual conversion happened, return the original string */
+       /* (we are checking pointers to strings instead of encodings because
+          `pg_do_encoding_conversion` above covers more cases than just
+          encoding equality) */

I think this could be done in one single comment and follow the preferred 
style
for multi-line comment, see [1].


Thank you for your feedback. I've attached a revised patch.


Did some minor changes in the attached:

- Started the multi-line comment with an upper case and finished
it with a "." and re-worded a bit.
- Ran pgindent

What do you think about the attached?

Also, might be good to create a CF entry [1] so that the patch proposal does 
not get lost
and gets visibility.

[1]: https://commitfest.postgresql.org/46/

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 5eb1f5d9323b8d270411bf156ec065fe7c69b7b8 Mon Sep 17 00:00:00 2001
From: Yurii Rashkovskii 
Date: Fri, 24 Nov 2023 06:32:05 -0800
Subject: [PATCH v3] Don't copy bytea/text in convert_from/to unless converted.

`pg_convert` allocates a new bytea whether actual conversion occurred or
not, followed by copying the result which (by definition) is the same as
the original value.

In the case where conversion has not occurred, it will now simply return
the original value, this avoiding unnecessary allocation and copying.
---
 src/backend/utils/mb/mbutils.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)
 100.0% src/backend/utils/mb/

diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 67a1ab2ab2..7666d166a5 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -585,19 +585,26 @@ pg_convert(PG_FUNCTION_ARGS)

  src_encoding,

  dest_encoding);
 
-   /* update len if conversion actually happened */
-   if (dest_str != src_str)
-   len = strlen(dest_str);
 
/*
-* build bytea data type structure.
+* If no actual conversion happened, return the original string (we are
+* checking pointers to strings instead of encodings because
+* pg_do_encoding_conversion() above covers more cases than just 
encoding
+* equality).
 */
+   if (dest_str == src_str)
+   {
+   PG_RETURN_BYTEA_P(string);
+   }
+
+   /* if conversion happened, build a new bytea of a correct length */
+   len = strlen(dest_str);
retval = (bytea *) palloc(len + VARHDRSZ);
SET_VARSIZE(retval, len + VARHDRSZ);
memcpy(VARDATA(retval), dest_str, len);
 
-   if (dest_str != src_str)
-   pfree(dest_str);
+   /* free the original result of conversion */
+   pfree(dest_str);
 
/* free memory if allocated by the toaster */
PG_FREE_IF_COPY(string, 0);
-- 
2.34.1

Re: brininsert optimization opportunity

2023-11-26 Thread Richard Guo

On Mon, Nov 27, 2023 at 1:53 PM Soumyadeep Chakraborty <
soumyadeep2...@gmail.com> wrote:

> On Sun, Nov 26, 2023 at 9:28 PM Richard Guo 
> wrote:
> > It seems that we have an oversight in this commit.  If there is no tuple
> > that has been inserted, we wouldn't have an available insert state in
> > the clean up phase.  So the Assert in brininsertcleanup() is not always
> > right.  For example:
> >
> > regression=# update brin_summarize set value = brin_summarize.value;
> > server closed the connection unexpectedly
>
> I wasn't able to repro the issue on
> 86b64bafc19c4c60136a4038d2a8d1e6eecc59f2.
> with UPDATE/INSERT:
>
> This could be because since c5b7ba4e67aeb5d6f824b74f94114d99ed6e42b7,
> we have moved ExecOpenIndices()
> from ExecInitModifyTable() to ExecInsert(). Since we never open the
> indices if nothing is
> inserted, we would never attempt to close them with ExecCloseIndices()
> while the ii_AmCache
> is NULL (which is what causes this assertion failure).


AFAICS we would also open the indices from ExecUpdate().  So if we
update the table in a way that no new tuples are inserted, we will have
this issue.  As I showed previously, the query below crashes for me on
latest master (dc9f8a7983).

regression=# update brin_summarize set value = brin_summarize.value;
server closed the connection unexpectedly

There are other code paths that call ExecOpenIndices(), such as
ExecMerge().  I believe it's not hard to create queries that trigger this
Assert for those cases.

Thanks
Richard

51 matches

Mail list logo