Re: [PATCH]Feature improvement for MERGE tab completion

2022-09-17 Thread Fujii Masao



On 2022/09/16 11:46, bt22kawamotok wrote:

Thanks for updating.

+    COMPLETE_WITH("UPDATE", "DELETE", "DO NOTHING");

"UPDATE" is always followed by "SET",  so why not complement it with
"UPDATE SET"?


Thanks for reviewing.
That's a good idea!
I create new patch v7.


Thanks for updating the patch!

I applied the following changes to the patch. Attached is the updated version 
of the patch.

The tab-completion code for MERGE was added in the middle of that for LOCK 
TABLE.
This would be an oversight of the commit that originally supported 
tab-completion
for MERGE. I fixed this issue.

+   else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny) ||
+TailMatches("MERGE", "INTO", MatchAny, 
MatchAnyExcept("AS")))
COMPLETE_WITH("USING");

This can cause to complete "MERGE INTO  USING" with "USING" unexpectedly.
I fixed this issue by replacing MatchAnyExcept("AS") with 
MatchAnyExcept("USING|AS").

I added some comments.

"MERGE" was tab-completed with just after "EXPLAIN" or "EXPLAIN ANALYZE", etc.
Since "INTO" always follows "MERGE", it's better to complete with "MERGE INTO"
there. I replaced "MERGE" with "MERGE INTO" in those tab-completions.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index f3465adb85..23f2a87991 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1669,7 +1669,7 @@ psql_completion(const char *text, int start, int end)
"COMMENT", "COMMIT", "COPY", "CREATE", "DEALLOCATE", "DECLARE",
"DELETE FROM", "DISCARD", "DO", "DROP", "END", "EXECUTE", 
"EXPLAIN",
"FETCH", "GRANT", "IMPORT FOREIGN SCHEMA", "INSERT INTO", 
"LISTEN", "LOAD", "LOCK",
-   "MERGE", "MOVE", "NOTIFY", "PREPARE",
+   "MERGE INTO", "MOVE", "NOTIFY", "PREPARE",
"REASSIGN", "REFRESH MATERIALIZED VIEW", "REINDEX", "RELEASE",
"RESET", "REVOKE", "ROLLBACK",
"SAVEPOINT", "SECURITY LABEL", "SELECT", "SET", "SHOW", "START",
@@ -3641,7 +3641,7 @@ psql_completion(const char *text, int start, int end)
  */
else if (Matches("EXPLAIN"))
COMPLETE_WITH("SELECT", "INSERT INTO", "DELETE FROM", "UPDATE", 
"DECLARE",
- "MERGE", "EXECUTE", "ANALYZE", 
"VERBOSE");
+ "MERGE INTO", "EXECUTE", "ANALYZE", 
"VERBOSE");
else if (HeadMatches("EXPLAIN", "(*") &&
 !HeadMatches("EXPLAIN", "(*)"))
{
@@ -3660,12 +3660,12 @@ psql_completion(const char *text, int start, int end)
}
else if (Matches("EXPLAIN", "ANALYZE"))
COMPLETE_WITH("SELECT", "INSERT INTO", "DELETE FROM", "UPDATE", 
"DECLARE",
- "MERGE", "EXECUTE", "VERBOSE");
+ "MERGE INTO", "EXECUTE", "VERBOSE");
else if (Matches("EXPLAIN", "(*)") ||
 Matches("EXPLAIN", "VERBOSE") ||
 Matches("EXPLAIN", "ANALYZE", "VERBOSE"))
COMPLETE_WITH("SELECT", "INSERT INTO", "DELETE FROM", "UPDATE", 
"DECLARE",
- "MERGE", "EXECUTE");
+ "MERGE INTO", "EXECUTE");
 
 /* FETCH && MOVE */
 
@@ -4065,58 +4065,90 @@ psql_completion(const char *text, int start, int end)
else if (HeadMatches("LOCK") && TailMatches("IN", "SHARE"))
COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE",
  "UPDATE EXCLUSIVE MODE");
+
+   /* Complete LOCK [TABLE] [ONLY]  [IN lockmode MODE] with 
"NOWAIT" */
+   else if (HeadMatches("LOCK") && TailMatches("MODE"))
+   COMPLETE_WITH("NOWAIT");
+
 /* MERGE --- can be inside EXPLAIN */
else if (TailMatches("MERGE"))
COMPLETE_WITH("INTO");
else if (TailMatches("MERGE", "INTO"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_mergetargets);
+
+   /* Complete MERGE INTO  [[AS] ] with USING */
else if (TailMatches("MERGE", "INTO", MatchAny))
COMPLETE_WITH("USING", "AS");
-   else if (TailMatches("MERGE", "INTO", MatchAny, "USING"))
-   COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
-   /* with [AS] alias */
-   else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny))
+   else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny) ||
+TailMatches("MERGE", "INTO", MatchAny, 
MatchAnyExcept("USING|AS")))
COMPLETE_WITH("USING");
-   else if (TailMatches("MERGE", "INTO", MatchAny, MatchAny))
-   COMPLETE_WITH("USING");
-   else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, 
"USING"))
-   

Re: [EXTERNAL] Re: Support load balancing in libpq

2022-09-17 Thread Michael Banck
Hi,

On Mon, Sep 12, 2022 at 02:16:56PM +, Jelte Fennema wrote:
> Attached is an updated patch with the following changes:
> 1. rebased (including solved merge conflict)
> 2. fixed failing tests in CI
> 3. changed the commit message a little bit
> 4. addressed the two remarks from Micheal
> 5. changed the prng_state from a global to a connection level value for 
> thread-safety
> 6. use pg_prng_uint64_range

Thanks!
 
I tested this some more, and found it somewhat surprising that at least
when looking at it on a microscopic level, some hosts are chosen more
often than the others for a while.

I basically ran 

while true; do psql -At "host=pg1,pg2,pg3 load_balance_hosts=1" -c
"SELECT inet_server_addr()"; sleep 1; done

and the initial output was:

10.0.3.109
10.0.3.109
10.0.3.240
10.0.3.109
10.0.3.109
10.0.3.240
10.0.3.109
10.0.3.240
10.0.3.240
10.0.3.240
10.0.3.240
10.0.3.109
10.0.3.240
10.0.3.109
10.0.3.109
10.0.3.240
10.0.3.240
10.0.3.109
10.0.3.60

I.e. the second host (pg2/10.0.3.60) was only hit after 19 iterations.

Once significantly more than a hundred iterations are run, the hosts
somewhat even out, but it is maybe suprising to users:

   50   100   250   500   1000   1
10.0.3.60   92477   1653283317
10.0.3.109 254288   1783533372
10.0.3.240 163485   1573193311

Or maybe my test setup is skewed? When I choose a two seconds timeout
between psql calls, I get a more even distribution initially, but it
then diverges after 100 iterations:

   50   100   250500   1000
10.0.3.60  193698199374 
10.0.3.109 133380150285 
10.0.3.240 183172151341 

Could just be bad luck...

I also switch one host to have two IP addresses in /etc/hosts:

10.0.3.109 pg1
10.0.3.60  pg1
10.0.3.240 pg3

And this resulted in this (one second timeout again):

First run:

   50   100   250500   1000
10.0.3.60  101856120255
10.0.3.109 143067139278
10.0.3.240 2652   127241467

Second run:

   50   100   250500   1000
10.0.3.60  203177138265 
10.0.3.109  92052116245 
10.0.3.240 2149   121246490 

So it looks like it load-balances between pg1 and pg3, and not between
the three IPs -  is this expected?

If I switch from "host=pg1,pg3" to "host=pg1,pg1,pg3", each IP adress is
hit roughly equally.

So I guess this is how it should work, but in that case I think the
documentation should be more explicit about what is to be expected if a
host has multiple IP addresses or hosts are specified multiple times in
the connection string.

> > Maybe my imagination is not so great, but what else than hosts could we
> > possibly load-balance? I don't mind calling it load_balance, but I also
> > don't feel very strongly one way or the other and this is clearly
> > bikeshed territory.
> 
> I agree, which is why I called it load_balance in my original patch.
> But I also think it's useful to match the naming for the already
> existing implementations in the PG ecosystem around this. 
> But like you I don't really feel strongly either way. It's a tradeoff
> between short name and consistency in the ecosystem.

I don't think consistency is an extremely valid concern. As a
counterpoint, pgJDBC had targetServerType some time before Postgres, and
the libpq parameter was then named somewhat differently when it was
introduced, namely target_session_attrs.

> > If I understand correctly, you've added DNS-based load balancing on top
> > of just shuffling the provided hostnames.  This makes sense if a
> > hostname is backed by more than one IP address in the context of load
> > balancing, but it also complicates the patch. So I'm wondering how much
> > shorter the patch would be if you leave that out for now?
> 
> Yes, that's correct and indeed the patch would be simpler without, i.e. all 
> the
> addrinfo changes would become unnecessary. But IMHO the behaviour of 
> the added option would be very unexpected if it didn't load balance across
> multiple IPs in a DNS record. libpq currently makes no real distinction in 
> handling of provided hosts and handling of their resolved IPs. If load 
> balancing
> would only apply to the host list that would start making a distinction
> between the two.

Fair enough, I agree.
 
> Apart from that the load balancing across IPs is one of the main reasons
> for my interest in this patch. The reason is that it allows expanding or 
> reducing
> the number of nodes that are being load balanced across transparently to the
> application. Which means that there's no need to re-deploy applications with 
> new connection strings when changing the number hosts.

That's a good point as well.
 
> > On the other hand, I believe pgJDBC keeps track of which hosts are up or
> > down and only load balances among the ones 

Re: [EXTERNAL] Re: Support load balancing in libpq

2022-09-17 Thread Michael Banck
Hi,

On Wed, Sep 14, 2022 at 05:53:48PM +0300, Maxim Orlov wrote:
> > Also, IMO, the solution must have a fallback mechanism if the
> > standby/chosen host isn't reachable.
> 
> Yeah, I think it should. I'm not insisting on a particular name of options
> here, but in my view, the overall idea may be next:
> - we have two libpq's options: "load_balance_hosts" and "failover_timeout";
> - the "load_balance_hosts" should be "sequential" or "random";
> - the "failover_timeout" is a time period, within which, if connection to
> the server is not established, we switch to the next address or host.

Isn't this exactly what connect_timeout is providing? In my tests, it
worked exactly as I would expect it, i.e. after connect_timeout seconds,
libpq was re-shuffling and going for another host.

If you specify only one host (or all are down), you get an error.

In any case, I am not sure what failover has to do with it if we are
talking about initiating connections - usually failover is for already
established connections that suddendly go away for one reason or
another.

Or maybe I'm just not understanding where you're getting at?

> While writing this text, I start thinking that load balancing is a
> combination of two parameters above.

So I guess what you are saying is that if load_balance_hosts is set,
not setting connect_timeout would be a hazard, cause it would stall the
connection attempt even though other hosts would be available.

That's right, but I guess it's already a hazard if you put multiple
hosts in there, and the connection is not immediately failed (because
the host doesn't exist or it rejects the connection) but stalled by a
firewall for one host, while other hosts later on in the list would be
happy to accept connections.

So maybe this is something to think about, but just changing the
defaul of connect_timeout to something else when load balancing is on
would be very surprising. In any case, I don't think this absolutely
needs to be addressed by the initial feature, it could be expanded upon
later on if needed, the feature is useful on its own, along with
connect_timeout.


Michael




Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-09-17 Thread Justin Pryzby
This is an alternative implementation, which still relies on adding the
GUC_DYNAMIC, flag but doesn't require adding a new, sql-accessible
function to convert the GUC to a pretty/human display value.
>From 25ee6d6ed23ff273e622551fd033c8d086953fe5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 19 Jul 2022 08:31:56 -0500
Subject: [PATCH 1/2] add DYNAMIC_DEFAULT for settings which vary by
 ./configure or platform or initdb

---
 doc/src/sgml/func.sgml  |  8 ++
 src/backend/utils/misc/guc_funcs.c  |  4 ++-
 src/backend/utils/misc/guc_tables.c | 39 -
 src/include/utils/guc.h |  2 ++
 4 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 67eb3806326..63063d054c5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24341,6 +24341,14 @@ SELECT collation for ('foo' COLLATE "de_DE");
  FlagDescription
 
 
+
+ 
+  DYNAMIC_DEFAULT
+  Parameters with this flag have default values which vary by
+  platform, or compile-time options, or are set dynamically during initdb.
+  
+ 
+
  
   EXPLAIN
   Parameters with this flag are included in
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 3d2df18659b..3376f23e076 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -539,7 +539,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 Datum
 pg_settings_get_flags(PG_FUNCTION_ARGS)
 {
-#define MAX_GUC_FLAGS	5
+#define MAX_GUC_FLAGS	6
 	char	   *varname = TextDatumGetCString(PG_GETARG_DATUM(0));
 	struct config_generic *record;
 	int			cnt = 0;
@@ -552,6 +552,8 @@ pg_settings_get_flags(PG_FUNCTION_ARGS)
 	if (record == NULL)
 		PG_RETURN_NULL();
 
+	if (record->flags & GUC_DYNAMIC_DEFAULT)
+		flags[cnt++] = CStringGetTextDatum("DYNAMIC_DEFAULT");
 	if (record->flags & GUC_EXPLAIN)
 		flags[cnt++] = CStringGetTextDatum("EXPLAIN");
 	if (record->flags & GUC_NO_RESET_ALL)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 7bff45e10d6..bd3c84bc7b8 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1201,7 +1201,7 @@ struct config_bool ConfigureNamesBool[] =
 		{"debug_assertions", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether the running server has assertion checks enabled."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_DYNAMIC_DEFAULT
 		},
 		_enabled,
 #ifdef USE_ASSERT_CHECKING
@@ -1377,7 +1377,8 @@ struct config_bool ConfigureNamesBool[] =
 	{
 		{"update_process_title", PGC_SUSET, PROCESS_TITLE,
 			gettext_noop("Updates the process title to show the active SQL command."),
-			gettext_noop("Enables updating of the process title every time a new SQL command is received by the server.")
+			gettext_noop("Enables updating of the process title every time a new SQL command is received by the server."),
+			GUC_DYNAMIC_DEFAULT
 		},
 		_process_title,
 #ifdef WIN32
@@ -2126,7 +2127,8 @@ struct config_int ConfigureNamesInt[] =
 	{
 		{"max_connections", PGC_POSTMASTER, CONN_AUTH_SETTINGS,
 			gettext_noop("Sets the maximum number of concurrent connections."),
-			NULL
+			NULL,
+			GUC_DYNAMIC_DEFAULT
 		},
 		,
 		100, 1, MAX_BACKENDS,
@@ -2163,7 +2165,7 @@ struct config_int ConfigureNamesInt[] =
 		{"shared_buffers", PGC_POSTMASTER, RESOURCES_MEM,
 			gettext_noop("Sets the number of shared memory buffers used by the server."),
 			NULL,
-			GUC_UNIT_BLOCKS
+			GUC_UNIT_BLOCKS | GUC_DYNAMIC_DEFAULT
 		},
 		,
 		16384, 16, INT_MAX / 2,
@@ -2677,7 +2679,7 @@ struct config_int ConfigureNamesInt[] =
 		{"checkpoint_flush_after", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Number of pages after which previously performed writes are flushed to disk."),
 			NULL,
-			GUC_UNIT_BLOCKS
+			GUC_UNIT_BLOCKS | GUC_DYNAMIC_DEFAULT
 		},
 		_flush_after,
 		DEFAULT_CHECKPOINT_FLUSH_AFTER, 0, WRITEBACK_MAX_PENDING_FLUSHES,
@@ -2895,7 +2897,7 @@ struct config_int ConfigureNamesInt[] =
 		{"bgwriter_flush_after", PGC_SIGHUP, RESOURCES_BGWRITER,
 			gettext_noop("Number of pages after which previously performed writes are flushed to disk."),
 			NULL,
-			GUC_UNIT_BLOCKS
+			GUC_UNIT_BLOCKS | GUC_DYNAMIC_DEFAULT
 		},
 		_flush_after,
 		DEFAULT_BGWRITER_FLUSH_AFTER, 0, WRITEBACK_MAX_PENDING_FLUSHES,
@@ -2908,7 +2910,7 @@ struct config_int ConfigureNamesInt[] =
 			RESOURCES_ASYNCHRONOUS,
 			gettext_noop("Number of simultaneous requests that can be handled efficiently by the disk subsystem."),
 			NULL,
-			GUC_EXPLAIN
+			GUC_EXPLAIN | GUC_DYNAMIC_DEFAULT
 		},
 		_io_concurrency,
 #ifdef USE_PREFETCH
@@ -2926,7 +2928,7 @@ struct config_int ConfigureNamesInt[] =
 			RESOURCES_ASYNCHRONOUS,
 			gettext_noop("A variant of effective_io_concurrency that is used for maintenance work."),
 			NULL,
-			GUC_EXPLAIN
+			GUC_EXPLAIN | 

Re: [Proposal] Add foreign-server health checks infrastructure

2022-09-17 Thread Fujii Masao




On 2022/03/04 15:17, kuroda.hay...@fujitsu.com wrote:

Hi Hackers,


It's not happy, but I'm not sure about a good solution. I made a timer 
reschedule
if connection lost had detected. But if queries in the transaction are quite 
short,
catching SIGINT may be fail.


Attached uses another way: sets pending flags again if DoingCommandRead is true.
If a remote server goes down while it is in idle_in_transaction,
next query will fail because of ereport(ERROR).

How do you think?


Sounds ok to me.

Thanks for updating the patches!

These failed to be applied to the master branch cleanly. Could you update them?

+  this option relies on kernel events exposed by Linux, macOS,

s/this/This

+   GUC_check_errdetail("pgfdw_health_check_interval must be set to 0 on 
this platform");

The actual parameter name "postgres_fdw.health_check_interval"
should be used for the message instead of internal variable name.

+   /* Register a timeout for checking remote servers */
+   pgfdw_health_check_timeout = RegisterTimeout(USER_TIMEOUT, 
pgfdw_connection_check);

This registered signal handler does lots of things. But that's not acceptable
and they should be performed outside signal handler. No?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-17 Thread Fujii Masao




On 2022/09/17 16:18, Bharath Rupireddy wrote:

Good idea. It makes a lot more sense to me, because xlog.c is already
a file of 9000 LOC. I've created xlogbackup.c/.h files and added the
new code there. Once this patch gets in, I can offer my hand to move
do_pg_backup_start() and do_pg_backup_stop() from xlog.c and if okay,
pg_backup_start() and pg_backup_stop() from xlogfuncs.c to
xlogbackup.c/.h. Then, we might have to create new get/set APIs for
XLogCtl fields that do_pg_backup_start() and do_pg_backup_stop()
access.


The definition of SessionBackupState enum type also should be in xlogbackup.h?


We need to carry tablespace_map contents from do_pg_backup_start()
till pg_backup_stop(), backup_label and history_file too are easy to
carry across. Hence it will be good to have all of them i.e.
tablespace_map, backup_label and history_file in the BackupState
structure. IMO, this way is good.


backup_label and history_file are not carried between pg_backup_start()
and _stop(), so don't need to be saved in BackupState. Their contents
can be easily created from other saved fields in BackupState,
if necessary. So at least for me it's better to get rid of them from
BackupState and don't allocate TopMemoryContext memory for them.


Something unrelated to your patch that I am noticing while scanning
the area is that we have been always lazy in freeing the label file
data allocated in TopMemoryContext when using the SQL functions if the
backup is aborted.  We are not talking about this much amount of
memory each time a backup is performed, but that could be a cause for
memory bloat in a server if the same session is used and backups keep
failing, as the data is freed only on a successful pg_backup_stop().
Base backups through the replication protocol don't care about that as
the code keeps around the same pointer for the whole duration of
perform_base_backup().  Trying to tie the cleanup of the label file
with the abort phase would be the cause of more complications with
do_pg_backup_stop(), and xlog.c has no need to know about that now.
Just a remark for later.


Yeah, I think that can be solved by passing in backup_state to
do_pg_abort_backup(). If okay, I can work on this too as 0002 patch in
this thread or we can discuss this separately to get more attention
after this refactoring patch gets in.


Or, to avoid such memory bloat, how about allocating the memory for
backup_state only when it's NULL?


I'm attaching v5 patch with the above review comments addressed,
please review it further.


Thanks for updating the patch!

+   charstartxlogfile[MAXFNAMELEN_BACKUP]; /* backup start WAL 
file */

+   charstopxlogfile[MAXFNAMELEN_BACKUP];   /* backup stop 
WAL file */

These file names seem not necessary in BackupState because they can be
calculated from other fields like startpoint and starttli, etc when
making backup_label and history file contents. If we remove them from
BackupState, we can also remove the definition of MAXFNAMELEN_BACKUP
macro from xlogbackup.h.

+   /* construct backup_label contents */
+   build_backup_content(state, false);

In basebackup case, build_backup_content() is called unnecessary twice
because do_pg_stop_backup() and its caller, perform_base_backup() call
that. This makes me think that it's better to get rid of the call to
build_backup_content() from do_pg_backup_stop(). Instead its callers,
perform_base_backup() and pg_backup_stop() should call that.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-17 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 11:59 PM Michael Paquier  wrote:
> If check_usermap() is used in a bugfix, that could be a risk, so this
> bit warrants a backpatch in my opinion.

Makes sense. Committed and backpatched a fix for check_usermap() just now

Thanks
-- 
Peter Geoghegan




Re: remove more archiving overhead

2022-09-17 Thread Nathan Bossart
On Sat, Sep 17, 2022 at 11:46:39AM +0200, Peter Eisentraut wrote:
>> > --- a/doc/src/sgml/backup.sgml
>> > +++ b/doc/src/sgml/backup.sgml
>> > @@ -691,11 +691,9 @@ test ! -f 
>> > /mnt/server/archivedir/000100A90065  cp pg_wal/0
>> >   system crashes before the server makes a durable record of archival 
>> > success,
>> >   the server will attempt to archive the file again after restarting 
>> > (provided
>> >   archiving is still enabled).  When an archive library encounters a
>> > -pre-existing file, it may return true if the WAL 
>> > file has
>> > +pre-existing file, it should return true if the 
>> > WAL file has
>> >   identical contents to the pre-existing archive and the pre-existing 
>> > archive
>> > -is fully persisted to storage.  Alternatively, the archive library may
>> > -return false anytime a pre-existing file is 
>> > encountered,
>> > -but this will require manual action by an administrator to resolve.  
>> > If a
>> > +is fully persisted to storage.  If a
>> >   pre-existing file contains different contents than the WAL file being
>> >   archived, the archive library must return
>> >   false.
>> 
>> Works for me.  Thanks.
> 
> This documentation change only covers archive_library.  How are users of
> archive_command supposed to handle this?

I believe users of archive_command need to do something similar to what is
described here.  However, it might be more reasonable to expect
archive_command users to simply return false when there is a pre-existing
file, as the deleted text notes.  IIRC that is why I added that sentence
originally.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-17 Thread Nathan Bossart
On Fri, Sep 16, 2022 at 02:54:14PM +0700, John Naylor wrote:
> Here again, I'd rather put this off and focus on getting the "large
> details" in good enough shape so we can got towards integrating with
> vacuum.

I started a new thread for the SIMD patch [0] so that this thread can
remain focused on the radix tree stuff.

[0] https://www.postgresql.org/message-id/20220917052903.GA3172400%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: missing indexes in indexlist with partitioned tables

2022-09-17 Thread Tom Lane
Alvaro Herrera  writes:
> On 2022-Sep-16, David Rowley wrote:
>> I kinda disagree with Alvaro's fix in 05fb5d661. I think indexlist is
>> the place to store these details.  That commit added the following
>> comment:
>> 
>> /*
>> * Ignore partitioned indexes, since they are not usable for
>> * queries.
>> */
>> 
>> But neither are hypothetical indexes either, yet they're added to
>> RelOptInfo.indexlist.
>> 
>> I think the patch should be changed so that the existing list is used
>> and we find another fix for the problems Alvaro fixed in 05fb5d661.
>> Unfortunately, there was no discussion marked on that commit message,
>> so it's not quite clear what the problem was. I'm unsure if there was
>> anything other than CLUSTER that was broken.

> After a bit of trawling through the archives, I found it here:
> https://www.postgresql.org/message-id/20180124162006.pmapfiznhgngwtjf%40alvherre.pgsql
> I think there was insufficient discussion and you're probably right that
> it wasn't the best fix.  I don't object to finding another fix.

FWIW, I don't see any big problem with what you did.  We'd need to
do something more like what David suggests if the planner ever has
a reason to consider partitioned indexes.  But as long as it does
not, why expend the time to build data structures representing them?
And we'd have to add code in quite a few places to ignore them,
once they're in indexlist.

regards, tom lane




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-17 Thread Peter Geoghegan
On Sat, Sep 17, 2022 at 11:26 AM Tom Lane  wrote:
> Yeah, bringing the regex code into line with our standards is fine.
> I don't really see a reason not to do it with the timezone code
> either, as long as there aren't too many changes there.  We are
> carrying a pretty large number of diffs from upstream already.

I'd be surprised if this created more than 3 minutes of extra work for
you when updating the timezone code.

There are a few places where I had to apply a certain amount of
subjective judgement (rather than just mechanically normalizing the
declarations), but the timezone code wasn't one of those places. Plus
there just isn't that many affected timezone related function
declarations, and they're concentrated in only 3 distinct areas.

-- 
Peter Geoghegan




Re: missing indexes in indexlist with partitioned tables

2022-09-17 Thread Alvaro Herrera
On 2022-Sep-16, David Rowley wrote:

> I kinda disagree with Alvaro's fix in 05fb5d661. I think indexlist is
> the place to store these details.  That commit added the following
> comment:
> 
> /*
> * Ignore partitioned indexes, since they are not usable for
> * queries.
> */
> 
> But neither are hypothetical indexes either, yet they're added to
> RelOptInfo.indexlist.
> 
> I think the patch should be changed so that the existing list is used
> and we find another fix for the problems Alvaro fixed in 05fb5d661.
> Unfortunately, there was no discussion marked on that commit message,
> so it's not quite clear what the problem was. I'm unsure if there was
> anything other than CLUSTER that was broken.

After a bit of trawling through the archives, I found it here:
https://www.postgresql.org/message-id/20180124162006.pmapfiznhgngwtjf%40alvherre.pgsql
I think there was insufficient discussion and you're probably right that
it wasn't the best fix.  I don't object to finding another fix.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-17 Thread Tom Lane
Peter Geoghegan  writes:
> Several files from src/timezone and from src/backend/regex make use of
> unnamed parameters in function declarations. It wouldn't be difficult
> to fix everything and call it a day, but I wonder if there are any
> special considerations here. I don't think that Henry Spencer's regex
> code is considered vendored code these days (if it ever was), so that
> seems clear cut. I'm less sure about the timezone code.

Yeah, bringing the regex code into line with our standards is fine.
I don't really see a reason not to do it with the timezone code
either, as long as there aren't too many changes there.  We are
carrying a pretty large number of diffs from upstream already.

(Which reminds me that I need to do another update pass on that
code soon.  Not right now, though.)

regards, tom lane




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-17 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 6:48 PM Peter Geoghegan  wrote:
> On Fri, Sep 16, 2022 at 6:20 PM Tom Lane  wrote:
> > I think they're easily Stroustrup's worst idea ever.  You're basically
> > throwing away an opportunity for documentation, and that documentation
> > is often sorely needed.
>
> He could at least point to C++ pure virtual functions, where omitting
> a parameter name in the base class supposedly conveys useful
> information. I don't find that argument particularly convincing
> myself, even in a C++ context, but at least it's an argument. Doesn't
> apply here in any case.

Several files from src/timezone and from src/backend/regex make use of
unnamed parameters in function declarations. It wouldn't be difficult
to fix everything and call it a day, but I wonder if there are any
special considerations here. I don't think that Henry Spencer's regex
code is considered vendored code these days (if it ever was), so that
seems clear cut. I'm less sure about the timezone code.

Note that regcomp.c has a relatively large number of function
declarations that need to be fixed (regexec.c has some too), since the
regex code was written in a style that makes unnamed parameters in
declarations the standard -- we're talking about changing every static
function declaration. The timezone code is just inconsistent about its
use of unnamed parameters, kind of like reorderbuffer.h.

I don't see any reason to treat this quasi-vendored code as special,
but I don't really know anything about your workflow with the timezone
files.

-- 
Peter Geoghegan




Re: Allow file inclusion in pg_hba and pg_ident files

2022-09-17 Thread Julien Rouhaud
On Tue, Aug 16, 2022 at 02:10:30PM +0800, Julien Rouhaud wrote:
> Hi,
> 
> On Sat, Jul 30, 2022 at 04:09:36PM +0800, Julien Rouhaud wrote:
> 
> >
> > - 0001: the rule_number / mapping_number addition in the views in a separate
> >   commit
> > - 0002: the main file inclusion patch.  Only a few minor bugfix since
> >   previous version discovered thanks to the tests (a bit more about it 
> > after),
> >   and documentation tweaks based on previous discussions
> > - 0003: the pg_hba_matches() POC, no changes

v10 attached to fix a few conflict with recent refactoring commits, no other
change.
>From d81715bf9848f1aa310ea199a9774eebcbe372ab Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 30 May 2022 10:59:51 +0800
Subject: [PATCH v10 1/3] Add rule_number / mapping_number to the
 pg_hba/pg_ident views.

Author: Julien Rouhaud
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 doc/src/sgml/system-views.sgml  | 22 +
 src/backend/utils/adt/hbafuncs.c| 50 ++---
 src/include/catalog/pg_proc.dat | 11 ---
 src/test/regress/expected/rules.out | 10 +++---
 4 files changed, 72 insertions(+), 21 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 44aa70a031..1d619427c1 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -991,6 +991,18 @@
 
 
 
+ 
+  
+   rule_number int4
+  
+  
+   Rule number of this rule among all rules if the rule is valid, otherwise
+   null. This indicates the order in which each rule will be considered
+   until the first matching one, if any, is used to perform authentication
+   with the client.
+  
+ 
+
  
   
line_number int4
@@ -1131,6 +1143,16 @@
 
 
 
+ 
+  
+   mapping_number int4
+  
+  
+   Mapping number, in priority order, of this mapping if the mapping is
+   valid, otherwise null
+  
+ 
+
  
   
line_number int4
diff --git a/src/backend/utils/adt/hbafuncs.c b/src/backend/utils/adt/hbafuncs.c
index 9e5794071c..c9be4bff1f 100644
--- a/src/backend/utils/adt/hbafuncs.c
+++ b/src/backend/utils/adt/hbafuncs.c
@@ -26,10 +26,12 @@
 
 static ArrayType *get_hba_options(HbaLine *hba);
 static void fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc,
- int lineno, HbaLine *hba, 
const char *err_msg);
+ int rule_number, int lineno, 
HbaLine *hba,
+ const char *err_msg);
 static void fill_hba_view(Tuplestorestate *tuple_store, TupleDesc tupdesc);
 static void fill_ident_line(Tuplestorestate *tuple_store, TupleDesc tupdesc,
-   int lineno, IdentLine 
*ident, const char *err_msg);
+   int mapping_number, int 
lineno, IdentLine *ident,
+   const char *err_msg);
 static void fill_ident_view(Tuplestorestate *tuple_store, TupleDesc tupdesc);
 
 
@@ -157,7 +159,7 @@ get_hba_options(HbaLine *hba)
 }
 
 /* Number of columns in pg_hba_file_rules view */
-#define NUM_PG_HBA_FILE_RULES_ATTS  9
+#define NUM_PG_HBA_FILE_RULES_ATTS  10
 
 /*
  * fill_hba_line
@@ -165,6 +167,7 @@ get_hba_options(HbaLine *hba)
  *
  * tuple_store: where to store data
  * tupdesc: tuple descriptor for the view
+ * rule_number: unique rule identifier among all valid rules
  * lineno: pg_hba.conf line number (must always be valid)
  * hba: parsed line data (can be NULL, in which case err_msg should be set)
  * err_msg: error message (NULL if none)
@@ -174,7 +177,8 @@ get_hba_options(HbaLine *hba)
  */
 static void
 fill_hba_line(Tuplestorestate *tuple_store, TupleDesc tupdesc,
- int lineno, HbaLine *hba, const char *err_msg)
+ int rule_number, int lineno, HbaLine *hba,
+ const char *err_msg)
 {
Datum   values[NUM_PG_HBA_FILE_RULES_ATTS];
boolnulls[NUM_PG_HBA_FILE_RULES_ATTS];
@@ -193,6 +197,11 @@ fill_hba_line(Tuplestorestate *tuple_store, TupleDesc 
tupdesc,
memset(nulls, 0, sizeof(nulls));
index = 0;
 
+   /* rule_number */
+   if (err_msg)
+   nulls[index++] = true;
+   else
+   values[index++] = Int32GetDatum(rule_number);
/* line_number */
values[index++] = Int32GetDatum(lineno);
 
@@ -336,7 +345,7 @@ fill_hba_line(Tuplestorestate *tuple_store, TupleDesc 
tupdesc,
else
{
/* no parsing result, so set relevant fields to nulls */
-   memset([1], true, (NUM_PG_HBA_FILE_RULES_ATTS - 2) * 
sizeof(bool));
+   memset([2], true, (NUM_PG_HBA_FILE_RULES_ATTS - 3) * 
sizeof(bool));
}
 
/* error 

Re: [RFC] building postgres with meson - v13

2022-09-17 Thread Andres Freund
Hi,

On 2022-09-16 16:22:35 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-09-16 09:14:20 +1200, Thomas Munro wrote:
> >> On Thu, Sep 15, 2022 at 2:26 PM Andres Freund  wrote:
> >>> - noticed that libpgport.a had and needed a dependency on errcodes.h - 
> >>> that
> >>> seemed wrong. The dependency is due to src/port/*p{read,write}v?.c 
> >>> including
> >>> postgres.h - which seems wrong. So I added a patch changing them to 
> >>> include
> >>> c.h.
> 
> >> Oops.  +1
> 
> > Looks like this has been the case since
> > 0d56acfbaa799553c0c6ea350fd6e68d81025994 in 14. Any opinions on whether we
> > should backpatch the "fix"?
> 
> +1, those files have no business including all of postgres.h

Done.

I've been wondering whether we should protect against this kind of issue on
the buildsystem level. Whenever building frontend code, add something like
-DBUILDING_FRONTEND, and error out if postgres.h is included without going
through postgres_fe.h.

Greetings,

Andres Freund




Re: Patch to address creation of PgStat* contexts with null parent context

2022-09-17 Thread Andres Freund
Hi,

On 2022-09-07 11:11:11 +0200, Drouvot, Bertrand wrote:
> On 9/6/22 7:53 AM, Kyotaro Horiguchi wrote:
> > At Mon, 5 Sep 2022 14:46:55 +0200, "Drouvot, Bertrand" 
> >  wrote in
> > > Looks like that both approaches have their pros and cons. I'm tempted
> > > to vote +1 on "just changing" the parent context to TopMemoryContext
> > > and not changing the allocations locations.
> > Yeah. It is safe more than anything and we don't have a problem there.
> > 
> > So, I'm fine with just replacing the parent context at the three places.
> 
> Attached a patch proposal to do so.

Pushed. Thanks for the report and the fix!

Greetings,

Andres Freund




Re: archive modules

2022-09-17 Thread Peter Eisentraut

On 14.09.22 23:09, Nathan Bossart wrote:

On Wed, Sep 14, 2022 at 09:31:04PM +0200, Peter Eisentraut wrote:

Here is a patch that addresses this.


My intent was to present archive_command as the built-in archive library,
but I can see how this might cause confusion, so this change seems
reasonable to me.


While working on this, I noticed that in master this conflicts with 
commit 3cabe45a819f8a2a282d9d57e45f259c84e97c3f.  I have posted a 
message in that thread looking for a resolution.






Re: remove more archiving overhead

2022-09-17 Thread Peter Eisentraut

On 03.08.22 09:16, Noah Misch wrote:

On Mon, Aug 01, 2022 at 10:02:19PM -0700, Nathan Bossart wrote:

On Sat, Jul 30, 2022 at 11:51:56PM -0700, Noah Misch wrote:

Inviting the administrator to resolve things is more dangerous than just
returning true.  I recommend making this text more opinionated and simpler:
libraries must return true.  Alternately, if some library has found a good
reason to return false, this paragraph could give the reason.  I don't know of
such a reason, though.


Your suggestion seems reasonable to me.  I've attached a small patch.



--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -691,11 +691,9 @@ test ! -f /mnt/server/archivedir/000100A90065 
 cp pg_wal/0
  system crashes before the server makes a durable record of archival 
success,
  the server will attempt to archive the file again after restarting 
(provided
  archiving is still enabled).  When an archive library encounters a
-pre-existing file, it may return true if the WAL file 
has
+pre-existing file, it should return true if the WAL 
file has
  identical contents to the pre-existing archive and the pre-existing 
archive
-is fully persisted to storage.  Alternatively, the archive library may
-return false anytime a pre-existing file is encountered,
-but this will require manual action by an administrator to resolve.  If a
+is fully persisted to storage.  If a
  pre-existing file contains different contents than the WAL file being
  archived, the archive library must return
  false.


Works for me.  Thanks.


This documentation change only covers archive_library.  How are users of 
archive_command supposed to handle this?






Re: ICU for global collation

2022-09-17 Thread Marina Polyakova

On 2022-09-16 10:56, Peter Eisentraut wrote:

On 15.09.22 17:41, Marina Polyakova wrote:
I agree with you. Here's another version of the patch. The 
locale/encoding checks and reports in initdb have been reordered, 
because now the encoding is set first and only then the ICU locale is 
checked.


I committed something based on the first version of your patch.  This
reordering of the messages here was a little too much surgery for me
at this point.  For instance, there are also messages in #ifdef WIN32
code that would need to be reordered as well.  I kept the overall
structure of the code the same and just inserted the additional
proposed checks.

If you want to pursue the reordering of the checks and messages
overall, a patch for the master branch could be considered.


Thank you! I already wrote about the order of the ICU checks in 
initdb/create database, they were the only reason to propose such 
changes...


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: ICU for global collation

2022-09-17 Thread Marina Polyakova
Thanks to Kyotaro Horiguchi review we found out that there're 
interesting cases due to the order of some ICU checks:


1. ICU locale vs supported encoding:

1.1.

On 2022-09-15 09:52, Kyotaro Horiguchi wrote:

If I executed initdb as follows, I would be told to specify
--icu-locale option.


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified


However, when I reran the command, it complains about incompatible
encoding this time.  I think it's more user-friendly to check for the
encoding compatibility before the check for missing --icu-locale
option.


1.2. (ok?)

$ initdb --encoding sql-ascii --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen


$ initdb --encoding sql-ascii --icu-locale en-US --locale-provider icu 
hoge

...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (SQL_ASCII) is not supported 
with the ICU provider.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


$ createdb --encoding sql-ascii --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU
$ createdb --encoding sql-ascii --icu-locale en-US --locale-provider icu 
hoge
createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII" 
is not supported with ICU provider


2. For builds without ICU:

2.1.

$ initdb --locale-provider icu hoge
...
initdb: error: ICU locale must be specified

$ initdb --locale-provider icu --icu-locale en-US hoge
...
initdb: error: ICU is not supported in this build

$ createdb --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU locale must be 
specified


$ createdb --locale-provider icu --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


IMO, it would be more user-friendly to inform an unsupported build in 
the first runs too..


2.2. (ok?)

$ initdb --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen

$ initdb --icu-locale en-US --locale-provider icu hoge
...
initdb: error: ICU is not supported in this build

$ createdb --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU

$ createdb --icu-locale en-US --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


2.3.

$ createdb --locale-provider icu --icu-locale en-US --encoding sql-ascii 
hoge
createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII" 
is not supported with ICU provider

$ createdb --locale-provider icu --icu-locale en-US --encoding utf8 hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


IMO, it would be more user-friendly to inform an unsupported build in 
the first run too..


3.

The locale provider is ICU, but it has not yet been set from the 
template database:



$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot be
specified unless locale provider is ICU


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-17 Thread Bharath Rupireddy
On Fri, Sep 16, 2022 at 12:01 PM Michael Paquier  wrote:
>
> On Wed, Sep 14, 2022 at 02:24:12PM +0530, Bharath Rupireddy wrote:
> > I'm attaching the v4 patch that's rebased on to the latest HEAD.
> > Please consider this for review.
>
> I have been looking at this patch.

Thanks for reviewing it.

> -   StringInfo  labelfile;
> -   StringInfo  tblspc_map_file;
> backup_manifest_info manifest;
> +   BackupState backup_state;
> You could use initialize the state here with a {0}.  That's simpler.

BackupState is a pointer  to BackupStateData, we can't initialize that
way. However, I got rid of BackupStateData and used BackupState for
the structure directly, whenever pointer to the structure is required,
I'm using BackupState * to be more clear.

> --- a/src/include/access/xlog_internal.h
> +++ b/src/include/access/xlog_internal.h
> @@ -380,6 +380,31 @@ GetRmgr(RmgrId rmid)
>  }
>  #endif
>
> +/* Structure to hold backup state. */
> +typedef struct BackupStateData
> +{
> Why is that in xlog_internal.h?  This header includes a lot of
> declarations about the internals of WAL, but the backup state is not
> that internal.  I'd like to think that we should begin separating the
> backup-related routines into their own file, as of a set of
> xlogbackup.c/h in this case.  That's a split I have been wondering
> about for some time now.  The internals of xlog.c for the start/stop
> backups are tied to XLogCtlData which map such a switch more
> complicated than it looks, so we can begin small and have the routines
> to create, free and build the label file and the tablespace map in
> this new file.

Good idea. It makes a lot more sense to me, because xlog.c is already
a file of 9000 LOC. I've created xlogbackup.c/.h files and added the
new code there. Once this patch gets in, I can offer my hand to move
do_pg_backup_start() and do_pg_backup_stop() from xlog.c and if okay,
pg_backup_start() and pg_backup_stop() from xlogfuncs.c to
xlogbackup.c/.h. Then, we might have to create new get/set APIs for
XLogCtl fields that do_pg_backup_start() and do_pg_backup_stop()
access.

> +   state->name = (char *) palloc0(len + 1);
> +   memcpy(state->name, name, len);
> Or just pstrdup()?

Done.

> +BackupState
> +get_backup_state(const char *name)
> +{
> I would name this one create_backup_state() instead.
>
> +void
> +create_backup_content_str(BackupState state, bool forhistoryfile)
> +{
> This could be a build_backup_content().

I came up with more meaningful names - allocate_backup_state(),
deallocate_backup_state(), build_backup_content().

> It seems to me that there is no point in having the list of
> tablespaces in BackupStateData? This actually makes the code harder
> to follow, see for example the changes with do_pg_backup_start(), we
> the list of tablespace may or may be not passed down as a pointer of
> BackupStateData while BackupStateData is itself the first argument of
> this routine.  These are independent from the label and backup history
> file, as well.

I haven't stored the list of tablespaces in BackupState, it's the
string that do_pg_backup_start() creates is stored in there for
carrying it till pg_backup_stop(). Adding the tablespace_map,
backup_label, history_file in BackupState makes it easy to carry them
across various backup related functions.

> We need to be careful about the file format (see read_backup_label()),
> and create_backup_content_str() is careful about that which is good.
> Core does not care about the format of the backup history file, though
> some community tools may.

Are you suggesting that we need something like check_history_file()
similar to what read_backup_label() does by parsing each line of the
label file and erroring out if not in the required format?

> I agree that what you are proposing here
> makes the generation of these files easier to follow, but let's
> document what forhistoryfile is here for, at least.  Saving the
> the backup label and history file strings in BackupState is a
> confusing interface IMO.  It would be more intuitive to have the
> backup state in input, and provide the string generated in output
> depending on what we want from the backup state.

We need to carry tablespace_map contents from do_pg_backup_start()
till pg_backup_stop(), backup_label and history_file too are easy to
carry across. Hence it will be good to have all of them i.e.
tablespace_map, backup_label and history_file in the BackupState
structure. IMO, this way is good.

> -   backup_started_in_recovery = RecoveryInProgress();
> +   Assert(state != NULL);
> +
> +   in_recovery = RecoveryInProgress();
> [...]
> -   if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
> +   if (state->started_in_recovery == true && in_recovery == false)
>
> I would have kept the naming to backup_started_in_recovery here.  What
> you are doing is logically right by relying on started_in_recovery to
> check if recovery was running when the backup started, but this just
> creates useless 

Re: ICU for global collation

2022-09-17 Thread Marina Polyakova

On 2022-09-16 11:11, Kyotaro Horiguchi wrote:

At Fri, 16 Sep 2022 09:49:28 +0300, Marina Polyakova
 wrote in

In continuation of options check: AFAICS the following checks in
initdb

if (locale_provider == COLLPROVIDER_ICU)
{
if (!icu_locale)
pg_fatal("ICU locale must be specified");

/*
 * In supported builds, the ICU locale ID will be checked by the
 * backend during post-bootstrap initialization.
 */
#ifndef USE_ICU
pg_fatal("ICU is not supported in this build");
#endif
}

are executed approximately when they are executed in create database
after getting all the necessary data from the template database:


initdb doesn't work that way, but anyway, I realized that I am
proposing to move that code in setlocales() to the caller function as
the result. I don't think setlocales() is the place for the code
because icu locale has no business with what the function does.  That
being said there's no obvious reason we *need* to move the code out to
its caller.


Excuse me, but could you explain your last sentence in more detail? I 
read that this code is not for setlocales and then - that it should not 
moved from here, so I'm confused...



+errmsg("encoding \"%s\" is not supported 
with ICU provider",

+   pg_log_error("encoding \"%s\" is not supported with ICU 
provider",
+pg_encoding_to_char(encodingid));

I might be wrong, but the messages look wrong to me.  The alternatives
below might work.

"encoding \"%s\" is not supported by ICU"
"encoding \"%s\" cannot be used for/with ICU locales"


The message indicates that the selected encoding cannot be used with the 
ICU provider because it does not support it. But if the text of the 
message becomes better and clearer, I will only be glad.



+   pg_log_error_hint("Rerun %s and choose a matching combination.",
+ progname);

This doesn't seem to provide users with useful information.


It was commited in more verbose form:

pg_log_error_hint("Rerun %s and either do not specify an encoding 
explicitly, "

  "or choose a matching combination.",

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-17 Thread Michael Paquier
On Fri, Sep 16, 2022 at 06:48:36PM -0700, Peter Geoghegan wrote:
> On Fri, Sep 16, 2022 at 6:20 PM Tom Lane  wrote:
>> I'd view the current state of reorderbuffer.h as pretty unacceptable on
>> stylistic grounds no matter which position you take.  Having successive
>> declarations randomly using named or unnamed parameters is seriously
>> ugly and distracting, at least to my eye.  We don't need such blatant
>> reminders of how many cooks have stirred this broth.
> 
> I'll come up with a revision that deals with that too, then. Shouldn't
> be too much more work.

Being able to catch unnamed paramaters in function declarations would
be really nice.  Thanks for looking at that.

> I suppose that I ought to backpatch a fix for the really egregious
> issue in hba.h, and leave it at that on stable branches.

If check_usermap() is used in a bugfix, that could be a risk, so this
bit warrants a backpatch in my opinion.
--
Michael


signature.asc
Description: PGP signature


Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-09-17 Thread Michael Paquier
On Fri, Sep 16, 2022 at 06:24:07PM +0200, Drouvot, Bertrand wrote:
> The CF bot is failing for Windows (all other tests are green) and only for
> the new tap test related to the regular expression on the host name (the
> ones on database and role are fine).
> 
> The issue is not related to the patch. The issue is that the Windows Cirrus
> test does not like when a host name is provided for a "host" entry in
> pg_hba.conf (while it works fine when a CIDR is provided).
> 
> You can see an example in [1] where the only change is to replace the CIDR
> by "localhost" in 002_scram.pl. As you can see the Cirrus tests are failing
> on Windows only (its log file is here [2]).
> 
> I'll look at this "Windows" related issue but would appreciate any
> guidance/help if someone has experience in this area on windows.

I recall that being able to do a reverse lookup of a hostname on
Windows for localhost requires a few extra setup steps as that's not
guaranteed to be set in all environments by default, which is why we
go at great length to use 127.0.0.1 in the TAP test setup for example
(see Cluster.pm).  Looking at your patch, the goal is to test the
mapping of regular expression for host names, user names and database
names.  If the first case is not guaranteed, my guess is that it is
fine to skip this portion of the tests on Windows.

While reading the patch, I am a bit confused about token_regcomp() and
token_regexec().  It would help the review a lot if these were
documented with proper comments, even if these act roughly as wrappers
for pg_regexec() and pg_regcomp().
--
Michael


signature.asc
Description: PGP signature


Re: Pruning never visible changes

2022-09-17 Thread Simon Riggs
On Fri, 16 Sept 2022 at 18:37, Tom Lane  wrote:
>
> Simon Riggs  writes:
> > On Fri, 16 Sept 2022 at 15:26, Tom Lane  wrote:
> >> You cannot
> >> do that, at least not without checking that the originating
> >> transaction has no snapshots that could see the older row version.
>
> > I'm saying this is possible only AFTER the end of the originating
> > xact, so there are no issues with additional snapshots.
>
> I see, so the point is just that we can prune even if the originating
> xact hasn't yet passed the global xmin horizon.  I agree that's safe,
> but will it fire often enough to be worth the trouble?

It is an edge case with limited utility, I agree, which is why I
looked for a cheap test (xmin == xmax only).

This additional test is also valid, but too expensive to apply:
TransactionIdGetTopmostTranactionId(xmax) ==
TransactionIdGetTopmostTranactionId(xmin)

> Also, why
> does it need to be restricted to certain shapes of HOT chains ---
> that is, why can't we do exactly what we'd do if the xact *were*
> past the xmin horizon?

I thought it important to maintain the integrity of the HOT chain, in
that the xmax of one tuple version matches the xmin of the next. So
pruning only from the root of the chain allows us to maintain that
validity check.

I'm on the fence myself, which is why I didn't post it immediately I
had written it.

If not, it would be useful to add a note in comments to the code to
explain why we don't do this.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Pruning never visible changes

2022-09-17 Thread Simon Riggs
On Fri, 16 Sept 2022 at 21:07, Laurenz Albe  wrote:
>
> On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:
> > Simon Riggs  writes:
> > > A user asked me whether we prune never visible changes, such as
> > > BEGIN;
> > > INSERT...
> > > UPDATE.. (same row)
> > > COMMIT;
> >
> > Didn't we just have this discussion in another thread?
>
> For reference: that was
> https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.ca...@cybertec.at

Thanks. I confirm I hadn't seen that, and indeed, I wrote the patch on
5 Sept before you posted.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Slow standby snapshot

2022-09-17 Thread Simon Riggs
On Fri, 16 Sept 2022 at 17:08, Michail Nikolaev
 wrote:
>
> Hello everyone.
>
> To find the best frequency for calling KnownAssignedXidsCompress in
> Simon's patch, I made a set of benchmarks. It looks like each 8th xid
> is a little bit often.
>
> Setup and method is the same as previous (1). 16-core machines,
> max_connections = 5000. Tests were running for about a day, 220 runs
> in total (each version for 20 times, evenly distributed throughout the
> day).
>
> Staring from 60th second, 30 seconds-long transaction was started on primary.
>
> Graphs in attachment. So, looks like 64 is the best value here. It
> gives even a little bit more TPS than smaller values.
>
> Yes, such benchmark does not cover all possible cases, but it is
> better to measure at least something when selecting constants :)

This is very good and clear, thank you.


> If someone has an idea of different benchmark scenarios - please share them.

> So, updated version (with 64 and some commit message) in attachment too.
>
> [1]: 
> https://www.postgresql.org/message-id/flat/CANtu0ohzBFTYwdLtcanWo4%2B794WWUi7LY2rnbHyorJdE8_ZnGg%40mail.gmail.com#379c1be7b8134ada5a574078d51b64c6

I've cleaned up the comments and used a #define for the constant.

IMHO this is ready for commit. I will add it to the next CF.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


v8c-new-heuristic-to-compress-KnownAssignedXids.patch
Description: Binary data


Re: ICU for global collation

2022-09-17 Thread Marina Polyakova

On 2022-09-16 10:57, Peter Eisentraut wrote:

On 16.09.22 09:31, Marina Polyakova wrote:
IMO it is hardly understantable from the program output either - it 
looks like I manually chose the encoding UTF8. Maybe first inform 
about selected encoding?..


Yes, I included something like that in the patch just committed.


diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 
6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 
100644

--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2310,7 +2310,11 @@ setup_locale_encoding(void)
  }

  if (!encoding && locale_provider == COLLPROVIDER_ICU)
+    {
  encodingid = PG_UTF8;
+    printf(_("The default database encoding has been set to 
\"%s\" for a better experience with the ICU provider.\n"),

+   pg_encoding_to_char(encodingid));
+    }
  else if (!encoding)
  {
  int    ctype_enc;


Thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company