Re: [PATCH] Expose port->authn_id to extensions and triggers

2022-03-02 Thread Michael Paquier
On Wed, Mar 02, 2022 at 01:27:40PM -0800, Andres Freund wrote:
> I don't think we should commit this without synchronizing the authn between
> worker / leader (in a separate commit). Too likely that some function that's
> marked parallel ok queries the authn_id, opening up a security/monitoring hole
> or such because of a bogus return value.

Hmm, OK.  Using the same authn ID for the leader and the workers still
looks a bit strange to me as the worker is not the one that does the 
authentication, only the leader does that.  Anyway, FixedParallelState
includes some authentication data passed down by the leader when
spawning a worker.  So, if we were to pass down the authn, we are
going to need a new PARALLEL_KEY_* to serialize and restore the data
passed down via a DSM like any other states as per the business in
parallel.c.  Jacob, what do you think?
--
Michael


signature.asc
Description: PGP signature


Re: Changing "Hot Standby" to "hot standby"

2022-03-02 Thread Daniel Westermann (DWE)
>> Thanks for having a look. Are you suggesting to change it like this?
>> -    Hot Standby is the term used to describe the ability to connect to
>> +    Hot standby is the term used to describe the ability to connect to

>Yes.  Isn't it the right form of a sentence?

Done like that.

Regards
Danieldiff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b5b6042104..ea10fe2be8 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -548,7 +548,7 @@ protocol to make nodes agree on a serializable transactional order.
rollforward will take considerably longer, so that technique only
offers a solution for disaster recovery, not high availability.
A standby server can also be used for read-only queries, in which case
-   it is called a Hot Standby server. See  for
+   it is called a hot standby server. See  for
more information.
   
 
@@ -1032,7 +1032,7 @@ primary_slot_name = 'node_a_slot'

 

-Hot Standby feedback propagates upstream, whatever the cascaded arrangement.
+Hot standby feedback propagates upstream, whatever the cascaded arrangement.

 

@@ -1499,16 +1499,16 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
   Hot Standby
 
   
-   Hot Standby
+   hot standby
   
 

-Hot Standby is the term used to describe the ability to connect to
+Hot standby is the term used to describe the ability to connect to
 the server and run read-only queries while the server is in archive
 recovery or standby mode. This
 is useful both for replication purposes and for restoring a backup
 to a desired state with great precision.
-The term Hot Standby also refers to the ability of the server to move
+The term hot standby also refers to the ability of the server to move
 from recovery through to normal operation while users continue running
 queries and/or keep their connections open.

@@ -1623,7 +1623,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
being executed during recovery.  This restriction applies even to
temporary tables, because table rows cannot be read or written without
assigning a transaction ID, which is currently not possible in a
-   Hot Standby environment.
+   hot standby environment.
   
  
  
@@ -1703,7 +1703,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'

 In normal operation, read-only transactions are allowed to
 use LISTEN and NOTIFY,
-so Hot Standby sessions operate under slightly tighter
+so hot standby sessions operate under slightly tighter
 restrictions than ordinary read-only sessions.  It is possible that some
 of these restrictions might be loosened in a future release.

@@ -1746,7 +1746,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'

 

-There are also additional types of conflict that can occur with Hot Standby.
+There are also additional types of conflict that can occur with hot standby.
 These conflicts are hard conflicts in the sense that queries
 might need to be canceled and, in some cases, sessions disconnected to resolve them.
 The user is provided with several ways to handle these
@@ -1947,8 +1947,8 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
 If hot_standby is on in postgresql.conf
 (the default value) and there is a
 standby.signalstandby.signalfor hot standby
-file present, the server will run in Hot Standby mode.
-However, it may take some time for Hot Standby connections to be allowed,
+file present, the server will run in hot standby mode.
+However, it may take some time for hot standby connections to be allowed,
 because the server will not accept connections until it has completed
 sufficient recovery to provide a consistent state against which queries
 can run.  During this period,
@@ -2282,7 +2282,7 @@ HINT:  You can then restart the server after making the necessary configuration
Caveats
 

-There are several limitations of Hot Standby.
+There are several limitations of hot standby.
 These can and probably will be fixed in future releases:
 
   
@@ -2299,7 +2299,7 @@ HINT:  You can then restart the server after making the necessary configuration
 
  Valid starting points for standby queries are generated at each
  checkpoint on the primary. If the standby is shut down while the primary
- is in a shutdown state, it might not be possible to re-enter Hot Standby
+ is in a shutdown state, it might not be possible to re-enter hot standby
  until the primary is started up, so that it generates further starting
  points in the WAL logs.  This situation isn't a problem in the most
  common situations where it might happen. Generally, if the primary is


Re: Commitfest 2022-03 Patch Triage Part 1b

2022-03-02 Thread Fabien COELHO


Hello Greg,


Peter posted an updated version of Fabiens patch about a month ago (which at
this point no longer applies)


Attached a v15 which is a rebase, after some minor changes in the source 
and some new test cases added (good!).


fixing a few issues, but also point at old review comments still 
unaddressed.


ISTM that all comments have been addressed. However, the latest patch 
raises issues about work around libpq corner case behaviors which are 
really just that, corner cases.


Since this was pushed, but had to be reverted, I assume there is a 
desire for the feature but it seems to need more work still.




It looks like Peter and Fabien were debating the merits of a libpq
change and probably that won't happen this release cycle.



ISTM these are really very minor issues that could be resolved in this 
cycle.


Is there a kernel of psql functionality that can be extracted from this 
without the libpq change in this release cycle or should it wait until 
we add the functionality to libpq?


The patch can wait for the issues to be resolved one way or an other 
before proceeding, *or* it can be applied, maybe with a small tweak, and 
the libpq issues be fixed separately.


For a reminder, there are two actual "issues"features" or "bug" with 
libpq, which are made visible by the patch, but are pre-existing:


(1) under some circumstances a infinite stream of empty results is 
returned, that has to be filtered out manually.


(2) under some special circumstances some error messages may be output 
twice because of when libpq decides to reset the error buffer.


(1) has been there for ever, and (2) is under investigation to possibly 
improve the situation, so as to remove a hack in the code to avoid it.
The alternative which IMO would be ok is to admit that under some very 
special conditions the same error message may be output twice, and if it 
is resolved later on then fine.



If it's the latter then perhaps we should move this to 16?


I'm not that pessimistic! I may be proven wrong:-)

--
Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index e0abe34bb6..8f7f93172a 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -50,8 +50,28 @@ BEGIN \;
 SELECT 2.0 AS "float" \;
 SELECT 'world' AS "text" \;
 COMMIT;
+ float 
+---
+   2.0
+(1 row)
+
+ text  
+---
+ world
+(1 row)
+
 -- compound with empty statements and spurious leading spacing
 \;\;   SELECT 3 + 3 \;\;\;   SELECT ' ' || ' !' \;\;   SELECT 1 + 4 \;;
+ ?column? 
+--
+6
+(1 row)
+
+ ?column? 
+--
+   !
+(1 row)
+
  ?column? 
 --
 5
@@ -61,6 +81,11 @@ COMMIT;
 SELECT 1 + 1 + 1 AS "add" \gset
 SELECT :add + 1 + 1 AS "add" \;
 SELECT :add + 1 + 1 AS "add" \gset
+ add 
+-
+   5
+(1 row)
+
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  i 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index caabb06c53..f01adb1fd2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql
commands included in the string to divide it into multiple
transactions.  (See 
for more details about how the server handles multi-query strings.)
-   Also, psql only prints the
-   result of the last SQL command in the string.
-   This is different from the behavior when the same string is read from
-   a file or fed to psql's standard input,
-   because then psql sends
-   each SQL command separately.
   
   
-   Because of this behavior, putting more than one SQL command in a
-   single -c string often has unexpected results.
-   It's better to use repeated -c commands or feed
-   multiple commands to psql's standard input,
+   If having several commands executed in one transaction is not desired, 
+   use repeated -c commands or feed multiple commands to
+   psql's standard input,
either using echo as illustrated above, or
via a shell here-document, for example:
 
@@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3;
 commands included in the string to divide it into multiple
 transactions.  (See 
 for more details about how the server handles multi-query strings.)
-psql prints only the last query result
-it receives for each request; in this example, although all
-three SELECTs are indeed executed, psql
-only prints the 3.
 
 
   
@@ -4160,6 +4149,18 @@ bar
   
 
   
+SHOW_ALL_RESULTS
+
+
+When this variable is set to off, only the last
+result of a combined query (\;) is shown instead of
+all of them.  The default is on.  The off behavior
+is for compatibility with older versions of p

Re: Changing "Hot Standby" to "hot standby"

2022-03-02 Thread Kyotaro Horiguchi
At Thu, 3 Mar 2022 06:55:43 +, "Daniel Westermann (DWE)" 
 wrote in 
> Hi Kyotaro,
> 
> >>    
> >>-    Hot Standby is the term used to describe the ability to connect to
> >>+    hot standby is the term used to describe the ability to connect to
> 
> >They look like decapitalizing the first word in a sentsnce.
> 
> Thanks for having a look. Are you suggesting to change it like this?
> -Hot Standby is the term used to describe the ability to connect to
> +Hot standby is the term used to describe the ability to connect to

Yes.  Isn't it the right form of a sentence?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Schema variables - new implementation for Postgres 15

2022-03-02 Thread Julien Rouhaud
On Thu, Mar 03, 2022 at 03:06:52PM +0800, Julien Rouhaud wrote:
> Hi,
>
> On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> >
> > I lost commit with this change. I am sending updated patch.

Also, another thing is the size of the patch.  It's probably the minimum to
have a consistent working implementation, but maybe we can still split it to
make review easier?

For instance, maybe having:

- the pg_variable part on its own, without a way to use them, maybe with
  syscache helpers
- the main session variable implementation and test coverage
- plpgsql support and test coverage
- pg_dump support and test coverage

It wouldn't make the main patch that small but could still help quite a bit.

Any better suggestion?




Re: Schema variables - new implementation for Postgres 15

2022-03-02 Thread Julien Rouhaud
Hi,

On Wed, Mar 02, 2022 at 06:03:06AM +0100, Pavel Stehule wrote:
> 
> I lost commit with this change. I am sending updated patch.

Thanks a lot Pavel!

I did a more thorough review of the patch.  I'm attaching a diff (in .txt
extension) for comment improvement suggestions.  I may have misunderstood
things so feel free to discard some of it.  I will mention the comment I didn't
understand in this mail.

First, I spotted some problem in the invalidation logic.

+ * Assign sinval mark to session variable. This mark probably
+ * signalized, so the session variable was dropped. But this
+ * should be rechecked later against system catalog.
+ */
+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)

You mention that hashvalue can only be zero for commands that can't
affect session variables (like VACUUM or ANALYZE), but that's not true.  It can
also happen in case of sinval queue overflow (see InvalidateSystemCaches()).
So in that case we should trigger a full recheck, with some heuristics on how
to detect that a cached variable is still valid.  Unfortunately the oid can
wraparound so some other check is needed to make it safe.

Also, even if we get a non-zero hashvalue in the inval callback, we can't
assume that there weren't any collision in the hash.  So the additional check
should be used there too.

We had a long off-line discussion about this with Pavel yesterday on what
heuristic to use there.  Unlike other caches where discarding an entry when it
shouldn't have been is not really problematic, the cache here contains the real
variable value so we can't discard it unless the variable was really dropped.
It should be possible to make it work, so I will let Pavel comment on which
approach he wants to use and what the drawbacks are.  I guess that this will be
the most critical part of this patch to decide whether the approach is
acceptable or not.


The rest is only minor stylistic comments.

Using -DRAW_EXPRESSION_COVERAGE_TEST I see that T_LetStmt is missing in
raw_expression_tree_walker.

ALTER and DROP both suggest "IMMUTABLE VARIABLE" as valid completion, while
it should only be usable in the CREATE [ IMMUTABLE ] VARIABLE form.

+initVariable(Variable *var, Oid varid, bool fast_only)
+{
+   var->collation = varform->varcollation;
+   var->eoxaction = varform->vareoxaction;
+   var->is_not_null = varform->varisnotnull;
+   var->is_immutable = varform->varisimmutable;

nit: eoxaction is defined after is_not_null and is_immutable, it would be
better to keep the initialization order consistent (same in VariableCreate).

+   values[Anum_pg_variable_varcollation - 1] = ObjectIdGetDatum((char) 
varCollation);
+   values[Anum_pg_variable_vareoxaction - 1] = CharGetDatum(eoxaction);

seems like the char cast is on the wrong variable?

+ * [...] We have to hold two separate action lists:
+ * one for dropping the session variable from system catalog, and
+ * another one for resetting its value. Both are necessary, since
+ * dropping a session variable also needs to enforce a reset of
+ * the value.

I don't fully understand that comment.  Maybe you meant that the opposite isn't
true, ie. highlight that a reset should *not* drop the variable thus two lists?

+typedef enum SVariableXActAction
+{
+   SVAR_ON_COMMIT_DROP,/* used for ON COMMIT DROP */
+   SVAR_ON_COMMIT_RESET,   /* used for DROP VARIABLE */
+   SVAR_RESET, /* used for ON TRANSACTION END RESET */
+   SVAR_RECHECK/* verify if session variable still exists */
+} SVariableXActAction;
+
+typedef struct SVariableXActActionItem
+{
+   Oid varid;  /* varid of session variable */
+   SVariableXActAction action; /* reset or drop */

the stored action isn't simply "reset or drop", even though the resulting
action will be a reset or a drop (or a no-op) right?  Since it's storing a enum
define just before, I'd just drop the comment on action, and maybe specify that
SVAR_RECHECK will do appropriate cleanup if the session variable doesn't exist.


+ * Release the variable defined by varid from sessionvars
+ * hashtab.
+ */
+static void
+free_session_variable(SVariable svar)

The function name is a bit confusing given the previous function.  Maybe this
one should be called forget_session_variable() instead, or something like that?

I think the function comment should also mention that caller is responsible for
making sure that the sessionvars htab exists before calling it, for extra
clarity, or just add an assert for that.

+static void
+free_session_variable_varid(Oid varid)

Similary, maybe renaming this function forget_session_variable_by_id()?

+static void
+create_sessionvars_hashtable(void)
+{
+   HASHCTL ctl;
+
+   /* set callbacks */
+   if (first_time)
+   {
+   /* Read sinval messages */
+   CacheRegisterSyscacheCallback(VARIABLEOID,
+ pg_variable_cache_callback,
+ (Datum) 0);
+
+  

Re: Changing "Hot Standby" to "hot standby"

2022-03-02 Thread Daniel Westermann (DWE)
Hi Kyotaro,

>>    
>>-    Hot Standby is the term used to describe the ability to connect to
>>+    hot standby is the term used to describe the ability to connect to

>They look like decapitalizing the first word in a sentsnce.

Thanks for having a look. Are you suggesting to change it like this?
-Hot Standby is the term used to describe the ability to connect to
+Hot standby is the term used to describe the ability to connect to

Regards
Daniel



Re: Add the replication origin name and commit-LSN to logical replication worker errcontext

2022-03-02 Thread Amit Kapila
On Wed, Mar 2, 2022 at 1:05 PM Masahiko Sawada  wrote:
>
> On Wed, Mar 2, 2022 at 4:14 PM Amit Kapila  wrote:
>
> I've attached updated patches.
>

The first patch LGTM. Some comments on the second patch:

1.
@@ -3499,6 +3503,17 @@ ApplyWorkerMain(Datum main_arg)
  myslotname = MemoryContextStrdup(ApplyContext, syncslotname);

  pfree(syncslotname);
+
+ /*
+ * Allocate the origin name in long-lived context for error context
+ * message
+ */
+ ReplicationOriginNameForTablesync(MySubscription->oid,
+   MyLogicalRepWorker->relid,
+   originname,
+   sizeof(originname));
+ apply_error_callback_arg.origin_name = MemoryContextStrdup(ApplyContext,
+originname);

Can we assign this in LogicalRepSyncTableStart() where we already
forming the origin name? That will avoid this extra call to
ReplicationOriginNameForTablesync.

2.
@@ -3657,30 +3679,37 @@ apply_error_callback(void *arg)
  errcontext("processing remote data during \"%s\"",
 logicalrep_message_type(errarg->command));
  else
- errcontext("processing remote data during \"%s\" in transaction %u",
+ errcontext("processing remote data during \"%s\" in transaction %u
committed at LSN %X/%X from replication origin \"%s\"",
 logicalrep_message_type(errarg->command),
-errarg->remote_xid);
+errarg->remote_xid,
+LSN_FORMAT_ARGS(errarg->commit_lsn),
+errarg->origin_name);

Won't we set the origin name before the command? If so, it can be used
in the first message as well and we can change the condition in the
beginning such that if the origin or command is not set then we can
return without adding additional context information.

Isn't this error message used for rollback prepared failure as well?
If so, do we need to say "... finished at LSN ..." instead of "...
committed at LSN ..."?

The other point about this message is that saying " from
replication origin ..." sounds more like remote information similar to
publication but the origin is of the local node. How about slightly
changing it to "processing remote data for replication origin \"%s\"
during \"%s\" in transaction ..."?

3.
+
+   The LSN of the transaction that contains the change violating the
constraint and
+   the replication origin name can be found from those outputs (LSN
0/14C0378 and
+   replication origin pg_16395 in the above case).
The transaction
+   can be skipped by calling the 
pg_replication_origin_advance() function with
-   a node_name corresponding to the subscription name,
-   and a position.  The current position of origins can be seen in the
-   
+   the node_name and the next LSN of the commit LSN
+   (i.e., 0/14C0379) from those outputs.

After node_name, can we specify origin_name similar to what we do for
LSN as that will make things more clear? Also, shall we mention that
users need to disable subscriptions to perform replication origin
advance?

I think for prepared transactions, the user might need to use it twice
because after skipping prepared xact, it will get an error again
during the processing of commit prepared (no such prepare exists). I
thought of mentioning it but felt that might lead to specifying too
many details which can confuse users as well. What do you think?

4. There are places in the patch like apply_handle_stream_start()
which sets commit_lsn in callback arg as InvalidXLogRecPtr but the
callback function seems to be assuming that it is always a valid
value. Shouldn't we need to avoid appending LSN for such cases?

-- 
With Regards,
Amit Kapila.




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Michael Paquier
On Wed, Mar 02, 2022 at 11:15:28AM -0500, Stephen Frost wrote:
> With... which?  We removed recovery.conf without any warning between
> major releases, yet it was used by every single PG file-based backup and
> restore solution out there and by every single organization that had
> ever done a restore of PG since it was introduced in 8.0.

There was much more to the changes around recovery.conf, with the
integration of its parameters as GUCs so it had a lot of benefits,
and that's why it has baked for 3~4 years as far as I recall.  There
is SCRAM as a replacement of MD5 already in core, but as Jonathan said
upthread the upgrade from an instance that still has MD5 hashes may
finish by being tricky for some users because this does not concern
only pg_upgrade (this could fail if it detects MD5 hashes in
pg_authid), and we don't really know how to solve the pg_dump bit, do
we?  And it seems to me that this is not a matter of just blocking the
load of MD5 hashes in the backend in the case of logical dumps.

> Passing
> around cleartext passwords with the LDAP authentication method is
> clearly bad from a security perspective and it's a bunch of code to
> support that, along with it being quite complicated to configure and get
> set up (arguably harder than Kerberos, if you want my 2c).  If you want
> to say that it's valuable for us to continue to maintain that code
> because it's good and useful and might even be the only option for some
> people, fine, though I disagree, but I don't think my argument that we
> shouldn't keep it just because *someone* is using it is any different
> from our general project policy about features.

MD5 is still used at auth method by many people, as is LDAP.  They
have their design flaws (backend LDAP, MD5 hashes in old backups), but
those code areas are pretty mature as well, so a simple removal could
hurt the user base of PG quite a lot IMO.
--
Michael


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Tatsuo Ishii
> Yes, really, it's a known-broken system which suffers from such an old
> and well known attack that it's been given a name: pass-the-hash.  As
> was discussed on this thread even, just the fact that it's not trivial
> to break on the wire doesn't make it not-broken, particularly when we
> use the username (which is rather commonly the same one used across
> multiple systems..) as the salt.  Worse, md5 isn't exactly the pinnacle

I am not a big fan of md5 auth but saying that md5 auth uses username
as the salt is oversimplified. The md5 hashed password shored in
pg_shadow is created as md5(password + username).  But the md5 hashed
password flying over wire is using a random salt like md5(md5(password
+ username) + random_salt).

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




RE: row filtering for logical replication

2022-03-02 Thread shiy.f...@fujitsu.com
On Thu, Mar 3, 2022 10:40 AM Amit Kapila  wrote:
> 
> On Wed, Mar 2, 2022 at 5:42 PM Euler Taveira  wrote:
> >
> > On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
> >
> > While working on the column filtering patch, which touches about the
> > same places, I noticed two minor gaps in testing:
> >
> > 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> > tweaking the row filter. But there are no checks the row filter was
> > actually modified / stored in the catalog. It might be just thrown away
> > and no one would notice.
> >
> > The test that row filter was modified is available in a previous section. 
> > The
> > one that you modified (0001) is testing the supported objects.
> >
> 
> Right. But if Tomas thinks it is good to add for these ones as well
> then I don't mind.
> 
> > 153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000
> AND e < 2000);
> > 154 \dRp+ testpub5
> > 155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> > 156 \dRp+ testpub5
> > 157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE
> expression)
> > 158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300
> AND e < 500);
> > 159 \dRp+ testpub5
> >
> > IIRC this test was written before adding the row filter information into the
> > psql. We could add \d+ testpub_rf_tbl3 before and after the modification.
> >
> 
> 
> Agreed. We can use \d instead of \d+ as row filter is available with \d.
> 
> > 2) There are no pg_dump tests.
> >
> > WFM.
> >
> 
> This is a miss. I feel we can add a few more.
> 

Agree that we can add some tests, attach the patch which fixes these two points.

Regards,
Shi yu 


0001-Add-some-tests-for-row-filter-in-logical-replication.patch
Description:  0001-Add-some-tests-for-row-filter-in-logical-replication.patch


Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-03-02 Thread Michael Paquier
On Wed, Mar 02, 2022 at 12:07:29AM -0800, Andres Freund wrote:
>> +++ b/src/bin/pg_upgrade/t/001_basic.pl
>> @@ -0,0 +1,9 @@
>> +use strict;
>> +use warnings;
>> +
>> +use PostgreSQL::Test::Utils;
>> +use Test::More tests => 8;
> 
> Outdated.

Fixed.

>> +program_help_ok('pg_upgrade');
>> +program_version_ok('pg_upgrade');
>> +program_options_handling_ok('pg_upgrade');
> 
> Unrelated. But I kinda wish we'd do this in a saner manner than copying this
> test into every binary. E.g. by ensuring that all tools installed in the temp
> install are tested or such.

Perhaps.  I am sticking with the existing style for now.

>> +# The test of pg_upgrade consists in setting up an instance.  This is the
>> +# source instance used for the upgrade. Then a new and fresh instance is
>> +# created, and is used as the target instance for the upgrade.
> 
> This seems a bit repetitive. Lots of "instance".

Indeed.  I have reworked the whole, rather than just those three
sentences.

>> +if (   (defined($ENV{olddump}) && !defined($ENV{oldinstall}))
>> +|| (!defined($ENV{olddump}) && defined($ENV{oldinstall})))
> 
> Odd indentation. Spaces between parens?

Well, perltidy tells me that this is right.

>> +$newnode->init(extra => [ '--wal-segsize', '1', '--allow-group-access' ]);
> 
> I'd copy the comments from test.sh wrt --wal-segsize,
> --allow-group-access.

Done.
--
Michael
From ba2cf7a7adf9cf50406d8bbcaa12167137ba29c3 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Mar 2022 14:01:56 +0900
Subject: [PATCH v11] Switch tests of pg_upgrade to use TAP

---
 src/bin/pg_upgrade/Makefile|  21 +-
 src/bin/pg_upgrade/TESTING |  85 ++--
 src/bin/pg_upgrade/t/001_basic.pl  |  11 +
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 248 ++
 src/bin/pg_upgrade/test.sh | 279 -
 src/tools/msvc/vcregress.pl|  92 +---
 6 files changed, 289 insertions(+), 447 deletions(-)
 create mode 100644 src/bin/pg_upgrade/t/001_basic.pl
 create mode 100644 src/bin/pg_upgrade/t/002_pg_upgrade.pl
 delete mode 100644 src/bin/pg_upgrade/test.sh

diff --git a/src/bin/pg_upgrade/Makefile b/src/bin/pg_upgrade/Makefile
index 49b94f0ac7..1f5d757548 100644
--- a/src/bin/pg_upgrade/Makefile
+++ b/src/bin/pg_upgrade/Makefile
@@ -28,6 +28,10 @@ OBJS = \
 override CPPFLAGS := -DDLSUFFIX=\"$(DLSUFFIX)\" -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
 
+# required for 002_pg_upgrade.pl
+REGRESS_SHLIB=$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX)
+export REGRESS_SHLIB
+
 all: pg_upgrade
 
 pg_upgrade: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
@@ -47,17 +51,8 @@ clean distclean maintainer-clean:
 	rm -rf delete_old_cluster.sh log/ tmp_check/ \
 	   reindex_hash.sql
 
-# When $(MAKE) is present, make automatically infers that this is a
-# recursive make. which is not actually what we want here, as that
-# e.g. prevents output synchronization from working (as make thinks
-# that the subsidiary make knows how to deal with that itself, but
-# we're invoking a shell script that doesn't know). Referencing
-# $(MAKE) indirectly avoids that behaviour.
-# See https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html#MAKE-Variable
-NOTSUBMAKEMAKE=$(MAKE)
+check:
+	$(prove_check)
 
-check: test.sh all temp-install
-	MAKE=$(NOTSUBMAKEMAKE) $(with_temp_install) bindir=$(abs_top_builddir)/tmp_install/$(bindir) EXTRA_REGRESS_OPTS="$(EXTRA_REGRESS_OPTS)" $(SHELL) $<
-
-# installcheck is not supported because there's no meaningful way to test
-# pg_upgrade against a single already-running server
+installcheck:
+	$(prove_installcheck)
diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING
index 78b9747908..3718483a1c 100644
--- a/src/bin/pg_upgrade/TESTING
+++ b/src/bin/pg_upgrade/TESTING
@@ -2,78 +2,23 @@ THE SHORT VERSION
 -
 
 On non-Windows machines, you can execute the testing process
-described below by running
+described below by running the following command in this directory:
 	make check
-in this directory.  This will run the shell script test.sh, performing
-an upgrade from the version in this source tree to a new instance of
-the same version.
 
-To test an upgrade from a different version, you must have a built
-source tree for the old version as well as this version, and you
-must have done "make install" for both versions.  Then do:
+This will run the TAP tests to run pg_upgrade, performing an upgrade
+from the version in this source tree to a new instance of the same
+version.
 
-export oldsrc=...somewhere/postgresql	(old version's source tree)
-export oldbindir=...otherversion/bin	(old version's installed bin dir)
-export bindir=...thisversion/bin	(this version's installed bin dir)
-export libdir=...thisversion/lib	(this version's installed lib dir)
-sh test.sh
+Testing an upgrade from a different version requires a

Re: PG DOCS - logical replication filtering

2022-03-02 Thread Peter Smith
On Thu, Mar 3, 2022 at 2:15 PM Amit Kapila  wrote:
>
> On Wed, Mar 2, 2022 at 8:00 PM Peter Eisentraut
>  wrote:
> >
> > On 02.03.22 05:47, Peter Smith wrote:
> > > This patch introduces a new "Filtering" page to give a common place
> > > where all kinds of logical replication filtering can be described.
> > > (e.g. It is envisaged that a "Column Filters" section can be added
> > > sometime in the future).
> >
> > The pending feature to select a subset of table columns to replicate is
> > not "column filtering".  The thread might still be still called that,
> > but we have changed the patch to not use that terminology.
> >
> > Filtering is a dynamic action based on actual values.  The row filtering
> > feature does that.  The column list feature is a static DDL-time
> > configuration.  It is no more filtering than specifying a list of tables
> > in a publication is table filtering.
> >
> > So please consider organizing the documentation differently to not
> > create this confusion.
> >
>
> +1. I think Row Filters can directly be a section just before
> Conflicts on the logical replication page [1].
>

OK. I will reorganize the page as suggested, and also attend to the
other comments below.

> Some comments on the patch:
> 1. I think we can extend/add the example to have filters on more than
> one table. This has been noticed multiple times during development
> that people are not very clear on it.
> 2. I think we can add an example or two for row filters actions (like
> Insert, Update).
> 3.
>  Publications can choose to limit the changes they produce to
> any combination of INSERT, UPDATE,
> -   DELETE, and TRUNCATE,
> similar to how triggers are fired by
> -   particular event types.  By default, all operation types are replicated.
> +   DELETE, and TRUNCATE by using
> +   operation filters.
>
> By this, one can imply that row filters are used for Truncate as well
> but that is not true. I know that that patch later specifies that "Row
> filters have no effect for TRUNCATE commands." but
> the above modification is not very clear.
>
> [1] - https://www.postgresql.org/docs/devel/logical-replication.html
>

--
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-03-02 Thread Japin Li

On Thu, 03 Mar 2022 at 11:25, David G. Johnston  
wrote:
> I would suggest a wording more like:
>
> "A precondition for using minimal WAL is to disable WAL archiving and
> streaming replication by setting max_wal_senders to 0, and archive_mode to
> off."
>
> While accurate, the phrase "you must set" just doesn't feel right to me.  I
> also don't like how the proposed sentence (either one) is added separately
> as opposed to being included in the immediately preceding paragraph.  While
> this limited patch is probably sufficient I would suggest trying to work
> out a slightly larger patch the improves the wording on the entire existing
> paragraph while incorporating the reference to max_wal_senders.
>

Thanks for your review.  Modified as your suggestion.

> Note, we seem to be missing the documentation of the default setting for
> archive_mode.
>

Add the default value for archive_mode.

> In addition, max_wal_senders could also be changed, adding a sentence like:
>
> "If setting max_wal_senders to 0 consider also reducing the amount of WAL
> produced by changing wal_level to minimal."
>

Modified.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7ed8c82a9d..4fd36fa53a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2775,6 +2775,10 @@ include_dir 'conf.d'
 minimal makes any base backups taken before
 unavailable for archive recovery and standby server, which may
 lead to data loss.
+A precondition for using minimal WAL is to disable WAL archiving and
+streaming replication by setting 
+to 0, and archive_mode
+to off.


 In logical level, the same information is logged as
@@ -3535,6 +3539,7 @@ include_dir 'conf.d'
 mode. In always mode, all files restored from the archive
 or streamed with streaming replication will be archived (again). See
  for details.
+The default value is off.


 This parameter can only be set at server start.
@@ -4096,7 +4101,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
 reconnect.  This parameter can only be set at server start.  Also,
 wal_level must be set to
 replica or higher to allow connections from standby
-servers.
+servers.  If setting max_wal_senders to
+0 consider also reducing the amount of WAL produced
+by changing wal_level to minimal.

 



Re: logical replication empty transactions

2022-03-02 Thread Ajin Cherian
On Wed, Mar 2, 2022 at 1:01 PM shiy.f...@fujitsu.com
 wrote:
>
> Hi,
>
> Here are some comments on the v21 patch.
>
> 1.
> +   WalSndKeepalive(false, 0);
>
> Maybe we can use InvalidXLogRecPtr here, instead of 0.
>

Fixed.

> 2.
> +   pq_sendint64(&output_message, writePtr ? writePtr : sentPtr);
>
> Similarly, should we use XLogRecPtrIsInvalid()?

Fixed

>
> 3.
> @@ -1183,6 +1269,20 @@ pgoutput_change(LogicalDecodingContext *ctx, 
> ReorderBufferTXN *txn,
> Assert(false);
> }
>
> +   if (in_streaming)
> +   {
> +   /* If streaming, send STREAM START if we haven't yet */
> +   if (txndata && !txndata->sent_stream_start)
> +   pgoutput_send_stream_start(ctx, txn);
> +   }
> +   else
> +   {
> +   /* If not streaming, send BEGIN if we haven't yet */
> +   if (txndata && !txndata->sent_begin_txn)
> +   pgoutput_send_begin(ctx, txn);
> +   }
> +
> +
> /* Avoid leaking memory by using and resetting our own context */
> old = MemoryContextSwitchTo(data->context);
>
>
> I am not sure if it is suitable to send begin or stream_start here, because 
> the
> row filter is not checked yet. That means, empty transactions caused by row
> filter are not skipped.
>

Moved the check down, so that row_filters are taken into account.

regards,
Ajin Cherian
Fujitsu Australia


v22-0001-Skip-empty-transactions-for-logical-replication.patch
Description: Binary data


RE: Failed transaction statistics to measure the logical replication progress

2022-03-02 Thread osumi.takami...@fujitsu.com
On Wednesday, March 2, 2022 5:29 PM Shi, Yu/侍 雨  wrote:
> A comments on the v26 patch.
Thank you for checking the patch !

> 
> The following document about pg_stat_subscription_stats view only says that
> "showing statistics about errors", should we add something about transactions
> here?
> 
>  
> 
> pg_stat_subscription_stats >pg_stat_subscription_stats
>   One row per subscription, showing statistics about errors.
>   See 
>   pg_stat_subscription_stats for
> details.
>   
>  
> 
> 
> I noticed that the v24 patch has some changes about the description of this
> view. Maybe we can modify to "showing statistics about errors and
> transactions".
You are right. Fixed.

New patch v27 that incorporated your comments is shared in [1].
Kindly have a look at it.


[1] - 
https://www.postgresql.org/message-id/TYCPR01MB837345FB72E3CE9C827AF42CED049%40TYCPR01MB8373.jpnprd01.prod.outlook.com


Best Regards,
Takamichi Osumi



RE: Failed transaction statistics to measure the logical replication progress

2022-03-02 Thread osumi.takami...@fujitsu.com
On Wednesday, March 2, 2022 2:18 PM Masahiko Sawada  
wrote:
> On Wed, Mar 2, 2022 at 10:21 AM osumi.takami...@fujitsu.com
>  wrote:
> > Also, I quickly checked other similar views(pg_stat_slru,
> > pg_stat_wal_receiver) commit logs, especially when they introduce columns.
> > But, I couldn't find column name validations.
> > So, I feel this is aligned.
> >
> 
> I've looked at v26 patch and here are some random comments:
Hi, thank you for reviewing !


> +/* determine the subscription entry */
> +Oidm_subid;
> +
> +PgStat_Counter apply_commit_count;
> +PgStat_Counter apply_rollback_count;
> 
> I think it's better to add the prefix "m_" to apply_commit/rollback_count for
> consistency.
Fixed.

 
> ---
> +/*
> + * Increment the counter of commit for subscription statistics.
> + */
> +static void
> +subscription_stats_incr_commit(void)
> +{
> +Assert(OidIsValid(subStats.subid));
> +
> +subStats.apply_commit_count++;
> +}
> +
> 
> I think we don't need the Assert() here since it should not be a problem even 
> if
> subStats.subid is InvalidOid at least in this function.
> 
> If we remove it, we can remove both subscription_stats_incr_commit() and
> +subscription_stats_incr_rollback() as well.
Removed the Assert() from both functions.


> ---
> +void
> +pgstat_report_subscription_xact(bool force) {
> +static TimestampTz last_report = 0;
> +PgStat_MsgSubscriptionXact msg;
> +
> +/* Bailout early if nothing to do */
> +if (!OidIsValid(subStats.subid) ||
> +(subStats.apply_commit_count == 0 &&
> subStats.apply_rollback_count == 0))
> +return;
> +
> 
> +LogicalRepSubscriptionStats subStats =
> +{
> +.subid = InvalidOid,
> +.apply_commit_count = 0,
> +.apply_rollback_count = 0,
> +};
> 
> Do we need subStats.subid? I think we can pass MySubscription->oid (or
> MyLogicalRepWorker->subid) to pgstat_report_subscription_xact() along
> with the pointer of the statistics (subStats). That way, we don't need to 
> expose
> subStats.
Removed the subStats.subid. Also, now I pass the oid to the
pgstat_report_subscription_xact with the pointer of the statistics.

> Also, I think it's better to add "Xact" or something to the struct name. For
> example, SubscriptionXactStats.
Renamed.

> ---
> +
> +typedef struct LogicalRepSubscriptionStats {
> +Oidsubid;
> +
> +int64  apply_commit_count;
> +int64  apply_rollback_count;
> +} LogicalRepSubscriptionStats;
> 
> We need a description for this struct.
> 
> Probably it is better to declare it in logicalworker.h instead so that 
> pgstat.c
> includes it instead of worker_internal.h? worker_internal.h is the header file
> shared by logical replication workers such as apply worker,  tablesync worker,
> and launcher. So it might not be advisable to include it in pgstat.c.
Changed the definition place to logicalworker.h
and added some explanations for it.

Attached the updated v27.


Best Regards,
Takamichi Osumi



v27-0001-Extend-pg_stat_subscription_stats-to-include-gen.patch
Description:  v27-0001-Extend-pg_stat_subscription_stats-to-include-gen.patch


Re: Doc about how to set max_wal_senders when setting minimal wal_level

2022-03-02 Thread David G. Johnston
On Wed, Mar 2, 2022 at 7:44 PM Japin Li  wrote:

>
> Hi, hackers
>
> When I try to change wal_level to minimal and restart the database, it
> complains
> max_wal_senders > 0.
>
> 2022-03-03 10:10:16.938 CST [6389] FATAL:  WAL streaming (max_wal_senders
> > 0) requires wal_level "replica" or "logical"
>
> However, the documentation about wal_level [1] doesn't mentation this.
> How about adding a sentence to describe how to set max_wal_senders when
> setting wal_level to minimal?
>
> [1] https://www.postgresql.org/docs/devel/runtime-config-wal.html
>
>
I would suggest a wording more like:

"A precondition for using minimal WAL is to disable WAL archiving and
streaming replication by setting max_wal_senders to 0, and archive_mode to
off."

While accurate, the phrase "you must set" just doesn't feel right to me.  I
also don't like how the proposed sentence (either one) is added separately
as opposed to being included in the immediately preceding paragraph.  While
this limited patch is probably sufficient I would suggest trying to work
out a slightly larger patch the improves the wording on the entire existing
paragraph while incorporating the reference to max_wal_senders.

Note, we seem to be missing the documentation of the default setting for
archive_mode.

In addition, max_wal_senders could also be changed, adding a sentence like:

"If setting max_wal_senders to 0 consider also reducing the amount of WAL
produced by changing wal_level to minimal."

At least insofar as core is concerned without a wal sender the additional
wal of replica is not actually able to be leveraged as pg_basebackup will
not work (at noted in its own description).

David J.


Re: PG DOCS - logical replication filtering

2022-03-02 Thread Amit Kapila
On Wed, Mar 2, 2022 at 8:00 PM Peter Eisentraut
 wrote:
>
> On 02.03.22 05:47, Peter Smith wrote:
> > This patch introduces a new "Filtering" page to give a common place
> > where all kinds of logical replication filtering can be described.
> > (e.g. It is envisaged that a "Column Filters" section can be added
> > sometime in the future).
>
> The pending feature to select a subset of table columns to replicate is
> not "column filtering".  The thread might still be still called that,
> but we have changed the patch to not use that terminology.
>
> Filtering is a dynamic action based on actual values.  The row filtering
> feature does that.  The column list feature is a static DDL-time
> configuration.  It is no more filtering than specifying a list of tables
> in a publication is table filtering.
>
> So please consider organizing the documentation differently to not
> create this confusion.
>

+1. I think Row Filters can directly be a section just before
Conflicts on the logical replication page [1].

Some comments on the patch:
1. I think we can extend/add the example to have filters on more than
one table. This has been noticed multiple times during development
that people are not very clear on it.
2. I think we can add an example or two for row filters actions (like
Insert, Update).
3.
 Publications can choose to limit the changes they produce to
any combination of INSERT, UPDATE,
-   DELETE, and TRUNCATE,
similar to how triggers are fired by
-   particular event types.  By default, all operation types are replicated.
+   DELETE, and TRUNCATE by using
+   operation filters.

By this, one can imply that row filters are used for Truncate as well
but that is not true. I know that that patch later specifies that "Row
filters have no effect for TRUNCATE commands." but
the above modification is not very clear.

[1] - https://www.postgresql.org/docs/devel/logical-replication.html

-- 
With Regards,
Amit Kapila.




Re: Add CHECKPOINT_REQUESTED flag to the log message in LogCheckpointStart()

2022-03-02 Thread Kyotaro Horiguchi
At Thu, 3 Mar 2022 10:27:10 +0900, Michael Paquier  wrote 
in 
> On Thu, Mar 03, 2022 at 09:39:37AM +0900, Kyotaro Horiguchi wrote:
> > At Wed, 2 Mar 2022 18:18:10 +0530, Bharath Rupireddy 
> >  wrote in 
> >> I don't think that's useful. Being in LogCheckpointStart
> >> (CreateCheckPoint or CreateRestartPoint) itself means that somebody
> >> has requested a checkpoint. Having CHECKPOINT_REQUESTED doesn't add
> >> any value.
> > 
> > Agreed.
> 
> Exactly my impression.  This would apply now to the WAL shutdown code
> paths, and I'd suspect that the callers of CreateCheckPoint() are not
> going to increase soon.  The point is: the logs already provide some
> contexts for any of those callers so I see no need for this additional
> information.
> 
> > Actually no one does but RequestCheckpoint() accepts 0 as flags.
> > Checkpointer would be a bit more complex without CHECKPOINT_REQUESTED.
> > I don't think it does us any good to get rid of the flag value.
> 
> I'd rather keep this code as-is.

I fail to identify the nuance of the phrase, so just for a
clarification.  In short, I think we should keep the exiting code
as-is.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Why do spgbuildempty(), btbuildempty(), and blbuildempty() use smgrwrite()?

2022-03-02 Thread Kyotaro Horiguchi
At Wed, 2 Mar 2022 20:07:14 -0500, Melanie Plageman  
wrote in 
> If you enable the CHECK_WRITE_VS_EXTEND-protected assert in mdwrite(),
> you'll trip it anytime you exercise btbuildempty(), blbuildempty(), or
> spgbuildempty().
> 
> In this case, it may not make any meaningful difference if smgrwrite()
> or smgrextend() is called (_mdfd_getseg() behavior won't differ even
> with the different flags, so really only the FileWrite() wait event will
> be different).
> However, it seems like it should still be changed to call smgrextend().
> Or, since they only write a few pages, why not use shared buffers like
> gistbuildempty() and brinbuildempty() do?
> 
> I've attached a patch to move these three into shared buffers.
> 
> And, speaking of spgbuildempty(), there is no test exercising it in
> check-world, so I've attached a patch to do so. I wasn't sure if it
> belonged in spgist.sql or create_index_spgist.sql (on name alone, seems
> like the latter but based on content, seems like the former).
> 
> - Melanie

I didn't dig into your specific isssue, but I'm mildly opposed to
moving them onto shared buffers.  They are cold images of init-fork,
which is actually no-use for active servers.  Rather I'd like to move
brinbuildempty out of shared buffers considering one of my proposing
patches..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-02 Thread Kyotaro Horiguchi
Hi.

+#ifdef FRONTEND
+/*
+ * Functions that are currently not needed in the backend, but are better
+ * implemented inside xlogreader.c because of the internal facilities available
+ * here.
+ */
+
 #endif /* FRONTEND */

Why didn't you remove the emptied #ifdef section?

+int
+read_local_xlog_page_2(XLogReaderState *state, XLogRecPtr targetPagePtr,
+  int reqLen, XLogRecPtr targetRecPtr, 
char *cur_page)

The difference with the original function is this function has one
additional if-block amid. I think we can insert the code directly in
the original function.

+   /*
+* We are trying to read future WAL. Let's not wait for 
WAL to be
+* available if indicated.
+*/
+   if (state->private_data != NULL)

However, in the first place it seems to me there's not need for the
function to take care of no_wait affairs.

If, for expample, pg_get_wal_record_info() with no_wait = true, it is
enough that the function identifies the bleeding edge of WAL then loop
until the LSN.  So I think no need for the new function, nor for any
modification on the origical function.

The changes will reduce the footprint of the patch largely, I think.

At Wed, 2 Mar 2022 22:37:43 +0530, Bharath Rupireddy 
 wrote in 
> On Wed, Mar 2, 2022 at 8:12 PM Ashutosh Sharma  wrote:
> >
> > Some review comments on v5 patch (v5-0001-pg_walinspect.patch)
> 
> Thanks for reviewing.
> 
> > +--
> > +-- pg_get_wal_records_info()
> > +--
> > +CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn,
> > +IN end_lsn pg_lsn,
> > +IN wait_for_wal boolean DEFAULT false,
> > +OUT lsn pg_lsn,
> >
> > What does the wait_for_wal flag mean here when one has already
> > specified the start and end lsn? AFAIU, If a user has specified a
> > start and stop LSN, it means that the user knows the extent to which
> > he/she wants to display the WAL records in which case we need to stop
> > once the end lsn has reached . So what is the meaning of wait_for_wal
> > flag? Does it look sensible to have the wait_for_wal flag here? To me
> > it doesn't.
> 
> Users can always specify a future end_lsn and set wait_for_wal to
> true, then the pg_get_wal_records_info/pg_get_wal_stats functions can
> wait for the WAL. IMO, this is useful. If you remember you were okay
> with wait/nowait versions for some of the functions upthread [1]. I'm
> not going to retain this behaviour for both
> pg_get_wal_records_info/pg_get_wal_stats as it is similar to
> pg_waldump's --follow option.

I agree to this for now. However, I prefer that NULL or invalid
end_lsn is equivalent to pg_current_wal_lsn().

> > ==
> >
> > +--
> > +-- pg_get_wal_records_info_till_end_of_wal()
> > +--
> > +CREATE FUNCTION pg_get_wal_records_info_till_end_of_wal(IN start_lsn 
> > pg_lsn,
> > +OUT lsn pg_lsn,
> > +OUT prev_lsn pg_lsn,
> > +OUT xid xid,
> >
> > Why is this function required? Is pg_get_wal_records_info() alone not
> > enough? I think it is. See if we can make end_lsn optional in
> > pg_get_wal_records_info() and lets just have it alone. I think it can
> > do the job of pg_get_wal_records_info_till_end_of_wal function.

I rather agree to Ashutosh. This feature can be covered by
pg_get_wal_records(start_lsn, NULL, false).

> > ==
> >
> > +--
> > +-- pg_get_wal_stats_till_end_of_wal()
> > +--
> > +CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn,
> > +OUT resource_manager text,
> > +OUT count int8,
> >
> > Above comment applies to this function as well. Isn't pg_get_wal_stats() 
> > enough?
> 
> I'm doing the following input validations for these functions to not
> cause any issues with invalid LSN. If I were to have the default value
> for end_lsn as 0/0, I can't perform input validations right? That is
> the reason I'm having separate functions {pg_get_wal_records_info,
> pg_get_wal_stats}_till_end_of_wal() versions.
> 
> /* Validate input. */
> if (XLogRecPtrIsInvalid(start_lsn))
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>  errmsg("invalid WAL record start LSN")));
> 
> if (XLogRecPtrIsInvalid(end_lsn))
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>  errmsg("invalid WAL record end LSN")));
> 
> if (start_lsn >= end_lsn)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>  errmsg("WAL record start LSN must be less than end LSN")));

I don't think that validations are worth doing at least for the first
two, as far as that value has a special meaning. I see it useful if
pg_get_wal_records_info() means dump the all available records for the
moment, or records of the last segment, page or something.

> > ==
> >
> >
> > +   if (loc <= read_upto)
> > +   

Doc about how to set max_wal_senders when setting minimal wal_level

2022-03-02 Thread Japin Li

Hi, hackers

When I try to change wal_level to minimal and restart the database, it complains
max_wal_senders > 0.

2022-03-03 10:10:16.938 CST [6389] FATAL:  WAL streaming (max_wal_senders > 0) 
requires wal_level "replica" or "logical"

However, the documentation about wal_level [1] doesn't mentation this.
How about adding a sentence to describe how to set max_wal_senders when
setting wal_level to minimal?

[1] https://www.postgresql.org/docs/devel/runtime-config-wal.html

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7ed8c82a9d..3f5ecc4a04 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2776,6 +2776,10 @@ include_dir 'conf.d'
 unavailable for archive recovery and standby server, which may
 lead to data loss.

+   
+When changing wal_level to minimal,
+you must set the  parameter to 0.
+   

 In logical level, the same information is logged as
 with replica, plus information needed to allow


Re: row filtering for logical replication

2022-03-02 Thread Amit Kapila
On Wed, Mar 2, 2022 at 5:42 PM Euler Taveira  wrote:
>
> On Wed, Mar 2, 2022, at 8:45 AM, Tomas Vondra wrote:
>
> While working on the column filtering patch, which touches about the
> same places, I noticed two minor gaps in testing:
>
> 1) The regression tests do perform multiple ALTER PUBLICATION commands,
> tweaking the row filter. But there are no checks the row filter was
> actually modified / stored in the catalog. It might be just thrown away
> and no one would notice.
>
> The test that row filter was modified is available in a previous section. The
> one that you modified (0001) is testing the supported objects.
>

Right. But if Tomas thinks it is good to add for these ones as well
then I don't mind.

> 153 ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000 AND 
> e < 2000);
> 154 \dRp+ testpub5
> 155 ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;
> 156 \dRp+ testpub5
> 157 -- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (another WHERE 
> expression)
> 158 ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300 AND e 
> < 500);
> 159 \dRp+ testpub5
>
> IIRC this test was written before adding the row filter information into the
> psql. We could add \d+ testpub_rf_tbl3 before and after the modification.
>


Agreed. We can use \d instead of \d+ as row filter is available with \d.

> 2) There are no pg_dump tests.
>
> WFM.
>

This is a miss. I feel we can add a few more.

-- 
With Regards,
Amit Kapila.




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-02 Thread Ashutosh Sharma
On Wed, Mar 2, 2022 at 10:37 PM Bharath Rupireddy
 wrote:
>
> On Wed, Mar 2, 2022 at 8:12 PM Ashutosh Sharma  wrote:
> >
> > Some review comments on v5 patch (v5-0001-pg_walinspect.patch)
>
> Thanks for reviewing.
>
> > +--
> > +-- pg_get_wal_records_info()
> > +--
> > +CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn,
> > +IN end_lsn pg_lsn,
> > +IN wait_for_wal boolean DEFAULT false,
> > +OUT lsn pg_lsn,
> >
> > What does the wait_for_wal flag mean here when one has already
> > specified the start and end lsn? AFAIU, If a user has specified a
> > start and stop LSN, it means that the user knows the extent to which
> > he/she wants to display the WAL records in which case we need to stop
> > once the end lsn has reached . So what is the meaning of wait_for_wal
> > flag? Does it look sensible to have the wait_for_wal flag here? To me
> > it doesn't.
>
> Users can always specify a future end_lsn and set wait_for_wal to
> true, then the pg_get_wal_records_info/pg_get_wal_stats functions can
> wait for the WAL. IMO, this is useful. If you remember you were okay
> with wait/nowait versions for some of the functions upthread [1]. I'm
> not going to retain this behaviour for both
> pg_get_wal_records_info/pg_get_wal_stats as it is similar to
> pg_waldump's --follow option.
>

It is not at all similar to pg_waldumps behaviour. Please check the
behaviour of pg_waldump properly. Does it wait for any wal records
when a user has specified a stop pointer? It doesn't and it shouldn't.
I mean does it even make sense to wait for the WAL when a stop pointer
is specified? And it's quite understandable that if a user has asked
pg_walinspect to stop at a certain point, it must. Also, What if there
are already WAL records after the stop pointer, in that case does it
even make sense to have a wait flag. WHat would be the meaning of the
wait flag in that case?

Further, have you checked wait_for_wal flag behaviour, is it even working?

> >
> > +--
> > +-- pg_get_wal_records_info_till_end_of_wal()
> > +--
> > +CREATE FUNCTION pg_get_wal_records_info_till_end_of_wal(IN start_lsn 
> > pg_lsn,
> > +OUT lsn pg_lsn,
> > +OUT prev_lsn pg_lsn,
> > +OUT xid xid,
> >
> > Why is this function required? Is pg_get_wal_records_info() alone not
> > enough? I think it is. See if we can make end_lsn optional in
> > pg_get_wal_records_info() and lets just have it alone. I think it can
> > do the job of pg_get_wal_records_info_till_end_of_wal function.
> >
> > ==
> >
> > +--
> > +-- pg_get_wal_stats_till_end_of_wal()
> > +--
> > +CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn,
> > +OUT resource_manager text,
> > +OUT count int8,
> >
> > Above comment applies to this function as well. Isn't pg_get_wal_stats() 
> > enough?
>
> I'm doing the following input validations for these functions to not
> cause any issues with invalid LSN. If I were to have the default value
> for end_lsn as 0/0, I can't perform input validations right? That is
> the reason I'm having separate functions {pg_get_wal_records_info,
> pg_get_wal_stats}_till_end_of_wal() versions.
>

You can do it. Please check pg_waldump to understand how it is done
there. You cannot have multiple functions doing different things when
one single function can do all the job.

> > ==
> >
> >
> > +   if (loc <= read_upto)
> > +   break;
> > +
> > +   /* Let's not wait for WAL to be available if
> > indicated */
> > +   if (loc > read_upto &&
> > +   state->private_data != NULL)
> > +   {
> >
> > Why loc > read_upto? The first if condition is (loc <= read_upto)
> > followed by the second if condition - (loc > read_upto). Is the first
> > if condition (loc <= read_upto) not enough to indicate that loc >
> > read_upto?
>
> Yeah, that's unnecessary, I improved the comment there and removed loc
> > read_upto.
>
> > ==
> >
> > +#define IsEndOfWALReached(state) \
> > +   (state->private_data != NULL && \
> > +   (((ReadLocalXLOGPage2Private *)
> > xlogreader->private_data)->no_wait == true) && \
> > +   (((ReadLocalXLOGPage2Private *)
> > xlogreader->private_data)->reached_end_of_wal == true))
> >
> > I think we should either use state or xlogreader. First line says
> > state->private_data and second line xlogreader->private_data.
>
> I've changed it to use state instead of xlogreader.
>
> > ==
> >
> > +   (((ReadLocalXLOGPage2Private *)
> > xlogreader->private_data)->reached_end_of_wal == true))
> > +
> >
> > There is a new patch coming to make the end of WAL messages less
> > scary. It introduces the EOW flag in xlogreaderstate maybe we can use
> > that instead of introducing new flags in private area to represent the
> > end of WAL.
>
> Yeah that would be great. But we never know which one gets committed
> first. Until then it's not good to have dependencie

Re: Design of pg_stat_subscription_workers vs pgstats

2022-03-02 Thread Andres Freund
Hi,

On 2022-02-25 11:32:24 -0800, Andres Freund wrote:
> On 2022-02-25 16:25:01 -0300, Euler Taveira wrote:
> > On Fri, Feb 25, 2022, at 11:52 AM, Greg Stark wrote:
> > > On Tue, 25 Jan 2022 at 01:32, Andres Freund  wrote:
> > > >
> > > > Hi,
> > > >
> > > > I was looking the shared memory stats patch again.
> > > 
> > > Can you point me to this thread? I looked for it but couldn't find it.
> 
> > https://postgr.es/m/20180629.173418.190173462.horiguchi.kyot...@lab.ntt.co.jp
> 
> I'll post a rebased version as soon as this is resolved... I have a local one,
> but it just works by nuking a bunch of tests / #ifdefing out code related to
> this.

Now that the pg_stat_subscription_workers changes have been committed, I've
posted a rebased version to the above thread.

Greetings,

Andres Freund




Re: [Proposal] Global temporary tables

2022-03-02 Thread Pavel Stehule
Hi


> I think you'd basically have to come up with a generic design for
> partitioning
> catalog tables into local / non-local storage, without needing explicit
> code
> for each catalog. That could also be used to store the default catalog
> contents separately from user defined ones (e.g. pg_proc is pretty large).
>

There is still a risk of bloating in local storage, but, mainly, you
probably have to modify a lot of lines because the system cache doesn't
support partitioning.

Regards

Pavel


>
> Greetings,
>
> Andres Freund
>


Re: Add CHECKPOINT_REQUESTED flag to the log message in LogCheckpointStart()

2022-03-02 Thread Michael Paquier
On Thu, Mar 03, 2022 at 09:39:37AM +0900, Kyotaro Horiguchi wrote:
> At Wed, 2 Mar 2022 18:18:10 +0530, Bharath Rupireddy 
>  wrote in 
>> I don't think that's useful. Being in LogCheckpointStart
>> (CreateCheckPoint or CreateRestartPoint) itself means that somebody
>> has requested a checkpoint. Having CHECKPOINT_REQUESTED doesn't add
>> any value.
> 
> Agreed.

Exactly my impression.  This would apply now to the WAL shutdown code
paths, and I'd suspect that the callers of CreateCheckPoint() are not
going to increase soon.  The point is: the logs already provide some
contexts for any of those callers so I see no need for this additional
information.

> Actually no one does but RequestCheckpoint() accepts 0 as flags.
> Checkpointer would be a bit more complex without CHECKPOINT_REQUESTED.
> I don't think it does us any good to get rid of the flag value.

I'd rather keep this code as-is.
--
Michael


signature.asc
Description: PGP signature


Re: Changing "Hot Standby" to "hot standby"

2022-03-02 Thread Kyotaro Horiguchi
At Wed, 2 Mar 2022 15:22:44 +, "Daniel Westermann (DWE)" 
 wrote in 
> > Pretty sure that for titles we should keep English capitalization rules.
> 
> Done like that. Thanks for taking a look.


-Hot Standby feedback propagates upstream, whatever the cascaded 
arrangement.
+hot standby feedback propagates upstream, whatever the cascaded arrangement


-Hot Standby is the term used to describe the ability to connect to
+hot standby is the term used to describe the ability to connect to


They look like decapitalizing the first word in a sentsnce.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Michael Paquier
On Wed, Mar 02, 2022 at 12:04:59PM -0500, Chapman Flack wrote:
> I had just recently noticed that while reviewing [0], but shrugged,
> as I didn't know what the history was.

Okay.  I did not see you mention it on the thread, but the discussion
is long so it is easy to miss some of its details.

> Is this best handled as a separate patch, or folded into [0], which is
> going to be altering and renaming that function anyway?

No idea where this is leading, but I'd rather fix what is at hands now
rather than assuming that something may or may not happen.  If, as you
say, this code gets removed, rebasing this conflict is just a matter
of removing the existing code again so that's trivial.
--
Michael


signature.asc
Description: PGP signature


Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Michael Paquier
On Thu, Mar 03, 2022 at 12:36:32AM +0800, Julien Rouhaud wrote:
> I don't see strong evidence for that pattern being wildly used with some naive
> grepping:

Yes, I don't recall either seeing the style with an undef a lot when
it came to system functions.  I'll move on and apply the fix in a
minute using this style.
--
Michael


signature.asc
Description: PGP signature


Re: Avoiding smgrimmedsync() during nbtree index builds

2022-03-02 Thread Justin Pryzby
Rebased to appease cfbot.

I ran these paches under a branch which shows code coverage in cirrus.  It
looks good to my eyes.
https://api.cirrus-ci.com/v1/artifact/task/5212346552418304/coverage/coverage/00-index.html

Are these patches being considered for v15 ?
>From 30707da3e5eb68d1efbc5594696da47ad7f72bab Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Tue, 8 Feb 2022 19:01:18 -0500
Subject: [PATCH 1/4] Add unbuffered IO API

Wrap unbuffered extends and writes in a new API, directmgr.

When writing data outside of shared buffers, the backend must do a
series of steps to ensure the data is both durable and recoverable.

When writing or extending a page of data for a WAL-logged table fork,
the backend must log, checksum (if page is not empty), and write out the
page before moving on.

Additionally, the backend must fsync the page data to ensure it reaches
permanent storage since checkpointer is unaware of the buffer and could
move the Redo pointer past the associated WAL for this write/extend
before it fsyncs the data.

This API is also used for non-WAL-logged and non-self-fsync'd table
forks but with the appropriate exceptions to the above steps.

This commit introduces no functional change. It replaces all current
callers of smgrimmedsync(), smgrextend(), and smgrwrite() with the
equivalent directmgr functions. Consolidating these steps makes IO
outside of shared buffers less error-prone.
---
 src/backend/access/gist/gistbuild.c   | 36 +++
 src/backend/access/hash/hashpage.c| 18 +++---
 src/backend/access/heap/heapam_handler.c  | 15 +++--
 src/backend/access/heap/rewriteheap.c | 53 +--
 src/backend/access/heap/visibilitymap.c   | 10 ++-
 src/backend/access/nbtree/nbtree.c| 18 ++
 src/backend/access/nbtree/nbtsort.c   | 56 ++--
 src/backend/access/spgist/spginsert.c | 39 ---
 src/backend/catalog/storage.c | 30 +++--
 src/backend/storage/Makefile  |  2 +-
 src/backend/storage/direct/Makefile   | 17 +
 src/backend/storage/direct/directmgr.c| 79 +++
 src/backend/storage/freespace/freespace.c | 14 ++--
 src/include/catalog/storage.h |  1 +
 src/include/storage/directmgr.h   | 57 
 15 files changed, 276 insertions(+), 169 deletions(-)
 create mode 100644 src/backend/storage/direct/Makefile
 create mode 100644 src/backend/storage/direct/directmgr.c
 create mode 100644 src/include/storage/directmgr.h

diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index e081e6571a4..8fabc2a42d7 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -43,6 +43,7 @@
 #include "miscadmin.h"
 #include "optimizer/optimizer.h"
 #include "storage/bufmgr.h"
+#include "storage/directmgr.h"
 #include "storage/smgr.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
@@ -91,6 +92,7 @@ typedef struct
 
 	int64		indtuples;		/* number of tuples indexed */
 
+	UnBufferedWriteState ub_wstate;
 	/*
 	 * Extra data structures used during a buffering build. 'gfbb' contains
 	 * information related to managing the build buffers. 'parentMap' is a
@@ -409,14 +411,16 @@ gist_indexsortbuild(GISTBuildState *state)
 	state->pages_allocated = 0;
 	state->pages_written = 0;
 	state->ready_num_pages = 0;
+	unbuffered_prep(&state->ub_wstate, false, false);
 
 	/*
 	 * Write an empty page as a placeholder for the root page. It will be
 	 * replaced with the real root page at the end.
 	 */
 	page = palloc0(BLCKSZ);
-	smgrextend(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
-			   page, true);
+	unbuffered_extend(&state->ub_wstate, false,
+			RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
+			page, true);
 	state->pages_allocated++;
 	state->pages_written++;
 
@@ -458,12 +462,13 @@ gist_indexsortbuild(GISTBuildState *state)
 
 	/* Write out the root */
 	PageSetLSN(levelstate->pages[0], GistBuildLSN);
-	PageSetChecksumInplace(levelstate->pages[0], GIST_ROOT_BLKNO);
-	smgrwrite(RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
-			  levelstate->pages[0], true);
-	if (RelationNeedsWAL(state->indexrel))
-		log_newpage(&state->indexrel->rd_node, MAIN_FORKNUM, GIST_ROOT_BLKNO,
-	levelstate->pages[0], true);
+
+	unbuffered_write(&state->ub_wstate, RelationNeedsWAL(state->indexrel),
+			RelationGetSmgr(state->indexrel), MAIN_FORKNUM, GIST_ROOT_BLKNO,
+			  levelstate->pages[0]);
+
+	unbuffered_finish(&state->ub_wstate, RelationGetSmgr(state->indexrel),
+			MAIN_FORKNUM);
 
 	pfree(levelstate->pages[0]);
 	pfree(levelstate);
@@ -645,6 +650,8 @@ gist_indexsortbuild_flush_ready_pages(GISTBuildState *state)
 	if (state->ready_num_pages == 0)
 		return;
 
+	unbuffered_prep(&state->ub_wstate, false, false);
+
 	for (int i = 0; i < state->ready_num_pages; i++)
 	{
 		Page		page = state->ready_pages[i];
@@ -655,9 +662,13 @@ gist_indexsortbuild_flush_ready_pages(GI

Why do spgbuildempty(), btbuildempty(), and blbuildempty() use smgrwrite()?

2022-03-02 Thread Melanie Plageman
If you enable the CHECK_WRITE_VS_EXTEND-protected assert in mdwrite(),
you'll trip it anytime you exercise btbuildempty(), blbuildempty(), or
spgbuildempty().

In this case, it may not make any meaningful difference if smgrwrite()
or smgrextend() is called (_mdfd_getseg() behavior won't differ even
with the different flags, so really only the FileWrite() wait event will
be different).
However, it seems like it should still be changed to call smgrextend().
Or, since they only write a few pages, why not use shared buffers like
gistbuildempty() and brinbuildempty() do?

I've attached a patch to move these three into shared buffers.

And, speaking of spgbuildempty(), there is no test exercising it in
check-world, so I've attached a patch to do so. I wasn't sure if it
belonged in spgist.sql or create_index_spgist.sql (on name alone, seems
like the latter but based on content, seems like the former).

- Melanie
From f4f594ed06d5cf54135e4f9974fd2158ead8a7bb Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 2 Mar 2022 15:42:48 -0500
Subject: [PATCH v1 1/2] Build unlogged index init fork in shared buffers

Move empty index build in the init fork for unlogged bloom, btree, and
spgist indexes into shared buffers. Previously, the initial pages of
these indexes were extended with smgrwrite(). smgrwrite() is not
meant to be used for relation extension. Instead of using smgrextend(),
though, move the build to shared buffers (currently brinbuildempty() and
gistbuildempty() build in shared buffers).
---
 contrib/bloom/blinsert.c  | 32 
 src/backend/access/nbtree/nbtree.c| 32 
 src/backend/access/spgist/spginsert.c | 74 ---
 3 files changed, 53 insertions(+), 85 deletions(-)

diff --git a/contrib/bloom/blinsert.c b/contrib/bloom/blinsert.c
index c94cf34e69..47f4f1cf62 100644
--- a/contrib/bloom/blinsert.c
+++ b/contrib/bloom/blinsert.c
@@ -163,31 +163,17 @@ blbuild(Relation heap, Relation index, IndexInfo *indexInfo)
 void
 blbuildempty(Relation index)
 {
-	Page		metapage;
+	Buffer metabuf =
+		ReadBufferExtended(index, INIT_FORKNUM, P_NEW, RBM_NORMAL, NULL);
+	LockBuffer(metabuf, BUFFER_LOCK_EXCLUSIVE);
+	BloomFillMetapage(index, BufferGetPage(metabuf));
 
-	/* Construct metapage. */
-	metapage = (Page) palloc(BLCKSZ);
-	BloomFillMetapage(index, metapage);
+	START_CRIT_SECTION();
+	MarkBufferDirty(metabuf);
+	log_newpage_buffer(metabuf, true);
+	END_CRIT_SECTION();
 
-	/*
-	 * Write the page and log it.  It might seem that an immediate sync would
-	 * be sufficient to guarantee that the file exists on disk, but recovery
-	 * itself might remove it while replaying, for example, an
-	 * XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record.  Therefore, we need
-	 * this even when wal_level=minimal.
-	 */
-	PageSetChecksumInplace(metapage, BLOOM_METAPAGE_BLKNO);
-	smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BLOOM_METAPAGE_BLKNO,
-			  (char *) metapage, true);
-	log_newpage(&(RelationGetSmgr(index))->smgr_rnode.node, INIT_FORKNUM,
-BLOOM_METAPAGE_BLKNO, metapage, true);
-
-	/*
-	 * An immediate sync is required even if we xlog'd the page, because the
-	 * write did not go through shared_buffers and therefore a concurrent
-	 * checkpoint may have moved the redo pointer past our xlog record.
-	 */
-	smgrimmedsync(RelationGetSmgr(index), INIT_FORKNUM);
+	UnlockReleaseBuffer(metabuf);
 }
 
 /*
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index c9b4964c1e..bac531609d 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -151,31 +151,19 @@ bthandler(PG_FUNCTION_ARGS)
 void
 btbuildempty(Relation index)
 {
-	Page		metapage;
+	Buffer metabuf =
+		ReadBufferExtended(index, INIT_FORKNUM, P_NEW, RBM_NORMAL, NULL);
+	LockBuffer(metabuf, BUFFER_LOCK_EXCLUSIVE);
 
-	/* Construct metapage. */
-	metapage = (Page) palloc(BLCKSZ);
-	_bt_initmetapage(metapage, P_NONE, 0, _bt_allequalimage(index, false));
+	_bt_initmetapage(BufferGetPage(metabuf), P_NONE, 0,
+			_bt_allequalimage(index, false));
 
-	/*
-	 * Write the page and log it.  It might seem that an immediate sync would
-	 * be sufficient to guarantee that the file exists on disk, but recovery
-	 * itself might remove it while replaying, for example, an
-	 * XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record.  Therefore, we need
-	 * this even when wal_level=minimal.
-	 */
-	PageSetChecksumInplace(metapage, BTREE_METAPAGE);
-	smgrwrite(RelationGetSmgr(index), INIT_FORKNUM, BTREE_METAPAGE,
-			  (char *) metapage, true);
-	log_newpage(&RelationGetSmgr(index)->smgr_rnode.node, INIT_FORKNUM,
-BTREE_METAPAGE, metapage, true);
+	START_CRIT_SECTION();
+	MarkBufferDirty(metabuf);
+	log_newpage_buffer(metabuf, true);
+	END_CRIT_SECTION();
 
-	/*
-	 * An immediate sync is required even if we xlog'd the page, because the
-	 * write did not go through shared_buffers and therefore a concurrent
-	 * checkpoint may have moved the redo pointer past our 

Re: Proposal: Support custom authentication methods using hooks,Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Tatsuo Ishii
> On 2/25/22 12:39 PM, Tom Lane wrote:
>> Jeff Davis  writes:
>>> On Thu, 2022-02-24 at 20:47 -0500, Tom Lane wrote:
 ... and, since we can't readily enforce that the client only sends
 those cleartext passwords over suitably-encrypted connections, this
 could easily be a net negative for security.  Not sure that I think
 it's a good idea.
>> 
>>> I don't understand your point. Can't you just use "hostssl" rather
>>> than
>>> "host"?
>> My point is that sending cleartext passwords over the wire is an
>> insecure-by-definition protocol that we shouldn't be encouraging
>> more use of.
> 
> This is my general feeling as well. We just spent a bunch of effort
> adding, refining, and making SCRAM the default method. I think doing
> anything that would drive more use of sending plaintext passwords,
> even over TLS, is counter to that.

There's at least one use case to use plaintext passwords. Pgpool-II
accepts plaintext passwords from frontend (from frontend's point of
view, it looks as if the frontend speaks with PostgreSQL server which
requests the plaintext password authentication), then negotiates with
backends regarding authentication method they demand. Suppose we have
2 PostgreSQL clusters and they require md5 auth. They send different
password encryption salt and Pgpool-II deal with each server using the
salt and password. So Pgpool-II needs plaintext password.  Same thing
can be said to SCRAM-SHA-256 authentication because it's kind of
challenge/response based authentication.

Actually it is possible for Pgpool-II to not use plaintext passwords
reading from frontend. In this case passwords are stored in a file and
Pgpool-II reads passwords from the file. But this is annoying for
users because they have to sync the passwords stored in PostgreSQL
with the passwords stored in the file.

So, dropping plaintext password authentication support from libpq will
make it impossible for users to use the former method.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: PG DOCS - logical replication filtering

2022-03-02 Thread Peter Smith
On Wed, Mar 2, 2022 at 8:26 PM Amit Kapila  wrote:
>
> On Wed, Mar 2, 2022 at 2:37 PM Aleksander Alekseev
>  wrote:
> >
> >
> > I see that a large part of the documentation is commented and marked as TBA 
> > (Column Filters, Combining Different Kinds of Filters). Could you please 
> > clarify if it's a work-in-progress patch? If it's not, I believe the 
> > commented part should be removed before committing.
> >
>
> I think we can remove any Column Filters related information
> (placeholders), if that patch gets committed, we can always extend the
> existing docs.
>

+1

--
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: PG DOCS - logical replication filtering

2022-03-02 Thread Peter Smith
On Wed, Mar 2, 2022 at 8:43 PM Aleksander Alekseev
 wrote:
...
> Here is an updated version of the patch.

Thanks for your review comments and fixes. The updated v2 patch looks
good to me.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add CHECKPOINT_REQUESTED flag to the log message in LogCheckpointStart()

2022-03-02 Thread Kyotaro Horiguchi
At Wed, 2 Mar 2022 18:18:10 +0530, Bharath Rupireddy 
 wrote in 
> On Wed, Mar 2, 2022 at 5:41 PM Nitin Jadhav
>  wrote:
> >
> > Hi,
> >
> > I have noticed that the CHECKPOINT_REQUESTED flag information is not
> > present in the log message of LogCheckpointStart() function. I would
> > like to understand if it was missed or left intentionally. The log
> > message describes all the possible checkpoint flags except
> > CHECKPOINT_REQUESTED flag. I feel we should support this. Thoughts?
> 
> I don't think that's useful. Being in LogCheckpointStart
> (CreateCheckPoint or CreateRestartPoint) itself means that somebody
> has requested a checkpoint. Having CHECKPOINT_REQUESTED doesn't add
> any value.

Agreed.

> I would suggest removing the CHECKPOINT_REQUESTED flag as it's not
> being used anywhere instead CheckpointerShmem->ckpt_flags is used as
> an indication of the checkpoint requested in CheckpointerMain [1]. If

Actually no one does but RequestCheckpoint() accepts 0 as flags.
Checkpointer would be a bit more complex without CHECKPOINT_REQUESTED.
I don't think it does us any good to get rid of the flag value.

> others don't agree to remove as it doesn't cause any harm, then, I
> would  add something like this for more readability:

 if (((volatile CheckpointerShmemStruct *)
-  CheckpointerShmem)->ckpt_flags)
+  CheckpointerShmem)->ckpt_flags) & CHECKPOINT_REQUESTED))

I don't particularly object to this, but I don't think that change
makes the code significantly easier to read either.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Column Filtering in Logical Replication

2022-03-02 Thread Justin Pryzby
I applied this patch in my branch with CI hacks to show code coverage on
cirrus.
https://api.cirrus-ci.com/v1/artifact/task/6186186539532288/coverage/coverage/00-index.html

Eyeballing it looks good.  But GetActionsInPublication() isn't being hit at
all?

I think the queries in pg_dump should be written with the common portions of
the query outside the conditional.

-- 
Justin




Re: Logical replication timeout problem

2022-03-02 Thread Peter Smith
On Wed, Mar 2, 2022 at 1:06 PM wangw.f...@fujitsu.com
 wrote:
>
...
> Attach the new patch. [suggestion by Kuroda-San]

It is difficult to read the thread and to keep track of who reviewed
what, and what patch is latest etc, when every patch name is the same.

Can you please introduce a version number for future patch attachments?

--
KInd Regards,
Peter Smith.
Fujitsu Australia




Re: BufferAlloc: don't take two simultaneous locks

2022-03-02 Thread Yura Sokolov
В Вт, 01/03/2022 в 10:24 +0300, Yura Sokolov пишет:
> Ok, here is v4.

And here is v5.

First, there was compilation error in Assert in dynahash.c .
Excuse me for not checking before sending previous version.

Second, I add third commit that reduces HASHHDR allocation
size for non-partitioned dynahash:
- moved freeList to last position
- alloc and memset offset(HASHHDR, freeList[1]) for
  non-partitioned hash tables.
I didn't benchmarked it, but I will be surprised if it
matters much in performance sence.

Third, I put all three commits into single file to not
confuse commitfest application.

 


regards

Yura Sokolov
Postgres Professional
y.soko...@postgrespro.ru
funny.fal...@gmail.com
From c1b8e6d60030d5d02287ae731ab604feeafa7486 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Mon, 21 Feb 2022 08:49:03 +0300
Subject: [PATCH 1/3] bufmgr: do not acquire two partition locks.

Acquiring two partition locks leads to complex dependency chain that hurts
at high concurrency level.

There is no need to hold both lock simultaneously. Buffer is pinned so
other processes could not select it for eviction. If tag is cleared and
buffer removed from old partition other processes will not find it.
Therefore it is safe to release old partition lock before acquiring
new partition lock.
---
 src/backend/storage/buffer/bufmgr.c | 189 +---
 1 file changed, 89 insertions(+), 100 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index b4532948d3f..5d2781f4813 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1288,93 +1288,16 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			oldHash = BufTableHashCode(&oldTag);
 			oldPartitionLock = BufMappingPartitionLock(oldHash);
 
-			/*
-			 * Must lock the lower-numbered partition first to avoid
-			 * deadlocks.
-			 */
-			if (oldPartitionLock < newPartitionLock)
-			{
-LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
-LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
-			}
-			else if (oldPartitionLock > newPartitionLock)
-			{
-LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
-LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
-			}
-			else
-			{
-/* only one partition, only one lock */
-LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
-			}
+			LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
 		}
 		else
 		{
-			/* if it wasn't valid, we need only the new partition */
-			LWLockAcquire(newPartitionLock, LW_EXCLUSIVE);
 			/* remember we have no old-partition lock or tag */
 			oldPartitionLock = NULL;
 			/* keep the compiler quiet about uninitialized variables */
 			oldHash = 0;
 		}
 
-		/*
-		 * Try to make a hashtable entry for the buffer under its new tag.
-		 * This could fail because while we were writing someone else
-		 * allocated another buffer for the same block we want to read in.
-		 * Note that we have not yet removed the hashtable entry for the old
-		 * tag.
-		 */
-		buf_id = BufTableInsert(&newTag, newHash, buf->buf_id);
-
-		if (buf_id >= 0)
-		{
-			/*
-			 * Got a collision. Someone has already done what we were about to
-			 * do. We'll just handle this as if it were found in the buffer
-			 * pool in the first place.  First, give up the buffer we were
-			 * planning to use.
-			 */
-			UnpinBuffer(buf, true);
-
-			/* Can give up that buffer's mapping partition lock now */
-			if (oldPartitionLock != NULL &&
-oldPartitionLock != newPartitionLock)
-LWLockRelease(oldPartitionLock);
-
-			/* remaining code should match code at top of routine */
-
-			buf = GetBufferDescriptor(buf_id);
-
-			valid = PinBuffer(buf, strategy);
-
-			/* Can release the mapping lock as soon as we've pinned it */
-			LWLockRelease(newPartitionLock);
-
-			*foundPtr = true;
-
-			if (!valid)
-			{
-/*
- * We can only get here if (a) someone else is still reading
- * in the page, or (b) a previous read attempt failed.  We
- * have to wait for any active read attempt to finish, and
- * then set up our own read attempt if the page is still not
- * BM_VALID.  StartBufferIO does it all.
- */
-if (StartBufferIO(buf, true))
-{
-	/*
-	 * If we get here, previous attempts to read the buffer
-	 * must have failed ... but we shall bravely try again.
-	 */
-	*foundPtr = false;
-}
-			}
-
-			return buf;
-		}
-
 		/*
 		 * Need to lock the buffer header too in order to change its tag.
 		 */
@@ -1382,40 +1305,113 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 
 		/*
 		 * Somebody could have pinned or re-dirtied the buffer while we were
-		 * doing the I/O and making the new hashtable entry.  If so, we can't
-		 * recycle this buffer; we must undo everything we've done and start
-		 * over with a new victim buffer.
+		 * doing the I/O.  If so, we can't recycle this buffer; we must undo
+		 * everything we've done and start over with 

Re: libpq compression (part 2)

2022-03-02 Thread Justin Pryzby
If there's no objection, I'd like to move this to the next CF for consideration
in PG16.

On Mon, Jan 17, 2022 at 10:39:19PM -0600, Justin Pryzby wrote:
> On Tue, Jan 18, 2022 at 02:06:32AM +0500, Daniil Zakhlystov wrote:
> > > => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).
> 
> > I’ve resolved the stuck tests and added zlib support for CI Windows builds 
> > to patch 0003-*.  Thanks
> > for the suggestion, all tests seem to be OK now, except the macOS one.  It 
> > won't schedule in Cirrus
> > CI for some reason, but I guess it happens because of my GitHub account 
> > limitation.
> 
> I don't know about your github account, but it works for cfbot, which is now
> green.
> 
> Thanks for implementing zlib for windows.  Did you try this with default
> compressions set to lz4 and zstd ?
> 
> The thread from 2019 is very long, and starts off with the guidance that
> compression had been implemented at the wrong layer.  It looks like this 
> hasn't
> changed since then.  secure_read/write are passed as function pointers to the
> ZPQ interface, which then calls back to them to read and flush its compression
> buffers.  As I understand, the suggestion was to leave the socket reads and
> writes alone.  And then conditionally de/compress buffers after reading /
> before writing from the socket if compression was negotiated.
> 
> It's currently done like this
> pq_recvbuf() => secure_read() - when compression is disabled 
> pq_recvbuf() => ZPQ => secure_read() - when compression is enabled 
> 
> Dmitri sent a partial, POC patch which changes the de/compression to happen in
> secure_read/write, which is changed to call ZPQ:  
> https://www.postgresql.org/message-id/ca+q6zcuprssnars+fyobsd-f2stk1roa-4sahfofajowlzi...@mail.gmail.com
> pq_recvbuf() => secure_read() => ZPQ
> 
> The same thing is true of the frontend: function pointers to
> pqsecure_read/write are being passed to zpq_create, and then the ZPQ interface
> called instead of the original functions.  Those are the functions which read
> using SSL, so they should also handle compression.
> 
> That's where SSL is handled, and it seems like the right place to handle
> compression.  Have you evaluated that way to do things ?
> 
> Konstantin said he put ZPQ at that layer seems to 1) avoid code duplication
> between client/server; and, 2) to allow compression to happen before SSL, to
> allow both (if the admin decides it's okay).  But I don't see why compression
> can't happen before sending to SSL, or after reading from it?




Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:

2022-03-02 Thread Robert Haas
On Wed, Mar 2, 2022 at 3:00 PM Andres Freund  wrote:
> What I am stuck on is what we can do for the released branches. Data
> corruption after two consecutive ALTER DATABASE SET TABLESPACEs seems like
> something we need to address.

I think we should consider back-porting the ProcSignalBarrier stuff
eventually. I realize that it's not really battle-tested yet and I am
not saying we should do it right now, but I think that if we get these
changes into v15, back-porting it in let's say May of next year could
be a reasonable thing to do. Sure, there is some risk there, but on
the other hand, coming up with completely different fixes for the
back-branches is not risk-free either, nor is it clear that there is
any alternative fix that is nearly as good. In the long run, I am
fairly convinced that ProcSignalBarrier is the way forward not only
for this purpose but for other things as well, and everybody's got to
get on the train or be left behind.

Also, I am aware of multiple instances where the project waited a
long, long time to fix bugs because we didn't have a back-patchable
fix. I disagree with that on principle. A master-only fix now is
better than a back-patchable fix two or three years from now. Of
course a back-patchable fix now is better still, but we have to pick
from the options we have, not the ones we'd like to have.



It seems to me that if we were going to try to construct an
alternative fix for the back-branches, it would have to be something
that didn't involve a new invalidation mechanism -- because the
ProcSignalBarrier stuff is an invalidation mechanism in effect, and I
feel that it can't be better to invent two new invalidation mechanisms
rather than one. And the only idea I have is trying to detect a
dangerous sequence of operations and just outright block it. We have
some cases sort of like that already - e.g. you can't prepare a
transaction if it's done certain things. But, the existing precedents
that occur to me are, I think, all cases where all of the related
actions are being performed in the same backend. It doesn't sound
crazy to me to have some rule like "you can't ALTER TABLESPACE on the
same tablespace in the same backend twice in a row without an
intervening checkpoint", or whatever, and install the book-keeping to
enforce that. But I don't think anything like that can work, both
because the two ALTER TABLESPACE commands could be performed in
different sessions, and also because an intervening checkpoint is no
guarantee of safety anyway, IIUC. So I'm just not really seeing a
reasonable strategy that isn't basically the barrier stuff.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

2022-03-02 Thread Andres Freund
Hi,

On 2022-03-01 08:35:27 -0500, Stephen Frost wrote:
> I'm not really sure why we're arguing about this, but clearly the authn
> ID of the leader process is what should be used because that's the
> authentication under which the parallel worker is running, just as much
> as the effective role is the authorization.  Having this be available in
> worker processes would certainly be good as it would allow more query
> plans to be considered when these functions are used.  At this time, I
> don't think that outweighs the complications around having it and I'm
> not suggesting that Jacob needs to go do that, but surely it would be
> better.

I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.

Greetings,

Andres Freund




Re: Add 64-bit XIDs into PostgreSQL 15

2022-03-02 Thread Andres Freund
Hi,

On 2022-03-02 16:25:33 +0300, Aleksander Alekseev wrote:
> I agree with Bruce it would be great to deliver this in PG15.

> Please let me know if you believe it's unrealistic for any reason so I will
> focus on testing and reviewing other patches.

I don't see 15 as a realistic target for this patch. There's huge amounts of
work left, it has gotten very little review.

I encourage trying to break down the patch into smaller incrementally useful
pieces. E.g. making all the SLRUs 64bit would be a substantial and
independently committable piece.

Greetings,

Andres Freund




Re: [Proposal] Global temporary tables

2022-03-02 Thread Andres Freund
Hi,

On 2022-02-27 06:09:54 +0100, Pavel Stehule wrote:
> ne 27. 2. 2022 v 5:13 odesílatel Andres Freund  napsal:
> > On 2022-02-27 04:17:52 +0100, Pavel Stehule wrote:
> > > Without this, the GTT will be terribly slow like current temporary tables
> > > with a lot of problems with bloating of pg_class, pg_attribute and
> > > pg_depend tables.
> >
> > I think it's not a great idea to solve multiple complicated problems at
> > once...

> I thought about this issue for a very long time, and I didn't find any
> better (without more significant rewriting of pg storage). In a lot of
> projects, that I know, the temporary tables are strictly prohibited due
> possible devastating impact to system catalog bloat.  It is a serious
> problem. So any implementation of GTT should solve the questions: a) how to
> reduce catalog bloating, b) how to allow session related statistics for
> GTT. I agree so implementation of GTT like template based LTT (local
> temporary tables) can be very simple (it is possible by extension), but
> with the same unhappy performance impacts.

> I don't say so current design should be accepted without any discussions
> and without changes. Maybe GTT based on LTT can be better than nothing
> (what we have now), and can be good enough for a lot of projects where the
> load is not too high (and almost all projects have low load).

I think there's just no way that it can be merged with anything close to the
current design - it's unmaintainable. The need for the feature doesn't change
that.

That's not to say it's impossible to come up with a workable design. But it's
definitely not easy. If I were to work on this - which I am not planning to -
I'd try to solve the problems of "LTT" first, with an eye towards using the
infrastructure for GTT.

I think you'd basically have to come up with a generic design for partitioning
catalog tables into local / non-local storage, without needing explicit code
for each catalog. That could also be used to store the default catalog
contents separately from user defined ones (e.g. pg_proc is pretty large).

Greetings,

Andres Freund




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Andres Freund
Hi,

On 2022-03-02 15:26:32 -0500, Stephen Frost wrote:
> Part of the point, for my part anyway, of dropping support for plaintext
> transmission would be to remove support for that from libpq, otherwise a
> compromised server could still potentially convince a client to provide
> a plaintext password be sent to it.

IMO that's an argument for an opt-in option to permit plaintext, not an
argument for removal of the code alltogether. Even that will need a long
transition time, because it's effectively a form of an ABI break. Upgrading
libpq will suddenly cause applications to stop working. So adding an opt-out
option to disable plaintext is the next step...

I don't think it makes sense to discuss this topic as part of this thread
really. It seems wholly independent of making authentication pluggable.


> I also just generally disagree with the idea that it makes sense for
> these things to be in contrib.  We should be dropping them because
> they're insecure- moving them to contrib doesn't change the issue that
> we're distributing authentication solutions that send (either through an
> encrypted tunnel, or not, neither is good) that pass plaintext passwords
> around.

Shrug. I don't think it's a good idea to just leave people hanging without a
replacement. It's OK to make it a bit harder and require explicit
configuration, but dropping support for reasonable configurations IMO is
something we should be very hesitant doing.

Greetings,

Andres Freund




Re: Adding CI to our tree

2022-03-02 Thread Justin Pryzby
On Mon, Feb 28, 2022 at 02:58:02PM -0600, Justin Pryzby wrote:
> I still think that if "Build Docs" is a separate cirrus task, it should 
> rebuild
> docs on every CI run, even if they haven't changed, for any patch that touches
> docs/.  It'll be confusing if cfbot shows 5 green circles and 4 of them were
> built 1 day ago, and 1 was built 3 weeks ago.  Docs are the task that runs
> quickest, so I don't think it's worth doing anything special there (especially
> without understanding the behavior of changesInclude()).
> 
> Also, to allow users to view the built HTML docs, cfbot would need to 1) keep
> track of previous CI runs; and 2) logic to handle "skipped" CI runs, to allow
> showing artifacts from the previous run.  If it's not already done, I think 
> the
> first half is a good idea on its own.  But the 2nd part doesn't seem 
> desirable.

Maybe changesInclude() could work if we use this URL (from cirrus'
documentation), which uses the artifacts from the last successful build.
https://api.cirrus-ci.com/v1/artifact/github/justinpryzby/postgres/Documentation/html_docs/html_docs/00-doc.html?branch=citest-cirrus2

That requires knowing the file being modified, so we'd have to generate an
index of changed files - which I've started doing here.

> However, I realized that we can filter on cfbot with either of these:
> | $CIRRUS_CHANGE_TITLE =~ '^\[CF...'
> | git log -1 |grep '^Author: Commitfest Bot '
> If we can assume that cfbot will continue submitting branches as a single
> patch, this resolves the question of a "base branch", for cfbot.

I don't know what you think of that idea, but I think I want to amend my
proposal: show HTML and coverage artifacts for HEAD~1, unless set otherwise by
an environment var.  Today, that'd do the right thing for cfbot, and also for
any 1-patch commits.

> These patches implement that idea, and make "code coverage" and "HTML diffs"
> stuff only run for cfbot commits.  This still needs another round of testing,
> though.

The patch was missing a file due to an issue while rebasing - oops.

BTW (regarding the last patch), I just noticed that -Og optimization can cause
warnings with gcc-4.8.5-39.el7.x86_64.

be-fsstubs.c: In function 'be_lo_export':
be-fsstubs.c:522:24: warning: 'fd' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
  if (CloseTransientFile(fd) != 0)
^
trigger.c: In function 'ExecCallTriggerFunc':
trigger.c:2400:2: warning: 'result' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
  return (HeapTuple) DatumGetPointer(result);
  ^
xml.c: In function 'xml_pstrdup_and_free':
xml.c:1205:2: warning: 'result' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
  return result;

-- 
Justin
>From 883edaa653bcf7f1a2369d8edf46eaaac1ba0ba2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/7] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a feature is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index d10b0a82f9f..1b7c36283e9 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y ...
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without llvm
@@ -180,10 +181,12 @@ task:
 chown -R postgres:postgres ${CCACHE_DIR}
 echo '* - memlock 134217728' > /etc/security/limits.d/postgres.conf
 su postgres -c "ulimit -l -H && ulimit -l -S"
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core'
+#apt-get update
+#apt-get -y install ...
 
   configure_script: |
 su postgres <<-EOF
@@ -237,7 +240,7 @@ task:
 ulimit -a -H && ulimit -a -S
 export
 
-  setup_cores_script:
+  setup_os_script:
 - mkdir ${HOME}/cores
 - sudo sysctl kern.corefile="${HOME}/cores/core.%P"
 
@@ -384,6 +387,9 @@ task:
 powershell -Command get-psdrive -psprovider filesystem
 set
 
+  setup_os_script: |
+REM choco install -y ...
+
   configure_script:
 # copy errors out when using forward slashes
 - copy src\tools\ci\windows_build_config.pl src\tools\msvc\config.pl
@@ -479,6 +485,10 @@ task:
   ccache_cache:
 folder: $CCACHE_DIR
 
+  setup_os_script: |
+#apt-get update
+#apt-get -y install ...
+
   ###
   # Test that code can be built with gcc/clan

Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread samay sharma
Hi,

On Wed, Mar 2, 2022 at 12:32 AM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 01.03.22 22:34, Andres Freund wrote:
> > The cases I've heard about are about centralizing auth across multiple
> cloud
> > services. Including secret management in some form. E.g. allowing an
> > application to auth to postgres, redis and having the secret provided by
> > infrastructure, rather than having to hardcode it in config or such.
> >
> > I can't see application developers configuring kerberos and I don't think
> > LDAP, PAM, Radius are a great answer either, due to the plaintext
> requirement
> > alone? LDAP is pretty clearly dying technology, PAM is fragile
> complicated C
> > stuff that's not portable across OSs. Radius is probably the most
> realistic,
> > but at least as implemented doesn't seem flexible enough (e.g. no access
> to
> > group memberships etc).
> >
> > Nor does baking stuff like that in seem realistic to me, it'll
> presumably be
> > too cloud provider specific.
>
> Let's gather some more information on this.  PostgreSQL should support
> the authentication that many people want to use out of the box.  I don't
> think it would be good to be at a point where all the built-in methods
> are outdated and if you want to use the good stuff you have to hunt for
> plugins.  The number of different cloud APIs is effectively small.  I
> expect that there are a lot of similarities, like they probably all need
> support for http calls, they might need support for caching lookups,
> etc.  OIDC was mentioned elsewhere.  That's a standard.  Is that
> compatible with any cloud providers?  Would that be enough for many users?
>

I think we are discussing two topics in this thread which in my opinion are
orthogonal.


(a) Should we make authentication methods pluggable by exposing these
hooks? - These will allow users to add plugins of their own to support
whatever auth method they like. One immediate use case (and what prompted
me to start looking at this) is Azure Active Directory integration which is
a common request from Azure customers. We could also, over time, move some
of our existing auth methods into extensions if we don’t want to maintain
them in core.


Please note that these hooks do not restrict auth providers to use only
plaintext auth methods. We could do SCRAM with secrets which are stored
outside of Postgres using this auth plugin too. I can modify the
test_auth_provider sample extension to do something like that as Andres
suggested.


I don't see much point in that unless we deprecate *all* the
> auth methods that transmit a cleartext password.


(b) Should we allow plaintext auth methods (and should we deprecate a few
currently supported auth methods which use plaintext exchange)? - I think
this is also a very important discussion but has many factors to consider
independent of the hooks. Whatever decision is made here, we can impose
that in auth.c later for plugins. For eg. allow plugins to only do
plaintext stuff with SSL enabled (by checking if Port->ssl_in_use), or if
we remove AUTH_REQ_PASSWORD, then plugins any way can’t use plaintext
exchange with the client.


So, overall, if we are open to allowing plugins for auth methods, I can
proceed to modify the test_auth_provider extension to use SCRAM and allow
registering multiple providers for this specific change.


Regards,

Samay


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-02 09:32:26 +0100, Peter Eisentraut wrote:
> > On 01.03.22 22:34, Andres Freund wrote:
> > > The cases I've heard about are about centralizing auth across multiple 
> > > cloud
> > > services. Including secret management in some form. E.g. allowing an
> > > application to auth to postgres, redis and having the secret provided by
> > > infrastructure, rather than having to hardcode it in config or such.
> > > 
> > > I can't see application developers configuring kerberos and I don't think
> > > LDAP, PAM, Radius are a great answer either, due to the plaintext 
> > > requirement
> > > alone? LDAP is pretty clearly dying technology, PAM is fragile 
> > > complicated C
> > > stuff that's not portable across OSs. Radius is probably the most 
> > > realistic,
> > > but at least as implemented doesn't seem flexible enough (e.g. no access 
> > > to
> > > group memberships etc).
> > > 
> > > Nor does baking stuff like that in seem realistic to me, it'll presumably 
> > > be
> > > too cloud provider specific.
> > 
> > Let's gather some more information on this.  PostgreSQL should support the
> > authentication that many people want to use out of the box.  I don't think
> > it would be good to be at a point where all the built-in methods are
> > outdated and if you want to use the good stuff you have to hunt for plugins.
> > The number of different cloud APIs is effectively small.  I expect that
> > there are a lot of similarities, like they probably all need support for
> > http calls, they might need support for caching lookups, etc.  OIDC was
> > mentioned elsewhere.  That's a standard.  Is that compatible with any cloud
> > providers?  Would that be enough for many users?
> 
> I'm not opposed to putting something into the source tree eventually, if we
> can figure out an overlapping set of capabilities that's useful enough. But I
> don't see that as a reason to not make auth extensible? If anything, the
> contrary - it's going to be much easier to figure out what this should look
> like if it can be iteratively worked on with unmodified postgres.
> 
> Even if we were to put something in core, it seems that contrib/ would be a
> better place than auth.c for something like it.
> 
> I think we should consider moving pam, ldap, radius to contrib
> eventually. That way we'd not need to entirely remove them, but a "default"
> postgres wouldn't have support for auth methods requiring plaintext secret
> transmission.

Part of the point, for my part anyway, of dropping support for plaintext
transmission would be to remove support for that from libpq, otherwise a
compromised server could still potentially convince a client to provide
a plaintext password be sent to it.

I also just generally disagree with the idea that it makes sense for
these things to be in contrib.  We should be dropping them because
they're insecure- moving them to contrib doesn't change the issue that
we're distributing authentication solutions that send (either through an
encrypted tunnel, or not, neither is good) that pass plaintext passwords
around.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: trigger example for plsample

2022-03-02 Thread Mark Wong
On Fri, Feb 25, 2022 at 06:39:39PM +, Chapman Flack wrote:
> This patch is straightforward, does what it says, and passes the tests.
> 
> Regarding the duplication of code between plsample_func_handler and
> plsample_trigger_handler, perhaps that's for the best for now, as 3554 in
> the same commitfest also touches plsample, so merge conflicts may be
> minimized by not doing more invasive refactoring.
> 
> That would leave low-hanging fruit for a later patch that could refactor
> plsample to reduce the duplication (maybe adding a validator at the same
> time? That would also duplicate some of the checks in the existing handlers.)
> 
> I am not sure that structuring the trigger handler with separate compile and
> execute steps is worth the effort for a simple example like plsample. The main
> plsample_func_handler is not so structured.
> 
> It's likely that many real PLs will have some notion of compilation separate 
> from
> execution. But those will also have logic to do the compilation only once, and
> somewhere to cache the result of that for reuse across calls, and those kinds 
> of
> details might make plsample's basic skeleton more complex than needed.
> 
> I know that in just looking at expected/plsample.out, I was a little 
> distracted by
> seeing multiple "compile" messages for the same trigger function in the same
> session and wondering why that was.
> 
> So maybe it would be simpler and less distracting to assume that the PL 
> targeted
> by plsample is one that just has a simple interpreter that works from the 
> source text.

I've attached v2, which reduces the output:

* Removing the notices for the text body, and the "compile" message.
* Replaced the notice for "compile" message with a comment as a
  placeholder for where a compiling code or checking a cache may go.
* Reducing the number of rows inserted into the table, thus reducing
  the number of notice messages about which code path is taken.


I think that reduces the repetitiveness of the output...

Regards,
Mark
diff --git a/src/test/modules/plsample/expected/plsample.out 
b/src/test/modules/plsample/expected/plsample.out
index a0c318b6df..a7912c7897 100644
--- a/src/test/modules/plsample/expected/plsample.out
+++ b/src/test/modules/plsample/expected/plsample.out
@@ -6,9 +6,6 @@ AS $$
   Example of source with text result.
 $$ LANGUAGE plsample;
 SELECT plsample_result_text(1.23, 'abc', '{4, 5, 6}');
-NOTICE:  source text of function "plsample_result_text": 
-  Example of source with text result.
-
 NOTICE:  argument: 0; name: a1; value: 1.23
 NOTICE:  argument: 1; name: a2; value: abc
 NOTICE:  argument: 2; name: a3; value: {4,5,6}
@@ -25,12 +22,57 @@ AS $$
   Example of source with void result.
 $$ LANGUAGE plsample;
 SELECT plsample_result_void('{foo, bar, hoge}');
-NOTICE:  source text of function "plsample_result_void": 
-  Example of source with void result.
-
 NOTICE:  argument: 0; name: a1; value: {foo,bar,hoge}
  plsample_result_void 
 --
  
 (1 row)
 
+CREATE FUNCTION my_trigger_func() RETURNS trigger AS $$
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+$$ language plsample;
+CREATE TABLE my_table (num integer, description text);
+CREATE TRIGGER my_trigger_func BEFORE INSERT OR UPDATE ON my_table
+   FOR EACH ROW EXECUTE FUNCTION my_trigger_func();
+CREATE TRIGGER my_trigger_func2 AFTER INSERT OR UPDATE ON my_table
+   FOR EACH ROW EXECUTE FUNCTION my_trigger_func(8);
+INSERT INTO my_table (num, description)
+VALUES (1, 'first');
+NOTICE:  trigger name: my_trigger_func
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered BEFORE
+NOTICE:  triggered per row
+NOTICE:  trigger name: my_trigger_func2
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered AFTER
+NOTICE:  triggered per row
+NOTICE:  trigger arg[0]: 8
+UPDATE my_table
+SET description = 'first, modified once'
+WHERE num = 1;
+NOTICE:  trigger name: my_trigger_func
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by UPDATE
+NOTICE:  triggered BEFORE
+NOTICE:  triggered per row
+NOTICE:  trigger name: my_trigger_func2
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by UPDATE
+NOTICE:  triggered AFTER
+NOTICE:  triggered per row
+NOTICE:  trigger arg[0]: 8
+DROP TRIGGER my_trigger_func ON my_table;
+DROP TRIGGER my_trigger_func2 ON my_table;
+DROP FUNCTION my_trigger_func;
diff --git a/src/test/modules/plsample/plsample.c 
b/src/test/modules/plsample/plsample.c
index 6fc33c728c..fea266f522 100644
--- a/src/test/modules/plsample/plsample.c
+++ b/src/test/modules/plsample/plsample.c
@@ -19,6 +19,7 @@
 #include "catalog/pg_type.h"
 #include "commands/event_trigger.h"
 #include "commands/trigger.h"
+#include "executor/sp

Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Andres Freund
Hi,

On 2022-03-02 09:32:26 +0100, Peter Eisentraut wrote:
> On 01.03.22 22:34, Andres Freund wrote:
> > The cases I've heard about are about centralizing auth across multiple cloud
> > services. Including secret management in some form. E.g. allowing an
> > application to auth to postgres, redis and having the secret provided by
> > infrastructure, rather than having to hardcode it in config or such.
> > 
> > I can't see application developers configuring kerberos and I don't think
> > LDAP, PAM, Radius are a great answer either, due to the plaintext 
> > requirement
> > alone? LDAP is pretty clearly dying technology, PAM is fragile complicated C
> > stuff that's not portable across OSs. Radius is probably the most realistic,
> > but at least as implemented doesn't seem flexible enough (e.g. no access to
> > group memberships etc).
> > 
> > Nor does baking stuff like that in seem realistic to me, it'll presumably be
> > too cloud provider specific.
> 
> Let's gather some more information on this.  PostgreSQL should support the
> authentication that many people want to use out of the box.  I don't think
> it would be good to be at a point where all the built-in methods are
> outdated and if you want to use the good stuff you have to hunt for plugins.
> The number of different cloud APIs is effectively small.  I expect that
> there are a lot of similarities, like they probably all need support for
> http calls, they might need support for caching lookups, etc.  OIDC was
> mentioned elsewhere.  That's a standard.  Is that compatible with any cloud
> providers?  Would that be enough for many users?

I'm not opposed to putting something into the source tree eventually, if we
can figure out an overlapping set of capabilities that's useful enough. But I
don't see that as a reason to not make auth extensible? If anything, the
contrary - it's going to be much easier to figure out what this should look
like if it can be iteratively worked on with unmodified postgres.

Even if we were to put something in core, it seems that contrib/ would be a
better place than auth.c for something like it.

I think we should consider moving pam, ldap, radius to contrib
eventually. That way we'd not need to entirely remove them, but a "default"
postgres wouldn't have support for auth methods requiring plaintext secret
transmission.

Greetings,

Andres Freund




Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:

2022-03-02 Thread Andres Freund
Hi,

On 2022-03-02 14:52:01 -0500, Robert Haas wrote:
> - I am having some trouble understanding clearly what 0001 is doing.
> I'll try to study it further.

It tests for the various scenarios I could think of that could lead to FD
reuse, to state the obvious ;). Anything particularly unclear.


> - In general, 0003 makes a lot of sense to me.
> 
> +   /*
> +* Finally tell the kernel to write the data to
> storage. Don't smgr if
> +* previously closed, otherwise we could end up evading 
> fd-reuse
> +* protection.
> +*/
> 
> - I think the above hunk is missing a word, because it uses smgr as a
> verb. I also think that it's easy to imagine this explanation being
> insufficient for some future hacker to understand the issue.

Yea, it's definitely not sufficient or gramattically correct. I think I wanted
to send something out, but was very tired by that point..


> - While 0004 seems useful for testing, it's an awfully big hammer. I'm
> not sure we should be thinking about committing something like that,
> or at least not as a default part of the build. But ... maybe?

Aggreed. I think it's racy anyway, because further files could get deleted
(e.g. segments > 0) after the barrier has been processed.


What I am stuck on is what we can do for the released branches. Data
corruption after two consecutive ALTER DATABASE SET TABLESPACEs seems like
something we need to address.

Greetings,

Andres Freund




Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:

2022-03-02 Thread Robert Haas
On Tue, Feb 22, 2022 at 4:40 AM Andres Freund  wrote:
> On 2022-02-22 01:11:21 -0800, Andres Freund wrote:
> > I've started to work on a few debugging aids to find problem like
> > these. Attached are two WIP patches:
>
> Forgot to attach. Also importantly includes a tap test for several of these
> issues

Hi,

Just a few very preliminary comments:

- I am having some trouble understanding clearly what 0001 is doing.
I'll try to study it further.

- 0002 seems like it's generally a good idea. I haven't yet dug into
why the call sites for AssertFileNotDeleted() are where they are, or
whether that's a complete set of places to check.

- In general, 0003 makes a lot of sense to me.

+   /*
+* Finally tell the kernel to write the data to
storage. Don't smgr if
+* previously closed, otherwise we could end up evading fd-reuse
+* protection.
+*/

- I think the above hunk is missing a word, because it uses smgr as a
verb. I also think that it's easy to imagine this explanation being
insufficient for some future hacker to understand the issue.

- While 0004 seems useful for testing, it's an awfully big hammer. I'm
not sure we should be thinking about committing something like that,
or at least not as a default part of the build. But ... maybe?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-03-02 Thread Chapman Flack
On 03/01/22 20:03, Nathan Bossart wrote:
> Here is a new version of the patch with the following changes:

I did not notice this earlier (sorry), but there seems to remain in
backup.sgml a programlisting example that shows a psql invocation
for pg_backup_start, then a tar command, then another psql invocation
for pg_backup_stop.

I think that was only workable for the exclusive mode, and now it is
necessary to issue pg_backup_start and pg_backup_stop in the same session.

(The 'touch backup_in_progress' business seems a bit bogus now too,
suggesting an exclusivity remembered from bygone days.)

I am not sure what a workable, simple example ought to look like.
Maybe a single psql script issuing the pg_backup_start and the
pg_backup_stop, with a tar command in between with \! ?

Several bricks shy of production-ready, but it would give the idea.

Regards,
-Chap




Re: Support for grabbing multiple consecutive values with nextval()

2022-03-02 Thread Jille Timmermans

On 2022-02-28 11:13, Peter Eisentraut wrote:

On 27.02.22 10:42, Jille Timmermans wrote:
I wanted to be able to allocate a bunch of numbers from a sequence at 
once. Multiple people seem to be struggling with this 
(https://stackoverflow.com/questions/896274/select-multiple-ids-from-a-postgresql-sequence, 
https://www.depesz.com/2008/03/20/getting-multiple-values-from-sequences/).


I propose to add an extra argument to nextval() that specifies how 
many numbers you want to allocate (default 1).


What is the use of this?

I note that the stackoverflow question wanted to return multiple
sequence values as a result set, whereas your implementation just
skips a number of values and returns the last one.  At least we should
be clear about what we are trying to achieve.
Both would work for me actually. I'm using COPY FROM to insert many rows 
and need to know their ids and COPY FROM doesn't support RETURNING.


I implemented this approach because:
- smaller diff
- maybe someone benefits from them being consecutive
- less data to send between client/server

The obvious downside is that people can make mistakes in whether the 
returned number is the first or last number of the series.





Re: [PATCH] Add reloption for views to enable RLS

2022-03-02 Thread Laurenz Albe
On Wed, 2022-03-02 at 10:10 +, Dean Rasheed wrote:
> > I kept "check_permissions_owner" for now. Constantly changing it around
> > with each iteration doesn't really bring any value IMHO, I'd rather have
> > a final consensus on how to name the option and *then* change it for good.
> 
> Yes indeed, it's annoying to keep changing the name between patch
> versions, so let's try to get a consensus now.
> 
> For my part, I find myself more and more convinced that
> "security_invoker" is the right name [...]
> 
> What are other people's opinions?

I am fine with "security_invoker".  If there are other databases that use the
same term for the same thing, that is a strong argument.

I also agree that having "off" for the default setting is nicer.

My main worry is that other people misunderstand it in the same way that
Walter did, namely that this behaves just like security invoker functions.
But if the behavior is well documented, I think that is ok.

Yours,
Laurenz Albe





Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2022-03-02 Thread Bharath Rupireddy
On Wed, Mar 2, 2022 at 4:45 PM Nitin Jadhav
 wrote:
> > Also, how about special phases for SyncPostCheckpoint(),
> > SyncPreCheckpoint(), InvalidateObsoleteReplicationSlots(),
> > PreallocXlogFiles() (it currently pre-allocates only 1 WAL file, but
> > it might be increase in future (?)), TruncateSUBTRANS()?
>
> SyncPreCheckpoint() is just incrementing a counter and
> PreallocXlogFiles() currently pre-allocates only 1 WAL file. I feel
> there is no need to add any phases for these as of now. We can add in
> the future if necessary. Added phases for SyncPostCheckpoint(),
> InvalidateObsoleteReplicationSlots() and TruncateSUBTRANS().

Okay.

> > 10) I'm not sure if it's discussed, how about adding the number of
> > snapshot/mapping files so far the checkpoint has processed in file
> > processing while loops of
> > CheckPointSnapBuild/CheckPointLogicalRewriteHeap? Sometimes, there can
> > be many logical snapshot or mapping files and users may be interested
> > in knowing the so-far-processed-file-count.
>
> I had thought about this while sharing the v1 patch and mentioned my
> views upthread. I feel it won't give meaningful progress information
> (It can be treated as statistics). Hence not included. Thoughts?

Okay. If there are any complaints about it we can always add them later.

> > > > As mentioned upthread, there can be multiple backends that request a
> > > > checkpoint, so unless we want to store an array of pid we should store 
> > > > a number
> > > > of backend that are waiting for a new checkpoint.
> > >
> > > Yeah, you are right. Let's not go that path and store an array of
> > > pids. I don't see a strong use-case with the pid of the process
> > > requesting checkpoint. If required, we can add it later once the
> > > pg_stat_progress_checkpoint view gets in.
> >
> > I don't think that's really necessary to give the pid list.
> >
> > If you requested a new checkpoint, it doesn't matter if it's only your 
> > backend
> > that triggered it, another backend or a few other dozen, the result will be 
> > the
> > same and you have the information that the request has been seen.  We could
> > store just a bool for that but having a number instead also gives a bit more
> > information and may allow you to detect some broken logic on your client 
> > code
> > if it keeps increasing.
>
> It's a good metric to show in the view but the information is not
> readily available. Additional code is required to calculate the number
> of requests. Is it worth doing that? I feel this can be added later if
> required.

Yes, we can always add it later if required.

> Please find the v4 patch attached and share your thoughts.

I reviewed v4 patch, here are my comments:

1) Can we convert below into pgstat_progress_update_multi_param, just
to avoid function calls?
pgstat_progress_update_param(PROGRESS_CHECKPOINT_LSN, checkPoint.redo);
pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,

2) Why are we not having special phase for CheckPointReplicationOrigin
as it does good bunch of work (writing to disk, XLogFlush,
durable_rename) especially when max_replication_slots is large?

3) I don't think "requested" is necessary here as it doesn't add any
value or it's not a checkpoint kind or such, you can remove it.

4) s:/'recycling old XLOG files'/'recycling old WAL files'
+  WHEN 16 THEN 'recycling old XLOG files'

5) Can we place CREATE VIEW pg_stat_progress_checkpoint AS definition
next to pg_stat_progress_copy in system_view.sql? It looks like all
the progress reporting views are next to each other.

6) How about shutdown and end-of-recovery checkpoint? Are you planning
to have an ereport_startup_progress mechanism as 0002?

7) I think you don't need to call checkpoint_progress_start and
pgstat_progress_update_param, any other progress reporting function
for shutdown and end-of-recovery checkpoint right?

8) Not for all kinds of checkpoints right? pg_stat_progress_checkpoint
can't show progress report for shutdown and end-of-recovery
checkpoint, I think you need to specify that here in wal.sgml and
checkpoint.sgml.
+   command CHECKPOINT. The checkpointer process running the
+   checkpoint will report its progress in the
+   pg_stat_progress_checkpoint view. See
+for details.

9) Can you add a test case for pg_stat_progress_checkpoint view? I
think it's good to add one. See, below for reference:
-- Add a trigger to catch and print the contents of the catalog view
-- pg_stat_progress_copy during data insertion.  This allows to test
-- the validation of some progress reports for COPY FROM where the trigger
-- would fire.
create function notice_after_tab_progress_reporting() returns trigger AS
$$
declare report record;

10) Typo: it's not "is happens"
+   The checkpoint is happens without delays.

11) Can you be specific what are those "some operations" that forced a
checkpoint? May be like, basebackup, createdb or something?
+   The checkpoint is started because some operation forced a 

Re: [Proposal] Global temporary tables

2022-03-02 Thread Pavel Stehule
st 2. 3. 2022 v 19:02 odesílatel Adam Brusselback 
napsal:

> >In my observation, very few users require an accurate query plan for
> temporary tables to
> perform manual analyze.
>
> Absolutely not true in my observations or personal experience. It's one of
> the main reasons I have needed to use (local) temporary tables rather than
> just materializing a CTE when decomposing queries that are too complex for
> Postgres to handle.
>
> I wish I could use GTT to avoid the catalog bloat in those instances, but
> that will only be possible if the query plans are accurate.
>

This strongly depends on usage.  Very common patterns from MSSQL don't need
statistics. But on second thought, sometimes, the query should be divided
and temp tables are used for storing some middle results. In this case, you
cannot exist without statistics. In the first case, the temp tables can be
replaced by arrays. In the second case, the temp tables are not replaceable.

Regards

Pavel


Re: [Proposal] Global temporary tables

2022-03-02 Thread Adam Brusselback
>In my observation, very few users require an accurate query plan for
temporary tables to
perform manual analyze.

Absolutely not true in my observations or personal experience. It's one of
the main reasons I have needed to use (local) temporary tables rather than
just materializing a CTE when decomposing queries that are too complex for
Postgres to handle.

I wish I could use GTT to avoid the catalog bloat in those instances, but
that will only be possible if the query plans are accurate.


Re: Add id's to various elements in protocol.sgml

2022-03-02 Thread Chapman Flack
On 03/02/22 12:46, Brar Piening wrote:
> With regard to varlistentry I'd suggest to decide whether to add ids or
> not on a case by case base. I already offered to add ids to long lists
> upon request but I wouldn't want to blindly add ~4k ids that nobody

Perhaps there are a bunch of variablelists where no one cares about
linking to any of the entries.

So maybe a useful non-terminating message to add eventually would
be one that identifies any varlistentry lacking an id, with a
variablelist where at least one other entry has an id.

Regards,
-Chap




Re: Add id's to various elements in protocol.sgml

2022-03-02 Thread Brar Piening

On 02.03.2022 at 10:37, Peter Eisentraut wrote:

I have applied the part of your patch that adds the id's.  The
discussion about the formatting aspect can continue.


Thank you!

I've generated some data by outputting the element name whenever a
section or varlistentry lacks an id. That's how the situation in the
docs currently looks like:

   element    | count
--+---
 sect2    |   275
 sect3    |    94
 sect4    |    20
 simplesect   |    20
 varlistentry |  3976

Looking at this, I think that manually assigning an id to all ~400
sections currently lacking one to make them referable in a consistent
way is a bit of work but feasible.

Once we consitently have stable ids on section headers IMHO it makes
sense to also expose them as links. I'd probably also make the
stylesheet emit a non-terminating message/comment whenever it finds a
section without id in order to help keeping the layout consistent over time.

With regard to varlistentry I'd suggest to decide whether to add ids or
not on a case by case base. I already offered to add ids to long lists
upon request but I wouldn't want to blindly add ~4k ids that nobody
cares about. That part can also grow over time by people adding ids as
they deem them useful.

Any objections/thoughts?





Re: using extended statistics to improve join estimates

2022-03-02 Thread Justin Pryzby
On Wed, Mar 02, 2022 at 11:38:21AM -0600, Justin Pryzby wrote:
> Rebased over 269b532ae and muted compiler warnings.

And attached.
>From 587a5e9fe87c26cdcd9602fc349f092da95cc580 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Mon, 13 Dec 2021 14:05:17 +0100
Subject: [PATCH] Estimate joins using extended statistics

Use extended statistics (MCV) to improve join estimates. In general this
is similar to how we use regular statistics - we search for extended
statistics (with MCV) covering all join clauses, and if we find such MCV
on both sides of the join, we combine those two MCVs.

Extended statistics allow a couple additional improvements - e.g. if
there are baserel conditions, we can use them to restrict the part of
the MCVs combined. This means we're building conditional probability
distribution and calculating conditional probability

P(join clauses | baserel conditions)

instead of just P(join clauses).

The patch also allows combining regular and extended MCV - we don't need
extended MCVs on both sides. This helps when one of the tables does not
have extended statistics (e.g. because there are no correlations).
---
 src/backend/optimizer/path/clausesel.c|  63 +-
 src/backend/statistics/extended_stats.c   | 805 ++
 src/backend/statistics/mcv.c  | 757 
 .../statistics/extended_stats_internal.h  |  20 +
 src/include/statistics/statistics.h   |  12 +
 src/test/regress/expected/stats_ext.out   | 167 
 src/test/regress/sql/stats_ext.sql|  66 ++
 7 files changed, 1889 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 06f836308d0..1b2227321a2 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -50,6 +50,9 @@ static Selectivity clauselist_selectivity_or(PlannerInfo *root,
 			 JoinType jointype,
 			 SpecialJoinInfo *sjinfo,
 			 bool use_extended_stats);
+static inline bool treat_as_join_clause(PlannerInfo *root,
+		Node *clause, RestrictInfo *rinfo,
+		int varRelid, SpecialJoinInfo *sjinfo);
 
 /
  *		ROUTINES TO COMPUTE SELECTIVITIES
@@ -129,12 +132,53 @@ clauselist_selectivity_ext(PlannerInfo *root,
 	RangeQueryClause *rqlist = NULL;
 	ListCell   *l;
 	int			listidx;
+	bool		single_clause_optimization = true;
+
+	/*
+	 * The optimization of skipping to clause_selectivity_ext for single
+	 * clauses means we can't improve join estimates with a single join
+	 * clause but additional baserel restrictions. So we disable it when
+	 * estimating joins.
+	 *
+	 * XXX Not sure if this is the right way to do it, but more elaborate
+	 * checks would mostly negate the whole point of the optimization.
+	 * The (Var op Var) patch has the same issue.
+	 *
+	 * XXX An alternative might be making clause_selectivity_ext smarter
+	 * and make it use the join extended stats there. But that seems kinda
+	 * against the whole point of the optimization (skipping expensive
+	 * stuff) and it's making other parts more complex.
+	 *
+	 * XXX Maybe this should check if there are at least some restrictions
+	 * on some base relations, which seems important. But then again, that
+	 * seems to go against the idea of this check to be cheap. Moreover, it
+	 * won't work for OR clauses, which may have multiple parts but we still
+	 * see them as a single BoolExpr clause (it doesn't work later, though).
+	 */
+	if (list_length(clauses) == 1)
+	{
+		Node *clause = linitial(clauses);
+		RestrictInfo *rinfo = NULL;
+
+		if (IsA(clause, RestrictInfo))
+		{
+			rinfo = (RestrictInfo *) clause;
+			clause = (Node *) rinfo->clause;
+		}
+
+		single_clause_optimization
+			= !treat_as_join_clause(root, clause, rinfo, varRelid, sjinfo);
+	}
 
 	/*
 	 * If there's exactly one clause, just go directly to
 	 * clause_selectivity_ext(). None of what we might do below is relevant.
+	 *
+	 * XXX This means we won't try using extended stats on OR-clauses (which
+	 * are a single BoolExpr clause at this point), although we'll do that
+	 * later (once we look at the arguments).
 	 */
-	if (list_length(clauses) == 1)
+	if ((list_length(clauses) == 1) && single_clause_optimization)
 		return clause_selectivity_ext(root, (Node *) linitial(clauses),
 	  varRelid, jointype, sjinfo,
 	  use_extended_stats);
@@ -157,6 +201,23 @@ clauselist_selectivity_ext(PlannerInfo *root,
 			&estimatedclauses, false);
 	}
 
+	/*
+	 * Try applying extended statistics to joins. There's not much we can
+	 * do to detect when this makes sense, but we can check that there are
+	 * join clauses, and that at least some of the rels have stats.
+	 *
+	 * XXX Isn't this mutually exclusive with the preceding block which
+	 * calculates estimates for a single relation?
+	 */
+	if (use_extended_stats &&
+		statext_try_join_estimat

Re: using extended statistics to improve join estimates

2022-03-02 Thread Justin Pryzby
On Wed, Jan 19, 2022 at 06:18:09PM +0800, Julien Rouhaud wrote:
> On Tue, Jan 04, 2022 at 03:55:50PM -0800, Andres Freund wrote:
> > On 2022-01-01 18:21:06 +0100, Tomas Vondra wrote:
> > > Here's an updated patch, rebased and fixing a couple typos reported by
> > > Justin Pryzby directly.
> > 
> > FWIW, cfbot reports a few compiler warnings:
> 
> Also the patch doesn't apply anymore:
> 
> http://cfbot.cputube.org/patch_36_3055.log
> === Applying patches on top of PostgreSQL commit ID 
> 74527c3e022d3ace648340b79a6ddec3419f6732 ===
> === applying patch 
> ./0001-Estimate-joins-using-extended-statistics-20220101.patch
> patching file src/backend/optimizer/path/clausesel.c
> patching file src/backend/statistics/extended_stats.c
> Hunk #1 FAILED at 30.
> Hunk #2 succeeded at 102 (offset 1 line).
> Hunk #3 succeeded at 2619 (offset 9 lines).
> 1 out of 3 hunks FAILED -- saving rejects to file 
> src/backend/statistics/extended_stats.c.rej

Rebased over 269b532ae and muted compiler warnings.

Tomas - is this patch viable for pg15 , or should move to the next CF ?

In case it's useful, I ran this on cirrus with my branch for code coverage.
https://cirrus-ci.com/task/5816731397521408
https://api.cirrus-ci.com/v1/artifact/task/5816731397521408/coverage/coverage/00-index.html

statext_find_matching_mcv() has poor coverage.
statext_clauselist_join_selectivity() has poor coverage for the "stats2" case.

In mcv.c: mcv_combine_extended() and mcv_combine_simple() have poor coverage
for the "else if" cases (does it matter?)

Not related to this patch:
build_attnums_array() isn't being hit.

Same at statext_is_compatible_clause_internal()
1538   0 : *exprs = lappend(*exprs, clause);

statext_mcv_[de]serialize() aren't being hit for cstrings.

-- 
Justin




Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Robert Haas
On Wed, Mar 2, 2022 at 11:09 AM Tom Lane  wrote:
> Robert Haas  writes:
> > So the questions in my mind here are all
> > about whether we can detect this stuff cheaply and whether anybody
> > wants to do the work to make it happen, not whether we'd get a benefit
> > in the cases where it kicks in.
>
> Right, my worries are mostly about the first point.

OK, cool.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Commitfest 2022-03 Patch Triage Part 1a.i

2022-03-02 Thread Julien Rouhaud
On Wed, Mar 02, 2022 at 11:58:28AM -0500, Greg Stark wrote:
>
> But I'm unclear exactly what the consequences in the commitfest app
> are of specific state changes. As I understand it there are basically
> two alternatives:
>
> 1) Returned with feedback -- does this make it harder for an author to
> resume work release? Can they simply reactivate the CF entry or do
> they need to start a new one and then lose history in the app?

As far as I know they would need to create a new entry, and thus lose the
history.

> 2) Moved to next commitfest -- this seems to just drag the pain on.
> Then it has to get triaged again next commitfest and if it's actually
> stalled (or progressing fine without needing feedback) that's just
> extra work for nothing.
>
> Do I have this right? What is the right state to put a patch in that
> means "this patch doesn't need to be triaged again unless the author
> actually feels progress has been made and needs new feedback or thinks
> its committable"?

I don't think that 2) means having to triage again.  If a patch gets moved to
the next commitfest now, then clearly it's not ready and should be also
switched to Waiting on Author.

In the next commitfest, if the author doesn't address the problems raised
during review the patch will still be in Waiting for Author, and the only
needed triaging would be to close as Return With Feedback patches that looks
abandoned.  For now the arbitrary "abandoned" definition is usually "patch in
Waiting on Author for at least 2 weeks at the end of the commitfest with no
sign of activity from the author".




Re: Commitfest 2022-03 Patch Triage Part 1a.i

2022-03-02 Thread Tom Lane
Greg Stark  writes:
> Do I have this right? What is the right state to put a patch in that
> means "this patch doesn't need to be triaged again unless the author
> actually feels progress has been made and needs new feedback or thinks
> its committable"?

But that's not really the goal, is it?  ISTM what you want to do is
identify patches that we're not going to try to get into v15, and
then push them out to the next CF so that we don't spend more time
on them this month.  But that determination should not preclude them
from being looked at on the normal basis once the next CF arrives.
So I'd say just push them forward with status "Needs review" or
"Waiting on author", whichever seems more appropriate.

If a patch seems to have stalled to the point where neither of
those statuses is appropriate, then closing it RWF would be the
thing to do; but that's not special to the last-CF situation.

regards, tom lane




Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread David Steele

On 3/2/22 11:04, Chapman Flack wrote:

On 03/02/22 02:46, Michael Paquier wrote:

system function marked as proretset while it builds and returns only
one record.  And this is a popular one: pg_stop_backup(), labelled
v2.


I had just recently noticed that while reviewing [0], but shrugged,
as I didn't know what the history was.

Is this best handled as a separate patch, or folded into [0], which is
going to be altering and renaming that function anyway?


On 03/02/22 09:31, Robert Haas wrote:

On Wed, Mar 2, 2022 at 5:25 AM Aleksander Alekseev

Since it doesn't seem to be used for anything except these two array
declarations I suggest keeping simply "3" here.


I think we do this kind of thing in various places in similar
situations, and I think it is good style. It makes it easier to catch
everything if you ever need to update the code.



I've been known (in other projects) to sometimes accomplish the same
thing with, e.g.,

Datum  values[3];
boolnulls[sizeof values / sizeof *values];


I also use this pattern, though I would generally write it as:

bool nulls[sizeof(values) / sizeof(Datum)];

Chap's way makes it possible to use a macro, though, so that's a plus.

Regards,
-David




Re: Allow root ownership of client certificate key

2022-03-02 Thread David Steele

On 3/2/22 08:40, Tom Lane wrote:

Chris Bandy  writes:

On 3/1/22 3:15 AM, Tom Lane wrote:

Anyway, I'd be happier about back-patching if we could document
actual requests to make it work like the server side does.



PGO runs PostgreSQL 10 through 14 in Kubernetes, and we have to work
around this issue when using certificates for system accounts.


Sold then, I'll make it so in a bit.


Thank you! I think the containers community is really going to 
appreciate this.


Regards,
-David




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-02 Thread Bharath Rupireddy
On Wed, Mar 2, 2022 at 8:12 PM Ashutosh Sharma  wrote:
>
> Some review comments on v5 patch (v5-0001-pg_walinspect.patch)

Thanks for reviewing.

> +--
> +-- pg_get_wal_records_info()
> +--
> +CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn,
> +IN end_lsn pg_lsn,
> +IN wait_for_wal boolean DEFAULT false,
> +OUT lsn pg_lsn,
>
> What does the wait_for_wal flag mean here when one has already
> specified the start and end lsn? AFAIU, If a user has specified a
> start and stop LSN, it means that the user knows the extent to which
> he/she wants to display the WAL records in which case we need to stop
> once the end lsn has reached . So what is the meaning of wait_for_wal
> flag? Does it look sensible to have the wait_for_wal flag here? To me
> it doesn't.

Users can always specify a future end_lsn and set wait_for_wal to
true, then the pg_get_wal_records_info/pg_get_wal_stats functions can
wait for the WAL. IMO, this is useful. If you remember you were okay
with wait/nowait versions for some of the functions upthread [1]. I'm
not going to retain this behaviour for both
pg_get_wal_records_info/pg_get_wal_stats as it is similar to
pg_waldump's --follow option.

> ==
>
> +--
> +-- pg_get_wal_records_info_till_end_of_wal()
> +--
> +CREATE FUNCTION pg_get_wal_records_info_till_end_of_wal(IN start_lsn pg_lsn,
> +OUT lsn pg_lsn,
> +OUT prev_lsn pg_lsn,
> +OUT xid xid,
>
> Why is this function required? Is pg_get_wal_records_info() alone not
> enough? I think it is. See if we can make end_lsn optional in
> pg_get_wal_records_info() and lets just have it alone. I think it can
> do the job of pg_get_wal_records_info_till_end_of_wal function.
>
> ==
>
> +--
> +-- pg_get_wal_stats_till_end_of_wal()
> +--
> +CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn,
> +OUT resource_manager text,
> +OUT count int8,
>
> Above comment applies to this function as well. Isn't pg_get_wal_stats() 
> enough?

I'm doing the following input validations for these functions to not
cause any issues with invalid LSN. If I were to have the default value
for end_lsn as 0/0, I can't perform input validations right? That is
the reason I'm having separate functions {pg_get_wal_records_info,
pg_get_wal_stats}_till_end_of_wal() versions.

/* Validate input. */
if (XLogRecPtrIsInvalid(start_lsn))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("invalid WAL record start LSN")));

if (XLogRecPtrIsInvalid(end_lsn))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("invalid WAL record end LSN")));

if (start_lsn >= end_lsn)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("WAL record start LSN must be less than end LSN")));

> ==
>
>
> +   if (loc <= read_upto)
> +   break;
> +
> +   /* Let's not wait for WAL to be available if
> indicated */
> +   if (loc > read_upto &&
> +   state->private_data != NULL)
> +   {
>
> Why loc > read_upto? The first if condition is (loc <= read_upto)
> followed by the second if condition - (loc > read_upto). Is the first
> if condition (loc <= read_upto) not enough to indicate that loc >
> read_upto?

Yeah, that's unnecessary, I improved the comment there and removed loc
> read_upto.

> ==
>
> +#define IsEndOfWALReached(state) \
> +   (state->private_data != NULL && \
> +   (((ReadLocalXLOGPage2Private *)
> xlogreader->private_data)->no_wait == true) && \
> +   (((ReadLocalXLOGPage2Private *)
> xlogreader->private_data)->reached_end_of_wal == true))
>
> I think we should either use state or xlogreader. First line says
> state->private_data and second line xlogreader->private_data.

I've changed it to use state instead of xlogreader.

> ==
>
> +   (((ReadLocalXLOGPage2Private *)
> xlogreader->private_data)->reached_end_of_wal == true))
> +
>
> There is a new patch coming to make the end of WAL messages less
> scary. It introduces the EOW flag in xlogreaderstate maybe we can use
> that instead of introducing new flags in private area to represent the
> end of WAL.

Yeah that would be great. But we never know which one gets committed
first. Until then it's not good to have dependencies on two "on-going"
patches. Later, we can change.

> ==
>
> +/*
> + * XLogReaderRoutine->page_read callback for reading local xlog files
> + *
> + * This function is same as read_local_xlog_page except that it works in both
> + * wait and no wait mode. The callers can specify about waiting in 
> private_data
> + * of XLogReaderState.
> + */
> +int
> +read_local_xlog_page_2(XLogReaderState *state, XLogRecPtr targetPagePtr,
> +  int reqLen, XLogRecPtr
> targetRecPtr, char *cur_page)
> 

Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Chapman Flack
On 03/02/22 02:46, Michael Paquier wrote:
> system function marked as proretset while it builds and returns only
> one record.  And this is a popular one: pg_stop_backup(), labelled
> v2.

I had just recently noticed that while reviewing [0], but shrugged,
as I didn't know what the history was.

Is this best handled as a separate patch, or folded into [0], which is
going to be altering and renaming that function anyway?


On 03/02/22 09:31, Robert Haas wrote:
> On Wed, Mar 2, 2022 at 5:25 AM Aleksander Alekseev
>> Since it doesn't seem to be used for anything except these two array
>> declarations I suggest keeping simply "3" here.
>
> I think we do this kind of thing in various places in similar
> situations, and I think it is good style. It makes it easier to catch
> everything if you ever need to update the code.


I've been known (in other projects) to sometimes accomplish the same
thing with, e.g.,

Datum  values[3];
boolnulls[sizeof values / sizeof *values];


Doesn't win any beauty contests, but just one place to change the length
if it needs changing. I see we define a lengthof in c.h, so could use:

Datum  values[3];
boolnulls[lengthof(values)];

Regards,
-Chap


[0] https://commitfest.postgresql.org/37/3436/




Re: Commitfest 2022-03 Patch Triage Part 1a.i

2022-03-02 Thread Greg Stark
On Wed, 2 Mar 2022 at 07:12, Daniel Gustafsson  wrote:

> Thanks for picking it up and continuing with recent developments.  Let me know
> if you want a hand in triaging patchsets.

While I have the time there may be patches I may need help coming to
the right conclusions about what actions to take.

I think the main thing I can do to help is to take patches that have
no chance of making this release and taking them off our collective
plates. -- Hopefully after they've received feedback but as this is
the last commitfest of the release that's a secondary concern.

But I'm unclear exactly what the consequences in the commitfest app
are of specific state changes. As I understand it there are basically
two alternatives:

1) Returned with feedback -- does this make it harder for an author to
resume work release? Can they simply reactivate the CF entry or do
they need to start a new one and then lose history in the app?

2) Moved to next commitfest -- this seems to just drag the pain on.
Then it has to get triaged again next commitfest and if it's actually
stalled (or progressing fine without needing feedback) that's just
extra work for nothing.

Do I have this right? What is the right state to put a patch in that
means "this patch doesn't need to be triaged again unless the author
actually feels progress has been made and needs new feedback or thinks
its committable"?

-- 
greg




Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Julien Rouhaud
On Wed, Mar 02, 2022 at 05:40:00PM +0300, Aleksander Alekseev wrote:
> Hi Tom.
>
> Yeah, there's plenty of precedent for that coding if you look around.
> > I've not read the whole patch, but this snippet seems fine to me
> > if there's also an #undef at the end of the function.
>
> No, there is no #undef. With #undef I don't mind it either.

I don't see strong evidence for that pattern being wildly used with some naive
grepping:

#define for such use without undef:
POSTGRES_FDW_GET_CONNECTIONS_COLS
HEAP_TUPLE_INFOMASK_COLS
CONNECTBY_NCOLS
DBLINK_NOTIFY_COLS
PG_STAT_STATEMENTS_COLS
PG_STAT_STATEMENTS_INFO_COLS
HEAPCHECK_RELATION_COLS
PG_PARTITION_TREE_COLS
PG_STAT_GET_ACTIVITY_COLS
PG_STAT_GET_WAL_COLS
PG_STAT_GET_SLRU_COLS
PG_STAT_GET_REPLICATION_SLOT_COLS
PG_STAT_GET_SUBSCRIPTION_STATS_COLS
PG_GET_BACKEND_MEMORY_CONTEXTS_COLS
PG_GET_SHMEM_SIZES_COLS
PG_GET_REPLICATION_SLOTS_COLS
READ_REPLICATION_SLOT_COLS
PG_STAT_GET_WAL_SENDERS_COLS
PG_STAT_GET_SUBSCRIPTION_COLS

With an undef:
REPLICATION_ORIGIN_PROGRESS_COLS




Re: Add 64-bit XIDs into PostgreSQL 15

2022-03-02 Thread Justin Pryzby
On Wed, Mar 02, 2022 at 06:43:11PM +0400, Pavel Borisov wrote:
> Hi hackers!
> 
> Hi! Here is the rebased version.

The patch doesn't apply - I suppose the patch is relative a forked postgres
which already has other patches.

http://cfbot.cputube.org/pavel-borisov.html

Note also that I mentioned an issue with pg_upgrade.  Handling that that well
is probably the most important part of the patch.

-- 
Justin




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Mar  2, 2022 at 10:54:27AM -0500, Stephen Frost wrote:
> > It's our decision what we want to support and maintain in the code base
> > and what we don't.  Folks often ask for things that we don't or won't
> > support and this isn't any different from that.  We also remove things
> > on a rather regular basis even when they're being used- generally
> > because we have something better, as we do here.  I disagree that an
> > argument of 'some people use it so we can't remove it' holds any weight
> > here.
> 
> I disagree.

With... which?  We removed recovery.conf without any warning between
major releases, yet it was used by every single PG file-based backup and
restore solution out there and by every single organization that had
ever done a restore of PG since it was introduced in 8.0.  Passing
around cleartext passwords with the LDAP authentication method is
clearly bad from a security perspective and it's a bunch of code to
support that, along with it being quite complicated to configure and get
set up (arguably harder than Kerberos, if you want my 2c).  If you want
to say that it's valuable for us to continue to maintain that code
because it's good and useful and might even be the only option for some
people, fine, though I disagree, but I don't think my argument that we
shouldn't keep it just because *someone* is using it is any different
from our general project policy about features.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Handle infinite recursion in logical replication setup

2022-03-02 Thread vignesh C
On Tue, Mar 1, 2022 at 4:12 PM kuroda.hay...@fujitsu.com
 wrote:
>
> Hi Vignesh,
>
> > In logical replication, currently Walsender sends the data that is
> > generated locally and the data that are replicated from other
> > instances. This results in infinite recursion in circular logical
> > replication setup.
>
> Thank you for good explanation. I understand that this fix can be used
> for a bidirectional replication.
>
> > Here there are two problems for the user: a) incremental
> > synchronization of table sending both local data and replicated data
> > by walsender b) Table synchronization of table using copy command
> > sending both local data and replicated data
>
> So you wanted to solve these two problem and currently focused on
> the first one, right? We can check one by one.
>
> > For the first problem "Incremental synchronization of table by
> > Walsender" can be solved by:
> > Currently the locally generated data does not have replication origin
> > associated and the data that has originated from another instance will
> > have a replication origin associated. We could use this information to
> > differentiate locally generated data and replicated data and send only
> > the locally generated data. This "only_local" could be provided as an
> > option while subscription is created:
> > ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
> > PUBLICATION pub1 with (only_local = on);
>
> Sounds good, but I cannot distinguish whether the assumption will keep.
>
> I played with your patch, but it could not be applied to current master.
> I tested from bd74c40 and I confirmed infinite loop was not appeared.
Rebased the patch on top of head

> local_only could not be set from ALTER SUBSCRIPTION command.
> Is it expected?
Modified

Thanks for the comments, the attached patch has the changes for the same.

Regards,
Vignesh
From 7c67cc23584e1106fbf2011c8c6658442125e48f Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Wed, 2 Mar 2022 20:40:34 +0530
Subject: [PATCH v2] Skip replication of non local data.

Add an option only_local which will subscribe only to the locally
generated data in the publisher node. If subscriber is created with this
option, publisher will skip publishing the data that was subscribed
from other nodes. It can be created using following syntax:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=' PUBLICATION pub1 with (only_local = on);
---
 contrib/test_decoding/test_decoding.c |  13 +++
 doc/src/sgml/ref/alter_subscription.sgml  |   3 +-
 doc/src/sgml/ref/create_subscription.sgml |  12 ++
 src/backend/catalog/pg_subscription.c |   1 +
 src/backend/catalog/system_views.sql  |   3 +-
 src/backend/commands/subscriptioncmds.c   |  29 -
 .../libpqwalreceiver/libpqwalreceiver.c   |  18 ++-
 src/backend/replication/logical/decode.c  |  36 --
 src/backend/replication/logical/logical.c |  35 ++
 src/backend/replication/logical/tablesync.c   |   2 +-
 src/backend/replication/logical/worker.c  |   2 +
 src/backend/replication/pgoutput/pgoutput.c   |  25 
 src/backend/replication/slot.c|   4 +-
 src/backend/replication/slotfuncs.c   |  18 ++-
 src/backend/replication/walreceiver.c |   2 +-
 src/backend/replication/walsender.c   |  21 +++-
 src/bin/psql/tab-complete.c   |   4 +-
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/catalog/pg_subscription.h |   3 +
 src/include/replication/logical.h |   4 +
 src/include/replication/output_plugin.h   |   7 ++
 src/include/replication/pgoutput.h|   1 +
 src/include/replication/slot.h|   5 +-
 src/include/replication/walreceiver.h |   8 +-
 src/test/regress/expected/rules.out   |   5 +-
 src/test/regress/expected/subscription.out|   4 +
 src/test/regress/sql/subscription.sql |   4 +
 src/test/subscription/t/029_circular.pl   | 108 ++
 src/tools/pgindent/typedefs.list  |   1 +
 29 files changed, 345 insertions(+), 39 deletions(-)
 create mode 100644 src/test/subscription/t/029_circular.pl

diff --git a/contrib/test_decoding/test_decoding.c b/contrib/test_decoding/test_decoding.c
index ea22649e41..58bc5dbc1c 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -73,6 +73,8 @@ static void pg_decode_truncate(LogicalDecodingContext *ctx,
 			   ReorderBufferChange *change);
 static bool pg_decode_filter(LogicalDecodingContext *ctx,
 			 RepOriginId origin_id);
+static bool pg_decode_filter_remotedata(LogicalDecodingContext *ctx,
+		RepOriginId origin_id);
 static void pg_decode_message(LogicalDecodingContext *ctx,
 			  ReorderBufferTXN *txn, XLogRecPtr message_lsn,
 			  bool transactional, const char *prefix,
@@ -148,6 +150,7 @@ _PG_output_plugin_init(OutputPluginCallbacks *cb)
 	cb->truncate_cb = pg_de

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Tom Lane
Robert Haas  writes:
> So the questions in my mind here are all
> about whether we can detect this stuff cheaply and whether anybody
> wants to do the work to make it happen, not whether we'd get a benefit
> in the cases where it kicks in.

Right, my worries are mostly about the first point.

regards, tom lane




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Bruce Momjian
On Wed, Mar  2, 2022 at 10:54:27AM -0500, Stephen Frost wrote:
> It's our decision what we want to support and maintain in the code base
> and what we don't.  Folks often ask for things that we don't or won't
> support and this isn't any different from that.  We also remove things
> on a rather regular basis even when they're being used- generally
> because we have something better, as we do here.  I disagree that an
> argument of 'some people use it so we can't remove it' holds any weight
> here.

I disagree.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Mar  2, 2022 at 10:29:45AM -0500, Stephen Frost wrote:
> > We don't require SSL to be used with them..?  Further, as already
> > discussed on this thread, SSL only helps with on-the-wire, doesn't
> > address the risk of a compromised server.  LDAP, in particular, is
> > terrible in this regard because it's a centralized password system,
> > meaning that one compromised server will lead to an attacker gaining
> > full access to the victim's account throughout the enterprise.
> 
> Yes, but the site chose LDAP, and I don't think it is our place to tell
> them what to use.

It's our decision what we want to support and maintain in the code base
and what we don't.  Folks often ask for things that we don't or won't
support and this isn't any different from that.  We also remove things
on a rather regular basis even when they're being used- generally
because we have something better, as we do here.  I disagree that an
argument of 'some people use it so we can't remove it' holds any weight
here.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Robert Haas
On Tue, Mar 1, 2022 at 9:05 PM Tom Lane  wrote:
> Robert Haas  writes:
> > I agree. My question is: why shouldn't every case where we can deduce
> > an implied inequality be reasonably likely to show a benefit?
>
> Maybe it will be, if we can deal with the issue you already mentioned
> about not misestimating the resulting partially-redundant conditions.

OK.

> > Second, it looks to me like the patch takes the rather naive strategy
> > of enforcing the derived clauses everywhere that they can legally be
> > put, which seems certain not to be optimal.
>
> I'm not sure about that ... it's basically what we do with derived
> equalities.  However, there's enough structure in the equivalence-class
> case that we don't end up enforcing redundant quals.  It's not clear
> to me whether the same can be said here.

I mean, to go back to the example of a.x < 42 and a.x = b.x, there are
three possible choices as to where to enforce the qual (a, b, both).
That's a meaningful choice, independent of any estimation issue. I
think it is reasonably common to have cases where a.x < 42 is very
selective and b.x < 42 hardly filters out anything at all, or the
other way around. Certainly, that kind of situation came up a lot in
PostgreSQL-based applications that I wrote myself back in the day. If
we're just talking about btree operators, *maybe* we can say it's
cheap enough that we don't care, but color me a tad skeptical.

> > I don't know whether attaching something to the equivalence class data
> > structure is the right idea or not. Presumably, we don't want to make
> > an extra pass over the query tree to gather the information needed for
> > this kind of optimization, and it feels like we need to know which
> > vars are EMs before we try to derive alternate/additional quals.
>
> Yeah, we don't want to make an additional pass over the tree, and
> we also would rather not add an additional set of per-operator
> catalog lookups.  We might be able to generalize the code that looks
> for equality operators so that it looks for "any btree operator"
> with the same number of lookups, and then have it feed the results
> down either the EquivalenceClass path or the inequality path
> as appropriate.  At the end, after we've formed all the ECs, we
> could have a go at matching up the inequality structures with the
> ECs.

Interesting idea.

> But I don't agree that ECs are a necessary prerequisite.
> Here are a couple of other patterns that might be worth looking for:
>
> * "a > b AND b > c" allows deducing "a > c", whether or not any
> of those values appears in an EC.
>
> * "a > const1 AND a > const2" can be simplified to either "a > const1"
> or "a > const2" depending on which constant is larger.  (The predicate
> proof mechanism already has a form of this, but we don't typically
> apply it in a way that would result in dropping the redundant qual.)
>
> It's entirely possible that one or both of these patterns is not
> worth looking for.  But I would say that it's equally unproven
> that deriving "a > c" from "a = b AND b > c" is worth the cycles.
> I'll grant that it's most likely going to be a win if we can use
> any of these patterns to generate a restriction clause from what
> had been join clauses.  Beyond that it's much less clear.

Pretty much all of the cases that I've run across involve an equijoin
plus an inequality, so if somebody asked me which problem we ought to
put most effort into solving, I'd say that one. Cases like "a>1 and
a>2" or a same-table case like "a=b and b>3" haven't been as common in
my experience, and haven't caused as much trouble when they do happen.
Part of that is because if you have something like "a>1 and a>2" in
your query, it may be easy for you to just tweak the query generation
to avoid it, and if "a=b and b>3" is coming up a lot, you may choose
to adjust your data model (e.g. choose to store NULL in b to indicate
same-as-a), whereas if you have something like
"orders.orderno=order_lines.orderno and order_lines.orderno<1,"
what are you going to do to avoid that exactly? If you normalize your
order data and then want to find the old orders, this problem arises
ineluctably.

But having said that, I'm not *against* doing something about those
cases if it's cheap or falls out naturally. If we could detect for
free that the user had written a>1 and a>2, it'd certainly be
beneficial to drop the former qual and keep only the latter. If the
user writes a>b and b>c and all those columns are in one table I don't
see how it helps to derive a>c, because we're still going to need to
check the other two quals anyway so we've just created more work. But
if those columns are not all in the same table then I'd say chances
are really pretty good. Like, suppose it's x.a>y.b and y.b>x.c. Well,
like I say, I don't really see people writing queries like that
myself, but if they do, it seems pretty obvious that deriving x.a>x.c
has the potential to save a LOT of trouble. If it's x.a>y.b and
y.b>z.c I don't f

Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Bruce Momjian
On Wed, Mar  2, 2022 at 10:29:45AM -0500, Stephen Frost wrote:
> We don't require SSL to be used with them..?  Further, as already
> discussed on this thread, SSL only helps with on-the-wire, doesn't
> address the risk of a compromised server.  LDAP, in particular, is
> terrible in this regard because it's a centralized password system,
> meaning that one compromised server will lead to an attacker gaining
> full access to the victim's account throughout the enterprise.

Yes, but the site chose LDAP, and I don't think it is our place to tell
them what to use.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Jonathan S. Katz

On 3/2/22 10:30 AM, Stephen Frost wrote:

Greetings,

* Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:

On 02.03.22 15:16, Jonathan S. Katz wrote:

I find that a lot of people are still purposely using md5.  Removing it
now or in a year would be quite a disruption.


What are the reasons they are still purposely using it? The ones I have
seen/heard are:

- Using an older driver
- On a pre-v10 PG
- Unaware of SCRAM


I'm not really sure, but it seems like they are content with what they have
and don't want to bother with the new fancy stuff.


By that argument, we should have kept "password" (plain) as an 
authentication method.


The specific use-cases I've presented are all solvable issues. The 
biggest challenging with existing users is the upgrade process, which is 
why I'd rather we begin a deprecation process and see if there are any 
ways we can make the md5 => SCRAM transition easier.



There were lots and lots of folks who were comfortable with
recovery.conf, yet we removed that without any qualms from one major
version to the next.  md5 will have had 5 years of overlap with scram.


I do agree with Stephen in principle here. I encountered upgrade 
challenges in this an challenge with updating automation to handle this 
change.



What I'm proposing above is to start the process of deprecating it as an
auth method, which also allows to continue the education efforts to
upgrae. Does that make sense?


I'm not in favor of starting a process that will result in removal of the
md5 method at this time.


I am.


+1 for starting this process. It may still take a few more years, but we 
should help our users to move away from an auth method with known issues.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Joshua Brindle
On Wed, Mar 2, 2022 at 10:29 AM Stephen Frost  wrote:
>
> Greetings,
>
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Wed, Mar  2, 2022 at 10:09:31AM -0500, Stephen Frost wrote:
> > > I'm not sure that it's quite so simple.  Perhaps we should also drop
> > > LDAP and I don't really think PAM was ever terribly good for us to have,
> > > but at least PAM and RADIUS could possibly be used with OTP solutions
> > > (and maybe LDAP?  Not sure, don't think I've seen that but perhaps..),
> > > rendering sniffing of what's transmitted less valuable.  We don't
> > > support that for 'password' itself or for 'md5' in any serious way
> > > though.
> >
> > I thought all the plain-password methods were already using SSL
> > (hopefully with certificate authentication) and they were therefore
> > safe.  Why would we remove something like LDAP if that is what the site
> > is already using?
>
> We don't require SSL to be used with them..?  Further, as already
> discussed on this thread, SSL only helps with on-the-wire, doesn't
> address the risk of a compromised server.  LDAP, in particular, is
> terrible in this regard because it's a centralized password system,
> meaning that one compromised server will lead to an attacker gaining
> full access to the victim's account throughout the enterprise.

I want to add support for the deprecation move, and I think/hope that
my multi-password patchset can help make the transition less painful.

I also want to throw out another existing issue with MD5, namely that
the salt as the role is fundamentally problematic, even outside of
trivial pass-the-hash attacks one could build a rainbow table today
that uses 'postgres' as the salt, and get admin credentials that can
be used for password stuffing attacks against other enterprise assets.
This means PG is actively enabling lateral movement through enterprise
systems if MD5 passwords are used.




Re: [PoC/RFC] Multiple passwords, interval expirations

2022-03-02 Thread Joshua Brindle
On Wed, Mar 2, 2022 at 9:58 AM Joshua Brindle
 wrote:
>
> This is not intended for PG15.
>
> Attached are a proof of concept patchset to implement multiple valid
> passwords, which have independent expirations, set by a GUC or SQL
> using an interval.
>



> postgres=# select * from pg_auth_password ;
>  roleid |  name   |
>password
> |  expiration
> +-+---
> +---
>  10 | __def__ |
> SCRAM-SHA-256$4096:yGiHIYPwc2az7xj/7TIyTA==$OQL/AEcEY1yOCNbrZEj4zDvNnOLpIqltOW1uQvosLvc=:9VRRppuIkSrwhiBN5ePy8wB1y
> zDa/2uX0WUx6gXi93E= |
>   16384 | __def__ |
> SCRAM-SHA-256$4096:AA==$1Ivp4d+vAWxowpuGEn05KR9lxyGOms3yy85k3D7XpBg=:k8xUjU6xrJG17PMGa/Zya6pAE
> /M7pEDaoIFmWvNIEUg= | 2022-03-02 06:52:31.217193-08
>   16384 | newone  |
> SCRAM-SHA-256$4096:AA==$WK3+41CCGDognSnZrtpHhv00z9LuVUjHR1hWq8T1+iE=:w2C5GuhgiEB7wXqPxYfxBKB+e
> hm4h6Oeif1uzpPIFVk= | 2022-03-03 06:47:53.728159-08
> (3 rows)

There's obviously a salt problem here that I'll need to fix that
apparently snuck in at the last rebase, but this brings up one aspect
of the patchset I didn't mention in the original email:

For the SCRAM protocol to work as is with existing clients the salt
for each password must be the same. Right now ALTER USER will find and
reuse the salt, but a user passing in a pre-computed SCRAM secret
currently has no way to get the salt.

for \password (we'll need a new one that takes a password name) I was
thinking libpq could hold onto the salt that was used to log in, but
for outside computation we'll need some way for the client to request
it.

None of that is done yet.




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> On 02.03.22 15:16, Jonathan S. Katz wrote:
> >>I find that a lot of people are still purposely using md5.  Removing it
> >>now or in a year would be quite a disruption.
> >
> >What are the reasons they are still purposely using it? The ones I have
> >seen/heard are:
> >
> >- Using an older driver
> >- On a pre-v10 PG
> >- Unaware of SCRAM
> 
> I'm not really sure, but it seems like they are content with what they have
> and don't want to bother with the new fancy stuff.

There were lots and lots of folks who were comfortable with
recovery.conf, yet we removed that without any qualms from one major
version to the next.  md5 will have had 5 years of overlap with scram.

> >What I'm proposing above is to start the process of deprecating it as an
> >auth method, which also allows to continue the education efforts to
> >upgrae. Does that make sense?
> 
> I'm not in favor of starting a process that will result in removal of the
> md5 method at this time.

I am.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Mar  2, 2022 at 10:09:31AM -0500, Stephen Frost wrote:
> > I'm not sure that it's quite so simple.  Perhaps we should also drop
> > LDAP and I don't really think PAM was ever terribly good for us to have,
> > but at least PAM and RADIUS could possibly be used with OTP solutions
> > (and maybe LDAP?  Not sure, don't think I've seen that but perhaps..),
> > rendering sniffing of what's transmitted less valuable.  We don't
> > support that for 'password' itself or for 'md5' in any serious way
> > though.
> 
> I thought all the plain-password methods were already using SSL
> (hopefully with certificate authentication) and they were therefore
> safe.  Why would we remove something like LDAP if that is what the site
> is already using?

We don't require SSL to be used with them..?  Further, as already
discussed on this thread, SSL only helps with on-the-wire, doesn't
address the risk of a compromised server.  LDAP, in particular, is
terrible in this regard because it's a centralized password system,
meaning that one compromised server will lead to an attacker gaining
full access to the victim's account throughout the enterprise.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Peter Eisentraut



On 02.03.22 15:16, Jonathan S. Katz wrote:
I find that a lot of people are still purposely using md5.  Removing 
it now or in a year would be quite a disruption.


What are the reasons they are still purposely using it? The ones I have 
seen/heard are:


- Using an older driver
- On a pre-v10 PG
- Unaware of SCRAM


I'm not really sure, but it seems like they are content with what they 
have and don't want to bother with the new fancy stuff.


What I'm proposing above is to start the process of deprecating it as an 
auth method, which also allows to continue the education efforts to 
upgrae. Does that make sense?


I'm not in favor of starting a process that will result in removal of 
the md5 method at this time.





Re: Changing "Hot Standby" to "hot standby"

2022-03-02 Thread Daniel Westermann (DWE)
Hi Aleksander,

> Pretty sure that for titles we should keep English capitalization rules.

Done like that. Thanks for taking a look.

Regards
Danieldiff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index b5b6042104..08eb1ad946 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -548,7 +548,7 @@ protocol to make nodes agree on a serializable transactional order.
rollforward will take considerably longer, so that technique only
offers a solution for disaster recovery, not high availability.
A standby server can also be used for read-only queries, in which case
-   it is called a Hot Standby server. See  for
+   it is called a hot standby server. See  for
more information.
   
 
@@ -1032,7 +1032,7 @@ primary_slot_name = 'node_a_slot'

 

-Hot Standby feedback propagates upstream, whatever the cascaded arrangement.
+hot standby feedback propagates upstream, whatever the cascaded arrangement.

 

@@ -1499,16 +1499,16 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
   Hot Standby
 
   
-   Hot Standby
+   hot standby
   
 

-Hot Standby is the term used to describe the ability to connect to
+hot standby is the term used to describe the ability to connect to
 the server and run read-only queries while the server is in archive
 recovery or standby mode. This
 is useful both for replication purposes and for restoring a backup
 to a desired state with great precision.
-The term Hot Standby also refers to the ability of the server to move
+The term hot standby also refers to the ability of the server to move
 from recovery through to normal operation while users continue running
 queries and/or keep their connections open.

@@ -1623,7 +1623,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
being executed during recovery.  This restriction applies even to
temporary tables, because table rows cannot be read or written without
assigning a transaction ID, which is currently not possible in a
-   Hot Standby environment.
+   hot standby environment.
   
  
  
@@ -1703,7 +1703,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'

 In normal operation, read-only transactions are allowed to
 use LISTEN and NOTIFY,
-so Hot Standby sessions operate under slightly tighter
+so hot standby sessions operate under slightly tighter
 restrictions than ordinary read-only sessions.  It is possible that some
 of these restrictions might be loosened in a future release.

@@ -1746,7 +1746,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'

 

-There are also additional types of conflict that can occur with Hot Standby.
+There are also additional types of conflict that can occur with hot standby.
 These conflicts are hard conflicts in the sense that queries
 might need to be canceled and, in some cases, sessions disconnected to resolve them.
 The user is provided with several ways to handle these
@@ -1947,8 +1947,8 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
 If hot_standby is on in postgresql.conf
 (the default value) and there is a
 standby.signalstandby.signalfor hot standby
-file present, the server will run in Hot Standby mode.
-However, it may take some time for Hot Standby connections to be allowed,
+file present, the server will run in hot standby mode.
+However, it may take some time for hot standby connections to be allowed,
 because the server will not accept connections until it has completed
 sufficient recovery to provide a consistent state against which queries
 can run.  During this period,
@@ -2282,7 +2282,7 @@ HINT:  You can then restart the server after making the necessary configuration
Caveats
 

-There are several limitations of Hot Standby.
+There are several limitations of hot standby.
 These can and probably will be fixed in future releases:
 
   
@@ -2299,7 +2299,7 @@ HINT:  You can then restart the server after making the necessary configuration
 
  Valid starting points for standby queries are generated at each
  checkpoint on the primary. If the standby is shut down while the primary
- is in a shutdown state, it might not be possible to re-enter Hot Standby
+ is in a shutdown state, it might not be possible to re-enter hot standby
  until the primary is started up, so that it generates further starting
  points in the WAL logs.  This situation isn't a problem in the most
  common situations where it might happen. Generally, if the primary is


Re: SQL/JSON: functions

2022-03-02 Thread Andrew Dunstan


On 3/1/22 16:41, Andrew Dunstan wrote:
> On 2/1/22 14:11,I wrote:
>> 2. The new GUC "sql_json" is a bit of a worry. I understand what it's
>> trying to do, but I'm trying to convince myself it's not going to be a
>> fruitful source of error reports, especially if people switch it in the
>> middle of a session. Maybe it should be an initdb option instead of a GUC?
>>
>>
>
> So far my efforts have not borne fruit. Here's why:
>
>
> andrew=# set sql_json = jsonb;
> SET
> andrew=# create table abc (x text, y json);
> CREATE TABLE
> andrew=# \d abc
>    Table "public.abc"
>  Column | Type | Collation | Nullable | Default
> +--+---+--+-
>  x  | text |   |  |
>  y  | json |   |  |
>
> andrew=# insert into abc values ('a','{"q":1}');
> INSERT 0 1
> andrew=# select json_each(y) from abc;
> ERROR:  function json_each(json) does not exist
> LINE 1: select json_each(y) from abc;
>    ^
> HINT:  No function matches the given name and argument types. You might
> need to add explicit type casts.
> andrew=# select jsonb_each(y) from abc;
>  jsonb_each
> 
>  (q,1)
> (1 row)
>
>
> The description tells them the column is json, but the json_* functions
> don't work on the column and you need to use the jsonb functions. That
> seems to me a recipe for major confusion. It might be better if we set
> it at initdb time so it couldn't be changed, but even so it could be
> horribly confusing.
>
> This is certainly severable from the rest of these patches. I'm not sure
> how severable it is from the SQL/JSON Table patches.
>
>

I have confirmed that this is not required at all for the JSON_TABLE
patch set.


I'll submit new patch sets omitting it shortly. The GUC patch can be
considered separately, probably as release 16 material, but I think as
is it's at best quite incomplete.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Bruce Momjian
On Wed, Mar  2, 2022 at 10:09:31AM -0500, Stephen Frost wrote:
> I'm not sure that it's quite so simple.  Perhaps we should also drop
> LDAP and I don't really think PAM was ever terribly good for us to have,
> but at least PAM and RADIUS could possibly be used with OTP solutions
> (and maybe LDAP?  Not sure, don't think I've seen that but perhaps..),
> rendering sniffing of what's transmitted less valuable.  We don't
> support that for 'password' itself or for 'md5' in any serious way
> though.

I thought all the plain-password methods were already using SSL
(hopefully with certificate authentication) and they were therefore
safe.  Why would we remove something like LDAP if that is what the site
is already using?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> What is the logic to removing md5 but keeping 'password'?
> 
> > I don't think we should keep 'password'.
> 
> I don't see much point in that unless we deprecate *all* the
> auth methods that transmit a cleartext password.

I'm not sure that it's quite so simple.  Perhaps we should also drop
LDAP and I don't really think PAM was ever terribly good for us to have,
but at least PAM and RADIUS could possibly be used with OTP solutions
(and maybe LDAP?  Not sure, don't think I've seen that but perhaps..),
rendering sniffing of what's transmitted less valuable.  We don't
support that for 'password' itself or for 'md5' in any serious way
though.

We really should drop ident already though.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Tom Lane
Stephen Frost  writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> What is the logic to removing md5 but keeping 'password'?

> I don't think we should keep 'password'.

I don't see much point in that unless we deprecate *all* the
auth methods that transmit a cleartext password.

regards, tom lane




Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Mar  1, 2022 at 08:31:19AM -0500, Stephen Frost wrote:
> > > The last time I played with this area is the recent error handling
> > > improvement with cryptohashes but MD5 has actually helped here in
> > > detecting the problem as a patched OpenSSL would complain if trying to
> > > use MD5 as hash function when FIPS is enabled.
> > 
> > Having to continue to deal with md5 as an algorithm when it's known to
> > be notably less secure and so much so that organizations essentially ban
> > its use for exactly what we're using it for, in fact, another reason to
> 
> Really?  I thought it was publicly-visible MD5 hashes that were the
> biggest problem.  Our 32-bit salt during the connection is a problem, of
> course.

Neither are good.  Not sure that we really need to spend a lot of effort
trying to figure out which issue is the biggest problem.

> > remove it, not a reason to keep it.  Better code coverage testing of
> > error paths is the answer to making sure that our error handling behaves
> > properly.
> 
> What is the logic to removing md5 but keeping 'password'?

I don't think we should keep 'password'.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-02 Thread Bruce Momjian
On Tue, Mar  1, 2022 at 08:31:19AM -0500, Stephen Frost wrote:
> > The last time I played with this area is the recent error handling
> > improvement with cryptohashes but MD5 has actually helped here in
> > detecting the problem as a patched OpenSSL would complain if trying to
> > use MD5 as hash function when FIPS is enabled.
> 
> Having to continue to deal with md5 as an algorithm when it's known to
> be notably less secure and so much so that organizations essentially ban
> its use for exactly what we're using it for, in fact, another reason to

Really?  I thought it was publicly-visible MD5 hashes that were the
biggest problem.  Our 32-bit salt during the connection is a problem, of
course.

> remove it, not a reason to keep it.  Better code coverage testing of
> error paths is the answer to making sure that our error handling behaves
> properly.

What is the logic to removing md5 but keeping 'password'?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Add 64-bit XIDs into PostgreSQL 15

2022-03-02 Thread Pavel Borisov
Hi hackers!

Hi! Here is the rebased version.
>
I'd like to add a description of what was done in v9:
- The patch is rebased on current master branch
- In-memory tuple storage format was refactored as promised to have
pre-calculated 64bit xmin and xmax, not just copies of pd_xid_base and
pd_multi_base.
- Fixed bug reported by Andres Freund, with lazy conversion of pages
upgraded from 32 to 64 xid when first tuple read (and therefore lazy
conversion) is done in read-only state (read-only xact or on replica). In
this case now in memory buffer descriptor will be marked
with REGBUF_CONVERTED flag. When cluster comes to read-write state this
will lead to emitting full page write xlog instruction for this page.

Relevant changes in README are also done.

We'd very much appreciate enthusiasm to have 64 bit xid's in PG15 and any
effort to review and test this feature.

Alexander, thanks for your attention to the patchset. Your questions and
review are very much welcome!
The participation of other hackers is highly appreciated as always!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-02 Thread Ashutosh Sharma
Some review comments on v5 patch (v5-0001-pg_walinspect.patch)

+--
+-- pg_get_wal_records_info()
+--
+CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn,
+IN end_lsn pg_lsn,
+IN wait_for_wal boolean DEFAULT false,
+OUT lsn pg_lsn,

What does the wait_for_wal flag mean here when one has already
specified the start and end lsn? AFAIU, If a user has specified a
start and stop LSN, it means that the user knows the extent to which
he/she wants to display the WAL records in which case we need to stop
once the end lsn has reached . So what is the meaning of wait_for_wal
flag? Does it look sensible to have the wait_for_wal flag here? To me
it doesn't.

==

+--
+-- pg_get_wal_records_info_till_end_of_wal()
+--
+CREATE FUNCTION pg_get_wal_records_info_till_end_of_wal(IN start_lsn pg_lsn,
+OUT lsn pg_lsn,
+OUT prev_lsn pg_lsn,
+OUT xid xid,

Why is this function required? Is pg_get_wal_records_info() alone not
enough? I think it is. See if we can make end_lsn optional in
pg_get_wal_records_info() and lets just have it alone. I think it can
do the job of pg_get_wal_records_info_till_end_of_wal function.

==

+--
+-- pg_get_wal_stats_till_end_of_wal()
+--
+CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn,
+OUT resource_manager text,
+OUT count int8,

Above comment applies to this function as well. Isn't pg_get_wal_stats() enough?

==


+   if (loc <= read_upto)
+   break;
+
+   /* Let's not wait for WAL to be available if
indicated */
+   if (loc > read_upto &&
+   state->private_data != NULL)
+   {

Why loc > read_upto? The first if condition is (loc <= read_upto)
followed by the second if condition - (loc > read_upto). Is the first
if condition (loc <= read_upto) not enough to indicate that loc >
read_upto?

==

+#define IsEndOfWALReached(state) \
+   (state->private_data != NULL && \
+   (((ReadLocalXLOGPage2Private *)
xlogreader->private_data)->no_wait == true) && \
+   (((ReadLocalXLOGPage2Private *)
xlogreader->private_data)->reached_end_of_wal == true))


I think we should either use state or xlogreader. First line says
state->private_data and second line xlogreader->private_data.

==

+   (((ReadLocalXLOGPage2Private *)
xlogreader->private_data)->reached_end_of_wal == true))
+

There is a new patch coming to make the end of WAL messages less
scary. It introduces the EOW flag in xlogreaderstate maybe we can use
that instead of introducing new flags in private area to represent the
end of WAL.

==

+/*
+ * XLogReaderRoutine->page_read callback for reading local xlog files
+ *
+ * This function is same as read_local_xlog_page except that it works in both
+ * wait and no wait mode. The callers can specify about waiting in private_data
+ * of XLogReaderState.
+ */
+int
+read_local_xlog_page_2(XLogReaderState *state, XLogRecPtr targetPagePtr,
+  int reqLen, XLogRecPtr
targetRecPtr, char *cur_page)
+{
+   XLogRecPtr  read_upto,

Do we really need this function? Can't we make use of an existing WAL
reader function - read_local_xlog_page()?

--
With Regards,
Ashutosh Sharma.

On Fri, Feb 25, 2022 at 4:33 PM Bharath Rupireddy
 wrote:
>
> On Wed, Feb 16, 2022 at 9:04 AM Ashutosh Sharma  wrote:
> > I don't think that's the use case of this patch. Unless there is some
> > other valid reason, I would suggest you remove it.
>
> Removed the function pg_verify_raw_wal_record. Robert and Greg also
> voted for removal upthread.
>
> > > > Should we add a function that returns the pointer to the first and
> > > > probably the last WAL record in the WAL segment? This would help users
> > > > to inspect the wal records in the entire wal segment if they wish to
> > > > do so.
> > >
> > > Good point. One can do this already with pg_get_wal_records_info and
> > > pg_walfile_name_offset. Usually, the LSN format itself can give an
> > > idea about the WAL file it is in.
> > >
> > > postgres=# select lsn, pg_walfile_name_offset(lsn) from
> > > pg_get_wal_records_info('0/500', '0/5FF') order by lsn asc
> > > limit 1;
> > > lsn|pg_walfile_name_offset
> > > ---+---
> > >  0/538 | (00010005,56)
> > > (1 row)
> > >
> > > postgres=# select lsn, pg_walfile_name_offset(lsn) from
> > > pg_get_wal_records_info('0/500', '0/5FF') order by lsn desc
> > > limit 1;
> > > lsn|   pg_walfile_name_offset
> > > ---+-
> > >  0/5C0 | (00010005,16777152)
> > > (1 row)
> > >
> >
> > The workaround you are suggesting is not very user friendly and FYKI
> > pg_wal_records_info simply hangs at times when we specify the higher
> > and lower limit of lsn in a wal file.
> >
> > To make things easier for the end users I would

Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Aleksander Alekseev
Hi Tom.

Yeah, there's plenty of precedent for that coding if you look around.
> I've not read the whole patch, but this snippet seems fine to me
> if there's also an #undef at the end of the function.
>

No, there is no #undef. With #undef I don't mind it either.

-- 
Best regards,
Aleksander Alekseev


Re: Allow root ownership of client certificate key

2022-03-02 Thread Tom Lane
Chris Bandy  writes:
> On 3/1/22 3:15 AM, Tom Lane wrote:
>> Anyway, I'd be happier about back-patching if we could document
>> actual requests to make it work like the server side does.

> PGO runs PostgreSQL 10 through 14 in Kubernetes, and we have to work
> around this issue when using certificates for system accounts.

Sold then, I'll make it so in a bit.

regards, tom lane




Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Tom Lane
Robert Haas  writes:
> On Wed, Mar 2, 2022 at 5:25 AM Aleksander Alekseev
>  wrote:
>> Declaring a macro inside the procedure body is a bit unconventional.
>> Since it doesn't seem to be used for anything except these two array
>> declarations I suggest keeping simply "3" here.

> I think we do this kind of thing in various places in similar
> situations, and I think it is good style. It makes it easier to catch
> everything if you ever need to update the code.

Yeah, there's plenty of precedent for that coding if you look around.
I've not read the whole patch, but this snippet seems fine to me
if there's also an #undef at the end of the function.

regards, tom lane




Re: pg_stop_backup() v2 incorrectly marked as proretset

2022-03-02 Thread Robert Haas
On Wed, Mar 2, 2022 at 5:25 AM Aleksander Alekseev
 wrote:
> ```
>  Datum
>  pg_stop_backup_v2(PG_FUNCTION_ARGS)
>  {
> -ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> +#define PG_STOP_BACKUP_V2_COLS 3
>  TupleDesctupdesc;
> -Tuplestorestate *tupstore;
> -MemoryContext per_query_ctx;
> -MemoryContext oldcontext;
> -Datumvalues[3];
> -boolnulls[3];
> +Datumvalues[PG_STOP_BACKUP_V2_COLS];
> +boolnulls[PG_STOP_BACKUP_V2_COLS];
> ```
>
> Declaring a macro inside the procedure body is a bit unconventional.
> Since it doesn't seem to be used for anything except these two array
> declarations I suggest keeping simply "3" here.

I think we do this kind of thing in various places in similar
situations, and I think it is good style. It makes it easier to catch
everything if you ever need to update the code.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: PG DOCS - logical replication filtering

2022-03-02 Thread Peter Eisentraut

On 02.03.22 05:47, Peter Smith wrote:

This patch introduces a new "Filtering" page to give a common place
where all kinds of logical replication filtering can be described.
(e.g. It is envisaged that a "Column Filters" section can be added
sometime in the future).


The pending feature to select a subset of table columns to replicate is 
not "column filtering".  The thread might still be still called that, 
but we have changed the patch to not use that terminology.


Filtering is a dynamic action based on actual values.  The row filtering 
feature does that.  The column list feature is a static DDL-time 
configuration.  It is no more filtering than specifying a list of tables 
in a publication is table filtering.


So please consider organizing the documentation differently to not 
create this confusion.






  1   2   >