Re: pg_tablespace_location() failure with allow_in_place_tablespaces

2022-07-21 Thread Thomas Munro
On Thu, Mar 31, 2022 at 5:01 PM Michael Paquier  wrote:
> On Wed, Mar 30, 2022 at 08:23:25PM +1300, Thomas Munro wrote:
> > That leads to the attached patches, the first of which I'd want to 
> > back-patch.
>
> Makes sense.
> ...
> This should also work for plain files, so that looks fine to me.

Thanks.  Pushed.  Also CC'ing Alvaro who expressed an interest in this
problem[1].

> > Unfortunately while testing this I realised there is something else
> > wrong here: if you take a basebackup using tar format, in-place
> > tablespaces are skipped (they should get their own OID.tar file, but
> > they don't, also no error).  While it wasn't one of my original goals
> > for in-place tablespaces to work in every way (and I'm certain some
> > external tools would be confused by them), it seems we're pretty close
> > so we should probably figure out that piece of the puzzle.  It may be
> > obvious why but I didn't have time to dig into that today... perhaps
> > instead of just skipping the readlink() we should be writing something
> > different into the mapfile and then restoring as appropriate...
>
> Yeah, I saw that in-place tablespaces were part of the main tarball in
> base backups as we rely on the existence of a link to decide if the
> contents of a path should be separated in a different tarball or not.
> This does not strike me as a huge problem in itself, TBH, as the
> improvement would be limited to make sure that the base backups could
> be correctly restored with multiple tablespaces.  And you can get
> pretty much the same amount of coverage to make sure that the backup
> contents are correct without fully restoring them.

Are they in the main tar file, or are they just missing?

[1] https://postgr.es/m/2022072751.x7hod2xgrd76xr5c%40alvherre.pgsql




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Fri, Jul 22, 2022 at 1:13 PM David Rowley  wrote:
> On Fri, 22 Jul 2022 at 15:22, Amit Langote  wrote:
> > BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
> > is to add `set jit_above_cost to 0` at the top of the test file, or
> > are we missing a force_jit_mode, like there is force_parallel_mode?
>
> I don't think we'd need any setting to hide the JIT counters from
> EXPLAIN ANALYZE since those only show with COSTS ON, which we tend not
> to do.

Ah, makes sense.

> I think for testing, you could just zero all the jit*above_cost GUCs.
>
> If you look at the config_extra in [1], you'll see that animal runs
> the tests with modified JIT parameters.
>
> BTW, I was working on code inside llvm_compile_expr() a few days ago
> and I thought I'd gotten the new evaluation steps I was adding correct
> as it worked fine with jit_above_cost=0, but on further testing, it
> crashed with jit_inline_above_cost=0. Might be worth doing both to see
> if everything works as intended.

Thanks for the pointer.

So I didn't see things going bust on re-testing with all
jit_*_above_cost parameters set to 0, so maybe the
llvm_compile_expression() additions are alright.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-21 Thread Amit Kapila
On Fri, Jul 22, 2022 at 3:47 AM Peter Smith  wrote:
>
> I was in favour of enum mostly because I thought the bitmask of an
> earlier patch was mis-used; IMO each bit should only be for
> representing something as "on/set". So a bit for
> SUBSCRIPTION_REL_STATE_READY makes sense, but a bit for
> SUBSCRIPTION_REL_STATE_NOT_READY seemed strange/backwards to me. YMMV.
>
> So using a bitmask is fine, except I thought it should be implemented
> so that one of the bits is for a "NOT" modifier (IIUC this is kind of
> similar to what Michael [1] suggested above?). So "Not READY" would be
> (SUBSCRIPTION_REL_STATE_MOD_NOT | SUBSCRIPTION_REL_STATE_READY)
>

Hmm, I think that sounds more complicated than what I expected. I
suggest let's go with a simple idea of using a boolean not_ready which
will decide whether to use the additional key to search. I feel we can
extend it by using a bitmask or enum when we have a clear need for
more states.

> Also, it may be better to add the bit constants for every one of the
> current states, even if you are not needing to use all of them just
> yet. In fact, I thought this patch probably can implement the fully
> capable common function (i.e. capable of multiple keys etc) right now,
> so there will be no need to revisit it again in the future.
>

I don't know whether we need to go that far. Say for a year or so if
we don't have such a use case arising which appears to be quite likely
then one can question the need for additional defines.

-- 
With Regards,
Amit Kapila.




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Fri, Jul 22, 2022 at 7:39 AM Kyotaro Horiguchi
 wrote:
>
> +errmsg("cannot operate on relation \"%s\"",
>
> Other callers of errdetail_relkind_not_supported() describing
> operations concretely.  In that sense we I think should say "cannot
> open relation \"%s\"" here.
>

Sounds reasonable to me. This will give more precise information to the user.

-- 
With Regards,
Amit Kapila.




Re: Handle infinite recursion in logical replication setup

2022-07-21 Thread Amit Kapila
On Fri, Jul 22, 2022 at 1:39 AM Jonathan S. Katz  wrote:
>
> Thanks for the work on this feature -- this is definitely very helpful
> towards supporting more types of use cases with logical replication!
>
> I've read through the proposed documentation and did some light testing
> of the patch. I have two general comments about the docs as they
> currently read:
>
> 1. I'm concerned by calling this "Bidirectional replication" in the docs
> that we are overstating the current capabilities. I think this is
> accentuated int he opening paragraph:
>
> ==snip==
>   Bidirectional replication is useful for creating a multi-master database
>   environment for replicating read/write operations performed by any of the
>   member nodes.
> ==snip==
>
> For one, we're not replicating reads, we're replicating writes. Amongst
> the writes, at this point we're only replicating DML. A reader could
> think that deploying can work for a full bidirectional solution.
>
> (Even if we're aspirationally calling this section "Bidirectional
> replication", that does make it sound like we're limited to two nodes,
> when we can support more than two).
>

Right, I think the system can support N-Way replication.

> Perhaps "Logical replication between writers" or "Logical replication
> between primaries" or "Replicating changes between primaries", or
> something better.
>

Among the above "Replicating changes between primaries" sounds good to
me or simply "Replication between primaries". As this is a sub-section
on the Logical Replication page, I feel it is okay to not use Logical
in the title.

> 2. There is no mention of conflicts in the documentation, e.g.
> referencing the "Conflicts" section of the documentation. It's very easy
> to create a conflicting transaction that causes a subscriber to be
> unable to continue to apply transactions:
>
>-- DB 1
>CREATE TABLE abc (id int);
>CREATE PUBLICATION node1 FOR ALL TABLES ;
>
>-- DB2
>CREATE TABLE abc (id int);
>CREATE PUBLICATION node2 FOR ALL TABLES ;
>CREATE SUBSCRIPTION node2_node1
>  CONNECTION 'dbname=logi port=5433'
>  PUBLICATION node1
>  WITH (copy_data = off, origin = none);
>
>-- DB1
>CREATE SUBSCRIPTION node1_node2
>  CONNECTION 'dbname=logi port=5434'
>  PUBLICATION node2
>  WITH (copy_data = off, origin = none);
>INSERT INTO abc VALUES (1);
>
>-- DB2
>INSERT INTO abc VALUES (2);
>
>-- DB1
>ALTER TABLE abc ADD PRIMARY KEY id;
>INSERT INTO abc VALUES (3);
>
>-- DB2
>INSERT INTO abc VALUES (3);
>
>-- DB1 cannot apply the transactions
>
> At a minimum, I think we should reference the documentation we have in
> the logical replication section on conflicts. We may also want to advise
> that a user is responsible for designing their schemas in a way to
> minimize the risk of conflicts.
>

This sounds reasonable to me.

One more point about docs, it appears to be added as the last
sub-section on the Logical Replication page. Is there a reason for
doing so? I feel this should be third sub-section after describing
Publication and Subscription.

BTW, do you have any opinion on the idea of the first remaining patch
where we accomplish two things: a) Checks and throws an error if
'copy_data = on' and 'origin = none' but the publication tables were
also replicated from other publishers. b) Adds 'force' value for
copy_data parameter to allow copying in such a case. The primary
reason for this patch is to avoid loops or duplicate data in the
initial phase. We can't skip copying based on origin as we can do
while replicating changes from WAL. So, we detect that the publisher
already has data from some other node and doesn't allow replication
unless the user uses the 'force' option for copy_data.

-- 
With Regards,
Amit Kapila.




Re: Strange failures on chipmunk

2022-07-21 Thread Thomas Munro
On Thu, Jun 30, 2022 at 8:21 PM Heikki Linnakangas  wrote:
> I ran sparc64-ext4-zeros on chipmunk for 10 minutes, and it didn't print
> anything.

Thanks for checking.

> It's possible that the SD card on chipmunk is simply wearing out and
> flipping bits. I can try to replace it. Anyone have suggestions on a
> test program I could run on the SD card, after replacing it, to verify
> if it was indeed worn out?

BTW its disk is full.

FWIW I run RPi4 build bots on higher end USB3.x sticks (SanDisk
Extreme Pro, I'm sure there are others), and the performance is orders
of magnitude higher and more consistent than the micro SD and
cheap/random USB sticks I tried.  Admittedly they cost more than the
RPi4 board themselves (back when you could get them).

I noticed another (presumed) Raspberry Pi apparently behaving
strangely at the storage level (guessing it's a Pi by the armv7l
architecture): dangomushi appears to get files mixed up.  Here it is
trying to compile a log file last week:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2022-07-14%2017%3A58%3A38

And the week before it tried to compile some Perl:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2022-07-09%2015%3A30%3A07




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread David Rowley
On Fri, 22 Jul 2022 at 15:22, Amit Langote  wrote:
> BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
> is to add `set jit_above_cost to 0` at the top of the test file, or
> are we missing a force_jit_mode, like there is force_parallel_mode?

I don't think we'd need any setting to hide the JIT counters from
EXPLAIN ANALYZE since those only show with COSTS ON, which we tend not
to do.

I think for testing, you could just zero all the jit*above_cost GUCs.

If you look at the config_extra in [1], you'll see that animal runs
the tests with modified JIT parameters.

BTW, I was working on code inside llvm_compile_expr() a few days ago
and I thought I'd gotten the new evaluation steps I was adding correct
as it worked fine with jit_above_cost=0, but on further testing, it
crashed with jit_inline_above_cost=0. Might be worth doing both to see
if everything works as intended.

David

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes=2022-07-22%2003%3A04%3A03




Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-07-21 Thread David Rowley
On Thu, 4 Nov 2021 at 20:59, Ronan Dunklau  wrote:
> I took some time to toy with this again.
>
> At first I thought that charging a discount in foreign grouping paths for
> Aggref targets (since they are computed remotely) would be a good idea,
> similar to what is done for the grouping keys.
>
> But in the end, it's probably not something we would like to do. Yes, the
> group planning will be more accurate on the remote side generally (better
> statistics than locally for estimating the number of groups) but executing the
> grouping locally when the number of groups is close to the input's cardinality
> (ex: group by unique_key)  gives us a form of parallelism which can actually
> perform better.
>
> For the other cases where there is fewer output than input tuples, that is,
> when an actual grouping takes place, adjusting fdw_tuple_cost might be enough
> to tune the behavior to what is desirable.

I've now looked into this issue. With the patched code, the remote
aggregate path loses out in add_path() due to the fact that the local
aggregate path compares fuzzily the same as the remote aggregate path.
Since the local aggregate path is now fetching the rows from the
foreign server with a SQL query containing an ORDER BY clause (per my
change to query_pathkeys being picked up in
get_useful_pathkeys_for_relation()), add_path now prefers that path
due to it having pathkeys and the remote aggregate query not having
any (PATHKEYS_BETTER2).

It seems what's going on is that quite simply the default
fdw_tuple_cost is unrealistically low. Let's look.

#define DEFAULT_FDW_TUPLE_COST 0.01

Which is even lower than DEFAULT_PARALLEL_TUPLE_COST (0.1) and the
same as cpu_tuple_cost!

After some debugging, I see add_path() switches to the, seemingly
better, remote aggregate plan again if I multiple fdw_tuple_cost by
28. Anything below that sticks to the (inferior) local aggregate plan.

There's also another problem going on that would make that situation
better. The query planner expects the following query to produce 6
rows:

SELECT array_agg("C 1" ORDER BY "C 1" USING OPERATOR(public.<^) NULLS
LAST), c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6)) GROUP
BY c2;

You might expect the planner to think there'd just be 1 row due to the
"c2 = 6" and "GROUP BY c2", but it thinks there will be more than
that. If estimate_num_groups() knew about EquivalenceClasses and
checked ec_has_const, then it might be able to do better, but it
doesn't, so:

GroupAggregate  (cost=11.67..11.82 rows=6 width=36)

If I force that estimate to be 1 row instead of 6, then I only need a
fdw_tuple_cost to be 12 times the default to get it to switch to the
remote aggregate plan.

I think we should likely just patch master and change
DEFAULT_FDW_TUPLE_COST to at the very least 0.2, which is 20x higher
than today's setting. I'd be open to a much higher setting such as 0.5
(50x).

David




RE: Hash index build performance tweak from sorting

2022-07-21 Thread houzj.f...@fujitsu.com
On Monday, May 30, 2022 4:13 pmshiy.f...@fujitsu.com  
wrote:
> 
> On Tue, May 10, 2022 5:43 PM Simon Riggs 
> wrote:
> >
> > On Sat, 30 Apr 2022 at 12:12, Amit Kapila 
> > wrote:
> > >
> > > Few comments on the patch:
> > > 1. I think it is better to use DatumGetUInt32 to fetch the hash key
> > > as the nearby code is using.
> > > 2. You may want to change the below comment in HSpool
> > > /*
> > > * We sort the hash keys based on the buckets they belong to. Below
> > masks
> > > * are used in _hash_hashkey2bucket to determine the bucket of given
> > hash
> > > * key.
> > > */
> >
> > Addressed in new patch, v2.
> >
> 
> I think your changes looks reasonable.

+1, the changes look good to me as well.

Best regards,
Hou zj




Re: slot_creation_error failures

2022-07-21 Thread Thomas Munro
On Fri, Jul 22, 2022 at 3:11 PM Masahiko Sawada  wrote:
> We can see regression.diffs[1]:

Ahh, right, thanks.  We see it when it fails in test-decoding-check on
Windows, but not when it fails in MiscCheck, and I didn't check enough
of them.




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Fri, Jul 22, 2022 at 2:12 AM Alvaro Herrera  wrote:
> On 2022-Jul-21, Amit Langote wrote:
>
> > Because I wrote all of it while not really understanding how the LLVM
> > constructs like blocks and branches work, the only reason I think
> > those llvm_compile_expr() additions may be correct is that all the
> > tests in jsonb_sqljson.sql pass even if I add the following line at
> > the top:
>
> I suggest to build with --enable-coverage, then run the regression tests
> and do "make coverage-html" and see if your code appears covered in the
> report.

Thanks for the suggestion.  I just did and it seems that both the
additions to ExecInterpExpr() and to llvm_compile_expr() are well
covered.

BTW, the only way I found to *forcefully* exercise llvm_compile_expr()
is to add `set jit_above_cost to 0` at the top of the test file, or
are we missing a force_jit_mode, like there is force_parallel_mode?

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: slot_creation_error failures

2022-07-21 Thread Masahiko Sawada
On Fri, Jul 22, 2022 at 10:59 AM Thomas Munro  wrote:
>
> Hi,
>
> Here are some recent $SUBJECT on HEAD.  Unfortunately we don't see the
> regression.diffs file :-(

We can see regression.diffs[1]:

== output_iso/regression.diffs 
diff -w -U3 
C:/tools/nmsys64/home/pgrunner/bf/root/HEAD/pgsql/contrib/test_decoding/expected/slot_creation_error.out
C:/tools/nmsys64/home/pgrunner/bf/root/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/slot_creation_error.out
--- 
C:/tools/nmsys64/home/pgrunner/bf/root/HEAD/pgsql/contrib/test_decoding/expected/slot_creation_error.out
2022-03-19 06:04:11.806604000 +
+++ 
C:/tools/nmsys64/home/pgrunner/bf/root/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/slot_creation_error.out
2022-04-11 21:15:58.342700300 +
@@ -92,23 +92,7 @@
 FROM pg_stat_activity
 WHERE application_name = 'isolation/slot_creation_error/s2';
  
-step s2_init: <... completed>
-FATAL:  terminating connection due to administrator command
-server closed the connection unexpectedly
+PQconsumeInput failed: server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.

-step s1_terminate_s2: <... completed>
-pg_terminate_backend
-
-t
-(1 row)
-
-step s1_c: COMMIT;
-step s1_view_slot:
-SELECT slot_name, slot_type, active FROM pg_replication_slots
WHERE slot_name = 'slot_creation_error'
-
-slot_name|slot_type|active
--+-+--
-(0 rows)
-

Regards,

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=fairywren=2022-04-11%2021%3A04%3A15=test-decoding-check

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Remove useless arguments in ReadCheckpointRecord().

2022-07-21 Thread Tom Lane
Fujii Masao  writes:
> On 2022/07/21 14:54, Kyotaro Horiguchi wrote:
>> At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao 
>>  wrote in
>>> -   (errmsg("could not locate required checkpoint record"),
>>> +   (errmsg("could not locate a valid checkpoint record in backup_label 
>>> file"),

>> "in backup_label" there looks *to me* need some verb..  

> Sorry, I failed to understand your point. Could you clarify your point?

FWIW, the proposed change looks like perfectly good English to me.
"locate" is the verb.  It's been way too many years since high
school grammar for me to remember the exact term for auxiliary
clauses like "in backup_label file", but that doesn't need its
own verb.  Possibly Kyotaro-san is feeling that it should be
like "... checkpoint record in the backup_label file".  That'd
be more formal, but in the telegraphic style that we prefer for
primary error messages, omitting the "the" is fine.

regards, tom lane




Re: question about `static inline` functions in header files

2022-07-21 Thread Junwang Zhao
Ok, thanks for the clarification.

On Fri, Jul 22, 2022 at 11:03 AM Tom Lane  wrote:
>
> Junwang Zhao  writes:
> > I notice that there are lots of *static inline functions* in header files,
> > the header file's content will go into each translation unit at preprocess
> > phase, that means all the c file including the header will have a copy
> > of the static inline function.
>
> We are assuming that the compiler will not emit unused static functions.
> This has been default behavior in gcc for ages.  If you're unfortunate
> enough to have a compiler that won't do it, yes you're going to have a
> bloated binary.
>
> > IMHO, the header files should only include the inline function's 
> > declaration,
> > and the definition should be in c files.
>
> Then it couldn't be inlined, defeating the purpose.
>
> regards, tom lane



-- 
Regards
Junwang Zhao




Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Fujii Masao




On 2022/07/21 23:41, Japin Li wrote:


On Thu, 21 Jul 2022 at 22:22, Fujii Masao  wrote:

Hi,

I found that fetch_more_data_begin() in postgres_fdw reports an error when 
PQsendQuery() returns the value less than 0 as follows though PQsendQuery() can 
return only 1 or 0. I think this is  a bug. Attached is the patch that fixes 
this bug. This needs to be back-ported to v14 where async execution was 
supported in postgres_fdw.

if (PQsendQuery(fsstate->conn, sql) < 0)
pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
fsstate->query);

Regards,


+1.  However, I think check whether the result equals 0 or 1 might be better.


Maybe. I just used "if (!PQsendQuery())" style because it's used in 
postgres_fdw elsewhere.


Anyway, the patch works correctly.


Thanks for the review! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: question about `static inline` functions in header files

2022-07-21 Thread Tom Lane
Junwang Zhao  writes:
> I notice that there are lots of *static inline functions* in header files,
> the header file's content will go into each translation unit at preprocess
> phase, that means all the c file including the header will have a copy
> of the static inline function.

We are assuming that the compiler will not emit unused static functions.
This has been default behavior in gcc for ages.  If you're unfortunate
enough to have a compiler that won't do it, yes you're going to have a
bloated binary.

> IMHO, the header files should only include the inline function's declaration,
> and the definition should be in c files.

Then it couldn't be inlined, defeating the purpose.

regards, tom lane




Re: Remove useless arguments in ReadCheckpointRecord().

2022-07-21 Thread Fujii Masao



On 2022/07/21 14:54, Kyotaro Horiguchi wrote:

I agree to removing the two parameters. And agree to let
ReadCheckpointRecord not conscious of the location source.


Thanks for the review!



At Thu, 21 Jul 2022 11:45:23 +0900, Fujii Masao  
wrote in

Agreed. Attached is the updated version of the patch.
Thanks for the review!


-   (errmsg("could not locate required checkpoint record"),
+   (errmsg("could not locate a valid checkpoint record in backup_label 
file"),

"in backup_label" there looks *to me* need some verb..  


Sorry, I failed to understand your point. Could you clarify your point?



By the way,
this looks like a good chance to remove the (now) extra parens around
errmsg() and friends.

For example:

-   (errmsg("could not locate a valid checkpoint record in backup_label 
file"),
+   errmsg("could not locate checkpoint record spcified in backup_label 
file"),

-   (errmsg("could not locate a valid checkpoint record in control file")));
+   errmsg("could not locate checkpoint record recorded in control file")));


Because it's recommended not to put parenthesis just before errmsg(), you mean? 
I'm ok to remove such parenthesis, but I'd like understand why before doing 
that. I was thinking that either having or not having parenthesis in front of 
errmsg() is ok, so there are many calls to errmsg() with parenthesis, in 
xlogrecovery.c.



+   (errmsg("invalid checkpoint record")));

Is it useful to show the specified LSN there?


Yes, LSN info would be helpful also for debugging.

I separated the patch into two; one is to remove useless arguments in 
ReadCheckpointRecord(), another is to improve log messages. I added LSN info in 
log messages in the second patch.



+   (errmsg("invalid resource manager ID in checkpoint 
record")));

We have a similar message "invalid resource manager ID %u at
%X/%X". Since the caller explains that it is a checkpoint record, we
can share the message here.


+1



+   (errmsg("invalid xl_info in checkpoint 
record")));

(It is not an issue of this patch, though) I don't think this is
appropriate for user-facing message. Counldn't we say "unexpected
record type: %x" or something like?


The proposed log message doesn't look like an improvement. But I have no better 
one. So I left the message as it is, in the patch, for now.



+   (errmsg("invalid length of checkpoint 
record")));

We have "record with invalid length at %X/%X" or "invalid record
length at %X/%X: wanted %u, got %u". Counld we reuse any of them?


+1

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom 1a7f5b66fed158306ed802cb08b328c2f8d47f20 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Wed, 20 Jul 2022 23:06:44 +0900
Subject: [PATCH 1/2] Remove useless arguments in ReadCheckpointRecord().

---
 src/backend/access/transam/xlogrecovery.c | 84 ---
 1 file changed, 15 insertions(+), 69 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c 
b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..e383c2123a 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -422,8 +422,8 @@ static XLogPageReadResult 
WaitForWALToBecomeAvailable(XLogRecPtr RecPtr,

  XLogRecPtr replayLSN,

  bool nonblocking);
 static int emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
-static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher, 
XLogRecPtr RecPtr,
-   
int whichChkpt, bool report, TimeLineID replayTLI);
+static XLogRecord *ReadCheckpointRecord(XLogPrefetcher *xlogprefetcher,
+   
XLogRecPtr RecPtr, TimeLineID replayTLI);
 static bool rescanLatestTimeLine(TimeLineID replayTLI, XLogRecPtr replayLSN);
 static int XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 XLogSource source, bool 
notfoundOk);
@@ -605,7 +605,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool 
*wasShutdown_ptr,
 * When a backup_label file is present, we want to roll forward 
from
 * the checkpoint it identifies, rather than using pg_control.
 */
-   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc, 0, 
true,
+   record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
  
CheckPointTLI);
if (record != NULL)
{
@@ 

question about `static inline` functions in header files

2022-07-21 Thread Junwang Zhao
I notice that there are lots of *static inline functions* in header files,
the header file's content will go into each translation unit at preprocess
phase, that means all the c file including the header will have a copy
of the static inline function.

The inline keyword is a hint for compiler to inline the function, if the
compiler does inline the function, the definition could be optimized out
by the compiler, but if the *inline function*  can not be inlined, the function
will reside in each of the translation units that include the header file, which
means the same static function compiled multiple times and may waste
some space?

IMHO, the header files should only include the inline function's declaration,
and the definition should be in c files.

I am not sure why this kind of coding style came along, appreciate if
some one can give me some clue, thanks :)

-- 
Regards
Junwang Zhao




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Kyotaro Horiguchi
Hi.

+errmsg("cannot operate on relation \"%s\"",

Other callers of errdetail_relkind_not_supported() describing
operations concretely.  In that sense we I think should say "cannot
open relation \"%s\"" here.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

 




slot_creation_error failures

2022-07-21 Thread Thomas Munro
Hi,

Here are some recent $SUBJECT on HEAD.  Unfortunately we don't see the
regression.diffs file :-(

bfbot=> select make_snapshot_url(animal, snapshot) from run where
'slot_creation_error' = any(fail_tests) order by snapshot desc;
   make_snapshot_url

 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-07-21%2023:33:50
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-06-15%2003:12:54
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-05-10%2021:03:37
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-04-11%2021:04:15
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-04-08%2018:04:15
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2022-04-01%2018:27:29
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2022-03-08%2001:14:51
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris=2022-02-24%2015:17:30
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse=2022-02-15%2009:29:06
(9 rows)




Re: warn if GUC set to an invalid shared library

2022-07-21 Thread Justin Pryzby
Finally returning to this .. rebased and updated per feedback.

I'm not sure of a good place to put test cases for this..
>From 13e8c8b96d1a5313fd3edde515a5278cf8c6b23e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 13 Dec 2021 08:42:38 -0600
Subject: [PATCH v5 1/3] warn when setting GUC to a nonextant library

---
 src/backend/utils/misc/guc.c  | 106 +-
 .../unsafe_tests/expected/rolenames.out   |  19 
 .../modules/unsafe_tests/sql/rolenames.sql|  13 +++
 .../worker_spi/expected/worker_spi.out|   2 +
 .../modules/worker_spi/sql/worker_spi.sql |   2 +
 src/test/regress/expected/rules.out   |   8 ++
 6 files changed, 147 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index af4a1c30689..4e38e73a277 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -176,6 +176,12 @@ static bool call_enum_check_hook(struct config_enum *conf, int *newval,
  void **extra, GucSource source, int elevel);
 
 static bool check_log_destination(char **newval, void **extra, GucSource source);
+static bool check_local_preload_libraries(char **newval, void **extra,
+		GucSource source);
+static bool check_session_preload_libraries(char **newval, void **extra,
+		GucSource source);
+static bool check_shared_preload_libraries(char **newval, void **extra,
+		GucSource source);
 static void assign_log_destination(const char *newval, void *extra);
 
 static bool check_wal_consistency_checking(char **newval, void **extra,
@@ -4272,7 +4278,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		_preload_libraries_string,
 		"",
-		NULL, NULL, NULL
+		check_session_preload_libraries, NULL, NULL
 	},
 
 	{
@@ -4283,7 +4289,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		_preload_libraries_string,
 		"",
-		NULL, NULL, NULL
+		check_shared_preload_libraries, NULL, NULL
 	},
 
 	{
@@ -4294,7 +4300,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		_preload_libraries_string,
 		"",
-		NULL, NULL, NULL
+		check_local_preload_libraries, NULL, NULL
 	},
 
 	{
@@ -12506,6 +12512,100 @@ check_max_worker_processes(int *newval, void **extra, GucSource source)
 	return true;
 }
 
+/*
+ * See also load_libraries() and internal_load_library().
+ */
+static bool
+check_preload_libraries(char **newval, void **extra, GucSource source,
+	bool restricted)
+{
+	char		*rawstring;
+	List		*elemlist;
+	ListCell	*l;
+
+	/* nothing to do if empty */
+	if (newval == NULL || *newval[0] == '\0')
+		return true;
+
+	/*
+	 * issue warnings only during an interactive SET, from ALTER
+	 * ROLE/DATABASE/SYSTEM.
+	 */
+	if (source != PGC_S_TEST && source != PGC_S_FILE)
+		return true;
+
+	/* Need a modifiable copy of string */
+	rawstring = pstrdup(*newval);
+
+	/* Parse string into list of filename paths */
+	if (!SplitDirectoriesString(rawstring, ',', ))
+	{
+		/* Should not happen ? */
+		return false;
+	}
+
+	foreach(l, elemlist)
+	{
+		/* Note that filename was already canonicalized */
+		char   *filename = (char *) lfirst(l);
+		char   *expanded = NULL;
+
+		/* If restricting, insert $libdir/plugins if not mentioned already */
+		if (restricted && first_dir_separator(filename) == NULL)
+		{
+			expanded = psprintf("$libdir/plugins/%s", filename);
+			filename = expanded;
+		}
+
+		/*
+		 * stat()/access() only check that the library exists, not that it has
+		 * the correct magic number or even a library.  But error messages from
+		 * dlopen() are not portable, so it'd be hard to report any problem
+		 * other than "does not exist".
+		 */
+		if (access(filename, R_OK) == 0)
+			continue;
+
+
+		if (source == PGC_S_FILE)
+			/* ALTER SYSTEM */
+			ereport(WARNING,
+	errcode_for_file_access(),
+	errmsg("could not access file \"%s\"", filename),
+	errdetail("The server will currently fail to start with this setting."),
+	errhint("If the server is shut down, it will be necessary to manually edit the %s file to allow it to start again.",
+		"postgresql.auto.conf"));
+		else
+			/* ALTER ROLE/DATABASE */
+			ereport(WARNING,
+	errcode_for_file_access(),
+	errmsg("could not access file \"%s\"", filename),
+	errdetail("New sessions will currently fail to connect with the new setting."));
+	}
+
+	list_free_deep(elemlist);
+	pfree(rawstring);
+	return true;
+}
+
+static bool
+check_shared_preload_libraries(char **newval, void **extra, GucSource source)
+{
+	return check_preload_libraries(newval, extra, source, true);
+}
+
+static bool
+check_local_preload_libraries(char **newval, void **extra, GucSource source)
+{
+	return check_preload_libraries(newval, extra, source, false);
+}
+
+static bool
+check_session_preload_libraries(char **newval, void **extra, GucSource source)
+{
+	return check_preload_libraries(newval, extra, source, true);
+}
+
 static bool
 check_effective_io_concurrency(int *newval, void **extra, GucSource 

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-07-21 Thread Masahiko Sawada
On Tue, Jul 19, 2022 at 1:30 PM John Naylor
 wrote:
>
>
>
> On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada  wrote:
>
> > I’d like to keep the first version simple. We can improve it and add
> > more optimizations later. Using radix tree for vacuum TID storage
> > would still be a big win comparing to using a flat array, even without
> > all these optimizations. In terms of single-value leaves method, I'm
> > also concerned about an extra pointer traversal and extra memory
> > allocation. It's most flexible but multi-value leaves method is also
> > flexible enough for many use cases. Using the single-value method
> > seems to be too much as the first step for me.
> >
> > Overall, using 64-bit keys and 64-bit values would be a reasonable
> > choice for me as the first step . It can cover wider use cases
> > including vacuum TID use cases. And possibly it can cover use cases by
> > combining a hash table or using tree of tree, for example.
>
> These two aspects would also bring it closer to Andres' prototype, which 1) 
> makes review easier and 2) easier to preserve optimization work already done, 
> so +1 from me.

Thanks.

I've updated the patch. It now implements 64-bit keys, 64-bit values,
and the multi-value leaves method. I've tried to remove duplicated
codes but we might find a better way to do that.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


radixtree_v5.patch
Description: Binary data


Re: Windows now has fdatasync()

2022-07-21 Thread Thomas Munro
Hearing no objection, I committed the patch to remove O_FSYNC.  The
next cleanup one I'll just leave here for now.
From a50957d4c079a3703c9f8074287c7edb5654022c Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 20 Jul 2022 14:10:37 +1200
Subject: [PATCH v3] Remove fdatasync configure probe.

Every system we target has POSIX fdatasync, except Windows where we
provide a replacement in src/port/fdatasync.c.  We can remove the
configure probe and associated #ifdefs.

We retain the probe for the function declaration, which allows us to
supply the missing declaration for macOS and Windows.

Discussion: https://postgr.es/m/CA%2BhUKGJZJVO%3DiX%2Beb-PXi2_XS9ZRqnn_4URh0NUQOwt6-_51xQ%40mail.gmail.com
---
 configure | 59 +--
 configure.ac  |  3 --
 src/backend/access/transam/xlog.c |  4 --
 src/backend/storage/file/fd.c |  8 
 src/bin/pg_test_fsync/pg_test_fsync.c |  4 --
 src/include/access/xlogdefs.h |  7 +---
 src/include/pg_config.h.in|  3 --
 src/include/port/freebsd.h|  2 -
 src/include/port/win32_port.h |  8 
 src/tools/msvc/Solution.pm|  1 -
 10 files changed, 3 insertions(+), 96 deletions(-)

diff --git a/configure b/configure
index 33a425562f..f13822fb3d 100755
--- a/configure
+++ b/configure
@@ -12315,63 +12315,6 @@ if test "$ac_res" != no; then :
 
 fi
 
-# Solaris:
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing fdatasync" >&5
-$as_echo_n "checking for library containing fdatasync... " >&6; }
-if ${ac_cv_search_fdatasync+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  ac_func_search_save_LIBS=$LIBS
-cat confdefs.h - <<_ACEOF >conftest.$ac_ext
-/* end confdefs.h.  */
-
-/* Override any GCC internal prototype to avoid an error.
-   Use char because int might match the return type of a GCC
-   builtin and then its argument prototype would still apply.  */
-#ifdef __cplusplus
-extern "C"
-#endif
-char fdatasync ();
-int
-main ()
-{
-return fdatasync ();
-  ;
-  return 0;
-}
-_ACEOF
-for ac_lib in '' rt posix4; do
-  if test -z "$ac_lib"; then
-ac_res="none required"
-  else
-ac_res=-l$ac_lib
-LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
-  fi
-  if ac_fn_c_try_link "$LINENO"; then :
-  ac_cv_search_fdatasync=$ac_res
-fi
-rm -f core conftest.err conftest.$ac_objext \
-conftest$ac_exeext
-  if ${ac_cv_search_fdatasync+:} false; then :
-  break
-fi
-done
-if ${ac_cv_search_fdatasync+:} false; then :
-
-else
-  ac_cv_search_fdatasync=no
-fi
-rm conftest.$ac_ext
-LIBS=$ac_func_search_save_LIBS
-fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_fdatasync" >&5
-$as_echo "$ac_cv_search_fdatasync" >&6; }
-ac_res=$ac_cv_search_fdatasync
-if test "$ac_res" != no; then :
-  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
-
-fi
-
 # Cygwin:
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing shmget" >&5
 $as_echo_n "checking for library containing shmget... " >&6; }
@@ -16039,7 +15982,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in backtrace_symbols clock_gettime copyfile fdatasync getifaddrs getpeerucred getrlimit inet_pton kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pthread_is_threaded_np readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink syncfs sync_file_range uselocale wcstombs_l writev
+for ac_func in backtrace_symbols clock_gettime copyfile getifaddrs getpeerucred getrlimit inet_pton kqueue mbstowcs_l memset_s poll posix_fallocate ppoll pthread_is_threaded_np readlink readv setproctitle setproctitle_fast setsid shm_open strchrnul strsignal symlink syncfs sync_file_range uselocale wcstombs_l writev
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.ac b/configure.ac
index 5295cb5806..0b64b35b39 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1253,8 +1253,6 @@ AC_SEARCH_LIBS(getopt_long, [getopt gnugetopt])
 AC_SEARCH_LIBS(shm_open, rt)
 AC_SEARCH_LIBS(shm_unlink, rt)
 AC_SEARCH_LIBS(clock_gettime, [rt posix4])
-# Solaris:
-AC_SEARCH_LIBS(fdatasync, [rt posix4])
 # Cygwin:
 AC_SEARCH_LIBS(shmget, cygipc)
 # *BSD:
@@ -1796,7 +1794,6 @@ AC_CHECK_FUNCS(m4_normalize([
 	backtrace_symbols
 	clock_gettime
 	copyfile
-	fdatasync
 	getifaddrs
 	getpeerucred
 	getrlimit
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1da3b8eb2e..fc972873c8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -168,9 +168,7 @@ const struct config_enum_entry sync_method_options[] = {
 #ifdef HAVE_FSYNC_WRITETHROUGH
 	{"fsync_writethrough", SYNC_METHOD_FSYNC_WRITETHROUGH, false},
 #endif
-#ifdef HAVE_FDATASYNC
 	{"fdatasync", SYNC_METHOD_FDATASYNC, false},
-#endif
 #ifdef O_SYNC
 	{"open_sync", SYNC_METHOD_OPEN, 

Re: Reducing logs produced by TAP tests running pg_regress on crash

2022-07-21 Thread Thomas Munro
On Fri, Jul 22, 2022 at 1:09 PM Michael Paquier  wrote:
> On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote:
> > One idea I got to limit the useless output generated is to check the
> > status of the cluster after running the regression test suite as
> > restart_on_crash is disabled by default in Cluster.pm, and avoid any
> > follow-up logic in these tests if the cluster is not running anymore,
> > as of the attached.
>
> So, this is still an open item..
>
> Thomas, any objections about this one?  Checking for the status of the
> node after completing pg_regress still sounds like a good idea to me,
> because as restart_after_crash is disabled we would generate a ton of
> logs coming from regression.diffs for nothing.  On top of that the
> parallel connections make harder finding which query failed, and the
> logs of the backend provide enough context already on a hard crash.

What if the clue we need to understand why it crashed was in the
regression diffs that we didn't dump?

I wonder if we should move the noise suppression check closer to
pg_regress, so that it works also for the "main" pg_regress run, not
only the one in this new TAP test.  As discussed in this thread,
inconclusively:

https://www.postgresql.org/message-id/flat/CA%2BhUKGL7hxqbadkto7e1FCOLQhuHg%3DwVn_PDZd6fDMbQrrZisA%40mail.gmail.com




Re: Reducing logs produced by TAP tests running pg_regress on crash

2022-07-21 Thread Michael Paquier
On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote:
> One idea I got to limit the useless output generated is to check the
> status of the cluster after running the regression test suite as
> restart_on_crash is disabled by default in Cluster.pm, and avoid any 
> follow-up logic in these tests if the cluster is not running anymore,
> as of the attached.

So, this is still an open item..

Thomas, any objections about this one?  Checking for the status of the
node after completing pg_regress still sounds like a good idea to me,
because as restart_after_crash is disabled we would generate a ton of
logs coming from regression.diffs for nothing.  On top of that the
parallel connections make harder finding which query failed, and the
logs of the backend provide enough context already on a hard crash.
--
Michael


signature.asc
Description: PGP signature


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 13:25:05 +0200, Alvaro Herrera  
wrote in 
> On 2022-Jul-21, Alvaro Herrera wrote:
> 
> > Yeah, I think that would reduce cruft.  I'm not sure this is more
> > against backpatching policy or less, compared to adding a separate
> > GUC just for this bugfix.
> 
> cruft:
> 
> {
> {"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY,
> gettext_noop("Continues recovery after finding invalid database 
> directories."),
> gettext_noop("It is possible for tablespace drop to interfere 
> with database creation "
>  "so that WAL replay is forced to create fake 
> database directories. "
>  "These should have been dropped by the time recovery 
> ends; "
>  "but in case they aren't, this option lets recovery 
> continue if they "
>  "are present.  Note that these directories must be 
> removed manually afterwards."),
> GUC_NOT_IN_SAMPLE
> },
> _recovery_tablespaces,
> false,
> NULL, NULL, NULL
> },
> 
> This is not a very good explanation, but I don't know how to make it
> better.

It looks a bit too detailed. I crafted the following..

Recovery can create tentative in-place tablespace directories under
pg_tblspc/. They are assumed to be removed until reaching recovery
consistency, but otherwise PostgreSQL raises a PANIC-level error,
aborting the recovery. Setting allow_recovery_tablespaces to true
causes the system to allow such directories during normal
operation. In case those directories are left after reaching
consistency, that implies data loss and metadata inconsistency and may
cause failure of future tablespace creation.

Though, after writing this, I became to think that piggy-backing on
allow_in_place_tablespaces might be a bit different..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Memory leak fix in psql

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 02:43:03PM +0800, Japin Li wrote:
> I attached a patch for v14 [1] based on master, if you want to apply it,
> please consider reviewing it.

We are talking about a few hundred bytes leaked each time, so this
does not worry me much in the older branches, honestly.
--
Michael


signature.asc
Description: PGP signature


Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 05:39:35PM -0700, Gurjeet Singh wrote:
> One notable side effect of this change is that
> process_session_preload_libraries() is now called _before_ we
> SetProcessingMode(NormalProcessing). Which means any database access
> performed by _PG_init() of an extension will be doing it in
> InitProcessing mode. I'm not sure if that's problematic.

I cannot see a reason why on top of my mind.  The restrictions of
InitProcessing apply to two code paths of bgworkers connecting to a
database, and normal processing is used as a barrier to prevent the
creation of some objects.

> The patch also adds an assertion in pg_parameter_aclcheck() to ensure
> that there's a transaction is in progress before it's called.

+   /* It's pointless to call this function, unless we're in a transaction. 
*/
+   Assert(IsTransactionState());

This can involve extension code, I think that this should be at least
an elog(ERROR) so as we have higher chances of knowing if something
still goes wrong in the wild.

> The patch now lets the user connect, throws a warning, and does not crash.
> 
> $ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test
> WARNING:  permission denied to set parameter "plperl.on_plperl_init"
> Expanded display is used automatically.
> psql (15beta1)
> Type "help" for help.

I am wondering whether we'd better have a test on this one with a
non-superuser.  Except for a few tests in the unsafe section,
session_preload_libraries has a limited amount of coverage.

+   /*
+* process any libraries that should be preloaded at backend start (this
+* can't be done until GUC settings are complete). Note that these 
libraries
+* can declare new GUC variables.
+*/
+   process_session_preload_libraries();
There is no point in doing that during bootstrap anyway, no?
--
Michael


signature.asc
Description: PGP signature


Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Michael Paquier
On Thu, Jul 21, 2022 at 07:42:12PM +0200, Alvaro Herrera wrote:
> Thanks.  I was looking at the recently modified REINDEX syntax and
> noticed there another spot for taking an optional name.  I ended up
> reusing OptSchemaName for that, as in the attached patch.  I think
> adding production-specific additional productions is pointless and
> probably bloats the grammar.  So let me +1 your push of the patch you
> posted, just to keep things moving forward, but in addition I propose to
> later rename OptSchemaName to something more generic and use it in these
> three places.

This slightly changes the behavior of the grammar, as CONCURRENTLY
was working on DATABASE as follows:
* On HEAD:
=# reindex database concurrently postgres;
REINDEX
=# reindex (concurrently) database concurrently postgres;
REINDEX
=# reindex (concurrently) database ;
REINDEX
=# reindex (concurrently) database postgres;
REINDEX
=# reindex database concurrently postgres;
REINDEX
=# reindex database concurrently;
ERROR:  42601: syntax error at or near ";"

And actually, even on HEAD, the last case is marked as supported by
the docs but we don't allow it in the parser.  My mistake on this
one.

Now, with the patch, I get:
=# reindex (concurrently) database concurrently;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex (concurrently) database concurrently postgres ;
=# reindex (concurrently) database concurrently postgres;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex (concurrently) database concurrently postgres;
=# reindex (concurrently) database ;
REINDEX
=# reindex (concurrently) database postgres;
REINDEX
=# reindex database concurrently postgres;
ERROR:  42601: syntax error at or near "concurrently"
LINE 1: reindex database concurrently postgres;
=# reindex database concurrently;
ERROR:  42601: syntax error at or near "concurrently"

So this indeed has as effect to make possible the use of CONCURRENTLY
for DATABASE and SYSTEM only within the parenthesized grammar.  Seeing
the simplifications this creates, I'd agree with dropping this part of
the grammar.  I think that I would add the following queries to
create_index.sql to test this grammar, as we can rely on this code
path generating an error:
REINDEX (CONCURRENTLY) SYSTEM postgres;
REINDEX (CONCURRENTLY) SYSTEM;
At least it would validate the parsing for DATABASE.

This breaks reindexdb for the database case, because the query
generated in run_reindex_command() adds CONCURRENTLY only *after* the
database name, and we should be careful to generate something 
backward-compatible in this tool, as well.  The fix is simple: you
can add CONCURRENTLY within the section with TABLESPACE and VERBOSE
for a connection >= 14, and use it after the object name for <= 13.
--
Michael


signature.asc
Description: PGP signature


Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 4:35 PM Gurjeet Singh  wrote:
> I like the idea of performing library initialization in
> InitPostgres(), as it performs the first transaction of the
> connection, and because of the libraries' ability to gin up new GUC
> variables that might need special handling, and also if it allows them
> to do database access.
>
> I think anywhere after the 'PostAuthDelay' check in InitPostgres()
> would be a good place to perform process_session_preload_libraries().
> I'm inclined to invoke it as late as possible, before we commit the
> transaction.
>
> As for making set_config_option() throw an error if not in
> transaction, I'm not a big fan of checks  that break the flow, and of
> unrelated code showing up when reading a function. For a casual
> reader, such a check for transaction would make for a jarring
> experience; "why are we checking for active transaction in  the guts
> of guc.c?", they might think. If anything, such an error should be
> thrown from or below pg_parameter_aclcheck().
>
> But I am not sure if it should be exposed as an error. A user
> encountering that error is not at fault. Hence I believe an assertion
> check is more suitable for catching code that invokes
> set_config_option() outside a transaction.

Please see attached the patch that implements the above proposal.

The process_session_preload_libraries() call has been moved to the end
of InitPostgres(), just before we report the backend to
PgBackendStatus and commit the first transaction.

One notable side effect of this change is that
process_session_preload_libraries() is now called _before_ we
SetProcessingMode(NormalProcessing). Which means any database access
performed by _PG_init() of an extension will be doing it in
InitProcessing mode. I'm not sure if that's problematic.

The patch also adds an assertion in pg_parameter_aclcheck() to ensure
that there's a transaction is in progress before it's called.

The patch now lets the user connect, throws a warning, and does not crash.

$ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test
WARNING:  permission denied to set parameter "plperl.on_plperl_init"
Expanded display is used automatically.
psql (15beta1)
Type "help" for help.

postgres@B:694512=>

Best regards,
Gurjeet
http://Gurje.et


move_process_session_preload_libraries.patch
Description: Binary data


Re: Expose Parallelism counters planned/execute in pg_stat_statements

2022-07-21 Thread Justin Pryzby
On Thu, Jul 21, 2022 at 06:26:58PM -0400, Anthony Sotolongo wrote:
> Hi all:
> Here's a patch to add counters about  planned/executed  for parallelism  to
> pg_stat_statements, as a way to follow-up on if the queries are
> planning/executing with parallelism, this can help to understand if you have
> a good/bad configuration or if your hardware is enough

+1, I was missing something like this before, but it didn't occur to me to use
PSS:

https://www.postgresql.org/message-id/20200310190142.gb29...@telsasoft.com
> My hope is to answer to questions like these:
>
> . is query (ever? usually?) using parallel paths?
> . is query usefully using parallel paths?
> . what queries are my max_parallel_workers(_per_process) being used for ?
> . Are certain longrunning or frequently running queries which are using
>   parallel paths using all max_parallel_workers and precluding other queries
>   from using parallel query ?  Or, are semi-short queries sometimes precluding
>   longrunning queries from using parallelism, when the long queries would
>   better benefit ?

This patch is storing the number of times the query was planned/executed using
parallelism, but not the number of workers.  Would it make sense to instead
store the the *number* of workers launched/planned ?  Otherwise, it might be
that a query is consistently planned to use a large number of workers, but then
runs with few.  I'm referring to the fields shown in "explain/analyze".  (Then,
the 2nd field should be renamed to "launched").

 Workers Planned: 2
 Workers Launched: 2

I don't think this is doing the right thing for prepared statements, like
PQprepare()/PQexecPrepared(), or SQL: PREPARE p AS SELECT; EXECUTE p;

Right now, the docs say that it shows the "number of times the statement was
planned to use parallelism", but the planning counter is incremented during
each execution.  PSS already shows "calls" and "plans" separately.  The
documentation doesn't mention prepared statements as a reason why they wouldn't
match, which seems like a deficiency.

This currently doesn't count parallel workers used by utility statements, such
as CREATE INDEX and VACUUM (see max_parallel_maintenance_workers).  If that's
not easy to do, mention that in the docs as a limitation.

You should try to add some test to contrib/pg_stat_statements/sql, or add
parallelism test to an existing test.  Note that the number of parallel workers
launched isn't stable, so you can't test that part..

You modified pgss_store() to take two booleans, but pass "NULL" instead of
"false".  Curiously, of all the compilers in cirrusci, only MSVC complained ..

"planed" is actually spelled "planned", with two enns.

The patch has some leading/trailing whitespace (maybe shown by git log
depending on your configuration).

Please add this patch to the next commitfest.
https://commitfest.postgresql.org/39/

-- 
Justin




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Kyotaro Horiguchi
At Thu, 21 Jul 2022 23:14:57 +1200, Thomas Munro  wrote 
in 
> On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  
> wrote:
> > On 2022-Jul-20, Alvaro Herrera wrote:
> > > I see the following alternatives:
> > >
> > > 1. not backpatch this fix to 14 and older
> > > 2. use a different GUC; either allow_invalid_pages as previously
> > >suggested, or create a new one just for this purpose
> > > 3. not provide any overriding mechanism in versions 14 and older
> >
> > I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> > to add and backpatch a new GUC allow_recovery_tablespaces as the
> > override mechanism.
> >
> > If others disagree with this choice, please speak up.
> 
> Would it help if we back-patched the allow_in_place_tablespaces stuff?
>  I'm not sure how hard/destabilising that would be, but I could take a
> look tomorrow.

+1. Addiotional reason for me is it is a developer option.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 07:30:20PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> +StartTransactionCommand();
>>  process_session_preload_libraries();
>> +CommitTransactionCommand();
> 
> Yeah, that way would avoid any questions about changing the order of
> operations, but it seems like a mighty expensive solution: it's
> adding a transaction to each backend start on the off chance that
> (a) session_preload_libraries/local_preload_libraries is nonempty and
> (b) the loaded libraries are going to do anything where it'd matter.
> So that's why I thought of moving the call inside a pre-existing
> transaction.
> 
> If we had to back-patch this into any released versions, I'd agree with
> taking the performance hit in order to reduce the chance of side-effects.
> But I think as long as we only have to do it in v15, it's not too late to
> possibly cause some compatibility issues for extensions.

Yeah, fair point.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 3:29 PM Nathan Bossart  wrote:
>
> On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote:
> > Right.  So there are basically two things we could do about this:
> >
> > 1. set_config_option could decline to call pg_parameter_aclcheck
> > if not IsTransactionState(), instead failing the assignment.
> > This isn't a great answer because it would result in disallowing
> > GUC assignments that users might expect to work.
> >
> > 2. We could move process_session_preload_libraries() to someplace
> > where a transaction is in progress -- probably, relocate it to
> > inside InitPostgres().
> >
> > I'm inclined to think that #2 is a better long-term solution,
> > because it'd allow you to session-preload libraries that expect
> > to be able to do database access during _PG_init.  (Right now
> > that'd fail with the same sort of symptoms seen here.)  But
> > there's no denying that this might have surprising side-effects
> > for extensions that weren't expecting such a change.
> >
> > It could also be reasonable to do both #1 and #2, with the idea
> > that #1 might save us from crashing if there are any other
> > code paths where we can reach that pg_parameter_aclcheck call
> > outside a transaction.
> >
> > Thoughts?
>
> I wrote up a small patch along the same lines as #2 before seeing this
> message.  It simply ensures that process_session_preload_libraries() is
> called within a transaction.  I don't have a strong opinion about doing it
> this way versus moving this call somewhere else as you proposed, but I'd
> agree that #2 is a better long-term solution than #1.  AFAICT
> shared_preload_libraries, even with EXEC_BACKEND, should not have the same
> problem.
>
> I'm not sure whether we should be worried about libraries that are already
> creating transactions in their _PG_init() functions.  Off the top of my
> head, I don't recall seeing anything like that.  Even if it does impact
> some extensions, it doesn't seem like it'd be too much trouble to fix.
>
> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index 8ba1c170f0..fd471d74a3 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username)
>  /*
>   * process any libraries that should be preloaded at backend start (this
>   * likewise can't be done until GUC settings are complete)
> + *
> + * If the user provided a setting at session startup for a custom GUC
> + * defined by one of these libraries, we might need syscache access when
> + * evaluating whether she has permission to set it, so do this step 
> within
> + * a transaction.
>   */
> +StartTransactionCommand();
>  process_session_preload_libraries();
> +CommitTransactionCommand();
>
>  /*
>   * Send this backend's cancellation info to the frontend.

(none of the following is your patch's fault)

I don't think that is a good call-site for
process_session_preload_libraries(), because a library being loaded
can declare its own GUCs, hence I believe this should be called at
least before the call to BeginReportingGUCOptions().

If an extension creates a GUC with GUC_REPORT flag, it is violating
expectations. But since the DefineCustomXVariable() stack does not
prevent the callers from doing so, we must still honor the protocol
followed for all params with GUC_REPORT. And hence the

I think it'd be a good idea to ban the callers of
DefineCustomXVariable() from declaring their variable GUC_REPORT, to
ensure that only core code can define such variables.

Best regards,
Gurjeet
http://Gurje.et




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Gurjeet Singh
On Thu, Jul 21, 2022 at 2:44 PM Tom Lane  wrote:
>
> Gurjeet Singh  writes:
> > While poking at plperl's GUC in an internal discussion, I was able to
> > induce a crash (or an assertion failure in assert-enabled builds) as
> > an unprivileged user.
> > My investigation so far has revealed that the code path for the
> > following condition has never been tested, and because of this, when a
> > user tries to override an SUSET param via PGOPTIONS, Postgres tries to
> > perform a table lookup during process initialization. Because there's
> > no transaction in progress, and because this table is not in the
> > primed caches, we end up with code trying to dereference an
> > uninitialized  CurrentResourceOwner.
>
> Right.  So there are basically two things we could do about this:
>
> 1. set_config_option could decline to call pg_parameter_aclcheck
> if not IsTransactionState(), instead failing the assignment.
> This isn't a great answer because it would result in disallowing
> GUC assignments that users might expect to work.
>
> 2. We could move process_session_preload_libraries() to someplace
> where a transaction is in progress -- probably, relocate it to
> inside InitPostgres().
>
> I'm inclined to think that #2 is a better long-term solution,
> because it'd allow you to session-preload libraries that expect
> to be able to do database access during _PG_init.  (Right now
> that'd fail with the same sort of symptoms seen here.)  But
> there's no denying that this might have surprising side-effects
> for extensions that weren't expecting such a change.
>
> It could also be reasonable to do both #1 and #2, with the idea
> that #1 might save us from crashing if there are any other
> code paths where we can reach that pg_parameter_aclcheck call
> outside a transaction.
>
> Thoughts?

I had debated just wrapping the process_session_preload_libraries()
call with a transaction, like Nathan's patch posted ealier on this
thread does. But I hesitated because of the sensitivity around the
order of operations/call during process initialization.

I like the idea of performing library initialization in
InitPostgres(), as it performs the first transaction of the
connection, and because of the libraries' ability to gin up new GUC
variables that might need special handling, and also if it allows them
to do database access.

I think anywhere after the 'PostAuthDelay' check in InitPostgres()
would be a good place to perform process_session_preload_libraries().
I'm inclined to invoke it as late as possible, before we commit the
transaction.

As for making set_config_option() throw an error if not in
transaction, I'm not a big fan of checks  that break the flow, and of
unrelated code showing up when reading a function. For a casual
reader, such a check for transaction would make for a jarring
experience; "why are we checking for active transaction in  the guts
of guc.c?", they might think. If anything, such an error should be
thrown from or below pg_parameter_aclcheck().

But I am not sure if it should be exposed as an error. A user
encountering that error is not at fault. Hence I believe an assertion
check is more suitable for catching code that invokes
set_config_option() outside a transaction.

Best regards,
Gurjeet
http://Gurje.et




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Tom Lane
Nathan Bossart  writes:
> +StartTransactionCommand();
>  process_session_preload_libraries();
> +CommitTransactionCommand();

Yeah, that way would avoid any questions about changing the order of
operations, but it seems like a mighty expensive solution: it's
adding a transaction to each backend start on the off chance that
(a) session_preload_libraries/local_preload_libraries is nonempty and
(b) the loaded libraries are going to do anything where it'd matter.
So that's why I thought of moving the call inside a pre-existing
transaction.

If we had to back-patch this into any released versions, I'd agree with
taking the performance hit in order to reduce the chance of side-effects.
But I think as long as we only have to do it in v15, it's not too late to
possibly cause some compatibility issues for extensions.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-07-21 Thread Jacob Champion
On Wed, Jul 20, 2022 at 3:42 PM Tom Lane  wrote:
> Jacob Champion  writes:
> > I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s,
> > because that seems to be a common case for the check hooks.
>
> Really?  That's almost certainly NOT okay.  As an example, if you
> have a problem with a new value loaded from postgresql.conf during
> SIGHUP processing, throwing ERROR will cause the postmaster to exit.

v4 attempts to fix this by letting the check hooks pass
MCXT_ALLOC_NO_OOM to pg_clean_ascii(). (It's ignored in the frontend,
which just mallocs.)

> I wouldn't be too surprised if there are isolated cases where people
> didn't understand what they were doing and wrote that, but that
> needs to be fixed not emulated.

I might be missing something, but in guc.c at least it appears to be
the rule and not the exception.

Thanks,
--Jacob
diff --git a/src/backend/libpq/be-secure-openssl.c 
b/src/backend/libpq/be-secure-openssl.c
index 5318719acb..1394ecaa2b 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -1116,7 +1116,7 @@ prepare_cert_name(char *name)
 
 #undef MAXLEN
 
-   return pg_clean_ascii(truncated);
+   return pg_clean_ascii(truncated, 0);
 }
 
 /*
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index 5e8cd770c0..52fb2e52f1 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2284,7 +2284,7 @@ retry1:
 */
if (strcmp(nameptr, "application_name") == 0)
{
-   port->application_name = 
pg_clean_ascii(valptr);
+   port->application_name = 
pg_clean_ascii(valptr, 0);
}
}
offset = valoffset + strlen(valptr) + 1;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 60400752e5..2f99fe9a6d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -12479,9 +12479,18 @@ assign_maintenance_io_concurrency(int newval, void 
*extra)
 static bool
 check_application_name(char **newval, void **extra, GucSource source)
 {
+   char   *clean;
+
/* Only allow clean ASCII chars in the application name */
-   *newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
+   clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM);
+   if (!clean)
+   return false;
+
+   clean = guc_strdup(WARNING, clean);
+   if (!clean)
+   return false;
 
+   *newval = clean;
return true;
 }
 
@@ -12495,9 +12504,18 @@ assign_application_name(const char *newval, void 
*extra)
 static bool
 check_cluster_name(char **newval, void **extra, GucSource source)
 {
+   char   *clean;
+
/* Only allow clean ASCII chars in the cluster name */
-   *newval = guc_strdup(ERROR, pg_clean_ascii(*newval));
+   clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM);
+   if (!clean)
+   return false;
+
+   clean = guc_strdup(WARNING, clean);
+   if (!clean)
+   return false;
 
+   *newval = clean;
return true;
 }
 
diff --git a/src/common/string.c b/src/common/string.c
index db15324c62..97b3d45d36 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -62,7 +62,10 @@ strtoint(const char *pg_restrict str, char **pg_restrict 
endptr, int base)
 /*
  * pg_clean_ascii -- Replace any non-ASCII chars with a "\xXX" string
  *
- * Makes a palloc'd copy of the string passed in, which must be 
'\0'-terminated.
+ * Makes a newly allocated copy of the string passed in, which must be
+ * '\0'-terminated. In the backend, additional alloc_flags may be provided and
+ * will be passed as-is to palloc_extended(); in the frontend, alloc_flags is
+ * ignored and the copy is malloc'd.
  *
  * This function exists specifically to deal with filtering out
  * non-ASCII characters in a few places where the client can provide an almost
@@ -80,23 +83,46 @@ strtoint(const char *pg_restrict str, char **pg_restrict 
endptr, int base)
  * for now.
  */
 char *
-pg_clean_ascii(const char *str)
+pg_clean_ascii(const char *str, int alloc_flags)
 {
-   StringInfoData buf;
-   const char   *p;
+   size_t  dstlen;
+   char   *dst;
+   const char *p;
+   size_t  i = 0;
 
-   initStringInfo();
+   /* Worst case, each byte can become four bytes, plus a null terminator. 
*/
+   dstlen = strlen(str) * 4 + 1;
+
+#ifdef FRONTEND
+   dst = malloc(dstlen);
+#else
+   dst = palloc_extended(dstlen, alloc_flags);
+#endif
+
+   if (!dst)
+   return NULL;
 
for (p = str; *p != '\0'; p++)
{
+
/* Only allow clean ASCII chars in the string */
if (*p < 32 || *p > 126)
-   appendStringInfo(, 

Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote:
> Right.  So there are basically two things we could do about this:
> 
> 1. set_config_option could decline to call pg_parameter_aclcheck
> if not IsTransactionState(), instead failing the assignment.
> This isn't a great answer because it would result in disallowing
> GUC assignments that users might expect to work.
> 
> 2. We could move process_session_preload_libraries() to someplace
> where a transaction is in progress -- probably, relocate it to
> inside InitPostgres().
> 
> I'm inclined to think that #2 is a better long-term solution,
> because it'd allow you to session-preload libraries that expect
> to be able to do database access during _PG_init.  (Right now
> that'd fail with the same sort of symptoms seen here.)  But
> there's no denying that this might have surprising side-effects
> for extensions that weren't expecting such a change.
> 
> It could also be reasonable to do both #1 and #2, with the idea
> that #1 might save us from crashing if there are any other
> code paths where we can reach that pg_parameter_aclcheck call
> outside a transaction.
> 
> Thoughts?

I wrote up a small patch along the same lines as #2 before seeing this
message.  It simply ensures that process_session_preload_libraries() is
called within a transaction.  I don't have a strong opinion about doing it
this way versus moving this call somewhere else as you proposed, but I'd
agree that #2 is a better long-term solution than #1.  AFAICT
shared_preload_libraries, even with EXEC_BACKEND, should not have the same
problem.

I'm not sure whether we should be worried about libraries that are already
creating transactions in their _PG_init() functions.  Off the top of my
head, I don't recall seeing anything like that.  Even if it does impact
some extensions, it doesn't seem like it'd be too much trouble to fix.

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8ba1c170f0..fd471d74a3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username)
 /*
  * process any libraries that should be preloaded at backend start (this
  * likewise can't be done until GUC settings are complete)
+ *
+ * If the user provided a setting at session startup for a custom GUC
+ * defined by one of these libraries, we might need syscache access when
+ * evaluating whether she has permission to set it, so do this step within
+ * a transaction.
  */
+StartTransactionCommand();
 process_session_preload_libraries();
+CommitTransactionCommand();
 
 /*
  * Send this backend's cancellation info to the frontend.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Expose Parallelism counters planned/execute in pg_stat_statements

2022-07-21 Thread Anthony Sotolongo

Hi all:
Here's a patch to add counters about  planned/executed  for parallelism  
to pg_stat_statements, as a way to follow-up on if the queries are 
planning/executing with parallelism, this can help to understand if you 
have a good/bad configuration or if your hardware is enough





We decided to store information about the number of times is planned  
and  the number of times executed  the parallelism by queries



Regards

Anthony
From 99e02378b04d496698147c858c80477659fb34ad Mon Sep 17 00:00:00 2001
From: asotolongo 
Date: Thu, 21 Jul 2022 17:56:59 -0400
Subject: [PATCH v1] Here's a patch to add the counters of parallelism
 planned/executed by statements to pg_stat_statements

---
 contrib/pg_stat_statements/Makefile   |  2 +-
 .../expected/oldextversions.out   | 58 
 .../pg_stat_statements--1.10--1.11.sql| 69 +++
 .../pg_stat_statements/pg_stat_statements.c   | 58 ++--
 .../pg_stat_statements.control|  2 +-
 .../pg_stat_statements/sql/oldextversions.sql |  5 ++
 doc/src/sgml/pgstatstatements.sgml| 18 +
 7 files changed, 203 insertions(+), 9 deletions(-)
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index edc40c8bbf..0afb9060fa 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -6,7 +6,7 @@ OBJS = \
 	pg_stat_statements.o
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql \
+DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.10--1.11.sql \
 	pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \
 	pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
 	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out
index efb2049ecf..f847127347 100644
--- a/contrib/pg_stat_statements/expected/oldextversions.out
+++ b/contrib/pg_stat_statements/expected/oldextversions.out
@@ -250,4 +250,62 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements;
  t
 (1 row)
 
+-- New functions and views for pg_stat_statements in 1.11
+AlTER EXTENSION pg_stat_statements UPDATE TO '1.11';
+\d pg_stat_statements
+  View "public.pg_stat_statements"
+ Column |   Type   | Collation | Nullable | Default 
++--+---+--+-
+ userid | oid  |   |  | 
+ dbid   | oid  |   |  | 
+ toplevel   | boolean  |   |  | 
+ queryid| bigint   |   |  | 
+ query  | text |   |  | 
+ plans  | bigint   |   |  | 
+ total_plan_time| double precision |   |  | 
+ min_plan_time  | double precision |   |  | 
+ max_plan_time  | double precision |   |  | 
+ mean_plan_time | double precision |   |  | 
+ stddev_plan_time   | double precision |   |  | 
+ calls  | bigint   |   |  | 
+ total_exec_time| double precision |   |  | 
+ min_exec_time  | double precision |   |  | 
+ max_exec_time  | double precision |   |  | 
+ mean_exec_time | double precision |   |  | 
+ stddev_exec_time   | double precision |   |  | 
+ rows   | bigint   |   |  | 
+ shared_blks_hit| bigint   |   |  | 
+ shared_blks_read   | bigint   |   |  | 
+ shared_blks_dirtied| bigint   |   |  | 
+ shared_blks_written| bigint   |   |  | 
+ local_blks_hit | bigint   |   |  | 
+ local_blks_read| bigint   |   |  | 
+ local_blks_dirtied | bigint   |   |  | 
+ local_blks_written | bigint   |   |  | 
+ temp_blks_read | bigint   |   |  | 
+ temp_blks_written  | bigint   |   |  | 
+ blk_read_time  | double precision |   |  | 
+ blk_write_time | double precision |   |  | 
+ temp_blk_read_time | double precision |   |  | 
+ temp_blk_write_time| double precision |   |  | 
+ wal_records| bigint   |   |  | 
+ wal_fpi| bigint   |   |  | 
+ wal_bytes   

Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.

2022-07-21 Thread Peter Smith
On Thu, Jul 21, 2022 at 10:11 PM Amit Kapila  wrote:
>
> On Thu, Jul 21, 2022 at 10:03 AM Michael Paquier  wrote:
> >
> > On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote:
> > > Yeah, it is not very clear to me either. I think this won't be
> > > difficult to change one or another way depending on future needs. At
> > > this stage, it appeared to me that bitmask is a better way to
> > > represent this information but if you and other feels using enum is a
> > > better idea then I am fine with that as well.
> >
> > Please don't get me wrong :)
> >
> > I favor a bitmask over an enum here, as you do, with a clean
> > layer for those flags.
> >
>
> Okay, let's see what Peter Smith has to say about this as he was in
> favor of using enum here?
>

I was in favour of enum mostly because I thought the bitmask of an
earlier patch was mis-used; IMO each bit should only be for
representing something as "on/set". So a bit for
SUBSCRIPTION_REL_STATE_READY makes sense, but a bit for
SUBSCRIPTION_REL_STATE_NOT_READY seemed strange/backwards to me. YMMV.

So using a bitmask is fine, except I thought it should be implemented
so that one of the bits is for a "NOT" modifier (IIUC this is kind of
similar to what Michael [1] suggested above?). So "Not READY" would be
(SUBSCRIPTION_REL_STATE_MOD_NOT | SUBSCRIPTION_REL_STATE_READY)

Also, it may be better to add the bit constants for every one of the
current states, even if you are not needing to use all of them just
yet. In fact, I thought this patch probably can implement the fully
capable common function (i.e. capable of multiple keys etc) right now,
so there will be no need to revisit it again in the future.

--
[1] https://www.postgresql.org/message-id/Ytiuj4hLykTvBF46%40paquier.xyz

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS

2022-07-21 Thread Tom Lane
Gurjeet Singh  writes:
> While poking at plperl's GUC in an internal discussion, I was able to
> induce a crash (or an assertion failure in assert-enabled builds) as
> an unprivileged user.
> My investigation so far has revealed that the code path for the
> following condition has never been tested, and because of this, when a
> user tries to override an SUSET param via PGOPTIONS, Postgres tries to
> perform a table lookup during process initialization. Because there's
> no transaction in progress, and because this table is not in the
> primed caches, we end up with code trying to dereference an
> uninitialized  CurrentResourceOwner.

Right.  So there are basically two things we could do about this:

1. set_config_option could decline to call pg_parameter_aclcheck
if not IsTransactionState(), instead failing the assignment.
This isn't a great answer because it would result in disallowing
GUC assignments that users might expect to work.

2. We could move process_session_preload_libraries() to someplace
where a transaction is in progress -- probably, relocate it to
inside InitPostgres().

I'm inclined to think that #2 is a better long-term solution,
because it'd allow you to session-preload libraries that expect
to be able to do database access during _PG_init.  (Right now
that'd fail with the same sort of symptoms seen here.)  But
there's no denying that this might have surprising side-effects
for extensions that weren't expecting such a change.

It could also be reasonable to do both #1 and #2, with the idea
that #1 might save us from crashing if there are any other
code paths where we can reach that pg_parameter_aclcheck call
outside a transaction.

Thoughts?

regards, tom lane




Re: System column support for partitioned tables using heap

2022-07-21 Thread Robert Haas
On Tue, Jul 19, 2022 at 11:22 PM Morris de Oryx  wrote:
> It might help if I show a sample insert handling function. The issue is with 
> the line at the end of the top CTE, insert_rows:
>
> returning xmax as inserted_transaction_id),
>
> That's what fails on partitions. Is there an alternative way to test what 
> happened to the row(s)? here's the full function. . I wrote a code generator, 
> so I don't have to hand-code all of these bits for each table+version:

Oh I see. I didn't realize you were using INSERT .. ON CONFLICT
UPDATE, but that makes tons of sense, and I don't see an obvious
alternative to the way you wrote this.

Hmm.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Undocumented Order By vs Target List Volatile Function Behavior

2022-07-21 Thread David G. Johnston
Hey,

This came up today on twitter as a claimed POLA violation:

postgres=# select random(), random() order by random();
   random|   random
-+-
 0.08176638503720679 | 0.08176638503720679
(1 row)

Which was explained long ago by Tom as:

https://www.postgresql.org/message-id/9570.1193941378%40sss.pgh.pa.us

The parser makes it behave equivalent to:

SELECT random() AS foo ORDER BY foo;

Which apparently extends to any column, even aliased ones, that use the
same expression:

postgres=# select random() as foo, random() as foo2 order by foo;
foo |foo2
+
 0.7334292196943459 | 0.7334292196943459
(1 row)

The documentation does say:

"A query using a volatile function will re-evaluate the function at every
row where its value is needed."

https://www.postgresql.org/docs/current/xfunc-volatility.html

That sentence is insufficient to explain why, without the order by, the
system chooses to evaluate random() twice, while with order by it does so
only once.

I propose extending the existing ORDER BY paragraph in the SELECT Command
Reference as follows:

"A limitation of this feature is that an ORDER BY clause applying to the
result of a UNION, INTERSECT, or EXCEPT clause can only specify an output
column name or number, not an expression."

Add:

A side-effect of this feature is that ORDER BY expressions containing
volatile functions will execute the volatile function only once for the
entire row; thus any column expressions using the same function will reuse
the same function result.  By way of example, note the output differences
for the following two queries:

postgres=# select random() as foo, random()*1 as foo2 from
generate_series(1,2) order by foo;
foo |foo2
+
 0.2631492904302788 | 0.2631492904302788
 0.9019166692448664 | 0.9019166692448664
(2 rows)

postgres=# select random() as foo, random() as foo2 from
generate_series(1,2);
foo |foo2
+
 0.7763978178239725 | 0.3569212477832773
 0.7360531822096732 | 0.7028952103643864
(2 rows)

David J.


Re: Handle infinite recursion in logical replication setup

2022-07-21 Thread Jonathan S. Katz

Hi,

On 7/21/22 6:34 AM, vignesh C wrote:

On Thu, Jul 21, 2022 at 2:06 PM Amit Kapila  wrote:


On Wed, Jul 20, 2022 at 2:33 PM vignesh C  wrote:


Modified. Apart from this I have run pgperltidy on the perl file and
renamed 032_origin.pl to 030_origin.pl as currently there is
029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file.
Thanks for the comment, the attached patch has the changes for the same.



Pushed. Kindly rebase the remaining patches.


Thanks for pushing the patch.
The attached v37 version contains the rebased patch for the remaining patches.


Thanks for the work on this feature -- this is definitely very helpful 
towards supporting more types of use cases with logical replication!


I've read through the proposed documentation and did some light testing 
of the patch. I have two general comments about the docs as they 
currently read:


1. I'm concerned by calling this "Bidirectional replication" in the docs 
that we are overstating the current capabilities. I think this is 
accentuated int he opening paragraph:


==snip==
 Bidirectional replication is useful for creating a multi-master database
 environment for replicating read/write operations performed by any of the
 member nodes.
==snip==

For one, we're not replicating reads, we're replicating writes. Amongst 
the writes, at this point we're only replicating DML. A reader could 
think that deploying can work for a full bidirectional solution.


(Even if we're aspirationally calling this section "Bidirectional 
replication", that does make it sound like we're limited to two nodes, 
when we can support more than two).


Perhaps "Logical replication between writers" or "Logical replication 
between primaries" or "Replicating changes between primaries", or 
something better.


2. There is no mention of conflicts in the documentation, e.g. 
referencing the "Conflicts" section of the documentation. It's very easy 
to create a conflicting transaction that causes a subscriber to be 
unable to continue to apply transactions:


  -- DB 1
  CREATE TABLE abc (id int);
  CREATE PUBLICATION node1 FOR ALL TABLES ;

  -- DB2
  CREATE TABLE abc (id int);
  CREATE PUBLICATION node2 FOR ALL TABLES ;
  CREATE SUBSCRIPTION node2_node1
CONNECTION 'dbname=logi port=5433'
PUBLICATION node1
WITH (copy_data = off, origin = none);

  -- DB1
  CREATE SUBSCRIPTION node1_node2
CONNECTION 'dbname=logi port=5434'
PUBLICATION node2
WITH (copy_data = off, origin = none);
  INSERT INTO abc VALUES (1);

  -- DB2
  INSERT INTO abc VALUES (2);

  -- DB1
  ALTER TABLE abc ADD PRIMARY KEY id;
  INSERT INTO abc VALUES (3);

  -- DB2
  INSERT INTO abc VALUES (3);

  -- DB1 cannot apply the transactions

At a minimum, I think we should reference the documentation we have in 
the logical replication section on conflicts. We may also want to advise 
that a user is responsible for designing their schemas in a way to 
minimize the risk of conflicts.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: make -C libpq check fails obscurely if tap tests are disabled

2022-07-21 Thread Andrew Dunstan


On 2022-07-21 Th 04:53, Alvaro Herrera wrote:
> On 2022-Jul-20, Andrew Dunstan wrote:
>
>> On 2022-07-20 We 13:23, Justin Pryzby wrote:
>>> PATH=... && @echo "TAP tests not enabled. Try configuring with 
>>> --enable-tap-tests"
>>> /bin/sh: 1: @echo: not found
>>>
>>> make is telling the shell to run "@echo" , rather than running "echo" 
>>> silently.
>> Yeah. It's a bit ugly but I think the attached would fix it.
> Here's a different take.  Just assign the variable separately.


Nice, I didn't know you could do that.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Transparent column encryption

2022-07-21 Thread Bruce Momjian
On Mon, Jul 18, 2022 at 12:53:23PM +0200, Peter Eisentraut wrote:
> Asymmetric keys gives you some more options for how you set up the keys at
> the beginning.  For example, you create the asymmetric key pair on the host
> where your client program that wants access to the encrypted data will run.
> You put the private key in an appropriate location for run time.  You send
> the public key to another host.  On that other host, you create the CEK,
> encrypt it with the CMK, and then upload it into the server (CREATE COLUMN
> ENCRYPTION KEY).  Then you can wipe that second host.  That way, you can be
> even more sure that the unencrypted CEK isn't left anywhere.  I'm not sure
> whether this method is very useful in practice, but it's interesting.
> 
> In any case, as I mentioned above, this particular aspect is up for
> discussion.

I caution against adding complexity without a good reason, because
historically complexity often leads to exploits and bugs, especially
with crypto.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Dean Rasheed
On Thu, 21 Jul 2022 at 18:42, Alvaro Herrera  wrote:
>
> Thanks.  I was looking at the recently modified REINDEX syntax and
> noticed there another spot for taking an optional name.  I ended up
> reusing OptSchemaName for that, as in the attached patch.  I think
> adding production-specific additional productions is pointless and
> probably bloats the grammar.  So let me +1 your push of the patch you
> posted, just to keep things moving forward, but in addition I propose to
> later rename OptSchemaName to something more generic and use it in these
> three places.
>

OK, pushed.

Before writing opt_stats_name, I went looking for anything else I
could use, but didn't see anything. The difference between this and
the index case, is that statistics objects can be schema-qualified, so
it might be tricky to get something that'll work for all 3 places.

Regards,
Dean




Re: Transparent column encryption

2022-07-21 Thread Jacob Champion
On Mon, Jul 18, 2022 at 9:07 AM Robert Haas  wrote:
> Even there, what can be accomplished with a feature that only encrypts
> individual column values is by nature somewhat limited. If you have a
> text column that, for one row, stores the value 'a', and for some
> other row, stores the entire text of Don Quixote in the original
> Spanish, it is going to be really difficult to keep an adversary who
> can read from the disk from distinguishing those rows. If you want to
> fix that, you're going to need to do block-level encryption or
> something of that sort.

A minimum padding option would fix the leak here, right? If every
entry is the same length then there's no information to be gained, at
least in an offline analysis.

I think some work around that is probably going to be needed for
serious use of this encryption, in part because of the use of text
format as the canonical input. If the encrypted values of 1, 10, 100,
and 1000 hypothetically leaked their exact lengths, then an encrypted
int wouldn't be very useful. So I'd want to quantify (and possibly
configure) exactly how much data you can encrypt in a single message
before the length starts being leaked, and then make sure that my
encrypted values stay inside that bound.

--Jacob




Re: Transparent column encryption

2022-07-21 Thread Jacob Champion
On Mon, Jul 18, 2022 at 3:53 AM Peter Eisentraut
 wrote:
> Some other products make use of secure enclaves to do computations on
> (otherwise) encrypted values on the server.  I don't fully know how that
> works, but I suspect that asymmetric keys can play a role in that.  (I
> don't have any immediate plans for that in my patch.  It seems to be a
> dying technology at the moment.)
>
> Asymmetric keys gives you some more options for how you set up the keys
> at the beginning.  For example, you create the asymmetric key pair on
> the host where your client program that wants access to the encrypted
> data will run.  You put the private key in an appropriate location for
> run time.  You send the public key to another host.  On that other host,
> you create the CEK, encrypt it with the CMK, and then upload it into the
> server (CREATE COLUMN ENCRYPTION KEY).  Then you can wipe that second
> host.  That way, you can be even more sure that the unencrypted CEK
> isn't left anywhere.  I'm not sure whether this method is very useful in
> practice, but it's interesting.

As long as it's clear to people trying this that the "public" key
cannot actually be made public, I suppose. That needs to be documented
IMO. I like your idea of providing a symmetric option as well.

> In any case, as I mentioned above, this particular aspect is up for
> discussion.
>
> Also note that if you use a KMS (cmklookup "run" method), the actual
> algorithm doesn't even matter (depending on details of the KMS setup),
> since you just tell the KMS "decrypt this", and the KMS knows by itself
> what algorithm to use.  Maybe there should be a way to specify "unknown"
> in the ckdcmkalg field.

+1, an officially client-defined method would probably be useful.

> The short answer is, these same algorithms are used in equivalent
> products (see MS SQL Server, MongoDB).  They even reference the same
> exact draft document.
>
> Besides that, here is my analysis for why these are good choices: You
> can't use any of the counter modes, because since the encryption happens
> on the client, there is no way to coordinate to avoid nonce reuse.  So
> among mainstream modes, you are basically left with AES-CBC with a
> random IV.  In that case, even if you happen to reuse an IV, the
> possible damage is very contained.)

I think both AES-GCM-SIV and XChaCha20-Poly1305 are designed to handle
the nonce problem as well. In any case, if I were deploying this, I'd
want to know the characteristics/limits of our chosen suites (e.g. how
much data can be encrypted per key) so that I could plan rotations
correctly. Something like the table in [1]?

> > Since we're requiring "canonical" use of text format, and the docs say
> > there are no embedded or trailing nulls allowed in text values, could we
> > steal the use of a single zero byte to mean NULL? One additional
> > complication would be that the client would have to double-check that
> > we're not writing a NULL into a NOT NULL column, and complain if it
> > reads one during decryption. Another complication would be that the
> > client would need to complain if it got a plaintext NULL.
>
> You're already alluding to some of the complications.  Also consider
> that null values could arise from, say, outer joins.  So you could be in
> a situation where encrypted and unencrypted null values coexist.

(I realize I'm about to wade into the pool of what NULL means in SQL,
the subject of which I've stayed mostly, gleefully, ignorant.)

To be honest that sounds pretty useful. Any unencrypted null must have
come from the server computation; it's a clear claim by the server
that no such rows exist. (If the encrypted column is itself NOT NULL
then there's no ambiguity to begin with, I think.) That wouldn't be
transparent behavior anymore, so it may (understandably) be a non-goal
for the patch, but it really does sound useful.

And it might be a little extreme, but if I as a user decided that I
wanted in-band encrypted null, it wouldn't be particularly surprising
to me if such a column couldn't be included in an outer join. Just
like I can't join on a randomized encrypted column, or add two
encrypted NUMERICs to each other. In fact I might even want the server
to enforce NOT NULL transparently on the underlying pg_encrypted_*
column, to make sure that I didn't accidentally push an unencrypted
NULL by mistake.

> And of
> course the server doesn't know about the encrypted null values.  So how
> do you maintain semantics, like for aggregate functions, primary keys,
> anything that treats null values specially?

Could you elaborate? Any special cases seem like they'd be important
to document regardless of whether or not we support in-band null
encryption. For example, do you plan to support encrypted primary
keys, null or not? That seems like it'd be particularly difficult
during CEK rotation.

> How do clients deal with a
> mix of encrypted and unencrypted null values, how do they know which one
> is real.

That one seems 

Re: Fix gcc warning in sync.c (usr/src/backend/storage/sync/sync.c)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-15, Thomas Munro wrote:

> On Mon, Jul 11, 2022 at 9:45 PM Kyotaro Horiguchi
>  wrote:

> > So, for starters, I compiled the whole tree with -Wshadow=local. and I
> > saw many warnings with it.  At a glance all of them are reasonably
> > "fixed" but I don't think it is what we want...
> 
> Wow, yeah.

Previous threads on this topic:

https://postgr.es/m/mn2pr18mb2927f7b5f690065e1194b258e3...@mn2pr18mb2927.namprd18.prod.outlook.com
https://postgr.es/m/CAApHDvpqBR7u9yzW4yggjG=QfN=fzsc8wo2ckokpqtif-+i...@mail.gmail.com
https://postgr.es/m/877k1psmpf@mailbox.samurai.com

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)




Re: Allow placeholders in ALTER ROLE w/o superuser

2022-07-21 Thread Nathan Bossart
On Wed, Jul 20, 2022 at 11:50:10AM -0400, Tom Lane wrote:
> I think that 13d838815 has completely changed the terms that this
> discussion needs to be conducted under.  It seems clear to me now
> that if you want to relax this only-superusers restriction, what
> you have to do is store the OID of the role that issued ALTER ROLE/DB SET,
> and then apply the same checks that would be used in the ordinary case
> where a placeholder is being filled in after being set intra-session.
> That is, we'd no longer assume that a value coming from pg_db_role_setting
> was set with superuser privileges, but we'd know exactly who did set it.

I was imagining that the permissions checks would apply at ALTER ROLE/DB
SET time, not at login time.  Otherwise, changing a role's privileges might
impact other roles' parameters, and it's not clear (at least to me) what
should happen when the role is dropped.  Another reason I imagined it this
way is because that's basically how it works today.  We assume that the
pg_db_role_setting entry was added by a superuser, but we don't check that
the user that ran ALTER ROLE/DB SET is still superuser every time you log
in.

> This might also tie into Nathan's question in another thread about
> exactly what permissions should be required to issue ALTER ROLE/DB SET.
> In particular I'm wondering if different permissions should be needed to
> override an existing entry than if there is no existing entry.  If not,
> we could find ourselves downgrading a superuser-set entry to a
> non-superuser-set entry, which might have bad consequences later
> (eg, by rendering the entry nonfunctional because when we actually
> load the extension we find out the GUC is SUSET).

Yeah, this is why I suggested storing something that equates to PGC_SUSET
any time a role is superuser or has grantable GUC permissions.

> Possibly related to this: I felt while working on 13d838815 that
> PGC_SUSET and PGC_SU_BACKEND should be usable as GucContext
> values for GUC variables, indicating that the GUC requires special
> privileges to be set, but we should no longer use them as passed-in
> GucContext values.  That is, we should remove privilege tests from
> the call sites, like this:
> 
> (void) set_config_option(stmt->name,
>  ExtractSetVariableArgs(stmt),
> -(superuser() ? PGC_SUSET : PGC_USERSET),
> +PGC_USERSET,
>  PGC_S_SESSION,
>  action, true, 0, false);
> 
> and instead put that behavior inside set_config_option_ext, which
> would want to look at superuser_arg(srole) instead, and indeed might
> not need to do anything because pg_parameter_aclcheck would subsume
> the test.  I didn't pursue this further because it wasn't essential
> to fixing the bug.  But it seems relevant here, because that line of
> thought leads to the conclusion that storing PGC_SUSET vs PGC_USERSET
> is entirely the wrong approach.

Couldn't ProcessGUCArray() use set_config_option_ext() with the context
indicated by pg_db_role_setting?  Also, instead of using PGC_USERSET in all
the set_config_option() call sites, shouldn't we remove the "context"
argument altogether?  I am likely misunderstanding your proposal, but while
I think simplifying set_config_option() is worthwhile, I don't see why it
would preclude storing the context in pg_db_role_setting.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: First draft of the PG 15 release notes

2022-07-21 Thread Bruce Momjian
On Tue, Jul 12, 2022 at 02:47:07PM -0400, Bruce Momjian wrote:
> On Mon, Jul 11, 2022 at 11:31:32PM -0700, Noah Misch wrote:
> > On Mon, Jul 11, 2022 at 12:39:57PM -0400, Bruce Momjian wrote:
> > > I had trouble reading the sentences in the order you used so I
> > > restructured it:
> > > 
> > >   The new default is one of the secure schema usage patterns that  > >   linkend="ddl-schemas-patterns"/> has recommended since the security
> > >   release for CVE-2018-1058.  The change applies to newly-created
> > >   databases in existing clusters and for new clusters.  Upgrading a
> > >   cluster or restoring a database dump will preserve existing permissions.
> > 
> > I agree with the sentence order change.
> 
> Great.
> 
> > >   For existing databases, especially those having multiple users, consider
> > >   issuing REVOKE to adopt this new default.  For new
> > >   databases having zero need to defend against insider threats, granting
> > >   USAGE permission on their public
> > >   schemas will yield the behavior of prior releases.
> > 
> > s/USAGE/CREATE/ in the last sentence.  Looks good with that change.
> 
> Ah, yes, of course.

Patch applied,  I also adjusted the second paragraph to be more
symmetric.  You can see the results here:

https://momjian.us/pgsql_docs/release-15.html

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Dean Rasheed wrote:

> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
> 
> I also noticed a comment in CreateStatistics() that needed updating.
> 
> Barring any further comments, I'll push this shortly.

Thanks.  I was looking at the recently modified REINDEX syntax and
noticed there another spot for taking an optional name.  I ended up
reusing OptSchemaName for that, as in the attached patch.  I think
adding production-specific additional productions is pointless and
probably bloats the grammar.  So let me +1 your push of the patch you
posted, just to keep things moving forward, but in addition I propose to
later rename OptSchemaName to something more generic and use it in these
three places.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 143467419c19aea6a79db46da461aae4d9afabac Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 21 Jul 2022 16:48:51 +0200
Subject: [PATCH] Rework grammar for REINDEX

---
 src/backend/parser/gram.y  | 80 +++---
 src/test/regress/expected/create_index.out |  4 +-
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d649a1b8d1..85ab17ca5a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -560,7 +560,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	generic_option_elem alter_generic_option_elem
 %type 	generic_option_list alter_generic_option_list
 
-%type 	reindex_target_type reindex_target_multitable reindex_name_optional
+%type 	reindex_target_type reindex_target_type_multi
 
 %type 	copy_generic_opt_arg copy_generic_opt_arg_list_item
 %type 	copy_generic_opt_elem
@@ -9085,7 +9085,9 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d
  *
  *		QUERY:
  *
- *		REINDEX [ (options) ] type [CONCURRENTLY] 
+ *		REINDEX [ (options) ] {TABLE | INDEX} [CONCURRENTLY] 
+ *		REINDEX [ (options) ] SCHEMA [CONCURRENTLY] 
+ *		REINDEX [ (options) ] {SYSTEM | DATABASE} []
  */
 
 ReindexStmt:
@@ -9102,37 +9104,6 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @3));
 	$$ = (Node *) n;
 }
-			| REINDEX reindex_target_multitable opt_concurrently name
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-
-	n->kind = $2;
-	n->name = $4;
-	n->relation = NULL;
-	n->params = NIL;
-	if ($3)
-		n->params = lappend(n->params,
-			makeDefElem("concurrently", NULL, @3));
-	$$ = (Node *) n;
-}
-			| REINDEX reindex_name_optional
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-	n->kind = $2;
-	n->name = NULL;
-	n->relation = NULL;
-	n->params = NIL;
-	$$ = (Node *)n;
-}
-			| REINDEX '(' utility_option_list ')' reindex_name_optional
-{
-	ReindexStmt *n = makeNode(ReindexStmt);
-	n->kind = $5;
-	n->name = NULL;
-	n->relation = NULL;
-	n->params = $3;
-	$$ = (Node *)n;
-}
 			| REINDEX '(' utility_option_list ')' reindex_target_type opt_concurrently qualified_name
 {
 	ReindexStmt *n = makeNode(ReindexStmt);
@@ -9146,11 +9117,25 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @6));
 	$$ = (Node *) n;
 }
-			| REINDEX '(' utility_option_list ')' reindex_target_multitable opt_concurrently name
+
+			| REINDEX SCHEMA opt_concurrently name
 {
 	ReindexStmt *n = makeNode(ReindexStmt);
 
-	n->kind = $5;
+	n->kind = REINDEX_OBJECT_SCHEMA;
+	n->name = $4;
+	n->relation = NULL;
+	n->params = NIL;
+	if ($3)
+		n->params = lappend(n->params,
+			makeDefElem("concurrently", NULL, @3));
+	$$ = (Node *) n;
+}
+			| REINDEX '(' utility_option_list ')' SCHEMA opt_concurrently name
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+
+	n->kind = REINDEX_OBJECT_SCHEMA;
 	n->name = $7;
 	n->relation = NULL;
 	n->params = $3;
@@ -9159,18 +9144,31 @@ ReindexStmt:
 			makeDefElem("concurrently", NULL, @6));
 	$$ = (Node *) n;
 }
+			| REINDEX reindex_target_type_multi OptSchemaName
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+	n->kind = $2;
+	n->name = NULL;
+	n->relation = NULL;
+	n->params = NIL;
+	$$ = (Node *)n;
+}
+			| REINDEX '(' utility_option_list ')' reindex_target_type_multi OptSchemaName
+{
+	ReindexStmt *n = makeNode(ReindexStmt);
+	n->kind = $5;
+	n->name = $6;
+	n->relation = NULL;
+	n->params = $3;
+	$$ = (Node *)n;
+}
 		;
 reindex_target_type:
 			INDEX	{ $$ = 

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 10:41 schrieb Dean Rasheed:


It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html

It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.

Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.



Here is an updated patch that marks the functions VOLATILE PARALLEL 
RESTRICTED and uses pg_prng_uint64_range() rather than rand().From 26676802f05d00c31e0b2d5997f61734aa421fca Mon Sep 17 00:00:00 2001
From: Martin Kalcher 
Date: Sun, 17 Jul 2022 18:06:04 +0200
Subject: [PATCH] Introduce array_shuffle() and array_sample()

* array_shuffle() shuffles the elements of an array.
* array_sample() chooses n elements from an array by random.

Shuffling of arrays can already be accomplished with SQL
using unnest() and array_agg(order by random()). But that is
very slow compared to the new functions. In addition the new functions
are dimension aware.
---
 doc/src/sgml/func.sgml   |  35 +
 src/backend/utils/adt/arrayfuncs.c   | 189 ++-
 src/include/catalog/pg_proc.dat  |   6 +
 src/test/regress/expected/arrays.out |  60 +
 src/test/regress/sql/arrays.sql  |  17 +++
 5 files changed, 306 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b6783b7ad0..6b96897244 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19395,6 +19395,41 @@ SELECT NULLIF(value, '(none)') ...

   
 
+  
+   
+
+ array_sample
+
+array_sample ( array anyarray, n integer )
+anyarray
+   
+   
+Returns n randomly chosen elements from array.
+The order of the elements in resulting array is unspecified.
+   
+   
+array_sample(ARRAY[[1,2],[3,4],[5,6]], 2)
+{{5,6},{3,4}}
+   
+  
+
+  
+   
+
+ array_shuffle
+
+array_shuffle ( anyarray )
+anyarray
+   
+   
+Shuffles the first dimension of the array.
+   
+   
+array_shuffle(ARRAY[[1,2],[3,4],[5,6]])
+{{5,6},{3,4},{1,2}}
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index fb167f226a..64da980348 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -34,7 +34,7 @@
 #include "utils/memutils.h"
 #include "utils/selfuncs.h"
 #include "utils/typcache.h"
-
+#include "common/pg_prng.h"
 
 /*
  * GUC parameter
@@ -6872,3 +6872,190 @@ trim_array(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(result);
 }
+
+/*
+ * Produce array with max n random items from given array in random order.
+ *
+ * array: array object to examine (must not be NULL)
+ * n: max number of items
+ * elmtyp, elmlen, elmbyval, elmalign: info for the datatype of the items
+ *
+ * NOTE: it would be cleaner to look up the elmlen/elmbval/elmalign info
+ * from the system catalogs, given the elmtype.  However, the caller is
+ * in a better position to cache this info across multiple uses, or even
+ * to hard-wire values if the element type is hard-wired.
+ */
+static ArrayType *
+array_shuffle_n(ArrayType *array, int n,
+Oid elmtyp, int elmlen, bool elmbyval, char elmalign)
+{
+	ArrayType  *result;
+	int			ndim,
+			   *dims,
+			   *lbs,
+rdims[MAXDIM],
+nelm,
+nitem,
+i,
+j,
+k;
+	Datum		elm,
+			   *elms,
+			   *relms;
+	bool		nul,
+			   *nuls,
+			   *rnuls;
+
+	ndim = ARR_NDIM(array);
+	dims = ARR_DIMS(array);
+	lbs = ARR_LBOUND(array);
+
+	if (ndim < 1 || dims[0] < 1 || n < 1)
+		return construct_empty_array(elmtyp);
+
+	deconstruct_array(array, elmtyp, elmlen, elmbyval, elmalign,
+	  , , );
+
+	/* Calculate number of elements per item. */
+	nelm = (ndim > 1) ? ArrayGetNItems(ndim - 1, dims + 1) : 1;
+	elms = relms;
+	nuls = rnuls;
+	nitem = dims[0];
+	n = Min(n, nitem);
+
+	/*
+	 * Shuffle array using Fisher-Yates algorithm. Swap head with an randomly
+	 * chosen item and increment head.
+	 */
+	for (i = 0; i < n; i++)
+	{
+		k = (int) pg_prng_uint64_range(_global_prng_state, 0, nitem - i - 1) * nelm;
+
+		/* Swap all elements in head with elements in item k. */
+		for (j = 0; j < nelm; j++)
+		{
+			elm = *elms;
+			

Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Nathan Bossart
On Thu, Jul 21, 2022 at 01:02:50PM -0400, Tom Lane wrote:
> Robert Haas  writes:
>> ... if
>> we want to regard no-superusers as a supported configuration, we
>> probably need to tighten that up. I think it's kind of hopeless,
> 
> Yeah, I agree.  At least, I'm uninterested in spending any of my
> own time trying to make that usefully-more-secure than it is today.
> If somebody else is interested enough to do the legwork, we can
> look at what they come up with.

Given the current assumptions the code makes about the bootstrap superuser,
I think it makes sense to disallow removing its superuser attribute (at
least via ALTER ROLE NOSUPERUSER).  It seems like there is much work to do
before a no-superuser configuration could be formally supported.  If/when
such support materializes, it might be possible to remove the restriction
that Robert is proposing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Amit Langote wrote:

> Because I wrote all of it while not really understanding how the LLVM
> constructs like blocks and branches work, the only reason I think
> those llvm_compile_expr() additions may be correct is that all the
> tests in jsonb_sqljson.sql pass even if I add the following line at
> the top:

I suggest to build with --enable-coverage, then run the regression tests
and do "make coverage-html" and see if your code appears covered in the
report.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
Robert Haas  writes:
> ... if
> we want to regard no-superusers as a supported configuration, we
> probably need to tighten that up. I think it's kind of hopeless,

Yeah, I agree.  At least, I'm uninterested in spending any of my
own time trying to make that usefully-more-secure than it is today.
If somebody else is interested enough to do the legwork, we can
look at what they come up with.

regards, tom lane




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Robert Haas
On Thu, Jul 21, 2022 at 12:28 PM Tom Lane  wrote:
> True, but what if the idea is to have *no* superusers?  I seem
> to recall people being interested in setups like that.

Hmm, right. There's nothing that stops you from de-super-ing all of
your superusers today, and then if you ever need to do anything as
superuser again, you have to start up in single-user mode, which will
treat your session as super regardless. But considering how much power
the bootstrap user still has, I'm not sure that's really buying you
very much. In particular, the new GRANT ALTER SYSTEM stuff looks
sufficient to allow the bootstrap user to break out to the OS, so if
we want to regard no-superusers as a supported configuration, we
probably need to tighten that up. I think it's kind of hopeless,
though, because of the fact that you can also freely Trojan functions
and operators in pg_catalog. Maybe that's insufficient to break out to
the OS or assume superuser privileges, but you should be able to at
least Trojan every other user on the system.

> On the whole I don't have any objection to your proposal, I just
> worry that somebody else will.

OK, good to know. Thanks.

> Of course there's always "UPDATE pg_authid SET rolsuper = false",
> which makes it absolutely clear that you're breaking the glass cover.

Right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
"David G. Johnston"  writes:
> On Thu, Jul 21, 2022 at 9:28 AM Tom Lane  wrote:
>> True, but what if the idea is to have *no* superusers?  I seem
>> to recall people being interested in setups like that.

> I would expect an initdb option (once this is possible) to specify this
> desire and we just never set one up in the first place.  It seems
> impractical to remove one after it already exists.

There has to be a role that owns the built-in objects.  Robert's point
is that pretending that that role isn't high-privilege is silly.

regards, tom lane




Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread David G. Johnston
On Thu, Jul 21, 2022 at 9:28 AM Tom Lane  wrote:

> Robert Haas  writes:
> > Currently, it's possible to remove the rolissuper bit from the
> > bootstrap superuser, but this leaves that user - and the system in
> > general - in an odd state. The bootstrap user continues to own all of
> > the objects it owned before, e.g. all of the system catalogs. Direct
> > DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
> > possible to do things like rename a system function out of the way and
> > create a new function with the same signature. Therefore, creating a
> > new superuser and making the original one a non-superuser is probably
> > not viable from a security perspective, because anyone who gained
> > access to that role would likely have little difficulty mounting a
> > Trojan horse attack against the current superusers.
>
> True, but what if the idea is to have *no* superusers?  I seem
> to recall people being interested in setups like that.
>


> On the whole I don't have any objection to your proposal, I just
> worry that somebody else will.
>
> Of course there's always "UPDATE pg_authid SET rolsuper = false",
> which makes it absolutely clear that you're breaking the glass cover.
>
>
I would expect an initdb option (once this is possible) to specify this
desire and we just never set one up in the first place.  It seems
impractical to remove one after it already exists.  Though we could enable
the option (or a function) tied to the specific predefined role that, say,
permits catalog changes, when that day comes.

David J.


Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Tom Lane
Robert Haas  writes:
> Currently, it's possible to remove the rolissuper bit from the
> bootstrap superuser, but this leaves that user - and the system in
> general - in an odd state. The bootstrap user continues to own all of
> the objects it owned before, e.g. all of the system catalogs. Direct
> DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
> possible to do things like rename a system function out of the way and
> create a new function with the same signature. Therefore, creating a
> new superuser and making the original one a non-superuser is probably
> not viable from a security perspective, because anyone who gained
> access to that role would likely have little difficulty mounting a
> Trojan horse attack against the current superusers.

True, but what if the idea is to have *no* superusers?  I seem
to recall people being interested in setups like that.

On the whole I don't have any objection to your proposal, I just
worry that somebody else will.

Of course there's always "UPDATE pg_authid SET rolsuper = false",
which makes it absolutely clear that you're breaking the glass cover.

regards, tom lane




Re: Add connection active, idle time to pg_stat_activity

2022-07-21 Thread Sergey Dudoladov
Hello,

I have addressed the reviews.

@Aleksander Alekseev thanks for reporting the issue. I have altered
the patch to respect the behavior of pg_stat_activity, specifically
[1]

> Another important point is that when a server process is asked to display any 
> of these statistics,
> it first fetches the most recent report emitted by the collector process and 
> then continues to use this snapshot
> for all statistical views and functions until the end of its current 
> transaction.
> So the statistics will show static information as long as you continue the 
> current transaction.

For the patch it means no computing of real-time values of
total_*_time. Here is an example to illustrate the new behavior:

=# begin;

=*# select total_active_time, total_idle_in_transaction_time from
pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 0.124 |  10505.098

postgres=*# select pg_sleep(10);

postgres=*# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 0.124 |  10505.098

postgres=*# commit;

postgres=# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
---+
 10015.796 |  29322.831


[1] 
https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-STATS-VIEWS

Regards,
Sergey
From b5298301a3f5223bd78c519ddcddbd1bec9cf000 Mon Sep 17 00:00:00 2001
From: Sergey Dudoladov 
Date: Wed, 20 Apr 2022 23:47:37 +0200
Subject: [PATCH] pg_stat_activity: add 'total_active_time' and
 'total_idle_in_transaction_time'

catversion bump because of the change in the contents of pg_stat_activity

Author: Sergey Dudoladov, based on the initial version by Rafia Sabih

Reviewed-by: Aleksander Alekseev, Bertrand Drouvot, and Atsushi Torikoshi

Discussion: https://www.postgresql.org/message-id/flat/CA%2BFpmFcJF0vwi-SWW0wYO-c-FbhyawLq4tCpRDCJJ8Bq%3Dja-gA%40mail.gmail.com
---
 doc/src/sgml/monitoring.sgml| 20 +
 src/backend/catalog/system_views.sql|  4 ++-
 src/backend/utils/activity/backend_status.c | 33 -
 src/backend/utils/adt/pgstatfuncs.c |  8 -
 src/include/catalog/pg_proc.dat |  6 ++--
 src/include/pgstat.h| 10 ---
 src/include/utils/backend_status.h  | 10 +++
 src/test/regress/expected/rules.out | 12 
 8 files changed, 75 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 7dbbab6f5c..943927fe34 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -979,6 +979,26 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
additional types.
   
  
+
+ 
+  
+   total_active_time double precision
+  
+  
+   Time in milliseconds this backend spent in active and
+   fastpath function call states.
+  
+ 
+
+ 
+  
+   total_idle_in_transaction_time double precision
+  
+  
+   Time in milliseconds this backend spent in idle in transaction
+   and idle in transaction (aborted) states.
+  
+ 
 

   
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f369b1fc14..2ec6ea2304 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -864,7 +864,9 @@ CREATE VIEW pg_stat_activity AS
 s.backend_xmin,
 S.query_id,
 S.query,
-S.backend_type
+S.backend_type,
+S.total_active_time,
+S.total_idle_in_transaction_time
 FROM pg_stat_get_activity(NULL) AS S
 LEFT JOIN pg_database AS D ON (S.datid = D.oid)
 LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6d7a..8fe2929fba 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -336,6 +336,8 @@ pgstat_bestart(void)
 	lbeentry.st_activity_start_timestamp = 0;
 	lbeentry.st_state_start_timestamp = 0;
 	lbeentry.st_xact_start_timestamp = 0;
+	lbeentry.st_total_active_time = 0;
+	lbeentry.st_total_transaction_idle_time = 0;
 	lbeentry.st_databaseid = MyDatabaseId;
 
 	/* We have userid for client-backends, wal-sender and bgworker processes */
@@ -524,6 +526,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	TimestampTz start_timestamp;
 	TimestampTz current_timestamp;
 	int			len = 0;
+	int64		

Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Tomas Vondra
On 7/21/22 16:12, Dean Rasheed wrote:
> On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
>  wrote:
>>
>> On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
>> wrote:
>>>
 + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
>>
>> I think this is ready for a committer, so I've marked it as such.
>>
> 
> Picking this up...
> 
> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
> 
> I also noticed a comment in CreateStatistics() that needed updating.
> 
> Barring any further comments, I'll push this shortly.
> 

+1


-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER

2022-07-21 Thread Robert Haas
Hi,

Currently, it's possible to remove the rolissuper bit from the
bootstrap superuser, but this leaves that user - and the system in
general - in an odd state. The bootstrap user continues to own all of
the objects it owned before, e.g. all of the system catalogs. Direct
DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's
possible to do things like rename a system function out of the way and
create a new function with the same signature. Therefore, creating a
new superuser and making the original one a non-superuser is probably
not viable from a security perspective, because anyone who gained
access to that role would likely have little difficulty mounting a
Trojan horse attack against the current superusers.

There are other problems, too. (1) pg_parameter_acl entries are
considered to be owned by the bootstrap superuser, so while the
bootstrap user loses the ability to directly ALTER SYSTEM SET
archive_command, they can still grant that ability to some other user
(possibly one they've just created, if they still have CREATEROLE)
which pretty much gives the whole show away. (2) When a trusted
extension is created, the extension objects are documented as ending
up owned by the bootstrap superuser, and the bootstrap user will end
up owning them even if they are no longer super. (3) Range
constructors end up getting owned by the bootstrap user, too. I
haven't really tried to verify whether ownership of trusted extension
objects or range constructors would allow the bootstrap
not-a-superuser to escalate back to superuser, but it seems fairly
likely. I believe these object ownership assignments were made with
the idea that the bootstrap user would always be a superuser.

pg_upgrade refers to the "install user" rather than the bootstrap
superuser, but it's talking about the same thing. If you've made the
bootstrap user non-super, pg_upgrade will fail. It is only able to
connect as the bootstrap user, and it must connect as superuser or it
can't do the things it needs to do.

All in all, it seems to me that various parts of the system are built
around the assumption that you will not try to execute ALTER ROLE
bootstrap_superuser NOSUPERUSER. I suggest that we formally prohibit
that, as per the attached patch. Otherwise, I suppose we need to
prevent privilege escalation attacks from a bootstrap ex-superuser,
which seems fairly impractical and a poor use of engineering
resources. Or I suppose we could continue with the present state of
affairs where our code and documentation assume you won't do that but
nothing actually stops you from doing it, but that doesn't seem to
have much to recommend it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-Do-not-allow-removal-of-superuser-privileges-from.patch
Description: Binary data


Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Simon Riggs
On Thu, 21 Jul 2022 at 15:12, Dean Rasheed  wrote:
>
> On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
>  wrote:
> >
> > On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
> > wrote:
> > >
> > > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
> >
> > I think this is ready for a committer, so I've marked it as such.
> >
>
> Picking this up...
>
> I tend to agree with Matthias' earlier point about avoiding code
> duplication in the grammar. Without going off and refactoring other
> parts of the grammar not related to this patch, it's still a slightly
> smaller, simpler change, and less code duplication, to do this using a
> new opt_stats_name production in the grammar, as in the attached.
>
> I also noticed a comment in CreateStatistics() that needed updating.
>
> Barring any further comments, I'll push this shortly.

Nice change, please proceed.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
LGTM

On Thu, Jul 21, 2022 at 11:52 PM Aleksander Alekseev
 wrote:
>
> Hi Junwang,
>
> > btw, there are some typos in Patch v5, %s/ralation/relation/g
>
> D'oh!
>
> > yeah, IMHO validate_relation_kind() is better ;)
>
> Cool. Here is the corrected patch. Thanks!
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Junwang,

> btw, there are some typos in Patch v5, %s/ralation/relation/g

D'oh!

> yeah, IMHO validate_relation_kind() is better ;)

Cool. Here is the corrected patch. Thanks!

-- 
Best regards,
Aleksander Alekseev


v6-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 14:25 schrieb Dean Rasheed:


I'm inclined to say that we want a new pg_global_prng_user_state that
is updated by setseed(), and used by random(), array_shuffle(),
array_sample(), and any other user-facing random functions we add
later.



I like the idea. How would you organize the code? I imagine some sort of 
user prng that is encapsulated in something like utils/adt/random.c and 
is guaranteed to always be seeded.


Martin




Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Tom Lane
Fujii Masao  writes:
> I found that fetch_more_data_begin() in postgres_fdw reports an error when 
> PQsendQuery() returns the value less than 0 as follows though PQsendQuery() 
> can return only 1 or 0. I think this is  a bug. Attached is the patch that 
> fixes this bug. This needs to be back-ported to v14 where async execution was 
> supported in postgres_fdw.

Yup, clearly a thinko.

regards, tom lane




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Thu, Jul 21, 2022 at 11:55 PM Amit Langote 
wrote:
> On Wed, Jul 20, 2022 at 11:09 PM Amit Langote 
wrote:
> > On Wed, Jul 20, 2022 at 12:37 AM Andres Freund 
wrote:
> > > On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
> > > > About that, I was wondering if the blocks in llvm_compile_expr()
need
> > > > to be hand-coded to match what's added in ExecInterpExpr() or if
I've
> > > > missed some tool that can be used instead?
> > >
> > > The easiest way is to just call an external function for the
implementation of
> > > the step. But yes, otherwise you need to handcraft it.
> >
> > Ok, thanks.
> >
> > So I started updating llvm_compile_expr() for handling the new
> > ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
> > realized that code could have been consolidated into less code, or
> > IOW, into fewer new ExprEvalSteps.  So, I refactored things that way
> > and am now retrying adding the code to llvm_compile_expr() based on
> > new, better consolidated, code.
> >
> > Here's the updated version, without the llvm pieces, in case you'd
> > like to look at it even in this state.  I'll post a version with llvm
> > pieces filled in tomorrow.   (I have merged the different patches into
> > one for convenience.)
>
> And here's a version with llvm pieces filled in.
>
> Because I wrote all of it while not really understanding how the LLVM
> constructs like blocks and branches work, the only reason I think
> those llvm_compile_expr() additions may be correct is that all the
> tests in jsonb_sqljson.sql pass even if I add the following line at
> the top:
>
> set jit_above_cost to 0;

Oh and I did build --with-llvm. :-)


-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com
-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
btw, there are some typos in Patch v5, %s/ralation/relation/g

On Thu, Jul 21, 2022 at 10:05 PM Aleksander Alekseev
 wrote:
>
> Hi Amit,
>
> > Yeah, that's better. On again thinking about the function name, I
> > wonder if validate_relation_type() suits here as there is no generic
> > object being passed?
>
> Yep, validate_relation_type() sounds better.
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: shared-memory based stats collector - v70

2022-07-21 Thread Greg Stark
On Wed, 20 Jul 2022 at 15:09, Andres Freund  wrote:
>
> Each backend only had stats for things it touched. But the stats collector 
> read all files at startup into hash tables and absorbed all generated stats 
> into those as well.

Fascinating. I'm surprised this didn't raise issues previously for
people with millions of tables. I wonder if it wasn't causing issues
and we just didn't hear about them because there were other bigger
issues :)


> >All that said -- having all objects loaded in shared memory makes my
> >work way easier.
>
> What are your trying to do?

I'm trying to implement an exporter for prometheus/openmetrics/etc
that dumps directly from shared memory without going through the SQL
backend layer. I believe this will be much more reliable, lower
overhead, safer, and consistent than writing SQL queries.

Ideally I would want to dump out the stats without connecting to each
database. I suspect that would run into problems where the schema
really adds a lot of information (such as which table each index is on
or which table a toast relation is for. There are also some things
people think of as stats that are maintained in the catalog such as
reltuples and relpages. So I'm imagining this won't strictly stay true
in the end.

It seems like just having an interface to iterate over the shared hash
table and return entries one by one without filtering by database
would be fairly straightforward and I would be able to do most of what
I want just with that. There's actually enough meta information in the
stats entries to be able to handle them as they come instead of trying
to process look up specific stats one by one.


-- 
greg




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Junwang Zhao
yeah, IMHO validate_relation_kind() is better ;)

On Thu, Jul 21, 2022 at 10:21 PM Aleksander Alekseev
 wrote:
>
> Hi Amit,
>
> > Yep, validate_relation_type() sounds better.
>
> Or maybe validate_relation_kind() after all?
>
> --
> Best regards,
> Aleksander Alekseev



-- 
Regards
Junwang Zhao




Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size

2022-07-21 Thread Amit Langote
On Wed, Jul 20, 2022 at 11:09 PM Amit Langote  wrote:
> On Wed, Jul 20, 2022 at 12:37 AM Andres Freund  wrote:
> > On 2022-07-19 20:40:11 +0900, Amit Langote wrote:
> > > About that, I was wondering if the blocks in llvm_compile_expr() need
> > > to be hand-coded to match what's added in ExecInterpExpr() or if I've
> > > missed some tool that can be used instead?
> >
> > The easiest way is to just call an external function for the implementation 
> > of
> > the step. But yes, otherwise you need to handcraft it.
>
> Ok, thanks.
>
> So I started updating llvm_compile_expr() for handling the new
> ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly
> realized that code could have been consolidated into less code, or
> IOW, into fewer new ExprEvalSteps.  So, I refactored things that way
> and am now retrying adding the code to llvm_compile_expr() based on
> new, better consolidated, code.
>
> Here's the updated version, without the llvm pieces, in case you'd
> like to look at it even in this state.  I'll post a version with llvm
> pieces filled in tomorrow.   (I have merged the different patches into
> one for convenience.)

And here's a version with llvm pieces filled in.

Because I wrote all of it while not really understanding how the LLVM
constructs like blocks and branches work, the only reason I think
those llvm_compile_expr() additions may be correct is that all the
tests in jsonb_sqljson.sql pass even if I add the following line at
the top:

set jit_above_cost to 0;

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v4-0001-Some-improvements-to-JsonExpr-evaluation.patch
Description: Binary data


Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Japin Li


On Thu, 21 Jul 2022 at 22:22, Fujii Masao  wrote:
> Hi,
>
> I found that fetch_more_data_begin() in postgres_fdw reports an error when 
> PQsendQuery() returns the value less than 0 as follows though PQsendQuery() 
> can return only 1 or 0. I think this is  a bug. Attached is the patch that 
> fixes this bug. This needs to be back-ported to v14 where async execution was 
> supported in postgres_fdw.
>
>   if (PQsendQuery(fsstate->conn, sql) < 0)
>   pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
> fsstate->query);
>
> Regards,

+1.  However, I think check whether the result equals 0 or 1 might be better.
Anyway, the patch works correctly.


$ grep 'PQsendQuery(' -rn . --include '*.c'
./contrib/postgres_fdw/postgres_fdw.c:7073:  if (PQsendQuery(fsstate->conn, 
sql) < 0)
./contrib/postgres_fdw/connection.c:647:if (!PQsendQuery(conn, sql))
./contrib/postgres_fdw/connection.c:782:if (!PQsendQuery(conn, query))
./contrib/postgres_fdw/connection.c:1347:   if (!PQsendQuery(conn, query))
./contrib/postgres_fdw/connection.c:1575:   if 
(PQsendQuery(entry->conn, "DEALLOCATE ALL"))
./contrib/dblink/dblink.c:720:  retval = PQsendQuery(conn, sql);
./contrib/dblink/dblink.c:1146: if (!PQsendQuery(conn, sql))
./src/test/isolation/isolationtester.c:669: if (!PQsendQuery(conn, 
step->sql))
./src/test/modules/libpq_pipeline/libpq_pipeline.c:500: if (PQsendQuery(conn, 
"SELECT 1; SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:532: if (PQsendQuery(conn, 
"SELECT 1.0/g FROM generate_series(3, -1, -1) g") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1000:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1046:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1084:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1094:if 
(PQsendQuery(conn, "SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1118:if 
(PQsendQuery(conn, "SELECT 1") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1132:if 
(PQsendQuery(conn, "SELECT 2") != 1)
./src/test/modules/libpq_pipeline/libpq_pipeline.c:1159:if 
(PQsendQuery(conn, "SELECT pg_catalog.pg_advisory_unlock(1,1)") != 1)
./src/bin/pg_basebackup/pg_basebackup.c:1921:   if (PQsendQuery(conn, basebkp) 
== 0)
./src/bin/pg_amcheck/pg_amcheck.c:891:  if (PQsendQuery(slot->connection, sql) 
== 0)
./src/bin/psql/common.c:1451:   success = PQsendQuery(pset.db, query);
./src/bin/scripts/reindexdb.c:551:  status = PQsendQuery(conn, 
sql.data) == 1;
./src/bin/scripts/vacuumdb.c:947:   status = PQsendQuery(conn, sql) == 1;
./src/bin/pgbench/pgbench.c:3089:   r = PQsendQuery(st->con, sql);
./src/bin/pgbench/pgbench.c:4012:   
if (!PQsendQuery(st->con, "ROLLBACK"))
./src/backend/replication/libpqwalreceiver/libpqwalreceiver.c:663:  if 
(!PQsendQuery(streamConn, query))
./src/interfaces/libpq/fe-exec.c:1421:PQsendQuery(PGconn *conn, const char 
*query)
./src/interfaces/libpq/fe-exec.c:2319:  if (!PQsendQuery(conn, query))

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Custom tuplesorts for extensions

2022-07-21 Thread Alexander Korotkov
Hi, John!

On Thu, Jul 21, 2022 at 6:44 AM John Naylor
 wrote:
> On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov  
> wrote:
> > There are some places, which potentially could cause a slowdown.  I'm
> > going to make some experiments with that.
>
> I haven't looked at the patches, so I don't know of a specific place to look 
> for a slowdown, but I thought it might help to perform the same query tests 
> as my most recent test for evaluating qsort variants (some description in 
> [1]), and here is the spreadsheet. Overall, the differences look like noise. 
> A few cases with unabbreviatable text look a bit faster with the patch. I'm 
> not sure if that's a real difference, but in any case I don't see a slowdown 
> anywhere.
>
> [1] 
> https://www.postgresql.org/message-id/CAFBsxsHeTACMP1JVQ%2Bm35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ%40mail.gmail.com

Great, thank you very much for the feedback!

--
Regards,
Alexander Korotkov




postgres_fdw: Fix bug in checking of return value of PQsendQuery().

2022-07-21 Thread Fujii Masao

Hi,

I found that fetch_more_data_begin() in postgres_fdw reports an error when 
PQsendQuery() returns the value less than 0 as follows though PQsendQuery() can 
return only 1 or 0. I think this is  a bug. Attached is the patch that fixes 
this bug. This needs to be back-ported to v14 where async execution was 
supported in postgres_fdw.

if (PQsendQuery(fsstate->conn, sql) < 0)
pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
fsstate->query);

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom 538f33a17f3623cec54768a7325d64f8e97abe06 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Thu, 21 Jul 2022 22:52:50 +0900
Subject: [PATCH] postgres_fdw: Fix bug in checking of return value of
 PQsendQuery().

When postgres_fdw begins an asynchronous data fetch, it submits FETCH query
by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw
should report an error. But, previously, postgres_fdw reported an error
only when the return value is less than 0, though PQsendQuery() never return
the values other than 0 and 1. Therefore postgres_fdw could not handle
the failure to send FETCH query in an asynchronous data fetch.

This commit fixes postgres_fdw so that it reports an error
when PQsendQuery() returns 0.

Back-patch to v14 where asynchronous execution was supported in postgres_fdw.

Author: Fujii Masao
Reviewed-by:
Discussion: https://postgr.es/m/
---
 contrib/postgres_fdw/postgres_fdw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index f3b93954ee..048db542d3 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -7070,7 +7070,7 @@ fetch_more_data_begin(AsyncRequest *areq)
snprintf(sql, sizeof(sql), "FETCH %d FROM c%u",
 fsstate->fetch_size, fsstate->cursor_number);
 
-   if (PQsendQuery(fsstate->conn, sql) < 0)
+   if (!PQsendQuery(fsstate->conn, sql))
pgfdw_report_error(ERROR, NULL, fsstate->conn, false, 
fsstate->query);
 
/* Remember that the request is in process */
-- 
2.36.0



Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> Yep, validate_relation_type() sounds better.

Or maybe validate_relation_kind() after all?

-- 
Best regards,
Aleksander Alekseev




Re: Make name optional in CREATE STATISTICS

2022-07-21 Thread Dean Rasheed
On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent
 wrote:
>
> On Wed, 13 Jul 2022 at 08:07, Simon Riggs  
> wrote:
> >
> > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ]
>
> I think this is ready for a committer, so I've marked it as such.
>

Picking this up...

I tend to agree with Matthias' earlier point about avoiding code
duplication in the grammar. Without going off and refactoring other
parts of the grammar not related to this patch, it's still a slightly
smaller, simpler change, and less code duplication, to do this using a
new opt_stats_name production in the grammar, as in the attached.

I also noticed a comment in CreateStatistics() that needed updating.

Barring any further comments, I'll push this shortly.

Regards,
Dean
From 8963355b2d8451be8f71a3bd2890e99e31f7d3ff Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Thu, 21 Jul 2022 14:48:28 +0100
Subject: [PATCH] Make the name optional in CREATE STATISTICS.

This allows users to omit the statistics name in a CREATE STATISTICS
command, letting the system auto-generate a sensible, unique name,
putting the statistics object in the same schema as the table.

Simon Riggs, reviewed by Matthias van de Meent.

Discussion: https://postgr.es/m/canbhv-fgd2d_c3zftft2arfx_tapsgoekes58rlzx5xzqp5...@mail.gmail.com
---
 doc/src/sgml/ref/create_statistics.sgml | 12 ++--
 src/backend/commands/statscmds.c|  7 +-
 src/backend/parser/gram.y   | 13 +++-
 src/test/regress/expected/stats_ext.out | 87 +++--
 src/test/regress/sql/stats_ext.sql  | 20 --
 5 files changed, 86 insertions(+), 53 deletions(-)

diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 9a8c904c08..b847944f37 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -21,11 +21,11 @@ PostgreSQL documentation
 
  
 
-CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
+CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ]
 ON ( expression )
 FROM table_name
 
-CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
+CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ]
 [ ( statistics_kind [, ... ] ) ]
 ON { column_name | ( expression ) }, { column_name | ( expression ) } [, ...]
 FROM table_name
@@ -60,8 +60,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
If a schema name is given (for example, CREATE STATISTICS
myschema.mystat ...) then the statistics object is created in the
specified schema.  Otherwise it is created in the current schema.
-   The name of the statistics object must be distinct from the name of any
-   other statistics object in the same schema.
+   If given, the name of the statistics object must be distinct from the name
+   of any other statistics object in the same schema.
   
  
 
@@ -78,6 +78,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
   exists.  A notice is issued in this case.  Note that only the name of
   the statistics object is considered here, not the details of its
   definition.
+  Statistics name is required when IF NOT EXISTS is specified.
  
 

@@ -88,6 +89,9 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na
  
   The name (optionally schema-qualified) of the statistics object to be
   created.
+  If the name is omitted, PostgreSQL chooses a
+  suitable name based on the parent table's name and the defined column
+  name(s) and/or expression(s).
  
 

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index cd5e2f2b6b..415016969d 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -155,10 +155,9 @@ CreateStatistics(CreateStatsStmt *stmt)
 
 	/*
 	 * If the node has a name, split it up and determine creation namespace.
-	 * If not (a possibility not considered by the grammar, but one which can
-	 * occur via the "CREATE TABLE ... (LIKE)" command), then we put the
-	 * object in the same namespace as the relation, and cons up a name for
-	 * it.
+	 * If not, put the object in the same namespace as the relation, and cons
+	 * up a name for it.  (This can happen either via "CREATE STATISTICS ..."
+	 * or via "CREATE TABLE ... (LIKE)".)
 	 */
 	if (stmt->defnames)
 		namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames,
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d649a1b8d1..0a874a04aa 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -434,7 +434,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 old_aggr_definition old_aggr_list
 oper_argtypes RuleActionList RuleActionMulti
 opt_column_list columnList opt_name_list
-sort_clause opt_sort_clause sortby_list index_params stats_params
+sort_clause opt_sort_clause sortby_list index_params
+opt_stats_name stats_params
 opt_include opt_c_include 

Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> Yeah, that's better. On again thinking about the function name, I
> wonder if validate_relation_type() suits here as there is no generic
> object being passed?

Yep, validate_relation_type() sounds better.

-- 
Best regards,
Aleksander Alekseev


v5-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: System catalog documentation chapter

2022-07-21 Thread Bruce Momjian
On Wed, Jul 20, 2022 at 09:19:17PM -0400, Bruce Momjian wrote:
> On Wed, Jul 20, 2022 at 04:32:46PM -0400, Bruce Momjian wrote:
> > On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote:
> > > Will there be a comprehensive list somewhere? Even if it just lists the 
> > > views,
> > > gives maybe a one-sentence description, and links to the relevant 
> > > chapter, I
> > > would find that helpful sometimes.
> > 
> > I was not planning on that since we don't do that in any other cases I
> > can think of.
> > 
> > > I ask because I occasionally find myself wanting a comprehensive list of
> > > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid 
> > > that
> > > situation for views.
> > 
> > Well, then we just leave them where the are and link to them from other
> > parts of the documentation, which I assume/hope we already do.
> 
> People have mentioned the view documentation doesn't belong in the
> Internals part.  Maybe we can just move it to the Server
> Administration part and leave it together.

Thinking some more about this, I wonder if we should distinguish system
views that are needed for a task vs those used for reporting.  For
example, pg_stat_activity is a dymamic view and is needed for
monitoring.  pg_prepared_statements just reports the prepared
statements.

Could it be that over time, we have moved the "needed for a task" views
into the relevant sections, and the reporting views have just stayed as
a group, and that is okay --- maybe they just need to be moved to Server
Administration?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson





Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 6:12 PM Aleksander Alekseev
 wrote:
>
> Hi Alvaro,
>
> > Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
> > should change these error messages to conform to the same message style.
>
> Good point! Done.
>

Yeah, that's better. On again thinking about the function name, I
wonder if validate_relation_type() suits here as there is no generic
object being passed?

-- 
With Regards,
Amit Kapila.




Re: Remove fls(), use pg_bitutils.h facilities instead?

2022-07-21 Thread Tom Lane
Thomas Munro  writes:
> On Thu, Jul 21, 2022 at 1:34 AM Tom Lane  wrote:
>> How is it sane to ask for a segment bin for zero pages?  Seems like
>> something should have short-circuited such a case well before here.

> It's intended.  There are two ways you can arrive here with n == 0:

OK.

>> We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t
>> as the appropriate one of pg_leftmost_one_pos32/64, perhaps.

> Yeah.

Patches look good to me.

regards, tom lane




Re: SLRUs in the main buffer pool, redux

2022-07-21 Thread Yura Sokolov
Good day, Thomas

В Пт, 27/05/2022 в 23:24 +1200, Thomas Munro пишет:
> Rebased, debugged and fleshed out a tiny bit more, but still with
> plenty of TODO notes and questions.  I will talk about this idea at
> PGCon, so I figured it'd help to have a patch that actually applies,
> even if it doesn't work quite right yet.  It's quite a large patch but
> that's partly because it removes a lot of lines...

Looks like it have to be rebased again.






Re: Improve description of XLOG_RUNNING_XACTS

2022-07-21 Thread Ashutosh Bapat
Hi

On Thu, Jul 21, 2022 at 6:44 AM Masahiko Sawada  wrote:
>
> Hi,
>
> I realized that standby_desc_running_xacts() in standbydesc.c doesn't
> describe subtransaction XIDs. I've attached the patch to improve the
> description. Here is an example by pg_wlaldump:
>
> * HEAD
> rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
> 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
> latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048
>
> * w/ patch
> rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
> 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050
> latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048; 1
> subxacts: 1049
>

I think this is a good addition to debugging info. +1

If we are going to add 64 subxid numbers then it would help if we
could be more verbose and print "subxid overflowed" instead of "subxid
ovf".

-- 
Best Wishes,
Ashutosh Bapat




Re: MultiXact\SLRU buffers configuration

2022-07-21 Thread Yura Sokolov
Good day, all.

I did benchmark of patch on 2 socket Xeon 5220 CPU @ 2.20GHz .
I used "benchmark" used to reproduce problems with SLRU on our
customers setup.
In opposite to Shawn's tests I concentrated on bad case: a lot
of contention.

slru-funcs.sql - function definitions
  - functions creates a lot of subtrunsactions to stress subtrans
  - and select random rows for share to stress multixacts

slru-call.sql - function call for benchmark

slru-ballast.sql - randomly select 1000 consequent rows
"for update skip locked" to stress multixacts

patch1 - make SLRU buffers configurable
patch2 - make "8-associative banks"

Benchmark done by pgbench.
Inited with scale 1 to induce contention.
pgbench -i -s 1 testdb

Benchmark 1:
- low number of connections (50), 60% slru-call, 40% slru-ballast
pgbench -f slru-call.sql@60 -f slru-ballast.sql@40 -c 50 -j 75 -P 1 -T 30 
testdb

version | subtrans | multixact | tps
| buffers  | offs/memb | func+ballast
+--+---+--
master  | 32   | 8/16  | 184+119
patch1  | 32   | 8/16  | 184+119
patch1  | 1024 | 8/16  | 121+77
patch1  | 1024 | 512/1024  | 118+75
patch2  | 32   | 8/16  | 190+122
patch2  | 1024 | 8/16  | 190+125
patch2  | 1024 | 512/1024  | 190+127

As you see, this test case degrades with dumb increase of
SLRU buffers. But use of "hash table" in form of "associative
buckets" makes performance stable.

Benchmark 2:
- high connection number (600), 98% slru-call, 2% slru-ballast
pgbench -f slru-call.sql@98 -f slru-ballast.sql@2 -c 600 -j 75 -P 1 -T 30 
testdb

I don't paste "ballast" tps here since 2% make them too small,
and they're very noisy.

version | subtrans | multixact | tps
| buffers  | offs/memb | func
+--+---+--
master  | 32   | 8/16  | 13
patch1  | 32  
| 8/16  | 13
patch1  | 1024 | 8/16  | 31
patch1  | 1024 | 512/1024  | 53
patch2  | 32   | 8/16  | 12
patch2  | 1024 | 8/16  | 34
patch2  | 1024 | 512/1024  | 67

In this case simple buffer increase does help. But "buckets"
increase performance gain.

I didn't paste here results third part of patch ("Pack SLRU...")
because I didn't see any major performance gain from it, and
it consumes large part of patch diff.

Rebased versions of first two patch parts are attached.

regards,

Yura Sokolov


slru-ballast.sql
Description: application/sql


slru-call.sql
Description: application/sql


slru-func.sql
Description: application/sql
From 41ec9d1c54184c515d53ecc8021c4a998813f2a9 Mon Sep 17 00:00:00 2001
From: Andrey Borodin 
Date: Sun, 11 Apr 2021 21:18:10 +0300
Subject: [PATCH v21 2/2] Divide SLRU buffers into 8-associative banks

We want to eliminate linear search within SLRU buffers.
To do so we divide SLRU buffers into banks. Each bank holds
approximately 8 buffers. Each SLRU pageno may reside only in one bank.
Adjacent pagenos reside in different banks.
---
 src/backend/access/transam/slru.c | 43 ---
 src/include/access/slru.h |  2 ++
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index b65cb49d7ff..abc534bbd06 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -134,7 +134,7 @@ typedef enum
 static SlruErrorCause slru_errcause;
 static int	slru_errno;
 
-
+static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset);
 static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
 static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
 static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata);
@@ -148,6 +148,30 @@ static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename,
 	  int segpage, void *data);
 static void SlruInternalDeleteSegment(SlruCtl ctl, int segno);
 
+/*
+ * Pick bank size optimal for N-assiciative SLRU buffers.
+ * We expect bank number to be picked from lowest bits of requested pageno.
+ * Thus we want number of banks to be power of 2. This routine computes number
+ * of banks aiming to make each bank of size 8. So we can pack page number and
+ * statuses of each bank on one cacheline.
+ */
+static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset)
+{
+	int nbanks = 1;
+	*banksize = *nslots;
+	*bankoffset = 0;
+	while (*banksize > 15)
+	{
+		if ((*banksize & 1) != 0)
+			*banksize +=1;
+		*banksize /= 2;
+		nbanks *= 2;
+		*bankoffset += 1;
+	}
+	elog(DEBUG5, "nslots %d banksize %d nbanks %d ", *nslots, *banksize, nbanks);
+	*nslots = *banksize * nbanks;
+}
+
 /*
  * Initialization of shared memory
  */
@@ -156,6 +180,8 @@ Size
 SimpleLruShmemSize(int nslots, int nlsns)
 {
 	Size		sz;
+	int bankoffset, banksize;
+	SlruAdjustNSlots(, , );
 
 	/* we assume nslots isn't so large as to risk overflow */
 	sz = MAXALIGN(sizeof(SlruSharedData));
@@ -190,6 +216,8 @@ SimpleLruInit(SlruCtl ctl, const char 

Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Alvaro,

> Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
> should change these error messages to conform to the same message style.

Good point! Done.

-- 
Best regards,
Aleksander Alekseev


v4-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [RFC] building postgres with meson - v10

2022-07-21 Thread Bilal Yavuz
Hi,

Sorry for the first email.

On Mon, 18 Jul 2022 at 23:23, Andres Freund  wrote:
>
> In
https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far.

While testing ECPG, C and exe files are generated by meson so these files
are in the meson's build directory but expected files are in the source
directory. However; there was no way to set different paths for inputs (C
and exe files') and expected files' directory. So, I added `--expecteddir`
to separately set expected files' directory.

Greetings,

Nazir Bilal Yavuz


Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Dean Rasheed
On Thu, 21 Jul 2022 at 12:15, Martin Kalcher
 wrote:
>
> I agree that we should use pg_prng_uint64_range(). However, in order to
> achieve interoperability with setseed() we would have to use
> drandom_seed (rather than pg_global_prng_state) as rng state, which is
> declared statically in float.c and exclusively used by random(). Do we
> want to expose drandom_seed to other functions?
>

Ah, I didn't realise that setseed() and random() were bound up so
tightly. It does feel as though, if we're adding more user-facing
functions that return random sequences, there ought to be a way to
seed them, and I wouldn't want to have separate setseed functions for
each one.

I'm inclined to say that we want a new pg_global_prng_user_state that
is updated by setseed(), and used by random(), array_shuffle(),
array_sample(), and any other user-facing random functions we add
later.

I can also see that others might not like expanding the scope of
setseed() in this way.

Regards,
Dean




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Junwang Zhao wrote:

> There are some duplicate code in table.c, add a static inline function
> to eliminate the duplicates.

Hmm, but see commit 2ed532ee8c47 about this kind of check.  Perhaps we
should change these error messages to conform to the same message style.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)




Re: [RFC] building postgres with meson - v10

2022-07-21 Thread Bilal Yavuz
Hi,

On 2022-07-18 23:23:27 +0300, Andres Freund wrote:

> Bilal, Peter previously commented on the pg_regress change for ecpg,
> perhaps
> you can comment on that?
>
> In
> https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far


While testing ECPG, C and exe files are generated by meson so these files
are in the meson's build directory but expected files are in the source
directory. However; there was no way to set different paths for inputs (C
and exe files') and expected files' directory. So, I added `--expecteddir`
to separately set expected files' directory.

Greetings,

Nazir Bilal Yavuz

On Mon, 18 Jul 2022 at 23:23, Andres Freund  wrote:

> Hi,
>
> On 2022-07-18 11:33:09 +0200, Peter Eisentraut wrote:
> > The following patches are ok to commit IMO:
> >
> > a1c5542929 prereq: Deal with paths containing \ and spaces in
> basebackup_to_shell tests
> > e37951875d meson: prereq: psql: Output dir and dependency generation for
> sql_help
> > 18cc9fbd02 meson: prereq: ecpg: Add and use output directory argument
> for preproc/*.pl
> > 58a32694e9 meson: prereq: Move snowball_create.sql creation into perl
> file
> > 59b8bffdaf meson: prereq: Add output path arg in generate-lwlocknames.pl
> > 2db97b00d5 meson: prereq: generate-errcodes.pl: Accept output file
> > fb8f52f21d meson: prereq: unicode: Allow to specify output directory
> > 8f1e4410d6 meson: prereq: Refactor dtrace postprocessing make rules
> > 3d18a20b11 meson: prereq: Add --outdir option to gen_node_support.pl
>
> I pushed these. Thanks for the reviews and patches!
>
> The symbol export stuff has also been pushed (discussed in a separate
> thread).
>
> It's nice to see the meson patchset length reduced by this much.
>
> I pushed a rebased version of the remaining branches to git. I'll be on
> vacation for a bit, I'm not sure I can get a new version with further
> cleanups
> out before.
>
>
> Given that we can't use src/tools/gen_versioning_script.pl for the make
> build,
> due to not depending on perl for tarball builds, I'm inclined to rewrite it
> python (which we depend on via meson anyway) and consider it a meson
> specific
> wrapper?
>
>
> Bilal, Peter previously commented on the pg_regress change for ecpg,
> perhaps
> you can comment on that?
>
> In
> https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com
> On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote:
> > dff7b5a960 meson: prereq: regress: allow to specify director containing
> > expected files.
> >
> > This could use a bit more explanation, but it doesn't look
> > controversial so far.
>
> Greetings,
>
> Andres Freund
>


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi Amit,

> I don't think this change should be part of this patch. Do you see a
> reason for doing this?

My bad. I thought this was done by pgindent.


-- 
Best regards,
Aleksander Alekseev


v3-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Amit Kapila
On Thu, Jul 21, 2022 at 5:09 PM Aleksander Alekseev
 wrote:
>
> > > There are some duplicate code in table.c, add a static inline function
> > > to eliminate the duplicates.
> > >
> >
> > Can we name function as validate_object_type, or check_object_type?
> >
> > Otherwise, the patch looks fine to me. Let's see if others have
> > something to say.
>
> LGTM
>

@@ -161,10 +121,32 @@ table_openrv_extended(const RangeVar *relation,
LOCKMODE lockmode,
  *
  * Note that it is often sensible to hold a lock beyond relation_close;
  * in that case, the lock is released automatically at xact end.
- * 
+ * 
  */
 void
 table_close(Relation relation, LOCKMODE lockmode)

I don't think this change should be part of this patch. Do you see a
reason for doing this?


-- 
With Regards,
Amit Kapila.




Re: [PATCH v1] eliminate duplicate code in table.c

2022-07-21 Thread Aleksander Alekseev
Hi hackers,

> > There are some duplicate code in table.c, add a static inline function
> > to eliminate the duplicates.
> >
>
> Can we name function as validate_object_type, or check_object_type?
>
> Otherwise, the patch looks fine to me. Let's see if others have
> something to say.

LGTM

-- 
Best regards,
Aleksander Alekseev


v2-0001-Eliminate-duplicate-code-in-table.c.patch
Description: Binary data


Re: [PoC] Reducing planning time when tables have many partitions

2022-07-21 Thread Andrey Lepikhov

On 7/5/22 13:57, Yuya Watari wrote:

On Mon, Jul 4, 2022 at 6:28 AM Tom Lane  wrote:

Perhaps the bms_is_subset class could be handled in a similar
way, ie do a one-time pass to make a List of all EquivalenceMembers
that use a RelOptInfo.


Thank you for giving your idea. I will try to polish up my algorithm
based on your suggestion.
This work has significant interest for highly partitioned 
configurations. Are you still working on this patch? According to the 
current state of the thread, changing the status to 'Waiting on author' 
may be better until the next version.

Feel free to reverse the status if you need more feedback.

--
Regards
Andrey Lepikhov
Postgres Professional




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Alvaro Herrera wrote:

> Yeah, I think that would reduce cruft.  I'm not sure this is more
> against backpatching policy or less, compared to adding a separate
> GUC just for this bugfix.

cruft:

{
{"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY,
gettext_noop("Continues recovery after finding invalid database 
directories."),
gettext_noop("It is possible for tablespace drop to interfere with 
database creation "
 "so that WAL replay is forced to create fake database 
directories. "
 "These should have been dropped by the time recovery 
ends; "
 "but in case they aren't, this option lets recovery 
continue if they "
 "are present.  Note that these directories must be 
removed manually afterwards."),
GUC_NOT_IN_SAMPLE
},
_recovery_tablespaces,
false,
NULL, NULL, NULL
},

This is not a very good explanation, but I don't know how to make it
better.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I think my standards have lowered enough that now I think 'good design'
is when the page doesn't irritate the living f*ck out of me." (JWZ)




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Thomas Munro wrote:

> On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  
> wrote:

> > I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> > to add and backpatch a new GUC allow_recovery_tablespaces as the
> > override mechanism.
> >
> > If others disagree with this choice, please speak up.
> 
> Would it help if we back-patched the allow_in_place_tablespaces stuff?
>  I'm not sure how hard/destabilising that would be, but I could take a
> look tomorrow.

Yeah, I think that would reduce cruft.  I'm not sure this is more
against backpatching policy or less, compared to adding a separate
GUC just for this bugfix.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
   (Paul Graham)




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Alvaro Herrera
On 2022-Jul-21, Thomas Munro wrote:

> On Wed, Jul 20, 2022 at 10:51 PM Alvaro Herrera  
> wrote:
> > v26 here.  I spent some time fighting the readdir() stuff for
> > Windows (so that get_dirent_type returns LNK for junction points)
> > but couldn't make it to work and was unable to figure out why.
> 
> Was it because of this?
> 
> https://www.postgresql.org/message-id/CA%2BhUKGKv%2B736Pc8kSj3%3DDijDGd1eC79-uT3Vi16n7jYkcc_raw%40mail.gmail.com

Oh, that sounds very likely, yeah.  I didn't think of testing the
FILE_ATTRIBUTE_DIRECTORY bit for junction points.

I +1 pushing both of these patches to 14.  Then this patch becomes a
couple of lines shorter.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers




Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-07-21 Thread Martin Kalcher

Am 21.07.22 um 10:41 schrieb Dean Rasheed:


A couple of quick comments on the current patch:


Thank you for your feedback!


It's important to mark these new functions as VOLATILE, not IMMUTABLE,
otherwise they won't work as expected in queries. See
https://www.postgresql.org/docs/current/xfunc-volatility.html


CREATE FUNCTION marks functions as VOLATILE by default if not explicitly 
marked otherwise. I assumed function definitions in pg_proc.dat have the 
same behavior. I will fix that.

https://www.postgresql.org/docs/current/sql-createfunction.html


It would be better to use pg_prng_uint64_range() rather than rand() to
pick elements. Partly, that's because it uses a higher quality PRNG,
with a larger internal state, and it ensures that the results are
unbiased across the range. But more importantly, it interoperates with
setseed(), allowing predictable sequences of "random" numbers to be
generated -- something that's useful in writing repeatable regression
tests.


I agree that we should use pg_prng_uint64_range(). However, in order to 
achieve interoperability with setseed() we would have to use 
drandom_seed (rather than pg_global_prng_state) as rng state, which is 
declared statically in float.c and exclusively used by random(). Do we 
want to expose drandom_seed to other functions?



Assuming these new functions are made to interoperate with setseed(),
which I think they should be, then they also need to be marked as
PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
https://www.postgresql.org/docs/current/parallel-safety.html, which
explains why setseed() and random() are parallel restricted.


As mentioned above, i assumed the default here is PARALLEL UNSAFE. I'll 
fix that.



In my experience, the requirement for sampling with replacement is
about as common as the requirement for sampling without replacement,
so it seems odd to provide one but not the other. Of course, we can
always add a with-replacement function later, and give it a different
name. But maybe array_sample() could be used for both, with a
"with_replacement" boolean parameter?


We can also add a "with_replacement" boolean parameter which is false by 
default to array_sample() later. I do not have a strong opinion about 
that and will implement it, if desired. Any opinions about 
default/no-default?


Martin




Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-07-21 Thread Thomas Munro
On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera  wrote:
> On 2022-Jul-20, Alvaro Herrera wrote:
> > I see the following alternatives:
> >
> > 1. not backpatch this fix to 14 and older
> > 2. use a different GUC; either allow_invalid_pages as previously
> >suggested, or create a new one just for this purpose
> > 3. not provide any overriding mechanism in versions 14 and older
>
> I've got no opinions on this.  I don't like either 1 or 3, so I'm going
> to add and backpatch a new GUC allow_recovery_tablespaces as the
> override mechanism.
>
> If others disagree with this choice, please speak up.

Would it help if we back-patched the allow_in_place_tablespaces stuff?
 I'm not sure how hard/destabilising that would be, but I could take a
look tomorrow.




  1   2   >