date:20231227

On Tue, Dec 26, 2023 at 7:35 PM Isaac Morland  wrote:
>
> On Thu, 21 Dec 2023 at 09:26, Amit Kapila  wrote:
>
>>
>> A conflicting column where NULL indicates no conflict, and other
>> > values indicate the reason for the conflict, doesn't seem too bad.
>> >
>>
>> This is fine too.
>
>
> I prefer this option. There is precedent for doing it this way, for example 
> in pg_stat_activity.wait_event_type.
>
> The most common test of this field is likely to be "is there a conflict" and 
> it's better to write this as "[fieldname] IS NOT NULL" than to introduce a 
> magic constant. Also, it makes clear to future maintainers that this field 
> has one purpose: saying what type of conflict there is, if any. If we find 
> ourselves wanting to record a new non-conflict status (no idea what that 
> could be: "almost conflict"? "probably conflict soon"?) there would be less 
> temptation to break existing tests for "is there a conflict".

+1 on using "[fieldname] IS NOT NULL"  to  find  "is there a conflict"

PFA the patch which attempts to implement this.

This patch changes the existing 'conflicting' field to
'conflicting_cause' in pg_replication_slots. This new field is always
NULL for physical slots (like the previous field conflicting), as well
as for those logical slots which are not invalidated.

thanks
Shveta


v1-0001-Track-conflicting_cause-in-pg_replication_slots.patch
Description: Binary data

Re: Function to get invalidation cause of a replication slot.

On Thu, Dec 21, 2023 at 2:59 PM Amit Kapila  wrote:
>
> On Thu, Dec 21, 2023 at 12:07 PM Michael Paquier  wrote:
> >
> > On Thu, Dec 21, 2023 at 11:53:04AM +0530, Amit Kapila wrote:
> > > On Thu, Dec 21, 2023 at 11:18 AM Michael Paquier  
> > > wrote:
> > > Yeah, if one uses them independently then there is no such guarantee.
> >
> > This could be possible in the same query as well, still less likely,
> > as the contents are volatile.
> >
>
> True, this is quite obvious but that was not a recommended way to use
> the function. Anyway, now that we agree to expose it via an existing
> function, there is no point in further argument on this.
>
> > >>  A lot could happen between both function calls while the
> > >> repslot LWLock is not hold.
> > >>
> > >> Yeah, you could keep the reason text as NULL when there is no
> > >> conflict, replacing the boolean by the text in the function, and keep
> > >> the view definition compatible with v16 while adding an extra column.
> > >
> > > But as mentioned we also want the enum value to be exposed in some way
> > > so that it can be used by the sync slot feature [1] as well,
> > > otherwise, we may need some mappings to convert the text back to an
> > > enum. I guess if we want to expose via view, then we can return an
> > > enum value by pg_get_replication_slots() and the view can replace it
> > > with text based on the value.
> >
> > Sure.  Something like is OK by me as long as the data is retrieved
> > from a single scan of the slot data while holding the slot data's
> > LWLock.
> >
>
> Okay, so let's go this way unless someone feels otherwise.

Please track the progress in another thread [1] where the patch is posted today.

[1]: https://www.postgresql.org/message-id/ZYOE8IguqTbp-seF%40paquier.xyz

thanks
Shveta

Re: Synchronizing slots from primary to standby

On Wed, Dec 27, 2023 at 11:36 AM Masahiko Sawada  wrote:
>
> Hi,
>
> Thank you for working on this.
>
> On Tue, Dec 26, 2023 at 9:27 PM shveta malik  wrote:
> >
> > On Tue, Dec 26, 2023 at 4:41 PM Zhijie Hou (Fujitsu)
> >  wrote:
> > >
> > > On Wednesday, December 20, 2023 7:37 PM Amit Kapila 
> > >  wrote:
> > > >
> > > > On Wed, Dec 20, 2023 at 3:29 PM shveta malik 
> > > > wrote:
> > > > >
> > > > > On Wed, Dec 20, 2023 at 9:12 AM Amit Kapila 
> > > > wrote:
> > > > > >
> > > > > > On Tue, Dec 19, 2023 at 5:30 PM shveta malik 
> > > > > > 
> > > > wrote:
> > > > > > >
> > > > > > > Thanks for reviewing. I have addressed these in v50.
> > > > > > >
> > > > > >
> > > > > > I was looking at this patch to see if something smaller could be
> > > > > > independently committable. I think we can extract
> > > > > > pg_get_slot_invalidation_cause() and commit it as that function
> > > > > > could be independently useful as well. What do you think?
> > > > > >
> > > > >
> > > > > Sure, forked another thread [1]
> > > > > [1]:
> > > > >
> > > > https://www.postgresql.org/message-id/CAJpy0uBpr0ym12%2B0mXpjcRFA6
> > > > N%3D
> > > > > anX%2BYk9aGU4EJhHNu%3DfWykQ%40mail.gmail.com
> > > > >
> > > >
> > > > Thanks, thinking more, we can split the patch into the following three 
> > > > patches
> > > > which can be committed separately (a) Allowing the failover property to 
> > > > be set
> > > > for a slot via SQL API and subscription commands
> > > > (b) sync slot worker infrastructure (c) GUC standby_slot_names and the 
> > > > the
> > > > corresponding wait logic in server-side.
> > > >
> > > > Thoughts?
> > >
> > > I agree. Here is the V54 patch set which was split based on the 
> > > suggestion.
> > > The commit message in each patch is also improved.
> > >
> >
> > I would like to revisit the current dependency of slotsync worker on
> > dbname used in 002 patch. Currently we accept dbname in
> > primary_conninfo and thus the user has to make sure to provide one (by
> > manually altering it) even in case of a conf file auto-generated by
> > "pg_basebackup -R".
> > Thus I would like to discuss if there are better ways to do it.
> > Complete background is as follow:
> >
> > We need dbname for 2 purposes:
> >
> > 1) to connect to remote db in order to run SELECT queries to fetch the
> > info needed by slotsync worker.
> > 2) to make connection in slot-sync worker itself in order to be able
> > to use libpq APIs for 1)
> >
> > We run 3 kind of select queries in slot-sync worker currently:
> >
> > a) To fetch all failover slots (logical slots) info at once in
> > synchronize_slots().
> > b) To fetch a particular slot info during
> > wait_for_primary_slot_catchup() logic (logical slot).
> > c) To validate primary slot (physical one) and also to distinguish
> > between standby and cascading standby by running pg_is_in_recovery().
> >
> >  1) One approach to avoid dependency on dbname is using commands
> > instead of SELECT.  This will need implementing LIST_SLOTS command for
> > a), and for b) we can use LIST_SLOTS and fetch everything (even though
> > it is not needed) or have LIST_SLOTS with a filter on slot-name or
> > extend READ_REPLICATION_SLOT,  and for c) we can have some other
> > command to get pg_is_in_recovery() info. But, I feel by relying on
> > commands we will be making the extension of the slot-sync feature
> > difficult. In future, if there is some more requirement to fetch any
> > other info,
> > then there too we have to implement a command. I am not sure if it is
> > good and extensible approach.
> >
> > 2) Another way to avoid asking for a dbname in primary_conninfo is to
> > use the default dbname internally. This brings us to two questions:
> > 'How' and 'Which default db'?
> >
> > 2.1) To answer 'How':
> > Using default dbname is simpler for the purpose of slot-sync worker
> > having its own db-connection, but is a little tricky for the purpose
> > of connection to remote_db. This is because we have to inject this
> > dbname internally in our connection-info.
> >
> > 2.1.1) Say we use primary_conninfo (i.e. original one w/o dbname),
> > then currently it could have 2 formats:
> >
> > a) The simple "=" format for key-value pairs, example:
> > 'user=replication host=127.0.0.1 port=5433 dbname=postgres'.
> > b) URI format, example:
> > postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp
> >
> > We can distinguish between the 2 formats using 'uri_prefix_length' but
> > injecting the dbname part will be messy specially for URI format.  If
> > we want to do it w/o injecting and only by changing libpq interfaces
> > to accept dbname separately apart from conninfo, then there is no
> > current simpler way available. It will need a good amount of changes
> > in libpq.
> >
> > 2.1.2) Another way is to not rely on primary_conninfo directly but
> > rely on 'WalRcv->conninfo' in order to connect to remote_db. This is
> > because the latter is never URI format, it is some parsed forma

Re: Synchronizing slots from primary to standby

2023-12-27 Thread Amit Kapila

On Wed, Dec 27, 2023 at 11:36 AM Masahiko Sawada  wrote:
>
> On Tue, Dec 26, 2023 at 9:27 PM shveta malik  wrote:
> >
> > I would like to revisit the current dependency of slotsync worker on
> > dbname used in 002 patch. Currently we accept dbname in
> > primary_conninfo and thus the user has to make sure to provide one (by
> > manually altering it) even in case of a conf file auto-generated by
> > "pg_basebackup -R".
> > Thus I would like to discuss if there are better ways to do it.
> > Complete background is as follow:
> >
> > We need dbname for 2 purposes:
> >
> > 1) to connect to remote db in order to run SELECT queries to fetch the
> > info needed by slotsync worker.
> > 2) to make connection in slot-sync worker itself in order to be able
> > to use libpq APIs for 1)
> >
> > We run 3 kind of select queries in slot-sync worker currently:
> >
> > a) To fetch all failover slots (logical slots) info at once in
> > synchronize_slots().
> > b) To fetch a particular slot info during
> > wait_for_primary_slot_catchup() logic (logical slot).
> > c) To validate primary slot (physical one) and also to distinguish
> > between standby and cascading standby by running pg_is_in_recovery().
> >
> >  1) One approach to avoid dependency on dbname is using commands
> > instead of SELECT.  This will need implementing LIST_SLOTS command for
> > a), and for b) we can use LIST_SLOTS and fetch everything (even though
> > it is not needed) or have LIST_SLOTS with a filter on slot-name or
> > extend READ_REPLICATION_SLOT,  and for c) we can have some other
> > command to get pg_is_in_recovery() info. But, I feel by relying on
> > commands we will be making the extension of the slot-sync feature
> > difficult. In future, if there is some more requirement to fetch any
> > other info,
> > then there too we have to implement a command. I am not sure if it is
> > good and extensible approach.
> >
> > 2) Another way to avoid asking for a dbname in primary_conninfo is to
> > use the default dbname internally. This brings us to two questions:
> > 'How' and 'Which default db'?
> >
> > 2.1) To answer 'How':
> > Using default dbname is simpler for the purpose of slot-sync worker
> > having its own db-connection, but is a little tricky for the purpose
> > of connection to remote_db. This is because we have to inject this
> > dbname internally in our connection-info.
> >
> > 2.1.1) Say we use primary_conninfo (i.e. original one w/o dbname),
> > then currently it could have 2 formats:
> >
> > a) The simple "=" format for key-value pairs, example:
> > 'user=replication host=127.0.0.1 port=5433 dbname=postgres'.
> > b) URI format, example:
> > postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp
> >
> > We can distinguish between the 2 formats using 'uri_prefix_length' but
> > injecting the dbname part will be messy specially for URI format.  If
> > we want to do it w/o injecting and only by changing libpq interfaces
> > to accept dbname separately apart from conninfo, then there is no
> > current simpler way available. It will need a good amount of changes
> > in libpq.
> >
> > 2.1.2) Another way is to not rely on primary_conninfo directly but
> > rely on 'WalRcv->conninfo' in order to connect to remote_db. This is
> > because the latter is never URI format, it is some parsed format and
> > appending may work. As an example, primary_conninfo =
> > 'postgresql://replication@localhost:5433', WalRcv->conninfo loaded
> > internally is:
> > "user=replication passfile=/home/shveta/.pgpass channel_binding=prefer
> > dbname=replication host=localhost port=5433
> > fallback_application_name=walreceiver sslmode=prefer sslcompression=0
> > sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2
> > gssencmode=disable krbsrvname=postgres gssdelegation=0
> > target_session_attrs=any load_balance_hosts=disable", '\000'
> >
> > So we can try appending our default dbname to this. But all the
> > defaults loaded in WalRcv->conninfo need some careful analysis to
> > figure out if they work for slot-sync worker case.
> >
> > 2.2) Now coming to 'Which default db':
> >
> > 2.2.1) If we use 'template1' as default db, it may block 'create db'
> > operations on primary for the time when the slot-sync worker is
> > connected to remote using this dbname. Example:
> >
> > postgres=# create database newdb1;
> > ERROR:  source database "template1" is being accessed by other users
> > DETAIL:  There is 1 other session using the database.
> >
> > 2.2.2) If we use 'postgres' as default db, there are chances that it
> > can be dropped as unlike 'template1', it is allowed to be dropped by
> > user, and if slotsync worker is connected to it, user may see:
> > newdb1=# drop database postgres;
> > ERROR:  database "postgres" is being accessed by other users
> > DETAIL:  There is 1 other session using the database.
> >
> > But once the slot-sync worker or standby goes down, user can always
> > drop this and next time slot-sync worker may not be abl

Re: Track in pg_replication_slots the reason why slots conflict?

2023-12-27 Thread Amit Kapila

On Wed, Dec 27, 2023 at 3:08 PM shveta malik  wrote:
>
> PFA the patch which attempts to implement this.
>
> This patch changes the existing 'conflicting' field to
> 'conflicting_cause' in pg_replication_slots.
>

This name sounds a bit odd to me, would it be better to name it as
conflict_cause?

A few other minor comments:
=
*
+   conflicting_cause text
+  
+  
+   Cause if this logical slot conflicted with recovery (and so is now
+   invalidated). It is always NULL for physical slots, as well as for
+   those logical slots which are not invalidated. Possible values are:

Would it better to use description as follows:" Cause of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible
values are:
.."

*
  $res = $node_standby->safe_psql(
  'postgres', qq(
- select bool_and(conflicting) from pg_replication_slots;));
+ select bool_and(conflicting) from
+ (select conflicting_cause is not NULL as conflicting from
pg_replication_slots);));

Won't the query "select conflicting_cause is not NULL as conflicting
from pg_replication_slots" can return false even for physical slots
and then impact the result of the main query whereas the original
query would seem to be simply ignoring physical slots? If this
observation is correct then you might want to add a 'slot_type'
condition in the new query.

* After introducing check_slots_conflicting_cause(), do we need to
have check_slots_conflicting_status()? Aren't both checking the same
thing?

-- 
With Regards,
Amit Kapila.

Shared detoast Datum proposal

2023-12-27 Thread Andy Fan


Problem:

Toast works well for its claimed purposes. However the current detoast
infrastructure doesn't shared the detoast datum in the query's
lifespan, which will cause some inefficiency like this:

SELECT big_jsonb_col->'a',  big_jsonb_col->'b',  big_jsonb_col->'c'
FROM t;

In the above case, the big_jsonb_col will be detoast 3 times for every
record.

a more common case maybe:

CREATE TABLE t(a numeric);
insert into t select i FROM generate_series(1, 10)i;
SELECT * FROM t WHERE a > 3;

Here we needs detoasted because of VARATT_CAN_MAKE_SHORT, and it needs to
be detoasted twice, one is in a > 3,  the other one is in the targetlist,
where we need to call numeric_out. 

Proposal


When we access some desired toast column in EEOP_{SCAN|INNER_OUTER}_VAR
steps, we can detoast it immediately and save it back to
slot->tts_values.  With this way, when any other expressions seeks for
the Datum, it get the detoast version. Planner decides which Vars
should use this feature, executor manages it detoast action and memory
management. 

Planner design
---

1. This feature only happen at the Plan Node where the detoast would
   happen in the previous topology, for example:

   for example:
   
   SELECT t1.toast_col, t2.toast_col FROM t1 join t2 USING (toast_col);

   toast_col just happens at the Join node's slot even if we have a
   projection on t1 or t2 at the scan node (except the Parameterized path).

   However if

   SELECT t1.toast_col, t2.toast_col
   FROM t1 join t2
   USING (toast_col)
   WHERE t1.toast_col > 'a';

   the detoast *may* happen at the scan of level t1 since "t1.toast_col >
   'a'" accesses the Var within a FuncCall ('>' operator), which will
   cause a detoast. (However it should not happen if it is under a Sort
   node, for details see Planner Design section 2).

   At the implementation side, I added "Bitmapset  *reference_attrs;" to
   Scan node which show if the Var should be accessed with the
   pre-detoast way in expression execution engine. the value is
   maintained at the create_plan/set_plan_refs stage. 

   Two similar fields are added in Join node.

Bitmapset  *outer_reference_attrs;
Bitmapset  *inner_reference_attrs;

   In the both case, I tracked the level of walker/mutator, if the level
   greater than 1 when we access a Var, the 'var->varattno - 1' is added
   to the bitmapset. Some special node should be ignored, see
   increase_level_for_pre_detoast for details.

2. We also need to make sure the detoast datum will not increase the
   work_mem usage for the nodes like Sort/Hash etc, all of such nodes
   can be found with search 'CP_SMALL_TLIST' flags. 

   If the a node under a Sort-Hash-like nodes, we have some extra
   checking to see if a Var is a *directly input* of such nodes. If yes,
   we can't detoast it in advance, or else, we know the Var has been
   discarded before goes to these nodes, we still can use the shared
   detoast feature.

   The simplest cases to show this is:

   For example:

   2.1
   Query-1
   explain (verbose) select * from t1 where b > 'a';
   -- b can be detoast in advance.

   Query-2
   explain (verbose) select * from t1 where b > 'a' order by c;
   -- b can't be detoast since it will makes the Sort use more work_mem.

   Query-3
   explain (verbose) select a, c from t1 where b > 'a' order by c;
   -- b can be pre-detoasted, since it is discarded before goes to Sort
   node. In this case it doesn't do anything good, but for some complex
   case like Query-4, it does. 

Query-4   
explain (costs off, verbose)
select t3.*
from t1, t2, t3
where t2.c > '999'
and t2.c = t1.c
and t3.b = t1.b;

   QUERY PLAN  
--
 Hash Join
   Output: t3.a, t3.b, t3.c
   Hash Cond: (t3.b = t1.b)
   ->  Seq Scan on public.t3
 Output: t3.a, t3.b, t3.c
   ->  Hash
 Output: t1.b
 ->  Nested Loop
   Output: t1.b  <-- Note here 3
   ->  Seq Scan on public.t2
 Output: t2.a, t2.b, t2.c
 Filter: (t2.c > '...'::text) <--- Note Here 1
   ->  Index Scan using t1_c_idx on public.t1
 Output: t1.a, t1.b, t1.c
 Index Cond: (t1.c = t2.c)  <--- Note Here 2 
(15 rows) 
 
In this case, detoast datum for t2.c can be shared and it benefits for
t2.c = t1.c and no harm for the Hash node.


Execution side
--

Once we decide a Var should be pre-detoasted for a given plan node, a
special step named as EEOP_{INNER/OUTER/SCAN}_VAR_TOAST will be created
during ExecInitExpr stage. The special steps are introduced to avoid its
impacts on the non-related EEOP_{INNER/OUTER/SCAN}_VAR code path.

slot->tts_values is used to store the detoast datum so that any other
expressions can access it pretty easily.

Because of the above design, the detoast datum should have a same
lifespan as a

Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)

2023-12-27 Thread Pavel Borisov

Alexander,

On Tue, 26 Dec 2023 at 23:35, Alexander Korotkov 
wrote:

> Pavel,
>
> On Mon, Dec 25, 2023 at 8:32 PM Pavel Borisov 
> wrote:
> > I've reviewed both patches:
> > 0001 - is a pure refactoring replacing argument transfer from via struct
> member to transfer explicitly as a function argument. It's justified by the
> fact firstPage is localized only to several places. The patch looks simple
> and good enough.
> >
> > 0002:
> > continuescanPrechecked is semantically much better than previous
> requiredMatchedByPrecheck which confused me earlier. Thanks!
> >
> > From the new comments, it looks a little bit hard to understand who does
> what. Semantics "if caller told" in comments looks more clear to me. Could
> you especially give attention to the comments:
> >
> > "If they wouldn't be matched, then the *continuescan flag would be set
> for the current item and the last item on the page accordingly."
> > "If the key is required for the opposite direction scan, we need to know
> there was already at least one matching item on the page.  For those keys."
> >
> > > Prechecking the value of the continuescan flag for the last item on the
> > >+ * page (according to the scan direction).
> > Maybe, in this case, it would be more clear like: "...(for backwards
> scan it will be the first item on a page)"
> >
> > Otherwise the patch 0002 looks like a good fix for the bug to be pushed.
>
> Thank you for your review.  I've revised comments to meet your suggestions.
>
Thank you for revised comments! I think they are good enough.

Regards,
Pavel

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-27 Thread Ranier Vilela

Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra <
tomas.von...@enterprisedb.com> escreveu:

>
>
> On 12/26/23 19:10, Ranier Vilela wrote:
> > Hi,
> >
> > The commit b437571  I
> > think has an oversight.
> > When allocate memory and initialize private spool in function:
> > _brin_leader_participate_as_worker
> >
> > The behavior is the bs_spool (heap and index fields)
> > are left empty.
> >
> > The code affected is:
> >   buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
> > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> > - buildstate->bs_spool->index = buildstate->bs_spool->index;
> > + buildstate->bs_spool->heap = heap;
> > + buildstate->bs_spool->index = index;
> >
> > Is the fix correct?
> >
>
> Thanks for noticing this.

You're welcome.


> Yes, I believe this is a bug - the assignments
> are certainly wrong, it leaves the fields set to NULL.
>
> I wonder how come this didn't fail during testing. Surely, if the leader
> participates as a worker, the tuplesort_begin_index_brin shall be called
> with heap/index being NULL, leading to some failure during the sort. But
> maybe this means we don't actually need the heap/index fields, it's just
> a copy of TuplesortIndexArg, but BRIN does not need that because we sort
> the tuples by blkno, and we don't need the descriptors for that.
>
Unfortunately I can't test on Windows, since I can't build with meson on
Windows.


> In any case, the _brin_parallel_scan_and_build does not actually need
> the separate heap/index arguments, those are already in the spool.
>
Yeah, for sure.


> I'll try to figure out if we want to simplify the tuplesort or remove
> the arguments from _brin_parallel_scan_and_build.
>
Thank you for your work.

best regards,
Ranier Vilela

Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals

2023-12-27 Thread Andy Fan

Hi,

David Rowley  writes:

>
> Happy to hear other people's thoughts on this patch.  Otherwise, I
> currently don't think the missed optimisation is a reason to block
> what we've ended up with so far.
>
> David
>
> [1] https://postgr.es/m/flat/17540-7aa1855ad5ec18b4%40postgresql.org
>
> [2. application/x-patch; 
> v10-0001-Reduce-NullTest-quals-to-constant-TRUE-or-FALSE.patch]...

Thanks for working on this, an I just get a complaint about this missed
optimisation 7 hours ago..

I also want to add notnullattnums for the UniqueKey stuff as well, by
comparing your implementation with mine,  I found you didn't consider
the NOT NULL generated by filter. After apply your patch:

create table a(a int);
explain (costs off) select * from a where a > 3 and a is null;
 QUERY PLAN  
-
 Seq Scan on a
   Filter: ((a IS NULL) AND (a > 3))
(2 rows)

This is acutally needed by UniqueKey stuff, do you think it should be
added? To save some of your time, you can check what I did in UniqueKey

[1]
https://www.postgresql.org/message-id/attachment/151254/v1-0001-uniquekey-on-base-relation-and-used-it-for-mark-d.patch

-- 
Best Regards
Andy Fan

Should we remove -Wdeclaration-after-statement?

2023-12-27 Thread Jelte Fennema-Nio

Postgres currently requires all variables to be declared at the top of
the function, because it specifies -Wdeclaration-after-statement. One
of the reasons that we had this warning was because C89 required this
style of declaration. Requiring it everywhere made backporting easier,
since some of our older supported PG versions needed to compile on
C89. Now that we have dropped support for PG11 that reason goes away,
since now all supported Postgres versions require C99. So, I think
it's worth reconsidering if we want this warning to be enabled or not.

Personally I would very much prefer the warning to be disabled for the
following reasons:
1. It allows Asserts, ereports and other checks at the top of a
function, making it clear to the reader if there are any requirements
on the arguments that the caller should uphold.
2. By declaring variables at their first usage you limit the scope of
the variable. This reduces the amount of code that a reader of the
code has to look at to see if the variable was changed between its
declaration and the usage location that they are interested in.
3. By declaring variables at their first usage, often you can
immediately see the type of the variable that you are looking at.
Since most variables are not used in the whole function, their first
usage is pretty much their only usage.
4. By declaring variables at their first usage, you can often
initialize them with a useful value right away. That way you don't
have to worry about using it uninitialized. It also reduces the lines
of code, since initialization and declaration can be done in the same
line.
5. By declaring variables at their first usage, it is made clear to
the reader why the variable needs to exist. Oftentimes when I read a
postgres function from top to bottom, it's unclear to me what purpose
some of the declared variables at the top have.
6. I run into this warning over and over again when writing code in
postgres. This is because all of the other programming languages I
write code in don't have this restriction. Even many C projects I work
in don't have this restriction on where variables can be declared.

Failed assertion in joininfo.c, remove_join_clause_from_rels

2023-12-27 Thread Andreas Seltenreich

Hi,

SQLsmith found a failing Assertion in joininfo.c on master.  I can
reproduce it on an assertion-enabled build like this:

create table t1(a int primary key);
create table t2(a int);

select * from t2 right join
 (t1 as t1a inner join t1 as t1b on t1a.a = t1b.a)
  on t1a.a is not null and exists (select);

-- TRAP: failed Assert("list_member_ptr(rel->joininfo, restrictinfo)"), File: 
"joininfo.c", Line: 144, PID: 777839

regards,
Andreas

Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals

2023-12-27 Thread Richard Guo

On Wed, Dec 27, 2023 at 7:38 PM Andy Fan  wrote:

> I also want to add notnullattnums for the UniqueKey stuff as well, by
> comparing your implementation with mine,  I found you didn't consider
> the NOT NULL generated by filter. After apply your patch:
>
> create table a(a int);
> explain (costs off) select * from a where a > 3 and a is null;
>  QUERY PLAN
> -
>  Seq Scan on a
>Filter: ((a IS NULL) AND (a > 3))
> (2 rows)


The detection of self-inconsistent restrictions already exists in
planner.

# set constraint_exclusion to on;
SET
# explain (costs off) select * from a where a > 3 and a is null;
QUERY PLAN
--
 Result
   One-Time Filter: false
(2 rows)

Thanks
Richard

Re: Failed assertion in joininfo.c, remove_join_clause_from_rels

2023-12-27 Thread Alexander Korotkov

On Wed, Dec 27, 2023 at 1:54 PM Andreas Seltenreich  wrote:
> SQLsmith found a failing Assertion in joininfo.c on master.  I can
> reproduce it on an assertion-enabled build like this:
>
> create table t1(a int primary key);
> create table t2(a int);
>
> select * from t2 right join
>  (t1 as t1a inner join t1 as t1b on t1a.a = t1b.a)
>   on t1a.a is not null and exists (select);
>
> -- TRAP: failed Assert("list_member_ptr(rel->joininfo, restrictinfo)"), File: 
> "joininfo.c", Line: 144, PID: 777839

Thank you for pointing this out.  I'm investigating.

--
Regards,
Alexander Korotkov

Re: Failed assertion in joininfo.c, remove_join_clause_from_rels

2023-12-27 Thread Richard Guo

On Wed, Dec 27, 2023 at 8:00 PM Alexander Korotkov 
wrote:

> On Wed, Dec 27, 2023 at 1:54 PM Andreas Seltenreich 
> wrote:
> > SQLsmith found a failing Assertion in joininfo.c on master.  I can
> > reproduce it on an assertion-enabled build like this:
> >
> > create table t1(a int primary key);
> > create table t2(a int);
> >
> > select * from t2 right join
> >  (t1 as t1a inner join t1 as t1b on t1a.a = t1b.a)
> >   on t1a.a is not null and exists (select);
> >
> > -- TRAP: failed Assert("list_member_ptr(rel->joininfo, restrictinfo)"),
> File: "joininfo.c", Line: 144, PID: 777839
>
> Thank you for pointing this out.  I'm investigating.


This is the same issue with [1] and has been just fixed by e0477837ce.

[1]
https://www.postgresql.org/message-id/flat/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com

Thanks
Richard

Re: Statistics Import and Export

2023-12-27 Thread Tomas Vondra

On 12/26/23 20:19, Tom Lane wrote:
> Bruce Momjian  writes:
>> I think we need a robust API to handle two cases:
> 
>> *  changes in how we store statistics
>> *  changes in how how data type values are represented in the statistics
> 
>> We have had such changes in the past, and I think these two issues are
>> what have prevented import/export of statistics up to this point.
>> Developing an API that doesn't cleanly handle these will cause long-term
>> pain.
> 
> Agreed.
> 

I agree the format is important - we don't want to end up with a format
that's cumbersome or inconvenient to use. But I don't think the proposed
format is somewhat bad in those respects - it mostly reflects how we
store statistics and if I was designing a format for humans, it might
look a bit differently. But that's not the goal here, IMHO.

I don't quite understand the two cases above. Why should this affect how
we store statistics? Surely, making the statistics easy to use for the
optimizer is much more important than occasional export/import.

>> In summary, I think we need an SQL-level command for this.
> 
> I think a SQL command is an actively bad idea.  It'll just add development
> and maintenance overhead that we don't need.  When I worked on this topic
> years ago at Salesforce, I had things set up with simple functions, which
> pg_dump would invoke by writing more or less
> 
>   SELECT pg_catalog.load_statistics();
> 
> This has a number of advantages, not least of which is that an extension
> could plausibly add compatible functions to older versions.  The trick,
> as you say, is to figure out what the argument lists ought to be.
> Unfortunately I recall few details of what I wrote for Salesforce,
> but I think I had it broken down in a way where there was a separate
> function call occurring for each pg_statistic "slot", thus roughly
> 
> load_statistics(table regclass, attname text, stakind int, stavalue ...);
> 
> I might have had a separate load_statistics_xxx function for each
> stakind, which would ease the issue of deciding what the datatype
> of "stavalue" is.  As mentioned already, we'd also need some sort of
> version identifier, and we'd expect the load_statistics() functions
> to be able to transform the data if the old version used a different
> representation.  I agree with the idea that an explicit representation
> of the source table attribute's type would be wise, too.
> 

Yeah, this is pretty much what I meant by "functional" interface. But if
I said maybe the format implemented by the patch is maybe too close to
how we store the statistics, then this has exactly the same issue. And
it has other issues too, I think - it breaks down the stats into
multiple function calls, so ensuring the sanity/correctness of whole
sets of statistics gets much harder, I think.

I'm not sure about the extension idea. Yes, we could have an extension
providing such functions, but do we have any precedent of making
pg_upgrade dependent on an external extension? I'd much rather have
something built-in that just works, especially if we intend to make it
the default behavior (which I think should be our aim here).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Failed assertion in joininfo.c, remove_join_clause_from_rels

2023-12-27 Thread Alexander Korotkov

On Wed, Dec 27, 2023 at 2:04 PM Richard Guo  wrote:
> On Wed, Dec 27, 2023 at 8:00 PM Alexander Korotkov  
> wrote:
>>
>> On Wed, Dec 27, 2023 at 1:54 PM Andreas Seltenreich  
>> wrote:
>> > SQLsmith found a failing Assertion in joininfo.c on master.  I can
>> > reproduce it on an assertion-enabled build like this:
>> >
>> > create table t1(a int primary key);
>> > create table t2(a int);
>> >
>> > select * from t2 right join
>> >  (t1 as t1a inner join t1 as t1b on t1a.a = t1b.a)
>> >   on t1a.a is not null and exists (select);
>> >
>> > -- TRAP: failed Assert("list_member_ptr(rel->joininfo, restrictinfo)"), 
>> > File: "joininfo.c", Line: 144, PID: 777839
>>
>> Thank you for pointing this out.  I'm investigating.
>
>
> This is the same issue with [1] and has been just fixed by e0477837ce.
>
> [1] 
> https://www.postgresql.org/message-id/flat/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com

I just came to the same conclusion.  Thank you, Richard.

--
Regards,
Alexander Korotkov

Re: Bug in nbtree optimization to skip > operator comparisons (or < comparisons in backwards scans)

2023-12-27 Thread Alexander Korotkov

On Wed, Dec 27, 2023 at 1:18 PM Pavel Borisov  wrote:
> Thank you for revised comments! I think they are good enough.

Pushed, thank you!

--
Regards,
Alexander Korotkov

Re: Add support for attribute((returns_nonnull))


On 19.12.23 21:43, Tristan Partin wrote:
Here is a patch which adds support for the returns_nonnull attribute 
alongside all the other attributes we optionally support.


I recently wound up in a situation where I was checking for NULL return 
values of a function that couldn't ever return NULL because the 
inability to allocate memory was always elog(ERROR)ed (aborted).


I didn't go through and mark anything, but I feel like it could be 
useful for people going forward, including myself.


I think it would be useful if this patch series contained a patch that 
added some initial uses of this.  That way we can check that the 
proposed definition actually works, and we can observe what it does, 
with respect to warnings, static analysis, etc.

Re: pg_stat_statements: more test coverage


On 27.12.23 09:08, Julien Rouhaud wrote:

Hi,

On Tue, Dec 26, 2023 at 10:03 PM Peter Eisentraut  wrote:


On 24.12.23 03:03, Michael Paquier wrote:

- Use a DO block of a PL function, say with something like that to
ensure an amount of N queries?  Say with something like that after
tweaking pg_stat_statements.track:
CREATE OR REPLACE FUNCTION create_tables(num_tables int)
RETURNS VOID AS
$func$
BEGIN
FOR i IN 1..num_tables LOOP
  EXECUTE format('
CREATE TABLE IF NOT EXISTS %I (id int)', 't_' || i);
END LOOP;
END
$func$ LANGUAGE plpgsql;


I tried it like this first, but this doesn't register as separately
executed commands for pg_stat_statements.


I was a bit surprised by that so I checked locally.  It does work as
expected provided that you set pg_stat_statements.track to all:


Ok, here is an updated patch set that does it that way.

I have committed the patches 0002 and 0003 from the previous patch set 
already.


I have also enhanced the TAP test a bit to check the actual content of 
the output across restarts.


I'm not too hung up on the 0001 patch if others don't like that approach.From 8938017869025598024f0aba950c0aeccbcbe651 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 23 Dec 2023 14:25:26 +0100
Subject: [PATCH v2 1/3] pg_stat_statements: Exclude from lcov functions that
 are impossible to test

The current code only supports installing version 1.4 of
pg_stat_statements and upgrading from there.  So the C code entry
points for older versions are not reachable from the tests.  To
prevent this from ruining the test coverage figures, mark those
functions are excluded from lcov.

Discussion: 
https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75b...@eisentraut.org
---
 contrib/pg_stat_statements/pg_stat_statements.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 6f62a531f7..8ac73bf55e 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -311,13 +311,15 @@ static bool pgss_save = true; /* whether to save 
stats across shutdown */
 PG_FUNCTION_INFO_V1(pg_stat_statements_reset);
 PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_7);
 PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_11);
+/* LCOV_EXCL_START */
+PG_FUNCTION_INFO_V1(pg_stat_statements);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_2);
+/* LCOV_EXCL_STOP */
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_3);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_8);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_9);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_10);
 PG_FUNCTION_INFO_V1(pg_stat_statements_1_11);
-PG_FUNCTION_INFO_V1(pg_stat_statements);
 PG_FUNCTION_INFO_V1(pg_stat_statements_info);
 
 static void pgss_shmem_request(void);
@@ -1610,6 +1612,8 @@ pg_stat_statements_1_3(PG_FUNCTION_ARGS)
return (Datum) 0;
 }
 
+/* LCOV_EXCL_START */
+
 Datum
 pg_stat_statements_1_2(PG_FUNCTION_ARGS)
 {
@@ -1633,6 +1637,8 @@ pg_stat_statements(PG_FUNCTION_ARGS)
return (Datum) 0;
 }
 
+/* LCOV_EXCL_STOP */
+
 /* Common code for all versions of pg_stat_statements() */
 static void
 pg_stat_statements_internal(FunctionCallInfo fcinfo,

base-commit: 7e6fb5da41d8ee1bddcd5058b7204018ef68d193
-- 
2.43.0

From 6fa54b21ae6060fd459f5330130491662361f2d5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 23 Dec 2023 14:25:26 +0100
Subject: [PATCH v2 2/3] pg_stat_statements: Add coverage for entry_dealloc()

This involves creating pg_stat_statements.max entries, then creating
even more entries, and checking that the limit is kept and the least
used entries are kicked out.

Discussion: 
https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75b...@eisentraut.org
---
 contrib/pg_stat_statements/Makefile   |  2 +-
 contrib/pg_stat_statements/expected/max.out   | 66 +++
 contrib/pg_stat_statements/meson.build|  1 +
 .../pg_stat_statements.conf   |  2 +
 contrib/pg_stat_statements/sql/max.sql| 44 +
 5 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 contrib/pg_stat_statements/expected/max.out
 create mode 100644 contrib/pg_stat_statements/sql/max.sql

diff --git a/contrib/pg_stat_statements/Makefile 
b/contrib/pg_stat_statements/Makefile
index aecd1d6a2a..7ee16e8350 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -19,7 +19,7 @@ LDFLAGS_SL += $(filter -lm, $(LIBS))
 
 REGRESS_OPTS = --temp-config 
$(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf
 REGRESS = select dml cursors utility level_tracking planning \
-   user_activity wal entry_timestamp cleanup oldextversions
+   user_activity wal entry_timestamp max cleanup oldextversions
 # Disabled because these tests require 
"shared_preload_libraries=pg_stat_statements",
 # which typical

Re: trying again to get incremental backup

2023-12-27 Thread Robert Haas

On Sat, Dec 23, 2023 at 4:51 PM Nathan Bossart  wrote:
> My compiler has the following complaint:
>
> ../postgresql/src/backend/postmaster/walsummarizer.c: In function 
> ‘GetOldestUnsummarizedLSN’:
> ../postgresql/src/backend/postmaster/walsummarizer.c:540:32: error: 
> ‘unsummarized_lsn’ may be used uninitialized in this function 
> [-Werror=maybe-uninitialized]
>   540 |  WalSummarizerCtl->pending_lsn = unsummarized_lsn;
>   |  ~~^~

Thanks. I don't think there's a real bug, but I pushed a fix, same as
what you had.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

cannot abort transaction 2737414167, it was already committed

2023-12-27 Thread Justin Pryzby

We had this:

< 2023-12-25 04:06:20.062 MST telsasoft >ERROR:  could not open file 
"pg_tblspc/16395/PG_16_202307071/16384/121010871": Input/output error
< 2023-12-25 04:06:20.062 MST telsasoft >STATEMENT:  commit
< 2023-12-25 04:06:20.062 MST telsasoft >WARNING:  AbortTransaction while in 
COMMIT state
< 2023-12-25 04:06:20.062 MST telsasoft >PANIC:  cannot abort transaction 
2737414167, it was already committed
< 2023-12-25 04:06:20.473 MST  >LOG:  server process (PID 14678) was terminated 
by signal 6: Aborted

The application is a daily cronjob which would've just done:

begin;
lo_unlink(); -- the client-side function called from pygresql;
DELETE FROM tbl WHERE col=%s;
commit;

The table being removed would've been a transient (but not "temporary")
table created ~1 day prior.

It's possible that the filesystem had an IO error, but I can't find any
evidence of that.  Postgres is running entirely on zfs, which says:

scan: scrub repaired 0B in 00:07:03 with 0 errors on Mon Dec 25 04:49:07 2023
errors: No known data errors

My main question is why an IO error would cause the DB to abort, rather
than raising an ERROR.

This is pg16 compiled at efa8f6064, runing under centos7.  ZFS is 2.2.2,
but the pool hasn't been upgraded to use the features new since 2.1.

(gdb) bt
#0  0x7fc961089387 in raise () from /lib64/libc.so.6
#1  0x7fc96108aa78 in abort () from /lib64/libc.so.6
#2  0x009438b7 in errfinish (filename=filename@entry=0xac8e20 "xact.c", 
lineno=lineno@entry=1742, funcname=funcname@entry=0x9a6600 <__func__.32495> 
"RecordTransactionAbort") at elog.c:604
#3  0x0054d6ab in RecordTransactionAbort 
(isSubXact=isSubXact@entry=false) at xact.c:1741
#4  0x0054d7bd in AbortTransaction () at xact.c:2814
#5  0x0054e015 in AbortCurrentTransaction () at xact.c:3415
#6  0x00804e4e in PostgresMain (dbname=0x12ea840 "ts", 
username=0x12ea828 "telsasoft") at postgres.c:4354
#7  0x0077bdd6 in BackendRun (port=, port=) at postmaster.c:4465
#8  BackendStartup (port=0x12e44c0) at postmaster.c:4193
#9  ServerLoop () at postmaster.c:1783
#10 0x0077ce9a in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x12ad280) at postmaster.c:1467
#11 0x004ba8b8 in main (argc=3, argv=0x12ad280) at main.c:198

#3  0x0054d6ab in RecordTransactionAbort 
(isSubXact=isSubXact@entry=false) at xact.c:1741
xid = 2737414167
rels = 0x94f549 
ndroppedstats = 0
droppedstats = 0x0

#4  0x0054d7bd in AbortTransaction () at xact.c:2814
is_parallel_worker = false

-- 
Justin

Re: Should we remove -Wdeclaration-after-statement?

Jelte Fennema-Nio  writes:
> Postgres currently requires all variables to be declared at the top of
> the function, because it specifies -Wdeclaration-after-statement. One
> of the reasons that we had this warning was because C89 required this
> style of declaration. Requiring it everywhere made backporting easier,
> since some of our older supported PG versions needed to compile on
> C89. Now that we have dropped support for PG11 that reason goes away,
> since now all supported Postgres versions require C99. So, I think
> it's worth reconsidering if we want this warning to be enabled or not.

This has already been debated, and the conclusion was that we would
stick to the existing style for consistency reasons.  The fact that
back-portable patches required it was only one of the arguments, and
not the decisive one.

regards, tom lane

Re: pg_upgrade failing for 200+ million Large Objects

Robins Tharakan  writes:
> Applying all 4 patches, I also see good performance improvement.
> With more Large Objects, although pg_dump improved significantly,
> pg_restore is now comfortably an order of magnitude faster.

Yeah.  The key thing here is that pg_dump can only parallelize
the data transfer, while (with 0004) pg_restore can parallelize
large object creation and owner-setting as well as data transfer.
I don't see any simple way to improve that on the dump side,
but I'm not sure we need to.  Zillions of empty objects is not
really the use case to worry about.  I suspect that a more realistic
case with moderate amounts of data in the blobs would make pg_dump
look better.

regards, tom lane

Re: Statistics Import and Export

2023-12-27 Thread Bruce Momjian

On Wed, Dec 27, 2023 at 01:08:47PM +0100, Tomas Vondra wrote:
> On 12/26/23 20:19, Tom Lane wrote:
> > Bruce Momjian  writes:
> >> I think we need a robust API to handle two cases:
> > 
> >> *  changes in how we store statistics
> >> *  changes in how how data type values are represented in the statistics
> > 
> >> We have had such changes in the past, and I think these two issues are
> >> what have prevented import/export of statistics up to this point.
> >> Developing an API that doesn't cleanly handle these will cause long-term
> >> pain.
> > 
> > Agreed.
> > 
> 
> I agree the format is important - we don't want to end up with a format
> that's cumbersome or inconvenient to use. But I don't think the proposed
> format is somewhat bad in those respects - it mostly reflects how we
> store statistics and if I was designing a format for humans, it might
> look a bit differently. But that's not the goal here, IMHO.
> 
> I don't quite understand the two cases above. Why should this affect how
> we store statistics? Surely, making the statistics easy to use for the
> optimizer is much more important than occasional export/import.

The two items above were to focus on getting a solution that can easily
handle future statistics storage changes.  I figured we would want to
manipulate the data as a table internally so I am confused why we would
export JSON instead of a COPY format.  I didn't think we were changing
how we internall store or use the statistics.

> >> In summary, I think we need an SQL-level command for this.
> > 
> > I think a SQL command is an actively bad idea.  It'll just add development
> > and maintenance overhead that we don't need.  When I worked on this topic
> > years ago at Salesforce, I had things set up with simple functions, which
> > pg_dump would invoke by writing more or less
> > 
> > SELECT pg_catalog.load_statistics();
> > 
> > This has a number of advantages, not least of which is that an extension
> > could plausibly add compatible functions to older versions.  The trick,
> > as you say, is to figure out what the argument lists ought to be.
> > Unfortunately I recall few details of what I wrote for Salesforce,
> > but I think I had it broken down in a way where there was a separate
> > function call occurring for each pg_statistic "slot", thus roughly
> > 
> > load_statistics(table regclass, attname text, stakind int, stavalue ...);
> > 
> > I might have had a separate load_statistics_xxx function for each
> > stakind, which would ease the issue of deciding what the datatype
> > of "stavalue" is.  As mentioned already, we'd also need some sort of
> > version identifier, and we'd expect the load_statistics() functions
> > to be able to transform the data if the old version used a different
> > representation.  I agree with the idea that an explicit representation
> > of the source table attribute's type would be wise, too.
> > 
> 
> Yeah, this is pretty much what I meant by "functional" interface. But if
> I said maybe the format implemented by the patch is maybe too close to
> how we store the statistics, then this has exactly the same issue. And
> it has other issues too, I think - it breaks down the stats into
> multiple function calls, so ensuring the sanity/correctness of whole
> sets of statistics gets much harder, I think.

I was suggesting an SQL command because this feature is going to need a
lot of options and do a lot of different things, I am afraid, and a
single function might be too complex to manage.

> I'm not sure about the extension idea. Yes, we could have an extension
> providing such functions, but do we have any precedent of making
> pg_upgrade dependent on an external extension? I'd much rather have
> something built-in that just works, especially if we intend to make it
> the default behavior (which I think should be our aim here).

Uh, an extension seems nice to allow people in back branches to install
it, but not for normal usage.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: trying again to get incremental backup

2023-12-27 Thread Nathan Bossart

On Wed, Dec 27, 2023 at 09:11:02AM -0500, Robert Haas wrote:
> Thanks. I don't think there's a real bug, but I pushed a fix, same as
> what you had.

Thanks!  I also noticed that WALSummarizerLock probably needs a mention in
wait_event_names.txt.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

change regexp_substr first argument make tests more easier to understand.

hi.
https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/expected/strings.out#n928

SELECT regexp_substr('abcabcabc', 'a.c');
SELECT regexp_substr('abcabcabc', 'a.c', 2);
SELECT regexp_substr('abcabcabc', 'a.c', 1, 3);
SELECT regexp_substr('abcabcabc', 'a.c', 1, 4) IS NULL AS t;
SELECT regexp_substr('abcabcabc', 'A.C', 1, 2, 'i');

they all return 'abc', there are 3 'abc ' in  string 'abcabcabc'
except IS NULL query.
maybe we can change regexp_substr first argument from "abcabcabc" to
"abcaXcaYc".
so the result would be more easier to understand.

[PATCH] Exponential backoff for auth_delay

2023-12-27 Thread Michael Banck

Hi,

we had a conversation with a customer about security compliance a while
ago and one thing they were concerned about was avoiding brute-force
attemps for remote password guessing. This is should not be a big
concern if reasonably secure passwords are used and increasing SCRAM
iteration count can also help, but generally auth_delay is recommended
for this if there are concerns.

This patch adds exponential backoff so that one can choose a small
initial value which gets doubled for each failed authentication attempt
until a maximum wait time (which is 10s by default, but can be disabled
if so desired).

Currently, this patch tracks remote hosts but not users, the idea being
that a remote attacker likely tries several users from a particular
host, but this could in theory be extended to users if there are
concerns.

The patch is partly based on an earlier, more ambitious attempt at
extending auth_delay by 成之焕 from a year ago:
https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com


Michael
>From 4c964c866010bbdbeee9f0b2a755d97c91c5c091 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Wed, 27 Dec 2023 15:55:39 +0100
Subject: [PATCH v1] Add optional exponential backoff to auth_delay contrib
 module.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This adds two new GUCs for auth_delay, exp_backoff and max_seconds. The former
controls whether exponential backoff should be used or not, the latter sets an
maximum delay (default is 10s) in case exponential backoff is active.

The exponential backoff is tracked per remote host and doubled for every failed
login attempt (i.e., wrong password, not just missing pg_hba line or database)
and reset to auth_delay.milliseconds after a successful authentication from
that host.

This patch is partly based on a larger (but ultimately rejected) patch by
成之焕.

Authors: Michael Banck, 成之焕
Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com
---
 contrib/auth_delay/auth_delay.c  | 202 ++-
 doc/src/sgml/auth-delay.sgml |  41 +++
 src/tools/pgindent/typedefs.list |   1 +
 3 files changed, 243 insertions(+), 1 deletion(-)

diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c
index 8d6e4d2778..95e56db6ec 100644
--- a/contrib/auth_delay/auth_delay.c
+++ b/contrib/auth_delay/auth_delay.c
@@ -14,24 +14,50 @@
 #include 
 
 #include "libpq/auth.h"
+#include "miscadmin.h"
 #include "port.h"
+#include "storage/ipc.h"
+#include "storage/shmem.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
 PG_MODULE_MAGIC;
 
+#define MAX_CONN_RECORDS 50
+
 /* GUC Variables */
 static int	auth_delay_milliseconds = 0;
+static bool auth_delay_exp_backoff = false;
+static int	auth_delay_max_seconds = 0;
 
 /* Original Hook */
 static ClientAuthentication_hook_type original_client_auth_hook = NULL;
 
+typedef struct AuthConnRecord
+{
+	char		remote_host[NI_MAXHOST];
+	bool		used;
+	double		sleep_time;		/* in milliseconds */
+} AuthConnRecord;
+
+static shmem_startup_hook_type shmem_startup_next = NULL;
+static shmem_request_hook_type shmem_request_next = NULL;
+static AuthConnRecord *acr_array = NULL;
+
+static AuthConnRecord *find_conn_record(char *remote_host, int *free_index);
+static double record_failed_conn_auth(Port *port);
+static double find_conn_max_delay(void);
+static void record_conn_failure(AuthConnRecord *acr);
+static void cleanup_conn_record(Port *port);
+
 /*
  * Check authentication
  */
 static void
 auth_delay_checks(Port *port, int status)
 {
+	double		delay;
+
 	/*
 	 * Any other plugins which use ClientAuthentication_hook.
 	 */
@@ -43,8 +69,150 @@ auth_delay_checks(Port *port, int status)
 	 */
 	if (status != STATUS_OK)
 	{
-		pg_usleep(1000L * auth_delay_milliseconds);
+		if (auth_delay_exp_backoff)
+		{
+			/*
+			 * Exponential backoff per remote host.
+			 */
+			delay = record_failed_conn_auth(port);
+			if (auth_delay_max_seconds > 0)
+delay = Min(delay, 1000L * auth_delay_max_seconds);
+		}
+		else
+			delay = auth_delay_milliseconds;
+		if (delay > 0)
+		{
+			elog(DEBUG1, "Authentication delayed for %g seconds", delay / 1000.0);
+			pg_usleep(1000L * (long) delay);
+		}
+	}
+	else
+	{
+		cleanup_conn_record(port);
+	}
+}
+
+static double
+record_failed_conn_auth(Port *port)
+{
+	AuthConnRecord *acr = NULL;
+	int			j = -1;
+
+	acr = find_conn_record(port->remote_host, &j);
+
+	if (!acr)
+	{
+		if (j == -1)
+
+			/*
+			 * No free space, MAX_CONN_RECORDS reached. Wait as long as the
+			 * largest delay for any remote host.
+			 */
+			return find_conn_max_delay();
+		acr = &acr_array[j];
+		strcpy(acr->remote_host, port->remote_host);
+		acr->used = true;
+		elog(DEBUG1, "new connection: %s, index: %d", acr->remote_host, j);
+	}
+
+	record_conn_failure(acr);
+	return acr->sleep_time;
+}
+
+static AuthConnRecord *
+find_conn_record(char *remote_host, int *free_index)
+{
+	int

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2023-12-27 Thread Melanie Plageman

On Sat, Dec 23, 2023 at 9:14 PM Michael Paquier  wrote:
>
> On Mon, Nov 13, 2023 at 07:06:15PM -0500, Melanie Plageman wrote:
> > As best I can tell, our best case scenario is Thomas' streaming read API
> > goes in, we add vacuum as a user, and we can likely remove the skip
> > range logic.
>
> This does not prevent the work you've been doing in 0001 and 0002
> posted upthread, right? Some progress is always better than no
> progress

Correct. Peter and I were mainly discussing next refactoring steps as
we move toward combining the prune, freeze, and VM records. This
thread's patches stand alone.

> I can see the appeal behind doing 0001 actually to keep
> the updates of the block numbers closer to where we determine if
> relation truncation is safe of not rather than use an intermediate
> state in LVPagePruneState.

Exactly.

> 0002 is much, much, much trickier..

Do you have specific concerns about its correctness? I understand it
is an area where we have to be sure we are correct. But, to be fair,
that is true of all the pruning and vacuuming code.

- Melanie

postgres_fdw fails to see that array type belongs to extension

2023-12-27 Thread David Geier


Hi hackers,

We use postgres_fdw to connect two databases. Both DBs have an extension 
installed which provides a custom string data type. Our extension is 
known to the FDW as we created the foreign server with our extension 
listed in the "extensions" option.


The filter clause of the query SELECT * FROM test WHERE col = 'foo' OR 
col = 'bar' is pushed down to the remote, while the filter clause of the 
semantically equivalent query SELECT * FROM test WHERE col IN ('foo', 
'bar') is not.


I traced this down to getExtensionOfObject() called from 
lookup_shippable(). getExtensionOfObject() doesn't recurse but only 
checks first level dependencies and only checks for extension 
dependencies. However, the IN operator takes an array of our custom data 
type as argument (type is typically prefixed with _ in pg_type). This 
array type is only dependent on our extension via the custom data type 
in two steps which postgres_fdw doesn't see. Therefore, postgres_fdw 
doesn't allow for push-down of the IN.


Thoughts?

--
David Geier
(ServiceNow)

add function argument names to regex* functions.

Hi.
similar to [1], add function argument names to the following functions:
regexp_like, regexp_match,regexp_matches,regexp_replace,
regexp_substr,regexp_split_to_array,regexp_split_to_table,regexp_count

so I call these function in a different notation[2], like:

SELECT regexp_like(string=>'a'||CHR(10)||'d', pattern=>'a.d', flags:='n');
select regexp_match(string=>'abc',n pattern=>'(B)(c)', flags=>'i');
select regexp_matches(string=>'Programmer', pattern=>'(\w)(.*?\1)',
flags=>'ig');
SELECT regexp_replace(source=>'A PostgreSQL function',
pattern=>'a|e|i|o|u', replacement=>'X', start=>1, n=>4, flags=>'i');
SELECT regexp_substr(string=>'1234567890',
pattern=>'(123)(4(56)(78))', start=>1, n=>1, flags=>'i', subexpr=>4);
SELECT regexp_split_to_array(string=>'thE QUick bROWn FOx jUMPs ovEr
The lazy dOG', pattern=>'e', flags=>'i');

SELECT foo, length(foo)
FROM regexp_split_to_table(string=>'thE QUick bROWn FOx jUMPs ovEr The
lazy dOG', pattern=>'e',flags=>'i') AS foo;
SELECT regexp_count(string=>'ABCABCABCABC', pattern=>'Abc', start=>1,
flags=>'i');

In [3], except the above mentioned function, there is a "substring"
function.
I want to refactor substring function argument names. it looks like:
   Schema   |   Name| Result data type |Argument data
types | Type
+---+--++--
 pg_catalog | substring | bit  | bits bit, "from" integer
 | func
 pg_catalog | substring | bit  | bits bit, "from" integer,
"for" integer| func
 pg_catalog | substring | bytea| bytes bytea, "from"
integer| func
 pg_catalog | substring | bytea| bytes bytea, "from"
integer, "for" integer | func
 pg_catalog | substring | text | string text, "from"
integer| func
 pg_catalog | substring | text | string text, "from"
integer, "for" integer | func
 pg_catalog | substring | text | string text, pattern text
 | func
 pg_catalog | substring | text | text, text, text
 | func
(8 rows)

As you can see, the substring function argument names need an explicit
double quote,
which doesn't look good, so I gave up.

[1]
https://www.postgresql.org/message-id/flat/877cw3jl8y@wibble.ilmari.org
[2]
https://www.postgresql.org/docs/current/sql-syntax-calling-funcs.html#SQL-SYNTAX-CALLING-FUNCS
[3] https://www.postgresql.org/docs/current/functions-matching.html
From f86ac65a78847062d68bb7e299a1e45e2a3d9477 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Wed, 27 Dec 2023 19:59:34 +0800
Subject: [PATCH v1 1/1] add function argument names to regex.* functions.

 Specifically add function argument names to the following funtions:
 regexp_like, regexp_match, regexp_matches, regexp_replace,
 regexp_substr, regexp_split_to_array,
 regexp_split_to_table, regexp_count.

 So these functions can be easiler to understand in psql via \df. also more important make these functions
 can be called in different notaions.

---
 src/include/catalog/pg_proc.dat | 71 ++---
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f526..e3977a59 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3611,105 +3611,148 @@
   prosrc => 'replace_text' },
 { oid => '2284', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text', prosrc => 'textregexreplace_noopt' },
+  proargtypes => 'text text text',
+  proargnames => '{source, pattern, replacement}',
+  prosrc => 'textregexreplace_noopt' },
 { oid => '2285', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text text', prosrc => 'textregexreplace' },
+  proargtypes => 'text text text text',
+  proargnames => '{source, pattern, replacement, flags}',
+  prosrc => 'textregexreplace' },
 { oid => '6251', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4 int4 text',
+  proargnames => '{source, pattern, replacement, start, n, flags}',
   prosrc => 'textregexreplace_extended' },
 { oid => '6252', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4 int4',
+  proargnames => '{source, pattern, replacement, start, n}',
   prosrc => 'textregexreplace_extended_no_flags' },
 { oid => '6253', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4',
+  proargnames => '{source, pattern, replacement, start}',
   prosrc => 'textregexreplace_extended_no_n' },
 { oid => '3396', descr => 'find first match for regexp',
   proname => 'regexp_match', prorettype => '_tex

Re: [PATCH] pg_dump: Do not dump statistics for excluded tables

Rian McGuire  writes:
> I've attached a patch against master that addresses a small bug in pg_dump.
> Previously, pg_dump would include CREATE STATISTICS statements for
> tables that were excluded from the dump, causing reload to fail if any
> excluded tables had extended statistics.

I agree that's a bug ...

> The patch skips the creation of the StatsExtInfo if the associated
> table does not have the DUMP_COMPONENT_DEFINITION flag set. This is
> similar to how getPublicationTables behaves if a table is excluded.

... but I don't like the details of this patch (and I'm not too
thrilled with the implementation of getPublicationTables, either).
The style in pg_dump is to put such decisions into separate
policy-setting subroutines.  Also, skipping creation of the
DumpableObject altogether is the wrong thing because it'd prevent
pg_dump from tracing or reasoning about dependencies involving the
stats object, which can be relevant even if the object itself isn't
dumped --- this is why all the other data-collection subroutines
operate as they do.  getPublicationTables can probably get away
with its low-rent approach given that publication membership isn't
represented by pg_depend entries, but it's far from clear that it'll
never be an issue for stats.

So I think it needs to be more like the attached.
(I did use your test case verbatim.)

regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8c0b5486b9..050a831226 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2046,6 +2046,26 @@ selectDumpablePublicationObject(DumpableObject *dobj, Archive *fout)
 		DUMP_COMPONENT_ALL : DUMP_COMPONENT_NONE;
 }
 
+/*
+ * selectDumpableStatisticsObject: policy-setting subroutine
+ *		Mark an extended statistics object as to be dumped or not
+ *
+ * We dump an extended statistics object if the schema it's in and the table
+ * it's for are being dumped.  (This'll need more thought if statistics
+ * objects ever support cross-table stats.)
+ */
+static void
+selectDumpableStatisticsObject(StatsExtInfo *sobj, Archive *fout)
+{
+	if (checkExtensionMembership(&sobj->dobj, fout))
+		return;	/* extension membership overrides all else */
+
+	sobj->dobj.dump = sobj->dobj.namespace->dobj.dump_contains;
+	if (sobj->stattable == NULL ||
+		!(sobj->stattable->dobj.dump & DUMP_COMPONENT_DEFINITION))
+		sobj->dobj.dump = DUMP_COMPONENT_NONE;
+}
+
 /*
  * selectDumpableObject: policy-setting subroutine
  *		Mark a generic dumpable object as to be dumped or not
@@ -7291,6 +7311,7 @@ getExtendedStatistics(Archive *fout)
 	int			i_stxname;
 	int			i_stxnamespace;
 	int			i_stxowner;
+	int			i_stxrelid;
 	int			i_stattarget;
 	int			i;
 
@@ -7302,11 +7323,11 @@ getExtendedStatistics(Archive *fout)
 
 	if (fout->remoteVersion < 13)
 		appendPQExpBufferStr(query, "SELECT tableoid, oid, stxname, "
-			 "stxnamespace, stxowner, (-1) AS stxstattarget "
+			 "stxnamespace, stxowner, stxrelid, (-1) AS stxstattarget "
 			 "FROM pg_catalog.pg_statistic_ext");
 	else
 		appendPQExpBufferStr(query, "SELECT tableoid, oid, stxname, "
-			 "stxnamespace, stxowner, stxstattarget "
+			 "stxnamespace, stxowner, stxrelid, stxstattarget "
 			 "FROM pg_catalog.pg_statistic_ext");
 
 	res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
@@ -7318,6 +7339,7 @@ getExtendedStatistics(Archive *fout)
 	i_stxname = PQfnumber(res, "stxname");
 	i_stxnamespace = PQfnumber(res, "stxnamespace");
 	i_stxowner = PQfnumber(res, "stxowner");
+	i_stxrelid = PQfnumber(res, "stxrelid");
 	i_stattarget = PQfnumber(res, "stxstattarget");
 
 	statsextinfo = (StatsExtInfo *) pg_malloc(ntups * sizeof(StatsExtInfo));
@@ -7332,10 +7354,12 @@ getExtendedStatistics(Archive *fout)
 		statsextinfo[i].dobj.namespace =
 			findNamespace(atooid(PQgetvalue(res, i, i_stxnamespace)));
 		statsextinfo[i].rolname = getRoleName(PQgetvalue(res, i, i_stxowner));
+		statsextinfo[i].stattable =
+			findTableByOid(atooid(PQgetvalue(res, i, i_stxrelid)));
 		statsextinfo[i].stattarget = atoi(PQgetvalue(res, i, i_stattarget));
 
 		/* Decide whether we want to dump it */
-		selectDumpableObject(&(statsextinfo[i].dobj), fout);
+		selectDumpableStatisticsObject(&(statsextinfo[i]), fout);
 	}
 
 	PQclear(res);
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2fe3cbed9a..673ca5c92d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -423,7 +423,8 @@ typedef struct _indexAttachInfo
 typedef struct _statsExtInfo
 {
 	DumpableObject dobj;
-	const char *rolname;
+	const char *rolname;		/* owner */
+	TableInfo  *stattable;		/* link to table the stats are for */
 	int			stattarget;		/* statistics target */
 } StatsExtInfo;
 
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index eb3ec534b4..a671603cd2 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -3742,14 +3742,15 @@ my %tests

Re: postgres_fdw fails to see that array type belongs to extension

David Geier  writes:
> The filter clause of the query SELECT * FROM test WHERE col = 'foo' OR 
> col = 'bar' is pushed down to the remote, while the filter clause of the 
> semantically equivalent query SELECT * FROM test WHERE col IN ('foo', 
> 'bar') is not.

> I traced this down to getExtensionOfObject() called from 
> lookup_shippable(). getExtensionOfObject() doesn't recurse but only 
> checks first level dependencies and only checks for extension 
> dependencies. However, the IN operator takes an array of our custom data 
> type as argument (type is typically prefixed with _ in pg_type). This 
> array type is only dependent on our extension via the custom data type 
> in two steps which postgres_fdw doesn't see. Therefore, postgres_fdw 
> doesn't allow for push-down of the IN.

Hmm.  It seems odd that if an extension defines a type, the type is
listed as a member of the extension but the array type is not.
That makes it look like the array type is an externally-created
thing that happens to depend on the extension, when it's actually
part of the extension.  I'm surprised we've not run across other
misbehaviors traceable to that.

Of course, fixing it like that leads to needing to change the
contents of pg_depend, so it wouldn't be back-patchable.  But it
seems like the best way in the long run.

regards, tom lane

Re: Should we remove -Wdeclaration-after-statement?

2023-12-27 Thread Jelte Fennema-Nio

On Wed, 27 Dec 2023 at 16:05, Tom Lane wrote:
> This has already been debated, and the conclusion was that we would
> stick to the existing style for consistency reasons.

I looked through the archives quite a bit, but I couldn't find any
conclusive debate about the current declaration style. Definitely not
one with "consistency reasons" as being the conclusion. Could you
point me to the place where that conclusion was reached? Or could you
at least clarify what consistency you believe is lost by removing the
warning? The threads discussing this warning that I did find were the
following:

The initial addition of the warning flag[1], which has very little discussion.

Introducing the C99 requirement[2]. Robert and you both preferred the
current declaration style. Andrew and Andres both would want to accept
the new declaration style.

Another where removal of this warning was suggested[3], and where
Andres said he was in favor of removing the warning. But he didn't
think fighting for it was worth the effort at the time to fight you
and Robert, when he was trying to get the general C99 requirement in.

And finally, one that was started by me, where I suggest an automated
refactor[4]. This change got shot down because it would cause lots of
backpatching problems (and because it was using perl regexes instead
of an AST parser to do the automated refactor). Ranier and you were
proponents of the current declaration style. Chapman was in favor of
the new declaration style. Andrew seems neutral.

P.S. Note, that I'm not suggesting a complete refactor this time. I'm
only proposing to relax the rules, and disable the warning, so newly
written code can benefit. But if the only reason not to remove the
warning is that then there would be two styles of declaration in the
codebase, then I'm happy to create another refactoring script that
moves declarations down to their first usage. (Which could then be run
on all backbranches to make sure there is no backpatching pain)

[1]: https://www.postgresql.org/message-id/flat/417263F8.4060102%40samurai.com
[2]:
https://www.postgresql.org/message-id/flat/CA%2BTgmoYvHzFkwChsamwbBrLNJRcRq%2BfyTwveFaN_YOWUsRnfpw%40mail.gmail.com#931f4c68237caf4c60b4dc298236aef1
[3]:
https://www.postgresql.org/message-id/flat/20181213210012.i7iihamlbi7vfdiw%40alap3.anarazel.de#00304f9dfc039da87383fed30be62cff
[4]:
https://www.postgresql.org/message-id/flat/AM5PR83MB0178E68E4FF1BAF9C66DF0D1F7C09%40AM5PR83MB0178.EURPRD83.prod.outlook.com

Re: Add support for attribute((returns_nonnull))

2023-12-27 Thread Tristan Partin


On Wed Dec 27, 2023 at 6:42 AM CST, Peter Eisentraut wrote:

On 19.12.23 21:43, Tristan Partin wrote:
> Here is a patch which adds support for the returns_nonnull attribute
> alongside all the other attributes we optionally support.
>
> I recently wound up in a situation where I was checking for NULL return
> values of a function that couldn't ever return NULL because the
> inability to allocate memory was always elog(ERROR)ed (aborted).
>
> I didn't go through and mark anything, but I feel like it could be
> useful for people going forward, including myself.

I think it would be useful if this patch series contained a patch that
added some initial uses of this.  That way we can check that the
proposed definition actually works, and we can observe what it does,
with respect to warnings, static analysis, etc.


Good point. Patch attached.

I tried to find some ways to prove some value, but I couldn't. Take this
example for instance:

static const char word[] = { 'h', 'e', 'l', 'l', 'o' };

const char * __attribute__((returns_nonnull))
hello()
{
return word;
}

int
main(void)
{
const char *r;

r = hello();
if (r == NULL)
return 1;

return 0;
}

I would have thought I could get gcc or clang to warn on a wasteful NULL
check, but alas. I also didn't see any code generation improvements, but
I am assuming that the example is too contrived. I couldn't find any
good things online that had examples of when such an annotation forced
the compiler to warn or create more optimized code.

If you return NULL from the hello() function, clang will warn that the
attribute doesn't match reality.

--
Tristan Partin
Neon (https://neon.tech)
From fe4916093d0ce036e7b70595b39351b4ecf93798 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 19 Dec 2023 14:39:03 -0600
Subject: [PATCH v2 1/2] Add support for __attribute__((returns_nonnull))

Allows for marking functions that can't possibly return NULL, like those
that always elog(ERROR) for instance in the case of failures.

While not only being good documentation, annotating functions which
can't return NULL can lead to better code generation since the optimizer
can more easily analyze a function. Quoting the LLVM documentation:

> The returns_nonnull attribute implies that returning a null pointer is
> undefined behavior, which the optimizer may take advantage of.

The more we mark, the better the analysis can become.
---
 src/include/c.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/include/c.h b/src/include/c.h
index 26bf7ec16e..e3a127f954 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -285,6 +285,18 @@
 #define pg_unreachable() abort()
 #endif
 
+/*
+ * Place on functions which return a pointer but can't return NULL. When used,
+ * it can allow the compiler to warn if a NULL check occurs in the parent
+ * function because that NULL check would always fail. It is also an opportunity
+ * to help the compiler with optimizations.
+ */
+#if __has_attribute (returns_nonnull)
+#define pg_returns_nonnull __attribute__((returns_nonnull))
+#else
+#define pg_returns_nonnull
+#endif
+
 /*
  * Hints to the compiler about the likelihood of a branch. Both likely() and
  * unlikely() return the boolean value of the contained expression.
-- 
Tristan Partin
Neon (https://neon.tech)

From 1dc3544f887c857f55cc5579df240179496f72b6 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 27 Dec 2023 11:21:11 -0600
Subject: [PATCH v2 2/2] Mark some initial candidates as returns_nonnull

---
 src/include/common/fe_memutils.h | 22 --
 src/include/utils/palloc.h   | 28 ++--
 src/include/utils/pg_locale.h|  2 +-
 3 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/src/include/common/fe_memutils.h b/src/include/common/fe_memutils.h
index 89601cc778..0f026dbd18 100644
--- a/src/include/common/fe_memutils.h
+++ b/src/include/common/fe_memutils.h
@@ -9,6 +9,8 @@
 #ifndef FE_MEMUTILS_H
 #define FE_MEMUTILS_H
 
+#include "c.h"
+
 /*
  * Flags for pg_malloc_extended and palloc_extended, deliberately named
  * the same as the backend flags.
@@ -22,11 +24,11 @@
  * "Safe" memory allocation functions --- these exit(1) on failure
  * (except pg_malloc_extended with MCXT_ALLOC_NO_OOM)
  */
-extern char *pg_strdup(const char *in);
-extern void *pg_malloc(size_t size);
-extern void *pg_malloc0(size_t size);
+extern pg_returns_nonnull char *pg_strdup(const char *in);
+extern pg_returns_nonnull void *pg_malloc(size_t size);
+extern pg_returns_nonnull void *pg_malloc0(size_t size);
 extern void *pg_malloc_extended(size_t size, int flags);
-extern void *pg_realloc(void *ptr, size_t size);
+extern pg_returns_nonnull void *pg_realloc(void *ptr, size_t size);
 extern void pg_free(void *ptr);
 
 /*
@@ -52,12 +54,12 @@ extern void pg_free(void *ptr);

Re: Two small bugs in guc.c

2023-12-27 Thread Tristan Partin


On Tue Dec 26, 2023 at 1:02 PM CST, Tom Lane wrote:

I investigated the report at [1] about pg_file_settings not reporting
invalid values of "log_connections".  It turns out it's broken for
PGC_BACKEND and PGC_SU_BACKEND parameters, but not other ones.
The cause is a bit of premature optimization in this logic:

 * If a PGC_BACKEND or PGC_SU_BACKEND parameter is changed in
 * the config file, we want to accept the new value in the
 * postmaster (whence it will propagate to
 * subsequently-started backends), but ignore it in existing
 * backends. ...

Upon detecting that case, set_config_option just returns -1 immediately
without bothering to validate the value.  It should check for invalid
input before returning -1, which we can mechanize with a one-line fix:

-return -1;
+changeVal = false;

While studying this, I also noted that the bit to prevent changes in
parallel workers seems seriously broken:

if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE)
ereport(elevel,
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
 errmsg("cannot set parameters during a parallel operation")));

This is evidently assuming that ereport() won't return, which seems
like a very dubious assumption given the various values that elevel
can have.  Maybe it's accidentally true -- I don't recall any
reports of trouble here -- but it sure looks fragile.

Hence, proposed patch attached.


Looks good to me.

--
Tristan Partin
Neon (https://neon.tech)

Re: introduce dynamic shared memory registry

2023-12-27 Thread Nathan Bossart

Here is a new version of the patch.  In addition to various small changes,
I've rewritten the test suite to use an integer and a lock, added a
dsm_registry_find() function, and adjusted autoprewarm to use the registry.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f8d48285e8117ff0bbf35f5022f3096df59385ae Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 11 Oct 2023 22:07:26 -0500
Subject: [PATCH v2 1/2] add dsm registry

---
 src/backend/storage/ipc/Makefile  |   1 +
 src/backend/storage/ipc/dsm_registry.c| 209 ++
 src/backend/storage/ipc/ipci.c|   3 +
 src/backend/storage/ipc/meson.build   |   1 +
 src/backend/storage/lmgr/lwlock.c |   4 +
 src/backend/storage/lmgr/lwlocknames.txt  |   1 +
 .../utils/activity/wait_event_names.txt   |   3 +
 src/include/storage/dsm_registry.h|  24 ++
 src/include/storage/lwlock.h  |   4 +-
 src/test/modules/Makefile |   1 +
 src/test/modules/meson.build  |   1 +
 src/test/modules/test_dsm_registry/.gitignore |   4 +
 src/test/modules/test_dsm_registry/Makefile   |  23 ++
 .../expected/test_dsm_registry.out|  16 ++
 .../modules/test_dsm_registry/meson.build |  33 +++
 .../sql/test_dsm_registry.sql |   7 +
 .../test_dsm_registry--1.0.sql|  10 +
 .../test_dsm_registry/test_dsm_registry.c |  75 +++
 .../test_dsm_registry.control |   4 +
 src/tools/pgindent/typedefs.list  |   3 +
 20 files changed, 426 insertions(+), 1 deletion(-)
 create mode 100644 src/backend/storage/ipc/dsm_registry.c
 create mode 100644 src/include/storage/dsm_registry.h
 create mode 100644 src/test/modules/test_dsm_registry/.gitignore
 create mode 100644 src/test/modules/test_dsm_registry/Makefile
 create mode 100644 src/test/modules/test_dsm_registry/expected/test_dsm_registry.out
 create mode 100644 src/test/modules/test_dsm_registry/meson.build
 create mode 100644 src/test/modules/test_dsm_registry/sql/test_dsm_registry.sql
 create mode 100644 src/test/modules/test_dsm_registry/test_dsm_registry--1.0.sql
 create mode 100644 src/test/modules/test_dsm_registry/test_dsm_registry.c
 create mode 100644 src/test/modules/test_dsm_registry/test_dsm_registry.control

diff --git a/src/backend/storage/ipc/Makefile b/src/backend/storage/ipc/Makefile
index 6d5b921038..d8a1653eb6 100644
--- a/src/backend/storage/ipc/Makefile
+++ b/src/backend/storage/ipc/Makefile
@@ -12,6 +12,7 @@ OBJS = \
 	barrier.o \
 	dsm.o \
 	dsm_impl.o \
+	dsm_registry.o \
 	ipc.o \
 	ipci.o \
 	latch.o \
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
new file mode 100644
index 00..5fc970001e
--- /dev/null
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -0,0 +1,209 @@
+/*-
+ *
+ * dsm_registry.c
+ *
+ * Functions for interfacing with the dynamic shared memory registry.  This
+ * provides a way for libraries to use shared memory without needing to
+ * request it at startup time via a shmem_request_hook.
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/storage/ipc/dsm_registry.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+
+#include "lib/dshash.h"
+#include "storage/dsm_registry.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "utils/memutils.h"
+
+typedef struct DSMRegistryCtxStruct
+{
+	dsa_handle	dsah;
+	dshash_table_handle dshh;
+} DSMRegistryCtxStruct;
+
+static DSMRegistryCtxStruct *DSMRegistryCtx;
+
+typedef struct DSMRegistryEntry
+{
+	char		key[256];
+	dsm_handle	handle;
+} DSMRegistryEntry;
+
+static const dshash_parameters dsh_params = {
+	offsetof(DSMRegistryEntry, handle),
+	sizeof(DSMRegistryEntry),
+	dshash_memcmp,
+	dshash_memhash,
+	LWTRANCHE_DSM_REGISTRY_HASH
+};
+
+static dsa_area *dsm_registry_dsa;
+static dshash_table *dsm_registry_table;
+
+static void init_dsm_registry(void);
+
+Size
+DSMRegistryShmemSize(void)
+{
+	return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+}
+
+void
+DSMRegistryShmemInit(void)
+{
+	bool		found;
+
+	DSMRegistryCtx = (DSMRegistryCtxStruct *)
+		ShmemInitStruct("DSM Registry Data",
+		DSMRegistryShmemSize(),
+		&found);
+
+	if (!found)
+	{
+		DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+		DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
+	}
+}
+
+/*
+ * Initialize or attach to the dynamic shared hash table that stores the DSM
+ * registry entries, if not already done.  This must be called before accessing
+ * the table.
+ */
+static void
+init_dsm_registry(void)
+{
+	/* Quick exit if we already did this. */
+	if (dsm_registry_table)
+		return;
+
+	/* Otherwise, use a lock to ensure only one

Re: Password leakage avoidance


On 23.12.23 16:13, Joe Conway wrote:
I have recently, once again for the umpteenth time, been involved in 
discussions around (paraphrasing) "why does Postgres leak the passwords 
into the logs when they are changed". I know well that the canonical 
advice is something like "use psql with \password if you care about that".


And while that works, it is a deeply unsatisfying answer for me to give 
and for the OP to receive.


The alternative is something like "...well if you don't like that, use 
PQencryptPasswordConn() to roll your own solution that meets your 
security needs".


Again, not a spectacular answer IMHO. It amounts to "here is a 
do-it-yourself kit, go put it together". It occurred to me that we can, 
and really should, do better.


The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).


The usage in psql serves as a ready built-in test for the libpq function 
(patch 0002). Docs included too (patch 0003).


I don't follow how you get from the problem statement to this solution. 
This proposal doesn't avoid password leakage, does it?  It just provides 
a different way to phrase the existing solution.  Who is a potential 
user of this solution?  Right now it just saves a dozen lines in psql, 
but it's not clear how it improves anything else.

Re: Password leakage avoidance

2023-12-27 Thread Joe Conway


On 12/27/23 15:39, Peter Eisentraut wrote:

On 23.12.23 16:13, Joe Conway wrote:
I have recently, once again for the umpteenth time, been involved in 
discussions around (paraphrasing) "why does Postgres leak the passwords 
into the logs when they are changed". I know well that the canonical 
advice is something like "use psql with \password if you care about that".


And while that works, it is a deeply unsatisfying answer for me to give 
and for the OP to receive.


The alternative is something like "...well if you don't like that, use 
PQencryptPasswordConn() to roll your own solution that meets your 
security needs".


Again, not a spectacular answer IMHO. It amounts to "here is a 
do-it-yourself kit, go put it together". It occurred to me that we can, 
and really should, do better.


The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).


The usage in psql serves as a ready built-in test for the libpq function 
(patch 0002). Docs included too (patch 0003).


I don't follow how you get from the problem statement to this solution.
This proposal doesn't avoid password leakage, does it?


Yes, it most certainly does. The plaintext password would never be seen 
by the server and therefore never logged. This is exactly why the 
feature already existed in psql.



 It just provides a different way to phrase the existing solution.


Yes, a fully built one that is convenient to use, and does not ask 
everyone to roll their own.


Who is a potential user of this solution? 


Literally every company that has complained that Postgres pollutes their 
logs with plaintext passwords. I have heard the request to provide a 
better solution many times, over many years, while working for three 
different companies.



Right now it just saves a dozen lines in psql, but it's not clear how
it improves anything else.


It is to me, and so far no one else has complained about that. More 
opinions would be welcomed of course.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Password leakage avoidance

Joe Conway  writes:
> On 12/27/23 15:39, Peter Eisentraut wrote:
>> On 23.12.23 16:13, Joe Conway wrote:
>>> The attached patch set moves the guts of \password from psql into the 
>>> libpq client side -- PQchangePassword() (patch 0001).

>> I don't follow how you get from the problem statement to this solution.
>> This proposal doesn't avoid password leakage, does it?
>> It just provides a different way to phrase the existing solution.

> Yes, a fully built one that is convenient to use, and does not ask 
> everyone to roll their own.

It's convenient for users of libpq, I guess, but it doesn't help
anyone not writing C code directly atop libpq.  If this is the
way forward then we need to also press JDBC and other client
libraries to implement comparable functionality.  That's within
the realm of sanity surely, and having a well-thought-through
reference implementation in libpq would help those authors.
So I don't think this is a strike against the patch; but the answer
to Peter's question has to be that this is just part of the solution.

regards, tom lane

Re: Password leakage avoidance

2023-12-27 Thread Dave Cramer

On Wed, 27 Dec 2023 at 16:10, Tom Lane  wrote:

> Joe Conway  writes:
> > On 12/27/23 15:39, Peter Eisentraut wrote:
> >> On 23.12.23 16:13, Joe Conway wrote:
> >>> The attached patch set moves the guts of \password from psql into the
> >>> libpq client side -- PQchangePassword() (patch 0001).
>
> >> I don't follow how you get from the problem statement to this solution.
> >> This proposal doesn't avoid password leakage, does it?
> >> It just provides a different way to phrase the existing solution.
>
> > Yes, a fully built one that is convenient to use, and does not ask
> > everyone to roll their own.
>
> It's convenient for users of libpq, I guess, but it doesn't help
> anyone not writing C code directly atop libpq.  If this is the
> way forward then we need to also press JDBC and other client
> libraries to implement comparable functionality.  That's within
> the realm of sanity surely, and having a well-thought-through
> reference implementation in libpq would help those authors.
> So I don't think this is a strike against the patch; but the answer
> to Peter's question has to be that this is just part of the solution.
>

Already have one in the works for JDBC, actually predates this.
https://github.com/pgjdbc/pgjdbc/pull/3067

Dave

Re: Multidimensional Histograms

2023-12-27 Thread Tomas Vondra

Hello Alexander,

We actually had histograms in the early patches adding multivariate
statistics [1]. We however ended up removing histograms and only kept
the simpler types, for a couple reasons.

It might be worth going through the discussion - I'm sure one of the
reasons was we needed to limit the scope of the patch, and histograms
were much more complex and possibly less useful compared to the other
statistics types.

Another reason was that the algorithm described in the two papers you
reference (1988 paper by DeWitt and the evaluation by Carlson and
Sutherland from ~2010) is simple but has pretty obvious weaknesses. It
processes the columns one by one - first build bucket on column "a",
then splits each bucket into buckets on "b". So it's not symmetrical,
and it results in lower accuracy compared to an "ideal" histogram with
the same number of buckets (especially for the dimensions split early).

This does indeed produce an equi-depth histogram, which seems nice, but
the mesh is regular in such a way that any value of the first dimension
intersects all buckets on the second dimension. So for example with a
histogram of 100x100 buckets on [a,b], any value "a=X" intersects with
100 buckets on "b", each representing 1/1 of tuples. But we don't
know how the tuples are distributed in the tuple - so we end up using
0.5 of the bucket as unbiased estimate, but that can easily add-up in
the wrong dimension.

However, this is not the only way to build an equi-depth histogram,
there are ways to make it more symmetrical. Also, it's not clear
equi-depth histograms are ideal with multiple columns. There are papers
proposing various other types of histograms (using different criteria to
build buckets optimizing a different metric). The most interesting one
seems to be V-Optimal histograms - see for example papers [1], [2], [3],
[4], [5] and [6]. I'm sure there are more. The drawback of course is
that it's more expensive to build such histograms.

IIRC the patch tried to do something like V-optimal histograms, but
using a greedy algorithm and not the exhaustive stuff described in the
various papers.

[1] https://www.vldb.org/conf/1998/p275.pdf
[2]
https://cs-people.bu.edu/mathan/reading-groups/papers-classics/histograms.pdf
[3] https://dl.acm.org/doi/pdf/10.1145/304182.304200
[4] http://www.cs.columbia.edu/~gravano/Papers/2001/sigmod01b.pdf
[5] https://cs.gmu.edu/~carlotta/publications/vldb090.pdf
[6] https://core.ac.uk/download/pdf/82158702.pdf


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Password leakage avoidance

2023-12-27 Thread Joe Conway


On 12/27/23 16:09, Tom Lane wrote:

Joe Conway  writes:

On 12/27/23 15:39, Peter Eisentraut wrote:

On 23.12.23 16:13, Joe Conway wrote:
The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).



I don't follow how you get from the problem statement to this solution.
This proposal doesn't avoid password leakage, does it?
It just provides a different way to phrase the existing solution.


Yes, a fully built one that is convenient to use, and does not ask 
everyone to roll their own.


It's convenient for users of libpq, I guess, but it doesn't help
anyone not writing C code directly atop libpq.  If this is the
way forward then we need to also press JDBC and other client
libraries to implement comparable functionality.  That's within
the realm of sanity surely, and having a well-thought-through
reference implementation in libpq would help those authors.
So I don't think this is a strike against the patch; but the answer
to Peter's question has to be that this is just part of the solution.


As mentioned downthread by Dave Cramer, JDBC is already onboard.

And as Jonathan said in an adjacent part of the thread:

This should generally helpful, as it allows libpq-based drivers to
adopt this method and provide a safer mechanism for setting/changing
passwords! (which we should also promote once availbale).


Which is definitely something I have had in mind all along.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Password leakage avoidance

2023-12-27 Thread Jonathan S. Katz


On 12/24/23 12:15 PM, Tom Lane wrote:


Maybe we need a PQcreaterole that provide the mechanism to set passwords
safely. It'd likely need to take all the options need for creating a
role, but that would at least give the underlying mechanism to ensure
we're always sending a hashed password to the server.


I'm kind of down on that, because it seems like it'd be quite hard to
design an easy-to-use C API that doesn't break the next time somebody
adds another option to CREATE USER.  What's so wrong with suggesting
that the password be set in a separate step?  (For comparison, typical
Unix utilities like useradd(8) also tell you to set the password
separately.)


Modern development frameworks tend to reduce things down to one-step, 
even fairly complex operations. Additionally, a lot of these frameworks 
don't even require a developer to build backend applications that 
involve doing actually work on the backend (e.g. UNIX), so the approach 
of useradd(8) et al. are not familiar. Adding the additional step would 
lead to errors, e.g. the developer not calling the "change password" 
function to create the obfuscated password. Granted, we can push the 
problem down to driver authors to "be better" and simplify the process 
for their end users, but that still can be error prone, having seen this 
with driver authors implementing PostgreSQL SCRAM and having made 
(correctable) mistakes that could have lead to security issues.


That said, I see why trying to keep track of all of the "CREATE ROLE" 
attributes from a C API can be cumbersome, so perhaps we could end up 
adding an API that just does "create-user-with-password" and applies a 
similar method to Joe's patch. That would also align with the developer 
experience above, as in those cases users tend to just be created with a 
password w/o any of the additional role options.


Also open to punting this to a different thread as we can at least make 
things better with the "change password" approach.


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature

Assorted typo fixes

2023-12-27 Thread Dagfinn Ilmari Mannsåker

Hi folks,

I was playing around with the `typos` tool
(https://github.com/crate-ci/typos), and thought I'd run it on the
posgres repo for fun.  After a bit of tweaking to get rid of most false
positives (see separately attached .typos.toml file), it came up with a
useful set of suggestions, some of which I applied verbatim, others
which needed a bit more rewording.

Attached is a series of patches.  The first one are what I consider
obvious, unambiguous fixes to code comments.  The subsequent ones are
fixes for actual code (variable, function, type names) and docs, one
patch per class of typo.  As far as I can tell, none of the code changes
(except the ECPG one, see below) affect anything exported, so this
should not cause any compatibility issues for extensions.

The ECPG change affects the generated C code, but from my reading of the
callers in descriptor.c and ecpg.trailer, any code that would have
caused it to encounter the affected enum value would fail to compile, so
either the case is not possible, or nobody actually uses whatever syntax
is affected (I don't know enough about ECPG to tell without spending far
too much time digging in the code).

- ilmari

>From d5abde91b33bbdaa99c6c1e7a6334affa370e9c6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 22 Dec 2023 01:46:27 +
Subject: [PATCH 01/13] Fix typos in comments

---
 contrib/bloom/bloom.h  |  2 +-
 contrib/pgcrypto/expected/pgp-compression.out  |  2 +-
 contrib/pgcrypto/openssl.c |  2 +-
 contrib/pgcrypto/pgp-encrypt.c |  2 +-
 contrib/pgcrypto/sql/pgp-compression.sql   |  2 +-
 contrib/postgres_fdw/expected/postgres_fdw.out |  2 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  2 +-
 src/backend/access/brin/brin.c |  6 +++---
 src/backend/access/common/heaptuple.c  |  2 +-
 src/backend/access/gist/gistbuild.c|  2 +-
 src/backend/access/heap/heapam.c   |  4 ++--
 src/backend/access/nbtree/nbtree.c |  2 +-
 src/backend/catalog/namespace.c|  2 +-
 src/backend/catalog/pg_constraint.c|  2 +-
 src/backend/commands/event_trigger.c   |  2 +-
 src/backend/executor/execMain.c|  2 +-
 src/backend/optimizer/plan/initsplan.c |  4 ++--
 src/backend/utils/adt/rangetypes.c |  2 +-
 src/backend/utils/cache/catcache.c |  2 +-
 src/backend/utils/sort/tuplesortvariants.c | 12 ++--
 src/backend/utils/time/combocid.c  |  2 +-
 src/bin/pg_rewind/t/001_basic.pl   |  2 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |  2 +-
 src/bin/pg_rewind/t/007_standby_source.pl  |  2 +-
 src/bin/pg_rewind/t/009_growing_files.pl   |  2 +-
 src/include/pg_config_manual.h |  2 +-
 src/test/isolation/specs/stats.spec| 16 
 src/test/recovery/t/029_stats_restart.pl   |  2 +-
 src/test/regress/expected/boolean.out  |  2 +-
 src/test/regress/expected/brin_multi.out   |  4 ++--
 src/test/regress/expected/join.out |  2 +-
 src/test/regress/sql/boolean.sql   |  2 +-
 src/test/regress/sql/brin_multi.sql|  4 ++--
 src/test/regress/sql/join.sql  |  2 +-
 34 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..7c4407b9ec 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -127,7 +127,7 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
-/* Magic number to distinguish bloom pages among anothers */
+/* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
diff --git a/contrib/pgcrypto/expected/pgp-compression.out b/contrib/pgcrypto/expected/pgp-compression.out
index d4c57feba3..67e2dce897 100644
--- a/contrib/pgcrypto/expected/pgp-compression.out
+++ b/contrib/pgcrypto/expected/pgp-compression.out
@@ -60,7 +60,7 @@ WITH random_string AS
   -- This generates a random string of 16366 bytes.  This is chosen
   -- as random so that it does not get compressed, and the decompression
   -- would work on a string with the same length as the origin, making the
-  -- test behavior more predictible.  lpad() ensures that the generated
+  -- test behavior more predictable.  lpad() ensures that the generated
   -- hexadecimal value is completed by extra zero characters if random()
   -- has generated a value strictly lower than 16.
   SELECT string_agg(decode(lpad(to_hex((random()*256)::int), 2, '0'), 'hex'), '') as bytes
diff --git a/contrib/pgcrypto/openssl.c b/contrib/pgcrypto/openssl.c
index 4a913bd04f..8259de5e39 100644
--- a/contrib/pgcrypto/openssl.c
+++ b/contrib/pgcrypto/openssl.c
@@ -460,7 +460,7 @@ bf_init(PX_Cipher *c, const uint8 *key, unsigned klen, const uin

Re: [HACKERS] Changing references of password encryption to hashing


On 27.12.23 02:04, Bruce Momjian wrote:

I did_not_  change the user API, so CREATE/ALTER ROLE still uses
[ENCRYPTED] PASSWORD, the GUC is still called password_encryption, and
the libpq function is still called PQencryptPasswordConn().  This makes
the user interface confusing since the API uses "encryption" but the
text calls it "hashing".  Is there support for renaming the API to use
"hash" and keeping "encrypt" for backward compatiblity.


Yeah, this is clearly confusing.  I think we should just leave it alone. 
 Some gentle rewording here and there to clarify things might be 
appropriate (but the text already uses hashing on some occasions), but 
the blanket replacement does not make things better, I think.

Add LSN <-> time conversion functionality

2023-12-27 Thread Melanie Plageman

Hi,

Elsewhere [1] I required a way to estimate the time corresponding to a
particular LSN in the past. I devised the attached LSNTimeline, a data
structure mapping LSNs <-> timestamps with decreasing precision for
older time, LSN pairs. This can be used to locate and translate a
particular time to LSN or vice versa using linear interpolation.

I've added an instance of the LSNTimeline to PgStat_WalStats and insert
new values to it in background writer's main loop. This patch set also
introduces some new pageinspect functions exposing LSN <-> time
translations.

Outside of being useful to users wondering about the last modification
time of a particular block in a relation, the LSNTimeline can be put to
use in other Postgres sub-systems to govern behavior based on resource
consumption -- using the LSN consumption rate as a proxy.

As mentioned in [1], the LSNTimeline is a prerequisite for my
implementation of a new freeze heuristic which seeks to freeze only
pages which will remain unmodified for a certain amount of wall clock
time. But one can imagine other uses for such translation capabilities.

The pageinspect additions need a bit more work. I didn't bump the
pageinspect version (didn't add the new functions to a new pageinspect
version file). I also didn't exercise the new pageinspect functions in a
test. I was unsure how to write a test which would be guaranteed not to
flake. Because the background writer updates the timeline, it seemed a
remote possibility that the time or LSN returned by the functions would
be 0 and as such, I'm not sure even a test that SELECT time/lsn > 0
would always pass.

I also noticed the pageinspect functions don't have XML id attributes
for link discoverability. I planned to add that in a separate commit.

- Melanie

[1] 
https://www.postgresql.org/message-id/CAAKRu_b3tpbdRPUPh1Q5h35gXhY%3DspH2ssNsEsJ9sDfw6%3DPEAg%40mail.gmail.com
From 75a48fec0e9f0909dd11a676bb51e494fa4ca61c Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Tue, 5 Dec 2023 07:29:39 -0500
Subject: [PATCH v1 1/5] Record LSN at postmaster startup

The insert_lsn at postmaster startup can be used along with PgStartTime
as seed values for a timeline mapping LSNs to time. Future commits will
add such a structure for LSN <-> time conversions. A start LSN allows
for such conversions before even inserting a value into the timeline.
The current time and current insert LSN can be used along with
PgStartTime and PgStartLSN.

This is WIP, as I'm not sure if I did this in the right place.
---
 src/backend/access/transam/xlog.c   | 2 ++
 src/backend/postmaster/postmaster.c | 1 +
 src/include/utils/builtins.h| 3 +++
 3 files changed, 6 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1264849883..aa71e502e4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -146,6 +146,8 @@ bool		XLOG_DEBUG = false;
 
 int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
 
+XLogRecPtr	PgStartLSN = InvalidXLogRecPtr;
+
 /*
  * Number of WAL insertion locks to use. A higher value allows more insertions
  * to happen concurrently, but adds some CPU overhead to flushing the WAL,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index b163e89cbb..d858e04454 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1448,6 +1448,7 @@ PostmasterMain(int argc, char *argv[])
 	 * Remember postmaster startup time
 	 */
 	PgStartTime = GetCurrentTimestamp();
+	PgStartLSN = GetXLogInsertRecPtr();
 
 	/*
 	 * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 2f8b46d6da..0cb24e10e6 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -17,6 +17,7 @@
 #include "fmgr.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
+#include "access/xlogdefs.h"
 
 /* Sign + the most decimal digits an 8-byte number could have */
 #define MAXINT8LEN 20
@@ -82,6 +83,8 @@ extern void generate_operator_clause(fmStringInfo buf,
 	 Oid opoid,
 	 const char *rightop, Oid rightoptype);
 
+extern PGDLLIMPORT XLogRecPtr PgStartLSN;
+
 /* varchar.c */
 extern int	bpchartruelen(char *s, int len);
 
-- 
2.37.2

From aedbc5058e370705b4041732f671263a2050b997 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 27 Dec 2023 16:40:27 -0500
Subject: [PATCH v1 2/5] Add LSNTimeline for converting LSN <-> time

Add a new structure, LSNTimeline, consisting of LSNTimes -- each an LSN,
time pair. Each LSNTime can represent multiple logical LSN, time pairs,
referred to as members. LSN <-> time conversions can be done using
linear interpolation with two LSNTimes on the LSNTimeline.

This commit does not add a global instance of LSNTimeline. It adds the
structures and functions needed to maintain and access such a timeline.
---
 src/backend/utils/activity/pgstat_wal.c |

Re: add function argument names to regex* functions.


On 27.12.23 17:53, jian he wrote:

similar to [1], add function argument names to the following functions:
regexp_like, regexp_match,regexp_matches,regexp_replace,
regexp_substr,regexp_split_to_array,regexp_split_to_table,regexp_count


Note that these functions are a quasi-standard that is shared with other 
SQL implementations.  It might be worth looking around if there are 
already other implementations of this idea.

Re: cannot abort transaction 2737414167, it was already committed

2023-12-27 Thread Thomas Munro

On Thu, Dec 28, 2023 at 4:02 AM Justin Pryzby  wrote:
> My main question is why an IO error would cause the DB to abort, rather
> than raising an ERROR.

In CommitTransaction() there is a stretch of code beginning s->state =
TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call
out to various subsystems' AtEOXact_XXX() functions.  There is no way
to roll back in that state, so anything that throws ERROR from those
routines is going to get something much like $SUBJECT.  Hmm, we'd know
which exact code path got that EIO from your smoldering core if we'd
put an explicit critical section there (if we're going to PANIC
anyway, it might as well not be from a different stack after
longjmp()...).

I guess the large object usage isn't directly relevant (that module's
EOXact stuff seems to be finished before TRANS_COMMIT, but I don't
know that code well).  Everything later is supposed to be about
closing/releasing/cleaning up, and for example smgrDoPendingDeletes()
reaches code with this relevant comment:

 * Note: smgr_unlink must treat deletion failure as a WARNING, not an
 * ERROR, because we've already decided to commit or abort the current
 * xact.

We don't really have a general ban on ereporting on system call
failure, though.  We've just singled unlink() out.  Only a few lines
above that we call DropRelationsAllBuffers(rels, nrels), and that
calls smgrnblocks(), and that might need to need to re-open() the
relation file to do lseek(SEEK_END), because PostgreSQL itself has no
tracking of relation size.  Hard to say but my best guess is that's
where you might have got your EIO, assuming you dropped the relation
in this transaction?

> This is pg16 compiled at efa8f6064, runing under centos7.  ZFS is 2.2.2,
> but the pool hasn't been upgraded to use the features new since 2.1.

I've been following recent ZFS stuff from a safe distance as a user.
AFAIK the extremely hard to hit bug fixed in that very recent release
didn't technically require the interesting new feature (namely block
cloning, though I think that helped people find the root cause after a
phase of false blame?).  Anyway, it had for symptom some bogus zero
bytes on read, not a spurious EIO.

Re: cannot abort transaction 2737414167, it was already committed

Thomas Munro  writes:
> In CommitTransaction() there is a stretch of code beginning s->state =
> TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call
> out to various subsystems' AtEOXact_XXX() functions.  There is no way
> to roll back in that state, so anything that throws ERROR from those
> routines is going to get something much like $SUBJECT.  Hmm, we'd know
> which exact code path got that EIO from your smoldering core if we'd
> put an explicit critical section there (if we're going to PANIC
> anyway, it might as well not be from a different stack after
> longjmp()...).

+1, there's basically no hope of debugging this sort of problem
as things stand.

regards, tom lane

Re: cannot abort transaction 2737414167, it was already committed

2023-12-27 Thread Justin Pryzby

On Thu, Dec 28, 2023 at 11:33:16AM +1300, Thomas Munro wrote:
> I guess the large object usage isn't directly relevant (that module's
> EOXact stuff seems to be finished before TRANS_COMMIT, but I don't
> know that code well).  Everything later is supposed to be about
> closing/releasing/cleaning up, and for example smgrDoPendingDeletes()
> reaches code with this relevant comment:
> 
>  * Note: smgr_unlink must treat deletion failure as a WARNING, not an
>  * ERROR, because we've already decided to commit or abort the current
>  * xact.
> 
> We don't really have a general ban on ereporting on system call
> failure, though.  We've just singled unlink() out.  Only a few lines
> above that we call DropRelationsAllBuffers(rels, nrels), and that
> calls smgrnblocks(), and that might need to need to re-open() the
> relation file to do lseek(SEEK_END), because PostgreSQL itself has no
> tracking of relation size.  Hard to say but my best guess is that's
> where you might have got your EIO, assuming you dropped the relation
> in this transaction?

Yeah.  In fact I was confused - this was not lo_unlink().
This uses normal tables, so would've done:

"begin;"
"DROP TABLE IF EXISTS %s", tablename
"DELETE FROM cached_objects WHERE cache_name=%s", tablename
"commit;"

> > This is pg16 compiled at efa8f6064, runing under centos7.  ZFS is 2.2.2,
> > but the pool hasn't been upgraded to use the features new since 2.1.
> 
> I've been following recent ZFS stuff from a safe distance as a user.
> AFAIK the extremely hard to hit bug fixed in that very recent release
> didn't technically require the interesting new feature (namely block
> cloning, though I think that helped people find the root cause after a
> phase of false blame?).  Anyway, it had for symptom some bogus zero
> bytes on read, not a spurious EIO.

The ZFS bug had to do with bogus bytes which may-or-may-not-be-zero, as
I understand.  The understanding is that the bug was pre-existing but
became more easy to hit in 2.2, and is fixed in 2.2.2 and 2.1.14.

-- 
Justin

Re: Statistics Import and Export

2023-12-27 Thread Corey Huinker

On Mon, Dec 25, 2023 at 8:18 PM Tomas Vondra 
wrote:

> Hi,
>
> I finally had time to look at the last version of the patch, so here's a
> couple thoughts and questions in somewhat random order. Please take this
> as a bit of a brainstorming and push back if you disagree some of my
> comments.
>
> In general, I like the goal of this patch - not having statistics is a
> common issue after an upgrade, and people sometimes don't even realize
> they need to run analyze. So, it's definitely worth improving.
>
> I'm not entirely sure about the other use case - allowing people to
> tweak optimizer statistics on a running cluster, to see what would be
> the plan in that case. Or more precisely - I agree that would be an
> interesting and useful feature, but maybe the interface should not be
> the same as for the binary upgrade use case?
>
>
> interfaces
> --
>
> When I thought about the ability to dump/load statistics in the past, I
> usually envisioned some sort of DDL that would do the export and import.
> So for example we'd have EXPORT STATISTICS / IMPORT STATISTICS commands,
> or something like that, and that'd do all the work. This would mean
> stats are "first-class citizens" and it'd be fairly straightforward to
> add this into pg_dump, for example. Or at least I think so ...
>
> Alternatively we could have the usual "functional" interface, with a
> functions to export/import statistics, replacing the DDL commands.
>
> Unfortunately, none of this works for the pg_upgrade use case, because
> existing cluster versions would not support this new interface, of
> course. That's a significant flaw, as it'd make this useful only for
> upgrades of future versions.
>

This was the reason I settled on the interface that I did: while we can
create whatever interface we want for importing the statistics, we would
need to be able to extract stats from databases using only the facilities
available in those same databases, and then store that in a medium that
could be conveyed across databases, either by text files or by saving them
off in a side table prior to upgrade. JSONB met the criteria.

>
> So I think for the pg_upgrade use case, we don't have much choice other
> than using "custom" export through a view, which is what the patch does.
>
> However, for the other use case (tweaking optimizer stats) this is not
> really an issue - that always happens on the same instance, so no issue
> with not having the "export" function and so on. I'd bet there are more
> convenient ways to do this than using the export view. I'm sure it could
> share a lot of the infrastructure, ofc.
>

So, there is a third use case - foreign data wrappers. When analyzing a
foreign table, at least one in the postgresql_fdw family of foreign
servers, we should be able to send a query specific to the version and
dialect of that server, get back the JSONB, and import those results. That
use case may be more tangible to you than the tweak/tuning case.

>
>
>
> JSON format
> ---
>
> As for the JSON format, I wonder if we need that at all? Isn't that an
> unnecessary layer of indirection? Couldn't we simply dump pg_statistic
> and pg_statistic_ext_data in CSV, or something like that? The amount of
> new JSONB code seems to be very small, so it's OK I guess.
>

I see a few problems with dumping pg_statistic[_ext_data]. The first is
that the importer now has to understand all of the past formats of those
two tables. The next is that the tables are chock full of Oids that don't
necessarily carry forward. I could see us having a text-ified version of
those two tables, but we'd need that for all previous iterations of those
table formats. Instead, I put the burden on the stats export to de-oid the
data and make it *_in() function friendly.

> That's pretty much why I envisioned a format "grouping" the arrays for a
> particular type of statistics (MCV, histogram) into the same object, as
> for example in
>
>  {
>"mcv" : {"values" : [...], "frequencies" : [...]}
>"histogram" : {"bounds" : [...]}
>  }
>

I agree that would be a lot more readable, and probably a lot more
debuggable. But I went into this unsure if there could be more than one
stats slot of a given kind per table. Knowing that they must be unique
helps.

> But that's probably much harder to generate from plain SQL (at least I
> think so, I haven't tried).
>

I think it would be harder, but far from impossible.

> data missing in the export
> --
>
> I think the data needs to include more information. Maybe not for the
> pg_upgrade use case, where it's mostly guaranteed not to change, but for
> the "manual tweak" use case it can change. And I don't think we want two
> different formats - we want one, working for everything.
>

I"m not against this at all, and I started out doing that, but the
qualified names of operators got _ugly_, and I quickly realized that what I
was generating wouldn't matter, either the input data would make sense for
the

Re: Statistics Import and Export

2023-12-27 Thread Corey Huinker

>
>
>   As mentioned already, we'd also need some sort of
> version identifier, and we'd expect the load_statistics() functions
> to be able to transform the data if the old version used a different
> representation.  I agree with the idea that an explicit representation
> of the source table attribute's type would be wise, too.


There is a version identifier currently (its own column not embedded in the
JSON), but I discovered that I was able to put the burden on the export
queries to spackle-over the changes in the table structures over time.
Still, I knew that we'd need the version number in there eventually.

Re: Statistics Import and Export

2023-12-27 Thread Corey Huinker

>
> Yeah, this is pretty much what I meant by "functional" interface. But if
> I said maybe the format implemented by the patch is maybe too close to
> how we store the statistics, then this has exactly the same issue. And
> it has other issues too, I think - it breaks down the stats into
> multiple function calls, so ensuring the sanity/correctness of whole
> sets of statistics gets much harder, I think.
>

Export functions was my original plan, for simplicity, maintenance, etc,
but it seemed like I'd be adding quite a few functions, so the one view
made more sense for an initial version. Also, I knew that pg_dump or some
other stats exporter would have to inline the guts of those functions into
queries for older versions, and adapting a view definition seemed more
straightforward for the reader than function definitions.

Re: Support "Right Semi Join" plan shapes

2023-12-27 Thread wenhui qiu

Hi Richard Guo
 I did a simple test ,Subqueries of type (in) can be supported, There
is a test sql that doesn't support it, and I think that's because it can't
pull up  the subqueries.
```
test=# explain (costs off) SELECT  t1.* FROM prt1_adv t1 WHERE EXISTS
(SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: (t2.b = t1.a)
 ->  Append
   ->  Seq Scan on prt2_adv_p1 t2_1
   ->  Seq Scan on prt2_adv_p2 t2_2
   ->  Seq Scan on prt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on prt1_adv_p1 t1_1
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p2 t1_2
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p3 t1_3
   Filter: (b = 0)
(16 rows)

test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE t1.a IN
(SELECT t2.b FROM prt2_adv t2) AND t1.b = 0 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: (t2.b = t1.a)
 ->  Append
   ->  Seq Scan on prt2_adv_p1 t2_1
   ->  Seq Scan on prt2_adv_p2 t2_2
   ->  Seq Scan on prt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on prt1_adv_p1 t1_1
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p2 t1_2
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p3 t1_3
   Filter: (b = 0)
(16 rows)

test=#

test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE EXISTS
(SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10
ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
 ->  Append
   ->  Seq Scan on plt2_adv_p1 t2_1
   ->  Seq Scan on plt2_adv_p2 t2_2
   ->  Seq Scan on plt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on plt1_adv_p1 t1_1
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p2 t1_2
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p3 t1_3
   Filter: (b < 10)
(16 rows)

test=#
test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE (t1.a, t1.c)
IN (SELECT t2.a, t2.c FROM plt2_adv t2) AND t1.b < 10 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
 ->  Append
   ->  Seq Scan on plt2_adv_p1 t2_1
   ->  Seq Scan on plt2_adv_p2 t2_2
   ->  Seq Scan on plt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on plt1_adv_p1 t1_1
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p2 t1_2
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p3 t1_3
   Filter: (b < 10)
(16 rows)


```

```
test=# explain (costs off) select * from int4_tbl i4, tenk1 a
 where exists(select * from tenk1 b
  where a.twothousand = b.twothousand and a.fivethous <>
b.fivethous)
   and i4.f1 = a.tenthous;
   QUERY PLAN
-
 Hash Right Semi Join
   Hash Cond: (b.twothousand = a.twothousand)
   Join Filter: (a.fivethous <> b.fivethous)
   ->  Seq Scan on tenk1 b
   ->  Hash
 ->  Hash Join
   Hash Cond: (a.tenthous = i4.f1)
   ->  Seq Scan on tenk1 a
   ->  Hash
 ->  Seq Scan on int4_tbl i4
(10 rows)

test=# explain (costs off ) SELECT *
FROM int4_tbl i4, tenk1 a
WHERE (a.twothousand, a.fivethous) IN (
SELECT b.twothousand, b.fivethous
FROM tenk1 b
WHERE a.twothousand = b.twothousand  and  a.fivethous <> b.fivethous
)
AND i4.f1 = a.tenthous;
   QUERY PLAN

 Nested Loop
   Join Filter: (i4.f1 = a.tenthous)
   ->  Seq Scan on tenk1 a
 Filter: (SubPlan 1)
 SubPlan 1
   ->  Seq Scan on tenk1 b
 Filter: ((a.fivethous <> fivethous) AND (a.twothousand =
twothousand))
   ->  Materialize
 ->  Seq Scan on int4_tbl i4
(9 rows)
test=# set enable_nestloop =off;
SET
test=# explain (costs off ) SELECT *
FRO

RE: Random pg_upgrade test failure on drongo

2023-12-27 Thread Hayato Kuroda (Fujitsu)

Dear Alexander, 

> I agree with your analysis and would like to propose a PoC fix (see
> attached). With this patch applied, 20 iterations succeeded for me.

There are no reviewers so that I will review again. Let's move the PoC
to the concrete patch. Note that I only focused on fixes of random failure,
other parts are out-of-scope.

Basically, code comments can be updated accordingly.

01.

```
/*
 * This function might be called for a regular file or for a junction
 * point (which we use to emulate symlinks).  The latter must be 
unlinked
 * with rmdir() on Windows.  Before we worry about any of that, let's 
see
 * if we can unlink directly, since that's expected to be the most 
common
 * case.
 */
snprintf(tmppath, sizeof(tmppath), "%s.tmp", path);
if (pgrename(path, tmppath) == 0)
{
if (unlink(tmppath) == 0)
return 0;
curpath = tmppath;
}
```

You can modify comments atop changes because it is not trivial.
Below is my draft:

```
 * XXX: we rename the target file to ".tmp" before calling unlink. The
 * removal may fail with STATUS_DELETE_PENDING status on Windows, so
 * creating the same file would fail. This assumes that renaming is a
 * synchronous operation.
```

02.

```
loops = 0;
while (lstat(curpath, &st) < 0 && 
lstat_error_was_status_delete_pending())
{
if (++loops > 100)  /* time out after 10 sec */
return -1;
pg_usleep(10);  /* us */
}
```

Comments can be added atop the part. Below one is my draft.

```
/*
 * Wait until the removal is really finished to avoid ERRORs for 
creating a
 * same file in other functions.
 */
```


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Statistics Import and Export

Corey Huinker  writes:
> Export functions was my original plan, for simplicity, maintenance, etc,
> but it seemed like I'd be adding quite a few functions, so the one view
> made more sense for an initial version. Also, I knew that pg_dump or some
> other stats exporter would have to inline the guts of those functions into
> queries for older versions, and adapting a view definition seemed more
> straightforward for the reader than function definitions.

Hmm, I'm not sure we are talking about the same thing at all.

What I am proposing is *import* functions.  I didn't say anything about
how pg_dump obtains the data it prints; however, I would advocate that
we keep that part as simple as possible.  You cannot expect export
functionality to know the requirements of future server versions,
so I don't think it's useful to put much intelligence there.

So I think pg_dump should produce a pretty literal representation of
what it finds in the source server's catalog, and then rely on the
import functions in the destination server to make sense of that
and do whatever slicing-n-dicing is required.

That being the case, I don't see a lot of value in a view -- especially
not given the requirement to dump from older server versions.
(Conceivably we could just say that we won't dump stats from server
versions predating the introduction of this feature, but that's hardly
a restriction that supports doing this via a view.)

regards, tom lane

Re: Statistics Import and Export

2023-12-27 Thread Bruce Momjian

On Wed, Dec 27, 2023 at 09:41:31PM -0500, Corey Huinker wrote:
> When I thought about the ability to dump/load statistics in the past, I
> usually envisioned some sort of DDL that would do the export and import.
> So for example we'd have EXPORT STATISTICS / IMPORT STATISTICS commands,
> or something like that, and that'd do all the work. This would mean
> stats are "first-class citizens" and it'd be fairly straightforward to
> add this into pg_dump, for example. Or at least I think so ...
> 
> Alternatively we could have the usual "functional" interface, with a
> functions to export/import statistics, replacing the DDL commands.
> 
> Unfortunately, none of this works for the pg_upgrade use case, because
> existing cluster versions would not support this new interface, of
> course. That's a significant flaw, as it'd make this useful only for
> upgrades of future versions.
> 
> 
> This was the reason I settled on the interface that I did: while we can create
> whatever interface we want for importing the statistics, we would need to be
> able to extract stats from databases using only the facilities available in
> those same databases, and then store that in a medium that could be conveyed
> across databases, either by text files or by saving them off in a side table
> prior to upgrade. JSONB met the criteria.

Uh, it wouldn't be crazy to add this capability to pg_upgrade/pg_dump in
a minor version upgrade if it wasn't enabled by default, and if we were
very careful.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: add function argument names to regex* functions.

On Thu, Dec 28, 2023 at 6:25 AM Peter Eisentraut  wrote:
>
> On 27.12.23 17:53, jian he wrote:
> > similar to [1], add function argument names to the following functions:
> > regexp_like, regexp_match,regexp_matches,regexp_replace,
> > regexp_substr,regexp_split_to_array,regexp_split_to_table,regexp_count
>
> Note that these functions are a quasi-standard that is shared with other
> SQL implementations.  It might be worth looking around if there are
> already other implementations of this idea.
>

turns out people do like calling functions via explicitly mentioning
function argument names, example: [0]
There are no provisions for the argument names.

I looked around the oracle implementation in [1], and the oracle regex
related function argumentation name in [2]
I use the doc [3] syntax explanation and add the related function names.

Current, regex.* function syntax seems fine. but only parameter `N`
seems a little bit weird.
If we change the function's argument name, we also need to change
function syntax explanation in the doc; vise versa.

QUOTE:
The regexp_instr function returns the starting or ending position of
the N'th match of a POSIX regular expression pattern to a string, or
zero if there is no such match. It has the syntax regexp_instr(string,
pattern [, start [, N [, endoption [, flags [, subexpr ]). pattern
is searched for in string, normally from the beginning of the string,
but if the start parameter is provided then beginning from that
character index. If N is specified then the N'th match of the pattern
is located, otherwise the first match is located.
END OF QUOTE.

maybe we can change `N` to occurrence. but `occurrence` is kind of verbose.

[0] 
https://stackoverflow.com/questions/33387348/oracle-named-parameters-in-regular-functions
[1] 
https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/REGEXP_SUBSTR.html#GUID-2903904D-455F-4839-A8B2-1731EF4BD099
[2] https://dbfiddle.uk/h_SBDEKi
[3] 
https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

On Wed, Dec 20, 2023 at 8:27 PM Masahiko Sawada  wrote:
>
>
> Why do we need to use SPI? I think we can form heap tuples and insert
> them to the error table. Creating the error table also doesn't need to
> use SPI.
>
Thanks for pointing it out. I figured out how to form heap tuples and
insert them to the error table.
but I don't know how to create the error table without using SPI.
Please pointer it out.

> >
> > copy_errors one per schema.
> > foo.copy_errors will be owned by the schema: foo owner.
>
> It seems that the error table is created when the SAVE_ERROR is used
> for the first time. It probably blocks concurrent COPY FROM commands
> with SAVE_ERROR option to different tables if the error table is not
> created yet.
>
I don't know how to solve this problem Maybe we can document this.
but it will block the COPY FROM immediately.

> >
> > if you can insert to a table in that specific schema let's say foo,
> > then you will get privilege to INSERT/DELETE/SELECT
> > to foo.copy_errors.
> > If you are not a superuser, you are only allowed to do
> > INSERT/DELETE/SELECT on foo.copy_errors rows where USERID =
> > current_user::regrole::oid.
> > This is done via row level security.
>
> I don't think it works. If the user is dropped, the user's oid could
> be reused for a different user.
>

You are right.
so I changed, now the schema owner will be the error table owner.
every error table tuple inserts,
I switch to schema owner, do the insert, then switch back to the
COPY_FROM operation user.
now everyone (except superuser) will need explicit grant to access the
error table.
From 8c8c266f1dc809ffa0ec9f4262bdd912ed6b758a Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Wed, 27 Dec 2023 20:15:24 +0800
Subject: [PATCH v13 1/1] Make COPY FROM more error tolerant
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently COPY FROM has 3 types of error while processing the source file.
* extra data after last expected column
* missing data for column \"%s\"
* data type conversion error.

Instead of throwing errors while copying, save_error specifier will
save errors to table copy_errors for all the copy from operation in the same schema.

We check the existing copy_errors table definition by column name and column data type.
if table already exists and meets the criteria then errors metadata will save to copy_errors.
if the table does not exist, then create one.

table copy_errors is per schema-wise, it's owned by the copy from
operation destination schema's owner.
The table owner has full privilege on copy_errors,
other non-superuser need gain privilege to access it.

Only works for COPY FROM, non-BINARY mode.
---
 doc/src/sgml/ref/copy.sgml   | 121 -
 src/backend/commands/copy.c  |  12 ++
 src/backend/commands/copyfrom.c  | 133 +-
 src/backend/commands/copyfromparse.c | 217 +--
 src/backend/parser/gram.y|   8 +-
 src/bin/psql/tab-complete.c  |   3 +-
 src/include/commands/copy.h  |   1 +
 src/include/commands/copyfrom_internal.h |   6 +
 src/include/parser/kwlist.h  |   1 +
 src/test/regress/expected/copy2.out  | 137 ++
 src/test/regress/sql/copy2.sql   | 123 +
 11 files changed, 746 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 18ecc69c..1d0ff0b6 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { table_name [ ( column_name [, ...] ) | * }
 FORCE_NULL { ( column_name [, ...] ) | * }
 ENCODING 'encoding_name'
+SAVE_ERROR [ boolean ]
 
  
 
@@ -411,6 +412,18 @@ WHERE condition
 

 
+   
+SAVE_ERROR
+
+ 
+  Specifies that any data conversion errors while copying will automatically saved in table COPY_ERRORS and the COPY FROM operation will not be interrupted by conversion errors.
+  This option is not allowed when using binary format. Note that this
+  is only supported in current COPY FROM syntax.
+  If this option is omitted, any data type conversion errors will be raised immediately.
+ 
+
+   
+
   
  
 
@@ -564,6 +577,7 @@ COPY count
 amount to a considerable amount of wasted disk space if the failure
 happened well into a large copy operation. You might wish to invoke
 VACUUM to recover the wasted space.
+To continue copying while skip conversion errors in a COPY FROM, you might wish to specify SAVE_ERROR.

 

@@ -572,6 +586,18 @@ COPY count
 null strings to null values and unquoted null strings to empty strings.

 
+   
+If the SAVE_ERROR option is specified and conversion errors occur while copying,
+PostgreSQL will first check the table COPY_ERRORS existence, then save the conversion error related information to it.
+If it does exist, but the actual table definition cannot use it to save the e

Re: Track in pg_replication_slots the reason why slots conflict?

On Wed, Dec 27, 2023 at 4:16 PM Amit Kapila  wrote:
>
> On Wed, Dec 27, 2023 at 3:08 PM shveta malik  wrote:
> >
> > PFA the patch which attempts to implement this.
> >
> > This patch changes the existing 'conflicting' field to
> > 'conflicting_cause' in pg_replication_slots.
> >
>
> This name sounds a bit odd to me, would it be better to name it as
> conflict_cause?
>
> A few other minor comments:
> =
> *
> +   conflicting_cause text
> +  
> +  
> +   Cause if this logical slot conflicted with recovery (and so is now
> +   invalidated). It is always NULL for physical slots, as well as for
> +   those logical slots which are not invalidated. Possible values are:
>
> Would it better to use description as follows:" Cause of logical
> slot's conflict with recovery. It is always NULL for physical slots,
> as well as for logical slots which are not invalidated. The non-NULL
> values indicate that the slot is marked as invalidated. Possible
> values are:
> .."
>
> *
>   $res = $node_standby->safe_psql(
>   'postgres', qq(
> - select bool_and(conflicting) from pg_replication_slots;));
> + select bool_and(conflicting) from
> + (select conflicting_cause is not NULL as conflicting from
> pg_replication_slots);));
>
> Won't the query "select conflicting_cause is not NULL as conflicting
> from pg_replication_slots" can return false even for physical slots
> and then impact the result of the main query whereas the original
> query would seem to be simply ignoring physical slots? If this
> observation is correct then you might want to add a 'slot_type'
> condition in the new query.
>
> * After introducing check_slots_conflicting_cause(), do we need to
> have check_slots_conflicting_status()? Aren't both checking the same
> thing?

I think it is needed for the case where we want to check that there is
no conflict.

# Verify slots are reported as non conflicting in pg_replication_slots
check_slots_conflicting_status(0);

For the cases where there is conflict, I think
check_slots_conflicting_cause() can replace
check_slots_conflicting_status().

thanks
Shveta

Re: doc patch: note AttributeRelationId passed to FDW validator function

2023-12-27 Thread Ian Lawrence Barwick

2023年6月7日(水) 9:08 Ian Lawrence Barwick :
>
> Hi
>
> Here:
>
>   https://www.postgresql.org/docs/current/fdw-functions.html
>
> the enumeration of OIDs which might be passed as the FDW validator function's
> second argument omits "AttributeRelationId" (as passed when altering
> a foreign table's column options).
>
> Attached v1 patch adds this to this list of OIDs.
>
> The alternative v2 patch adds this to this list of OIDs, and also
> formats it as an
> SGML list, which IMHO is easier to read.
>
> Looks like this has been missing since 9.3.

Forgot to add this to a CF; done:  https://commitfest.postgresql.org/46/4730/

Regards

Ian Barwick

Re: warn if GUC set to an invalid shared library

2023-12-27 Thread Shubham Khanna

On Thu, Dec 28, 2023 at 10:54 AM Justin Pryzby  wrote:
>
> On Sat, Oct 29, 2022 at 10:40 AM Justin Pryzby  wrote:
> > > > > On Fri, Sep 02, 2022 at 05:24:58PM -0500, Justin Pryzby wrote:
> > > > > > It caused no issue when I changed:
> > > > > >
> > > > > > /* Check that it's acceptable for the 
> > > > > > indicated parameter */
> > > > > > if (!parse_and_validate_value(record, name, 
> > > > > > value,
> > > > > > - PGC_S_FILE, 
> > > > > > ERROR,
> > > > > > + PGC_S_TEST, 
> > > > > > ERROR,
> > > > > >   &newval, 
> > > > > > &newextra))
> > > > > >
> > > > > > I'm not sure where to go from here.
> > > > >
> > > > > I'm hoping for some guidance ; this simple change may be naive, but 
> > > > > I'm not
> > > > > sure what a wider change would look like.
>
> I'm still hoping.
>
> > > PGC_S_TEST is a better fit, so my question is whether it's really that
> > > simple ?
> >
> > I've added the trivial change as 0001 and re-opened the patch (which ended
> > up in January's CF)
> >
> > If for some reason it's not really as simple as that, then 001 will
> > serve as a "straw-man patch" hoping to elicit discussion on that point.
>
> > From defdb57fe0ec373c1eea8df42f0e1831b3f9c3cc Mon Sep 17 00:00:00 2001
> > From: Justin Pryzby 
> > Date: Fri, 22 Jul 2022 15:52:11 -0500
> > Subject: [PATCH v6 1/4] WIP: test GUCs from ALTER SYSTEM as PGC_S_TEST not
> >  FILE
> >
> > WIP: ALTER SYSTEM should use PGC_S_TEST rather than PGC_S_FILE
> >
> > Since the value didn't come from a file.  Or maybe we should have
> > another PGC_S_ value for this, or a flag for 'is a test'.
> > ---
> >  src/backend/utils/misc/guc.c | 2 +-
> >  src/include/utils/guc.h  | 1 +
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
> > index 6f21752b844..ae8810591d6 100644
> > --- a/src/backend/utils/misc/guc.c
> > +++ b/src/backend/utils/misc/guc.c
> > @@ -4435,7 +4435,7 @@ AlterSystemSetConfigFile(AlterSystemStmt 
> > *altersysstmt)
> >
> >   /* Check that it's acceptable for the indicated 
> > parameter */
> >   if (!parse_and_validate_value(record, name, value,
> > -   
> > PGC_S_FILE, ERROR,
> > +   
> > PGC_S_TEST, ERROR,
> > 
> > &newval, &newextra))
> >   ereport(ERROR,
> >   
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>
> This is rebased over my own patch to enable checks for
> REGRESSION_TEST_NAME_RESTRICTIONS.
>
I was reviewing the Patch and came across a minor issue that the Patch
does not apply on the current Head. Please provide the updated version
of the patch.

Thanks and Regards,
Shubham Khanna.

Re: doc patch: note AttributeRelationId passed to FDW validator function

2023-12-27 Thread Michael Paquier

On Thu, Dec 28, 2023 at 01:55:27PM +0900, Ian Lawrence Barwick wrote:
> 2023年6月7日(水) 9:08 Ian Lawrence Barwick :
>> The alternative v2 patch adds this to this list of OIDs, and also
>> formats it as an
>> SGML list, which IMHO is easier to read.
>>
>> Looks like this has been missing since 9.3.
> 
> Forgot to add this to a CF; done:  https://commitfest.postgresql.org/46/4730/

Agreed that a list is cleaner.  Looking around I can see that the
catalogs going through the validator functions are limited to the five
you are listing in your patch.  Will apply in a bit, thanks!
--
Michael


signature.asc
Description: PGP signature

Re: change regexp_substr first argument make tests more easier to understand.

On Thu, Dec 28, 2023 at 12:13 AM jian he  wrote:
>
> hi.
> https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/expected/strings.out#n928
>
> SELECT regexp_substr('abcabcabc', 'a.c');
> SELECT regexp_substr('abcabcabc', 'a.c', 2);
> SELECT regexp_substr('abcabcabc', 'a.c', 1, 3);
> SELECT regexp_substr('abcabcabc', 'a.c', 1, 4) IS NULL AS t;
> SELECT regexp_substr('abcabcabc', 'A.C', 1, 2, 'i');
>
> they all return 'abc', there are 3 'abc ' in  string 'abcabcabc'
> except IS NULL query.
> maybe we can change regexp_substr first argument from "abcabcabc" to
> "abcaXcaYc".
> so the result would be more easier to understand.

anyway here is the minor patch to  change string from "abcabcabc" to
"abcaXcaYc".
From ffe63d9607824e6aa89d590761fcb05219b799d8 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Thu, 28 Dec 2023 14:11:31 +0800
Subject: [PATCH v1 1/1] make regexp_substr result more easier to understand.

  regexp_substr using "abcabcabc" is not good for test,
  since in many cases, it will return "abc".
  there are 3  "abc" in "abcabcabc", change the test string to something else
  make the results more clear.

---
 src/test/regress/expected/strings.out | 14 +++---
 src/test/regress/sql/strings.sql  |  8 
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b7500d9c..7098934a 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -925,22 +925,22 @@ SELECT regexp_substr('abcdefghi', 'd.q') IS NULL AS t;
  t
 (1 row)
 
-SELECT regexp_substr('abcabcabc', 'a.c');
+SELECT regexp_substr('abcaXcaYc', 'a.c');
  regexp_substr 
 ---
  abc
 (1 row)
 
-SELECT regexp_substr('abcabcabc', 'a.c', 2);
+SELECT regexp_substr('abcaXcaYc', 'a.c', 2);
  regexp_substr 
 ---
- abc
+ aXc
 (1 row)
 
-SELECT regexp_substr('abcabcabc', 'a.c', 1, 3);
+SELECT regexp_substr('abcaXcaYc', 'a.c', 1, 3);
  regexp_substr 
 ---
- abc
+ aYc
 (1 row)
 
 SELECT regexp_substr('abcabcabc', 'a.c', 1, 4) IS NULL AS t;
@@ -949,10 +949,10 @@ SELECT regexp_substr('abcabcabc', 'a.c', 1, 4) IS NULL AS t;
  t
 (1 row)
 
-SELECT regexp_substr('abcabcabc', 'A.C', 1, 2, 'i');
+SELECT regexp_substr('abcaXcaYc', 'A.C', 1, 2, 'i');
  regexp_substr 
 ---
- abc
+ aXc
 (1 row)
 
 SELECT regexp_substr('1234567890', '(123)(4(56)(78))', 1, 1, 'i', 0);
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 39596789..4ed86516 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -274,11 +274,11 @@ SELECT regexp_instr('abcabcabc', 'a.c', 1, 1, 0, '', -1);
 -- regexp_substr tests
 SELECT regexp_substr('abcdefghi', 'd.f');
 SELECT regexp_substr('abcdefghi', 'd.q') IS NULL AS t;
-SELECT regexp_substr('abcabcabc', 'a.c');
-SELECT regexp_substr('abcabcabc', 'a.c', 2);
-SELECT regexp_substr('abcabcabc', 'a.c', 1, 3);
+SELECT regexp_substr('abcaXcaYc', 'a.c');
+SELECT regexp_substr('abcaXcaYc', 'a.c', 2);
+SELECT regexp_substr('abcaXcaYc', 'a.c', 1, 3);
 SELECT regexp_substr('abcabcabc', 'a.c', 1, 4) IS NULL AS t;
-SELECT regexp_substr('abcabcabc', 'A.C', 1, 2, 'i');
+SELECT regexp_substr('abcaXcaYc', 'A.C', 1, 2, 'i');
 SELECT regexp_substr('1234567890', '(123)(4(56)(78))', 1, 1, 'i', 0);
 SELECT regexp_substr('1234567890', '(123)(4(56)(78))', 1, 1, 'i', 1);
 SELECT regexp_substr('1234567890', '(123)(4(56)(78))', 1, 1, 'i', 2);
-- 
2.34.1

Re: Functions to return random numbers in a given range