Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-12 Thread Alexander Kukushkin
Hi Ashutosh,

Apologies for any confusion, but I'm not entirely following your
> explanation. Could you kindly provide further clarification?
> Additionally, would you mind reviewing the problem description
> outlined in the initial email?
>

I know about the problem and have seen the original email.
What confused me, is that your email didn't specify that SET SEARCH_PATH in
the CREATE EXTENSION is a boolean flag, hence I made an assumption that it
is a TEXT (similar to GUC with the same name). Now after looking at your
code it makes more sense. Sorry about the confusion.

But, I also agree with Jelte, it should be a property of a control file,
rather than a user controlled parameter, so that an attacker can't opt out.

Regards,
--
Alexander Kukushkin


Re: Partial aggregates pushdown

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 07:27, fujii.y...@df.mitsubishielectric.co.jp
 wrote:
> Could you please clarify what you mean?
> Are you referring to:
>   Option 1: Modifying existing aggregate functions to minimize the use of 
> internal state values.
>   Option 2: Not supporting the push down of partial aggregates for functions 
> with internal state values.

Basically I mean both Option 1 and Option 2 together. i.e. once we do
option 1, supporting partial aggregate pushdown for all important
aggregates with internal state values, then supporting pushdown of
internal state values becomes unnecessary.

> There are many aggregate functions with internal state values, so if we go 
> with Option 1,
> we might need to change a lot of existing code, like transition functions and 
> finalize functions.
> Also, I'm not sure how many existing aggregate functions can be modified this 
> way.

There are indeed 57 aggregate functions with internal state values:

> SELECT count(*) from pg_aggregate where aggtranstype = 'internal'::regtype;
 count
───
57
(1 row)

But there are only 26 different aggtransfns. And the most used 8 of
those cover 39 of those 57 aggregates.

> SELECT
sum (count) OVER(ORDER BY count desc, aggtransfn ROWS BETWEEN
unbounded preceding and current row) AS cumulative_count
, *
FROM (
SELECT
count(*),
aggtransfn
from pg_aggregate
where aggtranstype = 'internal'::regtype
group by aggtransfn
order by count(*) desc, aggtransfn
);
 cumulative_count │ count │   aggtransfn
──┼───┼
7 │ 7 │ ordered_set_transition
   13 │ 6 │ numeric_accum
   19 │ 6 │ int2_accum
   25 │ 6 │ int4_accum
   31 │ 6 │ int8_accum
   35 │ 4 │ ordered_set_transition_multi
   37 │ 2 │ int8_avg_accum
   39 │ 2 │ numeric_avg_accum
   40 │ 1 │ array_agg_transfn
   41 │ 1 │ json_agg_transfn
   42 │ 1 │ json_object_agg_transfn
   43 │ 1 │ jsonb_agg_transfn
   44 │ 1 │ jsonb_object_agg_transfn
   45 │ 1 │ string_agg_transfn
   46 │ 1 │ bytea_string_agg_transfn
   47 │ 1 │ array_agg_array_transfn
   48 │ 1 │ range_agg_transfn
   49 │ 1 │ multirange_agg_transfn
   50 │ 1 │ json_agg_strict_transfn
   51 │ 1 │ json_object_agg_strict_transfn
   52 │ 1 │ json_object_agg_unique_transfn
   53 │ 1 │ json_object_agg_unique_strict_transfn
   54 │ 1 │ jsonb_agg_strict_transfn
   55 │ 1 │ jsonb_object_agg_strict_transfn
   56 │ 1 │ jsonb_object_agg_unique_transfn
   57 │ 1 │ jsonb_object_agg_unique_strict_transfn
(26 rows)


And actually most of those don't have a serialfn, so they wouldn't be
supported by your suggested approach either. Looking at the
distribution of aggserialfns instead we see the following:

> SELECT
sum (count) OVER(ORDER BY count desc, aggserialfn ROWS BETWEEN
unbounded preceding and current row) AS cumulative_count
, *
FROM (
SELECT
count(*),
aggserialfn
from pg_aggregate
where
aggtranstype = 'internal'::regtype
AND aggserialfn != 0
group by aggserialfn
order by count(*) desc, aggserialfn
);
 cumulative_count │ count │aggserialfn
──┼───┼───
   12 │12 │ numeric_serialize
   24 │12 │ numeric_poly_serialize
   26 │ 2 │ numeric_avg_serialize
   28 │ 2 │ int8_avg_serialize
   30 │ 2 │ string_agg_serialize
   31 │ 1 │ array_agg_serialize
   32 │ 1 │ array_agg_array_serialize
(7 rows)

So there are only 7 aggserialfns, and thus at most 7 new postgres
types that you would need to create to support the same aggregates as
in your current proposal. But looking at the implementations of these
serialize functions even that is an over-estimation: numeric_serialize
and numeric_avg_serialize both serialize a NumericAggState, and
numeric_poly_serialize and int8_avg_serialize both serialize a
PolyNumAggState. So probably a we could even do with only 5 types. And
to be clear: only converting PolyNumAggState and NumericAggState to
actual postgres types would already cover 28 out of the 32 aggregates.
That seems quite feasible to do.

So I agree it's probably more code than your current approach. At the
very least because you would need to implement in/out text
serialization functions for these internal types that currently don't
have them. But I do think it would be quite a feasible amount. And to
clarify, I see a few benefits of using the approach that I'm
proposing:

1. So far aggserialfn and aggdese

Re: [HACKERS] make async slave to wait for lsn to be replayed

2024-06-12 Thread Kartyshov Ivan
Hi, Alexander, Here, I made some improvements according to your 
discussion with Heikki.


On 2024-04-11 18:09, Alexander Korotkov wrote:
On Thu, Apr 11, 2024 at 1:46 AM Heikki Linnakangas  
wrote:
In a nutshell, it's possible for the loop in WaitForLSN to exit 
without
cleaning up the process from the heap. I was able to hit that by 
adding

a delay after the addLSNWaiter() call:

> TRAP: failed Assert("!procInfo->inHeap"), File: 
"../src/backend/commands/waitlsn.c", Line: 114, PID: 1936152
> postgres: heikki postgres [local] 
CALL(ExceptionalCondition+0xab)[0x55da1f68787b]
> postgres: heikki postgres [local] CALL(+0x331ec8)[0x55da1f204ec8]
> postgres: heikki postgres [local] CALL(WaitForLSN+0x139)[0x55da1f2052cc]
> postgres: heikki postgres [local] 
CALL(pg_wal_replay_wait+0x18b)[0x55da1f2056e5]
> postgres: heikki postgres [local] CALL(ExecuteCallStmt+0x46e)[0x55da1f18031a]
> postgres: heikki postgres [local] 
CALL(standard_ProcessUtility+0x8cf)[0x55da1f4b26c9]

I think there's a similar race condition if the timeout is reached at
the same time that the startup process wakes up the process.


Thank you for catching this.  I think WaitForLSN() just needs to call
deleteLSNWaiter() unconditionally after exit from the loop.


Fix and add injection point test on this race condition.

On Thu, Apr 11, 2024 at 1:46 AM Heikki Linnakangas  
wrote:

The docs could use some-copy-editing, but just to point out one issue:

> There are also procedures to control the progress of recovery.

That's copy-pasted from an earlier sentence at the table that lists
functions like pg_promote(), pg_wal_replay_pause(), and
pg_is_wal_replay_paused(). The pg_wal_replay_wait() doesn't control 
the
progress of recovery like those functions do, it only causes the 
calling

backend to wait.


Fix documentation and add extra tests on multi-standby replication
and cascade replication.

--
Ivan Kartyshov
Postgres Professional: www.postgrespro.comcommit ddbee0e9fcd1f0a2a770d476acef677486f62367
Author: i.kartyshov 
Date:   Wed Jun 12 01:29:00 2024 +0300

Subject: [PATCH v18] Implement pg_wal_replay_wait() stored procedure

pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed before starting the transaction.
This option is useful when the user makes some data changes on primary and
needs a guarantee to see these changes on standby.

The queue of waiters is stored in the shared memory array sorted by LSN.
During replay of WAL waiters whose LSNs are already replayed are deleted from
the shared memory array and woken up by setting of their latches.

pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
the snapshot could prevent the replay of WAL records implying a kind of
self-deadlock.  This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working in a non-atomic context,
not a function.

Catversion is bumped.

Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
---
 doc/src/sgml/func.sgml| 112 
 src/backend/access/transam/xact.c |   6 +
 src/backend/access/transam/xlog.c |   7 +
 src/backend/access/transam/xlogrecovery.c |  11 ++
 src/backend/catalog/system_functions.sql  |   3 +
 src/backend/commands/Makefile |   3 +-
 src/backend/commands/meson.build  |   1 +
 src/backend/commands/waitlsn.c| 344 +
 src/backend/lib/pairingheap.c |  18 +-
 src/backend/storage/ipc/ipci.c|   3 +
 src/backend/storage/lmgr/proc.c   |   6 +
 src/backend/utils/activity/wait_event_names.txt   |   2 +
 src/include/catalog/pg_proc.dat   |   5 +
 src/include/commands/waitlsn.h|  77 +
 src/include/lib/pairingheap.h |   3 +
 src/include/storage/lwlocklist.h  |   1 +
 src/test/recovery/meson.build |   2 +
 src/test/recovery/t/043_wal_replay_wait.pl| 176 +++
 src/test/recovery/t/044_wal_replay_wait_injection_test.pl |  98 +++
 src/tools/pgindent/typedefs.list  |   2 +
 20 files changed, 877 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 17c44bc338..b36c5b2ad8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sg

Re: Add support to TLS 1.3 cipher suites and curves lists

2024-06-12 Thread Jelte Fennema-Nio
On Mon, 10 Jun 2024 at 12:31, Daniel Gustafsson  wrote:
> Regarding the ciphersuites portion of the patch.  I'm not particularly 
> thrilled
> about having a GUC for TLSv1.2 ciphers and one for TLSv1.3 ciphersuites, users
> not all that familiar with TLS will likely find it confusing to figure out 
> what
> to do.

I don't think it's easy to create a single GUC because OpenSSL has
different APIs for both. So we'd have to add some custom parsing for
the combined string, which is likely to cause some problems imho. I
think separating them is the best option from the options we have and
I don't think it matters much practice for users. Users not familiar
with TLS might indeed be confused, but those users shouldn't touch
these settings anyway, and just use the defaults. The users that care
about this probably already get two cipher strings from their
compliance teams, because many other applications also have two
separate options for specifying both.




Re: Re: Re: Add support to TLS 1.3 cipher suites and curves lists

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 04:32, Erica Zhang  wrote:
> There are certain government, financial and other enterprise organizations 
> that have very strict requirements about the encrypted communication and more 
> specifically about fine grained params like the TLS ciphers and curves that 
> they use. The default ones for those customers are not acceptable. Any 
> products that integrate Postgres and requires encrypted communication with 
> the Postgres would have to fulfil those requirements.

Yeah, I ran into such requirements before too. So I do think it makes
sense to have such a feature in Postgres.

> So if we can have this patch in the upcoming new major version, that means 
> Postgres users who have similar requirements can upgrade to PG17.

As Daniel mentioned you can already achieve the same using the
"Ciphersuites" directive in openssl.conf. Also you could of course
always disable TLSv1.3 support.




Re: Conflict Detection and Resolution

2024-06-12 Thread shveta malik
On Tue, Jun 11, 2024 at 7:44 PM Tomas Vondra
 wrote:
>
>
>
> On 6/11/24 10:35, shveta malik wrote:
> > On Mon, Jun 10, 2024 at 5:24 PM Tomas Vondra
> >  wrote:
> >>
> >>
> >>
> >> On 6/10/24 12:56, shveta malik wrote:
> >>> On Fri, Jun 7, 2024 at 6:08 PM Tomas Vondra
> >>>  wrote:
> 
> >>>
> >>>  UPDATE
> >>> 
> >>>
> >>> Conflict Detection Method:
> >>> 
> >>> Origin conflict detection: The ‘origin’ info is used to detect
> >>> conflict which can be obtained from commit-timestamp generated for
> >>> incoming txn at the source node. To compare remote’s origin with the
> >>> local’s origin, we must have origin information for local txns as well
> >>> which can be obtained from commit-timestamp after enabling
> >>> ‘track_commit_timestamp’ locally.
> >>> The one drawback here is the ‘origin’ information cannot be obtained
> >>> once the row is frozen and the commit-timestamp info is removed by
> >>> vacuum. For a frozen row, conflicts cannot be raised, and thus the
> >>> incoming changes will be applied in all the cases.
> >>>
> >>> Conflict Types:
> >>> 
> >>> a) update_differ: The origin of an incoming update's key row differs
> >>> from the local row i.e.; the row has already been updated locally or
> >>> by different nodes.
> >>> b) update_missing: The row with the same value as that incoming
> >>> update's key does not exist. Remote is trying to update a row which
> >>> does not exist locally.
> >>> c) update_deleted: The row with the same value as that incoming
> >>> update's key does not exist. The row is already deleted. This conflict
> >>> type is generated only if the deleted row is still detectable i.e., it
> >>> is not removed by VACUUM yet. If the row is removed by VACUUM already,
> >>> it cannot detect this conflict. It will detect it as update_missing
> >>> and will follow the default or configured resolver of update_missing
> >>> itself.
> >>>
> >>
> >> I don't understand the why should update_missing or update_deleted be
> >> different, especially considering it's not detected reliably. And also
> >> that even if we happen to find the row the associated TOAST data may
> >> have already been removed. So why would this matter?
> >
> > Here, we are trying to tackle the case where the row is 'recently'
> > deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
> > want to opt for a different resolution in such a case as against the
> > one where the corresponding row was not even present in the first
> > place. The case where the row was deleted long back may not fall into
> > this category as there are higher chances that they have been removed
> > by vacuum and can be considered equivalent to the update_ missing
> > case.
> >
> 
>  My point is that if we can't detect the difference reliably, it's not
>  very useful. Consider this example:
> 
>  Node A:
> 
>    T1: INSERT INTO t (id, value) VALUES (1,1);
> 
>    T2: DELETE FROM t WHERE id = 1;
> 
>  Node B:
> 
>    T3: UPDATE t SET value = 2 WHERE id = 1;
> 
>  The "correct" order of received messages on a third node is T1-T3-T2.
>  But we may also see T1-T2-T3 and T3-T1-T2, e.g. due to network issues
>  and so on. For T1-T2-T3 the right decision is to discard the update,
>  while for T3-T1-T2 it's to either wait for the INSERT or wait for the
>  insert to arrive.
> 
>  But if we misdetect the situation, we either end up with a row that
>  shouldn't be there, or losing an update.
> >>>
> >>> Doesn't the above example indicate that 'update_deleted' should also
> >>> be considered a necessary conflict type?  Please see the possibilities
> >>> of conflicts in all three cases:
> >>>
> >>>
> >>> The "correct" order of receiving messages on node C (as suggested
> >>> above) is T1-T3-T2   (case1)
> >>> --
> >>> T1 will insert the row.
> >>> T3 will have update_differ conflict; latest_timestamp wins or apply
> >>> will apply it. earliest_timestamp_wins or skip will skip it.
> >>> T2 will delete the row (irrespective of whether the update happened or 
> >>> not).
> >>> End Result: No Data.
> >>>
> >>> T1-T2-T3
> >>> --
> >>> T1 will insert the row.
> >>> T2 will delete the row.
> >>> T3 will have conflict update_deleted. If it is 'update_deleted', the
> >>> chances are that the resolver set here is to 'skip' (default is also
> >>> 'skip' in this case).
> >>>
> >>> If vacuum has deleted that row (or if we don't support
> >>> 'update_deleted' conflict), it will be 'update_missing' conflict. In
> >>> that case, the user may end up inserting the row if resolver chosen is
> >>> in favor of apply (which seems an obvious choice for 'update_missing'
> >>> conflict; default is also 'apply_or_s

Re: Logical Replication of sequences

2024-06-12 Thread Amit Kapila
On Wed, Jun 12, 2024 at 10:44 AM Masahiko Sawada  wrote:
>
> On Tue, Jun 11, 2024 at 7:36 PM vignesh C  wrote:
> >
> > 1) CREATE PUBLICATION syntax enhancement:
> > CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > The addition of a new column titled "all sequences" in the
> > pg_publication system table will signify whether the publication is
> > designated as all sequences publication or not.
> >
>
> The first approach sounds like we don't create entries for sequences
> in pg_subscription_rel. In this case, how do we know all sequences
> that we need to refresh when executing the REFRESH PUBLICATION
> SEQUENCES command you mentioned below?
>

As per my understanding, we should be creating entries for sequences
in pg_subscription_rel similar to tables. The difference would be that
we won't need all the sync_states (i = initialize, d = data is being
copied, f = finished table copy, s = synchronized, r = ready) as we
don't need any synchronization with apply workers.

> > 2)  CREATE SUBSCRIPTION -- no syntax change.
> > Upon creation of a subscription, the following additional steps will
> > be managed by the subscriber:
> > i) The subscriber will retrieve the list of sequences associated with
> > the subscription's publications.
> > ii) For each sequence: a) Retrieve the sequence value from the
> > publisher by invoking the pg_sequence_state function. b) Set the
> > sequence with the value obtained from the publisher. iv) Once the
> > subscription creation is completed, all sequence values will become
> > visible at the subscriber's end.
>
> Sequence values are always copied from the publisher? or does it
> happen only when copy_data = true?
>

It is better to do it when "copy_data = true" to keep it compatible
with the table's behavior.

> >
> > An alternative design approach could involve retrieving the sequence
> > list from the publisher during subscription creation and inserting the
> > sequences with an "init" state into the pg_subscription_rel system
> > table. These tasks could be executed by a single sequence sync worker,
> > which would:
> > i) Retrieve the list of sequences in the "init" state from the
> > pg_subscription_rel system table.
> > ii) Initiate a transaction.
> > iii) For each sequence: a) Obtain the sequence value from the
> > publisher by utilizing the pg_sequence_state function. b) Update the
> > sequence with the value obtained from the publisher.
> > iv) Commit the transaction.
> >
> > The benefit with the second approach is that if there are large number
> > of sequences, the sequence sync can be enhanced to happen in parallel
> > and also if there are any locks held on the sequences in the
> > publisher, the sequence worker can wait to acquire the lock instead of
> > blocking the whole create subscription command which will delay the
> > initial copy of the tables too.
>
> I prefer to have separate workers to sync sequences.
>

+1.

> Probably we can
> start with a single worker and extend it to have multiple workers.

Yeah, starting with a single worker sounds good for now. Do you think
we should sync all the sequences in a single transaction or have some
threshold value above which a different transaction would be required
or maybe a different sequence sync worker altogether? Now, having
multiple sequence-sync workers requires some synchronization so that
only a single worker is allocated for one sequence.

The simplest thing is to use a single sequence sync worker that syncs
all sequences in one transaction but with a large number of sequences,
it could be inefficient. OTOH, I am not sure if it would be a problem
in reality.

>
> BTW
> the sequence-sync worker will be taken from
> max_sync_workers_per_subscription pool?
>

I think so.

> Or yet another idea I came up with is that a tablesync worker will
> synchronize both the table and sequences owned by the table. That is,
> after the tablesync worker caught up with the apply worker, the
> tablesync worker synchronizes sequences associated with the target
> table as well. One benefit would be that at the time of initial table
> sync being completed, the table and its sequence data are consistent.
> As soon as new changes come to the table, it would become inconsistent
> so it might not be helpful much, though. Also, sequences that are not
> owned by any table will still need to be synchronized by someone.
>

The other thing to consider in this idea is that we somehow need to
distinguish the sequences owned by the table.

> >
> > 3) Refreshing the sequence can be achieved through the existing
> > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > here).
> > The subscriber identifies stale sequences, meaning sequences present
> > in pg_subscription_rel but absent from the publication, and removes
> > them from the pg_subscription_rel system table. The subscriber also
> > checks for newly added sequences in the publisher and synchronizes
> > their values from the publisher using the steps outlined in the
> >

Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Andrew Dunstan



On 2024-06-11 Tu 19:48, Noah Misch wrote:

On Mon, Jun 10, 2024 at 06:49:11PM -0700, Andres Freund wrote:

On 2024-06-10 16:46:56 -0400, Andrew Dunstan wrote:

On 2024-06-10 Mo 16:04, Andres Freund wrote:

Just for context for the rest the email: I think we desperately need to move
off perl for tests. The infrastructure around our testing is basically
unmaintained and just about nobody that started doing dev stuff in the last 10
years learned perl.

As for what up and coming developers learn, they mostly don't learn C
either, and that's far more critical to what we do.

C is a a lot more useful to to them than perl. And it's actually far more
widely known these days than perl.

If we're going to test in a non-Perl language, I'd pick C over Python.  There
would be several other unlikely-community-choice languages I'd pick over
Python (C#, Java, C++).  We'd need a library like today's Perl
PostgreSQL::Test to make C-language tests nice, but the same would apply to
any new language.



Indeed. We've invested quite a lot of effort on that infrastructure. I 
guess people can learn from what we've done so a second language might 
be easier to support.


(Java would be my pick from your unlikely set, but I can see the 
attraction of Python.)





I also want the initial scope to be the new language coexisting with the
existing Perl tests.  If a bulk translation ever happens, it should happen
long after the debut of the new framework.  That said, I don't much trust a
human-written bulk language translation to go through without some tests
accidentally ceasing to test what they test in Perl today.



+1


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Logical Replication of sequences

2024-06-12 Thread vignesh C
On Wed, 12 Jun 2024 at 10:51, Dilip Kumar  wrote:
>
> On Tue, Jun 11, 2024 at 4:06 PM vignesh C  wrote:
> >
> > Amit and I engaged in an offline discussion regarding the design and
> > contemplated that it could be like below:
>
> If I understand correctly, does this require the sequences to already
> exist on the subscribing node before creating the subscription, or
> will it also copy any non-existing sequences?

Sequences must exist in the subscriber; we'll synchronize only their
values. Any sequences that are not present in the subscriber will
trigger an error.

> > 1) CREATE PUBLICATION syntax enhancement:
> > CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > The addition of a new column titled "all sequences" in the
> > pg_publication system table will signify whether the publication is
> > designated as all sequences publication or not.
> >
> > 2)  CREATE SUBSCRIPTION -- no syntax change.
> > Upon creation of a subscription, the following additional steps will
> > be managed by the subscriber:
> > i) The subscriber will retrieve the list of sequences associated with
> > the subscription's publications.
> > ii) For each sequence: a) Retrieve the sequence value from the
> > publisher by invoking the pg_sequence_state function. b) Set the
> > sequence with the value obtained from the publisher. iv) Once the
> > subscription creation is completed, all sequence values will become
> > visible at the subscriber's end.
> >
> > An alternative design approach could involve retrieving the sequence
> > list from the publisher during subscription creation and inserting the
> > sequences with an "init" state into the pg_subscription_rel system
> > table. These tasks could be executed by a single sequence sync worker,
> > which would:
> > i) Retrieve the list of sequences in the "init" state from the
> > pg_subscription_rel system table.
> > ii) Initiate a transaction.
> > iii) For each sequence: a) Obtain the sequence value from the
> > publisher by utilizing the pg_sequence_state function. b) Update the
> > sequence with the value obtained from the publisher.
> > iv) Commit the transaction.
> >
> > The benefit with the second approach is that if there are large number
> > of sequences, the sequence sync can be enhanced to happen in parallel
> > and also if there are any locks held on the sequences in the
> > publisher, the sequence worker can wait to acquire the lock instead of
> > blocking the whole create subscription command which will delay the
> > initial copy of the tables too.
>
> Yeah w.r.t. this point second approach seems better.

ok

> > 3) Refreshing the sequence can be achieved through the existing
> > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > here).
> > The subscriber identifies stale sequences, meaning sequences present
> > in pg_subscription_rel but absent from the publication, and removes
> > them from the pg_subscription_rel system table. The subscriber also
> > checks for newly added sequences in the publisher and synchronizes
> > their values from the publisher using the steps outlined in the
> > subscription creation process. It's worth noting that previously
> > synchronized sequences won't be synchronized again; the sequence sync
> > will occur solely for the newly added sequences.
>
> > 4) Introducing a new command for refreshing all sequences: ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > The subscriber will remove stale sequences and add newly added
> > sequences from the publisher. Following this, it will re-synchronize
> > the sequence values for all sequences in the updated list from the
> > publisher, following the steps outlined in the subscription creation
> > process.
>
> Okay, this answers my first question: we will remove the sequences
> that are removed from the publisher and add the new sequences. I don't
> see any problem with this, but doesn't it seem like we are effectively
> doing DDL replication only for sequences without having a
> comprehensive plan for overall DDL replication?

What I intended to convey is that we'll eliminate the sequences from
pg_subscription_rel. We won't facilitate the DDL replication of
sequences; instead, we anticipate users to create the sequences
themselves.

> > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > value from the publisher, along with the page LSN. Incorporate
> > SetSequence function, which will procure a new relfilenode for the
> > sequence and set the new relfilenode with the specified value. This
> > will facilitate rollback in case of any failures.
>
> I do not understand this point, you mean whenever we are fetching the
> sequence value from the publisher we need to create a new relfilenode
> on the subscriber?  Why not just update the catalog tuple is
> sufficient?  Or this is for handling the ALTER SEQUENCE case?

Sequences operate distinctively from tables. Alterations to sequences
reflect instantly in another session, even before committing the
transaction

Re: Track the amount of time waiting due to cost_delay

2024-06-12 Thread Bertrand Drouvot
Hi,

On Tue, Jun 11, 2024 at 11:40:36AM -0500, Nathan Bossart wrote:
> On Tue, Jun 11, 2024 at 07:25:11AM +, Bertrand Drouvot wrote:
> > So I think that in v2 we could: 1) measure the actual wait time instead, 2)
> > count the number of times the vacuum slept. We could also 3) reports the
> > effective cost limit (as proposed by Nathan up-thread) (I think that 3) 
> > could
> > be misleading but I'll yield to majority opinion if people think it's not).
> 
> I still think the effective cost limit would be useful, if for no other
> reason than to help reinforce that it is distributed among the autovacuum
> workers.

I also think it can be useful, my concern is more to put this information in
pg_stat_progress_vacuum. What about Sawada-san proposal in [1]? (we could
create a new view that would contain those data: per-worker dobalance, 
cost_lmit,
cost_delay, active, and failsafe). 

[1]: 
https://www.postgresql.org/message-id/CAD21AoDOu%3DDZcC%2BPemYmCNGSwbgL1s-5OZkZ1Spd5pSxofWNCw%40mail.gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Showing applied extended statistics in explain Part 2

2024-06-12 Thread Tomas Vondra
Hello Yamada-san,

I finally got to look at this patch again - apologies for taking so
long, I'm well aware it's rather frustrating to wait for feedback. I'll
try to pay more attention to this patch, and don't hesitate to ping me
off-list if immediate input is needed.

I looked at the patch from March 1 [1], which applies mostly with some
minor bitrot, but no major conflicts. A couple review comments:


1) The patch is not added to the CF app, which I think is a mistake. Can
you please add it to the 2024-07 commitfest? Otherwise people may not be
aware of it, won't do reviews etc. It'll require posting a rebased
patch, but should not be a big deal.


2) Not having the patch in a CF also means cfbot is not running tests on
it. Which is unfortunate, because the patch actually has an a bug cfbot
would find - I've noticed it after running the tests through the github
CI, see [2].

FWIW I very much recommend setting up this CI and using it during
development, it turned out to be very valuable for me as it tests on a
range of systems, and more convenient than the rpi5 machines I used for
that purposes before. See src/tools/ci/README for details.


3) The bug has this symptom:

  ERROR:  unrecognized node type: 268
  CONTEXT:  PL/pgSQL function check_estimated_rows(text) line 7 ...
  STATEMENT:  SELECT * FROM check_estimated_rows('SELECT a, b FROM ...

but it only shows on the FreeBSD machine (in CI). But that's simply
because that's running tests with "-DCOPY_PARSE_PLAN_TREES", which
always copies the query plan, to make sure all the nodes can be copied.
And we don't have a copy function for the StatisticExtInfo node (that's
the node 268), so it fails.

FWIW you can have this failure even on a regular build, you just need to
do explain on a prepared statement (with an extended statistics):

  CREATE TABLE t (a int, b int);

  INSERT INTO t SELECT (i/100), (i/100)
FROM generate_series(1,1000) s(i);

  CREATE STATISTICS ON a,b FROM t;

  ANALYZE t;

  PREPARE ps (INT, INT) AS SELECT * FROM t WHERE a = $1 AND b = $2;

  EXPLAIN EXECUTE ps(5,5);

  ERROR:  unrecognized node type: 268


4) I can think of two basic ways to fix this issue - either allow
copying of the StatisticExtInto node, or represent the information in a
different way (e.g. add a new node for that purpose, or use existing
nodes to do that).

I don't think we should do the first thing - the StatisticExtInfo is in
pathnodes.h, is designed to be used during path construction, has a lot
of fields that we likely don't need to show stuff in explain - but we'd
need to copy those too, and that seems useless / expensive.

So I suggest we invent a new (much simpler) node, tracking only the bits
we actually need for in the explain part. Or alternatively, if you think
adding a separate node is an overkill, maybe we could keep just an OID
of the statistics we applied, and the explain would lookup the name?

But I think having a new node might might also make the patch simpler,
as the new struct could combine the information the patch keeps in three
separate lists. Instead, there's be just one new list in Plan, members
would be the new node type, and each element would be

  (statistics OID, list of clauses, flag is_or)

or something like that.


5) In [3] Tom raised two possible issues with doing this - cost of
copying the information, and locking. For the performance concerns, I
think the first thing we should do is measuring how expensive it is. I
suggest measuring the overhead for about three basic cases:

 - table with no extended stats
 - table with a couple (1-10?) extended stats
 - table with a lot of (100?) extended stats

And see the impact of the patch. That is, measure the planning time with
master and with the patch applied. The table size does not matter much,
I think - this should measure just the planning, not execute the query.
In practice the extra costs will get negligible as the execution time
grows. But we're measuring the worst case, so that's fine.

For the locking, I agree with Robert [4] that this probably should not
be an issue - I don't see why would this be different from indexes etc.
But I haven't thought about that too much, so maybe investigate and test
this a bit more (that plans get invalidated if the statistics changes,
and so on).


6) I'm not sure we want to have this under EXPLAIN (VERBOSE). It's what
I did in the initial PoC patch, but maybe we should invent a new flag
for this purpose, otherwise VERBOSE will cover too much stuff? I'm
thinking about "STATS" for example.

This would probably mean the patch should also add a new auto_explain
"log_" flag to enable/disable this.


7) The patch really needs some docs - I'd mention this in the EXPLAIN
docs, probably. There's also a chapter about estimates, maybe that
should mention this too? Try searching for places in the SGML docs
mentioning extended stats and/or explain, I guess.

For tests, I guess stats_ext is the right place to test this. I'm not
sure what's the best way 

Re: Track the amount of time waiting due to cost_delay

2024-06-12 Thread Bertrand Drouvot
Hi,

On Tue, Jun 11, 2024 at 02:48:30PM -0400, Robert Haas wrote:
> On Tue, Jun 11, 2024 at 2:47 PM Nathan Bossart  
> wrote:
> > I'm struggling to think of a scenario in which the number of waits would be
> > useful, assuming you already know the amount of time spent waiting.

If we provide the actual time spent waiting, providing the number of waits would
allow to see if there is a diff between the actual time and the intended time
(i.e: number of waits * cost_delay, should the cost_delay be the same during
the vacuum duration). That should trigger some thoughts if the diff is large
enough.

I think that what we are doing here is to somehow add instrumentation around the
"WAIT_EVENT_VACUUM_DELAY" wait event. If we were to add instrumentation for wait
events (generaly speaking) we'd probably also expose the number of waits per
wait event (in addition to the time waited).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Postgresql OOM

2024-06-12 Thread Radu Radutiu
Hi,


> That's helpful, thanks!
>
> One thing to note is that postgres' work_mem is confusing - it applies to
> individual query execution nodes, not the whole query. Additionally, when
> you
> use parallel workers, each of the parallel workers can use the "full"
> work_mem, rather than work_mem being evenly distributed across workers.
>
> Within that, the memory usage in the EXPLAIN ANALYZE isn't entirely
> unexpected
> (I'd say it's unreasonable if you're not a postgres dev, but that's a
> different issue).
>
> We can see each of the Hash nodes use ~1GB, which is due to
> (1 leader + 4 workers) * work_mem = 5 * 200MB = 1GB.
>
> In this specific query we probably could free the memory in the "lower"
> hash
> join nodes once the node directly above has finished building, but we don't
> have the logic for that today.
>

I would understand 1 GB, even 2GB (considering hash_mem_multiplier) but the
server ran out of memory with more than 64 GB.

>
> Of course that doesn't explain why the memory usage / temp file creation
> is so
> extreme with a lower limit / fewer workers.  There aren't any bad
> statistics
> afaics, nor can I really see a potential for a significant skew, it looks
> to
> me that the hashtables are all built on a single text field and that nearly
> all the input rows are distinct.
>
> Could you post the table definition (\d+) and the database definition
> (\l). One random guess I have is that you ended up with a
> "non-deterministic"
> collation for that column and that we end up with a bad hash due to that.
>

I was  able to eliminate the columns not involved in the query while still
retaining the problematic behavior (that's the reason for the new table
names):
postgres=# \d inreq
  Table "public.inreq"
 Column  |Type | Collation |
Nullable | Default
-+-+---+--+-
 input_sequence  | bigint  |   | not
null |
 msg_type| character varying(8)|   |
   |
 originalrequest_id  | bigint  |   |
   |
 receive_time| timestamp without time zone |   |
   |
 related_output_sequence | bigint  |   |
   |
 msg_status  | character varying(15)   |   |
   |
Indexes:
"inreq_pkey" PRIMARY KEY, btree (input_sequence)
"inreq_originalrequest_id_idx" btree (originalrequest_id)

postgres=# \d outreq
  Table "public.outreq"
 Column |  Type  | Collation | Nullable | Default
++---+--+-
 output_sequence| bigint |   | not null |
 input_sequence | bigint |   |  |
 reply_input_sequence   | bigint |   |  |
 related_input_sequence | bigint |   |  |
Indexes:
"outreq_pkey" PRIMARY KEY, btree (output_sequence)
"outreq_input_sequence_idx" btree (input_sequence)
"outreq_reply_input_sequence_idx" btree (reply_input_sequence)
postgres=# SELECT datname, datcollate FROM pg_database WHERE datname =
current_database();
 datname  | datcollate
--+-
 postgres | en_US.UTF-8
(1 row)

A dump of the two tables above can be found at
https://drive.google.com/file/d/1ep1MYjNzlFaICL3GlPZaMdpOxRG6WCGz/view?usp=sharing
(compressed size 1GB; size of database after import 14 GB ).

# prepare my_query (timestamp,bigint) as SELECT  t.input_sequence,
rec_tro.output_sequence, r.input_sequence, rpl_rec_tro.output_sequence,
rpl_snd_tro.output_sequence, snd_tro.output_sequence, t.msg_type  FROM
inreq t  LEFT JOIN outreq rec_tro ON rec_tro.input_sequence =
t.input_sequence LEFT JOIN inreq r ON r.originalRequest_id =
t.input_sequence   LEFT JOIN outreq rpl_rec_tro ON
rpl_rec_tro.input_sequence = r.input_sequence  LEFT JOIN outreq rpl_snd_tro
ON rpl_snd_tro.reply_input_sequence = r.input_sequence  LEFT JOIN outreq
snd_tro ON snd_tro.reply_input_sequence = t.input_sequence  WHERE
t.receive_time < $1 AND t.input_sequence < $2  AND t.msg_status IN
('COMPLETED', 'REJECTED')  ORDER BY t.msg_status DESC, t.input_sequence ;
# EXPLAIN (ANALYZE,BUFFERS)  EXECUTE my_query('2024-05-18 00:00:00',
202406020168279904);

Best Regards,
Radu


Re: Logical Replication of sequences

2024-06-12 Thread Dilip Kumar
On Wed, Jun 12, 2024 at 4:08 PM vignesh C  wrote:
>
> On Wed, 12 Jun 2024 at 10:51, Dilip Kumar  wrote:
> >
> > On Tue, Jun 11, 2024 at 4:06 PM vignesh C  wrote:
> > >
> > > Amit and I engaged in an offline discussion regarding the design and
> > > contemplated that it could be like below:
> >
> > If I understand correctly, does this require the sequences to already
> > exist on the subscribing node before creating the subscription, or
> > will it also copy any non-existing sequences?
>
> Sequences must exist in the subscriber; we'll synchronize only their
> values. Any sequences that are not present in the subscriber will
> trigger an error.

Okay, that makes sense.

>
> > > 3) Refreshing the sequence can be achieved through the existing
> > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > here).
> > > The subscriber identifies stale sequences, meaning sequences present
> > > in pg_subscription_rel but absent from the publication, and removes
> > > them from the pg_subscription_rel system table. The subscriber also
> > > checks for newly added sequences in the publisher and synchronizes
> > > their values from the publisher using the steps outlined in the
> > > subscription creation process. It's worth noting that previously
> > > synchronized sequences won't be synchronized again; the sequence sync
> > > will occur solely for the newly added sequences.
> >
> > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > The subscriber will remove stale sequences and add newly added
> > > sequences from the publisher. Following this, it will re-synchronize
> > > the sequence values for all sequences in the updated list from the
> > > publisher, following the steps outlined in the subscription creation
> > > process.
> >
> > Okay, this answers my first question: we will remove the sequences
> > that are removed from the publisher and add the new sequences. I don't
> > see any problem with this, but doesn't it seem like we are effectively
> > doing DDL replication only for sequences without having a
> > comprehensive plan for overall DDL replication?
>
> What I intended to convey is that we'll eliminate the sequences from
> pg_subscription_rel. We won't facilitate the DDL replication of
> sequences; instead, we anticipate users to create the sequences
> themselves.

hmm okay.

> > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > value from the publisher, along with the page LSN. Incorporate
> > > SetSequence function, which will procure a new relfilenode for the
> > > sequence and set the new relfilenode with the specified value. This
> > > will facilitate rollback in case of any failures.
> >
> > I do not understand this point, you mean whenever we are fetching the
> > sequence value from the publisher we need to create a new relfilenode
> > on the subscriber?  Why not just update the catalog tuple is
> > sufficient?  Or this is for handling the ALTER SEQUENCE case?
>
> Sequences operate distinctively from tables. Alterations to sequences
> reflect instantly in another session, even before committing the
> transaction. To ensure the synchronization of sequence value and state
> updates in pg_subscription_rel, we assign it a new relfilenode. This
> strategy ensures that any potential errors allow for the rollback of
> both the sequence state in pg_subscription_rel and the sequence values
> simultaneously.

So, you're saying that when we synchronize the sequence values on the
subscriber side, we will create a new relfilenode to allow reverting
to the old state of the sequence in case of an error or transaction
rollback? But why would we want to do that? Generally, even if you
call nextval() on a sequence and then roll back the transaction, the
sequence value doesn't revert to the old value. So, what specific
problem on the subscriber side are we trying to avoid by operating on
a new relfilenode?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 01:48, Noah Misch  wrote:
> If we're going to test in a non-Perl language, I'd pick C over Python.  There
> would be several other unlikely-community-choice languages I'd pick over
> Python (C#, Java, C++).

My main goals of this thread are:
1. Allowing people to quickly write tests
2. Have those tests do what the writer intended them to do
3. Have good error reporting by default

Those goals indeed don't necesitate python.

But I think those are really hard to achieve with any C based
framework, and probably with C++ too. Also manual memory management in
tests seems to add tons of complexity for basically no benefit.

I think C#, Java, Go, Rust, Kotlin, and Swift would be acceptable
choices for me (and possibly some more). They allow some type of
introspection, they have a garbage collector, and their general
tooling is quite good.

But I think a dynamically typed scripting language is much more
fitting for writing tests like this. I love static typing for
production code, but imho it really doesn't have much benefit for
tests.

As scripting languages go, the ones that are still fairly heavily in
use are Javascript, Python, Ruby, and PHP. I think all of those could
probably work, but my personal order of preference would be Python,
Ruby, Javascript, PHP.

Finally, I'm definitely biased towards using Python myself. But I
think there's good reasons for that:
1. In the data space, that Postgres in, Python is very heavily used for analysis
2. Everyone coming out of university these days knows it to some extent
3. Multiple people in the community have been writing Postgres related
tests in python and have enjoyed doing so (me[1], Jacob[2],
Alexander[3])

What language people want to write tests in is obviously very
subjective. And obviously we cannot allow every language for writing
tests. But I think if ~25% of the community prefers to write their
tests in Python. Then that should be enough of a reason to allow them
to do so.

TO CLARIFY: This thread is not a proposal to replace Perl with Python.
It's a proposal to allow people to also write tests in Python.

> I also want the initial scope to be the new language coexisting with the
> existing Perl tests.  If a bulk translation ever happens, it should happen
> long after the debut of the new framework.  That said, I don't much trust a
> human-written bulk language translation to go through without some tests
> accidentally ceasing to test what they test in Perl today.

I definitely don't think we should rewrite all the tests that we have
in Perl today into some other language. But I do think that whatever
language we choose, that language should make it as least as easy to
write tests, as easy to read them and as easy to see that they are
testing the intended thing, as is currently the case for Perl.
Rewriting a few Perl tests into the new language, even if not merging
the rewrite, is a good way of validating that imho.

PS. For PgBouncer I actually hand-rewrote all the tests that we had in
bash (which is the worst testing language ever) in Python and doing so
actually found more bugs in PgBouncer code that our bash tests
wouldn't catch. So it's not necessarily the case that you lose
coverage by rewriting tests.

[1]: https://github.com/pgbouncer/pgbouncer/tree/master/test
[2]: https://github.com/jchampio/pg-pytest-suite
[3]: https://github.com/postgrespro/testgres




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Alexander Korotkov
On Tue, Jun 11, 2024 at 5:31 PM Jacob Champion
 wrote:
> On Mon, Jun 10, 2024 at 12:26 PM Alexander Korotkov
>  wrote:
> > Thank you for working on this.
> > Do you think you could re-use something from testgres[1] package?
>
> Possibly? I think we're all coming at this with our own bags of tricks
> and will need to carve off pieces to port, contribute, or reimplement.
> Does testgres have something in particular you'd like to see the
> Postgres tests support?

Generally, testgres was initially designed as Python analogue of what
we have in src/test/perl/PostgreSQL/Test.  In particular its
testgres.PostgresNode is analogue of PostgreSQL::Test::Cluster.  It
comes under PostgreSQL License.  So, I wonder if we could revise it
and fetch most part of it into our source tree.

--
Regards,
Alexander Korotkov
Supabase




Re: Conflict Detection and Resolution

2024-06-12 Thread Tomas Vondra



On 6/12/24 06:32, Dilip Kumar wrote:
> On Tue, Jun 11, 2024 at 7:44 PM Tomas Vondra
>  wrote:
> 
>>> Yes, that's correct. However, many cases could benefit from the
>>> update_deleted conflict type if it can be implemented reliably. That's
>>> why we wanted to give it a try. But if we can't achieve predictable
>>> results with it, I'm fine to drop this approach and conflict_type. We
>>> can consider a better design in the future that doesn't depend on
>>> non-vacuumed entries and provides a more robust method for identifying
>>> deleted rows.
>>>
>>
>> I agree having a separate update_deleted conflict would be beneficial,
>> I'm not arguing against that - my point is actually that I think this
>> conflict type is required, and that it needs to be detected reliably.
>>
> 
> When working with a distributed system, we must accept some form of
> eventual consistency model.

I'm not sure this is necessarily true. There are distributed databases
implementing (or aiming to) regular consistency models, without eventual
consistency. I'm not saying it's easy, but it shows eventual consistency
is not the only option.

> However, it's essential to design a
> predictable and acceptable behavior. For example, if a change is a
> result of a previous operation (such as an update on node B triggered
> after observing an operation on node A), we can say that the operation
> on node A happened before the operation on node B. Conversely, if
> operations on nodes A and B are independent, we consider them
> concurrent.
> 

Right. And this is precisely the focus or my questions - understanding
what behavior we aim for / expect in the end. Or said differently, what
anomalies / weird behavior would be considered expected.

Because that's important both for discussions about feasibility, etc.
And also for evaluation / reviews of the patch.

> In distributed systems, clock skew is a known issue. To establish a
> consistency model, we need to ensure it guarantees the
> "happens-before" relationship. Consider a scenario with three nodes:
> NodeA, NodeB, and NodeC. If NodeA sends changes to NodeB, and
> subsequently NodeB makes changes, and then both NodeA's and NodeB's
> changes are sent to NodeC, the clock skew might make NodeB's changes
> appear to have occurred before NodeA's changes. However, we should
> maintain data that indicates NodeB's changes were triggered after
> NodeA's changes arrived at NodeB. This implies that logically, NodeB's
> changes happened after NodeA's changes, despite what the timestamps
> suggest.
> 
> A common method to handle such cases is using vector clocks for
> conflict resolution. "Vector clocks" allow us to track the causal
> relationships between changes across nodes, ensuring that we can
> correctly order events and resolve conflicts in a manner that respects
> the "happens-before" relationship. This method helps maintain
> consistency and predictability in the system despite issues like clock
> skew.
> 

I'm familiar with the concept of vector clock (or logical clock in
general), but it's not clear to me how you plan to use this in the
context of the conflict handling. Can you elaborate/explain?

The way I see it, conflict handling is pretty tightly coupled with
regular commit timestamps and MVCC in general. How would you use vector
clock to change that?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Alexander Korotkov
On Wed, Jun 12, 2024 at 2:48 PM Alexander Korotkov  wrote:
> On Tue, Jun 11, 2024 at 5:31 PM Jacob Champion
>  wrote:
> > On Mon, Jun 10, 2024 at 12:26 PM Alexander Korotkov
> >  wrote:
> > > Thank you for working on this.
> > > Do you think you could re-use something from testgres[1] package?
> >
> > Possibly? I think we're all coming at this with our own bags of tricks
> > and will need to carve off pieces to port, contribute, or reimplement.
> > Does testgres have something in particular you'd like to see the
> > Postgres tests support?
>
> Generally, testgres was initially designed as Python analogue of what
> we have in src/test/perl/PostgreSQL/Test.  In particular its
> testgres.PostgresNode is analogue of PostgreSQL::Test::Cluster.  It
> comes under PostgreSQL License.  So, I wonder if we could revise it
> and fetch most part of it into our source tree.

Plus testgres exists from 2016 and already have quite amount of use
cases.  This is what I quickly found on github.

https://github.com/adjust/pg_querylog
https://github.com/postgrespro/pg_pathman
https://github.com/lanterndata/lantern
https://github.com/orioledb/orioledb
https://github.com/cbandy/pgtwixt
https://github.com/OpenNTI/nti.testing
https://github.com/postgrespro/pg_probackup
https://github.com/postgrespro/rum

--
Regards,
Alexander Korotkov
Supabase




Re: Track the amount of time waiting due to cost_delay

2024-06-12 Thread Bertrand Drouvot
Hi,

On Tue, Jun 11, 2024 at 01:13:48PM -0400, Robert Haas wrote:
> On Tue, Jun 11, 2024 at 5:49 AM Bertrand Drouvot
>  wrote:
> > As we can see the actual wait time is 30ms less than the intended wait time 
> > with
> > this simple test. So I still think we should go with 1) actual wait time 
> > and 2)
> > report the number of waits (as mentioned in [1]). Does that make sense to 
> > you?
> 
> I like the idea of reporting the actual wait time better,

+1

> provided
> that we verify that doing so isn't too expensive. I think it probably
> isn't, because in a long-running VACUUM there is likely to be disk
> I/O, so the CPU overhead of a few extra gettimeofday() calls should be
> fairly low by comparison.

Agree.

> I wonder if there's a noticeable hit when
> everything is in-memory. I guess probably not, because with any sort
> of normal configuration, we shouldn't be delaying after every block we
> process, so the cost of those gettimeofday() calls should still be
> getting spread across quite a bit of real work.

I did some testing, with:

shared_buffers = 12GB
vacuum_cost_delay = 1
autovacuum_vacuum_cost_delay = 1
max_parallel_maintenance_workers = 0
max_parallel_workers = 0

added to a default config file.

A table and all its indexes were fully in memory, the numbers are:

postgres=# SELECT n.nspname, c.relname, count(*) AS buffers
 FROM pg_buffercache b JOIN pg_class c
 ON b.relfilenode = pg_relation_filenode(c.oid) AND
b.reldatabase IN (0, (SELECT oid FROM pg_database
  WHERE datname = current_database()))
 JOIN pg_namespace n ON n.oid = c.relnamespace
 GROUP BY n.nspname, c.relname
 ORDER BY 3 DESC
 LIMIT 11;

 nspname |  relname  | buffers
-+---+-
 public  | large_tbl |  80
 public  | large_tbl_pkey|5486
 public  | large_tbl_filler7 |1859
 public  | large_tbl_filler4 |1859
 public  | large_tbl_filler1 |1859
 public  | large_tbl_filler6 |1859
 public  | large_tbl_filler3 |1859
 public  | large_tbl_filler2 |1859
 public  | large_tbl_filler5 |1859
 public  | large_tbl_filler8 |1859
 public  | large_tbl_version |1576
(11 rows)


The observed timings when vacuuming this table are:

On master:

vacuum phase: cumulative duration
-

scanning heap: 00:00:37.808184
vacuuming indexes: 00:00:41.808176
vacuuming heap: 00:00:54.808156

On master patched with actual time delayed:

vacuum phase: cumulative duration
-

scanning heap: 00:00:36.502104 (time_delayed: 22202)
vacuuming indexes: 00:00:41.002103 (time_delayed: 23769)
vacuuming heap: 00:00:54.302096 (time_delayed: 34886)

As we can see there is no noticeable degradation while the vacuum entered about
34886 times in this instrumentation code path (cost_delay was set to 1).

> That said, I'm not sure this experiment shows a real problem with the
> idea of showing intended wait time. It does establish the concept that
> repeated signals can throw our numbers off, but 30ms isn't much of a
> discrepancy.

Yeah, the idea was just to show how easy it is to create a 30ms discrepancy.

> I'm worried about being off by a factor of two, or an
> order of magnitude. I think we still don't know if that can happen,
> but if we're going to show actual wait time anyway, then we don't need
> to explore the problems with other hypothetical systems too much.

Agree.

> I'm not convinced that reporting the number of waits is useful. If we
> were going to report a possibly-inaccurate amount of actual waiting,
> then also reporting the number of waits might make it easier to figure
> out when the possibly-inaccurate number was in fact inaccurate. But I
> think it's way better to report an accurate amount of actual waiting,
> and then I'm not sure what we gain by also reporting the number of
> waits.

Sami shared his thoughts in [1] and [2] and so did I in [3]. If some of us still
don't think that reporting the number of waits is useful then we can probably
start without it.

[1]: 
https://www.postgresql.org/message-id/0EA474B6-BF88-49AE-82CA-C1A9A3C17727%40amazon.com
[2]: 
https://www.postgresql.org/message-id/E12435E2-5FCA-49B0-9ADB-0E7153F95E2D%40amazon.com
[3]: 
https://www.postgresql.org/message-id/ZmmGG4e%2BqTBD2kfn%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 01:48, Noah Misch  wrote:
> If we're going to test in a non-Perl language, I'd pick C over Python.
> 
> We'd need a library like today's Perl
> PostgreSQL::Test to make C-language tests nice, but the same would apply to
> any new language.

P.P.S. We already write tests in C, we use it for testing libpq[1].
I'd personally definitely welcome a C library to make those tests
nicer to write, because I've written a fair bit of those in the past
and currently it's not very fun to do.

[1]: 
https://github.com/postgres/postgres/blob/master/src/test/modules/libpq_pipeline/libpq_pipeline.c




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Peter Eisentraut

On 11.06.24 16:55, David E. Wheeler wrote:

On Jun 10, 2024, at 15:39, Andres Freund  wrote:


That's 6 years ago, not sure we can really learn that much from that.

And it's not like it's actually impossible, #ifdefs aren't great, but they are
better than nothing.


Right, it’s just that extension authors could use some notification that such a 
change is coming so they can update their code, if necessary.


I think since around 6 years ago we have been much more vigilant about 
avoiding ABI breaks.  So if there aren't any more recent examples of 
breakage, then maybe that was ultimately successful, and the upshot is, 
continue to be vigilant at about the same level?






Re: ON ERROR in json_query and the like

2024-06-12 Thread Markus Winand



> On 11.06.2024, at 03:58, jian he  wrote:
> 
> On Tue, May 28, 2024 at 5:29 PM Markus Winand  wrote:
>> 
>> Hi!
>> 
>> I’ve noticed two “surprising” (to me) behaviors related to
>> the “ON ERROR” clause of the new JSON query functions in 17beta1.
>> 
>> 1. JSON parsing errors are not subject to ON ERROR
>>   Apparently, the functions expect JSONB so that a cast is implied
>>   when providing TEXT. However, the errors during that cast are
>>   not subject to the ON ERROR clause.
>> 
>>   17beta1=# SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);
>>   ERROR:  invalid input syntax for type json
>>   DETAIL:  Token "invalid" is invalid.
>>   CONTEXT:  JSON data, line 1: invalid
>> 
>>   Oracle DB and Db2 (LUW) both return NULL in that case.
>> 
>>   I had a look on the list archive to see if that is intentional but
>>   frankly speaking these functions came a long way. In case it is
>>   intentional it might be worth adding a note to the docs.
>> 
> 
> json_query ( context_item, path_expression);
> 
> `SELECT JSON_QUERY('invalid', '$' NULL ON ERROR);`
> to make this return NULL, that means to catch all the errors that
> happened while context_item evaluation.
> otherwise, it would not be consistent?
> 
> Currently context_item expressions can be quite arbitrary.
> considering the following examples.
> 
> create or replace function test(jsonb) returns jsonb as $$ begin raise
> exception 'abort'; end $$ language plpgsql;
> create or replace function test1(jsonb) returns jsonb as $$ begin
> return $1; end $$ language plpgsql;
> SELECT JSON_VALUE(test('1'), '$');
> SELECT JSON_VALUE(test1('1'), '$');
> SELECT JSON_VALUE((select '1'::jsonb), '$');
> SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from cte), 
> '$');
> SELECT JSON_VALUE((with cte(s) as (select '1') select s::jsonb from
> cte union all select s::jsonb from cte limit 1), '$');
> 
> Currently, I don't think we can make
> SELECT JSON_VALUE(test('1'), '$' null on error);
> return NULL.

This is not how it is meant. Your example is not subject to the ON ERROR
clause because the error happens in a sub-expression. My point is that
ON ERROR includes the String to JSON conversion (the JSON parsing) that
— in the way the standard describes these functions — inside of them.

In the standard, JSON_VALUE & co accept string types as well as the type JSON:

10.14 SR 1: The declared type of the  simply contained in the 
 immediately contained in the  shall 
be a string type or a JSON type. 

It might be best to think of it as two separate functions, overloaded:

JSON_VALUE(context_item JSONB, path_expression …)
JSON_VALUE(context_item TEXT, path_expression …)

Now if you do this:
create function test2(text) returns text as $$ begin
return $1; end $$ language plpgsql;
create function test3(text) returns jsonb as $$ begin
return $1::jsonb; end $$ language plpgsql;

SELECT JSON_VALUE(test2('invalid'), '$' null on error);
SELECT JSON_VALUE(test3('invalid'), '$' null on error);

The first query should return NULL, while the second should (and does) fail.

This is how I understand it.

-markus



Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 14:44, Peter Eisentraut  wrote:
> I think since around 6 years ago we have been much more vigilant about
> avoiding ABI breaks.  So if there aren't any more recent examples of
> breakage, then maybe that was ultimately successful, and the upshot is,
> continue to be vigilant at about the same level?

While not strictly an ABI break I guess, the backport of 32d5a4974c81
broke building Citus against 13.10 and 14.7[1].

[1]: https://github.com/citusdata/citus/pull/6711




Re: race condition in pg_class

2024-06-12 Thread Michail Nikolaev
Hello, everyone.

I am not sure, but I think that issue may be related to the issue described
in
https://www.postgresql.org/message-id/CANtu0ojXmqjmEzp-%3DaJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg%40mail.gmail.com

It looks like REINDEX CONCURRENTLY could interfere with ON CONFLICT UPDATE
in some strange way.

Best regards,
Mikhail.


Re: CI and test improvements

2024-06-12 Thread Justin Pryzby
On Fri, Jun 24, 2022 at 08:38:50AM +1200, Thomas Munro wrote:
> > I've also sent some patches to Thomas for cfbot to help progress some of 
> > these
> > patches (code coverage and documentation upload as artifacts).
> > https://github.com/justinpryzby/cfbot/commits/master
> 
> Thanks, sorry for lack of action, will get to these soon.

On Tue, Feb 13, 2024 at 01:10:21PM -0600, Justin Pryzby wrote:
> I sent various patches to cfbot but haven't heard back.

> https://www.postgresql.org/message-id/flat/20220409021853.gp24...@telsasoft.com
> https://www.postgresql.org/message-id/flat/20220623193125.gb22...@telsasoft.com
> https://github.com/justinpryzby/cfbot/commits/master
> https://github.com/macdice/cfbot/pulls

@Thomas: ping

I reintroduced the patch for ccache/windows -- v4.10 supports PCH, which
can make the builds 2x faster.
>From 8cebe912127ee5c5d03b62b231811c2670698da0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 21:53:22 -0500
Subject: [PATCH 1/6] cirrus/windows: add compiler_warnings_script

The goal is to fail due to warnings only after running tests.
(At least historically, it's too slow to run a separate windows VM to
compile with -Werror.)

https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de

I'm not sure how to write this test in windows shell; it's also easy to
write something that doesn't work in posix sh, since windows shell is
interpretting && and ||...

See also:
8a1ce5e54f6d144e4f8e19af7c767b026ee0c956
https://cirrus-ci.com/task/6241060062494720
https://cirrus-ci.com/task/6496366607204352

20221104161934.gb16...@telsasoft.com
20221031223744.gt16...@telsasoft.com
20221104161934.gb16...@telsasoft.com
20221123054329.gg11...@telsasoft.com
20230224002029.gq1...@telsasoft.com

ci-os-only: windows
---
 .cirrus.tasks.yml | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 33646faeadf..5a2b64f64c2 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -566,12 +566,21 @@ task:
 
   build_script: |
 vcvarsall x64
-ninja -C build
+ninja -C build |tee build.txt
+REM Since pipes lose the exit status of the preceding command, rerun the compilation
+REM without the pipe, exiting now if it fails, to avoid trying to run checks
+ninja -C build > nul
 
   check_world_script: |
 vcvarsall x64
 meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
 
+  # This should be last, so check_world is run even if there are warnings
+  always:
+compiler_warnings_script:
+  # this avoids using metachars which would be interpretted by the windows shell
+  - sh -c 'if grep ": warning " build.txt; then exit 1; fi; exit 0'
+
   on_failure:
 <<: *on_failure_meson
 crashlog_artifacts:
-- 
2.42.0

>From 7f8e1c244b24f04a28d5a1a03da85f00419717e1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 22:05:13 -0500
Subject: [PATCH 2/6] cirrus/windows: ccache

https://www.postgresql.org/message-id/flat/20220522232606.GZ19626%40telsasoft.com

https://cirrus-ci.com/task/5213936495099904 build 29sec

, linux
linux-meson
ci-os-only: windows, windows-msvc
---
 .cirrus.tasks.yml | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 5a2b64f64c2..7a3fcc36fae 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -536,6 +536,11 @@ task:
   env:
 TEST_JOBS: 8 # wild guess, data based value welcome
 
+CCACHE_DIR: $CIRRUS_WORKING_DIR/.ccache
+CCACHE_MAXSIZE: "500M"
+CCACHE_SLOPPINESS: pch_defines,time_macros
+CCACHE_DEPEND: 1
+
 # Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That
 # prevents crash reporting from working unless binaries do SetErrorMode()
 # themselves. Furthermore, it appears that either python or, more likely,
@@ -550,8 +555,11 @@ task:
   depends_on: SanityCheck
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'
 
-  setup_additional_packages_script: |
-REM choco install -y --no-progress ...
+  setup_additional_packages_script:
+- choco install -y --no-progress ccache --version 4.10.0
+
+  ccache_cache:
+folder: $CCACHE_DIR
 
   setup_hosts_file_script: |
 echo 127.0.0.1 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts
@@ -562,7 +570,8 @@ task:
   # Use /DEBUG:FASTLINK to avoid high memory usage during linking
   configure_script: |
 vcvarsall x64
-meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
+set CC=c:\ProgramData\chocolatey\lib\ccache\tools\ccache-4.10-windows-x86_64\ccache.exe cl.exe
+meson setup build --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcasser

Re: ON ERROR in json_query and the like

2024-06-12 Thread David G. Johnston
On Wednesday, June 12, 2024, Markus Winand  wrote:
>
>
> 10.14 SR 1: The declared type of the  simply contained
> in the  immediately contained in the  item> shall be a string type or a JSON type.


> It might be best to think of it as two separate functions, overloaded:
>
> JSON_VALUE(context_item JSONB, path_expression …)
> JSON_VALUE(context_item TEXT, path_expression …)


Yes, we need to document that we deviate from (fail to fully implement) the
standard here in that we only provide jsonb parameter functions, not text
ones.

David J.


Re: ON ERROR in json_query and the like

2024-06-12 Thread Markus Winand


> On 04.06.2024, at 07:00, jian he  wrote:
> 
> On Tue, May 28, 2024 at 5:29 PM Markus Winand  wrote:
> 
>> 2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
>> 
>>   17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
>>a
>>   
>>[]
>>   (1 row)
>> 
>>   As NULL ON EMPTY is implied, it should give the same result as
>>   explicitly adding NULL ON EMPTY:
>> 
> 
> I vaguely remember, we stumbled on ON ERROR, ON EMPTY several times.
> i don't have a standard, 

In my understanding of the standard is that there is no distinction
between an explicit and implicit ON EMPTY clause.

E.g. clause 6.35 (json_query) Syntax Rule 4:

•  If  is not specified, then NULL ON EMPTY is 
implicit.

General Rule 5ai then covers the NULL ON EMPTY case:

   • i)  If the length of SEQ2 is 0 (zero) and ONEMPTY is NULL, then 
let JV be the null value.

Neither of these make the ON EMPTY handling dependent on the presence of ON 
ERROR.

-markus



Re: relfilenode statistics

2024-06-12 Thread Bertrand Drouvot
Hi,

On Tue, Jun 11, 2024 at 03:35:23PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 10 Jun 2024 08:09:56 +, Bertrand Drouvot 
>  wrote in 
> > 
> > My idea was to move all that is in pg_statio_all_tables to relfilenode stats
> > and 1) add new stats to pg_statio_all_tables (like heap_blks_written), 2) 
> > ensure
> > the user can still retrieve the stats from pg_statio_all_tables in such a 
> > way
> > that it survives a rewrite, 3) provide dedicated APIs to retrieve
> > relfilenode stats but only for the current relfilenode, 4) document this
> > behavior. This is what the POC patch is doing for heap_blks_written (would
> > need to do the same for heap_blks_read and friends) except for the 
> > documentation
> > part. What do you think?
> 
> In my opinion,

Thanks for looking at it!

> it is certainly strange that bufmgr is aware of
> relation kinds, but introducing relfilenode stats to avoid this skew
> doesn't seem to be the best way, as it invites inconclusive arguments
> like the one raised above. The fact that we transfer counters from old
> relfilenodes to new ones indicates that we are not really interested
> in counts by relfilenode. If that's the case, wouldn't it be simpler
> to call pgstat_count_relation_buffer_read() from bufmgr.c and then
> branch according to relkind within that function? If you're concerned
> about the additional branch, some ingenuity may be needed.

That may be doable for "read" activities but what about write activities?
Do you mean not relying on relfilenode stats for reads but relying on 
relfilenode
stats for writes?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: ON ERROR in json_query and the like

2024-06-12 Thread David G. Johnston
On Tuesday, May 28, 2024, Markus Winand  wrote:

>
> 2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
>
>17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
> a
>
> []
>(1 row)
>
>As NULL ON EMPTY is implied, it should give the same result as
>explicitly adding NULL ON EMPTY:
>
>17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON
> ERROR) a;
> a
>---
>
>(1 row)
>
>Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
>on the other hand returns NULL for both queries.
>
>I don’t think that PostgreSQL should follow Oracle DB's suit here
>but again, in case this is intentional it should be made explicit
>in the docs.
>

The docs here don’t seem to cover the on empty clause at all nor fully
cover all options.

Where do you find the claim that the one implies the other?  Is it a typo
that your examples says “implies null on empty” but the subject line says
“implies error on empty”?

Without those clauses a result is either empty or an error - they are
mutually exclusive (ignoring matches).  I would not expect one clause to
imply or affect the behavior of the other.  There is no chaining.  The
original result is transformed to the new result specified by the clause.

I’d need to figure out whether the example you show is actually producing
empty or error; but it seems correct if the result is empty.  The first
query ignores the error clause - the empty array row seems to be the
representation of empty here; the second one matches the empty clause and
outputs null instead of the empty array.

David J.


Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread FWS Neil


> On Jun 12, 2024, at 6:40 AM, Jelte Fennema-Nio  wrote:
> 
> I think C#, Java, Go, Rust, Kotlin, and Swift would be acceptable
> choices for me (and possibly some more). They allow some type of
> introspection, they have a garbage collector, and their general
> tooling is quite good.
> 

Having used Python for 15+ years and then abandoned it for all projects I would
say the single most important points for a long term project like Postgres is,
not necessarily in order, package stability, package depth, semantic versioning,
available resources, and multiprocessor support.

The reason I abandoned Python was for the constant API breaks in packages. Yes,
python is a great language to teach in school for a one-time class project, but
that is not Postgres’s use-case.  Remember that Fortran and Pascal were the 
darlings for teaching in school prior to Python and no-one uses them any more.

Yes Python innovates fast and is fashionable.  But again, not Postgres’s 
use-case.

I believe that anyone coming out of school these days would have a relatively
easy transition to any of Go, Rust, Kotlin, Swift, etc.  In other words, any of
the modern languages.  In addition, the language should scale well to
multiprocessors, because parallel testing is becoming more important every day.

If the Postgres project is going to pick a new language for testing, it should
pick a language for the next 50 years based on the projects needs.

Python is good for package depth and resource availability, but fails IMO in the
other categories. My experience with python where the program flow can change
because of non-visible characters is a terrible way to write robust long term
maintainable code.  Because of this most of the modern languages are going to be
closer in style to Postgres’s C code base than Python.



Re: ON ERROR in json_query and the like

2024-06-12 Thread Markus Winand



> On 12.06.2024, at 15:31, David G. Johnston  wrote:
> 
> On Tuesday, May 28, 2024, Markus Winand  wrote:
> 
> 2. EMPTY [ARRAY|OBJECT] ON ERROR implies ERROR ON EMPTY
> 
>17beta1=# SELECT JSON_QUERY('[]', '$[*]' EMPTY ARRAY ON ERROR) a;
> a
>
> []
>(1 row) 
> 
>As NULL ON EMPTY is implied, it should give the same result as
>explicitly adding NULL ON EMPTY:
> 
>17beta1=# SELECT JSON_QUERY('[]', '$[*]' NULL ON EMPTY EMPTY ARRAY ON 
> ERROR) a;
> a
>---
> 
>(1 row)
> 
>Interestingly, Oracle DB gives the same (wrong) results. Db2 (LUW)
>on the other hand returns NULL for both queries.
> 
>I don’t think that PostgreSQL should follow Oracle DB's suit here
>but again, in case this is intentional it should be made explicit
>in the docs.
> 
> The docs here don’t seem to cover the on empty clause at all nor fully cover 
> all options.
> 
> Where do you find the claim that the one implies the other?  Is it a typo 
> that your examples says “implies null on empty” but the subject line says 
> “implies error on empty”?

I see the confusion caused — sorry. The headline was meant to describe the 
observed behaviour in 17beta1, while the content refers to how the standard 
defines it.

> Without those clauses a result is either empty or an error - they are 
> mutually exclusive (ignoring matches).  I would not expect one clause to 
> imply or affect the behavior of the other.  There is no chaining.  The 
> original result is transformed to the new result specified by the clause.

Agreed, that’s why I found the 17beta1 behaviour surprising. 

> I’d need to figure out whether the example you show is actually producing 
> empty or error; but it seems correct if the result is empty.

As I understand the standard, an empty result is not an error.

>   The first query ignores the error clause - the empty array row seems to be 
> the representation of empty here; the second one matches the empty clause and 
> outputs null instead of the empty array.

But the first should behave the same, as the standard implies NULL ON EMPTY if 
there is no explicit ON EMPTY clause. Oracle DB behaving differently here makes 
me wonder if there is something in the standard that I haven’t noticed yet...

-markus





Re: Add support to TLS 1.3 cipher suites and curves lists

2024-06-12 Thread Peter Eisentraut

On 12.06.24 10:51, Jelte Fennema-Nio wrote:

On Mon, 10 Jun 2024 at 12:31, Daniel Gustafsson  wrote:

Regarding the ciphersuites portion of the patch.  I'm not particularly thrilled
about having a GUC for TLSv1.2 ciphers and one for TLSv1.3 ciphersuites, users
not all that familiar with TLS will likely find it confusing to figure out what
to do.


I don't think it's easy to create a single GUC because OpenSSL has
different APIs for both. So we'd have to add some custom parsing for
the combined string, which is likely to cause some problems imho. I
think separating them is the best option from the options we have and
I don't think it matters much practice for users. Users not familiar
with TLS might indeed be confused, but those users shouldn't touch
these settings anyway, and just use the defaults. The users that care
about this probably already get two cipher strings from their
compliance teams, because many other applications also have two
separate options for specifying both.


Maybe some comparisons with other SSL-enabled server products would be 
useful.


Here is the Apache httpd setting:

https://httpd.apache.org/docs/current/mod/mod_ssl.html#sslciphersuite

They use a complex syntax to be able to set both via one setting.

Here is the nginx setting:

https://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_ciphers

This doesn't appear to support TLS 1.3?





Re: Proposal: Document ABI Compatibility

2024-06-12 Thread David E. Wheeler
On Jun 12, 2024, at 8:43 AM, Peter Eisentraut  wrote:

>> Right, it’s just that extension authors could use some notification that 
>> such a change is coming so they can update their code, if necessary.
> 
> I think since around 6 years ago we have been much more vigilant about 
> avoiding ABI breaks.  So if there aren't any more recent examples of 
> breakage, then maybe that was ultimately successful, and the upshot is, 
> continue to be vigilant at about the same level?

That sounds great to me. I’d like to get it documented, though, so that 
extension and other third party developers are aware of it, and not just making 
wild guesses and scaring each other over (perhaps) misconceptions.

D





Re: SQL:2011 application time

2024-06-12 Thread Robert Haas
On Wed, Jun 5, 2024 at 4:56 PM Paul Jungwirth
 wrote:
> **Option 2**: Add a new operator, called &&&, that works like && except an 
> empty range *does*
> overlap another empty range. Empty ranges should still not overlap anything 
> else. This would fix the
> exclusion constraint. You could add `(5, 'empty')` once but not twice. This 
> would allow empties to
> people who want to use them. (We would still forbid them if you define a 
> PERIOD, because those come
> with the CHECK constraint mentioned above.)
> And there is almost nothing to code. But it is mathematically suspect to say 
> an empty range overlaps
> something small (something with zero width) but not something big. Surely if 
> a && b and b <@ c, then
> a && c? So this feels like the kind of elegant hack that you eventually 
> regret.

I think this might be fine.

> **Option 3**: Forbid empties, not as a reified CHECK constraint, but just 
> with some code in the
> executor. Again we could do just PKs or PKs and UNIQUEs. Let's do both, for 
> all the reasons above.
> Not creating a CHECK constraint is much less clunky. There is no catalog 
> entry to create/drop. Users
> don't wonder where it came from when they say `\d t`. It can't conflict with 
> constraints of their
> own. We would enforce this in ExecConstraints, where we enforce NOT NULL and 
> CHECK constraints, for
> any table with constraints where conperiod is true. We'd also need to do this 
> check on existing rows
> when you create a temporal PK/UQ. This option also requires a new field in 
> pg_class: just as we have
> relchecks, relhasrules, relhastriggers, etc. to let us skip work in the 
> relcache, I assume we'd want
> relperiods.

I don't really like the existing relhasWHATEVER fields and am not very
keen about adding more of them. Maybe it will turn out to be the best
way, but finding the right times to set and unset such fields has been
challenging over the years, and we've had to fix some bugs. So, if you
go this route, I recommend looking carefully at whether there's a
reasonable way to avoid the need for such a field. Other than that,
this idea seems reasonable.

> **Option 4**: Teach GiST indexes to enforce uniqueness. We didn't discuss 
> this at pgconf, at least
> not in reference to the empties problem. But I was thinking about this 
> request from Matthias for
> temporal PKs & UQs to support `USING INDEX idx`.[2] It is confusing that a 
> temporal index has
> indisunique, but if you try to create a unique GiST index directly we say 
> they don't support UNIQUE
> indexes! Similarly `pg_indexam_has_property(783, 'can_unique')` returns 
> false. There is something
> muddled about all that. So how about we give the GiST AM handler amcanunique?
>
> As I understand it, GiST indexes are capable of uniqueness,[3] and indeed 
> today you can create an
> exclusion constraint with the same effect, but in the past the core had no 
> way of asking an opclass
> which operator gave equality. With the stratnum support proc from 6db4598fcb 
> (part of this patch
> series, but reverted from v17), we could get a known operator for "equals". 
> If the index's opclasses
> had that sproc and it gave non-zero for RTEqualStrategyNumber, then CREATE 
> UNIQUE INDEX would
> succeed. We would just ("just") need to make GiST raise an error if it found 
> a duplicate. And if
> *that* was happening, the empty ranges wouldn't cause a problem.

Isn't this just a more hacky version of option (2)?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread David E. Wheeler
On Jun 12, 2024, at 8:58 AM, Jelte Fennema-Nio  wrote:

> While not strictly an ABI break I guess, the backport of 32d5a4974c81
> broke building Citus against 13.10 and 14.7[1].
> 
> [1]: https://github.com/citusdata/citus/pull/6711

Interesting one. We might want to advise projects to use deferent names if they 
copy code from the core, use an extension-specific prefix perhaps. That way if 
it gets backported by the core, as in this example, it won’t break anything, 
and the extension can choose to switch.

D





Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade

2024-06-12 Thread Nathan Bossart
On Tue, Jun 11, 2024 at 10:39:51AM +0200, Matthias van de Meent wrote:
> On Tue, 11 Jun 2024 at 04:01, Nathan Bossart  wrote:
>> It looks like pg_dump always uses template0, so AFAICT we don't even need
>> the suggested manual checkpoint after restoring template1.
> 
> Thanks for reminding me. It seems I misunderstood the reason why we
> first process template1 in create_new_objects, as I didn't read the
> comments thoroughly enough.

Actually, I think you are right that we need a manual checkpoint, except I
think we need it to be after prepare_new_globals().  set_frozenxids()
connects to each database (including template0) and updates a bunch of
pg_class rows, and we probably want those on disk before we start copying
the files to create all the user's databases.

-- 
nathan




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Robert Haas
On Mon, Jun 3, 2024 at 3:39 PM Tom Lane  wrote:
> Me either.  There are degrees of ABI compatibility

Exactly this!

What I think would be useful to document is our usual practices e.g.
adding new struct members at the end of structs, trying to avoid
changing public function signatures. If we document promises to
extension authors, I don't know how much difference that will make:
we'll probably end up needing to violate them at some point for one
reason or another. But if we document what committers should do, then
we might do better than we're now, because committers will be more
likely to do it right, and extension authors can also read those
instructions to understand what our practices are.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread walther

Jelte Fennema-Nio:

As scripting languages go, the ones that are still fairly heavily in
use are Javascript, Python, Ruby, and PHP. I think all of those could
probably work, but my personal order of preference would be Python,
Ruby, Javascript, PHP.

Finally, I'm definitely biased towards using Python myself. But I
think there's good reasons for that:
1. In the data space, that Postgres in, Python is very heavily used for analysis
2. Everyone coming out of university these days knows it to some extent
3. Multiple people in the community have been writing Postgres related
tests in python and have enjoyed doing so (me[1], Jacob[2],
Alexander[3])


PostgREST also uses pytest for integration tests - and that was a very 
good decision compared to the bash based tests we had before.


One more argument for Python compared to the other mentioned scripting 
languages: Python is already a development dependency via meson. None of 
the other 3 are. In a future where meson will be the only build system, 
we will have python "for free" already.


Best,

Wolfgang




Re: Remove dependence on integer wrapping

2024-06-12 Thread Nathan Bossart
On Tue, Jun 11, 2024 at 09:10:44PM -0400, Joseph Koshakow wrote:
> The attached patch has updated this file too. For some reason I was
> under the impression that I should leave the ecpg/ files alone, though
> I can't remember why.

Thanks.  This looks pretty good to me after a skim, so I'll see about
committing/back-patching it in the near future.  IIUC there is likely more
to come in this area, but I see no need to wait.

(I'm chuckling that we are adding Y2K tests in 2024...)

-- 
nathan




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread David E. Wheeler
On Jun 12, 2024, at 10:47, Robert Haas  wrote:

> What I think would be useful to document is our usual practices e.g.
> adding new struct members at the end of structs, trying to avoid
> changing public function signatures. If we document promises to
> extension authors, I don't know how much difference that will make:
> we'll probably end up needing to violate them at some point for one
> reason or another.

I think that’s fine if there is some sort of notification process. The policy I 
drafted upthread starts with making sure the such a break is mentioned in the 
release notes.

> But if we document what committers should do, then
> we might do better than we're now, because committers will be more
> likely to do it right, and extension authors can also read those
> instructions to understand what our practices are.

Yes, this, thank you!

D





Re: Windows: openssl & gssapi dislike each other

2024-06-12 Thread Imran Zaheer
Hi

I removed the macro from the sslinfo.c as suggested by Andrew. Then I
was thinking maybe we can undo some other similar code.

I rearranged the headers to their previous position in
be-secure-openssl.c and in fe-secure-openssl.c. I was able to compile
with gssapi and openssl enabled. You can look into the original commits. [1,
2]
Is it ok if we undo the changes from these commits?

I am attaching two new patches.
One with macro guards removed from ssinfo.c.
Second patch will additionally rearrange headers for
be-secure-openssl.c and in fe-secure-openssl.c to their previous
position.

Thanks
Imran Zaheer
Bitnine

[1]: 
https://github.com/postgres/postgres/commit/1241fcbd7e649414f09f9858ba73e63975dcff64
[2]: 
https://github.com/postgres/postgres/commit/568620dfd6912351b4127435eca5309f823abde8

On Wed, Jun 12, 2024 at 12:34 AM Dave Page  wrote:
>
>
>
> On Tue, 11 Jun 2024 at 12:22, Andrew Dunstan  wrote:
>>
>>
>> On 2024-06-11 Tu 05:19, Dave Page wrote:
>>
>> Hi
>>
>> On Sun, 9 Jun 2024 at 08:29, Imran Zaheer  wrote:
>>>
>>> Hi
>>>
>>> I am submitting two new patches. We can undefine the macro at two locations
>>>
>>> 1). As be-secure-openssl.c [1] was the actual
>>> file where the conflict happened so I undefined the macro here before
>>> the ssl includes. I changed the comment a little to make it understandable.
>>> I am also attaching the error generated with ninja build.
>>>
>>> OR
>>>
>>> 2). Right after the gssapi includes in libpq-be.h
>>
>>
>> Thank you for working on this. I can confirm the undef version compiles and 
>> passes tests with 16.3.
>>
>>
>>
>> Thanks for testing.
>>
>> I think I prefer approach 2, which should also allow us to remove the #undef 
>> in sslinfo.c so we only need to do this in one place.
>
> OK, well that version compiles and passes tests as well :-)
>
> Thanks!
>
> --
> Dave Page
> pgAdmin: https://www.pgadmin.org
> PostgreSQL: https://www.postgresql.org
> EDB: https://www.enterprisedb.com
>


v02-0002-approach02-rearrange-the-headers.patch
Description: Binary data


v02-0001-approach02-remove-macro-guard-from-sslinfo.patch
Description: Binary data


Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Andres Freund
On 2024-06-11 10:55:38 -0400, David E. Wheeler wrote:
> ABI Policy
> ==
> 
> The PostgreSQL core team maintains two application binary interface (ABI) 
> guarantees: one for major releases and one for minor releases.

I.e. for major versions it's "there is none"?

> Major Releases
> --
> 
> Applications that use the PostgreSQL APIs must be compiled for each major 
> release supported by the application. The inclusion of `PG_MODULE_MAGIC` 
> ensures that code compiled for one major version will rejected by other major 
> versions.
> 
> Furthermore, new releases may make API changes that require code changes. Use 
> the `PG_VERSION_NUM` constant to adjust code in a backwards compatible way:
> 
> ``` c
> #if PG_VERSION_NUM >= 16
> #include "varatt.h"
> #endif
> ```
> 
> PostgreSQL avoids unnecessary API changes in major releases, but usually
> ships a few necessary API changes, including deprecation, renaming, and
> argument variation.


> In such cases the incompatible changes will be listed in the Release Notes.

I don't think we actually exhaustively list all of them.


> Minor Releases
> --
> 
> PostgreSQL makes every effort to avoid both API and ABI breaks in minor 
> releases. In general, an application compiled against any minor release will 
> work with any other minor release, past or future.

s/every/a reasonable/ or just s/every/an/


> When a change *is* required, PostgreSQL will choose the least invasive way
> possible, for example by squeezing a new field into padding space or
> appending it to the end of a struct. This sort of change should not impact
> dependent applications unless they use `sizeof(the struct)` or create their
> own instances of such structs --- patterns best avoided.

The padding case doesn't affect sizeof() fwiw.

I think there's too often not an alternative to using sizeof(), potentially
indirectly (via makeNode() or such. So this sounds a bit too general.

Greetings,

Andres Freund




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Andres Freund
Hi,

On 2024-06-12 14:58:04 +0200, Jelte Fennema-Nio wrote:
> On Wed, 12 Jun 2024 at 14:44, Peter Eisentraut  wrote:
> > I think since around 6 years ago we have been much more vigilant about
> > avoiding ABI breaks.  So if there aren't any more recent examples of
> > breakage, then maybe that was ultimately successful, and the upshot is,
> > continue to be vigilant at about the same level?
> 
> While not strictly an ABI break I guess, the backport of 32d5a4974c81
> broke building Citus against 13.10 and 14.7[1].

I think that kind of thing is not something we (PG devs) really can do
anything about. It's also
a) fairly easy thing to fix
b) fails during compilation
c) doesn't break ABI afaict

Greetings,

Andres Freund




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Andres Freund
Hi,

On 2024-06-11 08:04:57 -0400, Andrew Dunstan wrote:
> Some time ago I did some work on wrapping libpq using the perl FFI module.
> It worked pretty well, and would mean we could probably avoid many uses of
> IPC::Run, and would probably be substantially more efficient (no fork
> required). It wouldn't avoid all uses of IPC::Run, though.

FWIW, I'd *love* to see work on this continue. The reduction in test runtime
on windows is substantial and would shorten the hack->CI->fail->hack loop a
good bit shorter. And save money.


> But my point was mainly that while a new framework might have value, I don't
> think we need to run out and immediately rewrite several hundred TAP tests.

Oh, yea. That's not at all feasible to just do in one go.

Greetings,

Andres Freund




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Peter Geoghegan
On Mon, Jun 3, 2024 at 3:38 PM Tom Lane  wrote:
> > Thus I am not really on board with this statement as-is.
>
> Me either.  There are degrees of ABI compatibility, and we'll choose
> the least invasive way, but it's seldom the case that no conceivable
> extension will be broken.  For example, if we can't squeeze a new
> field into padding space, we'll typically put it at the end of the
> struct in existing branches.  That's okay unless some extension has
> a dependency on sizeof(the struct), for instance because it's
> allocating such structs itself.

Right. While there certainly are code bases (mostly C libraries
without a huge surface area) where taking the hardest possible line on
ABI breakage makes sense, and where ABI breakage can be detected by a
mechanical process, that isn't us. In fact I'd say that Postgres is
just about the further possible thing from that.

I'm a little surprised that we don't seem to have all that many
problems with ABI breakage, though. Although we theoretically have a
huge number of APIs that extension authors might choose to use, that
isn't really true in practical terms. The universe of theoretically
possible problems is vastly larger than the areas where we see
problems in practice. You have to be pragmatic about it.

-- 
Peter Geoghegan




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 15:49, FWS Neil  wrote:
> I believe that anyone coming out of school these days would have a relatively
> easy transition to any of Go, Rust, Kotlin, Swift, etc.  In other words, any 
> of
> the modern languages.

Agreed, which is why I said they'd be acceptable to me. But I think
one important advantage of Python is that it's clear that many people
in the community are willing to write tests in it. At PGConf.dev there
were a lot of people in the unconference session about this. Also many
people already wrote a Postgres testing framework for python, and are
using it (see list of projects that Alexander shared). I haven't seen
that level of willingness to write tests for any of those other
languages (yet).

> In addition, the language should scale well to
> multiprocessors, because parallel testing is becoming more important every 
> day.
> 
> Python is good for package depth and resource availability, but fails IMO in 
> the
> other categories.

You can easily pin packages or call a different function based on the
version of the package, so I'm not sure what the problem is with
package stability. Also chances are we'll pull in very little external
packages and rely mostly on the stdlib (which is quite stable).

Regarding parallelised running of tests, I agree that's very
important. And indeed normally parallelism in python can be a pain
(although async/await makes I/O parallelism a lot better at least).
But running pytest tests in parallel is extremely easy by using
pytest-xdist[1], so I don't really think there's an issue for this
specific Python usecase.

> My experience with python where the program flow can change
> because of non-visible characters is a terrible way to write robust long term
> maintainable code.  Because of this most of the modern languages are going to 
> be
> closer in style to Postgres’s C code base than Python.

I'm assuming this is about spaces vs curly braces for blocks? Now that
we have auto formatters for every modern programming language I indeed
prefer curly braces myself too. But honestly that's pretty much a tabs
vs spaces discussion.

[1]: https://pypi.org/project/pytest-xdist/

On Wed, 12 Jun 2024 at 15:49, FWS Neil  wrote:
>
>
> > On Jun 12, 2024, at 6:40 AM, Jelte Fennema-Nio  wrote:
> >
> > I think C#, Java, Go, Rust, Kotlin, and Swift would be acceptable
> > choices for me (and possibly some more). They allow some type of
> > introspection, they have a garbage collector, and their general
> > tooling is quite good.
> >
>
> Having used Python for 15+ years and then abandoned it for all projects I 
> would
> say the single most important points for a long term project like Postgres is,
> not necessarily in order, package stability, package depth, semantic 
> versioning,
> available resources, and multiprocessor support.
>
> The reason I abandoned Python was for the constant API breaks in packages. 
> Yes,
> python is a great language to teach in school for a one-time class project, 
> but
> that is not Postgres’s use-case.  Remember that Fortran and Pascal were the
> darlings for teaching in school prior to Python and no-one uses them any more.
>
> Yes Python innovates fast and is fashionable.  But again, not Postgres’s 
> use-case.
>
> I believe that anyone coming out of school these days would have a relatively
> easy transition to any of Go, Rust, Kotlin, Swift, etc.  In other words, any 
> of
> the modern languages.  In addition, the language should scale well to
> multiprocessors, because parallel testing is becoming more important every 
> day.
>
> If the Postgres project is going to pick a new language for testing, it should
> pick a language for the next 50 years based on the projects needs.
>
> Python is good for package depth and resource availability, but fails IMO in 
> the
> other categories. My experience with python where the program flow can change
> because of non-visible characters is a terrible way to write robust long term
> maintainable code.  Because of this most of the modern languages are going to 
> be
> closer in style to Postgres’s C code base than Python.




Re: Remove dependence on integer wrapping

2024-06-12 Thread Tom Lane
Nathan Bossart  writes:
> Thanks.  This looks pretty good to me after a skim, so I'll see about
> committing/back-patching it in the near future.  IIUC there is likely more
> to come in this area, but I see no need to wait.

Uh ... what of this would we back-patch?  It seems like v18
material to me.

regards, tom lane




Re: SQL:2011 application time

2024-06-12 Thread Matthias van de Meent
On Wed, 5 Jun 2024 at 22:57, Paul Jungwirth  
wrote:
>
> On Thu, May 9, 2024 at 5:44 PM Matthias van de Meent 
>  wrote:
>  > Additionally, because I can't create my own non-constraint-backing
>  > unique GIST indexes, I can't pre-create my unique constraints
>  > CONCURRENTLY as one could do for the non-temporal case
>
> We talked about this a bit at pgconf.dev. I would like to implement it, since 
> I agree it is an
> important workflow to support. Here are some thoughts about what would need 
> to be done.
>
> First we could take a small step: allow non-temporal UNIQUE GiST indexes. 
> This is possible according
> to [1], but in the past we had no way of knowing which strategy number an 
> opclass was using for
> equality. With the stratnum support proc introduced by 6db4598fcb (reverted 
> for v17), we could
> change amcanunique to true for the GiST AM handler. If the index's opclasses 
> had that sproc and it
> gave non-zero for RTEqualStrategyNumber, we would have a reliable "definition 
> of uniqueness". UNIQUE
> GiST indexes would raise an error if they detected a duplicate record.

Cool.

> But that is just regular non-temporal indexes. To avoid a long table lock 
> you'd need a way to build
> the index that is not just unique, but also does exclusion based on &&.  We 
> could borrow syntax from
> SQL:2011 and allow `CREATE INDEX idx ON t (id, valid_at WITHOUT OVERLAPS)`. 
> But since CREATE INDEX
> is a lower-level concept than a constraint, it'd be better to do something 
> more general. You can
> already give opclasses for each indexed column. How about allowing operators 
> as well? For instance
> `CREATE UNIQUE INDEX idx ON t (id WITH =, valid_at WITH &&)`? Then the index 
> would know to enforce
> those rules.

I think this looks fine. I'd like it even better if we could default
to the equality operator that's used by the type's default btree
opclass in this syntax; that'd make CREATE UNIQUE INDEX much less
awkward for e.g. hash indexes.

> This is the same data we store today in pg_constraint.conexclops. So that 
> would get
> moved/copied to pg_index (probably moved).

I'd keep the pg_constraint.conexclops around: People are inevitably
going to want to keep the current exclusion constraints' handling of
duplicate empty ranges, which is different from expectations we see
for UNIQUE INDEX's handling.

> Then when you add the constraint, what is the syntax? Today when you say 
> PRIMARY KEY/UNIQUE USING
> INDEX, you don't give the column names. So how do we know it's WITHOUT 
> OVERLAPS? I guess if the
> underlying index has (foo WITH = [, bar WITH =], baz WITH &&) we just assume 
> the user wants WITHOUT
> OVERLAPS, and otherwise they want a regular PK/UQ constraint?

Presumably you would know this based on the pg_index.indisunique flag?

> In addition this workflow only works if you can CREATE INDEX CONCURRENTLY. 
> I'm not sure yet if we'll
> have problems there. I noticed that for REINDEX at least, there were plans in 
> 2012 to support
> exclusion-constraint indexes,[2] but when the patch was committed in 2019 
> they had been dropped,
> with plans to add support eventually.[3] Today they are still not supported. 
> Maybe whatever caused
> problems for REINDEX isn't an issue for just INDEX, but it would take more 
> research to find out.

I don't quite see where exclusion constraints get into the picture?
Isn't this about unique indexes, not exclusion constraints? I
understand exclusion constraints are backed by indexes, but that
doesn't have to make it a unique index, right? I mean, currently, you
can write an exclusion constraint that makes sure that all rows with a
certain prefix have the same suffix columns (given a btree-esque index
type with <> -operator support), which seems exactly opposite of what
unique indexes should do.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Andres Freund
Hi,

On 2024-06-11 07:28:23 -0700, Jacob Champion wrote:
> On Mon, Jun 10, 2024 at 1:04 PM Andres Freund  wrote:
> > Just for context for the rest the email: I think we desperately need to move
> > off perl for tests. The infrastructure around our testing is basically
> > unmaintained and just about nobody that started doing dev stuff in the last 
> > 10
> > years learned perl.
> 
> Okay. Personally, I'm going to try to stay out of discussions around
> subtracting Perl and focus on adding Python, for a bunch of different
> reasons:

I think I might have formulated my paragraph above badly - I didn't mean that
we should move away from perl tests tomorrow, but that we need a path forward
that allows folks to write tests without perl.


> - Tests aren't cheap, but in my experience, the maintenance-cost math
> for tests is a lot different than the math for implementations.

At the moment they tend to be *more* expensive often, due to spurious
failures. That's mostly not perl's fault, don't get me wrong, but us not
having better infrastructure for testing complicated behaviour and/or testing
things on a more narrow basis.


> > > Problem 1 (rerun failing tests): One architectural roadblock to this
> > > in our Test::More suite is that tests depend on setup that's done by
> > > previous tests. pytest allows you to declare each test's setup
> > > requirements via pytest fixtures, letting the test runner build up the
> > > world exactly as it needs to be for a single isolated test. These
> > > fixtures may be given a "scope" so that multiple tests may share the
> > > same setup for performance or other reasons.
> >
> > OTOH, that's quite likely to increase overall test times very
> > significantly. Yes, sometimes that can be avoided with careful use of 
> > various
> > features, but often that's hard, and IME is rarely done rigiorously.
> 
> Well, scopes are pretty front and center when you start building
> pytest fixtures, and the complicated longer setups will hopefully
> converge correctly early on and be reused everywhere else. I imagine
> no one wants to build cluster setup from scratch.

One (the?) prime source of state in our tap tests is the
database. Realistically we can't just tear that one down and reset it between
tests without causing the test times to explode. So we'll have to live with
some persistent state.


> On a slight tangent, is this not a problem today?

It is, but that doesn't mean making it even bigger is unproblematic :)




> > I think part of the problem is that the information about what precisely
> > failed is often much harder to collect when testing multiple servers
> > interacting than when doing localized unit tests.
> >
> > I think we ought to invest a bunch in improving that, I'd hope that a lot of
> > that work would be largely independent of the language the tests are written
> > in.
> 
> We do a lot more acceptance testing than internal testing, which came
> up as a major complaint from me and others during the unconference.
> One of the reasons people avoid writing internal tests in Perl is
> because it's very painful to find a rhythm with Test::More.

What definition of internal tests are you using here?

I think a lot of our tests are complicated, fragile and slow because we almost
exclusively do end-to-end tests, because with a few exceptions we don't have a
way to exercise code in a more granular way.


> > > When it comes to third-party packages, which I think we're
> > > probably going to want in moderation, we would still need to discuss
> > > supply chain safety. Python is not as mature here as, say, Go.
> >
> > What external dependencies are you imagining?
> 
> The OAuth pytest suite makes extensive use of
> - psycopg, to easily drive libpq;

That's probably not going to fly. It introduces painful circular dependencies
between building postgres (for libpq), building psycopg (requiring libpq) and
testing postgres (requiring psycopg).

You *can* solve such issues, but we've debated that in the past, and I doubt
we'll find agreement on the awkwardness it introduces.


> - construct, for on-the-wire packet representations and manipulation; and

That seems fairly minimal.


> - pyca/cryptography, for easy generation of certificates and manual
> crypto testing.

That's a bit more painful, but I guess maybe not unrealistic?


> I'd imagine each would need considerable discussion, if there is
> interest in doing the same things that I do with them.

One thing worth thinking about is that such dependencies have to work on a
relatively large number of platforms / architectures. A lot of projects
don't...

Greetings,

Andres Freund




Re: Proposal: Document ABI Compatibility

2024-06-12 Thread Peter Geoghegan
On Tue, Jun 11, 2024 at 10:55 AM David E. Wheeler  wrote:
> Right, it’s just that extension authors could use some notification that such 
> a change is coming so they can update their code, if necessary.

In general our strategy around ABI breaks is to avoid them whenever
possible. We also make the most conservative assumptions about what a
true ABI break is -- strictly speaking we can never be fully sure
about the impact of a theoretical/mechanical ABI break (we can only
make well educated guesses). My sense is that we're approaching having
the fewest possible real ABI breaks already -- we're already doing the
best we can. That doesn't seem like a useful area to focus on.

As an example, my bugfix commit 714780dc was apparently discussed by
Yurii Rashkovskii during his pgConf.dev talk. I was (and still am)
approaching 100% certainty that that wasn't a true ABI break.
Documenting this somewhere seems rather unappealing. Strictly speaking
I'm not 100% certain that this is a non-issue, but who benefits from
hearing my hand-wavy characterisation of why I believe it's a
non-issue? You might as well just look at an ABI change report
yourself.

Approximately 0% of all extensions actually use the struct in
question, and so obviously aren't affected. If anybody is using the
struct then it's merely very very likely that they aren't affected.
But why trust me here? After all, I can't even imagine why anybody
would want to use the struct in question. My hand-wavy speculation
about what it would look like if I was wrong about that is inherently
suspect, and probably just useless. Is it not?

That having been said, it would be useful if there was a community web
resource for this -- something akin to coverage.postgresql.org, but
with differential ABI breakage reports. You can see an example report
here:

https://postgr.es/m/CAH2-Wzm-W6hSn71sUkz0Rem=qdeu7tnfmc7_jg2djrlfef_...@mail.gmail.com

Theoretically anybody can do this themselves. In practice they don't.
So something as simple as providing automated reports about ABI
changes might well move the needle here.

--
Peter Geoghegan




Re: Remove dependence on integer wrapping

2024-06-12 Thread Nathan Bossart
On Wed, Jun 12, 2024 at 11:45:20AM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> Thanks.  This looks pretty good to me after a skim, so I'll see about
>> committing/back-patching it in the near future.  IIUC there is likely more
>> to come in this area, but I see no need to wait.
> 
> Uh ... what of this would we back-patch?  It seems like v18
> material to me.

D'oh.  I was under the impression that the numutils.c changes were arguably
bug fixes.  Even in that case, we should probably split out the other stuff
for v18.  But you're probably right that we could just wait for all of it.

-- 
nathan




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jelte Fennema-Nio
On Wed, 12 Jun 2024 at 17:50, Andres Freund  wrote:
> > The OAuth pytest suite makes extensive use of
> > - psycopg, to easily drive libpq;
>
> That's probably not going to fly. It introduces painful circular dependencies
> between building postgres (for libpq), building psycopg (requiring libpq) and
> testing postgres (requiring psycopg).
>
> You *can* solve such issues, but we've debated that in the past, and I doubt
> we'll find agreement on the awkwardness it introduces.

psycopg has a few implementations binary, c, & pure python. The pure
python one can be linked to a specific libpq.so file at runtime[1]. As
long as we don't break the libpq API (which we shouldn't), we can just
point it to the libpq compiled by meson/make. We wouldn't be able to
use the newest libpq features that way (because psycopg wouldn't know
about them), but that seems totally fine for most usages (i.e. sending
a query over a connection). If we really want to use those from the
python tests we could write our own tiny CFFI layer specifically for
those.

> One thing worth thinking about is that such dependencies have to work on a
> relatively large number of platforms / architectures. A lot of projects
> don't...

Do they really? A bunch of the Perl tests we just skip on windows or
uncommon platforms. I think we could do the same for these.




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Alexander Korotkov
On Wed, Jun 12, 2024 at 7:08 PM Jelte Fennema-Nio  wrote:
> On Wed, 12 Jun 2024 at 17:50, Andres Freund  wrote:
> > > The OAuth pytest suite makes extensive use of
> > > - psycopg, to easily drive libpq;
> >
> > That's probably not going to fly. It introduces painful circular 
> > dependencies
> > between building postgres (for libpq), building psycopg (requiring libpq) 
> > and
> > testing postgres (requiring psycopg).
> >
> > You *can* solve such issues, but we've debated that in the past, and I doubt
> > we'll find agreement on the awkwardness it introduces.
>
> psycopg has a few implementations binary, c, & pure python. The pure
> python one can be linked to a specific libpq.so file at runtime[1]. As
> long as we don't break the libpq API (which we shouldn't), we can just
> point it to the libpq compiled by meson/make. We wouldn't be able to
> use the newest libpq features that way (because psycopg wouldn't know
> about them), but that seems totally fine for most usages (i.e. sending
> a query over a connection). If we really want to use those from the
> python tests we could write our own tiny CFFI layer specifically for
> those.

I guess you mean pg8000. Note that pg8000 and psycopg2 have some
differences in interpretation of datatypes (AFAIR arrays, jsonb...).
So, it would be easier to chose one particular driver.  However, with
a bit efforts it's possible to make all the code driver agnostic.

--
Regards,
Alexander Korotkov
Supabase




Re: Use WALReadFromBuffers in more places

2024-06-12 Thread Bharath Rupireddy
Hi,

On Sat, Jun 8, 2024 at 5:24 PM Nitin Jadhav 
wrote:
>
> I spent some time examining the patch. Here are my observations from the
review.

Thanks.

> - I believe there’s no need for an extra variable ‘nbytes’ in this
> context. We can repurpose the ‘count’ variable for the same function.
> If necessary, we might think about renaming ‘count’ to ‘nbytes’.

'count' variable can't be altered once determined as the page_read
callbacks need to return the total number of bytes read. However, I ended
up removing 'nbytes' like in the attached v2 patch.

> - The operations performed by logical_read_xlog_page() and
> read_local_xlog_page_guts() are identical. It might be beneficial to
> create a shared function to minimize code repetition.

IMO, creating another function to just wrap two other functions doesn't
seem good to me.

I attached v2 patches for further review. No changes in 0002 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 6902b207cf7497396493aef369a9e275900a86e7 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 12 Jun 2024 14:46:16 +
Subject: [PATCH v2 1/2] Use WALReadFromBuffers in more places

Commit 91f2cae introduced WALReadFromBuffers, and used it only for
physical replication walsenders. There are couple of other callers
that use read_local_xlog_page page_read callback and logical
replication walsenders can also benefit reading WAL from WAL
buffers using the new function. This commit uses the new function
for these callers.

Author: Bharath Rupireddy
Reviewed-by: Jingtang Zhang, Nitin Jadhav
Discussion: https://www.postgresql.org/message-id/CALj2ACVfF2Uj9NoFy-5m98HNtjHpuD17EDE9twVeJng-jTAe7A%40mail.gmail.com
---
 src/backend/access/transam/xlogutils.c | 10 +-
 src/backend/replication/walsender.c| 13 ++---
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 5295b85fe0..24a7ef0479 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -892,6 +892,7 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
 	int			count;
 	WALReadError errinfo;
 	TimeLineID	currTLI;
+	Size		rbytes;
 
 	loc = targetPagePtr + reqLen;
 
@@ -1004,7 +1005,14 @@ read_local_xlog_page_guts(XLogReaderState *state, XLogRecPtr targetPagePtr,
 		count = read_upto - targetPagePtr;
 	}
 
-	if (!WALRead(state, cur_page, targetPagePtr, count, tli,
+	/* attempt to read WAL from WAL buffers first */
+	rbytes = WALReadFromBuffers(cur_page, targetPagePtr, count, currTLI);
+	cur_page += rbytes;
+	targetPagePtr += rbytes;
+
+	/* now read the remaining WAL from WAL file */
+	if ((count - rbytes) > 0 &&
+		!WALRead(state, cur_page, targetPagePtr, (count - rbytes), tli,
  &errinfo))
 		WALReadRaiseError(&errinfo);
 
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c623b07cf0..bd0decef3d 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1056,6 +1056,7 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req
 	WALReadError errinfo;
 	XLogSegNo	segno;
 	TimeLineID	currTLI;
+	Size		rbytes;
 
 	/*
 	 * Make sure we have enough WAL available before retrieving the current
@@ -1092,11 +1093,17 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req
 	else
 		count = flushptr - targetPagePtr;	/* part of the page available */
 
-	/* now actually read the data, we know it's there */
-	if (!WALRead(state,
+	/* attempt to read WAL from WAL buffers first */
+	rbytes = WALReadFromBuffers(cur_page, targetPagePtr, count, currTLI);
+	cur_page += rbytes;
+	targetPagePtr += rbytes;
+
+	/* now read the remaining WAL from WAL file */
+	if ((count - rbytes) > 0 &&
+		!WALRead(state,
  cur_page,
  targetPagePtr,
- count,
+ (count - rbytes),
  currTLI,		/* Pass the current TLI because only
  * WalSndSegmentOpen controls whether new TLI
  * is needed. */
-- 
2.34.1

From 6aca7856208907d2fbde58d507e4a48319a614ad Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 12 Jun 2024 14:55:31 +
Subject: [PATCH v2 2/2] Add test module to demonstrate reading from WAL
 buffers patterns.

his commit adds a test module to demonstrate a few patterns for
reading from WAL buffers using WALReadFromBuffers added by commit
91f2cae7a4e.

1. This module contains a test function to read the WAL that's
fully copied to WAL buffers. Whether or not the WAL is fully
copied to WAL buffers is ensured by WaitXLogInsertionsToFinish
before WALReadFromBuffers.

2. This module contains an implementation of xlogreader page_read
callback to read unflushed/not-yet-flushed WAL directly from WAL
buffers.

Author: Bharath Rupireddy
Discussion: https://www.postgresql.org/message-id/CAFiTN-sE7CJn-ZFj%2B-0Wv6TNytv_fp4n%2BeCszspxJ3mt77

RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-12 Thread Amonson, Paul D
> The project is currently in feature-freeze in preparation for the next major
> release so new development and ideas are not the top priority right now.
> Additionally there is a large developer meeting shortly which many are busy
> preparing for.  Excercise some patience, and I'm sure there will be follow-ups
> to this once development of postgres v18 picks up.

Thanks, understood.

I had our OSS internal team, who are experts in OSS licensing, review possible 
conflicts between the PostgreSQL license and the BSD-Clause 3-like license for 
the CRC32C AVX-512 code, and they found no issues. Therefore, including the new 
license into the PostgreSQL codebase should be acceptable.

I am attaching the first official patches. The second patch is a simple test 
function in PostgreSQL SQL, which I used for testing and benchmarking. It will 
not be merged.

Code Structure Question: While working on this code, I noticed overlaps with 
runtime CPU checks done in the previous POPCNT merged code. I was considering 
that these checks should perhaps be formalized and consolidated into a single 
source/header file pair. If this is desirable, where should I place these 
files? Should it be in "src/port" where they are used, or in "src/common" where 
they are available to all (not just the "src/port" tree)?

Thanks,
Paul



0001-v2-Feat-Add-AVX512-crc32c-algorithm-to-postgres.patch
Description: 0001-v2-Feat-Add-AVX512-crc32c-algorithm-to-postgres.patch


0002-Test-Add-a-Postgres-SQL-function-for-crc32c-testing.patch
Description: 0002-Test-Add-a-Postgres-SQL-function-for-crc32c-testing.patch


Contributing test cases to improve coverage

2024-06-12 Thread J F
Hello All,

I am working on a project that aims to produce test cases that
improve mutation coverage of a dbms's test suite.

The rough workflow of the project goes as follows:
(a) apply mutation at a souce code level
(b) compile and check if the mutated installation passed existing testsuite
(c) If not, fuzz the mutated installation with SQL fuzzer
(d) if a fuzzor successfully produce a test case that crash or trigger bugs
in the mutated installation, use a reduction tools to reduce the test case
(e) add the reduced test case to existing test suite

For postgres, I am looking at adding test cases to test suite in
test/src/regress/. I have gone through (a)-(e), and managed to produced
some test cases. As an example, I claim the test case
```
CREATE RECURSIVE VIEW a(b) AS SELECT'' ;
SELECT FROM a WHERE NULL;
```
could kill the following mutation at optimizer/plan/setrefs.c, 502:5--502:33
Original binary operator expression:
```
rte->rtekind == RTE_SUBQUERY

Replacement expression:
```
(rte->rtekind) >= RTE_SUBQUERY
```

I have a few questions about adding these test cases:

(a) The regression test suite is run by a parallel scheduler, with some
test cases dependent on previous test cases. If I just add my test case as
part of the parallel scheduler’s tests, it might not work, since previous
test cases in the scheduler might already create the same table, for
instance.

(b) How do I get my test cases reviewed and ultimately included in a future
release of PostgreSQL?

Thank you for your time.

Regards,
Jon


Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Daniel Gustafsson
> On 12 Jun 2024, at 18:08, Jelte Fennema-Nio  wrote:
> On Wed, 12 Jun 2024 at 17:50, Andres Freund  wrote:

>> One thing worth thinking about is that such dependencies have to work on a
>> relatively large number of platforms / architectures. A lot of projects
>> don't...
> 
> Do they really? A bunch of the Perl tests we just skip on windows or
> uncommon platforms. I think we could do the same for these.

For a project intended to improve on the status quo it seems like a too low bar 
to just port over the deficincies in the thing we’re trying to improve over.

./daniel



Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Daniele Varrazzo
On Wed, 12 Jun 2024 at 18:08, Jelte Fennema-Nio  wrote:
>
> On Wed, 12 Jun 2024 at 17:50, Andres Freund  wrote:
> > > The OAuth pytest suite makes extensive use of
> > > - psycopg, to easily drive libpq;
> >
> > That's probably not going to fly. It introduces painful circular 
> > dependencies
> > between building postgres (for libpq), building psycopg (requiring libpq) 
> > and
> > testing postgres (requiring psycopg).
>
> psycopg has a few implementations binary, c, & pure python. The pure
> python one can be linked to a specific libpq.so file at runtime[1]. As

This is true, but [citation needed] :D I assume the pointer wanted to
be https://www.psycopg.org/psycopg3/docs/api/pq.html#pq-impl

I see the following use cases and how I would use psycopg to implement them:

- by installing 'psycopg[binary]' you would get a binary bundle
shipping with a stable version of the libpq, so you can test the
database server regardless of libpq instabilities in the same
codebase.
- using the pure Python psycopg (enforced by exporting
'PSYCOPG_IMPL=python') you would use the libpq found on the
LD_LIBRARY_PATH, which can be useful to test regressions to the libpq
itself.
- if you want to test new libpq functions you can reach them in Python
by dynamic lookup. See [2] for an example of a function only available
from libpq v17.

[2]: 
https://github.com/psycopg/psycopg/blob/2bf7783d66ab239a2fa330a842fd461c4bb17c48/psycopg/psycopg/pq/_pq_ctypes.py#L564-L569

-- Daniele




Columnar format export in Postgres

2024-06-12 Thread Sushrut Shivaswamy
Hey Postgres team,

I have been working on adding support for columnar format export to
Postgres to speed up analytics queries.
I've created an extension that achieves this functionality here
.

I"m looking to improve the performance of this extension to enable drop-in
analytics support for Postgres. Some immediate improvements I have in mind
are:
 - Reduce memory consumption when exporting table data to columnar format
 - Create a native planner / execution hook that can read columnar data
with vectorised operations.

It would be very helpful if you could take a look and suggest improvements
to the extension.
Hopefully, this extension can be shipped by default with postgres at some
point in the future.

Thanks,
Sushrut


Re: BF mamba failure

2024-06-12 Thread Alexander Lakhin

Hello hackers,

20.03.2023 09:10, Peter Smith wrote:


Using this I was also able to reproduce the problem. But test failures
were rare. The make check-world seemed OK, and indeed the
test_decoding tests would also appear to PASS around 14 out of 15
times.


I've stumbled upon this assertion failure again during testing following 
cd312adc5.

This time I've simplified the reproducer to the attached modification.
With this patch applied, `make -s check -C contrib/test_decoding` fails on 
master as below:
ok 1 - pgstat_rc_1    14 ms
not ok 2 - pgstat_rc_2  1351 ms


contrib/test_decoding/output_iso/log/postmaster.log contains:
TRAP: failed Assert("pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0"), File: "pgstat_shmem.c", Line: 562, 
PID: 1130928


With extra logging added, I see the following events happening:
1) pgstat_rc_1.setup calls pgstat_create_replslot(), gets
  ReplicationSlotIndex(slot) = 0 and calls
  pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0).

2) pgstat_rc_1.s0_get_changes executes pg_logical_slot_get_changes(...)
  and then calls pgstat_gc_entry_refs on shmem_exit() ->
  pgstat_shutdown_hook() ...;
  with the sleep added inside pgstat_release_entry_ref, this backend waits
  after decreasing entry_ref->shared_entry->refcount to 0.

3) pgstat_rc_1.stop removes the replication slot.

4) pgstat_rc_2.setup calls pgstat_create_replslot(), gets
  ReplicationSlotIndex(slot) = 0 and calls
  pgstat_get_entry_ref_locked(PGSTAT_KIND_REPLSLOT, InvalidOid, 0, 0),
  which leads to the call pgstat_reinit_entry(), which increases refcount
  for the same shared_entry as in (1) and (2), and then to the call
  pgstat_acquire_entry_ref(), which increases refcount once more.

5) the backend 2 reaches
Assert(pg_atomic_read_u32(&entry_ref->shared_entry->refcount) == 0),
  which fails due to refcount = 2.

Best regards,
Alexanderdiff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index c7ce603706..660eea87eb 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -6,10 +6,12 @@ PGFILEDESC = "test_decoding - example of a logical decoding output plugin"
 REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	decoding_into_rel binary prepared replorigin time messages \
 	spill slot truncate stream stats twophase twophase_stream
-ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
+XISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
 	twophase_snapshot slot_creation_error catalog_change_snapshot
 
+ISOLATION = pgstat_rc_1 pgstat_rc_2
+
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 
diff --git a/contrib/test_decoding/expected/pgstat_rc_1.out b/contrib/test_decoding/expected/pgstat_rc_1.out
new file mode 100644
index 00..850adf9930
--- /dev/null
+++ b/contrib/test_decoding/expected/pgstat_rc_1.out
@@ -0,0 +1,18 @@
+Parsed test spec with 1 sessions
+
+starting permutation: s0_get_changes
+?column?
+
+init
+(1 row)
+
+step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
+data
+
+(0 rows)
+
+?column?
+
+stop
+(1 row)
+
diff --git a/contrib/test_decoding/expected/pgstat_rc_2.out b/contrib/test_decoding/expected/pgstat_rc_2.out
new file mode 100644
index 00..f8d4ce83a6
--- /dev/null
+++ b/contrib/test_decoding/expected/pgstat_rc_2.out
@@ -0,0 +1,19 @@
+Parsed test spec with 1 sessions
+
+starting permutation: s0_sleep
+?column?
+
+init
+(1 row)
+
+step s0_sleep: SELECT pg_sleep(1);
+pg_sleep
+
+
+(1 row)
+
+?column?
+
+stop
+(1 row)
+
diff --git a/contrib/test_decoding/specs/pgstat_rc_1.spec b/contrib/test_decoding/specs/pgstat_rc_1.spec
new file mode 100644
index 00..7b08704a55
--- /dev/null
+++ b/contrib/test_decoding/specs/pgstat_rc_1.spec
@@ -0,0 +1,14 @@
+setup
+{
+SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); -- must be first write in xact
+}
+
+teardown
+{
+SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot');
+}
+
+session "s0"
+step "s0_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); }
+
+permutation "s0_get_changes"
diff --git a/contrib/test_decoding/specs/pgstat_rc_2.spec b/contrib/test_decoding/specs/pgstat_rc_2.spec
new file mode 100644
index 00..4e454ae8e8
--- /dev/null
+++ b/contrib/test_decoding/specs/pgstat_rc_2.spec
@@ -0,0 +1,14 @@
+setup
+{
+SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); -- must be first write in xact
+}
+
+teardown
+{
+	SELECT 

Re: Remove dependence on integer wrapping

2024-06-12 Thread Peter Geoghegan
On Mon, Jun 10, 2024 at 2:30 PM Andres Freund  wrote:
> On 2024-06-09 21:57:54 -0400, Tom Lane wrote:
> > BTW, while I approve of trying to get rid of our need for -fwrapv,
> > I'm quite scared of actually doing it.

IMV it's perfectly fine to defensively assume that we need -fwrapv,
even given a total lack of evidence that removing it would cause harm.
Doing it to "be more in line with the standard" is a terrible argument.

> I think that's a quite fair concern. One potentially relevant datapoint is
> that we actually don't have -fwrapv equivalent on all platforms, and I don't
> recall a lot of complaints from windows users.

That might just be because MSVC inherently doesn't do optimizations
that rely on signed wrap being undefined behavior. Last I checked MSVC
just doesn't rely on the standard's strict aliasing rules, even as an
opt-in thing.

> Of course it's quite possible
> that they'd never notice...

Removing -fwrapv is something that I see as a potential optimization.
It should be justified in about the same way as any other
optimization.

I suspect that it doesn't actually have any clear benefits for us. But
if I'm wrong about that then the benefit might still be achievable
through other means (something far short of just removing -fwrapv
globally).

> I think this is a good argument for enabling -ftrapv in development
> builds. That gives us at least a *chance* of seeing these issues.

+1. I'm definitely prepared to say that code that actively relies on
-fwrapv is broken code.


--
Peter Geoghegan




Re: Revive num_dead_tuples column of pg_stat_progress_vacuum

2024-06-12 Thread Peter Geoghegan
On Thu, Jun 6, 2024 at 9:23 PM Masahiko Sawada  wrote:
> num_dead_item_ids seems good to me. I've updated the patch that
> incorporated the comment from Álvaro[1].

Great, thank you.

-- 
Peter Geoghegan




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Alexander Korotkov
On Wed, Jun 12, 2024 at 7:34 PM Alexander Korotkov  wrote:
> On Wed, Jun 12, 2024 at 7:08 PM Jelte Fennema-Nio  wrote:
> > On Wed, 12 Jun 2024 at 17:50, Andres Freund  wrote:
> > > > The OAuth pytest suite makes extensive use of
> > > > - psycopg, to easily drive libpq;
> > >
> > > That's probably not going to fly. It introduces painful circular 
> > > dependencies
> > > between building postgres (for libpq), building psycopg (requiring libpq) 
> > > and
> > > testing postgres (requiring psycopg).
> > >
> > > You *can* solve such issues, but we've debated that in the past, and I 
> > > doubt
> > > we'll find agreement on the awkwardness it introduces.
> >
> > psycopg has a few implementations binary, c, & pure python. The pure
> > python one can be linked to a specific libpq.so file at runtime[1]. As
> > long as we don't break the libpq API (which we shouldn't), we can just
> > point it to the libpq compiled by meson/make. We wouldn't be able to
> > use the newest libpq features that way (because psycopg wouldn't know
> > about them), but that seems totally fine for most usages (i.e. sending
> > a query over a connection). If we really want to use those from the
> > python tests we could write our own tiny CFFI layer specifically for
> > those.
>
> I guess you mean pg8000. Note that pg8000 and psycopg2 have some
> differences in interpretation of datatypes (AFAIR arrays, jsonb...).
> So, it would be easier to chose one particular driver.  However, with
> a bit efforts it's possible to make all the code driver agnostic.

Ops, this is probably outdated due to presence of psycopg3, as pointed
by Daniele Varrazzo [1].

Links.
1. 
https://www.postgresql.org/message-id/CA%2Bmi_8Zj0gpzPKUEcEx2mPOAsm0zPvznhbcnQDA_eeHVnVqg9Q%40mail.gmail.com

--
Regards,
Alexander Korotkov
Supabase




Re: Columnar format export in Postgres

2024-06-12 Thread Ranier Vilela
Em qua., 12 de jun. de 2024 às 13:56, Sushrut Shivaswamy <
sushrut.shivasw...@gmail.com> escreveu:

> Hey Postgres team,
>
> I have been working on adding support for columnar format export to
> Postgres to speed up analytics queries.
> I've created an extension that achieves this functionality here
> .
>
> I"m looking to improve the performance of this extension to enable drop-in
> analytics support for Postgres. Some immediate improvements I have in mind
> are:
>  - Reduce memory consumption when exporting table data to columnar format
>  - Create a native planner / execution hook that can read columnar data
> with vectorised operations.
>
> It would be very helpful if you could take a look and suggest improvements
> to the extension.
> Hopefully, this extension can be shipped by default with postgres at some
> point in the future.
>
If you want to have any hope, the license must be BSD.
GPL is incompatible.

best regards,
Ranier Vilela


Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Daniel Gustafsson
> On 12 Jun 2024, at 17:50, Andres Freund  wrote:
> On 2024-06-11 07:28:23 -0700, Jacob Champion wrote:

>> The OAuth pytest suite makes extensive use of
>> - psycopg, to easily drive libpq;
> 
> That's probably not going to fly. It introduces painful circular dependencies
> between building postgres (for libpq), building psycopg (requiring libpq) and
> testing postgres (requiring psycopg).

I might be missing something obvious, but if we use a third-party libpq driver
in the testsuite doesn't that imply that a patch adding net new functionality
to libpq also need to add it to the driver in order to write the tests?  I'm
thinking about the SCRAM situation a few years back when drivers weren't up to
date.

--
Daniel Gustafsson





Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Daniele Varrazzo
On Wed, 12 Jun 2024 at 19:30, Daniel Gustafsson  wrote:

> I might be missing something obvious, but if we use a third-party libpq driver
> in the testsuite doesn't that imply that a patch adding net new functionality
> to libpq also need to add it to the driver in order to write the tests?  I'm
> thinking about the SCRAM situation a few years back when drivers weren't up to
> date.

As Jelte pointed out, new libpq functions can be tested via CFFI. I
posted a practical example in a link upthread (pure Python Psycopg is
entirely implemented on FFI).

-- Daniele




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Tom Lane
Robert Haas  writes:
> I agree this is not great. I guess I didn't think about it very hard
> because, after all, we were just exposing an API that we'd already
> been using internally. But I think it's reasonable to adjust the API
> to allow for better resolution, as you propose. A second is a very
> long amount of time, and it's entirely reasonable for someone to want
> better granularity.

Here's a v2 responding to some of the comments.

* People pushed back against not using "int64", but the difficulty
with that is that we'd have to #include c.h or at least pg_config.h
in libpq-fe.h, and that would be a totally disastrous invasion of
application namespace.  However, I'd forgotten that libpq-fe.h
does include postgres_ext.h, and there's just enough configure
infrastructure behind that to allow defining pg_int64, which indeed
libpq-fe.h is already relying on.  So we can use that.

* I decided to invent a typedef 

typedef pg_int64 PGusec_time_t;

instead of writing "pg_int64" explicitly everywhere.  This is perhaps
not as useful as it was when I was thinking the definition would be
"long long int", but it still seems to add some readability.  In my
eyes anyway ... anyone think differently?

* I also undid changes like s/end_time/end_time_us/.  I'd done that
mostly to ensure I looked at/fixed every reference to those variables,
but on reflection I don't think it's doing anything for readability.

* I took Ranier's suggestion to make fast paths for end_time == 0.
I'm not sure this will make any visible performance difference, but
it's simple and shouldn't hurt.  We do have some code paths that use
that behavior.

* Ranier also suggested using clock_gettime instead of gettimeofday,
but I see no reason to do that.  libpq already relies on gettimeofday,
but not on clock_gettime, and anyway post-beta1 isn't a great time to
start experimenting with portability-relevant changes.

regards, tom lane

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 80179f9978..990c43ec36 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -268,30 +268,52 @@ PGconn *PQsetdb(char *pghost,
   
nonblocking connection
Poll a connection's underlying socket descriptor retrieved with .
+   The primary use of this function is iterating through the connection
+   sequence described in the documentation of
+   .
 
-int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+typedef pg_int64 PGusec_time_t;
+
+int PQsocketPoll(int sock, int forRead, int forWrite,
+ PGusec_time_t end_time);
 
   
 
   
-   This function sets up polling of a file descriptor. The underlying function is either
+   This function performs polling of a file descriptor, optionally with
+   a timeout. The underlying function is either
poll(2) or select(2), depending on platform
-   support. The primary use of this function is iterating through the connection sequence
-   described in the documentation of . If
-   forRead is specified, the function waits for the socket to be ready
-   for reading. If forWrite is specified, the function waits for the
-   socket to be ready for write. See POLLIN and POLLOUT
+   support.
+   If forRead is nonzero, the
+   function will terminate when the socket is ready for
+   reading. If forWrite is nonzero,
+   the function will terminate when the
+   socket is ready for writing.
+   See POLLIN and POLLOUT
from poll(2), or readfds and
-   writefds from select(2) for more information. If
-   end_time is not -1, it specifies the time at which
-   this function should stop waiting for the condition to be met.
+   writefds from select(2) for more information.
+  
+
+  
+   The timeout is specified by end_time, which
+   is the time to stop waiting expressed as a number of microseconds since
+   the Unix epoch (that is, time_t times 1 million).
+   Timeout is infinite if end_time
+   is -1.  Timeout is immediate (no blocking) if
+   end_time is 0 (or indeed, any time before now).
+   Timeout values can be calculated conveniently by adding the desired
+   number of microseconds to the result of
+   .
+   Note that the underlying system calls may have less than microsecond
+   precision, so that the actual delay may be imprecise.
   
 
   
The function returns a value greater than 0 if the specified condition
is met, 0 if a timeout occurred, or -1 if an error
occurred. The error can be retrieved by checking the errno(3) value. In
-   the event forRead and forWrite are not set, the
+   the event both forRead
+   and forWrite are zero, the
function immediately returns a timeout condition.
   
  
@@ -8039,6 +8061,25 @@ int PQlibVersion(void);
 

 
+   
+PQgetCurrentTimeUSecPQgetCurrentTimeUSec
+
+ 

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-06-12 Thread Robert Haas
On Wed, Jun 5, 2024 at 10:06 AM Jelte Fennema-Nio  wrote:
> Patch 1 & 2: Minor code changes with zero effect until we actually
> bump the protocol version or add protocol parameters. I hope these can
> be merged rather soon to reduce the number of patches in the patchset.

0001 looks like a bug fix that can (and probably should) be committed
and back-patched. What would reduce work for me is if the commit
message explained why these changes are necessary and to which stable
branches they should be applied and, if that's not all of them, the
reason why back-patching should stop at some particular release. The
change to pqTraceOutput_NegotiateProtocolVersion is easy to
understand: the current code doesn't dump the message format as
documented. It probably doesn't help that the documentation is missing
a word -- it should say "Then, for each protocol option...". It's less
obvious why the change to fe-connect.c is needed. Since most cases
seem to be handled in a few common places, it would be good to mention
in the commit message (or maybe a comment) why this one needs separate
handling.

I agree with 0002 except for the change from PG_PROTOCOL_MINOR(proto)
> PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST) to proto > PG_PROTOCOL_LATEST.
I prefer that test the way it is; I think the intent is clearer with
the existing code.

> Patch 3: Similar to 1 & 2 in that it has no actual effect yet. But
> after bumping the version this would be a user visible API change, so
> I expect it requires a bit more discussion.

I don't know if this is the right idea or not. An alternative would be
to leave this alone and add PQprotocolMinorVersion().

> Patch 4: Adds a new connection option, but initially all parameters
> that it takes have the same effect.

Generally seems OK, but:
- The commit message needs spelling and grammar checking.
- dispsize 3 isn't long enough for 3.10, or more immediately, "latest".
- This is storing the major protocol version in a variable called
"minor" and the minor protocol version in a variable called "major".
- I think PGMAXPROTOCOLVERSION needs to be added to the docs.

> Patch 5: Starts using the new connection option from Patch 4

Not sure yet whether I like this approach.

> Patch 6: Libpq changes to start handling NegotiateProtocolVersion in a
> more complex way. (nothing changes in practice yet)

+ * NegotiateProtocolVersion message. So we only want to send a

only->don't

+ * protocol version by default. Since either of those would cause a

"default. Since" => "default, since"

+ char*conn_string_value = *(char **) ((char *) conn +
param->conn_connection_string_value_offset);
+ char*server_value = *(char **) ((char *) conn +
param->conn_connection_string_value_offset);
+ char*supported_value = *(char **) ((char *) conn +
param->conn_connection_string_value_offset);

I have some difficulty understanding how these calculations would
produce different answers.

+ libpq_append_conn_error(conn, "invalid NegotiateProtocolVersion
message, server version is newer than client version");
+ libpq_append_conn_error(conn, "invalid NegotiateProtocolVersion
message, negative protocol parameter count");
+ libpq_append_conn_error(conn, "unexpected NegotiateProtocolVersion
message, server supports requested features");

These messages don't seem good. First, I don't think that telling the
user that there's a problem with a specific wire protocol message is
very user-friendly. Second, the use of a comma to glue two
semi-related thoughts together is not a great practice in English in
general and is certainly something we should try to avoid in our error
messages. Third, the first and last of these aren't very clear about
what the problem actually is. I can only understand it from reading
the code.

Maybe these messages could be rephrased as "unable to negotiate
protocol version: blah". Like "unable to negotiate protocol version:
server requests downgrade to higher-numbered version" or "unable to
negotiate protocol version: server negotiates but asks for no
changes".

I don't think I completely understand what's going on in this patch
yet. I'm not sure that it can be committed on its own, and I think it
might need more polishing, including on comments and the commit
message.

> Patch 7: Bump the protocol version to 3.2 (see commit message for why
> not bumping to 3.1)

Good commit message. The point is arguable, so putting forth your best
argument is important.

> Patch 8: The main change: Infrastructure and protocol support to be
> able to add protocol parameters
> Patch 9: Adds a report_parameters protocol extension as a POC for the
> changes in the previous patch.

My general impression on first looking at these patches is that a lot
of the ideas make sense but that they don't seem very close to being
committable.

It's not very clear how these new messages integrate into the overall
protocol flow. The documentation makes the negative statement that
they can't be used as part of the extended query protoco

Re: race condition in pg_class

2024-06-12 Thread Noah Misch
On Tue, Jun 11, 2024 at 01:37:21PM +0900, Michael Paquier wrote:
> On Mon, Jun 10, 2024 at 07:19:27PM -0700, Noah Misch wrote:
> > On Fri, Jun 07, 2024 at 09:08:03AM -0400, Robert Haas wrote:
> >> I think the core code should provide an "Injection Point" wait event
> >> type and let extensions add specific wait events there, just like you
> >> did for "Extension".
> > 
> > Michael, could you accept the core code offering that, or not?  If so, I am
> > content to implement that.  If not, for injection point wait events, I have
> > just one priority.  The isolation tester already detects lmgr locks without
> > the test writer teaching it about each lock individually.  I want it to have
> > that same capability for injection points. Do you think we can find 
> > something
> > everyone can accept, having that property?  These wait events show up in 
> > tests
> > only, and I'm happy to make the cosmetics be anything compatible with that
> > detection ability.
> 
> Adding a wait event class for injection point is an interesting
> suggestion that would simplify the detection in the isolation function
> quite a bit.  Are you sure that this is something that would be fit
> for v17 material?  TBH, I am not sure.

If I were making a list of changes always welcome post-beta, it wouldn't
include adding wait event types.  But I don't hesitate to add one if it
unblocks a necessary test for a bug present in all versions.

> At the end, the test coverage has the highest priority and the bugs
> you are addressing are complex enough that isolation tests of this
> level are a necessity, so I don't object to what
> inplace050-tests-inj-v2.patch introduces with the naming dependency
> for the time being on HEAD.  I'll just adapt and live with that
> depending on what I deal with, while trying to improve HEAD later on.

Here's what I'm reading for each person's willingness to tolerate each option:

STRATEGY| Paquier | Misch | Haas

new "Injection Point" wait type | maybe   | yes   | yes
INJECTION_POINT(...) naming | yes | yes   | unknown
isolation spec says event names | yes | no| unknown

Corrections and additional strategy lines welcome.  Robert, how do you judge
the lines where I've listed you as "unknown"?

> I'm still wondering if there is something that could be more elegant
> than a dedicated class for injection points, but I cannot think about
> something that would be better for isolation tests on top of my head.
> If there is something I can think of, I'll just go and implement it :)

I once considered changing them to use advisory lock waits instead of
ConditionVariableSleep(), but I recall that was worse from the perspective of
injection points in critical sections.




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 1:53 PM Tom Lane  wrote:
> * I decided to invent a typedef
>
> typedef pg_int64 PGusec_time_t;
>
> instead of writing "pg_int64" explicitly everywhere.  This is perhaps
> not as useful as it was when I was thinking the definition would be
> "long long int", but it still seems to add some readability.  In my
> eyes anyway ... anyone think differently?

I don't think it's a bad idea to have a typedef, but that particular
one is pretty unreadable. Mmm, let's separate some things with
underscores and others by a change in the capitalization conventIon!

I assume you're following an existing convention and therefore this is
the Right Thing To Do, but if there's some other approach that is less
like line noise, that would be great.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-12 Thread John H
Hi,

> I know about the problem and have seen the original email.

I'm sympathetic to the problem of potential privilege escalation when
executing functions. At the same time I'm not sure if there's that
much of a difference between 'CREATE EXTENSION' vs  superusers copying
a series of 'CREATE FUNCTION' where they don't understand these same
nuances.

CREATE FUNCTION already provides an ability to set the search_path. So
I'm wondering what the problem we want to solve here is. Is it that
there's too much friction for extension authors to have to set
search_path as part of the function definition and we want to make it
easier for them to "set once and forget"?


> But, I also agree with Jelte, it should be a property of a control file, 
> rather than a user controlled parameter, so that an attacker can't opt out.

+1. Also curious what happens if an extension author has search_path
already set in proconfig for a function that doesn't match what's in
the control file. I'm guessing the function one should take
precedence.


--
John Hsu - Amazon Web Services




Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-12 Thread Tom Lane
"Amonson, Paul D"  writes:
> I had our OSS internal team, who are experts in OSS licensing, review 
> possible conflicts between the PostgreSQL license and the BSD-Clause 3-like 
> license for the CRC32C AVX-512 code, and they found no issues. Therefore, 
> including the new license into the PostgreSQL codebase should be acceptable.

Maybe you should get some actual lawyers to answer this type of
question.  The Chromium license this code cites is 3-clause-BSD
style, which is NOT compatible: the "advertising" clause is
significant.

In any case, writing copyright notices that are pointers to
external web pages is not how it's done around here.  We generally
operate on the assumption that the Postgres source code will
outlive any specific web site.  Dead links to incidental material
might be okay, but legally relevant stuff not so much.

regards, tom lane




Re: race condition in pg_class

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 1:54 PM Noah Misch  wrote:
> If I were making a list of changes always welcome post-beta, it wouldn't
> include adding wait event types.  But I don't hesitate to add one if it
> unblocks a necessary test for a bug present in all versions.

However, injection points themselves are not present in all versions,
so even if we invent a new wait-event type, we'll have difficulty
testing older versions, unless we're planning to back-patch all of
that infrastructure, which I assume we aren't.

Personally, I think the fact that injection point wait events were put
under Extension is a design mistake that should be corrected before 17
is out of beta.

> Here's what I'm reading for each person's willingness to tolerate each option:
>
> STRATEGY| Paquier | Misch | Haas
> 
> new "Injection Point" wait type | maybe   | yes   | yes
> INJECTION_POINT(...) naming | yes | yes   | unknown
> isolation spec says event names | yes | no| unknown
>
> Corrections and additional strategy lines welcome.  Robert, how do you judge
> the lines where I've listed you as "unknown"?

I'd tolerate INJECTION_POINT() if we had no other option but I think
it's clearly inferior. Does the last line refer to putting the
specific wait event names in the isolation spec file? If so, I'd also
be fine with that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: On disable_cost

2024-06-12 Thread Andres Freund
Hi,

On 2024-06-12 11:35:48 -0400, Robert Haas wrote:
> Subject: [PATCH v2 3/4] Treat the # of disabled nodes in a path as a separate
>  cost metric.
> 
> Previously, when a path type was disabled by e.g. enable_seqscan=false,
> we either avoided generating that path type in the first place, or
> more commonly, we added a large constant, called disable_cost, to the
> estimated startup cost of that path. This latter approach can distort
> planning. For instance, an extremely expensive non-disabled path
> could seem to be worse than a disabled path, especially if the full
> cost of that path node need not be paid (e.g. due to a Limit).
> Or, as in the regression test whose expected output changes with this
> commit, the addition of disable_cost can make two paths that would
> normally be distinguishible cost seem to have fuzzily the same cost.
> 
> To fix that, we now count the number of disabled path nodes and
> consider that a high-order component of both the cost. Hence, the
> path list is now sorted by disabled_nodes and then by total_cost,
> instead of just by the latter, and likewise for the partial path list.
> It is important that this number is a count and not simply a Boolean;
> else, as soon as we're unable to respect disabled path types in all
> portions of the path, we stop trying to avoid them where we can.


>   if (criterion == STARTUP_COST)
>   {
>   if (path1->startup_cost < path2->startup_cost)
> @@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
>   Costcost1,
>   cost2;
>  
> + /* Number of disabled nodes, if different, trumps all else. */
> + if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
> + {
> + if (path1->disabled_nodes < path2->disabled_nodes)
> + return -1;
> + else
> + return +1;
> + }

I suspect it's going to be ok, because the branch is going to be very
predictable in normal workloads, but I still worry a bit about making
compare_path_costs_fuzzily() more expensive. For more join-heavy queries it
can really show up and there's plenty ORM generated join-heavy query
workloads.

If costs were 32 bit integers, I'd have suggested just stashing the disabled
counts in the upper 32 bits of a 64bit integer. But ...




In an extreme case i can see a tiny bit of overhead, but not enough to be
worth worrying about. Mostly because we're so profligate in doing
bms_overlap() that cost comparisons don't end up mattering as much - I seem to
recall that being different in the not distant past though.


Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
mergejoins_allowed. If mergejoins are disabled we end up reaching
match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
only set mergejoin_enabled for right joins inside select_mergejoin_clauses(),
but we don't call select_mergejoin_clauses() if !enable_mergejoin and jointype
!= FULL.  I, what?


Greetings,

Andres Freund




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Ranier Vilela
Em qua., 12 de jun. de 2024 às 14:53, Tom Lane  escreveu:

> Robert Haas  writes:
> > I agree this is not great. I guess I didn't think about it very hard
> > because, after all, we were just exposing an API that we'd already
> > been using internally. But I think it's reasonable to adjust the API
> > to allow for better resolution, as you propose. A second is a very
> > long amount of time, and it's entirely reasonable for someone to want
> > better granularity.
>
> Here's a v2 responding to some of the comments.
>
> * People pushed back against not using "int64", but the difficulty
> with that is that we'd have to #include c.h or at least pg_config.h
> in libpq-fe.h, and that would be a totally disastrous invasion of
> application namespace.  However, I'd forgotten that libpq-fe.h
> does include postgres_ext.h, and there's just enough configure
> infrastructure behind that to allow defining pg_int64, which indeed
> libpq-fe.h is already relying on.  So we can use that.
>
> * I decided to invent a typedef
>
> typedef pg_int64 PGusec_time_t;
>
Perhaps pg_timeusec?
I think it combines with PQgetCurrentTimeUSec


>
> instead of writing "pg_int64" explicitly everywhere.  This is perhaps
> not as useful as it was when I was thinking the definition would be
> "long long int", but it still seems to add some readability.  In my
> eyes anyway ... anyone think differently?
>
> * I also undid changes like s/end_time/end_time_us/.  I'd done that
> mostly to ensure I looked at/fixed every reference to those variables,
> but on reflection I don't think it's doing anything for readability.
>
end_time seems much better to me.


>
> * I took Ranier's suggestion to make fast paths for end_time == 0.
> I'm not sure this will make any visible performance difference, but
> it's simple and shouldn't hurt.  We do have some code paths that use
> that behavior.
>
Thanks.


>
> * Ranier also suggested using clock_gettime instead of gettimeofday,
> but I see no reason to do that.  libpq already relies on gettimeofday,
> but not on clock_gettime, and anyway post-beta1 isn't a great time to
> start experimenting with portability-relevant changes.
>
I agree.
For v18, it would be a case of thinking about not using it anymore
gettimeofday, as it appears to be deprecated.

best regards,
Ranier Vilela


Re: Contributing test cases to improve coverage

2024-06-12 Thread Tom Lane
J F  writes:
> For postgres, I am looking at adding test cases to test suite in
> test/src/regress/. I have gone through (a)-(e), and managed to produced
> some test cases. As an example, I claim the test case
> ```
> CREATE RECURSIVE VIEW a(b) AS SELECT'' ;
> SELECT FROM a WHERE NULL;
> ```
> could kill the following mutation at optimizer/plan/setrefs.c, 502:5--502:33
> Original binary operator expression:
> ```
> rte->rtekind == RTE_SUBQUERY
> 
> Replacement expression:
> ```
> (rte->rtekind) >= RTE_SUBQUERY
> ```

I am quite confused about what is the point of this.  You have not
found any actual bug, nor have you demonstrated that this test case
could discover a likely future bug that wouldn't be detected another
way.  Moreover, it seems like the process would lead to some very
large number of equally marginal test cases.  We aren't likely to
accept such a patch, because we are concerned about keeping down the
runtime of the test suite.

regards, tom lane




Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-12 Thread Bruce Momjian
On Wed, Jun 12, 2024 at 02:08:02PM -0400, Tom Lane wrote:
> "Amonson, Paul D"  writes:
> > I had our OSS internal team, who are experts in OSS licensing, review 
> > possible conflicts between the PostgreSQL license and the BSD-Clause 3-like 
> > license for the CRC32C AVX-512 code, and they found no issues. Therefore, 
> > including the new license into the PostgreSQL codebase should be acceptable.
> 
> Maybe you should get some actual lawyers to answer this type of
> question.  The Chromium license this code cites is 3-clause-BSD
> style, which is NOT compatible: the "advertising" clause is
> significant.
> 
> In any case, writing copyright notices that are pointers to
> external web pages is not how it's done around here.  We generally
> operate on the assumption that the Postgres source code will
> outlive any specific web site.  Dead links to incidental material
> might be okay, but legally relevant stuff not so much.

Agreed.  The licenses are compatible in the sense that they can be
combined to create a unified work, but they cannot be combined without
modifying the license of the combined work.  You would need to combine
the Postgres and Chrome license for this, and I highly doubt we are
going to be modifying the Postgres for this.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jun 12, 2024 at 1:53 PM Tom Lane  wrote:
>> * I decided to invent a typedef
>> typedef pg_int64 PGusec_time_t;

> I don't think it's a bad idea to have a typedef, but that particular
> one is pretty unreadable. Mmm, let's separate some things with
> underscores and others by a change in the capitalization conventIon!

"PG" as a prefix for typedefs in libpq-fe.h is a pretty ancient
precedent.  I'm not wedded to any of the rest of it --- do you
have a better idea?

regards, tom lane




Re: On disable_cost

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 2:11 PM Andres Freund  wrote:
> 
>
> In an extreme case i can see a tiny bit of overhead, but not enough to be
> worth worrying about. Mostly because we're so profligate in doing
> bms_overlap() that cost comparisons don't end up mattering as much - I seem to
> recall that being different in the not distant past though.

There are very few things I love more than when you can't resist
trying to break my patches and yet fail to find a problem. Granted the
latter part only happens once a century or so, but I'll take it.

> Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
> mergejoins_allowed. If mergejoins are disabled we end up reaching
> match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
> only set mergejoin_enabled for right joins inside select_mergejoin_clauses(),
> but we don't call select_mergejoin_clauses() if !enable_mergejoin and jointype
> != FULL.  I, what?

I agree this logic is extremely confusing, but "we only set
mergejoin_enabled for right joins inside select_mergejoin_clauses()"
doesn't seem to be true. It starts out true, and always stays true
except for right, right-anti, and full joins, where
select_mergejoin_clauses() can set it to false. Since the call to
match_unsorted_outer() is gated by mergejoin_enabled, you might think
that we'd skip considering nested loops on the strength of not being
able to do a merge join, but comment "2." in add_paths_to_joinrel
explains that the join types for which mergejoin_enabled can end up
false aren't supported by nested loops anyway. Still, this logic is
really tortured.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 2:25 PM Tom Lane  wrote:
> Robert Haas  writes:
> > On Wed, Jun 12, 2024 at 1:53 PM Tom Lane  wrote:
> >> * I decided to invent a typedef
> >> typedef pg_int64 PGusec_time_t;
>
> > I don't think it's a bad idea to have a typedef, but that particular
> > one is pretty unreadable. Mmm, let's separate some things with
> > underscores and others by a change in the capitalization conventIon!
>
> "PG" as a prefix for typedefs in libpq-fe.h is a pretty ancient
> precedent.  I'm not wedded to any of the rest of it --- do you
> have a better idea?

Hmm, well, one thing I notice is that most of the other typedefs in
src/interfaces/libpq seem to do PGWordsLikeThis or PGwordsLikeThis
rather than PGwords_like_this. There are a few that randomly do
pg_words_like_this, too. But I know of no specific precedent for how a
microsecond type should be named.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: On disable_cost

2024-06-12 Thread Andres Freund
Hi,

On 2024-06-12 14:33:31 -0400, Robert Haas wrote:
> On Wed, Jun 12, 2024 at 2:11 PM Andres Freund  wrote:
> > 
> >
> > In an extreme case i can see a tiny bit of overhead, but not enough to be
> > worth worrying about. Mostly because we're so profligate in doing
> > bms_overlap() that cost comparisons don't end up mattering as much - I seem 
> > to
> > recall that being different in the not distant past though.
> 
> There are very few things I love more than when you can't resist
> trying to break my patches and yet fail to find a problem. Granted the
> latter part only happens once a century or so, but I'll take it.

:)


Too high cost in path cost comparison is what made me look at the PG code for
the first time, IIRC :)



> > Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
> > mergejoins_allowed. If mergejoins are disabled we end up reaching
> > match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
> > only set mergejoin_enabled for right joins inside 
> > select_mergejoin_clauses(),
> > but we don't call select_mergejoin_clauses() if !enable_mergejoin and 
> > jointype
> > != FULL.  I, what?
> 
> I agree this logic is extremely confusing, but "we only set
> mergejoin_enabled for right joins inside select_mergejoin_clauses()"
> doesn't seem to be true.

Sorry, should have been more precise. With "set" I didn't mean set to true,
but that that it's only modified within select_mergejoin_clauses().


> It starts out true, and always stays true except for right, right-anti, and
> full joins, where select_mergejoin_clauses() can set it to false. Since the
> call to match_unsorted_outer() is gated by mergejoin_enabled, you might
> think that we'd skip considering nested loops on the strength of not being
> able to do a merge join, but comment "2." in add_paths_to_joinrel explains
> that the join types for which mergejoin_enabled can end up false aren't
> supported by nested loops anyway. Still, this logic is really tortured.

Agree that that's the logic - but doesn't that mean we'll consider nestloops
for e.g. right joins iff enable_mergejoin=false?


Greetings,

Andres Freund




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jun 12, 2024 at 2:25 PM Tom Lane  wrote:
>> "PG" as a prefix for typedefs in libpq-fe.h is a pretty ancient
>> precedent.  I'm not wedded to any of the rest of it --- do you
>> have a better idea?

> Hmm, well, one thing I notice is that most of the other typedefs in
> src/interfaces/libpq seem to do PGWordsLikeThis or PGwordsLikeThis
> rather than PGwords_like_this. There are a few that randomly do
> pg_words_like_this, too. But I know of no specific precedent for how a
> microsecond type should be named.

Hmm ... pg_int64 is the only such typedef I'm seeing in that file.
But okay, it's a precedent.  The thing I'm having difficulty with
is that I'd like the typedef name to allude to time_t, and I don't
think fooling with the casing of that will be helpful in making
the allusion stick.  So how about one of

pg_usec_time_t
pg_time_t_usec

?

regards, tom lane




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 1:30 PM Daniel Gustafsson  wrote:
> > On 12 Jun 2024, at 17:50, Andres Freund  wrote:
> > On 2024-06-11 07:28:23 -0700, Jacob Champion wrote:
>
> >> The OAuth pytest suite makes extensive use of
> >> - psycopg, to easily drive libpq;
> >
> > That's probably not going to fly. It introduces painful circular 
> > dependencies
> > between building postgres (for libpq), building psycopg (requiring libpq) 
> > and
> > testing postgres (requiring psycopg).
>
> I might be missing something obvious, but if we use a third-party libpq driver
> in the testsuite doesn't that imply that a patch adding net new functionality
> to libpq also need to add it to the driver in order to write the tests?  I'm
> thinking about the SCRAM situation a few years back when drivers weren't up to
> date.

Yeah, I don't think depending on psycopg2 is practical at all. We can
either shell out to psql like we do now, or we can use something like
CFFI.

On the overall topic of this thread, I personally find most of the
rationale articulated in the original message unconvincing. Many of
those improvements could be made on top of the Perl framework we have
today, and some of them have been discussed, but nobody's done the
work. I also don't understand the argument that assert a == b is some
new and wonderful thing; I mean, you can already do is($a,$b,"test
name") which *also* shows you the values when they don't match, and
includes a test name, too! I personally think that most of the
frustration associated with writing TAP tests has to do with (1)
Windows behavior being randomly different than on other platforms in
ways that are severely under-documented, (2)
PostgreSQL::Test::whatever being somewhat clunky and under-designed,
and (3) the general difficulty of producing race-free test cases. A
new test framework isn't going to solve (3), and (1) and (2) could be
fixed anyway.

However, I understand that a lot of people would prefer to code in
Python than in Perl. I am not one of them: I learned Perl in the early
nineties, and I haven't learned Python yet. Nonetheless, Python being
more popular than Perl is a reasonable reason to consider allowing its
use in PostgreSQL. But if that's the reason, let's be up front about
it.

I do think we want a scripting language here i.e. not C.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Multi-transactional statements and statistics for autovacuum

2024-06-12 Thread Igor V.Gnatyuk

Hello.

Before the advent of procedures in PostgreSQL 11 that can manage 
transactions, there could only be one transaction
in one statement. Hence the end of the transaction also meant the end of 
the statement. Apparently, this is why
the corresponding restriction is described differently in different 
places of the documentation:


https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-STATS-VIEWS
"...so a query or transaction still in progress does not affect the 
displayed totals..."

"...counts actions taken so far within the current transaction..."

But now it's possible that several transactions are performed within one 
SQL statement call.
At the same time, the current implementation transfers the accumulated 
statistics to the shared memory only
at the end of the statement. These statistics data is used by automatic 
vacuum. Thus, in a situation
where some procedure that changes data is running for a long time (e.g. 
an infinite event processing loop,
including implementing any queues), the changes made and committed in it 
will not affect statistics in shared memory
until the CALL statement is finished. This will not allow the autovacuum 
to make the right cleaning decision in time.
To illustrate the described feature, I suggest to consider the example 
below.


Example.

We process the data in the 'test' table. The 'changes' column will show 
the number of row updates:


  CREATE TABLE test (changes int);

Let's insert a row into the table:

  INSERT INTO test VALUES (0);

At each processing step, the value of the 'changes' column will be 
incremented. The processing will be performed
in a long-running loop within the 'process' procedure (see below). The 
actions of each loop step are committed.


  CREATE PROCEDURE process() AS $$
  DECLARE
l_chs int;
  BEGIN
LOOP
  UPDATE test SET changes = changes + 1 RETURNING changes INTO 
l_chs;

  COMMIT;
  RAISE NOTICE 'changes % -- upd_shared = %, upd_local = %', l_chs,
   (SELECT n_tup_upd FROM pg_stat_all_tables
WHERE relname = 'test'),  -- statistics in shared 
memory (considered by autovacuum)

   (SELECT n_tup_upd FROM pg_stat_xact_all_tables
WHERE relname = 'test');   -- statistics within the 
operation (transaction)

END LOOP;
  END
  $$ LANGUAGE plpgsql

Let's call the procedure:

  CALL process();

NOTICE:  changes 1 -- upd_shared = 0, upd_local = 1
NOTICE:  changes 2 -- upd_shared = 0, upd_local = 2
NOTICE:  changes 3 -- upd_shared = 0, upd_local = 3
NOTICE:  changes 4 -- upd_shared = 0, upd_local = 4
NOTICE:  changes 5 -- upd_shared = 0, upd_local = 5
NOTICE:  changes 6 -- upd_shared = 0, upd_local = 6
NOTICE:  changes 7 -- upd_shared = 0, upd_local = 7
NOTICE:  changes 8 -- upd_shared = 0, upd_local = 8
...

If we now observe the cumulative statistics on the 'test' table from 
another session, we will see
that despite the fact that there are updates and dead tuples appear, 
this information does not get into the shared memory:


SELECT n_tup_upd, n_dead_tup, n_ins_since_vacuum, vacuum_count, 
autovacuum_count FROM pg_stat_all_tables WHERE relname = 'test'

|  n_tup_upd  | 0
|  n_dead_tup | 0
|  n_ins_since_vacuum | 1
|  vacuum_count   | 0
|  autovacuum_count   | 0

It would be logical to remove the existing restriction, that is, to 
update statistics data precisely

after transaction completion, even if the operator is still working.

--
Regards, Igor Gnatyuk
Postgres Professional https://postgrespro.com




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 3:00 PM Tom Lane  wrote:
> Hmm ... pg_int64 is the only such typedef I'm seeing in that file.

I grepped the whole directory for '^} '.

> But okay, it's a precedent.  The thing I'm having difficulty with
> is that I'd like the typedef name to allude to time_t, and I don't
> think fooling with the casing of that will be helpful in making
> the allusion stick.  So how about one of
>
> pg_usec_time_t
> pg_time_t_usec
>
> ?

The former seems better to me, since having _t not at the end does not
seem too intuitive.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-12 Thread Jeff Davis
On Wed, 2024-06-12 at 12:13 +0530, Ashutosh Bapat wrote:
> > Alternatively, we could leverage the extension dependency
> > information
> > to determine whether the function is created by an extension or
> > not.
> 
> That will be simpler. We do that sort of thing for identity
> sequences. So there's a precedent to do that kind of stuff. 

I did not look at the details, but +1 for using information we already
have. There's a little bit of extra work to resolve it, but thanks to
the search_path cache it should only need to be done once per unique
search_path setting per session.

Regards,
Jeff Davis





Re: On disable_cost

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 2:48 PM Andres Freund  wrote:
> Sorry, should have been more precise. With "set" I didn't mean set to true,
> but that that it's only modified within select_mergejoin_clauses().

Oh. "set" has more than one relevant meaning here.

> > It starts out true, and always stays true except for right, right-anti, and
> > full joins, where select_mergejoin_clauses() can set it to false. Since the
> > call to match_unsorted_outer() is gated by mergejoin_enabled, you might
> > think that we'd skip considering nested loops on the strength of not being
> > able to do a merge join, but comment "2." in add_paths_to_joinrel explains
> > that the join types for which mergejoin_enabled can end up false aren't
> > supported by nested loops anyway. Still, this logic is really tortured.
>
> Agree that that's the logic - but doesn't that mean we'll consider nestloops
> for e.g. right joins iff enable_mergejoin=false?

No, because that function has its own internal guards. See nestjoinOK.

But don't misunderstand me: I'm not defending the status quo. The
whole thing seems like a Rube Goldberg machine to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jun 12, 2024 at 3:00 PM Tom Lane  wrote:
>> So how about one of
>>  pg_usec_time_t
>>  pg_time_t_usec
>> ?

> The former seems better to me, since having _t not at the end does not
> seem too intuitive.

True.  We can guess about how POSIX might spell this if they ever
invent the concept, but one choice they certainly would not make
is time_t_usec.

v3 attached uses pg_usec_time_t, and fixes one brown-paper-bag
bug the cfbot noticed in v2.

regards, tom lane

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 80179f9978..814c49bdd1 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -268,30 +268,52 @@ PGconn *PQsetdb(char *pghost,
   
nonblocking connection
Poll a connection's underlying socket descriptor retrieved with .
+   The primary use of this function is iterating through the connection
+   sequence described in the documentation of
+   .
 
-int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+typedef pg_int64 pg_usec_time_t;
+
+int PQsocketPoll(int sock, int forRead, int forWrite,
+ pg_usec_time_t end_time);
 
   
 
   
-   This function sets up polling of a file descriptor. The underlying function is either
+   This function performs polling of a file descriptor, optionally with
+   a timeout. The underlying function is either
poll(2) or select(2), depending on platform
-   support. The primary use of this function is iterating through the connection sequence
-   described in the documentation of . If
-   forRead is specified, the function waits for the socket to be ready
-   for reading. If forWrite is specified, the function waits for the
-   socket to be ready for write. See POLLIN and POLLOUT
+   support.
+   If forRead is nonzero, the
+   function will terminate when the socket is ready for
+   reading. If forWrite is nonzero,
+   the function will terminate when the
+   socket is ready for writing.
+   See POLLIN and POLLOUT
from poll(2), or readfds and
-   writefds from select(2) for more information. If
-   end_time is not -1, it specifies the time at which
-   this function should stop waiting for the condition to be met.
+   writefds from select(2) for more information.
+  
+
+  
+   The timeout is specified by end_time, which
+   is the time to stop waiting expressed as a number of microseconds since
+   the Unix epoch (that is, time_t times 1 million).
+   Timeout is infinite if end_time
+   is -1.  Timeout is immediate (no blocking) if
+   end_time is 0 (or indeed, any time before now).
+   Timeout values can be calculated conveniently by adding the desired
+   number of microseconds to the result of
+   .
+   Note that the underlying system calls may have less than microsecond
+   precision, so that the actual delay may be imprecise.
   
 
   
The function returns a value greater than 0 if the specified condition
is met, 0 if a timeout occurred, or -1 if an error
occurred. The error can be retrieved by checking the errno(3) value. In
-   the event forRead and forWrite are not set, the
+   the event both forRead
+   and forWrite are zero, the
function immediately returns a timeout condition.
   
  
@@ -8039,6 +8061,25 @@ int PQlibVersion(void);
 

 
+   
+PQgetCurrentTimeUSecPQgetCurrentTimeUSec
+
+
+ 
+  Retrieves the current time, expressed as the number of microseconds
+  since the Unix epoch (that is, time_t times 1 million).
+
+pg_usec_time_t PQgetCurrentTimeUSec(void);
+
+ 
+
+ 
+  This is primarily useful for calculating timeout values to use with
+  .
+ 
+
+   
+
   
 
  
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index fae5940b54..180781ecd0 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3764,7 +3764,7 @@ wait_until_connected(PGconn *conn)
 	{
 		int			rc;
 		int			sock;
-		time_t		end_time;
+		pg_usec_time_t end_time;
 
 		/*
 		 * On every iteration of the connection sequence, let's check if the
@@ -3795,7 +3795,7 @@ wait_until_connected(PGconn *conn)
 		 * solution happens to just be adding a timeout, so let's wait for 1
 		 * second and check cancel_pressed again.
 		 */
-		end_time = time(NULL) + 1;
+		end_time = PQgetCurrentTimeUSec() + 100;
 		rc = PQsocketPoll(sock, forRead, !forRead, end_time);
 		if (rc == -1)
 			return;
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 8ee0811510..5d8213e0b5 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -204,3 +204,4 @@ PQcancelReset 201
 PQcancelFinish202
 PQsocketPoll  203
 PQsetChunkedRowsMode  204
+PQgetCurrentTimeUSec  205
diff --git a/src

Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jacob Champion
On Tue, Jun 11, 2024 at 4:48 PM Noah Misch  wrote:
> If we're going to test in a non-Perl language, I'd pick C over Python.

We already test in C, though? If the complaint is that those tests are
driven by Perl, I agree that something like libcheck or GTest or
whatever people are using nowadays would be nicer. But that can be
added at any point; the activation energy for a new C-based test
runner seems pretty small. IMO, there's no reason to pick it _over_
another language, when we already support C tests and agree that
developers need to be fluent in C.

> We'd need a library like today's Perl
> PostgreSQL::Test to make C-language tests nice, but the same would apply to
> any new language.

I think the effort required in rebuilding end-to-end tests in C is
going to be a lot different than in pretty much any other modern
high-level language, so I don't agree that "the same would apply".

For the five problem statements I put forward, I think C moves the
needle for zero of them. It neither solves the problems we have nor
gives us stronger tools to solve them ourselves. And for my personally
motivating use case of OAuth, where I need to manipulate HTTP and JSON
and TLS and so on and so forth, implementing all of that in C would be
an absolute nightmare. Given that choice, I would rather use Perl --
and that's saying something, because I like C a lot more than I like
Perl -- because it's the difference between being given a rusty but
still functional table saw, and being given a box of Legos to build a
"better" table saw, when all I want to do is cut a 2x4 in half and
move on with my work.

I will use the rusty saw if I have to. But I want to get a better saw
-- that somebody else with experience in making saws has constructed,
and other people are pretty happy with -- as opposed to building my
own.

> I also want the initial scope to be the new language coexisting with the
> existing Perl tests.

Strongly agreed.

Thanks,
--Jacob




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jacob Champion
On Wed, Jun 12, 2024 at 4:40 AM Jelte Fennema-Nio  wrote:
> I think C#, Java, Go, Rust, Kotlin, and Swift would be acceptable
> choices for me (and possibly some more). They allow some type of
> introspection, they have a garbage collector, and their general
> tooling is quite good.
>
> But I think a dynamically typed scripting language is much more
> fitting for writing tests like this. I love static typing for
> production code, but imho it really doesn't have much benefit for
> tests.

+1. I write mostly protocol mocks and glue code in my authn testing,
to try to set up the system into some initial state and then break it.
Of the languages mentioned here, I've only used C#, Java, and Go. If I
had to reimplement my tests, I'd probably reach for Go out of all of
those, but the glue would still be more painful than it probably needs
to be.

> As scripting languages go, the ones that are still fairly heavily in
> use are Javascript, Python, Ruby, and PHP. I think all of those could
> probably work, but my personal order of preference would be Python,
> Ruby, Javascript, PHP.

- Python is the easiest language I've personally used to glue things
together, bar none.
- I like Ruby as a language but have no experience using it for
testing. (RSpec did come up during the unconference session and
subsequent hallway conversations.)
- Javascript is a completely different mental model from what we're
used to, IMO. I think we're likely to spend a lot of time fighting the
engine unless everyone is very clear on how it works.
- I don't see a use case for PHP here.

> TO CLARIFY: This thread is not a proposal to replace Perl with Python.
> It's a proposal to allow people to also write tests in Python.

+1. It doesn't need to replace anything. It just needs to help us do
more things than we're currently doing.

--Jacob




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jacob Champion
On Wed, Jun 12, 2024 at 4:48 AM Alexander Korotkov  wrote:
> Generally, testgres was initially designed as Python analogue of what
> we have in src/test/perl/PostgreSQL/Test.  In particular its
> testgres.PostgresNode is analogue of PostgreSQL::Test::Cluster.  It
> comes under PostgreSQL License.  So, I wonder if we could revise it
> and fetch most part of it into our source tree.

Okay. If there's wide interest in a port of PostgreSQL::Test::Cluster,
that might be something to take a look at. (Since I'm focused on
testing things that the current Perl suite can't do at all, I would
probably not be the first to volunteer.)

--Jacob




Re: race condition in pg_class

2024-06-12 Thread Noah Misch
On Wed, Jun 12, 2024 at 02:08:31PM -0400, Robert Haas wrote:
> On Wed, Jun 12, 2024 at 1:54 PM Noah Misch  wrote:
> > If I were making a list of changes always welcome post-beta, it wouldn't
> > include adding wait event types.  But I don't hesitate to add one if it
> > unblocks a necessary test for a bug present in all versions.
> 
> However, injection points themselves are not present in all versions,
> so even if we invent a new wait-event type, we'll have difficulty
> testing older versions, unless we're planning to back-patch all of
> that infrastructure, which I assume we aren't.

Right.  We could put the injection point tests in v18 only instead of v17+v18.
I feel that would be an overreaction to a dispute about names that show up
only in tests.  Even so, I could accept that.

> Personally, I think the fact that injection point wait events were put
> under Extension is a design mistake that should be corrected before 17
> is out of beta.

Works for me.  I don't personally have a problem with the use of Extension,
since it is a src/test/modules extension creating them.

> > Here's what I'm reading for each person's willingness to tolerate each 
> > option:
> >
> > STRATEGY| Paquier | Misch | Haas
> > 
> > new "Injection Point" wait type | maybe   | yes   | yes
> > INJECTION_POINT(...) naming | yes | yes   | unknown
> > isolation spec says event names | yes | no| unknown
> >
> > Corrections and additional strategy lines welcome.  Robert, how do you judge
> > the lines where I've listed you as "unknown"?
> 
> I'd tolerate INJECTION_POINT() if we had no other option but I think
> it's clearly inferior. Does the last line refer to putting the
> specific wait event names in the isolation spec file? If so, I'd also
> be fine with that.

Yes, the last line does refer to that.  Updated table:

STRATEGY| Paquier | Misch | Haas

new "Injection Point" wait type | maybe   | yes   | yes
INJECTION_POINT(...) naming | yes | yes   | no
isolation spec says event names | yes | no| yes

I find that's adequate support for the first line.  If there are no objections
in the next 24hr, I will implement that.




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jacob Champion
On Wed, Jun 12, 2024 at 8:50 AM Andres Freund  wrote:
> I think I might have formulated my paragraph above badly - I didn't mean that
> we should move away from perl tests tomorrow, but that we need a path forward
> that allows folks to write tests without perl.

Okay, agreed.

> > - Tests aren't cheap, but in my experience, the maintenance-cost math
> > for tests is a lot different than the math for implementations.
>
> At the moment they tend to be *more* expensive often, due to spurious
> failures. That's mostly not perl's fault, don't get me wrong, but us not
> having better infrastructure for testing complicated behaviour and/or testing
> things on a more narrow basis.

Well, okay, but I'm not sure how to respond to this in the frame of
this discussion. Bad tests will continue to exist. I am trying to add
a tool that, in my view, has made it easier for me to test complicated
behavior than what we currently have. I can't prove that it will solve
other issues too.

> > Well, scopes are pretty front and center when you start building
> > pytest fixtures, and the complicated longer setups will hopefully
> > converge correctly early on and be reused everywhere else. I imagine
> > no one wants to build cluster setup from scratch.
>
> One (the?) prime source of state in our tap tests is the
> database. Realistically we can't just tear that one down and reset it between
> tests without causing the test times to explode. So we'll have to live with
> some persistent state.

Yes? If I've given the impression that I disagree, sorry; I agree.

> > On a slight tangent, is this not a problem today?
>
> It is, but that doesn't mean making it even bigger is unproblematic :)

Given all that's been said, I don't understand why you think the
problem would get bigger. We would cache expensive state that we need,
including the cluster, and pytest lets us do that, and my test suite
does that. I've never written a suite that spun up a separate cluster
for every single test and then immediately threw it away.

(If you want to _enable_ that behavior, to test in extreme isolation,
then pytest lets you do that too. But it's not something we should do
by default.)

> > We do a lot more acceptance testing than internal testing, which came
> > up as a major complaint from me and others during the unconference.
> > One of the reasons people avoid writing internal tests in Perl is
> > because it's very painful to find a rhythm with Test::More.
>
> What definition of internal tests are you using here?

There's a spectrum from unit-testing unexported functions all the way
to end-to-end acceptance, and personally I believe that anything
finer-grained than end-to-end acceptance is unnecessarily painful. My
OAuth suite sits somewhere in the middle, where it mocks the protocol
layer and can test the client and server as independent pieces. Super
useful for OAuth, which is asymmetrical.

I'd like to additionally see better support for unit tests of backend
internals, but I don't know those seams as well as all of you do and I
should not be driving that. I don't think Python will necessarily help
you with it. But it sure helped me break apart the client and the
server while enjoying the testing process, and other people want to do
that too, so that's what I'm pushing for.

> I think a lot of our tests are complicated, fragile and slow because we almost
> exclusively do end-to-end tests, because with a few exceptions we don't have a
> way to exercise code in a more granular way.

Yep.

> That's probably not going to fly. It introduces painful circular dependencies
> between building postgres (for libpq), building psycopg (requiring libpq) and
> testing postgres (requiring psycopg).

I am trying very hard not to drag that, which I understand is
controversial and is in no way a linchpin of my proposal, into the
discussion of whether or not we should try supporting pytest.

I get it; I understand that the circular dependency is weird; there
are alternatives if it's unacceptable; none of that has anything to do
with Python+pytest.

> One thing worth thinking about is that such dependencies have to work on a
> relatively large number of platforms / architectures. A lot of projects
> don't...

Agreed.

Thanks,
--Jacob




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread Jacob Champion
On Wed, Jun 12, 2024 at 10:30 AM Daniel Gustafsson  wrote:
> I might be missing something obvious, but if we use a third-party libpq driver
> in the testsuite doesn't that imply that a patch adding net new functionality
> to libpq also need to add it to the driver in order to write the tests?

I use the third-party driver to perform the "basics" at a high level
-- connections, queries during cluster setup, things that don't
involve ABI changes. For new ABI I use ctypes, or as other people have
mentioned CFFI would work.

--Jacob




Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-12 Thread Robert Haas
On Wed, Jun 12, 2024 at 2:05 PM John H  wrote:>
> I'm sympathetic to the problem of potential privilege escalation when
> executing functions. At the same time I'm not sure if there's that
> much of a difference between 'CREATE EXTENSION' vs  superusers copying
> a series of 'CREATE FUNCTION' where they don't understand these same
> nuances.

+1.

> CREATE FUNCTION already provides an ability to set the search_path. So
> I'm wondering what the problem we want to solve here is. Is it that
> there's too much friction for extension authors to have to set
> search_path as part of the function definition and we want to make it
> easier for them to "set once and forget"?

If that's the problem we want to solve, I'm unconvinced that it's
worth doing anything. But I think there's another problem, which is
that if the extension is relocatable, how do you set a secure
search_path? You could say SET search_path = foo, pg_catalog if you
know the extension will be installed in schema foo, but if you don't
know in what schema the extension will be installed, then what are you
supposed to do? The proposal of litting $extension_schema could help
with that ...

...except I'm not sure that's really a full solution either, because
what if the extension is installed into a schema that's writable by
others, like public? If $extension_schema resolves to public and
public allows CREATE access to normal users, you're in a lot of
trouble again already, because now an attacker can try to capture
operator references within your function, or function calls that are
approximate matches to some existing function by introducing perfect
matches that take priority. Perhaps we want to consider the case where
the attacker can write to the schema containing the extension as an
unsupported scenario, but if so, we'd better be explicit about that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-12 Thread Andres Freund
Hi,

I'm wonder if this isn't going in the wrong direction. We're using CRCs for
something they're not well suited for in my understanding - and are paying a
reasonably high price for it, given that even hardware accelerated CRCs aren't
blazingly fast.

CRCs are used for things like ethernet, iSCSI because they are good at
detecting the kinds of errors encountered, namely short bursts of
bitflips. And the covered data is limited to a fairly small limit.

Which imo makes CRCs a bad choice for WAL. For one, we don't actually expect a
short burst of bitflips, the most likely case is all bits after some point
changing (because only one part of the record made it to disk). For another,
WAL records are *not* limited to a small size, and if anything error detection
becomes more important with longer records (they're likely to be span more
pages / segments).


It's hard to understand, but a nonetheless helpful page is
https://users.ece.cmu.edu/~koopman/crc/crc32.html which lists properties for
crc32c:
https://users.ece.cmu.edu/~koopman/crc/c32/0x8f6e37a0_len.txt
which lists
(0x8f6e37a0; 0x11edc6f41) <=> (0x82f63b78; 0x105ec76f1) 
{2147483615,2147483615,5243,5243,177,177,47,47,20,20,8,8,6,6,1,1} | gold | 
(*op) iSCSI; CRC-32C; CRC-32/4

This cryptic notion AFAIU indicates that for our polynomial we can detect 2bit
errors up to a length of 2147483615 bytes, 3 bit errors up to 2147483615, 3
and 4 bit errors up to 5243, 5 and 6 bit errors up to 177, 7/8 bit errors up
to 47.

IMO for our purposes just about all errors are going to be at least at sector
boundaries, i.e. 512 bytes and thus are at least 8 bit large. At that point we
are only guaranteed to find a single-byte error (it'll be common to have
much more) up to a lenght of 47bits. Which isn't a useful guarantee.


With that I perhaps have established that CRC guarantees aren't useful for us.
But not yet why we should use something else: Given that we already aren't
relying on hard guarantees, we could instead just use a fast hash like xxh3.
https://github.com/Cyan4973/xxHash which is fast both for large and small
amounts of data.


Greetings,

Andres Freund




Re: Improve the granularity of PQsocketPoll's timeout parameter?

2024-06-12 Thread Tom Lane
I wrote:
> v3 attached uses pg_usec_time_t, and fixes one brown-paper-bag
> bug the cfbot noticed in v2.

Oh, I just remembered that there's a different bit of
pqConnectDBComplete that we could simplify now:

if (timeout > 0)
{
/*
 * Rounding could cause connection to fail unexpectedly quickly;
 * to prevent possibly waiting hardly-at-all, insist on at least
 * two seconds.
 */
if (timeout < 2)
timeout = 2;
}
else/* negative means 0 */
timeout = 0;

With this infrastructure, there's no longer any need to discriminate
against timeout == 1 second, so we might as well reduce this to

if (timeout < 0)
timeout = 0;

as it's done elsewhere.

regards, tom lane




Re: race condition in pg_class

2024-06-12 Thread Noah Misch
On Wed, Jun 12, 2024 at 03:02:43PM +0200, Michail Nikolaev wrote:
> I am not sure, but I think that issue may be related to the issue described
> in
> https://www.postgresql.org/message-id/CANtu0ojXmqjmEzp-%3DaJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg%40mail.gmail.com
> 
> It looks like REINDEX CONCURRENTLY could interfere with ON CONFLICT UPDATE
> in some strange way.

Can you say more about the connection you see between $SUBJECT and that?  That
looks like a valid report of an important bug, but I'm not following the
potential relationship to $SUBJECT.

On your other thread, it would be useful to see stack traces from the high-CPU
processes once the live lock has ended all query completion.




Re: Contributing test cases to improve coverage

2024-06-12 Thread J F
> I am quite confused about what is the point of this.  You have not
> found any actual bug, nor have you demonstrated that this test case
> could discover a likely future bug that wouldn't be detected another
> way.  Moreover, it seems like the process would lead to some very
> large number of equally marginal test cases.  We aren't likely to
> accept such a patch, because we are concerned about keeping down the
> runtime of the test suite.
>
>regards, tom lane


The point of this project is to improve the coverage of PostgreSQL’s
preexisting test suite. Writing a test suite to achieve close to 100%
coverage is challenging, but I have proposed a workflow to automate this
process.

I assert that no test case in the regression test suite currently covers
the comparator in the expression rte->rtekind == RTE_SUBQUERY. I propose
adding a new test case that addresses exactly this. In the future, if
someone accidentally modifies the operator to become >=, it will trigger
incorrect behavior when certain queries are executed. This test case will
catch that issue.

I get that the test cases in /regress are likely reserved for actual bugs
found and are designed to run quickly. Would it be a good idea to have a
separate, more rigorous test suite that runs longer but provides better
code coverage?

Regards,
Jon


Re: race condition in pg_class

2024-06-12 Thread Michail Nikolaev
Hello!

> Can you say more about the connection you see between $SUBJECT and that?
That
> looks like a valid report of an important bug, but I'm not following the
> potential relationship to $SUBJECT.

I was guided by the following logic:
* A pg_class race condition can cause table indexes to look stale.
* REINDEX updates indexes
* errors can be explained by different backends using different arbiter
indexes

> On your other thread, it would be useful to see stack traces from the
high-CPU
> processes once the live lock has ended all query completion.
I'll do.

Best regards,
Mikhail.


  1   2   >