Re: Possibility to disable `ALTER SYSTEM`

2024-02-12 Thread Joel Jacobson
On Sun, Feb 11, 2024, at 14:58, Robert Haas wrote:
> It's not entirely clear to me what our wider vision is here. Some
> people seem to want a whole series of flags that can disable various
> things that the superuser might otherwise be able to do,

Yes, that's what bothers me a little with the idea of a special fix for this 
special case.

On Thu, Sep 7, 2023, at 22:27, Tom Lane wrote:
> If you nonetheless feel that that's a good idea for your use case,
> you can implement the restriction with an event trigger or the like.

On Fri, Sep 15, 2023, at 11:18, Daniel Gustafsson wrote:
>> On 11 Sep 2023, at 15:50, Magnus Hagander  wrote:
>> 
>> On Sat, Sep 9, 2023 at 5:14 PM Alvaro Herrera  
>> wrote:
>>> 
>>> On 2023-Sep-08, Magnus Hagander wrote:
>>> 
 Now, it might be that you don't care at all about the *security* side
 of the feature, and only care about the convenience side. But in that
 case, the original suggestion from Tom of using an even trigger seems
 like a fine enough solution?
>>> 
>>> ALTER SYSTEM, like all system-wide commands, does not trigger event
>>> triggers.  These are per-database only.
>>> 
>>> https://www.postgresql.org/docs/16/event-trigger-matrix.html
>> 
>> Hah, didn't think of that. And yes, that's a very good point. But one
>> way to fix that would be to actually make event triggers for system
>> wide commands, which would then be useful for other things as well...
>
> Wouldn't having system wide EVTs be a generic solution which could be the
> infrastructure for this requested change as well as others in the same area?

+1

I like the wider vision of providing the necessary infrastructure to provide a 
solution for the general case.

/Joel




Re: make dist using git archive

2024-02-12 Thread Peter Eisentraut

On 12.02.24 18:26, Tristan Partin wrote:

On Sun Feb 11, 2024 at 5:09 PM CST, Peter Eisentraut wrote:
Small update: I noticed that on Windows (at least the one that is 
running the CI job), I need to use git -c core.autocrlf=false, 
otherwise git archive does line-ending conversion for the files it 
puts into the archive.  With this fix, all the archives produced by 
all the CI jobs across the different platforms match, except the 
.tar.gz archive from the Linux job, which I suspect suffers from an 
old git version.  We should get the Linux images updated to a newer 
Debian version soon anyway, so I think that issue will go away.


I think with this change, it is unlikely I will be able to upstream 
anything to Meson that would benefit Postgres here since setting this 
option seems project dependent.


Meson is vulnerable to the same problem: If the person who makes the 
release had some crlf-related git setting activated in their 
environment, then that would affect the tarball.  And such a tarball 
would be genuinely broken for non-Windows users, because at least some 
parts of Unix systems can't process such CRLF files correctly.


(This is easy to test: Run meson dist with core.autocrlf=true on the 
postgresql tree on a non-Windows system.  It will fail during dist check.)






Re: Synchronizing slots from primary to standby

2024-02-12 Thread Amit Kapila
On Tue, Feb 13, 2024 at 9:38 AM Zhijie Hou (Fujitsu)
 wrote:
>
> Here is the V85_2 patch set that added the test and fixed one typo,
> there are no other code changes.
>

Few comments on the latest changes:
==
1.
+# Confirm that the invalidated slot has been dropped.
+$standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid 5/,
+ $log_offset);

Is it okay to hardcode dbid 5? I am a bit worried that it can lead to
instability in the test.

2.
+check_primary_info(WalReceiverConn *wrconn, int elevel)
+{
..
+ bool primary_info_valid;

I don't think for 0001, we need an elevel as an argument, so let's
remove it. Additionally, can we change the variable name
primary_info_valid to primary_slot_valid? Also, can we change the
function name to validate_remote_info() as the remote can be both
primary or standby?

3.
+SyncReplicationSlots(WalReceiverConn *wrconn)
+{
+ PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn));
+ {
+ check_primary_info(wrconn, ERROR);
+
+ synchronize_slots(wrconn);
+ }
+ PG_END_ENSURE_ERROR_CLEANUP(slotsync_failure_callback,
PointerGetDatum(wrconn));
+
+ walrcv_disconnect(wrconn);

It is better to disconnect in the caller where we have made the connection.

-- 
With Regards,
Amit Kapila.
diff --git a/doc/src/sgml/logicaldecoding.sgml 
b/doc/src/sgml/logicaldecoding.sgml
index bfa2a13377..a9ff6acd31 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -365,19 +365,18 @@ postgres=# select * from 
pg_logical_slot_get_changes('regression_slot', NULL, NU
 Replication Slot Synchronization
 
  The logical replication slots on the primary can be synchronized to
- the hot standby by enabling failover during slot
- creation (e.g., using the failover parameter of
+ the hot standby by using the failover parameter of
  
  pg_create_logical_replication_slot, or
  using the 
  failover option of
- CREATE SUBSCRIPTION), and then calling
+ CREATE SUBSCRIPTION during slot creation, and then 
calling
  
  pg_sync_replication_slots
  on the standby. For the synchronization to work, it is mandatory to
- have a physical replication slot between the primary and the standby 
(e.g.,
+ have a physical replication slot between the primary and the standby aka
  primary_slot_name
- should be configured on the standby), and
+ should be configured on the standby, and
  hot_standby_feedback
  must be enabled on the standby. It is also necessary to specify a valid
  dbname in the
diff --git a/src/backend/replication/logical/slotsync.c 
b/src/backend/replication/logical/slotsync.c
index feb04e1451..eb238cef92 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -304,6 +304,12 @@ reserve_wal_for_local_slot(XLogRecPtr restart_lsn)
 * number. However, if no WAL segment files have been removed 
by a
 * checkpoint since startup, we need to search for the oldest 
segment
 * file from the current timeline existing in XLOGDIR.
+*
+* XXX: Currently, we are searching for the oldest segment in 
the
+* current timeline as there is less chance of the slot's 
restart_lsn
+* from being some prior timeline, and even if it happens, in 
the worst
+* case, we will wait to sync till the slot's restart_lsn moved 
to the
+* current timeline.
 */
oldest_segno = XLogGetLastRemovedSegno() + 1;
 
@@ -685,11 +691,10 @@ synchronize_slots(WalReceiverConn *wrconn)
 }
 
 /*
- * Checks the primary server info.
+ * Checks the remote server info.
  *
- * Using the specified primary server connection, check whether we are a
- * cascading standby. It also validates primary_slot_name for non-cascading
- * standbys.
+ * We ensure that the 'primary_slot_name' exists on the remote server and
+ * the remote server is not a standby node.
  */
 static void
 check_primary_info(WalReceiverConn *wrconn, int elevel)


Re: Built-in CTYPE provider

2024-02-12 Thread Peter Eisentraut

On 13.02.24 03:01, Jeff Davis wrote:

1. The SQL spec mentions the capitalization of "ß" as "SS"
specifically. Should UCS_BASIC use the unconditional mappings in
SpecialCasing.txt? I already have some code to do that (not posted
yet).


It is my understanding that "correct" Unicode case conversion needs to 
use at least some parts of SpecialCasing.txt.  The header of the file says


"For compatibility, the UnicodeData.txt file only contains simple case 
mappings for characters where they are one-to-one and independent of 
context and language. The data in this file, combined with the simple 
case mappings in UnicodeData.txt, defines the full case mappings [...]"


I read this as, just using UnicodeData.txt by itself is incomplete.

I think we need to use the "Unconditional" mappings and the "Conditional 
Language-Insensitive" mappings (which is just Greek sigma).  Obviously, 
skip the "Language-Sensitive" mappings.





Do away with zero-padding assumption before WALRead()

2024-02-12 Thread Bharath Rupireddy
Hi,

I noticed an assumption [1] at WALRead() call sites expecting the
flushed WAL page to be zero-padded after the flush LSN. I think this
can't always be true as the WAL can get flushed after determining the
flush LSN before reading it from the WAL file using WALRead(). I've
hacked the code up a bit to check if that's true -
https://github.com/BRupireddy2/postgres/tree/ensure_extra_read_WAL_page_is_zero_padded_at_the_end_WIP,
the tests hit the Assert(false); added. Which means, the zero-padding
comment around WALRead() call sites isn't quite right.

I'm wondering why the WALRead() callers are always reading XLOG_BLCKSZ
despite knowing exactly how much to read. Is it to tell the OS to
explicitly fetch the whole page from the disk? If yes, the OS will do
that anyway because the page transfers from disk to OS page cache are
always in terms of disk block sizes, no?

Although, there's no immediate problem with it right now, the
assumption is going to be incorrect when reading WAL from WAL buffers
using WALReadFromBuffers -
https://www.postgresql.org/message-id/CALj2ACV=C1GZT9XQRm4iN1NV1T=hla_hsgwnx2y5-g+mswd...@mail.gmail.com.

If we have no reason, can the WALRead() callers just read how much
they want like walsender for physical replication? Attached a patch
for the change.

Thoughts?

[1]
/*
 * Even though we just determined how much of the page can be validly read
 * as 'count', read the whole page anyway. It's guaranteed to be
 * zero-padded up to the page boundary if it's incomplete.
 */
if (!WALRead(state, cur_page, targetPagePtr, XLOG_BLCKSZ, tli,
 ))

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Do-away-with-zero-padding-assumption-before-WALRe.patch
Description: Binary data


Re: Collation version tracking for macOS

2024-02-12 Thread Robert Haas
On Tue, Feb 13, 2024 at 1:55 AM Jeff Davis  wrote:
> Postgres can and does latch on to the version of ICU it was compiled
> against. It's a normal shared library dependency.
>
> The problem is that databases -- and the file structures -- outlive a
> particular version of Postgres. So if Postgres 16 is compiled against
> ICU X and Postgres 17 is compiled against ICU Y, how do you upgrade
> from 16 to 17? Postgres 17 will try to access the old file structures
> using ICU Y, and they'll be corrupt.
>
> What we want is the file structures that depend on ICU X to continue to
> find ICU X even after you upgrade to Postgres 17, yet allow new
> structures to be created using ICU Y. In other words, "multi-lib",
> meaning that the same Postgres binary is linking to multiple versions
> of ICU and the different versions for different structures. That would
> allow users to recreate one index at a time to use ICU Y, until nothing
> depends on ICU X any longer.

Ah, I see. At least, I think I do. I think some of this material could
be very usefully included into the first section of the doc you're
trying to write. What you say here makes it a lot easier to grasp the
motivation and use case for this code, at least for me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Statistics Import and Export

2024-02-12 Thread Corey Huinker
>
> Also, it says "statistics are replaced" but it's quite clear if that
> applies only to matching statistics or if all stats are deleted first
> and then the new stuff is inserted. (FWIW remove_pg_statistics clearly
> deletes all pre-existing stats).
>

All are now deleted first, both in the pg_statistic and
pg_statistic_ext_data tables. The previous version was taking a more
"replace it if we find a new value" approach, but that's overly
complicated, so following the example set by extended statistics seemed
best.



> - import_pg_statistics: I somewhat dislike that we're passing arguments
> as datum[] array - it's hard to say what the elements are expected to
> be, etc. Maybe we should expand this, to make it clear. How do we even
> know the array is large enough?
>

Completely fair. Initially that was done with the expectation that the
array would be the same for both regular stats and extended stats, but that
was no longer the case.


> - I don't quite understand why we need examine_rel_attribute. It sets a
> lot of fields in the VacAttrStats struct, but then we only use attrtypid
> and attrtypmod from it - so why bother and not simply load just these
> two fields? Or maybe I miss something.
>

I think you're right, we don't need it anymore for regular statistics. We
still need it in extended stats because statext_store() takes a subset of
the vacattrstats rows as an input.

Which leads to a side issue. We currently have 3 functions:
examine_rel_attribute and the two varieties of examine_attribute (one in
analyze.c and the other in extended stats). These are highly similar
but just different enough that I didn't feel comfortable refactoring them
into a one-size-fits-all function, and I was particularly reluctant to
modify existing code for the ANALYZE path.


>
> - examine_rel_attribute can return NULL, but get_attrinfo does not check
> for NULL and just dereferences the pointer. Surely that can lead to
> segfaults?
>

Good catch, and it highlights how little we need VacAttrStats for regular
statistics.


>
> - validate_no_duplicates and the other validate functions would deserve
> a better docs, explaining what exactly is checked (it took me a while to
> realize we check just for duplicates), what the parameters do etc.
>

Those functions are in a fairly formative phase - I expect a conversation
about what sort of validations we want to do to ensure that the statistics
being imported make sense, and under what circumstances we would forego
some of those checks.


>
> - Do we want to make the validate_ functions part of the public API? I
> realize we want to use them from multiple places (regular and extended
> stats), but maybe it'd be better to have an "internal" header file, just
> like we have extended_stats_internal?
>

I see no need to have them be a part of the public API. Will move.


>
> - I'm not sure we do "\set debug f" elsewhere. It took me a while to
> realize why the query outputs are empty ...
>

That was an experiment that rose out of the difficulty in determining
_where_ a difference was when the set-difference checks failed. So far I
like it, and I'm hoping it catches on.



>
>
> 0002
>
> - I'd rename create_stat_ext_entry to statext_create_entry.
>
> - Do we even want to include OIDs from the source server? Why not to
> just have object names and resolve those? Seems safer - if the target
> server has the OID allocated to a different object, that could lead to
> confusing / hard to detect issues.
>

The import functions would obviously never use the imported oids to look up
objects on the destination system. Rather, they're there to verify that the
local object oid matches the exported object oid, which is true in the case
of a binary upgrade.

The export format is an attempt to export the pg_statistic[_ext_data] for
that object as-is, and, as Tom suggested, let the import function do the
transformations. We can of course remove them if they truly have no purpose
for validation.


>
> - What happens if we import statistics which includes data for extended
> statistics object which does not exist on the target machine?
>

The import function takes an oid of the object (relation or extstat
object), and the json payload is supposed to be the stats for ONE
corresponding object. Multiple objects of data really don't fit into the
json format, and statistics exported for an object that does not exist on
the destination system would have no meaningful invocation. I envision the
dump file looking like this

CREATE TABLE public.foo ();

SELECT pg_import_rel_stats('public.foo'::regclass, , option
flag, option flag);

So a call against a nonexistent object would fail on the regclass cast.


>
> - pg_import_ext_stats seems to not use require_match_oids - bug?
>

I haven't yet seen a good way to make use of matching oids in extended
stats. Checking matching operator/collation oids would make sense, but
little else.


>
>
> 0003
>
> - no SGML docs for the new tools?
>

Correct. I 

Fix incorrect PG_GETARG in pgcrypto

2024-02-12 Thread shihao zhong
Hi hackers,

I'd like to bring to your attention that I recently identified some
functions in pgcrypto that are using PG_GETARG functions in a way that
doesn't match the expected function signature of the stored
procedures. This patch proposes a solution to address these
inconsistencies and ensure proper alignment.

Thanks,
Shihao


fix_getarg.patch
Description: Binary data


Re: Encoding protection for pgcrypto

2024-02-12 Thread shihao zhong
On Fri, Feb 9, 2024 at 5:34 PM cary huang  wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> Hello
>
> I had a look at your patch, which applies fine to PostgreSQL master. I 
> noticed that the new regression tests you have added to test utf-8 encoding 
> fails on my setup (make check) with the following diffs:
>
> ---
> @ -13,6 +13,7 @@
>  =ivrD
>  -END PGP MESSAGE-
>  '), '0123456789abcdefghij'), 'sha1');
> +ERROR:  invalid byte sequence for encoding "UTF8": 0x91
>  -- Database encoding protection.  Ciphertext source:
>  -- printf '\xe0\xe0\xbff' | gpg --batch --passphrase mykey --textmode 
> --armor --symmetric
>  select pgp_sym_decrypt(dearmor('
> @@ -23,5 +24,5 @@
>  =QKy4
>  -END PGP MESSAGE-
>  '), 'mykey', 'debug=1');
> +ERROR:  invalid byte sequence for encoding "UTF8": 0xe0 0xe0 0xbf
>  \quit
> -\endif
> ---
>
> I am not sure but it seems that you intentionally provide a text input that 
> would produce a non-utf-8 compliant decrypted output, which triggers the 
> error from within "pg_verifymbstr()" call that you have added in pgp-pgsql.c? 
> Are the errors expected in your new test case? If so, then the tests shall 
> pass instead because it has caught a invalid encoding in decrypted output.

Thanks for sharing that, I had updated the pgp-decrypt_utf8.out in the
v2.patch which will pass the `make -C contrib/pgcrypto check`.

> Generally, I am ok with the extra encoding check after text decryption but do 
> not think if it is a good idea to just error out and abort the transaction 
> when it detects invalid encoding character. text decryption routines may be 
> used quite frequently and users generally do not expect them to abort 
> transaction. It may be ok to just give them a warning about invalid character 
> encoding.

Thanks for pointing that out. The goal for this patch is to fix the
encoding for the TEXT return value because by default the PostgreSQL
TEXT type should have the same encoding as the database encoding. So I
only added mbverify for the pgp_sym_decrypt_text and
pgp_pub_decrypt_text functions. If customers want to use these two
functions without encoding, they should use pgp_pub_decrypt_bytea and
pgp_sym_decrypt_bytea because BYTEA is represented as a binary string
in PostgreSQL.

Please let me know if you have more questions or concerns. Thanks!

> thanks
> 
> Cary Huang
> Highgo Software - Canada
> www.highgo.ca


fix_pycrypto_v2.patch
Description: Binary data


RE: Synchronizing slots from primary to standby

2024-02-12 Thread Zhijie Hou (Fujitsu)
On Tuesday, February 13, 2024 9:16 AM Zhijie Hou (Fujitsu) 
 wrote:
> 
> Here is the new version patch which addressed above and most of Bertrand's
> comments.
> 
> TODO: trying to add one test for the case the slot is valid on primary while 
> the
> synced slots is invalidated on the standby.

Here is the V85_2 patch set that added the test and fixed one typo,
there are no other code changes.

Best Regards,
Hou zj


v85_2-0001-Add-a-slot-synchronization-function.patch
Description: v85_2-0001-Add-a-slot-synchronization-function.patch


Re: Synchronizing slots from primary to standby

2024-02-12 Thread shveta malik
On Tue, Feb 13, 2024 at 6:45 AM Zhijie Hou (Fujitsu)
 wrote:
>
> On Monday, February 12, 2024 5:40 PM Amit Kapila  
> wrote:
>
> Thanks for the comments, I have addressed them.
>
> Here is the new version patch which addressed above and
> most of Bertrand's comments.

Thanks for the patch.

I am trying to run valgrind on patch001. I followed the steps given in
[1]. It ended up generating 850 log files. Is there a way to figure
out that we have a memory related problem w/o going through each log
file manually?

I also tried running the steps with '-leak-check=summary' (in the
first run, it was '-leak-check=no' as suggested in wiki) with and
without the patch and tried comparing those manually for a few of
them. I found o/p more or less the same. But this is a mammoth task if
we have to do it manually for 850 files. So any pointers here?

For reference:

Sample log file with  '-leak-check=no'
==00:00:08:44.321 250746== HEAP SUMMARY:
==00:00:08:44.321 250746== in use at exit: 1,298,274 bytes in 290 blocks
==00:00:08:44.321 250746==   total heap usage: 11,958 allocs, 7,005
frees, 8,175,630 bytes allocated
==00:00:08:44.321 250746==
==00:00:08:44.321 250746== For a detailed leak analysis, rerun with:
--leak-check=full
==00:00:08:44.321 250746==
==00:00:08:44.321 250746== For lists of detected and suppressed
errors, rerun with: -s
==00:00:08:44.321 250746== ERROR SUMMARY: 0 errors from 0 contexts
(suppressed: 0 from 0)


Sample log file with  '-leak-check=summary'
==00:00:00:27.300 265785== HEAP SUMMARY:
==00:00:00:27.300 265785== in use at exit: 1,929,907 bytes in 310 blocks
==00:00:00:27.300 265785==   total heap usage: 71,677 allocs, 7,754
frees, 95,750,897 bytes allocated
==00:00:00:27.300 265785==
==00:00:00:27.394 265785== LEAK SUMMARY:
==00:00:00:27.394 265785==definitely lost: 20,507 bytes in 171 blocks
==00:00:00:27.394 265785==indirectly lost: 16,419 bytes in 61 blocks
==00:00:00:27.394 265785==  possibly lost: 354,670 bytes in 905 blocks
==00:00:00:27.394 265785==still reachable: 592,586 bytes in 1,473 blocks
==00:00:00:27.394 265785== suppressed: 0 bytes in 0 blocks
==00:00:00:27.394 265785== Rerun with --leak-check=full to see details
of leaked memory
==00:00:00:27.394 265785==
==00:00:00:27.394 265785== For lists of detected and suppressed
errors, rerun with: -s
==00:00:00:27.394 265785== ERROR SUMMARY: 0 errors from 0 contexts
(suppressed: 0 from 0)


[1]: https://wiki.postgresql.org/wiki/Valgrind

thanks
Shveta




Re: Documentation to upgrade logical replication cluster

2024-02-12 Thread vignesh C
On Mon, 12 Feb 2024 at 14:33, vignesh C  wrote:
>
> On Fri, 9 Feb 2024 at 12:30, Peter Smith  wrote:
> >
> > Here are some review comments for patch v7-0001.
> >
> > ==
> > doc/src/sgml/glossary.sgml
> >
> > 1.
> > +  
> > +   Logical replication cluster
> > +   
> > +
> > + A set of publisher and subscriber instance with publisher instance
> > + replicating changes to the subscriber instance.
> > +
> > +   
> > +  
> >
> > 1a.
> > /instance with/instances with/
>
> Modified
>
> > ~~~
> >
> > 1b.
> > The description then made me want to look up the glossary definition
> > of a "publisher instance" and "subscriber instance", but then I was
> > quite surprised that even "Publisher" and "Subscriber" terms are not
> > described in the glossary. Should this patch add those, or should we
> > start another thread for adding them?
>
>  I felt it is better to start a new thread for this

A new patch has been posted at [1] to address this.
[1] - 
https://www.postgresql.org/message-id/CANhcyEXa%3D%2BshzbdS2iW9%3DY%3D_Eh7aRWZbQKJjDHVYiCmuiE1Okw%40mail.gmail.com

Regards,
Vignesh




Re: RFC: Logging plan of the running query

2024-02-12 Thread torikoshia

On 2024-02-12 09:00, jian he wrote:

Thanks for you comments.

On Mon, Jan 29, 2024 at 9:02 PM torikoshia  
wrote:


Hi,

Updated the patch to fix typos and move
ProcessLogQueryPlanInterruptActive from errfinish() to 
AbortTransaction.




+  
+   
+
+ pg_log_query_plan
+
+pg_log_query_plan (
pid integer )
+boolean
+   
+   
+Requests to log the plan of the query currently running on the
+backend with specified process ID.
+It will be logged at LOG message level and
+will appear in the server log based on the log
+configuration set (See linkend="runtime-config-logging"/>

+for more information), but will not be sent to the client
+regardless of .
+
+  
it would be better to explain the meaning of return value TRUE/FALSE?


Yeah, but I've noticed that this should be located in 'Table Server 
Signaling Functions' not 'Table Control Data Functions'.
Since 'Table Server Signaling Functions' describes the return code as 
below, just relocation seems fine.


  Each of these functions returns true if the signal was successfully 
sent and false if sending the signal failed.



+# logging plan of the running query on the specified backend
+{ oid => '8000', descr => 'log plan of the running query on the
specified backend',
+  proname => 'pg_log_query_plan',
+  provolatile => 'v', prorettype => 'bool',
you can add
`proargnames => '{pid}'`


Hmm, pg_log_query_plan() can take one argument, I'm not sure how much 
sense it makes.
Other functions which take one argument such as pg_cancel_backend() does 
not have proargnames.




+ if (proc == NULL)
+ {
+ /*
+ * This is just a warning so a loop-through-resultset will not abort
+ * if one backend terminated on its own during the run.
+ */
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL backend process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ be_status = pgstat_get_beentry_by_backend_id(proc->backendId);
+ if (be_status->st_backendType != B_BACKEND)
+ {
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL client backend process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ if (SendProcSignal(pid, PROCSIG_LOG_QUERY_PLAN, proc->backendId) < 0)
+ {
+ /* Again, just a warning to allow loops */
+ ereport(WARNING,
+ (errmsg("could not send signal to process %d: %m", pid)));
+ PG_RETURN_BOOL(false);
+ }

I found out that pg_log_query_plan's comments look like
pg_log_backend_memory_contexts.
pg_log_backend_memory_contexts will iterate through many memory 
contexts.

but pg_log_query_plan for one specific pid will only output one plan?
so I am a little bit confused by the comments.


These "loop" mean backend can run pg_log_query_plan() repeatedly even 
when failing sending signals.

pg_signal_backend() also have such comments.


+ /*
+ * Ensure no page lock is held on this process.
+ *
+ * If page lock is held at the time of the interrupt, we can't acquire 
any
+ * other heavyweight lock, which might be necessary for explaining the 
plan

+ * when retrieving column names.
+ *
+ * This may be overkill, but since page locks are held for a short 
duration
+ * we check all the LocalLock entries and when finding even one, give 
up

+ * logging the plan.
+ */
+ hash_seq_init(, GetLockMethodLocalHash());
+ while ((locallock = (LOCALLOCK *) hash_seq_search()) != NULL)
+ {
+ if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
maybe not that self evident, the above comments still not explained
why we need to ensure only
PAGE lock was held on this process?


This is for preventing assertion error and it seems not necessary 
anymore as described in [1].

I'm going remove them.



In the commit message, can you add all the discussion links?
My gmail account doesn't have a previous discussion history.


Sure.


I am not sure this
(https://www.postgresql.org/message-id/flat/d68c3ae31672664876b22d2dcbb526d2%40postgrespro.ru)
is the only discussion link?


This is the original one:
https://www.postgresql.org/message-id/cf8501bcd95ba4d727cbba886ba9eea8%40oss.nttdata.com


I found a bug:
src8=# select *, pg_sleep(10)  from tenk1 for update;
2024-02-11 15:54:17.944 CST [48602] LOG:  query plan running on
backend with PID 48602 is:
Query Text: select *, pg_sleep(10)  from tenk1 for update;
LockRows  (cost=0.00..570.00 rows=1 width=254)
  Output: unique1, unique2, two, four, ten, twenty, hundred,
thousand, twothousand, fivethous, tenthous, odd, even, stringu1,
stringu2, string4, (pg_sleep('10'::double precision)), ctid
  ->  Seq Scan on public.tenk1  (cost=0.00..470.00 rows=1 
width=254)

Output: unique1, unique2, two, four, ten, twenty,
hundred, thousand, twothousand, fivethous, tenthous, odd, even,
stringu1, stringu2, string4, pg_sleep('10'::double precision), ctid

another session (PID) executes `SELECT pg_log_query_plan(48602);` in
the meantime.
pg_log_query_plan returns true successfully, but PID 48602 was 

Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Thomas Munro
On Sat, Feb 10, 2024 at 8:36 AM Andres Freund  wrote:
> Also, yikes, that's an ugly way of doing hardware detection. Jumping out of a
> signal handler into normal code. Brrr.

Maybe it's a little baroque but what's actually wrong with it?
OpenSSL does something similar during initialisation as a fallback
(see armcap.c), and I borrowed that particular idea because I didn't
want the rat's nest of unportable strategies required to detect the
feature otherwise.  As people who ran OpenSSL-linked stuff under gdb
on 32 bit ARM discover (on e.g. early 32 bit RPis and early iOS etc
systems the instruction was missing, and the signal dropped you into
the debugger, so there are lots of reports of that if you google this
topic).  FWIW psql also long-jumps out of signal handlers, arguably
more questionably because it's an async signal and the program counter
may be inside readline...

As for Windows, let's see... SIGILL doesn't even work (the identifier
is defined for compatibility, but illegal instruction exception is not
connected up to it), not to mention that in the backend we redirect
sigaction() to our own fake co-operative signal system so this
technique wouldn't work even if SIGILL did.  But I would be surprised
if anyone actually has a platform that can run Windows that doesn't
have this instruction (if they did I think we'd see a crash with
untrapped illegal instruction exception 0xC01D, let me put that
here to help google to find this message if it ever happens...).
CRC32C is required by ARMv8.1[1], and was optional but AFAIK present
in general purpose ARMv8.0 cores including lowly RPis.  In other words
anything < 8 years old[2] has it certainly, and older stuff probably
does too if it is a 64 bit server/PC/laptop/RPi?  If you look for
reports of systems without it they seem to be pre-ARMv8, or early
phones/tablets (?)?

It looks like Windows 11 requires ARMv8.1[3] ("the Windows kernel
dropped ARMv8.0 support shortly after builds 22621 [= 2022], now
requiring and enforcing a minimum of ARMv8.1 and thus the new atomics
instructions", that atomic/lock stuff being a pretty solid reason to
want to do so as discussed on this list before...).  What if we just
assumed that the same must be effectively true in practice for Windows
10 until we hear otherwise?  If a reasonable non-hypothetical system
shows up that rejects this instruction in the short time before we
disclaim support for 10 (whose EOL is next year?), we can decide
whether to just turn off the CRC32 fast path for that OS version, or
figure out the runtime feature-probe patch.  Clearly we need to
minimise Windows magic in this project due to lack of hackers, so
writing code proactively for imaginary users seems like a bad plan...

[1] 
https://developer.arm.com/documentation/ddi0597/2023-12/Base-Instructions/CRC32C--CRC32C-
[2] https://en.wikipedia.org/wiki/Comparison_of_ARM_processors#ARMv8-A
[3] http://www.emulators.com/docs/abc_history_of_woa.htm




Re: Built-in CTYPE provider

2024-02-12 Thread Jeff Davis
On Wed, 2024-02-07 at 10:53 +0100, Peter Eisentraut wrote:
> Various comments are updated to include the term "character class". 
> I 
> don't recognize that as an official Unicode term.  There are
> categories 
> and properties.  Let's check this.

It's based on
https://www.unicode.org/reports/tr18/#Compatibility_Properties

so I suppose the right name is "properties".

> Is it potentially confusing that only some pg_u_prop_* have a posix
> variant?  Would it be better for a consistent interface to have a
> "posix" argument for each and just ignore it if not used?  Not sure.

I thought about it but didn't see a clear advantage one way or another.

> About this initdb --builtin-locale option and analogous options 
> elsewhere:  Maybe we should flip this around and provide a --libc-
> locale 
> option, and have all the other providers just use the --locale
> option. 
> This would be more consistent with the fact that it's libc that is 
> special in this context.

Would --libc-locale affect all the environment variables or just
LC_CTYPE/LC_COLLATE? How do we avoid breakage?

I like the general direction here but we might need to phase in the
option or come up with a new name. Suggestions welcome.

> Do we even need the "C" locale?  We have established that "C.UTF-8"
> is 
> useful, but if that is easily available, who would need "C"?

I don't think we should encourage its use generally but I also don't
think it will disappear any time soon. Some people will want it on
simplicity grounds. I hope fewer people will use "C" when we have a
better builtin option.

> Some changes in this patch appear to be just straight renamings, like
> in
> src/backend/utils/init/postinit.c and 
> src/bin/pg_upgrade/t/002_pg_upgrade.pl.  Maybe those should be put
> into 
> the previous patch instead.
> 
> On the collation naming: My expectation would have been that the 
> "C.UTF-8" locale would be exposed as the UCS_BASIC collation.

I'd like that. We have to sort out a couple things first, though:

1. The SQL spec mentions the capitalization of "ß" as "SS"
specifically. Should UCS_BASIC use the unconditional mappings in
SpecialCasing.txt? I already have some code to do that (not posted
yet).

2. Should UCS_BASIC use the "POSIX" or "Standard" variant of regex
properties? (The main difference seems to be whether symbols get
treated as punctuation or not.)

3. What do we do about potential breakage for existing users of
UCS_BASIC who might be expecting C-like behavior?

Regards,
Jeff Davis





Re: LogwrtResult contended spinlock

2024-02-12 Thread Jeff Davis
On Fri, 2022-09-23 at 10:49 +0200, Alvaro Herrera wrote:
> On 2022-Jul-28, Alvaro Herrera wrote:
> 
> > v10 is just a trivial rebase.  No changes.  Moved to next
> > commitfest.
> 
> I realized that because of commit e369f3708636 this change is no
> longer
> as critical as it used to be, so I'm withdrawing this patch from the
> commitfest.

It looks like there's some renewed interest in this patch:

https://www.postgresql.org/message-id/20240213001150.4uqzh7tinuhvo...@awork3.anarazel.de

even if there's not a pressing performance conern now, it could clean
up the complexity of having a non-shared copy of LogwrtResult.

Regards,
Jeff Davis





Re: Where can I find the doxyfile?

2024-02-12 Thread John Morris
Update:  another patch with 1) suggested changes, 2) delete old html before 
generating new, and 3) added flex names for the more complex regular 
expressions.

Exercising the “ninja doxygen” command, brought up some issues.

  1.  The generated html directory contained old as well as new pages.
The build script now deletes old pages before building new ones.
  2.  Using a line of dashes to suppress text formatting is not implemented.
Some dashed comments are harder to read, especially around code samples.
  3.  Comments at the end of “#define” are considered part of the define
and not annotated as doxygen comments.

The first issue was very confusing, so it has been fixed.
My preference is to make the other two issues part of a future enhancement. 
Thoughts?

I’ve also been scanning the generated pages looking for anomalies.

  1.  About 1/3 of pages examined.
  2.  In most cases, the filter works as expected.
  3.  Many places are missing information, so the output has blank fields (as 
expected).
  4.  A few places have obvious nonsense. (eg. a group comment applied to one 
element)
  5.  No situations where the output would be misleading.
  6.  In all cases, the source code is “a click away”.

While I had planned to look at *every* page, I’ll probably stop at the 1/3 
sample – unless someone wants to help scan through the pages with me.

I also heard back from Jetbrains about incorporating custom Doxyfiles. They do 
their own rendering and do not invoke the doxygen command. Custom Doxyfiles are 
not going to happen. (It’s probably the same for VS.)

On my Mac M1, generating doxygen takes about 20 seconds. When graphs are added, 
it takes 5 minutes.




doxygen_v5.patch
Description: doxygen_v5.patch


Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Jeff Davis
On Mon, 2024-02-12 at 15:36 -0800, Andres Freund wrote:
> 
> It doesn't really seem like a necessary, or even particularly useful,
> part. You couldn't just call WALRead() for that, since the caller
> would need
> to know the range up to which WAL is valid but not yet flushed as
> well. Thus
> the caller would need to first use WaitXLogInsertionsToFinish() or
> something
> like it anyway - and then there's no point in doing the WALRead()
> anymore.

I follow until the last part. Did you mean "and then there's no point
in doing the WaitXLogInsertionsToFinish() in WALReadFromBuffers()
anymore"?

For now, should I assert that the requested WAL data is before the
Flush pointer or assert that it's before the Write pointer?

> Note that for replicating unflushed data, we *still* might need to
> fall back
> to reading WAL data from disk. In which case not asserting in
> WALRead() would
> just make it hard to find bugs, because not using
> WaitXLogInsertionsToFinish()
> would appear to work as long as data is in wal buffers, but as soon
> as we'd
> fall back to on-disk (but unflushed) data, we'd send bogus WAL.

That makes me wonder whether my previous idea[1] might matter: when
some buffers have been evicted, should WALReadFromBuffers() keep going
through the loop and return the end portion of the requested data
rather than the beginning?

We can sort that out when we get closer to replicating unflushed WAL.

Regards,
Jeff Davis

[1] 
https://www.postgresql.org/message-id/2b36bf99e762e65db0dafbf8d338756cf5fa6ece.ca...@j-davis.com




RE: Synchronizing slots from primary to standby

2024-02-12 Thread Zhijie Hou (Fujitsu)
On Monday, February 12, 2024 5:40 PM Amit Kapila  
wrote:
> 
> On Sun, Feb 11, 2024 at 6:53 PM Zhijie Hou (Fujitsu) 
> wrote:
> >
> > Agreed. Here is the V84 patch which addressed this.
> >
> 
> Few comments:
> =
> 1. Isn't the new function (pg_sync_replication_slots()) allowed to sync the 
> slots
> from physical standby to another cascading standby?
> Won't it be better to simply disallow syncing slots on cascading standby to 
> keep
> it consistent with slotsync worker behavior?
> 
> 2.
> Previously, I commented to keep the declaration and definition of functions in
> the same order but I see that it still doesn't match in the below case:
> 
> @@ -44,6 +46,7 @@ extern void WalSndWakeup(bool physical, bool logical);
> extern void WalSndInitStopping(void); extern void WalSndWaitStopping(void);
> extern void HandleWalSndInitStopping(void);
> +extern XLogRecPtr GetStandbyFlushRecPtr(TimeLineID *tli);
> extern void WalSndRqstFileReload(void);
> 
> I think we can keep the new declaration just before WalSndSignals().
> That would be more consistent.
> 
> 3.
> +  
> +  True if this is a logical slot that was synced from a primary server.
> +  
> +  
> +   On a hot standby, the slots with the synced column marked as true can
> +   neither be used for logical decoding nor dropped by the user.
> + The value
> 
> I don't think we need a separate para here.
> 
> Apart from this, I have made several cosmetic changes in the attached.
> Please include these in the next version unless you see any problems.

Thanks for the comments, I have addressed them.

Here is the new version patch which addressed above and
most of Bertrand's comments.

TODO: trying to add one test for the case the slot is valid on
primary while the synced slots is invalidated on the standby.

Best Regards,
Houzj


v85-0001-Add-a-slot-synchronization-function.patch
Description: v85-0001-Add-a-slot-synchronization-function.patch


RE: Synchronizing slots from primary to standby

2024-02-12 Thread Zhijie Hou (Fujitsu)
On Monday, February 12, 2024 6:03 PM Bertrand Drouvot 
 wrote:
> 
> Hi,
> 
> On Sun, Feb 11, 2024 at 01:23:19PM +, Zhijie Hou (Fujitsu) wrote:
> > On Saturday, February 10, 2024 9:10 PM Amit Kapila
>  wrote:
> > >
> > > On Sat, Feb 10, 2024 at 5:31 PM Masahiko Sawada
> > > 
> > > wrote:
> > > >
> > > > On Fri, Feb 9, 2024 at 4:08 PM Zhijie Hou (Fujitsu)
> > > >  wrote:
> > > >
> > > > > Another alternative is to register the callback when calling
> > > > > slotsync functions and unregister it after the function call.
> > > > > And register the callback in
> > > > > slotsyncworkmain() for the slotsync worker patch, although this
> > > > > may adds a few more codes.
> > > >
> > > > Another idea is that SyncReplicationSlots() calls
> > > > synchronize_slots() in PG_ENSURE_ERROR_CLEANUP() block instead of
> > > > PG_TRY(), to make sure to clear the flag in case of ERROR or
> > > > FATAL. And the slotsync worker uses the before_shmem_callback to clear
> the flag.
> > > >
> > >
> > > +1. This sounds like a better way to clear the flag.
> >
> > Agreed. Here is the V84 patch which addressed this.
> >
> > Apart from above, I removed the txn start/end codes from 0001 as they
> > are used in the slotsync worker patch. And I also ran pgindent and
> > pgperltidy for the patch.
> >
> 
> Thanks!
> 
> A few random comments:

Thanks for the comments.

> 
> 001 ===
> 
> "
>  For
>  the synchronization to work, it is mandatory to have a physical  replication 
> slot
> between the primary and the standby, "
> 
> Maybe mention "primary_slot_name" here?

Added.

> 
> 002 ===
> 
> +   
> +Synchronize the logical failover slots from the primary server to the
> standby server.
> 
> should we say "logical failover replication slots" instead?

Changed.

> 
> 003 ===
> 
> +  If, after executing the function,
> +  
> +  hot_standby_feedback is disabled
> on
> +  the standby or the physical slot configured in
> +  
> +  primary_slot_name is
> +  removed,
> 
> I think another option that could lead to slot invalidation is if 
> primary_slot_name
> is NULL or miss-configured. Indeed hot_standby_feedback would be working
> (for the catalog_xmin) but only as long as the standby is up and running.

I didn't change this based on the discussion.

> 
> 004 ===
> 
> + on the standby. For the synchronization to work, it is mandatory to
> + have a physical replication slot between the primary and the
> + standby,
> 
> should we mention primary_slot_name here?

Added.

> 
> 005 ===
> 
> + To resume logical replication after failover from the synced logical
> + slots, the subscription's 'conninfo' must be altered
> 
> Only in a pub/sub context but not for other ways of using the logical 
> replication
> slot(s).

I am not very sure about this, because the 3-rd part logicalrep can also
have their own replication origin, so I didn't change for now, but will think 
over
this.

> 
> 006 ===
> 
> +   neither be used for logical decoding nor dropped by the user
> 
> what about "nor dropped manually"?

Changed.

> 
> 007 ===
> 
> +typedef struct SlotSyncCtxStruct
> +{
> 
> Should we remove "Struct" from the struct name?

The name was named based on some other comment to be consistent
with LogicalReplCtxStruct, so I didn't change this.
If other also prefer without struct, we can change it later.

> 008 ===
> 
> +   ereport(LOG,
> +   errmsg("dropped replication slot
> \"%s\" of dbid %d",
> +
> NameStr(local_slot->data.name),
> +
> + local_slot->data.database));
> 
> We emit a message when an "invalidated" slot is dropped but not when we
> create a slot. Shouldn't we emit a message when we create a synced slot on the
> standby?
> 
> I think that could be confusing to see "a drop" message not followed by "a
> create"
> one when it's expected (slot valid on the primary for example).

I think we will report "sync-ready" for newly synced slot, for newly
created temporary slots, I am not sure do we need to report log to them,
because they will be dropped on promotion anyway. But if others also prefer to 
log,
I am fine with that.

> 
> 009 ===
> 
> Regarding 040_standby_failover_slots_sync.pl what about adding tests for?
> 
> - synced slot invalidation (and ensure it's recreated once
> pg_sync_replication_slots() is called and when the slot in primary is valid)

Will try this in next version.

> - cannot enable failover for a temporary replication slot

Added.

> - replication slots can only be synchronized from a standby server

Added.

Best Regards,
Hou zj



Re: RFC: Logging plan of the running query

2024-02-12 Thread torikoshia

On 2024-02-07 19:14, torikoshia wrote:

On 2024-02-07 13:58, Ashutosh Bapat wrote:



The prologue refers to a very populated
lock hash table. I think that will happen if thousands of tables are
queried in a single query OR a query runs on a partitioned table with
thousands of partitions. May be we want to try that scenario.


OK, I'll try such cases.


I measured this using partitioned pgbench_accounts with some 
modification to v36[1].
The results[2] show that CPU time increases in proportion to the number 
of partitions, and the increase is not that large.


However I've noticed that these ensuring no page lock logic would not be 
necessary anymore since cc32ec24fdf3b98 removed the assertion which 
caused an error[1].


  $ git show cc32ec24fdf3b98
  ..
  diff --git a/src/backend/storage/lmgr/lock.c 
b/src/backend/storage/lmgr/lock.c

  index 0a692ee0a6..f595bce31b 100644
  --- a/src/backend/storage/lmgr/lock.c
  +++ b/src/backend/storage/lmgr/lock.c
  @@ -186,18 +186,6 @@ static int FastPathLocalUseCount = 0;
*/
 static bool IsRelationExtensionLockHeld PG_USED_FOR_ASSERTS_ONLY = 
false;


  -   /*
  -* We don't acquire any other heavyweight lock while holding 
the page lock

  -* except for relation extension.
  -*/
  -   Assert(!IsPageLockHeld ||
  -  (locktag->locktag_type == 
LOCKTAG_RELATION_EXTEND));


I'm going to remove ensuring no page lock logic after some testings.


[1]
$ git diff _submission/log_running_query-v36
+#include "time.h"
+
 bool ProcessLogQueryPlanInterruptActive = false;

 /* Hook for plugins to get control in ExplainOneQuery() */
@@ -5258,6 +5260,10 @@ ProcessLogQueryPlanInterrupt(void)
MemoryContext old_cxt;
LogQueryPlanPending = false;

+   clock_t start, end;
+   double cpu_time_used;
+   int num_hash_entry = 0;
+
/* Cannot re-enter. */
if (ProcessLogQueryPlanInterruptActive)
return;
@@ -5287,9 +5293,11 @@ ProcessLogQueryPlanInterrupt(void)
 * we check all the LocalLock entries and when finding even one, 
give up

 * logging the plan.
 */
+   start = clock();
hash_seq_init(, GetLockMethodLocalHash());
while ((locallock = (LOCALLOCK *) hash_seq_search()) != NULL)
{
+   num_hash_entry++;
if (LOCALLOCK_LOCKTAG(*locallock) == LOCKTAG_PAGE)
{
ereport(LOG_SERVER_ONLY,
@@ -5301,6 +5309,12 @@ ProcessLogQueryPlanInterrupt(void)
return;
}
}
+   end = clock();
+   cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+
+   ereport(LOG,
+   errmsg("locallock entry search took: %f for %d entries", 
cpu_time_used, num_hash_entry));


[2]
# partition number: 512
locallock entry search took: 0.29 for 1026 entries
locallock entry search took: 0.30 for 1026 entries
locallock entry search took: 0.36 for 1026 entries

# partition number: 1024
locallock entry search took: 0.70 for 2050 entries
locallock entry search took: 0.59 for 2050 entries
locallock entry search took: 0.49 for 2050 entries

# partition number: 2048
locallock entry search took: 0.000100 for 4098 entries
locallock entry search took: 0.000103 for 4098 entries
locallock entry search took: 0.000101 for 4098 entries

# partition number: 4096
locallock entry search took: 0.000197 for 8194 entries
locallock entry search took: 0.000193 for 8194 entries
locallock entry search took: 0.000192 for 8194 entries

[3] 
https://www.postgresql.org/message-id/0642712f-1298-960a-a3ba-e256d56040ac%40oss.nttdata.com



--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 15:56:19 -0800, Jeff Davis wrote:
> On Mon, 2024-02-12 at 11:33 -0800, Jeff Davis wrote:
> > For 0002 & 0003, I'd like more clarity on how they will actually be
> > used by an extension.
>
> In patch 0002, I'm concerned about calling
> WaitXLogInsertionsToFinish(). It loops through all the locks, but
> doesn't have any early return path or advance any state.

I doubt it'd be too bad - we call that at much much higher frequency during
write heavy OLTP workloads (c.f. XLogFlush()).  It can be a performance issue
there, but only after increasing NUM_XLOGINSERT_LOCKS - before that the
limited number of writers is the limit.  Compared to that walsender shouldn't
be a significant factor.

However, I think it's a very bad idea to call WALReadFromBuffers() from
WALReadFromBuffers(). This needs to be at the caller, not down in
WALReadFromBuffers().

I don't see why we would want to weaken the error condition in
WaitXLogInsertionsToFinish() - I suspect it'd not work correctly to wait for
insertions that aren't yet in progress and it just seems like an API misuse.


> So if it's repeatedly called with the same or similar values it seems like
> it would be doing a lot of extra work.
>
> I'm not sure of the best fix. We could add something to LogwrtResult to
> track a new LSN that represents the highest known point where all
> inserters are finished (in other words, the latest return value of
> WaitXLogInsertionsToFinish()). That seems invasive, though.

FWIW, I think LogwrtResult is an anti-pattern, perhaps introduced due to
misunderstanding how cache coherency works. It's not fundamentally faster to
access non-shared memory. It'd make far more sense to allow lock-free access
to the shared LogwrtResult and

Greetings,

Andres Freund




Re: POC, WIP: OR-clause support for indexes

2024-02-12 Thread jian he
On Thu, Feb 8, 2024 at 1:34 PM Andrei Lepikhov
 wrote:
> A couple of questions:
> 1. As I see, transformAExprIn uses the same logic as we invented but
> allows composite and domain types. Could you add a comment explaining
> why we forbid row types in general, in contrast to the transformAExprIn
> routine?
> 2. Could you provide the tests to check issues covered by the recent (in
> v.15) changes?
>
> Patch 0001-* in the attachment incorporates changes induced by Jian's
> notes from [1].
> Patch 0002-* contains a transformation of the SAOP clause, which allows
> the optimizer to utilize partial indexes if they cover all values in
> this array. Also, it is an answer to Alexander's note [2] on performance
> degradation. This first version may be a bit raw, but I need your
> opinion: Does it resolve the issue?
>

+ newa = makeNode(ArrayExpr);
+ /* array_collid will be set by parse_collate.c */
+ newa->element_typeid = scalar_type;
+ newa->array_typeid = array_type;
+ newa->multidims = false;
+ newa->elements = aexprs;
+ newa->location = -1;

I am confused by the comments `array_collid will be set by
parse_collate.c`, can you further explain it?
if OR expression right arm is not plain Const, but with collation
specification, eg.
`where a  = 'a' collate "C" or a = 'b' collate "C";`

then the rightop is not Const, it will be CollateExpr, it will not be
used in transformation.
-
Maybe the previous thread mentioned it, but this thread is very long.
after apply
v16-0001-Transform-OR-clause-to-ANY-expressions.patch
and 0002-Teach-generate_bitmap_or_paths-to-build-BitmapOr-pat-20240212.patch
I found a performance degradation case:

drop table if exists test;
create table test as (select (random()*100)::int x, (random()*1000) y
from generate_series(1,100) i);
vacuum analyze test;

set enable_or_transformation to off;
explain(timing off, analyze, costs off)
select * from test where (x = 1 or x = 2 or x = 3 or x = 4 or x = 5 or
x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
50.887 ms

set enable_or_transformation to on;
explain(timing off, analyze, costs off)
select * from test where (x = 1 or x = 2 or x = 3 or x = 4 or x = 5 or
x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
92.001 ms

-
but for aggregate count(*), it indeed increased the performance:

set enable_or_transformation to off;
explain(timing off, analyze, costs off)
select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x
= 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
46.818 ms

set enable_or_transformation to on;
explain(timing off, analyze, costs off)
select count(*) from test where (x = 1 or x = 2 or x = 3 or x = 4 or x
= 5 or x = 6 or x = 7 or x = 8 or x = 9 ) \watch i=0.1 c=10
35.376 ms

The time is the last result of the 10 iterations.




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Jeff Davis
On Mon, 2024-02-12 at 11:33 -0800, Jeff Davis wrote:
> For 0002 & 0003, I'd like more clarity on how they will actually be
> used by an extension.

In patch 0002, I'm concerned about calling
WaitXLogInsertionsToFinish(). It loops through all the locks, but
doesn't have any early return path or advance any state.

So if it's repeatedly called with the same or similar values it seems
like it would be doing a lot of extra work.

I'm not sure of the best fix. We could add something to LogwrtResult to
track a new LSN that represents the highest known point where all
inserters are finished (in other words, the latest return value of
WaitXLogInsertionsToFinish()). That seems invasive, though.

Regards,
Jeff Davis





Re: glibc qsort() vulnerability

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 17:04:23 -0600, Nathan Bossart wrote:
> On Mon, Feb 12, 2024 at 01:31:30PM -0800, Andres Freund wrote:
> > One thing that's worth checking is if this ends up with *worse* code when 
> > the
> > comparators are inlined. I think none of the changed comparators will end up
> > getting used with an inlined sort, but ...
> 
> Yeah, AFAICT the only inlined sorts are in tuplesort.c and bufmgr.c, and
> the patches don't touch those files.
> 
> > The reason we could end up with worse code is that when inlining the
> > comparisons would make less sense for the compiler. Consider e.g.
> > return DO_COMPARE(a, b) < 0 ?
> > (DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a))
> > : (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c));
> > 
> > With a naive implementation the compiler will understand it only cares about
> > a < b, not about the other possibilities. I'm not sure that's still true 
> > with
> > the more complicated optimized version.
> 
> You aren't kidding [0].  Besides perhaps adding a comment in
> sort_template.h, is there anything else you think we should do about this
> now?

I'd add also a comment to the new functions. I think it's fine otherwise. I
wish there were formulation that'd be optimal for both cases, but this way we
can at least adapt all places at once if either find a better formulation or
change all our sorts to happen via an inline implementation of qsort or such.

Greetings,

Andres




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 12:46:00 -0800, Jeff Davis wrote:
> On Mon, 2024-02-12 at 12:18 -0800, Andres Freund wrote:
> > +    upto = Min(startptr + count, LogwrtResult.Write);
> > +    nbytes = upto - startptr;
> >
> > Shouldn't it pretty much be a bug to ever encounter this?
>
> In the current code it's impossible, though Bharath hinted at an
> extension which could reach that path.
>
> What I committed was a bit of a compromise -- earlier versions of the
> patch supported reading right up to the Insert pointer (which requires
> a call to WaitXLogInsertionsToFinish()). I wasn't ready to commit that
> code without seeing a more about how that would be used, but I thought
> it was reasonable to have some simple code in there to allow reading up
> to the Write pointer.

I doubt there's a sane way to use WALRead() without *first* ensuring that the
range of data is valid. I think we're better of moving that responsibility
explicitly to the caller and adding an assertion verifying that.


> It seems closer to the structure that we will ultimately need to
> replicate unflushed data, right?

It doesn't really seem like a necessary, or even particularly useful,
part. You couldn't just call WALRead() for that, since the caller would need
to know the range up to which WAL is valid but not yet flushed as well. Thus
the caller would need to first use WaitXLogInsertionsToFinish() or something
like it anyway - and then there's no point in doing the WALRead() anymore.

Note that for replicating unflushed data, we *still* might need to fall back
to reading WAL data from disk. In which case not asserting in WALRead() would
just make it hard to find bugs, because not using WaitXLogInsertionsToFinish()
would appear to work as long as data is in wal buffers, but as soon as we'd
fall back to on-disk (but unflushed) data, we'd send bogus WAL.

Greetings,

Andres Freund




Re: Reducing output size of nodeToString

2024-02-12 Thread Matthias van de Meent
On Mon, 12 Feb 2024 at 20:32, Matthias van de Meent
 wrote:
>
> On Mon, 12 Feb 2024 at 19:03, Matthias van de Meent
>  wrote:
> > Attached is patchset v2, which contains the improvements from these patches:
>
> Attached v3, which fixes an out-of-bounds read in pg_strtoken_next,
> detected by asan, that was a likely cause of the problems in CFBot's
> FreeBSD regression tests.

Apparently that was caused by issues in my updated bitmapset
serializer; where I used bms_next_member(..., x=0) as first iteration
thus skipping the first bit. This didn't show up earlier because that
bit is not exercised in PG's builtin views, but is exercised when
WRITE_READ_PARSE_PLAN_TREES is defined (as on the FreeBSD CI job).

Trivial fix in the attached v4 of the patchset, with some fixes for
other assertions that'd get some exercise in non-pg_node_tree paths in
the WRITE_READ configuration.


v4-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch
Description: Binary data


v4-0002-pg_node_tree-Don-t-store-query-text-locations-in-.patch
Description: Binary data


v4-0005-nodeToString-omit-serializing-NULL-datums-in-Cons.patch
Description: Binary data


v4-0004-gen_node_support.pl-Add-a-TypMod-type-for-signall.patch
Description: Binary data


v4-0003-gen_node_support.pl-Mark-location-fields-as-type-.patch
Description: Binary data


v4-0007-gen_node_support.pl-Optimize-serialization-of-fie.patch
Description: Binary data


v4-0006-nodeToString-Apply-RLE-on-Bitmapset-and-numeric-L.patch
Description: Binary data


Re: glibc qsort() vulnerability

2024-02-12 Thread Nathan Bossart
On Mon, Feb 12, 2024 at 01:31:30PM -0800, Andres Freund wrote:
> One thing that's worth checking is if this ends up with *worse* code when the
> comparators are inlined. I think none of the changed comparators will end up
> getting used with an inlined sort, but ...

Yeah, AFAICT the only inlined sorts are in tuplesort.c and bufmgr.c, and
the patches don't touch those files.

> The reason we could end up with worse code is that when inlining the
> comparisons would make less sense for the compiler. Consider e.g.
>   return DO_COMPARE(a, b) < 0 ?
>   (DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a))
>   : (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c));
> 
> With a naive implementation the compiler will understand it only cares about
> a < b, not about the other possibilities. I'm not sure that's still true with
> the more complicated optimized version.

You aren't kidding [0].  Besides perhaps adding a comment in
sort_template.h, is there anything else you think we should do about this
now?

[0] https://godbolt.org/z/bbTqK54zK

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: ALTER TYPE OWNER fails to recurse to multirange

2024-02-12 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jan 16, 2024 at 11:46 AM Tom Lane  wrote:
>> Also, we already
>> treat the multirange as dependent for some things:

> But this seems like an entirely valid point.

Yeah, it's a bit of a muddle.  But there is no syntax for making
a standalone multirange type, so it seems to me that we've mostly
determined that multiranges are dependent types.  There are just
a few places that didn't get the word.

Attached is a proposed patch to enforce that ownership and permissions
of a multirange are those of the underlying range type, in ways
parallel to how we treat array types.  This is all that I found by
looking for calls to IsTrueArrayType().  It's possible that there's
some dependent-type behavior somewhere that isn't conditioned on that,
but I can't think of a good way to search.

If we don't do this, then we need some hacking in pg_dump to get it
to save and restore these properties of multiranges, so it's not like
the status quo is acceptable.

I'd initially thought that perhaps we could back-patch parts of this,
but now I'm not sure; it seems like these behaviors are a bit
intertwined.  Given the lack of field complaints I'm inclined to leave
things alone in the back branches.

regards, tom lane

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 590affb79a..591e767cce 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -2447,11 +2447,17 @@ ExecGrant_Type_check(InternalGrant *istmt, HeapTuple tuple)
 
 	pg_type_tuple = (Form_pg_type) GETSTRUCT(tuple);
 
+	/* Disallow GRANT on dependent types */
 	if (IsTrueArrayType(pg_type_tuple))
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_GRANT_OPERATION),
  errmsg("cannot set privileges of array types"),
  errhint("Set the privileges of the element type instead.")));
+	if (pg_type_tuple->typtype == TYPTYPE_MULTIRANGE)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_GRANT_OPERATION),
+ errmsg("cannot set privileges of multirange types"),
+ errhint("Set the privileges of the range type instead.")));
 
 	/* Used GRANT DOMAIN on a non-domain? */
 	if (istmt->objtype == OBJECT_DOMAIN &&
@@ -3806,6 +3812,35 @@ pg_type_aclmask_ext(Oid type_oid, Oid roleid, AclMode mask, AclMaskHow how,
 		typeForm = (Form_pg_type) GETSTRUCT(tuple);
 	}
 
+	/*
+	 * Likewise, multirange types don't manage their own permissions; consult
+	 * the associated range type.  (Note we must do this after the array step
+	 * to get the right answer for arrays of multiranges.)
+	 */
+	if (typeForm->typtype == TYPTYPE_MULTIRANGE)
+	{
+		Oid			rangetype = get_multirange_range(type_oid);
+
+		ReleaseSysCache(tuple);
+
+		tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(rangetype));
+		if (!HeapTupleIsValid(tuple))
+		{
+			if (is_missing != NULL)
+			{
+/* return "no privileges" instead of throwing an error */
+*is_missing = true;
+return 0;
+			}
+			else
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_OBJECT),
+		 errmsg("type with OID %u does not exist",
+rangetype)));
+		}
+		typeForm = (Form_pg_type) GETSTRUCT(tuple);
+	}
+
 	/*
 	 * Now get the type's owner and ACL from the tuple
 	 */
diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
index d7167108fb..fe47be38d0 100644
--- a/src/backend/catalog/pg_type.c
+++ b/src/backend/catalog/pg_type.c
@@ -326,14 +326,15 @@ TypeCreate(Oid newTypeOid,
  errmsg("fixed-size types must have storage PLAIN")));
 
 	/*
-	 * This is a dependent type if it's an implicitly-created array type, or
-	 * if it's a relation rowtype that's not a composite type.  For such types
-	 * we'll leave the ACL empty, and we'll skip creating some dependency
-	 * records because there will be a dependency already through the
-	 * depended-on type or relation.  (Caution: this is closely intertwined
-	 * with some behavior in GenerateTypeDependencies.)
+	 * This is a dependent type if it's an implicitly-created array type or
+	 * multirange type, or if it's a relation rowtype that's not a composite
+	 * type.  For such types we'll leave the ACL empty, and we'll skip
+	 * creating some dependency records because there will be a dependency
+	 * already through the depended-on type or relation.  (Caution: this is
+	 * closely intertwined with some behavior in GenerateTypeDependencies.)
 	 */
 	isDependentType = isImplicitArray ||
+		typeType == TYPTYPE_MULTIRANGE ||
 		(OidIsValid(relationOid) && relationKind != RELKIND_COMPOSITE_TYPE);
 
 	/*
@@ -534,8 +535,9 @@ TypeCreate(Oid newTypeOid,
  * relationKind and isImplicitArray are likewise somewhat expensive to deduce
  * from the tuple, so we make callers pass those (they're not optional).
  *
- * isDependentType is true if this is an implicit array or relation rowtype;
- * that means it doesn't need its own dependencies on owner etc.
+ * isDependentType is true if this is an implicit array, multirange, or
+ * relation rowtype; that means it doesn't need its 

Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 16:46:55 -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
> > Approaches like that as well as the in-tree pgrminclude work by "I
> > removed the #include and it still compiled fine", which can be
> > unreliable.  IWYU on the other hand has the compiler tracking where a
> > symbol actually came from, and so if it says that an #include is not
> > used, then it's pretty much correct by definition.
>
> Well, it might be correct by definition for the version of the code
> that the compiler processed.  But it sounds to me like it's just as
> vulnerable as pgrminclude to taking out #includes that are needed
> only by #ifdef'd code sections that you didn't compile.

I think pgrminclude is far worse than IWYU - it *maximizes* reliance on
indirect includes, the opposite of what iwyu does. I share concerns about
removing includes for platform/config option specific code, but I think that
it'd not take too many annotations to address that.

I think the proposed patch shows some good changes that are painful to do by
hand, but easy with iwyu, like replacing builtins.h with fmgrprotos.h,
replacing includes of heapam.h/heap.h with table.h etc where appropriate.

While I can see applying some targeted changes without more work, I don't
really see much point in applying a lot of the other removals without actually
committing to adding the necessary IWYU annotations to our code to make iwyu
actually usable.

Greetings,

Andres Freund




[Patch] add multiple client certificate selection feature

2024-02-12 Thread Cary Huang
Hello 



I would like to share a patch that adds a feature to libpq to automatically 
select the best client certificate to send to the server (if it requests one). 
This feature is inspired by this email discussion years ago: 
https://www.postgresql.org/message-id/200905081539.n48Fdl2Y003286%40no.baka.org.
 This feature is useful if libpq client needs to communicate with multiple 
TLS-enabled PostgreSQL servers with different TLS certificate setups. Instead 
of letting the application to figure out the right certificate for the right 
server, the patch allows libpq library itself to pick the most ideal client 
certificate to send to the server.



Currently, we rely on options “sslcert” and “sslkey” parameters on the client 
side to select a client certificate + private key to send to the server, the 
patch adds 2 new options. “sslcertdir” and “sslkeydir” to specify directories 
where all possible certificate and private key files are stored. The new 
options cannot be used with “sslcert” and “sslkey” at the same time.



The most ideal certificate selection is based on the trusted CA names sent by 
the server in “Certificate Request” handshake message; obtained by the client 
making a call to “SSL_get0_peer_CA_list()” function. This list of trusted CA 
names tells the client the list of “issuers” that this server can trust. Inside 
“sslcertdir”, If a client certificate candidate’s issuer name equals to one of 
the trusted CA names, then that is the certificate to use. Once a candidate 
certificate is identified, the patch will then look for a matching private key 
in “sslkeydir”. These actions are performed in certificate callback function 
(cert_cb), which gets called when server requests a client certificate during 
TLS handshake.



This patch requires OpenSSL version 1.1.1 or later to work. The feature will be 
disabled with older OpenSSL versions. Attached is a POC patch containing the 
described feature.



Limitations:



One limitation of this feature is that it does not quite support the case where 
multiple private key files inside “sslkeydir” are encrypted with different 
passwords. When the client wants to find a matching private key from 
“sslkeydir”, it will always use the same password supplied by the client (via 
“sslpassword” option) to decrypt the private key it tries to access.





Also, no tap tests have been added to the patch to test this feature yet. So, 
to test this feature, we will need to prepare the environment manually:



1. generate 2 root CA certificates (ca1 and ca2), which sign 2 sets of client 
and server certificates.

2. configure the server to use a server certificate signed by either ca1 or ca2.

3. put all client certificates and private keys (signed by both ca1 and ca2) 
into a directory (we will point"sslcertdir" and "sslkeydir" to this directory)

4. based on the root CA certificate configured at the server side, the client 
will pick the certificate that the server can trust from specified "sslcertdir" 
and "sslkeydir" directories



Please let me know what you think. Any comments / feedback are greatly 
appreciated.



Best regards




Cary Huang

Highgo Software (Canada)

www.highgo.ca

v1-0001-multiple_client_certificate_selection_support.patch
Description: Binary data


Re: Psql meta-command conninfo+

2024-02-12 Thread Pavel Luzanov

Hi,

On 12.02.2024 17:16, Maiquel Grassi wrote:


The "if (db == NULL)" has been removed, as well as the message inside 
it. To always maintain the same error messages, I changed the 
validation of "\conninfo" to match the validation used for 
"\conninfo+". Therefore, now "\conninfo" and "\conninfo+" return the 
same error when "pg_terminate_backend()" terminates this session 
through another session.




This make sense to me. Thank you for this work.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com


Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Tom Lane
Peter Eisentraut  writes:
> Approaches like that as well as the in-tree pgrminclude work by "I 
> removed the #include and it still compiled fine", which can be 
> unreliable.  IWYU on the other hand has the compiler tracking where a 
> symbol actually came from, and so if it says that an #include is not 
> used, then it's pretty much correct by definition.

Well, it might be correct by definition for the version of the code
that the compiler processed.  But it sounds to me like it's just as
vulnerable as pgrminclude to taking out #includes that are needed
only by #ifdef'd code sections that you didn't compile.

On the whole, our experience with automated #include removal is
pretty awful.  I'm not sure I want to go there again.

regards, tom lane




Re: Fix a typo in pg_rotate_logfile

2024-02-12 Thread Bharath Rupireddy
On Tue, Feb 13, 2024 at 2:29 AM Daniel Gustafsson  wrote:
>
> On that note though, we might want to consider just dropping it altogether in
> v17 (while fixing the incorrect hint in backbranches)?  I can't imagine
> adminpack 1.0 being in heavy use today, and skimming pgAdmin code it seems 
> it's
> only used in pgAdmin3 and not 4. Maybe it's time to simply drop old code?

https://codesearch.debian.net/search?q=pg_logfile_rotate=1
shows no users for it though. There's pgadmin3 using it
https://github.com/search?q=repo%3Apgadmin-org%2Fpgadmin3%20pg_logfile_rotate=code,
however the repo is archived. Surprisingly, core has to maintain the
old code needed for adminpack 1.0 - pg_rotate_logfile_old SQL function
and pg_rotate_logfile function in signalfuncs.c. These things could
have been moved to adminpack.c back then and pointed CREATE FUNCTION
pg_catalog.pg_logfile_rotate() to use it from adminpack.c. If we
decide to remove adminpack 1.0 version completely, the 1.0 functions
pg_file_read, pg_file_length and pg_logfile_rotate will also go away
making adminpack code simpler.

Having said that, it's good to hear from others, preferably from
pgadmin developers - added Dave Page (dp...@pgadmin.org) in here for
inputs.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: glibc qsort() vulnerability

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 14:51:38 -0600, Nathan Bossart wrote:
> On Mon, Feb 12, 2024 at 06:09:06PM +0100, Mats Kindahl wrote:
> > Here are the two fixed patches.
> 
> These patches look pretty good to me.  Barring additional feedback, I'll
> plan on committing them in the next few days.

One thing that's worth checking is if this ends up with *worse* code when the
comparators are inlined. I think none of the changed comparators will end up
getting used with an inlined sort, but ...

The reason we could end up with worse code is that when inlining the
comparisons would make less sense for the compiler. Consider e.g.
return DO_COMPARE(a, b) < 0 ?
(DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a))
: (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c));

With a naive implementation the compiler will understand it only cares about
a < b, not about the other possibilities. I'm not sure that's still true with
the more complicated optimized version.

Greetings,

Andres Freund




Re: Things I don't like about \du's "Attributes" column

2024-02-12 Thread Pavel Luzanov

On 28.01.2024 22:51, Pavel Luzanov wrote:
I'll think about it and try to implement in the next patch version 
within a few days.


Sorry for delay.

Please look at v4.
I tried to implement all of David's suggestions.
The only addition - "Login" column. I still thinks this is important 
information to be highlighted.
Especially considering that the Attributes column small enough with a newline 
separator.

The changes are split into two patches.
0001 - pg_roles view. I plan to organize a new thread for discussion.
0002 - \du command. It depends on 0001 for "Password?" and "Valid until" 
columns.

Output for superuser:

postgres@postgres(17.0)=# \du+
   List of 
roles
Role name | Login | Attributes  | Password? |   Valid until 
  | Connection limit |   Description
--+---+-+---+-+--+--
 postgres | yes   | Superuser  +| no|   
  |  |
  |   | Create DB  +|   |   
  |  |
  |   | Create role+|   |   
  |  |
  |   | Inherit+|   |   
  |  |
  |   | Replication+|   |   
  |  |
  |   | Bypass RLS  |   |   
  |  |
 regress_du_admin | yes   | Create role+| yes   | infinity  
  |  | User createrole attribute
  |   | Inherit |   |   
  |  |
 regress_du_role0 | yes   | Create DB  +| yes   | 2024-12-31 00:00:00+03
  |  |
  |   | Inherit+|   |   
  |  |
  |   | Replication+|   |   
  |  |
  |   | Bypass RLS  |   |   
  |  |
 regress_du_role1 | no| Inherit | no| 2024-12-31 
00:00:00+03(invalid) | 50   | Group role without password but with 
valid until
 regress_du_role2 | yes   | Inherit | yes   |   
  | Not allowed  | No connections allowed
 regress_du_role3 | yes   | | yes   |   
  | 10   | User without attributes
 regress_du_su| yes   | Superuser  +| yes   |   
  | 3(ignored)   | Superuser but with connection limit
  |   | Create DB  +|   |   
  |  |
  |   | Create role+|   |   
  |  |
  |   | Inherit+|   |   
  |  |
  |   | Replication+|   |   
  |  |
  |   | Bypass RLS  |   |   
  |  |
(7 rows)

Output for regress_du_admin (can see password for regress_du_role[0,1,2]
but not for regress_du_role3):

regress_du_admin@postgres(17.0)=> \du regress_du_role*
  List of roles
Role name | Login | Attributes  | Password? |   Valid until 
  | Connection limit
--+---+-+---+-+--
 regress_du_role0 | yes   | Create DB  +| yes   | 2024-12-31 00:00:00+03
  |
  |   | Inherit+|   |   
  |
  |   | Replication+|   |   
  |
  |   | Bypass RLS  |   |   
  |
 regress_du_role1 | no| Inherit | no| 2024-12-31 
00:00:00+03(invalid) | 50
 regress_du_role2 | yes   | Inherit | yes   |   
  | Not allowed
 regress_du_role3 | yes   | |   |   
  | 10
(4 rows)

Output for regress_du_role3 (no password information):

regress_du_role3@postgres(17.0)=> \du regress_du_role*
 List of roles
Role name | Login | Attributes  | Password? |  Valid until   | 
Connection limit
--+---+-+---++--
 regress_du_role0 | yes   | Create DB  +|   | 2024-12-31 00:00:00+03 |
 

Re: glibc qsort() vulnerability

2024-02-12 Thread Fabrízio de Royes Mello
On Mon, Feb 12, 2024 at 5:51 PM Nathan Bossart 
wrote:
>
> On Mon, Feb 12, 2024 at 06:09:06PM +0100, Mats Kindahl wrote:
> > Here are the two fixed patches.
>
> These patches look pretty good to me.  Barring additional feedback, I'll
> plan on committing them in the next few days.
>

Also did some tests locally and everything went well. Patches apply to the
main branch without issues and all regression (including checkworld) pass!!

Regards
-- 
Fabrízio de Royes Mello


Re: [PoC] Reducing planning time when tables have many partitions

2024-02-12 Thread Alena Rybakina

Hi! Sorry my delayed reply too.

On 17.01.2024 12:33, Yuya Watari wrote:

Hello Alena,

Thank you for your quick response, and I'm sorry for my delayed reply.

On Sun, Dec 17, 2023 at 12:41 AM Alena Rybakina
 wrote:

I thought about this earlier and was worried that the index links of the 
equivalence classes might not be referenced correctly for outer joins,
so I decided to just overwrite them and reset the previous ones.

Thank you for pointing this out. I have investigated this problem and
found a potential bug place. The code quoted below modifies
RestrictInfo's clause_relids. Here, our indexes, namely
eclass_source_indexes and eclass_derive_indexes, are based on
clause_relids, so they should be adjusted after the modification.
However, my patch didn't do that, so it may have missed some
references. The same problem occurs in places other than the quoted
one.

=
/*
  * Walker function for replace_varno()
  */
static bool
replace_varno_walker(Node *node, ReplaceVarnoContext *ctx)
{
 ...
 else if (IsA(node, RestrictInfo))
 {
 RestrictInfo *rinfo = (RestrictInfo *) node;
 ...

 if (bms_is_member(ctx->from, rinfo->clause_relids))
 {
 replace_varno((Node *) rinfo->clause, ctx->from, ctx->to);
 replace_varno((Node *) rinfo->orclause, ctx->from, ctx->to);
 rinfo->clause_relids = replace_relid(rinfo->clause_relids,
ctx->from, ctx->to);
 ...
 }
 ...
 }
 ...
}
=

I have attached a new version of the patch, v23, to fix this problem.
v23-0006 adds a helper function called update_clause_relids(). This
function modifies RestrictInfo->clause_relids while adjusting its
related indexes. I have also attached a sanity check patch
(sanity-check.txt) to this email. This sanity check patch verifies
that there are no missing references between RestrictInfos and our
indexes. I don't intend to commit this patch, but it helps find
potential bugs. v23 passes this sanity check, but the v21 you
submitted before does not. This means that the adjustment by
update_clause_relids() is needed to prevent missing references after
modifying clause_relids. I'd appreciate your letting me know if v23
doesn't solve your concern.

One of the things I don't think is good about my approach is that it
adds some complexity to the code. In my approach, all modifications to
clause_relids must be done through the update_clause_relids()
function, but enforcing this rule is not so easy. In this sense, my
patch may need to be simplified more.


this is due to the fact that I explained before: we zeroed the values indicated 
by the indexes,
then this check is not correct either - since the zeroed value indicated by the 
index is correct.
That's why I removed this check.

Thank you for letting me know. I fixed this in v23-0005 to adjust the
indexes in update_eclasses(). With this change, the assertion check
will be correct.

Yes, it is working correctly now with the assertion check. I suppose 
it's better to add this code with an additional comment and a 
recommendation for other developers
to use it for checking in case of manipulations with the list of 
equivalences.


--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 12:50:12 -0800, Andres Freund wrote:
> On 2024-02-12 13:28:40 -0500, Andrew Dunstan wrote:
> I wonder if this indicates that we are either missing memory barriers
> somewhere or that the memory barriers we end up with on msvc + arm aren't
> correct?  Either could explain why the problem doesn't occur when building
> with optimizations.

I think I might have been on to something - if my human emulation of a
preprocessor isn't wrong, we'd end up with

#define S_UNLOCK(lock)  \
do { _ReadWriteBarrier(); (*(lock)) = 0; } while (0)

on msvc + arm. And that's entirely insufficient - _ReadWriteBarrier() just
limits *compiler* level reordering, not CPU level reordering.  I think it's
even insufficient on x86[-64], but it's definitely insufficient on arm.



Another issue I see is that we have a bunch of code that's dependant on
__aarch64__ being set - but it doesn't look to me that it is set on msvc. One
obvious consequence of that is that 64bit atomics won't be used (see
src/include/port/atomics/arch-arm.h) - but that shouldn't cause this issue.


Greetings,

Andres Freund




Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Dave Cramer
On Mon, 12 Feb 2024 at 15:50, Andres Freund  wrote:

> Hi,
>
> On 2024-02-12 13:28:40 -0500, Andrew Dunstan wrote:
> > On 2024-02-12 Mo 11:44, Dave Cramer wrote:
> > > OK, so I have managed to get a debugger attached to postgres.exe when
> it
> > > faults and the fault occurs at
> > >
> https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/src/backend/utils/mmgr/dsa.c#L869
> > > span is pointing to 0x0
> >
> > Further data point. If I select a build type of 'debug' instead of
> > 'debugoptimized' the error disappears.
>
> Oh, this is quite interesting.  Dave, could you post the backtrace?
>

Andres,

I am using Visual Studio as the debugger. Here is what I have.

> postgres.exe!dsa_free(dsa_area * area, unsigned __int64 dp) Line 869 C
  postgres.exe!resize(dshash_table * hash_table, unsigned __int64
new_size_log2) Line 879 C
  postgres.exe!dshash_find_or_insert(dshash_table * hash_table, const void
* key, bool * found) Line 480 C
  postgres.exe!pgstat_get_entry_ref(PgStat_Kind kind, unsigned int dboid,
unsigned int objoid, bool create, bool * created_entry) Line 455 C
  postgres.exe!pgstat_prep_pending_entry(PgStat_Kind kind, unsigned int
dboid, unsigned int objoid, bool * created_entry) Line 1123 C
  [Inline Frame] postgres.exe!pgstat_prep_relation_pending(unsigned int)
Line 904 C
  postgres.exe!pgstat_assoc_relation(RelationData * rel) Line 139 C
  [Inline Frame] postgres.exe!ReadBufferExtended(RelationData *) Line 802 C
  postgres.exe!ReadBuffer(RelationData * reln, unsigned int blockNum) Line
737 C
  postgres.exe!read_seq_tuple(RelationData * rel, int * buf, HeapTupleData
* seqdatatuple) Line 1196 C
  postgres.exe!AlterSequence(ParseState * pstate, AlterSeqStmt * stmt) Line
481 C
  postgres.exe!ProcessUtilitySlow(ParseState * pstate, PlannedStmt * pstmt,
const char * queryString, ProcessUtilityContext context, ParamListInfoData
* params, QueryEnvironment * queryEnv, _DestReceiver * dest,
QueryCompletion * qc) Line 1679 C
  postgres.exe!standard_ProcessUtility(PlannedStmt * pstmt, const char *
queryString, bool readOnlyTree, ProcessUtilityContext context,
ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver *
dest, QueryCompletion * qc) Line 1080 C
  postgres.exe!ProcessUtility(PlannedStmt * pstmt, const char *
queryString, bool readOnlyTree, ProcessUtilityContext context,
ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver *
dest, QueryCompletion * qc) Line 530 C
  postgres.exe!ProcessUtilitySlow(ParseState * pstate, PlannedStmt * pstmt,
const char * queryString, ProcessUtilityContext context, ParamListInfoData
* params, QueryEnvironment * queryEnv, _DestReceiver * dest,
QueryCompletion * qc) Line 1263 C
  postgres.exe!standard_ProcessUtility(PlannedStmt * pstmt, const char *
queryString, bool readOnlyTree, ProcessUtilityContext context,
ParamListInfoData * params, QueryEnvironment * queryEnv, _DestReceiver *
dest, QueryCompletion * qc) Line 1080 C


if there is a better debugger to use, please let me know

Dave



>
> I wonder if this indicates that we are either missing memory barriers
> somewhere or that the memory barriers we end up with on msvc + arm aren't
> correct?  Either could explain why the problem doesn't occur when building
> with optimizations.
>
> Greetings,
>
> Andres Freund
>


Re: Fix a typo in pg_rotate_logfile

2024-02-12 Thread Daniel Gustafsson
> On 12 Feb 2024, at 21:46, Nathan Bossart  wrote:
> 
> On Mon, Feb 12, 2024 at 09:39:06PM +0100, Daniel Gustafsson wrote:
>>> On 12 Feb 2024, at 21:32, Bharath Rupireddy 
>>>  wrote:
>>> I happened to notice a typo in pg_rotate_logfile in ipc/signalfuncs.c
>>> - the hint message wrongly mentions that pg_logfile_rotate is part of
>>> the core; which is actually not. pg_logfile_rotate is an adminpack's
>>> 1.0 SQL function dropped in 2.0. The core defines pg_rotate_logfile
>>> SQL function instead, so use that. Here's a patch to fix the typo.
>> 
>> Nice catch!  This needs to be backpatched all the way down to 12 as that
>> function wen't away a long time ago (it was marked as deprecated all the way
>> back in 9.1).
> 
> This is a bit strange because, with this patch, the HINT suggests using a
> function with the same name as the one it lives in.  IIUC this is because
> adminpack's pg_logfile_rotate() uses pg_rotate_logfile(), while core's
> pg_rotate_logfile() uses pg_rotate_logfile_v2().  I suppose trying to
> rename these might be more trouble than it's worth at this point, though...

Yeah, I doubt that's worth the churn.

On that note though, we might want to consider just dropping it altogether in
v17 (while fixing the incorrect hint in backbranches)?  I can't imagine
adminpack 1.0 being in heavy use today, and skimming pgAdmin code it seems it's
only used in pgAdmin3 and not 4. Maybe it's time to simply drop old code?

--
Daniel Gustafsson





Re: Patch: Add parse_type Function

2024-02-12 Thread David E. Wheeler
On Feb 12, 2024, at 12:53 PM, Tom Lane  wrote:

> "David E. Wheeler"  writes:
>> [ v4-0001-Add-parse_type-SQL-function.patch ]
> 
> It strikes me that this is basically to_regtype() with the additional
> option to return the typmod.  That leads to some questions:

Huh. I saw it more on a par with format_type(), at least in the output I expect.

> * Should we choose a name related to to_regtype?  I don't have any
> immediate suggestions, but since you didn't seem entirely set on
> parse_type, maybe it's worth thinking along those lines.  OTOH,
> to the extent that people would use this with format_type, maybe
> parse_type is fine.

Yeah, and the `to_*` functions return a single OID, not two fields.

> * Perhaps the type OID output argument should be declared regtype
> not plain OID?  It wouldn't make any difference for passing it to
> format_type, but in other use-cases perhaps regtype would be more
> readable.  It's not a deal-breaker either way of course, since
> you can cast oid to regtype or vice versa.

Sure. I used `oid` because that’s exactly what the argument to format_type() is 
called, so thought the parity there more appropriate. Maybe its argument should 
be renamed to regtype?

> * Maybe the code should be in adt/regproc.c not format_type.c.

Happy to move it wherever makes the most sense.

> * Experience with to_regtype suggests strongly that people would
> prefer "return NULL" to failure for an unrecognized type name.

Could do, but as with to_regtype() there’s an issue with error handling when it 
descends into the SQL parser.

> Skimming the patch, I notice that the manual addition to
> builtins.h should be unnecessary: the pg_proc.dat entry
> should be enough to create an extern in fmgrprotos.h.

Confirmed, will remove in next patch.

> Also, I'm pretty sure that reformat_dat_file.pl will
> think your pg_proc.dat entry is overly verbose.  See
> https://www.postgresql.org/docs/devel/system-catalog-initial-data.html

There are a bunch of things it reformats:

❯ git diff --name-only   
src/include/catalog/pg_aggregate.dat
src/include/catalog/pg_database.dat
src/include/catalog/pg_proc.dat
src/include/catalog/pg_type.dat
src/include/utils/builtins.h

And even in pg_proc.dat there are 13 blocks it reformats. I can submit with 
just my block formatted if you’d prefer.

> BTW, another way that this problem could be approached is to use
> to_regtype() as-is, with a separate function to obtain the typmod:
> 
> select format_type(to_regtype('timestamp(4)'), to_regtypmod('timestamp(4)'));

Oh interesting.

> This is intellectually ugly, since it implies parsing the same
> typename string twice.  But on the other hand it avoids the notational
> pain and runtime overhead involved in using a record-returning
> function.  So I think it might be roughly a wash for performance.
> Question to think about is which way is easier to use.  I don't
> have an opinion particularly; just throwing the idea out there.

Honestly I haven’t cared for the fact that parse_type() returns a record; it 
makes it a bit clunky. But yeah, so does this to pass the same value to two 
function calls.

Perhaps it makes sense to add to_regtypmod() as you suggest, but then also 
something like:

CREATE FUNCTION format_type_string(text) RETURNS TEXT AS $$
SELECT format_type(to_regtype($1), to_regtypmod($1))
$$;

Best,

David





Re: Popcount optimization using AVX512

2024-02-12 Thread Nathan Bossart
On Sat, Feb 10, 2024 at 03:52:38PM -0800, Noah Misch wrote:
> On Fri, Feb 09, 2024 at 08:33:23PM -0800, Andres Freund wrote:
>> My understanding is that the ifunc mechanism just avoid the need for repeated
>> indirect calls/jumps to implement a single function call, not the use of
>> indirect function calls at all. Calls into shared libraries, like libc, are
>> indirected via the GOT / PLT, i.e. an indirect function call/jump.  Without
>> ifuncs, the target of the function call would then have to dispatch to the
>> resolved function. Ifuncs allow to avoid this repeated dispatch by moving the
>> dispatch to the dynamic linker stage, modifying the contents of the GOT/PLT 
>> to
>> point to the right function. Thus ifuncs are an optimization when calling a
>> function in a shared library that's then dispatched depending on the cpu
>> capabilities.
>> 
>> However, in our case, where the code is in the same binary, function calls
>> implemented in the main binary directly (possibly via a static library) don't
>> go through GOT/PLT. In such a case, use of ifuncs turns a normal direct
>> function call into one going through the GOT/PLT, i.e. makes it indirect. The
>> same is true for calls within a shared library if either explicit symbol
>> visibility is used, or -symbolic, -Wl,-Bsymbolic or such is used. Therefore
>> there's no efficiency gain of ifuncs over a call via function pointer.
>> 
>> 
>> This isn't because ifunc is implemented badly or something - the reason for
>> this is that dynamic relocations aren't typically implemented by patching all
>> callsites (".text relocations"), which is what you would need to avoid the
>> need for an indirect call to something that fundamentally cannot be a 
>> constant
>> address at link time. The reason text relocations are disfavored is that
>> they can make program startup quite slow, that they require allowing
>> modifications to executable pages which are disliked due to the security
>> implications, and that they make the code non-shareable, as the in-memory
>> executable code has to differ from the on-disk code.
>> 
>> 
>> I actually think ifuncs within the same binary are a tad *slower* than plain
>> function pointer calls, unless -fno-plt is used. Without -fno-plt, an ifunc 
>> is
>> called by 1) a direct call into the PLT, 2) loading the target address from
>> the GOT, 3) making an an indirect jump to that address.  Whereas a "plain
>> indirect function call" is just 1) load target address from variable 2) 
>> making
>> an indirect jump to that address. With -fno-plt the callsites themselves load
>> the address from the GOT.
> 
> That sounds more accurate than what I wrote.  Thanks.

+1, thanks for the detailed explanation, Andres.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-09 15:32:10 -0500, Dave Cramer wrote:
> On Fri, 9 Feb 2024 at 14:36, Andres Freund  wrote:
> > That's something like a segfault.
> >
> > One suspicion I have is that src/port/pg_crc32c_armv8_choose.c possibly
> > doesn't properly support msvc.  It seems to assume that SIGILL can be
> > trapped,
> > but that IIRC doesn't work on windows.
> >
> > I'd check if the problem persists if you change
> > cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
> > to
> > cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 0)
> >
> 
> This results in
> 
> FAILED: src/bin/pg_checksums/pg_checksums.exe
> src/bin/pg_checksums/pg_checksums.pdb
> "link"  /MACHINE:ARM64 /OUT:src/bin/pg_checksums/pg_checksums.exe
> src/bin/pg_checksums/pg_checksums.exe.p/win32ver.res
> src/bin/pg_checksums/pg_checksums.exe.p/pg_checksums.c.obj "/release"
> "/nologo" "/DEBUG" "/PDB:src\bin\pg_checksums\pg_checksums.pdb"
> "/INCREMENTAL:NO" "/STACK:4194304" "/NOEXP" "src/fe_utils/libpgfeutils.a"
> "src/common/libpgcommon.a" "src/port/libpgport.a" "ws2_32.lib" "ws2_32.lib"
> "ws2_32.lib" "ws2_32.lib" "/SUBSYSTEM:CONSOLE" "kernel32.lib" "user32.lib"
> "gdi32.lib" "winspool.lib" "shell32.lib" "ole32.lib" "oleaut32.lib"
> "uuid.lib" "comdlg32.lib" "advapi32.lib"
> libpgcommon.a(controldata_utils.c.obj) : error LNK2001: unresolved external
> symbol pg_comp_crc32c

That's because have_optimized_crc ends up being set.


I'm somewhat unhappy about the msvc specific path in meson.build. Shouldn't at
the very least the "Use ARM CRC Extension unconditionally" path be first. I
guess it's not really this patch's fault that the runtime detection for armv8
crc is implemented this shoddily :(.

Greetings,

Andres Freund




Re: glibc qsort() vulnerability

2024-02-12 Thread Nathan Bossart
On Mon, Feb 12, 2024 at 06:09:06PM +0100, Mats Kindahl wrote:
> Here are the two fixed patches.

These patches look pretty good to me.  Barring additional feedback, I'll
plan on committing them in the next few days.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 13:28:40 -0500, Andrew Dunstan wrote:
> On 2024-02-12 Mo 11:44, Dave Cramer wrote:
> > OK, so I have managed to get a debugger attached to postgres.exe when it
> > faults and the fault occurs at
> > https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/src/backend/utils/mmgr/dsa.c#L869
> > span is pointing to 0x0
> 
> Further data point. If I select a build type of 'debug' instead of
> 'debugoptimized' the error disappears.

Oh, this is quite interesting.  Dave, could you post the backtrace?

I wonder if this indicates that we are either missing memory barriers
somewhere or that the memory barriers we end up with on msvc + arm aren't
correct?  Either could explain why the problem doesn't occur when building
with optimizations.

Greetings,

Andres Freund




Re: Fix a typo in pg_rotate_logfile

2024-02-12 Thread Nathan Bossart
On Mon, Feb 12, 2024 at 09:39:06PM +0100, Daniel Gustafsson wrote:
>> On 12 Feb 2024, at 21:32, Bharath Rupireddy 
>>  wrote:
>> I happened to notice a typo in pg_rotate_logfile in ipc/signalfuncs.c
>> - the hint message wrongly mentions that pg_logfile_rotate is part of
>> the core; which is actually not. pg_logfile_rotate is an adminpack's
>> 1.0 SQL function dropped in 2.0. The core defines pg_rotate_logfile
>> SQL function instead, so use that. Here's a patch to fix the typo.
> 
> Nice catch!  This needs to be backpatched all the way down to 12 as that
> function wen't away a long time ago (it was marked as deprecated all the way
> back in 9.1).

This is a bit strange because, with this patch, the HINT suggests using a
function with the same name as the one it lives in.  IIUC this is because
adminpack's pg_logfile_rotate() uses pg_rotate_logfile(), while core's
pg_rotate_logfile() uses pg_rotate_logfile_v2().  I suppose trying to
rename these might be more trouble than it's worth at this point, though...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Jeff Davis
On Mon, 2024-02-12 at 12:18 -0800, Andres Freund wrote:
> +    upto = Min(startptr + count, LogwrtResult.Write);
> +    nbytes = upto - startptr;
> 
> Shouldn't it pretty much be a bug to ever encounter this?

In the current code it's impossible, though Bharath hinted at an
extension which could reach that path.

What I committed was a bit of a compromise -- earlier versions of the
patch supported reading right up to the Insert pointer (which requires
a call to WaitXLogInsertionsToFinish()). I wasn't ready to commit that
code without seeing a more about how that would be used, but I thought
it was reasonable to have some simple code in there to allow reading up
to the Write pointer.

It seems closer to the structure that we will ultimately need to
replicate unflushed data, right?

Regards,
Jeff Davis

[1]
https://www.postgresql.org/message-id/calj2acw65mqn6ukv57sqdtmzajgd1n_adqtdgy+gmdqu6v6...@mail.gmail.com




Add test module for verifying backtrace functionality

2024-02-12 Thread Bharath Rupireddy
Hi,

Postgres has a good amount of code for dealing with backtraces - two
GUCs backtrace_functions and backtrace_on_internal_error,
errbacktrace; all of which use core function set_backtrace from
elog.c. I've not seen this code being tested at all, see code coverage
report - 
https://coverage.postgresql.org/src/backend/utils/error/elog.c.gcov.html.

I think adding a simple test module (containing no .c files) with only
TAP tests will help cover this code. I ended up having it as a
separate module under src/test/modules/test_backtrace as I was not
able to find an existing TAP file in src/test to add these tests.  I'm
able to verify the backtrace related code with the attached patch
consistently. The TAP tests rely on the fact that the server emits
text "BACKTRACE: " to server logs before logging the backtrace, and
the backtrace contains the function name in which the error occurs.
I've turned off query statement logging (set log_statement = none,
log_min_error_statement = fatal) so that the tests get to see the
functions only in the backtrace. Although the CF bot is happy with the
attached patch 
https://github.com/BRupireddy2/postgres/tree/add_test_module_for_bcktrace_functionality_v1,
there might be some more flakiness to it.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 35d716f87611439d56f233a26f29a3b582ed3fc0 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 12 Feb 2024 15:22:05 +
Subject: [PATCH v1] Add test module for backtrace functionality

---
 src/test/modules/Makefile |   1 +
 src/test/modules/meson.build  |   1 +
 src/test/modules/test_backtrace/.gitignore|   4 +
 src/test/modules/test_backtrace/Makefile  |  14 +++
 src/test/modules/test_backtrace/README|   4 +
 src/test/modules/test_backtrace/meson.build   |  12 ++
 .../modules/test_backtrace/t/001_backtrace.pl | 103 ++
 7 files changed, 139 insertions(+)
 create mode 100644 src/test/modules/test_backtrace/.gitignore
 create mode 100644 src/test/modules/test_backtrace/Makefile
 create mode 100644 src/test/modules/test_backtrace/README
 create mode 100644 src/test/modules/test_backtrace/meson.build
 create mode 100644 src/test/modules/test_backtrace/t/001_backtrace.pl

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 89aa41b5e3..3967311048 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -13,6 +13,7 @@ SUBDIRS = \
 		  libpq_pipeline \
 		  plsample \
 		  spgist_name_ops \
+		  test_backtrace \
 		  test_bloomfilter \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 8fbe742d38..3b21b389af 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -12,6 +12,7 @@ subdir('libpq_pipeline')
 subdir('plsample')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
+subdir('test_backtrace')
 subdir('test_bloomfilter')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
diff --git a/src/test/modules/test_backtrace/.gitignore b/src/test/modules/test_backtrace/.gitignore
new file mode 100644
index 00..5dcb3ff972
--- /dev/null
+++ b/src/test/modules/test_backtrace/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/test_backtrace/Makefile b/src/test/modules/test_backtrace/Makefile
new file mode 100644
index 00..b908864e89
--- /dev/null
+++ b/src/test/modules/test_backtrace/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/test_backtrace/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_backtrace
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_backtrace/README b/src/test/modules/test_backtrace/README
new file mode 100644
index 00..ae1366bd58
--- /dev/null
+++ b/src/test/modules/test_backtrace/README
@@ -0,0 +1,4 @@
+This directory doesn't actually contain any extension module.
+
+Instead it is a home for backtrace tests that we want to run using TAP
+infrastructure.
diff --git a/src/test/modules/test_backtrace/meson.build b/src/test/modules/test_backtrace/meson.build
new file mode 100644
index 00..4adffcc914
--- /dev/null
+++ b/src/test/modules/test_backtrace/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'test_backtrace',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+'tests': [
+  't/001_backtrace.pl',
+],
+  },
+}
diff --git a/src/test/modules/test_backtrace/t/001_backtrace.pl b/src/test/modules/test_backtrace/t/001_backtrace.pl
new file mode 100644
index 00..ecd53dbe7b
--- 

Re: Fix a typo in pg_rotate_logfile

2024-02-12 Thread Daniel Gustafsson
> On 12 Feb 2024, at 21:32, Bharath Rupireddy 
>  wrote:

> I happened to notice a typo in pg_rotate_logfile in ipc/signalfuncs.c
> - the hint message wrongly mentions that pg_logfile_rotate is part of
> the core; which is actually not. pg_logfile_rotate is an adminpack's
> 1.0 SQL function dropped in 2.0. The core defines pg_rotate_logfile
> SQL function instead, so use that. Here's a patch to fix the typo.

Nice catch!  This needs to be backpatched all the way down to 12 as that
function wen't away a long time ago (it was marked as deprecated all the way
back in 9.1).

--
Daniel Gustafsson





Re: Popcount optimization using AVX512

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 20:14:06 +, Amonson, Paul D wrote:
> > > +# Check for header immintrin.h
> > ...
> > Do these all actually have to link?  Invoking the linker is slow.
> > I think you might be able to just use cc.has_header_symbol().
>
> I took this to mean the last of the 3 new blocks.

Yep.


> > Does this work with msvc?
>
> I think it will work but I have no way to validate it. I propose we remove 
> the AVX-512 popcount feature from MSVC builds. Sound ok?

CI [1], whould be able to test at least building. Including via cfbot,
automatically run for each commitfest entry - you can see prior runs at
[2].  They run on Zen 3 epyc instances, so unfortunately runtime won't be
tested.  If you look at [3], you can see that currently it doesn't seem to be
considered supported at configure time:

...
[00:23:48.480] Checking if "__get_cpuid" : links: NO
[00:23:48.480] Checking if "__cpuid" : links: YES
...
[00:23:48.492] Checking if "x86_64: popcntq instruction" compiles: NO
...

Unfortunately CI currently is configured to not upload the build logs if the
build succeeds, so we don't have enough details to see why.


> > This will build all of pgport with the avx flags, which wouldn't be 
> > correct, I think? The compiler might inject automatic uses of avx512 in 
> > places, which would cause problems, no?
>
> This will take me some time to learn how to do this in meson. Any pointers
> here would be helpful.

Should be fairly simple, add it to the replace_funcs_pos and add the relevant
cflags to pgport_cflags, similar to how it's done for crc.


> > While you don't do the same for make, isn't even just using the avx512 for 
> > all of pg_bitutils.c broken for exactly that reson? That's why the existing 
> > code builds the files for various crc variants as their own file.
>
> I don't think its broken, nothing else in pg_bitutils.c will make use of
> AVX-512

You can't really guarantee that compiler auto-vectorization won't decide to do
so, no? I wouldn't call it likely, but it's also hard to be sure it won't
happen at some point.


> If splitting still makes sense, I propose splitting into 3 files:  
> pg_bitutils.c (entry point +sw popcnt implementation), pg_popcnt_choose.c 
> (CPUID and xgetbv check) and pg_popcnt_x86_64_accel.c (64/512bit x86 
> implementations).
> I'm not an expert in meson, but splitting might add complexity to meson.build.
>
> Could you elaborate if there are other benefits to the split file approach?

It won't lead to SIGILLs ;)

Greetings,

Andres Freund


[1] https://github.com/postgres/postgres/blob/master/src/tools/ci/README
[2] 
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F47%2F4675
[3] https://cirrus-ci.com/task/5645112189911040




Fix a typo in pg_rotate_logfile

2024-02-12 Thread Bharath Rupireddy
Hi,

I happened to notice a typo in pg_rotate_logfile in ipc/signalfuncs.c
- the hint message wrongly mentions that pg_logfile_rotate is part of
the core; which is actually not. pg_logfile_rotate is an adminpack's
1.0 SQL function dropped in 2.0. The core defines pg_rotate_logfile
SQL function instead, so use that. Here's a patch to fix the typo.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 18854ba341d892750fc75b9988ba107fe44e7c63 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 12 Feb 2024 13:30:31 +
Subject: [PATCH v1] Fix a typo in pg_rotate_logfile

---
 src/backend/storage/ipc/signalfuncs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/signalfuncs.c b/src/backend/storage/ipc/signalfuncs.c
index 81d1a59659..c05a2af48b 100644
--- a/src/backend/storage/ipc/signalfuncs.c
+++ b/src/backend/storage/ipc/signalfuncs.c
@@ -284,7 +284,7 @@ pg_rotate_logfile(PG_FUNCTION_ARGS)
  errmsg("must be superuser to rotate log files with adminpack 1.0"),
 		/* translator: %s is a SQL function name */
  errhint("Consider using %s, which is part of core, instead.",
-		 "pg_logfile_rotate()")));
+		 "pg_rotate_logfile()")));
 
 	if (!Logging_collector)
 	{
-- 
2.34.1



Re: Collation version tracking for macOS

2024-02-12 Thread Jeff Davis
On Sun, 2024-02-11 at 22:04 +0530, Robert Haas wrote:
> 1. Here's what we think your OS package manager is probably going to
> do.
> 2. That's going to interact with PostgreSQL in this way that I will
> now describe.
> 3. See, that sucks, because of the stuff I said above about needing
> stable collations!
> 4. But if you installed this module instead, then you could prevent
> the things I said under #2 from happening.
> 5. Instead, you'd get this other behavior, which would make you
> happy.

I like that framing, thank you. I'll try to come up with something
there.

> I feel like I can almost piece together in my head how this is
> supposed to work -- I think it's like "we expect the OS package
> manager to drop all the ICU versions in the same directory via side
> by
> side installs, and that works well for other programs because ... for
> some mysterious reason they can latch onto the specific version they
> were linked against ... but we can't or don't do that because ... I
> guess we're dumber than those other pieces of software or
> something ... 

Postgres can and does latch on to the version of ICU it was compiled
against. It's a normal shared library dependency.

The problem is that databases -- and the file structures -- outlive a
particular version of Postgres. So if Postgres 16 is compiled against
ICU X and Postgres 17 is compiled against ICU Y, how do you upgrade
from 16 to 17? Postgres 17 will try to access the old file structures
using ICU Y, and they'll be corrupt.

What we want is the file structures that depend on ICU X to continue to
find ICU X even after you upgrade to Postgres 17, yet allow new
structures to be created using ICU Y. In other words, "multi-lib",
meaning that the same Postgres binary is linking to multiple versions
of ICU and the different versions for different structures. That would
allow users to recreate one index at a time to use ICU Y, until nothing
depends on ICU X any longer.

I should say this is not an easy process even if something like
icu_multilib is available. We don't have all of the information needed
in the catalog to track which structures depend on which versions of a
collation library, collation library version is itself not easy to
define, and it still involves rebuilding (or at least re-validating) a
lot of structures. This is a "make hard things possible" tool, and I
suspect only a handful of users would use it successfully to migrate to
new ICU versions.

More simply, some users might just want to lock down the version of ICU
to X, and just use that forever until they have a reason to change it.
icu_multilib can also facilitate that, though it's still not trivial.

> "icu_multilib must be loaded via shared_preload_libraries.
> icu_multilib ignores any ICU library with a major version greater
> than
> that with which PostgreSQL was built."
> 
> It's not clear from reading this whether the second sentence here is
> a
> regrettable implementation restriction or design behavior. If it's
> design behavior, what's the point of it?

That restriction came from Thomas's (uncommitted) work on the same
problem. I believe the reasoning was that we don't know whether future
versions of ICU might break something that we're doing, though perhaps
there's a better way.

Regards,
Jeff Davis






Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-12 11:33:24 -0800, Jeff Davis wrote:
> On Wed, 2024-01-31 at 14:30 +0530, Bharath Rupireddy wrote:
> > Please see the attached v22 patch set.
>
> Committed 0001.

Yay, I think this is very cool. There are plenty other improvements than can
be based on this...


One thing I'm a bit confused in the code is the following:

+/*
+ * Don't read past the available WAL data.
+ *
+ * Check using local copy of LogwrtResult. Ordinarily it's been updated by
+ * the caller when determining how far to read; but if not, it just means
+ * we'll read less data.
+ *
+ * XXX: the available WAL could be extended to the WAL insert pointer by
+ * calling WaitXLogInsertionsToFinish().
+ */
+upto = Min(startptr + count, LogwrtResult.Write);
+nbytes = upto - startptr;

Shouldn't it pretty much be a bug to ever encounter this? There aren't
equivalent checks in WALRead(), so any user of WALReadFromBuffers() that then
falls back to WALRead() is just going to send unwritten data.

ISTM that this should be an assertion or error.

Greetings,

Andres Freund




Why does BitmapPrefetch() skip fetch based on current block recheck flag

2024-02-12 Thread Melanie Plageman
Hi,

I noticed that in the following code in BitmapPrefetch(), we use the
current block's TBMIterateResult->recheck when deciding whether or not
to prefetch the block returned by tbm_iterate() (which is very likely
a different block). Why not use TBMIterateResult->recheck for the
block we are considering prefetching?

I see a note that the skip fetch logic here depends on the assumption
that the index AM will report the same recheck flag for this future
heap page as it did for the current heap page. But why would we depend
on that assumption instead of just using the TBMIterateResult just
populated by tbm_iterate() for the block we are about to prefetch?
That is the block we pass to PrefetchBuffer() afterward.

/*
 * If we expect not to have to actually read this heap page,
 * skip this prefetch call, but continue to run the prefetch
 * logic normally.  (Would it be better not to increment
 * prefetch_pages?)
 *
 * This depends on the assumption that the index AM will
 * report the same recheck flag for this future heap page as
 * it did for the current heap page; which is not a certainty
 * but is true in many cases.
 */
skip_fetch = (node->can_skip_fetch &&
  (node->tbmres ? !node->tbmres->recheck : false) &&
  VM_ALL_VISIBLE(node->ss.ss_currentRelation,
 tbmpre->blockno,
 >pvmbuffer));

if (!skip_fetch)
PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, tbmpre->blockno);


- Melanie




RE: Popcount optimization using AVX512

2024-02-12 Thread Amonson, Paul D
My responses with questions,

> > +# XXX: The configure.ac check for __cpuidex() is broken, we don't 
> > +copy that # here. To prevent problems due to two detection methods 
> > +working, stop # checking after one.
>
> This seems like a bogus copy-paste.

My bad. Will remove the offending comment.  :)

> > +# Check for header immintrin.h
> ...
> Do these all actually have to link?  Invoking the linker is slow.
> I think you might be able to just use cc.has_header_symbol().

I took this to mean the last of the 3 new blocks. I changed this one to the 
cc_has_header method. I think I do want the first 2 checking the link as well. 
If the don't link here they won't link in the actual build.

> Does this work with msvc?

I think it will work but I have no way to validate it. I propose we remove the 
AVX-512 popcount feature from MSVC builds. Sound ok?

> That's a very long line in the output, how about using the avx feature name 
> or something?

Agree, will fix.

> This will build all of pgport with the avx flags, which wouldn't be correct, 
> I think? The compiler might inject automatic uses of avx512 in places, which 
> would cause problems, no?

This will take me some time to learn how to do this in meson. Any pointers here 
would be helpful. 

> While you don't do the same for make, isn't even just using the avx512 for 
> all of pg_bitutils.c broken for exactly that reson? That's why the existing 
> code builds the files for various crc variants as their own file.

I don't think its broken, nothing else in pg_bitutils.c will make use of 
AVX-512, so I am not sure what dividing this up into multiple files will yield 
benefits beyond code readability as they will all be needed during compile 
time. I prefer to not split if the community agrees to it.
 
If splitting still makes sense, I propose splitting into 3 files:  
pg_bitutils.c (entry point +sw popcnt implementation), pg_popcnt_choose.c 
(CPUID and xgetbv check) and pg_popcnt_x86_64_accel.c (64/512bit x86 
implementations). 
I'm not an expert in meson, but splitting might add complexity to meson.build. 

Could you elaborate if there are other benefits to the split file approach?

Paul


-Original Message-
From: Andres Freund  
Sent: Friday, February 9, 2024 10:35 AM
To: Amonson, Paul D 
Cc: Alvaro Herrera ; Shankaran, Akash 
; Nathan Bossart ; Noah 
Misch ; Tom Lane ; Matthias van de Meent 
; pgsql-hackers@lists.postgresql.org
Subject: Re: Popcount optimization using AVX512

Hi,

On 2024-02-09 17:39:46 +, Amonson, Paul D wrote:

> diff --git a/meson.build b/meson.build index 8ed51b6aae..1e7a4dc942 
> 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1773,6 +1773,45 @@ elif cc.links('''
>  endif
>  
>  
> +# XXX: The configure.ac check for __cpuidex() is broken, we don't 
> +copy that # here. To prevent problems due to two detection methods 
> +working, stop # checking after one.

This seems like a bogus copy-paste.


> +if cc.links('''
> +#include 
> +int main(int arg, char **argv)
> +{
> +unsigned int exx[4] = {0, 0, 0, 0};
> +__get_cpuid_count(7, 0, [0], [1], [2], [3]);
> +}
> +''', name: '__get_cpuid_count',
> +args: test_c_args)
> +  cdata.set('HAVE__GET_CPUID_COUNT', 1) elif cc.links('''
> +#include 
> +int main(int arg, char **argv)
> +{
> +unsigned int exx[4] = {0, 0, 0, 0};
> +__cpuidex(exx, 7, 0);
> +}
> +''', name: '__cpuidex',
> +args: test_c_args)
> +  cdata.set('HAVE__CPUIDEX', 1)
> +endif
> +
> +
> +# Check for header immintrin.h
> +if cc.links('''
> +#include 
> +int main(int arg, char **argv)
> +{
> +  return 1701;
> +}
> +''', name: '__immintrin',
> +args: test_c_args)
> +  cdata.set('HAVE__IMMINTRIN', 1)
> +endif

Do these all actually have to link?  Invoking the linker is slow.

I think you might be able to just use cc.has_header_symbol().



> +###
> +# AVX 512 POPCNT Intrinsic check
> +###
> +have_avx512_popcnt = false
> +cflags_avx512_popcnt = []
> +if host_cpu == 'x86_64'
> +  prog = '''
> +  #include 
> +  #include 
> +  void main(void)
> +  {
> +__m512i tmp __attribute__((aligned(64)));
> +__m512i input = _mm512_setzero_si512();
> +__m512i output = _mm512_popcnt_epi64(input);
> +uint64_t cnt = 999;
> +_mm512_store_si512(, output);
> +cnt = _mm512_reduce_add_epi64(tmp);
> +/* return computed value, to prevent the above being optimized away 
> */
> +return cnt == 0;
> +  }'''

Does this work with msvc?


> +if cc.links(prog, name: '_mm512_setzero_si512, 
> + _mm512_popcnt_epi64, _mm512_store_si512, and _mm512_reduce_add_epi64 
> + with -mavx512vpopcntdq -mavx512f',

That's a very long line in the output, how about using the avx feature name or 
something?



> diff --git a/src/port/Makefile 

Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Alvaro Herrera
On 2024-Feb-10, Peter Eisentraut wrote:

> So as a test, I ran IWYU over the backend *.c files and removed all the
> #includes it suggested.  (Note that IWYU also suggests to *add* a bunch of
> #includes, in fact that is its main purpose; I didn't do this here.) In some
> cases, a more specific #include replaces another less specific one.

Sounds reasonable in principle.  I looked at the access/brin changes and
they seems OK.  Then I noticed the execdebug.h -> executor.h changes and
was surprised, until I looked at execdebug.h and though ... maybe we
should just get rid of that file altogether.

> Also, as a small example, in src/backend/access/transam/rmgr.c you'll see
> some IWYU pragma annotations to handle a special case there.

Looks pretty acceptable.

> The purpose of this patch is twofold: One, it's of course a nice cleanup.
> Two, this is a test how well IWYU might work for us.  If we find either by
> human interpretation that a change doesn't make sense, or something breaks
> on some platform, then that would be useful feedback (perhaps to addressed
> by more pragma annotations or more test coverage).

Apparently the tool has long standing, so since the results appear to be
good, I'm not opposed to adding it to our workflow.  Not as much as
pgindent for sure, though.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"All rings of power are equal,
But some rings of power are more equal than others."
 (George Orwell's The Lord of the Rings)




Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Andres Freund
Hi,

On 2024-02-10 08:40:43 +0100, Peter Eisentraut wrote:
> So as a test, I ran IWYU over the backend *.c files and removed all the
> #includes it suggested.  (Note that IWYU also suggests to *add* a bunch of
> #includes, in fact that is its main purpose; I didn't do this here.) In some
> cases, a more specific #include replaces another less specific one.  (To
> keep the patch from exploding in size, I ignored for now all the suggestions
> to replace catalog/pg_somecatalog.h with catalog/pg_somecatalog_d.h.)  This
> ended up with the attached patch, which has

I think this kind of suggested change is why I have been wary of IWYU (and the
changes that clangd suggest): By moving indirect includes to .c files you end
up with implementation details more widely, which can increase the pain of
refactoring.

I'd be hesitant to commit to this without a) a policy about adding annotations
about when indirect includes shouldn't directly be included b) annotations
ensuring that.


What were the parameters you used for IWYU? I'm e.g. a bit surprised about the
changes to main.c, adding an include for s_lock.h. For one I don't think
s_lock.h should ever be included directly, but more importantly it seems there
isn't a reason to include spin.h either, but it's not removed here?

There are other changes I don't understand. E.g. why is
catalog/binary_upgrade.h removed from pg_enum.c? It's actually defining
binary_upgrade_next_pg_enum_oid, declared in catalog/binary_upgrade.h?


I think some of the changes here could be applied independently of more iwyu
infrastructure, e.g. replacing includes of builtins.h with includes of
fmgrprotos.h. But the larger set of changes seems somewhat hard to judge
as-is.


FWIW, for me the tree with the patch applied doesn't build anymore, presumably
due to 8be93177c46.
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c: 
In function 'EventTriggerOnLogin':
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c:955:72:
 error: 'F_OIDEQ' undeclared (first use in this function)
  955 | BTEqualStrategyNumber, 
F_OIDEQ,
  |
^~~
../../../../../home/andres/src/postgresql/src/backend/commands/event_trigger.c:955:72:
 note: each undeclared identifier is reported only once for each function it 
appears in


Greetings,

Andres Freund




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-02-12 Thread Jeff Davis
On Wed, 2024-01-31 at 14:30 +0530, Bharath Rupireddy wrote:
> Please see the attached v22 patch set.

Committed 0001.

For 0002 & 0003, I'd like more clarity on how they will actually be
used by an extension.

For 0004, we need to resolve why callers are using XLOG_BLCKSZ and we
can fix that independently, as discussed here:

https://www.postgresql.org/message-id/CALj2ACV=C1GZT9XQRm4iN1NV1T=hla_hsgwnx2y5-g+mswd...@mail.gmail.com

Regards,
Jeff Davis





Re: Reducing output size of nodeToString

2024-02-12 Thread Matthias van de Meent
On Mon, 12 Feb 2024 at 19:03, Matthias van de Meent
 wrote:
> Attached is patchset v2, which contains the improvements from these patches:

Attached v3, which fixes an out-of-bounds read in pg_strtoken_next,
detected by asan, that was a likely cause of the problems in CFBot's
FreeBSD regression tests.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)


v3-0002-pg_node_tree-Don-t-store-query-text-locations-in-.patch
Description: Binary data


v3-0004-gen_node_support.pl-Add-a-TypMod-type-for-signall.patch
Description: Binary data


v3-0005-nodeToString-omit-serializing-NULL-datums-in-Cons.patch
Description: Binary data


v3-0003-gen_node_support.pl-Mark-location-fields-as-type-.patch
Description: Binary data


v3-0006-nodeToString-Apply-RLE-on-Bitmapset-and-numeric-L.patch
Description: Binary data


v3-0007-gen_node_support.pl-Optimize-serialization-of-fie.patch
Description: Binary data


v3-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch
Description: Binary data


Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs

2024-02-12 Thread Daniel Verite
Jakub Wartak wrote:

> when I run with default pager (more or less):
> \set FETCH_COUNT 1000
> WITH data AS (SELECT  generate_series(1, 2000) as Total) select
> repeat('a',100) || data.Total || repeat('b', 800) as total_pat from
> data;
> -- it enters pager, a skip couple of pages and then "q"
> 
> .. then - both backend and psql - go into 100% CPU as it were still
> receiving

Thanks for looking into this patch!

What's happening after the pager has quit is that psql continues
to pump results from the server until there are no more results.

If the user wants to interrupt that, they should hit Ctrl+C to
cancel the query. I think psql should not cancel it implicitly
on their behalf, as it also cancels the transaction.

The behavior differs from the cursor implementation, because in
the cursor case, when the pager is displaying results, no query is
running. The previous FETCH results have been entirely
read, and the next FETCH has not been sent to the server yet.
This is why quitting the pager in the middle of this can
be dealt with instantly.

> (that doesn't happen e.g. with export PAGER=cat).  So I'm
> not sure, maybe ExecQueryAndProcessResults() should somewhat
> faster abort when the $PAGER is exiting normally(?).

I assume that when using PAGER=cat, you cancel the display
with Ctrl+C, which propagates to psql and have the effect
to also cancel the query. In that case it displays
"Cancel request sent",
and then shortly after it gets back from the server:
"ERROR:  canceling statement due to user request".
That case corresponds to the generic query canceling flow.

OTOH if killing the "cat" process with kill -TERM I see the same
behavior than with "more" or "less", that is postgres running
the query to completion and psql pumping the results.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Andrew Dunstan


On 2024-02-12 Mo 11:44, Dave Cramer wrote:


Dave Cramer
www.postgres.rocks


On Mon, 12 Feb 2024 at 09:19, Andrew Dunstan  wrote:


On 2024-02-12 Mo 08:51, Dave Cramer wrote:



On Sat, 10 Feb 2024 at 13:28, Andrew Dunstan
 wrote:


On 2024-02-10 Sa 12:20, Dave Cramer wrote:



On Sat, 10 Feb 2024 at 11:19, Andrew Dunstan
 wrote:


On 2024-02-09 Fr 14:23, Dave Cramer wrote:


Dave Cramer
www.postgres.rocks 


On Fri, 9 Feb 2024 at 07:18, Dave Cramer

 wrote:





On Fri, 9 Feb 2024 at 00:26, Michael Paquier
 wrote:

On Tue, Feb 06, 2024 at 07:01:49AM -0500, Dave
Cramer wrote:
> Thanks, this patch works and
> testing with meson passes.

Only with the version posted at [1]? 
Interesting, that's the same
contents as v8 posted upthread, minus
src/tools/ because we don't need
to care about them anymore.

Andrew, what's happening on the test side?  It
does not seem you've
mentioned any details about what is going
wrong, or I have just missed
them.

> I'll try the buildfarm next.

[1]:

https://www.postgresql.org/message-id/ea42654a-3dc4-98b0-335b-56b7ec5e5...@dunslane.net


interestingly meson test does not produce any error
The buildfarm produces the following error for me:

-SELECT relname, attname, coltypes,
get_columns_length(coltypes)
- FROM check_columns
- WHERE get_columns_length(coltypes) % 8 != 0 OR
-  'name'::regtype::oid = ANY(coltypes);
- relname | attname | coltypes | get_columns_length
--+-+--+
-(0 rows)
-
+server closed the connection unexpectedly
+This probably means the server terminated abnormally
+before or while processing the request.
+connection to server was lost


Actually digging some more, here is the actual error

2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:
 server process (PID 11204) was terminated by exception
0xC005
2024-02-09 13:31:11.008 -05 postmaster[10672] DETAIL:
 Failed process was running: VACUUM;
2024-02-09 13:31:11.008 -05 postmaster[10672] HINT:
 See C include file "ntstatus.h" for a description of
the hexadecimal value.
2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:
 terminating any other active server processes
2024-02-09 13:31:11.013 -05 postmaster[10672] LOG:  all
server processes terminated; reinitializing
2024-02-09 13:31:11.034 -05 startup[6152] LOG:
 database system was interrupted; last known up at
2024-02-09 13:31:01 -05






So how does one debug this ?

Also if I `run meson test` I don't see this error. What does
the buildfarm do differently?



First it does this:


meson test -C $pgsql --no-rebuild --suite setup


Then it does this (jflag is for the number of jobs):


meson test -t $meson_test_timeout $jflag -C $pgsql
--logbase checklog --print-errorlogs --no-rebuild --suite
regress --test-args=--no-locale



running the above manually produces no errors ??




Not for me. I get the error I previously reported, It's an access
violation error.




OK, so I have managed to get a debugger attached to postgres.exe when 
it faults and the fault occurs at

https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/src/backend/utils/mmgr/dsa.c#L869
span is pointing to 0x0




Further data point. If I select a build type of 'debug' instead of 
'debugoptimized' the error disappears.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Patch: Add parse_type Function

2024-02-12 Thread Pavel Stehule
po 12. 2. 2024 v 19:20 odesílatel Tom Lane  napsal:

> I wrote:
> > It strikes me that this is basically to_regtype() with the additional
> > option to return the typmod.  That leads to some questions:
>
> BTW, another way that this problem could be approached is to use
> to_regtype() as-is, with a separate function to obtain the typmod:
>
> select format_type(to_regtype('timestamp(4)'),
> to_regtypmod('timestamp(4)'));
>
> This is intellectually ugly, since it implies parsing the same
> typename string twice.  But on the other hand it avoids the notational
> pain and runtime overhead involved in using a record-returning
> function.  So I think it might be roughly a wash for performance.
> Question to think about is which way is easier to use.  I don't
> have an opinion particularly; just throwing the idea out there.
>

+1

Pavel

>
> regards, tom lane
>


Re: Patch: Add parse_type Function

2024-02-12 Thread Tom Lane
I wrote:
> It strikes me that this is basically to_regtype() with the additional
> option to return the typmod.  That leads to some questions:

BTW, another way that this problem could be approached is to use
to_regtype() as-is, with a separate function to obtain the typmod:

select format_type(to_regtype('timestamp(4)'), to_regtypmod('timestamp(4)'));

This is intellectually ugly, since it implies parsing the same
typename string twice.  But on the other hand it avoids the notational
pain and runtime overhead involved in using a record-returning
function.  So I think it might be roughly a wash for performance.
Question to think about is which way is easier to use.  I don't
have an opinion particularly; just throwing the idea out there.

regards, tom lane




Re: Reducing output size of nodeToString

2024-02-12 Thread Matthias van de Meent
On Wed, 31 Jan 2024 at 18:47, Robert Haas  wrote:
>
> On Wed, Jan 31, 2024 at 11:17 AM Matthias van de Meent
>  wrote:
> > I was also thinking about smaller per-attribute expression storage, for 
> > index attribute expressions, table default expressions, and functions. 
> > Other than that, less memory overhead for the serialized form of these 
> > constructs also helps for catalog cache sizes, etc.
> > People complained about the size of a fresh initdb, and I agreed with them, 
> > so I started looking at low-hanging fruits, and this is one.
> >
> > I've not done any tests yet on whether it's more performant in general. I'd 
> > expect the new code to do a bit better given the extremely verbose nature 
> > of the data and the rather complex byte-at-a-time token read method used, 
> > but this is currently hypothesis.
> > I do think that serialization itself may be slightly slower, but given that 
> > this generally happens only in DDL, and that we have to grow the output 
> > buffer less often, this too may still be a net win (but, again, this is an 
> > untested hypothesis).
>
> I think we're going to have to have separate formats for debugging and
> storage if we want to get very far here. The current format sucks for
> readability because it's so verbose, and tightening that up where we
> can makes sense to me. For me, that can include things like emitting
> unset location fields for sure, but delta-encoding of bitmap sets is
> more questionable. Turning 1 2 3 4 5 6 7 8 9 10 into 1-10 would be
> fine with me because that is both shorter and more readable, but
> turning 2 4 6 8 10 into 2 2 2 2 2 is way worse for a human reader.
> Such optimizations might make sense in a format that is designed for
> computer processing only but not one that has to serve multiple
> purposes.

I suppose so, yes. I've removed the delta-encoding from the
serialization of bitmapsets in the attached patchset.

Peter E. and I spoke about this patchset at FOSDEM PGDay, too. I said
to him that I wouldn't mind if this patchset was only partly applied:
The gains for most of the changes are definitely worth it even if some
others don't get in.

I think it'd be a nice QoL and storage improvement if even only (say)
the first two patches were committed, though the typmod default
markings (or alternatively, using a typedef-ed TypMod and one more
type-specific serialization handler) would also be a good improvement
without introducing too many "common value = default = omitted"
considerations that would reduce debugability.

Attached is patchset v2, which contains the improvements from these patches:

0001 has the "omit defaults" for the current types.
-23.5%pt / -35.1%pt (toasted / raw)
0002+0003 has new #defined type "Location" for those fields in Nodes
that point into (or have sizes of) query texts, and adds
infrastructure to conditionally omit them at all (see previous
discussions)
-3.5%pt / -6.3%pt
0004 has new #defined type TypeMod as alias for int32, that uses a
default value of -1 for (de)serialization purposes.
-3.0%pt / -6.1%pt
0005 updates Const node serialization to omit `:constvalue` if the
value is null.
+0.1%pt / -0.1%pt [^0]
0006 does run-length encoding for bitmaps and the various typed
integer lists, using "+int" as indicators of a run of a certain
length, excluding the start value.
 Bitmaps, IntLists and XidLists are based on runs with increments
of 1 (so, a notation (i 1 +3) means (i 1 2 3 4), while OidLists are
based on runs with no increments (so, (o 1 +3) means (o 1 1 1 1).
-2.5%pt / -0.6%pt
0007 does add some select custom 'default' values, in that the
varnosyn and varattnosyn fields now treat the value of varno and
varattno as their default values.
This reduces the size of lists of Vars significantly and has a
very meaningful impact on the size of the compressed data (the default
pg_rewrite dataset contains some 10.8k Var nodes).
-10.4%pt / 9.7%pt

Total for the full applied patchset:
55.5% smaller data in pg_rewrite.ev_action before TOAST
45.7% smaller data in pg_rewrite.ev_action after applying TOAST

Toast relation size, as fraction of the main pg_rewrite table:
select pg_relation_size(2838) *1.0 / pg_relation_size('pg_rewrite');
  master: 4.7
  0007: 1.3

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[^0]: The small difference in size for patch 0005 is presumably due to
low occurrance of NULL-valued Const nodes. Additionally, the inline vs
out-of-line TOASTed data and data (not) fitting on the last blocks of
each relation are likely to cause the change in total compression
ratio. If we had more null-valued Const nodes in pg_rewrite, the ratio
would presumably have been better than this.

PS: query I used for my data collection, + combined data:

select 'master' as "version"
 , pg_database_size('template0') as "template0"
 , pg_total_relation_size('pg_rewrite') as "pg_rewrite"
 , sum(pg_column_size(ev_action)) as "toasted"
 , 

Re: Patch: Add parse_type Function

2024-02-12 Thread Tom Lane
"David E. Wheeler"  writes:
> [ v4-0001-Add-parse_type-SQL-function.patch ]

It strikes me that this is basically to_regtype() with the additional
option to return the typmod.  That leads to some questions:

* Should we choose a name related to to_regtype?  I don't have any
immediate suggestions, but since you didn't seem entirely set on
parse_type, maybe it's worth thinking along those lines.  OTOH,
to the extent that people would use this with format_type, maybe
parse_type is fine.

* Perhaps the type OID output argument should be declared regtype
not plain OID?  It wouldn't make any difference for passing it to
format_type, but in other use-cases perhaps regtype would be more
readable.  It's not a deal-breaker either way of course, since
you can cast oid to regtype or vice versa.

* Maybe the code should be in adt/regproc.c not format_type.c.

* Experience with to_regtype suggests strongly that people would
prefer "return NULL" to failure for an unrecognized type name.


Skimming the patch, I notice that the manual addition to
builtins.h should be unnecessary: the pg_proc.dat entry
should be enough to create an extern in fmgrprotos.h.
Also, I'm pretty sure that reformat_dat_file.pl will
think your pg_proc.dat entry is overly verbose.  See
https://www.postgresql.org/docs/devel/system-catalog-initial-data.html

regards, tom lane




Re: make dist using git archive

2024-02-12 Thread Tristan Partin

On Sun Feb 11, 2024 at 5:09 PM CST, Peter Eisentraut wrote:
Small update: I noticed that on Windows (at least the one that is 
running the CI job), I need to use git -c core.autocrlf=false, otherwise 
git archive does line-ending conversion for the files it puts into the 
archive.  With this fix, all the archives produced by all the CI jobs 
across the different platforms match, except the .tar.gz archive from 
the Linux job, which I suspect suffers from an old git version.  We 
should get the Linux images updated to a newer Debian version soon 
anyway, so I think that issue will go away.


I think with this change, it is unlikely I will be able to upstream 
anything to Meson that would benefit Postgres here since setting this 
option seems project dependent.


--
Tristan Partin
Neon (https://neon.tech)




Re: Document efficient self-joins / UPDATE LIMIT techniques.

2024-02-12 Thread Corey Huinker
>
> Do you plan to add it to the commitfest?  If yes, I'd set it "ready for
> committer".
>
> Commitfest entry reanimated.


Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Tristan Partin

On Sat Feb 10, 2024 at 1:40 AM CST, Peter Eisentraut wrote:
I played with include-what-you-use (IWYU), "a tool for use with clang to 
analyze #includes in C and C++ source files".[0]  I came across this via 
clangd (the language server), because clangd (via the editor) kept 
suggesting a bunch of #includes to remove.  And I suppose it was right.


So as a test, I ran IWYU over the backend *.c files and removed all the 
#includes it suggested.  (Note that IWYU also suggests to *add* a bunch 
of #includes, in fact that is its main purpose; I didn't do this here.) 
In some cases, a more specific #include replaces another less specific 
one.  (To keep the patch from exploding in size, I ignored for now all 
the suggestions to replace catalog/pg_somecatalog.h with 
catalog/pg_somecatalog_d.h.)  This ended up with the attached patch, 
which has


  432 files changed, 233 insertions(+), 1023 deletions(-)

I tested this with various compilation options (assert, WAL_DEBUG, 
LOCK_DEBUG, different geqo variants, etc.) to make sure a header wasn't 
just used for some conditional code section.  Also, again, this patch 
touches just *.c files, so nothing declared from header files changes in 
hidden ways.


Also, as a small example, in src/backend/access/transam/rmgr.c you'll 
see some IWYU pragma annotations to handle a special case there.


The purpose of this patch is twofold: One, it's of course a nice 
cleanup.  Two, this is a test how well IWYU might work for us.  If we 
find either by human interpretation that a change doesn't make sense, or 
something breaks on some platform, then that would be useful feedback 
(perhaps to addressed by more pragma annotations or more test coverage).


(Interestingly, IWYU has been mentioned in src/tools/pginclude/README 
since 2012.  Has anyone else played with it?  Was it not mature enough 
back then?)


[0]: https://include-what-you-use.org/


I like this idea. This was something I tried to do but never finished in 
my last project. I have also been noticing the same issues from clangd. 
Having more automation around include patterns would be great! I think 
it would be good if there a Meson run_target()/Make .PHONY target that 
people could use to test too. Then, eventually the cfbot could pick this 
up.


--
Tristan Partin
Neon (https://neon.tech)




Re: glibc qsort() vulnerability

2024-02-12 Thread Mats Kindahl
On Mon, Feb 12, 2024 at 4:57 PM Nathan Bossart 
wrote:

> On Sun, Feb 11, 2024 at 03:44:42PM +0100, Mats Kindahl wrote:
> > On Sat, Feb 10, 2024 at 9:53 PM Nathan Bossart  >
> > wrote:
> >> and I think we should expand on some of the commentary in int.h.
> >> For example, the comment at the top of int.h seems very tailored to the
> >> existing functions and should probably be adjusted.
> >
> >
> > I rewrote the beginning to the following, does that look good?
> >
> >  * int.h
> >  *  Routines to perform signed and unsigned integer arithmetics,
> including
> >  *  comparisons, in an overflow-safe way.
> >
> >
> >
> >> And the "comparison
> >> routines for integers" comment might benefit from some additional
> details
> >> about the purpose and guarantees of the new functions.
> >>
> >
> > I expanded that into the following. WDYT?
> >
> >
> /*
> >  * Comparison routines for integers.
> >  *
> >  * These routines are used to implement comparison functions for, e.g.,
> >  * qsort(). They are designed to be efficient and not risk overflows in
> >  * internal computations that could cause strange results, such as
> INT_MIN >
> >  * INT_MAX if you just return "lhs - rhs".
> >
> *
>
> LGTM.  I might editorialize a bit before committing, but I think your
> proposed wording illustrates the thrust of the change.
>

Thanks Nathan,

Here are the two fixed patches.

Best wishes,
Mats Kindahl


>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
>
From b68f08b6afd62b75755e88f11a3ca31cfdce2645 Mon Sep 17 00:00:00 2001
From: Mats Kindahl 
Date: Fri, 9 Feb 2024 21:48:27 +0100
Subject: Use integer comparison functions

Use overflow-safe integer functions when implementing
qsort() comparison functions.
---
 contrib/hstore/hstore_gist.c|  3 ++-
 contrib/intarray/_intbig_gist.c |  3 ++-
 contrib/pg_stat_statements/pg_stat_statements.c |  8 ++--
 contrib/pg_trgm/trgm_op.c   |  8 ++--
 src/backend/access/nbtree/nbtinsert.c   |  8 ++--
 src/backend/access/nbtree/nbtpage.c | 10 +++---
 src/backend/access/nbtree/nbtsplitloc.c |  8 ++--
 src/backend/access/spgist/spgdoinsert.c |  5 ++---
 src/backend/access/spgist/spgtextproc.c |  3 ++-
 src/backend/backup/basebackup_incremental.c |  8 ++--
 src/backend/backup/walsummary.c |  7 ++-
 src/backend/catalog/heap.c  |  8 ++--
 src/backend/nodes/list.c| 13 +++--
 src/backend/nodes/tidbitmap.c   |  7 ++-
 src/backend/parser/parse_agg.c  |  7 ---
 src/backend/postmaster/autovacuum.c |  6 ++
 src/backend/replication/logical/reorderbuffer.c |  7 ++-
 src/backend/replication/syncrep.c   |  8 ++--
 src/backend/utils/adt/oid.c |  7 ++-
 src/backend/utils/adt/tsgistidx.c   | 10 +++---
 src/backend/utils/adt/tsquery_gist.c|  6 ++
 src/backend/utils/adt/tsvector.c|  5 ++---
 src/backend/utils/adt/tsvector_op.c |  5 ++---
 src/backend/utils/adt/xid.c |  7 ++-
 src/backend/utils/cache/relcache.c  |  3 ++-
 src/backend/utils/cache/syscache.c  |  5 ++---
 src/backend/utils/cache/typcache.c  |  8 ++--
 src/backend/utils/resowner/resowner.c   | 10 ++
 src/bin/pg_dump/pg_dump_sort.c  |  7 ++-
 src/bin/pg_upgrade/function.c   | 13 +++--
 src/bin/pg_walsummary/pg_walsummary.c   |  8 ++--
 src/bin/psql/crosstabview.c |  3 ++-
 src/include/access/gin_private.h|  8 ++--
 33 files changed, 76 insertions(+), 156 deletions(-)

diff --git a/contrib/hstore/hstore_gist.c b/contrib/hstore/hstore_gist.c
index fe343739eb..1079f25bf7 100644
--- a/contrib/hstore/hstore_gist.c
+++ b/contrib/hstore/hstore_gist.c
@@ -7,6 +7,7 @@
 #include "access/reloptions.h"
 #include "access/stratnum.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "hstore.h"
 #include "utils/pg_crc.h"
 
@@ -356,7 +357,7 @@ typedef struct
 static int
 comparecost(const void *a, const void *b)
 {
-	return ((const SPLITCOST *) a)->cost - ((const SPLITCOST *) b)->cost;
+	return pg_cmp_s32(((const SPLITCOST *) a)->cost, ((const SPLITCOST *) b)->cost);
 }
 
 
diff --git a/contrib/intarray/_intbig_gist.c b/contrib/intarray/_intbig_gist.c
index 8c6c4b5d05..aef9e13dcc 100644
--- a/contrib/intarray/_intbig_gist.c
+++ b/contrib/intarray/_intbig_gist.c
@@ -9,6 +9,7 @@
 #include "access/gist.h"
 #include "access/reloptions.h"
 #include "access/stratnum.h"
+#include "common/int.h"
 #include "port/pg_bitutils.h"
 
 #define GETENTRY(vec,pos) ((GISTTYPE *) 

Re: make BuiltinTrancheNames less ugly

2024-02-12 Thread Tristan Partin

On Wed Jan 24, 2024 at 8:09 AM CST, Alvaro Herrera wrote:

From 3d24b89855888a6650ec1aafb3579d810bfec5ac Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 23 Jan 2024 10:36:14 +0100
Subject: [PATCH] Remove IndividualLWLockNames

We can just merge the lwlock names into the BuiltinTrancheNames array.
This requires that Meson compiles the file with -I. in CPPFLAGS, which
in turn requires some additional contortions for DTrace support in
FreeBSD.
---
 src/backend/meson.build  |  4 +++-
 src/backend/storage/lmgr/Makefile|  3 ++-
 src/backend/storage/lmgr/generate-lwlocknames.pl | 10 ++
 src/backend/storage/lmgr/lwlock.c| 13 -
 src/backend/storage/lmgr/meson.build | 12 ++--
 5 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/backend/meson.build b/src/backend/meson.build
index 8767aaba67..57a52c37e0 100644
--- a/src/backend/meson.build
+++ b/src/backend/meson.build
@@ -127,7 +127,9 @@ backend_objs = [postgres_lib.extract_all_objects(recursive: 
false)]
 if dtrace.found() and host_system != 'darwin'
   backend_input += custom_target(
 'probes.o',
-input: ['utils/probes.d', postgres_lib.extract_objects(backend_sources, 
timezone_sources)],
+input: ['utils/probes.d',
+  postgres_lib.extract_objects(backend_sources, timezone_sources),
+  lwlock_lib.extract_objects(lwlock_source)],
 output: 'probes.o',
 command: [dtrace, '-C', '-G', '-o', '@OUTPUT@', '-s', '@INPUT@'],
 install: false,
diff --git a/src/backend/storage/lmgr/Makefile 
b/src/backend/storage/lmgr/Makefile
index 504480e847..81da6ee13a 100644
--- a/src/backend/storage/lmgr/Makefile
+++ b/src/backend/storage/lmgr/Makefile
@@ -12,13 +12,14 @@ subdir = src/backend/storage/lmgr
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
+override CPPFLAGS := -I. $(CPPFLAGS)

+
 OBJS = \
condition_variable.o \
deadlock.o \
lmgr.o \
lock.o \
lwlock.o \
-   lwlocknames.o \
predicate.o \
proc.o \
s_lock.o \
diff --git a/src/backend/storage/lmgr/generate-lwlocknames.pl 
b/src/backend/storage/lmgr/generate-lwlocknames.pl
index 7b93ecf6c1..a679a4ff54 100644
--- a/src/backend/storage/lmgr/generate-lwlocknames.pl
+++ b/src/backend/storage/lmgr/generate-lwlocknames.pl
@@ -10,7 +10,6 @@ use Getopt::Long;
 my $output_path = '.';
 
 my $lastlockidx = -1;

-my $continue = "\n";
 
 GetOptions('outdir:s' => \$output_path);
 
@@ -29,8 +28,6 @@ print $h $autogen;

 print $h "/* there is deliberately not an #ifndef LWLOCKNAMES_H here */\n\n";
 print $c $autogen, "\n";
 
-print $c "const char *const IndividualLWLockNames[] = {";

-
 #
 # First, record the predefined LWLocks listed in wait_event_names.txt.  We'll
 # cross-check those with the ones in lwlocknames.txt.
@@ -97,12 +94,10 @@ while (<$lwlocknames>)
while ($lastlockidx < $lockidx - 1)
{
++$lastlockidx;
-   printf $c "%s  \"\"", $continue, 
$lastlockidx;
-   $continue = ",\n";
+   printf $c "[%s] = \"\",\n", $lastlockidx, 
$lastlockidx;
}
-   printf $c "%s  \"%s\"", $continue, $trimmedlockname;
+   printf $c "[%s] = \"%s\",\n", $lockidx, $trimmedlockname;
$lastlockidx = $lockidx;
-   $continue = ",\n";
 
 	print $h "#define $lockname ([$lockidx].lock)\n";

 }
@@ -112,7 +107,6 @@ die
   . "lwlocknames.txt"
   if $i < scalar @wait_event_lwlocks;
 
-printf $c "\n};\n";

 print $h "\n";
 printf $h "#define NUM_INDIVIDUAL_LWLOCKS %s\n", $lastlockidx + 1;
 
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c

index 98fa6035cc..8aad9c6690 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -115,8 +115,8 @@ StaticAssertDecl(LW_VAL_EXCLUSIVE > (uint32) MAX_BACKENDS,
  * There are three sorts of LWLock "tranches":
  *
  * 1. The individually-named locks defined in lwlocknames.h each have their
- * own tranche.  The names of these tranches appear in IndividualLWLockNames[]
- * in lwlocknames.c.
+ * own tranche.  The names of these tranches come from lwlocknames.c into
+ * BuiltinTranchNames[] below.
  *
  * 2. There are some predefined tranches for built-in groups of locks.
  * These are listed in enum BuiltinTrancheIds in lwlock.h, and their names
@@ -129,9 +129,8 @@ StaticAssertDecl(LW_VAL_EXCLUSIVE > (uint32) MAX_BACKENDS,
  * All these names are user-visible as wait event names, so choose with care
  * ... and do not forget to update the documentation's list of wait events.
  */
-extern const char *const IndividualLWLockNames[];  /* in lwlocknames.c */
-
 static const char *const BuiltinTrancheNames[] = {
+#include "lwlocknames.c"
[LWTRANCHE_XACT_BUFFER] = "XactBuffer",
[LWTRANCHE_COMMITTS_BUFFER] = "CommitTsBuffer",
[LWTRANCHE_SUBTRANS_BUFFER] = "SubtransBuffer",
@@ -738,11 +737,7 @@ 

Re: Document efficient self-joins / UPDATE LIMIT techniques.

2024-02-12 Thread Laurenz Albe
On Mon, 2024-02-12 at 11:45 -0500, Corey Huinker wrote:
> 
> > - I am not sure if it is necessary to have the  at all.
> >   I'd say that it is just a trivial variation of the UPDATE example.
> >   On the other hand, a beginner might find the example useful.
> >   Not sure.
> 
> I think a beginner would find it useful. The join syntax for DELETE is 
> different from
> UPDATE in a way that has never made sense to me, and a person with only the 
> UPDATE
> example might try just replacing UPDATE WITH DELETE and eliminating the SET 
> clause,
> and frustration would follow. We have an opportunity to show the equivalent 
> join in
> both cases, let's use it.

I think we can leave the decision to the committer.

> > I think the "in" before between is unnecessary and had better be removed, 
> > but
> > I'll defer to the native speaker.
> 
> The "in" is more common when spoken. Removed.

The "in" is appropriate for intransitive use:
"I've been here and I've been there and I've been in between."
But: "I have been between here and there."

Do you plan to add it to the commitfest?  If yes, I'd set it "ready for 
committer".

Yours,
Laurenz Albe




Re: Document efficient self-joins / UPDATE LIMIT techniques.

2024-02-12 Thread Corey Huinker
>
>
> - About the style: there is usually an empty line between an ending 
>   and the next starting .  It does not matter for correctness, but I
>   think it makes the source easier to read.
>

Done. I've seen them with spaces and without, and have no preference.


>
> - I would rather have only "here" as link text rather than "in greater
> details
>   here".  Even better would be something that gives the reader a clue where
>   the link will take her, like
>   the documentation of
> UPDATE.
>

Done.

>
> - I am not sure if it is necessary to have the  at all.
>   I'd say that it is just a trivial variation of the UPDATE example.
>   On the other hand, a beginner might find the example useful.
>   Not sure.
>

I think a beginner would find it useful. The join syntax for DELETE is
different from UPDATE in a way that has never made sense to me, and a
person with only the UPDATE example might try just replacing UPDATE WITH
DELETE and eliminating the SET clause, and frustration would follow. We
have an opportunity to show the equivalent join in both cases, let's use it.



> I think the "in" before between is unnecessary and had better be removed,
> but
> I'll defer to the native speaker.
>

The "in" is more common when spoken. Removed.
From a6b57bf3a88c5df614b5dede99af3e99fe8e8089 Mon Sep 17 00:00:00 2001
From: Corey Huinker 
Date: Mon, 12 Feb 2024 11:32:49 -0500
Subject: [PATCH v3] Documentation: Show alternatives to LIMIT on UPDATE and
 DELETE

Show examples of how to simulate UPDATE or DELETE with a LIMIT clause.

These examples also serve to show the existence and utility of ctid self-joins.
---
 doc/src/sgml/ref/delete.sgml | 18 ++
 doc/src/sgml/ref/update.sgml | 37 
 2 files changed, 55 insertions(+)

diff --git a/doc/src/sgml/ref/delete.sgml b/doc/src/sgml/ref/delete.sgml
index 1b81b4e7d7..1544a28e18 100644
--- a/doc/src/sgml/ref/delete.sgml
+++ b/doc/src/sgml/ref/delete.sgml
@@ -234,6 +234,24 @@ DELETE FROM films
In some cases the join style is easier to write or faster to
execute than the sub-select style.
   
+  
+   While there is no LIMIT clause for
+   DELETE, it is possible to get a similar effect
+   using the method for UPDATE operations described
+   the documentation of UPDATE.
+
+WITH delete_batch AS (
+  SELECT l.ctid
+  FROM user_logs AS l
+  WHERE l.status = 'archived'
+  ORDER BY l.creation_date
+  LIMIT 1
+  FOR UPDATE
+)
+DELETE FROM user_logs AS ul
+USING delete_branch AS del
+WHERE ul.ctid = del.ctid;
+
  
 
  
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 2ab24b0523..ed3dd029c7 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -442,6 +442,43 @@ COMMIT;
 
 UPDATE films SET kind = 'Dramatic' WHERE CURRENT OF c_films;
 
+  
+   Updates affecting many rows can have negative effects on system performance,
+   such as table bloat, increased replica lag, increased lock contention,
+   and possible failure of the operation due to a deadlock. In such situations
+   it can make sense to perform the operation in smaller batches. Performing a
+   VACUUM operation on the table between batches can help
+   reduce table bloat. The
+   SQL standard does
+   not define a LIMIT clause for UPDATE
+   operations, but it is possible get a similar effect through the use of a
+   Common Table Expression and an
+   efficient self-join via the system column
+   ctid:
+
+WITH exceeded_max_retries AS (
+  SELECT w.ctid
+  FROM work_item AS w
+  WHERE w.status = 'active'
+  AND w.num_retries > 10
+  ORDER BY w.retry_timestamp
+  FOR UPDATE
+  LIMIT 5000
+)
+UPDATE work_item
+SET status = 'failed'
+FROM exceeded_max_retries AS emr
+WHERE work_item.ctid = emr.ctid
+
+If lock contention is a concern, then SKIP LOCKED can
+be added to the CTE. However, one final
+UPDATE without SKIP LOCKED or
+LIMIT will be needed to ensure that no matching rows
+were overlooked. The use of an ORDER BY clause allows
+the command to prioritize which rows will be locked and updated. This can
+also reduce contention with other update operations if they use the same
+ordering.
+  
  
 
  
-- 
2.43.0



Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Dave Cramer
Dave Cramer
www.postgres.rocks


On Mon, 12 Feb 2024 at 09:19, Andrew Dunstan  wrote:

>
> On 2024-02-12 Mo 08:51, Dave Cramer wrote:
>
>
>
> On Sat, 10 Feb 2024 at 13:28, Andrew Dunstan  wrote:
>
>>
>> On 2024-02-10 Sa 12:20, Dave Cramer wrote:
>>
>>
>>
>> On Sat, 10 Feb 2024 at 11:19, Andrew Dunstan  wrote:
>>
>>>
>>> On 2024-02-09 Fr 14:23, Dave Cramer wrote:
>>>
>>>
>>> Dave Cramer
>>> www.postgres.rocks
>>>
>>>
>>> On Fri, 9 Feb 2024 at 07:18, Dave Cramer 
>>>  wrote:
>>>




 On Fri, 9 Feb 2024 at 00:26, Michael Paquier 
 wrote:

> On Tue, Feb 06, 2024 at 07:01:49AM -0500, Dave Cramer wrote:
> > Thanks, this patch works and
> > testing with meson passes.
>
> Only with the version posted at [1]?  Interesting, that's the same
> contents as v8 posted upthread, minus src/tools/ because we don't need
> to care about them anymore.
>
> Andrew, what's happening on the test side?  It does not seem you've
> mentioned any details about what is going wrong, or I have just missed
> them.
>
> > I'll try the buildfarm next.
>
> [1]:
> https://www.postgresql.org/message-id/ea42654a-3dc4-98b0-335b-56b7ec5e5...@dunslane.net


 interestingly meson test does not produce any error
 The buildfarm produces the following error for me:

 -SELECT relname, attname, coltypes, get_columns_length(coltypes)
 - FROM check_columns
 - WHERE get_columns_length(coltypes) % 8 != 0 OR
 -   'name'::regtype::oid = ANY(coltypes);
 - relname | attname | coltypes | get_columns_length
 --+-+--+
 -(0 rows)
 -
 +server closed the connection unexpectedly
 + This probably means the server terminated abnormally
 + before or while processing the request.
 +connection to server was lost

>>>
>>> Actually digging some more, here is the actual error
>>>
>>> 2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:  server process (PID
>>> 11204) was terminated by exception 0xC005
>>> 2024-02-09 13:31:11.008 -05 postmaster[10672] DETAIL:  Failed process
>>> was running: VACUUM;
>>> 2024-02-09 13:31:11.008 -05 postmaster[10672] HINT:  See C include file
>>> "ntstatus.h" for a description of the hexadecimal value.
>>> 2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:  terminating any
>>> other active server processes
>>> 2024-02-09 13:31:11.013 -05 postmaster[10672] LOG:  all server
>>> processes terminated; reinitializing
>>> 2024-02-09 13:31:11.034 -05 startup[6152] LOG:  database system was
>>> interrupted; last known up at 2024-02-09 13:31:01 -05
>>>
>>>
>>>
>>>
>>>
>> So how does one debug this ?
>>
>> Also if I `run meson test` I don't see this error. What does the
>> buildfarm do differently?
>>
>>
>> First it does this:
>>
>>
>> meson test -C $pgsql --no-rebuild --suite setup
>>
>>
>> Then it does this (jflag is for the number of jobs):
>>
>>
>> meson test -t $meson_test_timeout $jflag -C $pgsql --logbase checklog
>> --print-errorlogs --no-rebuild --suite regress --test-args=--no-locale
>>
>>
>
> running the above manually produces no errors ??
>
>
>
> Not for me. I get the error I previously reported, It's an access
> violation error.
>
>
> cheers
>
>
> andrew
>

OK, so I have managed to get a debugger attached to postgres.exe when it
faults and the fault occurs at
https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/src/backend/utils/mmgr/dsa.c#L869
span is pointing to 0x0

Dave


Re: Patch: Add parse_type Function

2024-02-12 Thread David E. Wheeler
On Feb 10, 2024, at 20:52, Erik Wienhold  wrote:
> 
> Let me comment on some issues since I wrote the very first version of
> parse_type() on which David's patch is based.

Thanks Erik.

>> On 2024-02-01 01:00 +0100, jian he  wrote:
>> if you are wondering around other code deal with NULL, ErrorSaveContext.
>> NULL would be more correct?
>> `(void) parseTypeString(type, , , NULL);`

Fixed.

>> also parseTypeString already explained the third argument, our
>> comments can be simplified as:
>> /*
>> * Parse type-name argument to obtain type OID and encoded typmod. We don't
>> * need to handle parseTypeString failure, just let the error be
>> * raised.
>> */


Thanks, adopted that language.

>> cosmetic issue. Looking around other error handling code, the format
>> should be something like:
>> `
>> if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE)
>>ereport(ERROR,
>>  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>errmsg("function returning record called in"
>> "context that cannot accept type record")));
>> `
> 
> +1

I poked around and found some other examples, yes. I went with a single long 
line for errmsg following the example in adminpack.c[1]

>> `PG_FUNCTION_INFO_V1(parse_type);`
>> is unnecessary?
>> based on the error information:  https://cirrus-ci.com/task/5899457502380032
>> not sure why it only fails on windows.
> 
> Yeah, it's not necessary for internal functions per [1].  It's a
> leftover from when this function was part of the pgtap extension.

Removed.

>> +#define PARSE_TYPE_STRING_COLS 2 /* Returns two columns. */
>> +#undef PARSE_TYPE_STRING_COLS
>> Are these necessary?
>> given that the comments already say return the OID and type modifier.
> 
> Not sure what the convention here is but I found this in a couple of
> places, maybe even a tutorial on writing C functions.  See
> `git grep '_COLS\s\+[1-9]'` for those instances.  It's in line with my
> habit of avoiding magic constants.

Left in place for now.

> 
>> +( typid oid,
>> +  typmod int4 )
>> here `int4` should be `integer`?
> 
> +1

Fixed.

> Yes, the sentence is rather handwavy.  What is meant here is that this
> function also resolves types whose typmod is determined by the SQL
> parser or some step after that.  There are types whose typmod is
> whatever value is found inside the parenthesis, e.g. bit(13) has typmod
> 13.  Our first idea before coming up with that function was to do some
> string manipulation and extract the typmod from inside the parenthesis.
> But we soon found out that other typmods don't translate one to one,
> e.g.  varchar(13) has typmod 17.  The interval type is also special
> because the typmod is some bitmask encoding of e.g. 'second(0)'.  Hence
> the mention of the SQL grammar.

I adopted some of your language here --- and fixed the spelling errors :-)

>> can be simplified:
>> +   
>> +Parses a string representing an SQL data type, optionally
>> schema-qualified.
>> +Returns a record with two fields, typid and
>> +typmod, representing the OID and
>> modifier for the
>> +type. These correspond to the parameters to pass to the
>> +format_type
>> function.
>> +   
>> (I don't have a strong opinion though)
> 
> Sounds better.  The CREATE TABLE reference is superfluous.

Done.

Best,

David
[1] 
https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/contrib/adminpack/adminpack.c#L508-L511



v4-0001-Add-parse_type-SQL-function.patch
Description: Binary data




Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Nathan Bossart
On Mon, Feb 12, 2024 at 05:08:40PM +0100, Peter Eisentraut wrote:
> On 10.02.24 21:13, Nathan Bossart wrote:
>> I haven't played with it at all, but the topic reminds me of this thread:
>> 
>>  
>> https://postgr.es/m/flat/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ%3D3gV1Q%40mail.gmail.com
> 
> Approaches like that as well as the in-tree pgrminclude work by "I removed
> the #include and it still compiled fine", which can be unreliable.  IWYU on
> the other hand has the compiler tracking where a symbol actually came from,
> and so if it says that an #include is not used, then it's pretty much
> correct by definition.

Okay.  FWIW I tend to agree that this is nice cleanup.  I'd personally run
this before every commit on HEAD if there was an easy way to do so and it
didn't cause changes to a bunch of unrelated files, but I understand that
the pgindent stuff in the buildfarm isn't super popular.  I'd even like to
have a tool that automatically adjusted #include ordering to match project
policy, assuming one does not already exist.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Small fix on query_id_enabled

2024-02-12 Thread Yugo NAGATA
On Sat, 10 Feb 2024 10:19:15 +0900
Michael Paquier  wrote:

> On Fri, Feb 09, 2024 at 04:37:23PM +0800, Julien Rouhaud wrote:
> > On Fri, Feb 09, 2024 at 03:38:23PM +0900, Yugo NAGATA wrote:
> >> Also, I think the name is a bit confusing for the same reason, that is,
> >> query_id_enabled may be false even when query id is computed in fact.
> >>
> >> Actually, this does not matter because we use IsQueryIdEnabled to check
> >> if query id is enabled,  instead of referring to a global variable
> >> (query_id_enabled or compute_query_id). But, just for making a code a bit
> >> more readable, how about renaming this to query_id_required which seems to
> >> stand for the meaning more correctly?
> > 
> > -1 for renaming to avoid breaking extensions that might access it.  We 
> > should
> > simply document for compute_query_id and query_id_enabled declaration that 
> > one
> > should instead use IsQueryIdEnabled() if they're interested in whether the 
> > core
> > queryid are computed or not.
> 
> Agreed.  A renaming would involve more pain than gain.  Improving the
> comments around how to all that would be good enough, my 2c.

Thank you both for your comments.

I agreed with not renaming it.

I attached a updated patch that adds comments noting to use IsQueryIdEnabled()
instead of accessing the variables directly.

Regards,
Yugo Nagata
-- 
Yugo NAGATA 
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index e489bfceb5..82f725baaa 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -42,7 +42,13 @@
 /* GUC parameters */
 int			compute_query_id = COMPUTE_QUERY_ID_AUTO;
 
-/* True when compute_query_id is ON, or AUTO and a module requests them */
+/*
+ * True when compute_query_id is ON or AUTO, and a module requests them.
+ *
+ * Note that IsQueryIdEnabled() should be used instead of checking
+ * query_id_enabled or compute_query_id directly when we want to know
+ * whether query identifiers are computed in the core or not.
+ */
 bool		query_id_enabled = false;
 
 static void AppendJumble(JumbleState *jstate,


Re: cpluspluscheck complains about use of register

2024-02-12 Thread Tom Lane
Christoph Berg  writes:
> Should the removal of "register" be backported to support that better?

Perhaps.  It's early days yet, but nobody has complained that that
broke anything in v16, so I'm guessing it'd be fine.

regards, tom lane




Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Peter Eisentraut

On 10.02.24 21:13, Nathan Bossart wrote:

(Interestingly, IWYU has been mentioned in src/tools/pginclude/README since
2012.  Has anyone else played with it?  Was it not mature enough back then?)

I haven't played with it at all, but the topic reminds me of this thread:


https://postgr.es/m/flat/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ%3D3gV1Q%40mail.gmail.com


Approaches like that as well as the in-tree pgrminclude work by "I 
removed the #include and it still compiled fine", which can be 
unreliable.  IWYU on the other hand has the compiler tracking where a 
symbol actually came from, and so if it says that an #include is not 
used, then it's pretty much correct by definition.






Re: clarify equalTupleDescs()

2024-02-12 Thread Tom Lane
Peter Eisentraut  writes:
> On 06.02.24 16:14, Tom Lane wrote:
>> +1 for the general idea, but it seems like "row type equality"
>> might still be a slightly fuzzy concept.

> I did another pass across the callers to check what pg_attribute fields 
> might be relevant.

> Collation definitely needs to be added, certainly for plancache.c, maybe 
> for typcache.c, the other callers don't care.

+1

> Record types can have attisdropped fields, so it's probably good to 
> check those.

Yeah, good idea.  (In most cases the attname comparison would catch
that, but we shouldn't rely on it.)  In a perfect world maybe a
dropped column should be invisible to this comparison, but we're
a very long way from being able to treat it that way.

> I'm suspicious about attndims.  Maybe one could create a test case where 
> record types differ only in that.  Support for attndims throughout the 
> system is weak, but maybe there is something to check there.

There was a discussion last year[1] about removing attndims
altogether, which still seems to me like possibly a good idea.
So I doubt we want to consider it as a core semantic field.

> On a conceptual level, I figured pg_attribute rows can be divided up 
> into three categories:

> 1. "row type" stuff: attname, atttypid, atttypmod, attndims, 
> attisdropped, attcollation

> 2. physical layout stuff: attlen, attcacheoff, attbyval, attalign

I recall some discussion about taking attcacheoff out of this data
structure too ...

> 3. table metadata stuff (everything else)

> It's not perfect, and sometimes it's not clear whether these categories 
> inform the implementation or the other way around, but I think it helps 
> conceptualize it.

Sure.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/ZD%2B14YZ4IUue8Rhi%40gendo.asyd.net




Re: glibc qsort() vulnerability

2024-02-12 Thread Nathan Bossart
On Sun, Feb 11, 2024 at 03:44:42PM +0100, Mats Kindahl wrote:
> On Sat, Feb 10, 2024 at 9:53 PM Nathan Bossart 
> wrote:
>> and I think we should expand on some of the commentary in int.h.
>> For example, the comment at the top of int.h seems very tailored to the
>> existing functions and should probably be adjusted.
> 
> 
> I rewrote the beginning to the following, does that look good?
> 
>  * int.h
>  *  Routines to perform signed and unsigned integer arithmetics, including
>  *  comparisons, in an overflow-safe way.
> 
> 
> 
>> And the "comparison
>> routines for integers" comment might benefit from some additional details
>> about the purpose and guarantees of the new functions.
>>
> 
> I expanded that into the following. WDYT?
> 
> /*
>  * Comparison routines for integers.
>  *
>  * These routines are used to implement comparison functions for, e.g.,
>  * qsort(). They are designed to be efficient and not risk overflows in
>  * internal computations that could cause strange results, such as INT_MIN >
>  * INT_MAX if you just return "lhs - rhs".
>  *

LGTM.  I might editorialize a bit before committing, but I think your
proposed wording illustrates the thrust of the change.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] allow pg_current_logfile() execution under pg_monitor role

2024-02-12 Thread Nathan Bossart
On Mon, Feb 12, 2024 at 12:27:54PM +, Pavlo Golub wrote:
>>  Are there any other
>> functions that pg_monitor ought to have privileges for?
>> 
> Not that I'm aware of at the moment. This one was found by chance.

Okay.  I'll plan on committing this in the next few days.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Support a wildcard in backtrace_functions

2024-02-12 Thread Peter Eisentraut

On 12.02.24 14:27, Jelte Fennema-Nio wrote:

And honestly wanting
to get backtraces for non-ERROR log entries seems quite niche to me,
which to me makes it a weird default.


I think one reason for this is that non-ERRORs are fairly unique in 
their wording, so you don't have to isolate them by function name.





Re: table inheritance versus column compression and storage settings

2024-02-12 Thread Peter Eisentraut

On 08.02.24 08:20, Ashutosh Bapat wrote:

On Wed, Feb 7, 2024 at 12:47 PM Ashutosh Bapat
 wrote:


0001 fixes compression inheritance
0002 fixes storage inheritance



The first patch does not update compression_1.out which makes CI
unhappy. Here's patchset fixing that.


The changed behavior looks good to me.  The tests are good, the code 
changes are pretty straightforward.


Did you by any change check that pg_dump dumps the resulting structures 
correctly?  I notice in tablecmds.c that ALTER COLUMN SET STORAGE 
recurses but ALTER COLUMN SET COMPRESSION does not.  I don't understand 
why that is, and I wonder whether it affects pg_dump.






Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

2024-02-12 Thread Daniel Gustafsson
> On 11 Feb 2024, at 19:19, David Benjamin  wrote:
> 
> Hi all,
> 
> I've attached a patch for the master branch to fix up the custom BIOs used by 
> PostgreSQL, in light of the issues with the OpenSSL update recently. While 
> c82207a548db47623a2bfa2447babdaa630302b9 (switching from BIO_get_data to 
> BIO_get_app_data) resolved the immediate conflict, I don't think it addressed 
> the root cause, which is that PostgreSQL was mixing up two BIO 
> implementations, each of which expected to be driving the BIO.

Thanks for your patch, I'm still digesting it so will provide more of a review
later.

> It turns out the parts that came from the OpenSSL socket BIO were a no-op, 
> and in fact PostgreSQL is relying on it being a no-op. Instead, it's cleaner 
> to just define a custom BIO the normal way, which then leaves the more 
> standard BIO_get_data mechanism usable. This also avoids the risk that a 
> future OpenSSL will add a now BIO_ctrl to the socket type, with libssl 
> calling into it, and then break some assumptions made by PostgreSQL.

+   case BIO_CTRL_FLUSH:
+   /* libssl expects all BIOs to support BIO_flush. */
+   res = 1;
+   break;

Will this always be true?  Returning 1 implies that we have flushed all data on
the socket, but what if we just before called BIO_set_retry..XX()?

> I've attached a patch which does that. The existing SSL tests pass with it, 
> tested on Debian stable. (Though it took me a few iterations to figure out 
> how to run the SSL tests, so it's possible I've missed something.)

We've done a fair bit of work on making them easier to run, so I'm curious if
you saw any room for improvements there as someone coming to them for the first
time?

--
Daniel Gustafsson





Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Andrew Dunstan


On 2024-02-12 Mo 08:51, Dave Cramer wrote:



On Sat, 10 Feb 2024 at 13:28, Andrew Dunstan  wrote:


On 2024-02-10 Sa 12:20, Dave Cramer wrote:



On Sat, 10 Feb 2024 at 11:19, Andrew Dunstan
 wrote:


On 2024-02-09 Fr 14:23, Dave Cramer wrote:


Dave Cramer
www.postgres.rocks 


On Fri, 9 Feb 2024 at 07:18, Dave Cramer

 wrote:





On Fri, 9 Feb 2024 at 00:26, Michael Paquier
 wrote:

On Tue, Feb 06, 2024 at 07:01:49AM -0500, Dave
Cramer wrote:
> Thanks, this patch works and
> testing with meson passes.

Only with the version posted at [1]?  Interesting,
that's the same
contents as v8 posted upthread, minus src/tools/
because we don't need
to care about them anymore.

Andrew, what's happening on the test side?  It does
not seem you've
mentioned any details about what is going wrong, or
I have just missed
them.

> I'll try the buildfarm next.

[1]:

https://www.postgresql.org/message-id/ea42654a-3dc4-98b0-335b-56b7ec5e5...@dunslane.net


interestingly meson test does not produce any error
The buildfarm produces the following error for me:

-SELECT relname, attname, coltypes,
get_columns_length(coltypes)
- FROM check_columns
- WHERE get_columns_length(coltypes) % 8 != 0 OR
-  'name'::regtype::oid = ANY(coltypes);
- relname | attname | coltypes | get_columns_length
--+-+--+
-(0 rows)
-
+server closed the connection unexpectedly
+This probably means the server terminated abnormally
+before or while processing the request.
+connection to server was lost


Actually digging some more, here is the actual error

2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:  server
process (PID 11204) was terminated by exception 0xC005
2024-02-09 13:31:11.008 -05 postmaster[10672] DETAIL:
 Failed process was running: VACUUM;
2024-02-09 13:31:11.008 -05 postmaster[10672] HINT:  See C
include file "ntstatus.h" for a description of the
hexadecimal value.
2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:
 terminating any other active server processes
2024-02-09 13:31:11.013 -05 postmaster[10672] LOG:  all
server processes terminated; reinitializing
2024-02-09 13:31:11.034 -05 startup[6152] LOG:  database
system was interrupted; last known up at 2024-02-09 13:31:01 -05






So how does one debug this ?

Also if I `run meson test` I don't see this error. What does the
buildfarm do differently?



First it does this:


meson test -C $pgsql --no-rebuild --suite setup


Then it does this (jflag is for the number of jobs):


meson test -t $meson_test_timeout $jflag -C $pgsql --logbase
checklog --print-errorlogs --no-rebuild --suite regress
--test-args=--no-locale



running the above manually produces no errors ??




Not for me. I get the error I previously reported, It's an access 
violation error.



cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com


RE: Psql meta-command conninfo+

2024-02-12 Thread Maiquel Grassi

>I found that \conninfo and \conninfo+ act differently when the connection is 
>broken.
>I used pg_terminate_backend function from another session to terminate an open 
>psql session.
>After that, \conninfo does not see the connection break (surprisingly!), and 
>\conninfo+ returns an error:
>
>postgres@postgres(17.0)=# \conninfo+
>FATAL:  terminating connection due to administrator command
>server closed the connection unexpectedly
>   This probably means the server terminated abnormally
>   before or while processing the request.
>The connection to the server was lost. Attempting reset: Succeeded.
>
>postgres@postgres(17.0)=# \conninfo
>You are connected to database "postgres" as user "postgres" via socket in 
>"/tmp" at port "5401".
>
>Another surprise is that this check: if (db == NULL) did not work in both 
>cases.

--//--

Hi Pavel!

(v14)

The "if (db == NULL)" has been removed, as well
as the message inside it. To always maintain the
same error messages, I changed the validation of
"\conninfo" to match the validation used for "\conninfo+".
Therefore, now "\conninfo" and "\conninfo+" return
the same error when "pg_terminate_backend()" terminates
this session through another session. The result of
the adjustment is as follows:


Case 1 ("\conninfo+"):

[postgres@localhost bin]$ ./psql -x
psql (17devel)
Type "help" for help.

postgres=# \conninfo
You are connected to database "postgres" as user "postgres" via socket in 
"/tmp" at port "5432".
postgres=# \conninfo+
Current Connection Information
-[ RECORD 1 ]--+-
Database   | postgres
Authenticated User | postgres
System User|
Current User   | postgres
Session User   | postgres
Backend PID| 24381
Server Address |
Server Port| 5432
Client Address |
Client Port|
Socket Directory   | /tmp
Host   |

postgres=# \conninfo+
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.


Case 2 ("\conninfo"):

[postgres@localhost bin]$ ./psql -x
psql (17devel)
Type "help" for help.

postgres=# \conninfo
You are connected to database "postgres" as user "postgres" via socket in 
"/tmp" at port "5432".
postgres=# \conninfo+
Current Connection Information
-[ RECORD 1 ]--+-
Database   | postgres
Authenticated User | postgres
System User|
Current User   | postgres
Session User   | postgres
Backend PID| 24539
Server Address |
Server Port| 5432
Client Address |
Client Port|
Socket Directory   | /tmp
Host   |

postgres=# \conninfo
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

The connection to the server was lost. Attempting reset: Succeeded.

In both cases, the sessions were terminated by another session.

Regards,
Maiquel Grassi.


v14-0001-psql-meta-command-conninfo-plus.patch
Description: v14-0001-psql-meta-command-conninfo-plus.patch


Re: Synchronizing slots from primary to standby

2024-02-12 Thread Bertrand Drouvot
Hi,

On Mon, Feb 12, 2024 at 04:19:33PM +0530, Amit Kapila wrote:
> On Mon, Feb 12, 2024 at 3:33 PM Bertrand Drouvot
>  wrote:
> >
> > A few random comments:
> >
> >
> > 003 ===
> >
> > +  If, after executing the function,
> > +  
> > +  hot_standby_feedback is disabled on
> > +  the standby or the physical slot configured in
> > +  
> > +  primary_slot_name is
> > +  removed,
> >
> > I think another option that could lead to slot invalidation is if 
> > primary_slot_name
> > is NULL or miss-configured.
> >
> 
> If the primary_slot_name is NULL then the function will error out.

Yeah right, it had to be non NULL initially so we know there is a physical slot 
(if 
not dropped) that should prevent conflicts at the first place (should hsf be 
on).
Please forget about comment 003 then.

> >
> > 005 ===
> >
> > + To resume logical replication after failover from the synced logical
> > + slots, the subscription's 'conninfo' must be altered
> >
> > Only in a pub/sub context but not for other ways of using the logical 
> > replication
> > slot(s).
> >
> 
> Right, but what additional information do you want here? I thought we
> were speaking about the in-build logical replication here so this is
> okay.

The "Logical Decoding Concepts" sub-chapter also mentions "Logical decoding 
clients"
so I was not sure the part added in the patch was for in-build logical 
replication
only.

Or maybe just reword that way "In case of in-build logical replication, to 
resume
after failover from the synced.."?

> 
> >
> > 008 ===
> >
> > +   ereport(LOG,
> > +   errmsg("dropped replication slot 
> > \"%s\" of dbid %d",
> > +  
> > NameStr(local_slot->data.name),
> > +  
> > local_slot->data.database));
> >
> > We emit a message when an "invalidated" slot is dropped but not when we 
> > create
> > a slot. Shouldn't we emit a message when we create a synced slot on the 
> > standby?
> >
> > I think that could be confusing to see "a drop" message not followed by "a 
> > create"
> > one when it's expected (slot valid on the primary for example).
> >
> 
> Isn't the below message for sync-ready slot sufficient? Otherwise, in
> most cases, we will LOG multiple similar messages.
> 
> + ereport(LOG,
> + errmsg("newly created slot \"%s\" is sync-ready now",
> +remote_slot->name));

Yes it is sufficient if we reach it. For example during some test, I was able to
go through this code path:

Breakpoint 2, update_and_persist_local_synced_slot (remote_slot=0x56450e7c49c0, 
remote_dbid=5) at slotsync.c:340
340 ReplicationSlot *slot = MyReplicationSlot;
(gdb) n
346 if (remote_slot->restart_lsn < slot->data.restart_lsn ||
(gdb)
347 TransactionIdPrecedes(remote_slot->catalog_xmin,
(gdb)
346 if (remote_slot->restart_lsn < slot->data.restart_lsn ||
(gdb)
358 return;

means exiting from update_and_persist_local_synced_slot() without reaching the
"newly created slot" message (the slot on the primary was "inactive").

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add native windows on arm64 support

2024-02-12 Thread Dave Cramer
On Sat, 10 Feb 2024 at 13:28, Andrew Dunstan  wrote:

>
> On 2024-02-10 Sa 12:20, Dave Cramer wrote:
>
>
>
> On Sat, 10 Feb 2024 at 11:19, Andrew Dunstan  wrote:
>
>>
>> On 2024-02-09 Fr 14:23, Dave Cramer wrote:
>>
>>
>> Dave Cramer
>> www.postgres.rocks
>>
>>
>> On Fri, 9 Feb 2024 at 07:18, Dave Cramer 
>>  wrote:
>>
>>>
>>>
>>>
>>>
>>> On Fri, 9 Feb 2024 at 00:26, Michael Paquier 
>>> wrote:
>>>
 On Tue, Feb 06, 2024 at 07:01:49AM -0500, Dave Cramer wrote:
 > Thanks, this patch works and
 > testing with meson passes.

 Only with the version posted at [1]?  Interesting, that's the same
 contents as v8 posted upthread, minus src/tools/ because we don't need
 to care about them anymore.

 Andrew, what's happening on the test side?  It does not seem you've
 mentioned any details about what is going wrong, or I have just missed
 them.

 > I'll try the buildfarm next.

 [1]:
 https://www.postgresql.org/message-id/ea42654a-3dc4-98b0-335b-56b7ec5e5...@dunslane.net
>>>
>>>
>>> interestingly meson test does not produce any error
>>> The buildfarm produces the following error for me:
>>>
>>> -SELECT relname, attname, coltypes, get_columns_length(coltypes)
>>> - FROM check_columns
>>> - WHERE get_columns_length(coltypes) % 8 != 0 OR
>>> -   'name'::regtype::oid = ANY(coltypes);
>>> - relname | attname | coltypes | get_columns_length
>>> --+-+--+
>>> -(0 rows)
>>> -
>>> +server closed the connection unexpectedly
>>> + This probably means the server terminated abnormally
>>> + before or while processing the request.
>>> +connection to server was lost
>>>
>>
>> Actually digging some more, here is the actual error
>>
>> 2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:  server process (PID
>> 11204) was terminated by exception 0xC005
>> 2024-02-09 13:31:11.008 -05 postmaster[10672] DETAIL:  Failed process
>> was running: VACUUM;
>> 2024-02-09 13:31:11.008 -05 postmaster[10672] HINT:  See C include file
>> "ntstatus.h" for a description of the hexadecimal value.
>> 2024-02-09 13:31:11.008 -05 postmaster[10672] LOG:  terminating any
>> other active server processes
>> 2024-02-09 13:31:11.013 -05 postmaster[10672] LOG:  all server processes
>> terminated; reinitializing
>> 2024-02-09 13:31:11.034 -05 startup[6152] LOG:  database system was
>> interrupted; last known up at 2024-02-09 13:31:01 -05
>>
>>
>>
>>
>>
> So how does one debug this ?
>
> Also if I `run meson test` I don't see this error. What does the buildfarm
> do differently?
>
>
> First it does this:
>
>
> meson test -C $pgsql --no-rebuild --suite setup
>
>
> Then it does this (jflag is for the number of jobs):
>
>
> meson test -t $meson_test_timeout $jflag -C $pgsql --logbase checklog
> --print-errorlogs --no-rebuild --suite regress --test-args=--no-locale
>
>

running the above manually produces no errors ??

Dave


serial not accepted as datatype in ALTER TABLE ... ALTER COLUMN

2024-02-12 Thread Ashutosh Bapat
Hi All,
alter table t1 add column c serial;
ALTER TABLE

this works, but not
#alter table t1 add column c int;
ALTER TABLE
#alter table t1 alter column c type serial;
ERROR:  type "serial" does not exist

Looking at the documentation [1], the grammar for both mentions data_type

ADD [ COLUMN ] [ IF NOT EXISTS ] column_name data_type and

ALTER [ COLUMN ] column_name [ SET DATA ] TYPE data_type

data_type is described on that page as "Data type of the new column,
or new data type for an existing column." but CREATE TABLE
documentation [2] redirects data_type to [3], which mentions serial.
The impression created by the documentation is the second statement
above is a valid statement as should not throw an error; instead
change the data type of the column (and create required sequence).

In code ATPrepAlterColumnType() calls typenameTypeIdAndMod(), whereas
transformColumnDefinition() (called for ALTER TABLE ... ADD COLUMN and
CREATE TABLE) handles "serial" data type separately. Looks like we are
missing a call to transformColumnDefinition() in
transformAlterTableStmt() under case AT_AlterColumnType.

[1] https://www.postgresql.org/docs/current/sql-altertable.html
[2] https://www.postgresql.org/docs/16/sql-createtable.html
[3] https://www.postgresql.org/docs/16/datatype.html

-- 
Best Wishes,
Ashutosh Bapat




Re: Support a wildcard in backtrace_functions

2024-02-12 Thread Jelte Fennema-Nio
On Mon, 12 Feb 2024 at 14:14, Daniel Gustafsson  wrote:
> > The main problem it currently has is that it adds backtraces to all
> > LOG level logs too. So probably we want to make backtrace_functions
> > only log backtraces for ERROR and up (or maybe WARNING/NOTICE and up),
> > or add a backtrace_functions_level GUC too control this behaviour.
>
> A wildcard should IMO only apply for ERROR (and higher) so I've hacked that up
> in the attached v2. I was thinking about WARNING as well but opted against it.

Fine by me patch looks good. Although I think I'd slightly prefer
having a backtrace_functions_level GUC, so that we can get this same
benefit for non wildcard backtrace_functions and so we keep the
behaviour between the two consistent.

> I think we should keep the current functionality and instead adjust the docs.
> This has already been shipped like this, and restricting it now without a 
> clear
> usecase for doing so seems invasive (and someone might very well be using
> this). 0001 in the attached adjusts this.

Would a backtrace_functions_level GUC that would default to ERROR be
acceptable in this case? It's slight behaviour break, but you would be
able to get the previous behaviour very easily. And honestly wanting
to get backtraces for non-ERROR log entries seems quite niche to me,
which to me makes it a weird default.

> +If an log entry is raised and the name of the internal C function 
> where

s/an log entry/a log entry




Re: Support a wildcard in backtrace_functions

2024-02-12 Thread Daniel Gustafsson
> On 20 Dec 2023, at 12:23, Jelte Fennema-Nio  wrote:

> Attached is a trivial patch that starts supporting
> backtrace_functions='*'. By setting that in postgresql.conf for my dev
> environment it starts logging backtraces always.

I happened to implement pretty much the same diff today during a debugging
session, and then stumbled across this when searching the archives, so count me
in for +1 on the concept.

> The main problem it currently has is that it adds backtraces to all
> LOG level logs too. So probably we want to make backtrace_functions
> only log backtraces for ERROR and up (or maybe WARNING/NOTICE and up),
> or add a backtrace_functions_level GUC too control this behaviour.

A wildcard should IMO only apply for ERROR (and higher) so I've hacked that up
in the attached v2. I was thinking about WARNING as well but opted against it.

> The docs of backtrace_functions currently heavily suggest that it should
> only be logging backtraces for errors, so either we actually start
> doing that or we should clarify the docs

I think we should keep the current functionality and instead adjust the docs.
This has already been shipped like this, and restricting it now without a clear
usecase for doing so seems invasive (and someone might very well be using
this). 0001 in the attached adjusts this.

--
Daniel Gustafsson



v2-0002-Support-wildcard-in-backtrace_functions-to-handle.patch
Description: Binary data


v2-0001-doc-Clarify-when-backtrace_functions-is-invoked.patch
Description: Binary data


Re: [PATCH] Add sortsupport for range types and btree_gist

2024-02-12 Thread jian he
On Fri, Feb 9, 2024 at 2:14 AM Bernd Helmle  wrote:
>
> Am Mittwoch, dem 10.01.2024 um 22:18 +0800 schrieb jian he:
> >
> > I split the original author's patch into 2.
> > 1. Add GiST sortsupport function for all the btree-gist module data
> > types except anyrange data type (which actually does not in this
> > module)
> > 2. Add GiST sortsupport function for anyrange data type.
> >
>
> > what I am confused:
> > In fmgr.h
> >
> > /*
> >  * Support for cleaning up detoasted copies of inputs.  This must
> > only
> >  * be used for pass-by-ref datatypes, and normally would only be used
> >  * for toastable types.  If the given pointer is different from the
> >  * original argument, assume it's a palloc'd detoasted copy, and
> > pfree it.
> >  * NOTE: most functions on toastable types do not have to worry about
> > this,
> >  * but we currently require that support functions for indexes not
> > leak
> >  * memory.
> >  */
> > #define PG_FREE_IF_COPY(ptr,n) \
> > do { \
> > if ((Pointer) (ptr) != PG_GETARG_POINTER(n)) \
> > pfree(ptr); \
> > } while (0)
> >
> > but the doc
> > (https://www.postgresql.org/docs/current/gist-extensibility.html)
> >  says:
> > All the GiST support methods are normally called in short-lived
> > memory
> > contexts; that is, CurrentMemoryContext will get reset after each
> > tuple is processed. It is therefore not very important to worry about
> > pfree'ing everything you palloc.
> > ENDOF_QUOTE
> >
> > so I am not sure in range_gist_cmp, we need the following:
> > `
> > if ((Datum) range_a != a)
> > pfree(range_a);
> > if ((Datum) range_b != b)
> > pfree(range_b);
> > `
>
> Turns out this is not true for sortsupport: the comparison function is
> called for each tuple during sorting, which will leak the detoasted
> (and probably copied datum) in the sort memory context. See the same
> for e.g. numeric and text, which i needed to change to pass the key
> values correctly to the text_cmp() or numeric_cmp() function in these
> cases.
>

+ 
+  Per default btree_gist builts
GiST indexe with
+  sortsupport in sorted
mode. This usually results in a
+  much better index quality and smaller index sizes by much faster
index built speed. It is still
+  possible to revert to buffered built strategy by using the
buffering parameter
+  when creating the index.
+ 
+
I believe `built` |`builts` should be `build`.
Also
maybe we can simply copy some texts from
https://www.postgresql.org/docs/current/gist-implementation.html.
how about the following:
  
   The sorted method is only available if each of the opclasses used by the
   index provides a sortsupport function, as described
   in .  If they do, this method is
   usually the best, so it is used by default.
  It is still possible to change to a buffered build strategy by using
the buffering parameter
  to the CREATE INDEX command.
  

you've changed contrib/btree_gist/meson.build, seems we also need to
change contrib/btree_gist/Makefile

gist_point_sortsupport have `if (ssup->abbreviate)`,  does
range_gist_sortsupport also this part?
I think the `if(ssup->abbreviate)` part is optional?
Can we add some comments on it?




Re: clarify equalTupleDescs()

2024-02-12 Thread Peter Eisentraut

On 06.02.24 16:14, Tom Lane wrote:

Peter Eisentraut  writes:

The first want to compare what I call row-type equality, which means
they want to check specifically for equal number of attributes, and the
same attribute names, types, and typmods for each attribute.  Most
callers actually want that behavior.


Should compare attcollation too, no?

+1 for the general idea, but it seems like "row type equality"
might still be a slightly fuzzy concept.


I did another pass across the callers to check what pg_attribute fields 
might be relevant.


Collation definitely needs to be added, certainly for plancache.c, maybe 
for typcache.c, the other callers don't care.


Record types can have attisdropped fields, so it's probably good to 
check those.


I'm suspicious about attndims.  Maybe one could create a test case where 
record types differ only in that.  Support for attndims throughout the 
system is weak, but maybe there is something to check there.


On a conceptual level, I figured pg_attribute rows can be divided up 
into three categories:


1. "row type" stuff: attname, atttypid, atttypmod, attndims, 
attisdropped, attcollation


2. physical layout stuff: attlen, attcacheoff, attbyval, attalign

3. table metadata stuff (everything else)

It's not perfect, and sometimes it's not clear whether these categories 
inform the implementation or the other way around, but I think it helps 
conceptualize it.
From 7798fb0c9b57e94f8edc9c8c808395097e541954 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 12 Feb 2024 12:42:11 +0100
Subject: [PATCH v1] Separate equalRowTypes() from equalTupleDescs()

This introduces a new function equalRowTypes() that is effectively a
subset of equalTupleDescs() but only compares the number of attributes
and attribute name, type, and typmod.  This is enough for most
existing uses of equalTupleDescs(), which are changed to use the new
function.  The only remaining callers of equalTupleDescs() are those
that really want to check the full tuple descriptor as such, without
concern about record or row or record type semantics.

The existing function hashTupleDesc() is renamed to hashRowType(),
because it now corresponds more to equalRowTypes().

The purpose of this change is to be clearer about the semantics of
equality asked for by each caller.  (At least one caller had a comment
that questioned whether equalTupleDescs() was too restrictive.)  For
example, 4f622503d6d removed attstattarget from the tuple descriptor
structure.  It was not fully clear at the time how this should affect
equalTupleDescs().  Now the answer is clear: By their own definitions,
equalRowTypes() does not care, and equalTupleDescs() just compares
whatever is in the tuple descriptor but does not care why it is in
there.

Discussion: 
https://www.postgresql.org/message-id/flat/f656d6d9-6660-4518-a006-2f65cafbebd1%40eisentraut.org
---
 src/backend/access/common/tupdesc.c | 71 -
 src/backend/catalog/pg_proc.c   |  2 +-
 src/backend/commands/analyze.c  |  4 +-
 src/backend/commands/view.c | 13 +++---
 src/backend/utils/cache/plancache.c |  5 +-
 src/backend/utils/cache/typcache.c  | 10 ++--
 src/include/access/tupdesc.h|  4 +-
 7 files changed, 77 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c 
b/src/backend/access/common/tupdesc.c
index bc80caee117..64761f050f5 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -414,11 +414,6 @@ DecrTupleDescRefCount(TupleDesc tupdesc)
 
 /*
  * Compare two TupleDesc structures for logical equality
- *
- * Note: we deliberately do not check the attrelid and tdtypmod fields.
- * This allows typcache.c to use this routine to see if a cached record type
- * matches a requested type, and is harmless for relcache.c's uses.
- * We don't compare tdrefcount, either.
  */
 bool
 equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
@@ -431,6 +426,8 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
return false;
 
+   /* tdtypmod and tdrefcount are not checked */
+
for (i = 0; i < tupdesc1->natts; i++)
{
Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
@@ -561,17 +558,67 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
 }
 
 /*
- * hashTupleDesc
- * Compute a hash value for a tuple descriptor.
+ * equalRowTypes
  *
- * If two tuple descriptors would be considered equal by equalTupleDescs()
- * then their hash value will be equal according to this function.
+ * This determines whether two tuple descriptors have equal row types.  This
+ * only checks those fields in pg_attribute that are applicable for row types,
+ * while ignoring those fields that define the physical row storage or those
+ * that define table column metadata.
+ *
+ * Specifically, this checks:
+ *
+ * - same number of attributes
+ * - same composite 

Re: Add connection active, idle time to pg_stat_activity

2024-02-12 Thread Andrei Zubkov
Hi Sergei,

> I still would like to maintaint the focus on committing the "idle in
transactions" part of pg_stat_session first.

Agreed.

I've done a review of version 0004. This version is applied successful
over ce571434ae7, installcheck passed. The behavior of pg_stat_session
view and corresponding function looks correct. I've didn't found any
issues in the code.

Notes about the current state of a patch:

Naming
the view and function names 'pg_stat_session' seems correct for this
particular scope of a patch. However possible future resource
consumption statistics are valid for all backends (vacuum for example).
Right now it is not clear for me if we can get resource statistics from
those backends while those are listed in the pg_stat_activity view but
renaming to something like 'pg_stat_backend' seems reasonable to me.

Docs
1. session states referenced in monitoring.sgml is not uniform with
those of the pg_stat_activity view.
monitoring.sgml:4635
monitoring.sgml:4644
+   Time in milliseconds this backend spent in the
running or fastpath state.
I think those states should be referenced uniformly with
pg_stat_activity.

2. Description of the 'pg_stat_get_session()' function might be as
follows:

  Returns a row, showing statistics about the client backend with the
  specified process ID, or one row per client backend if
  NULL is specified. The fields returned are the
  same as those of pg_stat_session view.

The main thought here is to get rid of 'each active backend' because
'active backend' looks like backend in the 'active' state.

Tests
Call to a non-existing function is depend on non-existence of a
function, which can't be guaranteed absolutely. How about to do some
kind of obvious error here? Couple examples follows:

SELECT 0/0;

- or -

DO $$
BEGIN
RAISE 'test error';
END;
$$ LANGUAGE plpgsql;

My personal choice would be the last one.

-- 
regards, Andrei Zubkov
Postgres Professional





Re[2]: [PATCH] allow pg_current_logfile() execution under pg_monitor role

2024-02-12 Thread Pavlo Golub

 Are there any other
functions that pg_monitor ought to have privileges for?


Not that I'm aware of at the moment. This one was found by chance.

Kind regards,
Pavlo Golub

Re: clarify equalTupleDescs()

2024-02-12 Thread Peter Eisentraut

On 07.02.24 04:06, jian he wrote:

/*
  * hashRowType
  *
  * If two tuple descriptors would be considered equal by equalRowTypes()
  * then their hash value will be equal according to this function.
  */
uint32
hashRowType(TupleDesc desc)
{
uint32 s;
int i;

s = hash_combine(0, hash_uint32(desc->natts));
s = hash_combine(s, hash_uint32(desc->tdtypeid));
for (i = 0; i < desc->natts; ++i)
s = hash_combine(s, hash_uint32(TupleDescAttr(desc, i)->atttypid));

return s;
}

from the hashRowType comment, should we also hash attname and atttypmod?


In principle, hashRowType() could process all the fields that 
equalRowTypes() does.  But since it's only a hash function, it doesn't 
have to be perfect.  (This is also the case for the current 
hashTupleDesc().)  I'm not sure where the best tradeoff is.





Re: Add publisher and subscriber to glossary documentation.

2024-02-12 Thread Alvaro Herrera
Hello

On 2024-Feb-12, Shlok Kyal wrote:

> There are several places where publisher and subscriber terms are used
> across the documentation. But the publisher and subscriber were
> missing in the documentation. I felt this should be added in the
> glossary.

I agree, but let's wordsmith those definitions first.

Should these be "publisher node" and "subscriber node" instead?  Do we
want to define the term "node"?  I think in everyday conversations we
use "node" quite a lot, so maybe we do need an entry for it.  (Maybe
just  suffices, plus add under instance
"also called a node".)

+   Publisher
+   
+
+  A node where publication is defined.
+  It replicates the publication changes to the subscriber node.

Apart from deciding what to do with "node", what are "changes"?  This
doesn't seem very specific.

+   Subscriber
+   
+
+ A node where subscription is defined.
+ It subscribe to one or more publications on a publisher node and pull the 
data
+ from the publications they subscribe to.

Same issues as above, plus there are some grammar issues.


I think these definitions should use the term "logical replication",
which we don't currently define.  We do have "replication" where we
provide an overview of "logical replication".  Maybe that's enough, but
we should consider whether we want a separate definition of logical
replication (I'm leaning towards not having one, but it's worth asking.)

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"But static content is just dynamic content that isn't moving!"
http://smylers.hates-software.com/2007/08/15/fe244d0c.html




Re: cpluspluscheck complains about use of register

2024-02-12 Thread Christoph Berg
Re: Tom Lane
> > I hit this again while porting cplupluscheck to be invoked by meson as
> > well. ISTM that we should just remove the uses of register.
> 
> OK by me.
> 
> > I tried to use -Wregister to keep us honest going forward, but unfortunately
> > it only works with a C++ compiler...
> 
> I think we only really care about stuff that cpluspluscheck would spot,
> so I don't feel a need to mess with the standard compilation flags.

This has started to hurt: postgresql-debversion (a Debian version number
data type written in C++) failed to build against Postgresql <= 15 on
Ubuntu's next LTS release (24.04):

In file included from /usr/include/postgresql/15/server/port/atomics.h:70:
/usr/include/postgresql/15/server/port/atomics/arch-x86.h:143:2: error: ISO 
C++17 does not allow 'register' storage class specifier [-Wregister]
  143 | register char _res = 1;

I managed to work around it by putting `#define register` before
including the PG headers.

Should the removal of "register" be backported to support that better?

Christoph




Add publisher and subscriber to glossary documentation.

2024-02-12 Thread Shlok Kyal
Hi,

There are several places where publisher and subscriber terms are used
across the documentation. But the publisher and subscriber were
missing in the documentation. I felt this should be added in the
glossary.
I have created a patch for the same.

Thanks and Regards
Shlok Kyal


v1-0001-Add-publisher-and-subscriber-to-glossary-document.patch
Description: Binary data


  1   2   >