Re: Add support to TLS 1.3 cipher suites and curves lists

2024-06-07 Thread Peter Eisentraut

On 07.06.24 08:10, Erica Zhang wrote:
I’m a Postgres user and I’m looking into restricting the set of allowed 
ciphers on Postgres and configure a concrete set of curves on our 
postgres instances.


Out of curiosity, why is this needed in practice?

Could you please help to review to see if you are interested in having 
this change in upcoming Postgres major release(It's should be PG17)?


It would be targetting PG18 now.





Re: meson: Specify -Wformat as a common warning flag for extensions

2024-06-07 Thread Peter Eisentraut

On 29.05.24 08:47, Sutou Kouhei wrote:

In <4707d4ed-f268-43c0-b4dd-cdbc7520f...@eisentraut.org>
   "Re: meson: Specify -Wformat as a common warning flag for extensions" on 
Tue, 28 May 2024 23:31:05 -0700,
   Peter Eisentraut  wrote:


On 07.04.24 18:01, Sutou Kouhei wrote:

+# We don't have "warning_level == 3" and "warning_level ==
+# 'everything'" here because we don't use these warning levels.
+if warning_level == '1'
+  common_builtin_flags += ['-Wall']
+elif warning_level == '2'
+  common_builtin_flags += ['-Wall', '-Wextra']
+endif


I would trim this even further and always export just '-Wall'.  The
other options aren't really something we support.


OK. How about the v6 patch? It always uses '-Wall'.


I have committed this.  Thanks.





Re: small fix for llvm build

2024-06-06 Thread Peter Eisentraut

On 28.05.24 17:17, Peter Eisentraut wrote:
I'm getting build failures when building with meson and llvm enabled, 
like this:


[1/112] Generating src/backend/jit/llvm/llvmjit_types.bc with a custom 
command

FAILED: src/backend/jit/llvm/llvmjit_types.bc
/usr/local/bin/ccache /usr/local/Cellar/llvm/18.1.6/bin/clang -c -o 
src/backend/jit/llvm/llvmjit_types.bc 
../src/backend/jit/llvm/llvmjit_types.c -flto=thin -emit-llvm -MD -MQ 
src/backend/jit/llvm/llvmjit_types.bc -MF 
src/backend/jit/llvm/llvmjit_types.c.bc.d -O2 -Wno-ignored-attributes 
-Wno-empty-body -fno-strict-aliasing -fwrapv -I./src/include 
-I./src/backend/utils/misc -I../src/include

In file included from ../src/backend/jit/llvm/llvmjit_types.c:27:
In file included from ../src/include/postgres.h:45:
../src/include/c.h:75:10: fatal error: 'libintl.h' file not found
    75 | #include 
   |  ^~~
1 error generated.


The reason is that libintl.h is at /usr/local/include/libintl.h, but 
that is not in the include path for this command.  I have 
-I/usr/local/include in CPPFLAGS in the environment, which is why the 
normal compilation commands pick it up, but this is not used by this 
custom command.


Wit this small patch I can make it work:

diff --git a/src/backend/jit/llvm/meson.build 
b/src/backend/jit/llvm/meson.build

index 41c759f73c5..4a4232661ba 100644
--- a/src/backend/jit/llvm/meson.build
+++ b/src/backend/jit/llvm/meson.build
@@ -63,6 +63,7 @@ bitcode_cflags = ['-fno-strict-aliasing', '-fwrapv']
  if llvm.version().version_compare('=15.0')
    bitcode_cflags += ['-Xclang', '-no-opaque-pointers']
  endif
+bitcode_cflags += get_option('c_args')
  bitcode_cflags += cppflags

  # XXX: Worth improving on the logic to find directories here


I have committed this change.





report on not thread-safe functions

2024-06-06 Thread Peter Eisentraut

In the context of the multithreaded-server project, I looked into
potentially not thread-safe functions.

(See proposed next steps at the end of this message.)

Here is a list of functions in POSIX that are possibly not thread-safe:

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_01

I checked those against the PostgreSQL server source code (backend +
common + timezone), and found that the following from those are in
use:

- dlerror()
- getenv()
- getgrnam()
- getopt()
- getpwuid()
- localeconv()
- localtime()
- nl_langinfo()
- readdir()
- setenv()
- setlocale()
- strerror()
- strsignal()
- strtok()
- system()
- unsetenv()

Additionally, there are non-standard functions that are not
thread-safe, such as getopt_long().

Also, there are replacement functions such as pg_gmtime() and
pg_localtime() that mirror standard thread-unsafe functions and that
appear to be equally unsafe.  (Note to those looking into annotating
global variables: You also need to check static local variables.)

Conversely, some of the above might actually be thread-safe in
some/many/all implementations.  For example, strerror() and system()
are thread-safe in glibc.  So you might actually get a multithreaded
server running in that environment with fewer source code changes but
have it fail in others.  Just something to keep in mind.

I also tried the concurrency-mt-unsafe check from clang-tidy
(https://clang.llvm.org/extra/clang-tidy/checks/concurrency/mt-unsafe.html). 
 Run it for example like this:


clang-tidy -p build --quiet --checks='-*,concurrency-mt-unsafe' 
src/backend/main/*.c


(needs a compilation database in the build directory)

(You can't just run it like src/backend/**/*.c because some .c files
don't compile standalone, and then the whole thing aborts with too
many errors.  Maybe with a judicious exclusion list, this can be
achieved.  However, it's also not good dealing with compilation
options like FRONTEND.  So it can't easily serve as an automated
checker, but it's okay as a manual exploration tool.)

In addition to the POSIX list above, this also flagged:

- exit()
- sigprocmask()

Allegedly, you can toggle it between posix and glibc modes, but I
haven't succeeded with that.  So for example, it does not actually
flag strerror() out of the box, presumably because that is not in its
glibc list.


Now some more detailed comments on these functions:

- dlerror()

dlerror() gets the error from the last dlopen() call, which is
obviously not thread-safe.  This might require some deeper
investigation of the whole dfmgr.c mechanism.  (Which might be
appropriate in any case, since in multithreaded environments, you
don't need to load a library into each session separately.)

- exit()

Most of the exit() calls happen where there are not multiple threads
active.  But some emergency exit calls like in elog.c might more
correctly use _exit()?

- getenv()
- setenv()
- unsetenv()

getenv() is unsafe if there are concurrent setenv() or unsetenv()
calls.  We should try to move all those to early in the program
startup.  This seems doable.  Some calls are related to locale stuff,
which is a separate subproject to clean up.  There are some calls to
setenv("KRB5*"), which would need to be fixed.  The source code
comments nearby already contain ideas how to.

- getgrnam()
- getpwuid()
- localtime()

These have _r replacements.

- getopt()

This needs a custom replacement.  (There is no getopt_r() because
programs usually don't call getopt() after startup.)

(Note: This is also called during session startup, not only during
initial postmaster start.  So we definitely need something here, if we
want to, like, start more than one session concurrently.)

- localeconv()
- nl_langinfo()
- setlocale()

The locale business needs to be reworked to use locale_t and _l
functions.  This is already being discussed for other reasons.

- readdir()

This is listed as possibly thread-unsafe, but I think it is
thread-safe in practice.  You just can't work on the same DIR handle
from multiple threads.  There is a readdir_r(), but that's already
deprecated.  I think we can ignore this one.

- sigprocmask()

It looks like this is safe in practice.  Also, there is
pthread_sigmask() if needed.

- strerror()

Use strerror_r().  There are very few calls of this, actually, since
most potential users use printf %m.

- strsignal()

Use strsignal_r().  These calls are already wrapped in pg_strsignal()
for Windows portability, so it's easy to change.

But this also led me to think that it is potentially dangerous to have
different standards of thread-safety across the tree.  pg_strsignal()
is used by wait_result_to_str() which is used by pclose_check()
... and at that point you have completely lost track of what you are
dealing with underneath.  So if someone were to put, say,
pclose_check() into pgbench, it could be broken.

- strtok()

Use strtok_r() or maybe even strsep() (but there are small semantic
differences with the 

Re: Logical Replication of sequences

2024-06-05 Thread Peter Eisentraut

On 04.06.24 12:57, Amit Kapila wrote:

2. Provide a command say Alter Subscription ...  Replicate Sequences
(or something like that) which users can perform before shutdown of
the publisher node during upgrade. This will allow copying all the
sequences from the publisher node to the subscriber node directly.
Similar to previous approach, this could also be inconvenient for
users.


I would start with this.  In any case, you're going to need to write 
code to collect all the sequence values, send them over some protocol, 
apply them on the subscriber.  The easiest way to start is to trigger 
that manually.  Then later you can add other ways to trigger it, either 
by timer or around shutdown, or whatever other ideas there might be.





Re: Build with LTO / -flto on macOS

2024-06-05 Thread Peter Eisentraut

On 04.06.24 18:41, Tom Lane wrote:

Relevant to this: I wonder what we think the supported macOS versions
are, anyway.  AFAICS, the buildfarm only covers current (Sonoma)
and current-1 (Ventura) major versions, and only the latest minor
versions in those OS branches.


For other OS lines I think we are settling on supporting what the OS 
vendor supports.  So for macOS at the moment this would be current, 
current-1, and current-2, per 
.






Re: Volatile write caches on macOS and Windows, redux

2024-06-05 Thread Peter Eisentraut

On 03.06.24 17:28, Nathan Bossart wrote:

I agree, two states should be enough.  It could basically just be

pg_fsync(int fd)
{
#if macos
 fcntl(fd, F_FULLFSYNC);
#else
 fsync(fd);
#endif
}

IIUC with this approach, anyone who is using a file system that fails
fcntl(F_FULLSYNC) with ENOSUPP would have to turn fsync off.  That might be
the right thing to do since having a third option that sends the data to
the disk cache but doesn't provide any real guarantees if you lose power
may not be worth much.  However, if such a file system_did_  provide such
guarantees with just fsync(), then it would be unfortunate to force people
to turn fsync off.  But this could very well all be hypothetical, for all I
know...  In any case, I agree that we should probably use F_FULLFSYNC by
default on macOS.


Yeah, my example code above says "#if macos", not "#ifdef F_FULLSYNC". 
The latter might be a problem along the lines you describe if other 
systems use that symbol in a slightly different manner.






Re: meson "experimental"?

2024-06-04 Thread Peter Eisentraut

On 04.06.24 17:24, Robert Haas wrote:

On Wed, May 29, 2024 at 2:41 AM Peter Eisentraut  wrote:

"Alternatively, PostgreSQL can be built using Meson.  This is the only
option for building PostgreSQL in Windows using Visual Something[*].
For other platforms, using Meson is currently experimental."


Is it, though? I feel like we're beyond the experimental stage now.


Experimental is probably too harsh a word now.

But it doesn't have feature parity with configure/make yet (for example, 
no .bc files), so I wouldn't recommend it for production or packaging 
builds.


Then, there are issues like [0].  If it's experimental, then this is 
like, meh, we'll fix it later.  If not, then it's a bug.


More generally, I don't think we've really done a comprehensive check of 
how popular extensions build against pgxs-from-meson.  Packagers that 
make their production builds using meson now might be signing up for a 
long tail of the unknown.



[0]: 
https://www.postgresql.org/message-id/49e97fd0-c17e-4cbc-aeee-80ac51400...@eisentraut.org





Re: Build with LTO / -flto on macOS

2024-06-04 Thread Peter Eisentraut

On 03.06.24 17:07, Wolfgang Walther wrote:
I don't mind addressing this in PG18, but I would hesitate with 
backpatching.  With macOS, it's always hard to figure out whether 
these kinds of options work the same way going versions back.


All the versions for ld64 are in [1]. It seems this was introduced in 
ld64-224.1 [2] the first time. It was not there in ld64-136 [3]. Finally 
the man page has **exactly** the same wording in the latest version 
ld64-609 [4].


We could go further and compare the source, but I think it's safe to 
assume that this flag hasn't changed much and should not affect non-LTO 
builds. And for even older versions it would just not be supported, so 
configure would not use it.


With the native compiler tooling on macOS, it is not safe to assume 
anything, including that the man pages are accurate or that the 
documented options actually work correctly and don't break anything 
else.  Unless we have actual testing on all the supported macOS 
versions, I don't believe it.


Given that LTO apparently never worked on macOS, this is not a 
regression, so I wouldn't backpatch it.  I'm not objecting, but I don't 
want to touch it.






Re: Proposal: Document ABI Compatibility

2024-06-04 Thread Peter Eisentraut

On 04.06.24 02:11, Laurenz Albe wrote:

On Mon, 2024-06-03 at 15:38 -0400, Tom Lane wrote:

Andres Freund  writes:

On 2024-06-03 14:43:17 -0400, David E. Wheeler wrote:

* The ABI is guaranteed to change only in backward compatible ways in minor
releases. If for some reason it doesn’t it’s a bug that will need to be
fixed.



Thus I am not really on board with this statement as-is.


Me either.  There are degrees of ABI compatibility, and we'll choose
the least invasive way, but it's seldom the case that no conceivable
extension will be broken.


oracle_fdw has been broken by minor releases several times in the past.
This may well be because of weird things that I am doing; still, my
experience is that minor releases are not always binary compatible.


It'd be interesting to see a few examples of actual minor-version-upgrade
extension breakages, so we can judge what caused them.


Yes, that could be a fruitful discussion.


Digging through my commits brought up 6214e2b2280462cbc3aa1986e350e167651b3905,
for one.


I'm not sure I can see how that would have broken oracle_fdw, but in any 
case it's an interesting example.  This patch did not change any structs 
incompatibly, but it changed the semantics of a function without 
changing the name:


 extern void InitResultRelInfo(ResultRelInfo *resultRelInfo,
  Relation resultRelationDesc,
  Index resultRelationIndex,
- Relation partition_root,
+ ResultRelInfo *partition_root_rri,
  int instrument_options);

If an extension calls this function, something would possibly crash if 
it's on the wrong side of the update fence.


This could possibly be avoided by renaming the symbol in backbranches. 
Maybe something like


#define InitResultRelInfo InitResultRelInfo2

Then you'd get a specific error message when loading the module, rather 
than a crash.


This might be something to consider:

no ABI break is better than an explicit ABI break is better than a 
silent ABI break


(Although this is actually an API break, isn't it?)





Re: meson: Specify -Wformat as a common warning flag for extensions

2024-06-04 Thread Peter Eisentraut

On 29.05.24 08:47, Sutou Kouhei wrote:

In <4707d4ed-f268-43c0-b4dd-cdbc7520f...@eisentraut.org>
   "Re: meson: Specify -Wformat as a common warning flag for extensions" on 
Tue, 28 May 2024 23:31:05 -0700,
   Peter Eisentraut  wrote:


On 07.04.24 18:01, Sutou Kouhei wrote:

+# We don't have "warning_level == 3" and "warning_level ==
+# 'everything'" here because we don't use these warning levels.
+if warning_level == '1'
+  common_builtin_flags += ['-Wall']
+elif warning_level == '2'
+  common_builtin_flags += ['-Wall', '-Wextra']
+endif


I would trim this even further and always export just '-Wall'.  The
other options aren't really something we support.


OK. How about the v6 patch? It always uses '-Wall'.


Yes, this looks good to me.

All: I think we should backpatch this.  Otherwise, meson-based installs 
will get suboptimal behavior for extension builds via pgxs.






Re: Build with LTO / -flto on macOS

2024-06-03 Thread Peter Eisentraut

On 03.06.24 16:22, Wolfgang Walther wrote:
Building with clang and -flto on macOS currently fails with errors 
similar to [1]. This is because the --export-dynamic flag is called 
-export_dynamic [2] instead and we have not been passing this variant to 
the linker, so far.


It's probably worth clarifying that this option is needed on macOS only 
if LTO is also enabled.  For standard (non-LTO) builds, the 
export-dynamic behavior is already the default on macOS (otherwise 
nothing in PostgreSQL would work).


I don't think we explicitly offer LTO builds as part of the make build 
system, so anyone trying this would do it sort of self-service, by 
passing additional options to configure or make.  In which case they 
might as well pass the -export_dynamic option along in the same way?


I don't mind addressing this in PG18, but I would hesitate with 
backpatching.  With macOS, it's always hard to figure out whether these 
kinds of options work the same way going versions back.






Re: pgsql: Add more SQL/JSON constructor functions

2024-06-03 Thread Peter Eisentraut

On 02.06.24 21:46, Tom Lane wrote:

If you don't
like our current behavior, then either you have to say that RETURNING
with a length-limited target type is illegal (which is problematic
for the spec, since they have no such type) or that the cast behaves
like an implicit cast, with errors for overlength input (which I find
to be an unintuitive definition for a construct that names the target
type explicitly).


It asks for the latter behavior, essentially (but it's not defined in 
terms of casts).  It says:


"""
ii) Let JV be an implementation-dependent (UV097) value of type TT and 
encoding ENC such that these two conditions hold:


1) JV is a JSON text.

2) When the General Rules of Subclause 9.42, “Parsing JSON text”, are 
applied with JV as JSON TEXT, FO as FORMAT OPTION, and WITHOUT UNIQUE 
KEYS as UNIQUENESS CONSTRAINT; let CST be the STATUS and let CSJI be the 
SQL/JSON ITEM returned from the application of those General Rules, CST 
is successful completion (0) and CSJI is an SQL/JSON item that is 
equivalent to SJI.


If there is no such JV, then let ST be the exception condition: data 
exception — invalid JSON text (22032).


iii) If JV is longer than the length or maximum length of TT, then an 
exception condition is raised: data exception — string data, right 
truncation (22001).

"""

Oracle also behaves accordingly:

SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(20)) from 
dual;


JSON_SERIALIZE('{"A"

{"a":1,"a":2}

SQL> select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from 
dual;

select json_serialize('{"a":1, "a":2}' returning varchar2(5)) from dual
   *
ERROR at line 1:
ORA-40478: output value too large (maximum: 5)
JZN-00018: Input to serializer is too large
Help: https://docs.oracle.com/error-help/db/ora-40478/


As opposed to:

SQL> select cast(json_serialize('{"a":1, "a":2}') as varchar2(5)) from dual;

CAST(
-
{"a":





Re: Schema variables - new implementation for Postgres 15

2024-06-02 Thread Peter Eisentraut

On 25.05.24 12:50, Pavel Stehule wrote:
It looks odd - It is not intuitive, it introduces new inconsistency 
inside Postgres, or with solutions in other databases. No other database 
has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird 
will be confused. Users that use PL/pgSQL will be confused.


Do you have a description of what those other systems do?  Maybe you 
posted it already earlier?






Re: Switch background worker on/off in runtime.

2024-06-02 Thread Peter Eisentraut

On 31.05.24 11:28, ISHAN CHHANGANI . wrote:

Is it possible to switch on/off a background worker in runtime?

worker.bgw_flags =BGWORKER_SHMEM_ACCESS;

worker.bgw_start_time =BgWorkerStart_PostmasterStart;

I want to switch off the worker based on some flag value, etc, either 
from the main process or the worker itself.


Are there any already existing examples?


I think this depends more on your exact use case.  For example, the 
logical replication background workers have sophisticated logic to do 
this.  There is a launcher, which is itself a background worker, which 
launches other per-subscription workers.  And there are commands to 
disable subscriptions, which would among other things stop their 
corresponding background workers.  That logic is specific to the needs 
of the logic replication system.  You might get some ideas from that. 
But in general it will need a bit of custom logic.






Re: pgsql: Add more SQL/JSON constructor functions

2024-06-02 Thread Peter Eisentraut

On 29.05.24 18:44, Tom Lane wrote:

Amit Langote  writes:

On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera  wrote:

On 2024-May-27, Alvaro Herrera wrote:
I just noticed this behavior, which looks like a bug to me:

select json_serialize('{"a":1, "a":2}' returning varchar(5));
json_serialize

{"a":

I think this function should throw an error if the destination type
doesn't have room for the output json.  Otherwise, what good is the
serialization function?



This behavior comes from using COERCE_EXPLICIT_CAST when creating the
coercion expression to convert json_*() functions' argument to the
RETURNING type.


Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit.  I see
little reason that this should work differently from

select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
  json_serialize

  {"a":
(1 row)


The SQL standard says essentially that the output of json_serialize() is 
some string that when parsed back in gives you an equivalent JSON value 
as the input.  That doesn't seem compatible with truncating the output.


If you want output truncation, you can of course use an actual cast. 
But it makes sense that the RETURNING clause is separate from that.






Re: Volatile write caches on macOS and Windows, redux

2024-05-29 Thread Peter Eisentraut

On 25.05.24 04:01, Jelte Fennema-Nio wrote:

Is this the only reason why you're suggesting adding fsync=full,
instead of simply always setting F_FULLFSYNC when fsync=true on MacOS.
If so, I'm not sure we really gain anything by this tri-state. I think
people either care about data loss on power loss, or they don't. I
doubt many people want his third intermediate option, which afaict
basically means lose data on powerloss less often than fsync=false but
still lose data most of the time.


I agree, two states should be enough.  It could basically just be

pg_fsync(int fd)
{
#if macos
fcntl(fd, F_FULLFSYNC);
#else
fsync(fd);
#endif
}





meson "experimental"?

2024-05-29 Thread Peter Eisentraut

In installation.sgml it says

"""
Alternatively, PostgreSQL can be built using Meson.  This is currently 
experimental.

"""

Do we want to alter this statement for PG17, considering that this is 
now the only way to build for Windows using MSVC?


(A joke response is that the Windows port itself is experimental, so it 
doesn't matter much that the build system for it is as well.)


I would still call Meson use on non-Windows platforms experimental at 
this time.


So maybe something like

"Alternatively, PostgreSQL can be built using Meson.  This is the only 
option for building PostgreSQL in Windows using Visual Something[*]. 
For other platforms, using Meson is currently experimental."


[*] What is the correct name for this?




Re: meson: Specify -Wformat as a common warning flag for extensions

2024-05-29 Thread Peter Eisentraut

On 07.04.24 18:01, Sutou Kouhei wrote:

+# We don't have "warning_level == 3" and "warning_level ==
+# 'everything'" here because we don't use these warning levels.
+if warning_level == '1'
+  common_builtin_flags += ['-Wall']
+elif warning_level == '2'
+  common_builtin_flags += ['-Wall', '-Wextra']
+endif


I would trim this even further and always export just '-Wall'.  The 
other options aren't really something we support.


The other stanzas, on '-g' and '-O*', look good to me.





small fix for llvm build

2024-05-28 Thread Peter Eisentraut
I'm getting build failures when building with meson and llvm enabled, 
like this:


[1/112] Generating src/backend/jit/llvm/llvmjit_types.bc with a custom 
command

FAILED: src/backend/jit/llvm/llvmjit_types.bc
/usr/local/bin/ccache /usr/local/Cellar/llvm/18.1.6/bin/clang -c -o 
src/backend/jit/llvm/llvmjit_types.bc 
../src/backend/jit/llvm/llvmjit_types.c -flto=thin -emit-llvm -MD -MQ 
src/backend/jit/llvm/llvmjit_types.bc -MF 
src/backend/jit/llvm/llvmjit_types.c.bc.d -O2 -Wno-ignored-attributes 
-Wno-empty-body -fno-strict-aliasing -fwrapv -I./src/include 
-I./src/backend/utils/misc -I../src/include

In file included from ../src/backend/jit/llvm/llvmjit_types.c:27:
In file included from ../src/include/postgres.h:45:
../src/include/c.h:75:10: fatal error: 'libintl.h' file not found
   75 | #include 
  |  ^~~
1 error generated.


The reason is that libintl.h is at /usr/local/include/libintl.h, but 
that is not in the include path for this command.  I have 
-I/usr/local/include in CPPFLAGS in the environment, which is why the 
normal compilation commands pick it up, but this is not used by this 
custom command.


Wit this small patch I can make it work:

diff --git a/src/backend/jit/llvm/meson.build 
b/src/backend/jit/llvm/meson.build

index 41c759f73c5..4a4232661ba 100644
--- a/src/backend/jit/llvm/meson.build
+++ b/src/backend/jit/llvm/meson.build
@@ -63,6 +63,7 @@ bitcode_cflags = ['-fno-strict-aliasing', '-fwrapv']
 if llvm.version().version_compare('=15.0')
   bitcode_cflags += ['-Xclang', '-no-opaque-pointers']
 endif
+bitcode_cflags += get_option('c_args')
 bitcode_cflags += cppflags

 # XXX: Worth improving on the logic to find directories here


Is that correct?




Re: Improve conditional compilation for direct I/O alignment checks

2024-05-26 Thread Peter Eisentraut

On 26.05.24 09:16, Junwang Zhao wrote:

This patch refactors the alignment checks for direct I/O to preprocess phase,
thereby reducing some CPU cycles.


This patch replaces for example

if (PG_O_DIRECT != 0 && PG_IO_ALIGN_SIZE <= BLCKSZ)
Assert((uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer));

with

#if PG_O_DIRECT != 0 && PG_IO_ALIGN_SIZE <= BLCKSZ
Assert((uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer));
#endif

You appear to be assuming that this saves some CPU cycles.  But this is 
not the case.  The compiler will remove the unused code in the first 
case because all the constants in the if expression are known at 
compile-time.





Re: Upgrade Debian CI images to Bookworm

2024-05-24 Thread Peter Eisentraut

On 13.05.24 12:57, Nazir Bilal Yavuz wrote:

Bookworm versions of the Debian CI images are available now [0]. The
patches to use these images are attached.

'v1-0001-Upgrade-Debian-CI-images-to-Bookworm_REL_16+.patch' patch can
be applied to both upstream and REL_16 and all of the tasks finish
successfully.

'v1-0001-Upgrade-Debian-CI-images-to-Bookworm_REL_15.patch' patch can
be applied to REL_15 but it gives a compiler warning. The fix for this
warning is proposed here [1]. After the fix is applied, all of the
tasks finish successfully.

Any kind of feedback would be appreciated.


These updates are very welcome and look straightforward enough.

I'm not sure what the backpatching expectations of this kind of thing 
is.  The history of this CI setup is relatively short, so this hasn't 
been stressed too much.  I see that we once backpatched the macOS 
update, but that might have been all.


If we start backpatching this kind of thing, then this will grow as a 
job over time.  We'll have 5 or 6 branches to keep up to date, with 
several operating systems.  And once in a while we'll have to make 
additional changes like this warning fix you mention here.  I'm not sure 
how much we want to take this on.  Is there ongoing value in the CI 
setup in backbranches?


With these patches, we could do either of the following:

1) We update only master and only after it branches for PG18.  (The 
update is a "new feature".)


2) We update only master but do it now.  (This gives us the most amount 
of buffer time before the next release.)


3) We update master and PG16 now.  We ignore PG15.

4) We update master and PG16 now.  We update PG15 whenever that warning 
is fixed.


5) We update master, PG16, and PG15, but we hold all of them until the 
warning in PG15 is fixed.


6) We update all of them now and let the warning in PG15 be fixed 
independently.






Re: Convert node test compile-time settings into run-time parameters

2024-05-24 Thread Peter Eisentraut

On 21.05.24 20:48, Andres Freund wrote:

Where I'd be more concerned about peformance is the added branch in
READ_LOCATION_FIELD. There are a lot of calls to that, addding runtime
branches to each, with external function calls inside, is somewhat likely to
be measurable.


Ok, I have an improved plan.  I'm wrapping all the code related to this 
in #ifdef DEBUG_NODE_TESTS_ENABLED.  This in turn is defined in 
assert-enabled builds, or if you define it explicitly, or if you define 
one of the legacy individual symbols.  That way you get the run-time 
settings in a normal development build, but there is no new run-time 
overhead.  This is the same scheme that we use for debug_discard_caches.


(An argument could be made to enable this code if and only if assertions 
are enabled, since these tests are themselves kind of assertions.  But I 
think having a separate symbol documents the purpose of the various code 
sections better.)



Another thought:  Do we really need three separate settings?


Maybe not three settings, but a single setting, with multiple values, like
debug_io_direct?


Yeah, good idea.  Let's get some more feedback on this before I code up 
a complicated list parser.


Another approach might be levels.  My testing showed that the overhead 
of the copy_parse_plan_trees and raw_expression_coverage_tests flags is 
hardly noticeable, but write_read_parse_plan_trees has some noticeable 
impact.  So you could do 0=off, 1=only the cheap ones, 2=all tests.


In fact, if we could make "only the cheap ones" the default for 
assert-enabled builds, then most people won't even need to worry about 
this setting: The only way to mess up the write_read_parse_plan_trees is 
if you write custom node support, which is rare.  But the raw expression 
coverage still needs to be maintained by hand, so it's more often 
valuable to have it checked automatically.
From 80f35c90e3414dabd879e8832ce1b89c685e5de5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 24 May 2024 11:42:02 +0200
Subject: [PATCH v2] Convert node test compile-time settings into run-time
 parameters

This converts

COPY_PARSE_PLAN_TREES
WRITE_READ_PARSE_PLAN_TREES
RAW_EXPRESSION_COVERAGE_TEST

into run-time parameters

debug_copy_parse_plan_trees
debug_write_read_parse_plan_trees
debug_raw_expression_coverage_test

They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS.

The compile-time symbols are kept for build farm compatibility, but
they now just determine the default value of the run-time settings.

Furthermore, support for these settings is not compiled in at all
unless assertions are enabled, or the new symbol
DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the
legacy compile-time setting symbols are defined.  So there is no
run-time overhead in production builds.  (This is similar to the
handling of DISCARD_CACHES_ENABLED.)

Discussion: 
https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org
---
 .cirrus.tasks.yml   |  3 +-
 doc/src/sgml/config.sgml| 71 +
 src/backend/nodes/README|  9 ++--
 src/backend/nodes/read.c| 12 ++---
 src/backend/nodes/readfuncs.c   |  4 +-
 src/backend/parser/analyze.c| 44 ++
 src/backend/tcop/postgres.c | 30 +++-
 src/backend/utils/misc/guc_tables.c | 59 
 src/include/nodes/nodes.h   |  2 +-
 src/include/nodes/readfuncs.h   |  2 +-
 src/include/pg_config_manual.h  | 34 +++---
 src/include/utils/guc.h |  6 +++
 12 files changed, 212 insertions(+), 64 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index a2388cd5036..6a9ff066391 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -133,9 +133,10 @@ task:
 DISK_SIZE: 50
 
 CCACHE_DIR: /tmp/ccache_dir
-CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES 
-DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST 
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
+CPPFLAGS: -DRELCACHE_FORCE_RELEASE 
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
 CFLAGS: -Og -ggdb
 
+PG_TEST_INITDB_EXTRA_OPTS: -c debug_copy_parse_plan_trees=on -c 
debug_write_read_parse_plan_trees=on -c debug_raw_expression_coverage_test=on
 PG_TEST_PG_UPGRADE_MODE: --link
 
   <<: *freebsd_task_template
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb6..c121a13b4c3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11372,6 +11372,29 @@ Developer Options
   
  
 
+ 
+  debug_copy_parse_plan_trees 
(boolean)
+  
+   debug_copy_parse_plan_trees configuration 
parameter
+  
+  
+  
+   
+Enabling this forces all parse and plan trees to be passed through
+copyObject(), to facilitate catching errors and
+omissions in copyObject().  Th

Re: AIX support

2024-05-23 Thread Peter Eisentraut

On 22.05.24 18:15, Sriram RK wrote:

Please find the attached patch.

Apart from the AIX specific changes, there is a minor change in this 
file wrt to XLC, below is the error for which we removed inline.


Later, the build and tests passed for both XLC(16.1.0.18) and gcc(12) as 
well.


I think what you should do next is aggressively trim anything that does 
not apply to current versions of AIX or the current compiler.


For example,

+  # Old xlc versions (<13.1) don't have support for -qvisibility. Use 
expfull to force


+   
+AIX versions before 7.1 are no longer
+tested nor supported by the PostgreSQL
+community.
+   

(Probably most of that section needs to be retested and rewritten.)

+  # Native memset() is faster, tested on:
+  # - AIX 5.1 and 5.2, XLC 6.0 (IBM's cc)
+  # - AIX 5.3 ML3, gcc 4.0.1
+  memset_loop_limit = 0

+   # for the base executable (AIX 4.2 and up)

+ * "IBM XL C/C++ for AIX, V12.1" miscompiles, for 32-bit, some inline


One of the reasons that the AIX port ultimately became unmaintainable 
was that so many hacks and caveats were accumulated over the years.  A 
new port should set a more recent baseline and trim all those hacks.






Re: Virtual generated columns

2024-05-22 Thread Peter Eisentraut

On 29.04.24 20:54, Corey Huinker wrote:

  -- generation expression must be immutable
-CREATE TABLE gtest_err_4 (a int PRIMARY KEY, b double precision
GENERATED ALWAYS AS (random()) STORED);
+CREATE TABLE gtest_err_4 (a int PRIMARY KEY, b double precision
GENERATED ALWAYS AS (random()) VIRTUAL);

Does a VIRTUAL generated column have to be immutable? I can see where 
the STORED one has to be, but consider the following:


CREATE TABLE foo (
created_at timestamptz DEFAULT CURRENT_TIMESTAMP,
row_age interval GENERATED ALWAYS AS CURRENT_TIMESTAMP - created_at
);


I have been hesitant about this, but I'm now leaning toward that we 
could allow this.



  -- can't have generated column that is a child of normal column
  CREATE TABLE gtest_normal (a int, b int);
-CREATE TABLE gtest_normal_child (a int, b int GENERATED ALWAYS AS
(a * 2) STORED) INHERITS (gtest_normal);  -- error
+CREATE TABLE gtest_normal_child (a int, b int GENERATED ALWAYS AS
(a * 2) VIRTUAL) INHERITS (gtest_normal);  -- error

This is the barrier to the partitioning reorganization scheme I 
described above. Is there any hard rule why a child table couldn't have 
a generated column matching the parent's regular column? I can see where 
it might prevent indexing that column on the parent table, but is there 
some other dealbreaker or is this just a "it doesn't work yet" situation?


We had a quite a difficult time getting the inheritance business of 
stored generated columns working correctly.  I'm sticking to the 
well-trodden path here.  We can possibly expand this if someone wants to 
work out the details.


One last thing to keep in mind is that there are two special case 
expressions in the spec:


GENERATED ALWAYS AS ROW START
GENERATED ALWAYS AS ROW END

and we'll need to be able to fit those into the catalog. I'll start 
another thread for that unless you prefer I keep it here.


I think this is a separate feature.





Re: Schema variables - new implementation for Postgres 15

2024-05-22 Thread Peter Eisentraut

On 18.05.24 13:29, Alvaro Herrera wrote:

I want to note that when we discussed this patch series at the dev
meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
schema variables at all because of the fact that creating a variable
would potentially change the meaning of queries by shadowing table
columns.  But this turns out to be incorrect: it's_variables_  that are
shadowed by table columns, not the other way around.


But that's still bad, because seemingly unrelated schema changes can 
make variables appear and disappear.  For example, if you have


SELECT a, b FROM table1

and then you drop column b, maybe the above query continues to work 
because there is also a variable b.  Or maybe it now does different 
things because b is of a different type.  This all has the potential to 
be very confusing.






Re: Pgoutput not capturing the generated columns

2024-05-22 Thread Peter Eisentraut

On 08.05.24 09:13, Shubham Khanna wrote:

The attached patch has the changes to support capturing generated
column data using ‘pgoutput’ and’ test_decoding’ plugin. Now if the
‘include_generated_columns’ option is specified, the generated column
information and generated column data also will be sent.


It might be worth keeping half an eye on the development of virtual 
generated columns [0].  I think it won't be possible to include those 
into the replication output stream.


I think having an option for including stored generated columns is in 
general ok.



[0]: 
https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b...@eisentraut.org





Re: tydedef extraction - back to the future

2024-05-22 Thread Peter Eisentraut

On 20.05.24 23:11, Andrew Dunstan wrote:
Attached is an attempt to thread this needle. The core is a new perl 
module that imports the current buildfarm client logic. The intention is 
that once we have this, the buildfarm client will switch to using the 
module (if found) rather than its own built-in logic. There is precedent 
for this sort of arrangement (AdjustUpgrade.pm). Accompanying the new 
module is a standalone perl script that uses the new module, and 
replaces the current shell script (thus making it more portable).


It looks like this code could use a bit of work to modernize and clean 
up cruft, such as


+   my $sep = $using_msvc ? ';' : ':';

This can be done with File::Spec.

+   next if $bin =~ m!bin/(ipcclean|pltcl_)!;

Those are long gone.

+   next if $bin =~ m!/postmaster.exe$!;# sometimes a 
copy not a link


Also gone.

+   elsif ($using_osx)
+   {
+   # On OS X, we need to examine the .o files

Update the name.

+   # exclude ecpg/test, which pgindent does too
+   my $obj_wanted = sub {
+   /^.*\.o\z/s
+ && !($File::Find::name =~ m!/ecpg/test/!s)
+ && push(@testfiles, $File::Find::name);
+   };
+
+   File::Find::find($obj_wanted, $binloc);
+   }

Not clear why this is specific to that platform.

Also, some instructions should be provided.  It looks like this is meant 
to be run on the installation tree?  A README and/or a build target 
would be good.


The code distinguishes between srcdir and bindir, but it's not clear 
what the latter is.  It rather looks like the installation prefix.  Does 
this code support out of tree builds?  This should be cleared up.






Re: in parentesis is not usual on DOCs

2024-05-22 Thread Peter Eisentraut

On 20.05.24 02:00, jian he wrote:

removing parentheses means we need to rephrase this sentence?
So I come up with the following rephrase:

The context_item specifies the input data to query, the
path_expression is a JSON path expression defining the query,
json_path_name is an optional name for the path_expression. The
optional PASSING clause can provide data values to the
path_expression.


Based on this, write a simple patch.


Your patch kind of messes up the indentation of the text you are 
changing.  Please check that.





Re: Convert node test compile-time settings into run-time parameters

2024-05-21 Thread Peter Eisentraut

On 20.05.24 15:59, Tom Lane wrote:

Peter Eisentraut  writes:

This patch converts the compile-time settings
  COPY_PARSE_PLAN_TREES
  WRITE_READ_PARSE_PLAN_TREES
  RAW_EXPRESSION_COVERAGE_TEST



into run-time parameters



  debug_copy_parse_plan_trees
  debug_write_read_parse_plan_trees
  debug_raw_expression_coverage_test


I'm kind of down on this.  It seems like forcing a bunch of
useless-in-production debug support into the standard build.
What of this would be of any use to any non-developer?


We have a bunch of other debug_* settings that are available in 
production builds, such as


debug_print_parse
debug_print_rewritten
debug_print_plan
debug_pretty_print
debug_discard_caches
debug_io_direct
debug_parallel_query
debug_logical_replication_streaming

Maybe we could hide all of them behind some #ifdef DEBUG_OPTIONS, but in 
any case, I don't think the ones being proposed here are substantially 
different from those existing ones that they would require a separate 
treatment.


My goal is to make these facilities easier to use, avoiding hand-editing 
pg_config_manual.h and having to recompile.






Re: Speed up clean meson builds by ~25%

2024-05-20 Thread Peter Eisentraut

On 19.05.24 00:09, Andres Freund wrote:

On 2024-05-18 00:35:08 +0200, Peter Eisentraut wrote:

I retested the patch from 2024-04-07 (I think that's the one that "fixed
that"?  There are multiple "v1" patches in this thread.) using gcc-14 and
clang-18, with ccache disabled of course.  The measured effects of the patch
are:

gcc-14:   1% slower
clang-18: 3% faster

So with that, it doesn't seem very interesting.

I wonder whether the reason you're seing less benefit than Jelte is that - I'm
guessing - you now used ninja 1.12 and Jelte something older.  Ninja 1.12
prioritizes building edges using a "critical path" heuristic, leading to
scheduling nodes with more incoming dependencies, and deeper in the dependency
graph earlier.


Yes!  Very interesting!

With ninja 1.11 and gcc-14, I see the patch gives about a 17% speedup.




Convert node test compile-time settings into run-time parameters

2024-05-20 Thread Peter Eisentraut

This patch converts the compile-time settings

COPY_PARSE_PLAN_TREES
WRITE_READ_PARSE_PLAN_TREES
RAW_EXPRESSION_COVERAGE_TEST

into run-time parameters

debug_copy_parse_plan_trees
debug_write_read_parse_plan_trees
debug_raw_expression_coverage_test

They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS.

The effect is the same, but now you don't need to recompile in order to 
use these checks.


The compile-time symbols are kept for build farm compatibility, but they 
now just determine the default value of the run-time settings.


Possible concerns:

- Performance?  Looking for example at pg_parse_query() and its 
siblings, they also check for other debugging settings like 
log_parser_stats in the main code path, so it doesn't seem to be a concern.


- Access control?  I have these settings as PGC_USERSET for now. Maybe 
they should be PGC_SUSET?


Another thought:  Do we really need three separate settings?
From 568c620eb97f82aa8eab850cb3ce703c5c94a912 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 20 May 2024 09:13:23 +0200
Subject: [PATCH v1] Convert node test compile-time settings into run-time
 parameters

This converts

COPY_PARSE_PLAN_TREES
WRITE_READ_PARSE_PLAN_TREES
RAW_EXPRESSION_COVERAGE_TEST

into run-time parameters

debug_copy_parse_plan_trees
debug_write_read_parse_plan_trees
debug_raw_expression_coverage_test

They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS.

The compile-time symbols are kept for build farm compatibility, but
they now just determine the default value of the run-time settings.

TODO: config.sgml documentation
---
 .cirrus.tasks.yml   |  3 +-
 src/backend/nodes/README|  9 ++---
 src/backend/nodes/read.c| 15 +---
 src/backend/nodes/readfuncs.c   | 10 +-
 src/backend/parser/analyze.c| 14 +++-
 src/backend/tcop/postgres.c | 18 --
 src/backend/utils/misc/guc_tables.c | 55 +
 src/include/nodes/nodes.h   |  2 --
 src/include/nodes/readfuncs.h   |  2 --
 src/include/pg_config_manual.h  | 21 ---
 src/include/utils/guc.h |  4 +++
 11 files changed, 79 insertions(+), 74 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index a2388cd5036..6a9ff066391 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -133,9 +133,10 @@ task:
 DISK_SIZE: 50
 
 CCACHE_DIR: /tmp/ccache_dir
-CPPFLAGS: -DRELCACHE_FORCE_RELEASE -DCOPY_PARSE_PLAN_TREES 
-DWRITE_READ_PARSE_PLAN_TREES -DRAW_EXPRESSION_COVERAGE_TEST 
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
+CPPFLAGS: -DRELCACHE_FORCE_RELEASE 
-DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
 CFLAGS: -Og -ggdb
 
+PG_TEST_INITDB_EXTRA_OPTS: -c debug_copy_parse_plan_trees=on -c 
debug_write_read_parse_plan_trees=on -c debug_raw_expression_coverage_test=on
 PG_TEST_PG_UPGRADE_MODE: --link
 
   <<: *freebsd_task_template
diff --git a/src/backend/nodes/README b/src/backend/nodes/README
index 52364470205..f8bbd605386 100644
--- a/src/backend/nodes/README
+++ b/src/backend/nodes/README
@@ -98,10 +98,11 @@ Suppose you want to define a node Foo:
node types to find all the places to touch.
(Except for frequently-created nodes, don't bother writing a creator
function in makefuncs.c.)
-4. Consider testing your new code with COPY_PARSE_PLAN_TREES,
-   WRITE_READ_PARSE_PLAN_TREES, and RAW_EXPRESSION_COVERAGE_TEST to ensure
-   support has been added everywhere that it's necessary; see
-   pg_config_manual.h about these.
+4. Consider testing your new code with debug_copy_parse_plan_trees,
+   debug_write_read_parse_plan_trees, and
+   debug_raw_expression_coverage_test to ensure support has been added
+   everywhere that it's necessary (e.g., run the tests with
+   PG_TEST_INITDB_EXTRA_OPTS='-c debug_...=on').
 
 Adding a new node type moves the numbers associated with existing
 tags, so you'll need to recompile the whole tree after doing this.
diff --git a/src/backend/nodes/read.c b/src/backend/nodes/read.c
index 4eb42445c52..e2d2ce7374d 100644
--- a/src/backend/nodes/read.c
+++ b/src/backend/nodes/read.c
@@ -32,9 +32,7 @@
 static const char *pg_strtok_ptr = NULL;
 
 /* State flag that determines how readfuncs.c should treat location fields */
-#ifdef WRITE_READ_PARSE_PLAN_TREES
 bool   restore_location_fields = false;
-#endif
 
 
 /*
@@ -42,17 +40,14 @@ boolrestore_location_fields = false;
  *   builds a Node tree from its string representation (assumed valid)
  *
  * restore_loc_fields instructs readfuncs.c whether to restore location
- * fields rather than set them to -1.  This is currently only supported
- * in builds with the WRITE_READ_PARSE_PLAN_TREES debugging flag set.
+ * fields rather than set them to -1.
  */
 static void *
 stringToNodeInternal(const char *str, bool restore_loc_fields)
 {
void   *retval;
const char *save_

Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

2024-05-19 Thread Peter Eisentraut

On 19.05.24 20:07, David Benjamin wrote:
On Sun, May 19, 2024 at 12:21 PM Tom Lane > wrote:


Per the cfbot [1], this patch needs a rebase over the ALPN-related
commits.  It still isn't likely to get human attention before the
July commitfest, but you can save some time by making sure it's
in a reviewable state before that.


Rebased version attached. (The conflict was pretty trivial. Both patches 
add a field to some struct.)


Note that there is a concurrent plan to drop support for OpenSSL older 
than 1.1.1 [0], which would obsolete your configure test for 
BIO_get_data.  Whoever commits these should be sure to coordinate this.



[0]: https://www.postgresql.org/message-id/flat/zg3jnursg69dz...@paquier.xyz





Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Peter Eisentraut

On 17.05.24 23:01, Robert Haas wrote:

On Fri, May 17, 2024 at 4:59 PM Tom Lane  wrote:

As I mentioned upthread, I'm more worried about confusing error
reports than the machine time.


Well, Jelte fixed that.


I grant that there are people who are more affected, but still, I'd
just as soon not contort the build rules for a temporary problem.


Arguably by doing this, but I don't think it's enough of a contortion
to get excited about.

Anyone else want to vote?


I retested the patch from 2024-04-07 (I think that's the one that "fixed 
that"?  There are multiple "v1" patches in this thread.) using gcc-14 
and clang-18, with ccache disabled of course.  The measured effects of 
the patch are:


gcc-14:   1% slower
clang-18: 3% faster

So with that, it doesn't seem very interesting.





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Peter Eisentraut

On 17.05.24 14:42, Joe Conway wrote:
Namely, the week before commitfest I don't actually know if I will 
have the time during that month, but I will make sure my patch is in 
the commitfest just in case I get a few clear days to work on it. 
Because if it isn't there, I can't take advantage of those "found" hours.


A solution to both of these issues (yours and mine) would be to allow 
things to be added *during* the CF month. What is the point of having a 
"freeze" before every CF anyway? Especially if they start out clean. If 
something is ready for review on day 8 of the CF, why not let it be 
added for review?


Maybe this all indicates that the idea of bimonthly commitfests has run 
its course.  The original idea might have been, if we have like 50 
patches, we can process all of them within a month.  We're clearly not 
doing that anymore.  How would the development process look like if we 
just had one commitfest per dev cycle that runs from July 1st to March 31st?






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Peter Eisentraut

On 17.05.24 09:32, Heikki Linnakangas wrote:
Dunno about having to click a link or sparkly gold borders, but +1 on 
having a free-form text box for notes like that. Things like "cfbot says 
this has bitrotted" or "Will review this after other patch this depends 
on". On the mailing list, notes like that are both noisy and easily lost 
in the threads. But as a little free-form text box on the commitfest, it 
would be handy.


One risk is that if we start to rely too much on that, or on the other 
fields in the commitfest app for that matter, we de-value the mailing 
list archives. I'm not too worried about it, the idea is that the 
summary box just summarizes what's already been said on the mailing 
list, or is transient information like "I'll get to this tomorrow" 
that's not interesting to archive.


We already have the annotations feature, which is kind of this.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Peter Eisentraut

On 17.05.24 13:36, Daniel Gustafsson wrote:

On 17 May 2024, at 13:13, Robert Haas  wrote:



But if we are to guess what those reasons might be, Tom has already
admitted he does that for CI, and I do the same, so probably other
people also do it. I also suspect that some people are essentially
using the CF app as a personal todo list. By sticking patches in there
that they intend to commit next cycle, they both (1) feel virtuous,
because they give at least the appearance of following the community
process and inviting review before they commit and (2) avoid losing
track of the stuff they plan to commit.

There may be other reasons, too.


I think there is one more which is important: 3) Giving visibility into "this
is what I intend to commit".  Few can follow -hackers to the level where they
can have an overview of ongoing and/or finished work which will go in.  The CF
app does however provide that overview.  This is essentially the TODO list
aspect, but sharing one's TODO isn't all bad, especially for maintainers.


Ok, but these cases shouldn't use a separate "parking lot".  They should 
use the same statuses and flow diagram as the rest.  (Maybe with more 
dotted lines, sure.)






Re: GUC names in messages

2024-05-17 Thread Peter Eisentraut

On 17.05.24 05:31, Peter Smith wrote:

I think we should accept your two patches

v6-0001-GUC-names-docs.patch
v6-0002-GUC-names-add-quotes.patch

which effectively everyone was in favor of and which seem to be the most
robust and sustainable solution.

(The remaining three patches from the v6 set would be PG18 material at
this point.)

Thanks very much for taking an interest in resurrecting this thread.

It was always my intention to come back to this when the dust had
settled on PG17. But it would be even better if the docs for the rule
"just quote everything", and anything else you deem acceptable, can be
pushed sooner.

Of course, there will still be plenty more to do for PG18, including
locating examples in newly pushed code for messages that have slipped
through the cracks during the last few months using different formats,
and other improvements, but those tasks should become easier if we can
get some of these v6 patches out of the way first.


I committed your 0001 and 0002 now, with some small fixes.

There has also been quite a bit of new code, of course, since you posted 
your patches, so we'll probably find a few more things that could use 
adjustment.


I'd be happy to consider the rest of your patch set after beta1 and/or 
for PG18.






Re: Ignore Visual Studio's Temp Files While Working with PG on Windows

2024-05-17 Thread Peter Eisentraut

On 17.05.24 08:09, Yasir wrote:
I have been playing with PG on the Windows platform recently. An 
annoying thing I faced is that a lot of Visual Studio's temp files kept 
appearing in git changed files. Therefore, I am submitting this very 
trivial patch to ignore these temp files.


Our general recommendation is that you put such things into your 
personal global git ignore file.


For example, I have in ~/.gitconfig

[core]
excludesFile = ~/.gitexcludes

and then in ~/.gitexcludes I have various ignores that are specific to 
my local tooling.


That way we don't have to maintain ignore lists for all the tools in the 
world in the PostgreSQL source tree.






Re: Minor cleanups in the SSL tests

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:27, Daniel Gustafsson wrote:

On 16 May 2024, at 11:43, Peter Eisentraut  wrote:



You might want to run your patch through pgperltidy.  The result doesn't look 
bad, but a bit different than what you had crafted.


Ugh, I thought I had but clearly had forgotten. Fixed now.


append_conf() opens and closes the file for each call.  It might be nice if it 
could accept a list.  Or you can just pass the whole block as one string, like 
it was done for pg_ident.conf before.


The attached v2 pass the whole block as a here-doc which seemed like the best
option to retain readability of the config.


Works for me.





Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Peter Eisentraut

On 16.05.24 16:45, Tom Lane wrote:

Peter Eisentraut  writes:

In these cases, I think for
NotificationHash
ResourceOwnerData
WalSyncMethod
we can just get rid of the typedef.


I have no objection to dealing with NotificationHash as you have here.


ReadBuffersFlags shouldn't be an enum at all, because its values are
used as flag bits.


Yeah, that was bothering me too, but I went for the minimum delta.
I did think that a couple of integer macros would be a better idea,
so +1 for what you did here.


I committed this, and Michael took care of WaitEventExtension, so we 
should be all clear here.






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 17.05.24 04:26, Robert Haas wrote:

I do*emphatically*  think we need a parking lot.


People seem to like this idea; I'm not quite sure I follow it.  If you 
just want the automated patch testing, you can do that on your own by 
setting up github/cirrus for your own account.  If you keep emailing the 
public your patches just because you don't want to set up your private 
testing tooling, that seems a bit abusive?


Are there other reasons why developers might want their patches 
registered in a parking lot?


We also need to consider that the cfbot/cirrus resources are limited. 
Whatever changes we make, we should make sure that they are prioritized 
well.






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 17.05.24 00:13, Tom Lane wrote:

So a third status that encompasses the various other situations like
maybe forgotten by author, disagreements between author and reviewer,
process difficulties, needs some senior developer intervention, etc.
could be helpful.


Hmm, "forgotten by author" seems to generally turn into "this has been
in WOA state a long time".  Not sure we have a problem representing
that, only with a process for eventually retiring such entries.
Your other three examples all sound like "needs senior developer
attention", which could be a helpful state that's distinct from "ready
for committer".  It's definitely not the same as "Unclear".


Yeah, some fine-tuning might be required.  But I would be wary of 
over-designing too many new states at this point.  I think the key idea 
is that there ought to be a state that communicates "needs attention 
from someone other than author, reviewer, or committer".






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:46, Tom Lane wrote:

Peter Eisentraut  writes:

The problem is if we have 180 patches in Needs Review, and only 20 are
really actually ready to be reviewed.  And a second-order problem is
that if you already know that this will be the case, you give up before
even looking.


Right, so what can we do about that?  Does Needs Review state need to
be subdivided, and if so how?


Maybe a new state "Unclear".  Then a commitfest manager, or someone like 
Robert just now, can more easily power through the list and set 
everything that's doubtful to Unclear, with the understanding that the 
author can set it back to Needs Review to signal that they actually 
think it's ready.  Or, as a variant of what someone else was saying, 
just set all patches that carry over from March to July as Unclear.  Or 
something like that.


I think, if we consider the core mission of the commitfest app, we need 
to be more protective of the Needs Review state.


I have been, from time to time, when I'm in commitfest management mode, 
aggressive in setting entries back to Waiting on Author, but that's not 
always optimal.


So a third status that encompasses the various other situations like 
maybe forgotten by author, disagreements between author and reviewer, 
process difficulties, needs some senior developer intervention, etc. 
could be helpful.


This could also help scale the commitfest manager role, because then the 
head CFM could have like three helpers and just tell them, look at all 
the "Unclear" ones and help resolve them.





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:06, Joe Conway wrote:

On 5/16/24 16:57, Jacob Champion wrote:

On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:

Maybe we should just make it a policy that *nothing* gets moved forward
from commitfest-to-commitfest and therefore the author needs to care
enough to register for the next one?


I think that's going to severely disadvantage anyone who doesn't do
this as their day job. Maybe I'm bristling a bit too much at the
wording, but not having time to shepherd a patch is not the same as
not caring.


Maybe the word "care" was a poor choice, but forcing authors to think 
about and decide if they have the "time to shepherd a patch" for the 
*next CF* is exactly the point. If they don't, why clutter the CF with it.


Objectively, I think this could be quite effective.  You need to prove 
your continued interest in your project by pressing a button every two 
months.


But there is a high risk that this will double the annoyance for 
contributors whose patches aren't getting reviews.  Now, not only are 
you being ignored, but you need to prove that you're still there every 
two months.






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:04, Tom Lane wrote:

I think it'd be great if we could separate "I'm actively reviewing
this" from "I'm interested in this".  As a bonus, adding yourself
to the "interested" list would be a fine proxy for the thumbs-up
or star markers mentioned upthread.

If those were separate columns, we could implement some sort of
aging scheme whereby somebody who'd not commented for (say)
a week or two would get quasi-automatically moved from the "active
reviewer" column to the "interested" column, whereupon it wouldn't
be impolite for someone else to sign up for active review.


Yes, I think some variant of this could be quite useful.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 22:46, Melanie Plageman wrote:

I was reflecting on why a general purpose patch tracker sounded
appealing to me, and I realized that, at least at this time of year, I
have a few patches that really count as "waiting on author" that I
know I need to do additional work on before they need more review but
which aren't currently my top priority. I should probably simply
withdraw and re-register them. My justification was that I'll lose
them if I don't keep them in the commitfest app. But, I could just,
you know, save them somewhere myself instead of polluting the
commitfest app with them. I don't know if others are in this
situation. Anyway, I'm definitely currently guilty of parking.


I don't have a problem with that at all.  It's pretty understandable 
that many patches are parked between say April and July.


The key is the keep the status up to date.

If we have 890 patches in Waiting for Author and 100 patches in Ready 
for Committer and 10 patches in Needs Review, and those 10 patches are 
actually reviewable, then that's good.  There might need to be a 
"background process" to make sure those 890 waiting patches are not 
parked forever and those 100 patches actually get committed sometime, 
but in principle this is the system working.


The problem is if we have 180 patches in Needs Review, and only 20 are 
really actually ready to be reviewed.  And a second-order problem is 
that if you already know that this will be the case, you give up before 
even looking.






avoid MERGE_ACTION keyword?

2024-05-16 Thread Peter Eisentraut

I wonder if we can avoid making MERGE_ACTION a keyword.

I think we could parse it initially as a function and then transform it 
to a more special node later.  In the attached patch, I'm doing this in 
parse analysis.  We could try to do it even later and actually execute 
it as a function, if we could get the proper context passed into it somehow.


I'm thinking about this with half an eye on future features.  For 
example, row pattern recognition might introduce similar magic functions 
match_number() and classifier() (somewhat the inspiration for the 
merge_action() syntax), which could use similar treatment.


Thoughts?From 6e0132ca3f3472fb6c5f8c5b2ec7296de1f83811 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 16 May 2024 16:09:57 +0200
Subject: [PATCH] WIP: Avoid MERGE_ACTION keyword

---
 src/backend/parser/gram.y   | 12 +-
 src/backend/parser/parse_expr.c | 73 ++---
 src/include/catalog/pg_proc.dat |  2 +
 src/include/parser/kwlist.h |  1 -
 4 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4d582950b72..b717b704972 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -751,7 +751,7 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
LEADING LEAKPROOF LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL
LOCALTIME LOCALTIMESTAMP LOCATION LOCK_P LOCKED LOGGED
 
-   MAPPING MATCH MATCHED MATERIALIZED MAXVALUE MERGE MERGE_ACTION METHOD
+   MAPPING MATCH MATCHED MATERIALIZED MAXVALUE MERGE METHOD
MINUTE_P MINVALUE MODE MONTH_P MOVE
 
NAME_P NAMES NATIONAL NATURAL NCHAR NESTED NEW NEXT NFC NFD NFKC NFKD NO
@@ -16077,14 +16077,6 @@ func_expr_common_subexpr:
n->location = @1;
$$ = (Node *) n;
}
-   | MERGE_ACTION '(' ')'
-   {
-   MergeSupportFunc *m = 
makeNode(MergeSupportFunc);
-
-   m->msftype = TEXTOID;
-   m->location = @1;
-   $$ = (Node *) m;
-   }
| JSON_QUERY '('
json_value_expr ',' a_expr 
json_passing_clause_opt
json_returning_clause_opt
@@ -17936,7 +17928,6 @@ col_name_keyword:
| JSON_TABLE
| JSON_VALUE
| LEAST
-   | MERGE_ACTION
| NATIONAL
| NCHAR
| NONE
@@ -18334,7 +18325,6 @@ bare_label_keyword:
| MATERIALIZED
| MAXVALUE
| MERGE
-   | MERGE_ACTION
| METHOD
| MINVALUE
| MODE
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index aba3546ed1a..12fc6829fc0 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -55,7 +55,6 @@ static Node *transformAExprDistinct(ParseState *pstate, 
A_Expr *a);
 static Node *transformAExprNullIf(ParseState *pstate, A_Expr *a);
 static Node *transformAExprIn(ParseState *pstate, A_Expr *a);
 static Node *transformAExprBetween(ParseState *pstate, A_Expr *a);
-static Node *transformMergeSupportFunc(ParseState *pstate, MergeSupportFunc 
*f);
 static Node *transformBoolExpr(ParseState *pstate, BoolExpr *a);
 static Node *transformFuncCall(ParseState *pstate, FuncCall *fn);
 static Node *transformMultiAssignRef(ParseState *pstate, MultiAssignRef 
*maref);
@@ -238,11 +237,6 @@ transformExprRecurse(ParseState *pstate, Node *expr)
result = transformGroupingFunc(pstate, (GroupingFunc *) 
expr);
break;
 
-   case T_MergeSupportFunc:
-   result = transformMergeSupportFunc(pstate,
-   
   (MergeSupportFunc *) expr);
-   break;
-
case T_NamedArgExpr:
{
NamedArgExpr *na = (NamedArgExpr *) expr;
@@ -1374,31 +1368,6 @@ transformAExprBetween(ParseState *pstate, A_Expr *a)
return transformExprRecurse(pstate, result);
 }
 
-static Node *
-transformMergeSupportFunc(ParseState *pstate, MergeSupportFunc *f)
-{
-   /*
-* All we need to do is check that we're in the RETURNING list of a 
MERGE
-* command.  If so, we just return the node as-is.
-*/
-   if (pstate->p_expr_kind != EXPR_KIND_MERGE_RETURNING)
-   {
-   ParseState *parent_pstate = pstate->p

Re: Adding the extension name to EData / log_line_prefix

2024-05-16 Thread Peter Eisentraut

On 15.05.24 17:50, Tom Lane wrote:

We kind of already have something like this, for NLS.  If you look for
pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information
already trickles into the vicinity of the error data.  Maybe the same
thing could just be used for this, by wiring up the macros a bit
differently.

Hmm, cute idea, but it'd only help for extensions that are
NLS-enabled.  Which I bet is a tiny fraction of the population.
So far as I can find, we don't even document how to set up
TEXTDOMAIN for an extension --- you have to cargo-cult the
macro definition from some in-core extension.


Yeah, the whole thing is a bit mysterious, and we don't need to use the 
exact mechanism we have now.


But abstractly, we should only have to specify the, uh, domain of the 
log messages once.  Whether that is used for building a message catalog 
or tagging the server log, those are just two different downstream uses 
of the same piece of information.






Re: GUC names in messages

2024-05-16 Thread Peter Eisentraut

On 04.01.24 07:53, Peter Smith wrote:

Now that I read this again, I think this is wrong.

We should decide the quoting for a category, not the actual content.
Like, quote all file names; do not quote keywords.

This led to the attempted patch to decide the quoting of GUC parameter
names dynamically based on the actual content, which no one really
liked.  But then, to preserve consistency, we also need to be uniform in
quoting GUC parameter names where the name is hardcoded.



I agree. By attempting to define when to and when not to use quotes it
has become overcomplicated.

Earlier in the thread, I counted how quotes were used in the existing
messages [5]; there were ~39 quoted and 164 not quoted. Based on that
we chose to stay with the majority, and leave all the unquoted ones so
only adding quotes "when necessary". In hindsight, that was probably
the wrong choice because it opened a can of worms about what "when
necessary" even means (e.g. what about underscores, mixed case etc).

Certainly one simple rule "just quote everything" is easiest to follow.


I've been going through the translation updates for PG17 these days and 
was led back around to this issue.  It seems we left it in an 
intermediate state that no one was really happy with and which is 
arguably as inconsistent or more so than before.


I think we should accept your two patches

v6-0001-GUC-names-docs.patch
v6-0002-GUC-names-add-quotes.patch

which effectively everyone was in favor of and which seem to be the most 
robust and sustainable solution.


(The remaining three patches from the v6 set would be PG18 material at 
this point.)






Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-16 Thread Peter Eisentraut

On 15.05.24 21:05, Robert Haas wrote:

I don't think it's at all obvious that this belongs on the server side
rather than the client side. I think it could be done in either place,
or both. I just think we don't have the infrastructure to do it
cleanly, at present.


I think if you're going to do a syntax-check-with-catalog-lookup mode, 
you need authentication and access control.  The mode without catalog 
lookup doesn't need that.  But it might be better to offer both modes 
through a similar interface (or at least plan ahead for that).






Re: An improved README experience for PostgreSQL

2024-05-16 Thread Peter Eisentraut

On 15.05.24 21:34, Nathan Bossart wrote:

On Wed, May 15, 2024 at 07:23:19AM +0200, Peter Eisentraut wrote:

I think for CONTRIBUTING.md, a better link would be
<https://www.postgresql.org/developer/>.


WFM


This patch version looks good to me.





Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Peter Eisentraut

On 16.05.24 01:32, Tom Lane wrote:

As for the remainder, they aren't showing up because no variable
or field is declared using them, which means no debug symbol
table entry is made for them.  This means we could just drop those
typedefs and be little the worse off notationally.  I experimented
with a patch for that, as attached.  (In the case of NotificationHash,
I thought it better to arrange for there to be a suitable variable;
but it could certainly be done the other way too.)  Is this too anal?


I agree we should get rid of these.

Over the last release cycle, I've been leaning a bit more toward not 
typdef'ing enums and structs that are only in local use, in part because 
of the implied need to keep the typedefs list up to date.


In these cases, I think for

NotificationHash
ResourceOwnerData
WalSyncMethod

we can just get rid of the typedef.

ReadBuffersFlags shouldn't be an enum at all, because its values are 
used as flag bits.


WaitEventExtension, I'm not sure, it's like, an extensible enum?  I 
guess let's remove the typedef there, too.


Attached is a variant patch.
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index d0891e3f0e0..ab4c72762d8 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -396,10 +396,10 @@ typedef struct NotificationList
 
 #define MIN_HASHABLE_NOTIFIES 16   /* threshold to build hashtab */
 
-typedef struct NotificationHash
+struct NotificationHash
 {
Notification *event;/* => the actual Notification struct */
-} NotificationHash;
+};
 
 static NotificationList *pendingNotifies = NULL;
 
@@ -2299,7 +2299,7 @@ AddEventToPendingNotifies(Notification *n)
 
/* Create the hash table */
hash_ctl.keysize = sizeof(Notification *);
-   hash_ctl.entrysize = sizeof(NotificationHash);
+   hash_ctl.entrysize = sizeof(struct NotificationHash);
hash_ctl.hash = notification_hash;
hash_ctl.match = notification_match;
hash_ctl.hcxt = CurTransactionContext;
diff --git a/src/backend/utils/resowner/resowner.c 
b/src/backend/utils/resowner/resowner.c
index ab9343bc5cf..505534ee8d3 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -107,7 +107,7 @@ 
StaticAssertDecl(RESOWNER_HASH_MAX_ITEMS(RESOWNER_HASH_INIT_SIZE) >= RESOWNER_AR
 /*
  * ResourceOwner objects look like this
  */
-typedef struct ResourceOwnerData
+struct ResourceOwnerData
 {
ResourceOwner parent;   /* NULL if no parent (toplevel owner) */
ResourceOwner firstchild;   /* head of linked list of children */
@@ -155,7 +155,7 @@ typedef struct ResourceOwnerData
 
/* The local locks cache. */
LOCALLOCK  *locks[MAX_RESOWNER_LOCKS];  /* list of owned locks */
-} ResourceOwnerData;
+};
 
 
 /*
@@ -415,7 +415,7 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name)
ResourceOwner owner;
 
owner = (ResourceOwner) MemoryContextAllocZero(TopMemoryContext,
-   
   sizeof(ResourceOwnerData));
+   
   sizeof(struct ResourceOwnerData));
owner->name = name;
 
if (parent)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 76787a82673..1a1f11a943f 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -19,14 +19,14 @@
 
 
 /* Sync methods */
-typedef enum WalSyncMethod
+enum WalSyncMethod
 {
WAL_SYNC_METHOD_FSYNC = 0,
WAL_SYNC_METHOD_FDATASYNC,
WAL_SYNC_METHOD_OPEN,   /* for O_SYNC */
WAL_SYNC_METHOD_FSYNC_WRITETHROUGH,
WAL_SYNC_METHOD_OPEN_DSYNC  /* for O_DSYNC */
-} WalSyncMethod;
+};
 extern PGDLLIMPORT int wal_sync_method;
 
 extern PGDLLIMPORT XLogRecPtr ProcLastRecPtr;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 42211bfec4f..08364447c74 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -107,14 +107,10 @@ typedef struct BufferManagerRelation
 #define BMR_REL(p_rel) ((BufferManagerRelation){.rel = p_rel})
 #define BMR_SMGR(p_smgr, p_relpersistence) ((BufferManagerRelation){.smgr = 
p_smgr, .relpersistence = p_relpersistence})
 
-typedef enum ReadBuffersFlags
-{
-   /* Zero out page if reading fails. */
-   READ_BUFFERS_ZERO_ON_ERROR = (1 << 0),
-
-   /* Call smgrprefetch() if I/O necessary. */
-   READ_BUFFERS_ISSUE_ADVICE = (1 << 1),
-} ReadBuffersFlags;
+/* Zero out page if reading fails. */
+#define READ_BUFFERS_ZERO_ON_ERROR (1 << 0)
+/* Call smgrprefetch() if I/O necessary. */
+#define READ_BUFFERS_ISSUE_ADVICE (1 << 1)
 
 struct ReadBuffersOperation
 {


Re: Minor cleanups in the SSL tests

2024-05-16 Thread Peter Eisentraut

On 16.05.24 09:24, Daniel Gustafsson wrote:

When writing a new SSL test for another patch it struck me that the SSL tests
are doing configuration management without using the test framework API's.  The
attached patches cleans this up, no testcases are altered as part of this.

0001 makes the test for PG_TEST_EXTRA a top-level if statement not attached to
any other conditional.  There is no change in functionality, it's mainly for
readability (PG_TEST_EXTRA is it's own concept, not tied to library presence).


Makes sense to me.


0002 ports over editing configfiles to using append_conf() instead of opening
and writing to them directly.


Yes, it's probably preferable to use append_conf() here.  You might want 
to run your patch through pgperltidy.  The result doesn't look bad, but 
a bit different than what you had crafted.


append_conf() opens and closes the file for each call.  It might be nice 
if it could accept a list.  Or you can just pass the whole block as one 
string, like it was done for pg_ident.conf before.






Re: SQL:2011 application time

2024-05-16 Thread Peter Eisentraut

On 15.05.24 11:39, Peter Eisentraut wrote:
Attached are the individual revert patches.  I'm supplying these here 
mainly so that future efforts can use those instead of the original 
patches, since that would have to redo all the conflict resolution and 
also miss various typo fixes etc. that were applied in the meantime.  I 
will commit this as one squashed patch.


This has been done.





Re: Underscore in positional parameters?

2024-05-16 Thread Peter Eisentraut

On 16.05.24 01:11, Michael Paquier wrote:

On Wed, May 15, 2024 at 01:59:36PM +0200, Peter Eisentraut wrote:

On 14.05.24 18:07, Erik Wienhold wrote:

Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
and strtoint in ECPG.  This fixes overflows like:


Seems like a good idea, but as was said, this is an older issue, so let's
look at that separately.


Hmm, yeah.  I would be really tempted to fix that now.

Now, it has been this way for ages, and with my RMT hat on (aka I need
to show the example), I'd suggest to wait for when the v18 branch
opens as there is no urgency.  I'm OK to apply it myself at the end,
the patch is a good idea.


On this specific patch, maybe reword "parameter too large" to "parameter 
number too large".


Also, I was bemused by the use of atol(), which is notoriously 
unportable (sizeof(long)).  So I poked around and found more places that 
might need fixing.  I'm attaching a patch here with annotations too look 
at later.
From 2d3ad223b1f9b7bb5bebc6b6ef8cbba0d1c0022b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 16 May 2024 08:36:21 +0200
Subject: [PATCH] WIP: atol() investigation

---
 src/backend/parser/scan.l| 2 +-
 src/bin/pg_basebackup/pg_basebackup.c| 2 +-
 src/bin/pg_basebackup/streamutil.c   | 2 +-
 src/bin/pg_rewind/libpq_source.c | 2 +-
 src/interfaces/ecpg/ecpglib/execute.c| 2 +-
 src/interfaces/ecpg/preproc/ecpg.trailer | 2 +-
 src/interfaces/ecpg/preproc/pgc.l| 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index b499975e9c4..383d3f2d39a 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -993,7 +993,7 @@ other   .
 
 {param}{
SET_YYLLOC();
-   yylval->ival = atol(yytext + 1);
+   yylval->ival = atol(yytext + 1); // 
FIXME with overflow check
return PARAM;
}
 {param_junk}   {
diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 8f3dd04fd22..4ebc5e3b2b8 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2056,7 +2056,7 @@ BaseBackup(char *compression_algorithm, char 
*compression_detail,
tablespacecount = PQntuples(res);
for (i = 0; i < PQntuples(res); i++)
{
-   totalsize_kb += atol(PQgetvalue(res, i, 2));
+   totalsize_kb += atoll(PQgetvalue(res, i, 2)); // FIXME: atol() 
might truncate if sizeof(long)==4
 
/*
 * Verify tablespace directories are empty. Don't bother with 
the
diff --git a/src/bin/pg_basebackup/streamutil.c 
b/src/bin/pg_basebackup/streamutil.c
index d0efd8600ca..1d303d16ef5 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -631,7 +631,7 @@ GetSlotInformation(PGconn *conn, const char *slot_name,
 
/* current TLI */
if (!PQgetisnull(res, 0, 2))
-   tli_loc = (TimeLineID) atol(PQgetvalue(res, 0, 2));
+   tli_loc = (TimeLineID) atol(PQgetvalue(res, 0, 2)); // FIXME: 
use strtoul()
 
PQclear(res);
 
diff --git a/src/bin/pg_rewind/libpq_source.c b/src/bin/pg_rewind/libpq_source.c
index 7d898c3b501..618b175dcc4 100644
--- a/src/bin/pg_rewind/libpq_source.c
+++ b/src/bin/pg_rewind/libpq_source.c
@@ -294,7 +294,7 @@ libpq_traverse_files(rewind_source *source, 
process_file_callback_t callback)
}
 
path = PQgetvalue(res, i, 0);
-   filesize = atol(PQgetvalue(res, i, 1));
+   filesize = atoll(PQgetvalue(res, i, 1)); // FIXME: atol() might 
truncate
isdir = (strcmp(PQgetvalue(res, i, 2), "t") == 0);
link_target = PQgetvalue(res, i, 3);
 
diff --git a/src/interfaces/ecpg/ecpglib/execute.c 
b/src/interfaces/ecpg/ecpglib/execute.c
index 04d0b40c537..c578c21cf66 100644
--- a/src/interfaces/ecpg/ecpglib/execute.c
+++ b/src/interfaces/ecpg/ecpglib/execute.c
@@ -278,7 +278,7 @@ ecpg_is_type_an_array(int type, const struct statement 
*stmt, const struct varia
isarray = ECPG_ARRAY_NONE;
else
{
-   isarray = (atol((char *) PQgetvalue(query, 0, 0)) == 
-1) ? ECPG_ARRAY_ARRAY : ECPG_ARRAY_VECTOR;
+   isarray = (atoi((char *) PQgetvalue(query, 0, 0)) == 
-1) ? ECPG_ARRAY_ARRAY : ECPG_ARRAY_VECTOR;
if (ecpg_dynamic_type(type) == SQL3_CHARACTER ||
ecpg_dynamic_type(type) == 
SQL3_CHARACTER_VARYING)
{
diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer 
b/src/interfaces/ecpg/preproc/ecpg.trailer
index b2aa44f36dd..8ac1c

Re: Adding the extension name to EData / log_line_prefix

2024-05-15 Thread Peter Eisentraut

On 14.05.24 01:11, Tom Lane wrote:

The mechanism that Andres describes for sourcing the name seems a bit
overcomplex though.  Why not just allow/require each extension to
specify its name as a constant string?  We could force the matter by
redefining PG_MODULE_MAGIC as taking an argument:

PG_MODULE_MAGIC("hstore");


We kind of already have something like this, for NLS.  If you look for 
pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information 
already trickles into the vicinity of the error data.  Maybe the same 
thing could just be used for this, by wiring up the macros a bit 
differently.






Re: cataloguing NOT NULL constraints

2024-05-15 Thread Peter Eisentraut

On 15.05.24 09:50, Alvaro Herrera wrote:

On 2024-May-14, Bruce Momjian wrote:


Turns out these commits generated a single release note item, which I
have now removed with the attached committed patch.


Hmm, but the commits about not-null constraints for domains were not
reverted, only the ones for constraints on relations.  I think the
release notes don't properly address the ones on domains.  I think it's
at least these two commits:


-Author: Peter Eisentraut 
-2024-03-20 [e5da0fe3c] Catalog domain not-null constraints
-Author: Peter Eisentraut 
-2024-04-15 [9895b35cb] Fix ALTER DOMAIN NOT NULL syntax


I'm confused that these were kept.  The first one was specifically to 
make the catalog representation of domain not-null constraints 
consistent with table not-null constraints.  But the table part was 
reverted, so now the domain constraints are inconsistent again.


The second one refers to the first one, but it might also fix some 
additional older issue, so it would need more investigation.






Re: Underscore in positional parameters?

2024-05-15 Thread Peter Eisentraut

On 14.05.24 18:07, Erik Wienhold wrote:

Patch 0001 changes rules param and param_junk to only accept digits 0-9.


I have committed this patch to PG16 and master.

I was a little bit on the fence about what the behavior should be, but I 
checked Perl for comparison:


print 1000;   # ok
print 1_000;  # ok
print $1000;  # ok
print $1_000; # error

So this seems alright.


Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
and strtoint in ECPG.  This fixes overflows like:


Seems like a good idea, but as was said, this is an older issue, so 
let's look at that separately.






Re: doc: some fixes for environment sections in ref pages

2024-05-15 Thread Peter Eisentraut

On 13.05.24 13:02, Daniel Gustafsson wrote:

On 13 May 2024, at 10:48, Peter Eisentraut  wrote:



Patches attached.


All patches look good.


I think the first one is a bug fix.


Agreed.


Committed, thanks.





Re: Postgres and --config-file option

2024-05-15 Thread Peter Eisentraut

On 15.05.24 04:07, Michael Paquier wrote:

Not sure that these additions in --help or the docs are necessary.
The rest looks OK.

-"You must specify the --config-file or -D invocation "
+"You must specify the --config-file (or equivalent -c) or -D invocation "

How about "You must specify the --config-file, -c
\"config_file=VALUE\" or -D invocation"?  There is some practice for
--opt=VALUE in .po files.


Yeah, some of this is becoming quite unwieldy, if we document and 
mention each spelling variant of each option everywhere.


Maybe if the original problem is that the option --config-file is not 
explicitly in the --help output, let's add it to the --help output?






Re: pgsql: Fix overread in JSON parsing errors for incomplete byte sequence

2024-05-15 Thread Peter Eisentraut

On 15.05.24 02:00, Michael Paquier wrote:

On Tue, May 14, 2024 at 10:39:36AM +0200, Peter Eisentraut wrote:

I saw the same thing.  The problem is that there is incomplete dependency
information in the makefiles (not meson) between src/common/ and what is
using it.  So whenever anything changes in src/common/, you pretty much have
to do a forced rebuild of everything.


Is that a recent regression?  I have some blurry memories from
working on these areas that changing src/common/ reflected on the
compiled pieces effectively at some point.


One instance of this problem that I can reproduce at least back to PG12 is

1. touch src/common/exec.c
2. make -C src/bin/pg_dump

This will rebuild libpgcommon, but it will not relink pg_dump.




Re: An improved README experience for PostgreSQL

2024-05-14 Thread Peter Eisentraut

On 14.05.24 19:33, Nathan Bossart wrote:

On Tue, May 14, 2024 at 06:12:26PM +0200, Alvaro Herrera wrote:

On 2024-May-14, Tom Lane wrote:


I don't have a position on whether we want
these additional files or not; but if we do, I think the best answer
is to stick 'em under .github/ where they are out of the way but yet
updatable by any committer.


+1 for .github/, that was my first reaction as well after reading the
link Peter posted.


Here's an updated patch that uses .github/.


I'm fine with putting them under .github/.

I think for CONTRIBUTING.md, a better link would be 
.






Re: Requiring LLVM 14+ in PostgreSQL 18

2024-05-14 Thread Peter Eisentraut

On 15.05.24 06:21, Thomas Munro wrote:

And as I'm looking up how this was previously handled, I notice that
this list of clang-NN versions was last updated equally sneakily as part
of your patch to trim off LLVM <10 (820b5af73dc).  I wonder if the
original intention of that configure code was that maintaining the
versioned list above clang-7/llvm-config-7 was not needed, because the
unversioning programs could be used, or maybe because pkg-config could
be used.  It would be nice if we could get rid of having to update that.

I probably misunderstood why we were doing that, perhaps something to
do with the way some distro (Debian?) was doing things with older
versions, and yeah I see that we went a long time after 7 without
touching it and nobody cared.  Yeah, it would be nice to get rid of
it.  Here's a patch.  Meson didn't have that.


Yes, let's get that v3-0001 patch into PG17.





Re: pgsql: Fix overread in JSON parsing errors for incomplete byte sequence

2024-05-14 Thread Peter Eisentraut

On 10.05.24 14:23, Alvaro Herrera wrote:

On 2024-May-10, Alvaro Herrera wrote:


Not sure what's going on here, or why it fails for me while the
buildfarm is all happy.


Ah, I ran 'git clean -dfx' and now it works correctly.  I must have had
an incomplete rebuild.


I saw the same thing.  The problem is that there is incomplete 
dependency information in the makefiles (not meson) between src/common/ 
and what is using it.  So whenever anything changes in src/common/, you 
pretty much have to do a forced rebuild of everything.






Re: An improved README experience for PostgreSQL

2024-05-14 Thread Peter Eisentraut

On 13.05.24 17:26, Nathan Bossart wrote:

On Sun, May 12, 2024 at 05:17:42PM +0200, Peter Eisentraut wrote:

I don't know, I find these files kind of "yelling".  It's fine to have a
couple, but now it's getting a bit much, and there are more that could be
added.


I'm not sure what you mean by this.  Do you mean that the contents are too
blunt?  That there are too many files?  Something else?


I mean the all-caps file names, cluttering up the top-level directory.


If we want to enhance the GitHub experience, we can also add these files to
the organization instead: 
https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/creating-a-default-community-health-file


This was the intent of my patch.  There might be a few others that we could
use, but I figured we could start with the low-hanging fruit that would
have the most impact on the GitHub experience.


My point is, in order to get that enhanced GitHub experience, you don't 
actually have to commit these files into the individual source code 
repository.  You can add them to the organization and they will apply to 
all repositories under the organization.  This is explained at the above 
link.


However, I don't think these files are actually that useful.  People can 
go to the web site to find out about things about the PostgreSQL 
community.  We don't need to add bunch of $X.md files that just say, 
essentially, got to postgresql.org/$X.





Re: An improved README experience for PostgreSQL

2024-05-14 Thread Peter Eisentraut

On 13.05.24 17:43, Alvaro Herrera wrote:

On 2024-May-13, Nathan Bossart wrote:


If we want to enhance the GitHub experience, we can also add these files to
the organization instead: 
https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/creating-a-default-community-health-file


This was the intent of my patch.  There might be a few others that we could
use, but I figured we could start with the low-hanging fruit that would
have the most impact on the GitHub experience.


Can't we add these two lines per topic to the README.md?


The point of these special file names is that GitHub will produce 
special links to them.  If you look at Nathan's tree


https://github.com/nathan-bossart/postgres/tree/special-files

and scroll down to the README display, you will see links for "Code of 
Conduct", "License", and "Security" across the top.


Whether it's worth having these files just to produce these links is the 
debate.






Re: consider -Wmissing-variable-declarations

2024-05-14 Thread Peter Eisentraut

On 10.05.24 11:53, Heikki Linnakangas wrote:

On 09/05/2024 12:23, Peter Eisentraut wrote:

In [0] I had noticed that we have no automated verification that global
variables are declared in header files.  (For global functions, we have
this through -Wmissing-prototypes.)  As I mentioned there, I discovered
the Clang compiler option -Wmissing-variable-declarations, which does
exactly that.  Clang has supported this for quite some time, and GCC 14,
which was released a few days ago, now also supports it.  I went and
installed this option into the standard build flags and cleaned up the
warnings it found, which revealed a number of interesting things.


Nice! More checks like this is good in general.


Attached are patches organized by sub-topic.  The most dubious stuff is
in patches 0006 and 0007.  A bunch of GUC-related variables are not in
header files but are pulled in via ad-hoc extern declarations.  I can't
recognize an intentional scheme there, probably just done for
convenience or copied from previous practice.  These should be organized
into appropriate header files.


+1 for moving all these to header files. Also all the "other stuff" in 
patch 0007.


I have found a partial explanation for the "other stuff".  We have in 
launch_backend.c:


/*
 * The following need to be available to the save/restore_backend_variables
 * functions.  They are marked NON_EXEC_STATIC in their home modules.
 */
extern slock_t *ShmemLock;
extern slock_t *ProcStructLock;
extern PGPROC *AuxiliaryProcs;
extern PMSignalData *PMSignalState;
extern pg_time_t first_syslogger_file_time;
extern struct bkend *ShmemBackendArray;
extern bool redirection_done;

So these are notionally static variables that had to be sneakily 
exported for the purposes of EXEC_BACKEND.


(This probably also means my current patch set won't work cleanly on 
EXEC_BACKEND builds.  I'll need to check that further.)


However, it turns out that that comment is not completely true. 
ShmemLock, ShmemBackendArray, and redirection_done are not in fact 
NON_EXEC_STATIC.  I think they probably once were, but then they were 
needed elsewhere and people thought, if launch_backend.c (formerly 
postmaster.c) gets at them via its own extern declaration, then I will 
do that too.


ShmemLock has been like that for a longer time, but ShmemBackendArray 
and redirection_done are new like that in PG17, probably from all the 
postmaster.c refactoring.


ShmemLock and redirection_done have now escaped for wider use and should 
be in header files, as my patches are proposing.


ShmemBackendArray only exists if EXEC_BACKEND, so it's fine, but the 
comment is slightly misleading.  Maybe sticking a NON_EXEC_STATIC onto 
ShmemBackendArray, even though it's useless, would make this more 
consistent.






Re: Large files for relations

2024-05-13 Thread Peter Eisentraut

On 06.03.24 22:54, Thomas Munro wrote:

Rebased.  I had intended to try to get this into v17, but a couple of
unresolved problems came up while rebasing over the new incremental
backup stuff.  You snooze, you lose.  Hopefully we can sort these out
in time for the next commitfest:

* should pg_combinebasebackup read the control file to fetch the segment size?
* hunt for other segment-size related problems that may be lurking in
new incremental backup stuff
* basebackup_incremental.c wants to use memory in proportion to
segment size, which looks like a problem, and I wrote about that in a
new thread[1]


Overall, I like this idea, and the patch seems to have many bases covered.

The patch will need a rebase.  I was able to test it on 
master@{2024-03-13}, but after that there are conflicts.


In .cirrus.tasks.yml, one of the test tasks uses 
--with-segsize-blocks=6, but you are removing that option.  You could 
replace that with something like


PG_TEST_INITDB_EXTRA_OPTS='--rel-segsize=48kB'

But that won't work exactly because

initdb: error: argument of --rel-segsize must be a power of two

I suppose that's ok as a change, since it makes the arithmetic more 
efficient.  But maybe it should be called out explicitly in the commit 
message.


If I run it with 64kB, the test pgbench/001_pgbench_with_server fails 
consistently, so it seems there is still a gap somewhere.


A minor point, the initdb error message

initdb: error: argument of --rel-segsize must be a multiple of BLCKSZ

would be friendlier if actually showed the value of the block size 
instead of just the symbol.  Similarly for the nearby error message 
about the off_t size.


In the control file, all the other fields use unsigned types.  Should 
relseg_size be uint64?


PG_CONTROL_VERSION needs to be changed.





Re: SQL:2011 application time

2024-05-13 Thread Peter Eisentraut

On 03.04.24 07:30, Paul Jungwirth wrote:
But is it *literally* unique? Well two identical keys, e.g. (5, 
'[Jan24,Mar24)') and (5, '[Jan24,Mar24)'), do have overlapping ranges, 
so the second is excluded. Normally a temporal unique index is *more* 
restrictive than a standard one, since it forbids other values too (e.g. 
(5, '[Jan24,Feb24)')). But sadly there is one exception: the ranges in 
these keys do not overlap: (5, 'empty'), (5, 'empty'). With 
ranges/multiranges, `'empty' && x` is false for all x. You can add that 
key as many times as you like, despite a PK/UQ constraint:


     postgres=# insert into t values
     ('[1,2)', 'empty', 'foo'),
     ('[1,2)', 'empty', 'bar');
     INSERT 0 2
     postgres=# select * from t;
   id   | valid_at | name
     ---+--+--
  [1,2) | empty    | foo
  [1,2) | empty    | bar
     (2 rows)

Cases like this shouldn't actually happen for temporal tables, since 
empty is not a meaningful value. An UPDATE/DELETE FOR PORTION OF would 
never cause an empty. But we should still make sure they don't cause 
problems.


We should give temporal primary keys an internal CHECK constraint saying 
`NOT isempty(valid_at)`. The problem is analogous to NULLs in parts of a 
primary key. NULLs prevent two identical keys from ever comparing as 
equal. And just as a regular primary key cannot contain NULLs, so a 
temporal primary key should not contain empties.


The standard effectively prevents this with PERIODs, because a PERIOD 
adds a constraint saying start < end. But our ranges enforce only start 
<= end. If you say `int4range(4,4)` you get `empty`. If we constrain 
primary keys as I'm suggesting, then they are literally unique, and 
indisunique seems safer.


Should we add the same CHECK constraint to temporal UNIQUE indexes? I'm 
inclined toward no, just as we don't forbid NULLs in parts of a UNIQUE 
key. We should try to pick what gives users more options, when possible. 
Even if it is questionably meaningful, I can see use cases for allowing 
empty ranges in a temporal table. For example it lets you "disable" a 
row, preserving its values but marking it as never true.


It looks like we missed some of these fundamental design questions early 
on, and it might be too late now to fix them for PG17.


For example, the discussion on unique constraints misses that the 
question of null values in unique constraints itself is controversial 
and that there is now a way to change the behavior.  So I imagine there 
is also a selection of possible behaviors you might want for empty 
ranges.  Intuitively, I don't think empty ranges are sensible for 
temporal unique constraints.  But anyway, it's a bit late now to be 
discussing this.


I'm also concerned that if ranges have this fundamental incompatibility 
with periods, then the plan to eventually evolve this patch set to 
support standard periods will also have as-yet-unknown problems.


Some of these issues might be design flaws in the underlying mechanisms, 
like range types and exclusion constraints.  Like, if you're supposed to 
use this for scheduling but you can use empty ranges to bypass exclusion 
constraints, how is one supposed to use this?  Yes, a check constraint 
using isempty() might be the right answer.  But I don't see this 
documented anywhere.


On the technical side, adding an implicit check constraint as part of a 
primary key constraint is quite a difficult implementation task, as I 
think you are discovering.  I'm just reminded about how the patch for 
catalogued not-null constraints struggled with linking these not-null 
constraints to primary keys correctly.  This sounds a bit similar.


I'm afraid that these issues cannot be resolved in good time for this 
release, so we should revert this patch set for now.






doc: some fixes for environment sections in ref pages

2024-05-13 Thread Peter Eisentraut
I noticed that the reference pages for initdb and pg_ctl claim in the 
Environment section that libpq variables are used, which does not seem 
correct to me.  I think this was accidentally copied when this blurb was 
added to other pages.


While I was checking around that, I also noticed that pg_amcheck and 
pg_upgrade don't have Environment sections on their reference pages, so 
I added them.  For pg_amcheck I copied the standard text for client 
programs.  pg_upgrade has its own specific list of environment variables.


Patches attached.  I think the first one is a bug fix.From 3f573c5935d46b20de7e7129cd0bf69abed1df6c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 13 May 2024 10:10:21 +0200
Subject: [PATCH 1/3] doc: Remove claims that initdb and pg_ctl use libpq
 environment variables

Erroneously introduced by 571df93cff8.
---
 doc/src/sgml/ref/initdb.sgml | 6 --
 doc/src/sgml/ref/pg_ctl-ref.sgml | 7 ---
 2 files changed, 13 deletions(-)

diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index 6c96c0c0681..74008a9a82f 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -632,12 +632,6 @@ Environment

   
 
-  
-   This utility, like most other PostgreSQL 
utilities,
-   also uses the environment variables supported by 
libpq
-   (see ).
-  
-
  
 
  
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 46906966eb9..a0287bb81d6 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -578,13 +578,6 @@ Environment
unless PGDATA is set.
   
 
-  
-   pg_ctl, like most other 
PostgreSQL
-   utilities,
-   also uses the environment variables supported by 
libpq
-   (see ).
-  
-
   
For additional variables that affect the server,
see .

base-commit: 3ca43dbbb67fbfb96dec8de2e268b96790555148
-- 
2.44.0

From 362f6ed36bebc2d1f0bdaf00f54d9212b344d98f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 13 May 2024 10:12:02 +0200
Subject: [PATCH 2/3] doc: Add standard Environment section to pg_amcheck ref
 page

---
 doc/src/sgml/ref/pg_amcheck.sgml | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index 067c806b46d..2b9634b3ac2 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -644,6 +644,24 @@ Options
   
  
 
+ 
+  Environment
+
+  
+   pg_amcheck, like most other 
PostgreSQL
+   utilities,
+   also uses the environment variables supported by 
libpq
+   (see ).
+  
+
+  
+   The environment variable PG_COLOR specifies whether to use
+   color in diagnostic messages. Possible values are
+   always, auto and
+   never.
+  
+ 
+
  
   Notes
 
-- 
2.44.0

From f276cf62d1ebb6dc4c501a5d1b397aeea6630fcb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 13 May 2024 10:25:16 +0200
Subject: [PATCH 3/3] doc: Add standard Environment section to pg_upgrade ref
 page

---
 doc/src/sgml/ref/pgupgrade.sgml | 98 +
 1 file changed, 98 insertions(+)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 10c842adb14..9877f2f01c6 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -939,6 +939,104 @@ Reverting to old cluster
 
  
 
+ 
+  Environment
+
+  
+   Some environment variables can be used to provide defaults for command-line 
options:
+
+   
+
+ PGBINOLD
+
+ 
+  
+   The old PostgreSQL executable directory; option
+   -b/--old-bindir.
+  
+ 
+
+
+
+ PGBINNEW
+
+ 
+  
+   The new PostgreSQL executable directory; option
+   -B/--new-bindir.
+  
+ 
+
+
+
+ PGDATAOLD
+
+ 
+  
+   The old database cluster configuration directory; option
+   -d/--old-datadir.
+  
+ 
+
+
+
+ PGDATANEW
+
+ 
+  
+   The new database cluster configuration directory; option
+   -D/--new-datadir.
+  
+ 
+
+
+
+ PGPORTOLD
+
+ 
+  
+   The old cluster port number; option
+   -p/--old-port.
+  
+ 
+
+
+
+ PGPORTNEW
+
+ 
+  
+   The new cluster port number; option
+   -P/--new-port.
+  
+ 
+
+
+
+ PGSOCKETDIR
+
+ 
+  
+   Directory to use for postmaster sockets during upgrade; option
+   -s/--socketdir.
+  
+ 
+
+
+
+ PGUSER
+
+ 
+  
+   Cluster's install user name; option
+   -U/--username.
+  
+ 
+
+   
+  
+ 
+
  
   Notes
 
-- 
2.44.0



Re: Converting README documentation to Markdown

2024-05-13 Thread Peter Eisentraut

On 08.04.24 21:29, Daniel Gustafsson wrote:

Over in [0] I asked whether it would be worthwhile converting all our README
files to Markdown, and since it wasn't met with pitchforks I figured it would
be an interesting excercise to see what it would take (my honest gut feeling
was that it would be way too intrusive).  Markdown does brings a few key
features however so IMHO it's worth attempting to see:

* New developers are very used to reading/writing it
* Using a defined format ensures some level of consistency
* Many users and contributors new*as well as*  old like reading documentation
   nicely formatted in a browser
* The documentation now prints really well
* pandoc et.al can be used to render nice looking PDF's
* All the same benefits as discussed in [0]

The plan was to follow Grubers original motivation for Markdown closely:

"The idea is that a Markdown-formatted document should be publishable
as-is, as plain text, without looking like it’s been marked up with
tags or formatting instructions."

This translates to making the least amount of changes to achieve a) retained
plain text readability at todays level, b) proper Markdown rendering, not
looking like text files in a HTML window, and c) absolutly no reflows and
minimal impact on git blame.


I started looking through this and immediately found a bunch of tiny 
problems.  (This is probably in part because the READMEs under 
src/backend/access/ are some of the more complicated ones, but then they 
are also the ones that might benefit most from better rendering.)


One general problem is that original Markdown and GitHub-flavored 
Markdown (GFM) are incompatible in some interesting aspects.  For 
example, the line


A split initially marks the left page with the F_FOLLOW_RIGHT flag.

is rendered by GFM as you'd expect.  But original Markdown converts it to

A split initially marks the left page with the FFOLLOWRIGHT
flag.

This kind of problem is pervasive, as you'd expect.

Another incompatibility is that GFM accepts "1)" as a list marker (which 
appears to be used often in the READMEs), but original Markdown does 
not.  This then also affects surrounding formatting.


Also, the READMEs often do not indent lists in a non-ambiguous way.  For 
example, if you look into src/backend/optimizer/README, section "Join 
Tree Construction", there are two list items, but it's not immediately 
clear which paragraphs belong to the list and which ones follow the 
list.  This also interacts with the previous point.  The resulting 
formatting in GFM is quite misleading.


src/port/README.md is a similar case.

There are also various places where whitespace is used for ad-hoc 
formatting.  Consider for example in src/backend/access/gin/README


  the "category" of the null entry.  These are the possible categories:

1 = ordinary null key value extracted from an indexable item
2 = placeholder for zero-key indexable item
3 = placeholder for null indexable item

  Placeholder null entries are inserted into the index because otherwise

But this does not preserve the list-like formatting, it just flows it 
together.


There is a similar case with the authors list at the end of 
src/backend/access/gist/README.md.


src/test/README.md wasn't touched by your patch, but it also needs 
adjustments for list formatting.



In summary, I think before we could accept this, we'd need to go through 
this with a fine-toothed comb line by line and page by page to make sure 
the formatting is still sound.  And we'd need to figure out which 
Markdown flavor to target.






Convert sepgsql tests to TAP

2024-05-13 Thread Peter Eisentraut
The sepgsql tests have not been integrated into the Meson build system 
yet.  I propose to fix that here.


One problem there was that the tests use a very custom construction 
where a top-level shell script internally calls make.  I have converted 
this to a TAP script that does the preliminary checks and then calls 
pg_regress directly, without make.  This seems to get the job done. 
Also, once you have your SELinux environment set up as required, the 
test now works fully automatically; you don't have to do any manual prep 
work.  The whole thing is guarded by PG_TEST_EXTRA=sepgsql now.


Some comments and questions:

- Do we want to keep the old way to run the test?  I don't know all the 
testing scenarios that people might be interested in, but of course it 
would also be good to cut down on the duplication in the test files.


- Strangely, there was apparently so far no way to get to the build 
directory from a TAP script.  They only ever want to read files from the 
source directory.  So I had to add that.


- If you go through the pre-test checks in contrib/sepgsql/test_sepgsql, 
I have converted most of these checks to the Perl script.  Some of the 
checks are obsolete, because they check whether the database has been 
correctly initialized, which is now done by the TAP script anyway.  One 
check that I wasn't sure about is the


# 'psql' command must be executable from test domain

The old test was checking the installation tree, which I guess could be 
set up in random ways.  But do we need this kind of check if we are 
using a temporary installation?


As mentioned in the patch, the documentation needs to be updated.  This 
depends on the outcome of the question above whether we want to keep the 
old tests in some way.
From 2f8e73932b1068caf696582487de9e100fcd46be Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 13 May 2024 07:55:55 +0200
Subject: [PATCH v1] Convert sepgsql tests to TAP

Add a TAP test for sepgsql.  This automates the previously required
manual setup before the test.  The actual tests are still run by
pg_regress, but not called from within the TAP Perl script.

TODO: remove the old test scripts?
---
 contrib/sepgsql/.gitignore   |   3 +
 contrib/sepgsql/Makefile |   3 +
 contrib/sepgsql/meson.build  |  11 +-
 contrib/sepgsql/t/001_sepgsql.pl | 248 +++
 doc/src/sgml/regress.sgml|  11 ++
 doc/src/sgml/sepgsql.sgml|   2 +
 meson.build  |   1 +
 src/Makefile.global.in   |   1 +
 8 files changed, 279 insertions(+), 1 deletion(-)
 create mode 100644 contrib/sepgsql/t/001_sepgsql.pl

diff --git a/contrib/sepgsql/.gitignore b/contrib/sepgsql/.gitignore
index 31613e011f5..7cadb9419e9 100644
--- a/contrib/sepgsql/.gitignore
+++ b/contrib/sepgsql/.gitignore
@@ -1,7 +1,10 @@
 /sepgsql.sql
+# FIXME
 /sepgsql-regtest.fc
 /sepgsql-regtest.if
 /sepgsql-regtest.pp
 /tmp
 # Generated subdirectories
+/log/
 /results/
+/tmp_check/
diff --git a/contrib/sepgsql/Makefile b/contrib/sepgsql/Makefile
index afca75b693f..5cc9697736c 100644
--- a/contrib/sepgsql/Makefile
+++ b/contrib/sepgsql/Makefile
@@ -15,6 +15,9 @@ OBJS = \
 DATA_built = sepgsql.sql
 PGFILEDESC = "sepgsql - SELinux integration"
 
+TAP_TESTS = 1
+
+# FIXME
 # Note: because we don't tell the Makefile there are any regression tests,
 # we have to clean those result files explicitly
 EXTRA_CLEAN = -r $(pg_regress_clean_files) tmp/ *.pp sepgsql-regtest.if 
sepgsql-regtest.fc
diff --git a/contrib/sepgsql/meson.build b/contrib/sepgsql/meson.build
index 9544efe0287..5817ba30a58 100644
--- a/contrib/sepgsql/meson.build
+++ b/contrib/sepgsql/meson.build
@@ -40,4 +40,13 @@ contrib_targets += custom_target('sepgsql.sql',
   install_dir: contrib_data_args['install_dir'],
 )
 
-# TODO: implement sepgsql tests
+tests += {
+  'name': 'sepgsql',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+'tests': [
+  't/001_sepgsql.pl',
+],
+  },
+}
diff --git a/contrib/sepgsql/t/001_sepgsql.pl b/contrib/sepgsql/t/001_sepgsql.pl
new file mode 100644
index 000..82bba5774ce
--- /dev/null
+++ b/contrib/sepgsql/t/001_sepgsql.pl
@@ -0,0 +1,248 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bsepgsql\b/)
+{
+   plan skip_all =>
+ 'Potentially unsafe test sepgsql not enabled in PG_TEST_EXTRA';
+}
+
+note "checking selinux environment";
+
+# matchpathcon must be present to assess whether the installation environment
+# is OK.
+note "checking for matchpathcon";
+if (system('matchpathcon -n . >/dev/null 2>&1') != 0)
+{
+   diag </dev/null 2>&1') != 0)
+{
+   diag </dev/null 2>&1') != 0)
+{
+   diag </dev/null | sed 's/:/

Re: elog/ereport VS misleading backtrace_function function address

2024-05-12 Thread Peter Eisentraut

On 07.05.24 09:43, Jakub Wartak wrote:

NOTE: in case one will be testing this: one cannot ./configure with
--enable-debug as it prevents the compiler optimizations that actually
end up with the ".cold" branch optimizations that cause backtrace() to
return the wrong address.


Is that configuration useful?  If you're interested in backtraces, 
wouldn't you also want debug symbols?  Don't production builds use debug 
symbols nowadays as well?



I recall speculating about whether we could somehow invoke gdb
to get a more comprehensive and accurate backtrace, but I don't
really have a concrete idea how that could be made to work.

Oh no, I'm more about micro-fix rather than doing some bigger
revolution. The patch simply adds this one instruction in macro, so
that now backtrace_functions behaves better:

0x00773d28 <+105>:   call   0x79243f 
0x00773d2d <+110>:   movzbl -0x12(%rbp),%eax  << this ends
up being added by the patch
0x00773d31 <+114>:   call   0xdc1a0 

I'll put that as for PG18 in CF, but maybe it could be backpatched too
- no hard opinion on that (I don't see how that ERROR/FATAL path could
cause any performance regressions)


I'm missing a principled explanation of what this does.  I just see that 
it sprinkles some no-op code to make this particular thing work a bit 
differently, but this might just behave differently with the next 
compiler next year.  I'd like to see a bit more of an abstract 
explanation of what is happening here.






Re: An improved README experience for PostgreSQL

2024-05-12 Thread Peter Eisentraut

On 17.04.24 04:36, Nathan Bossart wrote:

On Wed, Feb 28, 2024 at 02:21:49PM -0600, Nathan Bossart wrote:

I see many projects have files like SECURITY.md, CODE_OF_CONDUCT.md, and
CONTRIBUTING.md, and I think it would be relatively easy to add content to
each of those for PostgreSQL, even if they just link elsewhere.

Here's a first attempt at this.  You can see some of the effects of these
files at [0].  More information about these files is available at [1] [2]
[3].

I figured we'd want to keep these pretty bare-bones to avoid duplicating
information that's already available at postgresql.org, but perhaps it's
worth filling them out a bit more.  Anyway, I just wanted to gauge interest
in this stuff.


I don't know, I find these files kind of "yelling".  It's fine to have a 
couple, but now it's getting a bit much, and there are more that could 
be added.


If we want to enhance the GitHub experience, we can also add these files 
to the organization instead: 
https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/creating-a-default-community-health-file






Re: Requiring LLVM 14+ in PostgreSQL 18

2024-05-12 Thread Peter Eisentraut

On 24.04.24 01:43, Thomas Munro wrote:

Rebased over ca89db5f.


These patches look fine to me.  The new cut-off makes sense, and it does 
save quite a bit of code.  We do need to get the Cirrus CI Debian images 
updated first, as you had already written.


As part of this patch, you also sneak in support for LLVM 18 
(llvm-config-18, clang-18 in configure).  Should this be a separate patch?


And as I'm looking up how this was previously handled, I notice that 
this list of clang-NN versions was last updated equally sneakily as part 
of your patch to trim off LLVM <10 (820b5af73dc).  I wonder if the 
original intention of that configure code was that maintaining the 
versioned list above clang-7/llvm-config-7 was not needed, because the 
unversioning programs could be used, or maybe because pkg-config could 
be used.  It would be nice if we could get rid of having to update that.






Re: Logging which interface was connected to in log_line_prefix

2024-05-12 Thread Peter Eisentraut

On 01.05.24 19:04, Greg Sabino Mullane wrote:
Thank you for taking the time to review this. I've attached a new 
rebased version, which has no significant changes.


There is a comment in the patch that states:

/* We do not need clean_ipv6_addr here: just report verbatim */

I am not quite sure what it means, but I am guessing it means that
the patch does not need to format the IPv6 addresses in any specific
way.


Yes, basically correct. There is a kluge (their word, not mine) in 
utils/adt/network.c to strip the zone - see the comment for the  
clean_ipv6_addr() function in that file. I added the patch comment in 
case some future person wonders why we don't "clean up" the ipv6 
address, like other places in the code base do. We don't need to pass it 
back to anything else, so we can simply output the correct version, zone 
and all.


clean_ipv6_addr() needs to be called before trying to convert a string 
representation into inet/cidr types.  This is not what is happening 
here.  So the comment is not applicable.






Re: Logging which interface was connected to in log_line_prefix

2024-05-12 Thread Peter Eisentraut

On 06.03.24 16:59, Greg Sabino Mullane wrote:
Someone on -general was asking about this, as they are listening on 
multiple IPs and would like to know which exact one clients were 
hitting. I took a quick look and we already have that information, so I 
grabbed some stuff from inet_server_addr and added it as part of a "%L" 
(for 'local interface'). Quick patch / POC attached.


I was confused by this patch title.  This feature does not log the 
interface (like "eth0" or "lo"), but the local address.  Please adjust 
the terminology.


I noticed that for Unix-domain socket connections, %r and %h write 
"[local]".  I think that should be done for this new placeholder as well.






Re: [PATCH] Replace magic constant 3 with NUM_MERGE_MATCH_KINDS

2024-05-12 Thread Peter Eisentraut

On 19.04.24 11:47, Aleksander Alekseev wrote:

Thanks. I see a few pieces of code that use special FOO_NUMBER enum
values instead of a macro. Should we refactor these pieces
accordingly? PFA another patch.


I think this is a sensible improvement.

But please keep the trailing commas on the last enum items.





Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs

2024-05-12 Thread Peter Eisentraut

On 14.12.23 14:40, Nazir Bilal Yavuz wrote:

On Fri, 6 Oct 2023 at 17:07, Tom Lane  wrote:


As a quick cross-check, I searched our commit log to see how many
README-only commits there were so far this year.  I found 11 since
January.  (Several were triggered by the latest round of pgindent
code and process changes, so maybe this is more than typical.)

Not sure what that tells us about the value of changing the CI
logic, but it does seem like it could be worth the one-liner
change needed to teach buildfarm animals to ignore READMEs.


I agree that it could be worth implementing this logic on buildfarm animals.

In case we want to implement the same logic on the CI, I added a new
version of the patch; it skips CI completely if the changes are only
in the README files.


I don't see how this could be applicable widely enough to be useful:

- While there are some patches that touch on README files, very few of 
those end up in a commit fest.


- If someone manually pushes a change to their own CI environment, I 
don't see why we need to second-guess that.


- Buildfarm runs generally batch several commits together, so it is very 
unlikely that this would be hit.


I think unless some concrete reason for this change can be shown, we 
should drop it.






Re: Add TAP tests for backtrace functionality (was Re: Add test module for verifying backtrace functionality)

2024-05-12 Thread Peter Eisentraut

On 16.03.24 05:25, Bharath Rupireddy wrote:

Postgres has a good amount of code for dealing with backtraces - two
GUCs backtrace_functions and backtrace_on_internal_error,
errbacktrace; all of which use core function set_backtrace from
elog.c. I've not seen this code being tested at all, see code coverage
report - 
https://coverage.postgresql.org/src/backend/utils/error/elog.c.gcov.html.

I think adding a simple test module (containing no .c files) with only
TAP tests will help cover this code. I ended up having it as a
separate module under src/test/modules/test_backtrace as I was not
able to find an existing TAP file in src/test to add these tests.  I'm
able to verify the backtrace related code with the attached patch
consistently. The TAP tests rely on the fact that the server emits
text "BACKTRACE: " to server logs before logging the backtrace, and
the backtrace contains the function name in which the error occurs.
I've turned off query statement logging (set log_statement = none,
log_min_error_statement = fatal) so that the tests get to see the
functions only in the backtrace. Although the CF bot is happy with the
attached patch 
https://github.com/BRupireddy2/postgres/tree/add_test_module_for_bcktrace_functionality_v1,
there might be some more flakiness to it.

Thoughts?


Ran pgperltidy on the new TAP test file added. Please see the attached v2 patch.


I've now moved the new TAP test file to src/test/modules/test_misc/t
as opposed to a new test module to keep it simple. I was not sure why
I hadn't done that in the first place.


Note that backtrace_on_internal_error has been removed, so this patch 
will need to be adjusted for that.


I suggest you consider joining forces with thread [0] where a 
replacement for backtrace_on_internal_error would be discussed.  Having 
some test coverage for whatever is being developed there might be useful.



[0]: 
https://www.postgresql.org/message-id/flat/CAGECzQTpdujCEt2SH4DBwRLoDq4HJArGDaxJSsWX0G=tnnz...@mail.gmail.com






Re: SQL:2011 application time

2024-05-10 Thread Peter Eisentraut
I have committed the 
v2-0001-Fix-ON-CONFLICT-DO-NOTHING-UPDATE-for-temporal-in.patch from 
this (confusingly, there was also a v2 earlier in this thread), and I'll 
continue working on the remaining items.



On 09.05.24 06:24, Paul Jungwirth wrote:
Here are a couple new patches, rebased to e305f715, addressing Peter's 
feedback. I'm still working on integrating jian he's suggestions for the 
last patch, so I've omitted that one here.


On 5/8/24 06:51, Peter Eisentraut wrote:
About v2-0001-Fix-ON-CONFLICT-DO-NOTHING-UPDATE-for-temporal-in.patch, 
I think the
ideas are right, but I wonder if we can fine-tune the new conditionals 
a bit.


--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool 
speculative)
  * If the indexes are to be used for speculative 
insertion, add extra

  * information required by unique index entries.
  */
-   if (speculative && ii->ii_Unique)
+   if (speculative && ii->ii_Unique && 
!ii->ii_HasWithoutOverlaps)

 BuildSpeculativeIndexInfo(indexDesc, ii);

Here, I think we could check !indexDesc->rd_index->indisexclusion 
instead.  So we

wouldn't need ii_HasWithoutOverlaps.


Okay.

Or we could push this into BuildSpeculativeIndexInfo(); it could just 
skip the rest
if an exclusion constraint is passed, on the theory that all the 
speculative index

info is already present in that case.


I like how BuildSpeculativeIndexInfo starts with an Assert that it's 
given a unique index, so I've left the check outside the function. This 
seems cleaner anyway: the function stays more focused.



--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -815,7 +815,7 @@ infer_arbiter_indexes(PlannerInfo *root)
  */
 if (indexOidFromConstraint == idxForm->indexrelid)
 {
-   if (!idxForm->indisunique && onconflict->action == 
ONCONFLICT_UPDATE)
+   if ((!idxForm->indisunique || idxForm->indisexclusion) && 
onconflict->action == ONCONFLICT_UPDATE)

 ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("ON CONFLICT DO UPDATE not supported 
with exclusion constraints")));


Shouldn't this use only idxForm->indisexclusion anyway?  Like

+   if (idxForm->indisexclusion && onconflict->action == 
ONCONFLICT_UPDATE)


That matches what the error message is reporting afterwards.


Agreed.


  * constraints), so index under consideration can be immediately
  * skipped if it's not unique
  */
-   if (!idxForm->indisunique)
+   if (!idxForm->indisunique || idxForm->indisexclusion)
 goto next;

Maybe here we need a comment.  Or make that a separate statement, like


Yes, that is nice. Done.

Yours,







consider -Wmissing-variable-declarations

2024-05-09 Thread Peter Eisentraut
In [0] I had noticed that we have no automated verification that global 
variables are declared in header files.  (For global functions, we have 
this through -Wmissing-prototypes.)  As I mentioned there, I discovered 
the Clang compiler option -Wmissing-variable-declarations, which does 
exactly that.  Clang has supported this for quite some time, and GCC 14, 
which was released a few days ago, now also supports it.  I went and 
installed this option into the standard build flags and cleaned up the 
warnings it found, which revealed a number of interesting things.


I think these checks are important.  We have been trying to mark global 
variables as PGDLLIMPORT consistently, but that only catches variables 
declared in header files.  Also, a threading project would surely 
benefit from global variables (thread-local variables?) having 
consistent declarations.


Attached are patches organized by sub-topic.  The most dubious stuff is 
in patches 0006 and 0007.  A bunch of GUC-related variables are not in 
header files but are pulled in via ad-hoc extern declarations.  I can't 
recognize an intentional scheme there, probably just done for 
convenience or copied from previous practice.  These should be organized 
into appropriate header files.



[0]: 
https://www.postgresql.org/message-id/c4ac402f-9d7b-45fa-bbc1-4a5bf0a9f...@eisentraut.orgFrom 296a1c465de6cfea333bf7bb7950f02b3ef80b12 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 8 May 2024 13:49:37 +0200
Subject: [PATCH v1 1/7] Add -Wmissing-variable-declarations to the standard
 compilation flags

This warning flag detects global variables not declared in header
files.  This is similar to what -Wmissing-prototypes does for
functions.  (More correctly, it is similar to what
-Wmissing-declarations does for functions, but -Wmissing-prototypes is
a superset of that in C.)

This flag is new in GCC 14.  Clang has supported it for a while.

XXX many warnings to fix first
---
 configure | 99 +++
 configure.ac  |  9 +++
 meson.build   | 10 +++
 src/Makefile.global.in|  1 +
 src/interfaces/ecpg/test/Makefile.regress |  2 +-
 src/interfaces/ecpg/test/meson.build  |  1 +
 src/makefiles/meson.build |  2 +
 src/tools/pg_bsd_indent/Makefile  |  2 +
 src/tools/pg_bsd_indent/meson.build   |  1 +
 9 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 89644f2249e..70fd1de88b5 100755
--- a/configure
+++ b/configure
@@ -741,6 +741,7 @@ CXXFLAGS_SL_MODULE
 CFLAGS_SL_MODULE
 CFLAGS_VECTORIZE
 CFLAGS_UNROLL_LOOPS
+PERMIT_MISSING_VARIABLE_DECLARATIONS
 PERMIT_DECLARATION_AFTER_STATEMENT
 LLVM_BINPATH
 LLVM_CXXFLAGS
@@ -6080,6 +6081,104 @@ if test x"$pgac_cv_prog_CXX_cxxflags__Wformat_security" 
= x"yes"; then
 fi
 
 
+  # gcc 14+, clang for a while
+  save_CFLAGS=$CFLAGS
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Wmissing-variable-declarations, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports -Wmissing-variable-declarations, 
for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Wmissing_variable_declarations+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=yes
+else
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" = x"yes"; 
then
+  CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+fi
+
+
+  PERMIT_MISSING_VARIABLE_DECLARATIONS=
+  if test x"$save_CFLAGS" != x"$CFLAGS"; then
+PERMIT_MISSING_VARIABLE_DECLARATIONS=-Wno-missing-variable-declarations
+  fi
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports 
-Wmissing-variable-declarations, for CXXFLAGS" >&5
+$as_echo_n "checking whether ${CXX} supports -Wmissing-variable-declarations, 
for CXXFLAGS... " >&6; }
+if ${pgac_cv_prog_CXX_cxxflags__Wmissing_variable_declarations+:} false; then :
+  $as_echo_n &

Re: SQL:2011 application time

2024-05-08 Thread Peter Eisentraut

On 30.04.24 18:39, Paul Jungwirth wrote:

On 4/30/24 09:24, Robert Haas wrote:

Peter, could you have a look at
http://postgr.es/m/47550967-260b-4180-9791-b224859fe...@illuminatedcomputing.com
and express an opinion about whether each of those proposals are (a)
good or bad ideas and (b) whether they need to be fixed for the
current release?


Here are the same patches but rebased.


I have committed 
v2-0002-Add-test-for-REPLICA-IDENTITY-with-a-temporal-key.patch.

About v2-0001-Fix-ON-CONFLICT-DO-NOTHING-UPDATE-for-temporal-in.patch, I think 
the
ideas are right, but I wonder if we can fine-tune the new conditionals a bit.

--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool 
speculative)
 * If the indexes are to be used for speculative insertion, add 
extra
 * information required by unique index entries.
 */
-   if (speculative && ii->ii_Unique)
+   if (speculative && ii->ii_Unique && !ii->ii_HasWithoutOverlaps)
BuildSpeculativeIndexInfo(indexDesc, ii);

Here, I think we could check !indexDesc->rd_index->indisexclusion instead.  So 
we
wouldn't need ii_HasWithoutOverlaps.

Or we could push this into BuildSpeculativeIndexInfo(); it could just skip the 
rest
if an exclusion constraint is passed, on the theory that all the speculative 
index
info is already present in that case.

--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -815,7 +815,7 @@ infer_arbiter_indexes(PlannerInfo *root)
 */
if (indexOidFromConstraint == idxForm->indexrelid)
{
-   if (!idxForm->indisunique && onconflict->action == 
ONCONFLICT_UPDATE)
+   if ((!idxForm->indisunique || idxForm->indisexclusion) && 
onconflict->action == ONCONFLICT_UPDATE)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 errmsg("ON CONFLICT DO UPDATE not supported with exclusion 
constraints")));

Shouldn't this use only idxForm->indisexclusion anyway?  Like

+   if (idxForm->indisexclusion && onconflict->action == 
ONCONFLICT_UPDATE)

That matches what the error message is reporting afterwards.

 * constraints), so index under consideration can be immediately
 * skipped if it's not unique
 */
-   if (!idxForm->indisunique)
+   if (!idxForm->indisunique || idxForm->indisexclusion)
goto next;

Maybe here we need a comment.  Or make that a separate statement, like

/* not supported yet etc. */
if (idxForm->indixexclusion)
next;





Re: AIX support

2024-05-08 Thread Peter Eisentraut

On 08.05.24 13:39, Sriram RK wrote:
We would like to understand your inputs/plans on reverting the changes 
for AIX.


I think the ship has sailed for PG17.  The way forward would be that you 
submit a patch for new, modernized AIX support for PG18.






Re: PERIOD foreign key feature

2024-05-08 Thread Peter Eisentraut

On 07.05.24 18:43, Paul Jungwirth wrote:

On 5/7/24 08:23, David G. Johnston wrote:
On Tue, May 7, 2024 at 7:54 AM Bruce Momjian > wrote:

    In the two marked lines, it says "if one side of the foreign key uses
    PERIOD, the other side must too."  However, looking at the example
    queries, it seems like if the foreign side has PERIOD, the primary 
side

    must have WITHOUT OVERLAPS, not PERIOD.

    Does this doc text need correcting?


The text is factually correct, though a bit hard to parse.

"the other side" refers to the part after "REFERENCES":

FOREIGN KEY ( column_name [, ... ] [, PERIOD column_name ] ) 
REFERENCES reftable [ ( refcolumn [, ... ] [, PERIOD column_name ] ) ]


***(shouldn't the second occurrence be [, PERIOD refcolum] ?)

The text is pointing out that since the refcolumn specification is 
optional you may very well not see a second PERIOD keyword in the 
clause.  Instead it will be inferred from the PK.


Maybe:

Finally, if the foreign key has a PERIOD column_name specification the 
corresponding refcolumn, if present, must also be marked PERIOD.  If 
the refcolumn clause is omitted, and thus the reftable's primary key 
constraint chosen, the primary key must have its final column marked 
WITHOUT OVERLAPS.


Yes, David is correct here on all points. I like his suggestion to 
clarify the language here also. If you need a patch from me let me know, 
but I assume it's something a committer can just make happen?


In principle yes, but it's also very helpful if someone produces an 
actual patch file, with complete commit message, credits, mailing list 
link, etc.






Re: add --no-sync to pg_upgrade's calls to pg_dump and pg_dumpall

2024-05-08 Thread Peter Eisentraut

On 03.05.24 19:13, Nathan Bossart wrote:

This is likely small potatoes compared to some of the other
pg_upgrade-related improvements I've proposed [0] [1] or plan to propose,
but this is easy enough, and I already wrote the patch, so here it is.
AFAICT there's no reason to bother syncing these dump files to disk.  If
someone pulls the plug during pg_upgrade, it's not like you can resume
pg_upgrade from where it left off.  Also, I think we skipped syncing before
v10, anyway, as the --no-sync flag was only added in commit 96a7128, which
added the code to sync dump files, too.


Looks good to me.





Re: partitioning and identity column

2024-05-07 Thread Peter Eisentraut

On 30.04.24 12:59, Ashutosh Bapat wrote:

PFA patch which fixes all the three problems.


I have committed this patch.  Thanks all.




Re: wrong comment in libpq.h

2024-05-06 Thread Peter Eisentraut

On 04.05.24 00:29, David Zhang wrote:

On 2024-05-03 4:53 a.m., Daniel Gustafsson wrote:

On 3 May 2024, at 13:48, Peter Eisentraut  wrote:
Maybe it's easier if we just replaced

    prototypes for functions in xxx.c

with

    declarations for xxx.c

throughout src/include/libpq/libpq.h.

+1

+1


It looks like this wording "prototypes for functions in" is used many 
times in src/include/, in many cases equally inaccurately, so I would 
suggest creating a more comprehensive patch for this.






Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-05-03 Thread Peter Eisentraut

On 03.05.24 10:39, Daniel Gustafsson wrote:

  I would
recommend to update the documentation of PQinitSSL and PQinitOpenSSL
to tell that these become useless and are deprecated.

They are no-ops when linking against v18, but writing an extension which
targets all supported versions of postgres along with their respective
supported OpenSSL versions make them still required, or am I missing something?


I don't think extensions come into play here, since this is libpq, so 
only the shared library interface compatibility matters.






Re: Support LIKE with nondeterministic collations

2024-05-03 Thread Peter Eisentraut

On 03.05.24 17:47, Daniel Verite wrote:

Peter Eisentraut wrote:


  However, off the top of my head, this definition has three flaws: (1)
It would make the single-character wildcard effectively an
any-number-of-characters wildcard, but only in some circumstances, which
could be confusing, (2) it would be difficult to compute, because you'd
have to check equality against all possible single-character strings,
and (3) it is not what the SQL standard says.


For #1 we're currently using the definition of a "character" as
being any single point of code,


That is the definition that is used throughout SQL and PostgreSQL.  We 
can't change that without redefining everything.  To pick just one 
example, the various trim function also behave in seemingly inconsistent 
ways when you apply then to strings in different normalization forms. 
The better fix there is to enforce the normalization form somehow.



Intuitively I think that our interpretation of "character" here should
be whatever sequence of code points are between character
boundaries [1], and that the equality of such characters would be the
equality of their sequences of code points, with the string equality
check of the collation, whatever the length of these sequences.

[1]:
https://unicode-org.github.io/icu/userguide/boundaryanalysis/#character-boundary


Even that page says, what we are calling character here is really called 
a grapheme cluster.


In a different world, pattern matching, character trimming, etc. would 
work by grapheme, but it does not.






Re: Support LIKE with nondeterministic collations

2024-05-03 Thread Peter Eisentraut

On 03.05.24 16:58, Daniel Verite wrote:

* Generating bounds for a sort key (prefix matching)

Having sort keys for strings allows for easy creation of bounds -
sort keys that are guaranteed to be smaller or larger than any sort
key from a give range. For example, if bounds are produced for a
sortkey of string “smith”, strings between upper and lower bounds
with one level would include “Smith”, “SMITH”, “sMiTh”. Two kinds
of upper bounds can be generated - the first one will match only
strings of equal length, while the second one will match all the
strings with the same initial prefix.

CLDR 1.9/ICU 4.6 and later map U+ to a collation element with
the maximum primary weight, so that for example the string
“smith\u” can be used as the upper bound rather than modifying
the sort key for “smith”.

In other words it says that

   col LIKE 'smith%' collate "nd"

is equivalent to:

   col >= 'smith' collate "nd" AND col < U&'smith\' collate "nd"

which could be obtained from an index scan, assuming a btree
index on "col" collate "nd".

U+ is a valid code point but a "non-character" [1] so it's
not supposed to be present in normal strings.


Thanks, this could be very useful!





Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-05-03 Thread Peter Eisentraut

On 03.05.24 16:13, Tom Lane wrote:

Peter Eisentraut  writes:

On 30.04.24 19:29, Tom Lane wrote:

Also, the bigger picture here is the seeming assumption that "if
we change pg_trgm then it will be safe to replicate from x86 to
arm".  I don't believe that that's a good idea and I'm unwilling
to promise that it will work, regardless of what we do about
char signedness.  That being the case, I don't want to invest a
lot of effort in the signedness issue.  Option (1) is clearly
a small change with little if any risk of future breakage.



But note that option 1 would prevent some replication that is currently
working.


The point of this thread though is that it's working only for small
values of "work".  People are rightfully unhappy if it seems to work
and then later they get bitten by compatibility problems.

Treating char signedness as a machine property in pg_control would
signal that we don't intend to make it work, and would ensure that
even the most minimal testing would find out that it doesn't work.

If we do not do that, it seems to me we have to buy into making
it work.  That would mean dealing with the consequences of an
incompatible change in pg_trgm indexes, and then going through
the same dance again the next time(s) similar problems are found.


Yes, that is understood.  But anecdotally, replicating between x86-64 
arm64 is occasionally used for upgrades or migrations.  In practice, 
this appears to have mostly worked.  If we now discover that it won't 
work with certain index extension modules, it's usable for most users. 
Even if we say, you have to reindex everything afterwards, it's probably 
still useful for these scenarios.


The way I understand the original report, the issue has to do 
specifically with how signed and unsigned chars compare differently.  I 
don't imagine this is used anywhere in the table/heap code.  So it's 
plausible that this issue is really contained to indexes.


On the other hand, if we put in a check against this, then at least we 
can answer any user questions about this with more certainty: No, won't 
work, here is why.






Re: Support LIKE with nondeterministic collations

2024-05-03 Thread Peter Eisentraut

On 03.05.24 15:20, Robert Haas wrote:

On Fri, May 3, 2024 at 4:52 AM Peter Eisentraut  wrote:

What the implementation does is, it walks through the pattern.  It sees
'_', so it steps over one character in the input string, which is '.'
here.  Then we have 'foo.' left to match in the input string.  Then it
takes from the pattern the next substring up to but not including either
a wildcard character or the end of the string, which is 'oo', and then
it checks if a prefix of the remaining input string can be found that is
"equal to" 'oo'.  So here it would try in turn

  '' = 'oo' collate ign_punct ?
  'f'= 'oo' collate ign_punct ?
  'fo'   = 'oo' collate ign_punct ?
  'foo'  = 'oo' collate ign_punct ?
  'foo.' = 'oo' collate ign_punct ?

and they all fail, so the match fails.


Interesting. Does that imply that these matches are slower than normal ones?


Yes, certainly, and there is also no indexing support (other than for 
exact matches).






Re: Document NULL

2024-05-03 Thread Peter Eisentraut

On 02.05.24 17:23, David G. Johnston wrote:
Version 2 attached.  Still a draft, focused on topic picking and overall 
structure.  Examples and links planned plus the usual semantic markup stuff.


I chose to add a new sect1 in the user guide (The SQL Language) chapter, 
"Data".


Please, let's not.

A stylistic note: "null" is an adjective.  You can talk about a "null 
value" or a value "is null".  These are lower-cased (or maybe 
title-cased).  You can use upper-case when referring to SQL syntax 
elements (in which case also tag it with something like ), and 
also to the C-language symbol (tagged with ).  We had recently 
cleaned this up, so I think the rest of the documentation should be 
pretty consistent about this.





Re: Tarball builds in the new world order

2024-05-03 Thread Peter Eisentraut

On 29.04.24 18:14, Tom Lane wrote:

Peter Eisentraut  writes:

On 26.04.24 21:24, Tom Lane wrote:

Concretely, I'm proposing the attached.  Peter didn't like
PG_COMMIT_HASH, so I have PG_COMMIT_REFSPEC below, but I'm not
wedded to that if a better name is proposed.



This seems ok to me, but note that we do have an equivalent
implementation in meson.  If we don't want to update that in a similar
way, maybe we should disable it.


OK.  After poking at that for awhile, it seemed like "default to
HEAD" fits into meson a lot better than "throw an error if the
variable isn't set", so I switched to doing it like that.
One reason is that AFAICT you can only set the variable during
"meson setup" not during "ninja".  This won't matter to the
tarball build script, which does a one-off configuration run
anyway.  But for manual use, a movable target like HEAD might be
more convenient given that behavior.


This patch looks good to me.





Re: A failure in prepared_xacts test

2024-05-03 Thread Peter Eisentraut

On 29.04.24 07:11, Tom Lane wrote:

Up to now, we've only worried about whether tests running in parallel
within a single test suite can interact.  It's quite scary to think
that the meson setup has expanded the possibility of interactions
to our entire source tree.  Maybe that was a bad idea and we should
fix the meson infrastructure to not do that.  I fear that otherwise,
we'll get bit regularly by very-low-probability bugs of this kind.


I don't think there is anything fundamentally different in the 
parallelism setups of the make-based and the meson-based tests.  There 
are just different implementation details that might affect the likely 
orderings and groupings.






Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-05-03 Thread Peter Eisentraut

On 30.04.24 19:29, Tom Lane wrote:

1) Assume that char signedness is somehow a property of bits-on-disk
even though it's weird.  Then pg_trgm indexes are correct, but we need
to store char signedness in pg_control.
2) Assume that char signedness is not a property of bits-on-disk.
Then pg_trgm indexes are buggy and need to be fixed.
What do you think?

Also, the bigger picture here is the seeming assumption that "if
we change pg_trgm then it will be safe to replicate from x86 to
arm".  I don't believe that that's a good idea and I'm unwilling
to promise that it will work, regardless of what we do about
char signedness.  That being the case, I don't want to invest a
lot of effort in the signedness issue.  Option (1) is clearly
a small change with little if any risk of future breakage.


But note that option 1 would prevent some replication that is currently 
working.






Re: [PATCH] json_lex_string: don't overread on bad UTF8

2024-05-03 Thread Peter Eisentraut

On 30.04.24 19:39, Jacob Champion wrote:

Tangentially: Should we maybe rethink pieces of the json_lex_string
error handling? For example, do we really want to echo an incomplete
multibyte sequence once we know it's bad?


I can't quite find the place you might be looking at in 
json_lex_string(), but for the general encoding conversion we have what 
would appear to be the same behavior in report_invalid_encoding(), and 
we go out of our way there to produce a verbose error message including 
the invalid data.






Re: wrong comment in libpq.h

2024-05-03 Thread Peter Eisentraut

On 03.05.24 00:37, David Zhang wrote:

Hi Hackers,

There is a comment like below in src/include/libpq/libpq.h,

  /*
   * prototypes for functions in be-secure.c
   */
extern PGDLLIMPORT char *ssl_library;
extern PGDLLIMPORT char *ssl_cert_file;

...

However, 'ssl_library', 'ssl_cert_file' and the rest are global 
parameter settings, not functions. To address this confusion, it would 
be better to move all global configuration settings to the proper 
section, such as /* GUCs */, to maintain consistency.


Maybe it's easier if we just replaced

prototypes for functions in xxx.c

with

declarations for xxx.c

throughout src/include/libpq/libpq.h.





  1   2   3   4   5   6   7   8   9   10   >