Re: Synchronizing slots from primary to standby

2023-10-23 Thread Ajin Cherian
On Mon, Oct 23, 2023 at 11:22 PM Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 10/20/23 5:27 AM, shveta malik wrote:
> > On Wed, Oct 18, 2023 at 4:24 PM Amit Kapila  wrote:
> >
> > PFA v25 patch set. The changes are:
> >
> > 1) 'enable_failover' is changed to 'failover'
> > 2) Alter subscription changes to support 'failover'
> > 3) Fixes a bug in patch001 wherein any change in standby_slot_names
> > was not considered in the flow where logical walsenders wait for
> > standby's confirmation. Now during the wait, if standby_slot_names is
> > changed, wait is restarted using new standby_slot_names.
> > 4) Addresses comments by Bertrand and Amit in [1],[2],[3]
> >
> > The changes are mostly in patch001 and a very few in patch002.
> >
> > Thank You Ajin for working on alter-subscription changes and adding
> > more TAP-tests for 'failover'
> >
>
> Thanks for updating the patch!
>
> Looking at 0001 and doing some experiment:
>
> Creating a logical slot with failover = true and then launching
> pg_logical_slot_get_changes() or pg_recvlogical() on it results
> to setting failover back to false.
>
> It occurs while creating the decoding context here:
>
> @@ -602,6 +602,9 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>  SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
>  }
>
> +   /* set failover in the slot, as requested */
> +   slot->data.failover = ctx->failover;
> +
>
> I think we can get rid of this change in CreateDecodingContext().
>
Yes, I too noticed this in my testing, however just removing this from
CreateDecodingContext will not allow us to change the slot's failover flag
using Alter subscription. Currently alter subscription re-establishes
the connection
using START REPLICATION and failover is one of the options passed in along with
START REPLICATION. I am thinking of moving this change to
StartLogicalReplication prior to calling CreateDecodingContext by
parsing the command options in StartReplicationCmd
without adding it to the LogicalDecodingContext.

regards,
Ajin Cherian
Fujitsu Australia




RE: Synchronizing slots from primary to standby

2023-10-23 Thread Zhijie Hou (Fujitsu)
On Friday, October 20, 2023 11:27 AM shveta malik  
wrote:
> 
> The changes are mostly in patch001 and a very few in patch002.
> 
> Thank You Ajin for working on alter-subscription changes and adding more
> TAP-tests for 'failover'

Thanks for updating the patch. Here are few things I noticed when testing the 
patch.
1)
+++ b/src/backend/replication/logical/logical.c
@@ -602,6 +602,9 @@ CreateDecodingContext(XLogRecPtr start_lsn,
SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
}
 
+   /* set failover in the slot, as requested */
+   slot->data.failover = ctx->failover;
+

I noticed others also commented this change. I found this will over-write the
slot's failover value to a wrong one in the case when we have acquired the slot 
and
call CreateDecodingContext after that. (e.g. it can be a problem if user call
pg_logical_slot_get_changes for a logical slot).

2)

WalSndWaitForStandbyConfirmation
...
+void
+WalSndWaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
+{
+   List   *standby_slot_cpy;
+
+   if (!MyReplicationSlot->data.failover)
+   return;
+
+   standby_slot_cpy = list_copy(standby_slot_names_list);
+

The standby list could be un-initialized when calling from a non-walsender
backend (e.g. via pg_logical_slot_get_changes()). So, we need to call
SlotSyncInitConfig somewhere in this case.

Best Regards,
Hou zj


Re: psql: Could we get "-- " prefixing on the **** QUERY **** outputs? (ECHO_HIDDEN)

2023-10-23 Thread Pavel Stehule
Hi

út 24. 10. 2023 v 7:10 odesílatel Kirk Wolak  napsal:

> On Wed, Jul 26, 2023 at 5:39 PM Nathan Bossart 
> wrote:
>
>> On Wed, Jul 26, 2023 at 08:06:37AM +0200, Pavel Stehule wrote:
>> > st 26. 7. 2023 v 6:22 odesílatel Nathan Bossart <
>> nathandboss...@gmail.com>
>> > napsal:
>> >> Barring additional feedback, I think this is ready for commit.
>> >>
>> >>
>> > +1
>>
>> Great.  I spent some time on the commit message in v4.  I plan to commit
>> this shortly.
>>
>> --
>> Nathan Bossart
>> Amazon Web Services: https://aws.amazon.com
>
>
> Curious about this.  I expected to see the comments?  (is there a chance
> that the translation piece is kicking in reverting them)?
> (expecting / * QUERY **/)
>
> 01:05:47 devuser@nctest= > \echo :VERSION_NAME  :VERSION_NUM
> 16.0 (Ubuntu 16.0-1.pgdg22.04+1) 16
> 01:05:57 devuser@nctest= > \dn public
> * QUERY **
> SELECT n.nspname AS "Name",
>   pg_catalog.pg_get_userbyid(n.nspowner) AS "Owner"
> FROM pg_catalog.pg_namespace n
> WHERE n.nspname OPERATOR(pg_catalog.~) '^(public)$' COLLATE
> pg_catalog.default
> ORDER BY 1;
> **
>
> * QUERY **
> SELECT pubname
> FROM pg_catalog.pg_publication p
>  JOIN pg_catalog.pg_publication_namespace pn ON p.oid = pn.pnpubid
>  JOIN pg_catalog.pg_namespace n ON n.oid = pn.pnnspid
> WHERE n.nspname = 'public'
> ORDER BY 1
> **
>
>   List of schemas
>   Name  |   Owner
> +---
>  public | pg_database_owner
> (1 row)
>
>
It is working in psql 17, not in psql 16

(2023-10-24 07:14:35) postgres=# \echo :VERSION_NAME  :VERSION_NUM
17devel 17
(2023-10-24 07:14:37) postgres=# \l+
/ QUERY */
SELECT
  d.datname as "Name",
  pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
  pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
  CASE d.datlocprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS
"Locale Provider",
...




>
>
>


Replace references to malloc() in libpq documentation with generic language

2023-10-23 Thread Gurjeet Singh
The commit message in the attached patch provides the reasoning, as follows:

The user does not benefit from knowing that libpq allocates some/all memory
using malloc(). Mentioning malloc() here has a few downsides, and almost no
benefits.

All the relevant mentions of malloc() are followed by an explicit
instruction to use PQfreemem() to free the resulting objects. So the
docs perform the sufficient job of educating the user on how to properly
free the memory.  But these mentions of malloc() may still lead an
inexperienced or careless user to (wrongly) believe that they may use
free() to free the resulting memory. They will be in a lot of pain until
they learn that (when linked statically) libpq's malloc()/free() cannot
be mixed with malloc()/free() of whichever malloc() library the client
application is being linked with.

Another downside of explicitly mentioning that objects returned by libpq
functions are allocated with malloc(), is that it exposes the implementation
details of libpq to the users. Such details are best left unmentioned so that
these can be freely changed in the future without having to worry about its
effect on client applications.

Whenever necessary, it is sufficient to tell the user that the objects/memory
returned by libpq functions is allocated on the heap. That is just enough
detail for the user to realize that the relevant object/memory needs to be
freed; and the instructions that follow mention to use PQfreemem() to free such
memory.

One mention of malloc is left intact, because that mention is unrelated to how
the memory is allocated, or how to free it.

In passing, slightly improve the language of PQencryptPasswordConn()
documentation.

Best regards,
Gurjeet
http://Gurje.et


v1-0001-Replace-references-to-malloc-in-libpq-documentati.patch
Description: Binary data


Re: Incorrect handling of OOM in WAL replay leading to data loss

2023-10-23 Thread Michael Paquier
On Tue, Oct 03, 2023 at 04:20:45PM +0900, Michael Paquier wrote:
> If there's no interest in this patch set after the next CF, I'm OK to
> drop it.  The state of HEAD is at least correct in the OOM cases now.

I have been thinking about this patch for the last few days, and in
light of 6b18b3fe2c2f I am going to withdraw it for now as this makes
the error layers of the xlogreader more complicated.
--
Michael


signature.asc
Description: PGP signature


Re: run pgindent on a regular basis / scripted manner

2023-10-23 Thread Michael Paquier
On Tue, Oct 24, 2023 at 10:16:55AM +0900, Amit Langote wrote:
> +1.  While I’ve made it part of routine to keep my local work pgindented
> since breaking Joel once, an option like this would still be useful.

I'd be OK with an option like that.  It is one of these things to type
once in a script, then forget about it.
--
Michael


signature.asc
Description: PGP signature


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-23 Thread Amit Kapila
On Sat, Oct 21, 2023 at 5:41 AM Bharath Rupireddy
 wrote:
>
> On Fri, Oct 20, 2023 at 8:51 PM Zhijie Hou (Fujitsu)
>  wrote:

>
> 9. IMO, binary_upgrade_logical_replication_slot_has_caught_up seems
> better, meaningful and consistent despite a bit long than just
> binary_upgrade_slot_has_caught_up.
>

I think logical_replication is specific to our pub-sub model but we
can have manually created slots as well. So, it would be better to
name it as binary_upgrade_logical_slot_has_caught_up().

-- 
With Regards,
Amit Kapila.




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-23 Thread Amit Kapila
On Mon, Oct 23, 2023 at 2:00 PM Bharath Rupireddy
 wrote:
>
> On Mon, Oct 23, 2023 at 11:10 AM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Thank you for reviewing! PSA new version.
>
> > > 6. A nit: how about is_decodable_txn or is_decodable_change or some
> > > other instead of just a plain name processing_required?
> > > +/* Do we need to process any change in 'fast_forward' mode? */
> > > +boolprocessing_required;
> >
> > I preferred current one. Because not only decodable txn, non-txn change and
> > empty transactions also be processed.
>
> Right. It's not the txn, but the change. processing_required seems too
> generic IMV. A nit: is_change_decodable or something?
>

If we don't want to keep it generic then we should use something like
'contains_decodable_change'. 'is_change_decodable' could have suited
here if we were checking a particular change.

> Thanks for the patch. Here are few comments on v56 patch:
>
> 1.
> + *
> + * Although this function is currently used only during pg_upgrade, there are
> + * no reasons to restrict it, so IsBinaryUpgrade is not checked here.
>
> This comment isn't required IMV, because anyone looking at the code
> and callsites can understand it.
>
> 2. A nit: IMV "This is a special purpose ..." statement seems redundant.
> + *
> + * This is a special purpose function to ensure that the given slot can be
> + * upgraded without data loss.
>
> How about
>
> Verify that the given replication slot has consumed all the WAL changes.
> If there's any decodable WAL record after the slot's
> confirmed_flush_lsn, the slot's consumer will lose that data after the
> slot is upgraded.
> Returns true if there are no decodable WAL records after the
> confirmed_flush_lsn. Otherwise false.
>

Personally, I find the current comment succinct and clear.

> 3.
> +if (PG_ARGISNULL(0))
> +elog(ERROR, "null argument to
> binary_upgrade_validate_wal_records is not allowed");
>
> I can see the above style is referenced from
> binary_upgrade_create_empty_extension, but IMV the following looks
> better and latest (ereport is new style than elog)
>
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>  errmsg("replication slot name cannot be null")));
>

Do you have any theory for making elog to ereport? I am not completely
sure but as this and related function is used internally, so using
elog seems reasonable. Also, I find keeping it consistent with the
existing error message is also reasonable. We can change both later
together if we get a broader agreement.

> 4. The following comment seems frivolous, the code tells it all.
> Please remove the comment.
> +
> +/* No need to check this slot, seek to new one */
> +continue;
>
> 5. A typo - s/gets/Gets
> + * gets the LogicalSlotInfos for all the logical replication slots of the
>
> 6. An optimization in count_old_cluster_logical_slots(void): Turn
> slot_count to a function static variable so that the for loop isn't
> required every time because the slot count is prepared in
> get_old_cluster_logical_slot_infos only once and won't change later
> on. Do you see any problem with the following? This saves a few CPU
> cycles when there are large number of replication slots.
> {
> static int slot_count = 0;
> static bool first_time = true;
>
> if (first_time)
> {
> for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
> slot_count += old_cluster.dbarr.dbs[dbnum].slot_arr.nslots;
>
> first_time = false;
> }
>
> return slot_count;
> }
>

This may not be a problem but this is also not a function that will be
used frequently. I am not sure if adding such code optimizations is
worth it.

> 7. A typo: s/slotname/slot name. "slot name" looks better in user
> visible messages.
> +pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\", two_phase: %s",
>

If we want to follow other parameters then we can even use slot_name.

-- 
With Regards,
Amit Kapila.




Re: More new SQL/JSON item methods

2023-10-23 Thread jian he
On Mon, Oct 23, 2023 at 3:29 PM Jeevan Chalke
 wrote:
>
> Attached are all three patches fixing the above comments.
>

minor issue:
/src/backend/utils/adt/jsonpath_exec.c
2531: Timestamp result;
2532: ErrorSaveContext escontext = {T_ErrorSaveContext};
2533:
2534: /* Get a warning when precision is reduced */
2535: time_precision = anytimestamp_typmod_check(false,
2536:time_precision);
2537: result = DatumGetTimestamp(value);
2538: AdjustTimestampForTypmod(, time_precision,
2539: (Node *) );
2540: if (escontext.error_occurred)
2541: RETURN_ERROR(ereport(ERROR,
2542: (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
2543:   errmsg("numeric argument of jsonpath item method .%s() is out
of range for type integer",
2544: jspOperationName(jsp->type);

you already did anytimestamp_typmod_check. So this "if
(escontext.error_occurred)" is unnecessary?
A similar case applies to another function called anytimestamp_typmod_check.

/src/backend/utils/adt/jsonpath_exec.c
1493: /* Convert numstr to Numeric with typmod */
1494: Assert(numstr != NULL);
1495: noerr = DirectInputFunctionCallSafe(numeric_in, numstr,
1496: InvalidOid, dtypmod,
1497: (Node *) ,
1498: );
1499:
1500: if (!noerr || escontext.error_occurred)
1501: RETURN_ERROR(ereport(ERROR,
1502: (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
1503:   errmsg("string argument of jsonpath item method .%s() is not a
valid representation of a decimal or number",
1504: jspOperationName(jsp->type);

inside DirectInputFunctionCallSafe already "if (SOFT_ERROR_OCCURRED(escontext))"
so "if (!noerr || escontext.error_occurred)" change to "if (!noerr)"
should be fine?




Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Monday, October 23, 2023, Tom Lane  wrote:

> Laurenz Albe  writes:
> > On Mon, 2023-10-23 at 11:37 -0700, David G. Johnston wrote:
> >> I do believe that we should be against exposing, like in this case, any
> internal
> >> implementation detail that encodes something (e.g., default privileges)
> as NULL
> >> in the catalogs, to the user of the psql meta-commands.
>
> > Sure, it would be best to hide this implementation detail from the user.
> > The correct way to do that would be to fake an ACL entry like
> "laurenz=arwdDxt/laurenz"
> > if there is a NULL in the catalog, but that would add a ton of
> special-case
> > code to psql, which does not look appealing at all.
>
> For better or worse, that *is* the backend's catalog representation,
> and I don't think that psql would be doing our users a service by
> trying to obscure the fact.  They'd run into it anyway the moment
> they look at the catalogs with anything but a \d-something command.
>

Which many may never do, and those few that do will see immediately that
the catalog uses null where they expected to see “(default)” and realize we
made a presentational choice in the interests of readability and their
query will need to make a choice regarding the null and empty arrays as
well.

David J.


Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Monday, October 23, 2023, Laurenz Albe  wrote:

> On Mon, 2023-10-23 at 11:37 -0700, David G. Johnston wrote:
> > > I didn't understand this completely.  You want default privileges
> displayed as
> > > "(default)", but are you for or against "\pset null" to have its
> normal effect on
> > > the output of backslash commands in all other cases?
> >
> > I haven’t inspected other cases but to my knowledge we don’t typically
> represent
> > non-unknown things using NULL so I’m not expecting other places to have
> this
> > representation problem.
>
> The first example that comes to my mind is the "ICU Locale" and the "ICU
> Rules"
> in the output of \l.  There are many others.


Both of those fall into “this null means there is no value for these
(because we aren’t using icu)”.  I have no qualms with leaving true nulls
represented as themselves.  Clean slate maybe I print “(not using icu)”
there instead of null but it isn’t worth the effort to change.

>
> > I won’t argue that exposing such NULLS is wrong, just it would
> preferable IME
> > to avoid doing so.  NULL means unknown or not applicable and default
> privileges
> > are neither of those things.  I get why our catalogs choose such an
> encoding and
> > agree with it, and users that find the need to consult the catalogs will
> need to
> > learn such details.  But we should strive for them to be able to survive
> with
> > psql meta-commands.
>
> Sure, it would be best to hide this implementation detail from the user.
> The correct way to do that would be to fake an ACL entry like
> "laurenz=arwdDxt/laurenz"
> if there is a NULL in the catalog, but that would add a ton of special-case
> code to psql, which does not look appealing at all.


More generically it would be “[PUBLIC=]/???/postgres” and
{OWNER}=???/postgres

It would ideally be a function call for psql and a system info function
usable for anyone.

David J.


Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Tom Lane
Andres Freund  writes:
> FC 29 is well out of support, so I don't think it makes sense to invest any
> further time in this. Personally, I don't think it's useful to have years old
> fedora in the buildfarm...

+1.  It's good to test old LTS distros, but Fedora releases have a
short shelf life by design.

regards, tom lane




Re: Fix output of zero privileges in psql

2023-10-23 Thread Tom Lane
Laurenz Albe  writes:
> On Mon, 2023-10-23 at 11:37 -0700, David G. Johnston wrote:
>> I do believe that we should be against exposing, like in this case, any 
>> internal
>> implementation detail that encodes something (e.g., default privileges) as 
>> NULL
>> in the catalogs, to the user of the psql meta-commands.

> Sure, it would be best to hide this implementation detail from the user.
> The correct way to do that would be to fake an ACL entry like 
> "laurenz=arwdDxt/laurenz"
> if there is a NULL in the catalog, but that would add a ton of special-case
> code to psql, which does not look appealing at all.

For better or worse, that *is* the backend's catalog representation,
and I don't think that psql would be doing our users a service by
trying to obscure the fact.  They'd run into it anyway the moment
they look at the catalogs with anything but a \d-something command.

regards, tom lane




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-23 Thread Andres Freund
Hi,

On 2023-10-24 09:39:42 +0700, Andrei Lepikhov wrote:
> On 20/10/2023 19:39, Stephen Frost wrote:
> Greetings,
> > * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> > > The only issue I worry about is the uncertainty and clutter that can be
> > > created by this feature. In the worst case, when we have a complex error
> > > stack (including the extension's CATCH sections, exceptions in stored
> > > procedures, etc.), the backend will throw the memory limit error 
> > > repeatedly.
> > 
> > I'm not seeing what additional uncertainty or clutter there is- this is,
> > again, exactly the same as what happens today on a system with
> > overcommit disabled and I don't feel like we get a lot of complaints
> > about this today.
> 
> Maybe I missed something or see this feature from an alternate point of view
> (as an extension developer), but overcommit is more useful so far: it kills
> a process.

In case of postgres it doesn't just kill one postgres, it leads to *all*
connections being terminated.


> It means that after restart, the backend/background worker will have an
> initial internal state. With this limit enabled, we need to remember that
> each function call can cause an error, and we have to remember it using
> static PG_CATCH sections where we must rearrange local variables to the
> initial (?) state. So, it complicates development.

You need to be aware of errors being thrown regardless this feature, as
out-of-memory errors can be encountered today already. There also are many
other kinds of errors that can be thrown.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Andres Freund
Hi,

On 2023-10-24 10:17:22 +1300, Thomas Munro wrote:
> This POWER animal fails, unexpectedly to me:
> 
> > bonito: 7.0.1 Fedora 29
> 
> We could try to chase that down, or we could rejoice that at least it
> works on current release.  It must begin working somewhere between 7
> and 11, but when I checked which LLVM releases I could easily install
> on eg cascabel (if I could get access) using the repo at apt.llvm.org,
> I saw that they don't even have anything older than 11.  So someone
> with access who wants to figure this out might have many days or weeks
> of compiling ahead of them.

I could reproduce the failure on bonito. The stack trace is:
#0  0x7fffb83541e8 in raise () from /lib64/libc.so.6
#1  0x7fffb833448c in abort () from /lib64/libc.so.6
#2  0x7fff9c68dd78 in std::__replacement_assert (_file=, 
_line=, _function=, _condition=)
at /usr/include/c++/8/ppc64le-redhat-linux/bits/c++config.h:447
#3  0x7fff9df90838 in std::unique_ptr >::operator* (
this=0x1b946cb8) at ../include/llvm/Support/MemAlloc.h:29
#4  llvm::OrcCBindingsStack::OrcCBindingsStack(llvm::TargetMachine&, 
std::function > ()>) (this=0x1b946be0, 
TM=..., IndirectStubsMgrBuilder=...) at 
../lib/ExecutionEngine/Orc/OrcCBindingsStack.h:242
#5  0x7fff9df90940 in LLVMOrcCreateInstance (TM=0x1b933ae0) at 
/usr/include/c++/8/bits/move.h:182
#6  0x7fffa0618f8c in llvm_session_initialize () at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit.c:981
#7  0x7fffa06179a8 in llvm_create_context (jitFlags=25) at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit.c:219
#8  0x7fffa0626cbc in llvm_compile_expr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/jit/llvm/llvmjit_expr.c:142
#9  0x10a76fc8 in jit_compile_expr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/jit/jit.c:177
#10 0x10404550 in ExecReadyExpr (state=0x1b8ef390) at 
/home/andres/src/postgres/src/backend/executor/execExpr.c:875

with this assertion message printed:
/usr/include/c++/8/bits/unique_ptr.h:328: typename 
std::add_lvalue_reference<_Tp>::type std::unique_ptr<_Tp, Dp>::operator*() 
const [with Tp = llvm::orc::JITCompileCallbackManager; _Dp = 
std::default_delete; typename 
std::add_lvalue_reference<_Tp>::type = llvm::orc::JITCompileCallbackManager&]: 
Assertion 'get() != pointer()' failed.


I wanted to use a debug build to investigate in more detail, but bonito is a
small VM. Thus I built llvm 7 on a more powerful gcc cfarm machine, running on
AlmaLinux 9.2.  The problem doesn't reproduce there.

Given the crash in some c++ standard library code, that the fc29 patches to
llvm look harmless, that building/using llvm 7 on a newer distro does not show
issues on PPC, it seems likely that this is a compiler / standard library
issue.

FC 29 is well out of support, so I don't think it makes sense to invest any
further time in this. Personally, I don't think it's useful to have years old
fedora in the buildfarm...

Greetings,

Andres Freund




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-23 Thread Andrei Lepikhov

On 20/10/2023 19:39, Stephen Frost wrote:
Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:

The only issue I worry about is the uncertainty and clutter that can be
created by this feature. In the worst case, when we have a complex error
stack (including the extension's CATCH sections, exceptions in stored
procedures, etc.), the backend will throw the memory limit error repeatedly.


I'm not seeing what additional uncertainty or clutter there is- this is,
again, exactly the same as what happens today on a system with
overcommit disabled and I don't feel like we get a lot of complaints
about this today.


Maybe I missed something or see this feature from an alternate point of 
view (as an extension developer), but overcommit is more useful so far: 
it kills a process.
It means that after restart, the backend/background worker will have an 
initial internal state. With this limit enabled, we need to remember 
that each function call can cause an error, and we have to remember it 
using static PG_CATCH sections where we must rearrange local variables 
to the initial (?) state. So, it complicates development.
Of course, this limit is a good feature, but from my point of view, it 
would be better to kill a memory-consuming backend instead of throwing 
an error. At least for now, we don't have a technique to repeat query 
planning with chances to build a more effective plan.


--
regards,
Andrei Lepikhov
Postgres Professional





Re: Fix output of zero privileges in psql

2023-10-23 Thread Laurenz Albe
On Mon, 2023-10-23 at 11:37 -0700, David G. Johnston wrote:
> > I didn't understand this completely.  You want default privileges displayed 
> > as
> > "(default)", but are you for or against "\pset null" to have its normal 
> > effect on
> > the output of backslash commands in all other cases?
> 
> I haven’t inspected other cases but to my knowledge we don’t typically 
> represent
> non-unknown things using NULL so I’m not expecting other places to have this
> representation problem.

The first example that comes to my mind is the "ICU Locale" and the "ICU Rules"
in the output of \l.  There are many others.

> I don’t think any of our meta-command outputs should modify pset null.
> Left join cases should be considered unknown, represented as NULL, and obey 
> the
> user’s setting.

That's what I think too.  psql output should respect "\pset null".
So it looks like we agree on that.

> I do believe that we should be against exposing, like in this case, any 
> internal
> implementation detail that encodes something (e.g., default privileges) as 
> NULL
> in the catalogs, to the user of the psql meta-commands.
> 
> I won’t argue that exposing such NULLS is wrong, just it would preferable IME
> to avoid doing so.  NULL means unknown or not applicable and default 
> privileges
> are neither of those things.  I get why our catalogs choose such an encoding 
> and
> agree with it, and users that find the need to consult the catalogs will need 
> to
> learn such details.  But we should strive for them to be able to survive with
> psql meta-commands.

Sure, it would be best to hide this implementation detail from the user.
The correct way to do that would be to fake an ACL entry like 
"laurenz=arwdDxt/laurenz"
if there is a NULL in the catalog, but that would add a ton of special-case
code to psql, which does not look appealing at all.

So we cannot completely hide the implementation, but perhaps "(default)" would
be less confusing than a NULL value.

If everybody agrees, I can modify the patch to do that.

Yours,
Laurenz Albe




Re: post-recovery amcheck expectations

2023-10-23 Thread Noah Misch
On Mon, Oct 23, 2023 at 04:46:23PM -0700, Peter Geoghegan wrote:
> On Fri, Oct 20, 2023 at 8:55 PM Noah Misch  wrote:
> > > > I lean toward fixing this by
> > > > having amcheck scan left; if left links reach only half-dead or deleted 
> > > > pages,
> > > > that's as good as the present child block being P_LEFTMOST.
> > >
> > > Also my preference.
> >
> > Done mostly that way, except I didn't accept deleted pages.  Making this 
> > work
> > on !readonly would take more than that, and readonly shouldn't need that.
> 
> That makes sense to me. I believe that it's not possible to have a
> string of consecutive sibling pages that are all half-dead (regardless
> of the BlockNumber order of sibling pages, even). But I'd probably
> have written the fix in roughly the same way. Although...maybe you
> should try to detect a string of half-dead pages? Hard to say if it's
> worth the trouble.

I imagined a string of half-dead siblings could arise in structure like this:

 *   1  

 
 *  /|\ 

  
 * 4 <-> 2 <-> 3

With events like this:

- DELETE renders blk 4 deletable.
- Crash with concurrent VACUUM, leaving 4 half-dead after having visited 1-4.
- DELETE renders blk 2 deletable.
- Crash with concurrent VACUUM, leaving 2 half-dead after having visited 1-2.

I didn't try to reproduce that, and something may well prevent it.

> Suggest adding a CHECK_FOR_INTERRUPTS() call to the loop, too, just
> for good luck.

Added.  That gave me the idea to check for circular links, like other parts of
amcheck do.  Net diff:

--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -949,11 +949,16 @@ bt_leftmost_ignoring_half_dead(BtreeCheckState *state,
Pagepage = palloc_btree_page(state, reached);
BTPageOpaque reached_opaque = BTPageGetOpaque(page);
 
+   CHECK_FOR_INTERRUPTS();
+
/*
-* _bt_unlink_halfdead_page() writes that side-links will 
continue to
-* point to the siblings.  We can easily check btpo_next.
+* Try to detect btpo_prev circular links.  
_bt_unlink_halfdead_page()
+* writes that side-links will continue to point to the 
siblings.
+* Check btpo_next for that property.
 */
-   all_half_dead = P_ISHALFDEAD(reached_opaque) &&
+   all_half_dead = P_ISHALFDEAD(reached_opaque);
+   reached != start &&
+   reached != reached_from &&
reached_opaque->btpo_next == reached_from;
if (all_half_dead)
{




Re: run pgindent on a regular basis / scripted manner

2023-10-23 Thread Amit Langote
On Tue, Oct 24, 2023 at 9:51 Jeff Davis  wrote:

> On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:
> > It would be good to learn how many of the committers out of the ones
> > you listed that --enable-indent-checks would have saved from breaking
> > koel.
>
> I'd find that a useful option.


+1.  While I’ve made it part of routine to keep my local work pgindented
since breaking Joel once, an option like this would still be useful.

>


Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Tom Lane
Matthias van de Meent  writes:
> On Mon, 23 Oct 2023, 19:34 Tom Lane,  wrote:
>> After ruminating on this for awhile, here's a straw-man proposal:
>> ...

> How does this work w.r.t. concurrently created tables that contain the
> domain?

It wouldn't change that at all I think.  I had noticed that we'd
probably need to tweak validateDomainConstraint() to ensure it applies
the same semantics that INSERT/UPDATE do --- although with Isaac's
idea to enable better tracking of which constraints will fail on NULL,
maybe just a blind application of the constraint expression will still
be close enough.

I agree that concurrent transactions can create violations of the new
constraint, but (a) that's true now, (b) I have no good ideas about
how to improve it, and (c) it seems like an independent problem.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-10-23 Thread Jeff Davis
On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:
> It would be good to learn how many of the committers out of the ones
> you listed that --enable-indent-checks would have saved from breaking
> koel.

I'd find that a useful option.

Regards,
Jeff Davis





Re: Is this a problem in GenericXLogFinish()?

2023-10-23 Thread Jeff Davis
On Mon, 2023-10-23 at 10:14 -0400, Robert Haas wrote:
> I think that would be a bug. I think it would be OK to just change
> the
> page LSN and nothing else, but that requires calling
> MarkBufferDirty()
> at some point. If you only call PageSetLSN() and never
> MarkBufferDirty(), that sounds messed up: either the former should be
> skipped, or the latter should be added. We shouldn't modify a shared
> buffer without marking it dirty.
> 

In that case, I think REGBUF_NO_CHANGE is the right name for the flag.

Committed.

Regards,
Jeff Davis





Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Tom Lane
Michael Paquier  writes:
> Please find attached a patch to move the version call close to the
> existing PGAC_PATH_PROGS.  And of course, I'd like to do a backpatch.
> Is that OK?

OK by me.

regards, tom lane




Re: Various bugs if segment containing redo pointer does not exist

2023-10-23 Thread Andres Freund
Hi,

On 2023-10-23 16:21:45 -0700, Andres Freund wrote:
> 1) For some reason I haven't yet debugged, the ReadRecord(PANIC) in
>PerformWalRecovery() doesn't PANIC, but instead just returns NULL
>
>We *do* output a DEBUG message, but well, that's insufficient.

The debug is from this backtrace:

#0  XLogFileReadAnyTLI (segno=6476, emode=13, source=XLOG_FROM_PG_WAL) at 
/home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:4291
#1  0x55d7b3949db0 in WaitForWALToBecomeAvailable (RecPtr=108649259008, 
randAccess=true, fetching_ckpt=false, tliRecPtr=108665421680, replayTLI=1,
replayLSN=108665421680, nonblocking=false) at 
/home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3697
#2  0x55d7b39494ff in XLogPageRead (xlogreader=0x55d7b472c470, 
targetPagePtr=108649250816, reqLen=8192, targetRecPtr=108665421680,
readBuf=0x55d7b47ba5d8 "\024\321\005") at 
/home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3278
#3  0x55d7b3941bb1 in ReadPageInternal (state=0x55d7b472c470, 
pageptr=108665413632, reqLen=8072)
at /home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:1014
#4  0x55d7b3940f43 in XLogDecodeNextRecord (state=0x55d7b472c470, 
nonblocking=false)
at /home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:571
#5  0x55d7b3941a41 in XLogReadAhead (state=0x55d7b472c470, 
nonblocking=false) at 
/home/andres/src/postgresql/src/backend/access/transam/xlogreader.c:947
#6  0x55d7b393f5fa in XLogPrefetcherNextBlock (pgsr_private=94384934340072, 
lsn=0x55d7b47cfeb8)
at 
/home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:496
#7  0x55d7b393efcd in lrq_prefetch (lrq=0x55d7b47cfe88) at 
/home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:256
#8  0x55d7b393f190 in lrq_complete_lsn (lrq=0x55d7b47cfe88, lsn=0) at 
/home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:294
#9  0x55d7b39401ba in XLogPrefetcherReadRecord (prefetcher=0x55d7b47bc5e8, 
errmsg=0x7ffc23505920)
at 
/home/andres/src/postgresql/src/backend/access/transam/xlogprefetcher.c:1041
#10 0x55d7b3948ff8 in ReadRecord (xlogprefetcher=0x55d7b47bc5e8, emode=23, 
fetching_ckpt=false, replayTLI=1)
at 
/home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:3078
#11 0x55d7b3946749 in PerformWalRecovery () at 
/home/andres/src/postgresql/src/backend/access/transam/xlogrecovery.c:1640


The source of the emode=13=DEBUG2 is that that's hardcoded in
WaitForWALToBecomeAvailable(). I guess the error ought to come from
XLogPageRead(), but all that happens is this:

case XLREAD_FAIL:
if (readFile >= 0)
close(readFile);
readFile = -1;
readLen = 0;
readSource = XLOG_FROM_ANY;
return XLREAD_FAIL;

which *does* error out for some other failures:
errno = save_errno;
ereport(emode_for_corrupt_record(emode, targetPagePtr + 
reqLen),
(errcode_for_file_access(),
 errmsg("could not read from WAL 
segment %s, LSN %X/%X, offset %u: %m",
fname, 
LSN_FORMAT_ARGS(targetPagePtr),
readOff)));

but not for a file that couldn't be opened. Which wouldn't have to be due to
ENOENT, could also be EACCESS...


xlogreader has undergone a fair bit of changes in the last few releases. As of
now, I can't really understand who is responsible for reporting what kind of
error.

/*
 * Data input callback
 *
 * This callback shall read at least reqLen valid bytes of the xlog page
 * starting at targetPagePtr, and store them in readBuf.  The callback
 * shall return the number of bytes read (never more than XLOG_BLCKSZ), 
or
 * -1 on failure.  The callback shall sleep, if necessary, to wait for 
the
 * requested bytes to become available.  The callback will not be 
invoked
 * again for the same page unless more than the returned number of bytes
 * are needed.
 *
 * targetRecPtr is the position of the WAL record we're reading.  
Usually
 * it is equal to targetPagePtr + reqLen, but sometimes xlogreader needs
 * to read and verify the page or segment header, before it reads the
 * actual WAL record it's interested in.  In that case, targetRecPtr can
 * be used to determine which timeline to read the page from.
 *
 * The callback shall set ->seg.ws_tli to the TLI of the file the page 
was
 * read from.
 */
XLogPageReadCB page_read;

/*
 * Callback to 

Re: Patch: Improve Boolean Predicate JSON Path Docs

2023-10-23 Thread Erik Wienhold
On 2023-10-24 00:58 +0200, David E. Wheeler wrote:
> On Oct 22, 2023, at 20:36, Erik Wienhold  wrote:
> 
> > That's an AppleSingle file according to [1][2].  It only contains the
> > resource fork and file name but no data fork.
> 
> Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K
> is not big but whatever. Let’s see if turning it off fixes the issue.

I suspected it had something to do with iCloud.  Glad you solved it!

> > Please change to "in strict mode" (without "the").
> 
> Hrm, I prefer it without the article, too, but it is consistently used
> that way elsewhere, like here:
> 
>  
> https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401
> 
> I’d be happy to change them all, but was keeping it consistent for now.

Right.  I haven't really noticed that the article case is more common.
I thought that you may have missed that one because I saw this change
that removes the article:

> -In the strict mode, the specified path must exactly match the structure 
> of
> +In strict mode, the specified path must exactly match the structure of

> Updated patch attached, thank you!

LGTM.  Would you create a commitfest entry?  I'll set the status to RfC.

-- 
Erik




RE: Partial aggregates pushdown

2023-10-23 Thread fujii.y...@df.mitsubishielectric.co.jp
Hi Mr.Momjian.

> Fujii-san, to get this patch closer to finished, can I modify this version of 
> the patch to improve some wording and post an
> updated version you can use for future changes?
Yes, I greatly appreciate your offer.
I would very much appreciate your modifications.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation




Re: Add trailing commas to enum definitions

2023-10-23 Thread David Steele

On 10/23/23 17:04, Tom Lane wrote:

Nathan Bossart  writes:

 From a long-term perspective, I
think standardizing on the trailing comma style will actually improve
git-blame because patches won't need to add a comma to the previous line
when adding a value.


Yeah, that's a good point.  I had been leaning towards "this is
unnecessary churn", but with that idea I'm now +1.


+1 from me.

-David





Re: trying again to get incremental backup

2023-10-23 Thread David Steele

On 10/23/23 11:44, Robert Haas wrote:

On Fri, Oct 20, 2023 at 11:30 AM David Steele  wrote:


I don't plan to stand in your way on this feature. I'm reviewing what
patches I can out of courtesy and to be sure that nothing adjacent to
your work is being affected. My apologies if my reviews are not meeting
your expectations, but I am contributing as my time constraints allow.


Sorry, I realize reading this response that I probably didn't do a
very good job writing that email and came across sounding like a jerk.
Possibly, I actually am a jerk. Whether it just sounded like it or I
actually am, I apologize. 


That was the way it came across, though I prefer to think it was 
unintentional. I certainly understand how frustrating dealing with a 
large and uncertain patch can be. Either way, I appreciate the apology.


Now onward...


But your last paragraph here gets at my real
question, which is whether you were going to try to block the feature.
I recognize that we have different priorities when it comes to what
would make most sense to implement in PostgreSQL, and that's OK, or at
least, it's OK with me. 


This seem perfectly natural to me.


I also don't have any particular expectation
about how much you should review the patch or in what level of detail,
and I have sincerely appreciated your feedback thus far. If you are
able to continue to provide more, that's great, and if that's not,
well, you're not obligated. What I was concerned about was whether
that review was a precursor to a vigorous attempt to keep the main
patch from getting committed, because if that was going to be the
case, then I'd like to surface that conflict sooner rather than later.
It sounds like that's not an issue, which is great.


Overall I would say I'm not strongly for or against the patch. I think 
it will be very difficult to use in a manual fashion, but automation is 
they way to go in general so that's not necessarily and argument against.


However, this is an area of great interest to me so I do want to at 
least make sure nothing is being impacted adjacent to the main goal of 
this patch. So far I have seen no sign of that, but that has been a 
primary goal of my reviews.



At the risk of drifting into the fraught question of what I *should*
be implementing rather than the hopefully-less-fraught question of
whether what I am actually implementing is any good, I see incremental
backup as a way of removing some of the use cases for the low-level
backup API. If you said "but people still will have lots of reasons to
use it," I would agree; and if you said "people can still screw things
up with pg_basebackup," I would also agree. Nonetheless, most of the
disasters I've personally seen have stemmed from the use of the
low-level API rather than from the use of pg_basebackup, though there
are exceptions. 


This all makes sense to me.


I also think a lot of the use of the low-level API is
driven by it being just too darn slow to copy the whole database, and
incremental backup can help with that in some circumstances. 


I would argue that restore performance is *more* important than backup 
performance and this patch is a step backward in that regard. Backups 
will be faster and less space will be used in the repository, but 
restore performance is going to suffer. If the deltas are very small the 
difference will probably be negligible, but as the deltas get large (and 
especially if there are a lot of them) the penalty will be more noticeable.



Also, I
have worked fairly hard to try to make sure that if you misuse
pg_combinebackup, or fail to use it altogether, you'll get an error
rather than silent data corruption. I would be interested to hear
about scenarios where the checks that I've implemented can be defeated
by something that is plausibly described as stupidity rather than
malice. I'm not sure we can fix all such cases, but I'm very alert to
the horror that will befall me if user error looks exactly like a bug
in the code. For my own sanity, we have to be able to distinguish
those cases. 


I was concerned with the difficulty of trying to stage the correct 
backups for pg_combinebackup, not whether it would recognize that the 
needed data was not available and then error appropriately. The latter 
is surmountable within pg_combinebackup but the former is left up to the 
user.



Moreover, we also need to be able to distinguish
backup-time bugs from reassembly-time bugs, which is why I've got the
pg_walsummary tool, and why pg_combinebackup has the ability to emit
fairly detailed debugging output. I anticipate those things being
useful in investigating bug reports when they show up. I won't be too
surprised if it turns out that more work on sanity-checking and/or
debugging tools is needed, but I think your concern about people
misusing stuff is bang on target and I really want to do whatever we
can to avoid that when possible and detect it when it happens.


The ability of users to misuse tools is, of course, 

Re: post-recovery amcheck expectations

2023-10-23 Thread Peter Geoghegan
On Fri, Oct 20, 2023 at 8:55 PM Noah Misch  wrote:
> > > I lean toward fixing this by
> > > having amcheck scan left; if left links reach only half-dead or deleted 
> > > pages,
> > > that's as good as the present child block being P_LEFTMOST.
> >
> > Also my preference.
>
> Done mostly that way, except I didn't accept deleted pages.  Making this work
> on !readonly would take more than that, and readonly shouldn't need that.

That makes sense to me. I believe that it's not possible to have a
string of consecutive sibling pages that are all half-dead (regardless
of the BlockNumber order of sibling pages, even). But I'd probably
have written the fix in roughly the same way. Although...maybe you
should try to detect a string of half-dead pages? Hard to say if it's
worth the trouble.

Suggest adding a CHECK_FOR_INTERRUPTS() call to the loop, too, just
for good luck.

> After I fixed the original error, the "block %u is not leftmost" surfaced
> next.  The attached patch fixes that, too.  I didn't investigate the others.
> The original test was flaky in response to WAL flush timing, but this one
> survives thousands of runs.

Hmm. Can't argue with that. Your fix seems sound.

-- 
Peter Geoghegan




Re: WIP: new system catalog pg_wait_event

2023-10-23 Thread Michael Paquier
On Mon, Oct 23, 2023 at 01:17:42PM +0300, Pavel Luzanov wrote:
> I propose to add a table alias for the wait_event column in the
> WHERE clause for consistency.

Okay by me that it looks like an improvement to understand where this
attribute is from, so applied on HEAD.
--
Michael


signature.asc
Description: PGP signature


Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Michael Paquier
On Mon, Oct 23, 2023 at 06:06:02PM +0200, Peter Eisentraut wrote:
> On 23.10.23 16:26, Tom Lane wrote:
>> Also, since "PGAC_PATH_PROGS(OPENSSL, openssl)" prints the full path to
>> what it found, you can at least tell after the fact that you are being
>> misled, because you can cross-check that path against the -L switches
>> being used for libraries.
>
> Yeah, that seems ok.

FWIW, I was also contemplating this one yesterday:
+PKG_CHECK_MODULES(OPENSSL, openssl)

Still, when I link my builds to a custom OpenSSL one, I force PATH to
point to a command of openssl related to the libs used so
PGAC_PATH_PROGS is more useful.  I guess that everybody here does the
same.  It could be of course possible to show both the command from
PATH and from pkg-config, but that's just confusing IMO.

There may be a point in doing the same for other commands like LZ4 and
Zstandard but these have been less of a pain in the buildfarm, even if
we don't use them for that long, so I cannot get excited about
spending more ./configure cycles for these.

Please find attached a patch to move the version call close to the
existing PGAC_PATH_PROGS.  And of course, I'd like to do a backpatch.
Is that OK?
--
Michael
diff --git a/configure b/configure
index c2cb1b1b24..cfd968235f 100755
--- a/configure
+++ b/configure
@@ -14077,6 +14077,9 @@ $as_echo_n "checking for OPENSSL... " >&6; }
 $as_echo "$OPENSSL" >&6; }
 fi
 
+pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo openssl not found)"
+{ $as_echo "$as_me:${as_lineno-$LINENO}: using openssl: $pgac_openssl_version" >&5
+$as_echo "$as_me: using openssl: $pgac_openssl_version" >&6;}
 if test "$with_ssl" = openssl ; then
   ac_fn_c_check_header_mongrel "$LINENO" "openssl/ssl.h" "ac_cv_header_openssl_ssl_h" "$ac_includes_default"
 if test "x$ac_cv_header_openssl_ssl_h" = xyes; then :
diff --git a/configure.ac b/configure.ac
index 440b08d113..f220b379b3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1553,6 +1553,8 @@ if test "$with_gssapi" = yes ; then
 fi
 
 PGAC_PATH_PROGS(OPENSSL, openssl)
+pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo openssl not found)"
+AC_MSG_NOTICE([using openssl: $pgac_openssl_version])
 if test "$with_ssl" = openssl ; then
   AC_CHECK_HEADER(openssl/ssl.h, [], [AC_MSG_ERROR([header file  is required for OpenSSL])])
   AC_CHECK_HEADER(openssl/err.h, [], [AC_MSG_ERROR([header file  is required for OpenSSL])])


signature.asc
Description: PGP signature


Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Matthias van de Meent
On Mon, 23 Oct 2023, 19:34 Tom Lane,  wrote:
>
> I wrote:
> > Given the exception the spec makes for CAST, I wonder if we shouldn't
> > just say "NULL is a valid value of every domain type, as well as every
> > base type.  If you don't like it, too bad; write a separate NOT NULL
> > constraint for your table column."
>
> After ruminating on this for awhile, here's a straw-man proposal:
>
> 1. Domains are data types, with the proviso that NULL is always
> a valid value no matter what the domain constraints might say.
> Implementation-wise, this'd just require that CoerceToDomain
> immediately return any null input without checking the constraints.
> This has two big attractions:

Agreed.

> 2. In INSERT and UPDATE queries, thumb through the constraints of
> any domain-typed target columns to see if any of them are NOT NULL
> or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
> NOT NULL constraint on that column.

How does this work w.r.t. concurrently created tables that contain the
domain? Right now, you can do something along the lines of the
following due to a lack of locking on domains for new columns/tables
that use said domain, and I believe that this is the main source of
domain constraint violations:

CREATE DOMAIN mydomain text;
CREATE TABLE c (d mydomain);

S1: BEGIN; INSERT INTO c VALUES (''); CREATE TABLE t (d mydomain);
INSERT INTO t VALUES (NULL);

S2: BEGIN; ALTER DOMAIN mydomain SET NOT NULL;
-- waits for S1 to release lock on c

S1: COMMIT;
-- S2's ALTER DOMAIN gets unblocked and succeeds, despite the NULL
value in "t" because that table is invisible to the transaction of
ALTER DOMAIN.

So my base question is, should we then require e.g. SHARE locks on
types that depend on domains when we do DDL that depends on the type,
and SHARE UPDATE EXCLUSIVE when we modify the type?

> The idea of point #2 is to have a cheap check that 99% satisfies
> what the spec says about not-null constraints on domains.  If we
> don't do #2, I think we have to fully recheck all the domain's
> constraints during column assignment.  I find that ugly as well
> as expensive performance-wise.  It does mean that if you have
> some domain constraint that would act to reject NULLs, but it's
> spelled in some weird way, it won't reject NULLs.  I don't find
> that possibility compelling enough to justify the performance hit
> of recomputing every constraint just in case it acts like that.

Makes sense.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Various bugs if segment containing redo pointer does not exist

2023-10-23 Thread Andres Freund
Hi,

I investigated a crashing postgres instance. Turns out the original issue was
operator error. But in the process I found a few postgres issues. The scenario
is basically that redo LSN and checkpoint LSN are in seperate segments, and
that for whatever reason, the file containing the redo LSN is missing.

I'd expect a PANIC in that situation. But, turns out that no such luck.

I've looked at 15 and HEAD so far.

1) For some reason I haven't yet debugged, the ReadRecord(PANIC) in
   PerformWalRecovery() doesn't PANIC, but instead just returns NULL

   We *do* output a DEBUG message, but well, that's insufficient.


2) On HEAD, we then segfault, because the cross check for XLOG_CHECKPOINT_REDO
   causes null pointer dereference. Which I guess is good, we shouldn't have
   gotten there without a record.


3) On 15, with or without assertions, we decide that "redo is not
   required". Gulp.


4) On 15, with assertions enabled, we fail with an assertion in the startup
   process, in FinishWalRecovery()->XLogPrefetcherBeginRead()->XLogBeginRead()
   Assert(!XLogRecPtrIsInvalid(RecPtr))


5) On 15, with optimizations enabled, we don't just crash, it gets scarier.

   First, the startup process actually creates a bogus WAL segment:

#1  0x55f53b2725ff in XLogFileInitInternal (path=0x7ffccb7b3360 
"pg_wal/0001", added=0x7ffccb7b335f, logtli=1, 
logsegno=1099511627776)
at /home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:2936
#2  PreallocXlogFiles (endptr=endptr@entry=0, tli=tli@entry=1) at 
/home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:3433
#3  0x55f53b277e00 in PreallocXlogFiles (tli=1, endptr=0) at 
/home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:3425
#4  StartupXLOG () at 
/home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:5517
#5  0x55f53b4a385e in StartupProcessMain () at 
/home/andres/src/postgresql-15/src/backend/postmaster/startup.c:282
#6  0x55f53b49860b in AuxiliaryProcessMain 
(auxtype=auxtype@entry=StartupProcess)
at /home/andres/src/postgresql-15/src/backend/postmaster/auxprocess.c:141
#7  0x55f53b4a2eed in StartChildProcess (type=StartupProcess) at 
/home/andres/src/postgresql-15/src/backend/postmaster/postmaster.c:5432
#8  PostmasterMain (argc=argc@entry=39, argv=argv@entry=0x55f53d5095d0) at 
/home/andres/src/postgresql-15/src/backend/postmaster/postmaster.c:1473
#9  0x55f53b1d1bff in main (argc=39, argv=0x55f53d5095d0) at 
/home/andres/src/postgresql-15/src/backend/main/main.c:202

Note the endptr=0 and pg_wal/0001 path.

With normal log level, one wouldn't learn anything about this.


Then the *checkpointer* segfaults, trying to write the end-of-recovery
checkpoint:

#0  __memcpy_evex_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:344
#1  0x56102ebe84a2 in memcpy (__len=26, __src=0x56102fbe7b48, 
__dest=) at 
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
#2  CopyXLogRecordToWAL (tli=1, EndPos=160, StartPos=40, rdata=0x56102f31ffb0 
, isLogSwitch=false, write_len=114)
at /home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:1229
#3  XLogInsertRecord (rdata=, fpw_lsn=, 
flags=, num_fpi=0, topxid_included=)
at /home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:861
#4  0x56102ebf12cb in XLogInsert (rmid=rmid@entry=0 '\000', 
info=info@entry=0 '\000')
at 
/home/andres/src/postgresql-15/src/backend/access/transam/xloginsert.c:492
#5  0x56102ebea92e in CreateCheckPoint (flags=102) at 
/home/andres/src/postgresql-15/src/backend/access/transam/xlog.c:6583
#6  0x56102ee0e552 in CheckpointerMain () at 
/home/andres/src/postgresql-15/src/backend/postmaster/checkpointer.c:455
#7  0x56102ee0c5f9 in AuxiliaryProcessMain 
(auxtype=auxtype@entry=CheckpointerProcess)
at /home/andres/src/postgresql-15/src/backend/postmaster/auxprocess.c:153
#8  0x56102ee12c38 in StartChildProcess (type=CheckpointerProcess) at 
/home/andres/src/postgresql-15/src/backend/postmaster/postmaster.c:5432
#9  0x56102ee16d54 in PostmasterMain (argc=argc@entry=35, 
argv=argv@entry=0x56102fb6e5d0)
at /home/andres/src/postgresql-15/src/backend/postmaster/postmaster.c:1466
#10 0x56102eb45bff in main (argc=35, argv=0x56102fb6e5d0) at 
/home/andres/src/postgresql-15/src/backend/main/main.c:202

The immediate cause of the crash is that GetXLogBuffer() is called with
ptr=40, which makes GetXLogBuffer() think it can use the cached path,
because cachedPage is still zero.

Which in turn is because because the startup process happily initialized
XLogCtl->Insert.{CurrBytePos,PrevBytePos} with 0s.  Even though it
initialized RedoRecPtr with the valid redo pointer.

The checkpointer actually ends up resetting the valid RedoRecPtr with
bogus content as part of CreateCheckPoint(), due to the bogus CurrBytePos.


6) On 15, with optimizations enabled, we 

Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Mark Wong
On Tue, Oct 24, 2023 at 10:17:22AM +1300, Thomas Munro wrote:
> On Tue, Oct 24, 2023 at 4:27 AM Mark Wong  wrote:
> > I haven't gotten around to disabling llvm on any of my animals with llvm
> > < 7 yet.  Do you still want to hold on that?
> 
> Yes, please disable --with-llvm on s390x and POWER animals with LLVM <
> 7 (see below).  Also, you have a bunch of machines with LLVM 16 that
> are failing to compile on REL_11_STABLE.  That is expected, because we
> agreed not to back-patch the LLVM 16 API changes into REL_11_STABLE:
> 
> > kingsnake: 16.0.6 Fedora Linux 38
> > krait: CentOS 16.0.6 Stream 8
> > lancehead: CentOS 16.0.6 Stream 8

I should have updated these to not use --with-llvm for REL_11_STABLE.

> These POWER machines fail as expected, and it's unfixable:
> 
> > elasmobranch: 5.0.1 openSUSE Leap 15.0
> > demoiselle: 5.0.1 openSUSE Leap 15.0
> > cavefish: 6.0.0 Ubuntu 18.04.6 LTS

These should now be updated to not use --with-llvm at all.

> These s390x animals are failing, but don't show the layout complaint.
> I can see that LLVM 6 also lacked a case for s390x in
> llvm::orc::createLocalIndirectStubsManagerBuilder(), the thing that
> was fixed in 7 with the addition of a default case.  Therefore these
> presumably fail just like old LLVM on POWER, and it's unfixable.  So I
> suggest turning off --with-llvm on these two:
> 
> > cotinga: 6.0.0 Ubuntu 18.04.6 LTS
> > perch: 6.0.0 Ubuntu 18.04.6 LTS

Ok, I should have removed --with-llvm here too.

> This s390x animal doesn't actually have --with-llvm enabled so it
> passes, but surely it'd be just like lora:
> 
> > mamushi: 15.0.7 Red Hat Enterprise Linux 9.2

Oops, I think I added it now.


I think I made all the recommended changes, and trimmed out the lines
where I didn't need to do anything. :)

Andres pointed out to me that my animals aren't set up to collect core
file so I'm also trying to update that too...

Regards,
Mark




Re: Use virtual tuple slot for Unique node

2023-10-23 Thread David Rowley
On Fri, 20 Oct 2023 at 22:30, Ashutosh Bapat
 wrote:
> I ran my experiments again. It seems on my machine the execution times
> do vary a bit. I ran EXPLAIN ANALYZE on the query 5 times and took
> average of execution times. I did this three times. For each run the
> standard deviation was within 2%. Here are the numbers
> master: 13548.33, 13878.88, 14572.52
> master + 0001: 13734.58, 14193.83, 14574.73
>
> So for me, I would say, this particular query performs the same with
> or without patch.

I'm not really sure which of the 7 queries you're referring to here.
The times you're quoting seem to align best to Q7 from your previous
results, so I'll assume you mean Q7.

I'm not really concerned with Q7 as both patched and unpatched use
TTSOpsMinimalTuple.

I also think you need to shrink the size of your benchmark down.  With
1 million tuples, you're more likely to be also measuring the time it
takes to get cache lines from memory into the CPU. A smaller scale
test will make this less likely. Also, you'd be better timing SELECT
rather than the time it takes to EXPLAIN ANALYZE. They're not the same
thing. EXPLAIN ANALYZE has additional timing going on and we may end
up not de-toasting toasted Datums.

> On Thu, Oct 19, 2023 at 4:26 PM David Rowley  wrote:
> > We can see that Q2 and Q3 become a bit slower.  This makes sense as
> > tts_virtual_materialize() is quite a bit more complex than
> > heap_copy_minimal_tuple() which is a simple palloc/memcpy.
> >
>
> If the source slot is a materialized virtual slot,
> tts_virtual_copyslot() could perform a memcpy of the materialized data
> itself rather than materialising from datums. That might be more
> efficient.

I think you're talking about just performing a memcpy() of the
VirtualTupleTableSlot->data field.  Unfortunately, you'd not be able
to do just that as you'd also need to repoint the non-byval Datums in
tts_values at the newly memcpy'd memory. If you skipped that part,
those would remain pointing to the original memory. If that memory
goes away, then bad things will happen. I think you'd still need to do
the 2nd loop in tts_virtual_materialize()

> > We'd likely see Q2 and Q3 do better with the patched version if there
> > were more duplicates as there'd be less tuple deforming going on
> > because of the virtual slots.
> >
> > Overall, the patched version is 5.55% faster than master.  However,
> > it's pretty hard to say if we should do this or not. Q3 has a mix of
> > varlena and byval types and that came out slower with the patched
> > version.
>
> Theoretically using the same slot type is supposed to be faster. We
> use same slot types for input and output in other places where as
> well.

Which theory?

> May be we should fix the above said inefficiency in
> tt_virtual_copyslot()?

I don't have any bright ideas on how to make tts_virtual_materialize()
itself faster.  If there were some way to remember !attbyval
attributes for the 2nd loop, that might be good, but creating
somewhere to store that might result in further overheads.

tts_virtual_copyslot() perhaps could be sped up a little by doing a
memcpy of the values/isnull arrays when the src and dst descriptors
have the same number of attributes. aka, something like:

if (srcdesc->natts == dstslot->tts_tupleDescriptor->natts)
memcpy(dstslot->tts_values, srcslot->tts_values,
MAXALIGN(srcdesc->natts * sizeof(Datum)) +
MAXALIGN(srcdesc->natts * sizeof(bool)));
else
{
for (int natt = 0; natt < srcdesc->natts; natt++)
{
dstslot->tts_values[natt] = srcslot->tts_values[natt];
dstslot->tts_isnull[natt] = srcslot->tts_isnull[natt];
}
}

I imagine we'd only start to notice gains by doing that for larger
natts values.  Each of the 7 queries here, I imagine, wouldn't have
enough columns for it to make much of a difference.

That seems valid enough to do based on how MakeTupleTableSlot()
allocates those arrays. ExecSetSlotDescriptor() does not seem on board
with the single allocation method, however. (Those pfree's in
ExecSetSlotDescriptor() do look a bit fragile.
https://coverage.postgresql.org/src/backend/executor/execTuples.c.gcov.html
says they're never called)

David




Re: Patch: Improve Boolean Predicate JSON Path Docs

2023-10-23 Thread David E. Wheeler
On Oct 22, 2023, at 20:36, Erik Wienhold  wrote:

> That's an AppleSingle file according to [1][2].  It only contains the
> resource fork and file name but no data fork.

Ah, I had “Send large attachments with Mail Drop” enabled. To me 20K is not big 
but whatever. Let’s see if turning it off fixes the issue.

> Any reason for calling it "predicate check expressions" (e.g. the link
> text) and sometimes "predicate path expressions" (e.g. the linked
> section title)?  I think it should be named consistently to avoid
> confusion and also to simplify searching.

I think "predicate path expressions” is more descriptive, but "predicate check 
expressions” is what was in the docs before, so let’s stick with that.

> Linking the same section twice in the same paragraph seems excessive.

Fair. Will link the second one.

>> += select jsonb_path_query(:'json', 
>> '$.track.segments');
>> +select jsonb_path_query(:'json', '$.track.segments');
> 
> Please remove the second SELECT.

Done.

>> += select jsonb_path_query(:'json', 'strict 
>> $.track.segments[0].location');
>> + jsonb_path_query
>> +---
>> + [47.763, 13.4034]
> 
> Strict mode is unnecessary to get that result and I'd omit it because
> the different modes are not introduced yet at this point.

Yep, pasto.

> Strict mode is unnecessary here as well.

Fixed.

>> + using the lax mode. To avoid surprising results, we recommend using
>> + the .** accessor only in the strict mode. The
> 
> Please change to "in strict mode" (without "the").

Hrm, I prefer it without the article, too, but it is consistently used that way 
elsewhere, like here:

 https://github.com/postgres/postgres/blob/5b36e8f/doc/src/sgml/func.sgml#L17401

I’d be happy to change them all, but was keeping it consistent for now.

Updated patch attached, thank you!

David



v6-0001-Improve-boolean-predicate-JSON-Path-docs.patch
Description: Binary data


Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-23 Thread Peter Geoghegan
On Mon, Oct 16, 2023 at 11:21 PM Donghang Lin  wrote:
> It seems to me that as long as a data type has a deterministic sort but not 
> necessarily be equalimage,
> it should be able to support deduplication.  e.g for interval type, we add a 
> byte wise tie breaker
> after '24h' and '1day' are compared equally. In the btree, '24h' and '1day' 
> are still adjacent,
> '1day' is always sorted before '24h' in a btree page, can we do dedup for 
> each value
> without problem?
> The assertion will still be triggered as it's testing btequalimage, but I'll 
> defer it for now.
> Wanted to know if the above idea sounds sane first...

 It's hard to give any one reason why this won't work. I'm sure that
with enough effort some scheme like this could work. It's just that
there are significant practical problems that at least make it seem
unappealing as a project. This has been discussed before:

https://www.postgresql.org/message-id/flat/CAH2-WzkZkSC7G%2Bv1WwXGo0emh8E-rByw%3DxSpBUoavk7PTjwF2Q%40mail.gmail.com#4e98cba0d76c8c8c0bf67ae0d4652903

--
Peter Geoghegan




Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Tom Lane
I wrote:
> Isaac Morland  writes:
>> If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
>> we could pass NULL to the constraint at CREATE DOMAIN time, and if it
>> returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
>> if an actual NOT NULL clause had been specified?

> Maybe, but then ALTER DOMAIN would have to be prepared to update that
> flag when adding or dropping constraints.  Perhaps that's better than
> checking on-the-fly during DML commands, though.

After further thought I like that idea a lot, but we can't simply
overwrite pg_type.typnotnull without losing track of whether the user
had given a bare NOT NULL constraint.  Instead I think the details
should be like this:

1. Add a bool column "connotnull" (or some such name) to pg_constraint.
Set this to true when the constraint is a domain CHECK constraint that
returns FALSE for NULL input.  (In future we could maintain the flag
for table CHECK constraints too, perhaps, but I don't see value in
that right now.)  This requires assuming that the constraint is
immutable (which we assume already) and that it's okay to evaluate it
on a NULL immediately during CREATE DOMAIN or ALTER DOMAIN ADD
CONSTRAINT.  It seems possible that that could fail, but only with
rather questionable choices of constraints.

2. INSERT/UPDATE enforce not-nullness if pg_type.typnotnull is set
or there is any domain constraint with pg_constraint.connotnull
set.  This still requires thumbing through the constraints at
query start, but the check is cheaper and a good deal more bulletproof
than my previous suggestion of a purely-syntactic check.

We could make query start still cheaper by adding another pg_type
column that is the OR of the associated constraints' connotnull
flags, but I suspect it's not worth the trouble.  The typcache
can probably maintain that info with epsilon extra cost.

A variant approach could be to omit the catalog changes and have
this state be tracked entirely by the typcache.  That'd result in
rather more trial evaluations of the domain constraints on NULLs,
but it would have the advantage of not requiring any constraint
evaluations to occur during CREATE/ALTER DOMAIN, only during startup
of a query that's likely to evaluate them anyway.  That'd reduce
the odds of breaking things thanks to search_path dependencies
and suchlike.

regards, tom lane




Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Tom Lane
Isaac Morland  writes:
> Then domain CHECK constraints are checked anytime a non-NULL value is
> turned into a domain value, and NOT NULL ones are checked only when storing
> to a table. CHECK constraints would be like STRICT functions; if the input
> is NULL, the implementation is not run and the result is NULL (which for a
> CHECK means accept the input).

Right.

> Whether I actually think the above is a good idea would require me to read
> carefully the relevant section of the SQL spec. If it agrees that CHECK ()
> is for testing non-NULL values and NOT NULL is for saying that columns of
> actual tables can't be NULL, then I would probably agree with my own idea,
> otherwise perhaps not depending on exactly what it said.

The spec doesn't actually allow bare NOT NULL as a domain constraint;
it only has CHECK constraints.  Of course you can write CHECK(VALUE
IS NOT NULL), or more-complicated things that will reject a NULL,
but they're effectively ignored during CAST and applied only when
storing to a table column.

I think we decided to implement NOT NULL because it seemed like an
odd wart not to have it if you could do the CHECK equivalent.
In the light of this new understanding, though, I bet they omitted
it deliberately because it'd be too-obviously-inconsistent behavior.

In any case, we can't drop the NOT NULL option now without breaking
apps.  I think it should continue to behave exactly the same as
"CHECK(VALUE IS NOT NULL)".

> If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
> we could pass NULL to the constraint at CREATE DOMAIN time, and if it
> returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
> if an actual NOT NULL clause had been specified?

Maybe, but then ALTER DOMAIN would have to be prepared to update that
flag when adding or dropping constraints.  Perhaps that's better than
checking on-the-fly during DML commands, though.

regards, tom lane




Re: Creating foreign key on partitioned table is too slow

2023-10-23 Thread Alec Lazarescu
I ran into a situation that echoed this original one by Kato in the
start of this thread:
https://www.postgresql.org/message-id/OSAPR01MB374809E8DE169C8BF2B82CBD9F6B0%40OSAPR01MB3748.jpnprd01.prod.outlook.com

More below.

Tom Lane  writes:
>
> I tried running the original test case under HEAD.  I do not see
> any visible memory leak, which I think indicates that 5b9312378 or
> some other fix has taken care of the leak since the original report.
> However, after waiting awhile and noting that the ADD FOREIGN KEY
> wasn't finishing, I poked into its progress with a debugger and
> observed that each iteration of RI_Initial_Check() was taking about
> 15 seconds.  I presume we have to do that for each partition,
> which leads to the estimate that it'll take 34 hours to finish this
> ... and that's with no data in the partitions, god help me if
> there'd been a lot.
>
> Some quick "perf" work says that most of the time seems to be
> getting spent in the planner, in get_eclass_for_sort_expr().
> So this is likely just a variant of performance issues we've
> seen before.  (I do wonder why we're not able to prune the
> join to just the matching PK partition, though.)

I found that the original example from Kato finishes in a little over
a minute now to create the FK constraint in Postgres 14-16.

However, in my case, I'm using composite partitions and that is taking
60x as long for an equivalent number of partitions. I must emphasize
this is with ZERO rows of data.

I'm using 1200 partitions in my example to finish a bit faster and
because that's my actual use case and less extreme than 8,000
partitions.

If I reach my 1,200 with 80 top-level and 15 leaf-level partitions in
a composite hierarchy (80x15 = 1,200 still) the speed is very slow.
I'm using composite partitions primarily because in my real code I
need LIST partitions which don't support multiple keys so I worked
around that using composite partitions with one LIST key in each
level. Doing some other workaround like a concatenated single key was
messy for my use case.

I modified Kato's test case to repeat the issue I'm having and saw
some very odd query plan behavior which is likely part of the issue.

The flat version with 1,200 partitions at a single level and no
composite partitions finishes in a little over a second while the 80 x
15 version with composite partitions takes over a minute (60x longer).
In my actual database with many such partitions with FK, the time
compounds and FK creation takes >30 minutes per FK leading to hours
just making FKs.

== COMPOSITE PARTITION INTERNAL SELECT PLAN ==

If I cancel the composite FK creation, I see where it stopped and that
gives a clue about the difference in speed. For the composite, it's
this statement with a plan linked on dalibo showing a massive amount
of sequential scans and Postgres making some assumptions about 1 row
existing.

ERROR: canceling statement due to user request
CONTEXT: SQL statement SELECT fk."aid" FROM ONLY
"public"."xhistory_25_12" fk LEFT OUTER JOIN "public"."xaccounts" pk
ON ( pk."aid" OPERATOR(pg_catalog.=) fk."aid") WHERE pk."aid" IS NULL
AND (fk."aid" IS NOT NULL)
SQL state: 57014

PLAN DETAILS: https://explain.dalibo.com/plan/fad72gdacb6727b4#plan

== FLAT PARTITION INTERNAL SELECT PLAN ==

This gives a direct result node and has no complexity at all

SELECT fk."aid" FROM ONLY "public"."history_23" fk LEFT OUTER JOIN
"public"."accounts" pk ON ( pk."aid" OPERATOR(pg_catalog.=) fk."aid")
WHERE pk."aid" IS NULL AND (fk."aid" IS NOT NULL)

PLAN DETAILS: https://explain.dalibo.com/plan/a83dae9b9569ebcd

Test cases to repeat easily below:

== FLAT PARTITION FAST FK DDL ==

CREATE DATABASE fastflatfk

CREATE TABLE accounts (aid INTEGER, bid INTEGER, abalance INTEGER,
filler CHAR(84)) PARTITION BY HASH(aid);
CREATE TABLE history (tid INTEGER, bid INTEGER, aid INTEGER, delta
INTEGER, mtime TIMESTAMP, filler CHAR(22)) PARTITION BY HASH(aid);

DO $$
DECLARE
   p INTEGER;
BEGIN
  FOR p IN 0..1023 LOOP
EXECUTE 'CREATE TABLE accounts_' || p || ' PARTITION OF accounts
FOR VALUES WITH (modulus 1024, remainder ' || p || ') PARTITION BY
HASH(aid);';
EXECUTE 'CREATE TABLE history_' || p || ' PARTITION OF history FOR
VALUES WITH (modulus 1024, remainder ' || p || ') PARTITION BY
HASH(aid);';
  END LOOP;
END $$;

ALTER TABLE accounts ADD CONSTRAINT accounts_pk PRIMARY KEY (aid);

-- Query returned successfully in 1 secs 547 msec.
ALTER TABLE history ADD CONSTRAINT history_fk FOREIGN KEY (aid)
REFERENCES accounts (aid) ON DELETE CASCADE;

--run to drop FK before you recreate it
--ALTER TABLE history DROP CONSTRAINT history_fk

== COMPOSITE PARTITION SLOW FK DDL ==

Now the composite partition version with 80 x 15 partitions which
finishes in a bit over a minute (60x the time)

CREATE DATABASE slowcompfk

-- Create the parent tables for xaccounts and xhistory
CREATE TABLE xaccounts (aid INTEGER, bid INTEGER, abalance INTEGER,
filler CHAR(84)) PARTITION BY HASH(aid);
CREATE TABLE xhistory 

Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Tom Lane
Vik Fearing  writes:
> On 10/23/23 18:53, Tom Lane wrote:
>> (1A) It satisfies the plain language of the SQL spec about 
>> how CAST to a domain type behaves.

> I agree with all of your proposal, except for this part.  I think the 
> shortcut in the General Rules of  is an oversight 
> and I plan on submitting a paper to fix it.

Yeah, it might be a bug in the spec, but if so the bug has been there
since SQL92 without anyone noticing.  SQL92 has GR2 as

 2) Case:

a) If the  specifies NULL or if SV is the null
  value, then the result of the  is the
  null value.

SQL99 revised the text some, but without changing that outcome.
Then in SQL:2003 they doubled down on the point:

a) If the  specifies NULL, then TV is the null value and
no further General Rules of this Subclause are applied.

b) If the  specifies an , then TV
is an empty collection of declared type TD and no further General
Rules of this Subclause are applied.

c) If SV is the null value, then TV is the null value and no further
General Rules of this Subclause are applied.

You're suggesting that nobody noticed that this wording requires NULLs
to skip the domain checks?  Maybe, but I think it must be intentional.
I'll await the committee's reaction with interest.

regards, tom lane




Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Thomas Munro
On Tue, Oct 24, 2023 at 4:27 AM Mark Wong  wrote:
> I haven't gotten around to disabling llvm on any of my animals with llvm
> < 7 yet.  Do you still want to hold on that?

Yes, please disable --with-llvm on s390x and POWER animals with LLVM <
7 (see below).  Also, you have a bunch of machines with LLVM 16 that
are failing to compile on REL_11_STABLE.  That is expected, because we
agreed not to back-patch the LLVM 16 API changes into REL_11_STABLE:

> kingsnake: 16.0.6 Fedora Linux 38
> krait: CentOS 16.0.6 Stream 8
> lancehead: CentOS 16.0.6 Stream 8

These POWER machines fail as expected, and it's unfixable:

> elasmobranch: 5.0.1 openSUSE Leap 15.0
> demoiselle: 5.0.1 openSUSE Leap 15.0
> cavefish: 6.0.0 Ubuntu 18.04.6 LTS

(Well, we could in theory supply our own fixed
llvm::orc::createLocalIndirectStubsManagerBuilder() function to hide
the broken one in LLVM <= 6, but that way lies madness IMHO.  An LTS
distro that cares could look into back-patching LLVM's fixes, but for
us, let us focus on current software.)

This POWER animal fails, unexpectedly to me:

> bonito: 7.0.1 Fedora 29

We could try to chase that down, or we could rejoice that at least it
works on current release.  It must begin working somewhere between 7
and 11, but when I checked which LLVM releases I could easily install
on eg cascabel (if I could get access) using the repo at apt.llvm.org,
I saw that they don't even have anything older than 11.  So someone
with access who wants to figure this out might have many days or weeks
of compiling ahead of them.

These POWER animals are passing, as expected:

> cascabel: 11.0.1 Debian GNU/Linux 11
> babbler: 15.0.7 AlmaLinux 8.8
> pytilia: 15.0.7 AlmaLinux 8.8
> nicator: 15.0.7 AlmaLinux 9.2
> twinspot: 15.0.7 AlmaLinux 9.2
> habu: 16.0.6 Fedora Linux 38
> kingsnake: 16.0.6 Fedora Linux 38
> krait: CentOS 16.0.6 Stream 8
> lancehead: CentOS 16.0.6 Stream 8

These s390x animals are passing:

> branta: 10.0.0 Ubuntu 20.04.4 LTS
> pike: 11.0.1 Debian GNU/Linux 11
> rinkhals: 11.0.1 Debian GNU/Linux 11
> sarus: 14.0.0 Ubuntu 22.04.1 LTS

These s390x animals are failing, but don't show the layout complaint.
I can see that LLVM 6 also lacked a case for s390x in
llvm::orc::createLocalIndirectStubsManagerBuilder(), the thing that
was fixed in 7 with the addition of a default case.  Therefore these
presumably fail just like old LLVM on POWER, and it's unfixable.  So I
suggest turning off --with-llvm on these two:

> cotinga: 6.0.0 Ubuntu 18.04.6 LTS
> perch: 6.0.0 Ubuntu 18.04.6 LTS

These s390x animals are failing with the mismatched layout problem,
which should be fixed by Andres's patch to tolerate the changing
z12/z13 ABI thing by falling back to whatever clang picked (at a cost
of not using all the features of your newer CPU, unless you explicitly
tell clang to target it):

> aracari: 15.0.7 Red Hat Enterprise Linux 8.6
> pipit: 15.0.7 Red Hat Enterprise Linux 8.6
> lora: 15.0.7 Red Hat Enterprise Linux 9.2

This s390x animal doesn't actually have --with-llvm enabled so it
passes, but surely it'd be just like lora:

> mamushi: 15.0.7 Red Hat Enterprise Linux 9.2




Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Vik Fearing

On 10/23/23 18:53, Tom Lane wrote:
1. Domains are data types, with the proviso that NULL is always a valid 
value no matter what the domain constraints might say. 
Implementation-wise, this'd just require that CoerceToDomain immediately 
return any null input without checking the constraints. This has two big 
attractions:



(1A) It satisfies the plain language of the SQL spec about 
how CAST to a domain type behaves.



I agree with all of your proposal, except for this part.  I think the 
shortcut in the General Rules of  is an oversight 
and I plan on submitting a paper to fix it.  The intention is, in my 
view, clearly to check the constraints upon casting.  What other 
explanation is there since the result type is still the domain's base 
type[*]?



[*] In the standard, not in our superior implementation of it.
--
Vik Fearing





Re: Add trailing commas to enum definitions

2023-10-23 Thread Tom Lane
Nathan Bossart  writes:
> From a long-term perspective, I
> think standardizing on the trailing comma style will actually improve
> git-blame because patches won't need to add a comma to the previous line
> when adding a value.

Yeah, that's a good point.  I had been leaning towards "this is
unnecessary churn", but with that idea I'm now +1.

regards, tom lane




Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Vik Fearing

On 10/23/23 20:36, Isaac Morland wrote:
Also, where it says "Expressions evaluating to TRUE or UNKNOWN succeed": 
Do we really mean "Expressions evaluating to TRUE or NULL succeed"?


No, UNKNOWN is the correct nomenclature for booleans.
--
Vik Fearing





Re: Why is hot_standby_feedback off by default?

2023-10-23 Thread Nathan Bossart
On Sun, Oct 22, 2023 at 12:07:59PM -0700, Andres Freund wrote:
> Medium term, I think we need an approximate xid->"time of assignment" mapping 
> that's continually maintained on the primary. One of the things that'd show 
> us to do is introduce a GUC to control the maximum effect of hs_feedback  on 
> the primary, in a useful unit. Numbers of xids are not a useful unit (100k 
> xids is forever on some systems, a few minutes at best on others, the rate is 
> not necessarily that steady when plpgsql exception handles are used, ...)
> 
> It'd be useful to have such a mapping for other features too. E.g.
> 
>  - making it visible in pg_stat _activity how problematic a longrunning xact 
> is - a 3 day old xact that doesn't have an xid assigned and has a recent xmin 
> is fine, it won't prevent vacuum from doing things. But a somewhat recent 
> xact that still has a snapshot from before an old xact was cancelled could be 
> problematic.
> 
> - turn pg_class.relfrozenxid into an understandable timeframe. It's a fair 
> bit of mental effort to classify "370M xids old" into problem/fine (it's e.g. 
> not a problem on a system with a high xid rate, on a big table that takes a 
> bit to a bit to vacuum).
> 
> - using the mapping to compute an xid consumption rate IMO would be one 
> building block for smarter AV scheduling. Together with historical vacuum 
> runtimes it'd allow us to start vacuuming early enough to prevent hitting 
> thresholds, adapt pacing, prioritize between tables etc. 

Big +1 to all of this.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add trailing commas to enum definitions

2023-10-23 Thread Nathan Bossart
On Mon, Oct 23, 2023 at 05:55:32PM +0800, Junwang Zhao wrote:
> On Mon, Oct 23, 2023 at 2:37 PM Peter Eisentraut  wrote:
>> Since C99, there can be a trailing comma after the last value in an enum
> 
> C99 allows us to do this doesn't mean we must do this, this is not
> inconsistent IMHO, and this will pollute the git log messages, people
> may *git blame* the file and see the reason for the introduction of the
> line.

I suspect that your concerns about git-blame could be resolved by adding
this commit to .git-blame-ignore-revs.  From a long-term perspective, I
think standardizing on the trailing comma style will actually improve
git-blame because patches won't need to add a comma to the previous line
when adding a value.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: vacuumdb/clusterdb/reindexdb: allow specifying objects to process in all databases

2023-10-23 Thread Nathan Bossart
rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 9618c243cbd3056006acda0136036b432af37830 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 28 Jun 2023 15:12:18 -0700
Subject: [PATCH v2 1/3] vacuumdb: allow specifying tables or schemas to
 process in all databases

---
 src/bin/scripts/t/100_vacuumdb.pl | 24 
 src/bin/scripts/vacuumdb.c| 19 +--
 2 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 925079bbed..52926d53a5 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -188,18 +188,18 @@ $node->command_fails_like(
 	[ 'vacuumdb', '-n', 'pg_catalog', '-N', '"Foo"', 'postgres' ],
 	qr/cannot vacuum all tables in schema\(s\) and exclude schema\(s\) at the same time/,
 	'cannot use options -n and -N at the same time');
-$node->command_fails_like(
-	[ 'vacuumdb', '-a', '-N', '"Foo"' ],
-	qr/cannot exclude specific schema\(s\) in all databases/,
-	'cannot use options -a and -N at the same time');
-$node->command_fails_like(
-	[ 'vacuumdb', '-a', '-n', '"Foo"' ],
-	qr/cannot vacuum specific schema\(s\) in all databases/,
-	'cannot use options -a and -n at the same time');
-$node->command_fails_like(
-	[ 'vacuumdb', '-a', '-t', '"Foo".bar' ],
-	qr/cannot vacuum specific table\(s\) in all databases/,
-	'cannot use options -a and -t at the same time');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-a', '-N', 'pg_catalog' ],
+	qr/(?:(?!VACUUM \(SKIP_DATABASE_STATS\) pg_catalog.pg_class).)*/,
+	'vacuumdb -a -N');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-a', '-n', 'pg_catalog' ],
+	qr/VACUUM \(SKIP_DATABASE_STATS\) pg_catalog.pg_class/,
+	'vacuumdb -a -n');
+$node->issues_sql_like(
+	[ 'vacuumdb', '-a', '-t', 'pg_class' ],
+	qr/VACUUM \(SKIP_DATABASE_STATS\) pg_catalog.pg_class/,
+	'vacuumdb -a -t');
 $node->command_fails_like(
 	[ 'vacuumdb', '-a', '-d', 'postgres' ],
 	qr/cannot vacuum all databases and a specific one at the same time/,
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index d682573dc1..0eddcaa047 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -72,6 +72,7 @@ static void vacuum_one_database(ConnParams *cparams,
 static void vacuum_all_databases(ConnParams *cparams,
  vacuumingOptions *vacopts,
  bool analyze_in_stages,
+ SimpleStringList *objects,
  int concurrentCons,
  const char *progname, bool echo, bool quiet);
 
@@ -378,6 +379,7 @@ main(int argc, char *argv[])
 
 		vacuum_all_databases(, ,
 			 analyze_in_stages,
+			 ,
 			 concurrentCons,
 			 progname, echo, quiet);
 	}
@@ -429,18 +431,6 @@ check_objfilter(void)
 		(objfilter & OBJFILTER_DATABASE))
 		pg_fatal("cannot vacuum all databases and a specific one at the same time");
 
-	if ((objfilter & OBJFILTER_ALL_DBS) &&
-		(objfilter & OBJFILTER_TABLE))
-		pg_fatal("cannot vacuum specific table(s) in all databases");
-
-	if ((objfilter & OBJFILTER_ALL_DBS) &&
-		(objfilter & OBJFILTER_SCHEMA))
-		pg_fatal("cannot vacuum specific schema(s) in all databases");
-
-	if ((objfilter & OBJFILTER_ALL_DBS) &&
-		(objfilter & OBJFILTER_SCHEMA_EXCLUDE))
-		pg_fatal("cannot exclude specific schema(s) in all databases");
-
 	if ((objfilter & OBJFILTER_TABLE) &&
 		(objfilter & OBJFILTER_SCHEMA))
 		pg_fatal("cannot vacuum all tables in schema(s) and specific table(s) at the same time");
@@ -895,6 +885,7 @@ static void
 vacuum_all_databases(ConnParams *cparams,
 	 vacuumingOptions *vacopts,
 	 bool analyze_in_stages,
+	 SimpleStringList *objects,
 	 int concurrentCons,
 	 const char *progname, bool echo, bool quiet)
 {
@@ -927,7 +918,7 @@ vacuum_all_databases(ConnParams *cparams,
 
 vacuum_one_database(cparams, vacopts,
 	stage,
-	NULL,
+	objects,
 	concurrentCons,
 	progname, echo, quiet);
 			}
@@ -941,7 +932,7 @@ vacuum_all_databases(ConnParams *cparams,
 
 			vacuum_one_database(cparams, vacopts,
 ANALYZE_NO_STAGE,
-NULL,
+objects,
 concurrentCons,
 progname, echo, quiet);
 		}
-- 
2.25.1

>From de22c8c0060512d1a9add03b6eb4265767fb061b Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 28 Jun 2023 15:12:58 -0700
Subject: [PATCH v2 2/3] clusterdb: allow specifying tables to process in all
 databases

---
 src/bin/scripts/clusterdb.c| 28 +-
 src/bin/scripts/t/011_clusterdb_all.pl | 11 ++
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/src/bin/scripts/clusterdb.c b/src/bin/scripts/clusterdb.c
index 65428031c7..89f1e733fe 100644
--- a/src/bin/scripts/clusterdb.c
+++ b/src/bin/scripts/clusterdb.c
@@ -21,8 +21,9 @@
 
 static void cluster_one_database(const ConnParams *cparams, const char *table,
  const char *progname, bool verbose, bool echo);
-static void 

Re: Evaluate arguments of correlated SubPlans in the referencing ExprState

2023-10-23 Thread Alena Rybakina

Hi!

I looked through your patch and noticed that it was not applied to the 
current version of the master. I rebased it and attached a version. I 
didn't see any problems and, honestly, no big changes were needed, all 
regression tests were passed.


I think it's better to add a test, but to be honest, I haven't been able 
to come up with something yet.


--
Regards,
Alena Rybakina
From f7a8ca7f3263fa5f82056f90231cf937133622c9 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 23 Oct 2023 22:54:04 +0300
Subject: [PATCH] WIP: Evaluate arguments of correlated SubPlans in the
 referencing ExprState

---
 src/backend/executor/execExpr.c   | 93 +--
 src/backend/executor/execExprInterp.c | 23 +++
 src/backend/executor/execProcnode.c   |  5 ++
 src/backend/executor/nodeSubplan.c| 30 -
 src/backend/jit/llvm/llvmjit_expr.c   |  6 ++
 src/backend/jit/llvm/llvmjit_types.c  |  1 +
 src/include/executor/execExpr.h   |  6 +-
 src/include/nodes/execnodes.h |  1 -
 8 files changed, 113 insertions(+), 52 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 2c62b0c9c84..14cd56a2db3 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -88,6 +88,9 @@ static void ExecBuildAggTransCall(ExprState *state, AggState *aggstate,
   FunctionCallInfo fcinfo, AggStatePerTrans pertrans,
   int transno, int setno, int setoff, bool ishash,
   bool nullcheck);
+static void ExecInitSubPlanExpr(SubPlan *subplan,
+ExprState *state,
+Datum *resv, bool *resnull);
 
 
 /*
@@ -1389,7 +1392,6 @@ ExecInitExprRec(Expr *node, ExprState *state,
 		case T_SubPlan:
 			{
 SubPlan*subplan = (SubPlan *) node;
-SubPlanState *sstate;
 
 /*
  * Real execution of a MULTIEXPR SubPlan has already been
@@ -1406,19 +1408,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
 	break;
 }
 
-if (!state->parent)
-	elog(ERROR, "SubPlan found with no parent plan");
-
-sstate = ExecInitSubPlan(subplan, state->parent);
-
-/* add SubPlanState nodes to state->parent->subPlan */
-state->parent->subPlan = lappend(state->parent->subPlan,
- sstate);
-
-scratch.opcode = EEOP_SUBPLAN;
-scratch.d.subplan.sstate = sstate;
-
-ExprEvalPushStep(state, );
+ExecInitSubPlanExpr(subplan, state, resv, resnull);
 break;
 			}
 
@@ -2750,29 +2740,12 @@ ExecPushExprSetupSteps(ExprState *state, ExprSetupInfo *info)
 	foreach(lc, info->multiexpr_subplans)
 	{
 		SubPlan*subplan = (SubPlan *) lfirst(lc);
-		SubPlanState *sstate;
 
 		Assert(subplan->subLinkType == MULTIEXPR_SUBLINK);
 
-		/* This should match what ExecInitExprRec does for other SubPlans: */
-
-		if (!state->parent)
-			elog(ERROR, "SubPlan found with no parent plan");
-
-		sstate = ExecInitSubPlan(subplan, state->parent);
-
-		/* add SubPlanState nodes to state->parent->subPlan */
-		state->parent->subPlan = lappend(state->parent->subPlan,
-		 sstate);
-
-		scratch.opcode = EEOP_SUBPLAN;
-		scratch.d.subplan.sstate = sstate;
-
 		/* The result can be ignored, but we better put it somewhere */
-		scratch.resvalue = >resvalue;
-		scratch.resnull = >resnull;
-
-		ExprEvalPushStep(state, );
+		ExecInitSubPlanExpr(subplan, state,
+			>resvalue, >resnull);
 	}
 }
 
@@ -4178,3 +4151,57 @@ ExecBuildParamSetEqual(TupleDesc desc,
 
 	return state;
 }
+
+static void
+ExecInitSubPlanExpr(SubPlan *subplan,
+	ExprState *state,
+	Datum *resv, bool *resnull)
+{
+	ExprEvalStep scratch = {0};
+	SubPlanState *sstate;
+	ListCell   *pvar;
+	ListCell   *l;
+
+	if (!state->parent)
+		elog(ERROR, "SubPlan found with no parent plan");
+
+	/*
+	 * Generate steps to evaluate input arguments for the subplan.
+	 *
+	 * We evaluate the argument expressions into ExprState's resvalue/resnull,
+	 * and then use PARAM_SET to update the parameter. We do that, instead of
+	 * evaluating directly into the param, to avoid depending on the pointer
+	 * value remaining stable / being included in the generated expression. No
+	 * danger of conflicts with other uses of resvalue/resnull as storing and
+	 * using the value always is in subsequent steps.
+	 *
+	 * Any calculation we have to do can be done in the parent econtext, since
+	 * the Param values don't need to have per-query lifetime.
+	 */
+	forboth(l, subplan->parParam, pvar, subplan->args)
+	{
+		int			paramid = lfirst_int(l);
+
+		ExecInitExprRec(lfirst(pvar), state,
+		>resvalue, >resnull);
+
+		scratch.opcode = EEOP_PARAM_SET;
+		scratch.d.param.paramid = paramid;
+		/* type isn't needed, but an old value could be confusing */
+		scratch.d.param.paramtype = InvalidOid;
+		ExprEvalPushStep(state, );
+	}
+
+	sstate = ExecInitSubPlan(subplan, state->parent);
+
+	/* add SubPlanState nodes to state->parent->subPlan */
+	state->parent->subPlan = lappend(state->parent->subPlan,
+	 sstate);
+
+	scratch.opcode = 

Re: recovery modules

2023-10-23 Thread Nathan Bossart
rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 0934f23773a612051c36070ed1f3b8ab100c4c65 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 15 Feb 2023 14:28:53 -0800
Subject: [PATCH v16 1/5] introduce routine for checking mutually exclusive
 string GUCs

---
 src/backend/postmaster/pgarch.c |  8 +++-
 src/backend/utils/misc/guc.c| 22 ++
 src/include/utils/guc.h |  3 +++
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 46af349564..d94730c50c 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -824,11 +824,9 @@ LoadArchiveLibrary(void)
 {
 	ArchiveModuleInit archive_init;
 
-	if (XLogArchiveLibrary[0] != '\0' && XLogArchiveCommand[0] != '\0')
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("both archive_command and archive_library set"),
- errdetail("Only one of archive_command, archive_library may be set.")));
+	(void) CheckMutuallyExclusiveStringGUCs(XLogArchiveLibrary, "archive_library",
+			XLogArchiveCommand, "archive_command",
+			ERROR);
 
 	/*
 	 * If shell archiving is enabled, use our special initialization function.
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 39d3775e80..847b62e161 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2638,6 +2638,28 @@ ReportGUCOption(struct config_generic *record)
 	pfree(val);
 }
 
+/*
+ * If both parameters are set, emits a log message at 'elevel' and returns
+ * false.  Otherwise, returns true.
+ */
+bool
+CheckMutuallyExclusiveStringGUCs(const char *p1val, const char *p1name,
+ const char *p2val, const char *p2name,
+ int elevel)
+{
+	if (p1val[0] != '\0' && p2val[0] != '\0')
+	{
+		ereport(elevel,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot set both %s and %s", p1name, p2name),
+ errdetail("Only one of %s or %s may be set.",
+		   p1name, p2name)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * Convert a value from one of the human-friendly units ("kB", "min" etc.)
  * to the given base unit.  'value' and 'unit' are the input value and unit
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 902e57c0ca..e4eef0da81 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -372,6 +372,9 @@ extern int	NewGUCNestLevel(void);
 extern void AtEOXact_GUC(bool isCommit, int nestLevel);
 extern void BeginReportingGUCOptions(void);
 extern void ReportChangedGUCOptions(void);
+extern bool CheckMutuallyExclusiveStringGUCs(const char *p1val, const char *p1name,
+			 const char *p2val, const char *p2name,
+			 int elevel);
 extern void ParseLongOption(const char *string, char **name, char **value);
 extern const char *get_config_unit_name(int flags);
 extern bool parse_int(const char *value, int *result, int flags,
-- 
2.25.1

>From 7f6a8d792620b4d5ae097eb43f25be69f89f17c6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 15 Feb 2023 10:36:00 -0800
Subject: [PATCH v16 2/5] refactor code for restoring via shell

---
 src/backend/Makefile  |   2 +-
 src/backend/access/transam/timeline.c |  12 +-
 src/backend/access/transam/xlog.c |  50 -
 src/backend/access/transam/xlogarchive.c  | 167 ---
 src/backend/access/transam/xlogrecovery.c |   3 +-
 src/backend/meson.build   |   1 +
 src/backend/postmaster/startup.c  |  16 +-
 src/backend/restore/Makefile  |  18 ++
 src/backend/restore/meson.build   |   5 +
 src/backend/restore/shell_restore.c   | 245 ++
 src/include/Makefile  |   2 +-
 src/include/access/xlogarchive.h  |   9 +-
 src/include/meson.build   |   1 +
 src/include/postmaster/startup.h  |   1 +
 src/include/restore/shell_restore.h   |  26 +++
 src/tools/pgindent/typedefs.list  |   1 +
 16 files changed, 407 insertions(+), 152 deletions(-)
 create mode 100644 src/backend/restore/Makefile
 create mode 100644 src/backend/restore/meson.build
 create mode 100644 src/backend/restore/shell_restore.c
 create mode 100644 src/include/restore/shell_restore.h

diff --git a/src/backend/Makefile b/src/backend/Makefile
index 3e275ac759..6913601c36 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -19,7 +19,7 @@ include $(top_builddir)/src/Makefile.global
 SUBDIRS = access archive backup bootstrap catalog parser commands executor \
 	foreign lib libpq \
 	main nodes optimizer partitioning port postmaster \
-	regex replication rewrite \
+	regex replication restore rewrite \
 	statistics storage tcop tsearch utils $(top_builddir)/src/timezone \
 	jit
 
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index 94e152694e..b3e1c18c65 100644
--- 

Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Pavel Stehule
po 23. 10. 2023 v 19:34 odesílatel Tom Lane  napsal:

> I wrote:
> > Given the exception the spec makes for CAST, I wonder if we shouldn't
> > just say "NULL is a valid value of every domain type, as well as every
> > base type.  If you don't like it, too bad; write a separate NOT NULL
> > constraint for your table column."
>
> After ruminating on this for awhile, here's a straw-man proposal:
>
> 1. Domains are data types, with the proviso that NULL is always
> a valid value no matter what the domain constraints might say.
> Implementation-wise, this'd just require that CoerceToDomain
> immediately return any null input without checking the constraints.
> This has two big attractions:
>
> (1A) It satisfies the plain language of the SQL spec about how
> CAST to a domain type behaves.
>
> (1B) It legitimizes our behavior of allowing nullable outer join
> columns, sub-SELECT outputs, etc to be considered to be of the
> source column's domain type and not just the base type.
>
> 2. In INSERT and UPDATE queries, thumb through the constraints of
> any domain-typed target columns to see if any of them are NOT NULL
> or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
> NOT NULL constraint on that column.
>

+1

I think only this interpretation makes sense.


> The idea of point #2 is to have a cheap check that 99% satisfies
> what the spec says about not-null constraints on domains.  If we
> don't do #2, I think we have to fully recheck all the domain's
> constraints during column assignment.  I find that ugly as well
> as expensive performance-wise.  It does mean that if you have
> some domain constraint that would act to reject NULLs, but it's
> spelled in some weird way, it won't reject NULLs.  I don't find
> that possibility compelling enough to justify the performance hit
> of recomputing every constraint just in case it acts like that.
>
> 3. Left unsaid here is whether we should treat assignments to,
> e.g., plpgsql variables as acting like assignments to table
> columns.  I'm inclined not to, because
>
> (3A) I'm lazy, and I'm also worried that we'd miss places where
> this arguably should happen.
>
> (3B) I don't think the SQL spec contemplates any such thing
> happening.
>
> (3C) Not doing that means we have a pretty consistent view of
> what the semantics are for "values in flight" within a query.
> Anything that's not stored in a table is "in flight" and so
> can be NULL.
>
> (3D) Again, if you don't like it, there's already ways to attach
> a separate NOT NULL constraint to plpgsql variables.
>

Although I don't fully like it, I think ignoring the NOT NULL constraint
for plpgsql's variables is a better way, then apply it. Elsewhere there can
be issues related to variable's initialization.

Regards

Pavel






>
>
> Documenting this in an intelligible fashion might be tricky,
> but explaining the exact spec-mandated behavior wouldn't be
> much fun either.
>
> Thoughts?
>
> regards, tom lane
>
>
>


Re: pg_dump needs SELECT privileges on irrelevant extension table

2023-10-23 Thread Tom Lane
Jacob Champion  writes:
> Is this approach backportable?

The code fix would surely work in the back branches.  Whether the
behavioral change is too big to be acceptable in minor releases
is something I don't have a strong opinion on.

regards, tom lane




Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Monday, October 23, 2023, Laurenz Albe  wrote:

> On Mon, 2023-10-23 at 08:35 -0700, David G. Johnston wrote:
>

> > along with not translating (none) and (default) and thus making the data
> contents
> > of these views environment independent.  But minimizing the variance of
> these command's
> > output across systems doesn't seem like a design goal that is likely to
> gain consensus
> > and is excessive when viewed within the framework of these being only
> for human consumption.
>
> I didn't understand this completely.  You want default privileges
> displayed as
> "(default)", but are you for or against "\pset null" to have its normal
> effect on
> the output of backslash commands in all other cases?


I haven’t inspected other cases but to my knowledge we don’t typically
represent non-unknown things using NULL so I’m not expecting other places
to have this representation problem.

I don’t think any of our meta-command outputs should modify pset null.
Left join cases should be considered unknown, represented as NULL, and obey
the user’s setting.

I do believe that we should be against exposing, like in this case, any
internal implementation detail that encodes something (e.g., default
privileges) as NULL in the catalogs, to the user of the psql meta-commands.

I won’t argue that exposing such NULLS is wrong, just it would preferable
IME to avoid doing so.  NULL means unknown or not applicable and default
privileges are neither of those things.  I get why our catalogs choose such
an encoding and agree with it, and users that find the need to consult the
catalogs will need to learn such details.  But we should strive for them to
be able to survive with psql meta-commands.

David J.


Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Isaac Morland
On Mon, 23 Oct 2023 at 13:40, Tom Lane  wrote:

> I wrote:
> > Given the exception the spec makes for CAST, I wonder if we shouldn't
> > just say "NULL is a valid value of every domain type, as well as every
> > base type.  If you don't like it, too bad; write a separate NOT NULL
> > constraint for your table column."
>
> After ruminating on this for awhile, here's a straw-man proposal:
>

[]


> 3. Left unsaid here is whether we should treat assignments to,
> e.g., plpgsql variables as acting like assignments to table
> columns.  I'm inclined not to, because
>
> (3A) I'm lazy, and I'm also worried that we'd miss places where
> this arguably should happen.
>
> (3B) I don't think the SQL spec contemplates any such thing
> happening.
>
> (3C) Not doing that means we have a pretty consistent view of
> what the semantics are for "values in flight" within a query.
> Anything that's not stored in a table is "in flight" and so
> can be NULL.
>
> (3D) Again, if you don't like it, there's already ways to attach
> a separate NOT NULL constraint to plpgsql variables.
>
>
> Documenting this in an intelligible fashion might be tricky,
> but explaining the exact spec-mandated behavior wouldn't be
> much fun either.


This sounds pretty good.

I'd be OK with only running the CHECK clause on non-NULL values. This would
imply that "CHECK (VALUE NOT NULL)" would have exactly the same effect as
"CHECK (TRUE)" (i.e., no effect). This might seem insane but it avoids a
special case and in any event if somebody wants the NOT NULL behaviour,
they can get it by specifying NOT NULL in the CREATE DOMAIN command.

Then domain CHECK constraints are checked anytime a non-NULL value is
turned into a domain value, and NOT NULL ones are checked only when storing
to a table. CHECK constraints would be like STRICT functions; if the input
is NULL, the implementation is not run and the result is NULL (which for a
CHECK means accept the input).

Whether I actually think the above is a good idea would require me to read
carefully the relevant section of the SQL spec. If it agrees that CHECK ()
is for testing non-NULL values and NOT NULL is for saying that columns of
actual tables can't be NULL, then I would probably agree with my own idea,
otherwise perhaps not depending on exactly what it said.

Some possible documentation wording to consider for the CREATE DOMAIN page:

Under "NOT NULL": "Table columns whose data type is this domain may not be
NULL, exactly as if NOT NULL had been given in the column specification."

Under "NULL": "This is a noise word indicating the default, which is that
the domain does not restrict NULL from occurring in table columns whose
data type is this domain."

Under "CHECK (expression)", replacing the first sentence: "CHECK clauses
specify integrity constraints or tests which non-NULL values of the domain
must satisfy; NULLs are never checked by domain CHECK clauses. To use a
domain to prevent a NULL from occurring in a table column, use the NOT NULL
clause."

Also, where it says "Expressions evaluating to TRUE or UNKNOWN succeed": Do
we really mean "Expressions evaluating to TRUE or NULL succeed"?

It would be nice if we had universally agreed terminology so that we would
have one word for the non-NULL things of various data types, and another
word for the possibly NULL things that might occur in variable or column.

If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
we could pass NULL to the constraint at CREATE DOMAIN time, and if it
returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
if an actual NOT NULL clause had been specified? Then when actually
processing domain constraints during a query, we could use the above
procedure. I'm thinking about more complicated constraints that evaluate to
FALSE for NULL but which are not simply "CHECK (VALUE NOT NULL)".

Is it an error to specify both NULL and NOT NULL? What about CHECK (VALUE
NOT NULL) and NULL?


Re: pg_dump needs SELECT privileges on irrelevant extension table

2023-10-23 Thread Jacob Champion
On Wed, Oct 18, 2023 at 1:25 PM Tom Lane  wrote:
> Stephen Frost  writes:
> > This change would mean that policies added by a user after the extension
> > is created would just be lost by a pg_dump/reload, doesn't it?
>
> Yes.  But I'd say that's unsupported, just like making other ad-hoc
> changes to extension objects is unsupported (and the effects will be
> lost on dump/reload).  We specifically have support for user-added
> ACLs, and that's good, but don't claim that we have support for
> doing the same with policies.

Is this approach backportable?

(Adding Aleks to CC -- Timescale may want to double-check that the new
proposal still works for them.)

Thanks,
--Jacob




Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Daniel Gustafsson
> On 23 Oct 2023, at 18:06, Peter Eisentraut  wrote:
> 
> On 23.10.23 16:26, Tom Lane wrote:
>> Also, since "PGAC_PATH_PROGS(OPENSSL, openssl)" prints the full path to what 
>> it found, you can at least tell after the fact that you are being misled, 
>> because you can cross-check that path against the -L switches being used for 
>> libraries.
> 
> Yeah, that seems ok.

+1, all good points raised, thanks!

--
Daniel Gustafsson





Re: Fix output of zero privileges in psql

2023-10-23 Thread Laurenz Albe
On Mon, 2023-10-23 at 08:35 -0700, David G. Johnston wrote:
> I tend to prefer the argument that these views are for human consumption and 
> should
> be designed with that in mind.

True, although given the shape of ACLs, it takes a somewhat trained human to
consume the string representation.  But we won't be able to hide the fact that
default ACLs are NULL in the catalogs.  We can leave them empty, we can show
them as "(default)" or we can let the user choose with "\pset null".


> I would suggest that we make the expected presence of NULL as an input 
> explicit:
> I would offer up:
> 
> when spcacl is null then '(default)'

Noted.

> along with not translating (none) and (default) and thus making the data 
> contents
> of these views environment independent.  But minimizing the variance of these 
> command's
> output across systems doesn't seem like a design goal that is likely to gain 
> consensus
> and is excessive when viewed within the framework of these being only for 
> human consumption.

I didn't understand this completely.  You want default privileges displayed as
"(default)", but are you for or against "\pset null" to have its normal effect 
on
the output of backslash commands in all other cases?

Speaking of consensus, it seems to me that Tom, Erik and me are in consensus.

Yours,
Laurenz Albe




Re: Partial aggregates pushdown

2023-10-23 Thread Bruce Momjian
On Wed, Oct 18, 2023 at 05:22:34AM +, 
fujii.y...@df.mitsubishielectric.co.jp wrote:
> Hi hackers.
> 
> Because there is a degrade in pg_dump.c, I fixed it.

Fujii-san, to get this patch closer to finished, can I modify this
version of the patch to improve some wording and post an updated version
you can use for future changes?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: PostgreSQL domains and NOT NULL constraint

2023-10-23 Thread Tom Lane
I wrote:
> Given the exception the spec makes for CAST, I wonder if we shouldn't
> just say "NULL is a valid value of every domain type, as well as every
> base type.  If you don't like it, too bad; write a separate NOT NULL
> constraint for your table column."

After ruminating on this for awhile, here's a straw-man proposal:

1. Domains are data types, with the proviso that NULL is always
a valid value no matter what the domain constraints might say.
Implementation-wise, this'd just require that CoerceToDomain
immediately return any null input without checking the constraints.
This has two big attractions:

(1A) It satisfies the plain language of the SQL spec about how
CAST to a domain type behaves.

(1B) It legitimizes our behavior of allowing nullable outer join
columns, sub-SELECT outputs, etc to be considered to be of the
source column's domain type and not just the base type.

2. In INSERT and UPDATE queries, thumb through the constraints of
any domain-typed target columns to see if any of them are NOT NULL
or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
NOT NULL constraint on that column.

The idea of point #2 is to have a cheap check that 99% satisfies
what the spec says about not-null constraints on domains.  If we
don't do #2, I think we have to fully recheck all the domain's
constraints during column assignment.  I find that ugly as well
as expensive performance-wise.  It does mean that if you have
some domain constraint that would act to reject NULLs, but it's
spelled in some weird way, it won't reject NULLs.  I don't find
that possibility compelling enough to justify the performance hit
of recomputing every constraint just in case it acts like that.

3. Left unsaid here is whether we should treat assignments to,
e.g., plpgsql variables as acting like assignments to table
columns.  I'm inclined not to, because

(3A) I'm lazy, and I'm also worried that we'd miss places where
this arguably should happen.

(3B) I don't think the SQL spec contemplates any such thing
happening.

(3C) Not doing that means we have a pretty consistent view of
what the semantics are for "values in flight" within a query.
Anything that's not stored in a table is "in flight" and so
can be NULL.

(3D) Again, if you don't like it, there's already ways to attach
a separate NOT NULL constraint to plpgsql variables.


Documenting this in an intelligible fashion might be tricky,
but explaining the exact spec-mandated behavior wouldn't be
much fun either.

Thoughts?

regards, tom lane




Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Peter Eisentraut

On 23.10.23 16:26, Tom Lane wrote:
Also, since "PGAC_PATH_PROGS(OPENSSL, openssl)" prints the full path to 
what it found, you can at least tell after the fact that you are being 
misled, because you can cross-check that path against the -L switches 
being used for libraries.


Yeah, that seems ok.





Re: trying again to get incremental backup

2023-10-23 Thread Robert Haas
On Fri, Oct 20, 2023 at 11:30 AM David Steele  wrote:
> Then I'm fine with it as is.

OK, thanks.

> In my view this feature puts the cart way before the horse. I'd think
> higher priority features might be parallelism, a backup repository,
> expiration management, archiving, or maybe even a restore command.
>
> It seems the only goal here is to make pg_basebackup a tool for external
> backup software to use, which might be OK, but I don't believe this
> feature really advances pg_basebackup as a usable piece of stand-alone
> software. If people really think that start/stop backup is too
> complicated an interface how are they supposed to track page
> incrementals and get them to a place where pg_combinebackup can put them
> backup together? If automation is required to use this feature,
> shouldn't pg_basebackup implement that automation?
>
> I have plenty of thoughts about the implementation as well, but I have a
> lot on my plate right now and I don't have time to get into it.
>
> I don't plan to stand in your way on this feature. I'm reviewing what
> patches I can out of courtesy and to be sure that nothing adjacent to
> your work is being affected. My apologies if my reviews are not meeting
> your expectations, but I am contributing as my time constraints allow.

Sorry, I realize reading this response that I probably didn't do a
very good job writing that email and came across sounding like a jerk.
Possibly, I actually am a jerk. Whether it just sounded like it or I
actually am, I apologize. But your last paragraph here gets at my real
question, which is whether you were going to try to block the feature.
I recognize that we have different priorities when it comes to what
would make most sense to implement in PostgreSQL, and that's OK, or at
least, it's OK with me. I also don't have any particular expectation
about how much you should review the patch or in what level of detail,
and I have sincerely appreciated your feedback thus far. If you are
able to continue to provide more, that's great, and if that's not,
well, you're not obligated. What I was concerned about was whether
that review was a precursor to a vigorous attempt to keep the main
patch from getting committed, because if that was going to be the
case, then I'd like to surface that conflict sooner rather than later.
It sounds like that's not an issue, which is great.

At the risk of drifting into the fraught question of what I *should*
be implementing rather than the hopefully-less-fraught question of
whether what I am actually implementing is any good, I see incremental
backup as a way of removing some of the use cases for the low-level
backup API. If you said "but people still will have lots of reasons to
use it," I would agree; and if you said "people can still screw things
up with pg_basebackup," I would also agree. Nonetheless, most of the
disasters I've personally seen have stemmed from the use of the
low-level API rather than from the use of pg_basebackup, though there
are exceptions. I also think a lot of the use of the low-level API is
driven by it being just too darn slow to copy the whole database, and
incremental backup can help with that in some circumstances. Also, I
have worked fairly hard to try to make sure that if you misuse
pg_combinebackup, or fail to use it altogether, you'll get an error
rather than silent data corruption. I would be interested to hear
about scenarios where the checks that I've implemented can be defeated
by something that is plausibly described as stupidity rather than
malice. I'm not sure we can fix all such cases, but I'm very alert to
the horror that will befall me if user error looks exactly like a bug
in the code. For my own sanity, we have to be able to distinguish
those cases. Moreover, we also need to be able to distinguish
backup-time bugs from reassembly-time bugs, which is why I've got the
pg_walsummary tool, and why pg_combinebackup has the ability to emit
fairly detailed debugging output. I anticipate those things being
useful in investigating bug reports when they show up. I won't be too
surprised if it turns out that more work on sanity-checking and/or
debugging tools is needed, but I think your concern about people
misusing stuff is bang on target and I really want to do whatever we
can to avoid that when possible and detect it when it happens.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Mon, Oct 23, 2023 at 7:57 AM Tom Lane  wrote:

>
> IOW, the current definition is "NULL privileges print as an empty
> string no matter what", and I don't think that serves to reduce
> confusion about whether an ACL is NULL or not.  We ought to be doing
> what we can to make clear that such an ACL *is* NULL, because
> otherwise people won't understand how it differs from an empty ACL.
>
>
I tend to prefer the argument that these views are for human consumption
and should be designed with that in mind.  Allowing the user to decide
whether they prefer NULL to print as the empty string or something else
works within that framework.  And the change of behavior for everyone with
a non-default \pset gets accepted under that framework as well.

I would suggest that we make the expected presence of NULL as an input
explicit:

case when spcacl is null then null
 when cardinality(spcacl) = 0 then '(none)' -- so as not to confuse
it with null being printed also as an empty string
 else array_to_string(spcacl, E'\\n')
end as "Access privileges"

I would offer up:

when spcacl is null then '(default)'

along with not translating (none) and (default) and thus making the data
contents of these views environment independent.  But minimizing the
variance of these command's output across systems doesn't seem like a
design goal that is likely to gain consensus and is excessive when viewed
within the framework of these being only for human consumption.

David J.


Re: LLVM 16 (opaque pointers)

2023-10-23 Thread Mark Wong
On Mon, Oct 23, 2023 at 01:15:04PM +1300, Thomas Munro wrote:
> On Sun, Oct 22, 2023 at 2:44 PM Thomas Munro  wrote:
> > On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> > > Thomas Munro  writes:
> > > > (It'd be nice if the
> > > > build farm logged "$LLVM_CONFIG --version" somewhere.)
> > >
> > > It's not really the buildfarm script's responsibility to do that,
> > > but feel free to make configure do so.
> >
> > Done, copying the example of how we do it for perl and various other things.
> 
> Build farm measles update:
> 
> With that we can see that nicator (LLVM 15 on POWER) is green.  We can
> see that cavefish (LLVM 6 on POWER) is red as expected.  We can also
> see that bonito (LLVM 7 on POWER) is red, so my earlier theory that
> this might be due to the known bug we got fixed in LLVM 7 is not
> enough.  Maybe there are other things fixed on POWER somewhere between
> those LLVM versions?  I suspect it'll be hard to figure out without
> debug builds and backtraces.

I haven't gotten around to disabling llvm on any of my animals with llvm
< 7 yet.  Do you still want to hold on that?

> One thing is definitely our fault, though.  xenodermus shows failures
> on REL_12_STABLE and REL_13_STABLE, using debug LLVM 6 on x86.  I
> couldn't reproduce this locally on (newer) debug LLVM, so I bugged
> Andres for access to the host/libraries and chased it down.  There is
> some type punning for a function parameter REL_13_STABLE and earlier,
> removed by Andres in REL_14_STABLE, and when I back-patched my
> refactoring I effectively back-patched a few changes from his commit
> df99ddc70b97 that removed the type punning, but I should have brought
> one more line from that commit to remove another trace of it.  See
> attached.

Here are my list of llvm-config versions and distros for s390x and POWER
(didn't see any issues on aarch64 but I grabbed all the info at the same
time.)

s390x:

branta: 10.0.0 Ubuntu 20.04.4 LTS
cotinga: 6.0.0 Ubuntu 18.04.6 LTS
perch: 6.0.0 Ubuntu 18.04.6 LTS
sarus: 14.0.0 Ubuntu 22.04.1 LTS
aracari: 15.0.7 Red Hat Enterprise Linux 8.6
pipit: 15.0.7 Red Hat Enterprise Linux 8.6
lora: 15.0.7 Red Hat Enterprise Linux 9.2
mamushi: 15.0.7 Red Hat Enterprise Linux 9.2
pike: 11.0.1 Debian GNU/Linux 11
rinkhals: 11.0.1 Debian GNU/Linux 11


POWER:

bonito: 7.0.1 Fedora 29
cavefish: 6.0.0 Ubuntu 18.04.6 LTS
demoiselle: 5.0.1 openSUSE Leap 15.0
elasmobranch: 5.0.1 openSUSE Leap 15.0
babbler: 15.0.7 AlmaLinux 8.8
pytilia: 15.0.7 AlmaLinux 8.8
nicator: 15.0.7 AlmaLinux 9.2
twinspot: 15.0.7 AlmaLinux 9.2
cascabel: 11.0.1 Debian GNU/Linux 11
habu: 16.0.6 Fedora Linux 38
kingsnake: 16.0.6 Fedora Linux 38
krait: CentOS 16.0.6 Stream 8
lancehead: CentOS 16.0.6 Stream 8


aarch64:

boiga: 14.0.5 Fedora Linux 36
corzo: 14.0.5 Fedora Linux 36
desman: 16.0.6 Fedora Linux 38
motmot: 16.0.6 Fedora Linux 38
whinchat: 11.0.1 Debian GNU/Linux 11
jackdaw: 11.0.1 Debian GNU/Linux 11
blackneck: 7.0.1 Debian GNU/Linux 10
alimoche: 7.0.1 Debian GNU/Linux 10
bulbul: 15.0.7 AlmaLinux 8.8
broadbill: 15.0.7 AlmaLinux 8.8
oystercatcher: 15.0.7 AlmaLinux 9.2
potoo: 15.0.7 AlmaLinux 9.2
whiting: 6.0.0 Ubuntu 18.04.5 LTS
vimba: 6.0.0 Ubuntu 18.04.5 LTS
splitfin: 10.0.0 Ubuntu 20.04.6 LTS
rudd: 10.0.0 Ubuntu 20.04.6 LTS
turbot: 14.0.0 Ubuntu 22.04.3 LTS
shiner: 14.0.0 Ubuntu 22.04.3 LTS
ziege: 16.0.6 CentOS Stream 8
chevrotain: 11.0.1 Debian GNU/Linux 11

Regards,
Mark




Re: Fix output of zero privileges in psql

2023-10-23 Thread Tom Lane
"David G. Johnston"  writes:
> We document default privileges print as an empty string - I do not think we
> should change the definition to "default privileges print null which can be
> controlled via \pset null", and regardless of preference doing so is not a
> bug.

Well, "if it's documented it's not a bug" is a defensible argument
for calling something not a bug, but it doesn't address the question
of whether changing it would be an improvement.  I think it would be,
and I object to your claim that we have a consensus to not do that.

The core of the problem here, IMO, is exactly that there is confusion
between whether a visibly empty string represents NULL (default
privileges) or an empty ACL (no privileges).  I believe we have agreed
that printing "(none)" or some other clearly-not-an-ACL-entry string
for the second case is an improvement, even though that's a change
in behavior.  That doesn't mean that changing the other case can't
also be an improvement.  I think it'd be less confusing all around
if this instance of NULL prints like other instances of NULL.

IOW, the current definition is "NULL privileges print as an empty
string no matter what", and I don't think that serves to reduce
confusion about whether an ACL is NULL or not.  We ought to be doing
what we can to make clear that such an ACL *is* NULL, because
otherwise people won't understand how it differs from an empty ACL.

regards, tom lane




Re: Casual Meson fixups

2023-10-23 Thread Andrew Dunstan



On 2023-10-22 Su 20:59, Andres Freund wrote:

Hi,

On 2023-09-01 11:31:07 -0500, Tristan Partin wrote:

Muon development has slowed quite a bit this year. Postgres is probably the
largest project which tries its best to support Muon. It seems like if we
want to keep supporting Muon, we should get a buildfarm machine to use it
instead of Meson to catch regressions. OR we should contemplate removing
support for it.

I found it to be quite useful to find bugs in the meson.build files...



I agree with Tristan that if we are going to use it then we should have 
a buildfarm animal that does too.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Monday, October 23, 2023, Laurenz Albe  wrote:

> On Mon, 2023-10-23 at 07:03 -0700, David G. Johnston wrote:
> > On Monday, October 23, 2023, Laurenz Albe 
> wrote:
> > >
> > >   --- a/src/bin/psql/describe.c
> > >   +++ b/src/bin/psql/describe.c
> > >   @@ -6718,7 +6680,13 @@ static void
> > >printACLColumn(PQExpBuffer buf, const char *colname)
> > >{
> > >   appendPQExpBuffer(buf,
> > >   - "pg_catalog.array_to_string(%s, E'\\n') AS
> \"%s\"",
> > >   + "CASE\n"
> > >   + "  WHEN %s IS NULL THEN ''\n"
> > >   + "  WHEN pg_catalog.cardinality(%s) = 0 THEN
> '%s'\n"
> > >   + "  ELSE pg_catalog.array_to_string(%s,
> E'\\n')\n"
> > >   + "END AS \"%s\"",
> > >   + colname,
> > >   + colname, gettext_noop("(none)"),
> > > colname, gettext_noop("Access privileges"));
> > >}
> > >
> > > This erroneously displays NULL as empty string and subverts my changes.
> > > I have removed the first branch of the CASE expression.
> >
> > There is no error here, the current consensus which this patch
> implements is to
> > not change the documented “default privileges display as blank”.  We are
> solving
> > the empty privileges are not distinguishable complaint by printing
> (none) for them.
>
> Erik's latest patch included my changes to display NULL as NULL in psql,
> so that "\pset null" works as expected.
>

No, per the commit message, issuing an explicit \pset null is a kludge and
it gets rid of the hack in favor of making the query itself produce an
empty string.  i.e., we choose a poor implementation to get the documented
behavior and we are cleaning that up as an aside to the main fix.

Changing the behavior so that default privileges print in correspondence to
the setting of \pset null is, IME, off the table for this patch.  Its one
and only goal is to reliably distinguish empty and default privileges.
That is our extant bug.

We document default privileges print as an empty string - I do not think we
should change the definition to "default privileges print null which can be
controlled via \pset null", and regardless of preference doing so is not a
bug.

David J.


Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Tom Lane
Daniel Gustafsson  writes:
> On 23 Oct 2023, at 08:22, Peter Eisentraut  wrote:
>> The problem is that the binary might not match the library, so this could be 
>> very misleading.  Also, meson gets the version via pkg-config, so the result 
>> would also be inconsistent with meson.  I am afraid this approach would be 
>> unreliable in the really interesting cases.

> I tend to agree with this, it would be preferrable to be consistent with meson
> if possible/feasible.

The configure script doesn't use pkg-config to find OpenSSL, so that
would also risk a misleading result.  I'm inclined to guess that
believing "openssl version" would be less likely to give a wrong
answer than believing "pkg-config --modversion openssl".  The former
amounts to assuming that your PATH is consistent with whatever you
set as the include and library search paths.  The latter, well,
hasn't got any principle at all, because we aren't consulting
pkg-config for this.

Also, since "PGAC_PATH_PROGS(OPENSSL, openssl)" prints the full path
to what it found, you can at least tell after the fact that you
are being misled, because you can cross-check that path against
the -L switches being used for libraries.  I don't think there's
any equivalent sanity check available for what pkg-config tells us,
if we're not consulting that for the library location.

meson.build may be fine here --- I suppose the reason that gets
the version via pkg-config is that it also gets other build details
from there.  It's slightly worrisome that the autoconf and meson
build systems might choose different openssl installations, but
that's possibly true of lots of our dependencies.

Another angle worth considering is that "openssl version" provides
more information.  On my RHEL8 box:

$ pkg-config --modversion openssl
1.1.1k
$ openssl version
OpenSSL 1.1.1k  FIPS 25 Mar 2021

On my laptop using MacPorts:

$ pkg-config --modversion openssl
3.1.3
$ openssl version
OpenSSL 3.1.3 19 Sep 2023 (Library: OpenSSL 3.1.3 19 Sep 2023)

regards, tom lane




Re: Allow ALTER SYSTEM SET on unrecognized custom GUCs

2023-10-23 Thread Andrew Dunstan



On 2023-10-16 Mo 20:19, Tom Lane wrote:

Currently we have this odd behavior (for a superuser):

regression=# ALTER SYSTEM SET foo.bar TO 'baz';
ERROR:  unrecognized configuration parameter "foo.bar"
regression=# SET foo.bar TO 'baz';
SET
regression=# ALTER SYSTEM SET foo.bar TO 'baz';
ALTER SYSTEM

That is, you can't ALTER SYSTEM SET a random custom GUC unless there
is already a placeholder GUC for it, because the find_option call in
AlterSystemSetConfigFile fails.  This is surely pretty inconsistent.
Either the first ALTER SYSTEM SET ought to succeed, or the second one
ought to fail too, because we don't have any more knowledge about the
custom GUC than we did before.

In the original discussion about this [1], I initially leaned towards
"they should both fail", but I reconsidered: there doesn't seem to be
any harm in allowing ALTER SYSTEM SET to succeed for any custom GUC
name, as long as you're superuser.

Hence, attached is a patch for that.  Much of it is refactoring to
avoid duplicating the code that checks for a reserved GUC name, which
I think should still be done here --- otherwise, we're losing a lot of
the typo detection that that check was intended to provide.  (That is,
if you have loaded an extension that defines "foo" as a prefix, we
should honor the extension's opinion about whether "foo.bar" is
valid.)  I also fixed the code for GRANT ON PARAMETER so that it
follows the same rules and throws the same errors for invalid cases.

There's a chunk of AlterSystemSetConfigFile that now needs indenting
one more tab stop, but I didn't do that yet for ease of review.

Thoughts?





Haven't read the patch but in principle I agree.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Is this a problem in GenericXLogFinish()?

2023-10-23 Thread Robert Haas
On Fri, Oct 20, 2023 at 12:28 PM Jeff Davis  wrote:
> I still have a question though: if a buffer is exclusive-locked,
> unmodified and clean, and then the caller registers it and later does
> PageSetLSN (just as if it were dirty), is that a definite bug?
>
> There are a couple callsites where the control flow is complex enough
> that it's hard to be sure the buffer is always marked dirty before
> being registered (like in log_heap_visible(), as I mentioned upthread).
> But those callsites are all doing PageSetLSN, unlike the hash index
> case.

I think that would be a bug. I think it would be OK to just change the
page LSN and nothing else, but that requires calling MarkBufferDirty()
at some point. If you only call PageSetLSN() and never
MarkBufferDirty(), that sounds messed up: either the former should be
skipped, or the latter should be added. We shouldn't modify a shared
buffer without marking it dirty.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: controlling meson's parallelism (and some whining)

2023-10-23 Thread Robert Haas
On Fri, Oct 20, 2023 at 1:08 PM Tristan Partin  wrote:
> You will see this in the 1.3.0 release which will be happening soon™️.

Cool, thanks!

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: controlling meson's parallelism (and some whining)

2023-10-23 Thread Robert Haas
On Thu, Oct 19, 2023 at 6:09 PM Andres Freund  wrote:
> Hm. Did you not run into simmilar issues with make check-world? I found the
> concurrency of that to be even more variable over a run.

I did not, but I didn't generally run that in parallel, either, mostly
for fear of being unable to see failures properly in the output. I
used -j8 when building, though.

> But perhaps there's something else wrong here? Perhaps we should deal with
> this in Cluster.pm to some degree? Controlling this from the level of meson
> (or make/prove for that matter) doesn't really work well, because different
> tests start differently many postgres instances.

I'm not sure, but I'm open to however anybody would like to improve things.

> How many cores does your machine have? I've run the tests in a loop on my m1
> mac mini in the past without running into this issue. It has "only" 8 cores
> though, whereas I infer, from you mentioning -j8, that you have more cores?

System Information shows "Total Number of Cores: 8" but sysctl hw.ncpu
returns 16. No real idea what is fastest, I haven't been real
scientific about choosing values for -j.

> > My next thought was that there ought to be some environmental variable
> > that I could set to control this behavior. But I can't find a list of
> > environment variables that affect meson behavior anywhere. I guess the
> > authors don't believe in environment variable as a control mechanism.
>
> They indeed do not like them - but there is one in this
> case: MESON_TESTTHREADS
>
> There's even documentation for it: 
> https://mesonbuild.com/Unit-tests.html#parallelism

I mean, I probably glanced at that page at some point, but it's hardly
obvious that there's a mention of an environment variable buried
somewhere in the middle of the page. Most of the code you see looking
at the page is Python, and the other environment variables mentioned
seem to be ones that it sets, rather than ones that you can set. They
really ought to have a documentation page somewhere that lists all of
the environment variables that the user can set, and maybe another one
that lists all the ones that the tool itself sets before running
subprocesses. You can't expect people to navigate through every page
of the documentation and read every word on the page carefully to find
stuff like this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fix output of zero privileges in psql

2023-10-23 Thread Laurenz Albe
On Mon, 2023-10-23 at 07:03 -0700, David G. Johnston wrote:
> On Monday, October 23, 2023, Laurenz Albe  wrote:
> > 
> >   --- a/src/bin/psql/describe.c
> >   +++ b/src/bin/psql/describe.c
> >   @@ -6718,7 +6680,13 @@ static void
> >    printACLColumn(PQExpBuffer buf, const char *colname)
> >    {
> >       appendPQExpBuffer(buf,
> >   -                     "pg_catalog.array_to_string(%s, E'\\n') AS \"%s\"",
> >   +                     "CASE\n"
> >   +                     "  WHEN %s IS NULL THEN ''\n"
> >   +                     "  WHEN pg_catalog.cardinality(%s) = 0 THEN '%s'\n"
> >   +                     "  ELSE pg_catalog.array_to_string(%s, E'\\n')\n"
> >   +                     "END AS \"%s\"",
> >   +                     colname,
> >   +                     colname, gettext_noop("(none)"),
> >                         colname, gettext_noop("Access privileges"));
> >    }
> > 
> > This erroneously displays NULL as empty string and subverts my changes.
> > I have removed the first branch of the CASE expression.
> 
> There is no error here, the current consensus which this patch implements is 
> to
> not change the documented “default privileges display as blank”.  We are 
> solving
> the empty privileges are not distinguishable complaint by printing (none) for 
> them.

Erik's latest patch included my changes to display NULL as NULL in psql,
so that "\pset null" works as expected.
But he left the above hunk from his original patch, and that hunk replaces
NULL with an empty string, so "\pset null" wouldn't work for the display
of default provoleges.

He didn't notice it because he didn't have a regression test that displays
default privileges with non-empty "\pset null".

Yours,
Laurenz Albe




Re: Fix output of zero privileges in psql

2023-10-23 Thread David G. Johnston
On Monday, October 23, 2023, Laurenz Albe  wrote:

>
>   --- a/src/bin/psql/describe.c
>   +++ b/src/bin/psql/describe.c
>   @@ -6718,7 +6680,13 @@ static void
>printACLColumn(PQExpBuffer buf, const char *colname)
>{
>   appendPQExpBuffer(buf,
>   - "pg_catalog.array_to_string(%s, E'\\n') AS
> \"%s\"",
>   + "CASE\n"
>   + "  WHEN %s IS NULL THEN ''\n"
>   + "  WHEN pg_catalog.cardinality(%s) = 0 THEN '%s'\n"
>   + "  ELSE pg_catalog.array_to_string(%s, E'\\n')\n"
>   + "END AS \"%s\"",
>   + colname,
>   + colname, gettext_noop("(none)"),
> colname, gettext_noop("Access privileges"));
>}
>
> This erroneously displays NULL as empty string and subverts my changes.
> I have removed the first branch of the CASE expression.
>

There is no error here, the current consensus which this patch implements
is to not change the documented “default privileges display as blank”.  We
are solving the empty privileges are not distinguishable complaint by
printing (none) for them.

David J.


Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-23 Thread Noah Misch
On Mon, Oct 16, 2023 at 11:21:20PM -0700, Donghang Lin wrote:
> > I've also caught btree posting lists where one TID refers to a '1d' heap
> > tuple, while another TID refers to a '24h' heap tuple.  amcheck complains.
> Index-only scans can return the '1d' bits where the actual tuple had the
> '24h'
> bits.
> 
> Have a build without the patch. I can't reproduce amcheck complaints in
> release mode
> where all these statements succeed.

The queries I shared don't create the problematic structure, just an assertion
failure.

> >  * Generic "equalimage" support function.
> > *
> > * B-Tree operator classes whose equality function could safely be
> replaced by
> > * datum_image_eq() in all cases can use this as their "equalimage" support
> > * function.
> It seems to me that as long as a data type has a deterministic sort but not
> necessarily be equalimage,
> it should be able to support deduplication.  e.g for interval type, we add
> a byte wise tie breaker
> after '24h' and '1day' are compared equally. In the btree, '24h' and '1day'
> are still adjacent,
> '1day' is always sorted before '24h' in a btree page, can we do dedup for
> each value
> without problem?

Yes.  I'm not aware of correctness obstacles arising if one did that.




Re: Synchronizing slots from primary to standby

2023-10-23 Thread shveta malik
On Mon, Oct 23, 2023 at 5:52 PM Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 10/20/23 5:27 AM, shveta malik wrote:
> > On Wed, Oct 18, 2023 at 4:24 PM Amit Kapila  wrote:
> >
> > PFA v25 patch set. The changes are:
> >
> > 1) 'enable_failover' is changed to 'failover'
> > 2) Alter subscription changes to support 'failover'
> > 3) Fixes a bug in patch001 wherein any change in standby_slot_names
> > was not considered in the flow where logical walsenders wait for
> > standby's confirmation. Now during the wait, if standby_slot_names is
> > changed, wait is restarted using new standby_slot_names.
> > 4) Addresses comments by Bertrand and Amit in [1],[2],[3]
> >
> > The changes are mostly in patch001 and a very few in patch002.
> >
> > Thank You Ajin for working on alter-subscription changes and adding
> > more TAP-tests for 'failover'
> >
>
> Thanks for updating the patch!
>
> Looking at 0001 and doing some experiment:
>
> Creating a logical slot with failover = true and then launching
> pg_logical_slot_get_changes() or pg_recvlogical() on it results
> to setting failover back to false.
>
> It occurs while creating the decoding context here:
>
> @@ -602,6 +602,9 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>  SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
>  }
>
> +   /* set failover in the slot, as requested */
> +   slot->data.failover = ctx->failover;
> +
>
> I think we can get rid of this change in CreateDecodingContext().
>

Thanks for pointing it out. I will correct it in the next patch.

> Looking at 0002:
>
>  /* Enter main loop */
>  for (;;)
>  {
>  int rc;
>  longwait_time = DEFAULT_NAPTIME_PER_CYCLE;
>
>  CHECK_FOR_INTERRUPTS();
>
>  /*
>   * If it is Hot standby, then try to launch slot-sync workers else
>   * launch apply workers.
>   */
>  if (RecoveryInProgress())
>  {
>  /* Launch only if we have succesfully made the connection */
>  if (wrconn)
>  LaunchSlotSyncWorkers(_time, wrconn);
>  }
>
> We are waiting for DEFAULT_NAPTIME_PER_CYCLE (3 minutes) before checking if 
> there
> is new synced slot(s) to be created on the standby. Do we want to keep this 
> behavior
> for V1?
>

I think for the slotsync workers case, we should reduce the naptime in
the launcher to say 30sec and retain the default one of 3mins for
subscription apply workers. Thoughts?

> Regards,
>
> --
> Bertrand Drouvot
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com




Re: Synchronizing slots from primary to standby

2023-10-23 Thread Drouvot, Bertrand

Hi,

On 10/20/23 5:27 AM, shveta malik wrote:

On Wed, Oct 18, 2023 at 4:24 PM Amit Kapila  wrote:

PFA v25 patch set. The changes are:

1) 'enable_failover' is changed to 'failover'
2) Alter subscription changes to support 'failover'
3) Fixes a bug in patch001 wherein any change in standby_slot_names
was not considered in the flow where logical walsenders wait for
standby's confirmation. Now during the wait, if standby_slot_names is
changed, wait is restarted using new standby_slot_names.
4) Addresses comments by Bertrand and Amit in [1],[2],[3]

The changes are mostly in patch001 and a very few in patch002.

Thank You Ajin for working on alter-subscription changes and adding
more TAP-tests for 'failover'



Thanks for updating the patch!

Looking at 0001 and doing some experiment:

Creating a logical slot with failover = true and then launching
pg_logical_slot_get_changes() or pg_recvlogical() on it results
to setting failover back to false.

It occurs while creating the decoding context here:

@@ -602,6 +602,9 @@ CreateDecodingContext(XLogRecPtr start_lsn,
SnapBuildSetTwoPhaseAt(ctx->snapshot_builder, start_lsn);
}

+   /* set failover in the slot, as requested */
+   slot->data.failover = ctx->failover;
+

I think we can get rid of this change in CreateDecodingContext().

Looking at 0002:

/* Enter main loop */
for (;;)
{
int rc;
longwait_time = DEFAULT_NAPTIME_PER_CYCLE;

CHECK_FOR_INTERRUPTS();

/*
 * If it is Hot standby, then try to launch slot-sync workers else
 * launch apply workers.
 */
if (RecoveryInProgress())
{
/* Launch only if we have succesfully made the connection */
if (wrconn)
LaunchSlotSyncWorkers(_time, wrconn);
}

We are waiting for DEFAULT_NAPTIME_PER_CYCLE (3 minutes) before checking if 
there
is new synced slot(s) to be created on the standby. Do we want to keep this 
behavior
for V1?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Parent/child context relation in pg_get_backend_memory_contexts()

2023-10-23 Thread Melih Mutlu
Hi,

Thanks for reviewing.
Attached the updated patch v3.

Andres Freund , 12 Eki 2023 Per, 19:23 tarihinde şunu
yazdı:

> > Here how pg_backend_memory_contexts would look like with this patch:
> >
> > postgres=# SELECT name, id, parent, parent_id, path
> > FROM pg_backend_memory_contexts
> > ORDER BY total_bytes DESC LIMIT 10;
> >   name   | id  |  parent  | parent_id | path
> >
> -+-+--+---+--
> >  CacheMemoryContext  |  27 | TopMemoryContext | 0 | {0}
> >  Timezones   | 124 | TopMemoryContext | 0 | {0}
> >  TopMemoryContext|   0 |  |   |
> >  MessageContext  |   8 | TopMemoryContext | 0 | {0}
> >  WAL record construction | 118 | TopMemoryContext | 0 | {0}
> >  ExecutorState   |  18 | PortalContext|17 | {0,16,17}
> >  TupleSort main  |  19 | ExecutorState|18 |
> {0,16,17,18}
> >  TransactionAbortContext |  14 | TopMemoryContext | 0 | {0}
> >  smgr relation table |  10 | TopMemoryContext | 0 | {0}
> >  GUC hash table  | 123 | GUCMemoryContext |   122 | {0,122}
> > (10 rows)
>
> Would we still need the parent_id column?
>

I guess not. Assuming the path column is sorted from TopMemoryContext to
the parent one level above, parent_id can be found using the path column if
needed.
Removed parent_id.


> > +
> > + 
> > +  
> > +   context_id int4
> > +  
> > +  
> > +   Current context id
> > +  
> > + 
>
> I think the docs here need to warn that the id is ephemeral and will likely
> differ in the next invocation.
>

Done.

> + 
> > +  
> > +   parent_id int4
> > +  
> > +  
> > +   Parent context id
> > +  
> > + 
> > +
> > + 
> > +  
> > +   path int4
> > +  
> > +  
> > +   Path to reach the current context from TopMemoryContext
> > +  
> > + 
>
> Perhaps we should include some hint here how it could be used?
>

I added more explanation but not sure if that is what you asked for. Do you
want a hint that is related to a more specific use case?

> + length = list_length(path);
> > + datum_array = (Datum *) palloc(length * sizeof(Datum));
> > + length = 0;
> > + foreach(lc, path)
> > + {
> > + datum_array[length++] = Int32GetDatum((int)
> lfirst_int(lc));
>
> The "(int)" in front of lfirst_int() seems redundant?
>

Removed.

I think it'd be good to have some minimal test for this. E.g. checking that
> there's multiple contexts below cache memory context or such.
>

Added new tests in sysview.sql.


Stephen Frost , 18 Eki 2023 Çar, 22:53 tarihinde şunu
yazdı:

> I wonder if we should perhaps just include
> "total_bytes_including_children" as another column?  Certainly seems
> like a very useful thing that folks would like to see.  We could do that
> either with C, or even something as simple as changing the view to do
> something like:
>
> WITH contexts AS MATERIALIZED (
>   SELECT * FROM pg_get_backend_memory_contexts()
> )
> SELECT
>   *,
>   coalesce
>   (
> (
>   (SELECT sum(total_bytes) FROM contexts WHERE ARRAY[a.id] <@ path)
>   + total_bytes
> ),
> total_bytes
>   ) AS total_bytes_including_children
> FROM contexts a;
>

I added a "total_bytes_including_children" column as you suggested. Did
that with C since it seemed faster than doing it by changing the view.

-- Calculating total_bytes_including_children by modifying the view
postgres=# select * from pg_backend_memory_contexts ;
Time: 30.462 ms

-- Calculating total_bytes_including_children with C
postgres=# select * from pg_backend_memory_contexts ;
Time: 1.511 ms


Thanks,
-- 
Melih Mutlu
Microsoft


v3-0001-Adding-id-parent_id-into-pg_backend_memory_contex.patch
Description: Binary data


Re: Removing unneeded self joins

2023-10-23 Thread Alexander Korotkov
On Mon, Oct 23, 2023 at 2:00 PM Alexander Lakhin  wrote:
> 23.10.2023 12:47, Alexander Korotkov wrote:
> > I think this patch makes substantial improvement to query planning.
> > It has received plenty of reviews.  The code is currently in quite
> > good shape.  I didn't manage to find the cases when this optimization
> > causes significant overhead to planning time.  Even if such cases will
> > be spotted there is a GUC option to disable this feature.  So, I'll
> > push this if there are no objections.
>
> On a quick glance, I've noticed following typos/inconsistencies in the
> patch, which maybe worth fixing:
> s/cadidates/candidates/
> s/uniquiness/uniqueness/
> s/selfjoin/self-join/
> s/seperate/separate/
>
> Also, shouldn't the reference "see generate_implied_equalities" be
> "see generate_implied_equalities_for_column"?

Fixed all of the above.  Thank you for catching this!

--
Regards,
Alexander Korotkov


0001-Remove-useless-self-joins-v48.patch
Description: Binary data


Re: Removing unneeded self joins

2023-10-23 Thread Alexander Lakhin

Hi Alexander,

23.10.2023 12:47, Alexander Korotkov wrote:

I think this patch makes substantial improvement to query planning.
It has received plenty of reviews.  The code is currently in quite
good shape.  I didn't manage to find the cases when this optimization
causes significant overhead to planning time.  Even if such cases will
be spotted there is a GUC option to disable this feature.  So, I'll
push this if there are no objections.


On a quick glance, I've noticed following typos/inconsistencies in the
patch, which maybe worth fixing:
s/cadidates/candidates/
s/uniquiness/uniqueness/
s/selfjoin/self-join/
s/seperate/separate/

Also, shouldn't the reference "see generate_implied_equalities" be
"see generate_implied_equalities_for_column"?

Best regards,
Alexander




Re: WIP: new system catalog pg_wait_event

2023-10-23 Thread Pavel Luzanov

Please, consider this small fix for the query example in the docs[1]:

SELECT a.pid, a.wait_event, w.description
  FROM pg_stat_activity a JOIN
   pg_wait_events w ON (a.wait_event_type = w.type AND
a.wait_event = w.name)
  WHERE wait_event is NOT NULL and a.state = 'active';

I propose to add a table alias for the wait_event column in the WHERE clause 
for consistency.

1.https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com
From 64e9aa8ddde392b97bff66d53a5d09e0a820380c Mon Sep 17 00:00:00 2001
From: Pavel Luzanov 
Date: Mon, 23 Oct 2023 13:00:16 +0300
Subject: [PATCH] Fix pg_wait_events usage example

This makes the use of column aliases consistent throughout the query.
---
 doc/src/sgml/monitoring.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1149093a8a..3f49ff79f3 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1119,7 +1119,7 @@ SELECT a.pid, a.wait_event, w.description
   FROM pg_stat_activity a JOIN
pg_wait_events w ON (a.wait_event_type = w.type AND
 a.wait_event = w.name)
-  WHERE wait_event is NOT NULL and a.state = 'active';
+  WHERE a.wait_event is NOT NULL and a.state = 'active';
 -[ RECORD 1 ]--
 pid | 686674
 wait_event  | WALInitSync
-- 
2.34.1



Re: Add trailing commas to enum definitions

2023-10-23 Thread Junwang Zhao
On Mon, Oct 23, 2023 at 2:37 PM Peter Eisentraut  wrote:
>
> Since C99, there can be a trailing comma after the last value in an enum

C99 allows us to do this doesn't mean we must do this, this is not
inconsistent IMHO, and this will pollute the git log messages, people
may *git blame* the file and see the reason for the introduction of the
line.

There are a lot of 'typedef struct' as well as 'struct', which is not
inconsistent either just like the *enum* case.

> definition.  A lot of new code has been introducing this style on the
> fly.  I have noticed that some new patches are now taking an
> inconsistent approach to this.  Some add the last comma on the fly if
> they add a new last value, some are trying to preserve the existing
> style in each place, some are even dropping the last comma if there was
> one.  I figured we could nudge this all in a consistent direction if we
> just add the trailing commas everywhere once.  See attached patch; it
> wasn't actually that much.
>
> I omitted a few places where there was a fixed "last" value that will
> always stay last.  I also skipped the header files of libpq and ecpg, in
> case people want to use those with older compilers.  There were also a
> small number of cases where the enum type wasn't used anywhere (but the
> enum values were), which ended up confusing pgindent a bit.



-- 
Regards
Junwang Zhao




Re: Removing unneeded self joins

2023-10-23 Thread Alexander Korotkov
On Mon, Oct 23, 2023 at 6:43 AM Andrei Lepikhov
 wrote:
> On 22/10/2023 05:01, Alexander Korotkov wrote:
> > On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov
> >  wrote:
> >> On 19/10/2023 01:50, Alexander Korotkov wrote:
> >>> This query took 3778.432 ms with self-join removal disabled, and
> >>> 3756.009 ms with self-join removal enabled.  So, no measurable
> >>> overhead.  Similar to the higher number of joins.  Can you imagine
> >>> some extreme case when self-join removal could introduce significant
> >>> overhead in comparison with other optimizer parts?  If not, should we
> >>> remove self_join_search_limit GUC?
> >> Thanks,
> >> It was Zhihong Yu who worried about that case [1]. And my purpose was to
> >> show a method to avoid such a problem if it would be needed.
> >> I guess the main idea here is that we have a lot of self-joins, but only
> >> few of them (or no one) can be removed.
> >> I can't imagine a practical situation when we can be stuck in the
> >> problems here. So, I vote to remove this GUC.
> >
> > I've removed the self_join_search_limit.  Anyway there is
> > enable_self_join_removal if the self join removal algorithm causes any
> > problems.  I also did some grammar corrections for the comments.  I
> > think the patch is getting to the committable shape.  I noticed some
> > failures on commitfest.cputube.org.  I'd like to check how this
> > version will pass it.
>
> I have observed the final patch. A couple of minor changes can be made
> (see attachment).

Thank you, Andrei!  I've integrated your changes into the patch.

> Also, I see room for improvement, but it can be done later. For example,
> we limit the optimization to only ordinary tables in this patch. It can
> be extended at least with partitioned and foreign tables soon.

Yes, I think it's reasonable to postpone some improvements.  It's
important to get the basic feature in, make sure it's safe and stable.
Then we can make improvements incrementally.

I think this patch makes substantial improvement to query planning.
It has received plenty of reviews.  The code is currently in quite
good shape.  I didn't manage to find the cases when this optimization
causes significant overhead to planning time.  Even if such cases will
be spotted there is a GUC option to disable this feature.  So, I'll
push this if there are no objections.

--
Regards,
Alexander Korotkov


0001-Remove-useless-self-joins-v47.patch
Description: Binary data


Re: Fix output of zero privileges in psql

2023-10-23 Thread Laurenz Albe
On Sat, 2023-10-21 at 04:29 +0200, Erik Wienhold wrote:
> The attached v3 of my initial patch
> does that.  It also includes Laurenz' fix to no longer ignore \pset null
> (minus the doc changes that suggest using \pset null to distinguish
> between default and empty privileges because that's no longer needed).

Thanks!

I went over the patch, fixed some problems and added some more stuff from
my patch.

In particular:

  --- a/doc/src/sgml/ddl.sgml
  +++ b/doc/src/sgml/ddl.sgml
  @@ -2353,7 +2353,9 @@ GRANT SELECT (col1), UPDATE (col1) ON mytable TO 
miriam_rw;
 
  If the Access privileges column is empty for a given
  object, it means the object has default privileges (that is, its
  -   privileges entry in the relevant system catalog is null).  Default
  +   privileges entry in the relevant system catalog is null).  The column 
shows
  +   (none) for empty privileges (that is, no privileges at
  +   all, even for the object owner  a rare occurrence).  Default
  privileges always include all privileges for the owner, and can include
  some privileges for PUBLIC depending on the object
  type, as explained above.  The first GRANT

This description of empty privileges is smack in the middle of describing
default privileges.  I thought that was confusing and moved it to its
own paragraph.

  --- a/src/bin/psql/describe.c
  +++ b/src/bin/psql/describe.c
  @@ -6718,7 +6680,13 @@ static void
   printACLColumn(PQExpBuffer buf, const char *colname)
   {
  appendPQExpBuffer(buf,
  - "pg_catalog.array_to_string(%s, E'\\n') AS \"%s\"",
  + "CASE\n"
  + "  WHEN %s IS NULL THEN ''\n"
  + "  WHEN pg_catalog.cardinality(%s) = 0 THEN '%s'\n"
  + "  ELSE pg_catalog.array_to_string(%s, E'\\n')\n"
  + "END AS \"%s\"",
  + colname,
  + colname, gettext_noop("(none)"),
colname, gettext_noop("Access privileges"));
   }

This erroneously displays NULL as empty string and subverts my changes.
I have removed the first branch of the CASE expression.

  --- a/src/test/regress/expected/psql.out
  +++ b/src/test/regress/expected/psql.out
  @@ -6663,3 +6663,97 @@ DROP ROLE regress_du_role0;
   DROP ROLE regress_du_role1;
   DROP ROLE regress_du_role2;
   DROP ROLE regress_du_admin;
  +-- Test empty privileges.
  +BEGIN;
  +WARNING:  there is already a transaction in progress

This warning is caused by a pre-existing error in the regression test, which
forgot to close the transaction.  I have added a COMMIT at the appropriate 
place.

  +ALTER TABLESPACE regress_tblspace OWNER TO CURRENT_USER;
  +REVOKE ALL ON TABLESPACE regress_tblspace FROM CURRENT_USER;
  +\db+ regress_tblspace
  +List of tablespaces
  +   Name   | Owner  |Location | Access 
privileges | Options |  Size   | Description 
  
+--++-+---+-+-+-
  + regress_tblspace | regress_zeropriv_owner | pg_tblspc/16385 | (none)
| | 0 bytes | 
  +(1 row)

This test is not stable, since it contains the OID of the tablespace, which
is different every time.

  +ALTER DATABASE :"DBNAME" OWNER TO CURRENT_USER;
  +REVOKE ALL ON DATABASE :"DBNAME" FROM CURRENT_USER, PUBLIC;
  +\l :"DBNAME"
  +List of databases
  +Name| Owner  | Encoding  | Locale Provider | Collate 
| Ctype | ICU Locale | ICU Rules | Access privileges 
  
+++---+-+-+---++---+---
  + regression | regress_zeropriv_owner | SQL_ASCII | libc| C   
| C ||   | (none)
  +(1 row)

This test is also not stable, since it depends on the locale definition
of the regression test database.  If you use "make installcheck", that could
be a different locale.

I think that these tests are not absolutely necessary, and the other tests
are sufficient.  Consequently, I took the simple road of removing them.

I also tried to improve the commit message.

Patch attached.

Yours,
Laurenz Albe
From df5137f537cc0bef68c126a1e20bda831689b8aa Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 23 Oct 2023 11:24:01 +0200
Subject: [PATCH] Fix default and empty privilege output in psql

Default privileges start as NULL::aclitem[] in various catalog columns,
but revoking all privileges leaves an empty aclitem[].  These two
cases used to produce the same output with psql meta-commands like \dp.
Using "\pset null '(default)'" as a workaround for spotting default
privileges did not work, because null values were always displayed
as empty strings by psql meta-commands.

This patch improves that with two changes:

1. Print 

RE: pg_ctl start may return 0 even if the postmaster has been already started on Windows

2023-10-23 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, Shlok,

> 
> At Fri, 6 Oct 2023 12:28:32 +0530, Shlok Kyal  wrote
> i> D:\project\pg_dev\bin>pg_ctl -D ../data -l data2.log start
> > pg_ctl: another server might be running; trying to start server anyway
> > waiting for server to startpg_ctl: launcher shell died
> >
> > The output message after patch is different from the HEAD. I felt that
> > with patch as well we should get the message  "pg_ctl: could not start
> > server".
> > Is this message change intentional?
> 
> Partly no, partly yes. My focus was on verifying the accuracy of
> identifying the actual postmaster PID on Windows. The current patch
> provides a detailed description of the events, primarily because I
> lack a comprehensive understanding of both the behavior of Windows
> APIs and the associated processes.  Given that context, the messages
> essentially serve debugging purposes.
> 
> I agree with your suggestion.  Ultimately, if there's a possibility
> for this to be committed, the message will be consolidated to "could
> not start server".

Based on the suggestion, I tried to update the patch.
A new argument is_valid is added for reporting callee. Also, reporting formats
are adjusted based on other functions. How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


v3-0001-Disable-autoruns-for-cmd.exe-on-Windows.patch
Description: v3-0001-Disable-autoruns-for-cmd.exe-on-Windows.patch


v3-0002-Improve-pg_ctl-postmaster-process-check-on-Window.patch
Description:  v3-0002-Improve-pg_ctl-postmaster-process-check-on-Window.patch


v3-0003-Remove-short-sleep-from-001_start_stop.pl.patch
Description: v3-0003-Remove-short-sleep-from-001_start_stop.pl.patch


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-23 Thread Bharath Rupireddy
On Mon, Oct 23, 2023 at 11:10 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> Thank you for reviewing! PSA new version.

> > 6. A nit: how about is_decodable_txn or is_decodable_change or some
> > other instead of just a plain name processing_required?
> > +/* Do we need to process any change in 'fast_forward' mode? */
> > +boolprocessing_required;
>
> I preferred current one. Because not only decodable txn, non-txn change and
> empty transactions also be processed.

Right. It's not the txn, but the change. processing_required seems too
generic IMV. A nit: is_change_decodable or something?

Thanks for the patch. Here are few comments on v56 patch:

1.
+ *
+ * Although this function is currently used only during pg_upgrade, there are
+ * no reasons to restrict it, so IsBinaryUpgrade is not checked here.

This comment isn't required IMV, because anyone looking at the code
and callsites can understand it.

2. A nit: IMV "This is a special purpose ..." statement seems redundant.
+ *
+ * This is a special purpose function to ensure that the given slot can be
+ * upgraded without data loss.

How about

Verify that the given replication slot has consumed all the WAL changes.
If there's any decodable WAL record after the slot's
confirmed_flush_lsn, the slot's consumer will lose that data after the
slot is upgraded.
Returns true if there are no decodable WAL records after the
confirmed_flush_lsn. Otherwise false.

3.
+if (PG_ARGISNULL(0))
+elog(ERROR, "null argument to
binary_upgrade_validate_wal_records is not allowed");

I can see the above style is referenced from
binary_upgrade_create_empty_extension, but IMV the following looks
better and latest (ereport is new style than elog)

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("replication slot name cannot be null")));

4. The following comment seems frivolous, the code tells it all.
Please remove the comment.
+
+/* No need to check this slot, seek to new one */
+continue;

5. A typo - s/gets/Gets
+ * gets the LogicalSlotInfos for all the logical replication slots of the

6. An optimization in count_old_cluster_logical_slots(void): Turn
slot_count to a function static variable so that the for loop isn't
required every time because the slot count is prepared in
get_old_cluster_logical_slot_infos only once and won't change later
on. Do you see any problem with the following? This saves a few CPU
cycles when there are large number of replication slots.
{
static int slot_count = 0;
static bool first_time = true;

if (first_time)
{
for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
slot_count += old_cluster.dbarr.dbs[dbnum].slot_arr.nslots;

first_time = false;
}

return slot_count;
}

7. A typo: s/slotname/slot name. "slot name" looks better in user
visible messages.
+pg_log(PG_VERBOSE, "slotname: \"%s\", plugin: \"%s\", two_phase: %s",

8.
+else
+{
+test_upgrade_from_pre_PG17($old_publisher, $new_publisher,
+@pg_upgrade_cmd);
+}
Will this ever be tested in current TAP test framework? I mean, will
the TAP test framework allow testing upgrades from one PG version to
another PG version?

9. A nit: Can single quotes around variable names in the comments be
removed just to be consistent?
+ * We also skip decoding in 'fast_forward' mode. This check must be last
+/* Do we need to process any change in 'fast_forward' mode? */

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Daniel Gustafsson
> On 23 Oct 2023, at 08:22, Peter Eisentraut  wrote:
> 
> On 23.10.23 02:26, Michael Paquier wrote:
>> 5e4dacb9878c has reminded me that we don't show the version of OpenSSL
>> in the output of ./configure.  This would be useful to know when
>> looking at issues within the buildfarm, and I've wanted that a few
>> times.

Many +1's, this has been on my TODO for some time but has never bubbled to the
top. Thanks for working on this.

>> How about the attached to use the openssl command, if available, to
>> display this information?  Libraries may be installed while the
>> command is not available, but in most cases I'd like to think that it
>> is around, and it is less complex than using something like
>> SSLeay_version() from libcrypto.
>> meson already shows this information, so no additions are required
>> there.  Also, LibreSSL uses `openssl`, right?
> 
> The problem is that the binary might not match the library, so this could be 
> very misleading.  Also, meson gets the version via pkg-config, so the result 
> would also be inconsistent with meson.  I am afraid this approach would be 
> unreliable in the really interesting cases.

I tend to agree with this, it would be preferrable to be consistent with meson
if possible/feasible.

--
Daniel Gustafsson





Re: Synchronizing slots from primary to standby

2023-10-23 Thread Drouvot, Bertrand

Hi,

On 10/18/23 6:43 AM, shveta malik wrote:

On Tue, Oct 17, 2023 at 9:06 PM Drouvot, Bertrand

+static void
+ProcessRepliesAndTimeOut(void)
+{
+   CHECK_FOR_INTERRUPTS();
+
+   /* Process any requests or signals received recently */
+   if (ConfigReloadPending)
+   {
+   ConfigReloadPending = false;
+   ProcessConfigFile(PGC_SIGHUP);
+   SyncRepInitConfig();
+   SlotSyncInitConfig();
+   }

Do we want to do this at each place ProcessRepliesAndTimeOut() is being
called? I mean before this change it was not done in ProcessPendingWrites().



Are you referring to ConfigReload stuff ? I see that even in
ProcessPendingWrites(), we do it after WalSndWait(). Now only the
order is changed, it is before  WalSndWait() now.


Yeah and the CFI.

With the patch the CFI and check on ConfigReloadPending is done in all the case
as the break (if !pq_is_send_pending()) is now done after. That seems ok, just
wanted to mention it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Show version of OpenSSL in ./configure output

2023-10-23 Thread Peter Eisentraut

On 23.10.23 02:26, Michael Paquier wrote:

5e4dacb9878c has reminded me that we don't show the version of OpenSSL
in the output of ./configure.  This would be useful to know when
looking at issues within the buildfarm, and I've wanted that a few
times.

How about the attached to use the openssl command, if available, to
display this information?  Libraries may be installed while the
command is not available, but in most cases I'd like to think that it
is around, and it is less complex than using something like
SSLeay_version() from libcrypto.

meson already shows this information, so no additions are required
there.  Also, LibreSSL uses `openssl`, right?


The problem is that the binary might not match the library, so this 
could be very misleading.  Also, meson gets the version via pkg-config, 
so the result would also be inconsistent with meson.  I am afraid this 
approach would be unreliable in the really interesting cases.


> +  # Print version of OpenSSL, if command is available.
> +  AC_ARG_VAR(OPENSSL, [path to openssl command])
> +  PGAC_PATH_PROGS(OPENSSL, openssl)

There is already a call like this in configure.ac, so (if this approach 
is taken) you should rearrange things to make use of that one.


> +  pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)"
> +  AC_MSG_NOTICE([using openssl $pgac_openssl_version])