Re: Index range search optimization

2023-09-28 Thread Alexander Korotkov
Hi, Peter.

On Fri, Sep 29, 2023 at 4:57 AM Peter Geoghegan  wrote:
> On Fri, Sep 22, 2023 at 7:24 AM Alexander Korotkov  
> wrote:
> > The thing is that NULLs could appear in the middle of matching values.
> >
> > # WITH t (a, b) AS (VALUES ('a', 'b'), ('a', NULL), ('b', 'a'))
> > SELECT a, b, (a, b) > ('a', 'a') FROM t ORDER BY (a, b);
> >  a |  b   | ?column?
> > ---+--+--
> >  a | b| t
> >  a | NULL | NULL
> >  b | a| t
> > (3 rows)
> >
> > So we can't just skip the row comparison operator, because we can meet
> > NULL at any place.
>
> But why would SK_ROW_HEADER be any different? Is it related to this
> existing case inside _bt_check_rowcompare()?:
>
> if (subkey->sk_flags & SK_ISNULL)
> {
> /*
>  * Unlike the simple-scankey case, this isn't a disallowed case.
>  * But it can never match.  If all the earlier row comparison
>  * columns are required for the scan direction, we can stop the
>  * scan, because there can't be another tuple that will succeed.
>  */
> if (subkey != (ScanKey) DatumGetPointer(skey->sk_argument))
> subkey--;
> if ((subkey->sk_flags & SK_BT_REQFWD) &&
> ScanDirectionIsForward(dir))
> *continuescan = false;
> else if ((subkey->sk_flags & SK_BT_REQBKWD) &&
>  ScanDirectionIsBackward(dir))
> *continuescan = false;
> return false;
> }

Yes, exactly. Our row comparison operators don't match if there is any
null inside the row.  But you can find these rows within the matching
range.

> I noticed that you're not initializing so->firstPage correctly for the
> _bt_endpoint() path, which is used when the initial position of the
> scan is either the leftmost or rightmost page. That is, it's possible
> to reach _bt_readpage() without having reached the point in
> _bt_first() where you initialize so->firstPage to "true".

Good catch, thank you!

> It would probably make sense if the flag was initialized to "false" in
> the same way as most other scan state is already, somewhere in
> nbtree.c. Probably in btrescan().

Makes sense, initialisation is added.

--
Regards,
Alexander Korotkov


0001-Skip-checking-of-scan-keys-required-for-direction-v8.patch
Description: Binary data


Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:
> This patch allows the role provided in BackgroundWorkerInitializeConnection()
> and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.

Interesting.  Yes, there would be use cases for that, I suppose.

> +  uint32 flags,
>char *out_dbname)
>  {

This may be more adapted with a bits32 for the flags.

> +# Ask the background workers to connect with this role with the flag in 
> place.
> +$node->append_conf(
> +'postgresql.conf', q{
> +worker_spi.role = 'nologrole'
> +worker_spi.bypass_login_check = true
> +});
> +$node->restart;
> +
> +# An error message should not be issued.
> +ok( !$node->log_contains(
> +"role \"nologrole\" is not permitted to log in", $log_start),
> +"nologrole allowed to connect if BGWORKER_BYPASS_ROLELOGINCHECK is set");
> +
>  done_testing();

It would be cheaper to use a dynamic background worker for such tests.
Something that I've been tempted to do in this module is to extend the
amount of data that's given to bgw_main_arg when launching a worker
with worker_spi_launch().  How about extending the SQL function so as
it is possible to give in input a role name (or a regrole), a database
name (or a database OID) and a text[] for the flags?  This would
require a bit more refactoring, but this would be benefitial to show
or one can pass down a full structure from the registration to the
main() routine.  On top of that, it would make the addition of the new
GUCs worker_spi.bypass_login_check and worker_spi.role unnecessary.

> +# return the size of logfile of $node in bytes
> +sub get_log_size
> +{
> +my ($node) = @_;
> +
> +return (stat $node->logfile)[7];
> +}

Just use -s here.  See other tests that want to check the contents of
the logs from an offset.

> - * Allow bypassing datallowconn restrictions when connecting to database
> + * Allow bypassing datallowconn restrictions and login check when connecting
> + * to database
>   */
> -#define BGWORKER_BYPASS_ALLOWCONN 1
> +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
> +#define BGWORKER_BYPASS_ROLELOGINCHECK 0x0002

The structure of the patch is inconsistent.  These flags are in
bgworker.h, but they are used also by InitPostgres().  Perhaps a
second boolean flag would be OK rather than a second set of flags for
InitPostgres() mapping with the bgworker set.
--
Michael


signature.asc
Description: PGP signature


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-28 Thread vignesh C
On Fri, 29 Sept 2023 at 04:55, Peter Smith  wrote:
>
> Some minor review comments for v4-0001:
>
> ==
> src/backend/replication/logical/worker.c
>
> 1.
> + /*
> + * Exit if the owner of the subscription has changed from superuser to a
> + * non-superuser.
> + */
> + if (!newsub->isownersuperuser && MySubscription->isownersuperuser)
> + {
> + if (am_parallel_apply_worker())
> + ereport(LOG,
> + errmsg("logical replication parallel apply worker for subscription
> \"%s\" will stop because subscription owner has become non-superuser",
> +MySubscription->name));
> + else
> + ereport(LOG,
> + errmsg("logical replication worker for subscription \"%s\" will
> restart because subscription owner has become non-superuser",
> +MySubscription->name));
> +
> + apply_worker_exit();
> + }
> +
>
> /because subscription owner has become non-superuser/because the
> subscription owner has become a non-superuser/  (in 2 places)

Modified

> ==
> src/include/catalog/pg_subscription.h
>
> 2.
>   char*origin; /* Only publish data originating from the
>   * specified origin */
> + bool isownersuperuser; /* Is subscription owner superuser? */
>  } Subscription;
>
> ~
>
> 2a.
> Would it be better to put this new field adjacent to the existing
> 'owner' field, since they kind of belong together?

Modified

> ~
>
> 2b.
> None of the other bool fields here has an 'is' prefix, so you could
> consider a shorter field name, like 'ownersuperuser' or
> 'superuserowner', etc.

Modified

The attached v5 version patch has the changes for the same.

Regards,
Vignesh
From da332a17f531597df2ded354e934e58fab679418 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 22 Sep 2023 15:12:23 +0530
Subject: [PATCH v5] Restart the apply worker if the subscription owner has
 changed from superuser to non-superuser.

Restart the apply worker if the subscription owner has changed from
superuser to non-superuser. This is required so that the subscription
connection string gets revalidated to identify cases where the
password option is not specified as part of the connection string for
non-superuser.
---
 src/backend/catalog/pg_subscription.c   |  3 +++
 src/backend/commands/subscriptioncmds.c |  4 +--
 src/backend/replication/logical/tablesync.c |  3 +--
 src/backend/replication/logical/worker.c| 30 ++---
 src/include/catalog/pg_subscription.h   |  1 +
 src/test/subscription/t/027_nosuperuser.pl  | 24 +
 6 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..47a341431c 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -108,6 +108,9 @@ GetSubscription(Oid subid, bool missing_ok)
    Anum_pg_subscription_suborigin);
 	sub->origin = TextDatumGetCString(datum);
 
+	/* Get superuser for subscription owner */
+	sub->ownersuperuser = superuser_arg(sub->owner);
+
 	ReleaseSysCache(tup);
 
 	return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 6fe111e98d..608459504a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -869,7 +869,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 	load_file("libpqwalreceiver", false);
 
 	/* Try to connect to the publisher. */
-	must_use_password = !superuser_arg(sub->owner) && sub->passwordrequired;
+	must_use_password = !sub->ownersuperuser && sub->passwordrequired;
 	wrconn = walrcv_connect(sub->conninfo, true, must_use_password,
 			sub->name, &err);
 	if (!wrconn)
@@ -1249,7 +1249,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 			load_file("libpqwalreceiver", false);
 			/* Check the connection info string. */
 			walrcv_check_conninfo(stmt->conninfo,
-  sub->passwordrequired && !superuser_arg(sub->owner));
+  sub->passwordrequired && !sub->ownersuperuser);
 
 			values[Anum_pg_subscription_subconninfo - 1] =
 CStringGetTextDatum(stmt->conninfo);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e2cee92cf2..d87ac474b1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1278,9 +1278,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 
 	/* Is the use of a password mandatory? */
 	must_use_password = MySubscription->passwordrequired &&
-		!superuser_arg(MySubscription->owner);
+		!MySubscription->ownersuperuser;
 
-	/* Note that the superuser_arg call can access the DB */
 	CommitTransactionCommand();
 
 	SpinLockAcquire(&MyLogicalRepWorker->relmutex);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 597947410f..d056407419 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3966,6 +3966,24 @@ maybe_reread_

[PGDOCS] change function linkend to refer to a more relevant target

2023-09-28 Thread Peter Smith
Hi,

I found a link in the docs that referred to the stats "views" section,
instead of the more relevant (IMO)  stats "functions" section.

PSA the 1 line patch -- it explains what I mean better than I can
describe in words...

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Change-the-xref-linkend.patch
Description: Binary data


Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-28 Thread David Rowley
On Fri, 29 Sept 2023 at 01:00, Ranier Vilela  wrote:
> Perhaps, and using your own words, the leaders on this project seem
> to be against reviewers armed with blunderbuss, too.

I don't have any ideas on what you're talking about here, but if this
is a valid concern that you think is unfair then maybe the code of
conduct team is a more relevant set of people to raise it with.

My mention of the scattergun approach to writing patches was aimed at
helping you have more success here.  I'd recommend you resist the urge
to change unrelated code in your patches. Your experience here might
improve if you're able to aim your patches at resolving specific
issues.  I know you'd love to see commit messages that include "In
passing, adjust a random assortment of changes contributed by Ranier
Vilela".  That's just not going to happen. You might think the
committer's job is just to commit patches contributed by contributors,
but what you might not realise is that we're also trying to maintain a
codebase that's over 3 decades old and will probably outlive most of
its current contributors.  Making things easy both for ourselves in
several years time and for our future counterparts is something that
we do need to consider when discussing and making changes.  The mail
archives and commit messages are an audit trail for our future selves
and future counterparts.  That's one of the main reasons why we tend
to not like you trying to sneak in assortments of random changes along
with your change. If you can understand this and adapt to this way of
thinking then you're more likely to find yourself feeling like you're
working with committers and other contributors rather than against
them.

I say this hoping you take it constructively, but I find that you're
often very defensive about your patches and often appear to take
change requests in a very negative way.  On this thread you've
demonstrated to me that by me requesting you to remove an unrelated
change in the patch that I must not care about code quality in
PostgreSQL.  I personally find these sorts of responses quite draining
and it makes me not want to touch your work. I would imagine I'm not
the only person that feels this. So there's a real danger here that if
you continue to have too many negative responses in emails, you'll
find yourself ignored and you're likely to find that frustrating and
that will lead to you having a worse experience here.   This does not
mean you have to become passive to all requests for changes to your
patches, but if you're going to argue or resist, then it pays to
politely explain your reasoning so that the topic can be discussed
constructively.

> I confess that some "in pass", "while there", and some desire to enrich the 
> patch, clouded my judgment.

I'm glad to see you're keen to make improvements to PostgreSQL,
however, I'd like to suggest that you channel that energy and aim to
produce patches targeted in those areas that attempt to resolve a
series or perhaps all problems of that particular category in
isolation.  If it's a large patch then it's likely best to get
consensus first as having lots of work rejected is always more
painful.

> So it seems that we have some consensus, I propose version 2 of the patch,
> which I hope we will have to correct, perhaps, the words of the comments.

I've pushed this patch after making an adjustment to the comment to
shorten it to a single line.

Thank you for working on this.

David




Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning

2023-09-28 Thread Amit Langote
On Wed, Sep 27, 2023 at 8:07 PM Ashutosh Bapat
 wrote:
> On Wed, Sep 27, 2023 at 2:30 PM Amit Langote  wrote:
> > +   /*
> > +* But the list of operator OIDs and the list of expressions may be
> > +* referenced somewhere else. Do not free those.
> > +*/
> >
> > I don't see build_child_join_sjinfo() copying any list of OIDs, so I'm
> > not sure what the comment is referring to.  Also, instead of "the list
> > of expressions", it might be more useful to mention semi_rhs_expr,
> > because that's the only one being copied.
>
> list of OID is semi_operators. It's copied as is from parent
> SpecialJoinInfo. But the way it's mentioned in the comment is
> confusing. I have modified the prologue of function to provide a
> general guidance on what can be freed and what should not be. and then
> specific examples. Let me know if this is more clear.

Thanks.  I would've kept the notes about specific members inside the
function.  Also, no need to have a note for pointers that are not
deep-copied to begin with, such as semi_operators.  IOW, something
like the following would have sufficed:

@@ -1735,6 +1735,10 @@ build_child_join_sjinfo(PlannerInfo *root,
SpecialJoinInfo *parent_sjinfo,
 /*
  * free_child_sjinfo_members
  * Free memory consumed by members of a child SpecialJoinInfo.
+ *
+ * Only members that are translated copies of their counterpart in the parent
+ * SpecialJoinInfo are freed here.  However, members that could be referenced
+ * elsewhere are not freed.
  */
 static void
 free_child_sjinfo_members(SpecialJoinInfo *child_sjinfo)
@@ -1755,10 +1759,7 @@ free_child_sjinfo_members(SpecialJoinInfo *child_sjinfo)
bms_free(child_sjinfo->syn_lefthand);
bms_free(child_sjinfo->syn_righthand);

-   /*
-* But the list of operator OIDs and the list of expressions may be
-* referenced somewhere else. Do not free those.
-*/
+   /* semi_rhs_exprs may be referenced, so don't free. */
 }

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Tom Lane
Andres Freund  writes:
> On 2023-09-28 19:20:27 -0700, Andres Freund wrote:
>> Thus the easiest fix looks to be to use this:
>> -  export_fmt = '-exported_symbols_list=@0@'
>> +  export_fmt = '-Wl,-exported_symbols_list,@0@'
>> I don't have anything older than Ventura to check though.

I don't have meson installed on my surviving Catalina box, but
I tried the equivalent thing in the Makefile universe:

diff --git a/src/Makefile.shlib b/src/Makefile.shlib
index f94d59d1c5..f2ed222cc7 100644
--- a/src/Makefile.shlib
+++ b/src/Makefile.shlib
@@ -136,7 +136,7 @@ ifeq ($(PORTNAME), darwin)
   BUILD.exports= $(AWK) '/^[^\#]/ {printf "_%s\n",$$1}' $< >$@
   exports_file = $(SHLIB_EXPORTS:%.txt=%.list)
   ifneq (,$(exports_file))
-exported_symbols_list = -exported_symbols_list $(exports_file)
+exported_symbols_list = -Wl,-exported_symbols_list,$(exports_file)
   endif
 endif

That builds and produces correctly-symbol-trimmed shlibs, so I'd
say it's fine.  (Perhaps we should apply the above to HEAD
alongside the meson.build fix, to get more test coverage?)

> Attached is the above change and a commit to change CI over to Sonoma. Not
> sure when we should switch, but it seems useful to include for testing
> purposes at the very least.

No opinion on when to switch CI.  Sonoma is surely pretty bleeding edge
yet.

regards, tom lane




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-09-28 Thread Andrei Lepikhov

On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1.

+1 to the idea, have doubts on the implementation.

I have a question. I see the feature triggers ERROR on the exceeding of 
the memory limit. The superior PG_CATCH() section will handle the error. 
As I see, many such sections use memory allocations. What if some 
routine, like the CopyErrorData(), exceeds the limit, too? In this case, 
we could repeat the error until the top PG_CATCH(). Is this correct 
behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for 
recursion and allow error handlers to slightly exceed this hard limit?


Also, the patch needs to be rebased.

--
regards,
Andrey Lepikhov
Postgres Professional





Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Andres Freund
Hi,

On 2023-09-28 19:20:27 -0700, Andres Freund wrote:
> Thus the easiest fix looks to be to use this:
> 
> diff --git a/meson.build b/meson.build
> index 5422885b0a2..16a2b0f801e 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -224,7 +224,7 @@ elif host_system == 'darwin'
>library_path_var = 'DYLD_LIBRARY_PATH'
>  
>export_file_format = 'darwin'
> -  export_fmt = '-exported_symbols_list=@0@'
> +  export_fmt = '-Wl,-exported_symbols_list,@0@'
>  
>mod_link_args_fmt = ['-bundle_loader', '@0@']
>mod_link_with_dir = 'bindir'
> 
> 
> I don't have anything older than Ventura to check though.

Attached is the above change and a commit to change CI over to Sonoma. Not
sure when we should switch, but it seems useful to include for testing
purposes at the very least.

Greetings,

Andres Freund
>From 9733a689e2e7280197e7ff188eb1ae306e199c82 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 28 Sep 2023 19:27:03 -0700
Subject: [PATCH v1 1/2] meson: macos: Correct -exported_symbols_list syntax
 for Sonoma compat

-exported_symbols_list=... works on Ventura and a few earlier releases, but
not on Sonoma. The easiest way to fix it is to -Wl,-exported_symbols_list,@0@
which actually seems more appropriate anyway, it's obviously a linker
argument.

Discussion: https://postgr.es/m/2023092848.jw6s7yktpfsfc...@alap3.anarazel.de
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 5422885b0a2..862c955453f 100644
--- a/meson.build
+++ b/meson.build
@@ -224,7 +224,7 @@ elif host_system == 'darwin'
   library_path_var = 'DYLD_LIBRARY_PATH'
 
   export_file_format = 'darwin'
-  export_fmt = '-exported_symbols_list=@0@'
+  export_fmt = '-Wl,-exported_symbols_list,@0@'
 
   mod_link_args_fmt = ['-bundle_loader', '@0@']
   mod_link_with_dir = 'bindir'
-- 
2.41.0.rc2

>From 6948fec81a41a666117552dc157b72f236335c88 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 28 Sep 2023 19:29:44 -0700
Subject: [PATCH v1 2/2] ci: macos: Switch to Sonoma

Discussion: https://postgr.es/m/2023092848.jw6s7yktpfsfc...@alap3.anarazel.de
---
 .cirrus.tasks.yml| 4 ++--
 src/tools/ci/ci_macports_packages.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850d..97088a7728a 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -410,7 +410,7 @@ task:
 
 
 task:
-  name: macOS - Ventura - Meson
+  name: macOS - Sonoma - Meson
 
   env:
 CPUS: 4 # always get that much for cirrusci macOS instances
@@ -419,7 +419,7 @@ task:
 # work OK. See
 # https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de
 TEST_JOBS: 8
-IMAGE: ghcr.io/cirruslabs/macos-ventura-base:latest
+IMAGE: ghcr.io/cirruslabs/macos-sonoma-base:latest
 
 CIRRUS_WORKING_DIR: ${HOME}/pgsql/
 CCACHE_DIR: ${HOME}/ccache
diff --git a/src/tools/ci/ci_macports_packages.sh b/src/tools/ci/ci_macports_packages.sh
index 4bc594a31d1..3f5f272c9ff 100755
--- a/src/tools/ci/ci_macports_packages.sh
+++ b/src/tools/ci/ci_macports_packages.sh
@@ -13,7 +13,7 @@ set -e
 
 packages="$@"
 
-macports_url="https://github.com/macports/macports-base/releases/download/v2.8.1/MacPorts-2.8.1-13-Ventura.pkg";
+macports_url="https://github.com/macports/macports-base/releases/download/v2.8.1/MacPorts-2.8.1-14-Sonoma.pkg";
 cache_dmg="macports.hfs.dmg"
 
 if [ "$CIRRUS_CI" != "true" ]; then
-- 
2.41.0.rc2



Re: The danger of deleting backup_label

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 05:14:22PM -0400, David Steele wrote:
> While reading through [1] I saw there were two instances where backup_label
> was removed to achieve a "successful" restore. This might work on trivial
> test restores but is an invitation to (silent) disaster in a production
> environment where the checkpoint stored in backup_label is almost certain to
> be earlier than the one stored in pg_control.

Definitely successful.

> Recovery worked perfectly as long as backup_label was present and failed
> hard when it was not:
> 
> LOG:  invalid primary checkpoint record
> PANIC:  could not locate a valid checkpoint record
> 
> It's not a very good message, but at least the foot gun has been removed. We
> could use this as a special value to give a better message, and maybe use
> something a bit more unique like 0xFADEFADE (or whatever) as the
> value.

Why not just InvalidXLogRecPtr?

> This is all easy enough for pg_basebackup to do, but will certainly be
> non-trivial for most backup software to implement. In [2] we have discussed
> perhaps returning pg_control from pg_backup_stop() for the backup software
> to save, or it could become part of the backup_label (encoded as hex or
> base64, presumably). I prefer the latter as this means less work for the
> backup software (except for the need to exclude pg_control from the backup).
> 
> I don't have a patch for this yet because I did not test this idea using
> pg_basebackup, but I'll be happy to work up a patch if there is interest.

If the contents of the control file are tweaked before sending it
through a BASE_BACKUP, it would cover more than just pg_basebackup.
Switching the way the control file is sent with new contents in
sendFileWithContent() rather than sendFile() would be one way, for
instance..
--
Michael


signature.asc
Description: PGP signature


Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Andres Freund
Hi,

On 2023-09-28 19:17:37 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund  writes:
> >> I think right now it doesn't work as-is on sonoma, because apple decided to
> >> change the option syntax, which is what causes the -e warning below, so the
> >> relevant option is just ignored.
> 
> > Hmm, we'd better fix that then.  Or is it their bug?  It looks to me like
> > clang's argument is -exported_symbols_list=/path/to/exports.list, so
> > it must be translating that to "-e".
> 
> Looking closer, the documented syntax is
> 
>   -exported_symbols_list filename
> 
> (two arguments, not one with an "=").  That is what our Makefiles
> use, and it still works fine with latest Xcode.  However, meson.build
> thinks it can get away with one argument containing "=", and evidently
> that doesn't work now (or maybe it never did?).

It does still work on Ventura.


> I tried
> 
>   export_fmt = '-exported_symbols_list @0@'

That would expand to a single argument with a space inbetween.


> and
> 
>   export_fmt = ['-exported_symbols_list', '@0@']

That would work in many places, but not here, export_fmt is used as a format
string... We could make the callsites do that for each array element, but
there's an easier solution that seems to work for both Ventura and Sonoma -
however I don't have anything older to test with.

TBH, I find it hard to understand what arguments go to the linker and which to
the compiler on macos. The argument is documented for the linker and not the
compiler, but so far we'd been passing it to the compiler, so there must be
some logic forwarding it.

Looking through the clang code, I see various llvm libraries using
-Wl,-exported_symbols_list and there are tests
(clang/test/Driver/darwin-ld.c) ensuring both syntaxes work.

Thus the easiest fix looks to be to use this:

diff --git a/meson.build b/meson.build
index 5422885b0a2..16a2b0f801e 100644
--- a/meson.build
+++ b/meson.build
@@ -224,7 +224,7 @@ elif host_system == 'darwin'
   library_path_var = 'DYLD_LIBRARY_PATH'
 
   export_file_format = 'darwin'
-  export_fmt = '-exported_symbols_list=@0@'
+  export_fmt = '-Wl,-exported_symbols_list,@0@'
 
   mod_link_args_fmt = ['-bundle_loader', '@0@']
   mod_link_with_dir = 'bindir'


I don't have anything older than Ventura to check though.

Greetings,

Andres Freund




RE: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-28 Thread Hayato Kuroda (Fujitsu)
Dear Michael,

I confirmed your commit. Thanks!
CF entry was closed as "Committed".

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
> v3 attached. I had a problem coming out with a better error message,
> so suggestions are welcome.  The cast still needs to be present as per
> above suggestion as 3GB is still valid buf size and still was causing
> integer overflow. We just throw an error on >= 4GB with v3.

+/* Safety net to prevent requesting huge memory by each query to 
pg_stat_activity */
+#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L
 
-   size = add_size(size,
-   
mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+   pgstat_track_size = mul_size(pgstat_track_activity_query_size,
+   NumBackendStatSlots);
+   if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE)
+   elog(FATAL, "too big Backend Activity Buffer allocation of %zu 
bytes", pgstat_track_size);
+   size = add_size(size, pgstat_track_size);

That should be enough to put in a hardcoded 4GB safety limit, while
mul_size() detects it at a higher range.  Note, however, that elog()
is only used for internal errors that users should never face, but
this one can come from a misconfiguration.  This would be better as an
ereport(), with ERRCODE_CONFIG_FILE_ERROR as errcode, I guess.

"Backend Activity Buffer" is the name of the internal struct.  Sure,
it shows up on the system views for shmem, but I wouldn't use this
term in a user-facing error message.  Perhaps something like "size
requested for backend status is out of range" would be cleaner.  Other
ideas are welcome.
--
Michael


signature.asc
Description: PGP signature


Re: Index range search optimization

2023-09-28 Thread Peter Geoghegan
On Fri, Sep 22, 2023 at 7:24 AM Alexander Korotkov  wrote:
> The thing is that NULLs could appear in the middle of matching values.
>
> # WITH t (a, b) AS (VALUES ('a', 'b'), ('a', NULL), ('b', 'a'))
> SELECT a, b, (a, b) > ('a', 'a') FROM t ORDER BY (a, b);
>  a |  b   | ?column?
> ---+--+--
>  a | b| t
>  a | NULL | NULL
>  b | a| t
> (3 rows)
>
> So we can't just skip the row comparison operator, because we can meet
> NULL at any place.

But why would SK_ROW_HEADER be any different? Is it related to this
existing case inside _bt_check_rowcompare()?:

if (subkey->sk_flags & SK_ISNULL)
{
/*
 * Unlike the simple-scankey case, this isn't a disallowed case.
 * But it can never match.  If all the earlier row comparison
 * columns are required for the scan direction, we can stop the
 * scan, because there can't be another tuple that will succeed.
 */
if (subkey != (ScanKey) DatumGetPointer(skey->sk_argument))
subkey--;
if ((subkey->sk_flags & SK_BT_REQFWD) &&
ScanDirectionIsForward(dir))
*continuescan = false;
else if ((subkey->sk_flags & SK_BT_REQBKWD) &&
 ScanDirectionIsBackward(dir))
*continuescan = false;
return false;
}

I noticed that you're not initializing so->firstPage correctly for the
_bt_endpoint() path, which is used when the initial position of the
scan is either the leftmost or rightmost page. That is, it's possible
to reach _bt_readpage() without having reached the point in
_bt_first() where you initialize so->firstPage to "true".

It would probably make sense if the flag was initialized to "false" in
the same way as most other scan state is already, somewhere in
nbtree.c. Probably in btrescan().

-- 
Peter Geoghegan




Re: Move global variables of pgoutput to plugin private scope.

2023-09-28 Thread Michael Paquier
On Wed, Sep 27, 2023 at 04:57:06PM +0900, Michael Paquier wrote:
> Sure.  I found the concept behind 0002 sound.  Feel free to go ahead
> with 0001, and I can always look at the second.  Always happy to help.

For the sake of the archives:
- Amit has applied 0001 down to 16 as of 54ccfd65868c.
- I've applied 0002 after on HEAD as of 9210afd3bcd6.
--
Michael


signature.asc
Description: PGP signature


Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 02:23:59PM +1000, Peter Smith wrote:
> v4 LGTM.

Applied v4 down to 16, then.
--
Michael


signature.asc
Description: PGP signature


Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
Vik Fearing  writes:
> On 9/28/23 20:46, Tom Lane wrote:
>> We went through all these points years ago when the enum feature
>> was first developed, as I recall.  Nobody thought that the ability
>> to remove an enum value was worth the amount of complexity it'd
>> entail.

> This issue comes up regularly (although far from often).  Do we want to 
> put some comments right where would-be implementors would be sure to see it?

Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
implemented" in the error message, as that wording sounds like we just
haven't got round to it.

regards, tom lane




Re: Try adding type cast with tablespace

2023-09-28 Thread Tom Lane
Kenichiro Tanaka  writes:
> Therefore I think it is good to add regtablespace alias,but I’m also
> newbie pgsql-hackers.
> We need some senior hackers’s opinions.

Well ... for my two cents, I'm kind of down on this, mainly because
I don't understand where we'd stop.  I don't want to end up in a
scenario where every system catalog is expected to have a reg*
type to go with it, because that'd create a lot of make-work.

The original idea of the reg* types was to create an easy way to do
OID lookups in catalogs where the correct lookup rule is more
complicated than
(SELECT oid FROM some_catalog WHERE name = 'foo')
So that motivates making reg* types for objects with
schema-qualified names, and even more so for functions and
types which have specialized syntax.  There was also some
consideration of which object types frequently need lookups.
IIRC, regrole got in partly because unprivileged users can't
select from pg_authid.

I don't really see that tablespaces meet the bar of any of these
past criteria: they don't have complex lookup rules nor are they
all that commonly used (IME anyway).  So if we accept this patch,
we're essentially saying that every catalog should have a reg*
type, and that's not a conclusion I want to reach.  We have 11
reg* types at the moment (only 9 if you discount the legacy
regproc and regoper ones), whereas there are about 30 catalogs
that have name columns.  Do we really want to open those
floodgates?

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Vik Fearing

On 9/28/23 20:46, Tom Lane wrote:

Andrew Dunstan  writes:

I wonder if we could have a boolean flag in pg_enum, indicating that
setting an enum to that value was forbidden.


Yeah, but that still offers no coherent solution to the problem of
what happens if there's a table that already contains such a value.
It doesn't seem terribly useful to forbid new entries if you can't
get rid of old ones.

Admittedly, a DISABLE flag would at least offer a chance at a
race-condition-free scan to verify that no such values remain
in tables.  But as somebody already mentioned upthread, that
wouldn't guarantee that the value doesn't appear in non-leaf
index pages.  So basically you could never get rid of the pg_enum
row, short of a full dump and restore.

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.


This issue comes up regularly (although far from often).  Do we want to 
put some comments right where would-be implementors would be sure to see it?


Attached is an example of what I mean.  Documentation is intentionally 
omitted.

--
Vik Fearing
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7d2032885e..f8d70cdaa0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6404,6 +6404,29 @@ AlterEnumStmt:
 n->skipIfNewValExists = false;
 $$ = (Node *) n;
 			}
+		 | ALTER TYPE_P any_name DROP VALUE_P Sconst
+			{
+/*
+ * The following problems must be solved before this can be
+ * implemented:
+ *
+ * - There can be no data of this type using this value in
+ *   any table
+ *
+ * - The value may not appear in any non-leaf page of a
+ *   btree (and similar issues with other index types)
+ *
+ * - Concurrent sessions must not be able to insert this
+ *   value while the preceding conditions are being checked
+ *
+ * - Possibly more...
+ */
+
+ereport(ERROR,
+		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		 errmsg("dropping an enum value is not yet implemented"),
+		 parser_errposition(@4)));
+			}
 		 ;
 
 opt_if_not_exists: IF_P NOT EXISTS  { $$ = true; }
diff --git a/src/test/regress/expected/enum.out b/src/test/regress/expected/enum.out
index 01159688e5..105ae7a2f9 100644
--- a/src/test/regress/expected/enum.out
+++ b/src/test/regress/expected/enum.out
@@ -605,6 +605,11 @@ ERROR:  "red" is not an existing enum label
 -- check that renaming to an existent value fails
 ALTER TYPE rainbow RENAME VALUE 'blue' TO 'green';
 ERROR:  enum label "green" already exists
+-- We still can't drop an enum value
+ALTER TYPE rainbow DROP VALUE 'green';
+ERROR:  dropping an enum value is not yet implemented
+LINE 1: ALTER TYPE rainbow DROP VALUE 'green';
+   ^
 --
 -- check transactional behaviour of ALTER TYPE ... ADD VALUE
 --
diff --git a/src/test/regress/sql/enum.sql b/src/test/regress/sql/enum.sql
index 93171379f2..a98df40ed8 100644
--- a/src/test/regress/sql/enum.sql
+++ b/src/test/regress/sql/enum.sql
@@ -276,6 +276,9 @@ ALTER TYPE rainbow RENAME VALUE 'red' TO 'crimson';
 -- check that renaming to an existent value fails
 ALTER TYPE rainbow RENAME VALUE 'blue' TO 'green';
 
+-- We still can't drop an enum value
+ALTER TYPE rainbow DROP VALUE 'green';
+
 --
 -- check transactional behaviour of ALTER TYPE ... ADD VALUE
 --


Re: pg_upgrade and logical replication

2023-09-28 Thread Michael Paquier
On Wed, Sep 27, 2023 at 07:31:41PM +0530, Amit Kapila wrote:
> On Wed, Sep 27, 2023 at 3:37 PM vignesh C  wrote:
>> Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
>> will check if the apply worker has some LSN records that need to be
>> applied to reach the LSN of the table. Once the required WAL is
>> applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
>> SUBREL_STATE_READY state. Since there is a chance that in this case
>> the apply worker has to apply some transactions to get all the tables
>> in READY state, I felt the minimum requirement should be that at least
>> all the tables should be in READY state for the upgradation of the
>> Subscriber.
> 
> I don't think this theory is completely correct because the pending
> WAL can be applied even after an upgrade.

Yeah, agreed that putting a pre-check about the state of the relations
stored in pg_subscription_rel when handling the upgrade of a
subscriber is not necessary.
--
Michael


signature.asc
Description: PGP signature


Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

2023-09-28 Thread Peter Geoghegan
On Sun, Sep 17, 2023 at 4:47 PM Peter Geoghegan  wrote:
> Attached is v2, which makes all array key advancement take place using
> the "next index tuple" approach (using binary searches to find array
> keys using index tuple values).

Attached is v3, which fixes bitrot caused by today's bugfix commit 714780dc.

No notable changes here compared to v2.

--
Peter Geoghegan
From 2cff1dadb7903d49a2338b64b27178fa0bc51456 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sat, 17 Jun 2023 17:03:36 -0700
Subject: [PATCH v3] Enhance nbtree ScalarArrayOp execution.

Commit 9e8da0f7 taught nbtree to handle ScalarArrayOpExpr quals
natively.  This works by pushing additional context about the arrays
down into the nbtree index AM, as index quals.  This information enabled
nbtree to execute multiple primitive index scans as part of an index
scan executor node that was treated as one continuous index scan.

The motivation behind this earlier work was enabling index-only scans
with ScalarArrayOpExpr clauses (SAOP quals are traditionally executed
via BitmapOr nodes, which is largely index-AM-agnostic, but always
requires heap access).  The general idea of giving the index AM this
additional context can be pushed a lot further, though.

Teach nbtree SAOP index scans to dynamically advance array scan keys
using information about the characteristics of the index, determined at
runtime.  The array key state machine advances the current array keys
using the next index tuple in line to be scanned, at the point where the
scan reaches the end of the last set of array keys.  This approach is
far more flexible, and can be far more efficient.  Cases that previously
required hundreds (even thousands) of primitive index scans now require
as few as one single primitive index scan.

Also remove all restrictions on generating path keys for nbtree index
scans that happen to have ScalarArrayOpExpr quals.  Bugfix commit
807a40c5 taught the planner to avoid generating unsafe path keys: path
keys on a multicolumn index path, with a SAOP clause on any attribute
beyond the first/most significant attribute.  These cases are now safe.
Now nbtree index scans with an inequality clause on a high order column
and a SAOP clause on a lower order column are executed as one single
primitive index scan, since that is the most efficient way to do it.
Non-required equality type SAOP quals are executed by nbtree using
almost the same approach used for required equality type SAOP quals.

nbtree is now strictly guaranteed to avoid all repeat accesses to any
individual leaf page, even in cases with inequalities on high order
columns (except when the scan direction changes, or the scan restarts).
We now have strong guarantees about the worst case, which is very useful
when costing index scans with SAOP clauses.  The cost profile of index
paths with multiple SAOP clauses is now a lot closer to other cases;
more selective index scans will now generally have lower costs than less
selective index scans.  The added cost from repeatedly descending the
index still matters, but it can never dominate.

An important goal of this work is to remove all ScalarArrayOpExpr clause
special cases from the planner -- ScalarArrayOpExpr clauses can now be
thought of a generalization of simple equality clauses (except when
costing index scans, perhaps).  The planner no longer needs to generate
alternative index paths with filter quals/qpquals.  We assume that true
SAOP index quals are strictly better than filter/qpquals, since the work
in nbtree guarantees that they'll be at least slightly faster.

Many of the queries sped up by the work from this commit don't directly
benefit from the nbtree/executor enhancements.  They benefit indirectly.
The planner no longer shows any restraint around making SAOP clauses
into true nbtree index quals, which tends to result in significant
savings on heap page accesses.  In general we never need visibility
checks to evaluate true index quals, whereas filter quals often need to
perform extra heap accesses, just to eliminate non-matching tuples
(expression evaluation is only safe with known visible tuples).

Author: Peter Geoghegan 
Discussion: https://postgr.es/m/CAH2-Wz=ksvN_sjcnD1+Bt-WtifRA5ok48aDYnq3pkKhxgMQpcw@mail.gmail.com
---
 src/include/access/nbtree.h|   39 +-
 src/backend/access/nbtree/nbtree.c |   65 +-
 src/backend/access/nbtree/nbtsearch.c  |   62 +-
 src/backend/access/nbtree/nbtutils.c   | 1367 ++--
 src/backend/optimizer/path/indxpath.c  |   64 +-
 src/backend/utils/adt/selfuncs.c   |  123 +-
 doc/src/sgml/monitoring.sgml   |   13 +
 src/test/regress/expected/create_index.out |   61 +-
 src/test/regress/expected/join.out |5 +-
 src/test/regress/sql/create_index.sql  |   20 +-
 10 files changed, 1506 insertions(+), 313 deletions(-)

diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 6345e16d7..33db9b648 100644
--- a/src/i

Re: Try adding type cast with tablespace

2023-09-28 Thread Kenichiro Tanaka
Hello,Yuki.

My understanding is that your patch is aimed to enrich an alias type for oid.
There are already some alias types for oid so I think it is good to
add regtablespace for convenience.

https://www.postgresql.org/docs/16/datatype-oid.html#DATATYPE-OID-TABLE

Actually,I also felt it is a bit of a hassle to join tables to find
tablespace name from pg_database,
it is convenient if I can use regtablespace alias.

Therefore I think it is good to add regtablespace alias,but I’m also
newbie pgsql-hackers.
We need some senior hackers’s opinions.

Kind Regards,
Kenichiro Tanaka
>
> Hi all,
>
>
> Good day!
>
>
> I am a newbee to PostgreSQL and recently came across an idea about 
> type-casting tablespace OID.
>
> The motibation is that when I have to upgrade a PostgreSQL database, we need 
> to join other tables to
>
> track tablespace name. I have just created a simple patch to resolve this.
>
>
> Hope you can take a look with this.
>
>
> My Execution Sample:
>
> # After Patch:
>
> 
>
> postgres=# SELECT oid,oid::regtablespace,spcname from pg_tablespace ;
>
>  oid  |oid |  spcname
>
> --++
>
>  1663 | pg_default | pg_default
>
>  1664 | pg_global  | pg_global
>
> (2 rows)
>
> 
>
>
> # Before Patch
>
> 
>
> postgres-# SELECT oid,oid::regtablespace,spcname from pg_tablespace ;
>
> ERROR:  syntax error at or near "oid"
>
> LINE 1: oid  |oid |  spcname
>
> ^
>
> 
>
>
> I added the "::regtablespace" part to source.
>
> Note: While developing, I also had to add several rows to pgcatalog tables.
>
>   Please point out if any OID newly assigned is not appropriate.
>
>
> Kind Regards,
>
> Yuki Tei




Re: [HACKERS] pg_upgrade failed with error - ERROR: column "a" in child table must be marked NOT NULL

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 01:29:21PM -0500, Justin Pryzby wrote:
> On Thu, Dec 14, 2017 at 07:18:59PM +0700, Ali Akbar wrote:
> > By the way, should i add this patch to the current commitfest?
> 
> The patch for pg_upgrade --check got forgotten 6 years ago, but it's a
> continuing problem (we hit it again which cost an hour during
> pg_upgrade) and ought to be (have been) backpatched.

You mean when upgrading from an instance of 9.6 or older as c30f177 is
not there, right?  With the new project policy to support pg_upgrade
for 10 years, there's still a very good argument for backpatching a
pre-check down to v11.

Anyway, it seems like the patch from [1] has no need to run this check
when the old cluster's version is 10 or newer.  And perhaps it should
mention that this check could be removed from pg_upgrade once v10
support is out of scope, in the shape of a comment.

> I didn't dig into it, but maybe it'd even be possible to fix the issue
> with ALTER..DROP NOT NULL ...

Interesting point.  I haven't checked either.

[1]: 
https://postgr.es/m/cacqjqlqoxtzcv4+-s1xot62ty8ggkzkh4ky03gm2aqyomxe...@mail.gmail.com
--
Michael


signature.asc
Description: PGP signature


Re: Does anyone ever use OPTIMIZER_DEBUG?

2023-09-28 Thread Michael Paquier
On Fri, Sep 29, 2023 at 10:20:39AM +1300, David Rowley wrote:
> 4 years is quite a long time for such a bug. Maybe that's because
> nobody uses OPTIMIZER_DEBUG. I certainly don't, and Tom mentions [1]
> he doesn't either.

I've used it perhaps once in the last 10 years, so removing it is OK
by me.
--
Michael


signature.asc
Description: PGP signature


Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-09-28 Thread Michael Paquier
On Thu, Sep 28, 2023 at 04:23:42PM -0400, David Steele wrote:
> After some playing around, I find I agree with Michael on this, i.e. require
> at least standby.signal when a backup_label is present.
> 
> According to my testing, you can preserve the "independent server"
> functionality by setting archive_command = /bin/false. In this case the
> timeline is not advanced and recovery proceeds from whatever is available in
> pg_wal.

I've seen folks depend on such setups in the past, actually, letting a
process outside Postgres just "push" WAL segments to pg_wal instead of
Postgres pulling it with a restore_command or a primary_conninfo for a
standby.

> I think this type of recovery from a backup label without a timeline change
> should absolutely be the exception, not the default as it seems to be now.

This can mess up archives pretty easily, additionally, so it's not
something to encourage..

> If the server is truly independent, then the timeline change is not
> important. If the server is not independent, then the timeline change is
> critical.
> 
> So overall, +1 for Michael's patch, though I have only read through it and
> not tested it yet.

Reviews, thoughts and opinions are welcome.

> One comment, though, if we are going to require recovery.signal when
> backup_label is present, should it just be implied? Why error and force the
> user to create it?

That's one thing I was considering, but I also cannot convince myself
that this is the best option because the presence of recovery.signal
or standby.standby (if both, standby.signal takes priority) makes it
clear what type of recovery is wanted at disk level.  I'd be OK if
folks think that this is a sensible consensus, as well, even if I
don't really agree with it.

Another idea I had was to force the creation of recovery.signal by
pg_basebackup even if -R is not used.  All the reports we've seen with
people getting confused came from pg_basebackup that enforces no
configuration.

A last thing, that had better be covered in a separate thread and
patch, is about validateRecoveryParameters().  These days, I'd like to
think that it may be OK to lift at least the restriction on
restore_command being required if we are doing recovery to ease the
case of self-contained backups (aka the case where all the WAL needed
to reach a consistent point is in pg_wal/ or its tarball)
--
Michael


signature.asc
Description: PGP signature


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-28 Thread Peter Smith
Some minor review comments for v4-0001:

==
src/backend/replication/logical/worker.c

1.
+ /*
+ * Exit if the owner of the subscription has changed from superuser to a
+ * non-superuser.
+ */
+ if (!newsub->isownersuperuser && MySubscription->isownersuperuser)
+ {
+ if (am_parallel_apply_worker())
+ ereport(LOG,
+ errmsg("logical replication parallel apply worker for subscription
\"%s\" will stop because subscription owner has become non-superuser",
+MySubscription->name));
+ else
+ ereport(LOG,
+ errmsg("logical replication worker for subscription \"%s\" will
restart because subscription owner has become non-superuser",
+MySubscription->name));
+
+ apply_worker_exit();
+ }
+

/because subscription owner has become non-superuser/because the
subscription owner has become a non-superuser/  (in 2 places)

==
src/include/catalog/pg_subscription.h

2.
  char*origin; /* Only publish data originating from the
  * specified origin */
+ bool isownersuperuser; /* Is subscription owner superuser? */
 } Subscription;

~

2a.
Would it be better to put this new field adjacent to the existing
'owner' field, since they kind of belong together?

~

2b.
None of the other bool fields here has an 'is' prefix, so you could
consider a shorter field name, like 'ownersuperuser' or
'superuserowner', etc.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> I think right now it doesn't work as-is on sonoma, because apple decided to
>> change the option syntax, which is what causes the -e warning below, so the
>> relevant option is just ignored.

> Hmm, we'd better fix that then.  Or is it their bug?  It looks to me like
> clang's argument is -exported_symbols_list=/path/to/exports.list, so
> it must be translating that to "-e".

Looking closer, the documented syntax is

-exported_symbols_list filename

(two arguments, not one with an "=").  That is what our Makefiles
use, and it still works fine with latest Xcode.  However, meson.build
thinks it can get away with one argument containing "=", and evidently
that doesn't work now (or maybe it never did?).

I tried

  export_fmt = '-exported_symbols_list @0@'

and

  export_fmt = ['-exported_symbols_list', '@0@']

and neither of those did what I wanted, so maybe I will have to
study meson's command language sometime soon.  In the meantime,
I suppose this might be an easy fix for somebody who knows their
way around meson.

regards, tom lane




Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Tom Lane
Andres Freund  writes:
> On 2023-09-28 16:46:08 -0400, Tom Lane wrote:
>> Well, it's only important on platforms where we can't restrict
>> libpq.so from exporting all symbols.  I don't know how close we are
>> to deciding that such cases are no longer interesting to worry about.
>> Makefile.shlib seems to know how to do it everywhere except Windows,
>> and I imagine we know how to do it over in the MSVC scripts.

> Hm, then I'd argue that we don't need to care about it anymore. The meson
> build does the necessary magic on windows, as do the current msvc scripts.

If we take that argument seriously, then I'm inclined to adjust my
upthread patch for Makefile.global.in so that it removes the extra
inclusions of libpgport/libpgcommon everywhere, not only macOS.
The rationale would be that it's not worth worrying about ABI
stability details on any straggler platforms.

> I think right now it doesn't work as-is on sonoma, because apple decided to
> change the option syntax, which is what causes the -e warning below, so the
> relevant option is just ignored.

Hmm, we'd better fix that then.  Or is it their bug?  It looks to me like
clang's argument is -exported_symbols_list=/path/to/exports.list, so
it must be translating that to "-e".

regards, tom lane




Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Andres Freund
Hi,

On 2023-09-28 16:46:08 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-09-27 16:52:44 -0400, Tom Lane wrote:
> >> I think it doesn't, as long as all the relevant build targets
> >> write their dependencies with "frontend_code" before "libpq".
> 
> > Hm, that's not great. I don't think that should be required. I'll try to 
> > take
> > a look at why that's needed.
> 
> Well, it's only important on platforms where we can't restrict
> libpq.so from exporting all symbols.  I don't know how close we are
> to deciding that such cases are no longer interesting to worry about.
> Makefile.shlib seems to know how to do it everywhere except Windows,
> and I imagine we know how to do it over in the MSVC scripts.

Hm, then I'd argue that we don't need to care about it anymore. The meson
build does the necessary magic on windows, as do the current msvc scripts.

I think right now it doesn't work as-is on sonoma, because apple decided to
change the option syntax, which is what causes the -e warning below, so the
relevant option is just ignored.


> There's still one duplicate warning
> from the backend link:
> 
> ld: warning: ignoring duplicate libraries: '-lpam'
> 
> I'm a bit baffled why that's showing up; there's no obvious
> double reference to pam.

I think it's because multiple libraries/binaries depend on it. Meson knows how
to deduplicate libraries found via pkg-config (presumably because that has
enough information for a topological sort), but apparently not when they're
found as "raw" libraries.  Until now that was also just pretty harmless, so I
understand not doing anything about it.

I see a way to avoid the warnings, but perhaps it's better to ask the meson
folks to put in a generic way of dealing with this.

Greetings,

Andres Freund




Re: Does anyone ever use OPTIMIZER_DEBUG?

2023-09-28 Thread Tom Lane
David Rowley  writes:
> In c4a1933b4 I pushed a fix for a 4 year old bug in print_path() where
> I'd forgotten to add handling for TidRangePaths while working on
> bb437f995.
> 4 years is quite a long time for such a bug. Maybe that's because
> nobody uses OPTIMIZER_DEBUG. I certainly don't, and Tom mentions [1]
> he doesn't either.
> Is there anyone out there who uses it?
> If not, it's about 320 lines of uselessness.

We could also discuss keeping the "tracing" aspect of it, but
replacing debug_print_rel with pprint(rel), which'd still allow
removal of all the "DEBUG SUPPORT" stuff at the bottom of allpaths.c.
That's pretty much all of the maintenance-requiring stuff in it.

regards, tom lane




Re: On login trigger: take three

2023-09-28 Thread Alexander Korotkov
Hi, Daniel!

On Mon, Sep 25, 2023 at 3:42 PM Daniel Gustafsson  wrote:
> > On 25 Sep 2023, at 11:13, Alexander Korotkov  wrote:
>
> > I'd like to do a short summary of
> > design issues on this thread.
>
> Thanks for summarizing this long thread!
>
> > the patch for the GUC option to disable
> > all event triggers resides in a separate thread [4]. Apparently that
> > patch should be committed first [5].
>
> I have committed the prerequisite patch for temporarily disabling EVTs via a
> GUC in 7750fefdb2.  We should absolutely avoid creating any more dependencies
> on single-user mode (yes, I have changed my mind since the beginning of the
> thread).

Thank you for committing 7750fefdb2.  I've revised this patch
according to it.  I've resolved the conflicts, make use of
event_triggers GUC and adjusted some comments.

> > 3. Yet another question is connection-time overhead introduced by this
> > patch. The overhead estimate varies from no measurable overhead [6] to
> > 5% overhead [7]. In order to overcome that, [8] has introduced a
> > database-level flag indicating whether there are connection triggers.
> > Later this flag was removed [9] in a hope that the possible overhead
> > is acceptable.
>
> While I disliked the flag, I do think the overhead is problematic.  Last time 
> I
> profiled it I found it noticeable, and it seems expensive for such a niche
> feature to impact such a hot path.  Maybe you can think of other ways to 
> reduce
> the cost here (if it indeed is an issue in the latest version of the patch,
> which is not one I've benchmarked)?

I don't think I can reproduce the performance regression pointed out
by Pavel Stehule [1].

I run a simple ";" sql script (this script doesn't even get the
snapshot) on my laptop and run it multiple times with event_triggers =
on and event_triggers = off;

pgbench -c 10 -j 10 -M prepared -f 1.sql -P 1 -T 60 -C postgres
event_triggers = on
run1: 2261
run2: 2301
run3: 2281
event_triggers = off
run1: 2321
run2: 2277
run3: 2267

pgbench -c 10 -j 10 -M prepared -f 1.sql -P 1 -T 60 -C postgres
event_triggers = on
run1: 731
run2: 740
run3: 733
event_triggers = off
run1: 739
run2: 734
run3: 731

I can't confirm the measurable overhead.

> > 5. It was also pointed out [11] that ^C in psql doesn't cancel
> > long-running client connection triggers. That might be considered a
> > psql problem though.
>
> While it is a psql problem, it's exacerbated by a slow login EVT and it breaks
> what I would guess is the mental model of many who press ^C in a stalled 
> login.
> At the very least we should probably document the risk?

Done in the attached patch.

Links
1. 
https://www.postgresql.org/message-id/CAFj8pRChwu01VLx76nKBVyScHCsd1YnBGiKfDJ6h17g4CSnUBg%40mail.gmail.com

--
Regards,
Alexander Korotkov


0001-Add-support-of-event-triggers-on-authenticated-l-v41.patch
Description: Binary data


Does anyone ever use OPTIMIZER_DEBUG?

2023-09-28 Thread David Rowley
In c4a1933b4 I pushed a fix for a 4 year old bug in print_path() where
I'd forgotten to add handling for TidRangePaths while working on
bb437f995.

4 years is quite a long time for such a bug. Maybe that's because
nobody uses OPTIMIZER_DEBUG. I certainly don't, and Tom mentions [1]
he doesn't either.

Is there anyone out there who uses it?

If not, it's about 320 lines of uselessness.

David

[1] https://www.postgresql.org/message-id/951421.1695911...@sss.pgh.pa.us




The danger of deleting backup_label

2023-09-28 Thread David Steele

Hackers,

While reading through [1] I saw there were two instances where 
backup_label was removed to achieve a "successful" restore. This might 
work on trivial test restores but is an invitation to (silent) disaster 
in a production environment where the checkpoint stored in backup_label 
is almost certain to be earlier than the one stored in pg_control.


A while back I had an idea on how to prevent this so I decided to give 
it a try. Basically, before writing pg_control to the backup I set 
checkpoint to 0x.


Recovery worked perfectly as long as backup_label was present and failed 
hard when it was not:


LOG:  invalid primary checkpoint record
PANIC:  could not locate a valid checkpoint record

It's not a very good message, but at least the foot gun has been 
removed. We could use this as a special value to give a better message, 
and maybe use something a bit more unique like 0xFADEFADE (or 
whatever) as the value.


This is all easy enough for pg_basebackup to do, but will certainly be 
non-trivial for most backup software to implement. In [2] we have 
discussed perhaps returning pg_control from pg_backup_stop() for the 
backup software to save, or it could become part of the backup_label 
(encoded as hex or base64, presumably). I prefer the latter as this 
means less work for the backup software (except for the need to exclude 
pg_control from the backup).


I don't have a patch for this yet because I did not test this idea using 
pg_basebackup, but I'll be happy to work up a patch if there is interest.


I feel like we should do *something* here. If even advanced users are 
making this mistake, then we should take it pretty seriously.


Regards,
-David

[1] 
https://www.postgresql.org/message-id/flat/CAM_vCudkSjr7NsNKSdjwtfAm9dbzepY6beZ5DP177POKy8%3D2aw%40mail.gmail.com#746e492bfcd2667635634f1477a61288
[2] 
https://www.postgresql.org/message-id/CA%2BhUKGKiZJcfZSA5G5Rm8oC78SNOQ4c8az5Ku%3D4wMTjw1FZ40g%40mail.gmail.com





Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Tom Lane
Andres Freund  writes:
> On 2023-09-27 16:52:44 -0400, Tom Lane wrote:
>> I think it doesn't, as long as all the relevant build targets
>> write their dependencies with "frontend_code" before "libpq".

> Hm, that's not great. I don't think that should be required. I'll try to take
> a look at why that's needed.

Well, it's only important on platforms where we can't restrict
libpq.so from exporting all symbols.  I don't know how close we are
to deciding that such cases are no longer interesting to worry about.
Makefile.shlib seems to know how to do it everywhere except Windows,
and I imagine we know how to do it over in the MSVC scripts.

>> However, it's hard to test this, because the meson build
>> seems completely broken on current macOS:

> Looks like you need 1.2 for the new clang / ld output...  Apparently apple's
> linker changed the format of its version output :/.

Ah, yeah, updating MacPorts again brought in meson 1.2.1 which seems
to work.  I now see a bunch of

ld: warning: ignoring -e, not used for output type
ld: warning: -undefined error is deprecated

which are unrelated.  There's still one duplicate warning
from the backend link:

ld: warning: ignoring duplicate libraries: '-lpam'

I'm a bit baffled why that's showing up; there's no obvious
double reference to pam.

regards, tom lane




Re: Annoying build warnings from latest Apple toolchain

2023-09-28 Thread Andres Freund
Hi,

On 2023-09-27 16:52:44 -0400, Tom Lane wrote:
> I wrote:
> > I've not yet looked at the meson build infrastructure to
> > see if it needs a corresponding change.
> 
> I think it doesn't, as long as all the relevant build targets
> write their dependencies with "frontend_code" before "libpq".

Hm, that's not great. I don't think that should be required. I'll try to take
a look at why that's needed.


> However, it's hard to test this, because the meson build
> seems completely broken on current macOS:

I am travelling and I don't quite dare to upgrade my mac mini remotely. So I
can't try Sonoma directly.

But CI worked after switching to sonoma - although installing packages from
macports took forever, due to macports building all packages locally.

https://cirrus-ci.com/task/5133869171605504

There's some weird warnings about hashlib/blake2, but it looks like that's a
python installation issue. Looks like this is with python from macports in
PATH.

[00:59:14.442] ERROR:root:code for hash blake2b was not found.
[00:59:14.442] Traceback (most recent call last):
[00:59:14.442]   File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/hashlib.py",
 line 307, in 
[00:59:14.442] globals()[__func_name] = __get_hash(__func_name)
[00:59:14.442]  ^^^
[00:59:14.442]   File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/hashlib.py",
 line 129, in __get_openssl_constructor
[00:59:14.442] return __get_builtin_constructor(name)
[00:59:14.442]^^^
[00:59:14.442]   File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/hashlib.py",
 line 123, in __get_builtin_constructor
[00:59:14.442] raise ValueError('unsupported hash type ' + name)
[00:59:14.442] ValueError: unsupported hash type blake2b

This just happens whenever python's hashlib - supposedly in the standard
library - is imported.


There *are* some buildsystem warnings:
[00:59:27.289] [260/2328] Linking target src/interfaces/libpq/libpq.5.dylib
[00:59:27.289] ld: warning: -undefined error is deprecated
[00:59:27.289] ld: warning: ignoring -e, not used for output type

Full command:
[1/1] cc  -o src/interfaces/libpq/libpq.5.dylib 
src/interfaces/libpq/libpq.5.dylib.p/fe-auth-scram.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-auth.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-connect.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-exec.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-lobj.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-misc.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-print.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-protocol3.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-secure.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-trace.c.o 
src/interfaces/libpq/libpq.5.dylib.p/legacy-pqsignal.c.o 
src/interfaces/libpq/libpq.5.dylib.p/libpq-events.c.o 
src/interfaces/libpq/libpq.5.dylib.p/pqexpbuffer.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-secure-common.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-secure-openssl.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-gssapi-common.c.o 
src/interfaces/libpq/libpq.5.dylib.p/fe-secure-gssapi.c.o 
-Wl,-dead_strip_dylibs -Wl,-headerpad_max_install_names -Wl,-undefined,error 
-shared -install_name @rpath/libpq.5.dylib -compatibility_version 5 
-current_version 5.17 -isysroot 
/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk -Og -ggdb 
-Wl,-rpath,/opt/local/lib -Wl,-rpath,/opt/local/libexec/openssl3/lib 
src/common/libpgcommon_shlib.a src/port/libpgport_shlib.a 
-exported_symbols_list=/Users/admin/pgsql/build/src/interfaces/libpq/exports.list
 -lm /opt/local/lib/libintl.dylib /opt/local/lib/libgssapi_krb5.dylib 
/opt/local/lib/libldap.dylib /opt/local/lib/liblber.dylib 
/opt/local/libexec/openssl3/lib/libssl.dylib 
/opt/local/libexec/openssl3/lib/libcrypto.dylib /opt/local/lib/libz.dylib 
/opt/local/lib/libzstd.dylib
ld: warning: -undefined error is deprecated
ld: warning: ignoring -e, not used for output type

So we need to make the addition of -Wl,-undefined,error conditional, that
should be easy enough. Although I'm a bit confused about this being
deprecated.

For the -e bit, this seems to do the trick:
@@ -224,7 +224,7 @@ elif host_system == 'darwin'
   library_path_var = 'DYLD_LIBRARY_PATH'
 
   export_file_format = 'darwin'
-  export_fmt = '-exported_symbols_list=@0@'
+  export_fmt = '-Wl,-exported_symbols_list,@0@'
 
   mod_link_args_fmt = ['-bundle_loader', '@0@']
   mod_link_with_dir = 'bindir'

It's quite annoying that apple is changing things option syntax.


> (I also tried with a more recent meson version, 1.1.1, with
> the same result.)

Looks like you need 1.2 for the new clang / ld output...  Apparently apple's
linker changed the format of its version output :/.

Greetings,

Andres Freund




Re: Requiring recovery.signal or standby.signal when recovering with a backup_label

2023-09-28 Thread David Steele

On 9/27/23 23:58, Kyotaro Horiguchi wrote:

At Fri, 10 Mar 2023 15:59:04 +0900, Michael Paquier  wrote 
in

My apologies for the long message, but this deserves some attention,
IMHO.

So, any thoughts?


Sorry for being late. However, I agree with David's concern regarding
the unnecessary inconvenience it introduces.  I'd like to maintain the
functionality.


After some playing around, I find I agree with Michael on this, i.e. 
require at least standby.signal when a backup_label is present.


According to my testing, you can preserve the "independent server" 
functionality by setting archive_command = /bin/false. In this case the 
timeline is not advanced and recovery proceeds from whatever is 
available in pg_wal.


I think this type of recovery from a backup label without a timeline 
change should absolutely be the exception, not the default as it seems 
to be now. If the server is truly independent, then the timeline change 
is not important. If the server is not independent, then the timeline 
change is critical.


So overall, +1 for Michael's patch, though I have only read through it 
and not tested it yet.


One comment, though, if we are going to require recovery.signal when 
backup_label is present, should it just be implied? Why error and force 
the user to create it?


Regards,
-David




Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

2023-09-28 Thread Tom Lane
Tommy Pavlicek  writes:
> I've attached a new version of the ALTER OPERATOR patch that allows
> no-ops. It should be ready to review now.

I got around to looking through this finally (sorry about the delay).
I'm mostly on board with the functionality, with the exception that
I don't see why we should allow ALTER OPERATOR to cause new shell
operators to be created.  The argument for allowing that in CREATE
OPERATOR was mainly to allow a linked pair of operators to be created
without a lot of complexity (specifically, being careful to specify
the commutator or negator linkage only in the second CREATE, which
is a rule that requires another exception for a self-commutator).
However, if you're using ALTER OPERATOR then you might as well create
both operators first and then link them with an ALTER command.
In fact, I don't really see a use-case where the operators wouldn't
both exist; isn't this feature mainly to allow retrospective
correction of omitted linkages?  So I think allowing ALTER to create a
second operator is more likely to allow mistakes to sneak by than to
do anything useful --- and they will be mistakes you can't correct
except by starting over.  I'd even question whether we want to let
ALTER establish a linkage to an existing shell operator, rather than
insisting you turn it into a valid operator via CREATE first.

If we implement it with that restriction then I don't think the
refactorization done in 0001 is correct, or at least not ideal.

(In any case, it seems like a bad idea that the command reference
pages make no mention of this stuff about shell operators.  It's
explained in 38.15. Operator Optimization Information, but it'd
be worth at least alluding to that section here.  Or maybe we
should move that info to CREATE OPERATOR?)

More generally, you muttered something about 0001 fixing some
existing bugs, but if so I can't see those trees for the forest of
refactorization.  I'd suggest splitting any bug fixes apart from
the pure-refactorization step.

regards, tom lane




Re: Is this a problem in GenericXLogFinish()?

2023-09-28 Thread Jeff Davis
On Wed, 2023-09-27 at 18:57 +0300, Heikki Linnakangas wrote:
> We could define a new REGBUF_NO_CHANGE flag to signal that the buffer
> is 
> registered just for locking purposes, and not actually modified by
> the 
> WAL record.

Out of curiosity I also added a trial assert (not intending to actually
add this):

  Assert(!(flags & REGBUF_NO_CHANGE) || !BufferIsExclusiveLocked());

I expected that to fail, but it didn't -- perhaps that buffer is only
locked during replay or something? I don't think we want that specific
Assert; I was just experimenting with some cross-checks we could do to
verify that REGBUF_NO_CHANGE is used properly.

Also, I ran into some problems with GIN that might require a bit more
refactoring in ginPlaceToPage(). Perhaps we could consider
REGBUF_NO_CHANGE a general bypass of the Assert(BufferIsDirty()), and
use it temporarily until we have a chance to analyze/refactor each of
the callers that need it.

Regards,
Jeff Davis





Re: Index range search optimization

2023-09-28 Thread Alexander Korotkov
On Thu, Sep 28, 2023 at 5:21 AM Peter Geoghegan  wrote:
> On Wed, Sep 27, 2023 at 9:41 AM Alexander Korotkov  
> wrote:
> > Fixed typo inficating => indicating as pointed by Pavel.
> > Peter, what do you think about the current shape of the patch?
>
> I'll try to get to this tomorrow. I'm rather busy with moving home at
> the moment, unfortunately.

No problem, thank you!

--
Regards,
Alexander Korotkov




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
Andrew Dunstan  writes:
> I wonder if we could have a boolean flag in pg_enum, indicating that 
> setting an enum to that value was forbidden.

Yeah, but that still offers no coherent solution to the problem of
what happens if there's a table that already contains such a value.
It doesn't seem terribly useful to forbid new entries if you can't
get rid of old ones.

Admittedly, a DISABLE flag would at least offer a chance at a
race-condition-free scan to verify that no such values remain
in tables.  But as somebody already mentioned upthread, that
wouldn't guarantee that the value doesn't appear in non-leaf
index pages.  So basically you could never get rid of the pg_enum
row, short of a full dump and restore.

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Andrew Dunstan



On 2023-09-28 Th 10:28, Tom Lane wrote:

=?UTF-8?B?0JTQsNC90LjQuyDQodGC0L7Qu9C/0L7QstGB0LrQuNGF?= 
 writes:

I would like to offer my patch on the problem of removing values from enums
It adds support for expression ALTER TYPE  DROP VALUE


This does not fix any of the hard problems that caused us not to
have such a feature to begin with.  Notably, what happens to
stored data of the enum type if it is a now-deleted value?





I wonder if we could have a boolean flag in pg_enum, indicating that 
setting an enum to that value was forbidden. That wouldn't delete the 
value but it wouldn't show up in enum_range and friends. We'd have to 
teach pg_dump and pg_upgrade to deal with it, but that shouldn't be too 
hard.



Perhaps the command could be something like


ALTER TYPE enum_name DISABLE value;


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [HACKERS] pg_upgrade failed with error - ERROR: column "a" in child table must be marked NOT NULL

2023-09-28 Thread Justin Pryzby
On Thu, Dec 14, 2017 at 07:18:59PM +0700, Ali Akbar wrote:
> By the way, should i add this patch to the current commitfest?

The patch for pg_upgrade --check got forgotten 6 years ago, but it's a
continuing problem (we hit it again which cost an hour during
pg_upgrade) and ought to be (have been) backpatched.

I didn't dig into it, but maybe it'd even be possible to fix the issue
with ALTER..DROP NOT NULL ...

-- 
Justin




Re: wal recycling problem

2023-09-28 Thread Christoph Moench-Tegeder
## Fabrice Chapuis (fabrice636...@gmail.com):

> We have a cluster of 2 members (1 primary and 1 standby) with Postgres
> version 14.9 and 2 barman server, slots are only configured for barman,
> barman is version 3.7.

The obvious question here is: can both of those barmans keep up with
your database, or are you seeing WAL retention due to exactly these
replication slots? (Check pg_replication_slots).

Regards,
Christoph

-- 
Spare Space




Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-28 Thread Jeff Davis
On Thu, 2023-09-28 at 11:34 -0400, Robert Haas wrote:
> I guess it depends on whether we think this is a bug. I think you
> could argue it either way.

I'd suggest backporting to 16 unless there's some kind of difficulty.
Otherwise we have a minor difference in behavior for no reason.

Regards,
Jeff Davis





Re: bug: ANALYZE progress report with inheritance tables

2023-09-28 Thread Heikki Linnakangas

On 22/01/2023 18:23, Justin Pryzby wrote:

pg_stat_progress_analyze was added in v13 (a166d408e).

For tables with inheritance children, do_analyze_rel() and
acquire_sample_rows() are called twice.  The first time through,
pgstat_progress_start_command() has memset() the progress array to zero.

But the 2nd time, ANALYZE_BLOCKS_DONE is already set from the previous
call, and BLOCKS_TOTAL can be set to some lower value (and in any case a
value unrelated to the pre-existing value of BLOCKS_DONE).  So the
progress report briefly shows a bogus combination of values and, with
these assertions, fails regression tests in master and v13, unless
BLOCKS_DONE is first zeroed.


Good catch!

I think the counts need do be reset even earlier, in 
acquire_inherited_sample_rows(), at the same time that we update 
PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID. See attached patch. 
Otherwise, there's a brief moment where we have already updated the 
child table ID, but the PROGRESS_ANALYZE_BLOCKS_TOTAL 
PROGRESS_ANALYZE_BLOCKS_DONE still show the counts from the previous 
child table. And if it's a foreign table, the FDW's sampling function 
might not update the progress report at all, in which case the old 
values will be displayed until the table is fully processed.


I appreciate the assertions you added, that made it easy to reproduce 
the problem. I'm inclined to not commit that though. It seems like a 
modularity violation for the code in backend_progress.c to have such 
intimate knowledge of what the different counters mean.


--
Heikki Linnakangas
Neon (https://neon.tech)
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index bfd981aa3f..206d1689ef 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1531,8 +1531,25 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
 		AcquireSampleRowsFunc acquirefunc = acquirefuncs[i];
 		double		childblocks = relblocks[i];
 
-		pgstat_progress_update_param(PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,
-	 RelationGetRelid(childrel));
+		/*
+		 * Report progress.  The sampling function will normally report blocks
+		 * done/total, but we need to reset them to 0 here, so that they don't
+		 * show an old value until that.
+		 */
+		{
+			const int	progress_index[] = {
+PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,
+PROGRESS_ANALYZE_BLOCKS_DONE,
+PROGRESS_ANALYZE_BLOCKS_TOTAL
+			};
+			const int64 progress_vals[] = {
+RelationGetRelid(childrel),
+0,
+0,
+			};
+
+			pgstat_progress_update_multi_param(3, progress_index, progress_vals);
+		}
 
 		if (childblocks > 0)
 		{


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-28 Thread Robert Haas
On Wed, Sep 27, 2023 at 2:58 AM Amit Kapila  wrote:
> But OTOH, the owner of the subscription can be changed by the Alter
> Subscription command whereas superuser status can't be changed. I
> think we should consider changing the message for this case.

The superuser status of the subscription owner is definitely *not* a
parameter of the subscription, so it doesn't seem like the same
message is appropriate.

> Adding Jeff and Robert to see what is their opinion on whether we
> should backpatch this or not.

I guess it depends on whether we think this is a bug. I think you
could argue it either way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Eliminate redundant tuple visibility check in vacuum

2023-09-28 Thread Robert Haas
On Thu, Sep 28, 2023 at 8:46 AM Melanie Plageman
 wrote:
> Once I started writing the function comment, however, I felt a bit
> awkward. In order to make the function available to both pruneheap.c
> and vacuumlazy.c, I had to put it in a header file. Writing a
> function, available to anyone including heapam.h, which takes an int
> and returns an HTSV_Result feels a bit odd. Do we want it to be common
> practice to use an int value outside the valid enum range to store
> "status not yet computed" for HTSV_Results?

I noticed the awkwardness of that return convention when I reviewed
the first version of this patch, but I decided it wasn't worth
spending time discussing. To avoid it, we'd either need to add a new
HTSV_Result that is only used here, or add a new type
HTSV_Result_With_An_Extra_Value and translate between the two, or pass
back a boolean + an enum instead of an array of int8. And all of those
seem to me to suck -- the first two are messy and the third would make
the return value much wider. So, no, I don't really like this, but
also, what would actually be any better? Also, IMV at least, it's more
of an issue of it being sort of ugly than of anything becoming common
practice, because how many callers of heap_page_prune() are there ever
going to be? AFAIK, we've only ever had two since forever, and even if
we grow one or two more at some point, that's still not that many.

I went ahead and committed 0001. If Andres still wants to push for
more renaming there, that can be a follow-up patch. And we can see if
he or anyone else has any comments on this new version of 0002. To me
we're down into the level of details that probably don't matter very
much one way or the other, but others may disagree.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Is it worth adding Assert(false) for unknown paths in print_path()?

2023-09-28 Thread David Rowley
On Fri, 29 Sept 2023 at 03:23, Tom Lane  wrote:
> FWIW, I'd argue for dropping print_path rather than continuing to
> maintain it.  I never use it, finding pprint() to serve the need
> better and more reliably.

Then perhaps we just need to open a thread with an appropriate subject
to check if anyone finds it useful and if we don't get any response
after some number of weeks, just remove it from master.

David




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
=?UTF-8?B?0JTQsNC90LjQuyDQodGC0L7Qu9C/0L7QstGB0LrQuNGF?= 
 writes:
> I would like to offer my patch on the problem of removing values from enums
> It adds support for expression ALTER TYPE  DROP VALUE
> 

This does not fix any of the hard problems that caused us not to
have such a feature to begin with.  Notably, what happens to
stored data of the enum type if it is a now-deleted value?

regards, tom lane




Re: Is it worth adding Assert(false) for unknown paths in print_path()?

2023-09-28 Thread Tom Lane
David Rowley  writes:
> In [1] Andrey highlighted that I'd forgotten to add print_path()
> handling for TidRangePaths in bb437f995.

> I know the OPTIMIZER_DEBUG code isn't exactly well used.  I never
> personally use it and I work quite a bit in the planner, however, if
> we're keeping it, I thought maybe we might get the memo of missing
> paths a bit sooner if we add an Assert(false) in the default cases.

FWIW, I'd argue for dropping print_path rather than continuing to
maintain it.  I never use it, finding pprint() to serve the need
better and more reliably.  However, assuming that we keep it ...

> Is the attached worthwhile?

... I think this is actually counterproductive.  It will certainly
not help draw the notice of anyone who wouldn't otherwise pay
attention to print_path.  Also, observe the extremely longstanding
policy decision in outNode's default: case:

/*
 * This should be an ERROR, but it's too useful to be able to
 * dump structures that outNode only understands part of.
 */
elog(WARNING, "could not dump unrecognized node type: %d",
 (int) nodeTag(obj));
break;

The same argument applies to print_path, I should think.

regards, tom lane




Re: Clarify where the severity level is defined

2023-09-28 Thread Kuwamura Masaki
>
> Committed, with some minor wordsmithing. Thanks!
>

Thanks for tweaking and pushing, Daniel-san!

Masaki Kuwamura


Re: Set enable_seqscan doesn't take effect?

2023-09-28 Thread Bruce Momjian
On Thu, Sep 28, 2023 at 03:38:28PM +0800, jacktby jacktby wrote:
> > You start several threads a week, often clearly not having done much, if 
> > any,
> > prior research. Often even sending the same question to multiple lists. It
> > should not be hard to find an explanation for the behaviour you see here.
> > 
> > pgsql-hackers isn't a "do my work for me service". We're hacking on
> > postgres. It's fine to occasionally ask for direction, but you're very 
> > clearly
> > exceeding that.
> > 
> > Greetings,
> > 
> > Andres Freund
> I’m so sorry for that. I think I’m not very familiar with pg, so I ask many 
> naive questions. And I apologize for my behavior.

I think you might find our IRC channel a more natural fit for getting
your questions answered:

https://www.postgresql.org/community/irc/

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Vik Fearing

On 9/28/23 14:13, Данил Столповских wrote:

Greetings, everyone!
I would like to offer my patch on the problem of removing values from enums

It adds support for expression ALTER TYPE  DROP VALUE


Added:
1. expression in grammar
2. function to drop enum values
3. regression tests
4. documentation


Thanks for this patch that a lot of people want.

However, it does not seem to address the issue of how to handle the 
dropped value being in the high key of an index.  Until we solve that 
problem (and maybe others), this kind of patch is insufficient to add 
the feature.

--
Vik Fearing





pg_resetwal: Corrections around -c option

2023-09-28 Thread Peter Eisentraut
Branching off from [0], here is a for-discussion patch about some 
corrections for the pg_resetwal -c (--commit-timestamp-ids) option.


First, in the documentation for finding a manual value for the -c option
based on the files present in the data directory, it was missing a 
multiplier, like for the other SLRU-based values, and also missing the 
mention of adding one for the upper value.  The value I came up with is 
computed as


SLRU_PAGES_PER_SEGMENT * COMMIT_TS_XACTS_PER_PAGE = 13088 = 0x3320

Second, the present pg_resetwal code hardcodes the minimum value as 2, 
which is FrozenTransactionId, but it's not clear why that is allowed. 
Maybe we should change that to FirstNormalTransactionId, which matches 
other xid-related options in pg_resetwal.


Thoughts?


[0]: 
https://www.postgresql.org/message-id/flat/0f3ab4a1-ae80-56e8-3426-6b4a02507...@eisentraut.orgFrom c092e8767cad0618feee6e6a76071aeb9df3efad Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 28 Sep 2023 15:43:01 +0200
Subject: [PATCH] WIP: pg_resetwal: Corrections around -c option

In the documentation for finding a manual value for the -c option
based on the files present in the data directory, it was missing a
multiplier, like for the other SLRU-based values, and also missing the
mention of adding one for the upper value.  The value mentioned here
is computed as

SLRU_PAGES_PER_SEGMENT * COMMIT_TS_XACTS_PER_PAGE = 13088 = 0x3320

Also, the present pg_resetwal code hardcodes the minimum value as 2,
which is FrozenTransactionId, but it's not clear why that is allowed.
Change that to FirstNormalTransactionId, which matches other
xid-related options in pg_resetwal.
---
 doc/src/sgml/ref/pg_resetwal.sgml |  7 ---
 src/bin/pg_resetwal/pg_resetwal.c | 12 ++--
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/ref/pg_resetwal.sgml 
b/doc/src/sgml/ref/pg_resetwal.sgml
index 08cd3ce5fc..40ea1ff724 100644
--- a/doc/src/sgml/ref/pg_resetwal.sgml
+++ b/doc/src/sgml/ref/pg_resetwal.sgml
@@ -183,11 +183,12 @@ Options
   A safe value for the oldest transaction ID for which the commit time can
   be retrieved (first part) can be determined by looking
   for the numerically smallest file name in the directory
-  pg_commit_ts under the data directory.  Conversely, 
a safe
+  pg_commit_ts under the data directory and then
+  multiplying by 13088 (0x3320).  Conversely, a safe
   value for the newest transaction ID for which the commit time can be
   retrieved (second part) can be determined by looking for the numerically
-  greatest file name in the same directory.  The file names are in
-  hexadecimal.
+  greatest file name in the same directory, adding one, and then
+  multiplying by 13088 (0x3320).  The file names are in hexadecimal.
  
 

diff --git a/src/bin/pg_resetwal/pg_resetwal.c 
b/src/bin/pg_resetwal/pg_resetwal.c
index 47e05bd2c9..b3fb94dfbe 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -211,13 +211,13 @@ main(int argc, char *argv[])
exit(1);
}
 
-   if (set_oldest_commit_ts_xid < 2 &&
-   set_oldest_commit_ts_xid != 0)
-   pg_fatal("transaction ID (-c) must be 
either 0 or greater than or equal to 2");
+   if (set_oldest_commit_ts_xid < 
FirstNormalTransactionId &&
+   set_oldest_commit_ts_xid != 
InvalidTransactionId)
+   pg_fatal("transaction ID (-c) must be 
either %u or greater than or equal to %u", InvalidTransactionId, 
FirstNormalTransactionId);
 
-   if (set_newest_commit_ts_xid < 2 &&
-   set_newest_commit_ts_xid != 0)
-   pg_fatal("transaction ID (-c) must be 
either 0 or greater than or equal to 2");
+   if (set_newest_commit_ts_xid < 
FirstNormalTransactionId &&
+   set_newest_commit_ts_xid != 
InvalidTransactionId)
+   pg_fatal("transaction ID (-c) must be 
either %u or greater than or equal to %u", InvalidTransactionId, 
FirstNormalTransactionId);
break;
 
case 'o':
-- 
2.42.0



Re: Is it worth adding Assert(false) for unknown paths in print_path()?

2023-09-28 Thread Alvaro Herrera
On 2023-Sep-29, David Rowley wrote:

> In [1] Andrey highlighted that I'd forgotten to add print_path()
> handling for TidRangePaths in bb437f995.
> 
> I know the OPTIMIZER_DEBUG code isn't exactly well used.  I never
> personally use it and I work quite a bit in the planner, however, if
> we're keeping it, I thought maybe we might get the memo of missing
> paths a bit sooner if we add an Assert(false) in the default cases.
> 
> Is the attached worthwhile?

Hmm, if we had a buildfarm animal with OPTIMIZER_DEBUG turned on, then I
agree it would catch the omission quickly.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)




Re: Clarify where the severity level is defined

2023-09-28 Thread Daniel Gustafsson
> On 25 Sep 2023, at 08:37, Daniel Gustafsson  wrote:
> 
>> On 25 Sep 2023, at 08:22, Kuwamura Masaki  
>> wrote:
> 
>> Recently I read the document about ereport()[1].
>> Then, I felt that there is little information about severity level.
>> So I guess it can be kind to clarify where severity level is defined(see 
>> attached patch please).
> 
> That makes sense, we already point to other related files on that page so this
> is line with that.

Committed, with some minor wordsmithing. Thanks!

--
Daniel Gustafsson





Re: Synchronizing slots from primary to standby

2023-09-28 Thread Drouvot, Bertrand

Hi,

On 9/25/23 6:10 AM, shveta malik wrote:

On Fri, Sep 22, 2023 at 3:48 PM Amit Kapila  wrote:


On Thu, Sep 21, 2023 at 9:16 AM shveta malik  wrote:


On Tue, Sep 19, 2023 at 10:29 AM shveta malik  wrote:

Currently in patch001, synchronize_slot_names is a GUC on both primary
and physical standby. This GUC tells which all logical slots need to
be synced on physical standbys from the primary. Ideally it should be
a GUC on physical standby alone and each physical standby should be
able to communicate the value to the primary (considering the value
may vary for different physical replicas of the same primary). The
primary on the other hand should be able to take UNION of these values
and let the logical walsenders (belonging to the slots in UNION
synchronize_slots_names) wait for physical standbys for confirmation
before sending those changes to logical subscribers. The intent is
logical subscribers should never be ahead of physical standbys.



Before getting into the details of 'synchronize_slot_names', I would
like to know whether we really need the second GUC
'standby_slot_names'. Can't we simply allow all the logical wal
senders corresponding to 'synchronize_slot_names' to wait for just the
physical standby(s) (physical slot corresponding to such physical
standby) that have sent ' synchronize_slot_names'list? We should have
one physical standby slot corresponding to one physical standby.



yes, with the new approach (to be implemented next) where we plan to
send synchronize_slot_names from each physical standby to primary, the
standby_slot_names GUC should no longer be needed on primary. The
physical standbys sending requests should automatically become the
ones to be waited for confirmation on the primary.



I think that standby_slot_names could be used to do some filtering (means
for which standby(s) we don't want the logical replication on the primary to go
ahead and for which standby(s) one would allow it).

I think that removing the GUC would:

- remove this flexibility
- probably open corner cases like: what if a standby is down? would that mean
that synchronize_slot_names not being send to the primary would allow the 
decoding
on the primary to go ahead?

So, I'm not sure we should remove this GUC.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Eliminate redundant tuple visibility check in vacuum

2023-09-28 Thread Melanie Plageman
On Thu, Sep 21, 2023 at 3:53 PM Robert Haas  wrote:
>
> On Wed, Sep 13, 2023 at 1:29 PM Andres Freund  wrote:
>
> > >   /*
> > >* The criteria for counting a tuple as live in this block 
> > > need to
> > > @@ -1682,7 +1664,7 @@ retry:
> > >* (Cases where we bypass index vacuuming will violate this 
> > > optimistic
> > >* assumption, but the overall impact of that should be 
> > > negligible.)
> > >*/
> > > - switch (res)
> > > + switch ((HTSV_Result) presult.htsv[offnum])
> > >   {
> > >   case HEAPTUPLE_LIVE:
> >
> > I think we should assert that we have a valid HTSV_Result here, i.e. not
> > -1. You could wrap the cast and Assert into an inline funciton as well.
>
> This isn't a bad idea, although I don't find it completely necessary either.

Attached v5 does this. Even though a value of -1 would hit the default
switch case and error out, I can see the argument for this validation
-- as all other places switching on an HTSV_Result are doing so on a
value which was always an HTSV_Result.

Once I started writing the function comment, however, I felt a bit
awkward. In order to make the function available to both pruneheap.c
and vacuumlazy.c, I had to put it in a header file. Writing a
function, available to anyone including heapam.h, which takes an int
and returns an HTSV_Result feels a bit odd. Do we want it to be common
practice to use an int value outside the valid enum range to store
"status not yet computed" for HTSV_Results?

Anyway, on a tactical note, I added the inline function to heapam.h
below the PruneResult definition since it is fairly tightly coupled to
the htsv array in PruneResult. All of the function prototypes are
under a comment that says "function prototypes for heap access method"
-- which didn't feel like an accurate description of this function. I
wonder if it makes sense to have pruneheap.c include vacuum.h and move
pruning specific stuff like this helper and PruneResult over there? I
can't remember why I didn't do this before, but maybe there is a
reason not to? I also wasn't sure if I needed to forward declare the
inline function or not.

Oh, and, one more note. I've dropped the former patch 0001 which
changed the function comment about off_loc above heap_page_prune(). I
have plans to write a separate patch adding an error context callback
for HOT pruning with the offset number and would include such a change
in that patch.

- Melanie
From aa1ec4fa1a689f9c6328d02fd13af1a42ecb0396 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 6 Sep 2023 16:54:41 -0400
Subject: [PATCH v5 2/2] Reuse heap_page_prune() tuple visibility statuses

heap_page_prune() obtains the HTSV_Result (tuple visibility status)
returned from HeapTupleSatisfiesVacuum() for every tuple on the page and
stores them in an array. By making this array available to
heap_page_prune()'s caller lazy_scan_prune(), we can avoid an additional
call to HeapTupleSatisfiesVacuum() when freezing the tuples and
recording LP_DEAD items for vacuum. This saves resources and eliminates
the possibility that vacuuming corrupted data results in a hang due to
endless retry looping.

This replaces the retry mechanism introduced in 8523492d4 to handle cases in
which a tuple's inserting transaction aborted between the visibility check in
heap_page_prune() and lazy_scan_prune()'s call to HeapTupleSatisfiesVacuum() --
rendering it dead but without a dead line pointer. We can instead reuse the
tuple's original visibility status, circumventing any disagreements.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com
---
 src/backend/access/heap/pruneheap.c  | 36 +++-
 src/backend/access/heap/vacuumlazy.c | 42 
 src/include/access/heapam.h  | 25 +
 3 files changed, 54 insertions(+), 49 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index d5892a2db4..c5f1abd95a 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -53,16 +53,6 @@ typedef struct
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
 	bool		marked[MaxHeapTuplesPerPage + 1];
-
-	/*
-	 * Tuple visibility is only computed once for each tuple, for correctness
-	 * and efficiency reasons; see comment in heap_page_prune() for details.
-	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
-	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
-	 *
-	 * Same indexing as ->marked.
-	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
 } PruneState;
 
 /* Local functions */
@@ -71,6 +61,7 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
 			   Buffer buffer);
 static int	heap_prune_chain(Buffer buffer,
 			 OffsetNumber rootoffnum,
+			 int8 *htsv,
 			 PruneState *p

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Zhijie Hou (Fujitsu)
On Thursday, September 28, 2023 5:32 PM Bharath Rupireddy 
 wrote:

Hi,

> 
> On Mon, Sep 25, 2023 at 4:31 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > > 4.
> > > +/*
> > > + * There is a possibility that following records may be generated
> > > + * during the upgrade.
> > > + */
> > > +is_valid = is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > > XLOG_CHECKPOINT_SHUTDOWN) ||
> > > +is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > > XLOG_CHECKPOINT_ONLINE) ||
...
> > >
> > > What if we missed to capture the WAL records that may be generated
> > > during upgrade?
> >
> > If such records are generated before calling
> > binary_upgrade_validate_wal_logical_end(),
> > the upgrading would fail. Otherwise it would be succeeded. Anyway, we
> > don't care such records because those aren't required to be
> > replicated. The main thing we want to detect is that we don't miss any 
> > record
> generated before server shutdown.
> 
> I read this
> https://www.postgresql.org/message-id/20230725170319.h423jbthfohwgnf7@a
> work3.anarazel.de
> and understand that the current patch implements the approach suggested
> there - "scan the end of the WAL for records that should have been streamed
> out". I think the WAL records that should have been streamed out are all WAL
> record types in _decode functions except the ones that have a no-op or an
> op unrelated to logical decoding. For instance,
> - for xlog_decode, if the records of type {XLOG_CHECKPOINT_ONLINE,
> XLOG_PARAMETER_CHANGE, XLOG_NOOP, XLOG_NEXTOID, XLOG_SWITCH,
> XLOG_BACKUP_END, XLOG_RESTORE_POINT, XLOG_FPW_CHANGE,
> XLOG_FPI_FOR_HINT, XLOG_FPI, XLOG_OVERWRITE_CONTRECORD} are found
> after confirmed_flush LSN, it is fine.
> - for xact_decode, if the records of type {XLOG_XACT_ASSIGNMENT} are found
> after confirmed_flush LSN, it is fine.
> - for standby_decode, if the records of type {XLOG_STANDBY_LOCK,
> XLOG_INVALIDATIONS} are found after confirmed_flush LSN, it is fine.
> - for standby_decode, if the records of type {XLOG_STANDBY_LOCK,
> XLOG_INVALIDATIONS} are found after confirmed_flush LSN, it is fine.
> - for heap2_decode, if the records of type {XLOG_HEAP2_REWRITE,
> XLOG_HEAP2_FREEZE_PAGE, XLOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM,
> XLOG_HEAP2_VISIBLE, XLOG_HEAP2_LOCK_UPDATED} are found after
> confirmed_flush LSN, it is fine.
> - for heap_decode, if the records of type {XLOG_HEAP_LOCK} are found after
> confirmed_flush LSN, it is fine.
> 
> I think all of the above WAL records are okay to be present after 
> cofirmed_flush
> LSN. If any WAL records other than the above are found after confirmed_flush
> LSN, those are the one that should have been streamed out and the pg_upgrade
> must complain with "The slot "foo" has not consumed the WAL yet" for all such
> slots, right? But, the function binary_upgrade_validate_wal_logical_end checks
> for only a handful of the above record types. I know that the list is arrived 
> at
> based on testing, but it may happen that any of the above WAL records may be
> generated and present before/during/after pg_upgrade for which pg_upgrade
> failure isn't wanted.
> 
> Perhaps, a function in logical/decode.c returning the WAL record as valid if 
> the
> record type is any of the above. A note in replication/decode.h and/or
> access/rmgrlist.h asking rmgr adders to categorize the WAL record type in the
> new function based on its decoding operation might help with future new WAL
> record type additions.
> 
> Thoughts?

I think this approach can work, but I am not sure if it's better than other
approaches. Mainly because it has almost the same maintaince burden as the
current approach, i.e. we need to verify and update the check function each
time we add a new WAL record type.

Apart from the WAL scan approach, we also considered alternative approach that
do not impose an additional maintenance burden and could potentially be less
complex.  For example, we can add a new field in pg_controldata to record the
last checkpoint that happens in non-upgrade mode, so that we can compare the
slot's confirmed_flush_lsn with this value, If they are the same, the WAL
should have been consumed otherwise we disallow upgrading this slot. I would
appreciate if you can share your thought about this approach.

And if we decided to use WAL scan approach, instead of checking each record, we
could directly check if the WAL record can be decoded into meaningful results
by use test_decoding to decode them. This approach also doesn't add new
maintenance burden as we anyway need to update the test_decoding if any decode
logic for new record changes. This was also mentioned [1].

What do you think ?

[1] 
https://www.postgresql.org/message-id/OS0PR01MB5716FC0F814D78E82E4CC3B894C3A%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hou zj


Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-09-28 Thread Drouvot, Bertrand

Hi hackers,

Please find attached a patch proposal to $SUBJECT.

This patch allows the role provided in BackgroundWorkerInitializeConnection()
and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.

In InitPostgres(), in case of a background worker, authentication is not 
performed
(PerformAuthentication() is not called), so having the role used to connect to 
the database
lacking login authorization seems to make sense.

With this new flag in place, one could give "high" privileges to the role used 
to initialize
the background workers connections without any risk of seeing this role being 
used by a
"normal user" to login.

The attached patch:

- adds the new flag
- adds documentation
- adds testing

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 82ea53a81a5e9f5e8b680ec6de849a48a7222463 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Wed, 20 Sep 2023 08:28:59 +
Subject: [PATCH v1] Allow background workers to bypass login check

This adds a new flag value (BGWORKER_BYPASS_ROLELOGINCHECK) to the flags being
used in BackgroundWorkerInitializeConnection() and 
BackgroundWorkerInitializeConnectionByOid().

This flag allows the background workers to bypass the login check done in 
InitializeSessionUserId().
---
 doc/src/sgml/bgworker.sgml|  3 +-
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/postmaster/autovacuum.c   |  4 +-
 src/backend/postmaster/postmaster.c   |  4 +-
 src/backend/tcop/postgres.c   |  2 +-
 src/backend/utils/init/miscinit.c |  4 +-
 src/backend/utils/init/postinit.c |  9 ++--
 src/include/miscadmin.h   |  5 +-
 src/include/postmaster/bgworker.h |  6 ++-
 .../modules/worker_spi/t/001_worker_spi.pl| 48 +++
 src/test/modules/worker_spi/worker_spi.c  | 25 +-
 11 files changed, 95 insertions(+), 17 deletions(-)
   6.9% doc/src/sgml/
   3.4% src/backend/bootstrap/
  10.6% src/backend/postmaster/
  16.8% src/backend/utils/init/
   7.2% src/include/postmaster/
  29.7% src/test/modules/worker_spi/t/
  21.1% src/test/modules/worker_spi/
   4.0% src/

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 9ad1146ba0..7335c02ed8 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -200,7 +200,8 @@ typedef struct BackgroundWorker
InvalidOid, the process will run as the superuser created
during initdb. If 
BGWORKER_BYPASS_ALLOWCONN
is specified as flags it is possible to bypass the 
restriction
-   to connect to databases not allowing user connections.
+   to connect to databases not allowing user connections. If 
BGWORKER_BYPASS_ROLELOGINCHECK
+   is specified as flags it is possible to bypass the login 
check to connect to databases.
A background worker can only call one of these two functions, and only
once.  It is not possible to switch databases.
   
diff --git a/src/backend/bootstrap/bootstrap.c 
b/src/backend/bootstrap/bootstrap.c
index 5810f8825e..d4b6425585 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -345,7 +345,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
if (pg_link_canary_is_frontend())
elog(ERROR, "backend is incorrectly linked to frontend 
functions");
 
-   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, NULL);
+   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, 0, NULL);
 
/* Initialize stuff for bootstrap-file processing */
for (i = 0; i < MAXATTR; i++)
diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index ae9be9b911..dde83e76fb 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -488,7 +488,7 @@ AutoVacLauncherMain(int argc, char *argv[])
/* Early initialization */
BaseInit();
 
-   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, NULL);
+   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, 0, NULL);
 
SetProcessingMode(NormalProcessing);
 
@@ -1706,7 +1706,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 * Note: if we have selected a just-deleted database (due to 
using
 * stale stats info), we'll fail and exit here.
 */
-   InitPostgres(NULL, dbid, NULL, InvalidOid, false, false,
+   InitPostgres(NULL, dbid, NULL, InvalidOid, false, 0,
 dbname);
SetProcessingMode(NormalProcessing);
set_ps_display(dbname);
diff --git a/src/backend/postmaster/postmaster.c 
b/src/backend/postmaster/postmaster.c
index 54e9bfb8c4..443b1c4725 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.

Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Данил Столповских
Greetings, everyone!
I would like to offer my patch on the problem of removing values from enums

It adds support for expression ALTER TYPE  DROP VALUE


Added:
1. expression in grammar
2. function to drop enum values
3. regression tests
4. documentation
Subject: [PATCH] Add DROP VALUE for ALTER TYPE with enum
---
Index: doc/src/sgml/datatype.sgml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
--- a/doc/src/sgml/datatype.sgml	(revision 1f9e3a9be539f912babd3ad58d01a4ce6aa0b85b)
+++ b/doc/src/sgml/datatype.sgml	(revision 2984db4be6d2a4afc37ba15049c6365b880b6609)
@@ -3294,10 +3294,10 @@
 
 
  Although enum types are primarily intended for static sets of values,
- there is support for adding new values to an existing enum type, and for
- renaming values (see ).  Existing values
- cannot be removed from an enum type, nor can the sort ordering of such
- values be changed, short of dropping and re-creating the enum type.
+ there is support for adding new values to an existing enum type, and
+ also for dropping and renaming values (see ). 
+ The sorting order of such values cannot be changed, except for deleting 
+ and re-creating the enum type.
 
 
 
Index: doc/src/sgml/ref/alter_type.sgml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/doc/src/sgml/ref/alter_type.sgml b/doc/src/sgml/ref/alter_type.sgml
--- a/doc/src/sgml/ref/alter_type.sgml	(revision 1f9e3a9be539f912babd3ad58d01a4ce6aa0b85b)
+++ b/doc/src/sgml/ref/alter_type.sgml	(revision 2984db4be6d2a4afc37ba15049c6365b880b6609)
@@ -30,6 +30,7 @@
 ALTER TYPE name action [, ... ]
 ALTER TYPE name ADD VALUE [ IF NOT EXISTS ] new_enum_value [ { BEFORE | AFTER } neighbor_enum_value ]
 ALTER TYPE name RENAME VALUE existing_enum_value TO new_enum_value
+ALTER TYPE name DROP VALUE existing_enum_value
 ALTER TYPE name SET ( property = value [, ... ] )
 
 where action is one of:
@@ -145,6 +146,15 @@
  
 

+   
+   
+   DROP VALUE
+   
+
+ This form drops a value of an enum type.
+
+   
+  
 

 
@@ -462,6 +472,13 @@
 ALTER TYPE colors RENAME VALUE 'purple' TO 'mauve';
 
   
+  
+  
+   To drop an enum value:
+
+ALTER TYPE colors DROP VALUE 'red';
+
+  
 
   
To create binary I/O functions for an existing base type:
Index: src/backend/catalog/pg_enum.c
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
--- a/src/backend/catalog/pg_enum.c	(revision 1f9e3a9be539f912babd3ad58d01a4ce6aa0b85b)
+++ b/src/backend/catalog/pg_enum.c	(revision 2984db4be6d2a4afc37ba15049c6365b880b6609)
@@ -612,6 +612,62 @@
 	table_close(pg_enum, RowExclusiveLock);
 }
 
+void
+DropEnumLabel(Oid enumTypeOid, const char *oldVal)
+{
+Relation	pg_enum;
+HeapTuple	enum_tup;
+Form_pg_enum en;
+CatCList   *list;
+int			nelems;
+HeapTuple	old_tup;
+int			i;
+
+
+/*
+ * Acquire a lock on the enum type, which we won't release until commit.
+ * This ensures that two backends aren't concurrently modifying the same
+ * enum type.  Since we are not changing the type's sort order, this is
+ * probably not really necessary, but there seems no reason not to take
+ * the lock to be sure.
+ */
+LockDatabaseObject(TypeRelationId, enumTypeOid, 0, ExclusiveLock);
+
+pg_enum = table_open(EnumRelationId, RowExclusiveLock);
+
+/* Get the list of existing members of the enum */
+list = SearchSysCacheList1(ENUMTYPOIDNAME,
+   ObjectIdGetDatum(enumTypeOid));
+nelems = list->n_members;
+
+/*
+ * Locate the element to rename and check if the new label is already in
+ * use.  (The unique index on pg_enum would catch that anyway, but we
+ * prefer a friendlier error message.)
+ */
+old_tup = NULL;
+for (i = 0; i < nelems; i++)
+{
+enum_tup = &(list->members[i]->tuple);
+en = (Form_pg_enum) GETSTRUCT(enum_tup);
+if (strcmp(NameStr(en->enumlabel), oldVal) == 0)
+{
+old_tup = enum_tup;
+}
+}
+
+ReleaseCatCacheList(list);
+
+if (!old_tup)
+ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("\"%s\" is not an existing enum label",
+   oldVal)));
+
+CatalogTupleDelete(pg_enum, &old_tup->t_self);
+table_close(pg_enum, RowExclusiveLock);
+}
+
 
 /*
  * Test if the given enum value is in the table of uncommitted enums.
Index: src/backend/commands/typecmds.c
IDEA a

Re: Out of memory error handling in frontend code

2023-09-28 Thread Daniel Gustafsson
> On 28 Sep 2023, at 10:14, Frédéric Yhuel  wrote:

> After some time, we understood that the 20 million of large objects were 
> responsible for the huge memory usage (more than 10 GB) by pg_dump.

This sounds like a known issue [0] which has been reported several times, and
one we should get around to fixing sometime.

> I think a more useful error message would help for such cases.

Knowing that this is case that pops up, I agree that we could do better around
the messaging here.

> I haven't try to get the patch ready for review, I know that the format of 
> the messages isn't right, I'd like to know what do you think of the idea, 
> first.

I don't think adding more details is a bad idea, but it shouldn't require any
knowledge about internals so I think messages like the one below needs to be
reworded to be more helpful.

+   if (loinfo == NULL)
+   {
+   pg_fatal("getLOs: out of memory");
+   }

--
Daniel Gustafsson

[0] 
https://www.postgresql.org/message-id/7da8823d83a2b66bdd917aa6cb2c5c2619d86011.ca...@credativ.de





Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-28 Thread Ranier Vilela
Em qua., 27 de set. de 2023 às 22:28, David Rowley 
escreveu:

> On Thu, 28 Sept 2023 at 02:37, Ranier Vilela  wrote:
> >> Please check [1] for the mention of:
> >>
> >> "The fastest way to get your patch rejected is to make unrelated
> >> changes. Reformatting lines that haven't changed, changing unrelated
> >> comments you felt were poorly worded, touching code not necessary to
> >> your change, etc. Each patch should have the minimum set of changes
> >> required to work robustly. If you do not follow the code formatting
> >> suggestions above, expect your patch to be returned to you with the
> >> feedback of "follow the code conventions", quite likely without any
> >> other review."
> >
> > Forgive my impulsiveness, anyone who loves perfect, well written code,
> would understand.
>
> Perhaps, but the committers on this project seem to be against the
> blunderbuss approach to committing patches.  You might meet less
> resistance around here if you assume all of their weapons have a scope
> and that they like to aim for something before pulling the trigger.
>
Perhaps, and using your own words, the leaders on this project seem
to be against reviewers armed with blunderbuss, too.


>
> Personally, this seems like a good idea to me and I'd like to follow
> it too.  If you'd like to hold off you a committer whose weapon of
> choice is the blunderbuss then I can back off and let you wait for one
> to come along. Just let me know.
>
Please, no, I love good combat too.
You are welcome in my threads.
But I hope that you will be strong like me, and don't wait for weak
comebacks,
when you find another strong elephant.


> > Do you have an objection to fixing the function
> find_base_rel_ignore_join?
> > Or is it included in unrelated changes?
>
> Well, the topic seems to be adding additional safety to prevent
> accessing negative entries for simple_rel_array.  I can't think why
> fixing the same theoretical hazard in find_base_rel_ignore_join()
> would be unrelated.

Good to know.


>   I hope you can see the difference here. Randomly
> adjusting function signatures because you happen to be changing some
> code within that function does not, in my book, seem related.

I confess that some "in pass", "while there", and some desire to enrich the
patch, clouded my judgment.

So it seems that we have some consensus, I propose version 2 of the patch,
which I hope we will have to correct, perhaps, the words of the comments.

best regards,
Ranier Vilela


v2-0001-Avoid-possible-out-of-bounds-access.patch
Description: Binary data


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-28 Thread vignesh C
On Wed, 27 Sept 2023 at 12:28, Amit Kapila  wrote:
>
> On Wed, Sep 27, 2023 at 6:58 AM Peter Smith  wrote:
> >
> > On Tue, Sep 26, 2023 at 11:57 PM vignesh C  wrote:
> > >
> > > On Tue, 26 Sept 2023 at 13:03, Peter Smith  wrote:
> > > >
> > > > Here are some comments for patch v2-0001.
> > > >
> > > > ==
> > > > src/backend/replication/logical/worker.c
> > > >
> > > > 1. maybe_reread_subscription
> > > >
> > > > ereport(LOG,
> > > > (errmsg("logical replication worker for subscription \"%s\"
> > > > will restart because of a parameter change",
> > > > MySubscription->name)));
> > > >
> > > > Is this really a "parameter change" though? It might be a stretch to
> > > > call the user role change a subscription parameter change. Perhaps
> > > > this case needs its own LOG message?
> > >
> > > When I was doing this change the same thought had come to my mind too
> > > but later I noticed that in case of owner change there was no separate
> > > log message. Since superuser check is somewhat similar to owner
> > > change, I felt no need to make any change for this.
> > >
> >
> > Yeah, I had seen the same already before my comment. Anyway, YMMV.
> >
>
> But OTOH, the owner of the subscription can be changed by the Alter
> Subscription command whereas superuser status can't be changed. I
> think we should consider changing the message for this case.

Modified

> BTW, do we want to backpatch this patch? I think we should backatch to
> PG16 as it impacts password_required functionality. Before this patch
> even if the subscription owner's superuser status is lost, it won't
> use a password for connection till the server gets restarted or the
> apply worker gets restarted due to some other reason. What do you
> think?

I felt since password_required functionality is there in PG16, we
should fix this in PG16 too. I have checked that password_required
functionality is not there in PG15, so no need to make any change in
PG15

The updated patch has the changes for the same.

Regards,
Vignesh
From eceb5873ede3058e041c4f28be964ddae9cc1487 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Thu, 28 Sep 2023 14:46:56 +0530
Subject: [PATCH v4] Restart the apply worker if the subscription owner has
 changed from superuser to non-superuser.

Restart the apply worker if the subscription owner has changed from
superuser to non-superuser. This is required so that the subscription
connection string gets revalidated to identify cases where the
password option is not specified as part of the connection string for
non-superuser.
---
 src/backend/catalog/pg_subscription.c   |  3 +++
 src/backend/commands/subscriptioncmds.c |  4 +--
 src/backend/replication/logical/tablesync.c |  3 +--
 src/backend/replication/logical/worker.c| 30 ++---
 src/include/catalog/pg_subscription.h   |  1 +
 src/test/subscription/t/027_nosuperuser.pl  | 24 +
 6 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..bc74b695c6 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -108,6 +108,9 @@ GetSubscription(Oid subid, bool missing_ok)
    Anum_pg_subscription_suborigin);
 	sub->origin = TextDatumGetCString(datum);
 
+	/* Get superuser for subscription owner */
+	sub->isownersuperuser = superuser_arg(sub->owner);
+
 	ReleaseSysCache(tup);
 
 	return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 6fe111e98d..ac57b150d7 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -869,7 +869,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 	load_file("libpqwalreceiver", false);
 
 	/* Try to connect to the publisher. */
-	must_use_password = !superuser_arg(sub->owner) && sub->passwordrequired;
+	must_use_password = !sub->isownersuperuser && sub->passwordrequired;
 	wrconn = walrcv_connect(sub->conninfo, true, must_use_password,
 			sub->name, &err);
 	if (!wrconn)
@@ -1249,7 +1249,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 			load_file("libpqwalreceiver", false);
 			/* Check the connection info string. */
 			walrcv_check_conninfo(stmt->conninfo,
-  sub->passwordrequired && !superuser_arg(sub->owner));
+  sub->passwordrequired && !sub->isownersuperuser);
 
 			values[Anum_pg_subscription_subconninfo - 1] =
 CStringGetTextDatum(stmt->conninfo);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 6d461654ab..8ce92c2919 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1265,9 +1265,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 
 	/* Is the use of a password mandatory? */
 	must_use_password = MySubscription->passwordrequired &&
-		!superuser

Is it worth adding Assert(false) for unknown paths in print_path()?

2023-09-28 Thread David Rowley
In [1] Andrey highlighted that I'd forgotten to add print_path()
handling for TidRangePaths in bb437f995.

I know the OPTIMIZER_DEBUG code isn't exactly well used.  I never
personally use it and I work quite a bit in the planner, however, if
we're keeping it, I thought maybe we might get the memo of missing
paths a bit sooner if we add an Assert(false) in the default cases.

Is the attached worthwhile?

David

[1] 
https://www.postgresql.org/message-id/379082d6-1b6a-4cd6-9ecf-7157d8c08...@postgrespro.ru


assert_fail_unknown_paths_in_print_path.patch
Description: Binary data


wal recycling problem

2023-09-28 Thread Fabrice Chapuis
Hello,

I have a question about the automatic removal of unused WAL files. When
loading data with pg_restore (200Gb) we noticed that a lot of WALs files
are generated and they are not purged automatically nor recycled despite
frequent checkpoints, then pg_wal folder (150Gb) fill and become out of
space.
We have a cluster of 2 members (1 primary and 1 standby) with Postgres
version 14.9 and 2 barman server, slots are only configured for barman,
barman is version 3.7.
The archive command is desactivated (archive_command=':')
I use pg_archivecleanup (with the wal file generated from the last
checkpoint in parameter) to remove files manually before the limit of 150Gb
so that the restore can terminate.

Why does postgres do not this cleanup automatically, which part of the code
is responsible for removing or recycling the wals?

Thanks for your help

Fabrice


Re: remaining sql/json patches

2023-09-28 Thread Alvaro Herrera
On 2023-Sep-27, Amit Langote wrote:

> Maybe the following is better:
> 
> +   /*
> +* For expression nodes that support soft errors.  Should be set to NULL
> +* before calling ExecInitExprRec() if the caller wants errors thrown.
> +*/
> 
> ...as in the attached.

That's good.

> Alvaro, do you think your concern regarding escontext not being in the
> right spot in the ExprState struct is addressed?  It doesn't seem very
> critical to me to place it in the struct's 1st cacheline, because
> escontext is not accessed in performance critical paths such as during
> expression evaluation, especially with the latest version.  (It would
> get accessed during evaluation with previous versions.)
> 
> If so, I'd like to move ahead with committing it.

Yeah, looks OK to me in v21.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: trying again to get incremental backup

2023-09-28 Thread Jakub Wartak
On Wed, Aug 30, 2023 at 4:50 PM Robert Haas  wrote:
[..]

I've played a little bit more this second batch of patches on
e8d74ad625f7344f6b715254d3869663c1569a51 @ 31Aug (days before wait
events refactor):

test_across_wallevelminimal.sh
test_many_incrementals_dbcreate.sh
test_many_incrementals.sh
test_multixact.sh
test_pending_2pc.sh
test_reindex_and_vacuum_full.sh
test_truncaterollback.sh
test_unlogged_table.sh

all those basic tests had GOOD results. Please find attached. I'll try
to schedule some more realistic (in terms of workload and sizes) test
in a couple of days + maybe have some fun with cross-backup-and
restores across standbys. As per earlier doubt: raw wal_level =
minimal situation, shouldn't be a concern, sadly because it requires
max_wal_senders==0, while pg_basebackup requires it above 0 (due to
"FATAL:  number of requested standby connections exceeds
max_wal_senders (currently 0)").

I wanted to also introduce corruption onto pg_walsummaries files, but
later saw in code that is already covered with CRC32, cool.

In v07:
> +#define MINIMUM_VERSION_FOR_WAL_SUMMARIES 16

17 ?

> A related design question is whether we should really be sending the
> whole backup manifest to the server at all. If it turns out that we
> don't really need anything except for the LSN of the previous backup,
> we could send that one piece of information instead of everything. On
> the other hand, if we need the list of files from the previous backup,
> then sending the whole manifest makes sense.

If that is still an area open for discussion: wouldn't it be better to
just specify LSN as it would allow resyncing standby across major lag
where the WAL to replay would be enormous? Given that we had
primary->standby where standby would be stuck on some LSN, right now
it would be:
1) calculate backup manifest of desynced 10TB standby (how? using
which tool?)  - even if possible, that means reading 10TB of data
instead of just putting a number, isn't it?
2) backup primary with such incremental backup >= LSN
3) copy the incremental backup to standby
4) apply it to the impaired standby
5) restart the WAL replay

> - We only know how to operate on directories, not tar files. I thought
> about that when working on pg_verifybackup as well, but I didn't do
> anything about it. It would be nice to go back and make that tool work
> on tar-format backups, and this one, too. I don't think there would be
> a whole lot of point trying to operate on compressed tar files because
> you need random access and that seems hard on a compressed file, but
> on uncompressed files it seems at least theoretically doable. I'm not
> sure whether anyone would care that much about this, though, even
> though it does sound pretty cool.

Also maybe it's too early to ask, but wouldn't it be nice if we could
have an future option in pg_combinebackup to avoid double writes when
used from restore hosts (right now we need to first to reconstruct the
original datadir from full and incremental backups on host hosting
backups and then TRANSFER it again and on target host?). So something
like that could work well from restorehost: pg_combinebackup
/tmp/backup1 /tmp/incbackup2 /tmp/incbackup3 -O tar -o - | ssh
dbserver 'tar xvf -C /path/to/restored/cluster - ' . The bad thing is
that such a pipe prevents parallelism from day 1 and I'm afraid I do
not have a better easy idea on how to have both at the same time in
the long term.

-J.


incrbackuptests-0.1.tgz
Description: application/compressed


Re: Latches vs lwlock contention

2023-09-28 Thread Heikki Linnakangas

On 28/10/2022 06:56, Thomas Munro wrote:

One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on*that*  wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.


ISTM that the change to not re-aqcuire the lock in ProcSleep is 
independent from the other changes. Let's split that off to a separate 
patch.


I agree it should be safe. Acquiring a lock just to hold off interrupts 
is overkill anwyway, HOLD_INTERRUPTS() would be enough. 
LockErrorCleanup() uses HOLD_INTERRUPTS() already.


There are no CHECK_FOR_INTERRUPTS() in GrantAwaitedLock(), so cancel/die 
interrupts can't happen here. But could we add HOLD_INTERRUPTS(), just 
pro forma, to document the assumption? It's a little awkward: you really 
should hold interrupts until the caller has done "awaitedLock = NULL;". 
So it's not quite enough to add a pair of HOLD_ and RESUME_INTERRUPTS() 
at the end of ProcSleep(). You'd need to do the HOLD_INTERRUPTS() in 
ProcSleep() and require the caller to do RESUME_INTERRUPTS(). In a 
sense, ProcSleep downgrades the lock on the partition to just holding 
off interrupts.


Overall +1 on this change to not re-acquire the partition lock.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-28 Thread Jim Jones

Hi Daniel

On 27.09.23 10:21, Daniel Gustafsson wrote:
An annotation syntax specifically for this would address my concern, 
but the

argument that pg_hba (and related code) is border-line too complicated as it is
does hold some water.  Complexity in code can lead to bugs, but complexity in
syntax can lead to misconfigurations or unintentional infosec leaks which is
usually more problematic.
Yeah, that's why the possibility to use the normal comments for this 
feature seemed at first so appealing :)

I would propose to not worry about code and instead just discuss a potential
new format for annotations, and only implement parsing and handling once
something has been agreed upon.  This should be in a new thread however to
ensure visibility, since it's beyond the subject of this thread.


Sounds good! I will open a new thread as soon as I get back home, so 
that we can collect some ideas.


Thanks

Jim





Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Bharath Rupireddy
On Mon, Sep 25, 2023 at 4:31 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> > 4.
> > +/*
> > + * There is a possibility that following records may be generated
> > + * during the upgrade.
> > + */
> > +is_valid = is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > XLOG_CHECKPOINT_SHUTDOWN) ||
> > +is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > XLOG_CHECKPOINT_ONLINE) ||
> > +is_xlog_record_type(rmid, info, RM_XLOG_ID, XLOG_SWITCH) ||
> > +is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > XLOG_FPI_FOR_HINT) ||
> > +is_xlog_record_type(rmid, info, RM_XLOG_ID,
> > XLOG_PARAMETER_CHANGE) ||
> > +is_xlog_record_type(rmid, info, RM_STANDBY_ID,
> > XLOG_RUNNING_XACTS) ||
> > +is_xlog_record_type(rmid, info, RM_HEAP2_ID,
> > XLOG_HEAP2_PRUNE);
> >
> > What if we missed to capture the WAL records that may be generated
> > during upgrade?
>
> If such records are generated before calling 
> binary_upgrade_validate_wal_logical_end(),
> the upgrading would fail. Otherwise it would be succeeded. Anyway, we don't 
> care
> such records because those aren't required to be replicated. The main thing we
> want to detect is that we don't miss any record generated before server 
> shutdown.

I read this 
https://www.postgresql.org/message-id/20230725170319.h423jbthfohwg...@awork3.anarazel.de
and understand that the current patch implements the approach
suggested there - "scan the end of the WAL for records that should
have been streamed out". I think the WAL records that should have been
streamed out are all WAL record types in _decode functions except
the ones that have a no-op or an op unrelated to logical decoding. For
instance,
- for xlog_decode, if the records of type {XLOG_CHECKPOINT_ONLINE,
XLOG_PARAMETER_CHANGE, XLOG_NOOP, XLOG_NEXTOID, XLOG_SWITCH,
XLOG_BACKUP_END, XLOG_RESTORE_POINT, XLOG_FPW_CHANGE,
XLOG_FPI_FOR_HINT, XLOG_FPI, XLOG_OVERWRITE_CONTRECORD} are found
after confirmed_flush LSN, it is fine.
- for xact_decode, if the records of type {XLOG_XACT_ASSIGNMENT} are
found after confirmed_flush LSN, it is fine.
- for standby_decode, if the records of type {XLOG_STANDBY_LOCK,
XLOG_INVALIDATIONS} are found after confirmed_flush LSN, it is fine.
- for standby_decode, if the records of type {XLOG_STANDBY_LOCK,
XLOG_INVALIDATIONS} are found after confirmed_flush LSN, it is fine.
- for heap2_decode, if the records of type {XLOG_HEAP2_REWRITE,
XLOG_HEAP2_FREEZE_PAGE, XLOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM,
XLOG_HEAP2_VISIBLE, XLOG_HEAP2_LOCK_UPDATED} are found after
confirmed_flush LSN, it is fine.
- for heap_decode, if the records of type {XLOG_HEAP_LOCK} are found
after confirmed_flush LSN, it is fine.

I think all of the above WAL records are okay to be present after
cofirmed_flush LSN. If any WAL records other than the above are found
after confirmed_flush LSN, those are the one that should have been
streamed out and the pg_upgrade must complain with "The slot "foo" has
not consumed the WAL yet" for all such slots, right? But, the function
binary_upgrade_validate_wal_logical_end checks for only a handful of
the above record types. I know that the list is arrived at based on
testing, but it may happen that any of the above WAL records may be
generated and present before/during/after pg_upgrade for which
pg_upgrade failure isn't wanted.

Perhaps, a function in logical/decode.c returning the WAL record as
valid if the record type is any of the above. A note in
replication/decode.h and/or access/rmgrlist.h asking rmgr adders to
categorize the WAL record type in the new function based on its
decoding operation might help with future new WAL record type
additions.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Questions about the new subscription parameter: password_required

2023-09-28 Thread Benoit Lobréau

On 9/26/23 19:00, Jeff Davis wrote:

   +   If the ownership of a subscription with
password_required=true
   +   is transferred to a non-superuser, they will gain full control
over the subscription
   +   but will not be able to modify it's connection string.

I think you mean false, right?


No, but I was wrong. At the beginning of the thread, I was surprised 
was even possible to change the ownership to a non-superuser because It 
shouldn't work and commands like ENABLE didn't complain in the terminal.
Then Robert Haas explained to me that it's ok because the superuser can 
do whatever he wants. I came back to it later and somehow convinced 
myself it was working. Sorry.



   +   If the ownership of a subscription with
password_required=true
   +   has been transferred to a non-superuser, it must be reverted to a
superuser for
   +   the DROP operation to succeed.

That's only needed if the superuser transfers a subscription with
password_required=true to a non-superuser and the connection string
does not contain a password. In that case, the subscription is already
in a failing state, not just for DROP. Ideally we'd have some other
warning in the docs not to do that -- maybe in CREATE and ALTER.


Yes, I forgot the connection string bit.


Also, if the subscription is in that kind of failing state, there are
other ways to get out of it as well, like disabling it and setting
connection=none, then dropping i

The code in for DropSubscription in
src/backend/commands/subscriptioncmds.c tries to connect before testing
if the slot is NONE / NULL. So it doesn't work to DISABLE the
subscription and set the slot to NONE.

Robert Haas proposed something in the following message but I am a 
little out of my depth here ...


https://www.postgresql.org/message-id/af9435ae-18df-6a9e-2374-2de23009518c%40dalibo.com


The whole thing is fairly delicate. As soon as you work slightly
outside of the intended use, password_required starts causing
unexpected things to happen.

As I said earlier, I think the best thing to do is to just have a
section that describes when to use password_required, what specific
things you should do to satisfy that case, and what caveats you should
avoid. Something like:

   "If you want to have a subscription using a connection string without
a password managed by a non-superuser, then: [ insert SQL steps here ].
Warning: if the connection string doesn't contain a password, make sure
to set password_required=false before transferring ownership, otherwise
it will start failing."


Ok, I will do it that way. Would you prefer this section to be in the 
ALTER SUBSCRIPTION on the CREATE SUBSCIPTION doc ?


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Amit Kapila
On Thu, Sep 28, 2023 at 2:22 PM Bharath Rupireddy
 wrote:
>
> On Fri, Sep 22, 2023 at 9:40 AM Michael Paquier  wrote:
> >
> > On Thu, Sep 21, 2023 at 01:50:28PM +0530, Amit Kapila wrote:
> > > We have discussed this point. Normally, we don't have such options in
> > > upgrade, so we were hesitent to add a new one for this but there is a
> > > discussion to add an --exclude-logical-slots option. We are planning
> > > to add that as a separate patch after getting some more consensus on
> > > it. Right now, the idea is to get the main patch ready.
> >
> > Okay.  I am wondering if the subscriber part is OK now without an
> > option, but that could also be considered separately, as well.  At
> > least I hope so.
>
> +1 for an option to skip upgrade logical replication slots for the
> following reasons:
> - one may not want the logical replication slots on the upgraded
> instance immediately - unless the upgraded instance is tested and
> determined to be performant.
> - one may not want the logical replication slots on the upgraded
> instance immediately - no logical replication setup is wanted on the
> new instance perhaps because of an architectural/organizational
> decision.
> - one may take backup of the postgres instance with logical
> replication slots using any of the file system/snapshot based backup
> mechanisms (not pg_basebackup), essentially getting the on-disk
> replication slots data as well; the pg_upgrade may fail on the
> backed-up instance.
>
> I agree to have it as a 0002 patch once the design and things are
> finalized for the main patch.
>

Thanks for understanding that it can be done as a 0002 patch because
we don't have an agreement on this. Jonathan feels exactly the
opposite for having an option that by default doesn't migrate slots as
users always need to use the option and they may want to have slots
migrated by default. So, we may consider to have an --exclude-*
option.

-- 
With Regards,
Amit Kapila.




Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()

2023-09-28 Thread Jakub Wartak
On Thu, Sep 28, 2023 at 12:53 AM Michael Paquier  wrote:
>
> On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
> > I don't think going for size_t is a viable path for fixing this. I'm pretty
> > sure the initial patch would trigger a type mismatch from guc_tables.c - we
> > don't have infrastructure for size_t GUCs.
>
> Nothing marked as PGDLLIMPORT uses size_t in the tree currently, FWIW.
>
> > Perhaps we ought to error out (in BackendStatusShmemSize() or such) if
> > pgstat_track_activity_query_size * MaxBackends >= 4GB?
>
> Yeah, agreed that putting a check like that could catch errors more
> quickly.

Hi,

v3 attached. I had a problem coming out with a better error message,
so suggestions are welcome.  The cast still needs to be present as per
above suggestion as 3GB is still valid buf size and still was causing
integer overflow. We just throw an error on >= 4GB with v3.

-J.


v3-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patch
Description: Binary data


Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Amit Kapila
On Thu, Sep 28, 2023 at 1:24 PM Bharath Rupireddy
 wrote:
>
> On Thu, Sep 28, 2023 at 1:06 PM Amit Kapila  wrote:
> >
> > On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy
> >  wrote:
> > >
> > > > No, without that commit, there is a very high possibility that even if
> > > > we have sent the WAL to the subscriber and got the acknowledgment of
> > > > the same, we would miss updating it before shutdown. This would lead
> > > > to upgrade failures because upgrades have no way to later identify
> > > > whether the remaining WAL records are sent to the subscriber.
> > >
> > > Thanks for clarifying. I'm trying understand what happens without
> > > commit e0b2eed0 with an illustration:
> > >
> > > step 1: publisher - confirmed_flush LSN  in replication slot on disk
> > > structure is 80
> > > step 2: publisher - sends WAL at LSN 100
> > > step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 
> > > 100
> > > step 4: publisher - shuts down without writing the new confirmed_flush
> > > LSN as 100 to disk, note that commit e0b2eed0 is not in place
> > > step 5: publisher - restarts
> > > step 6: subscriber - upon publisher restart, the subscriber requests
> > > WAL from publisher from LSN 100 as it tracks the last applied LSN in
> > > replication origin
> > >
> > > Now, if the pg_upgrade with the patch in this thread is run on
> > > publisher after step 4, it complains with "The slot \"%s\" has not
> > > consumed the WAL yet".
> > >
> > > Is my above understanding right?
> > >
> >
> > Yes.
>
> Thanks. Trying things with replication lag - when there's a lag, the
> pg_upgrade can't proceed further and it complains "The slot "mysub"
> has not consumed the WAL yet".
>
> I think the best way to upgrade a postgres instance with logical
> replication slots is: 1) ensure no replication lag for the logical
> slots; 2) perform pg_upgrade --check first; 3) perform pg_upgrade if
> there are no complaints.
>
> With the above understanding, it looks to me that the commit e0b2eed0
> isn't necessary for back branches. Because, without it the pg_upgrade
> complains "The slot "mysub" has not consumed the WAL yet", and then
> the user has to restart the instance to ensure the WAL is consumed
> (IOW, to get the correct confirmed_flush LSN to the disk).
>

The point is it will be difficult for users to ensure that all the WAL
is consumed because it may have already been sent even after restart
and shutdown but the check will still fail. I think the argument to
support upgrade from branches where we don't have commit e0b2eed0 has
some merits and we can change the checks if there is broader agreement
on it. Let's try to agree on whether the core patch is good as is
especially what we want to achieve via validate_wal_records. Once we
agree on the main patch and commit it, the other work including
considering having an option to upgrade slots can be done as top-up
patches.

-- 
With Regards,
Amit Kapila.




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Bharath Rupireddy
On Fri, Sep 22, 2023 at 9:40 AM Michael Paquier  wrote:
>
> On Thu, Sep 21, 2023 at 01:50:28PM +0530, Amit Kapila wrote:
> > We have discussed this point. Normally, we don't have such options in
> > upgrade, so we were hesitent to add a new one for this but there is a
> > discussion to add an --exclude-logical-slots option. We are planning
> > to add that as a separate patch after getting some more consensus on
> > it. Right now, the idea is to get the main patch ready.
>
> Okay.  I am wondering if the subscriber part is OK now without an
> option, but that could also be considered separately, as well.  At
> least I hope so.

+1 for an option to skip upgrade logical replication slots for the
following reasons:
- one may not want the logical replication slots on the upgraded
instance immediately - unless the upgraded instance is tested and
determined to be performant.
- one may not want the logical replication slots on the upgraded
instance immediately - no logical replication setup is wanted on the
new instance perhaps because of an architectural/organizational
decision.
- one may take backup of the postgres instance with logical
replication slots using any of the file system/snapshot based backup
mechanisms (not pg_basebackup), essentially getting the on-disk
replication slots data as well; the pg_upgrade may fail on the
backed-up instance.

I agree to have it as a 0002 patch once the design and things are
finalized for the main patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Out of memory error handling in frontend code

2023-09-28 Thread Frédéric Yhuel

Hello,

One of our customers recently complained that his pg_dump stopped 
abruptly with the message "out of memory".


After some time, we understood that the 20 million of large objects were 
responsible for the huge memory usage (more than 10 GB) by pg_dump.


I think a more useful error message would help for such cases. Indeed, 
it's not always possible to ask the client to run pg_dump with 
"valgrind --tool=massif" on its server.


Now, I understand that we don't want to add too much to the frontend 
code, it would be a lot of pain for not much gain.


But I wonder if we could add some checks in a few strategic places, as 
in the attached patch.


The idea would be to print a more useful error message in most of the 
cases, and keep the basic "out of memory" for the remaining ones.


I haven't try to get the patch ready for review, I know that the format 
of the messages isn't right, I'd like to know what do you think of the 
idea, first.


Best regards,
FrédéricFrom 81aa4ae59778f1193d6e1a8c81931502c941e997 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= 
Date: Mon, 18 Sep 2023 08:18:19 +0200
Subject: [PATCH] pg_dump: fix OOM handling

Exit with better error messages when there's not enough memory
to process large objects.
---
 src/bin/pg_dump/pg_backup_archiver.c | 10 +++---
 src/bin/pg_dump/pg_backup_archiver.h |  6 --
 src/bin/pg_dump/pg_dump.c| 13 +++--
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 39ebcfec32..8370e26075 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -1112,13 +1112,17 @@ WriteData(Archive *AHX, const void *data, size_t dLen)
 
 /* Public */
 TocEntry *
-ArchiveEntry(Archive *AHX, CatalogId catalogId, DumpId dumpId,
-			 ArchiveOpts *opts)
+ArchiveEntry2(Archive *AHX, CatalogId catalogId, DumpId dumpId,
+			 ArchiveOpts *opts, const char* caller)
 {
 	ArchiveHandle *AH = (ArchiveHandle *) AHX;
 	TocEntry   *newToc;
 
-	newToc = (TocEntry *) pg_malloc0(sizeof(TocEntry));
+	newToc = (TocEntry *) pg_malloc_extended(sizeof(TocEntry), MCXT_ALLOC_NO_OOM|MCXT_ALLOC_ZERO);
+	if (newToc == NULL)
+	{
+		pg_fatal("%s: could not add a new archive entry: %s", caller, strerror(errno));
+	}
 
 	AH->tocCount++;
 	if (dumpId > AH->maxDumpId)
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 18b38c17ab..60872744a6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -400,8 +400,10 @@ typedef struct _archiveOpts
 } ArchiveOpts;
 #define ARCHIVE_OPTS(...) &(ArchiveOpts){__VA_ARGS__}
 /* Called to add a TOC entry */
-extern TocEntry *ArchiveEntry(Archive *AHX, CatalogId catalogId,
-			  DumpId dumpId, ArchiveOpts *opts);
+extern TocEntry *ArchiveEntry2(Archive *AHX, CatalogId catalogId,
+	  DumpId dumpId, ArchiveOpts *opts, const char* caller);
+
+#define ArchiveEntry(a, b, c, d) ArchiveEntry2(a, b, c, d, __func__)
 
 extern void WriteHead(ArchiveHandle *AH);
 extern void ReadHead(ArchiveHandle *AH);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 7977d6a9c0..8413d4f115 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3566,7 +3566,11 @@ getLOs(Archive *fout)
 	/*
 	 * Each large object has its own "BLOB" archive entry.
 	 */
-	loinfo = (LoInfo *) pg_malloc(ntups * sizeof(LoInfo));
+	loinfo = (LoInfo *) pg_malloc_extended(ntups * sizeof(LoInfo), MCXT_ALLOC_NO_OOM);
+	if (loinfo == NULL)
+	{
+		pg_fatal("getLOs: out of memory");
+	}
 
 	for (i = 0; i < ntups; i++)
 	{
@@ -3606,7 +3610,12 @@ getLOs(Archive *fout)
 	 */
 	if (ntups > 0)
 	{
-		lodata = (DumpableObject *) pg_malloc(sizeof(DumpableObject));
+		lodata = (DumpableObject *) pg_malloc_extended(sizeof(DumpableObject), MCXT_ALLOC_NO_OOM);
+		if (lodata == NULL)
+		{
+			pg_fatal("getLOs: out of memory");
+		}
+
 		lodata->objType = DO_LARGE_OBJECT_DATA;
 		lodata->catId = nilCatalogId;
 		AssignDumpId(lodata);
-- 
2.39.2



Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Bharath Rupireddy
On Thu, Sep 28, 2023 at 1:06 PM Amit Kapila  wrote:
>
> On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy
>  wrote:
> >
> > > No, without that commit, there is a very high possibility that even if
> > > we have sent the WAL to the subscriber and got the acknowledgment of
> > > the same, we would miss updating it before shutdown. This would lead
> > > to upgrade failures because upgrades have no way to later identify
> > > whether the remaining WAL records are sent to the subscriber.
> >
> > Thanks for clarifying. I'm trying understand what happens without
> > commit e0b2eed0 with an illustration:
> >
> > step 1: publisher - confirmed_flush LSN  in replication slot on disk
> > structure is 80
> > step 2: publisher - sends WAL at LSN 100
> > step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 
> > 100
> > step 4: publisher - shuts down without writing the new confirmed_flush
> > LSN as 100 to disk, note that commit e0b2eed0 is not in place
> > step 5: publisher - restarts
> > step 6: subscriber - upon publisher restart, the subscriber requests
> > WAL from publisher from LSN 100 as it tracks the last applied LSN in
> > replication origin
> >
> > Now, if the pg_upgrade with the patch in this thread is run on
> > publisher after step 4, it complains with "The slot \"%s\" has not
> > consumed the WAL yet".
> >
> > Is my above understanding right?
> >
>
> Yes.

Thanks. Trying things with replication lag - when there's a lag, the
pg_upgrade can't proceed further and it complains "The slot "mysub"
has not consumed the WAL yet".

I think the best way to upgrade a postgres instance with logical
replication slots is: 1) ensure no replication lag for the logical
slots; 2) perform pg_upgrade --check first; 3) perform pg_upgrade if
there are no complaints.

With the above understanding, it looks to me that the commit e0b2eed0
isn't necessary for back branches. Because, without it the pg_upgrade
complains "The slot "mysub" has not consumed the WAL yet", and then
the user has to restart the instance to ensure the WAL is consumed
(IOW, to get the correct confirmed_flush LSN to the disk).

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




initdb's -c option behaves wrong way?

2023-09-28 Thread Kyotaro Horiguchi
Hello.

I noticed that -c option of initdb behaves in an unexpected
manner. Identical variable names with variations in letter casing are
treated as distinct variables.

$ initdb -cwork_mem=100 -cWORK_MEM=1000 -cWork_mem=2000
...
$ grep -i 'work_mem ' $PGDATA/postgresql.conf
work_mem = 100  # min 64kB
#maintenance_work_mem = 64MB# min 1MB
#autovacuum_work_mem = -1   # min 1MB, or -1 to use 
maintenance_work_mem
#logical_decoding_work_mem = 64MB   # min 64kB
WORK_MEM = 1000
Work_mem = 2000


The original intention was apparently to overwrite the existing
line. Furthermore, I surmise that preserving the original letter
casing is preferable.

Attached is a patch to address this issue.  To retrieve the variable
name from the existing line, the code is slightly restructured.
Alternatively, should we just down-case the provided variable names?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 0c6f5ceb0a..04419840f4 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -470,20 +470,14 @@ replace_guc_value(char **lines, const char *guc_name, const char *guc_value,
 	int			namelen = strlen(guc_name);
 	PQExpBuffer newline = createPQExpBuffer();
 	int			i;
+	const char *where;
+	const char *pname;
 
-	/* prepare the replacement line, except for possible comment and newline */
 	if (mark_as_comment)
 		appendPQExpBufferChar(newline, '#');
-	appendPQExpBuffer(newline, "%s = ", guc_name);
-	if (guc_value_requires_quotes(guc_value))
-		appendPQExpBuffer(newline, "'%s'", escape_quotes(guc_value));
-	else
-		appendPQExpBufferStr(newline, guc_value);
 
 	for (i = 0; lines[i]; i++)
 	{
-		const char *where;
-
 		/*
 		 * Look for a line assigning to guc_name.  Typically it will be
 		 * preceded by '#', but that might not be the case if a -c switch
@@ -493,15 +487,32 @@ replace_guc_value(char **lines, const char *guc_name, const char *guc_value,
 		where = lines[i];
 		while (*where == '#' || isspace((unsigned char) *where))
 			where++;
-		if (strncmp(where, guc_name, namelen) != 0)
+		if (strncasecmp(where, guc_name, namelen) != 0)
 			continue;
+
+		pname = where;
 		where += namelen;
 		while (isspace((unsigned char) *where))
 			where++;
-		if (*where != '=')
-			continue;
 
-		/* found it -- append the original comment if any */
+		/* assume there's only one match */
+		if (*where == '=')
+			break;
+	}
+
+	if (lines[i])
+	{
+		/* found it, rewrite the line preserving the original comment if any */
+		appendPQExpBuffer(newline, "%.*s = ", namelen, pname);
+		if (guc_value_requires_quotes(guc_value))
+			appendPQExpBuffer(newline, "'%s'", escape_quotes(guc_value));
+		else
+			appendPQExpBufferStr(newline, guc_value);
+
+		/*
+		 * completed body of line, now continue with potential indentation and
+		 * comment
+		 */
 		where = strrchr(where, '#');
 		if (where)
 		{
@@ -548,16 +559,19 @@ replace_guc_value(char **lines, const char *guc_name, const char *guc_value,
 
 		free(lines[i]);
 		lines[i] = newline->data;
-
-		break;	/* assume there's only one match */
 	}
-
-	if (lines[i] == NULL)
+	else
 	{
 		/*
 		 * No match, so append a new entry.  (We rely on the bootstrap server
 		 * to complain if it's not a valid GUC name.)
 		 */
+		appendPQExpBuffer(newline, "%s = ", guc_name);
+		if (guc_value_requires_quotes(guc_value))
+			appendPQExpBuffer(newline, "'%s'", escape_quotes(guc_value));
+		else
+			appendPQExpBufferStr(newline, guc_value);
+
 		appendPQExpBufferChar(newline, '\n');
 		lines = pg_realloc_array(lines, char *, i + 2);
 		lines[i++] = newline->data;


Re: Set enable_seqscan doesn't take effect?

2023-09-28 Thread jacktby jacktby


> 2023年9月28日 12:26,David G. Johnston  写道:
> 
> On Wednesday, September 27, 2023, jacktby jacktby  > wrote:
>> postgres=# SET enable_seqscan = off;
>> SET
>> postgres=# explain select * from t;
>>QUERY PLAN
>> -
>>  Seq Scan on t  (cost=100.00..123.60 rows=1360 width=32)
> 
> It wouldn’t cost 10billion to return the first tuple if that setting wasn’t 
> working.
> 
> That is the “discouragement” the documentation is referring to.
> 
> I do agree the wording in the docs could be improved since it is a bit 
> self-contradictory and unspecific, but it is explicitly clear a plan with 
> sequential scan can still be chosen even with this set to off.
> 
> David J.
> 
Yes, I think that’s it.Thanks.

Re: Set enable_seqscan doesn't take effect?

2023-09-28 Thread jacktby jacktby


> 2023年9月28日 01:07,Andres Freund  写道:
> 
> Hi,
> 
> On 2023-09-28 00:37:41 +0800, jacktby jacktby wrote:
>> postgres=# SET enable_seqscan = off;
>> SET
>> postgres=# explain select * from t;
>>  QUERY PLAN
>> -
>> Seq Scan on t  (cost=100.00..123.60 rows=1360 width=32)
>> (1 row)
>> 
>> postgres=#  select * from t;
>>  a   
>> ---
>> [1,2]
>> (1 row)
> 
> Sorry to be the grump here:
> 
> You start several threads a week, often clearly not having done much, if any,
> prior research. Often even sending the same question to multiple lists. It
> should not be hard to find an explanation for the behaviour you see here.
> 
> pgsql-hackers isn't a "do my work for me service". We're hacking on
> postgres. It's fine to occasionally ask for direction, but you're very clearly
> exceeding that.
> 
> Greetings,
> 
> Andres Freund
I’m so sorry for that. I think I’m not very familiar with pg, so I ask many 
naive questions. And I apologize for my behavior.



Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-28 Thread Amit Kapila
On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy
 wrote:
>
> On Mon, Sep 25, 2023 at 2:06 PM Amit Kapila  wrote:
> >
> > > > [1] 
> > > > https://www.postgresql.org/message-id/CAA4eK1%2BLtWDKXvxS7gnJ562VX%2Bs3C6%2B0uQWamqu%3DUuD8hMfORg%40mail.gmail.com
> > >
> > > I see. IIUC, without that commit e0b2eed [1], it may happen that the
> > > slot's on-disk confirmed_flush LSN value can be higher than the WAL
> > > LSN that's flushed to disk, no?
> > >
> >
> > No, without that commit, there is a very high possibility that even if
> > we have sent the WAL to the subscriber and got the acknowledgment of
> > the same, we would miss updating it before shutdown. This would lead
> > to upgrade failures because upgrades have no way to later identify
> > whether the remaining WAL records are sent to the subscriber.
>
> Thanks for clarifying. I'm trying understand what happens without
> commit e0b2eed0 with an illustration:
>
> step 1: publisher - confirmed_flush LSN  in replication slot on disk
> structure is 80
> step 2: publisher - sends WAL at LSN 100
> step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 100
> step 4: publisher - shuts down without writing the new confirmed_flush
> LSN as 100 to disk, note that commit e0b2eed0 is not in place
> step 5: publisher - restarts
> step 6: subscriber - upon publisher restart, the subscriber requests
> WAL from publisher from LSN 100 as it tracks the last applied LSN in
> replication origin
>
> Now, if the pg_upgrade with the patch in this thread is run on
> publisher after step 4, it complains with "The slot \"%s\" has not
> consumed the WAL yet".
>
> Is my above understanding right?
>

Yes.


-- 
With Regards,
Amit Kapila.