Re: Do we want a hashset type?
On Mon, Jun 19, 2023, at 02:00, jian he wrote: > select hashset_contains('{1,2}'::int4hashset,NULL::int); > should return null? Hmm, that's a good philosophical question. I notice Tomas Vondra in the initial commit opted for allowing NULL inputs, treating them as empty sets, e.g. in int4hashset_add() we create a new hashset if the first argument is NULL. I guess the easiest perhaps most consistent NULL-handling strategy would be to just mark all relevant functions STRICT except for the agg ones since we probably want to allow skipping over rows with NULL values without the entire result becoming NULL. But if we're not just going the STRICT route, then I think it's a bit more tricky, since you could argue the hashset_contains() example should return FALSE since the set doesn't contain the NULL value, but OTOH, since we don't store NULL values, we don't know if has ever been added, hence a NULL result would perhaps make more sense. I think I lean on thinking that if we want to be "NULL-friendly", like we currently are in hashset_add(), it would probably be most user-friendly to be consistent and let all functions return non-null return values in all cases where it is not unreasonable. Since we're essentially designing a set-theoretic system, I think we should aim for the logical "soundness" property of it and think about how we can verify that it is. Thoughts? /Joel
Re: Synchronizing slots from primary to standby
Hi, On 6/16/23 11:56 AM, Amit Kapila wrote: On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand wrote: Please find attached V5 (a rebase of V4 posted up-thread). In addition to the "rebasing" work, the TAP test adds a test about conflict handling (logical slot invalidation) relying on the work done in the "Minimal logical decoding on standby" patch series. I did not look more at the patch (than what's was needed for the rebase) but plan to do so. Are you still planning to continue working on this? Yes, I think it would be great to have such a feature in core. Some miscellaneous comments while going through this patch are as follows? Thanks! I'll look at them and will try to come back to you by mid of next week. Also I think we need to handle the case of invalidated replication slot(s): should we drop/recreate it/them? (as the main goal is to have sync slot(s) on the standby). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: [BUG] recovery of prepared transactions during promotion can fail
At Mon, 19 Jun 2023 14:24:44 +0900, Michael Paquier wrote in > On Fri, Jun 16, 2023 at 04:27:40PM +0200, Julian Markwort wrote: > > Note that it is important that the PREPARE entry is in the WAL file > > that PostgreSQL is writing to prior to the inital crash. > > This has happened repeatedly in production already with a customer > > that uses prepared transactions quite frequently. I assume that > > this has happened for others too, but the circumstances of the crash > > and the cause are very dubious, and troubleshooting it is pretty > > difficult. > > I guess that this is a possibility yes. I have not heard directly > about such a report, but perhaps that's just because few people use > 2PC. +1 > > This behaviour has - apparently unintentionally - been fixed in PG > > 15 and upwards (see commit 811051c ), as part of a general > > restructure and reorganization of this portion of xlog.c (see commit > > 6df1543 ). > > > > Furthermore, it seems this behaviour does not appear in PG 12 and > > older, due to another possible bug: In PG 13 and newer, the > > XLogReaderState is reset in XLogBeginRead() before reading WAL in > > XlogReadTwoPhaseData() in twophase.c . > > In the older releases (PG <= 12), this reset is not done, so the > > requested LSN containing the prepared transaction can (by happy > > coincidence) be read from in-memory buffers, and PostgreSQL > > consequently manages to come up just fine (as the WAL has already > > been read into buffers prior to the .partial rename). If the older > > releases also where to properly reset the XLogReaderState, they > > would also fail to find the LSN on disk, and hence PostgreSQL would > > crash again. > > That's debatable, but I think that I would let v12 and v11 be as they > are. v11 is going to be end-of-life soon and we did not have any > complains on this matter as far as I know, so there is a risk of > breaking something upon its last release. (Got some, Err.. > experiences with that in the past). On REL_11_STABLE, note for > example the slight difference with the handling of > recovery_end_command, where we rely on InRecovery rather than > ArchiveRecoveryRequested. REL_12_STABLE is in a more consistent shape > than v11 regarding that. Agree about 11, it's no use patching. About 12, I slightly prefer applying this but I'm fine without it since no actual problem are seen. > > I've attached patches for PG 14 and PG 13 that mimic the change in > > PG15 (commit 811051c ) and reorder the crucial events, placing the > > recovery of prepared transactions *before* renaming the file. > > Yes, I think that's OK. I would like to add two things to your > proposal for all the existing branches. > - Addition of a comment where RecoverPreparedTransactions() is called > at the end of recovery to tell that we'd better do that before working > on the last partial segment of the old timeline. > - Enforce the use of archiving in 009_twophase.pl. Both look good to me. > > My humble opinion is that this fix should be backpatched to PG 14 > > and PG 13. It's debatable whether the fix needs to be brought back > > to 12 and older also, as those do not exhibit this issue, but the > > order of renaming is still wrong. > > Yeah, I'd rather wait for somebody to complain about that. And v11 is > not worth taking risks with at this time of the year, IMHO. I don't have a complaint as the whole. > With your fix included, the patch for REL_14_STABLE would be like the > attached. Is that OK for you? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Allow pg_archivecleanup to remove backup history files
On Mon, Jun 19, 2023 at 11:24:29AM +0900, torikoshia wrote: > Thanks, now I understand what you meant. If I may ask, why is the refactoring of 0003 done after the feature in 0002? Shouldn't the order be reversed? That would make for a cleaner git history. -- Michael signature.asc Description: PGP signature
Re: [BUG] recovery of prepared transactions during promotion can fail
Thanks for the report, reproducer and the patches. At Fri, 16 Jun 2023 16:27:40 +0200, Julian Markwort wrote in > - prepare a transaction > - crash postgresql > - create standby.signal file > - start postgresql, wait for recovery to finish > - promote .. > The promotion will fail with a FATAL error, stating that "requested WAL > segment .* has already been removed". > The FATAL error causes the startup process to exit, so postmaster shuts down > again. > > Here's an exemplary log output, maybe this helps people to find this issue > when they search for it online: > LOG: redo done at 0/15D89B8 > LOG: last completed transaction was at log time 2023-06-16 13:09:53.71118+02 > LOG: selected new timeline ID: 2 > LOG: archive recovery complete > FATAL: requested WAL segment pg_wal/00010001 has already > been removed > LOG: startup process (PID 1650358) exited with exit code 1 Reproduced here. > The cause of this failure is an oversight (rather obvious in hindsight): > The renaming of the WAL file (that was last written to before the crash > happened) to .partial is done *before* PostgreSQL > might have to read this very file to recover prepared transactions from it. > The relevant function calls here are durable_rename() and > RecoverPreparedTransactions() in xlog.c . > This behaviour has - apparently unintentionally - been fixed in PG 15 and > upwards (see commit 811051c ), as part of a > general restructure and reorganization of this portion of xlog.c (see commit > 6df1543 ). I think so, the reordering might have done for some other reasons, though. > Furthermore, it seems this behaviour does not appear in PG 12 and older, due > to another possible bug: ... > In PG 13 and newer, the XLogReaderState is reset in XLogBeginRead() > before reading WAL in XlogReadTwoPhaseData() in twophase.c . I arraived at the same conclusion. > In the older releases (PG <= 12), this reset is not done, so the requested > LSN containing the prepared transaction can > (by happy coincidence) be read from in-memory buffers, and PostgreSQL > consequently manages to come up just fine (as the > WAL has already been read into buffers prior to the .partial rename). > If the older releases also where to properly reset the XLogReaderState, they > would also fail to find the LSN on disk, and > hence PostgreSQL would crash again. >From the perspective of loading WAL for prepared transactions, the current code in those versions seems fine. Although I suspect Windows may not like to rename currently-open segments, it's likely acceptable as the current test set operates without issue.. (I didn't tested this.) > I've attached patches for PG 14 and PG 13 that mimic the change in PG15 > (commit 811051c ) and reorder the crucial events, > placing the recovery of prepared transactions *before* renaming the file. It appears to move the correct part of the code to the proper location, modifying the steps to align with later versions. > I've also attached recovery test scripts for PG >= 12 and PG <= 11 that can > be used to verify that promote after recovery > with prepared transactions works. It effectively detects the bug, though it can't be directly used in the tree as-is. I'm unsure whether we need this in the tree, though. > My humble opinion is that this fix should be backpatched to PG 14 and PG 13. I agree with you. > It's debatable whether the fix needs to be brought back to 12 and older also, > as those do not exhibit this issue, but the > order of renaming is still wrong. > I'm not sure if there could be cases where the in-memory buffers of the > walreader are too small to cover a whole WAL > file. > There could also be other issues from operations that require reading WAL > that happen after the .partial rename, I > haven't checked in depth what else happens in the affected codepath. > Please let me know if you think this should also be fixed in PG 12 and > earlier, so I can produce the patches for those > versions as well. There's no immediate need to change the versions. However, I would prefer to backpatch them to the older versions for the following reasons. 1. Applying this eases future backpatching in this area, if any. 2. I have reservations about renaming possibly-open WAL segments. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] recovery of prepared transactions during promotion can fail
On Fri, Jun 16, 2023 at 04:27:40PM +0200, Julian Markwort wrote: > I've discovered a serious bug that leads to a server crash upon > promoting an instance that crashed previously and did recovery in > standby mode. Reproduced here, for the versions mentioned. > The bug is present in PostgreSQL versions 13 and 14 (and in earlier > versions, though it doesn't manifest itself so catastrophically). > The circumstances to trigger the bug are as follows: > - postgresql is configured for hot_standby, archiving, and prepared > transactions > - prepare a transaction > - crash postgresql > - create standby.signal file > - start postgresql, wait for recovery to finish > - promote hot_standby allows one to run queries on a standby running recovery, so it seems to me that it does not really matter. Enabling archiving is the critical piece. The nodes set in the TAP test 009_twophase.pl don't use any kind of archiving. But once it is enabled on London the first promotion command of the test fails the same way as you report. # Setup london node my $node_london = get_new_node("london"); -$node_london->init(allows_streaming => 1); +# Archiving is used to force tests with .partial segment creations +# done at the end of recovery. +$node_london->init(allows_streaming => 1, has_archiving => 1); Enabling the archiving does not impact any of the tests, as we don't use restore_command during recovery and only rely on streaming. > The cause of this failure is an oversight (rather obvious in > hindsight): The renaming of the WAL file (that was last written to > before the crash happened) to .partial is done *before* PostgreSQL > might have to read this very file to recover prepared transactions > from it. The relevant function calls here are durable_rename() and > RecoverPreparedTransactions() in xlog.c. > > Note that it is important that the PREPARE entry is in the WAL file > that PostgreSQL is writing to prior to the inital crash. > This has happened repeatedly in production already with a customer > that uses prepared transactions quite frequently. I assume that > this has happened for others too, but the circumstances of the crash > and the cause are very dubious, and troubleshooting it is pretty > difficult. I guess that this is a possibility yes. I have not heard directly about such a report, but perhaps that's just because few people use 2PC. > This behaviour has - apparently unintentionally - been fixed in PG > 15 and upwards (see commit 811051c ), as part of a general > restructure and reorganization of this portion of xlog.c (see commit > 6df1543 ). > > Furthermore, it seems this behaviour does not appear in PG 12 and > older, due to another possible bug: In PG 13 and newer, the > XLogReaderState is reset in XLogBeginRead() before reading WAL in > XlogReadTwoPhaseData() in twophase.c . > In the older releases (PG <= 12), this reset is not done, so the > requested LSN containing the prepared transaction can (by happy > coincidence) be read from in-memory buffers, and PostgreSQL > consequently manages to come up just fine (as the WAL has already > been read into buffers prior to the .partial rename). If the older > releases also where to properly reset the XLogReaderState, they > would also fail to find the LSN on disk, and hence PostgreSQL would > crash again. That's debatable, but I think that I would let v12 and v11 be as they are. v11 is going to be end-of-life soon and we did not have any complains on this matter as far as I know, so there is a risk of breaking something upon its last release. (Got some, Err.. experiences with that in the past). On REL_11_STABLE, note for example the slight difference with the handling of recovery_end_command, where we rely on InRecovery rather than ArchiveRecoveryRequested. REL_12_STABLE is in a more consistent shape than v11 regarding that. > I've attached patches for PG 14 and PG 13 that mimic the change in > PG15 (commit 811051c ) and reorder the crucial events, placing the > recovery of prepared transactions *before* renaming the file. Yes, I think that's OK. I would like to add two things to your proposal for all the existing branches. - Addition of a comment where RecoverPreparedTransactions() is called at the end of recovery to tell that we'd better do that before working on the last partial segment of the old timeline. - Enforce the use of archiving in 009_twophase.pl. > My humble opinion is that this fix should be backpatched to PG 14 > and PG 13. It's debatable whether the fix needs to be brought back > to 12 and older also, as those do not exhibit this issue, but the > order of renaming is still wrong. Yeah, I'd rather wait for somebody to complain about that. And v11 is not worth taking risks with at this time of the year, IMHO. With your fix included, the patch for REL_14_STABLE would be like the attached. Is that OK for you? -- Michael From 84c6c4cd7612847326c7121f3b20a31cc846ef92 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mo
Re: Support logical replication of DDLs
As per suggestion by Amit, reviewed two more formats to be used for DDL's WAL-logging purpose, analysis below: NodeToString: I do not think it is a good idea to use NodeToString in DDL Rep for reasons below: 1) It consists of too much internal and not-needed information. 2) Too large to be logged in WAL. Optimization of this will be a mammoth task because a) we need to filter out information, not all the info will be useful to the subscriber; b) it is not a simple key based search and replace/remove. Modifying strings is always difficult. 3) It's not compatible across major versions. If we want to use Node information in any format, we need to ensure that the output can be read in a different major version of PostgreSQL. Sql-ddl-to-json-schema given in [1]: This was suggested by Swada-san in one of the discussions. Since it is json format and thus essentially has to be key:value pairs like the current implementation. The difference here is that there is no "format string" maintained with each json object. And thus the awareness on how to construct the DDL (i.e. exact DDL-synatx) needs to be present at the receiver side. In our case, we maintain the "fmt" string using which the receiver can re-construct DDL statements without knowing PostgreSQL's DDL syntax. The "fmt" string tells us the order of elements/keys and also representation of values (string, identifier etc) while the JSON data created by sql-ddl-to-json-schema looks more generic; the receiver can construct a DDL statement in any form. It would be more useful for example when the receiver is not a PostgreSQL server. And thus does not seem a suitable choice for the logical replication use-case in hand. [1]: https://www.npmjs.com/package/sql-ddl-to-json-schema thanks Shveta
Re: Subscription statistics are not dropped at DROP SUBSCRIPTION in some cases
On Mon, Jun 19, 2023 at 6:50 AM Masahiko Sawada wrote: > > On Sat, Jun 17, 2023 at 6:45 PM Amit Kapila wrote: > > > > On Tue, May 16, 2023 at 8:00 PM Masahiko Sawada > > wrote: > > > > > > On Thu, May 11, 2023 at 5:12 PM Masahiko Sawada > > > wrote: > > > > > > > > > > After thinking more about it, I realized that this is not a problem > > > specific to HEAD. ISTM the problem is that by commit 7b64e4b3, we drop > > > the stats entry of subscription that is not associated with a > > > replication slot for apply worker, but we missed the case where the > > > subscription is not associated with both replication slots for apply > > > and tablesync. So IIUC we should backpatch it down to 15. > > > > > > > I agree that it should be backpatched to 15. > > > > > Since in pg15, since we don't create the subscription stats at CREATE > > > SUBSCRIPTION time but do when the first error is reported, > > > > > > > AFAICS, the call to pgstat_create_subscription() is present in > > CreateSubscription() in 15 as well, so, I don't get your point. > > IIUC in 15, pgstat_create_subscription() doesn't create the stats > entry. See commit e0b01429590. > Thanks for the clarification. Your changes looks good to me though I haven't tested it. -- With Regards, Amit Kapila.
Re: path->param_info only set for lateral?
James Coleman writes: > Over in "Parallelize correlated subqueries that execute within each > worker" [1} Richard Guo found a bug in the current version of my patch > in that thread. While debugging that issue I've been wondering why > Path's param_info field seems to be NULL unless there is a LATERAL > reference even though there may be non-lateral outer params > referenced. Per pathnodes.h: * "param_info", if not NULL, links to a ParamPathInfo that identifies outer * relation(s) that provide parameter values to each scan of this path. * That means this path can only be joined to those rels by means of nestloop * joins with this path on the inside. ... We're only interested in this for params that are coming from other relations of the same query level, so that they affect join order and join algorithm choices. Params coming down from outer query levels are much like EXTERN params to the planner: they are pseudoconstants for any one execution of the current query level. This isn't just LATERAL stuff; it's also intentionally-generated nestloop-with-inner-indexscan-cases. But it's not outer-level Params. Even though those are also PARAM_EXEC Params, they are fundamentally different animals for the planner's purposes. regards, tom lane
Re: Deleting prepared statements from libpq.
now it works. /src/test/modules/libpq_pipeline/libpq_pipeline.c > > /* Now that it's closed we should get an error when describing */ > res = PQdescribePortal(conn, "cursor_one"); > if (PQresultStatus(res) != PGRES_FATAL_ERROR) > pg_fatal("expected COMMAND_OK, got %s", PQresStatus(PQresultStatus(res))); should it be "if (PQresultStatus(res) == PGRES_FATAL_ERROR)" ? Similarly the following line should also change? > res = PQdescribePrepared(conn, "select_one"); > if (PQresultStatus(res) != PGRES_FATAL_ERROR) > pg_fatal("expected FATAL_ERROR, got %s", PQresStatus(PQresultStatus(res))); typo, unnecessary "portal." in the following sentence? "portalName can be "" or NULL to reference the unnamed portal, it is fine if no portal exists with this name. portal. On success, a PGresult with status PGRES_COMMAND_OK is returned." "Also, although there is no libpq function for deleting a prepared statement, the SQL DEALLOCATE statement can be used for that purpose." Now the PQclosePrepared has the same use as DEALLOCATE, maybe the above sentence should be changed?
path->param_info only set for lateral?
Hello, Over in "Parallelize correlated subqueries that execute within each worker" [1} Richard Guo found a bug in the current version of my patch in that thread. While debugging that issue I've been wondering why Path's param_info field seems to be NULL unless there is a LATERAL reference even though there may be non-lateral outer params referenced. Consider the query: select * from pg_description t1 where objoid in (select objoid from pg_description t2 where t2.description = t1.description); The subquery's rel has a baserestrictinfo containing an OpExpr comparing a Var (t2.description) to a Param of type PARAM_EXEC (t1.description). But the generated SeqScan path doesn't have its param_info field set, which means PATH_REQ_OUTER returns NULL also despite there being an obvious param referencing a required outer relid. Looking at create_seqscan_path we see that param_info is initialized with: get_baserel_parampathinfo(root, rel, required_outer) where required_outer is passed in from set_plain_rel_pathlist as rel->lateral_relids. And get_baserel_parampathinfo always returns NULL if required_outer is empty, so obviously with this query (no lateral reference) we're not going to get any ParamPathInfo added to the path or the rel. Is there a reason why we don't track the required relids providing the PARAM_EXEC params in this case? Thanks, James Coleman 1: https://www.postgresql.org/message-id/CAMbWs4_evjcMzN8Gw78bHfhfo2FKJThqhEjRJRmoMZx%3DNXcJ7w%40mail.gmail.com
Re: Allow pg_archivecleanup to remove backup history files
On 2023-06-16 11:22, Kyotaro Horiguchi wrote: At Thu, 15 Jun 2023 21:38:28 +0900, torikoshia wrote in On 2023-06-15 15:20, Kyotaro Horiguchi wrote: Thanks for your review! > + printf(_(" -x, --strip-extension=EXT strip this extention before > identifying files fo clean up\n")); > + printf(_(" -?, --help show this help, then exit\n")); > After this change, some of these lines corss the boundary of the 80 > columns width. (is that policy viable noadays? I am usually working > using terminal windows with such a width..) It's somewhat unrelated to > this patch, but a help line a few lines further down also exceeds the > width. We could shorten it by removing the "/mnt/server" portion, but > I'm not sure if it's worth doing. I also highlight 80th column according to the wiki[1]. Since usage() in other files like pg_rewind.c and initdb.c also exceeded the 80th column, I thought that was something like a guide. I think the page is suggesting about program code, not the messages that binaries print. Thanks, now I understand what you meant. ASAICS the main section of the "pg_rewind --help" fits within 80 columns. However, "initdb --help" does output a few lines exceeding the 80-column limit. Judging by the surrounding lines, I believe we're still aiming to maintain these the given width. I think we need to fix initdb in that regard. Hmm, it seems some other commands also exceeds 80 columns: pg_amcheck: --skip=OPTION do NOT check "all-frozen" or "all-visible" blocks --startblock=BLOCK begin checking table(s) at the given block number --endblock=BLOCKcheck table(s) only up to the given block number --no-synchronized-snapshots do not use synchronized snapshots in parallel jobs pg_isready: -t, --timeout=SECS seconds to wait when attempting connection, 0 disables (default: 3) pg_receivewal: --create-slot create a new replication slot (for the slot's name see --slot) --drop-slotdrop the replication slot (for the slot's name see --slot) If you don't mind, I'm going to create another thread about this point. I'll also discuss below line since it's unrelated to current thread as you pointed out: | pg_archivecleanup /mnt/server/archiverdir 00010010.0020.backup Attached patch fixes the number of columns per row exceeding 80 by changing to use getopt_long. On 2023-06-16 11:30, Kyotaro Horiguchi wrote: At Fri, 16 Jun 2023 11:22:31 +0900 (JST), Kyotaro Horiguchi wrote in ASAICS the main section of the "pg_rewind --help" fits within 80 columns. However, "initdb --help" does output a few lines exceeding the 80-column limit. Judging by the surrounding lines, I believe we're still aiming to maintain these the given width. I think we need to fix initdb in that regard. Mmm, the message was introduced in 2012 by 8a02339e9b. I haven't noticed this until now... regards. -- Regards, -- Atsushi Torikoshi NTT DATA CORPORATIONFrom 220bbd866158fd69a3a4affe73136f4699353ecd Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Mon, 19 Jun 2023 09:46:41 +0900 Subject: [PATCH v8 1/3] Introduce pg_archivecleanup into getopt_long This patch is a preliminary step to add an easy-to-understand option to delete backup history files, but it also adds long options to the existing options. --- doc/src/sgml/ref/pgarchivecleanup.sgml| 5 - src/bin/pg_archivecleanup/pg_archivecleanup.c | 22 +-- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/doc/src/sgml/ref/pgarchivecleanup.sgml b/doc/src/sgml/ref/pgarchivecleanup.sgml index 635e7c7685..09991c2fcd 100644 --- a/doc/src/sgml/ref/pgarchivecleanup.sgml +++ b/doc/src/sgml/ref/pgarchivecleanup.sgml @@ -95,6 +95,7 @@ pg_archivecleanup: removing file "archive/00010037000E" -d + --debug Print lots of debug logging output on stderr. @@ -104,6 +105,7 @@ pg_archivecleanup: removing file "archive/00010037000E" -n + --dry-run Print the names of the files that would have been removed on stdout (performs a dry run). @@ -122,7 +124,8 @@ pg_archivecleanup: removing file "archive/00010037000E" - -x extension + -x extension + --strip-extension=extension Provide an extension diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c index 7726d05149..fc0dca9856 100644 --- a/src/bin/pg_archivecleanup/pg_archivecleanup.c +++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c @@ -17,7 +17,7 @@ #include "access/xlog_internal.h" #include "common/logging.h" -#include "pg_getopt.h" +#include "getopt_long.h" const char *progname; @@ -252,11 +252,13 @@ usage(void) printf(_("Usage:\n")); printf(_(" %s [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE\n"), progname); printf(_("\nOptions:\n
Re: Subscription statistics are not dropped at DROP SUBSCRIPTION in some cases
On Sat, Jun 17, 2023 at 6:45 PM Amit Kapila wrote: > > On Tue, May 16, 2023 at 8:00 PM Masahiko Sawada wrote: > > > > On Thu, May 11, 2023 at 5:12 PM Masahiko Sawada > > wrote: > > > > > > > After thinking more about it, I realized that this is not a problem > > specific to HEAD. ISTM the problem is that by commit 7b64e4b3, we drop > > the stats entry of subscription that is not associated with a > > replication slot for apply worker, but we missed the case where the > > subscription is not associated with both replication slots for apply > > and tablesync. So IIUC we should backpatch it down to 15. > > > > I agree that it should be backpatched to 15. > > > Since in pg15, since we don't create the subscription stats at CREATE > > SUBSCRIPTION time but do when the first error is reported, > > > > AFAICS, the call to pgstat_create_subscription() is present in > CreateSubscription() in 15 as well, so, I don't get your point. IIUC in 15, pgstat_create_subscription() doesn't create the stats entry. See commit e0b01429590. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
Michael Paquier writes: > Another thing that I was wondering, though.. Do you think that there > would be an argument in being stricter in the hstore code regarding > the handling of multi-byte characters with some checks based on > IS_HIGHBIT_SET() when parsing the keys and values? What have you got in mind? We should already have validated encoding correctness before the text ever gets to hstore_in, and I'm not clear what additional checks would be useful. regards, tom lane
Re: Do we want a hashset type?
On Sat, Jun 17, 2023 at 8:38 AM Joel Jacobson wrote: > > On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote: > > I realise int4hashset_hash() is broken, > > since two int4hashset's that are considered equal, > > can by coincidence get different hashes: > ... > > Do we have any ideas on how to fix this without sacrificing performance? > > The problem was due to hashset_hash() function accumulating the hashes > of individual elements in a non-commutative manner. As a consequence, the > final hash value was sensitive to the order in which elements were inserted > into the hashset. This behavior led to inconsistencies, as logically > equivalent sets (i.e., sets with the same elements but in different orders) > produced different hash values. > > Solved by modifying the hashset_hash() function to use a commutative operation > when combining the hashes of individual elements. This change ensures that the > final hash value is independent of the element insertion order, and logically > equivalent sets produce the same hash. > > An somewhat unfortunate side-effect of this fix, is that we can no longer > visually sort the hashset output format, since it's not lexicographically > sorted. > I think this is an acceptable trade-off for a hashset type, > since the only alternative I see would be to sort the elements, > but then it wouldn't be a hashset, but a treeset, which different > Big-O complexity. > > New patch is attached, which will henceforth always be a complete patch, > to avoid the hassle of having to assemble incremental patches. > > /Joel select hashset_contains('{1,2}'::int4hashset,NULL::int); should return null? - SELECT attname ,pc.relname ,CASE attstorage WHEN 'p' THEN 'plain' WHEN 'e' THEN 'external' WHEN 'm' THEN 'main' WHEN 'x' THEN 'extended' END AS storage FROMpg_attribute pa joinpg_classpc on pc.oid = pa.attrelid where attnum > 0 and pa.attstorage = 'e'; In my system catalog, it seems only the hashset type storage = 'external'. most is extended. I am not sure the consequence of switch from external to extended. select hashset_hash('{-1,1}') as a1 ,hashset_hash('{1,-2}') as a2 ,hashset_hash('{-3,1}') as a3 ,hashset_hash('{4,1}') as a4; returns: a1 |a2 | a3 | a4 -+---++ -1735582196 | 998516167 | 1337000903 | 1305426029 (1 row) values {a1,a2,a3,a4} should be monotone increasing, based on the function int4hashset_cmp, but now it's not. so the following queries failed. --should return only one row. select hashset_cmp('{2,1}','{3,1}') union select hashset_cmp('{3,1}','{4,1}') union select hashset_cmp('{1,3}','{4,1}'); select hashset_cmp('{9,10,11}','{10,9,-11}') = hashset_cmp('{9,10,11}','{10,9,-1}'); --should be true select '{2,1}'::int4hashset > '{7}'::int4hashset; --should be false. based on int array comparison,. - I comment out following lines in hashset-api.c somewhere between {810,829} // if (a->hash < b->hash) // PG_RETURN_INT32(-1); // else if (a->hash > b->hash) // PG_RETURN_INT32(1); // if (a->nelements < b->nelements) // PG_RETURN_INT32(-1); // else if (a->nelements > b->nelements) // PG_RETURN_INT32(1); // Assert(a->nelements == b->nelements); So hashset_cmp will directly compare int array. the above queries works. {int4hashset_equals,int4hashset_neq} two special cases of hashset_cmp. maybe we can just wrap it just like int4hashset_le? now store 10 element int4hashset need 99 bytes, similar one dimension bigint array with length 10, occupy 101 byte in int4hashset_send, newly add struct attributes/member {load_factor growth_factor ncollisions hash} also need send to buf?
Re: Deleting prepared statements from libpq.
On Sun, Jun 18, 2023 at 01:03:57PM +0200, Jelte Fennema wrote: > Sorry about that. I attached a new patch that allows linking to the > new functions (I forgot to add the functions to exports.txt). This new > patch also adds some basic tests for these new functions. I am okay with the arguments about pgbouncer and psycopg2. The symmetry with the portal description routines makes this proposal easy to think about. - PGQUERY_CLOSE + PGQUERY_CLOSE /* Close Statoment or Portal */ s/Statoment/Statement/. + * Available options for close_type are + * 'S' to close a prepared statement; or + * 'P' to close a portal. + * Returns 1 on success and 0 on failure. + */ +static int +PQsendClose(PGconn *conn, char close_type, const char *close_target) Could it be better for this code path to issue an error if using a non-supported close_type rather than sending it? Okay, you are consistent with desc_type and PQsendDescribe(), just asking if it is worth doing something about. + + +Submits a request to obtain information about the specified +prepared statement, and waits for completion. + PQclosePrepared() does not request for a description. +Submits a request to close the the specified +portal, and waits for completion. s/the the/the/. + allows an application to release +resources related to a portal previously created portal. If it was a The end of this sentence looks a bit weird. Perhaps there should be some tests for the two async routines, as well? -- Michael signature.asc Description: PGP signature
Re: Deleting prepared statements from libpq.
On Sun, Jun 18, 2023 at 09:23:22PM +0800, jian he wrote: > previously I cannot link it. with > v2-0001-Support-sending-Close-messages-from-libpq.patch. now I can > compile it, link it, but then run time error. > same c program in the first email. > when I run it ./a.out, then error: > ./a.out: symbol lookup error: ./a.out: undefined symbol: PQsendClosePrepared If you still have problems, it seems to me that one mistake is in not updating LD_LIBRARY_PATH. It should point to a version of libpq compiled with the patch, or -lpq will not be able to resolve correctly when compiling your test program. -- Michael signature.asc Description: PGP signature
Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
On Sun, Jun 18, 2023 at 12:38:12PM -0400, Tom Lane wrote: > FWIW, I think the status quo is fine. Having hstore do something that > is neither its historical behavior nor aligned with the core parser > doesn't seem like a great idea. Okay. Fine by me. > I don't buy this argument that > somebody might be depending on the handling of \v in particular. It's > not any stronger than the argument that they might be depending on, > say, recognizing no-break space (0xA0) in LATIN1, which the old code > did (probably, depending on platform) and scanner_isspace will not. Another thing that I was wondering, though.. Do you think that there would be an argument in being stricter in the hstore code regarding the handling of multi-byte characters with some checks based on IS_HIGHBIT_SET() when parsing the keys and values? -- Michael signature.asc Description: PGP signature
Re: Do we want a hashset type?
On Sun, Jun 18, 2023, at 18:45, Andrew Dunstan wrote: > . It might be worth sending a version number with the send function > (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever > to some wire representation. Great idea; implemented. > . I think there are some important set operations missing: most notably > intersection, slightly less importantly asymmetric and symmetric > difference. I have no idea how easy these would be to add, but even for > your stated use I should have thought set intersection would be useful > ("Who is a member of both this set of friends and that set of friends?"). Another great idea; implemented. > . While supporting int4 only is OK for now, I think we would at least > want to support int8, and probably UUID since a number of systems I know > of use that as an object identifier. I agree that's probably the most logical thing to focus on next. I'm on it. New patch attached. /Joel hashset-0.0.1-75bf3ab.patch Description: Binary data
Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
On 6/12/23 17:28, Joe Conway wrote: On 6/12/23 10:44, Joe Conway wrote: 1/ how do we fix the misbehavior reported due to libperl in existing stable branches I was mostly trying to concentrate on #1, but 2 & 3 are worthy of discussion. Hmm, browsing through the perl source I came across a reference to this (from https://perldoc.perl.org/perllocale): --- PERL_SKIP_LOCALE_INIT This environment variable, available starting in Perl v5.20, if set (to any value), tells Perl to not use the rest of the environment variables to initialize with. Instead, Perl uses whatever the current locale settings are. This is particularly useful in embedded environments, see "Using embedded Perl with POSIX locales" in perlembed. --- Seems we ought to be using that. Turns out that that does nothing useful as far as I can tell. So I am back to proposing the attached against pg16beta1, to be backpatched to pg11. Since much of the discussion happened on pgsql-bugs, the background summary for hackers is this: When plperl is first loaded, the init function eventually works its way to calling Perl_init_i18nl10n(). In versions of perl >= 5.20, that ends up at S_emulate_setlocale() which does a series of uselocale() calls. For reference, RHEL 7 is perl 5.16.3 while RHEL 9 is perl 5.32.1. Older versions of perl do not have this behavior. The problem with uselocale() is that it changes the active locale away from the default global locale. Subsequent uses of setlocale() affect the global locale, but if that is not the active locale, it does not control the results of locale dependent functions such as localeconv(), which is what we depend on in PGLC_localeconv(). The result is illustrated in this example: 8< psql test psql (16beta1) Type "help" for help. test=# show lc_monetary; lc_monetary - en_GB.UTF-8 (1 row) test=# SELECT 12.34::money AS price; price £12.34 (1 row) test=# \q 8< psql test psql (16beta1) Type "help" for help. test=# load 'plperl'; LOAD test=# show lc_monetary; lc_monetary - en_GB.UTF-8 (1 row) test=# SELECT 12.34::money AS price; price $12.34 (1 row) 8< Notice that merely loading plperl makes the currency symbol wrong. I have proposed a targeted fix that I believe is safe to backpatch -- attached. IIUC, Tom was +1, but Heikki was looking for a more general solution. My issue with the more general solution is that it will likely be too invasive to backpatch, and at the moment at least, there are no other confirmed bugs related to all of this (even if the current code is more fragile than we would prefer). I would like to commit this to all supported branches in the next few days, unless there are other suggestions or objections. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com Ensure the global locale gets used by localeconv() A loaded library, such as libperl, may call uselocale() underneath us. This can result in localeconv() grabbing the wrong locale for numeric and monetary symbols and formatting. Fix that by resetting to the global locale determined by setlocale(). Backpatch to all supported versions. Author: Joe Conway Reviewed-By: Tom Lane Reported by: Guido Brugnara Discussion: https://postgr.es/m/flat/17946-3e84cb577e9551c3%40postgresql.org Backpatch-through: 11 diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c index 31e3b16..9dba161 100644 *** a/src/backend/utils/adt/pg_locale.c --- b/src/backend/utils/adt/pg_locale.c *** PGLC_localeconv(void) *** 505,510 --- 505,517 } /* + * Ensure the global locale will be used by localeconv(). + * This is necessary, for example, if another loaded library + * such as libperl has done uselocale() underneath us. + */ + uselocale(LC_GLOBAL_LOCALE); + + /* * This is tricky because we really don't want to risk throwing error * while the locale is set to other than our usual settings. Therefore, * the process is: collect the usual settings, set locale to special diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c index 7af1ccb..5d80e77 100644 *** a/src/fe_utils/print.c --- b/src/fe_utils/print.c *** setDecimalLocale(void) *** 3628,3633 --- 3628,3639 { struct lconv *extlconv; + /* + * Ensure the global locale will be used by localeconv(). + * This is necessary, for example, if another loaded library + * has done uselocale() underneath us. + */ + uselocale(LC_GLOBAL_LOCALE); extlconv = localeconv(); /* Don't accept an empty decimal_point string */
Re: Use generation context to speed up tuplesorts
Hi Ronan, We briefly chatted about the glibc-tuning part of this thread at pgcon, so I wonder if you're still planning to pursue that. If you do, I suggest we start a fresh thread, so that it's not mixed with the already committed improvements of generation context. I wonder what's the situation with the generation context improvements already pushed - does setting the glibc thresholds still help, and if yes how much? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Do we want a hashset type?
On 2023-06-16 Fr 20:38, Joel Jacobson wrote: New patch is attached, which will henceforth always be a complete patch, to avoid the hassle of having to assemble incremental patches. Cool, thanks. A couple of random thoughts: . It might be worth sending a version number with the send function (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever to some wire representation. . I think there are some important set operations missing: most notably intersection, slightly less importantly asymmetric and symmetric difference. I have no idea how easy these would be to add, but even for your stated use I should have thought set intersection would be useful ("Who is a member of both this set of friends and that set of friends?"). . While supporting int4 only is OK for now, I think we would at least want to support int8, and probably UUID since a number of systems I know of use that as an object identifier. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
Michael Paquier writes: > At the end, no need to do that. I have been able to hack the > attached, that shows the difference of treatment for \v when running > in macOS. Evan, what do you think? FWIW, I think the status quo is fine. Having hstore do something that is neither its historical behavior nor aligned with the core parser doesn't seem like a great idea. I don't buy this argument that somebody might be depending on the handling of \v in particular. It's not any stronger than the argument that they might be depending on, say, recognizing no-break space (0xA0) in LATIN1, which the old code did (probably, depending on platform) and scanner_isspace will not. If anything, the answer for these concerns is that d522b05c8 should not have been back-patched. But I'm okay with where we are. regards, tom lane
Re: Deleting prepared statements from libpq.
On Sun, Jun 18, 2023 at 7:04 PM Jelte Fennema wrote: > > On Sat, 17 Jun 2023 at 15:34, jian he wrote: > > I failed to link it. I don't know why. > > Sorry about that. I attached a new patch that allows linking to the > new functions (I forgot to add the functions to exports.txt). This new > patch also adds some basic tests for these new functions. previously I cannot link it. with v2-0001-Support-sending-Close-messages-from-libpq.patch. now I can compile it, link it, but then run time error. same c program in the first email. when I run it ./a.out, then error: ./a.out: symbol lookup error: ./a.out: undefined symbol: PQsendClosePrepared
Re: Deleting prepared statements from libpq.
On Sat, 17 Jun 2023 at 15:34, jian he wrote: > I failed to link it. I don't know why. Sorry about that. I attached a new patch that allows linking to the new functions (I forgot to add the functions to exports.txt). This new patch also adds some basic tests for these new functions. v2-0001-Support-sending-Close-messages-from-libpq.patch Description: Binary data
Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific
On Sun, Jun 18, 2023 at 10:50:16AM +0900, Michael Paquier wrote: > The difference between scanner_isspace() and array_isspace() is that > the former matches with what scan.l stores as rules for whitespace > characters, but the latter works on values. For hstore, we want the > latter, with something that works on values. To keep the change > locale to hstore, I think that we should just introduce an > hstore_isspace() which is a copy of array_isspace. That's a > duplication, sure, but I think that we may want to think harder about > \v in the flex scanner, and that's just a few extra lines for > something that has not changed in 13 years for arrays. That's also > easier to think about for stable branches. If you can send a patch, > that helps a lot, for sure! At the end, no need to do that. I have been able to hack the attached, that shows the difference of treatment for \v when running in macOS. Evan, what do you think? -- Michael From 7ede07940011d08f8c0cf05d57b3782f367c0adf Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Sun, 18 Jun 2023 17:29:37 +0900 Subject: [PATCH] More fixes for hstore and locales This is not really complete without the treatment of \v like array values. --- contrib/hstore/expected/hstore_utf8.out | 31 + contrib/hstore/hstore_io.c | 30 contrib/hstore/sql/hstore_utf8.sql | 7 ++ 3 files changed, 63 insertions(+), 5 deletions(-) diff --git a/contrib/hstore/expected/hstore_utf8.out b/contrib/hstore/expected/hstore_utf8.out index 4405824413..bbc885a181 100644 --- a/contrib/hstore/expected/hstore_utf8.out +++ b/contrib/hstore/expected/hstore_utf8.out @@ -34,3 +34,34 @@ SELECT 'keyąfoo=>valueą'::hstore; "keyąfoo"=>"valueą" (1 row) +-- More patterns that may depend on isspace() and locales, all discarded. +SELECT E'key\u000A=>value\u000A'::hstore; -- \n + hstore + + "key"=>"value" +(1 row) + +SELECT E'key\u0009=>value\u0009'::hstore; -- \t + hstore + + "key"=>"value" +(1 row) + +SELECT E'key\u000D=>value\u000D'::hstore; -- \r + hstore + + "key"=>"value" +(1 row) + +SELECT E'key\u000B=>value\u000B'::hstore; -- \v + hstore + + "key"=>"value" +(1 row) + +SELECT E'key\u000C=>value\u000C'::hstore; -- \f + hstore + + "key"=>"value" +(1 row) + diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index 999ddad76d..f33b241d54 100644 --- a/contrib/hstore/hstore_io.c +++ b/contrib/hstore/hstore_io.c @@ -13,7 +13,6 @@ #include "lib/stringinfo.h" #include "libpq/pqformat.h" #include "nodes/miscnodes.h" -#include "parser/scansup.h" #include "utils/builtins.h" #include "utils/json.h" #include "utils/jsonb.h" @@ -41,6 +40,7 @@ typedef struct int plen; } HSParser; +static bool hstore_isspace(char ch); static bool hstoreCheckKeyLength(size_t len, HSParser *state); static bool hstoreCheckValLength(size_t len, HSParser *state); @@ -82,6 +82,26 @@ prseof(HSParser *state) return false; } +/* + * hstore_isspace() --- a non-locale-dependent isspace() + * + * We used to use isspace() for parsing hstore values, but that has + * undesirable results: a hstore value might be silently interpreted + * differently depending on the locale setting. Now we just hard-wire + * the traditional ASCII definition of isspace(). + */ +static bool +hstore_isspace(char ch) +{ + if (ch == ' ' || + ch == '\t' || + ch == '\n' || + ch == '\r' || + ch == '\v' || + ch == '\f') + return true; + return false; +} #define GV_WAITVAL 0 #define GV_INVAL 1 @@ -119,7 +139,7 @@ get_val(HSParser *state, bool ignoreeq, bool *escaped) { st = GV_WAITESCIN; } - else if (!scanner_isspace((unsigned char) *(state->ptr))) + else if (!hstore_isspace((unsigned char) *(state->ptr))) { *(state->cur) = *(state->ptr); state->cur++; @@ -142,7 +162,7 @@ get_val(HSParser *state, bool ignoreeq, bool *escaped) state->ptr--; return true; } - else if (scanner_isspace((unsigned char) *(state->ptr))) + else if (hstore_isspace((unsigned char) *(state->ptr))) { return true; } @@ -256,7 +276,7 @@ parse_hstore(HSParser *state) { PRSEOF; } - else if (!scanner_isspace((unsigned char) *(state->ptr))) + else if (!hstore_isspace((unsigned char) *(state->ptr))) { PRSSYNTAXERROR; } @@ -310,7 +330,7 @@ parse_hstore(HSParser *state) { return true; } - else if (!scanner_isspace((unsigned char) *(state->ptr))) + else if (!hstore_isspace((unsigned char) *(state->ptr))) { PRSSYNTAXERROR; } diff --git a/contrib/hstore/sql/hstore_utf8.sql b/contrib/hstore/sql/hstore_utf8.sql index face878324..38c9481ee6 100644 --- a/contrib/hstore/sql/hstore_utf8.sql +++ b/contrib/hstore/sql/hstore_utf8.sql @@ -17,3 +17,10 @@ SELECT E'key\u0105=>value\u0105'::hstore; SELECT 'keyą=>valueą'::hstore; SELECT 'ą=>ą'::hstore;