Re: SQL/JSON query functions context_item doc entry and type requirement
On Fri, Jun 21, 2024 at 9:18 PM Amit Langote wrote: > On Fri, Jun 21, 2024 at 9:47 AM David G. Johnston > wrote: > > On Thu, Jun 20, 2024 at 9:01 AM jian he wrote: > >> playing around with it. > >> found some minor issues: > >> > >> json_exists allow: DEFAULT expression ON ERROR, which is not > >> mentioned in the doc. > >> for example: > >> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default > >> true ON ERROR); > >> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 0 ON > >> ERROR); > >> select JSON_EXISTS(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' default 11 ON > >> ERROR); > > > > > > Yeah, surprised it works, the documented behavior seems logical. Being > > able to return a non-boolean here seems odd. Especially since it is cast > > to boolean on output. > > > >> > >> JSON_VALUE on error, on empty semantics should be the same as json_query. > >> like: > >> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } > >> ON EMPTY ] > >> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } > >> ON ERROR ]) > >> > >> examples: > >> select JSON_value(jsonb '[]' , '$' empty array on error); > >> select JSON_value(jsonb '[]' , '$' empty object on error); > > > > > > Again the documented behavior seems to make sense though and the ability to > > specify empty in the value function seems like a bug. If you really want > > an empty array or object you do have access to default. The reason > > json_query provides for an empty array/object is that it is already > > expecting to produce an array (object seems a little odd). > > > > I agree our docs and code do not match which needs to be fixed, ideally in > > the direction of the standard which I'm guessing our documentation is based > > off of. But let's not go off of my guess. > > Oops, that is indeed not great and, yes, the problem is code not > matching the documentation, the latter of which is "correct". > > Basically, the grammar allows specifying any of the all possible ON > ERROR/EMPTY behavior values irrespective of the function, so parse > analysis should be catching and flagging an attempt to specify > incompatible value for a given function, which it does not. > > I've attached a patch to fix that, which I'd like to push before > anything else we've been discussing. While there are still a few hours to go before Saturday noon UTC when beta2 freeze goes into effect, I'm thinking to just push this after beta2 is stamped. Adding an open item for now. -- Thanks, Amit Langote
Re: ON ERROR in json_query and the like
Hi, Thanks all for chiming in. On Fri, Jun 21, 2024 at 8:00 PM Markus Winand wrote: > So updating the three options: > > 1. Disallow anything but jsonb for context_item (the patch I posted > > yesterday) > > * Non-conforming > * patch available > > > 2. Continue allowing context_item to be non-json character or utf-8 > > encoded bytea strings, but document that any parsing errors do not > > respect the ON ERROR clause. > > * Conforming by choosing IA050 to implement GR4: raise errors independent of > the ON ERROR clause. > * currently committed. > > > 3. Go ahead and fix implicit casts to jsonb so that any parsing errors > > respect ON ERROR (no patch written yet). > > * Conforming by choosing IA050 not to implement GR4: Parsing happens later, > considering the ON ERROR clause. > * no patch available, not trivial > > I guess I’m the only one in favour of 3 ;) My remaining arguments are that > Oracle and Db2 (LUW) do it that way and also that it is IMHO what users would > expect. However, as 2 is also conforming (how could I miss that?), proper > documentation is a very tempting option. So, we should go with 2 for v17, because while 3 may be very appealing, there's a risk that it might not get done in the time remaining for v17. I'll post the documentation patch on Monday. -- Thanks, Amit Langote
recoveryCheck/008_fsm_truncation is failing on dodo in v14- (due to slow fsync?)
Hello hackers, This week dodo started failing on the 008_fsm_truncation test sporadically due to pg_ctl timeout. For example, [1] (REL_14_STABLE): ### Starting node "standby" # Running: pg_ctl -D /media/pi/250gb/proj/bf2/v17/buildroot/REL_14_STABLE/pgsql.build/src/test/recovery/tmp_check/t_008_fsm_truncation_standby_data/pgdata -l /media/pi/250gb/proj/bf2/v17/buildroot/REL_14_STABLE/pgsql.build/src/test/recovery/tmp_check/log/008_fsm_truncation_standby.log -o --cluster-name=standby start waiting for server to start... stopped waiting pg_ctl: server did not start in time # pg_ctl start failed; logfile: 2024-06-19 21:27:30.843 ACST [13244:1] LOG: starting PostgreSQL 14.12 on armv7l-unknown-linux-gnueabihf, compiled by gcc (GCC) 15.0.0 20240617 (experimental), 32-bit 2024-06-19 21:27:31.768 ACST [13244:2] LOG: listening on Unix socket "/media/pi/250gb/proj/bf2/v17/buildroot/tmp/vLgcHgvc7O/.s.PGSQL.50013" 2024-06-19 21:27:35.055 ACST [13246:1] LOG: database system was interrupted; last known up at 2024-06-19 21:26:55 ACST A successful run between two failures [2]: 2024-06-21 05:17:43.102 ACST [18033:1] LOG: database system was interrupted; last known up at 2024-06-21 05:17:31 ACST 2024-06-21 05:18:06.718 ACST [18033:2] LOG: entering standby mode (That is, that start took around 20 seconds.) We can also find longer "successful" duration at [3]: 008_fsm_truncation_standby.log: 2024-06-19 23:11:13.854 ACST [26202:1] LOG: database system was interrupted; last known up at 2024-06-19 23:11:02 ACST 2024-06-19 23:12:07.115 ACST [26202:2] LOG: entering standby mode (That start took almost a minute.) So it doesn't seem impossible for this operation to last for more than two minutes. The facts that SyncDataDirectory() is executed between these two messages logged, 008_fsm_truncation is the only test which turns fsync on, and we see no such failures in newer branches (because of a7f417107), make me suspect that dodo is slow on fsync. I managed to reproduce similar fsync degradation (and reached 40 seconds duration of this start operation) on a slow armv7 device with a SD card, which getting significantly slower after many test runs without executing fstrim periodically. So maybe fstrim could help dodo too... On the other hand, backporting a7f417107 could fix the issue too, but I'm afraid we'll still see other tests (027_stream_regress) failing like [4]. When similar failures were observed on Andres Freund's animals, Andres came to conclusion that they were caused by fsync too ([5]). [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2024-06-19%2010%3A15%3A08 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2024-06-20%2018%3A30%3A53 [3] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dodo&dt=2024-06-19%2012%3A30%3A51&stg=recovery-check [4] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2024-06-21%2018%3A31%3A11 [5] https://www.postgresql.org/message-id/20240321063953.oyfojyq3wbc77xxb%40awork3.anarazel.de Best regards, Alexander
Re: New standby_slot_names GUC in PG 17
On Sat, Jun 22, 2024 at 1:49 AM Nathan Bossart wrote: > > On Fri, Jun 21, 2024 at 03:50:00PM -0400, Tom Lane wrote: > > Allow specification of physical standbys that must be synchronized > > before they are visible to subscribers (Hou Zhijie, Shveta Malik) > > > > it seems like the name ought to have some connection to > > synchronization. Perhaps something like "synchronized_standby_slots"? > > IMHO that might be a bit too close to synchronous_standby_names. But the > name might not be the only issue, as there is a separate proposal [0] to > add _another_ GUC to tie standby_slot_names to synchronous replication. I > wonder if this could just be a Boolean parameter or if folks really have > use-cases for both a list of synchronous standbys and a separate list of > synchronous standbys for failover slots. > Both have separate functionalities. We need to wait for the standby's in synchronous_standby_names to be synced at the commit time whereas the standby's in the standby_slot_names doesn't have such a requirement. The standby's in the standby_slot_names are used by logical WAL senders such that they will send decoded changes to plugins only after the specified replication slots confirm receiving WAL. So, combining them doesn't sound advisable. -- With Regards, Amit Kapila.
Re: SQL/JSON query functions context_item doc entry and type requirement
On Fri, Jun 21, 2024 at 8:18 PM Amit Langote wrote: > > >> JSON_VALUE on error, on empty semantics should be the same as json_query. > >> like: > >> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } > >> ON EMPTY ] > >> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } > >> ON ERROR ]) > >> > >> examples: > >> select JSON_value(jsonb '[]' , '$' empty array on error); > >> select JSON_value(jsonb '[]' , '$' empty object on error); > > > > Again the documented behavior seems to make sense though and the ability to > > specify empty in the value function seems like a bug. If you really want > > an empty array or object you do have access to default. The reason > > json_query provides for an empty array/object is that it is already > > expecting to produce an array (object seems a little odd). > > > > I agree our docs and code do not match which needs to be fixed, ideally in > > the direction of the standard which I'm guessing our documentation is based > > off of. But let's not go off of my guess. > > Oops, that is indeed not great and, yes, the problem is code not > matching the documentation, the latter of which is "correct". > > Basically, the grammar allows specifying any of the all possible ON > ERROR/EMPTY behavior values irrespective of the function, so parse > analysis should be catching and flagging an attempt to specify > incompatible value for a given function, which it does not. > > I've attached a patch to fix that, which I'd like to push before > anything else we've been discussing. > + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("invalid ON ERROR behavior"), + errdetail("Only ERROR, NULL, EMPTY [ ARRAY | OBJECT }, or DEFAULT is allowed in ON ERROR for JSON_QUERY()."), + parser_errposition(pstate, func->on_error->location)); `EMPTY [ ARRAY | OBJECT }` seems not correct, maybe just EMPTY, EMPTY ARRAY, EMPTY OBJECT. (apply to other places) `DEFAULT ` `DEFAULT ` or just `DEFAULT expression` would be more correct? (apply to other places) I think we should make json_query, json_value on empty, on error behave the same way. otherwise, it will have consistency issues for scalar jsonb. for example, we should expect the following two queries to return the same result? SELECT * FROM JSON_query(jsonb '1', '$.a' returning jsonb empty on empty); SELECT * FROM JSON_value(jsonb '1', '$.a' returning jsonb empty on empty); Also the json_table function will call json_value or json_query, make these two functions on error, on empty behavior the same can reduce unintended complexity. So based on your patch(v1-0001-SQL-JSON-Disallow-incompatible-values-in-ON-ERROR.patch) and the above points, I have made some changes, attached. it will make json_value, json_query not allow {true | false | unknown } on error, {true | false | unknown } on empty. json_table error message deal the same way as b4fad46b6bc8a9bf46ff689bcb1bd4edf8f267af BTW, i found one JSON_TABLE document deficiency [ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT expression } ON EMPTY ] [ { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT expression } ON ERROR ] it should be [ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT expression } ON EMPTY ] [ { ERROR | NULL | EMPTY { [ARRAY] | OBJECT } | DEFAULT expression } ON ERROR ] From 0a8b756a56f3750ad07f09d7933c9cae0218ec93 Mon Sep 17 00:00:00 2001 From: jian he Date: Sat, 22 Jun 2024 17:28:53 +0800 Subject: [PATCH v2 2/2] Disallow incompatible values in ON ERROR/EMPTY make json_exists only allow { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR make json_query, json_value only allow { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON EMPTY { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON ERROR make json_value, json_query on empty, on error behave the same way, because for scalar jsonb, json_value, json_query behave the same. so sometimes they may need to return the same value. also json_table function may call json_value or json_query, make these two function on error, on empty behavior the same can reduce unintended complex. --- src/backend/parser/parse_expr.c | 143 ++ .../regress/expected/sqljson_jsontable.out| 30 .../regress/expected/sqljson_queryfuncs.out | 11 +- src/test/regress/sql/sqljson_jsontable.sql| 7 + src/test/regress/sql/sqljson_queryfuncs.sql | 2 +- 5 files changed, 126 insertions(+), 67 deletions(-) diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 3d79d171..f3ffa88f 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -4299,74 +4299,95 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func) parser_errposition(pstate, format->location)); } - /* OMIT QUOTES is meaningless when strings are wrapped. */ + /* check json_query, json_value, on error, on empty behavior */ + if (func->op == JSON_Q
Re: Add pg_get_acl() function get the ACL for a database object
On Sat, Jun 22, 2024, at 02:54, Joel Jacobson wrote: > Attachments: > * v4-0001-Add-pg_get_acl.patch > * 0002-Add-pg_get_acl-overloads.patch Rebase and reduced diff for src/test/regress/sql/privileges.sql between patches. /Joel v5-0001-Add-pg_get_acl.patch Description: Binary data v2-0002-Add-pg_get_acl-overloads.patch Description: Binary data
Re: New standby_slot_names GUC in PG 17
On Sat, Jun 22, 2024 at 1:49 AM Nathan Bossart wrote: > > On Fri, Jun 21, 2024 at 03:50:00PM -0400, Tom Lane wrote: > > Allow specification of physical standbys that must be synchronized > > before they are visible to subscribers (Hou Zhijie, Shveta Malik) > > > > it seems like the name ought to have some connection to > > synchronization. Perhaps something like "synchronized_standby_slots"? > > IMHO that might be a bit too close to synchronous_standby_names. > Right, but better than the current one. The other possibility could be wait_for_standby_slots. -- With Regards, Amit Kapila.
Re: recoveryCheck/008_fsm_truncation is failing on dodo in v14- (due to slow fsync?)
On Sat, 22 Jun 2024 at 18:30, Alexander Lakhin wrote: > So it doesn't seem impossible for this operation to last for more than two > minutes. > > The facts that SyncDataDirectory() is executed between these two messages > logged, 008_fsm_truncation is the only test which turns fsync on, and we > see no such failures in newer branches (because of a7f417107), make me > suspect that dodo is slow on fsync. > Not sure if it helps but I can confirm that dodo is used for multiple tasks and that it is using a (slow) external USB3 disk. Also, while using dodo last week (for something unrelated), I noticed iotop at ~30MB/s usage & 1-min CPU around ~7. Right now (while dodo's idle), via dd I see ~30MB/s is pretty much the max: pi@pi4:/media/pi/250gb $ dd if=/dev/zero of=./test count=1024 oflag=direct bs=128k 1024+0 records in 1024+0 records out 134217728 bytes (134 MB, 128 MiB) copied, 4.51225 s, 29.7 MB/s pi@pi4:/media/pi/250gb $ dd if=/dev/zero of=./test count=1024 oflag=dsync bs=128k 1024+0 records in 1024+0 records out 134217728 bytes (134 MB, 128 MiB) copied, 24.4916 s, 5.5 MB/s - robins
Re: Track the amount of time waiting due to cost_delay
Hi, On Thu, Jun 13, 2024 at 11:56:26AM +, Bertrand Drouvot wrote: > == Observations > === > > As compare to v2: > > 1. scanning heap time is about the same > 2. vacuuming indexes time is about 3 minutes longer on master > 3. vacuuming heap time is about the same I had a closer look to understand why the vacuuming indexes time has been about 3 minutes longer on master. During the vacuuming indexes phase, the leader is helping vacuuming the indexes until it reaches WaitForParallelWorkersToFinish() (means when all the remaining indexes are currently handled by the parallel workers, the leader has nothing more to do and so it is waiting for the parallel workers to finish). During the time the leader process is involved in indexes vacuuming it is also subject to wait due to cost delay. But with v2 applied, the leader may be interrupted by the parallel workers while it is waiting (due to the new pgstat_progress_parallel_incr_param() calls that the patch is adding). So, with v2 applied, the leader is waiting less (as interrupted while waiting) when being involved in indexes vacuuming and that's why v2 is "faster" than master. To put some numbers, I did count the number of times the leader's pg_usleep() has been interrupted (by counting the number of times the nanosleep() did return a value < 0 in pg_usleep()). Here they are: v2: the leader has been interrupted about 342605 times master: the leader has been interrupted about 36 times The ones on master are mainly coming from the pgstat_progress_parallel_incr_param() calls in parallel_vacuum_process_one_index(). The additional ones on v2 are coming from the pgstat_progress_parallel_incr_param() calls added in vacuum_delay_point(). Conclusion == 1. vacuuming indexes time has been longer on master because with v2, the leader has been interrupted 342605 times while waiting, then making v2 "faster". 2. the leader being interrupted while waiting is also already happening on master due to the pgstat_progress_parallel_incr_param() calls in parallel_vacuum_process_one_index() (that have been added in 46ebdfe164). It has been the case "only" 36 times during my test case. I think that 2. is less of a concern but I think that 1. is something that needs to be addressed because the leader process is not honouring its cost delay wait time in a noticeable way (at least during my test case). I did not think of a proposal yet, just sharing my investigation as to why v2 has been faster than master during the vacuuming indexes phase. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: New standby_slot_names GUC in PG 17
On Sat, Jun 22, 2024 at 03:17:03PM +0530, Amit Kapila wrote: > On Sat, Jun 22, 2024 at 1:49 AM Nathan Bossart > wrote: > > > > On Fri, Jun 21, 2024 at 03:50:00PM -0400, Tom Lane wrote: > > > Allow specification of physical standbys that must be synchronized > > > before they are visible to subscribers (Hou Zhijie, Shveta Malik) > > > > > > it seems like the name ought to have some connection to > > > synchronization. Perhaps something like "synchronized_standby_slots"? > > > > IMHO that might be a bit too close to synchronous_standby_names. > > > > Right, but better than the current one. The other possibility could be > wait_for_standby_slots. FYI, changing this GUC name could force an initdb because postgresql.conf would have the old name and removing the comment to change it would cause an error. Therefore, we should change it ASAP. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: Table AM Interface Enhancements
On Fri, Jun 21, 2024 at 7:37 PM Matthias van de Meent wrote: > On Tue, 16 Apr 2024 at 12:34, Alexander Korotkov wrote: > > > > You're right. No sense trying to fix this. Reverted. > > I just noticed that this revert (commit 6377e12a) seems to have > introduced two comment blocks atop TableAmRoutine's > scan_analyze_next_block, and I can't find a clear reason why these are > two separate comment blocks. > Furthermore, both comment blocks seemingly talk about different > implementations of a block-based analyze functionality, and I don't > have the time to analyze which of these comments is authorative and > which are misplaced or obsolete. Thank you, I've just removed the first comment. It contains heap-specific information and has been copied here from heapam_scan_analyze_next_block(). -- Regards, Alexander Korotkov Supabase
Re: FreezeLimit underflows in pg14 and 15 causing incorrect behavior in heap_prepare_freeze_tuple
On Fri, Jun 21, 2024 at 4:22 PM Melanie Plageman wrote: > > While investigating a bug over on [1], I found that > vacuum_set_xid_limits() is calculating freezeLimit in an unsafe way on > at least Postgres 14 and 15. > > limit = *oldestXmin - freezemin; > safeLimit = ReadNextTransactionId() - autovacuum_freeze_max_age; > if (TransactionIdPrecedes(limit, safeLimit)) > limit = *oldestXmin; > *freezeLimit = limit; > > All of these are unsigned, so it doesn't work very nicely when > freezemin (based on autovacuum_freeze_min_age) is larger than > oldestXmin and autovacuum_freeze_max_age is bigger than the next > transaction ID -- which is pretty common right after initdb, for > example. > > I noticed the effect of this because FreezeLimit is saved in the > LVRelState and passed to heap_prepare_freeze_tuple() as cutoff_xid, > which is used to guard against freezing tuples that shouldn't be > frozen. > > I didn't propose a fix because I just want to make sure I'm not > missing something first. Hmm. So perhaps this subtraction results in the desired behavior for freeze limit -- but by using FreezeLimit as the cutoff_xid for heap_prepare_freeze_tuple(), you can still end up considering freezing tuples with xmax older than OldestXmin. This results in erroring out with "cannot freeze committed xmax" on 16 and master but not erroring out like this in 14 and 15 for the same tuple and cutoff values. - Melanie
Re: New standby_slot_names GUC in PG 17
Bruce Momjian writes: > FYI, changing this GUC name could force an initdb because > postgresql.conf would have the old name and removing the comment to > change it would cause an error. Therefore, we should change it ASAP. That's not reason for a forced initdb IMO. It's easily fixed by hand. At this point we're into the release freeze for beta2, so even if we had consensus on a new name it should wait till after. So I see no particular urgency to make a decision. regards, tom lane
Re: FreezeLimit underflows in pg14 and 15 causing incorrect behavior in heap_prepare_freeze_tuple
On Sat, Jun 22, 2024 at 10:43 AM Melanie Plageman wrote: > Hmm. So perhaps this subtraction results in the desired behavior for > freeze limit -- but by using FreezeLimit as the cutoff_xid for > heap_prepare_freeze_tuple(), you can still end up considering freezing > tuples with xmax older than OldestXmin. Using a FreezeLimit > OldestXmin is just wrong. I don't think that that even needs to be discussed. > This results in erroring out with "cannot freeze committed xmax" on 16 > and master but not erroring out like this in 14 and 15 for the same > tuple and cutoff values. I don't follow. I thought that 16 and master don't have this particular problem? Later versions don't use safeLimit as FreezeLimit like this. -- Peter Geoghegan
Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Hello hackers, 30.03.2024 01:17, Noah Misch wrote: On Fri, Mar 29, 2024 at 09:17:55AM +0100, Jelte Fennema-Nio wrote: On Thu, 28 Mar 2024 at 19:03, Tom Lane wrote: Could we make this test bulletproof by using an injection point? If not, I remain of the opinion that we're better off without it. Possibly, and if so, I agree that would be better than the currently added test. But I honestly don't feel like spending the time on creating such a test. The SQL test is more representative of real applications, and it's way simpler to understand. In general, I prefer 6-line SQL tests that catch a problem 10% of the time over injection point tests that catch it 100% of the time. For low detection rate to be exciting, it needs to be low enough to have a serious chance of all buildfarm members reporting green for the bad commit. With ~115 buildfarm members running in the last day, 0.1% detection rate would have been low enough to bother improving, but 4% would be high enough to call it good. As a recent buildfarm failure on orlingo (which tests asan-enabled builds) [1] shows, that test can still fail: 70/70 postgresql:postgres_fdw-running / postgres_fdw-running/regress ERROR 278.67s exit status 1 @@ -2775,6 +2775,7 @@ SET LOCAL statement_timeout = '10ms'; select count(*) from ft1 CROSS JOIN ft2 CROSS JOIN ft4 CROSS JOIN ft5; -- this takes very long ERROR: canceling statement due to statement timeout +WARNING: could not get result of cancel request due to timeout COMMIT; (from the next run we can see normal duration: "postgres_fdw-running/regress OK 6.30s ") I reproduced the failure with an asan-enabled build on a slowed-down VM and as far as I can see, it's caused by the following condition in ProcessInterrupts(): /* * If we are reading a command from the client, just ignore the cancel * request --- sending an extra error message won't accomplish * anything. Otherwise, go ahead and throw the error. */ if (!DoingCommandRead) { LockErrorCleanup(); ereport(ERROR, (errcode(ERRCODE_QUERY_CANCELED), errmsg("canceling statement due to user request"))); } I think this failure can be reproduced easily (without asan/slowing down) with this modification: @@ -4630,6 +4630,7 @@ PostgresMain(const char *dbname, const char *username) idle_session_timeout_enabled = false; } +if (rand() % 10 == 0) pg_usleep(1); /* * (5) disable async signal conditions again. * Running this test in a loop (for ((i=1;i<=100;i++)); do \ echo "iteration $i"; make -s check -C contrib/postgres_fdw/ || break; \ done), I get: ... iteration 56 # +++ regress check in contrib/postgres_fdw +++ # initializing database system by copying initdb template # using temp instance on port 55312 with PID 991332 ok 1 - postgres_fdw 20093 ms 1..1 # All 1 tests passed. iteration 57 # +++ regress check in contrib/postgres_fdw +++ # initializing database system by copying initdb template # using temp instance on port 55312 with PID 992152 not ok 1 - postgres_fdw 62064 ms 1..1 ... --- .../contrib/postgres_fdw/expected/postgres_fdw.out 2024-06-22 02:52:42.991574907 + +++ .../contrib/postgres_fdw/results/postgres_fdw.out 2024-06-22 14:43:43.949552927 + @@ -2775,6 +2775,7 @@ SET LOCAL statement_timeout = '10ms'; select count(*) from ft1 CROSS JOIN ft2 CROSS JOIN ft4 CROSS JOIN ft5; -- this takes very long ERROR: canceling statement due to statement timeout +WARNING: could not get result of cancel request due to timeout COMMIT; I also came across another failure of the test: @@ -2774,7 +2774,7 @@ BEGIN; SET LOCAL statement_timeout = '10ms'; select count(*) from ft1 CROSS JOIN ft2 CROSS JOIN ft4 CROSS JOIN ft5; -- this takes very long -ERROR: canceling statement due to statement timeout +ERROR: canceling statement due to user request COMMIT; which is reproduced with a sleep added here: @@ -1065,6 +1065,7 @@ exec_simple_query(const char *query_string) */ parsetree_list = pg_parse_query(query_string); +pg_usleep(11000); [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=olingo&dt=2024-06-20%2009%3A52%3A04 Best regards, Alexander
Re: replace strtok()
On 18.06.24 13:43, Ranier Vilela wrote: But I would like to see more const char * where this is possible. For example, in pg_locale.c IMO, the token variable can be const char *. At least strchr expects a const char * as the first parameter. This would not be future-proof. In C23, if you pass a const char * into strchr(), you also get a const char * as a result. And in this case, we do write into the area pointed to by the result. So with a const char *token, this whole thing would not compile cleanly under C23. I found another implementation of strsep, it seems lighter to me. I will attach it for consideration, however, I have not done any testing. Yeah, surely there are many possible implementations. I'm thinking, since we already took other str*() functions from OpenBSD, it makes sense to do this here as well, so we have only one source to deal with.
Re: replace strtok()
Peter Eisentraut writes: > On 18.06.24 13:43, Ranier Vilela wrote: >> I found another implementation of strsep, it seems lighter to me. >> I will attach it for consideration, however, I have not done any testing. > Yeah, surely there are many possible implementations. I'm thinking, > since we already took other str*() functions from OpenBSD, it makes > sense to do this here as well, so we have only one source to deal with. Why not use strpbrk? That's equally thread-safe, it's been there since C89, and it doesn't have the problem that you can't find out which of the delimiter characters was found. regards, tom lane
Re: FreezeLimit underflows in pg14 and 15 causing incorrect behavior in heap_prepare_freeze_tuple
On Sat, Jun 22, 2024 at 10:53 AM Peter Geoghegan wrote: > > On Sat, Jun 22, 2024 at 10:43 AM Melanie Plageman > wrote: > > Hmm. So perhaps this subtraction results in the desired behavior for > > freeze limit -- but by using FreezeLimit as the cutoff_xid for > > heap_prepare_freeze_tuple(), you can still end up considering freezing > > tuples with xmax older than OldestXmin. > > Using a FreezeLimit > OldestXmin is just wrong. I don't think that > that even needs to be discussed. Because it is an unsigned int that wraps around, FreezeLimit can precede OldestXmin, but we can have a tuple whose xmax precedes OldestXmin but does not precede FreezeLimit. So, the question is if it is okay to freeze tuples whose xmax precedes OldestXmin but follows FreezeLimit. > > This results in erroring out with "cannot freeze committed xmax" on 16 > > and master but not erroring out like this in 14 and 15 for the same > > tuple and cutoff values. > > I don't follow. I thought that 16 and master don't have this > particular problem? Later versions don't use safeLimit as FreezeLimit > like this. Yes, 16 and master don't consider freezing a tuple with an xmax older than OldestXmin because they use OldestXmin for cutoff_xid and that errors out in heap_prepare_freeze_tuple(). 14 and 15 (and maybe earlier, but I didn't check) use FreezeLimit so they do consider freezing tuple with xmax older than OldestXmin. - Melanie
Inconsistent Parsing of Offsets with Seconds
Hackers, The treatment of timestamptz (and timetz) values with offsets that include seconds seems a bit inconsistent. One can create such timestamps through the input function: david=# select '2024-06-22T12:35:00+02:30:15'::timestamptz; timestamptz 2024-06-22 10:04:45+00 But the offset seconds are dropped (or rounded away?) by to_timestamp()’s `OF` and `TZ` formats[2]: david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', '-MM-DD HH24:MI:SSOF'); to_timestamp 2024-06-03 10:05:00+00 david=# select to_timestamp('2024-06-03 12:35:00+02:30:15', '-MM-DD HH24:MI:SSTZ'); to_timestamp 2024-06-03 02:05:00-08 The corresponding jsonpath methods don’t like offsets with seconds *at all*: david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.datetime("-MM-DD HH24:MI:SSOF")'); ERROR: trailing characters remain in input string after datetime format david=# select jsonb_path_query('"2024-06-03 12:35:00+02:30:15"', '$.timestamp_tz()'); ERROR: timestamp_tz format is not recognized: "2024-06-03 12:35:00+02:30:15" I see from the source[1] that offsets between plus or minus 15:59:59 are allowed; should the `OF` and `TZ formats be able to parse them? Or perhaps there should be a `TZS` format to complement `TZH` and `TZM`? Best, David [1] https://github.com/postgres/postgres/blob/70a845c/src/include/datatype/timestamp.h#L136-L142 [2]: https://www.postgresql.org/docs/16/functions-formatting.html
Re: Meson far from ready on Windows
Hi, On 2024-06-21 12:20:49 +0100, Dave Page wrote: > On Thu, 20 Jun 2024 at 21:58, Andres Freund wrote: > That is precisely what https://github.com/dpage/winpgbuild/ was intended > for - and it works well for PG <= 16. If we develop it into that - I'd be happy. I mostly want to be able to do automated tests on windows with all reasonable dependencies. And occasionally do some interactive investigation, without a lot of setup time. One small advantage of something outside of PG is that it's easy to add additional dependencies when developing additional features. Except of course all the windows packaging systems seem ... suboptimal. > I don't think it's unreasonable to not support static linking, but I take > your point. Separately from this thread: ISTM that on windows it'd be quite beneficial to link a few things statically, given how annoying dealing with dlls can be? There's also the perf issue addressed further down. > My assumption all along was that Meson would replace autoconf etc. before > anything happened with MSVC, precisely because that's the type of > environment all the Postgres devs work in primarily. Instead we seem to > have taken what I think is a flawed approach of entirely replacing the > build system on the platform none of the devs use, whilst leaving the new > system as an experimental option on the platforms it will have had orders > of magnitude more testing. The old system was a major bottleneck. For one, there was no way to run all tests. And even the tests that one could run, would run serially, leading to exceedingly long tests times. While that could partially be addressed by having both buildsystems in parallel, the old one would frequently break in a way that one couldn't reproduce on other systems. And resource wise it wasn't feasible to test both old and new system for cfbot/CI. > What makes it worse, is that I don't believe anyone was warned about such a > drastic change. Packagers were told about the git archive changes to the > tarball generation, but that's it (and we've said before, we can't expect > packagers to follow all the activity on -hackers). I'm sure we could have dealt better with it. There certainly was some lack of of cohesion because I wasn't able to do drive the src/tools/msvc removal and Michael took up the slack. But I also don't think it's really fair to say that there was no heads up. Several people at EDB participated in the removal and buildfarm maintainers at EDB were repeatedly pinged, to move their buildfarm animals over. And of course the meson stuff came out a year earlier and it wouldn't have been exactly unreasonable > Well as I noted, that is the point of my Github repo above. You can just go > download the binaries from the all_deps action - you can even download the > config.pl and buildenv.pl that will work with those dependencies (those > files are artefacts of the postgresql action). For the purpose of CI we'd really need debug builds of most of the libraries - there are compat issues when mixing debug/non-debug runtimes (we've hit them occasionally) and not using the debug runtime hides a lot of issues. Of course also not optimal for CI / BF usage. > > - Linking the backend dynamically against lz4, icu, ssl, xml, xslt, zstd, > > zlib > > slows down the tests noticeably (~20%). So I ended up building those > > statically. > Curious. I wonder if that translates into a general 20% performance hit. > Presumably it would for anything that looks similar to whatever test/tests > are affected. FWIW, dynamic linking has a noticeable overhead on other platforms too. A non-dependencies-enabled postgres can do about 2x the connections-per-second than a fully kitted out postgres can (basically due to more memory mapping metadata being copied). But on windows the overhead is larger because so much more happens for every new connections, including loading all dlls from scratch. I suspect linking a few libraries statically would be quite worth it on windows. On other platforms it'd be quite inadvisable to statically link libraries, due to security updates, but for stuff like the EDB windows installer dynamic linking doesn't really help with that afaict? > > I ran into some issue with using a static libintl. I made it work, but > > for > > now reverted to a dynamic one. > > > > > > - Enabling nls slows down the tests by about 15%, somewhat painful. This is > > when statically linking, it's a bit worse when linked dynamically :(. > > > > That one I can imagine being in psql, so maybe not a big issue for most > real world use cases. I think it's both psql and backend. I think partially it's just due the additional libraries being linked in everywhere (intl and iconv) and partially it's the additinal indirection that happens in a bunch more places. We have a bunch of places where we do gettext lookups but never use the result unless you use DEBUG3 or such, and that not free. It also triggers additional filesystem
Re: Meson far from ready on Windows
Hi, On 2024-06-21 15:36:56 +0100, Dave Page wrote: > For giggles, I took a crack at doing that, manually creating .pc files for > everything I've been working with so far. Cool! > It seems to work as expected, except that unlike everything else libintl is > detected entirely based on whether the header and library can be found. I > had to pass extra lib and include dirs: Yea, right now libintl isn't using dependency detection because I didn't see any platform where it's distributed with a .pc for or such. It'd be just a line or two to make it use one... Greetings, Andres Freund
Re: Inconsistent Parsing of Offsets with Seconds
"David E. Wheeler" writes: > The treatment of timestamptz (and timetz) values with offsets that include > seconds seems a bit inconsistent. It's hard to get excited about this. Per the IANA TZ data, nowhere in the world has used fractional-minute UT offsets since 1972: # In 1972 Liberia was the last country to switch from a UT offset # that was not a multiple of 15 or 20 minutes. and they were twenty years later than the next-to-last place (although IANA will steadfastly deny reliability for their TZ data before 1970). So timestamps like this simply don't exist in the wild. > The corresponding jsonpath methods don’t like offsets with seconds *at all*: Perhaps that should be fixed, but it's pretty low-priority IMO. I doubt there is any standard saying that JSON timestamps need to be able to include that. > I see from the source[1] that offsets between plus or minus 15:59:59 > are allowed; should the `OF` and `TZ formats be able to parse them? I'd vote no. to_date/to_char already have enough trouble with format strings being squishier than one might expect. regards, tom lane
Re: Meson far from ready on Windows
Andres Freund: FWIW, dynamic linking has a noticeable overhead on other platforms too. A non-dependencies-enabled postgres can do about 2x the connections-per-second than a fully kitted out postgres can (basically due to more memory mapping metadata being copied). But on windows the overhead is larger because so much more happens for every new connections, including loading all dlls from scratch. I suspect linking a few libraries statically would be quite worth it on windows. On other platforms it'd be quite inadvisable to statically link libraries, due to security updates, [...] That's not necessarily true. The nix package manager and thus NixOS track all dependencies for a piece of software. If any of the dependencies are updated, all dependents are rebuilt, too. So the security concern doesn't apply here. There is a "static overlay", which builds everything linked fully statically. Unfortunately, PostgreSQL doesn't build in that, so far. Lately, I have been looking into building at least libpq in that static overlay, via Meson. There are two related config options: -Ddefault_library=shared|static|both -Dprefer_static The first controls which libraries (libpq, ...) to build ourselves. The second controls linking, IIUC also against external dependencies. Maybe it would be a first step to support -Dprefer_static? Then this could be set on Windows. Best, Wolfgang
Re: Pgoutput not capturing the generated columns
On Thu, Jun 20, 2024 at 9:03 AM Peter Smith wrote: > > Hi, here are my review comments for v8-0001. > > == > Commit message. > > 1. > It seems like the patch name was accidentally omitted, so it became a > mess when it defaulted to the 1st paragraph of the commit message. > > == > contrib/test_decoding/test_decoding.c > > 2. > + data->include_generated_columns = true; > > I previously posted a comment [1, #4] that this should default to > false; IMO it is unintuitive for the test_decoding to have an > *opposite* default behaviour compared to CREATE SUBSCRIPTION. > > == > doc/src/sgml/ref/create_subscription.sgml > > NITPICK - remove the inconsistent blank line in SGML > > == > src/backend/commands/subscriptioncmds.c > > 3. > +#define SUBOPT_include_generated_columns 0x0001 > > I previously posted a comment [1, #6] that this should be UPPERCASE, > but it is not yet fixed. > > == > src/bin/psql/describe.c > > NITPICK - move and reword the bogus comment > > ~ > > 4. > + if (pset.sversion >= 17) > + appendPQExpBuffer(&buf, > + ", subincludegencols AS \"%s\"\n", > + gettext_noop("include_generated_columns")); > > 4a. > For consistency with every other parameter, that column title should > be written in words "Include generated columns" (not > "include_generated_columns"). > > ~ > > 4b. > IMO this new column belongs with the other subscription parameter > columns (e.g. put it ahead of the "Conninfo" column). > > == > src/test/subscription/t/011_generated.pl > > NITPICK - fixed a comment > > 5. > IMO, it would be better for readability if all the matching CREATE > TABLE for publisher and subscriber are kept together, instead of the > current code which is creating all publisher tables and then creating > all subscriber tables. > > ~~~ > > 6. > +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2"); > +is( $result, qq(4|8 > +5|10), 'confirm generated columns ARE replicated when the > subscriber-side column is not generated'); > + > ... > +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab3"); > +is( $result, qq(4|24 > +5|25), 'confirm generated columns are NOT replicated when the > subscriber-side column is also generated'); > + > > 6a. > These SELECT all need ORDER BY to protect against the SELECT * > returning rows in some unexpected order. > > ~ > > 6b. > IMO there should be more comments here to explain how you can tell the > column was NOT replicated. E.g. it is because the result value of 'b' > is the subscriber-side computed value (which you made deliberately > different to the publisher-side computed value). > > == > > 99. > Please also refer to the attached nitpicks top-up patch for minor > cosmetic stuff. All the comments are handled. The attached Patch contains all the suggested changes. Thanks and Regards, Shubham Khanna. v9-0001-Currently-generated-column-values-are-not-replica.patch Description: Binary data
Re: Pgoutput not capturing the generated columns
On Fri, Jun 21, 2024 at 8:23 AM Peter Smith wrote: > > Hi Shubham, here are some more patch v8-0001 comments that I missed yesterday. > > == > src/test/subscription/t/011_generated.pl > > 1. > Are the PRIMARY KEY qualifiers needed for the new tab2, tab3 tables? I > don't think so. > > ~~~ > > 2. > +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab2"); > +is( $result, qq(4|8 > +5|10), 'confirm generated columns ARE replicated when the > subscriber-side column is not generated'); > + > +$node_publisher->safe_psql('postgres', "INSERT INTO tab3 VALUES (4), (5)"); > + > +$node_publisher->wait_for_catchup('sub3'); > + > +$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab3"); > +is( $result, qq(4|24 > +5|25), 'confirm generated columns are NOT replicated when the > subscriber-side column is also generated'); > + > > It would be prudent to do explicit "SELECT a,b" instead of "SELECT *", > for readability and to avoid any surprises. Both the comments are handled. Patch v9-0001 contains all the changes required. See [1] for the changes added. [1] https://www.postgresql.org/message-id/CAHv8Rj%2B6kwOGmn5MsRaTmciJDxtvNsyszMoPXV62OGPGzkxrCg%40mail.gmail.com Thanks and Regards, Shubham Khanna.
Re: Inconsistent Parsing of Offsets with Seconds
On Jun 22, 2024, at 13:15, Tom Lane wrote: > It's hard to get excited about this. I freely admit I’m getting into the weeds here. :-) >> The corresponding jsonpath methods don’t like offsets with seconds *at all*: > > Perhaps that should be fixed, but it's pretty low-priority IMO. > I doubt there is any standard saying that JSON timestamps need > to be able to include that. > >> I see from the source[1] that offsets between plus or minus 15:59:59 >> are allowed; should the `OF` and `TZ formats be able to parse them? > > I'd vote no. to_date/to_char already have enough trouble with format > strings being squishier than one might expect. I believe the former issue is caused by the latter: The jsonpath implementation uses the formatting strings to parse the timestamps[1], and since there is no formatting to support offsets with seconds, it doesn’t work at all in JSON timestamp parsing. [1]: https://github.com/postgres/postgres/blob/70a845c/src/backend/utils/adt/jsonpath_exec.c#L2420-L2442 So if we were to fix the parsing of offsets in jsonpath, we’d either have to change the parsing code there or augment the to_timestamp() formats and use them. Totally agree not a priority; happy to just pretend offsets with seconds don’t exist in any practical sense. Best, David
Re: allow changing autovacuum_max_workers without restarting
On Fri, Jun 21, 2024 at 03:44:07PM -0500, Nathan Bossart wrote: > I'm still not sure about this approach. At the moment, I'm leaning towards > something more like v2 [2] where the upper limit is a PGC_POSTMASTER GUC > (that we would set very low for TAP tests). Like so. -- nathan >From c1c33c6c157a7cec81180714369b2978b09e402f Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Sat, 22 Jun 2024 15:05:44 -0500 Subject: [PATCH v6 1/1] allow changing autovacuum_max_workers without restarting --- doc/src/sgml/config.sgml | 28 +++- doc/src/sgml/runtime.sgml | 12 ++--- src/backend/access/transam/xlog.c | 2 +- src/backend/postmaster/autovacuum.c | 44 --- src/backend/postmaster/postmaster.c | 2 +- src/backend/storage/lmgr/proc.c | 6 +-- src/backend/utils/init/postinit.c | 12 ++--- src/backend/utils/misc/guc_tables.c | 13 +- src/backend/utils/misc/postgresql.conf.sample | 3 +- src/include/postmaster/autovacuum.h | 1 + src/include/utils/guc_hooks.h | 4 +- src/test/perl/PostgreSQL/Test/Cluster.pm | 1 + 12 files changed, 89 insertions(+), 39 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 0c7a9082c5..64411918a4 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -8544,6 +8544,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + autovacuum_worker_slots (integer) + + autovacuum_worker_slots configuration parameter + + + + +Specifies the number of backend slots to reserve for autovacuum worker +processes. The default is 16. This parameter can only be set at server +start. + + +When changing this value, consider also adjusting +. + + + + autovacuum_max_workers (integer) @@ -8554,7 +8573,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; Specifies the maximum number of autovacuum processes (other than the autovacuum launcher) that may be running at any one time. The default -is three. This parameter can only be set at server start. +is three. This parameter can only be set in the +postgresql.conf file or on the server command line. + + +Note that a setting for this value which is higher than + will have no effect, +since autovacuum workers are taken from the pool of slots established +by that setting. diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 2f7c618886..4bb37faffe 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such SEMMNI Maximum number of semaphore identifiers (i.e., sets) -at least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) plus room for other applications +at least ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16) plus room for other applications SEMMNS Maximum number of semaphores system-wide -ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17 plus room for other applications +ceil((max_connections + autovacuum_worker_slots + max_wal_senders + max_worker_processes + 7) / 16) * 17 plus room for other applications @@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such When using System V semaphores, PostgreSQL uses one semaphore per allowed connection (), allowed autovacuum worker process -(), allowed WAL sender process +(), allowed WAL sender process (), and allowed background process (), in sets of 16. Each such set will @@ -847,13 +847,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such other applications. The maximum number of semaphores in the system is set by SEMMNS, which consequently must be at least as high as max_connections plus -autovacuum_max_workers plus max_wal_senders, +autovacuum_worker_slots plus max_wal_senders, plus max_worker_processes, plus one extra for each 16 allowed connections plus workers (see the formula in ). The parameter SEMMNI determines the limit on the number of semaphore sets that can exist on the system at one time. Hence this parameter must be at -least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16). +least ceil((max_connections + autovacuum_worker_slots + max_wal
Re: Meson far from ready on Windows
Hi, On June 22, 2024 7:32:01 PM GMT+02:00, walt...@technowledgy.de wrote: >Andres Freund: >> FWIW, dynamic linking has a noticeable overhead on other platforms too. A >> non-dependencies-enabled postgres can do about 2x the connections-per-second >> than a fully kitted out postgres can (basically due to more memory mapping >> metadata being copied). But on windows the overhead is larger because so >> much >> more happens for every new connections, including loading all dlls from >> scratch. >> >> I suspect linking a few libraries statically would be quite worth it on >> windows. On other platforms it'd be quite inadvisable to statically link >> libraries, due to security updates, [...] >That's not necessarily true. The nix package manager and thus NixOS track all >dependencies for a piece of software. If any of the dependencies are updated, >all dependents are rebuilt, too. So the security concern doesn't apply here. >There is a "static overlay", which builds everything linked fully statically. Right. There's definitely some scenario where it's ok, I was simplifying a bit. > Unfortunately, PostgreSQL doesn't build in that, so far. I've built mostly statically linked pg without much of a problem, what trouble did you encounter? Think there were some issues with linking Kerberos and openldap statically, but not on postgres' side. Building the postgres backend without support for dynamic linking doesn't make sense though. Extensions are just stop ingrained part of pg. >Lately, I have been looking into building at least libpq in that static >overlay, via Meson. There are two related config options: >-Ddefault_library=shared|static|both >-Dprefer_static > >The first controls which libraries (libpq, ...) to build ourselves. The second >controls linking, IIUC also against external dependencies. Pg by default builds a static libpq on nearly all platforms (not aix I think and maybe not Windows when building with autoconf, not sure about the old msvc system) today? >Maybe it would be a first step to support -Dprefer_static? That should work for nearly all dependencies today. Except for libintl, I think. I found that there are a lot of buglets in static link dependencies of various libraries though. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Wrong security context for deferred triggers?
On Mon, Jun 10, 2024 at 1:00 PM Laurenz Albe wrote: >Like you, I was surprised by the current behavior. There is a design >principle that PostgreSQL tries to follow, called the "Principle of >least astonishment". Things should behave like a moderately skilled >user would expect them to. In my opinion, the current behavior violates >that principle. Tomas seems to agree with that point of view. I worry that both approaches violate this principle in different ways. For example consider the following sequence of events: SET ROLE r1; BEGIN; SET CONSTRAINTS ALL DEFERRED; INSERT INTO ...; SET ROLE r2; SET search_path = '...'; COMMIT; I think that it would be reasonable to expect that the triggers execute with r2 and not r1, since the triggers were explicitly deferred and the role was explicitly set. It would likely be surprising that the search path was updated for the trigger but not the role. With your proposed approach it would be impossible for someone to trigger a trigger with one role and execute it with another, if that's a desirable feature. >I didn't find this strange behavior myself: it was one of our customers >who uses security definer functions for data modifications and has >problems with the current behavior, and I am trying to improve the >situation on their behalf. Would it be possible to share more details about this use case? For example, What are their current problems? Are they not able to set constraints to immediate? Or can they update the trigger function itself be a security definer function? That might help illuminate why the current behavior is wrong. >But I feel that the database user that runs the trigger should be the >same user that ran the triggering SQL statement. Even though I cannot >put my hand on a case where changing this user would constitute a real >security problem, it feels wrong. > >I am aware that that is rather squishy argumentation, but I have no >better one. Both my and Thomas' gut reaction seems to have been "the >current behavior is wrong". I understand the gut reaction, and I even have the same gut reaction, but since we would be treating roles exceptionally compared to the rest of the execution context, I would feel better if we had a more concrete reason. I also took a look at the code. It doesn't apply cleanly to master, so I took the liberty of rebasing and attaching it. > + /* > + * The role could have been dropped since the trigger was queued. > + * In that case, give up and error out. > + */ > + pfree(GetUserNameFromId(evtshared->ats_rolid, false)); It feels a bit wasteful to allocate and copy the role name when we never actually use it. Is it possible to check that the role exists without copying the name? Everything else looked good, and the code does what it says it will. Thanks, Joe Koshakow From f5de4ea29d0f78549618c23db5951120218af203 Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Wed, 6 Mar 2024 14:09:43 +0100 Subject: [PATCH] Make AFTER triggers run with the correct user With deferred triggers, it is possible that the current role changes between the time when the trigger is queued and the time it is executed (for example, the triggering data modification could have been executed in a SECURITY DEFINER function). Up to now, deferred trigger functions would run with the current role set to whatever was active at commit time. That does not matter for regular constraints, whose correctness doesn't depend on the current role. But for user-written contraint triggers, the current role certainly matters. Security considerations: - The trigger function could be modified between the time the trigger is queued and the time it runs. If the trigger was executed by a privileged user, the new behavior could be used for privilege escalation. But if a privileged user executes DML on a table owned by an untrusted user, all bets are off anyway --- the malicious code could as well be in the trigger function from the beginning. So we don't consider this a security hazard. - The previous behavior could lead to code inadvertently running with elevated privileges if a privileged user temporarily assumes lower privileges while executing DML on an untrusted table, but the deferred trigger runs with the user's original privileges. However, that only applies if the privileged user commits *after* resuming the original role. Should this be backpatched as a security bug? Author: Laurenz Albe Discussion: https://postgr.es/m/77ee784cf248e842f74588418f55c2931e47bd78.camel%40cybertec.at --- src/backend/commands/trigger.c | 23 src/test/regress/expected/triggers.out | 81 ++ src/test/regress/sql/triggers.sql | 75 3 files changed, 179 insertions(+) diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 58b7fc5bbd..69d583751a 100644 --- a/s
Re: Wrong security context for deferred triggers?
On Sat, Jun 8, 2024 at 2:37 PM Joseph Koshakow wrote: > > Something like > `SECURITY INVOKER | SECURITY TRIGGERER` (modeled after the modifiers in > `CREATE FUNCTION`) that control which role is used. > I'm inclined toward this option (except invoker and triggerer are the same entity, we need owner|definer). I'm having trouble accepting changing the existing behavior here but agree that having a mode whereby the owner of the trigger/table executes the trigger function in an initially clean environment (server/database defaults; the owner role isn't considered as having logged in so their personalized configurations do not take effect) (maybe add a SET clause to create trigger too). Security invoker would be the default, retaining current behavior for upgrade/dump+restore. Security definer on the function would take precedence as would its set clause. David J.
Re: Wrong security context for deferred triggers?
On Sat, Jun 22, 2024 at 6:23 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > except invoker and triggerer are the same entity Maybe "executor" would have been a better term than 'invoker". In this specific example they are not the same entity. The trigger is triggered and queued by one role and executed by a different role, hence the confusion. Though I agree with Laurenz, special SQL syntax for this exotic corner case is a little too much. > Security definer on the function would take precedence as would its set clause. These trigger options seem a bit redundant with the equivalent options on the function that is executed by the trigger. What would be the advantages or differences of setting these options on the trigger versus the function? Thanks, Joe Koshakow
Re: Wrong security context for deferred triggers?
On Sat, Jun 22, 2024 at 7:21 PM Joseph Koshakow wrote: > On Sat, Jun 22, 2024 at 6:23 PM David G. Johnston < > david.g.johns...@gmail.com> wrote: > > > except invoker and triggerer are the same entity > > Maybe "executor" would have been a better term than 'invoker". In this > specific example they are not the same entity. The trigger is > triggered and queued by one role and executed by a different role, > hence the confusion. > No matter what we label the keyword it would be represent the default and existing behavior whereby the environment at trigger resolution time, not trigger enqueue time, is used. I suppose there is an argument for capturing and reapplying the trigger enqueue time environment and giving that a keyword as well. But fewer options has value and security definer seems like the strictly superior option. > Though I agree with Laurenz, special SQL syntax > for this exotic corner case is a little too much. > It doesn't seem like a corner case if we want to institute a new recommended practice that all triggers should be created with security definer. We seem to be discussing that without giving the dba a choice in the matter - but IMO we do want to provide the choice and leave the default. > > Security definer on the function would take precedence as would its set > clause. > > These trigger options seem a bit redundant with the equivalent options > on the function that is executed by the trigger. What would be the > advantages or differences of setting these options on the trigger > versus the function? > > At least security definer needs to take precedence as the function owner is fully expecting their role to be the one executing the function, not whomever the trigger owner might be. If putting a set clause on the trigger is a thing then the same thing goes - the function author, if they also did that, expects their settings to be in place. Whether it really makes sense to have trigger owner set configuration when they attach the function is arguable but also the most flexible option. David J.
Unable parse a comment in gram.y
I was unable to parse a comment in src/backend/parser/gram.y around line 11364: /* * As func_expr but does not accept WINDOW functions directly (they * can still be contained in arguments for functions etc.) * Use this when window expressions are not allowed, so to disambiguate * the grammar. (e.g. in CREATE INDEX) */ Maybe "but" is unnecessary in the first sentence in the comment? Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Re: Unable parse a comment in gram.y
On Sat, Jun 22, 2024 at 9:02 PM Tatsuo Ishii wrote: > I was unable to parse a comment in src/backend/parser/gram.y around line > 11364: > > /* > * As func_expr but does not accept WINDOW functions directly (they > * can still be contained in arguments for functions etc.) > * Use this when window expressions are not allowed, so to disambiguate > * the grammar. (e.g. in CREATE INDEX) > */ > > Maybe "but" is unnecessary in the first sentence in the comment? > > The "but" is required, add a comma before it. It could also be written a bit more verbosely: func_expr_windowless is the same as func_expr aside from not accepting window functions directly ... David J.
Re: Unable parse a comment in gram.y
> On Sat, Jun 22, 2024 at 9:02 PM Tatsuo Ishii wrote: > >> I was unable to parse a comment in src/backend/parser/gram.y around line >> 11364: >> >> /* >> * As func_expr but does not accept WINDOW functions directly (they >> * can still be contained in arguments for functions etc.) >> * Use this when window expressions are not allowed, so to disambiguate >> * the grammar. (e.g. in CREATE INDEX) >> */ >> >> Maybe "but" is unnecessary in the first sentence in the comment? >> >> > The "but" is required, add a comma before it. It could also be written a > bit more verbosely: > > func_expr_windowless is the same as func_expr aside from not accepting > window functions directly ... Oh, I see. -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Re: Unable parse a comment in gram.y
"David G. Johnston" writes: > On Sat, Jun 22, 2024 at 9:02 PM Tatsuo Ishii wrote: >> I was unable to parse a comment in src/backend/parser/gram.y around line >> 11364: >> >> * As func_expr but does not accept WINDOW functions directly (they >> * can still be contained in arguments for functions etc.) > The "but" is required, add a comma before it. It could also be written a > bit more verbosely: Perhaps s/As func_expr/Like func_expr/ would be less confusing? regards, tom lane
RE: Pgoutput not capturing the generated columns
Hi Shubham, Thanks for sharing new patch! You shared as v9, but it should be v10, right? Also, since there are no commitfest entry, I registered [1]. You can rename the title based on the needs. Currently CFbot said OK. Anyway, below are my comments. 01. General Your patch contains unnecessary changes. Please remove all of them. E.g., ``` " s.subpublications,\n"); - ``` And ``` appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n" -" s.subenabled,\n"); + " s.subenabled,\n"); ``` 02. General Again, please run the pgindent/pgperltidy. 03. test_decoding Previously I suggested to the default value of to be include_generated_columns should be true, so you modified like that. However, Peter suggested opposite opinion [3] and you just revised accordingly. I think either way might be okay, but at least you must clarify the reason why you preferred to set default to false and changed accordingly. 04. decoding_into_rel.sql According to the comment atop this file, this test should insert result to a table. But added case does not - we should put them at another place. I.e., create another file. 05. decoding_into_rel.sql ``` +-- when 'include-generated-columns' is not set ``` Can you clarify the expected behavior as a comment? 06. getSubscriptions ``` + else + appendPQExpBufferStr(query, + " false AS subincludegencols,\n"); ``` I think the comma is not needed. Also, this error meant that you did not test to use pg_dump for instances prior PG16. Please verify whether we can dump subscriptions and restore them accordingly. [1]: https://commitfest.postgresql.org/48/5068/ [2]: https://www.postgresql.org/message-id/OSBPR01MB25529997E012DEABA8E15A02F5E52%40OSBPR01MB2552.jpnprd01.prod.outlook.com [3]: https://www.postgresql.org/message-id/CAHut%2BPujrRQ63ju8P41tBkdjkQb4X9uEdLK_Wkauxum1MVUdfA%40mail.gmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Re: Unable parse a comment in gram.y
>>> * As func_expr but does not accept WINDOW functions directly (they >>> * can still be contained in arguments for functions etc.) > >> The "but" is required, add a comma before it. It could also be written a >> bit more verbosely: > > Perhaps s/As func_expr/Like func_expr/ would be less confusing? +1. It's easier to understand at least for me. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Re: Unable parse a comment in gram.y
On Saturday, June 22, 2024, Tatsuo Ishii wrote: > >>> * As func_expr but does not accept WINDOW functions directly (they > >>> * can still be contained in arguments for functions etc.) > > > >> The "but" is required, add a comma before it. It could also be written > a > >> bit more verbosely: > > > > Perhaps s/As func_expr/Like func_expr/ would be less confusing? > > +1. It's easier to understand at least for me. > > +1 David J.
Re: speed up a logical replica setup
On Mon, Mar 25, 2024 at 12:55:39PM +0100, Peter Eisentraut wrote: > I have committed your version v33. > commit d44032d > --- /dev/null > +++ b/src/bin/pg_basebackup/pg_createsubscriber.c > +static char * > +concat_conninfo_dbname(const char *conninfo, const char *dbname) > +{ > + PQExpBuffer buf = createPQExpBuffer(); > + char *ret; > + > + Assert(conninfo != NULL); > + > + appendPQExpBufferStr(buf, conninfo); > + appendPQExpBuffer(buf, " dbname=%s", dbname); pg_createsubscriber fails on a dbname containing a space. Use appendConnStrVal() here and for other params in get_sub_conninfo(). See the CVE-2016-5424 commits for more background. For one way to test this scenario, see generate_db() in the pg_upgrade test suite. > +static char * > +create_logical_replication_slot(PGconn *conn, struct LogicalRepInfo *dbinfo) > +{ > + PQExpBuffer str = createPQExpBuffer(); > + PGresult *res = NULL; > + const char *slot_name = dbinfo->replslotname; > + char *slot_name_esc; > + char *lsn = NULL; > + > + Assert(conn != NULL); > + > + pg_log_info("creating the replication slot \"%s\" on database \"%s\"", > + slot_name, dbinfo->dbname); > + > + slot_name_esc = PQescapeLiteral(conn, slot_name, strlen(slot_name)); > + > + appendPQExpBuffer(str, > + "SELECT lsn FROM > pg_catalog.pg_create_logical_replication_slot(%s, 'pgoutput', false, false, > false)", This is passing twophase=false, but the patch does not mention prepared transactions. Is the intent to not support workloads containing prepared transactions? If so, the documentation should say that, and the tool likely should warn on startup if max_prepared_transactions != 0. > +static void > +create_publication(PGconn *conn, struct LogicalRepInfo *dbinfo) > +{ > + appendPQExpBuffer(str, "CREATE PUBLICATION %s FOR ALL TABLES", > + ipubname_esc); This tool's documentation says it "guarantees that no transaction will be lost." I tried to determine whether achieving that will require something like the fix from https://postgr.es/m/flat/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com. (Not exactly the fix from that thread, since that thread has not discussed the FOR ALL TABLES version of its race condition.) I don't know. On the one hand, pg_createsubscriber benefits from creating a logical slot after creating the publication. That snapbuild.c process will wait for running XIDs. On the other hand, an INSERT/UPDATE/DELETE acquires its RowExclusiveLock and builds its relcache entry before assigning an XID, so perhaps the snapbuild.c process isn't enough to prevent that thread's race condition. What do you think?
Re: Add pg_get_acl() function get the ACL for a database object
On Sat, Jun 22, 2024, at 11:44, Joel Jacobson wrote: > * v5-0001-Add-pg_get_acl.patch > * v2-0002-Add-pg_get_acl-overloads.patch Rename files to ensure cfbot applies them in order; both need to have same version prefix. v6-0001-Add-pg_get_acl.patch Description: Binary data v6-0002-Add-pg_get_acl-overloads.patch Description: Binary data