Re: Wrong security context for deferred triggers?
On Sat, 8 Jun 2024 at 17:37, Joseph Koshakow wrote: > However, I do worry that this is too much of a breaking change and > fundamentally changes how triggers work. Another draw back is that if > the trigger owner loses the required privileges, then no one can modify > to the table. > I worry about breaking changes too. The more I think about it, though, the more I think the existing behaviour doesn’t make sense. Speaking as a table owner, when I set a trigger on it, I expect that when the specified actions occur my trigger will fire and will do what I specify, without regard to the execution environment of the caller (search_path in particular); and my trigger should be able to do anything that I can do. For the canonical case of a logging table the trigger has to be able to do stuff the caller can't do. I don't expect to be able to do stuff that the caller can do. Speaking as someone making an update on a table, I don't expect to have it fail because my execution environment (search_path in particular) is wrong for the trigger implementation, and I consider it a security violation if the table owner is able to do stuff as me as a result, especially if I am an administrator making an update as superuser. In effect, I want the action which fires the trigger to be like a system call, and the trigger, plus the underlying action, to be like what the OS does in response to the system call. I'm not sure how to evaluate what problems with existing implementations would be caused by changing what role executes triggers, but I think it's pretty clear the existing behaviour is the wrong choice in every other way than backward compatibility. I welcome examples to the contrary, where the existing behaviour is not just OK but actually wanted.
Re: Wrong security context for deferred triggers?
On Sat, Jun 8, 2024 at 5:36 PM Joseph Koshakow wrote: >Additionally, I applied your patch to master and re-ran the example and >didn't notice any behavior change. > >test=# CREATE TABLE tab (i integer); >CREATE TABLE >test=# CREATE FUNCTION trig() RETURNS trigger >LANGUAGE plpgsql AS > $$BEGIN >RAISE NOTICE 'current_user = %', current_user; >RETURN NEW; > END;$$; >CREATE FUNCTION >test=# CREATE CONSTRAINT TRIGGER trig AFTER INSERT ON tab >DEFERRABLE INITIALLY IMMEDIATE >FOR EACH ROW EXECUTE FUNCTION trig(); >CREATE TRIGGER >test=# CREATE ROLE duff; >CREATE ROLE >test=# GRANT INSERT ON tab TO duff; >GRANT >test=# SET ROLE duff; >SET >test=> BEGIN; >BEGIN >test=*> INSERT INTO tab VALUES (1); >NOTICE: current_user = duff >INSERT 0 1 >test=*> SET CONSTRAINTS ALL DEFERRED; >SET CONSTRAINTS >test=*> INSERT INTO tab VALUES (2); >INSERT 0 1 >test=*> RESET ROLE; >RESET >test=*# COMMIT; >NOTICE: current_user = joe >COMMIT > >Though maybe I'm just doing something wrong. Sorry, there's definitely something wrong with my environment. You can ignore this. Thanks, Joe Koshakow
Shouldn't jsonpath .string() Unwrap?
Hackers, Most of the jsonpath methods auto-unwrap in lax mode: david=# select jsonb_path_query('[-2,5]', '$.abs()'); jsonb_path_query -- 2 5 (2 rows) The obvious exceptions are size() and type(), which apply directly to arrays, so no need to unwrap: david=# select jsonb_path_query('[-2,5]', '$.size()'); jsonb_path_query -- 2 (1 row) david=# select jsonb_path_query('[-2,5]', '$.type()'); jsonb_path_query -- "array" But what about string()? Is there some reason it doesn’t unwrap? david=# select jsonb_path_query('[-2,5]', '$.string()'); ERROR: jsonpath item method .string() can only be applied to a bool, string, numeric, or datetime value What I expect: david=# select jsonb_path_query('[-2,5]', '$.string()'); jsonb_path_query — "2" "5" (2 rows) However, I do see a test[1] for this behavior, so maybe there’s a reason for it? Best, David [1]: https://github.com/postgres/postgres/blob/REL_17_BETA1/src/test/regress/expected/jsonb_jsonpath.out#L2527-L2530
Re: Windows: openssl & gssapi dislike each other
Andrew Dunstan writes: > On 2024-06-08 Sa 06:22, Imran Zaheer wrote: >> Now this can either be solved by just just undefine the macro defined >> by wincrypt.h as done here [3] >> Or we should rearrange our headers. Openssl header should be at the >> bottom (after the gssapi includes). > Let's be consistent and use the #undef from [3]. +1. Depending on header order is not great, especially when you have to make it depend on an order that is directly contradictory to project conventions [0]. regards, tom lane [0] https://wiki.postgresql.org/wiki/Committing_checklist#Policies
Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
On Fri, Jun 7, 2024 at 8:05 PM Alvaro Herrera wrote: > In passing, I noticed that WaitReadBuffers has zero comments, which > seems an insufficient number of them. Ack. Here is a patch for that. I guess I hadn't put a comment there because it's hard to write anything without sort of duplicating what is already said by the StartReadBuffers() comment and doubling down on descriptions of future plans... but, in for a penny, in for a pound as they say. From db5e56825e4b4c595885373c413011c50fedd3e8 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sun, 9 Jun 2024 09:51:02 +1200 Subject: [PATCH] Add comment for WaitReadBuffers(). Per complaint from Alvaro while reviewing something else. Reported-by: Alvaro Herrera Discussion: https://postgr.es/m/202406070805.icofqromovvn%40alvherre.pgsql --- src/backend/storage/buffer/bufmgr.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index a8e3377263f..409210a6ed2 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -1371,6 +1371,17 @@ WaitReadBuffersCanStartIO(Buffer buffer, bool nowait) return StartBufferIO(GetBufferDescriptor(buffer - 1), true, nowait); } +/* + * The second step of a StartReadBuffers(), WaitReadBuffers() sequence. It is + * only necessary to call this if StartReadBuffers() returned true, indicating + * that I/O was necessary. + * + * Currently, this function performs synchronous I/O system calls. The earlier + * StartReadBuffers() call might have issued advice to give the OS a head + * start so that it completes quickly. In future work, a true asynchronous I/O + * could be started by StartReadBuffers(), and this function would wait for it + * to complete. + */ void WaitReadBuffers(ReadBuffersOperation *operation) { -- 2.45.1
Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
New version. Same code as v2, but comments improved to explain the reasoning, with reference to README's buffer access rules. I'm planning to push this soon if there are no objections. From 1fa26f407622cd69d82f3b4ea68fddf2c357cf06 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 7 Jun 2024 17:49:19 +1200 Subject: [PATCH v3] Fix RBM_ZERO_AND_LOCK. Commit 210622c6 accidentally zeroed out pages even if they were found in the buffer pool. It should always lock the page, but it should only zero pages that were not already found as an optimization to avoid I/O. Otherwise, concurrent readers that hold only a pin might see corrupted page contents changing under their feet. This is done with the standard BM_IO_IN_PROGRESS infrastructure to test for BM_VALID and wake concurrent waiters without races (as it was in earlier releases). Reported-by: Noah Misch Reported-by: Alexander Lakhin Reviewed-by: Alvaro Herrera (earlier version) Reviewed-by: Robert Haas (earlier version) Discussion: https://postgr.es/m/20240512171658.7e.nmi...@google.com Discussion: https://postgr.es/m/7ed10231-ce47-03d5-d3f9-4aea0dc7d5a4%40gmail.com --- src/backend/storage/buffer/bufmgr.c | 64 - 1 file changed, 45 insertions(+), 19 deletions(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 49637284f91..a8e3377263f 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -1010,43 +1010,69 @@ ExtendBufferedRelTo(BufferManagerRelation bmr, } /* - * Zero a buffer and lock it, as part of the implementation of + * Lock and optionally zero a buffer, as part of the implementation of * RBM_ZERO_AND_LOCK or RBM_ZERO_AND_CLEANUP_LOCK. The buffer must be already - * pinned. It does not have to be valid, but it is valid and locked on - * return. + * pinned. If the buffer is not already valid, it is zeroed and made valid. */ static void -ZeroBuffer(Buffer buffer, ReadBufferMode mode) +ZeroAndLockBuffer(Buffer buffer, ReadBufferMode mode, bool already_valid) { BufferDesc *bufHdr; - uint32 buf_state; + bool need_to_zero; Assert(mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK); - if (BufferIsLocal(buffer)) + if (already_valid) + { + /* + * If the caller already knew the buffer was valid, we can skip some + * header interaction. The caller just wants to lock the buffer. + */ + need_to_zero = false; + } + else if (BufferIsLocal(buffer)) + { + /* Simple case for non-shared buffers. */ bufHdr = GetLocalBufferDescriptor(-buffer - 1); + need_to_zero = (pg_atomic_read_u32(>state) & BM_VALID) == 0; + } else { + /* + * Take BM_IO_IN_PROGRESS, or discover that BM_VALID has been set + * concurrently. Even though we aren't doing I/O, that protocol + * ensures that we don't zero a page that someone else has pinned. An + * exclusive content lock wouldn't be enough, because readers are + * allowed to drop the content lock after determining that a tuple is + * visible (see buffer access rules in README). + */ bufHdr = GetBufferDescriptor(buffer - 1); + need_to_zero = StartBufferIO(bufHdr, true, false); + } + + if (need_to_zero) + memset(BufferGetPage(buffer), 0, BLCKSZ); + + /* Acquire the requested lock before setting BM_VALID. */ + if (!BufferIsLocal(buffer)) + { if (mode == RBM_ZERO_AND_LOCK) LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); else LockBufferForCleanup(buffer); } - memset(BufferGetPage(buffer), 0, BLCKSZ); - - if (BufferIsLocal(buffer)) - { - buf_state = pg_atomic_read_u32(>state); - buf_state |= BM_VALID; - pg_atomic_unlocked_write_u32(>state, buf_state); - } - else + /* + * Set BM_VALID, and wake anyone waiting for BM_IO_IN_PROGRESS to be + * cleared. + */ + if (need_to_zero) { - buf_state = LockBufHdr(bufHdr); - buf_state |= BM_VALID; - UnlockBufHdr(bufHdr, buf_state); + if (BufferIsLocal(buffer)) + pg_atomic_write_u32(>state, +pg_atomic_read_u32(>state) | BM_VALID); + else + TerminateBufferIO(bufHdr, false, BM_VALID, true); } } @@ -1185,7 +1211,7 @@ ReadBuffer_common(Relation rel, SMgrRelation smgr, char smgr_persistence, buffer = PinBufferForBlock(rel, smgr, smgr_persistence, forkNum, blockNum, strategy, ); - ZeroBuffer(buffer, mode); + ZeroAndLockBuffer(buffer, mode, found); return buffer; } -- 2.45.1
Re: Wrong security context for deferred triggers?
Hi, I see that this patch is marked as ready for review, so I thought I would attempt to review it. This is my first review, so please take it with a grain of salt. > So a deferred constraint trigger does not run with the same security context > as an immediate trigger. It sounds like the crux of your argument is that the current behavior is that triggers are executed with the role and security context of the session at the time of execution. Instead, the trigger should be executed with the role and security context of the session at the time time of queuing (i.e. the same context as the action that triggered the trigger). While I understand that the current behavior can be surprising in some scenarios, it's not clear to me why this behavior is wrong. It seems that the whole point of deferring a trigger to commit time is that the context that the trigger is executed in is different than the context that it was triggered in. Tables may have changed, permissions may have changed, session configuration variables may have changed, roles may have changed, etc. So why should the executing role be treated differently and restored to the value at the time of triggering. Perhaps you can expand on why you feel that the current behavior is wrong? > This is somewhat nasty in combination with > SECURITY DEFINER functions: if that function performs an operation, and that > operation triggers a deferred trigger, that trigger will run in the wrong > security context. ... > The more serious concern is that the old code constitutes > a narrow security hazard: a superuser could temporarily > assume an unprivileged role to avoid risks while performing > DML on a table controlled by an untrusted user, but for > some reason resume being a superuser *before* COMMIT. > Then a deferred trigger would inadvertently run with > superuser privileges. I find these examples to be surprising, but not necessarily wrong (as per my comment above). If someone wants the triggers to be executed as the triggering role, then they can run `SET CONSTRAINTS ALL IMMEDIATE`. If deferring a trigger to commit time and executing it as the triggering role is desirable, then maybe we should add a modifier to triggers that can control this behavior. Something like `SECURITY INVOKER | SECURITY TRIGGERER` (modeled after the modifiers in `CREATE FUNCTION`) that control which role is used. > This looks to me like another reason that triggers should run as the > trigger owner. Which role owns the trigger won’t change as a result of > constraints being deferred or not, or any role setting done during the > transaction, including that relating to security definer functions. > > Right now triggers can’t do anything that those who can > INSERT/UPDATE/DELETE (i.e., cause the trigger to fire) can’t do, which in >particular breaks the almost canonical example of using triggers to log > changes — I can’t do it without also allowing users to make spurious log > entries. > > Also if I cause a trigger to fire, I’ve just given the trigger owner the > opportunity to run arbitrary code as me. > >> I just realized one problem with running a deferred constraint trigger as >> the triggering role: that role might have been dropped by the time the >> trigger >> executes. But then we could still error out. > > This problem is also fixed by running triggers as their owners: there > should be a dependency between an object and its owner. So the > trigger-executing role can’t be dropped without dropping the trigger. +1, this approach would remove all of the surprising/wrong behavior and in my opinion is more obvious. I'd like to add some more reasons why this behavior makes sense: - The documentation [0] indicates that to create a trigger, the creating role must have the `EXECUTE` privilege on the trigger function. In fact this check is skipped for the role that triggers trigger. -- Create trig_owner role and function. Grant execute on function -- to role. test=# CREATE ROLE trig_owner; CREATE ROLE test=# GRANT CREATE ON SCHEMA public TO trig_owner; GRANT test=# CREATE OR REPLACE FUNCTION f() RETURNS trigger LANGUAGE plpgsql AS $$BEGIN RAISE NOTICE 'current_user = %', current_user; RETURN NEW; END;$$; CREATE FUNCTION test=# REVOKE EXECUTE ON FUNCTION f FROM PUBLIC; REVOKE test=# GRANT EXECUTE ON FUNCTION f TO trig_owner; GRANT -- Create the trigger as trig_owner. test=# SET ROLE trig_owner; SET test=> CREATE TABLE t (a INT); CREATE TABLE test=> CREATE CONSTRAINT TRIGGER trig AFTER INSERT ON t DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION f(); CREATE TRIGGER -- Trigger the trigger with a role that doesn't have execute -- privileges on the trigger function and also call the function -- directly. The trigger succeeds but the function call fails. test=> RESET ROLE; RESET test=# CREATE ROLE r1;
Re: Windows: openssl & gssapi dislike each other
On 2024-06-08 Sa 06:22, Imran Zaheer wrote: I was able to reproduce the gssapi & openssl error on windows. I tried on PG16 with msvc build system and on PG17 with meson build system. The error was reproducible when enabling both openssl and gssapi from the configurations. Turns out that it was due to the conflicting macros. "be-secure-openssl.c" tries to prevent this conflict here [1]. But the error again appears when gssapi is enabled. The file "be-secure-openssl.c" fails to compile because it has a similar scenario as explained here [2]. The header libpq.h is indirectly including libpq-be.h which has a wrong order of including openssl headers. Header "gssapi.h" indirectly includes "wincrypt.h" and openssl header should be defined after gssapi includes. Now this can either be solved by just just undefine the macro defined by wincrypt.h as done here [3] ``` #ifdef X509_NAME #undef X509_NAME #endif ``` Or we should rearrange our headers. Openssl header should be at the bottom (after the gssapi includes). I am attaching the patch here in which I rearranged the openssl header in libpq-be.h [1]: https://github.com/postgres/postgres/blob/8ba34c698d19450ccae9a5aea59a6d0bc8b75c0e/src/backend/libpq/be-secure-openssl.c#L46 [2]: https://github.com/openssl/openssl/issues/10307#issuecomment-964155382 [3]: https://github.com/postgres/postgres/blob/00ac25a3c365004821e819653c3307acd3294818/contrib/sslinfo/sslinfo.c#L29 Let's be consistent and use the #undef from [3]. I did find the comment in sslinfo.c slightly confusing until I understood that this was a #define clashing with a typedef. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: The xversion-upgrade test fails to stop server
On 2024-06-08 Sa 10:00, Alexander Lakhin wrote: Hello, 30.05.2024 18:00, Alexander Lakhin wrote: While reviewing recent buildfarm failures, I came across this one: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake=2024-05-23%2004%3A11%3A03 upgrade.crake/REL_16_STABLE/REL9_5_STABLE-ctl4.log waiting for server to shut down... failed pg_ctl: server does not shut down I've grepped through logs of the last 167 xversion-upgrade-REL9_5_STABLE-REL_16_STABLE/*ctl4.log on crake and got the following results: waiting for server to shut down done [...] So maybe it would make sense to increase default PGCTLTIMEOUT for buildfarm clients, say, to 180 seconds? Sure. For now I have added it to the config on crake, but we can make it a default. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: cannot drop intarray extension
Michael Paquier writes: > On Fri, Jun 07, 2024 at 11:32:14AM +0800, jian he wrote: >> in deleteObjectsInList, under certain conditions trying to sort the to >> be deleted object list >> by just using sort_object_addresses seems to work, >> but it looks like a hack. >> maybe the proper fix would be in findDependentObjects. > I have not studied the patch in details, but this looks > overcomplicated to me. I dunno about overcomplicated, but it's fundamentally the wrong thing: it won't do much except move the problem from this example to other example(s). The difficulty is precisely that we cannot simply delete objects in reverse OID order and expect that to be OK. It appears to work in simple cases because reverse OID order usually means deleting newest objects first, and that usually means dropping depender objects before dependees --- but dependencies added as a result of later ALTER commands may not be honored correctly. Not to mention that you can lose if an OID wraparound happened during the sequence of object creations. In the case at hand, the reason we're having trouble with g_intbig_options() is that the sequence of extension scripts run by CREATE EXTENSION creates the gist__intbig_ops opfamily first, and then creates g_intbig_options() and attaches it to the opfamily later (in intarray--1.2--1.3.sql). So g_intbig_options() has a larger OID than the opclass that the index depends on. In DROP EXTENSION, the first level of findDependentObjects() finds all the direct dependencies (members) of the extension, and then sorts them by OID descending, concluding that g_intbig_options() can be dropped before the opclass. Subsequent recursive levels will find the index and recognize that it must be dropped before the opclass --- but this fails to account for the fact that we'd better drop the opclass before any of the functions it depends on. At some point along the line, we will come across the dependency that says so; but we don't do anything in response, because if findDependentObjects() sees that the current object is already in targetObjects then it thinks it's done. I think when I wrote this code I was assuming that the dependency- order traversal performed by findDependentObjects() was sufficient to guarantee producing a safe deletion order, but it's now obvious that that's not so. At minimum, when findDependentObjects() finds that a dependency exists on an object that's already in targetObjects, it'd need to do something about moving that object to after the one it's working on. But I doubt we can fix it with just that, because that won't be enough to handle indirect dependencies. It looks to me that the only real fix will require performing a topological sort, similar to what pg_dump does, to produce a safe deletion order that honors all the direct and indirect dependencies found by findDependentObjects(). An open question is whether we will need dependency-loop-breaking logic, or whether the hackery done in findDependentObjects() is sufficient to ensure that we can assume there are no loops in the dependencies it chooses to output. It might be a better idea to get rid of that logic and instead have explicit loop-breaking like the way pg_dump does it. It's also tempting to wonder if we can share code for this with pg_dump. The TopoSort function alone is not that big, but if we have to duplicate the loop-breaking logic it'd get annoying. Anyway, this is a very long-standing problem and I don't think we should try to solve it in a rush. regards, tom lane
Re: Proposal to include --exclude-extension Flag in pg_dump
> Attached is a patch for the --filter docs, covering the omissions I can see. Thanks Dean for working on this. I have reviewed the changes and they look good to me. Regards, Ayush Vatsa Amazon Web services (AWS) On Fri, 7 Jun 2024 at 15:50, Dean Rasheed wrote: > On Tue, 19 Mar 2024 at 11:53, Daniel Gustafsson wrote: > > > > I did notice a few mistakes in the --filter > > documentation portion for other keywords but that's unrelated to this > patch, > > will fix them once this is in to avoid conflicts. > > > > Attached is a patch for the --filter docs, covering the omissions I can > see. > > Regards, > Dean >
Re: Things I don't like about \du's "Attributes" column
On Sat, Jun 8, 2024 at 10:02 AM Pavel Luzanov wrote: > Therefore, I think the current patch offers a better version of the \du > command. > However, I admit that these improvements are not enough to accept the patch. > I would like to hear other opinions. Hmm, I don't think I quite agree with this. If people like this version better than what we have now, that's all we need to accept the patch. I just don't really want to be the one to decide all by myself whether this is, in fact, better. So, like you, I would like to hear other opinions. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Things I don't like about \du's "Attributes" column
On 07.06.2024 15:35, Robert Haas wrote: This seems unobjectionable to me. I am not sure whether it is better than the current verison, or whether it is what we want. But it seems reasonable. I consider this patch as a continuation of the work on \drg command, when it was decided to remove the "Member of" column from \du command. Without "Member of" column, the output of the \du command looks very short. Only two columns: "Role name" and "Attributes". All the information about the role is collected in just one "Attributes" column and it is not presented in the most convenient and obvious way. What exactly is wrong with the Attribute column Tom wrote in the first message of this thread and I agree with these arguments. The current implementation offers some solutions for 3 of the 4 issues mentioned in Tom's initial message. Issue about display of rolvaliduntil can't be resolved without changing pg_roles (or executing different queries for different users). Therefore, I think the current patch offers a better version of the \du command. However, I admit that these improvements are not enough to accept the patch. I would like to hear other opinions. -- Pavel Luzanov Postgres Professional:https://postgrespro.com
Re: The xversion-upgrade test fails to stop server
Hello, 30.05.2024 18:00, Alexander Lakhin wrote: While reviewing recent buildfarm failures, I came across this one: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake=2024-05-23%2004%3A11%3A03 upgrade.crake/REL_16_STABLE/REL9_5_STABLE-ctl4.log waiting for server to shut down... failed pg_ctl: server does not shut down I've grepped through logs of the last 167 xversion-upgrade-REL9_5_STABLE-REL_16_STABLE/*ctl4.log on crake and got the following results: waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down.. done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down. done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down. done waiting for server to shut down... done waiting for server to shut down.. done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down.. done waiting for server to shut down... done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down.. done waiting for server to shut down... done waiting for server to shut down... done waiting for server to shut down.. done waiting for server to shut down done waiting for server to shut down done waiting for server to shut down. done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down... done waiting for server to shut down... done waiting for server to shut down.. done waiting for server to shut down done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down... done waiting for server to shut down done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down done waiting for server to shut down. done waiting for server to shut down... done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down.. done waiting for server to shut down. done waiting for server to shut down. done waiting for server to shut down done waiting for server to shut down... done waiting for server to shut down
Re: Use WALReadFromBuffers in more places
Hi Bharath, I spent some time examining the patch. Here are my observations from the review. - I believe there’s no need for an extra variable ‘nbytes’ in this context. We can repurpose the ‘count’ variable for the same function. If necessary, we might think about renaming ‘count’ to ‘nbytes’. - The operations performed by logical_read_xlog_page() and read_local_xlog_page_guts() are identical. It might be beneficial to create a shared function to minimize code repetition. Best Regards, Nitin Jadhav Azure Database for PostgreSQL Microsoft On Mon, May 13, 2024 at 12:17 PM Bharath Rupireddy wrote: > > On Wed, May 8, 2024 at 9:51 AM Jingtang Zhang wrote: > > > > Hi, Bharath. I've been testing this. It's cool. Is there any way we could > > monitor the hit rate about directly reading from WAL buffers by exporting > > to some views? > > Thanks for looking into this. I used purpose-built patches for > verifying the WAL buffers hit ratio, please check > USE-ON-HEAD-Collect-WAL-read-from-file-stats.txt and > USE-ON-PATCH-Collect-WAL-read-from-buffers-and-file-stats.txt at > https://www.postgresql.org/message-id/CALj2ACU9cfAcfVsGwUqXMace_7rfSBJ7%2BhXVJfVV1jnspTDGHQ%40mail.gmail.com. > In the long run, it's better to extend what's proposed in the thread > https://www.postgresql.org/message-id/CALj2ACU_f5_c8F%2BxyNR4HURjG%3DJziiz07wCpQc%3DAqAJUFh7%2B8w%40mail.gmail.com. > I haven't had a chance to dive deep into that thread yet, but a quick > glance over v8 patch tells me that it hasn't yet implemented the idea > of collecting WAL read stats for both walsenders and the backends > reading WAL. If that's done, we can extend it for WAL read from WAL > buffers. > > -- > Bharath Rupireddy > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com > >
Re: Injection points: preloading and runtime arguments
> On 7 Jun 2024, at 04:38, Michael Paquier wrote: Thanks Michael! Tests of injection points with injection points are neat :) Alvaro, here’s the test for multixact CV sleep that I was talking about on PGConf. It is needed to test [0]. It is based on loaded injection points. This technique is not committed yet, but the patch looks good. When all prerequisites are ready I will post it to corresponding thread and create CF item. Thanks! Best regards, Andrey Borodin. [0] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a0e0fb1ba vAB1-0001-Support-loading-of-injection-points.patch Description: Binary data vAB1-0002-Add-multixact-CV-sleep-test.patch Description: Binary data
Re: Windows: openssl & gssapi dislike each other
I was able to reproduce the gssapi & openssl error on windows. I tried on PG16 with msvc build system and on PG17 with meson build system. The error was reproducible when enabling both openssl and gssapi from the configurations. Turns out that it was due to the conflicting macros. "be-secure-openssl.c" tries to prevent this conflict here [1]. But the error again appears when gssapi is enabled. The file "be-secure-openssl.c" fails to compile because it has a similar scenario as explained here [2]. The header libpq.h is indirectly including libpq-be.h which has a wrong order of including openssl headers. Header "gssapi.h" indirectly includes "wincrypt.h" and openssl header should be defined after gssapi includes. Now this can either be solved by just just undefine the macro defined by wincrypt.h as done here [3] ``` #ifdef X509_NAME #undef X509_NAME #endif ``` Or we should rearrange our headers. Openssl header should be at the bottom (after the gssapi includes). I am attaching the patch here in which I rearranged the openssl header in libpq-be.h [1]: https://github.com/postgres/postgres/blob/8ba34c698d19450ccae9a5aea59a6d0bc8b75c0e/src/backend/libpq/be-secure-openssl.c#L46 [2]: https://github.com/openssl/openssl/issues/10307#issuecomment-964155382 [3]: https://github.com/postgres/postgres/blob/00ac25a3c365004821e819653c3307acd3294818/contrib/sslinfo/sslinfo.c#L29 Thanks Imran Zaheer Bitnine v01-0001-approach-01-Reorder-openssl-header.patch Description: Binary data
Re: Conflict Detection and Resolution
On Fri, Jun 7, 2024 at 5:39 PM Ashutosh Bapat wrote: > > On Thu, Jun 6, 2024 at 5:16 PM Nisha Moond wrote: >> >> > >> >> Here are more use cases of the "earliest_timestamp_wins" resolution method: >> 1) Applications where the record of first occurrence of an event is >> important. For example, sensor based applications like earthquake >> detection systems, capturing the first seismic wave's time is crucial. >> 2) Scheduling systems, like appointment booking, prioritize the >> earliest request when handling concurrent ones. >> 3) In contexts where maintaining chronological order is important - >> a) Social media platforms display comments ensuring that the >> earliest ones are visible first. >> b) Finance transaction processing systems rely on timestamps to >> prioritize the processing of transactions, ensuring that the earliest >> transaction is handled first > > > Thanks for sharing examples. However, these scenarios would be handled by the > application and not during replication. What we are discussing here is the > timestamp when a row was updated/inserted/deleted (or rather when the > transaction that updated row committed/became visible) and not a DML on > column which is of type timestamp. Some implementations use a hidden > timestamp column but that's different from a user column which captures > timestamp of (say) an event. The conflict resolution will be based on the > timestamp when that column's value was recorded in the database which may be > different from the value of the column itself. > It depends on how these operations are performed. For example, the appointment booking system could be prioritized via a transaction updating a row with columns emp_name, emp_id, reserved, time_slot. Now, if two employees at different geographical locations try to book the calendar, the earlier transaction will win. > If we use the transaction commit timestamp as basis for resolution, a > transaction where multiple rows conflict may end up with different rows > affected by that transaction being resolved differently. Say three > transactions T1, T2 and T3 on separate origins with timestamps t1, t2, and t3 > respectively changed rows r1, r2 and r2, r3 and r1, r4 respectively. Changes > to r1 and r2 will conflict. Let's say T2 and T3 are applied first and then T1 > is applied. If t2 < t1 < t3, r1 will end up with version of T3 and r2 will > end up with version of T1 after applying all the three transactions. > Are you telling the results based on latest_timestamp_wins? If so, then it is correct. OTOH, if the user has configured "earliest_timestamp_wins" resolution method, then we should end up with a version of r1 from T1 because t1 < t3. Also, due to the same reason, we should have version r2 from T2. > Would that introduce an inconsistency between r1 and r2? > As per my understanding, this shouldn't be an inconsistency. Won't it be true even when the transactions are performed on a single node with the same timing? -- With Regards, Amit Kapila.
New function normal_rand_array function to contrib/tablefunc.
Here is a new function which could produce an array of numbers with a controllable array length and duplicated elements in these arrays. I used it when working with gin index, and I think it would be helpful for others as well, so here it is. select * from normal_rand_array(5, 10, 1.8::numeric, 3.5::numeric); normal_rand_array --- {3.3,2.3,2.7,3.2,2.0,2.7,3.4,2.7,2.3,2.9} {3.3,1.8,2.9,3.4,2.0,1.8,2.0,3.5,2.8,2.5} {2.1,1.9,2.3,1.9,2.5,2.7,2.4,2.9,1.8} {2.3,2.5,2.4,2.7,2.7,2.3,2.9,3.3,3.3,1.9,3.5} {2.8,3.4,2.7,1.8,3.3,2.3,2.2,3.5,2.6,2.5} (5 rows) select * from normal_rand_array(5, 10, 1.8::int4, 3.5::int4); normal_rand_array - {3,2,2,3,4,2} {2,4,2,3,3,3,3,2,2,3,3,2,3,2} {2,4,3} {4,2,3,4,2,4,2,2,3,4,3,3,2,4,4,2,3} {4,3,3,4,3,3,4,2,4} (5 rows) the 5 means it needs to produce 5 rows in total and the 10 is the average array length, and 1.8 is the minvalue for the random function and 3.5 is the maxvalue. -- Best Regards Andy Fan >From 397dcaf67f29057b80aebbb6116b49ac8344547c Mon Sep 17 00:00:00 2001 From: Andy Fan Date: Sat, 8 Jun 2024 13:21:08 +0800 Subject: [PATCH v20240608 1/1] Add function normal_rand_array function to contrib/tablefunc. It can produce an array of numbers with n controllable array length and duplicated elements in these arrays. --- contrib/tablefunc/Makefile| 2 +- contrib/tablefunc/expected/tablefunc.out | 26 contrib/tablefunc/sql/tablefunc.sql | 10 ++ contrib/tablefunc/tablefunc--1.0--1.1.sql | 7 ++ contrib/tablefunc/tablefunc.c | 140 ++ contrib/tablefunc/tablefunc.control | 2 +- doc/src/sgml/tablefunc.sgml | 10 ++ src/backend/utils/adt/arrayfuncs.c| 7 ++ 8 files changed, 202 insertions(+), 2 deletions(-) create mode 100644 contrib/tablefunc/tablefunc--1.0--1.1.sql diff --git a/contrib/tablefunc/Makefile b/contrib/tablefunc/Makefile index 191a3a1d38..f0c67308fd 100644 --- a/contrib/tablefunc/Makefile +++ b/contrib/tablefunc/Makefile @@ -3,7 +3,7 @@ MODULES = tablefunc EXTENSION = tablefunc -DATA = tablefunc--1.0.sql +DATA = tablefunc--1.0.sql tablefunc--1.0--1.1.sql PGFILEDESC = "tablefunc - various functions that return tables" REGRESS = tablefunc diff --git a/contrib/tablefunc/expected/tablefunc.out b/contrib/tablefunc/expected/tablefunc.out index ddece79029..9f0cbbfbbe 100644 --- a/contrib/tablefunc/expected/tablefunc.out +++ b/contrib/tablefunc/expected/tablefunc.out @@ -12,6 +12,32 @@ SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2); -- negative number of tuples SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2); ERROR: number of rows cannot be negative +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::numeric, 8::numeric) as i; + count |avg +---+ +10 | 3. +(1 row) + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::int4, 8::int4) as i; + count |avg +---+ +10 | 3. +(1 row) + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::int8, 8::int8) as i; + count |avg +---+ +10 | 3. +(1 row) + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::float8, 8::float8) as i; + count |avg +---+ +10 | 3. +(1 row) + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 'abc'::text, 'def'::text) as i; +ERROR: unsupported type 25 in normal_rand_array. -- -- crosstab() -- diff --git a/contrib/tablefunc/sql/tablefunc.sql b/contrib/tablefunc/sql/tablefunc.sql index 0fb8e40de2..dec57cfc66 100644 --- a/contrib/tablefunc/sql/tablefunc.sql +++ b/contrib/tablefunc/sql/tablefunc.sql @@ -8,6 +8,16 @@ SELECT avg(normal_rand)::int, count(*) FROM normal_rand(100, 250, 0.2); -- negative number of tuples SELECT avg(normal_rand)::int, count(*) FROM normal_rand(-1, 250, 0.2); +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::numeric, 8::numeric) as i; + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::int4, 8::int4) as i; + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::int8, 8::int8) as i; + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 1.23::float8, 8::float8) as i; + +SELECT count(*), avg(COALESCE(array_length(i, 1), 0)) FROM normal_rand_array(10, 3, 'abc'::text, 'def'::text) as i; + -- -- crosstab() -- diff --git a/contrib/tablefunc/tablefunc--1.0--1.1.sql b/contrib/tablefunc/tablefunc--1.0--1.1.sql new file