On 09.12.24 02:25, Noah Misch wrote:
Ok, we should probably put that comment back in slightly altered form, like

"XXX This function intentionally takes only an AccessShareLock ... $REASON.
In the face of concurrent DDL, we might easily latch
onto an old version of an object, causing the GRANT or REVOKE statement
to fail."

Yep.

There is an open item for PG18 for this. So here is a patch that adds a comment back, mostly from your descriptions in this thread.

The change to AccessShareLock at least prevents confusing "cache lookup
failed" messages, and might alleviate some security concerns about swapping
in a completely different object concurrently (even if we currently think
this is not an actual problem).

Perhaps.  To me, the v17 behavior smells mildly superior to the v18 behavior.

Hmm. I think there has been a general effort to get rid of internal errors like "cache lookup failed ..." and replace them with proper user-facing errors. This change seems in line with that.

An alternative, if we wanted to go back to the old behavior (other than reverting altogether, since I think the refactoring is still valuable), would be to allow get_object_address() to work with lockmode == NoLock. That would require a bit of work, but nothing magical.

--- a/src/test/isolation/expected/intra-grant-inplace.out
+++ b/src/test/isolation/expected/intra-grant-inplace.out
@@ -248,6 +248,6 @@ relhasindex
   -----------
   (0 rows)
-s4: WARNING:  got: cache lookup failed for relation REDACTED
+s4: WARNING:  got: relation "intra_grant_inplace" does not exist

The affected permutation existed to cover the first LockRelease() in
SearchSysCacheLocked1().  Since this commit, that line no longer has coverage.

Do you have an idea how such a test case could be constructed now?

A rough idea.  The test worked because REVOKE used only LOCKTAG_TUPLE, which
didn't mind the LOCKTAG_RELATION from DROP TABLE.

One route might be to find another SearchSysCacheLocked1() caller that takes
no locks beyond the LOCKTAG_TUPLE taken right in SearchSysCacheLocked1().  I'd
add a temporary elog to report if that's happening.
check_lock_if_inplace_updateable_rel() is an example of reporting the absence
of a lock.  If check-world w/ that elog finds some operation reaching that
circumstance, this test could replace REVOKE with that operation.

Another route would be to remove the catalog row without locking the
underlying object, e.g. by replacing DROP TABLE with DELETE FROM pg_class.
That's more artificial.

I have attached here a patch that shows that the last idea does work.

I don't know what the best solution here is. It seems weird to leave some higher-level code faulty in order to be able to test lower-level code that attempts to cope with the faults of the higher-level code. I know that backstops have value, though.
From 1cac7fecf80a7ed3dc4a714f7c984fd8934fc0ea Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Wed, 11 Jun 2025 17:08:23 +0200
Subject: [PATCH 1/2] Improve objectNamesToOids() comment

Commit d31bbfb6590 removed the comment at objectNamesToOids() that
there is no locking, because that commit added locking.  But to fix
all the problems, we'd still need a stronger lock.  So put the comment
back with more a detailed explanation.
---
 src/backend/catalog/aclchk.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 9ca8a88dc91..24948c1f05e 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -659,6 +659,20 @@ ExecGrantStmt_oids(InternalGrant *istmt)
  * objectNamesToOids
  *
  * Turn a list of object names of a given type into an Oid list.
+ *
+ * XXX This function intentionally takes only an AccessShareLock.  In the face
+ * of concurrent DDL, we might easily latch onto an old version of an object,
+ * causing the GRANT or REVOKE statement to fail.  But it does prevent the
+ * object from disappearing altogether.  To do better, we would need to use a
+ * self-exclusive lock, perhaps ShareUpdateExclusiveLock, here and before
+ * *every* CatalogTupleUpdate() of a row that GRANT/REVOKE can affect.
+ * Besides that additional work, this could have operational costs.  For
+ * example, it would make GRANT ALL TABLES IN SCHEMA terminate every
+ * autovacuum running in the schema and consume a shared lock table entry per
+ * table in the schema.  The user-visible benefit of that additional work is
+ * just changing "ERROR: tuple concurrently updated" to blocking.  That's not
+ * nothing, but it might not outweigh autovacuum termination and lock table
+ * consumption spikes.
  */
 static List *
 objectNamesToOids(ObjectType objtype, List *objnames, bool is_grant)

base-commit: 361499538c9d3640e1ed5522e08fdf81b08e76ae
-- 
2.49.0

From e1ff7134fca04b9909f30eb8d71a126de92bc2a1 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Wed, 11 Jun 2025 17:11:19 +0200
Subject: [PATCH 2/2] WIP: Attempt to put back intra-grant-inplace.spec
 coverage

---
 src/test/isolation/expected/intra-grant-inplace.out | 4 ++--
 src/test/isolation/specs/intra-grant-inplace.spec   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/test/isolation/expected/intra-grant-inplace.out 
b/src/test/isolation/expected/intra-grant-inplace.out
index 1aa9da622da..23c34d0ca09 100644
--- a/src/test/isolation/expected/intra-grant-inplace.out
+++ b/src/test/isolation/expected/intra-grant-inplace.out
@@ -226,7 +226,7 @@ step revoke4: <... completed>
 starting permutation: b1 drop1 b3 sfu3 revoke4 c1 r3
 step b1: BEGIN;
 step drop1: 
-       DROP TABLE intra_grant_inplace;
+       DELETE FROM pg_class WHERE relname = 'intra_grant_inplace';
 
 step b3: BEGIN ISOLATION LEVEL READ COMMITTED;
 step sfu3: 
@@ -248,6 +248,6 @@ relhasindex
 -----------
 (0 rows)
 
-s4: WARNING:  got: relation "intra_grant_inplace" does not exist
+s4: WARNING:  got: cache lookup failed for relation REDACTED
 step revoke4: <... completed>
 step r3: ROLLBACK;
diff --git a/src/test/isolation/specs/intra-grant-inplace.spec 
b/src/test/isolation/specs/intra-grant-inplace.spec
index 9936d389359..e9c7848624c 100644
--- a/src/test/isolation/specs/intra-grant-inplace.spec
+++ b/src/test/isolation/specs/intra-grant-inplace.spec
@@ -20,7 +20,7 @@ step grant1   {
        GRANT SELECT ON intra_grant_inplace TO PUBLIC;
 }
 step drop1     {
-       DROP TABLE intra_grant_inplace;
+       DELETE FROM pg_class WHERE relname = 'intra_grant_inplace';
 }
 step c1        { COMMIT; }
 
-- 
2.49.0

Reply via email to