Re: DROP OWNED CASCADE vs Temp tables
On 2020-Apr-06, Tom Lane wrote: > Alvaro Herrera writes: > > On 2020-Jan-14, Alvaro Herrera wrote: > >> Hmm, it seems to be doing it differently. Maybe it should be acquiring > >> locks on all objects in that nested loop and verified them for > >> existence, so that when it calls performMultipleDeletions the objects > >> are already locked, as you say. > > > Yeah, this solves the reported bug. > > I looked this over and think it should be fine. There will be cases > where we get a deadlock error, but such risks existed anyway, since > we'd have acquired all the same locks later in the process. Thanks for looking again. I have pushed this to all branches, with these changes: > Hmmm ... there is an argument for doing ReleaseDeletionLock in the code > paths where you discover that the object's been deleted. Added this. This of course required also exporting ReleaseDeletionLock, which closes my concern about exporting only half of that API. > Also, if we're exporting these, it's worth expending a bit more > effort on their header comments. In particular AcquireDeletionLock > should describe its flags argument; perhaps along the lines of > "Accepts the same flags as performDeletion (though currently only > PERFORM_DELETION_CONCURRENTLY does anything)". Did this too. I also changed the comment to indicate that, since they're now exported APIs, they might grow the ability to lock shared objects in the future. In fact, we have some places where we're using LockSharedObject directly to lock objects to drop; it seems reasonable to think that we should augment AcquireDeletionLock to handle those objects and make those places use the new API. Lastly: right now, only performMultipleDeletions passes the flags down to AcquireDeletionLock -- there are a couple places that drop objects and call AcquireDeletionLock with flags=0. There's no bug AFAICS because those cannot be called while running concurrent object drop. But for correctness, those should pass flags too. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: DROP OWNED CASCADE vs Temp tables
Alvaro Herrera writes: > On 2020-Jan-14, Alvaro Herrera wrote: >> Hmm, it seems to be doing it differently. Maybe it should be acquiring >> locks on all objects in that nested loop and verified them for >> existence, so that when it calls performMultipleDeletions the objects >> are already locked, as you say. > Yeah, this solves the reported bug. I looked this over and think it should be fine. There will be cases where we get a deadlock error, but such risks existed anyway, since we'd have acquired all the same locks later in the process. > This is not a 100% solution: there's the cases when the user is removed > from an ACL and from a policy, and those deletions are done directly > instead of accumulating to the end for a mass deletion. Let's worry about that when and if we get field complaints. > I had to export AcquireDeletionLock (previously a static in > dependency.c). I wonder if I should export ReleaseDeletionLock too, for > symmetry. Hmmm ... there is an argument for doing ReleaseDeletionLock in the code paths where you discover that the object's been deleted. Holding a lock on a no-longer-existent object OID should be harmless from a deadlock standpoint; but it does represent useless consumption of a shared lock table entry, and that's a resource this operation could already burn with abandon. Also, if we're exporting these, it's worth expending a bit more effort on their header comments. In particular AcquireDeletionLock should describe its flags argument; perhaps along the lines of "Accepts the same flags as performDeletion (though currently only PERFORM_DELETION_CONCURRENTLY does anything)". regards, tom lane
Re: DROP OWNED CASCADE vs Temp tables
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested I have tested the patch with REL_12_STABLE for the given error scenario by running the "CREATE TEMPORARY TABLE..." and "DROP OWNED BY..." commands concurrently using parallel background workers. I have also tested a few related scenarios and the patch does seem to fix the reported bug. Ran make installcheck-world, no difference with and without patch.
Re: DROP OWNED CASCADE vs Temp tables
On 2020-Jan-23, Alvaro Herrera wrote: > This is not a 100% solution: there's the cases when the user is removed > from an ACL and from a policy, and those deletions are done directly > instead of accumulating to the end for a mass deletion. > > I had to export AcquireDeletionLock (previously a static in > dependency.c). I wonder if I should export ReleaseDeletionLock too, for > symmetry. FWIW I'm going to withhold this bugfix until after the next set of minors are out. I'd rather not find out later that I have no way to fix 9.4 if I break it, for a bug that has existed forever ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: DROP OWNED CASCADE vs Temp tables
On 2020-Jan-14, Alvaro Herrera wrote: > On 2020-Jan-13, Tom Lane wrote: > > > That seems fundamentally wrong. By the time we've queued an object for > > deletion in dependency.c, we have a lock on it, and we've verified that > > the object is still there (cf. systable_recheck_tuple calls). > > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > > doing it wrong. > > Hmm, it seems to be doing it differently. Maybe it should be acquiring > locks on all objects in that nested loop and verified them for > existence, so that when it calls performMultipleDeletions the objects > are already locked, as you say. Yeah, this solves the reported bug. This is not a 100% solution: there's the cases when the user is removed from an ACL and from a policy, and those deletions are done directly instead of accumulating to the end for a mass deletion. I had to export AcquireDeletionLock (previously a static in dependency.c). I wonder if I should export ReleaseDeletionLock too, for symmetry. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index c4a4df25b8..2ec3e39694 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -199,7 +199,6 @@ static void reportDependentObjects(const ObjectAddresses *targetObjects, static void deleteOneObject(const ObjectAddress *object, Relation *depRel, int32 flags); static void doDeletion(const ObjectAddress *object, int flags); -static void AcquireDeletionLock(const ObjectAddress *object, int flags); static void ReleaseDeletionLock(const ObjectAddress *object); static bool find_expr_references_walker(Node *node, find_expr_references_context *context); @@ -1530,7 +1529,7 @@ doDeletion(const ObjectAddress *object, int flags) * else. Note that dependency.c is not concerned with deleting any kind of * shared-across-databases object, so we have no need for LockSharedObject. */ -static void +void AcquireDeletionLock(const ObjectAddress *object, int flags) { if (object->classId == RelationRelationId) diff --git a/src/backend/catalog/pg_shdepend.c b/src/backend/catalog/pg_shdepend.c index 2ef792dbd7..20d7ee7d87 100644 --- a/src/backend/catalog/pg_shdepend.c +++ b/src/backend/catalog/pg_shdepend.c @@ -1319,12 +1319,16 @@ shdepDropOwned(List *roleids, DropBehavior behavior) elog(ERROR, "unexpected dependency type"); break; case SHARED_DEPENDENCY_ACL: + /* XXX need to lock+accumulate everything and drop later? */ RemoveRoleFromObjectACL(roleid, sdepForm->classid, sdepForm->objid); break; case SHARED_DEPENDENCY_POLICY: - /* If unable to remove role from policy, remove policy. */ + /* + * Try to remove role from policy; if unable to, remove + * policy. + */ if (!RemoveRoleFromObjectPolicy(roleid, sdepForm->classid, sdepForm->objid)) @@ -1332,6 +1336,15 @@ shdepDropOwned(List *roleids, DropBehavior behavior) obj.classId = sdepForm->classid; obj.objectId = sdepForm->objid; obj.objectSubId = sdepForm->objsubid; + /* + * Acquire lock on object, then verify this dependency + * is still relevant. If not, the object might have + * been dropped or the policy modified. Ignore the + * object in that case. + */ + AcquireDeletionLock(, 0); + if (!systable_recheck_tuple(scan, tuple)) + break; add_exact_object_address(, deleteobjs); } break; @@ -1342,6 +1355,10 @@ shdepDropOwned(List *roleids, DropBehavior behavior) obj.classId = sdepForm->classid; obj.objectId = sdepForm->objid; obj.objectSubId = sdepForm->objsubid; + /* as above */ + AcquireDeletionLock(, 0); + if (!systable_recheck_tuple(scan, tuple)) + break; add_exact_object_address(, deleteobjs); } break; diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h index 0cd6fcf027..fa9252b07c 100644 --- a/src/include/catalog/dependency.h +++ b/src/include/catalog/dependency.h @@ -142,6 +142,8 @@ typedef enum ObjectClass /* in dependency.c */ +extern void AcquireDeletionLock(const ObjectAddress *object, int flags); + extern void performDeletion(const ObjectAddress *object, DropBehavior behavior, int flags);
Re: DROP OWNED CASCADE vs Temp tables
On 2020-Jan-13, Tom Lane wrote: > That seems fundamentally wrong. By the time we've queued an object for > deletion in dependency.c, we have a lock on it, and we've verified that > the object is still there (cf. systable_recheck_tuple calls). > If shdepDropOwned is doing it differently, I'd say shdepDropOwned is > doing it wrong. Hmm, it seems to be doing it differently. Maybe it should be acquiring locks on all objects in that nested loop and verified them for existence, so that when it calls performMultipleDeletions the objects are already locked, as you say. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: DROP OWNED CASCADE vs Temp tables
Alvaro Herrera writes: > On 2020-Jan-07, Mithun Cy wrote: >> I have a test where a user creates a temp table and then disconnect, >> concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems >> this causes race condition between temptable deletion during disconnection >> (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation >> which will try to remove same temp table when they find them as part of >> pg_shdepend. > Cute. Is this really any worse than any other attempt to issue two DROPs against the same object concurrently? Maybe we can just call it pilot error. > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. That seems fundamentally wrong. By the time we've queued an object for deletion in dependency.c, we have a lock on it, and we've verified that the object is still there (cf. systable_recheck_tuple calls). If shdepDropOwned is doing it differently, I'd say shdepDropOwned is doing it wrong. regards, tom lane
Re: DROP OWNED CASCADE vs Temp tables
On Mon, Jan 13, 2020 at 07:45:06PM -0300, Alvaro Herrera wrote: > This seems fiddly to handle better; maybe you'd have to have a new > PERFORM_DELETION_* flag that says to ignore "missing" objects; so when > you go from shdepDropOwned, you pass that flag all the way down to > doDeletion(), so the objtype-specific function is called with > "missing_ok", and ignore if the object has already gone away. That's > tedious because none of the Remove* functions have the concept of > missing_ok. Yes, that would be invasive and I'd rather not backpatch such a change but I don't see a better or cleaner way to handle that correctly either than the way you are describing. Looking at all the subroutines removing the objects by OID, a patch among those lines is repetitive, though not complicated to do. -- Michael signature.asc Description: PGP signature
Re: DROP OWNED CASCADE vs Temp tables
On 2020-Jan-07, Mithun Cy wrote: > I have a test where a user creates a temp table and then disconnect, > concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems > this causes race condition between temptable deletion during disconnection > (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation > which will try to remove same temp table when they find them as part of > pg_shdepend. Cute. This seems fiddly to handle better; maybe you'd have to have a new PERFORM_DELETION_* flag that says to ignore "missing" objects; so when you go from shdepDropOwned, you pass that flag all the way down to doDeletion(), so the objtype-specific function is called with "missing_ok", and ignore if the object has already gone away. That's tedious because none of the Remove* functions have the concept of missing_ok. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
DROP OWNED CASCADE vs Temp tables
I have a test where a user creates a temp table and then disconnect, concurrently we try to do DROP OWNED BY CASCADE on the same user. Seems this causes race condition between temptable deletion during disconnection (@RemoveTempRelations(myTempNamespace)) and DROP OWNED BY CASCADE operation which will try to remove same temp table when they find them as part of pg_shdepend. Which will result in internal error cache lookup failed as below. DROP OWNED BY test_role CASCADE; 2020-01-07 12:35:06.524 IST [26064] ERROR: cache lookup failed for relation 41019 2020-01-07 12:35:06.524 IST [26064] STATEMENT: DROP OWNED BY test_role CASCADE; reproduce.sql:8: ERROR: cache lookup failed for relation 41019 TEST = create database test_db; create user test_superuser superuser; \c test_db test_superuser CREATE ROLE test_role nosuperuser login password 'test_pwd' ; \c test_db test_role CREATE TEMPORARY TABLE tmp_table(col1 int); \c test_db test_superuser DROP OWNED BY test_role CASCADE; -- Thanks and Regards Mithun Chicklore Yogendra EnterpriseDB: http://www.enterprisedb.com