Hi, While recently looking into partition maintenance, I found a case in which DETACH PARTITION FINALIZE could case deadlocks. This occurs when a ALTER TABLE DETACH CONCURRENTLY, followed by a cancel, the followed by an ALTER TABLE DETACH FINALIZE, and this sequence of steps occur in the middle of other REPEATABLE READ/SERIALIZABLE transactions. These RR/SERIALIZABLE should not accessed the partition until the FINALIZE step starts. See the attached repro.
This seems to occur as the FINALIZE is calling WaitForOlderSnapshot [1] to make sure that all older snapshots are completed before the finalize is completed and the detached partition is removed from the parent table association. WaitForOlderSnapshots is used here to ensure that snapshots older than the start of the ALTER TABLE DETACH CONCURRENTLY are completely removed to guarantee consistency, however it does seem to cause deadlocks for at least RR/SERIALIZABLE transactions. There are cases [2] in which certain operations are accepted as not being MVCC-safe, and now I am wondering if this is another case? Deadlocks are not a good scenario, and WaitForOlderSnapshot does not appear to do anything for READ COMMITTED transactions. So do we actually need WaitForOlderSnapshot in the FINALIZE code? and Could be acceptable that this operation is marked as not MVCC-safe like the other aforementioned operations? Perhaps I am missing some important point here, so any feedback will be appreciated. Regards, Sami Imseih Amazon Web Services (AWS) [1] https://github.com/postgres/postgres/blob/master/src/backend/commands/tablecmds.c#L18757-L18764 [2] https://www.postgresql.org/docs/current/mvcc-caveats.html
partition_detach_finalize_deadlock_repro.sql
Description: partition_detach_finalize_deadlock_repro.sql