Here's a much simpler example. This one works when I turn off
enable_tidscan, so I'm very hopeful Sophie's patch applies. It's very
similar to the example bug, but loses an update instead of gaining
one. It MSTM that both are because the ctid value wasn't being
rechecked in EvalPlanQual.

--session 1
CREATE TABLE t(p BOOL, q BOOL);
INSERT INTO t DEFAULT VALUES;
SELECT pg_advisory_lock(1);

--session 2
SELECT pg_advisory_lock_shared(1);
SELECT pg_advisory_unlock_shared(1);
WITH lock_t AS (SELECT ctid FROM t FOR UPDATE)
UPDATE t SET p = TRUE FROM lock_t l WHERE t.ctid = l.ctid
RETURNING p, q, t.ctid;

--session 3
SELECT pg_advisory_lock_shared(1);
SELECT pg_advisory_unlock_shared(1);
WITH lock_t AS (SELECT ctid FROM t FOR UPDATE)
UPDATE t SET q = TRUE FROM lock_t l WHERE t.ctid = l.ctid
RETURNING p, q, t.ctid;

--session 1
SELECT pg_advisory_unlock(1);

Only one of the updates succeeds.

This leads me to conclude that for concurrency and ctid:
safe:
    - deleting a record, because one of the deletes will succeed
    - updating a record with skip lock, because the lock won't ever be rechecked
unsafe (currently):
  - updating a record without skip lock

The documentation for UPDATE and DELETE both include a FOR UPDATE with
ctid example, so I think this isn't expected behaviour, but will be
fixed in the next release. Thanks Sophie!

Thanks,
Bernice

On Wed, Nov 12, 2025 at 5:12 PM Bernice Southey
<[email protected]> wrote:
>
> Hi all,
>
> I've run into a concurrency issue and am hoping someone here with more
> internal PostgreSQL knowledge than me can explain why.
>
> My scenario is keeping a sync table up to date with the latest message
> by recipient key. If I use ctid, all the updated records are dropped
> by the concurrent query, after the first query releases the locks. If
> I use a standard column it works as expected.
>
> I'm hopeful that the recently accepted patch [1] from Sophie is the
> same issue, but I'm not knowledgeable enough to tell. I tried setting
> enable_tidscan to 0 based on the discussion, but it made no
> difference, so I thought I best ask. If this behaviour is as expected,
> I'd really like to understand why, so I can better reason about ctid
> in concurrent updates.
>
> To test, I created 10 sync records and 200 messages for each of them.
> I processed 1000 messages at a time concurrently in two different
> sessions. I duplicated the test for the ctid case and my key column
> (k) case. The only difference is the final where clause for the
> update. The output should be each key synced to 200.
>
> --setup
> CREATE TABLE msg(k INT, m INT);
> CREATE TABLE msg2(k INT, m INT);
> CREATE TABLE sync(k INT PRIMARY KEY, m INT DEFAULT 0);
> CREATE TABLE sync2(k INT PRIMARY KEY, m INT DEFAULT 0);
> INSERT INTO sync(k) SELECT i FROM generate_series(1, 10) i;
> INSERT INTO sync2(k) SELECT i FROM generate_series(1, 10) i;
> INSERT INTO msg SELECT k, i FROM sync, generate_series(1, 200) i
>     ORDER BY random();
> INSERT INTO msg2 SELECT k, i FROM sync, generate_series(1, 200) i
>     ORDER BY random();
> SELECT pg_advisory_lock(1);
>
> --run in two new sessions
> SELECT pg_advisory_lock_shared(1);
> SELECT pg_advisory_unlock_shared(1);
> WITH
> --by ctid
> lock_msg AS (
>     SELECT ctid FROM msg FOR UPDATE SKIP LOCKED LIMIT 1000),
> delete_msg AS (
>     DELETE FROM msg m USING lock_msg l WHERE m.ctid = l.ctid
>     RETURNING k, m),
> msg_max AS (
>     SELECT DISTINCT ON (k) k, m FROM delete_msg ORDER BY k, m DESC),
> lock_sync AS (
>     SELECT s.ctid, k, m.m from msg_max m join sync s USING (k)
>     ORDER BY k FOR UPDATE OF s),
> update_sync AS (
>     UPDATE sync s set m = l.m FROM lock_sync l
>     WHERE s.ctid = l.ctid AND l.m > s.m  --by ctid
>     RETURNING s.k, s.m, s.ctid),
> --by k
> lock_msg2 AS (
>     SELECT ctid FROM msg2 FOR UPDATE SKIP LOCKED LIMIT 1000),
> delete_msg2 AS (
>     DELETE FROM msg2 m USING lock_msg2 l WHERE m.ctid = l.ctid
>     RETURNING k, m),
> msg_max2 AS (
>     SELECT DISTINCT ON (k) k, m FROM delete_msg2 ORDER BY k, m DESC),
> lock_sync2 AS (
>     SELECT s.ctid, k, m.m FROM msg_max2 m join sync2 s USING (k)
>     ORDER BY k FOR UPDATE OF s),
> update_sync2 AS (
>     UPDATE sync2 s SET m = l.m FROM lock_sync2 l
>     WHERE s.k = l.k AND l.m > s.m --by k
>     RETURNING s.k, s.m, s.ctid)
> --results
> SELECT 'ctid', * FROM update_sync UNION ALL
> SELECT 'k', * FROM update_sync2;
>
> --run last in original session
> SELECT pg_advisory_unlock(1);
>
> All the records in both msg tables are deleted. In the sync tables,
> 'k' is updated as expected across the two sessions, but 'ctid' dropped
> all the records in the second session.
>
> Session 1
> ctid 1 199 (0,11)
> ctid 2 200 (0,12)
> ctid 3 200 (0,13)
> ctid 4 199 (0,14)
> ctid 5 200 (0,15)
> ctid 6 200 (0,16)
> ctid 7 199 (0,17)
> ctid 8 200 (0,18)
> ctid 9 200 (0,19)
> ctid 10 200 (0,20)
> k 1 200 (0,11)
> k 2 195 (0,12)
> k 3 200 (0,13)
> k 4 200 (0,14)
> k 5 199 (0,15)
> k 6 199 (0,16)
> k 7 196 (0,17)
> k 8 194 (0,18)
> k 9 200 (0,19)
> k 10 196 (0,20)
>
> Session 2
> k 2 200 (0,21)
> k 5 200 (0,22)
> k 6 200 (0,23)
> k 7 200 (0,24)
> k 8 200 (0,25)
> k 10 200 (0,26)
>
> Thanks,
> Bernice
>
> 1 
> [https://www.postgresql.org/message-id/flat/4a6268ff-3340-453a-9bf5-c98d51a6f729%40app.fastmail.com]


Reply via email to