[BUGS] BUG #4925: "select ... for update" doesn't affect rows from sub-query

Steve Caligo Thu, 16 Jul 2009 09:46:31 -0700

The following bug has been logged online:

Bug reference:      4925
Logged by:          Steve Caligo
Email address:      steve.cal...@ctie.etat.lu
PostgreSQL version: 8.3.7 and 8.4.0
Operating system:   Archlinux and Gentoo 8.3.7, Gentoo 8.4.0
Description:        "select ... for update" doesn't affect rows from
sub-query
Details:


While trying to guarantee data consistency when doing concurrent processing,
I stumbled upon your cautions mentionned in your documentation (especially
"limit" ... "for update"):
http://www.postgresql.org/docs/8.4/static/sql-select.html

I tried working around this limitation and the statement on the same page
seemed promising to me:
"If FOR UPDATE or FOR SHARE is applied to a view or sub-query, it affects
all tables used in the view or sub-query."

But unfortunately the latter statement doesn't seem to be true and the
subquery isn't protected by row locks, as the following examples show.


1) INITIAL SITUATION

Create a simple table with some data. No constraints, no indexes, just the
bare minimum:

create table test (
        id integer,
        name varchar(10),
        c integer
);

insert into test values
        (1, 'test1', 0),
        (2, 'test2', 0),
        (3, 'test3', 0),
        (4, 'test4', 0),
        (5, 'test5', 0)
;


2) FIRST TRY, USING "UPDATE WHERE ID = ( SELECT )"
The goal is to have two clients set their unique ID to a single/different
row from the table. First, using "limit" in a slightly different way:

1=> begin transaction;
2=> begin transaction;

1=> update test set c = 1 where id = (
        select id from test where c = 0 order by id limit 1
)
; -- updates row id=1
2=> update test set c = 2 where id = (
        select id from test where c = 0 order by id limit 1
)
; -- forced to wait on lock

1=> commit; -- client #2 continues
1=> select * from test where id = 1;
 id | name  | c 
----+-------+---
  1 | test1 | 1
(1 row)

2=> commit;
2=> select * from test where id = 1;
 id | name  | c 
----+-------+---
  1 | test1 | 2
(1 row)

1=> select * from test where id = 1;
 id | name  | c 
----+-------+---
  1 | test1 | 2
(1 row)


Conclusion: didn't work.
Probably because the subquery is executed before the update and not affected
by row locking.

Expected behaviour: one client update one row to c=1 and the other client
updates a different row to c=2.


3) SECOND TRY, PROTECTING THE UPDATE BY AN ADDITIONAL "SELECT ... FOR
UPDATE" AND AVOIDING THE "LIMIT"

1=> begin transaction;
2=> begin transaction;
1=> select id from test where id = (
        select min(id) from test where c = 0
) for update;
 id 
----
  2
(1 row)

2=> select id from test where id = (
        select min(id) from test where c = 0
) for update; -- forced to wait on lock

1=> update test set c = 1 where id = 2;
1=> commit; -- client #2 continues:

2=> -- client #2 outputs:
 id 
----
  2
(1 row)

2=> select * from test where id = 2;
 id | name  | c 
----+-------+---
  2 | test2 | 1
(1 row)
2=> -- now this isn't what we initially asked for, let's just repeat the
query once more:

2=> select id from test where id = ( select min(id) from test where c = 0 )
for update;
 id 
----
  3
(1 row)


Conclusion: didn't work.
The situation one ends up in is one that contradicts your above statement,
but also seemingly violates the "I" in ACID. In 3), client #2 is clearly
affected by the actions of client #1. While serialized transactions or full
table locks would avoid this race condition, it either requires large
changes in the application or impacts performance during contention.

Adding an additional "c = 0" to the main query of 3) of course suppresses
the row from the concurrent update, but it shouldn't have been returned with
a value of "c = 1" because the transaction #2 started prior to the update
statement of #1.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] BUG #4925: "select ... for update" doesn't affect rows from sub-query

Reply via email to