Michael Fuhr wrote:
> On Sun, Mar 27, 2005 at 12:54:28AM -0600, Guy Rouillier wrote:
>> I'm getting the following in the server log:
>> 
>> 2005-03-27 06:04:21 GMT estat DETAIL:  Process 20928 waits for
>> ShareLock on transaction 7751823; blocked by process 20929.
>>      Process 20929 waits for ShareLock on transaction 7768115;
blocked by
>> process 20928. 2005-03-27 06:04:21 GMT estat CONTEXT:  SQL statement
>> "SELECT 1 FROM ONLY "rumba"."service_plane" x WHERE
>> "service_plane_id" = $1 FOR UPDATE OF x"
> ...
>> The service_plane table is a reference table, i.e., a fixed set of
>> values used only to validate foreign keys.  So the code doesn't have
>> any update statements on that table.  I'm assuming PostgreSQL is
>> generating that SQL to validate the foreign key.  But why is it
>> selecting for update?
> 
> To make sure the referenced key can't change until the transaction
> completes and the referencing row becomes visible to other
> transactions (or is rolled back) -- otherwise other transactions
> could change or delete the referenced key and not know they'd be
> breaking your referential integrity.  The current implementation
> supports only exclusive row-level locks (SELECT FOR UPDATE), but I
> think Alvaro might be working on shared row-level locks for a future
> release.       

Michael, thanks for the reply.  The current approach seems pretty...
fragile.  I encountered this deadlock because I have two tables (call
them T1 and T2), each with a foreign key to the same reference table.
I'm inserting about 2 million rows a day into each of these two tables.
Rows arrive in logical batches, so to help out performance, I only
commit after inserting every N rows. The deadlock arises because thread1
inserts some set of rows into T1 while thread2 inserts a set of rows
into T2.  The cardinality of the reference table is very small (only
about 10 rows) so the inserts into T1 and T2 are virtually guaranteed to
both reference the same values.

I understand this is easy for me to say since I'm not writing the PG
code, but multiple clients should be able to read reference tables
simultaneously.  Seems like a "prevent write" lock should be placed on
the rows in the reference table rather than an "exclusive" lock.  That
way as many clients as necessary can read the values, but if someone
comes along and tries to change a row, just that one client gets an
error.

With the current implementation, it appears I need to either (1) always
commit after every inserted row, or (2) single thread my entire insert
logic.  Neither of these two alternatives is very desirable.  And it is
only a partial fix (which will work in my case since I'm the only one
updating this database.)  In the general case though, where other
programmers may be writing code that I may not even know about to update
parts of this database, avoiding this type of deadlock becomes very
difficult.  It pretty much requires that everyone know what everyone
else is doing.

-- 
Guy Rouillier


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to