On Thu, Feb 11, 2016 at 9:36 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Feb 11, 2016 at 9:29 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Robert Haas <rh...@postgresql.org> writes: >>> Add some isolation tests for deadlock detection and resolution. >> >> Buildfarm says this needs work ... >> >> dromedary is one of mine, do you need me to look into what is >> happening? > > That would be great. Taking a look at what happened, I have a feeling > this may be a race condition of some kind in the isolation tester. It > seems to have failed to recognize that a1 started waiting, and that > caused the "deadlock detected" message to reported differently. I'm > not immediately sure what to do about that.
Yeah, so: try_complete_step() waits 10ms, and if it still hasn't gotten any data back from the server, then it uses a separate query to see whether the step in question is waiting on a lock. So what must've happened here is that it took more than 10ms for the process to show up as waiting in pg_stat_activity. It might be possible to fix this by not passing STEP_NONBLOCK if there's only one connection that isn't waiting. I think I had it like that at one point, and then took it out because it caused some other problem. Another option is to lengthen the timeout. It doesn't seem great to be dependent on a fixed timeout, but the server doesn't send any protocol traffic to indicate a lock wait. If we declared which steps are supposed to wait, then there'd be no ambiguity, but that seems like a drag. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers