On 14.02.2011 20:10, Kevin Grittner wrote:
Promotion of the lock granularity on the prior tuple is where we
have problems. If the two tuple versions are in separate pages then
the second UPDATE could miss the conflict.  My first thought was to
fix that by requiring promotion of a predicate lock on a tuple to
jump straight to the relation level if nextVersionOfRow is set for
the lock target and it points to a tuple in a different page.  But
that doesn't cover a situation where we have a heap tuple predicate
lock which gets promoted to page granularity before the tuple is
updated.  To handle that we would need to say that an UPDATE to a
tuple on a page which is predicate locked by the transaction would
need to be promoted to relation granularity if the new version of
the tuple wasn't on the same page as the old version.

Yeah, promoting the original lock on the UPDATE was my first thought too.

Another idea is to duplicate the original predicate lock on the first update, so that the original reader holds a lock on both row versions. I think that would ultimately be simpler as we wouldn't need the next-prior chains anymore.

For example, suppose that transaction X is holding a predicate lock on tuple A. Transaction Y updates tuple A, creating a new tuple B. Transaction Y sees that X holds a lock on tuple A (or the page containing A), so it acquires a new predicate lock on tuple B on behalf of X.

If the updater aborts, the lock on the new tuple needs to be cleaned up, so that it doesn't get confused with later tuple that's stored in the same physical location. We could store the xmin of the tuple in the predicate lock to check for that. Whenever you check for conflict, if the xmin of the lock doesn't match the xmin on the tuple, you know that the lock belonged to an old dead tuple stored in the same location, and can be simply removed as the tuple doesn't exist anymore.

That said, the above is about eliminating false negatives from some
corner cases which escaped notice until now.  I don't think the
changes described above will do anything to prevent the problems
reported by YAMAMOTO Takashi.

Agreed, it's a separate issue. Although if we change the way we handle the read-update-update problem, the other issue might go away too.

 Unless I'm missing something, it
sounds like tuple IDs are being changed or reused while predicate
locks are held on the tuples.  That's probably not going to be
overwhelmingly hard to fix if we can identify how that can happen.
I tried to cover HOT issues, but it seems likely I missed something.

Storing the xmin of the original tuple would probably help with that too. But it would be nice to understand and be able to reproduce the issue first.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to