I think there's a race condition in mminsert, if two backends insert a tuple to the same heap page range concurrently. mminsert does this:

1. Fetch the MMtuple for the page range
2. Check if any of the stored datums need updating
3. Unlock the page.
4. Lock the page again in exclusive mode.
5. Update the tuple.

It's possible that two backends arrive at phase 3 at the same time, with different values. For example, backend A wants to update the minimum to contain 10, and and backend B wants to update it to 5. Now, if backend B gets to update the tuple first, to 5, backend A will update the tuple to 10 when it gets the lock, which is wrong.

The simplest solution would be to get the buffer lock in exclusive mode to begin with, so that you don't need to release it between steps 2 and 5. That might be a significant hit on concurrency, though, when most of the insertions don't in fact have to update the value. Another idea is to re-check the updated values after acquiring the lock in exclusive mode, to see if they match the previous values.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to