On 07/30/2015 09:14 PM, Andres Freund wrote:
On 2015-07-30 17:36:52 +0300, Heikki Linnakangas wrote:
In 9.4, LWLockAcquire holds the spinlock when it marks the lock as held,
until it has updated the variable. And LWLockWaitForVar() holds the spinlock
when it checks that the lock is held and that the variable's value matches.
So it cannot happen on 9.4.

The first paragraph talked about "the same value", but that was just
referring to it not yet having been cleared i think...

To reiterate, with 9.5, it's possible that a backend is sleeping in
LWLockWaitForVar(oldvar=123), even though the lock is currently held by
another backend with value 124. That seems wrong, or surprising at the very
least.

With my patch that can't really happen that way though? The value is
re-checked after queuing. If it has changed by then we're done. And if
it hasn't yet changed we're guaranteed to be woken up once it's being
changed?

Ok, let me try explaining it again. With your patch, LWLockAcquireWithVar looks like this (simplified, assuming the lock is free):

1. Set LW_VAL_EXCLUSIVE
3. (Acquire spinlock) Set *valptr = val (Release spinlock)

LWLockWaitForVar looks like this:

1. Check if lock is free
2. (Acquire Spinlock) Read *valptr, compare with oldval (Release spinlock)
3. LWLockQueueSelf
4. Re-check that *valptr is still equal to oldval
5. sleep.

This is the race condition:

Backend A                       Backend B
---------                       ---------
LWLockAcquireWithVar(123)
  Set LW_VAL_EXCLUSIVE
  Set *valptr = 123
                                LWLockWaitForVar(123)
                                  See that the lock is held
                                  Read *valptr, it matches
                                  LWLockQueueSelf
LWLockRelease()
LWLockAcquireWithVar(124)
  Set LW_VAL_EXCLUSIVE
                                  wakes up
                                  See that the lock is still/again held
                                  Read *valptr, it's still 123
                                  LWLockQueueSelf
                                  Re-check that *valptr is still 123.
                                  go to sleep.
  Set *valptr = 124


Now, Backend B's LWLockWaitForVar(123) call is sleeping, even though the lock was free'd and reacquired with different value, 124. It won't wake up until A updates the value or releases the lock again.

This was not possible in 9.4. It was possible in 9.4 too when the value was same in both LWLockAcquireWithVar() calls, and I think that's acceptable, but not when the values differ.

I generaly don't mind adding some sort of flag clearing or such, but I'd
rather not have it in the retry loop in the general LWLockAttemptLock
path - I found that very small amounts of instructions in there have a
measurable impact.

I doubt clearing a bit would matter there, although you might have a better instinct on that...

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to