Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

Florian Pflug Fri, 14 Feb 2014 04:54:19 -0800

On Feb14, 2014, at 13:36 , Andres Freund <[email protected]> wrote:
> On 2014-02-14 13:28:47 +0100, Florian Pflug wrote:
>>> I don't think that can actually happen because the head of the wait list
>>> isn't the lock holder's lwWaitLink, but LWLock->head. I thought the same
>>> for a while...
>> 
>> Hm, true, but does that protect us under all circumstances? If another
>> backend grabs the lock before B gets a chance to do so, B will become the
>> wait queue's head, and anyone who enqueues behind B will do so by updating
>> B's lwWaitLink. That is then undone by the delayed reset of B's lwWaitLink
>> by the original lock holder.
>> 
>> The specific sequence I have in mind is:
>> 
>> A: Take lock
>> B: Enqueue
>> A: Release lock, set B's lwWaiting to false
>> D: Acquire lock
>> B: Enqueue after spurious wakeup
>>   (lock->head := B)
>> C: Enqueue behind B
>>   (B->lwWaitLink := C, lock->tail := C)
>> A: Set B's wWaitLink back to NULL, thereby corrupting the queue
>>   (B->lwWaitLink := NULL)
>> D: Release lock, dequeue and wakeup B
>>   (lock->head := B->lwWaitLink, i.e. lock->head := NULL)
>> B: Take and release lock, queue appears empty!
>>   C blocks indefinitely.
> 
> That's assuming either reordering by the compiler or an out-of-order
> store architecture, right?


Yes, it requires that a backend exits out of the PGSemaphoreLock loop in
LWLockAcquire before lwWaitLink has been reset to NULL by the previous lock
holder's LWLockRelease. Only if that happens there is a risk of the PGPROC
being on a wait queue by the time lwWaitLink is finally reset to NULL.

>>>> I wonder whether LWLockRelease really needs to update lwWaitLink at all.
>>>> We take the backends we awake off the queue by updating the queue's head 
>>>> and
>>>> tail, so the contents of lwWaitLink should only matter once the backend is
>>>> re-inserted into some wait queue. But when doing that, we reset lwWaitLink
>>>> back to NULL anway.
>>> 
>>> I don't like that, this stuff is hard to debug already, having stale
>>> pointers around doesn't help. I think we should just add the barrier in
>>> the releases with barrier.h and locally use a volatile var in the
>>> branches before that.
>> 
>> I like the idea of updating lwWaiting and lwWaitLink while still holding the
>> spinlock better. The costs are probably negligible, and it'd make the code 
>> much
>> easier to reason about.
> 
> I agree we should do that, but imo not in the backbranches. Anything
> more than than the minimal fix in that code should be avoided in the
> stable branches, this stuff is friggin performance sensitive, and the
> spinlock already is a *major* contention point in many workloads.

No argument there. But unless I missed something, there actually is a bug
there, and using volatile won't fix it. A barrier would, but what about the
back branches that don't have barrier.h? AFAICS the only other choices we have 
on
these branches are to either ignore the bug - it's probably *extremely* unlikely
- or to use a spinlock acquire/release cycle to create a barrier. The former
leaves me with a bit of an uneasy feeling, and the latter quite certainly has
a larger performance impact than moving the PGPROC updates within the section
protected by the spinlock and using an array to remember the backends to wakeup.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

Reply via email to