When blocking , we incur in multiple barriers when setting the task's uninterruptable state. This is particularly bad when the lock keeps getting stolen from the task trying to acquire the sem. These changes propose delaying setting the task's new state until we are sure that calling schedule is inevitable.
This implies that we do the trylock and active check (both basically ->counter checks) as TASK_RUNNING. For the trylock we hold the wait lock with interrupts disabled, so no risk there. And for the active check, the window for which we could get interrupted is quite small and makes no tangible difference. This patch increases Unixbench's 'execl' throughput by 25% on a 40 core machine. Signed-off-by: Davidlohr Bueso <dbu...@suse.de> --- kernel/locking/rwsem-xadd.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 18a50da..88b3468 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -459,17 +459,27 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem) count = rwsem_atomic_update(RWSEM_WAITING_BIAS, sem); /* wait until we successfully acquire the lock */ - set_current_state(TASK_UNINTERRUPTIBLE); while (true) { if (rwsem_try_write_lock(count, sem)) break; + + __set_current_state(TASK_UNINTERRUPTIBLE); raw_spin_unlock_irq(&sem->wait_lock); - /* Block until there are no active lockers. */ - do { + /* + * When there are active locks after we wake up, + * the lock was probably stolen from us. Thus, + * go immediately back to sleep and avoid taking + * the wait_lock. + */ + while (true) { schedule(); - set_current_state(TASK_UNINTERRUPTIBLE); - } while ((count = sem->count) & RWSEM_ACTIVE_MASK); + + count = READ_ONCE(sem->count); + if (!(count & RWSEM_ACTIVE_MASK)) + break; + __set_current_state(TASK_UNINTERRUPTIBLE); + } raw_spin_lock_irq(&sem->wait_lock); } -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/