v2->v3: - Use the abbreviation TP for the new futexes instead of TO. - Make a number of changes accordingly to review comments from ThomasG, PeterZ and MikeG. - Breaks the main futex patch into smaller pieces to make them easier to review. - Integrate the microbenchmark into the "perf bench futex" tool so that others can use it too.
v1->v2: - Adds an explicit lock hand-off mechanism. - Adds timeout support. - Simplifies the required userspace code. - Fixes a number of problems in the v1 code. This patchset introduces a new futex implementation called throughput-optimized (TP) futexes. It is similar to PI futexes in its calling convention, but provides better throughput than the wait-wake futexes by encouraging more lock stealing and optimistic spinning especially when the lock holders don't sleep and the critical sections are not very short. A manpage patch that documents changes to the futex(2) system call will be sent out separately. Patch 1 consolidates the duplicated timer setup code. Patch 2 renames the futex_pi_state structure to futex_state as it will no longer be specific to the PI futexes. Some futex_pi_state related functions are also renamed. Patch 3 adds some helper functions to be used in later patches. Patch 4 consolidates the codes that need to add or delete futex_state to pi_state_list. Patch 5 adds a new type field to the futex_state structure. Patch 6 changes the futex_hash_bucket to include another list for futex state objects and a spinlock to guard the new list. Patch 7 is the core of this patchset and introduction of the new TP futexes. Patch 8 enables more robust handling when the futex owners died unexpectedly. Patch 9 implements the lock handoff mechanism to prevent lock starvation. Patch 10 enables the FUTEX_LOCK call to return a status code to show how the lock is being acquired. Patch 11 adds code support the specification of a timeout value in the FUTEX_LOCK call. Patch 12 adds a new documentation file about the TP futexes. Patch 13 adds the futex microbenchmark that was used to produce the performance data to the "perf bench futex" tool. Others can then use it to do their own evaluation. Waiman Long (13): futex: Consolidate duplicated timer setup code futex: Rename futex_pi_state to futex_state futex: Add helpers to get & cmpxchg futex value without lock futex: Consolidate pure pi_state_list add & delete codes to helpers futex: Add a new futex type field into futex_state futex: Allow direct attachment of futex_state objects to hash bucket futex: Throughput-optimized (TP) futexes futex: Enable robust handling of TP futexes futex: Implement lock handoff for TP futexes to prevent lock starvation futex: Inform FUTEX_LOCK callers on how the lock is acquired futex: Add timeout support to TP futexes futex, doc: TP futexes document perf bench: New microbenchmark for userspace mutex performance Documentation/00-INDEX | 2 + Documentation/tp-futex.txt | 147 +++++++ include/linux/sched.h | 4 +- include/uapi/linux/futex.h | 4 + kernel/futex.c | 943 +++++++++++++++++++++++++++++++++++----- tools/perf/bench/Build | 1 + tools/perf/bench/bench.h | 1 + tools/perf/bench/futex-mutex.c | 558 ++++++++++++++++++++++++ tools/perf/bench/futex.h | 25 + tools/perf/builtin-bench.c | 4 + 10 files changed, 1575 insertions(+), 114 deletions(-) create mode 100644 Documentation/tp-futex.txt create mode 100644 tools/perf/bench/futex-mutex.c