On Wed, 10 Jun 2026 10:11:45 +0100
Eimear Morrissey <[email protected]> wrote:
> Add new downgrade option for pflock. Add stress tests for this &
> by extension the rest of the pflock/rwlock libraries.
>
> Eimear Morrissey (1):
> app/test: add stress tests for rwlock and pflock
>
> Konstantin Ananyev (1):
> eal/pflock: add API to downgrade from wr to rd lock
>
> app/test/meson.build | 2 +
> app/test/test_pflock_stress.c | 76 ++++++
> app/test/test_rwlock_stress.c | 59 +++++
> app/test/test_rwlock_stress_impl.h | 393 +++++++++++++++++++++++++++++
> lib/eal/include/rte_pflock.h | 21 ++
> 5 files changed, 551 insertions(+)
> create mode 100644 app/test/test_pflock_stress.c
> create mode 100644 app/test/test_rwlock_stress.c
> create mode 100644 app/test/test_rwlock_stress_impl.h
>
Interesting idea, lots of feedback from AI. Mostly about the test.
Patch 1/2 (eal/pflock: add API to downgrade from wr to rd lock)
Warning: new public API is not marked __rte_experimental.
New APIs (including static inline) should carry the experimental tag
for at least one release per the ABI policy:
__rte_experimental
static inline void
rte_pflock_write_downgrade(rte_pflock_t *pf)
Warning: new EAL API added without a release notes entry.
Please add a note to doc/guides/rel_notes/release_26_07.rst.
Info: the hunk adds a double blank line before the #ifdef __cplusplus,
and the two atomic calls use different continuation indentation
(one tab vs two). checkpatch will complain about the blank lines.
Patch 2/2 (app/test: add stress tests for rwlock and pflock)
Error: read lock is leaked on reader error paths, hanging the test.
handle_error() with write_lock=false does not unlock, and its comment
claims the lock is "already unlocked by the calling function" -- but
handle_reader_work() calls it while still holding the read lock (both
the array-mismatch path and the counter-changed path), and the
DOWNGRADE_TEST failure path in handle_writer_work() likewise calls it
while holding the downgraded read lock. The leaked reader keeps rd.out
from ever matching, so any writer blocks forever in write_lock() and
rte_eal_wait_lcore() never returns: a detected failure becomes a hang
instead of a test failure. Simplest fix is to have callers unlock
before calling handle_error() and drop the unlock from it entirely;
that also fixes the downgrade path incrementing reader_errors for a
writer thread.
Error: stop flag uses volatile instead of atomics.
"volatile bool stop" is written by workers (handle_error) and the main
lcore and polled by all workers. volatile provides no ordering or
atomicity guarantee; use RTE_ATOMIC(bool) with
rte_atomic_load_explicit/rte_atomic_store_explicit as
test_ring_stress_impl.h does for wrk_cmd.
The volatile on counter and counter_array is unnecessary -- they are
only accessed under the lock, which already provides ordering -- and
it defeats compiler optimization of the 1024-element verify loops.
Warning: DYNAMIC_ROLES does not switch roles.
should_be_writer() is called once in lcore_function() before the loop,
so each thread's role is fixed for the whole run; the flag's stated
purpose ("Threads can switch between reader/writer roles") never
happens. Move the role decision inside the while loop when
DYNAMIC_ROLES is set.
Warning: should_be_writer() assumes the main lcore has index 0.
unsigned int idx = rte_lcore_index(rte_lcore_id()) - 1;
With --main-lcore set to a non-lowest core, a worker can have index 0,
so idx underflows to UINT_MAX and the reader/writer split no longer
matches num_readers/num_writers. Compute the worker's position by
iterating RTE_LCORE_FOREACH_WORKER or skip the main lcore's index
explicitly.
Warning: trailing alignment attribute placement.
} __rte_cache_aligned;
on struct lcore_stats and struct rwlock_test_shared must be written as
struct __rte_cache_aligned lcore_stats {
checkpatch enforces this (required for MSVC).
Info: "max wait" statistic does not measure lock wait time.
acquire_time spans the entire iteration including the verify loops and
the configured reader/writer delays, so for long_hold it reports the
delay, not contention. Either time only the lock call or rename it.
Info: missing space in the summary printf string concatenation:
"%"PRIu64" writer ops," "total time: ..."
prints "writer ops,total time". Also the first element of
pflock_specific_tests is mis-indented (opening brace at column 0).