On 2025-06-24 23:10, Boqun Feng wrote:
Hi,

This is the official first version of simple hazard pointers following
the RFC:

        
https://lore.kernel.org/lkml/20250414060055.341516-1-boqun.f...@gmail.com/

I rebase it onto v6.16-rc3 and hope to get more feedback this time.

Thanks a lot for Breno Leitao to try the RFC out and share the numbers.

I did an extra comparison this time, between the shazptr solution and
the synchronize_rcu_expedited() solution. In my test, during a 100 times
"tc qdisc replace" run:

* IPI rate with the shazptr solution: ~14 per second per core.
* IPI rate with synchronize_rcu_expedited(): ~140 per second per core.

(IPI results were from the 'CAL' line in /proc/interrupt)

This shows that while both solutions have the similar speedup, shazptr
solution avoids the introduce of high IPI rate compared to
synchronize_rcu_expedited().

Feedback is welcome and please let know if there is any concern or
suggestion. Thanks!

Hi Boqun,

What is unclear to me is what is the delta wrt:

https://lore.kernel.org/lkml/20241008135034.1982519-4-mathieu.desnoy...@efficios.com/

and whether this helper against compiler optimizations would still be needed 
here:

https://lore.kernel.org/lkml/20241008135034.1982519-2-mathieu.desnoy...@efficios.com/

Thanks,

Mathieu


Regards,
Boqun

--------------------------------------
Please find the old performance below:

On my system (a 96-cpu VMs), the results of:

        time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq

are (with lockdep enabled):

        (without the patchset)
        real    0m1.039s
        user    0m0.001s
        sys     0m0.069s

        (with the patchset)
        real    0m0.053s
        user    0m0.000s
        sys     0m0.051s

i.e. almost 20x speed-up.

Other comparisons between RCU and shazptr, the rcuscale results (using
default configuration from
tools/testing/selftests/rcutorture/bin/kvm.sh):

RCU:

        Average grace-period duration: 7470.02 microseconds
        Minimum grace-period duration: 3981.6
        50th percentile grace-period duration: 6002.73
        90th percentile grace-period duration: 7008.93
        99th percentile grace-period duration: 10015
        Maximum grace-period duration: 142228

shazptr:

        Average grace-period duration: 0.845825 microseconds
        Minimum grace-period duration: 0.199
        50th percentile grace-period duration: 0.585
        90th percentile grace-period duration: 1.656
        99th percentile grace-period duration: 3.872
        Maximum grace-period duration: 3049.05

shazptr (skip_synchronize_self_scan=1, i.e. always let scan kthread to
wakeup):

        Average grace-period duration: 467.861 microseconds
        Minimum grace-period duration: 92.913
        50th percentile grace-period duration: 440.691
        90th percentile grace-period duration: 460.623
        99th percentile grace-period duration: 650.068
        Maximum grace-period duration: 5775.46

shazptr_wildcard (i.e. readers always use SHAZPTR_WILDCARD):

        Average grace-period duration: 599.569 microseconds
        Minimum grace-period duration: 1.432
        50th percentile grace-period duration: 582.631
        90th percentile grace-period duration: 781.704
        99th percentile grace-period duration: 1160.26
        Maximum grace-period duration: 6727.53

shazptr_wildcard (skip_synchronize_self_scan=1):

        Average grace-period duration: 460.466 microseconds
        Minimum grace-period duration: 303.546
        50th percentile grace-period duration: 424.334
        90th percentile grace-period duration: 482.637
        99th percentile grace-period duration: 600.214
        Maximum grace-period duration: 4126.94

Boqun Feng (8):
   Introduce simple hazard pointers
   shazptr: Add refscale test
   shazptr: Add refscale test for wildcard
   shazptr: Avoid synchronize_shaptr() busy waiting
   shazptr: Allow skip self scan in synchronize_shaptr()
   rcuscale: Allow rcu_scale_ops::get_gp_seq to be NULL
   rcuscale: Add tests for simple hazard pointers
   locking/lockdep: Use shazptr to protect the key hashlist

  include/linux/shazptr.h  |  73 +++++++++
  kernel/locking/Makefile  |   2 +-
  kernel/locking/lockdep.c |  11 +-
  kernel/locking/shazptr.c | 318 +++++++++++++++++++++++++++++++++++++++
  kernel/rcu/rcuscale.c    |  60 +++++++-
  kernel/rcu/refscale.c    |  77 ++++++++++
  6 files changed, 534 insertions(+), 7 deletions(-)
  create mode 100644 include/linux/shazptr.h
  create mode 100644 kernel/locking/shazptr.c



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Reply via email to