On 08/10, Peter Zijlstra wrote: > > On Tue, Aug 09, 2016 at 04:47:38PM -0700, John Stultz wrote: > > On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra <pet...@infradead.org> wrote: > > > > > > Currently the percpu-rwsem switches to (global) atomic ops while a > > > writer is waiting; which could be quite a while and slows down > > > releasing the readers. > > > > > > This patch cures this problem by ordering the reader-state vs > > > reader-count (see the comments in __percpu_down_read() and > > > percpu_down_write()). This changes a global atomic op into a full > > > memory barrier, which doesn't have the global cacheline contention. > > > > > > This also enables using the percpu-rwsem with rcu_sync disabled in order > > > to bias the implementation differently, reducing the writer latency by > > > adding some cost to readers. > > > > So this by itself doesn't help us much, but including the following > > from Oleg does help quite a bit: > > Correct, Oleg was going to send his rcu_sync rework on top of this. But > since its holiday season things might be tad delayed.
Yeees. The patch is ready and even seems to work, but I still can't (re)write the comments. Will try to finish tomorrow. But. We need something simple and backportable, so I am going to send the patch below first. As you can see this is your "sabotage" change, just the new helper was renamed + s/!GP_IDLE/GP_PASSED/ fix. And the only reason I can't send it today is that somehow I can't write the changelog ;) So I would be really happy if you send this change instead of me, I am going to keep "From: peterz" anyway. Oleg. diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h index a63a33e..ece7ed9 100644 --- a/include/linux/rcu_sync.h +++ b/include/linux/rcu_sync.h @@ -59,6 +59,7 @@ static inline bool rcu_sync_is_idle(struct rcu_sync *rsp) } extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type); +extern void rcu_sync_enter_start(struct rcu_sync *); extern void rcu_sync_enter(struct rcu_sync *); extern void rcu_sync_exit(struct rcu_sync *); extern void rcu_sync_dtor(struct rcu_sync *); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 75c0ff0..614f9cd 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -5609,6 +5609,8 @@ int __init cgroup_init(void) BUG_ON(cgroup_init_cftypes(NULL, cgroup_dfl_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup_legacy_base_files)); + rcu_sync_enter_start(&cgroup_threadgroup_rwsem.rss); + get_user_ns(init_cgroup_ns.user_ns); mutex_lock(&cgroup_mutex); diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c index be922c9..c9b7bc8 100644 --- a/kernel/rcu/sync.c +++ b/kernel/rcu/sync.c @@ -83,6 +83,18 @@ void rcu_sync_init(struct rcu_sync *rsp, enum rcu_sync_type type) } /** + * Must be called after rcu_sync_init() and before first use. + * + * Ensures rcu_sync_is_idle() returns false and rcu_sync_{enter,exit}() pairs + * turn into NO-OPs. + */ +void rcu_sync_enter_start(struct rcu_sync *rsp) +{ + rsp->gp_count++; + rsp->gp_state = GP_PASSED; +} + +/** * rcu_sync_enter() - Force readers onto slowpath * @rsp: Pointer to rcu_sync structure to use for synchronization *