Le Thu, Oct 03, 2024 at 10:01:57PM +0800, Z qiang a écrit :
> When the rcuoc kthreads process rcu callback, before invoke
> rcu_segcblist_add_len(&rdp->cblist, -count),
> the rcu_barrier() can insert rcu_barrier_callback() func to offline
> cpu rdp's->list.
Excellent analysis! Indeed we can have:
CPU 0 CPU 1 CPU 2
----- ----- -------
// deoffload // nocb_cb_wait // rcutorture
rcu_barrier()
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
rcu_do_batch()
rcu_barrier_callback()
rcu_barrier()
// still see
len == 1
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// decrement len
rcu_segcblist_add_len(-1);
kthread_parkme()
// Warn because there is
// still a pending barrier
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
And the worst is that the second rcu_barrier() is ignored.
>
> 6,5408,150692937,-,caller=T453;rcu: rcu_callback func: rcu_barrier_callback
>
> Maybe we can wait until rcu_segcblist_n_cbs(&rdp->cblist) return zero
> and then invoke kthread_parkme() in rcuoc kthreads.
> Any thoughts ?
Sounds good, or simply make sure that rdp->nocb_cb_sleep == false before
parking? Since kthread_park() should only be called after rcu_barrier() and
then rdp->nocb_cb_sleep shouldn't be set to true as long as there is a pending
one? Well we can also add a WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist))
before calling kthread_park().
Would you like to send the fix?
Thanks.