On 1/13/2026 11:00 PM, Paul E. McKenney wrote:
> On Tue, Jan 13, 2026 at 08:57:20AM +0000, Joel Fernandes wrote:
>>
>>
>>> On Jan 12, 2026, at 11:55 PM, Shrikanth Hegde <[email protected]> wrote:
>>>
>>> Hi.
>>>
>>> On 1/13/26 8:16 AM, Joel Fernandes wrote:
>>>
>>>
>>>>>>>> Another way to make it in-kernel would be to make the RCU normal wake
>>>>>>>> from GP optimization enabled for > 16 CPUs by default.>>
>>>>>>>> I was considering this, but I did not bring it up because I did not
>>>>>>>> know that there are large systems that might benefit from it until 
>>>>>>>> now.>
>>>>>>> This would require increasing the scalability of this optimization,
>>>>>>> right?  Or am I thinking of the wrong optimization?  ;-)
>>>>>>>
>>>>>> Yes I think you are considering the correct one, the concern you have is
>>>>>> regarding large number of wake ups initiated from the GP thread, correct?
>>>>>>
>>>>>> I was suggesting on the thread, a more dynamic approach where using
>>>>>> synchronize_rcu_normal() until it gets overloaded with requests. One 
>>>>>> approach
>>>>>> might be to measure the length of the rcu_state.srs_next to detect an 
>>>>>> overload
>>>>>> condition, similar to qhimark? Or perhaps qhimark itself can be used. 
>>>>>> And under
>>>>>> lightly loaded conditions, default to synchronize_rcu_normal() without 
>>>>>> checking
>>>>>> for the 16 CPU count.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>> Or maintain multiple lists.  Systems with 1000+ CPUs can be a bit
>>>>> unforgiving of pretty much any form of contention.
>>>> Makes sense. We could also just have a single list but a much smaller 
>>>> threshold for switching synchronize_rcu_normal off.
>>>> That would address the conveyor belt pattern Vishal expressed.
>>>> thanks,
>>>>  - Joel
>>>
>>> Wouldn't that make most of the sync_rcu calls on large system
>>> with synchronize_rcu_normal off?
>>
>> It would and that is expected.
>>
>>>
>>> Whats the cost of doing this?
>>
>> There is no cost, that is the point right. The scalability issue Paul is 
>> referring to is the
>> large number of wake ups. You wont have that if the number of synchronous 
>> callers is small.
> 
> Also the contention involved in the list management, if there is still
> only the one list.
> 
Even if the number of synchronize_rcu() in flight is a small number? like < 10.
To clarify, I meant keeping the threshold that small in favor of the list
contention issue you're raising.

Thanks!

 - Joel





Reply via email to