On 2017年12月21日 16:59, Michal Hocko wrote:
> On Thu 21-12-17 16:23:23, kemi wrote:
>>
>>
>> On 2017年12月21日 16:17, Michal Hocko wrote:
> [...]
>>> Can you see any difference with a more generic workload?
>>>
>>
>> I didn't see obvious improvement for will-it-scale.page_fault1
>> Two reasons for that:
>> 1) too long code path
>> 2) server zone lock and lru lock contention (access to buddy system 
>> frequently) 
> 
> OK. So does the patch helps for anything other than a microbenchmark?
> 
>>>> Some thinking about that:
>>>> a) the overhead due to cache bouncing caused by NUMA counter update in 
>>>> fast path 
>>>> severely increase with more and more CPUs cores
>>>
>>> What is an effect on a smaller system with fewer CPUs?
>>>
>>
>> Several CPU cycles can be saved using single thread for that.
>>
>>>> b) AFAIK, the typical usage scenario (similar at least)for which this 
>>>> optimization can 
>>>> benefit is 10/40G NIC used in high-speed data center network of cloud 
>>>> service providers.
>>>
>>> I would expect those would disable the numa accounting altogether.
>>>
>>
>> Yes, but it is still worthy to do some optimization, isn't?
> 
> Ohh, I am not opposing optimizations but you should make sure that they
> are worth the additional code and special casing. As I've said I am not
> convinced special casing numa counters is good. You can play with the
> threshold scaling for larger CPU count but let's make sure that the
> benefit is really measurable for normal workloads. Special ones will
> disable the numa accounting anyway.
> 

I understood. Could you give me some suggestion for those normal workloads, 
Thanks.
I will have a try and post the data ASAP. 

Reply via email to