On 2/19/21 10:59 AM, Tim Chen wrote:
>
>
> On 2/19/21 1:11 AM, Michal Hocko wrote:
>>
>> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET *
>> SOFTLIMIT_EVENTS_TARGET.
>> If all events correspond with a newly charged memory and the last event
>> was just about the soft limit boundary then we should be bound by 128k
>> pages (512M and much more if this were huge pages) which is a lot!
>> I haven't realized this was that much. Now I see the problem. This would
>> be a useful information for the changelog.
>>
>> Your fix is focusing on the over-the-limit boundary which will solve the
>> problem but wouldn't that lead to to updates happening too often in
>> pathological situation when a memcg would get reclaimed immediatelly?
>
> Not really immediately. The memcg that has the most soft limit excess will
> be chosen for page reclaim, which is the way it should be.
> It is less likely that a memcg that just exceeded
> the soft limit becomes the worst offender immediately. With the fix, we make
> sure that it is on the bad guys list and will not be ignored and be chosen
> eventually for reclaim. It will not sneakily increase its memory usage
> slowly.
>
I should also mention that the forced update is only performed once when
the memcg first exceeded the soft limit as we check whether the memcg
is already in the soft limit tree due to the !mz->on_tree check.
+ if (mz && !mz->on_tree && soft_limit_excess(mz->memcg) > 0)
+ force_update = true;
So the update overhead is very low.
Tim