On 13.04.2018 14:38, Michal Hocko wrote:
> On Fri 13-04-18 14:29:11, Kirill Tkhai wrote:
>> On 13.04.2018 14:20, Michal Hocko wrote:
>>> On Fri 13-04-18 14:06:40, Kirill Tkhai wrote:
>>>> On 13.04.2018 14:02, Michal Hocko wrote:
>>>>> On Fri 13-04-18 12:35:22, Kirill Tkhai wrote:
>>>>>> On 13.04.2018 11:55, Michal Hocko wrote:
>>>>>>> On Thu 12-04-18 17:52:04, Kirill Tkhai wrote:
>>>>>>> [...]
>>>>>>>> @@ -4471,6 +4477,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state 
>>>>>>>> *parent_css)
>>>>>>>>  
>>>>>>>>        return &memcg->css;
>>>>>>>>  fail:
>>>>>>>> +      mem_cgroup_id_remove(memcg);
>>>>>>>>        mem_cgroup_free(memcg);
>>>>>>>>        return ERR_PTR(-ENOMEM);
>>>>>>>>  }
>>>>>>>
>>>>>>> The only path which jumps to fail: here (in the current mmotm tree) is 
>>>>>>>         error = memcg_online_kmem(memcg);
>>>>>>>         if (error)
>>>>>>>                 goto fail;
>>>>>>>
>>>>>>> AFAICS and the only failure path in memcg_online_kmem
>>>>>>>         memcg_id = memcg_alloc_cache_id();
>>>>>>>         if (memcg_id < 0)
>>>>>>>                 return memcg_id;
>>>>>>>
>>>>>>> I am not entirely clear on memcg_alloc_cache_id but it seems we do clean
>>>>>>> up properly. Or am I missing something?
>>>>>>
>>>>>> memcg_alloc_cache_id() may allocate a lot of memory, in case of the 
>>>>>> system reached
>>>>>> memcg_nr_cache_ids cgroups. In this case it iterates over all LRU lists, 
>>>>>> and double
>>>>>> size of every of them. In case of memory pressure it can fail. If this 
>>>>>> occurs,
>>>>>> mem_cgroup::id is not unhashed from IDR and we leak this id.
>>>>>
>>>>> OK, my bad I was looking at the bad code path. So you want to clean up
>>>>> after mem_cgroup_alloc not memcg_online_kmem. Now it makes much more
>>>>> sense. Sorry for the confusion on my end.
>>>>>
>>>>> Anyway, shouldn't we do the thing in mem_cgroup_free() to be symmetric
>>>>> to mem_cgroup_alloc?
>>>>
>>>> We can't, since it's called from mem_cgroup_css_free(), which doesn't have 
>>>> a deal
>>>> with idr freeing. All the asymmetry, we see, is because of the trick to 
>>>> unhash ID
>>>> earlier, then from mem_cgroup_css_free().
>>>
>>> Are you sure. It's been some time since I've looked at the quite complex
>>> cgroup tear down code but from what I remember, css_free is called on
>>> the css release (aka when the reference count drops to zero). 
>>> mem_cgroup_id_put_many
>>> seems to unpin the css reference so we should have idr_remove by the
>>> time when css_free is called. Or am I still wrong and should go over the
>>> brain hurting cgroup removal code again?
>>
>> mem_cgroup_id_put_many() unpins css, but this may be not the last reference 
>> to the css.
>> Thus, we release ID earlier, then all references to css are freed.
> 
> Right and so what. If we have released the idr then we are not going to
> do that again in css_free. That is why we have that memcg->id.id > 0
> check before idr_remove and memcg->id.id = 0 for the last memcg ref.
> count. So again, why cannot we do the clean up in mem_cgroup_free and
> have a less confusing code? Or am I just not getting your point and
> being dense here?

We can, but mem_cgroup_free() called from mem_cgroup_css_alloc() is unlikely 
case.
The likely case is mem_cgroup_free() is called from mem_cgroup_css_free(), where
this idr manipulations will be a noop. Noop in likely case looks more confusing
for me.

Less confusing will be to move

        memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL,
                                 1, MEM_CGROUP_ID_MAX,
                                 GFP_KERNEL);

into mem_cgroup_css_alloc(). How are you think about this?

Kirill

Reply via email to