On Thu, Feb 11, 2021 at 12:59 AM Сергей Мамонов <[email protected]> wrote:
>
> And after migrate all containers to another node it still shows 63745 cgroups 
> -
>
> cat /proc/cgroups
> #subsys_name hierarchy num_cgroups enabled
> cpuset 7 2 1
> cpu 10 2 1
> cpuacct 10 2 1
> memory 2 63745 1

Looks like a leakage (or a bug in memory accounting which prevents
cgroup from being released).
You can check the number of memory cgroups with something like

find /sys/fs/cgroup/memory -type d | wc -l

If you see a large number, go explore those cgroups (check
cgroup.procs, memory.usage_in_bytes).

> devices 11 2 1
> freezer 17 2 1
> net_cls 12 2 1
> blkio 1 4 1
> perf_event 13 2 1
> hugetlb 14 2 1
> pids 3 68 1
> ve 6 1 1
> beancounter 4 3 1
> net_prio 12 2 1
>
> On Wed, 10 Feb 2021 at 18:47, Сергей Мамонов <[email protected]> wrote:
>>
>> And it is definitely it -
>> grep -E "memory|num_cgroups" /proc/cgroups
>> #subsys_name hierarchy num_cgroups enabled
>> memory 2 65534 1
>>
>> After migration some of containers to another node num_cgroups goes down to 
>> 65365 and it allowed to start stopped container without `
>> Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882: 
>> Cannot allocate memory` error.
>>
>> But I don't understand why num_cgroups for memory so big, yet.
>>
>> Like ~460 per container instead of  60 and less per container on other nodes 
>> (with the same kernel version too).
>>
>> On Wed, 10 Feb 2021 at 17:48, Сергей Мамонов <[email protected]> wrote:
>>>
>>> Hello!
>>>
>>> Looks like we reproduced this problem too.
>>>
>>> kernel - 3.10.0-1127.18.2.vz7.163.46
>>>
>>> Same error -
>>> Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882: 
>>> Cannot allocate memory
>>>
>>> Same ok output for
>>> /sys/fs/cgroup/memory/*limit_in_bytes
>>> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>>>
>>> Have a lot of free memory on node (per numa too).
>>>
>>> Only that looks really strange -
>>> grep -E "memory|num_cgroups" /proc/cgroups
>>> #subsys_name hierarchy num_cgroups enabled
>>> memory 2 65534 1
>>>
>>> huge nub_cgroups only on this node
>>>
>>> cat /proc/cgroups
>>> #subsys_name hierarchy num_cgroups enabled
>>> cpuset 7 144 1
>>> cpu 10 263 1
>>> cpuacct 10 263 1
>>> memory 2 65534 1
>>> devices 11 1787 1
>>> freezer 17 144 1
>>> net_cls 12 144 1
>>> blkio 1 257 1
>>> perf_event 13 144 1
>>> hugetlb 14 144 1
>>> pids 3 2955 1
>>> ve 6 143 1
>>> beancounter 4 143 1
>>> net_prio 12 144 1
>>>
>>> On Thu, 28 Jan 2021 at 14:22, Konstantin Khorenko <[email protected]> 
>>> wrote:
>>>>
>>>> May be you hit memory shortage in a particular NUMA node only, for example.
>>>>
>>>> # numactl --hardware
>>>> # numastat -m
>>>>
>>>>
>>>> Or go hard way - trace kernel where exactly do we get -ENOMEM:
>>>>
>>>> trace the kernel function cgroup_mkdir() using /sys/kernel/debug/tracing/
>>>> with function_graph tracer.
>>>>
>>>>
>>>> https://lwn.net/Articles/370423/
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Konstantin Khorenko,
>>>> Virtuozzo Linux Kernel Team
>>>>
>>>> On 01/28/2021 12:43 PM, Joe Dougherty wrote:
>>>>
>>>> I checked that, doesn't appear to be the case.
>>>>
>>>> # pwd
>>>> /sys/fs/cgroup/memory
>>>> # cat *limit_in_bytes
>>>> 9223372036854771712
>>>> 9223372036854767616
>>>> 2251799813685247
>>>> 2251799813685247
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> # cat *failcnt
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>>
>>>> # pwd
>>>> /sys/fs/cgroup/memory/machine.slice
>>>> # cat *limit_in_bytes
>>>> 9223372036854771712
>>>> 9223372036854767616
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> # cat *failcnt
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>>
>>>>
>>>>
>>>> On Thu, Jan 28, 2021 at 2:47 AM Konstantin Khorenko 
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hi Joe,
>>>>>
>>>>> i'd suggest to check memory limits for root and "machine.slice" memory 
>>>>> cgroups
>>>>>
>>>>> /sys/fs/cgroup/memory/*limit_in_bytes
>>>>> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>>>>>
>>>>> All of them should be unlimited.
>>>>>
>>>>> If not - search who limit them.
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Konstantin Khorenko,
>>>>> Virtuozzo Linux Kernel Team
>>>>>
>>>>> On 01/27/2021 10:28 PM, Joe Dougherty wrote:
>>>>>
>>>>> I'm running into an issue on only 1 of my OpenVZ 7 nodes where it's 
>>>>> unable to create a directory on /sys/fs/cgroup/memory/machine.slice due 
>>>>> to "Cannot allocate memory" whenever I try to start a new container or 
>>>>> restart and existing one. I've been trying to research this but I'm 
>>>>> unable to find any concrete info on what could cause this. It appears to 
>>>>> be memory related because sometimes if I issue "echo 1 
>>>>> /proc/sys/vm/drop_caches" it allows me to start a container (this only 
>>>>> works sometimes) but my RAM usage is extremely low with no swapping 
>>>>> (swappiness even set to 0 for testing). Thank you in advance for your 
>>>>> help.
>>>>>
>>>>>
>>>>> Example:
>>>>> # vzctl start 9499
>>>>> Starting Container ...
>>>>> Mount image: /vz/private/9499/root.hdd
>>>>> Container is mounted
>>>>> Can't create directory /sys/fs/cgroup/memory/machine.slice/9499: Cannot 
>>>>> allocate memory
>>>>> Unmount image: /vz/private/9499/root.hdd (190)
>>>>> Container is unmounted
>>>>> Failed to start the Container
>>>>>
>>>>>
>>>>> Node Info:
>>>>> Uptime:      10 days
>>>>> OS:          Virtuozzo 7.0.15
>>>>> Kernel:      3.10.0-1127.18.2.vz7.163.46 GNU/Linux
>>>>> System Load: 3.1
>>>>> /vz Usage:   56% of 37T
>>>>> Swap Usage:  0%
>>>>> RAM Free:    84% of 94.2GB
>>>>>
>>>>> # free -m
>>>>>                     total        used        free            shared   
>>>>> buff/cache   available
>>>>> Mem:          96502       14259     49940         413         32303       
>>>>>     80990
>>>>> Swap:         32767       93           32674
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> [email protected]
>>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> [email protected]
>>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>> --
>>>> -Joe Dougherty
>>>> Chief Operating Officer
>>>> Secure Dragon LLC
>>>> www.SecureDragon.net
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> [email protected]
>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> [email protected]
>>>> https://lists.openvz.org/mailman/listinfo/users
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Sergei Mamonov
>>
>>
>>
>> --
>> Best Regards,
>> Sergei Mamonov
>
>
>
> --
> Best Regards,
> Sergei Mamonov
> _______________________________________________
> Users mailing list
> [email protected]
> https://lists.openvz.org/mailman/listinfo/users

_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users

Reply via email to