On 1/22/20 10:03 AM, R. Diez wrote:
Hi all:

I am using the libvirt version that comes with Ubuntu 18.04.3 LTS.

I'm sorry, I don't have Ubuntu installed anywhere to look the version up. Can you run 'virsh version' to find it out for me please?


I have written a script that backs up my virtual machines every night. I want to limit the amount of memory that this backup operation consumes, mainly to prevent page cache thrashing. I have described the Linux page cache thrashing issue in detail here:

http://rdiez.shoutwiki.com/wiki/Today%27s_Operating_Systems_are_still_incredibly_brittle#The_Linux_Filesystem_Cache_is_Braindead

The VM virtual disk weighs 140 GB at the moment. I thought 500 MiB of RAM should be more than enough to back it up, so I added the following options to the systemd service file associated to the systemd timer I am using:

   MemoryLimit=500M

However, the OOM is killing "virsh vol-download":

Jan 21 23:40:00 GS-CEL-L kernel: [55535.913525] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name Jan 21 23:40:00 GS-CEL-L kernel: [55535.913527] [  13232]  1000 13232     5030      786    77824      103             0 BackupWindows10 Jan 21 23:40:00 GS-CEL-L kernel: [55535.913528] [  13267]  1000 13267     5063      567    73728      132             0 BackupWindows10 Jan 21 23:40:00 GS-CEL-L kernel: [55535.913529] [  13421]  1000 13421     5063      458    73728      132             0 BackupWindows10 Jan 21 23:40:00 GS-CEL-L kernel: [55535.913530] [  13428]  1000 13428 712847   124686  5586944   523997             0 virsh Jan 21 23:40:00 GS-CEL-L kernel: [55535.913532] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/VmBackup.service,task_memcg=/system.slice/VmBackup.service,task=virsh,pid=13428,uid=1000 Jan 21 23:40:00 GS-CEL-L kernel: [55535.913538] Memory cgroup out of memory: Killed process 13428 (virsh) total-vm:2851388kB, anon-rss:486180kB, file-rss:12564kB, shmem-rss:0kB

I wonder why "virsh vol-download" needs so much RAM. It does not get killed straight away, it takes a few minutes to get killed. It starts using a VMSIZE of around 295 MiB, which is not really frugal for a file download operation, but then it grows and grows.

This is very likely a memory leak somewhere. Can you try to run virsh under valgrind and download a small disk? valgrind could help us identify the leak. For instance:

valgrind --leak-check=full virsh vol-download /path/to/small/volume /tmp/blah; rm /tmp/blah

However, I am unable to reproduce with the current git master so looks like the leak was fixed - question is, which commit fixed it so that your distro maintainers can backport it.

Michal

Reply via email to