Excessive virtio_balloon inflation can cause invocation of OOM-killer, when Linux is under severe memory pressure. Various mechanisms are responsible for correct virtio_balloon memory management. Nevertheless it is often the case that these control tools does not have enough time to react on fast changing memory load. As a result OS runs out of memory and invokes OOM-killer. The balancing of memory by use of the virtio balloon should not cause the termination of processes while there are pages in the balloon. Now there is no way for virtio balloon driver to free memory at the last moment before some process get killed by OOM-killer.
This does not provide a security breach as balloon itself is running inside Guest OS and is working in the cooperation with the host. Thus some improvements from Guest side should be considered as normal. To solve the problem, introduce a virtio_balloon callback which is expected to be called from the oom notifier call chain in out_of_memory() function. If virtio balloon could release some memory, it will make the system to return and retry the allocation that forced the out of memory killer to run. This behavior should be enabled if and only if appropriate feature bit is set on the device. It is off by default. This functionality was recently merged into vanilla Linux (actually in linux-next at the moment) commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 Author: Raushaniya Maksudova <rmaksud...@parallels.com> Date: Mon Nov 10 09:36:29 2014 +1030 This patch adds respective control bits into QEMU. It introduces deflate-on-oom option for baloon device which do the trick. Changes from v5: - ported to QEMU current Changes from v4: - spelling corrected according to suggestions from Eric Blake Changes from v3: - ported to git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream_rebased Changes from v2: - fixed mistake with bit number in virtio_balloon_get_features Changes from v1: - From: in patch 1 according to the original ownership - feature processing in patch 2 as suggested by Michael. It could be done without additional field, but this will require to move the property level up, i.e. to PCI & CCW level. Signed-off-by: Raushaniya Maksudova <rmaksud...@virtuozzo.com> Signed-off-by: Denis V. Lunev <d...@openvz.org> CC: Anthony Liguori <aligu...@amazon.com> CC: Michael S. Tsirkin <m...@redhat.com>