Hello everyone, We're experiencing issues with running large instances (~60GB RAM) on fairly large NUMA nodes (4 CPUs, 256GB RAM) while using cpu pinning. The problem is that it seems that in some extreme cases qemu/KVM can have significant memory overhead (10-15%?) which nova-compute service doesn't take in to the account when launching VMs. Using our configuration as an example - imagine running two VMs with 30GB RAM on one NUMA node (because we use cpu pinning) - therefore using 60GB out of 64GB for given NUMA domain. When both VMs would consume their entire memory (given 10% KVM overhead) OOM killer takes an action (despite having plenty of free RAM in other NUMA nodes). (the numbers are just arbitrary, the point is that nova-scheduler schedules the instance to run on the node because the memory seems 'free enough', but specific NUMA node can be lacking the memory reserve).
Our initial solution was to use ram_allocation_ratio < 1 to ensure having some reserved memory - this didn't work. Upon studying source of nova, it turns out that ram_allocation_ratio is ignored when using cpu pinning. (see https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L859 and https://github.com/openstack/nova/blob/mitaka-eol/nova/virt/hardware.py#L821 ). We're running Mitaka, but this piece of code is implemented in Ocata in a same way. We're considering to create a patch for taking ram_allocation_ratio in to account. My question is - is ram_allocation_ratio ignored on purpose when using cpu pinning? If yes, what is the reasoning behind it? And what would be the right solution to ensure having reserved RAM on the NUMA nodes? Thanks. Regards, Jakub Jursa
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev