We've run into this a few times as well. As I understand it the "cut off threshold" applies to a cluster, not an individual hypervisor. So a KVM hypervisor can get filled up regardless (overcommit "1" still means "100%", and a hypervisor can fill up while a cluster is still < 85% full).
Right now CloudStack doesn't always show an accurate memory count on the host (e.g., host with 128GB shows up as ~126GB). The difference isn't because of "reserved" memory; it's due to how Java does the byte -> gigabyte math. KVM (as far as I've found) doesn't have "reserved" memory like Xen Dom0's do, so currently you have to keep tabs on memory usage outside of CloudStack. This becomes a bigger issue when you take guest disk caching into account, since even with plenty of actual overhead the hypervisor will quickly eat into swap with caching. When you deploy a new VM the hypervisor (or Linux really) is SUPPOSED to immediately pull any spare memory from caching and allocate it to the new VM process, but in practice it doesn't always work as well as it should. It's also much more common when deploying just a few VMs with a LOT of memory per host (e.g., host with 128GB of memory, 3 VMs with 32GB of memory + 1 VM with 24GB for a total of 120GB). The solution as I see it is to implement one of the following: - A) Allow setting host "overprovisioning" ratios of less than "1" (e.g., .85). This would effectively act as a "cap" on hypervisor resources. - B) Add a new setting for a "host" level "cut off threshold" in addition to the existing "cluster" level threshold. Thoughts? Thank You, Logan Barfield Tranquil Hosting On Wed, Jul 8, 2015 at 1:49 AM, ilya <ilya.mailing.li...@gmail.com> wrote: > Perhaps memory overcommit is set to greater than 1? What is your cut off > threshold? it usually set to 85%, which means you always leave 15% of > memory in reserve. > > Also, is your mem baloon driver in guest VMs disabled? > > As a test, create some VMs that are slightly under your cut off threshold, > i.e. if you total is 256GB of RAM on the host and usable is 217GB (assumed > 85%), create 4 VMs with 50GB RAM. > > Next, install mprime or another memory load generator to allocate all of its > memory and see if OOM re-occurs on hypervisor. > > Regards > ilya > > On 7/7/15 12:53 PM, Tony Fica wrote: >> >> We have several KVM nodes running Cloudstack 4.4.2. Sometimes an instance >> with X amount of RAM provisioned will be started on a host that has X+a >> small amount of RAM free. The kernel OOM killer will eventually kill off the >> instance. Has anyone else seen this behavior, is there a way to reserve RAM >> for use by the host instead of by Cloudstack? Looking at the numbers in the >> database and the logs, Cloudstack is trying to use 100% of the RAM on the >> host. >> >> Any thoughts would be appreciated. >> >> Thanks, >> >> Tony Fica > >