Re: KVM Node running out of RAM

2015-07-09 Thread Logan Barfield
I'll look into the overcommit and OOM tweaks, thanks. We've unfortunately had to disable KSM because it was causing consistent network issues on our VMs. There was some known kernel bug that was patched, but then re-added as a regression. I still think adding the "host" overcommit tunable in Clo

Re: KVM Node running out of RAM

2015-07-09 Thread Nux!
> This is where i was going with this as well. Keep over-provisioning > level under 1 and cut of @ 85%. > >> - B) Add a new setting for a "host" level "cut off threshold" in >> addition to the existing "cluster" level threshold. > This would be a better safeguard. Also, KVM has a setting that woul

Re: KVM Node running out of RAM

2015-07-08 Thread ilya
Please see response in-line On 7/8/15 7:51 AM, Logan Barfield wrote: We've run into this a few times as well. As I understand it the "cut off threshold" applies to a cluster, not an individual hypervisor. So a KVM hypervisor can get filled up regardless (overcommit "1" still means "100%", and

Re: KVM Node running out of RAM

2015-07-08 Thread Logan Barfield
We've run into this a few times as well. As I understand it the "cut off threshold" applies to a cluster, not an individual hypervisor. So a KVM hypervisor can get filled up regardless (overcommit "1" still means "100%", and a hypervisor can fill up while a cluster is still < 85% full). Right no

Re: KVM Node running out of RAM

2015-07-07 Thread ilya
Perhaps memory overcommit is set to greater than 1? What is your cut off threshold? it usually set to 85%, which means you always leave 15% of memory in reserve. Also, is your mem baloon driver in guest VMs disabled? As a test, create some VMs that are slightly under your cut off threshold,

KVM Node running out of RAM

2015-07-07 Thread Tony Fica
We have several KVM nodes running Cloudstack 4.4.2. Sometimes an instance with X amount of RAM provisioned will be started on a host that has X+a small amount of RAM free. The kernel OOM killer will eventually kill off the instance. Has anyone else seen this behavior, is there a way to reserv