I'll look into the overcommit and OOM tweaks, thanks. We've unfortunately had to disable KSM because it was causing consistent network issues on our VMs. There was some known kernel bug that was patched, but then re-added as a regression.
I still think adding the "host" overcommit tunable in CloudStack would be a good idea though. I don't think there's any reason CloudStack has to be naive about this type of thing. Being able to set "virtual" limits in one place makes it easier for admins to maintain headroom, and I don't see there being many downsides (other than time spent implementing it). Thank You, Logan Barfield Tranquil Hosting On Thu, Jul 9, 2015 at 5:51 AM, Nux! <n...@li.nux.ro> wrote: >> This is where i was going with this as well. Keep over-provisioning >> level under 1 and cut of @ 85%. >> >>> - B) Add a new setting for a "host" level "cut off threshold" in >>> addition to the existing "cluster" level threshold. >> This would be a better safeguard. Also, KVM has a setting that would >> restrict how much total memory it can use. Look for kernel argument >> "mem_overcommit". >> >> You can stay under specific threshold to avoid OOMs.. > > You can also tell the OOMK not to touch the KVM processes, like in this post > I've just written: > http://www.nux.ro/archive/2015/07/Protect_KVM_processes_from_OOM_killer.html > > In addition KSM (aka "memory deduplication" to simplify it) may help as well > at the cost of extra CPU usage required for page scans/merges. > > HTH > Lucian