We've run into this a few times as well.

As I understand it the "cut off threshold" applies to a cluster, not
an individual hypervisor.  So a KVM hypervisor can get filled up
regardless (overcommit "1" still means "100%", and a hypervisor can
fill up while a cluster is still < 85% full).

Right now CloudStack doesn't always show an accurate memory count on
the host (e.g., host with 128GB shows up as ~126GB).  The difference
isn't because of "reserved" memory; it's due to how Java does the byte
-> gigabyte math.

KVM (as far as I've found) doesn't have "reserved" memory like Xen
Dom0's do, so currently you have to keep tabs on memory usage outside
of CloudStack.

This becomes a bigger issue when you take guest disk caching into
account, since even with plenty of actual overhead the hypervisor will
quickly eat into swap with caching.  When you deploy a new VM the
hypervisor (or Linux really) is SUPPOSED to immediately pull any spare
memory from caching and allocate it to the new VM process, but in
practice it doesn't always work as well as it should.  It's also much
more common when deploying just a few VMs with a LOT of memory per
host (e.g., host with 128GB of memory, 3 VMs with 32GB of memory + 1
VM with 24GB for a total of 120GB).

The solution as I see it is to implement one of the following:
- A) Allow setting host "overprovisioning" ratios of less than "1"
(e.g., .85).  This would effectively act as a "cap" on hypervisor
resources.
- B) Add a new setting for a "host" level "cut off threshold" in
addition to the existing "cluster" level threshold.


Thoughts?

Thank You,

Logan Barfield
Tranquil Hosting


On Wed, Jul 8, 2015 at 1:49 AM, ilya <ilya.mailing.li...@gmail.com> wrote:
> Perhaps memory overcommit is set to greater than 1? What is your cut off
> threshold? it usually set to 85%,  which means you always leave 15% of
> memory in reserve.
>
> Also, is your mem baloon driver in guest VMs disabled?
>
> As a test, create some VMs that are slightly under your cut off threshold,
> i.e. if you total is 256GB of RAM on the host and usable is 217GB (assumed
> 85%), create 4 VMs with 50GB RAM.
>
> Next, install mprime or another memory load generator to allocate all of its
> memory and see if OOM re-occurs on hypervisor.
>
> Regards
> ilya
>
> On 7/7/15 12:53 PM, Tony Fica wrote:
>>
>> We have several KVM nodes running Cloudstack 4.4.2. Sometimes an instance
>> with X amount of RAM provisioned will be started on a host that has X+a
>> small amount of RAM free. The kernel OOM killer will eventually kill off the
>> instance.  Has anyone else seen this behavior, is there a way to reserve RAM
>> for use by the host instead of by Cloudstack? Looking at the numbers in the
>> database and the logs, Cloudstack is trying to use 100% of the RAM on the
>> host.
>>
>> Any thoughts would be appreciated.
>>
>> Thanks,
>>
>> Tony Fica
>
>

Reply via email to