Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-27 Thread Jay Pipes

On 03/23/2017 01:01 PM, Jean-Philippe Methot wrote:

Hi,

Lately, on my production openstack Newton setup, I've ran into a
situation that defies my assumptions regarding memory management on
Openstack compute nodes and I've been looking for explanations.
Basically, we had a VM with a flavor that limited it to 96 GB of ram,
which, to be quite honest, we never thought we could ever reach. This is
a very important VM where we wanted to avoid running out of memory at
all cost. The VM itself generally uses about 12 GB of ram.

We were surprised when we noticed yesterday that this VM, which has been
running for several months, was using all its 96 GB on the compute host.
Despite that, in the guest, the OS was indicating a memory usage of
about 12 GB. The only explanation I see to this is that at some point in
time, the host had to allocate all the 96GB of ram to the VM process and
it never took back the allocated ram. This prevented the creation of
more guests on the node as it was showing it didn't have enough memory
left.

Now, I was under the assumption that memory ballooning was integrated
into nova and that the amount of allocated memory to a specific guest
would deflate once that guest did not need the memory. After
verification, I've found blueprints for it, but I see no trace of any
implementation anywhere.

I also notice that on most of our compute nodes, the amount of ram used
is much lower than the amount of ram allocated to VMs, which I do
believe is normal.

So basically, my question is, how does openstack actually manage ram
allocation? Will it ever take back the unused ram of a guest process?
Can I force it to take back that ram?


Basically, you are using a hammer as a screwdriver.

The tool that Nova gives you to prevent other VMs from consuming memory 
allocated to another VM is called the ram_allocation_ratio. By default, 
this is set to 1.5, meaning that if you have 100GB of RAM on a compute 
host, you can allocate VMs that would consume up to 150GB of RAM.


For your VM that has 12GB of RAM used but 96GB allocated, you do not 
want to do that. Instead, give that VM around 16GB of memory, set your 
compute host's ram_allocation_ratio (in nova.conf) to 1.0 and then 
instances on that compute host will not be able to consume more RAM than 
is available on the host.


Best,
-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-24 Thread Steve Gordon


- Original Message -
> From: "Jean-Philippe Methot" 
> To: "Edmund Rhudy" 
> Cc: openstack-operators@lists.openstack.org
> Sent: Thursday, March 23, 2017 3:49:26 PM
> Subject: Re: [Openstack-operators] Memory usage of guest vms, ballooning and 
> nova
> 
> 
> On 2017-03-23 15:15, Edmund Rhudy (BLOOMBERG/ 120 PARK) wrote:
> > What sort of memory overcommit value are you running Nova with? The
> > scheduler looks at an instance's reservation rather than how much
> > memory is actually being used by QEMU when making a decision, as far
> > as I'm aware (but please correct me if I am wrong on this point). If
> > the HV has 128GB of memory, the instance has a reservation of 96GB,
> > you have 16GB reserved via reserved_host_memory_mb,
> > ram_allocation_ratio is set to 1.0, and you try to launch an instance
> > from a flavor with 32GB of memory, it will fail to pass RamFilter in
> > the scheduler and the scheduler will not consider it a valid host for
> > placement. (I am assuming you are using FilterScheduler still, as I
> > know nothing about the new placement API or what parts of it do and
> > don't work in Newton.)
> The overcommit value is set to 1.5 in the scheduler. It's not the
> scheduler that was preventing the instance from being provisionned, it
> was qemu returning that there was not enough ram when libvirt was trying
> to provision the instance (that error was not handled well by openstack,
> btw, but that's something else). So the instance does pass every filter.
> It just ends up in error when getting provisioned in the compute node
> because of a lack of ram, with the actual full error message only
> visible in the QEMU logs.
> > As far as why the memory didn't automatically get reclaimed, maybe KVM
> > will only reclaim empty pages and memory fragmentation in the guest
> > prevented it from doing so? It might also not actively try to reclaim
> > memory unless it comes under pressure to do so, because finding empty
> > pages and returning them to the host may be a somewhat time-consuming
> > operation.
> 
> That's entirely possible, but according to the doc, libvirt is supposed
> to have a memory balloon function that does the operation of reclaiming
> empty pages from guest processes, or so I understand. Now, how this
> function works is not exactly clear to me, or even if nova uses it or
> not. Another user suggested it might not be automatic, which is in
> accordance to what you're conjecturing.

As a general rule Libvirt provides an interface to facilitate various actions 
on the guest, but does not perform them without intervention - that is 
generally it needs to be triggered to do something either by a management layer 
(OpenStack, oVirt, virt-manager, Boxes, etc.) or explicit call from the 
operator (e.g. via virsh).

In this case as Chris noted while the memory stats are exposed by default, and 
while Libvirt exposes an API for interacting with the balloon, there is no 
process in Nova currently - or commonly deployed with it - that will actually 
exercise the ballooning mechanism to expand/contract the memory balloon. In 
oVirt/RHEV the traditional way to do it was using Memory Overcommitt Manager 
(MOM) to define and apply policies for managing it - the guest also needs to 
have a driver for the virtio balloon device IIRC. 

Such things have been proposed in the past [2] in OpenStack but never made it 
to implementation to my knowledge though as you've discovered it still seems 
like something that is generally desirable.

Thanks,

Steve

[1] http://www.ovirt.org/develop/projects/mom/
[2] https://blueprints.launchpad.net/nova/+spec/libvirt-memory-ballooning


> > From: jp.met...@planethoster.info
> > Subject: Re: [Openstack-operators] Memory usage of guest vms,
> > ballooning and nova
> >
> > Hi, This is indeed linux, CentOS 7 to be more precise, using
> > qemu-kvm as hypervisor. The used ram was in the used column. While
> > we have made adjustments by moving and resizing the specific guest
> > that was using 96 GB (verified in top), the ram usage is still
> > fairly high for the amount of allocated ram. Currently the ram
> > usage looks like this : total used free shared buff/cache
> > available Mem: 251G 190G 60G 42M 670M 60G Swap: 952M 707M 245M I
> > have 188.5GB of ram allocated to 22 instances on this node. I
> > believe it's unrealistic to think that all these 22 instances have
> > cached/are using up all their ram at this time. On 2017-03-23
> > 13:07, Kris G. Lindgren wrote: > Sorry for the super stupid
> &g

Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-23 Thread Jean-Philippe Methot


On 2017-03-23 15:15, Edmund Rhudy (BLOOMBERG/ 120 PARK) wrote:
What sort of memory overcommit value are you running Nova with? The 
scheduler looks at an instance's reservation rather than how much 
memory is actually being used by QEMU when making a decision, as far 
as I'm aware (but please correct me if I am wrong on this point). If 
the HV has 128GB of memory, the instance has a reservation of 96GB, 
you have 16GB reserved via reserved_host_memory_mb, 
ram_allocation_ratio is set to 1.0, and you try to launch an instance 
from a flavor with 32GB of memory, it will fail to pass RamFilter in 
the scheduler and the scheduler will not consider it a valid host for 
placement. (I am assuming you are using FilterScheduler still, as I 
know nothing about the new placement API or what parts of it do and 
don't work in Newton.)
The overcommit value is set to 1.5 in the scheduler. It's not the 
scheduler that was preventing the instance from being provisionned, it 
was qemu returning that there was not enough ram when libvirt was trying 
to provision the instance (that error was not handled well by openstack, 
btw, but that's something else). So the instance does pass every filter. 
It just ends up in error when getting provisioned in the compute node 
because of a lack of ram, with the actual full error message only 
visible in the QEMU logs.
As far as why the memory didn't automatically get reclaimed, maybe KVM 
will only reclaim empty pages and memory fragmentation in the guest 
prevented it from doing so? It might also not actively try to reclaim 
memory unless it comes under pressure to do so, because finding empty 
pages and returning them to the host may be a somewhat time-consuming 
operation.


That's entirely possible, but according to the doc, libvirt is supposed 
to have a memory balloon function that does the operation of reclaiming 
empty pages from guest processes, or so I understand. Now, how this 
function works is not exactly clear to me, or even if nova uses it or 
not. Another user suggested it might not be automatic, which is in 
accordance to what you're conjecturing.

From: jp.met...@planethoster.info
Subject: Re: [Openstack-operators] Memory usage of guest vms, 
ballooning and nova


Hi, This is indeed linux, CentOS 7 to be more precise, using
qemu-kvm as hypervisor. The used ram was in the used column. While
we have made adjustments by moving and resizing the specific guest
that was using 96 GB (verified in top), the ram usage is still
fairly high for the amount of allocated ram. Currently the ram
usage looks like this : total used free shared buff/cache
available Mem: 251G 190G 60G 42M 670M 60G Swap: 952M 707M 245M I
have 188.5GB of ram allocated to 22 instances on this node. I
believe it's unrealistic to think that all these 22 instances have
cached/are using up all their ram at this time. On 2017-03-23
13:07, Kris G. Lindgren wrote: > Sorry for the super stupid
question. > > But if this is linux are you sure that the memory is
not actually being consumed via buffers/cache? > > free -m > total
used free shared buff/cache available > Mem: 128751 27708 2796
4099 98246 96156 > Swap: 8191 0 8191 > > Shows that of 128GB 27GB
is used, but buffers/cache consumes 98GB of ram. > >
___
> Kris Lindgren > Senior Linux Systems Engineer > GoDaddy > > On
3/23/17, 11:01 AM, "Jean-Philippe Methot"
mailto:jp.met...@planethoster.info>>
wrote: > > Hi, > > Lately, on my production openstack Newton
setup, I've ran into a > situation that defies my assumptions
regarding memory management on > Openstack compute nodes and I've
been looking for explanations. > Basically, we had a VM with a
flavor that limited it to 96 GB of ram, > which, to be quite
honest, we never thought we could ever reach. This is > a very
important VM where we wanted to avoid running out of memory at >
all cost. The VM itself generally uses about 12 GB of ram. > > We
were surprised when we noticed yesterday that this VM, which has
been > running for several months, was using all its 96 GB on the
compute host. > Despite that, in the guest, the OS was indicating
a memory usage of > about 12 GB. The only explanation I see to
this is that at some point in > time, the host had to allocate all
the 96GB of ram to the VM process and > it never took back the
allocated ram. This prevented the creation of > more guests on the
node as it was showing it didn't have enough memory left. > > Now,
I was under the assumption that memory ballooning was integrated >
into nova and that the amount of allocated memory to a specific
guest > would deflat

Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-23 Thread Edmund Rhudy (BLOOMBERG/ 120 PARK)
What sort of memory overcommit value are you running Nova with? The scheduler 
looks at an instance's reservation rather than how much memory is actually 
being used by QEMU when making a decision, as far as I'm aware (but please 
correct me if I am wrong on this point). If the HV has 128GB of memory, the 
instance has a reservation of 96GB, you have 16GB reserved via 
reserved_host_memory_mb, ram_allocation_ratio is set to 1.0, and you try to 
launch an instance from a flavor with 32GB of memory, it will fail to pass 
RamFilter in the scheduler and the scheduler will not consider it a valid host 
for placement. (I am assuming you are using FilterScheduler still, as I know 
nothing about the new placement API or what parts of it do and don't work in 
Newton.)

As far as why the memory didn't automatically get reclaimed, maybe KVM will 
only reclaim empty pages and memory fragmentation in the guest prevented it 
from doing so? It might also not actively try to reclaim memory unless it comes 
under pressure to do so, because finding empty pages and returning them to the 
host may be a somewhat time-consuming operation.

From: jp.met...@planethoster.info 
Subject: Re: [Openstack-operators] Memory usage of guest vms, ballooning and 
nova

Hi,

This is indeed linux, CentOS 7 to be more precise, using qemu-kvm as 
hypervisor. The used ram was in the used column. While we have made 
adjustments by moving and resizing the specific guest that was using 96 
GB (verified in top), the ram usage is still fairly high for the amount 
of allocated ram.

Currently the ram usage looks like this :

   totalusedfree  shared buff/cache   
available
Mem:   251G190G 60G 42M 670M 60G
Swap:  952M707M245M


I have 188.5GB of ram allocated to 22 instances on this node. I believe 
it's unrealistic to think that all these 22 instances have cached/are 
using up all their ram at this time.

On 2017-03-23 13:07, Kris G. Lindgren wrote:
> Sorry for the super stupid question.
>
> But if this is linux are you sure that the memory is not actually being 
> consumed via buffers/cache?
>
> free -m
>total  usedfree  shared   
> buff/cache   available
> Mem: 128751   277082796 4099  98246  96156
> Swap:  8191   0 8191
>
> Shows that of 128GB 27GB is used, but buffers/cache consumes 98GB of ram.
>
> ___
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy
>
> On 3/23/17, 11:01 AM, "Jean-Philippe Methot"  
> wrote:
>
>  Hi,
>  
>  Lately, on my production openstack Newton setup, I've ran into a
>  situation that defies my assumptions regarding memory management on
>  Openstack compute nodes and I've been looking for explanations.
>  Basically, we had a VM with a flavor that limited it to 96 GB of ram,
>  which, to be quite honest, we never thought we could ever reach. This is
>  a very important VM where we wanted to avoid running out of memory at
>  all cost. The VM itself generally uses about 12 GB of ram.
>  
>  We were surprised when we noticed yesterday that this VM, which has been
>  running for several months, was using all its 96 GB on the compute host.
>  Despite that, in the guest, the OS was indicating a memory usage of
>  about 12 GB. The only explanation I see to this is that at some point in
>  time, the host had to allocate all the 96GB of ram to the VM process and
>  it never took back the allocated ram. This prevented the creation of
>  more guests on the node as it was showing it didn't have enough memory 
> left.
>  
>  Now, I was under the assumption that memory ballooning was integrated
>  into nova and that the amount of allocated memory to a specific guest
>  would deflate once that guest did not need the memory. After
>  verification, I've found blueprints for it, but I see no trace of any
>  implementation anywhere.
>  
>  I also notice that on most of our compute nodes, the amount of ram used
>  is much lower than the amount of ram allocated to VMs, which I do
>  believe is normal.
>  
>  So basically, my question is, how does openstack actually manage ram
>  allocation? Will it ever take back the unused ram of a guest process?
>  Can I force it to take back that ram?
>  
>  --
>  Jean-Philippe Méthot
>  Openstack system administrator
>  PlanetHoster inc.
>  www.planethoster.net
>  
>  
>  ___
>

Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-23 Thread Jean-Philippe Methot

Hi,

This is indeed linux, CentOS 7 to be more precise, using qemu-kvm as 
hypervisor. The used ram was in the used column. While we have made 
adjustments by moving and resizing the specific guest that was using 96 
GB (verified in top), the ram usage is still fairly high for the amount 
of allocated ram.


Currently the ram usage looks like this :

  totalusedfree  shared buff/cache   
available

Mem:   251G190G 60G 42M 670M 60G
Swap:  952M707M245M


I have 188.5GB of ram allocated to 22 instances on this node. I believe 
it's unrealistic to think that all these 22 instances have cached/are 
using up all their ram at this time.


On 2017-03-23 13:07, Kris G. Lindgren wrote:

Sorry for the super stupid question.

But if this is linux are you sure that the memory is not actually being 
consumed via buffers/cache?

free -m
   total  usedfree  shared   
buff/cache   available
Mem: 128751   277082796 4099  98246  96156
Swap:  8191   0 8191

Shows that of 128GB 27GB is used, but buffers/cache consumes 98GB of ram.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 3/23/17, 11:01 AM, "Jean-Philippe Methot"  
wrote:

 Hi,
 
 Lately, on my production openstack Newton setup, I've ran into a

 situation that defies my assumptions regarding memory management on
 Openstack compute nodes and I've been looking for explanations.
 Basically, we had a VM with a flavor that limited it to 96 GB of ram,
 which, to be quite honest, we never thought we could ever reach. This is
 a very important VM where we wanted to avoid running out of memory at
 all cost. The VM itself generally uses about 12 GB of ram.
 
 We were surprised when we noticed yesterday that this VM, which has been

 running for several months, was using all its 96 GB on the compute host.
 Despite that, in the guest, the OS was indicating a memory usage of
 about 12 GB. The only explanation I see to this is that at some point in
 time, the host had to allocate all the 96GB of ram to the VM process and
 it never took back the allocated ram. This prevented the creation of
 more guests on the node as it was showing it didn't have enough memory 
left.
 
 Now, I was under the assumption that memory ballooning was integrated

 into nova and that the amount of allocated memory to a specific guest
 would deflate once that guest did not need the memory. After
 verification, I've found blueprints for it, but I see no trace of any
 implementation anywhere.
 
 I also notice that on most of our compute nodes, the amount of ram used

 is much lower than the amount of ram allocated to VMs, which I do
 believe is normal.
 
 So basically, my question is, how does openstack actually manage ram

 allocation? Will it ever take back the unused ram of a guest process?
 Can I force it to take back that ram?
 
 --

 Jean-Philippe Méthot
 Openstack system administrator
 PlanetHoster inc.
 www.planethoster.net
 
 
 ___

 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 



--
Jean-Philippe Méthot
Openstack system administrator
PlanetHoster inc.
www.planethoster.net


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-23 Thread Chris Friesen

On 03/23/2017 11:01 AM, Jean-Philippe Methot wrote:


So basically, my question is, how does openstack actually manage ram allocation?
Will it ever take back the unused ram of a guest process? Can I force it to take
back that ram?


I don't think nova will automatically reclaim memory.

I'm pretty sure that if you have CONF.libvirt.mem_stats_period_seconds set 
(which it is by default) then you can manually tell libvirt to reclaim some 
memory via the "virsh setmem" command.


Chris

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Memory usage of guest vms, ballooning and nova

2017-03-23 Thread Jean-Philippe Methot

Hi,

Lately, on my production openstack Newton setup, I've ran into a 
situation that defies my assumptions regarding memory management on 
Openstack compute nodes and I've been looking for explanations. 
Basically, we had a VM with a flavor that limited it to 96 GB of ram, 
which, to be quite honest, we never thought we could ever reach. This is 
a very important VM where we wanted to avoid running out of memory at 
all cost. The VM itself generally uses about 12 GB of ram.


We were surprised when we noticed yesterday that this VM, which has been 
running for several months, was using all its 96 GB on the compute host. 
Despite that, in the guest, the OS was indicating a memory usage of 
about 12 GB. The only explanation I see to this is that at some point in 
time, the host had to allocate all the 96GB of ram to the VM process and 
it never took back the allocated ram. This prevented the creation of 
more guests on the node as it was showing it didn't have enough memory left.


Now, I was under the assumption that memory ballooning was integrated 
into nova and that the amount of allocated memory to a specific guest 
would deflate once that guest did not need the memory. After 
verification, I've found blueprints for it, but I see no trace of any 
implementation anywhere.


I also notice that on most of our compute nodes, the amount of ram used 
is much lower than the amount of ram allocated to VMs, which I do 
believe is normal.


So basically, my question is, how does openstack actually manage ram 
allocation? Will it ever take back the unused ram of a guest process? 
Can I force it to take back that ram?


--
Jean-Philippe Méthot
Openstack system administrator
PlanetHoster inc.
www.planethoster.net


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators