Status: New
Owner: ----
New issue 127 by miguel.filho: Wrong calculation of free memory when using
KVM (ganeti and hbal)
http://code.google.com/p/ganeti/issues/detail?id=127
Ganeti miscalculates the amount of free memory when using KVM.
# gnt-node list|grep quake
Node DTotal DFree MTotal MNode MFree Pinst Sinst
quake.ic.unicamp.br 232.7G 148.2G 3.8G 1.4G 2.2G 2 4
Reality:
r...@quake:~# free -m
total used free shared buffers cached
Mem: 3886 3312 573 0 466 1258
-/+ buffers/cache: 1588 2297
Swap: 487 1 486
gnt-instance reports 1.4G of real occupied memory, `free` reports about
that too:
total_free = total - free - buffers - cached
total_free = 3886 - 573 - 466 -1256
total_free = 1591M
Sounds about right.
Now, running hscan on the master and copying the relevant parts of
LOCAL.data:
quake.ic.unicamp.br|3886|1473|2285|238284|151748|4|N
gandalf.ic.unicamp.br|1024|10368|1|running|quake.ic.unicamp.br|
node03.ic.unicamp.br|
rio.ic.unicamp.br|1024|10368|1|running|quake.ic.unicamp.br|
node03.ic.unicamp.br|
Total: 3886, used: 1473, free: 2285.
Two instances configured with 1G each one.
Now running hbal:
# hbal -t LOCAL.data
Warning: cluster has inconsistent data:
- node quake.ic.unicamp.br is missing -1920 MB ram and 23 GB disk
So, the calculation seams to be:
available_ram: total_ram - node_used_ram - node_free_ram - instantes_ram
available_ram: 3386 - 1473 - 2285 - 2048
available_ram: -1920
The problem seams to be that hscan is reporting the instances_ram variable
as fully used, when actually they aren't and the instances memory
are "merged" into the node memory. The real size of the VMs at node quake
is:
ps axfu | grep kvm | awk 'BEGIN
{total_vsz=0;total_rss=0}{total_vsz=total_vsz+$5;total_rss=total_rss+$6}END
{printf("VSZ: %d, RSS: %d\n\n", total_vsz/1024, total_rss/1024)}'
VSZ: 2473, RSS: 206
More details at this thread:
http://groups.google.com/group/ganeti/browse_thread/thread/4020459f1c45c4ef