On 3/26/2014 11:55 PM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC)
wrote:
Hi Martin & Adam,
If MOM is using the VDSM HypervisorInterface:
1. using API Global.getCapabilities function to get host NUMA topology data
'autoNumaBalancing': true/false
'numaNodes': {'': {'memTotal': 'str'}, …}
2. using API Global.getStats function to get host NUMA statistics data
'numaNodeMemFree': {'': {'memFree': 'str'}, …}
Assume MOM already gets the per cpu usage info. etc.
You also can using libvirt API getCapabilities and getMemoryStats to merge
these data.
I am not sure
1. are these data is enough for MOM feature ?
2. do you need the VM NUMA topology data ?
If its a matter of just defining/implementing an API on the VDSM side
that returns a specified VM's Virtual NUMA topology let us consider
doing so If not MOM perhaps someone else will eventually find a use
for it later on :)
Vinod
Best Regards,
Jason Liao
-Original Message-
From: Vinod, Chegu
Sent: 2014年3月26日 21:35
To: Adam Litke
Cc: Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC); Martin Sivak; Gilad
Chaplik; Liang, Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, Xiao-Lei
(Bruce, HP Servers-PSC-CQ); Doron Fediuck; vdsm-devel
Subject: Re: FW: Fwd: Question about MOM
On 3/26/2014 5:35 AM, Adam Litke wrote:
On 26/03/14 03:50 -0700, Chegu Vinod wrote:
Restoring the email alias. Please keep discussions as public as
possible to allow others to contribute to the design and planning.
Fine
Jason.
Please see below...
On 3/26/2014 1:38 AM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC)
wrote:
Hi All,
Follow below discussion. I got these points:
1. MOM gathering NUMA information(topology, statistics...) will
changed in future. (one side using VDSM API, another side using
libvirt and system API)
I didn't follow your sentence..
Pl.. work with Adam/Martin and provide the needful API's on the VDSM
side ...so that MOM entity thread can use the API and extract the
needful about NUMA topology and cpu/memory usage info. As I see
it...this is probably the only piece that would be relevant to be
made available at the earliest (preferably in oVirt 3.5) and that
would enable MOM to pursue next steps as they say fit.
Beyond that ...at this point (for oVirt 3.5) let us not spend more
time on MOM internals please. Let us leave that to Adam and Martin to
pursue this as/when they see fit.
2. Martin and Adam will take a look at MOM policy in ovirt scheduler
when NUMA feature turn on.
Yes please.
3. ovirt engine will have numa-aware placement algorithm to make the
VM run within NUMA nodes as best way.
"algorithm" here is decided by user specified pinning requests
(and/or) by the oVirt scheduler. In the case of user request (upon
approval from oVirt scheduler) the VDSM-> libvirt will be explicitly
told what to do via numatune/cputune etc etc. In the absence of the
user specified pinning request I don't know if oVirt scheduler
intends to convey the numatune/cputune type of requests to the
libvirt...
4. ovirt engine will have some algorithm to automatic configure
virtual NUMA when big VM creation (big memory or vcpus)
This is a good suggestion but in my view should be taken up after
oVirt 3.5.
For now just accept and process the user specified requests...
5. Investigate on KSM, memory ballooning have the right tuning
parameter when NUMA feature turn on.
That is for Adam/Martin et.al. ...not for your specific project.
We just need to ensure that they have the basic NUMA info, they need
(via the VDSM API i mentioned above)...so that it enables them to
work on their part independently as/when they see fit.
6. Investigate on if Automatic NUMA balancing is keeping the process
reasonably balanced and notify ovirt engine.
Not sure I follow what you are saying...
Here is what I have in my mind :
Check if the target host has Automatic NUMA balancing enabled (you
can use the sysctl -a |grep numa_balancing or a similar underlying
mechanism for determining this). If its present then check if its
enabled or not (value of 1 is enabled and 0 is disabled)... and
convey this information to the oVirt engine GUI for display (this is
a hint for a user (if they wish) to skip manual pinning).. This in
my view is the minimum...at this point (and it would be great if we
can make it happen for oVirt 3.5).
I think since we have vdsm you can choose to enable autonuma always
(when it is present).
I don't speak for the various Linux distros out there... but I suspect most may
choose to have the default set to enabled (if the feature is present in the
OS).
Again... there should be some indication on the oVirt engine side (and in my
opinion it might be useful to display to the user too) whether a given host has
the feature currently enabled or not (either because it was disabled or the
feature is not present in the OS)
Are there any drawbacks to enabling it always?
Can't speak for every possible use case...but based on what I know at this