Re: [libvirt] [RFC PATCH auto partition NUMA guest domains v1 0/2] auto partition guests providing the host NUMA topology

2018-10-15 Thread Wim ten Have
On Tue, 25 Sep 2018 14:37:15 +0200
Jiri Denemark  wrote:

> On Tue, Sep 25, 2018 at 12:02:40 +0200, Wim Ten Have wrote:
> > From: Wim ten Have 
> > 
> > This patch extends the guest domain administration adding support
> > to automatically advertise the host NUMA node capabilities obtained
> > architecture under a guest by creating a vNUMA copy.  
> 
> I'm pretty sure someone would find this useful and such configuration is
> perfectly valid in libvirt. But I don't think there is a compelling
> reason to add some magic into the domain XML which would automatically
> expand to such configuration. It's basically a NUMA placement policy and
> libvirt generally tries to avoid including any kind of policies and
> rather just provide all the mechanisms and knobs which can be used by
> applications to implement any policy they like.
> 
> > The mechanism is enabled by setting the check='numa' attribute under
> > the CPU 'host-passthrough' topology:
> > 
> 
> Anyway, this is definitely not the right place for such option. The
> 'check' attribute is described as
> 
> "Since 3.2.0, an optional check attribute can be used to request a
> specific way of checking whether the virtual CPU matches the
> specification."
> 
> and the new 'numa' value does not fit in there in any way.
> 
> Moreover the code does the automatic NUMA placement at the moment
> libvirt parses the domain XML, which is not the right place since it
> would break migration, snapshots, and save/restore features.

  Howdy, thanks for your fast response.  I was Out Of Office for a
  while unable to reply earlier.  The beef of this code does indeed
  not belong under the domain code and should rather move into the NUMA
  specific code where check='numa' is simply badly chosen.

  Also whilst OOO it occurred to me that besides auto partitioning the
  host into a vNUMA replica there's probably even other configuration
  target we may introduce reserving a single NUMA-node out of the nodes
  reserved for a guest to configure.

  So my plan is to come back asap with reworked code.

> We have existing placement attributes for vcpu and numatune/memory
> elements which would have been much better place for implementing such
> feature. And event cpu/numa element could have been enhanced to support
> similar configuration.

  Going over libvirt documentation I am more appealed with vcpu area.  As
  said let me rework and return with better approach/RFC asap.

Rgds,
- Wim10H.


> Jirka
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH auto partition NUMA guest domains v1 0/2] auto partition guests providing the host NUMA topology

2018-09-25 Thread Jiri Denemark
On Tue, Sep 25, 2018 at 12:02:40 +0200, Wim Ten Have wrote:
> From: Wim ten Have 
> 
> This patch extends the guest domain administration adding support
> to automatically advertise the host NUMA node capabilities obtained
> architecture under a guest by creating a vNUMA copy.

I'm pretty sure someone would find this useful and such configuration is
perfectly valid in libvirt. But I don't think there is a compelling
reason to add some magic into the domain XML which would automatically
expand to such configuration. It's basically a NUMA placement policy and
libvirt generally tries to avoid including any kind of policies and
rather just provide all the mechanisms and knobs which can be used by
applications to implement any policy they like.

> The mechanism is enabled by setting the check='numa' attribute under
> the CPU 'host-passthrough' topology:
>   

Anyway, this is definitely not the right place for such option. The
'check' attribute is described as

"Since 3.2.0, an optional check attribute can be used to request a
specific way of checking whether the virtual CPU matches the
specification."

and the new 'numa' value does not fit in there in any way.

Moreover the code does the automatic NUMA placement at the moment
libvirt parses the domain XML, which is not the right place since it
would break migration, snapshots, and save/restore features.

We have existing placement attributes for vcpu and numatune/memory
elements which would have been much better place for implementing such
feature. And event cpu/numa element could have been enhanced to support
similar configuration.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [RFC PATCH auto partition NUMA guest domains v1 0/2] auto partition guests providing the host NUMA topology

2018-09-25 Thread Wim Ten Have
From: Wim ten Have 

This patch extends the guest domain administration adding support
to automatically advertise the host NUMA node capabilities obtained
architecture under a guest by creating a vNUMA copy.

The mechanism is enabled by setting the check='numa' attribute under
the CPU 'host-passthrough' topology:
   

When enabled the mechanism automatically renders the host capabilities
provided NUMA architecture, evenly balances the guest reserved vcpu
and memory amongst its vNUMA composed cells and have the cell allocated
vcpus pinned towards the host NUMA node physical cpusets.  This in such
way that the host NUMA topology is still in effect under the partitioned
guest domain.

Below example auto partitions the host 'lscpu' listed physical NUMA detail
under a guest domain vNUMA description.

[root@host ]# lscpu 
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):240
On-line CPU(s) list:   0-239
Thread(s) per core:2
Core(s) per socket:15
Socket(s): 8
NUMA node(s):  8
Vendor ID: GenuineIntel
CPU family:6
Model: 62
Model name:Intel(R) Xeon(R) CPU E7-8895 v2 @ 2.80GHz
Stepping:  7
CPU MHz:   3449.555
CPU max MHz:   3600.
CPU min MHz:   1200.
BogoMIPS:  5586.28
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  38400K
NUMA node0 CPU(s): 0-14,120-134
NUMA node1 CPU(s): 15-29,135-149
NUMA node2 CPU(s): 30-44,150-164
NUMA node3 CPU(s): 45-59,165-179
NUMA node4 CPU(s): 60-74,180-194
NUMA node5 CPU(s): 75-89,195-209
NUMA node6 CPU(s): 90-104,210-224
NUMA node7 CPU(s): 105-119,225-239
Flags: ...

The guest 'anuma' without the auto partition rendering enabled
reads;   ""


  anuma
  3f439f5f-1156-4d48-9491-945a2c0abc6d
  67108864
  67108864
  16
  
hvm

  
  



  
  
  



  
  destroy
  restart
  destroy
  


  
  
/usr/bin/qemu-system-x86_64

  
  

Enabling the auto partitioning the guest 'anuma' XML is rewritten
as listed below;   ""


  anuma
  3f439f5f-1156-4d48-9491-945a2c0abc6d
  67108864
  67108864
  16
  
















  
  
hvm

  
  



  
  


  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  
  

  
  
  
  
  
  
  
  

  

  
  



  
  destroy
  restart
  destroy
  


  
  
/usr/bin/qemu-system-x86_64

  
  

Finally the auto partitioned guest anuma 'lscpu' listed virtual vNUMA detail.

[root@anuma ~]# lscpu
Architecture:x86_64
CPU op-mode(s):  32-bit, 64-bit
Byte Order:  Little Endian
CPU(s):  16
On-line CPU(s) list: 0-15
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):   8
NUMA node(s):8
Vendor ID:   GenuineIntel
CPU family:  6
Model:   62
Model name:  Intel(R) Xeon(R) CPU E7-8895 v2 @ 2.80GHz
Stepping: