Re: [libvirt] [PATCH 1/3] XML definitions for guest NUMA

Eric Blake Mon, 07 Nov 2011 10:43:01 -0800

On 11/06/2011 06:57 AM, Bharata B Rao wrote:

XML definitions for guest NUMA and parsing routines.


From: Bharata B Rao<bhar...@linux.vnet.ibm.com>

This patch adds XML definitions for guest NUMA specification and contains
routines to parse the same. The guest NUMA specification looks like this:

<cpu>
         ...
         <topology sockets='2' cores='4' threads='2'/>
         <numa>
                 <cell cpus='0-7' mems='512000'/>
                 <cell cpus='8-15' mems='512000'/>
         </numa>
         ...
</cpu>

Signed-off-by: Bharata B Rao<bhar...@linux.vnet.ibm.com>
---

+<p>
+      Guest NUMA topology can be specifed using<code>numa</code>  element.
+<span class="since">Since X.X.X</span>

Let's just put 0.9.8 here. It's easier at feature freeze time to grepand replace a concrete 0.9.8 into a different numbering scheme (0.10.0,1.0.0, ?) if we decide on something different than 0.9.8, than it is toremember to also search for X.X.X.

+</p>
+
+<pre>
+  ...
+&lt;cpu&gt;
+    ...
+&lt;numa&gt;
+&lt;cell cpus='0-3' mems='512000'/&gt;
+&lt;cell cpus='4-7' mems='512000'/&gt;

I understand 'cpus' (a valid word meaning multiple cpu units), but'mems' seems odd; I think it would be better naming this attribute'memory' to match our <memory> element at the top level. Just becauseqemu's command line names the option mems= doesn't mean we should bestuck making our XML inconsistent.

+&lt;/numa&gt;
+    ...
+&lt;/cpu&gt;
+  ...</pre>
+
+<p>
+      Each<code>cell</code>  element specifies a NUMA cell or a NUMA node.
+<code>cpus</code>  specifies the CPU or range of CPUs that are part of
+      the node.<code>mems</code>  specifies the node memory in kilobytes
+      (i.e. blocks of 1024 bytes). Each cell or node is assigned cellid
+      or nodeid in the increasing order starting from 0.

I agree with doing things in 1024-byte blocks [1], since <memory> and<currentMemory> are also in that unit.

[1] 1024-byte blocks is technically kibibytes, not kilobytes; but you'recopying from existing text, so at least we're consistent, not to mentionfitting right in with the wikipedia complaint that KiB has had slowadoption by the computer industry:https://secure.wikimedia.org/wikipedia/en/wiki/Kibibyte :)

+</p>
+
+<p>
+      This guest NUMA specification translates to<code>-numa</code>  command
+      line option for QEMU/KVM. For the above example, the following QEMU
+      command line option is generated:
+<code>-numa node,nodeid=0,cpus=0-3,mems=512000 -numa 
node,nodeid=1,cpus=4-7,mems=512000</code>

This paragraph is not necessary. We don't need to give onehypervisor-specific example of how the XML is translated; it issufficient to simply document the XML semantics in a way that can beimplemented by any number of hypervisors.

+
+<define name="numaCell">
+<element name="cell">
+<attribute name="cpus">
+<ref name="Cellcpus"/>


Typically, ref names start with a lower case letter.

+</attribute>
+<attribute name="mems">
+<ref name="Cellmems"/>
+</attribute>

Is it possible for these attributes to be optional? That is, on theqemu line, can I specify cpus= but not mems=, or mems= but not cpus=?If so, then the attributes need to be optional in the schema, and thecode behave with sane defaults when one of the two attributes is notpresent.

@@ -2745,4 +2767,14 @@
        <param name="pattern">[a-zA-Z0-9_\.:]+</param>
      </data>
    </define>
+<define name="Cellcpus">
+<data type="string">
+<param 
name="pattern">([0-9]+(-[0-9]+)?|\^[0-9]+)(,([0-9]+(-[0-9]+)?|\^[0-9]+))*</param>

This looks like a repeat of <define name="cpuset">; if so, let's reusethat define instead of making a new one (and if not, why are weintroducing yet another syntax?).

+</data>
+</define>
+<define name="Cellmems">
+<data type="unsignedInt">
+<param name="pattern">[0-9]+</param>

Likewise, this looks like a repeat of <define name="memoryKB">, so letsreuse that.

@@ -109,6 +114,19 @@ no_memory:
      return NULL;
  }

+static int
+virCPUDefNumaCPUs(virCPUDefPtr def)
+{
+    int i, j, ncpus = 0;
+
+    for (i = 0; i<  def->ncells; i++) {
+        for (j = 0; j<  VIR_DOMAIN_CPUMASK_LEN; j++) {
+            if (def->cells[i].cpumask[j])
+                ncpus++;
+        }
+    }


Can this loop be made any faster by using count_one_bits?

+    return ncpus;
+}

  virCPUDefPtr
  virCPUDefParseXML(const xmlNodePtr node,
@@ -289,6 +307,50 @@ virCPUDefParseXML(const xmlNodePtr node,
          def->features[i].policy = policy;
      }

+    if (virXPathNode("./numa[1]", ctxt)) {
+        VIR_FREE(nodes);
+        n = virXPathNodeSet("./numa[1]/cell", ctxt,&nodes);
+        if (n<  0 || n == 0) {


This looks a bit funny, compared to if (n <= 0).

+        for (i = 0 ; i<  n ; i++) {
+            char *cpus;
+            int cpumasklen = VIR_DOMAIN_CPUMASK_LEN;
+            unsigned long ul;
+            int ret;
+
+            def->cells[i].cellid = i;
+            cpus = virXMLPropString(nodes[i], "cpus");
+
+            if (VIR_ALLOC_N(def->cells[i].cpumask, cpumasklen)<  0)
+                goto no_memory;
+
+            if (virDomainCpuSetParse((const char **)&cpus,
+                                 0, def->cells[i].cpumask,
+                                 cpumasklen)<  0)


Does this behave properly if the cpus=... attribute was missing?

+                goto error;
+
+            ret = virXPathULong("string(./numa[1]/cell/@mems)",
+                            ctxt,&ul);
+            if (ret<  0) {
+                virCPUReportError(VIR_ERR_INTERNAL_ERROR,
+                    "%s", _("Missing 'mems' attribute in NUMA topology"));
+                goto error;

Especially since you checked for a missing mems= counterpart? And backto my earlier question of whether both attributes are really mandatory,or whether just one in isolation can make sense.

+++ b/src/conf/cpu_conf.h
@@ -67,6 +67,14 @@ struct _virCPUFeatureDef {
      int policy;         /* enum virCPUFeaturePolicy */
  };

+typedef struct _virCellDef virCellDef;
+typedef virCellDef *virCellDefPtr;
+struct _virCellDef {
+   int cellid;
+   char *cpumask;      /* CPUs that are part of this node */
+   unsigned int mem;   /* Node memory */

I'd write this as /* Node memory in kB */, to make it clear what scaleis in use.

I saw the code for parsing the XML, but what about the code forgenerating the xml during 'virsh dumpxml' (aka virDomainDefFormat)? Thepatch is incomplete without that.


--
Eric Blake   ebl...@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 1/3] XML definitions for guest NUMA

Reply via email to