Re: [libvirt] [PATCH v1 11/32] util: netdevip: define cleanup function using VIR_DEFINE_AUTOPTR_FUNC
On Sat, Jul 28, 2018 at 11:31:26PM +0530, Sukrit Bhatnagar wrote: > Using the new VIR_DEFINE_AUTOPTR_FUNC macro defined in > src/util/viralloc.h, define a new wrapper around an existing > cleanup function which will be called when a variable declared > with VIR_AUTOPTR macro goes out of scope. Also, drop the redundant > viralloc.h include, since that has moved from the source module into > the header. > > When variables of type virNetDevIPAddrPtr and virNetDevIPRoutePtr > are declared using VIR_AUTOPTR, the functions virNetDevIPAddrFree > and virNetDevIPRouteFree, respectively, will be run > automatically on them when they go out of scope. > > Signed-off-by: Sukrit Bhatnagar > --- > src/util/virnetdevip.c | 7 ++- > src/util/virnetdevip.h | 4 > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/src/util/virnetdevip.c b/src/util/virnetdevip.c > index bf98ed8..fdb0b74 100644 > --- a/src/util/virnetdevip.c > +++ b/src/util/virnetdevip.c > @@ -27,7 +27,6 @@ > #include "virnetlink.h" > #include "virfile.h" > #include "virerror.h" > -#include "viralloc.h" > #include "virlog.h" > #include "virstring.h" > #include "virutil.h" > @@ -1129,3 +1128,9 @@ virNetDevIPInfoAddToDev(const char *ifname, > cleanup: > return ret; > } > + > +void > +virNetDevIPAddrFree(virNetDevIPAddrPtr ip) > +{ > +VIR_FREE(ip); > +} > diff --git a/src/util/virnetdevip.h b/src/util/virnetdevip.h > index 6b509ea..5608c37 100644 > --- a/src/util/virnetdevip.h > +++ b/src/util/virnetdevip.h > @@ -84,6 +84,7 @@ int virNetDevIPAddrGet(const char *ifname, virSocketAddrPtr > addr) > int virNetDevIPWaitDadFinish(virSocketAddrPtr *addrs, size_t count) > ATTRIBUTE_NONNULL(1); > bool virNetDevIPCheckIPv6Forwarding(void); > +void virNetDevIPAddrFree(virNetDevIPAddrPtr ip); These two hunks should be in a separate patch. To the rest: Reviewed-by: Erik Skultety -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 16/17] conf: Add return value check to virResctrlAllocForeachCache
From: Bing Niu Add return value check to virResctrlAllocForeachCache in virDomainCachetuneDefFormat. The virResctrlAllocForeachCache dose have return value, so need check return value to make sure function executed without error. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d8333ad..095c903 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -27199,10 +27199,10 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); -virResctrlAllocForeachCache(resctrl->alloc, -virDomainCachetuneDefFormatHelper, -&childrenBuf); - +if (virResctrlAllocForeachCache(resctrl->alloc, +virDomainCachetuneDefFormatHelper, +&childrenBuf) < 0) +goto cleanup; if (virBufferCheckError(&childrenBuf) < 0) goto cleanup; -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 13/17] conf: Factor out vcpus overlapping from virDomainCachetuneDefParse
From: Bing Niu Factor out vcpus overlapping detecting part from virDomainCachetuneDefParse and introduce virDomainResctrlVcpuMatch. Instead of allocating virResctrlAllocPtr by default, allocating virResctrlAllocPtr after confirm vcpus not overlap with existing ones. And virDomainResctrlVcpuMatch can be reused by other resource control technologies. virDomainResctrlVcpuMatch can clarify old vcpus overlap error whether an overlap or a redefinition. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 51 ++ 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d6314de..da8681d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18983,6 +18983,31 @@ virDomainResctrlParseVcpus(virDomainDefPtr def, static int +virDomainResctrlVcpuMatch(virDomainDefPtr def, + virBitmapPtr vcpus, + virResctrlAllocPtr *alloc) +{ +ssize_t i = 0; + +for (i = 0; i < def->nresctrls; i++) { +/* vcpus group has been created, directly use the existing one. + * Just updating memory allocation information of that group + */ +if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) { +*alloc = def->resctrls[i]->alloc; +break; +} +if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { +virReportError(VIR_ERR_XML_ERROR, "%s", + _("Overlapping vcpus in resctrls")); +return -1; +} +} +return 0; +} + + +static int virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, xmlNodePtr node, virResctrlAllocPtr alloc) @@ -19065,7 +19090,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr oldnode = ctxt->node; xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; -virResctrlAllocPtr alloc = virResctrlAllocNew(); +virResctrlAllocPtr alloc = NULL; virDomainResctrlDefPtr tmp_resctrl = NULL; char *tmp = NULL; char *vcpus_str = NULL; @@ -19076,9 +19101,6 @@ virDomainCachetuneDefParse(virDomainDefPtr def, ctxt->node = node; -if (!alloc) -goto cleanup; - if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; @@ -19096,6 +19118,19 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } +if (virDomainResctrlVcpuMatch(def, vcpus, &alloc) < 0) +goto cleanup; + +if (!alloc) { +alloc = virResctrlAllocNew(); +if (!alloc) +goto cleanup; +} else { +virReportError(VIR_ERR_XML_ERROR, "%s", + _("Identical vcpus in cachetunes found")); +goto cleanup; +} + for (i = 0; i < n; i++) { if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0) goto cleanup; @@ -19106,14 +19141,6 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } -for (i = 0; i < def->nresctrls; i++) { -if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { -virReportError(VIR_ERR_XML_ERROR, "%s", - _("Overlapping vcpus in cachetunes")); -goto cleanup; -} -} - /* We need to format it back because we need to be consistent in the naming * even when users specify some "sub-optimal" string there. */ VIR_FREE(vcpus_str); -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 15/17] conf: Add support for memorytune XML processing for resctrl MBA
From: Bing Niu Introduce a new section memorytune to support memory bandwidth allocation. This is consistent with existing cachetune. As the example: below: .. vpus --- vpus subjected to this memory bandwidth. id--- on which node memory bandwidth to be set. bandwidth --- the memory bandwidth percent to set. Signed-off-by: Bing Niu --- docs/formatdomain.html.in | 39 +++- docs/schemas/domaincommon.rng | 17 ++ src/conf/domain_conf.c | 200 + .../memorytune-colliding-allocs.xml| 30 .../memorytune-colliding-cachetune.xml | 32 tests/genericxml2xmlindata/memorytune.xml | 33 tests/genericxml2xmltest.c | 5 + 7 files changed, 355 insertions(+), 1 deletion(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 19b7312..1ae6e98 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -759,6 +759,10 @@+ + + ... @@ -932,7 +936,9 @@ size and required granularity are reported as well. The required attribute vcpus specifies to which vCPUs this allocation applies. A vCPU can only be member of one cachetune element -allocations. Supported subelements are: +allocations. The vCPUs specified by cachetune can be identical with those +in memorytune, however they are not allowed to overlap. +Supported subelements are: cache @@ -972,7 +978,38 @@ + + memorytuneSince 4.7.0 + +Optional memorytune element can control allocations for +memory bandwidth using the resctrl on the host. Whether or not is this +supported can be gathered from capabilities where some limitations like +minimum bandwidth and required granularity are reported as well. The +required attribute vcpus specifies to which vCPUs this +allocation applies. A vCPU can only be member of one +memorytune element allocations. The vcpus specified +by memorytune can be identical to those specified by +cachetune. However they are not allowed to overlap each other. +Supported subelements are: + + node + +This element controls the allocation of CPU memory bandwidth and has the +following attributes: + + id + +Host node id from which to allocate memory bandwidth. + + bandwidth + +The memory bandwidth to allocate from this node. The value by default +is in percentage. + + + + diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index ac04af5..48f0637 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -983,6 +983,23 @@ + + + + + + + + + + + + + + + + + diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f45def8..d8333ad 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19290,6 +19290,129 @@ virDomainDefParseCaps(virDomainDefPtr def, } +static int +virDomainMemorytuneDefParseMemory(xmlXPathContextPtr ctxt, + xmlNodePtr node, + virResctrlAllocPtr alloc) +{ +xmlNodePtr oldnode = ctxt->node; +unsigned int id; +unsigned int bandwidth; +char *tmp = NULL; +int ret = -1; + +ctxt->node = node; + +tmp = virXMLPropString(node, "id"); +if (!tmp) { +virReportError(VIR_ERR_XML_ERROR, "%s", + _("Missing memorytune attribute 'id'")); +goto cleanup; +} +if (virStrToLong_uip(tmp, NULL, 10, &id) < 0) { +virReportError(VIR_ERR_XML_ERROR, + _("Invalid memorytune attribute 'id' value '%s'"), + tmp); +goto cleanup; +} +VIR_FREE(tmp); + +tmp = virXMLPropString(node, "bandwidth"); +if (!tmp) { +virReportError(VIR_ERR_X+
[libvirt] [PATCH v2 RESEND 08/17] util: Add support to calculate MBA utilization
From: Bing Niu Introduce virResctrlMemoryBandwidthSubtract and virResctrlAllocMemoryBandwidth to be used as part of the virResctrlAllocAssign processing to configure the available memory bandwidth. Signed-off-by: Bing Niu --- src/util/virresctrl.c | 105 -- 1 file changed, 93 insertions(+), 12 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 1cbf9b3..c22b30f 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -1409,6 +1409,22 @@ virResctrlAllocSubtract(virResctrlAllocPtr dst, } +static void +virResctrlMemoryBandwidthSubtract(virResctrlAllocPtr free, + virResctrlAllocPtr used) +{ +size_t i; + +if (!used->mem_bw) +return; + +for (i = 0; i < used->mem_bw->nbandwidths; i++) { +if (used->mem_bw->bandwidths[i]) +*(free->mem_bw->bandwidths[i]) -= *(used->mem_bw->bandwidths[i]); +} +} + + static virResctrlAllocPtr virResctrlAllocNewFromInfo(virResctrlInfoPtr info) { @@ -1472,14 +1488,15 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) } /* - * This function creates an allocation that represents all unused parts of all - * caches in the system. It uses virResctrlInfo for creating a new full - * allocation with all bits set (using virResctrlAllocNewFromInfo()) and then - * scans for all allocations under /sys/fs/resctrl and subtracts each one of - * them from it. That way it can then return an allocation with only bit set - * being those that are not mentioned in any other allocation. It is used for - * two things, a) calculating the masks when creating allocations and b) from - * tests. + * This function creates an allocation that represents all unused parts of + * all caches and memory bandwidth in the system. It uses virResctrlInfo + * for creating a new full allocation with all bits set (using the + * virResctrlAllocNewFromInfo()), sets memory bandwidth 100%, and then scans + * for all allocations under /sys/fs/resctrl and subtracts each one of them + * from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation for CAT and + * available memory bandwidth for MBA. It is used for two things, calculating + * the masks and bandwidth available when creating allocations and from tests. */ virResctrlAllocPtr virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) @@ -1525,6 +1542,7 @@ virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) goto error; } +virResctrlMemoryBandwidthSubtract(ret, alloc); virResctrlAllocSubtract(ret, alloc); virObjectUnref(alloc); alloc = NULL; @@ -1676,6 +1694,66 @@ virResctrlAllocFindUnused(virResctrlAllocPtr alloc, static int +virResctrlAllocMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + virResctrlAllocPtr free) +{ +size_t i; +virResctrlAllocMemBWPtr mem_bw_alloc = alloc->mem_bw; +virResctrlAllocMemBWPtr mem_bw_free = free->mem_bw; +virResctrlInfoMemBWPtr mem_bw_info = resctrl->membw_info; + +if (!mem_bw_alloc) +return 0; + +if (mem_bw_alloc && !mem_bw_info) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("RDT Memory Bandwidth allocation unsupported")); +return -1; +} + +for (i = 0; i < mem_bw_alloc->nbandwidths; i++) { +if (!mem_bw_alloc->bandwidths[i]) +continue; + +if (*(mem_bw_alloc->bandwidths[i]) % mem_bw_info->bandwidth_granularity) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is not divisible by granularity %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->bandwidth_granularity); +return -1; +} +if (*(mem_bw_alloc->bandwidths[i]) < mem_bw_info->min_bandwidth) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is smaller than the minimum " + "allowed allocation %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->min_bandwidth); +return -1; +} +if (i > mem_bw_info->max_id) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("bandwidth controller id %zd does not " + "exist, max controller id %u"), + i, mem_bw_info->max_id); +return -1; +} +if (*(mem_bw_alloc->bandwidths[i]) > *(mem_bw_free->bandwidths[i])) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of %u%% " +
[libvirt] [PATCH v2 RESEND 05/17] util: Add MBA check to virResctrlInfoGetCache
From: Bing Niu If we have some membw_info data, then we need to calculate the number of MBA controllers on the system. The value cannot be obtained from a direct query to the RDT kernel module, but it is the same as the last level cache value which is calculated by traversing the cache hierarchy of host(/sys/bus/cpu/devices/cpuX/cache/). Signed-off-by: Bing Niu Reviewed-by: John Ferlan --- src/util/virresctrl.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index b12a05c..f454868 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -608,6 +608,20 @@ virResctrlInfoGetCache(virResctrlInfoPtr resctrl, if (virResctrlInfoIsEmpty(resctrl)) return 0; +/* Let's take the opportunity to update the number of last level + * cache. This number of memory bandwidth controller is same with + * last level cache */ +if (resctrl->membw_info) { +virResctrlInfoMemBWPtr membw_info = resctrl->membw_info; + +if (level > membw_info->last_level_cache) { +membw_info->last_level_cache = level; +membw_info->max_id = 0; +} else if (membw_info->last_level_cache == level) { +membw_info->max_id++; +} +} + if (level >= resctrl->nlevels) return 0; -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 07/17] util: Add MBA schemata parse and format methods
From: Bing Niu Introduce virResctrlAllocMemoryBandwidthFormat and virResctrlAllocParseMemoryBandwidthLine which will format and parse an entry in the schemata file for MBA. Signed-off-by: Bing Niu --- src/util/virresctrl.c | 141 ++ 1 file changed, 141 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 8a25798..1cbf9b3 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -986,6 +986,139 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) } +/* Format the Memory Bandwidth Allocation line that will be found in + * the schemata files. The line should be start with "MB:" and be + * followed by "id=value" pairs separated by a colon such as: + * + * MB:0=100;1=100 + * + * which indicates node id 0 has 100 percent bandwith and node id 1 + * has 100 percent bandwidth + */ +static int +virResctrlAllocMemoryBandwidthFormat(virResctrlAllocPtr alloc, + virBufferPtr buf) +{ +size_t i; + +if (!alloc->mem_bw) +return 0; + +virBufferAddLit(buf, "MB:"); + +for (i = 0; i < alloc->mem_bw->nbandwidths; i++) { +if (alloc->mem_bw->bandwidths[i]) { +virBufferAsprintf(buf, "%zd=%u;", i, + *(alloc->mem_bw->bandwidths[i])); +} +} + +virBufferTrim(buf, ";", 1); +virBufferAddChar(buf, '\n'); +return virBufferCheckError(buf); +} + + +static int +virResctrlAllocParseProcessMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *mem_bw) +{ +unsigned int bandwidth; +unsigned int id; +char *tmp = NULL; + +tmp = strchr(mem_bw, '='); +if (!tmp) +return 0; +*tmp = '\0'; +tmp++; + +if (virStrToLong_uip(mem_bw, NULL, 10, &id) < 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid node id %u "), id); +return -1; +} +if (virStrToLong_uip(tmp, NULL, 10, &bandwidth) < 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid bandwidth %u"), bandwidth); +return -1; +} +if (bandwidth < resctrl->membw_info->min_bandwidth || +id > resctrl->membw_info->max_id) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("Missing or inconsistent resctrl info for " + "memory bandwidth node '%u'"), id); +return -1; +} +if (alloc->mem_bw->nbandwidths <= id && +VIR_EXPAND_N(alloc->mem_bw->bandwidths, alloc->mem_bw->nbandwidths, + id - alloc->mem_bw->nbandwidths + 1) < 0) { +return -1; +} +if (!alloc->mem_bw->bandwidths[id]) { +if (VIR_ALLOC(alloc->mem_bw->bandwidths[id]) < 0) +return -1; +} + +*(alloc->mem_bw->bandwidths[id]) = bandwidth; +return 0; +} + + +/* Parse a schemata formatted MB: entry. Format details are described in + * virResctrlAllocMemoryBandwidthFormat. + */ +static int +virResctrlAllocParseMemoryBandwidthLine(virResctrlInfoPtr resctrl, +virResctrlAllocPtr alloc, +char *line) +{ +char **mbs = NULL; +char *tmp = NULL; +size_t nmbs = 0; +size_t i; +int ret = -1; + +/* For no reason there can be spaces */ +virSkipSpaces((const char **) &line); + +if (STRNEQLEN(line, "MB", 2)) +return 0; + +if (!resctrl || !resctrl->membw_info || +!resctrl->membw_info->min_bandwidth || +!resctrl->membw_info->bandwidth_granularity) { +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or inconsistent resctrl info for " + "memory bandwidth allocation")); +} + +if (!alloc->mem_bw) { +if (VIR_ALLOC(alloc->mem_bw) < 0) +return -1; +} + +tmp = strchr(line, ':'); +if (!tmp) +return 0; +tmp++; + +mbs = virStringSplitCount(tmp, ";", 0, &nmbs); +if (nmbs == 0) +return 0; + +for (i = 0; i < nmbs; i++) { +if (virResctrlAllocParseProcessMemoryBandwidth(resctrl, alloc, mbs[i]) < 0) +goto cleanup; +} +ret = 0; + cleanup: +virStringListFree(mbs); +return ret; +} + + static int virResctrlAllocFormatCache(virResctrlAllocPtr alloc, virBufferPtr buf) @@ -1045,6 +1178,11 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) return NULL; } +if (virResctrlAllocMemoryBandwidthFormat(alloc, &buf) < 0) { +virBufferFreeAndReset(&buf); +return NULL; +} + return virBufferContentAndReset(&buf); } @@ -1173,6 +1311,9 @@ virResctrlAllocParse(virResctrlInfoPtr resctrl, for (i = 0; i < nlines; i++) { if (virResctrlAllocParseCacheLine(resctrl, alloc, lines[i]) < 0) goto cleanup; +
[libvirt] [PATCH v2 RESEND 10/17] util: Introduce virResctrlAllocSetMemoryBandwidth
From: Bing Niu Introduce an API to allow setting of the MBA from domain XML. Signed-off-by: Bing Niu --- src/libvirt_private.syms | 1 + src/util/virresctrl.c| 48 src/util/virresctrl.h| 5 + 3 files changed, 54 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index cd598f9..0d2fc5c 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2657,6 +2657,7 @@ virResctrlAllocNew; virResctrlAllocRemove; virResctrlAllocSetCacheSize; virResctrlAllocSetID; +virResctrlAllocSetMemoryBandwidth; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index dc57c89..f25e7bc 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -965,6 +965,54 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, } +/* virResctrlAllocSetMemoryBandwidth + * @alloc: Pointer to an active allocation + * @id: node id of MBA to be set + * @memory_bandwidth: new memory bandwidth value + * + * Set the @memory_bandwidth for the node @id entry in the @alloc. + * + * Returns 0 on success, -1 on failure with error message set. + */ +int +virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth) +{ +virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw; + +if (memory_bandwidth > 100) { +virReportError(VIR_ERR_XML_ERROR, "%s", + _("Memory Bandwidth value exceeding 100 is invalid.")); +return -1; +} + +if (!mem_bw) { +if (VIR_ALLOC(mem_bw) < 0) +return -1; +alloc->mem_bw = mem_bw; +} + +if (mem_bw->nbandwidths <= id && +VIR_EXPAND_N(mem_bw->bandwidths, mem_bw->nbandwidths, + id - mem_bw->nbandwidths + 1) < 0) +return -1; + +if (mem_bw->bandwidths[id]) { +virReportError(VIR_ERR_XML_ERROR, + _("Memory Bandwidth already defined for node %u"), + id); +return -1; +} + +if (VIR_ALLOC(mem_bw->bandwidths[id]) < 0) +return -1; + +*(mem_bw->bandwidths[id]) = memory_bandwidth; +return 0; +} + + /* virResctrlAllocForeachMemory * @alloc: Pointer to an active allocation * @cb: Callback function diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 5ea5b27..8d62517 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -96,6 +96,11 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, void *opaque); int +virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth); + +int virResctrlAllocForeachMemory(virResctrlAllocPtr resctrl, virResctrlAllocForeachMemoryCallback cb, void *opaque); -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 14/17] conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse
From: Bing Niu Factor out vcpus virDomainResctrlDef update from virDomainCachetuneDefParse and introduce virDomainResctrlAppend. virDomainResctrlAppend will format vcpus string and append a new virDomainResctrlDef to virDomainDefPtr. So that this logic can be reusable. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 93 +- 1 file changed, 55 insertions(+), 38 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index da8681d..f45def8 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19082,6 +19082,58 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, static int +virDomainResctrlAppend(virDomainDefPtr def, + xmlNodePtr node, + virResctrlAllocPtr alloc, + virBitmapPtr vcpus, + unsigned int flags) +{ +char *vcpus_str = NULL; +char *alloc_id = NULL; +virDomainResctrlDefPtr tmp_resctrl = NULL; +int ret = -1; + +if (VIR_ALLOC(tmp_resctrl) < 0) +goto cleanup; + +/* We need to format it back because we need to be consistent in the naming + * even when users specify some "sub-optimal" string there. */ +vcpus_str = virBitmapFormat(vcpus); +if (!vcpus_str) +goto cleanup; + +if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) +alloc_id = virXMLPropString(node, "id"); + +if (!alloc_id) { +/* The number of allocations is limited and the directory structure is flat, + * not hierarchical, so we need to have all same allocations in one + * directory, so it's nice to have it named appropriately. For now it's + * 'vcpus_...' but it's designed in order for it to be changeable in the + * future (it's part of the status XML). */ +if (virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) < 0) +goto cleanup; +} + +if (virResctrlAllocSetID(alloc, alloc_id) < 0) +goto cleanup; + +tmp_resctrl->vcpus = vcpus; +tmp_resctrl->alloc = alloc; + +if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) +goto cleanup; + +ret = 0; + cleanup: +virDomainResctrlDefFree(tmp_resctrl); +VIR_FREE(alloc_id); +VIR_FREE(vcpus_str); +return ret; +} + + +static int virDomainCachetuneDefParse(virDomainDefPtr def, xmlXPathContextPtr ctxt, xmlNodePtr node, @@ -19091,19 +19143,12 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = NULL; -virDomainResctrlDefPtr tmp_resctrl = NULL; -char *tmp = NULL; -char *vcpus_str = NULL; -char *alloc_id = NULL; ssize_t i = 0; int n; int ret = -1; ctxt->node = node; -if (VIR_ALLOC(tmp_resctrl) < 0) -goto cleanup; - if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) goto cleanup; @@ -19141,45 +19186,17 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } -/* We need to format it back because we need to be consistent in the naming - * even when users specify some "sub-optimal" string there. */ -VIR_FREE(vcpus_str); -vcpus_str = virBitmapFormat(vcpus); -if (!vcpus_str) -goto cleanup; - -if (!(flags & VIR_DOMAIN_DEF_PARSE_INACTIVE)) -alloc_id = virXMLPropString(node, "id"); - -if (!alloc_id) { -/* The number of allocations is limited and the directory structure is flat, - * not hierarchical, so we need to have all same allocations in one - * directory, so it's nice to have it named appropriately. For now it's - * 'vcpus_...' but it's designed in order for it to be changeable in the - * future (it's part of the status XML). */ -if (virAsprintf(&alloc_id, "vcpus_%s", vcpus_str) < 0) -goto cleanup; -} - -if (virResctrlAllocSetID(alloc, alloc_id) < 0) -goto cleanup; - -VIR_STEAL_PTR(tmp_resctrl->vcpus, vcpus); -VIR_STEAL_PTR(tmp_resctrl->alloc, alloc); - -if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) +if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0) goto cleanup; +vcpus = NULL; +alloc = NULL; ret = 0; cleanup: ctxt->node = oldnode; -virDomainResctrlDefFree(tmp_resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); -VIR_FREE(alloc_id); -VIR_FREE(vcpus_str); VIR_FREE(nodes); -VIR_FREE(tmp); return ret; } -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 17/17] conf: Add memory bandwidth allocation capability of host
From: Bing Niu Add new XML section to report host's memory bandwidth allocation capability. The format as below example: . granularity granularity of memory bandwidth, unit percentage. min minimum memory bandwidth allowed, unit percentage. maxAllocs maximum memory bandwidth allocation group supported. Signed-off-by: Bing Niu --- docs/schemas/capability.rng| 33 +++ src/conf/capabilities.c| 107 + src/conf/capabilities.h| 11 +++ src/util/virresctrl.c | 20 src/util/virresctrl.h | 15 +++ .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth| 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 ++ tests/virresctrldata/resctrl.schemata | 1 + 10 files changed, 198 insertions(+) create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids diff --git a/docs/schemas/capability.rng b/docs/schemas/capability.rng index 52164d5..d61515c 100644 --- a/docs/schemas/capability.rng +++ b/docs/schemas/capability.rng @@ -51,6 +51,9 @@ + + + @@ -326,6 +329,36 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index 0f96500..ef2ca81 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -198,6 +198,16 @@ virCapabilitiesFreeNUMAInfo(virCapsPtr caps) } static void +virCapsHostMemBWNodeFree(virCapsHostMemBWNodePtr ptr) +{ +if (!ptr) +return; + +virBitmapFree(ptr->cpus); +VIR_FREE(ptr); +} + +static void virCapabilitiesClearSecModel(virCapsHostSecModelPtr secmodel) { size_t i; @@ -239,6 +249,10 @@ virCapsDispose(void *object) virCapsHostCacheBankFree(caps->host.caches[i]); VIR_FREE(caps->host.caches); +for (i = 0; i < caps->host.nnodes; i++) +virCapsHostMemBWNodeFree(caps->host.nodes[i]); +VIR_FREE(caps->host.nodes); + VIR_FREE(caps->host.netprefix); VIR_FREE(caps->host.pagesSize); virCPUDefFree(caps->host.cpu); @@ -957,6 +971,58 @@ virCapabilitiesFormatCaches(virBufferPtr buf, return 0; } +static int +virCapabilitiesFormatMemoryBandwidth(virBufferPtr buf, + size_t nnodes, + virCapsHostMemBWNodePtr *nodes) +{ +size_t i = 0; +virBuffer controlBuf = VIR_BUFFER_INITIALIZER; + +if (!nnodes) +return 0; + +virBufferAddLit(buf, "\n"); +virBufferAdjustIndent(buf, 2); + +for (i = 0; i < nnodes; i++) { +virCapsHostMemBWNodePtr node = nodes[i]; +virResctrlInfoMemBWPerNodePtr control = &node->control; +char *cpus_str = virBitmapFormat(node->cpus); + +if (!cpus_str) +return -1; + +virBufferAsprintf(buf, + "id, cpus_str); +VIR_FREE(cpus_str); + +virBufferSetChildIndent(&controlBuf, buf); +virBufferAsprintf(&controlBuf, + "\n", + control->granularity, control->min, + control->max_allocation); + +if (virBufferCheckError(&controlBuf) < 0) +return -1; + +if (virBufferUse(&controlBuf)) { +virBufferAddLit(buf, ">\n"); +virBufferAddBuffer(buf, &controlBuf); +virBufferAddLit(buf, "\n"); +} else { +virBufferAddLit(buf, "/>\n"); +} +} + +virBufferAdjustIndent(buf, -2); +virBufferAddLit(buf, "\n"); + +return 0; +} + /** * virCapabilitiesFormatXML: * @caps: capabilities to format @@ -1060,6 +1126,10 @@ virCapabilitiesFormatXML(virCapsPtr caps) caps->host.caches) < 0) goto error; +if (virCapabilitiesFormatMemoryBandwidth(&buf, caps->host.nnodes, + caps->host.nodes) < 0) +goto error; + for (i = 0; i < caps->host.nsecModels; i++) { virBufferAddLit(&buf, "\n"); virBufferAdjustIndent(&buf, 2); @@ -1602,6 +1672,40 @@ virCapabilitiesInitResctrl(virCapsPtr caps) } +static int +virCapabilitiesInitRes
[libvirt] [PATCH v2 RESEND 03/17] util: Refactor virResctrlAllocFormat of virresctrl
From: Bing Niu Refactor virResctrlAllocFormat so that it is easy to support other resource allocation technologies. Signed-off-by: Bing Niu Reviewed-by: John Ferlan --- src/util/virresctrl.c | 41 ++--- 1 file changed, 26 insertions(+), 15 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 313f964..a38c926 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -849,17 +849,14 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) } -char * -virResctrlAllocFormat(virResctrlAllocPtr alloc) +static int +virResctrlAllocFormatCache(virResctrlAllocPtr alloc, + virBufferPtr buf) { -virBuffer buf = VIR_BUFFER_INITIALIZER; unsigned int level = 0; unsigned int type = 0; unsigned int cache = 0; -if (!alloc) -return NULL; - for (level = 0; level < alloc->nlevels; level++) { virResctrlAllocPerLevelPtr a_level = alloc->levels[level]; @@ -872,7 +869,7 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) if (!a_type) continue; -virBufferAsprintf(&buf, "L%u%s:", level, virResctrlTypeToString(type)); +virBufferAsprintf(buf, "L%u%s:", level, virResctrlTypeToString(type)); for (cache = 0; cache < a_type->nmasks; cache++) { virBitmapPtr mask = a_type->masks[cache]; @@ -882,21 +879,35 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) continue; mask_str = virBitmapToString(mask, false, true); -if (!mask_str) { -virBufferFreeAndReset(&buf); -return NULL; -} +if (!mask_str) +return -1; -virBufferAsprintf(&buf, "%u=%s;", cache, mask_str); +virBufferAsprintf(buf, "%u=%s;", cache, mask_str); VIR_FREE(mask_str); } -virBufferTrim(&buf, ";", 1); -virBufferAddChar(&buf, '\n'); +virBufferTrim(buf, ";", 1); +virBufferAddChar(buf, '\n'); } } -virBufferCheckError(&buf); +return virBufferCheckError(buf); +} + + +char * +virResctrlAllocFormat(virResctrlAllocPtr alloc) +{ +virBuffer buf = VIR_BUFFER_INITIALIZER; + +if (!alloc) +return NULL; + +if (virResctrlAllocFormatCache(alloc, &buf) < 0) { +virBufferFreeAndReset(&buf); +return NULL; +} + return virBufferContentAndReset(&buf); } -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 11/17] conf: Rename cachetune to resctrl
From: Bing Niu Resctrl not only supports cache tuning, but also memory bandwidth tuning. Renaming cachetune to resctrl to reflect that. With resctrl, all allocation for different resources (cache, memory bandwidth) are aggregated and represented by a virResctrlAllocPtr inside virDomainResctrlDef. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 44 ++-- src/conf/domain_conf.h | 10 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- 4 files changed, 37 insertions(+), 37 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index abb6c5e..c1527b2 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2966,14 +2966,14 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) static void -virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl) { -if (!cachetune) +if (!resctrl) return; -virObjectUnref(cachetune->alloc); -virBitmapFree(cachetune->vcpus); -VIR_FREE(cachetune); +virObjectUnref(resctrl->alloc); +virBitmapFree(resctrl->vcpus); +VIR_FREE(resctrl); } @@ -3163,9 +3163,9 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems); -for (i = 0; i < def->ncachetunes; i++) -virDomainCachetuneDefFree(def->cachetunes[i]); -VIR_FREE(def->cachetunes); +for (i = 0; i < def->nresctrls; i++) +virDomainResctrlDefFree(def->resctrls[i]); +VIR_FREE(def->resctrls); VIR_FREE(def->keywrap); @@ -19034,7 +19034,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = virResctrlAllocNew(); -virDomainCachetuneDefPtr tmp_cachetune = NULL; +virDomainResctrlDefPtr tmp_resctrl = NULL; char *tmp = NULL; char *vcpus_str = NULL; char *alloc_id = NULL; @@ -19047,7 +19047,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; -if (VIR_ALLOC(tmp_cachetune) < 0) +if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; vcpus_str = virXMLPropString(node, "vcpus"); @@ -19088,8 +19088,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } -for (i = 0; i < def->ncachetunes; i++) { -if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { +for (i = 0; i < def->nresctrls; i++) { +if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Overlapping vcpus in cachetunes")); goto cleanup; @@ -19119,16 +19119,16 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (virResctrlAllocSetID(alloc, alloc_id) < 0) goto cleanup; -VIR_STEAL_PTR(tmp_cachetune->vcpus, vcpus); -VIR_STEAL_PTR(tmp_cachetune->alloc, alloc); +VIR_STEAL_PTR(tmp_resctrl->vcpus, vcpus); +VIR_STEAL_PTR(tmp_resctrl->alloc, alloc); -if (VIR_APPEND_ELEMENT(def->cachetunes, def->ncachetunes, tmp_cachetune) < 0) +if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0) goto cleanup; ret = 0; cleanup: ctxt->node = oldnode; -virDomainCachetuneDefFree(tmp_cachetune); +virDomainResctrlDefFree(tmp_resctrl); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(alloc_id); @@ -26994,7 +26994,7 @@ virDomainCachetuneDefFormatHelper(unsigned int level, static int virDomainCachetuneDefFormat(virBufferPtr buf, -virDomainCachetuneDefPtr cachetune, +virDomainResctrlDefPtr resctrl, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; @@ -27002,7 +27002,7 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); -virResctrlAllocForeachCache(cachetune->alloc, +virResctrlAllocForeachCache(resctrl->alloc, virDomainCachetuneDefFormatHelper, &childrenBuf); @@ -27015,14 +27015,14 @@ virDomainCachetuneDefFormat(virBufferPtr buf, goto cleanup; } -vcpus = virBitmapFormat(cachetune->vcpus); +vcpus = virBitmapFormat(resctrl->vcpus); if (!vcpus) goto cleanup; virBufferAsprintf(buf, "alloc); +const char *alloc_id = virResctrlAllocGetID(resctrl->alloc); if (!alloc_id) goto cleanup; @@ -27143,8 +27143,8 @@ virDomainCputuneDefFormat(virBufferPtr buf, def->iothreadids[i]->iothread_id); } -for (i = 0; i < def->ncachetunes; i++) -virDomainCachetuneDefFormat(&childrenBuf, def->cachetunes[i], flags); +for (i = 0; i < def->nresctrls; i++) +virDomainCachetuneDefFormat(&childrenBuf, de
[libvirt] [PATCH v2 RESEND 09/17] util: Introduce virResctrlAllocForeachMemory
From: Bing Niu Introduce an API that will traverse the memory bandwidth data calling a callback function for each defined bandwidth entry. Signed-off-by: Bing Niu --- src/libvirt_private.syms | 1 + src/util/virresctrl.c| 33 + src/util/virresctrl.h| 9 + 3 files changed, 43 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index bc489cb..cd598f9 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2649,6 +2649,7 @@ virResctrlAllocAddPID; virResctrlAllocCreate; virResctrlAllocDeterminePath; virResctrlAllocForeachCache; +virResctrlAllocForeachMemory; virResctrlAllocFormat; virResctrlAllocGetID; virResctrlAllocGetUnused; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index c22b30f..dc57c89 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -965,6 +965,39 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, } +/* virResctrlAllocForeachMemory + * @alloc: Pointer to an active allocation + * @cb: Callback function + * @opaque: Opaque data to be passed to @cb + * + * If available, traverse the defined memory bandwidth allocations and + * call the @cb function. + * + * Returns 0 on success, -1 and immediate failure if the @cb has any failure. + */ +int +virResctrlAllocForeachMemory(virResctrlAllocPtr alloc, + virResctrlAllocForeachMemoryCallback cb, + void *opaque) +{ +size_t i = 0; +virResctrlAllocMemBWPtr mem_bw; + +if (!alloc || !alloc->mem_bw) +return 0; + +mem_bw = alloc->mem_bw; +for (i = 0; i < mem_bw->nbandwidths; i++) { +if (mem_bw->bandwidths[i]) { +if (cb(i, *mem_bw->bandwidths[i], opaque) < 0) +return -1; +} +} + +return 0; +} + + int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id) diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index d657c06..5ea5b27 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -73,6 +73,10 @@ typedef int virResctrlAllocForeachCacheCallback(unsigned int level, unsigned long long size, void *opaque); +typedef int virResctrlAllocForeachMemoryCallback(unsigned int id, + unsigned int size, + void *opaque); + virResctrlAllocPtr virResctrlAllocNew(void); @@ -92,6 +96,11 @@ virResctrlAllocForeachCache(virResctrlAllocPtr alloc, void *opaque); int +virResctrlAllocForeachMemory(virResctrlAllocPtr resctrl, + virResctrlAllocForeachMemoryCallback cb, + void *opaque); + +int virResctrlAllocSetID(virResctrlAllocPtr alloc, const char *id); const char * -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 12/17] conf: Factor out vcpus parsing part from virDomainCachetuneDefParse
From: Bing Niu Extract vcpus parsing part from virDomainCachetuneDefParse into one function called virDomainResctrlParseVcpus. So that vcpus parsing logic can be reused by other resource control technologies. Adjust error message and use node->name so that the error message can fit to all technologies. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 48 +--- 1 file changed, 33 insertions(+), 15 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c1527b2..d6314de 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -18951,6 +18951,38 @@ virDomainDefParseBootOptions(virDomainDefPtr def, static int +virDomainResctrlParseVcpus(virDomainDefPtr def, + xmlNodePtr node, + virBitmapPtr *vcpus) +{ +char *vcpus_str = NULL; +int ret = -1; + +vcpus_str = virXMLPropString(node, "vcpus"); +if (!vcpus_str) { +virReportError(VIR_ERR_XML_ERROR, _("Missing %s attribute 'vcpus'"), + node->name); +goto cleanup; +} +if (virBitmapParse(vcpus_str, vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { +virReportError(VIR_ERR_XML_ERROR, + _("Invalid %s attribute 'vcpus' value '%s'"), + vcpus_str, node->name); +goto cleanup; +} + +/* We need to limit the bitmap to number of vCPUs. If there's nothing left, + * then we can just clean up and return 0 immediately */ +virBitmapShrink(*vcpus, def->maxvcpus); + +ret = 0; + cleanup: +VIR_FREE(vcpus_str); +return ret; +} + + +static int virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, xmlNodePtr node, virResctrlAllocPtr alloc) @@ -19050,22 +19082,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (VIR_ALLOC(tmp_resctrl) < 0) goto cleanup; -vcpus_str = virXMLPropString(node, "vcpus"); -if (!vcpus_str) { -virReportError(VIR_ERR_XML_ERROR, "%s", - _("Missing cachetune attribute 'vcpus'")); -goto cleanup; -} -if (virBitmapParse(vcpus_str, &vcpus, VIR_DOMAIN_CPUMASK_LEN) < 0) { -virReportError(VIR_ERR_XML_ERROR, - _("Invalid cachetune attribute 'vcpus' value '%s'"), - vcpus_str); +if (virDomainResctrlParseVcpus(def, node, &vcpus) < 0) goto cleanup; -} - -/* We need to limit the bitmap to number of vCPUs. If there's nothing left, - * then we can just clean up and return 0 immediately */ -virBitmapShrink(vcpus, def->maxvcpus); if (virBitmapIsAllClear(vcpus)) { ret = 0; -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 00/17] Introduce RDT memory bandwidth allocation support
From: Bing Niu This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation. The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. This feature provides a method to control applications which may be over-utilizing bandwidth relative to their priority in environments such as the data-center. The details can be found in Intel's SDM 17.19.7. Kernel supports MBA through resctrl file system same as CAT. Each resctrl group have a MB parameter to control how much memory bandwidth it can utilize in unit of percentage. In this series, MBA is enabled by enhancing existing virresctrl implementation. The policy employed for MBA is similar with CAT: The sum of each MBA group's bandwidth dose not exceed 100%. The enhancement of virresctrl include two main parts: Part 1: Add two new structures virResctrlInfoMemMB and virResctrlAllocMemBW for collecting host system MBA capability and domain memory bandwidth allocation. Those two structures are the extension of existing virResctrlInfo and virResctrlAlloc. With them, virresctrl framework can support MBA and CAT concurrently. Each virResctrlAlloc represent a resource allocation including CAT, or MBA, or CAT&MBA. The policy of MBA is that: total memory bandwidth of each resctrl group, created by virresctrl, does not exceed to 100%. Part 2: On XML part, add new elements to host capabilities query and domain allocation to support memory bandwidth allocation. - For host capabilities XML, new XML format like below example, . granularity --- memory bandwidth granularity min --- minimum memory bandwidth allowed maxAllocs --- maximum concurrent memory bandwidth allocation allowed. - For domain XML, new format as below example .. .. 1024 .. id --- node where memory bandwidth allocation will happen bandwidth --- bandwidth allocated in percentage -- With this extension of the virresctrl, the overall working follow of CAT and MBA is described by below picture. XML parser will aggregate MBA and CAT configuration and represents it in one virresctrl object. The methods of virresctrl class will manipulate resctrl interface to allocate corresponding resources. +-+ | XML | + parser +---+ | | +--+ | | internal object+--v--+ virResctrlAlloc| backing object| +--+--+ | | +--+ | +--v---+ | | | schemata | /sys/fs/resctrl | tasks| | . | | . | | | +--+ - previous versions and discussion can be found at v1: https://www.redhat.com/archives/libvir-list/2018-July/msg01144.html RFC v2: https://www.redhat.com/archives/libvir-list/2018-June/msg01268.html RFC v1: https://www.redhat.com/archives/libvir-list/2018-May/msg02101.html Changelog: v1 -> this: John's comment: 1. Split calculation of number of memory bandwidth control to one patch. 2. Split virResctrlAllocMemBW relating methods to 5 patch, each provides one kind of function, eg: schemata processing, memory bandwidth calculation. 3. Use resctrl to replace cachetune in domain conf. 4. Split refactor virDomainCachetuneDefParse into 3 patches. And adjust some logic, eg: use %s format e
[libvirt] [PATCH v2 RESEND 01/17] util: Rename some functions of virresctrl
From: Bing Niu Some functions in virresctrl are for CAT only, while some of other functions are for resource allocation, not just CAT. So change their names to reflect the reality. Signed-off-by: Bing Niu Reviewed-by: John Ferlan --- src/conf/domain_conf.c | 8 src/libvirt_private.syms | 4 ++-- src/util/virresctrl.c| 30 +++--- src/util/virresctrl.h| 26 +- 4 files changed, 34 insertions(+), 34 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index eff8af2..abb6c5e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -19013,7 +19013,7 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr ctxt, ULLONG_MAX, true) < 0) goto cleanup; -if (virResctrlAllocSetSize(alloc, level, type, cache, size) < 0) +if (virResctrlAllocSetCacheSize(alloc, level, type, cache, size) < 0) goto cleanup; ret = 0; @@ -27002,9 +27002,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); -virResctrlAllocForeachSize(cachetune->alloc, - virDomainCachetuneDefFormatHelper, - &childrenBuf); +virResctrlAllocForeachCache(cachetune->alloc, +virDomainCachetuneDefFormatHelper, +&childrenBuf); if (virBufferCheckError(&childrenBuf) < 0) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index fc386e1..bc489cb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2648,14 +2648,14 @@ virCacheTypeToString; virResctrlAllocAddPID; virResctrlAllocCreate; virResctrlAllocDeterminePath; -virResctrlAllocForeachSize; +virResctrlAllocForeachCache; virResctrlAllocFormat; virResctrlAllocGetID; virResctrlAllocGetUnused; virResctrlAllocNew; virResctrlAllocRemove; +virResctrlAllocSetCacheSize; virResctrlAllocSetID; -virResctrlAllocSetSize; virResctrlInfoGetCache; virResctrlInfoNew; diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index e492a63..6d69c8d 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -754,11 +754,11 @@ virResctrlAllocCheckCollision(virResctrlAllocPtr alloc, int -virResctrlAllocSetSize(virResctrlAllocPtr alloc, - unsigned int level, - virCacheType type, - unsigned int cache, - unsigned long long size) +virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, +unsigned int level, +virCacheType type, +unsigned int cache, +unsigned long long size) { if (virResctrlAllocCheckCollision(alloc, level, type, cache)) { virReportError(VIR_ERR_XML_ERROR, @@ -773,9 +773,9 @@ virResctrlAllocSetSize(virResctrlAllocPtr alloc, int -virResctrlAllocForeachSize(virResctrlAllocPtr alloc, - virResctrlAllocForeachSizeCallback cb, - void *opaque) +virResctrlAllocForeachCache(virResctrlAllocPtr alloc, +virResctrlAllocForeachCacheCallback cb, +void *opaque) { int ret = 0; unsigned int level = 0; @@ -939,9 +939,9 @@ virResctrlAllocParseProcessCache(virResctrlInfoPtr resctrl, static int -virResctrlAllocParseProcessLine(virResctrlInfoPtr resctrl, -virResctrlAllocPtr alloc, -char *line) +virResctrlAllocParseCacheLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) { char **caches = NULL; char *tmp = NULL; @@ -1009,7 +1009,7 @@ virResctrlAllocParse(virResctrlInfoPtr resctrl, lines = virStringSplitCount(schemata, "\n", 0, &nlines); for (i = 0; i < nlines; i++) { -if (virResctrlAllocParseProcessLine(resctrl, alloc, lines[i]) < 0) +if (virResctrlAllocParseCacheLine(resctrl, alloc, lines[i]) < 0) goto cleanup; } @@ -1401,8 +1401,8 @@ virResctrlAllocCopyMasks(virResctrlAllocPtr dst, * transforming `sizes` into `masks`. */ static int -virResctrlAllocMasksAssign(virResctrlInfoPtr resctrl, - virResctrlAllocPtr alloc) +virResctrlAllocAssign(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc) { int ret = -1; unsigned int level = 0; @@ -1526,7 +1526,7 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl, if (lockfd < 0) goto cleanup; -if (virResctrlAllocMasksAssign(resctrl, alloc) < 0) +if (virResctrlAllocAssign(resctrl, alloc) < 0) goto cleanup; alloc_str = virResctrlAllocFormat(alloc); diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 9052a2b..d657c06 100644 --- a/src/util/v
[libvirt] [PATCH v2 RESEND 04/17] util: Add MBA capability information query to resctrl
From: Bing Niu Introducing virResctrlInfoMemBW for the information memory bandwidth allocation information. Signed-off-by: Bing Niu Reviewed-by: John Ferlan --- src/util/virresctrl.c | 84 +++ 1 file changed, 84 insertions(+) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index a38c926..b12a05c 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -80,6 +80,9 @@ typedef virResctrlInfoPerType *virResctrlInfoPerTypePtr; typedef struct _virResctrlInfoPerLevel virResctrlInfoPerLevel; typedef virResctrlInfoPerLevel *virResctrlInfoPerLevelPtr; +typedef struct _virResctrlInfoMemBW virResctrlInfoMemBW; +typedef virResctrlInfoMemBW *virResctrlInfoMemBWPtr; + typedef struct _virResctrlAllocPerType virResctrlAllocPerType; typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; @@ -116,11 +119,30 @@ struct _virResctrlInfoPerLevel { virResctrlInfoPerTypePtr *types; }; +/* Information about memory bandwidth allocation */ +struct _virResctrlInfoMemBW { +/* minimum memory bandwidth allowed */ +unsigned int min_bandwidth; +/* bandwidth granularity */ +unsigned int bandwidth_granularity; +/* Maximum number of simultaneous allocations */ +unsigned int max_allocation; +/* level number of last level cache */ +unsigned int last_level_cache; +/* max id of last level cache, this is used to track + * how many last level cache available in host system, + * the number of memory bandwidth allocation controller + * is identical with last level cache. */ +unsigned int max_id; +}; + struct _virResctrlInfo { virObject parent; virResctrlInfoPerLevelPtr *levels; size_t nlevels; + +virResctrlInfoMemBWPtr membw_info; }; @@ -146,6 +168,7 @@ virResctrlInfoDispose(void *obj) VIR_FREE(level); } +VIR_FREE(resctrl->membw_info); VIR_FREE(resctrl->levels); } @@ -443,6 +466,60 @@ virResctrlGetCacheInfo(virResctrlInfoPtr resctrl, static int +virResctrlGetMemoryBandwidthInfo(virResctrlInfoPtr resctrl) +{ +int ret = -1; +int rv = -1; +virResctrlInfoMemBWPtr i_membw = NULL; + +/* query memory bandwidth allocation info */ +if (VIR_ALLOC(i_membw) < 0) +goto cleanup; +rv = virFileReadValueUint(&i_membw->bandwidth_granularity, + SYSFS_RESCTRL_PATH "/info/MB/bandwidth_gran"); +if (rv == -2) { +/* The file doesn't exist, so it's unusable for us, + * probably memory bandwidth allocation unsupported */ +VIR_INFO("The path '" SYSFS_RESCTRL_PATH "/info/MB/bandwidth_gran'" + "does not exist"); +ret = 0; +goto cleanup; +} else if (rv < 0) { +/* Other failures are fatal, so just quit */ +goto cleanup; +} + +rv = virFileReadValueUint(&i_membw->min_bandwidth, + SYSFS_RESCTRL_PATH "/info/MB/min_bandwidth"); +if (rv == -2) { +/* If the previous file exists, so should this one. Hence -2 is + * fatal in this case (errors out in next condition) - the kernel + * interface might've changed too much or something else is wrong. */ +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get min bandwidth from resctrl memory info")); +} +if (rv < 0) +goto cleanup; + +rv = virFileReadValueUint(&i_membw->max_allocation, + SYSFS_RESCTRL_PATH "/info/MB/num_closids"); +if (rv == -2) { + /* Similar reasoning to min_bandwidth above. */ +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot get max allocation from resctrl memory info")); +} +if (rv < 0) +goto cleanup; + +VIR_STEAL_PTR(resctrl->membw_info, i_membw); +ret = 0; + cleanup: +VIR_FREE(i_membw); +return ret; +} + + +static int virResctrlGetInfo(virResctrlInfoPtr resctrl) { DIR *dirp = NULL; @@ -452,6 +529,10 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) if (ret <= 0) goto cleanup; +ret = virResctrlGetMemoryBandwidthInfo(resctrl); +if (ret < 0) +goto cleanup; + ret = virResctrlGetCacheInfo(resctrl, dirp); if (ret < 0) goto cleanup; @@ -493,6 +574,9 @@ virResctrlInfoIsEmpty(virResctrlInfoPtr resctrl) if (!resctrl) return true; +if (resctrl->membw_info) +return false; + for (i = 0; i < resctrl->nlevels; i++) { virResctrlInfoPerLevelPtr i_level = resctrl->levels[i]; -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 02/17] util: Refactor virResctrlGetInfo in virresctrl
From: Bing Niu Separate resctrl common information parts from CAT specific parts, so that common information parts can be reused among different resource allocation technologies. Signed-off-by: Bing Niu Reviewed-by: John Ferlan --- src/util/virresctrl.c | 32 +++- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 6d69c8d..313f964 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -318,9 +318,9 @@ virResctrlUnlock(int fd) /* virResctrlInfo-related definitions */ static int -virResctrlGetInfo(virResctrlInfoPtr resctrl) +virResctrlGetCacheInfo(virResctrlInfoPtr resctrl, + DIR *dirp) { -DIR *dirp = NULL; char *endptr = NULL; char *tmp_str = NULL; int ret = -1; @@ -332,12 +332,6 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) virResctrlInfoPerLevelPtr i_level = NULL; virResctrlInfoPerTypePtr i_type = NULL; -rv = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); -if (rv <= 0) { -ret = rv; -goto cleanup; -} - while ((rv = virDirRead(dirp, &ent, SYSFS_RESCTRL_PATH "/info")) > 0) { VIR_DEBUG("Parsing info type '%s'", ent->d_name); if (ent->d_name[0] != 'L') @@ -443,12 +437,32 @@ virResctrlGetInfo(virResctrlInfoPtr resctrl) ret = 0; cleanup: -VIR_DIR_CLOSE(dirp); VIR_FREE(i_type); return ret; } +static int +virResctrlGetInfo(virResctrlInfoPtr resctrl) +{ +DIR *dirp = NULL; +int ret = -1; + +ret = virDirOpenIfExists(&dirp, SYSFS_RESCTRL_PATH "/info"); +if (ret <= 0) +goto cleanup; + +ret = virResctrlGetCacheInfo(resctrl, dirp); +if (ret < 0) +goto cleanup; + +ret = 0; + cleanup: +VIR_DIR_CLOSE(dirp); +return ret; +} + + virResctrlInfoPtr virResctrlInfoNew(void) { -- 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 RESEND 06/17] util: Add MBA allocation to virresctrl
From: Bing Niu Add memory bandwidth allocation support to virresctrl class. Introducing virResctrlAllocMemBW which is used for allocating memory bandwidth. Following virResctrlAllocPerType, it also employs a nested sparse array to indicate whether allocation is available for particular last level cache. Signed-off-by: Bing Niu --- src/util/virresctrl.c | 63 +++ 1 file changed, 59 insertions(+), 4 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index f454868..8a25798 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -36,9 +36,9 @@ VIR_LOG_INIT("util.virresctrl") /* Resctrl is short for Resource Control. It might be implemented for various - * resources, but at the time of this writing this is only supported for cache - * allocation technology (aka CAT). Hence the reson for leaving 'Cache' out of - * all the structure and function names for now (can be added later if needed. + * resources. Currently this supports cache allocation technology (aka CAT) and + * memory bandwidth allocation (aka MBA). More resources technologies may be + * added in the future. */ @@ -89,6 +89,9 @@ typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; +typedef struct _virResctrlAllocMemBW virResctrlAllocMemBW; +typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; + /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; @@ -180,7 +183,10 @@ virResctrlInfoDispose(void *obj) * consequently a directory under /sys/fs/resctrl). Since it can have multiple * parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be NULL - * in case there is no allocation for that particular one (level, cache, ...)). + * in case there is no allocation for that particular cache allocation (level, + * cache, ...) or memory allocation for particular node). + * + * =Cache allocation technology (CAT)= * * Since one allocation can be made for caches on different levels, the first * nested sparse array is of types virResctrlAllocPerLevel. For example if you @@ -205,6 +211,17 @@ virResctrlInfoDispose(void *obj) * all of them. While doing that we store the bitmask in a sparse array of * virBitmaps named `masks` indexed the same way as `sizes`. The upper bounds * of the sparse arrays are stored in nmasks or nsizes, respectively. + + * + * =Memory Bandwidth allocation technology (MBA)= + * + * The memory bandwidth allocation support in virResctrlAlloc works in the + * same fashion as CAT. However, memory bandwidth controller doesn't have a + * hierarchy organization as cache, each node have one memory bandwidth + * controller to memory bandwidth distribution. The number of memory bandwidth + * controller is identical with number of last level cache. So MBA also employs + * a sparse array to represent whether a memory bandwidth allocation happens + * on corresponding node. The available memory controller number is collected + * in 'virResctrlInfo'. */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since everything @@ -225,12 +242,24 @@ struct _virResctrlAllocPerLevel { * VIR_CACHE_TYPE_LAST number of items */ }; +/* + * virResctrlAllocMemBW represents one memory bandwidth allocation. + * Since it can have several last level caches in a NUMA system, it is + * also represented as a nested sparse arrays as virRestrlAllocPerLevel. + */ +struct _virResctrlAllocMemBW { +unsigned int **bandwidths; +size_t nbandwidths; +}; + struct _virResctrlAlloc { virObject parent; virResctrlAllocPerLevelPtr *levels; size_t nlevels; +virResctrlAllocMemBWPtr mem_bw; + /* The identifier (any unique string for now) */ char *id; /* libvirt-generated path in /sys/fs/resctrl for this particular @@ -274,6 +303,13 @@ virResctrlAllocDispose(void *obj) VIR_FREE(level); } +if (alloc->mem_bw) { +virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw; +for (i = 0; i < mem_bw->nbandwidths; i++) +VIR_FREE(mem_bw->bandwidths[i]); +VIR_FREE(alloc->mem_bw); +} + VIR_FREE(alloc->id); VIR_FREE(alloc->path); VIR_FREE(alloc->levels); @@ -692,6 +728,9 @@ virResctrlAllocIsEmpty(virResctrlAllocPtr alloc) if (!alloc) return true; +if (alloc->mem_bw) +return false; + for (i = 0; i < alloc->nlevels; i++) { virResctrlAllocPerLevelPtr a_level = alloc->levels[i]; @@ -1266,6 +1305,22 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) } } +/* set default free memory bandwidth to 100%*/ +if (info->membw_info) { +if (VIR_ALLOC(ret->mem_bw) < 0) +goto error; + +
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
On 2018.07.27 16:45:55 +0200, Erik Skultety wrote: > On Thu, Jul 26, 2018 at 06:04:10PM +0200, Cornelia Huck wrote: > > On Thu, 26 Jul 2018 17:43:45 +0200 > > Erik Skultety wrote: > > > > > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote: > > > > One thing I noticed is that we have seem to have an optional (?) > > > > vendor-driver created "aggregation" attribute (which always prints > > > > "true" in the Intel driver). Would it be better or worse for libvirt if > > > > it contained some kind of upper boundary or so? Additionally, would it > > > > > > Can you be more specific? Although, I wouldn't argue against data that > > > conveys > > > some information, since I would treat the mere presence of the optional > > > attribute as a supported feature that we can expose. Therefore, additional > > > *structured* data which sets clear limits to a certain feature is only a > > > plus > > > that we can expose to the users/management layer. > > > > My question is what would be easiest for libvirt: > > > > - "aggregation" attribute only present when driver supports aggregation > > (possibly containing max number of resources to be aggregated) > > - "aggregation" attribute always present; contains '1' if driver does > > not support aggregation and 'm' if driver can aggregate 'm' resources > > Both are fine from libvirt's POV, but IMHO the former makes a bit more sense > and I'm in favour of that one, IOW the presence of an attribute denotes a new > functionality which we can report, if it's missing, the feature is clearly > lacking- I don't think we (libvirt) should be reporting the value 1 explicitly > in the XML if the feature is missing, we can assume 1 as the default. > Good I'll adhere to that, thanks! -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Matching the type of mediated devices in the migration
BACKGROUND As the live migration of mdev is going to be supported in VFIO, a scheme of deciding if a mdev could be migratable between the source machine and the destination machine is needed. Mostly, this email is going to discuss a possible solution which needs fewer modifications of libvirt/VFIO. The configuration of a mdev is located in the domain XML, which guides libvirt how to find the mdev and generating the command line for QEMU. It basically only includes the UUID of a mdev. The domain XML of the source machine and destination machine are going to be compared before the migration really happens. Each configuration item would be compared and checked by libvirt. If one item of the source machine is different from the item of destination machine, the migration fails. For mdev, there is no any check/match before the migration happens yet. The user could use the node device list of libvirt to list the host devices and see the capabilities of those devices. The current node device code of libvirt has already been able to extract the supported mdev types from a host PCI device, plus some basic information, like max supported mdev instance of a host PCI device. THE SOLUTION To strictly check the mdev type and make sure the migration happens between the compatible mediated devices, three new mandatory elements in the domain XML below the hostdev element would be introduced: vendorid: The vendor ID of the mdev, which comes from the host PCI device. A user could obtain this information from the host PCI device which supports mdev in the node device list. productid: The product ID of the mdev, which also comes from the host PCI device. A user could obtain this information from the same approach above. mdevtype: The type of the mdev. As the creation of the mdev is managed by the user, the user knows the type of the mdev and would be responsible for filling out this information. These three elements are only needed when the device API of a mdev is "vfio-PCI". Take the example of mdev configuration from https://libvirt.org/formatdomain.html to illustrate the modification: 0xdead 0xbeef type With the newly introduced elements above, the flow of the creation of a domain XML with mdev will be like: 1. The user obtains the vendorid/productid from node device list 2. The user fills the vendorid/productid/mdevtype in the domain XML 3. When a migration happens, libvirt check these elements. If one item is different between two domain XML, then migration fails. POSSIBLE MODIFICATION OF LIBVIRT 1) Introduce three new elements in domain XML parsing and processing functions. 2) Extend the function virDomainDeviceInfoCheckABIStability() which is going to check the host dev part of the domain XMLs between the source machine and the destination machine. So it could fail the migration when it finds out the IDs and the mdev type are different between domain XMLs. PROS Minor changes in libvirt could achieve the mdev type match in the migration. Modifying VFIO and other mdev components is not necessary. Thanks, Zhi. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list