Re: [libvirt] [RFCv2 PATCH 3/5] conf: rename cachetune to restune
On 2018年07月06日 06:40, John Ferlan wrote: On 07/03/2018 12:10 AM, bing.niu wrote: Hi John, Thanks for reviewing! Since major questions are from this thread, so I think we can start from this. On 2018年06月30日 06:47, John Ferlan wrote: On 06/15/2018 05:29 AM, bing@intel.com wrote: From: Bing Niu resctrl not only supports cache tuning, but also memory bandwidth tuning. Rename cachetune to restune(resource tuning) to reflect that. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 44 ++-- src/conf/domain_conf.h | 10 +- src/qemu/qemu_process.c | 18 +- 3 files changed, 36 insertions(+), 36 deletions(-) Not so sure I'm liking [R|r]estune[s]... Still @cachetunes is an array of a structure that contains a vCPU bitmap and a virResctrlAllocPtr for the /cputune/cachetunes data. The virResctrlAllocPtr was changed to add a the bandwidth allocation, so does this really have to change? The cachetunes from Cache Allocation Technology(CAT) and memorytunes from Memory Bandwidth Allocation(MBA) are both features from Resource Directory Technology. RDT is a collection of resource distribution technology, right now it includes CAT and MBA. Details information can be found 17.18.1 of Intel's sdm. https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf tl;dr :-) So some day soon it'll be more complicated and have even more rules?!?! Hopefully not, and Intel do backward compatible well.:) So overall programing module should be steady. The RDT capability and configuration methods is exposed to user land in form of resctrl file system by kernel. The basic programming model for this resctrl fs interface like this: 1. create a folder under /sys/fs/resctrl. And this folder is called one resctrl group. 2. writing user desired CAT and MBA config to the file /sys/fs/resctrl//schemata to allocate the cache or memory bandwidth. you can set CAT and MBA together or any of them. 3. moving the threads you want to that resctrl group by writing threads' PID to /sys/fs/resctrl//task file. So that those threads can be assign with the resource allocated in step 2. And those resources are shared by those threads. *NOTE*: A thread only can be put in one resctrl group. This requirement is from how RDT and resctrl works. IOW: A vCPU or the vCPUs being used by CAT and MBA would need to be the same because in "this world" the group *is* the single vCPU or the range of vCPU's. Yup! vCPU actually is a thread of QEMU from host view. Each resctrl group has a closid. The resource configuration in above step2 is set to that closid's msr(IA32_L?_QoS) to describe the resource allocation for this closid. And thread use this closid to claim the allocated resource during context switch(scheduled_in()) in kernel by writing the closid to the msr IA32_PQR_ASSOC. With closid wrote in IA32_PQR_ASSOC, RDT hardware knows current running closid and allocated resource basing on this closid's IA32_L?_QoS msr. That why a thread can be put into one restrcl group only. Details description of resctrl can be found: https://github.com/torvalds/linux/blob/v4.17/Documentation/x86/intel_rdt_ui.txt :) More lengthy things that I didn't read... Thanks for the pointers. I think keeping track of all the rules and relationships is going to be a real "challenge". Yes, that is a real "challenge". However, kernel modules usually give a consistent and unified interface. That will help a lot. We should be able to support new features by extending the existing logic. Basing on above information, my understanding is that virResctrlAllocPtr is not only bind to cachetunes. It should be a backing object to describe a resctrl group. Especially current virresctrl class already works as that. Libvirt will create one resctrl group for each virResctrlAllocPtr by virResctrlAllocCreate() interface. So from components view, the big picture is something like below. Yay, pictures! Thanks for taking the time. +-+ | XML | + parser +---+ | | +--+ | | internal resctrl +--v--+ backing object | virResctrlAllocPtr | +--+--+ | | +--+ | +--v---+ | | | schemata | /sys/fs/resctrl | tasks | | . | | . | | |
Re: [libvirt] [RFCv2 PATCH 3/5] conf: rename cachetune to restune
On 07/03/2018 12:10 AM, bing.niu wrote: > Hi John, > Thanks for reviewing! Since major questions are from this thread, so I > think we can start from this. > > On 2018年06月30日 06:47, John Ferlan wrote: >> >> >> On 06/15/2018 05:29 AM, bing@intel.com wrote: >>> From: Bing Niu >>> >>> resctrl not only supports cache tuning, but also memory bandwidth >>> tuning. Rename cachetune to restune(resource tuning) to reflect >>> that. >>> >>> Signed-off-by: Bing Niu >>> --- >>> src/conf/domain_conf.c | 44 >>> ++-- >>> src/conf/domain_conf.h | 10 +- >>> src/qemu/qemu_process.c | 18 +- >>> 3 files changed, 36 insertions(+), 36 deletions(-) >>> >> >> Not so sure I'm liking [R|r]estune[s]... Still @cachetunes is an array >> of a structure that contains a vCPU bitmap and a virResctrlAllocPtr for >> the /cputune/cachetunes data. The virResctrlAllocPtr was changed to add >> a the bandwidth allocation, so does this really have to change? >> > The cachetunes from Cache Allocation Technology(CAT) and memorytunes > from Memory Bandwidth Allocation(MBA) are both features from Resource > Directory Technology. RDT is a collection of resource distribution > technology, right now it includes CAT and MBA. Details information can > be found 17.18.1 of Intel's sdm. > https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf > tl;dr :-) So some day soon it'll be more complicated and have even more rules?!?! > > The RDT capability and configuration methods is exposed to user land in > form of resctrl file system by kernel. The basic programming model for > this resctrl fs interface like this: > > 1. create a folder under /sys/fs/resctrl. And this folder is called one > resctrl group. > > 2. writing user desired CAT and MBA config to the file > /sys/fs/resctrl//schemata to allocate the cache or > memory bandwidth. you can set CAT and MBA together or any of them. > > 3. moving the threads you want to that resctrl group by writing threads' > PID to /sys/fs/resctrl//task file. So that those > threads can be assign with the resource allocated in step 2. And those > resources are shared by those threads. > > *NOTE*: A thread only can be put in one resctrl group. This requirement > is from how RDT and resctrl works. IOW: A vCPU or the vCPUs being used by CAT and MBA would need to be the same because in "this world" the group *is* the single vCPU or the range of vCPU's. > Each resctrl group has a closid. The resource configuration in above > step2 is set to that closid's msr(IA32_L?_QoS) to describe the resource > allocation for this closid. > And thread use this closid to claim the allocated resource during > context switch(scheduled_in()) in kernel by writing the closid to the > msr IA32_PQR_ASSOC. > With closid wrote in IA32_PQR_ASSOC, RDT hardware knows current running > closid and allocated resource basing on this closid's IA32_L?_QoS msr. > That why a thread can be put into one restrcl group only. > > Details description of resctrl can be found: > https://github.com/torvalds/linux/blob/v4.17/Documentation/x86/intel_rdt_ui.txt > > > :) More lengthy things that I didn't read... Thanks for the pointers. I think keeping track of all the rules and relationships is going to be a real "challenge". > > Basing on above information, my understanding is that virResctrlAllocPtr > is not only bind to cachetunes. It should be a backing object to > describe a resctrl group. Especially current virresctrl class already > works as that. > Libvirt will create one resctrl group for each virResctrlAllocPtr by > virResctrlAllocCreate() interface. > > So from components view, the big picture is something like below. Yay, pictures! Thanks for taking the time. > > > > +-+ > | > XML | + > parser +---+ > | > | > +--+ > | > | > internal resctrl +--v--+ > backing object | virResctrlAllocPtr | > +--+--+ > | > | > +--+ > | > +--v---+ > | | > | schemata | > /sys/fs/resctrl | tasks | > | . | > | . | > | | > +--+ > > Does this make sense? :) > Yes and I think further has me believe that the vCPUs in this case are the groups and perhaps becomes the internal "key" for all this. I didn't go look, but have to
Re: [libvirt] [RFCv2 PATCH 3/5] conf: rename cachetune to restune
Hi John, Thanks for reviewing! Since major questions are from this thread, so I think we can start from this. On 2018年06月30日 06:47, John Ferlan wrote: On 06/15/2018 05:29 AM, bing@intel.com wrote: From: Bing Niu resctrl not only supports cache tuning, but also memory bandwidth tuning. Rename cachetune to restune(resource tuning) to reflect that. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 44 ++-- src/conf/domain_conf.h | 10 +- src/qemu/qemu_process.c | 18 +- 3 files changed, 36 insertions(+), 36 deletions(-) Not so sure I'm liking [R|r]estune[s]... Still @cachetunes is an array of a structure that contains a vCPU bitmap and a virResctrlAllocPtr for the /cputune/cachetunes data. The virResctrlAllocPtr was changed to add a the bandwidth allocation, so does this really have to change? The cachetunes from Cache Allocation Technology(CAT) and memorytunes from Memory Bandwidth Allocation(MBA) are both features from Resource Directory Technology. RDT is a collection of resource distribution technology, right now it includes CAT and MBA. Details information can be found 17.18.1 of Intel's sdm. https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf The RDT capability and configuration methods is exposed to user land in form of resctrl file system by kernel. The basic programming model for this resctrl fs interface like this: 1. create a folder under /sys/fs/resctrl. And this folder is called one resctrl group. 2. writing user desired CAT and MBA config to the file /sys/fs/resctrl//schemata to allocate the cache or memory bandwidth. you can set CAT and MBA together or any of them. 3. moving the threads you want to that resctrl group by writing threads' PID to /sys/fs/resctrl//task file. So that those threads can be assign with the resource allocated in step 2. And those resources are shared by those threads. *NOTE*: A thread only can be put in one resctrl group. This requirement is from how RDT and resctrl works. Each resctrl group has a closid. The resource configuration in above step2 is set to that closid's msr(IA32_L?_QoS) to describe the resource allocation for this closid. And thread use this closid to claim the allocated resource during context switch(scheduled_in()) in kernel by writing the closid to the msr IA32_PQR_ASSOC. With closid wrote in IA32_PQR_ASSOC, RDT hardware knows current running closid and allocated resource basing on this closid's IA32_L?_QoS msr. That why a thread can be put into one restrcl group only. Details description of resctrl can be found: https://github.com/torvalds/linux/blob/v4.17/Documentation/x86/intel_rdt_ui.txt :) Basing on above information, my understanding is that virResctrlAllocPtr is not only bind to cachetunes. It should be a backing object to describe a resctrl group. Especially current virresctrl class already works as that. Libvirt will create one resctrl group for each virResctrlAllocPtr by virResctrlAllocCreate() interface. So from components view, the big picture is something like below. +-+ | XML | + parser +---+ | | +--+ | | internal resctrl +--v--+ backing object| virResctrlAllocPtr | +--+--+ | | +--+ | +--v---+ | | | schemata | /sys/fs/resctrl | tasks| | . | | . | | | +--+ Does this make sense? :) I think the question is - if both cachetunes and memtunes exist in domain XML, then for which does the @vcpus relate to. That is, from the cover there's: and then there's a Now what? I haven't looked at patch 4 yet to even know how this "works". I also raised this concern in the first round review. Ideally, if we can have a domain XML like this That will be more clear. Above domain XML means: one resctrl group with memory bandwidth 20 @node 0, L3 cache 3MB @ node0 and node 1. Put vcpu '0-3' to this resctrl group. So those resources are shared among vcpus. However, Pavel pointed out that we must keep backward compatible. And we need keep . In RFC v1, I also proposed to put memory bandwidth into cachetunes section like: Maint
Re: [libvirt] [RFCv2 PATCH 3/5] conf: rename cachetune to restune
On 06/15/2018 05:29 AM, bing@intel.com wrote: > From: Bing Niu > > resctrl not only supports cache tuning, but also memory bandwidth > tuning. Rename cachetune to restune(resource tuning) to reflect > that. > > Signed-off-by: Bing Niu > --- > src/conf/domain_conf.c | 44 ++-- > src/conf/domain_conf.h | 10 +- > src/qemu/qemu_process.c | 18 +- > 3 files changed, 36 insertions(+), 36 deletions(-) > Not so sure I'm liking [R|r]estune[s]... Still @cachetunes is an array of a structure that contains a vCPU bitmap and a virResctrlAllocPtr for the /cputune/cachetunes data. The virResctrlAllocPtr was changed to add a the bandwidth allocation, so does this really have to change? I think the question is - if both cachetunes and memtunes exist in domain XML, then for which does the @vcpus relate to. That is, from the cover there's: and then there's a Now what? I haven't looked at patch 4 yet to even know how this "works". I think what you want to do is create a virDomainMemtuneDef that mimics virDomainCachetuneDef, but I'm not 100% sure. Of course that still leaves me wondering how much of virResctrlAllocPtr then becomes wasted. You chose an integration point, but essentially have separate technologies. I'm trying to make sense of it all and am starting to get lost in the process of doing that. If @bandwidth can not be > 100... Can we duplicate the @id somehow? So how does that free buffer that got 100% and then gets subtracted from really work in this model? I probably need to study the code some more, but it's not clear it requires using the "same" sort of logic that cachetune uses where there could be different types (instructs, data, or both) that would be drawing from the size, right? I think that's how that worked. So what's the corollary for memtunes? You set a range from 0 to 100, right and that's it. OK, too many questions and it's late so I'll stop here... Hopefully I didn't leave any stray thoughts in the review of patch 1 or 2. John > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c > index 62c412e..b3543f3 100644 > --- a/src/conf/domain_conf.c > +++ b/src/conf/domain_conf.c > @@ -2950,14 +2950,14 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) > > > static void > -virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) > +virDomainRestuneDefFree(virDomainRestuneDefPtr restune) > { > -if (!cachetune) > +if (!restune) > return; > > -virObjectUnref(cachetune->alloc); > -virBitmapFree(cachetune->vcpus); > -VIR_FREE(cachetune); > +virObjectUnref(restune->alloc); > +virBitmapFree(restune->vcpus); > +VIR_FREE(restune); > } > > > @@ -3147,9 +3147,9 @@ void virDomainDefFree(virDomainDefPtr def) > virDomainShmemDefFree(def->shmems[i]); > VIR_FREE(def->shmems); > > -for (i = 0; i < def->ncachetunes; i++) > -virDomainCachetuneDefFree(def->cachetunes[i]); > -VIR_FREE(def->cachetunes); > +for (i = 0; i < def->nrestunes; i++) > +virDomainRestuneDefFree(def->restunes[i]); > +VIR_FREE(def->restunes); > > VIR_FREE(def->keywrap); > > @@ -18971,7 +18971,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, > xmlNodePtr *nodes = NULL; > virBitmapPtr vcpus = NULL; > virResctrlAllocPtr alloc = virResctrlAllocNew(); > -virDomainCachetuneDefPtr tmp_cachetune = NULL; > +virDomainRestuneDefPtr tmp_restune = NULL; > char *tmp = NULL; > char *vcpus_str = NULL; > char *alloc_id = NULL; > @@ -18984,7 +18984,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, > if (!alloc) > goto cleanup; > > -if (VIR_ALLOC(tmp_cachetune) < 0) > +if (VIR_ALLOC(tmp_restune) < 0) > goto cleanup; > > vcpus_str = virXMLPropString(node, "vcpus"); > @@ -19025,8 +19025,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, > goto cleanup; > } > > -for (i = 0; i < def->ncachetunes; i++) { > -if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { > +for (i = 0; i < def->nrestunes; i++) { > +if (virBitmapOverlaps(def->restunes[i]->vcpus, vcpus)) { > virReportError(VIR_ERR_XML_ERROR, "%s", > _("Overlapping vcpus in cachetunes")); > goto cleanup; > @@ -19056,16 +19056,16 @@ virDomainCachetuneDefParse(virDomainDefPtr def, > if (virResctrlAllocSetID(alloc, alloc_id) < 0) > goto cleanup; > > -VIR_STEAL_PTR(tmp_cachetune->vcpus, vcpus); > -VIR_STEAL_PTR(tmp_cachetune->alloc, alloc); > +VIR_STEAL_PTR(tmp_restune->vcpus, vcpus); > +VIR_STEAL_PTR(tmp_restune->alloc, alloc); > > -if (VIR_APPEND_ELEMENT(def->cachetunes, def->ncachetunes, tmp_cachetune) > < 0) > +if (VIR_APPEND_ELEMENT(def->restunes, def->nrestunes, tmp_restune) < 0) > goto cleanup; > > ret = 0; > cleanup: >
[libvirt] [RFCv2 PATCH 3/5] conf: rename cachetune to restune
From: Bing Niu resctrl not only supports cache tuning, but also memory bandwidth tuning. Rename cachetune to restune(resource tuning) to reflect that. Signed-off-by: Bing Niu --- src/conf/domain_conf.c | 44 ++-- src/conf/domain_conf.h | 10 +- src/qemu/qemu_process.c | 18 +- 3 files changed, 36 insertions(+), 36 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 62c412e..b3543f3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2950,14 +2950,14 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader) static void -virDomainCachetuneDefFree(virDomainCachetuneDefPtr cachetune) +virDomainRestuneDefFree(virDomainRestuneDefPtr restune) { -if (!cachetune) +if (!restune) return; -virObjectUnref(cachetune->alloc); -virBitmapFree(cachetune->vcpus); -VIR_FREE(cachetune); +virObjectUnref(restune->alloc); +virBitmapFree(restune->vcpus); +VIR_FREE(restune); } @@ -3147,9 +3147,9 @@ void virDomainDefFree(virDomainDefPtr def) virDomainShmemDefFree(def->shmems[i]); VIR_FREE(def->shmems); -for (i = 0; i < def->ncachetunes; i++) -virDomainCachetuneDefFree(def->cachetunes[i]); -VIR_FREE(def->cachetunes); +for (i = 0; i < def->nrestunes; i++) +virDomainRestuneDefFree(def->restunes[i]); +VIR_FREE(def->restunes); VIR_FREE(def->keywrap); @@ -18971,7 +18971,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, xmlNodePtr *nodes = NULL; virBitmapPtr vcpus = NULL; virResctrlAllocPtr alloc = virResctrlAllocNew(); -virDomainCachetuneDefPtr tmp_cachetune = NULL; +virDomainRestuneDefPtr tmp_restune = NULL; char *tmp = NULL; char *vcpus_str = NULL; char *alloc_id = NULL; @@ -18984,7 +18984,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (!alloc) goto cleanup; -if (VIR_ALLOC(tmp_cachetune) < 0) +if (VIR_ALLOC(tmp_restune) < 0) goto cleanup; vcpus_str = virXMLPropString(node, "vcpus"); @@ -19025,8 +19025,8 @@ virDomainCachetuneDefParse(virDomainDefPtr def, goto cleanup; } -for (i = 0; i < def->ncachetunes; i++) { -if (virBitmapOverlaps(def->cachetunes[i]->vcpus, vcpus)) { +for (i = 0; i < def->nrestunes; i++) { +if (virBitmapOverlaps(def->restunes[i]->vcpus, vcpus)) { virReportError(VIR_ERR_XML_ERROR, "%s", _("Overlapping vcpus in cachetunes")); goto cleanup; @@ -19056,16 +19056,16 @@ virDomainCachetuneDefParse(virDomainDefPtr def, if (virResctrlAllocSetID(alloc, alloc_id) < 0) goto cleanup; -VIR_STEAL_PTR(tmp_cachetune->vcpus, vcpus); -VIR_STEAL_PTR(tmp_cachetune->alloc, alloc); +VIR_STEAL_PTR(tmp_restune->vcpus, vcpus); +VIR_STEAL_PTR(tmp_restune->alloc, alloc); -if (VIR_APPEND_ELEMENT(def->cachetunes, def->ncachetunes, tmp_cachetune) < 0) +if (VIR_APPEND_ELEMENT(def->restunes, def->nrestunes, tmp_restune) < 0) goto cleanup; ret = 0; cleanup: ctxt->node = oldnode; -virDomainCachetuneDefFree(tmp_cachetune); +virDomainRestuneDefFree(tmp_restune); virObjectUnref(alloc); virBitmapFree(vcpus); VIR_FREE(alloc_id); @@ -26874,7 +26874,7 @@ virDomainCachetuneDefFormatHelper(unsigned int level, static int virDomainCachetuneDefFormat(virBufferPtr buf, -virDomainCachetuneDefPtr cachetune, +virDomainRestuneDefPtr restune, unsigned int flags) { virBuffer childrenBuf = VIR_BUFFER_INITIALIZER; @@ -26882,7 +26882,7 @@ virDomainCachetuneDefFormat(virBufferPtr buf, int ret = -1; virBufferSetChildIndent(&childrenBuf, buf); -virResctrlAllocForeachCache(cachetune->alloc, +virResctrlAllocForeachCache(restune->alloc, virDomainCachetuneDefFormatHelper, &childrenBuf); @@ -26895,14 +26895,14 @@ virDomainCachetuneDefFormat(virBufferPtr buf, goto cleanup; } -vcpus = virBitmapFormat(cachetune->vcpus); +vcpus = virBitmapFormat(restune->vcpus); if (!vcpus) goto cleanup; virBufferAsprintf(buf, "alloc); +const char *alloc_id = virResctrlAllocGetID(restune->alloc); if (!alloc_id) goto cleanup; @@ -27023,8 +27023,8 @@ virDomainCputuneDefFormat(virBufferPtr buf, def->iothreadids[i]->iothread_id); } -for (i = 0; i < def->ncachetunes; i++) -virDomainCachetuneDefFormat(&childrenBuf, def->cachetunes[i], flags); +for (i = 0; i < def->nrestunes; i++) +virDomainCachetuneDefFormat(&childrenBuf, def->restunes[i], flags); if (virBufferCheckError(&childrenBuf) < 0) return -1; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 6344c02..9e71a0b 1006