Re: [RFC v2 0/7] Introduce SMP Cache Topology
Hi Daniel, On Tue, Jun 04, 2024 at 10:29:15AM +0100, Daniel P. Berrangé wrote: > Date: Tue, 4 Jun 2024 10:29:15 +0100 > From: "Daniel P. Berrangé" > Subject: Re: [RFC v2 0/7] Introduce SMP Cache Topology > > On Thu, May 30, 2024 at 06:15:32PM +0800, Zhao Liu wrote: > > Hi, > > > > Now that the i386 cache model has been able to define the topology > > clearly, it's time to move on to discussing/advancing this feature about > > configuring the cache topology with -smp as the following example: > > > > -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ > > l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die > > > > With the new cache topology options ("l1d-cache", "l1i-cache", > > "l2-cache" and "l3-cache"), we could adjust the cache topology via -smp. > > Switching to QAPI for a second, your proposal is effectively > > { 'enum': 'SMPCacheTopo', > 'data': [ 'default','socket','die','cluster','module','core','thread'] } > >{ 'struct': 'SMPConfiguration', > 'data': { >'*cpus': 'int', >'*drawers': 'int', >'*books': 'int', >'*sockets': 'int', >'*dies': 'int', >'*clusters': 'int', >'*modules': 'int', >'*cores': 'int', >'*threads': 'int', >'*maxcpus': 'int', >'*l1d-cache': 'SMPCacheTopo', >'*l1i-cache': 'SMPCacheTopo', >'*l2-cache': 'SMPCacheTopo', >'*l3-cache': 'SMPCacheTopo', > } } > > I think that would be more natural to express as an array of structs > thus: > > { 'enum': 'SMPCacheTopo', > 'data': [ 'default','socket','die','cluster','module','core','thread'] } > > { 'enum': 'SMPCacheType', > 'data': [ 'l1d', 'l1i', 'l2', 'l3' ] } > > { 'struct': 'SMPCache', > 'data': { > 'type': 'SMPCacheType', > 'topo': 'SMPCacheTopo', > } } > >{ 'struct': 'SMPConfiguration', > 'data': { >'*cpus': 'int', >'*drawers': 'int', >'*books': 'int', >'*sockets': 'int', >'*dies': 'int', >'*clusters': 'int', >'*modules': 'int', >'*cores': 'int', >'*threads': 'int', >'*maxcpus': 'int', >'caches': [ 'SMPCache' ] > } } > > Giving an example in (hypothetical) JSON cli syntax of: > > -smp "{'cpus':32,'sockets':2,'dies':2,'modules':2, > 'cores':2,'threads':2,'maxcpus':32,'caches': [ > {'type':'l1d','topo':'core' }, > {'type':'l1i','topo':'core' }, > {'type':'l2','topo':'core' }, > {'type':'l3','topo':'die' }, > ]}" Thanks! Looks clean to me and I think it is ok. Just one further question, for this case where it must be expressed in a raw JSON string, is there any requirement in QEMU that a simple command-line friendly variant must be provided that corresponds to it? > > Open about How to Handle the Default Options > > > > > > (For the detailed description of this series, pls skip this "long" > > section and review the subsequent content.) > > > > > > Background of OPEN > > -- > > > > Daniel and I discussed initial thoughts on cache topology, and there was > > an idea that the default *cache_topo is on the CORE level [3]: > > > > > simply preferring "cores" for everything is a reasonable > > > default long term plan for e
Re: [RFC v2 0/7] Introduce SMP Cache Topology
On Thu, May 30, 2024 at 06:15:32PM +0800, Zhao Liu wrote: > Hi, > > Now that the i386 cache model has been able to define the topology > clearly, it's time to move on to discussing/advancing this feature about > configuring the cache topology with -smp as the following example: > > -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ > l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die > > With the new cache topology options ("l1d-cache", "l1i-cache", > "l2-cache" and "l3-cache"), we could adjust the cache topology via -smp. Switching to QAPI for a second, your proposal is effectively { 'enum': 'SMPCacheTopo', 'data': [ 'default','socket','die','cluster','module','core','thread'] } { 'struct': 'SMPConfiguration', 'data': { '*cpus': 'int', '*drawers': 'int', '*books': 'int', '*sockets': 'int', '*dies': 'int', '*clusters': 'int', '*modules': 'int', '*cores': 'int', '*threads': 'int', '*maxcpus': 'int', '*l1d-cache': 'SMPCacheTopo', '*l1i-cache': 'SMPCacheTopo', '*l2-cache': 'SMPCacheTopo', '*l3-cache': 'SMPCacheTopo', } } I think that would be more natural to express as an array of structs thus: { 'enum': 'SMPCacheTopo', 'data': [ 'default','socket','die','cluster','module','core','thread'] } { 'enum': 'SMPCacheType', 'data': [ 'l1d', 'l1i', 'l2', 'l3' ] } { 'struct': 'SMPCache', 'data': { 'type': 'SMPCacheType', 'topo': 'SMPCacheTopo', } } { 'struct': 'SMPConfiguration', 'data': { '*cpus': 'int', '*drawers': 'int', '*books': 'int', '*sockets': 'int', '*dies': 'int', '*clusters': 'int', '*modules': 'int', '*cores': 'int', '*threads': 'int', '*maxcpus': 'int', 'caches': [ 'SMPCache' ] } } Giving an example in (hypothetical) JSON cli syntax of: -smp "{'cpus':32,'sockets':2,'dies':2,'modules':2, 'cores':2,'threads':2,'maxcpus':32,'caches': [ {'type':'l1d','topo':'core' }, {'type':'l1i','topo':'core' }, {'type':'l2','topo':'core' }, {'type':'l3','topo':'die' }, ]}" > Open about How to Handle the Default Options > > > (For the detailed description of this series, pls skip this "long" > section and review the subsequent content.) > > > Background of OPEN > -- > > Daniel and I discussed initial thoughts on cache topology, and there was > an idea that the default *cache_topo is on the CORE level [3]: > > > simply preferring "cores" for everything is a reasonable > > default long term plan for everything, unless the specific > > architecture target has no concept of "cores". FYI, when I wrote that I wasn't specifically thinking about cache mappings. I just meant that when exposing SMP topology to guests, 'cores' is a good default, compared to 'sockets', or 'threads',etc. Defaults for cache <-> topology mappings should be whatever makes most sense to the architecture target/cpu. > Problem with this OPEN > -- > > Some arches have their own arch-specific cache topology, such as l1 per > core/l2 per core/l3 per die for i386. And as Jeehang proposed for > RISC-V, the cache topologies are like: l1/l2 per core and l3 per > cluster. > > Taking L3 as an example, logically there is a difference between the two > starting points of user-specified core level and with the default core > level. > > For example, > > "(user-specified) l3-cache-topo=core" should override i386's default l3 > per core, but i386's default l3 per core should also override > "(default) l3-cache-topo=core" because this default value is like a > placeholder that specifies nothing. > > However, from a command line parsing perspective, it's impossible to > tell what the “l3-cache-topo=core” setting is for... Yes, we need to explicitly distinguish built-in defaults from user specified data, otherwise we risk too many mistakes. > Options to solve OPEN > - > > So, I think we have the following options: > > > 1. Can we avoid such default parameters? > > > This would reduce the pain in QEMU, but I'm not sure if it's possible to > make libvirt happy? I think having an explicit "defualt" value is inevitable, not simply because of libvirt. Long experiance with QEMU shows that we need to be able to reliably distinguish direct user input from built-in defaults in cases like this. > > It is also possible to expose Cache topology information as the CPU > properties in “query-cpu-model-expansion type=full”, but that adds > arch-specific work. > > If omitted, I think it's just like omitting “cores”/“sockets”, > leaving it up to the machine to decide based on the specific CPU model > (and now the cache topology is indeed determined by the
[RFC v2 0/7] Introduce SMP Cache Topology
Hi, Now that the i386 cache model has been able to define the topology clearly, it's time to move on to discussing/advancing this feature about configuring the cache topology with -smp as the following example: -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\ l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die With the new cache topology options ("l1d-cache", "l1i-cache", "l2-cache" and "l3-cache"), we could adjust the cache topology via -smp. But in a recent discussion with Daniel, I think there's currently uncertainty as to whether newly added parameters should have default parameter values added or should be omitted by default...therefore, I keep the "RFC" tag for this series. For my difficulties with this OPEN, see the first section below ("Open about How to Handle the Default Options"). This patch set is based on a little cleanup: https://lore.kernel.org/qemu-devel/20240527131837.2630961-1-zhao1@intel.com/ And you can find the RFC v1 there: https://lore.kernel.org/qemu-devel/20240220092504.726064-1-zhao1@linux.intel.com/ Since the ARM [1] and RISC-V [2] folks have similar needs for the cache topology, I also cc'd the ARM and RISC-V folks and lists. I've gone to explain the current problem much in the first section below, and I appreciate your time your patience. Welcome your feedback! Open about How to Handle the Default Options (For the detailed description of this series, pls skip this "long" section and review the subsequent content.) Background of OPEN -- Daniel and I discussed initial thoughts on cache topology, and there was an idea that the default *cache_topo is on the CORE level [3]: > simply preferring "cores" for everything is a reasonable > default long term plan for everything, unless the specific > architecture target has no concept of "cores". The original purpose of considering *_cache_topo=core was to achieve similar behavior like "parameters=1" supported by other CPU topology options in -smp. This way, even if machine doesn't support configuring cache topology (by l1_separated_cache_supported/[l2|l3]_unified_cache_supported = false), such a *_cache_topo=core can be treated as a valid parameter, except that it will be quietly ignored. This has the advantage of facilitating upper-level libvirt support; currently, there is no way for cache topology support information to be exposed to libvirt, so libvirt doesn't know which machines to give support to for these cache topology options. However, I have a problem here. Problem with this OPEN -- Some arches have their own arch-specific cache topology, such as l1 per core/l2 per core/l3 per die for i386. And as Jeehang proposed for RISC-V, the cache topologies are like: l1/l2 per core and l3 per cluster. Taking L3 as an example, logically there is a difference between the two starting points of user-specified core level and with the default core level. For example, "(user-specified) l3-cache-topo=core" should override i386's default l3 per core, but i386's default l3 per core should also override "(default) l3-cache-topo=core" because this default value is like a placeholder that specifies nothing. However, from a command line parsing perspective, it's impossible to tell what the “l3-cache-topo=core” setting is for... Options to solve OPEN - So, I think we have the following options: 1. Can we avoid such default parameters? This would reduce the pain in QEMU, but I'm not sure if it's possible to make libvirt happy? It is also possible to expose Cache topology information as the CPU properties in “query-cpu-model-expansion type=full”, but that adds arch-specific work. If omitted, I think it's just like omitting “cores”/“sockets”, leaving it up to the machine to decide based on the specific CPU model (and now the cache topology is indeed determined by the CPU model as well). 2. If default is required, can we use a specific abstract word? ~~~ That is, is it possible to use a specific word like “auto”/“invalid” /“default” and avoid a specific topology level? Like setting “l3-cache-topo=invalid” (since I've only added the invalid hierarchy so far ;-) ). I found the cache topology of arches varies so much that I'm sorry to say it's hard to have a uniform default cache topology. I apologize for the very lengthy note and appreciate you reviewing it here as well as your time! Introduction Background -- Intel client platforms (ADL/RPL/MTL) and E core server platforms (SRF) share the L2 cache domain among multiple E cores (in the same module). Thus we need a way to adjust the cache topology so that users could create the cache topology for Guest that is nearly identical to Host. This is necessary in cases where there are bound vCPUs, especially consid