Re: [RFC v2 3/7] hw/core: Add cache topology options in -smp

2024-06-04 Thread Zhao Liu
Hi Markus and Daniel,

(I replied to both of you because I discussed the idea of cache object
at the end)

On Tue, Jun 04, 2024 at 10:32:14AM +0100, Daniel P. Berrangé wrote:
> Date: Tue, 4 Jun 2024 10:32:14 +0100
> From: "Daniel P. Berrangé" 
> Subject: Re: [RFC v2 3/7] hw/core: Add cache topology options in -smp
> 
> On Tue, Jun 04, 2024 at 10:54:51AM +0200, Markus Armbruster wrote:
> > Zhao Liu  writes:
> > 
> > > Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> > > -smp to define the cache topology for SMP system.
> > >
> > > Signed-off-by: Zhao Liu 
> > 
> > [...]
> > 
> > > diff --git a/qapi/machine.json b/qapi/machine.json
> > > index 7ac5a05bb9c9..8fa5af69b1bf 100644
> > > --- a/qapi/machine.json
> > > +++ b/qapi/machine.json
> > > @@ -1746,6 +1746,23 @@
> > >  #
> > >  # @threads: number of threads per core
> > >  #
> > > +# @l1d-cache: topology hierarchy of L1 data cache. It accepts the CPU
> > > +# topology enumeration as the parameter, i.e., CPUs in the same
> > > +# topology container share the same L1 data cache. (since 9.1)
> > > +#
> > > +# @l1i-cache: topology hierarchy of L1 instruction cache. It accepts
> > > +# the CPU topology enumeration as the parameter, i.e., CPUs in the
> > > +# same topology container share the same L1 instruction cache.
> > > +# (since 9.1)
> > > +#
> > > +# @l2-cache: topology hierarchy of L2 unified cache. It accepts the CPU
> > > +# topology enumeration as the parameter, i.e., CPUs in the same
> > > +# topology container share the same L2 unified cache. (since 9.1)
> > > +#
> > > +# @l3-cache: topology hierarchy of L3 unified cache. It accepts the CPU
> > > +# topology enumeration as the parameter, i.e., CPUs in the same
> > > +# topology container share the same L3 unified cache. (since 9.1)
> > > +#
> > >  # Since: 6.1
> > >  ##
> > 
> > The new members are all optional.  What does "absent" mean?  No such
> > cache?  Some default topology?

I suppose "absent" means using the the default arch-specific cache topo.

For example, x86 has its own default cache topo. my purpose in providing
this interface is to allow users to override the default (arch-specific)
cache topology.

Daniel suggested in cover letter's reply that I could add the value
“default”, I think omitting it would have the same effect as setting it
to “default”, and omitting it shouldn't be regarded as disabling a
particular cache, since the interface is only about the topology!

> > Is this sufficiently general?  Do all machines of interest have a split
> > level 1 cache, a level 2 cache, and a level 3 cache?

The different cache support can be extended by such control fields in
MachineClass:

typedef struct {
...
bool l1_separated_cache_supported;
bool l2_unified_cache_supported;
bool l3_unified_cache_supported;
} SMPCompatProps;

For example, if a machine with l1 unified cache appears, it can add the
"l1_unified_cache_supported", and add "l1" option in QAPI.

Then separate L1 cache has “l1i” and “l1d” options, and unified L1 cache
should have “l1” option.

> Level 4 cache is apparently a thing
> 
> https://www.guru3d.com/story/intel-confirms-l4-cache-in-upcoming-meteor-lake-cpus/
> 
> but given that any new cache levels will require new code in QEMU to
> wire up, its not a big deal to add new properties at the same time.
> 
> That said see my reply just now to the cover letter, where I suggest
> we should have a "caches" property that takes an array of cache
> info objects.

Yes, I agree.

> > 
> > Is the CPU topology level the only cache property we'll want to
> > configure here?  If the answer isn't "yes", then we should perhaps wrap
> > it in an object, so we can easily add more members later.
> 
> Cache size is a piece of info I could see us wanting to express

For the simplicity and clarity, I think it's better to only cover
topology in -smp, including the current CPU topology and the cache
topology I'm working on.

I've thought about the idea of converting the cache into the device,
which in turn could define more properties for the cache.

This one is mostly in connection with the previous qom-topo idea (aka
abstracting CPU topology device [1]):

-device cpu-socket,id=sock0 \
-device cpu-die,id=die0,parent=sock0 \
-device cpu-core,id=core0,parent=die0,nr-threads=2 \
-device cpu-cache,id=cache0,parent=core0,level=l1,type=inst \
-device cpu-cache,id=cache1,parent=core0,level=l1,type

Re: [RFC v2 3/7] hw/core: Add cache topology options in -smp

2024-06-04 Thread Daniel P . Berrangé
On Tue, Jun 04, 2024 at 10:54:51AM +0200, Markus Armbruster wrote:
> Zhao Liu  writes:
> 
> > Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> > -smp to define the cache topology for SMP system.
> >
> > Signed-off-by: Zhao Liu 
> 
> [...]
> 
> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index 7ac5a05bb9c9..8fa5af69b1bf 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1746,6 +1746,23 @@
> >  #
> >  # @threads: number of threads per core
> >  #
> > +# @l1d-cache: topology hierarchy of L1 data cache. It accepts the CPU
> > +# topology enumeration as the parameter, i.e., CPUs in the same
> > +# topology container share the same L1 data cache. (since 9.1)
> > +#
> > +# @l1i-cache: topology hierarchy of L1 instruction cache. It accepts
> > +# the CPU topology enumeration as the parameter, i.e., CPUs in the
> > +# same topology container share the same L1 instruction cache.
> > +# (since 9.1)
> > +#
> > +# @l2-cache: topology hierarchy of L2 unified cache. It accepts the CPU
> > +# topology enumeration as the parameter, i.e., CPUs in the same
> > +# topology container share the same L2 unified cache. (since 9.1)
> > +#
> > +# @l3-cache: topology hierarchy of L3 unified cache. It accepts the CPU
> > +# topology enumeration as the parameter, i.e., CPUs in the same
> > +# topology container share the same L3 unified cache. (since 9.1)
> > +#
> >  # Since: 6.1
> >  ##
> 
> The new members are all optional.  What does "absent" mean?  No such
> cache?  Some default topology?
> 
> Is this sufficiently general?  Do all machines of interest have a split
> level 1 cache, a level 2 cache, and a level 3 cache?

Level 4 cache is apparently a thing

https://www.guru3d.com/story/intel-confirms-l4-cache-in-upcoming-meteor-lake-cpus/

but given that any new cache levels will require new code in QEMU to
wire up, its not a big deal to add new properties at the same time.

That said see my reply just now to the cover letter, where I suggest
we should have a "caches" property that takes an array of cache
info objects.

> 
> Is the CPU topology level the only cache property we'll want to
> configure here?  If the answer isn't "yes", then we should perhaps wrap
> it in an object, so we can easily add more members later.

Cache size is a piece of info I could see us wanting to express

> Two spaces between sentences for consistency, please.
> 
> >  { 'struct': 'SMPConfiguration', 'data': {
> > @@ -1758,7 +1775,11 @@
> >   '*modules': 'int',
> >   '*cores': 'int',
> >   '*threads': 'int',
> > - '*maxcpus': 'int' } }
> > + '*maxcpus': 'int',
> > + '*l1d-cache': 'CPUTopoLevel',
> > + '*l1i-cache': 'CPUTopoLevel',
> > + '*l2-cache': 'CPUTopoLevel',
> > + '*l3-cache': 'CPUTopoLevel' } }
> >  
> >  ##
> >  # @x-query-irq:
> > diff --git a/system/vl.c b/system/vl.c
> 
> [...]
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [RFC v2 3/7] hw/core: Add cache topology options in -smp

2024-06-04 Thread Markus Armbruster
Zhao Liu  writes:

> Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> -smp to define the cache topology for SMP system.
>
> Signed-off-by: Zhao Liu 

[...]

> diff --git a/qapi/machine.json b/qapi/machine.json
> index 7ac5a05bb9c9..8fa5af69b1bf 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1746,6 +1746,23 @@
>  #
>  # @threads: number of threads per core
>  #
> +# @l1d-cache: topology hierarchy of L1 data cache. It accepts the CPU
> +# topology enumeration as the parameter, i.e., CPUs in the same
> +# topology container share the same L1 data cache. (since 9.1)
> +#
> +# @l1i-cache: topology hierarchy of L1 instruction cache. It accepts
> +# the CPU topology enumeration as the parameter, i.e., CPUs in the
> +# same topology container share the same L1 instruction cache.
> +# (since 9.1)
> +#
> +# @l2-cache: topology hierarchy of L2 unified cache. It accepts the CPU
> +# topology enumeration as the parameter, i.e., CPUs in the same
> +# topology container share the same L2 unified cache. (since 9.1)
> +#
> +# @l3-cache: topology hierarchy of L3 unified cache. It accepts the CPU
> +# topology enumeration as the parameter, i.e., CPUs in the same
> +# topology container share the same L3 unified cache. (since 9.1)
> +#
>  # Since: 6.1
>  ##

The new members are all optional.  What does "absent" mean?  No such
cache?  Some default topology?

Is this sufficiently general?  Do all machines of interest have a split
level 1 cache, a level 2 cache, and a level 3 cache?

Is the CPU topology level the only cache property we'll want to
configure here?  If the answer isn't "yes", then we should perhaps wrap
it in an object, so we can easily add more members later.

Two spaces between sentences for consistency, please.

>  { 'struct': 'SMPConfiguration', 'data': {
> @@ -1758,7 +1775,11 @@
>   '*modules': 'int',
>   '*cores': 'int',
>   '*threads': 'int',
> - '*maxcpus': 'int' } }
> + '*maxcpus': 'int',
> + '*l1d-cache': 'CPUTopoLevel',
> + '*l1i-cache': 'CPUTopoLevel',
> + '*l2-cache': 'CPUTopoLevel',
> + '*l3-cache': 'CPUTopoLevel' } }
>  
>  ##
>  # @x-query-irq:
> diff --git a/system/vl.c b/system/vl.c

[...]




[RFC v2 3/7] hw/core: Add cache topology options in -smp

2024-05-30 Thread Zhao Liu
Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
-smp to define the cache topology for SMP system.

Signed-off-by: Zhao Liu 
---
Changes since RFC v1:
 * Set has_*_cache field in machine_get_smp(). (JeeHeng)
 * Adjust string breaking style in error_setg() for more semantic
   sentence breaking conventions. (Jonathan)
 * Add more description about cache options. (Markus)
 * Now in v2, config->*_cache field stores topology enumeration instead
   of string, no need to parse, so just make machine_check_cache_topo()
   return boolean.
---
 hw/core/machine-smp.c  | 146 +
 hw/core/machine.c  |  20 ++
 qapi/machine.json  |  23 ++-
 system/vl.c|  12 
 tests/unit/meson.build |   3 +-
 5 files changed, 202 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 5d8d7edcbd3f..c79464cf3d2c 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -61,6 +61,150 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
 return g_string_free(s, false);
 }
 
+static bool machine_check_topo_support(MachineState *ms,
+   CPUTopoLevel topo)
+{
+MachineClass *mc = MACHINE_GET_CLASS(ms);
+
+if (topo == CPU_TOPO_LEVEL_MODULE && !mc->smp_props.modules_supported) {
+return false;
+}
+
+if (topo == CPU_TOPO_LEVEL_CLUSTER && !mc->smp_props.clusters_supported) {
+return false;
+}
+
+if (topo == CPU_TOPO_LEVEL_DIE && !mc->smp_props.dies_supported) {
+return false;
+}
+
+if (topo == CPU_TOPO_LEVEL_BOOK && !mc->smp_props.books_supported) {
+return false;
+}
+
+if (topo == CPU_TOPO_LEVEL_DRAWER && !mc->smp_props.drawers_supported) {
+return false;
+}
+
+return true;
+}
+
+static bool machine_check_cache_topo(MachineState *ms,
+ CPUTopoLevel topo,
+ Error **errp)
+{
+if (topo == CPU_TOPO_LEVEL__MAX || topo == CPU_TOPO_LEVEL_INVALID) {
+error_setg(errp,
+   "Invalid cache topology level: %s. "
+   "The cache topology should match the "
+   "valid CPU topology level",
+   cpu_topo_to_string(topo));
+return false;
+}
+
+if (!machine_check_topo_support(ms, topo)) {
+error_setg(errp,
+   "Invalid cache topology level: %s. "
+   "The topology level is not supported by this machine",
+   cpu_topo_to_string(topo));
+return false;
+}
+
+return true;
+}
+
+static void machine_parse_smp_cache_config(MachineState *ms,
+   const SMPConfiguration *config,
+   Error **errp)
+{
+MachineClass *mc = MACHINE_GET_CLASS(ms);
+
+/*
+ * The cache topology does not support a default entry similar to
+ * CPU topology with parameters=1. So when the machine explicitly
+ * does not support cache topology, return the error.
+ */
+if (config->has_l1d_cache) {
+if (!mc->smp_props.l1_separated_cache_supported) {
+error_setg(errp,
+   "L1 D-cache topology not supported by this machine");
+return;
+}
+
+if (!machine_check_cache_topo(ms, config->l1d_cache, errp)) {
+return;
+}
+
+ms->smp_cache.l1d = config->l1d_cache;
+}
+
+if (config->has_l1i_cache) {
+if (!mc->smp_props.l1_separated_cache_supported) {
+error_setg(errp,
+   "L1 I-cache topology not supported by this machine");
+return;
+}
+
+if (!machine_check_cache_topo(ms, config->l1i_cache, errp)) {
+return;
+}
+
+ms->smp_cache.l1i = config->l1i_cache;
+}
+
+if (config->has_l2_cache) {
+if (!mc->smp_props.l2_unified_cache_supported) {
+error_setg(errp,
+   "L2 cache topology not supported by this machine");
+return;
+}
+
+if (!machine_check_cache_topo(ms, config->l2_cache, errp)) {
+return;
+}
+
+ms->smp_cache.l2 = config->l2_cache;
+
+/*
+ * Cache topology is initialized by default to CPU_TOPO_LEVEL_INVALID,
+ * which is the lowest level, so such a check is OK, even if the config
+ * doesn't override that field.
+ */
+if (ms->smp_cache.l1d > ms->smp_cache.l2 ||
+ms->smp_cache.l1i > ms->smp_cache.l2) {
+error_setg(errp,
+   "Invalid L2 cache topology. "
+   "L2 cache topology level should not be lower than "
+   "L1 D-cache/L1 I-cache");
+return;
+}
+}
+
+if (config->has_l3_cache) {
+if (!mc->smp_props.l2_unified_cache_supported) {
+