Re: [PATCH v2 00/24] x86/resctrl: Merge the CDP resources

2021-04-08 Thread James Morse
Hi Babu,

On 06/04/2021 22:37, Babu Moger wrote:
> On 4/6/21 12:19 PM, James Morse wrote:
>> On 30/03/2021 21:36, Babu Moger wrote:
>>> On 3/12/21 11:58 AM, James Morse wrote:
 This series re-folds the resctrl code so the CDP resources (L3CODE et al)
 behaviour is all contained in the filesystem parts, with a minimum amount
 of arch specific code.

 This series collapses the CODE/DATA resources, moving all the user-visible
 resctrl ABI into what becomes the filesystem code. CDP becomes the type of
 configuration being applied to a cache. This is done by adding a
 struct resctrl_schema to the parts of resctrl that will move to fs. This
 holds the arch-code resource that is in use for this schema, along with
 other properties like the name, and whether the configuration being applied
 is CODE/DATA/BOTH.
>>
>>
>>> I applied your patches on my AMD box.
>>
>> Great! Thanks for taking a look,
>>
>>
>>> Seeing some difference in the behavior.
>>
>> Ooer,
>>
>>
>>> Before these patches.
>>>
>>> # dmesg |grep -i resctrl
>>> [   13.076973] resctrl: L3 allocation detected
>>> [   13.087835] resctrl: L3DATA allocation detected
>>> [   13.092886] resctrl: L3CODE allocation detected
>>> [   13.097936] resctrl: MB allocation detected
>>> [   13.102599] resctrl: L3 monitoring detected
>>>
>>>
>>> After the patches.
>>>
>>> # dmesg |grep -i resctrl
>>> [   13.076973] resctrl: L3 allocation detected
>>> [   13.097936] resctrl: MB allocation detected
>>> [   13.102599] resctrl: L3 monitoring detected
>>>
>>> You can see that L3DATA and L3CODE disappeared. I think we should keep the
>>> behavior same for x86(at least).
>>
>> This is the kernel log ... what user-space software is parsing that for an 
>> expected value?
>> What happens if the resctrl strings have been overwritten by more kernel log?
>>
>> I don't think user-space should be relying on this. I'd argue any user-space 
>> doing this is
>> already broken. Is it just the kernel selftest's filter_dmesg()? It doesn't 
>> seem to do
>> anything useful
>>
>> Whether resctrl is support can be read from /proc/filesystems. CDP is 
>> probably a
>> try-it-and-see. User-space could parse /proc/cpuinfo, but its probably not a 
>> good idea.

> Yes. Agree. Looking at the dmesg may no be right way to figure out all the
> support details. As a normal practice, I searched for these texts and
> noticed difference. That is why I felt it is best to keep those texts same
> as before.

>> Its easy to fix, but it seems odd that the kernel has to print things for 
>> user-space to
>> try and parse. (I'd like to point at the user-space software that depends on 
>> this)
> 
> I dont think there is any software that parses the dmesg for these
> details. These are info messages for the developers.

The kernel log changes all the time, messages at boot aren't something you can 
depend on
seeing later. Unless there is some user-space software broken by this, I'm 
afraid I don't
think its a good idea to add extra code to keep it the same.

Printing 'CDP supported by Lx' would be more useful to developers perusing the 
log. Even
more useful would be exposing feature attributes via sysfs to say what resctrl 
supports
without having to mount-it-and-see. This can then be used by user-space too.
e.g.:
| cat /sys/fs/ext4/features/fast_commit


>>> I am still not clear why we needed resctrl_conf_type
>>>
>>> enum resctrl_conf_type {
>>> CDP_BOTH,
>>> CDP_CODE,
>>> CDP_DATA,
>>> };
>>>
>>> Right now, I see all the resources are initialized as CDP_BOTH.
>>>
>>>  [RDT_RESOURCE_L3] =
>>> {
>>> .conf_type  = CDP_BOTH,
>>>  [RDT_RESOURCE_L2] =
>>> {
>>> .conf_type  = CDP_BOTH,
>>>  [RDT_RESOURCE_MBA] =
>>> {
>>> .conf_type  = CDP_BOTH,
>>
>> Ah, those should have been removed in patch 24. Once all the resources are 
>> the same, the
>> resource doesn't need to describe what kind it is.
>>
>>
>>> If all the resources are CDP_BOTH, then why we need separate CDP_CODE and
>>> CDP_DATA?
>>
>> The filesystem code for resctrl that will eventually move out of arch/x86 
>> needs to be able
>> to describe the type of configuration change being made back to the arch 
>> code. The enum
>> gets used for that.
>>
>> x86 needs this as it affects which MSRs the configuration value is written 
>> to.
>>
>>
>>> Are these going to be different for ARM?
>>
>> Nope. Arm's MPAM ends up emulating CDP with the closid values that get 
>> applied to
>> transactions.
>>
>>
>>> Also initializing RDT_RESOURCE_MBA as CDP_BOTH does not seem right. I dont
>>> think there will CDP support in MBA in future.
>>
>> Its not code or data, which makes it both. 'BOTH' is more of a 'do nothing 
>> special', there
>> may be a better name, but I'm not very good at naming things. (any 
>> suggestions?)

> Do you think CDP_NONE will make some sense?

If you t

Re: [PATCH v2 00/24] x86/resctrl: Merge the CDP resources

2021-04-06 Thread Babu Moger



On 4/6/21 12:19 PM, James Morse wrote:
> Hi Babu,
> 
> On 30/03/2021 21:36, Babu Moger wrote:
>> On 3/12/21 11:58 AM, James Morse wrote:
>>> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
>>> behaviour is all contained in the filesystem parts, with a minimum amount
>>> of arch specific code.
>>>
>>> Arm have some CPU support for dividing caches into portions, and
>>> applying bandwidth limits at various points in the SoC. The collective term
>>> for these features is MPAM: Memory Partitioning and Monitoring.
>>>
>>> MPAM is similar enough to Intel RDT, that it should use the defacto linux
>>> interface: resctrl. This filesystem currently lives under arch/x86, and is
>>> tightly coupled to the architecture.
>>> Ultimately, my plan is to split the existing resctrl code up to have an
>>> arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
>>> MPAM can be wired up.
>>>
>>> x86 might have two resources with cache controls, (L2 and L3) but has
>>> extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
>>> if CDP is enabled for the corresponding cache.
>>>
>>> MPAM has an equivalent feature to CDP, but its a property of the CPU,
>>> not the cache. Resctrl needs to have x86's odd/even behaviour, as that
>>> its the ABI, but this isn't how the MPAM hardware works. It is entirely
>>> possible that an in-kernel user of MPAM would not be using CDP, whereas
>>> resctrl is.
>>> Pretending L3CODE and L3DATA are entirely separate resources is a neat
>>> trick, but doing this is specific to x86.
>>> Doing this leaves the arch code in control of various parts of the
>>> filesystem ABI: the resources names, and the way the schemata are parsed.
>>> Allowing this stuff to vary between architectures is bad for user space.
>>>
>>> This series collapses the CODE/DATA resources, moving all the user-visible
>>> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
>>> configuration being applied to a cache. This is done by adding a
>>> struct resctrl_schema to the parts of resctrl that will move to fs. This
>>> holds the arch-code resource that is in use for this schema, along with
>>> other properties like the name, and whether the configuration being applied
>>> is CODE/DATA/BOTH.
> 
> 
>> I applied your patches on my AMD box.
> 
> Great! Thanks for taking a look,
> 
> 
>> Seeing some difference in the behavior.
> 
> Ooer,
> 
> 
>> Before these patches.
>>
>> # dmesg |grep -i resctrl
>> [   13.076973] resctrl: L3 allocation detected
>> [   13.087835] resctrl: L3DATA allocation detected
>> [   13.092886] resctrl: L3CODE allocation detected
>> [   13.097936] resctrl: MB allocation detected
>> [   13.102599] resctrl: L3 monitoring detected
>>
>>
>> After the patches.
>>
>> # dmesg |grep -i resctrl
>> [   13.076973] resctrl: L3 allocation detected
>> [   13.097936] resctrl: MB allocation detected
>> [   13.102599] resctrl: L3 monitoring detected
>>
>> You can see that L3DATA and L3CODE disappeared. I think we should keep the
>> behavior same for x86(at least).
> 
> This is the kernel log ... what user-space software is parsing that for an 
> expected value?
> What happens if the resctrl strings have been overwritten by more kernel log?
> 
> I don't think user-space should be relying on this. I'd argue any user-space 
> doing this is
> already broken. Is it just the kernel selftest's filter_dmesg()? It doesn't 
> seem to do
> anything useful
> 
> Whether resctrl is support can be read from /proc/filesystems. CDP is 
> probably a
> try-it-and-see. User-space could parse /proc/cpuinfo, but its probably not a 
> good idea.

Yes. Agree. Looking at the dmesg may no be right way to figure out all the
support details. As a normal practice, I searched for these texts and
noticed difference. That is why I felt it is best to keep those texts same
as before.
> 
> 
> Its easy to fix, but it seems odd that the kernel has to print things for 
> user-space to
> try and parse. (I'd like to point at the user-space software that depends on 
> this)

I dont think there is any software that parses the dmesg for these
details. These are info messages for the developers.

> 
> 
>> I am still not clear why we needed resctrl_conf_type
>>
>> enum resctrl_conf_type {
>> CDP_BOTH,
>> CDP_CODE,
>> CDP_DATA,
>> };
>>
>> Right now, I see all the resources are initialized as CDP_BOTH.
>>
>>  [RDT_RESOURCE_L3] =
>> {
>> .conf_type  = CDP_BOTH,
>>  [RDT_RESOURCE_L2] =
>> {
>> .conf_type  = CDP_BOTH,
>>  [RDT_RESOURCE_MBA] =
>> {
>> .conf_type  = CDP_BOTH,
> 
> Ah, those should have been removed in patch 24. Once all the resources are 
> the same, the
> resource doesn't need to describe what kind it is.
> 
> 
>> If all the resources are CDP_BOTH, then why we need separate CDP_CODE and
>> CDP_DATA?
> 
> The filesystem code for r

Re: [PATCH v2 00/24] x86/resctrl: Merge the CDP resources

2021-04-06 Thread James Morse
Hi Babu,

On 30/03/2021 21:36, Babu Moger wrote:
> On 3/12/21 11:58 AM, James Morse wrote:
>> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
>> behaviour is all contained in the filesystem parts, with a minimum amount
>> of arch specific code.
>>
>> Arm have some CPU support for dividing caches into portions, and
>> applying bandwidth limits at various points in the SoC. The collective term
>> for these features is MPAM: Memory Partitioning and Monitoring.
>>
>> MPAM is similar enough to Intel RDT, that it should use the defacto linux
>> interface: resctrl. This filesystem currently lives under arch/x86, and is
>> tightly coupled to the architecture.
>> Ultimately, my plan is to split the existing resctrl code up to have an
>> arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
>> MPAM can be wired up.
>>
>> x86 might have two resources with cache controls, (L2 and L3) but has
>> extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
>> if CDP is enabled for the corresponding cache.
>>
>> MPAM has an equivalent feature to CDP, but its a property of the CPU,
>> not the cache. Resctrl needs to have x86's odd/even behaviour, as that
>> its the ABI, but this isn't how the MPAM hardware works. It is entirely
>> possible that an in-kernel user of MPAM would not be using CDP, whereas
>> resctrl is.
>> Pretending L3CODE and L3DATA are entirely separate resources is a neat
>> trick, but doing this is specific to x86.
>> Doing this leaves the arch code in control of various parts of the
>> filesystem ABI: the resources names, and the way the schemata are parsed.
>> Allowing this stuff to vary between architectures is bad for user space.
>>
>> This series collapses the CODE/DATA resources, moving all the user-visible
>> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
>> configuration being applied to a cache. This is done by adding a
>> struct resctrl_schema to the parts of resctrl that will move to fs. This
>> holds the arch-code resource that is in use for this schema, along with
>> other properties like the name, and whether the configuration being applied
>> is CODE/DATA/BOTH.


> I applied your patches on my AMD box.

Great! Thanks for taking a look,


> Seeing some difference in the behavior.

Ooer,


> Before these patches.
> 
> # dmesg |grep -i resctrl
> [   13.076973] resctrl: L3 allocation detected
> [   13.087835] resctrl: L3DATA allocation detected
> [   13.092886] resctrl: L3CODE allocation detected
> [   13.097936] resctrl: MB allocation detected
> [   13.102599] resctrl: L3 monitoring detected
> 
> 
> After the patches.
> 
> # dmesg |grep -i resctrl
> [   13.076973] resctrl: L3 allocation detected
> [   13.097936] resctrl: MB allocation detected
> [   13.102599] resctrl: L3 monitoring detected
> 
> You can see that L3DATA and L3CODE disappeared. I think we should keep the
> behavior same for x86(at least).

This is the kernel log ... what user-space software is parsing that for an 
expected value?
What happens if the resctrl strings have been overwritten by more kernel log?

I don't think user-space should be relying on this. I'd argue any user-space 
doing this is
already broken. Is it just the kernel selftest's filter_dmesg()? It doesn't 
seem to do
anything useful

Whether resctrl is support can be read from /proc/filesystems. CDP is probably a
try-it-and-see. User-space could parse /proc/cpuinfo, but its probably not a 
good idea.


Its easy to fix, but it seems odd that the kernel has to print things for 
user-space to
try and parse. (I'd like to point at the user-space software that depends on 
this)


> I am still not clear why we needed resctrl_conf_type
> 
> enum resctrl_conf_type {
> CDP_BOTH,
> CDP_CODE,
> CDP_DATA,
> };
> 
> Right now, I see all the resources are initialized as CDP_BOTH.
> 
>  [RDT_RESOURCE_L3] =
> {
> .conf_type  = CDP_BOTH,
>  [RDT_RESOURCE_L2] =
> {
> .conf_type  = CDP_BOTH,
>  [RDT_RESOURCE_MBA] =
> {
> .conf_type  = CDP_BOTH,

Ah, those should have been removed in patch 24. Once all the resources are the 
same, the
resource doesn't need to describe what kind it is.


> If all the resources are CDP_BOTH, then why we need separate CDP_CODE and
> CDP_DATA?

The filesystem code for resctrl that will eventually move out of arch/x86 needs 
to be able
to describe the type of configuration change being made back to the arch code. 
The enum
gets used for that.

x86 needs this as it affects which MSRs the configuration value is written to.


> Are these going to be different for ARM?

Nope. Arm's MPAM ends up emulating CDP with the closid values that get applied 
to
transactions.


> Also initializing RDT_RESOURCE_MBA as CDP_BOTH does not seem right. I dont
> think there will CDP support in MBA in future.

Its not code or data, which m

Re: [PATCH v2 00/24] x86/resctrl: Merge the CDP resources

2021-03-30 Thread Babu Moger
Hi James,
Thanks for the patches. Few comments below.

On 3/12/21 11:58 AM, James Morse wrote:
> Hi folks,
> 
> Thanks to Reinette and Jamie for the comments on v1. Major changes in v2 are
> to keep the closid in resctrl_arch_update_domains(), eliminating one patch,
> splitting another that was making two sorts of change, and to re-order the
> first few patches. See each patches changelog for more.
> 
> 
> This series re-folds the resctrl code so the CDP resources (L3CODE et al)
> behaviour is all contained in the filesystem parts, with a minimum amount
> of arch specific code.
> 
> Arm have some CPU support for dividing caches into portions, and
> applying bandwidth limits at various points in the SoC. The collective term
> for these features is MPAM: Memory Partitioning and Monitoring.
> 
> MPAM is similar enough to Intel RDT, that it should use the defacto linux
> interface: resctrl. This filesystem currently lives under arch/x86, and is
> tightly coupled to the architecture.
> Ultimately, my plan is to split the existing resctrl code up to have an
> arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
> MPAM can be wired up.
> 
> x86 might have two resources with cache controls, (L2 and L3) but has
> extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
> if CDP is enabled for the corresponding cache.
> 
> MPAM has an equivalent feature to CDP, but its a property of the CPU,
> not the cache. Resctrl needs to have x86's odd/even behaviour, as that
> its the ABI, but this isn't how the MPAM hardware works. It is entirely
> possible that an in-kernel user of MPAM would not be using CDP, whereas
> resctrl is.
> Pretending L3CODE and L3DATA are entirely separate resources is a neat
> trick, but doing this is specific to x86.
> Doing this leaves the arch code in control of various parts of the
> filesystem ABI: the resources names, and the way the schemata are parsed.
> Allowing this stuff to vary between architectures is bad for user space.
> 
> This series collapses the CODE/DATA resources, moving all the user-visible
> resctrl ABI into what becomes the filesystem code. CDP becomes the type of
> configuration being applied to a cache. This is done by adding a
> struct resctrl_schema to the parts of resctrl that will move to fs. This
> holds the arch-code resource that is in use for this schema, along with
> other properties like the name, and whether the configuration being applied
> is CODE/DATA/BOTH.

I applied your patches on my AMD box. Seeing some difference in the behavior.

Before these patches.

# dmesg |grep -i resctrl
[   13.076973] resctrl: L3 allocation detected
[   13.087835] resctrl: L3DATA allocation detected
[   13.092886] resctrl: L3CODE allocation detected
[   13.097936] resctrl: MB allocation detected
[   13.102599] resctrl: L3 monitoring detected


After the patches.

# dmesg |grep -i resctrl
[   13.076973] resctrl: L3 allocation detected
[   13.097936] resctrl: MB allocation detected
[   13.102599] resctrl: L3 monitoring detected

You can see that L3DATA and L3CODE disappeared. I think we should keep the
behavior same for x86(at least).



I am still not clear why we needed resctrl_conf_type

enum resctrl_conf_type {
CDP_BOTH,
CDP_CODE,
CDP_DATA,
};

Right now, I see all the resources are initialized as CDP_BOTH.

 [RDT_RESOURCE_L3] =
{
.conf_type  = CDP_BOTH,
 [RDT_RESOURCE_L2] =
{
.conf_type  = CDP_BOTH,
 [RDT_RESOURCE_MBA] =
{
.conf_type  = CDP_BOTH,

If all the resources are CDP_BOTH, then why we need separate CDP_CODE and
CDP_DATA? Are these going to be different for ARM?

Also initializing RDT_RESOURCE_MBA as CDP_BOTH does not seem right. I dont
think there will CDP support in MBA in future.



> 
> This lets us fold the extra resources out of the arch code so that they
> don't need to be duplicated if the equivalent feature to CDP is missing, or
> implemented in a different way.
> 
> 
> The first two patches split the resource and domain structs to have an
> arch specific 'hw' portion, and the rest that is visible to resctrl.
> Future series massage the resctrl code so there are no accesses to 'hw'
> structures in the parts of resctrl that will move to fs, providing helpers
> where necessary.
> 
> This series adds temporary scaffolding, which it removes a few patches
> later. This is to allow things like the ctrlval arrays and resources to be
> merged separately, which should make is easier to bisect. These things
> are marked temporary, and should all be gone by the end of the series.
> 
> This series is a little rough around the monitors, would a fake
> struct resctrl_schema for the monitors simplify things, or be a source
> of bugs?
> 
> This series is based on v5.12-rc2, and can be retrieved from:
> git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git 
> mpam/resctrl_merg

[PATCH v2 00/24] x86/resctrl: Merge the CDP resources

2021-03-12 Thread James Morse
Hi folks,

Thanks to Reinette and Jamie for the comments on v1. Major changes in v2 are
to keep the closid in resctrl_arch_update_domains(), eliminating one patch,
splitting another that was making two sorts of change, and to re-order the
first few patches. See each patches changelog for more.


This series re-folds the resctrl code so the CDP resources (L3CODE et al)
behaviour is all contained in the filesystem parts, with a minimum amount
of arch specific code.

Arm have some CPU support for dividing caches into portions, and
applying bandwidth limits at various points in the SoC. The collective term
for these features is MPAM: Memory Partitioning and Monitoring.

MPAM is similar enough to Intel RDT, that it should use the defacto linux
interface: resctrl. This filesystem currently lives under arch/x86, and is
tightly coupled to the architecture.
Ultimately, my plan is to split the existing resctrl code up to have an
arch<->fs abstraction, then move all the bits out to fs/resctrl. From there
MPAM can be wired up.

x86 might have two resources with cache controls, (L2 and L3) but has
extra copies for CDP: L{2,3}{CODE,DATA}, which are marked as enabled
if CDP is enabled for the corresponding cache.

MPAM has an equivalent feature to CDP, but its a property of the CPU,
not the cache. Resctrl needs to have x86's odd/even behaviour, as that
its the ABI, but this isn't how the MPAM hardware works. It is entirely
possible that an in-kernel user of MPAM would not be using CDP, whereas
resctrl is.
Pretending L3CODE and L3DATA are entirely separate resources is a neat
trick, but doing this is specific to x86.
Doing this leaves the arch code in control of various parts of the
filesystem ABI: the resources names, and the way the schemata are parsed.
Allowing this stuff to vary between architectures is bad for user space.

This series collapses the CODE/DATA resources, moving all the user-visible
resctrl ABI into what becomes the filesystem code. CDP becomes the type of
configuration being applied to a cache. This is done by adding a
struct resctrl_schema to the parts of resctrl that will move to fs. This
holds the arch-code resource that is in use for this schema, along with
other properties like the name, and whether the configuration being applied
is CODE/DATA/BOTH.

This lets us fold the extra resources out of the arch code so that they
don't need to be duplicated if the equivalent feature to CDP is missing, or
implemented in a different way.


The first two patches split the resource and domain structs to have an
arch specific 'hw' portion, and the rest that is visible to resctrl.
Future series massage the resctrl code so there are no accesses to 'hw'
structures in the parts of resctrl that will move to fs, providing helpers
where necessary.

This series adds temporary scaffolding, which it removes a few patches
later. This is to allow things like the ctrlval arrays and resources to be
merged separately, which should make is easier to bisect. These things
are marked temporary, and should all be gone by the end of the series.

This series is a little rough around the monitors, would a fake
struct resctrl_schema for the monitors simplify things, or be a source
of bugs?

This series is based on v5.12-rc2, and can be retrieved from:
git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git 
mpam/resctrl_merge_cdp/v2

v1 was posted here:
https://lore.kernel.org/lkml/20201030161120.227225-1-james.mo...@arm.com/

Parts were previously posted as an RFC here:
https://lore.kernel.org/lkml/20200214182947.39194-1-james.mo...@arm.com/


Thanks,

James Morse (24):
  x86/resctrl: Split struct rdt_resource
  x86/resctrl: Split struct rdt_domain
  x86/resctrl: Add a separate schema list for resctrl
  x86/resctrl: Pass the schema in info dir's private pointer
  x86/resctrl: Label the resources with their configuration type
  x86/resctrl: Walk the resctrl schema list instead of an arch list
  x86/resctrl: Store the effective num_closid in the schema
  x86/resctrl: Add resctrl_arch_get_num_closid()
  x86/resctrl: Pass the schema to resctrl filesystem functions
  x86/resctrl: Swizzle rdt_resource and resctrl_schema in
pseudo_lock_region
  x86/resctrl: Move the schemata names into struct resctrl_schema
  x86/resctrl: Group staged configuration into a separate struct
  x86/resctrl: Allow different CODE/DATA configurations to be staged
  x86/resctrl: Rename update_domains() resctrl_arch_update_domains()
  x86/resctrl: Add a helper to read a closid's configuration
  x86/resctrl: Add a helper to read/set the CDP configuration
  x86/resctrl: Use cdp_enabled in rdt_domain_reconfigure_cdp()
  x86/resctrl: Pass configuration type to resctrl_arch_get_config()
  x86/resctrl: Make ctrlval arrays the same size
  x86/resctrl: Apply offset correction when config is staged
  x86/resctrl: Calculate the index from the configuration type
  x86/resctrl: Merge the ctrl_val arrays
  x86/resctrl: Remove rdt_cdp_peer_get()
  x86/resctrl: Merge