Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Mark Rutland
On Tue, Aug 20, 2019 at 04:55:24PM +0100, Raphael Gault wrote:
> Hi Mark,
> 
> Thank you for your comments.
> 
> On 8/20/19 4:49 PM, Mark Rutland wrote:
> > On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:
> > > Hi Raphael,
> > > 
> > > On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:
> > > > This feature is required in order to enable PMU counters direct
> > > > access from userspace only when the system is homogeneous.
> > > > This feature checks the model of each CPU brought online and compares it
> > > > to the boot CPU. If it differs then it is heterogeneous.
> > > 
> > > It would be worth noting that this patch prevents heterogeneous CPUs
> > > being brought online late if the system was uniform at boot time.
> > 
> > Looking again, I think I'd misunderstood how
> > ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
> > problem in this area.
> > 
> > [...]
> > 
> > > 
> > > > +   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
> > > > +   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
> > > > ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
> > > > +   .matches = has_heterogeneous_pmu,
> > > > +   },
> > 
> > I had a quick chat with Will, and we concluded that we must permit late
> > onlining of heterogeneous CPUs here as people are likely to rely on
> > late CPU onlining on some heterogeneous systems.
> > 
> > I think the above permits that, but that also means that we need some
> > support code to fail gracefully in that case (e.g. without sending
> > a SIGILL to unaware userspace code).
> 
> I understand, however, I understood that ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU
> did not allow later CPU to be heterogeneous if the capability wasn't already
> enabled.

Yes, I think that you're right. IIUC the absence of
ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU is what prevents that from
happening.

> Thus if as you say we need to allow the system to switch from
> homogeneous to heterogeneous, then I should change the type of this
> capability.

I'm afraid so!

I believe we need both ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU and
ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU, so I guess we should be using
ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE.

Does that sound right to you? ... or have I confused myself again?

Thanks,
Mark.

> > That means that we'll need the counter emulation code that you had in
> > previous versions of this patch (e.g. to handle potential UNDEFs when a
> > new CPU has fewer counters than the previously online CPUs).
> > 
> > Further, I think the context switch (and event index) code needs to take
> > this cap into account, and disable direct access once the system becomes
> > heterogeneous.
> 
> That is a good point indeed.
> 
> Thanks,
> 
> -- 
> Raphael Gault


Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Raphael Gault

Hi Mark,

Thank you for your comments.

On 8/20/19 4:49 PM, Mark Rutland wrote:

On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:

Hi Raphael,

On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:

This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.


It would be worth noting that this patch prevents heterogeneous CPUs
being brought online late if the system was uniform at boot time.


Looking again, I think I'd misunderstood how
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
problem in this area.

[...]




+   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
+   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+   .matches = has_heterogeneous_pmu,
+   },


I had a quick chat with Will, and we concluded that we must permit late
onlining of heterogeneous CPUs here as people are likely to rely on
late CPU onlining on some heterogeneous systems.

I think the above permits that, but that also means that we need some
support code to fail gracefully in that case (e.g. without sending
a SIGILL to unaware userspace code).


I understand, however, I understood that 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU did not allow later CPU to be 
heterogeneous if the capability wasn't already enabled. Thus if as you 
say we need to allow the system to switch from homogeneous to 
heterogeneous, then I should change the type of this capability.



That means that we'll need the counter emulation code that you had in
previous versions of this patch (e.g. to handle potential UNDEFs when a
new CPU has fewer counters than the previously online CPUs).

Further, I think the context switch (and event index) code needs to take
this cap into account, and disable direct access once the system becomes
heterogeneous.


That is a good point indeed.

Thanks,

--
Raphael Gault


Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Mark Rutland
On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:
> Hi Raphael,
> 
> On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:
> > This feature is required in order to enable PMU counters direct
> > access from userspace only when the system is homogeneous.
> > This feature checks the model of each CPU brought online and compares it
> > to the boot CPU. If it differs then it is heterogeneous.
> 
> It would be worth noting that this patch prevents heterogeneous CPUs
> being brought online late if the system was uniform at boot time.

Looking again, I think I'd misunderstood how
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
problem in this area.

[...]

> 
> > +   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
> > +   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
> > ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
> > +   .matches = has_heterogeneous_pmu,
> > +   },

I had a quick chat with Will, and we concluded that we must permit late
onlining of heterogeneous CPUs here as people are likely to rely on
late CPU onlining on some heterogeneous systems.

I think the above permits that, but that also means that we need some
support code to fail gracefully in that case (e.g. without sending
a SIGILL to unaware userspace code).

That means that we'll need the counter emulation code that you had in
previous versions of this patch (e.g. to handle potential UNDEFs when a
new CPU has fewer counters than the previously online CPUs).

Further, I think the context switch (and event index) code needs to take
this cap into account, and disable direct access once the system becomes
heterogeneous.

Thanks,
Mark.


Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Mark Rutland
Hi Raphael,

On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:
> This feature is required in order to enable PMU counters direct
> access from userspace only when the system is homogeneous.
> This feature checks the model of each CPU brought online and compares it
> to the boot CPU. If it differs then it is heterogeneous.

I t would be worth noting that this patch prevents heterogeneous CPUs
being brought online late if the system was uniform at boot time.

> 
> Signed-off-by: Raphael Gault 
> ---
>  arch/arm64/include/asm/cpucaps.h |  3 ++-
>  arch/arm64/kernel/cpufeature.c   | 20 
>  arch/arm64/kernel/perf_event.c   |  1 +
>  3 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h 
> b/arch/arm64/include/asm/cpucaps.h
> index f19fe4b9acc4..040370af38ad 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -52,7 +52,8 @@
>  #define ARM64_HAS_IRQ_PRIO_MASKING   42
>  #define ARM64_HAS_DCPODP 43
>  #define ARM64_WORKAROUND_1463225 44
> +#define ARM64_HAS_HETEROGENEOUS_PMU  45
>  
> -#define ARM64_NCAPS  45
> +#define ARM64_NCAPS  46
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9323bcc40a58..bbdd809f12a6 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1260,6 +1260,15 @@ static bool can_use_gic_priorities(const struct 
> arm64_cpu_capabilities *entry,
>  }
>  #endif
>  
> +static bool has_heterogeneous_pmu(const struct arm64_cpu_capabilities *entry,
> +  int scope)
> +{
> + u32 model = read_cpuid_id() & MIDR_CPU_MODEL_MASK;
> + struct cpuinfo_arm64 *boot = _cpu(cpu_data, 0);
> +
> + return  (boot->reg_midr & MIDR_CPU_MODEL_MASK) != model;
> +}

We should use boot_cpu_data rather than _cpu(cpu_data, 0) here. We
can make that __ro_after_init, and declare it in
arch/arm64/includ/asm/smp.h.

That caters for CPU0 being hotplugged off and then a different physical
CPU being hotplugged on in its place.

> +
>  static const struct arm64_cpu_capabilities arm64_features[] = {
>   {
>   .desc = "GIC system register CPU interface",
> @@ -1560,6 +1569,16 @@ static const struct arm64_cpu_capabilities 
> arm64_features[] = {
>   .min_field_value = 1,
>   },
>  #endif
> + {
> + /*
> +  * Detect whether the system is heterogeneous or
> +  * homogeneous
> +  */
> + .desc = "Detect whether we have heterogeneous CPUs",

The desc gets printed in dmesg with a prefix, e.g.

[0.058267][T1] CPU features: detected: Privileged Access Never
[0.058340][T1] CPU features: detected: LSE atomic instructions
[0.058416][T1] CPU features: detected: RAS Extension Support
[0.058489][T1] CPU features: detected: CRC32 instructions

... so this should only say "Heterogeneous CPUs".

> + .capability = ARM64_HAS_HETEROGENEOUS_PMU,
> + .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
> ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
> + .matches = has_heterogeneous_pmu,
> + },
>   {},
>  };
>  
> @@ -1727,6 +1746,7 @@ static void __init setup_elf_hwcaps(const struct 
> arm64_cpu_capabilities *hwcaps)
>   cap_set_elf_hwcap(hwcaps);
>  }
>  
> +

This whitespace addition can go.

>  static void update_cpu_capabilities(u16 scope_mask)
>  {
>   int i;
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 2d3bdebdf6df..a0b4f1bca491 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

I think this should be added in a separate patch.

It looks like this is a missing include that we need today for
smp_processor_id(), so please spin that as a preparatory patch (with my
Acked-by).

Thanks,
Mark.

>  
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL  0xC2
> -- 
> 2.17.1
>