Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-19 Thread Dietmar Eggemann
On 19/04/2021 04:55, Ruifeng Zhang wrote:
> Dietmar Eggemann  于2021年4月17日周六 上午1:00写道:
>>
>> On 16/04/2021 13:04, Ruifeng Zhang wrote:
>>> Dietmar Eggemann  于2021年4月16日周五 下午6:39写道:

 On 16/04/2021 11:32, Valentin Schneider wrote:
> On 16/04/21 15:47, Ruifeng Zhang wrote:

[...]

>> I'm afraid that this is now a much weaker case to get this into
>> mainline.
> 
> But it's still a problem and it's not break the original logic ( parse
> topology from MPIDR or parse capacity ), only add the support for
> parse topology from DT.
> I think it should still be merged into the mainline. If don't, the
> DynamIQ SoC has some issue in sched and cpufreq.

IMHO, not necessarily. Your DynamIQ SoC is one cluster with 8 CPUs. It's
subdivided into 2 Frequency Domains (FDs).

CFS Energy-Aware-Scheduling (EAS, find_energy_efficient_cpu()) and
Capacity-Aware-Scheduling (CAS, select_idle_sibling() ->
select_idle_capacity()) work correctly even in case you only have an MC
sched domain (sd).
No matter which sd (MC, DIE) the sd_asym_cpucapacity is, we always
iterate over all CPUs. Per Performance Domains (i.e. FDs) in EAS and
over sched_domain_span(sd) in CAS.

CFS load-balancing (in case your system is `over-utilized`) might work
slightly different due to the missing DIE sd but not inevitably worse.

Do you have benchmarks or testcases in mind which convince you that
Phantom Domains is something you would need? BTW, they are called
Phantom since they let you use uarch and/or max CPU frequency domain to
fake real topology (like LLC) boundaries.

[...]

> Why do you keep the logic of topology_parse_cpu_capacity in arm
> get_coretype_capacity function? The capacity-dmips-mhz will be parsed
> by drivers/base/arch_topology.c as following:
> parse_dt_topology
> parse_cluster
> parse_core
> get_cpu_for_node
> topology_parse_cpu_capacity

I think we still need it for systems out there w/o cpu-map in dt, like
my arm32 TC2 with mainline vexpress-v2p-ca15_a7.dts.

It's called twice on each CPU in case I add the cpu-map dt entry though.


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-18 Thread Ruifeng Zhang
Dietmar Eggemann  于2021年4月17日周六 上午1:00写道:
>
> On 16/04/2021 13:04, Ruifeng Zhang wrote:
> > Dietmar Eggemann  于2021年4月16日周五 下午6:39写道:
> >>
> >> On 16/04/2021 11:32, Valentin Schneider wrote:
> >>> On 16/04/21 15:47, Ruifeng Zhang wrote:
>
> [...]
>
> >> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> >> the mpidr handling in arm32's store_cpu_topology() is not correct does
> >> not exist?
> >
> > I have reconfirmed it, the MT bit has been set to 1. I am sorry for
> > the previous messages.
> > The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.
>
> Nice! This is sorted then.
>
> [...]
>
> >> Is this what you need for your arm32 kernel system? Adding the
> >> possibility to parse cpu-map to create Phantom Domains?
> >
> > Yes, I need parse DT cpu-map to create different Phantom Domains.
> > With it, the dts should be change to:
> > cpu-map {
> > cluster0 {
> > core0 {
> > cpu = <>;
> > };
> > core1 {
> > cpu = <>;
> > };
> > core2 {
> > cpu = <>;
> > };
> > core3 {
> > cpu = <>;
> > };
> > };
> >
> > cluster1 {
> > core0 {
> > cpu = <>;
> > };
> > core1 {
> > cpu = <>;
> > };
> > core2 {
> > cpu = <>;
> > };
> > core3 {
> > cpu = <>;
> > };
> > };
> > };
> >
>
> I'm afraid that this is now a much weaker case to get this into
> mainline.

But it's still a problem and it's not break the original logic ( parse
topology from MPIDR or parse capacity ), only add the support for
parse topology from DT.
I think it should still be merged into the mainline. If don't, the
DynamIQ SoC has some issue in sched and cpufreq.
>
> I'm able to run with an extra cpu-map entry:

Great.
>
> diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts 
> b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> index 012d40a7228c..f60d9b448253 100644
> --- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> +++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> @@ -35,6 +35,29 @@ cpus {
> #address-cells = <1>;
> #size-cells = <0>;
>
> +   cpu-map {
> +   cluster0 {
> +   core0 {
> +   cpu = <>;
> +   };
> +   core1 {
> +   cpu = <>;
> +   };
> +   };
> +
> +   cluster1 {
> +   core0 {
> +   cpu = <>;
> +   };
> +   core1 {
> +   cpu = <>;
> +   };
> +   core2 {
> +   cpu = <>;
> +   };
> +   };
> +   };
> +
> cpu0: cpu@0 {
>
> a condensed version (see below) of your patch on my Arm32 TC2.
> The move of update_cpu_capacity() in store_cpu_topology() is only
> necessary when I use the old clock-frequency based cpu_efficiency
> approach for asymmetric CPU capacity (TC2 is a15/a7):
>
> diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts 
> b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> index f60d9b448253..e0679cca40ed 100644
> --- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> +++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
> @@ -64,7 +64,7 @@ cpu0: cpu@0 {
> reg = <0>;
> cci-control-port = <_control1>;
> cpu-idle-states = <_SLEEP_BIG>;
> -   capacity-dmips-mhz = <1024>;
> +   clock-frequency = <10>;
> dynamic-power-coefficient = <990>;
> };
>
> @@ -74,7 +74,7 @@ cpu1: cpu@1 {
> reg = <1>;
> cci-control-port = <_control1>;
> cpu-idle-states = <_SLEEP_BIG>;
> -   capacity-dmips-mhz = <1024>;
> +   

Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Dietmar Eggemann
On 16/04/2021 13:04, Ruifeng Zhang wrote:
> Dietmar Eggemann  于2021年4月16日周五 下午6:39写道:
>>
>> On 16/04/2021 11:32, Valentin Schneider wrote:
>>> On 16/04/21 15:47, Ruifeng Zhang wrote:

[...]

>> I'm confused. Do you have the MT bit set to 1 then? So the issue that
>> the mpidr handling in arm32's store_cpu_topology() is not correct does
>> not exist?
> 
> I have reconfirmed it, the MT bit has been set to 1. I am sorry for
> the previous messages.
> The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.

Nice! This is sorted then.

[...]

>> Is this what you need for your arm32 kernel system? Adding the
>> possibility to parse cpu-map to create Phantom Domains?
> 
> Yes, I need parse DT cpu-map to create different Phantom Domains.
> With it, the dts should be change to:
> cpu-map {
> cluster0 {
> core0 {
> cpu = <>;
> };
> core1 {
> cpu = <>;
> };
> core2 {
> cpu = <>;
> };
> core3 {
> cpu = <>;
> };
> };
> 
> cluster1 {
> core0 {
> cpu = <>;
> };
> core1 {
> cpu = <>;
> };
> core2 {
> cpu = <>;
> };
> core3 {
> cpu = <>;
> };
> };
> };
> 

I'm afraid that this is now a much weaker case to get this into
mainline.

I'm able to run with an extra cpu-map entry:

diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts 
b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
index 012d40a7228c..f60d9b448253 100644
--- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
+++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
@@ -35,6 +35,29 @@ cpus {
#address-cells = <1>;
#size-cells = <0>;
 
+   cpu-map {
+   cluster0 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   };
+
+   cluster1 {
+   core0 {
+   cpu = <>;
+   };
+   core1 {
+   cpu = <>;
+   };
+   core2 {
+   cpu = <>;
+   };
+   };
+   };
+
cpu0: cpu@0 {
 
a condensed version (see below) of your patch on my Arm32 TC2.
The move of update_cpu_capacity() in store_cpu_topology() is only
necessary when I use the old clock-frequency based cpu_efficiency
approach for asymmetric CPU capacity (TC2 is a15/a7):

diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts 
b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
index f60d9b448253..e0679cca40ed 100644
--- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
+++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
@@ -64,7 +64,7 @@ cpu0: cpu@0 {
reg = <0>;
cci-control-port = <_control1>;
cpu-idle-states = <_SLEEP_BIG>;
-   capacity-dmips-mhz = <1024>;
+   clock-frequency = <10>;
dynamic-power-coefficient = <990>;
};
 
@@ -74,7 +74,7 @@ cpu1: cpu@1 {
reg = <1>;
cci-control-port = <_control1>;
cpu-idle-states = <_SLEEP_BIG>;
-   capacity-dmips-mhz = <1024>;
+   clock-frequency = <10>;
dynamic-power-coefficient = <990>;
};
 
@@ -84,7 +84,7 @@ cpu2: cpu@2 {
reg = <0x100>;
cci-control-port = <_control2>;
cpu-idle-states = <_SLEEP_LITTLE>;
-   capacity-dmips-mhz = <516>;
+   clock-frequency = <8>;
dynamic-power-coefficient = <133>;
};
 
@@ -94,7 +94,7 @@ cpu3: cpu@3 {
reg = <0x101>;
 

Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Ruifeng Zhang
Dietmar Eggemann  于2021年4月16日周五 下午6:39写道:
>
> On 16/04/2021 11:32, Valentin Schneider wrote:
> > On 16/04/21 15:47, Ruifeng Zhang wrote:
> >> For more requirements, if all cores in one physical cluster, the
> >> {aff2} of all cores are the same value.
> >> i.e. the sc9863a,
> >> core0: 8100
> >> core1: 81000100
> >> core2: 81000200
> >> core3: 81000300
> >> core4: 81000400
> >> core5: 81000500
> >> core6: 81000600
> >> core7: 81000700
> >>
> >> According to MPIDR all cores will parse to the one cluster, but it's
> >> the big.LITTLE system, it's need two logic cluster for schedule or
> >> cpufreq.
> >> So I think it's better to add the logic of parse topology from DT.
> >
> > Ah, so it's a slightly different issue, but still one that requires a
> > different means of specifying topology.
>
> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> the mpidr handling in arm32's store_cpu_topology() is not correct does
> not exist?

I have reconfirmed it, the MT bit has been set to 1. I am sorry for
the previous messages.
The mpidr parse by store_cpu_topology is correct, at least for the sc9863a.

>
> With DynamIQ you have only *one* cluster, you should also be able to run
> your big.LITTLE system with only an MC sched domain.
>
> # cat /proc/schedstat
> cpu0 
> domain0 ff ... <- MC
> ...
>
> You can introduce a cpu-map to create what we called Phantom Domains in
> Android products.
>
> # cat /proc/schedstat
>
> cpu0 
> domain0 0f ... <- MC
> domain1 ff ... < DIE
>
> Is this what you need for your arm32 kernel system? Adding the
> possibility to parse cpu-map to create Phantom Domains?

Yes, I need parse DT cpu-map to create different Phantom Domains.
With it, the dts should be change to:
cpu-map {
cluster0 {
core0 {
cpu = <>;
};
core1 {
cpu = <>;
};
core2 {
cpu = <>;
};
core3 {
cpu = <>;
};
};

cluster1 {
core0 {
cpu = <>;
};
core1 {
cpu = <>;
};
core2 {
cpu = <>;
};
core3 {
cpu = <>;
};
};
};


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Valentin Schneider
On 16/04/21 12:39, Dietmar Eggemann wrote:
> On 16/04/2021 11:32, Valentin Schneider wrote:
>> On 16/04/21 15:47, Ruifeng Zhang wrote:
>>> For more requirements, if all cores in one physical cluster, the
>>> {aff2} of all cores are the same value.
>>> i.e. the sc9863a,
>>> core0: 8100
>>> core1: 81000100
>>> core2: 81000200
>>> core3: 81000300
>>> core4: 81000400
>>> core5: 81000500
>>> core6: 81000600
>>> core7: 81000700
>>>
>>> According to MPIDR all cores will parse to the one cluster, but it's
>>> the big.LITTLE system, it's need two logic cluster for schedule or
>>> cpufreq.
>>> So I think it's better to add the logic of parse topology from DT.
>>
>> Ah, so it's a slightly different issue, but still one that requires a
>> different means of specifying topology.
>
> I'm confused. Do you have the MT bit set to 1 then? So the issue that
> the mpidr handling in arm32's store_cpu_topology() is not correct does
> not exist?
>
> With DynamIQ you have only *one* cluster, you should also be able to run
> your big.LITTLE system with only an MC sched domain.
>
> # cat /proc/schedstat
> cpu0 
> domain0 ff ... <- MC
> ...
>

You're right, this is actually a DynamIQ system, not a (legacy) big.LITTLE
one, so all CPUs are under the same LLC (the DSU). I probably should have
checked this earlier on, but this is quite obvious from sc9863a.dtsi:

cpu-map {
cluster0 {
core0 {
cpu = <>;
};
core1 {
cpu = <>;
};
core2 {
cpu = <>;
};
core3 {
cpu = <>;
};
core4 {
cpu = <>;
};
core5 {
cpu = <>;
};
core6 {
cpu = <>;
};
core7 {
cpu = <>;
};
};
};

All CPUs are in the same cluster, and the MPIDR values actually match that.

> You can introduce a cpu-map to create what we called Phantom Domains in
> Android products.
>
> # cat /proc/schedstat
>
> cpu0 
> domain0 0f ... <- MC
> domain1 ff ... < DIE
>
> Is this what you need for your arm32 kernel system? Adding the
> possibility to parse cpu-map to create Phantom Domains?


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Dietmar Eggemann
On 16/04/2021 11:32, Valentin Schneider wrote:
> On 16/04/21 15:47, Ruifeng Zhang wrote:
>> For more requirements, if all cores in one physical cluster, the
>> {aff2} of all cores are the same value.
>> i.e. the sc9863a,
>> core0: 8100
>> core1: 81000100
>> core2: 81000200
>> core3: 81000300
>> core4: 81000400
>> core5: 81000500
>> core6: 81000600
>> core7: 81000700
>>
>> According to MPIDR all cores will parse to the one cluster, but it's
>> the big.LITTLE system, it's need two logic cluster for schedule or
>> cpufreq.
>> So I think it's better to add the logic of parse topology from DT.
> 
> Ah, so it's a slightly different issue, but still one that requires a
> different means of specifying topology.

I'm confused. Do you have the MT bit set to 1 then? So the issue that
the mpidr handling in arm32's store_cpu_topology() is not correct does
not exist?

With DynamIQ you have only *one* cluster, you should also be able to run
your big.LITTLE system with only an MC sched domain.

# cat /proc/schedstat
cpu0 
domain0 ff ... <- MC
...

You can introduce a cpu-map to create what we called Phantom Domains in
Android products.

# cat /proc/schedstat

cpu0 
domain0 0f ... <- MC
domain1 ff ... < DIE

Is this what you need for your arm32 kernel system? Adding the
possibility to parse cpu-map to create Phantom Domains?


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Valentin Schneider
On 16/04/21 15:47, Ruifeng Zhang wrote:
> For more requirements, if all cores in one physical cluster, the
> {aff2} of all cores are the same value.
> i.e. the sc9863a,
> core0: 8100
> core1: 81000100
> core2: 81000200
> core3: 81000300
> core4: 81000400
> core5: 81000500
> core6: 81000600
> core7: 81000700
>
> According to MPIDR all cores will parse to the one cluster, but it's
> the big.LITTLE system, it's need two logic cluster for schedule or
> cpufreq.
> So I think it's better to add the logic of parse topology from DT.

Ah, so it's a slightly different issue, but still one that requires a
different means of specifying topology.


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-16 Thread Ruifeng Zhang
Dietmar Eggemann  于2021年4月16日周五 上午4:10写道:
>
> On 15/04/2021 20:09, Valentin Schneider wrote:
> > On 14/04/21 20:23, Ruifeng Zhang wrote:
> >> From: Ruifeng Zhang 
> >>
> >> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
> >> version, one of them is the kernel running on EL1 using aarch32.
> >> user(EL0) kernel(EL1)
> >> sc9863a_go  aarch32   aarch32
> >> sc9863a aarch64   aarch64
> >>
> >> When kernel runs on EL1 using aarch32, the topology will parse wrong.
> >> For example,
> >> The MPIDR has been written to the chip register in armv8.2 format.
> >> For example,
> >> core0: 8000
> >> core1: 8100
> >> core2: 8200
> >> ...
> >>
> >> It will parse to:
> >> |   | aff2 | packageid | coreid |
> >> |---+--+---+|
> >> | Core0 |0 | 0 |0   |
> >> | Core1 |0 | 1 |0   |
> >> | Core2 |0 | 2 |0   |
> >> |  ...  |  |   ||
> >>
> >> The wrong topology is that all of the coreid are 0 and unexpected
> >> packageid.
> >>
> >> The reason is the MPIDR format is different between armv7 and armv8.2.
> >> armv7 (A7) mpidr is:
> >> [11:8]  [7:2]   [1:0]
> >> cluster reservedcpu
> >> The cortex-a7 spec DDI0464F 4.3.5
> >> https://developer.arm.com/documentation/ddi0464/f/?lang=en
> >>
> >> armv8.2 (A55) mpidr is:
> >> [23:16] [15:8]  [7:0]
> >> cluster cpu thread
> >>
> >
> > What I had understood from our conversation was that there *isn't* a format
> > difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
> > would parse it the same, except that MPIDR parsing has been deprecated for
> > arm64.

I agree, I should change my description.

> >
> > The problem is that those MPIDR values don't match the actual topology. If
> > they had the MT bit set, i.e.
> >
> >   core0: 8100
> >   core1: 81000100
> >   core2: 81000200
> >
> > then it would be parsed as:
> >
> >   |   | package_id | core_id | thread_id |
> >   |---++-+---|
> >   | Core0 |  0 |   0 | 0 |
> >   | Core1 |  0 |   1 | 0 |
> >   | Core2 |  0 |   2 | 0 |
> >
> > which would make more sense (wrt the actual, physical topology).
>
> ... and this would be in sync with
> https://developer.arm.com/documentation/100442/0200/register-descriptions/aarch32-system-registers/mpidr--multiprocessor-affinity-register
>
> MT, [24]
>
>0b1 ...
>
> There is no 0b0 for MT.
>
As you said, the MT must be 0b1, so the {aff1} means coreid for A55.
It's no problem for parsing coreid.

For more requirements, if all cores in one physical cluster, the
{aff2} of all cores are the same value.
i.e. the sc9863a,
core0: 8100
core1: 81000100
core2: 81000200
core3: 81000300
core4: 81000400
core5: 81000500
core6: 81000600
core7: 81000700

According to MPIDR all cores will parse to the one cluster, but it's
the big.LITTLE system, it's need two logic cluster for schedule or
cpufreq.
So I think it's better to add the logic of parse topology from DT.


Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-15 Thread Dietmar Eggemann
On 15/04/2021 20:09, Valentin Schneider wrote:
> On 14/04/21 20:23, Ruifeng Zhang wrote:
>> From: Ruifeng Zhang 
>>
>> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
>> version, one of them is the kernel running on EL1 using aarch32.
>> user(EL0) kernel(EL1)
>> sc9863a_go  aarch32   aarch32
>> sc9863a aarch64   aarch64
>>
>> When kernel runs on EL1 using aarch32, the topology will parse wrong.
>> For example,
>> The MPIDR has been written to the chip register in armv8.2 format.
>> For example,
>> core0: 8000
>> core1: 8100
>> core2: 8200
>> ...
>>
>> It will parse to:
>> |   | aff2 | packageid | coreid |
>> |---+--+---+|
>> | Core0 |0 | 0 |0   |
>> | Core1 |0 | 1 |0   |
>> | Core2 |0 | 2 |0   |
>> |  ...  |  |   ||
>>
>> The wrong topology is that all of the coreid are 0 and unexpected
>> packageid.
>>
>> The reason is the MPIDR format is different between armv7 and armv8.2.
>> armv7 (A7) mpidr is:
>> [11:8]  [7:2]   [1:0]
>> cluster reservedcpu
>> The cortex-a7 spec DDI0464F 4.3.5
>> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>>
>> armv8.2 (A55) mpidr is:
>> [23:16] [15:8]  [7:0]
>> cluster cpu thread
>>
> 
> What I had understood from our conversation was that there *isn't* a format
> difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
> would parse it the same, except that MPIDR parsing has been deprecated for
> arm64.
> 
> The problem is that those MPIDR values don't match the actual topology. If
> they had the MT bit set, i.e.
> 
>   core0: 8100
>   core1: 81000100
>   core2: 81000200
> 
> then it would be parsed as:
> 
>   |   | package_id | core_id | thread_id |
>   |---++-+---|
>   | Core0 |  0 |   0 | 0 |
>   | Core1 |  0 |   1 | 0 |
>   | Core2 |  0 |   2 | 0 |
> 
> which would make more sense (wrt the actual, physical topology).

... and this would be in sync with
https://developer.arm.com/documentation/100442/0200/register-descriptions/aarch32-system-registers/mpidr--multiprocessor-affinity-register

MT, [24]

   0b1 ...

There is no 0b0 for MT.



Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

2021-04-15 Thread Valentin Schneider
On 14/04/21 20:23, Ruifeng Zhang wrote:
> From: Ruifeng Zhang 
>
> In Unisoc, the sc9863a SoC which using cortex-a55, it has two software
> version, one of them is the kernel running on EL1 using aarch32.
> user(EL0) kernel(EL1)
> sc9863a_go  aarch32   aarch32
> sc9863a aarch64   aarch64
>
> When kernel runs on EL1 using aarch32, the topology will parse wrong.
> For example,
> The MPIDR has been written to the chip register in armv8.2 format.
> For example,
> core0: 8000
> core1: 8100
> core2: 8200
> ...
>
> It will parse to:
> |   | aff2 | packageid | coreid |
> |---+--+---+|
> | Core0 |0 | 0 |0   |
> | Core1 |0 | 1 |0   |
> | Core2 |0 | 2 |0   |
> |  ...  |  |   ||
>
> The wrong topology is that all of the coreid are 0 and unexpected
> packageid.
>
> The reason is the MPIDR format is different between armv7 and armv8.2.
> armv7 (A7) mpidr is:
> [11:8]  [7:2]   [1:0]
> cluster reservedcpu
> The cortex-a7 spec DDI0464F 4.3.5
> https://developer.arm.com/documentation/ddi0464/f/?lang=en
>
> armv8.2 (A55) mpidr is:
> [23:16] [15:8]  [7:0]
> cluster cpu thread
>

What I had understood from our conversation was that there *isn't* a format
difference (at least for the bottom 32 bits) - arm64/kernel/topopology.c
would parse it the same, except that MPIDR parsing has been deprecated for
arm64.

The problem is that those MPIDR values don't match the actual topology. If
they had the MT bit set, i.e.

  core0: 8100
  core1: 81000100
  core2: 81000200

then it would be parsed as:

  |   | package_id | core_id | thread_id |
  |---++-+---|
  | Core0 |  0 |   0 | 0 |
  | Core1 |  0 |   1 | 0 |
  | Core2 |  0 |   2 | 0 |

which would make more sense (wrt the actual, physical topology).