Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-28 Thread Peter Zijlstra
On Thu, Feb 28, 2019 at 10:59:19AM -0500, Len Brown wrote:

> The SDM is like software -- usually (but not always) you are better
> off with the latest version:-)

If only it existed as a .tex file in a git repo. It has been very useful
to have older versions around a number of times.


Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-28 Thread Len Brown
On Tue, Feb 26, 2019 at 8:54 AM Peter Zijlstra  wrote:

> > > It would've been nice to have the CPUID instruction 1F leaf reference
> > > 3B-3.9 in the SDM, and maybe mention this here too.
> >
> > I didn't mention SDM sections because they change -- leaving stale
> > pointers in our commit messages.  The SDM is re-published 4 times per
> > year.
>
> Yah, I know. Which is why I keep all SDMs. So if you say, book 3 section
> 8 of Jul'17, I can find it :-)

The SDM is like software -- usually (but not always) you are better
off with the latest version:-)

> > Cache enumeration in Leaf-4 is totally unchanged.
> > ACPI NUMA tables are totally unchanged.
>
> Sure; and yet Sub-NUMA-Clustering broke stuff in interesting ways. I'm
> trying to get a feel for how these levels will interact with all that.
>
> Before that SNC stuff, caches had never spanned NODEs (and I still
> think that is 'creative' at best).

Yeah, SNC is sort of a curve ball.  I guess it made enough stuff run better that
it is available as an option.  But it doesn't help everything, so it is disabled
by default...

I think from a scheduler point of view, sticking with the output of
CPUID.4 for the cache topology, and the ACPI tables for the
node topology/distances, is the right strategy.

> > From a scheduler point of view, imagine that a SKX system with 4 die
> > in 4 packages was mechanically re-designed so that those 4 die resided
> > in 2 double-sized packages.
> >
> > They may have tweaked the links between the die, but logically it is
> > identical and compatible, and the legacy kernel will function
> > properly.
>
> This example has LLC in die and yes that works.
>
> But I can imagine things like L2 in tile and L3 across tiles but within
> DIE and then it _might_ make sense to still consider the tile for
> scheduling.
>
> Another option is having the LLC off die; also not unheard of.
>
> And then there's many creative and slightly crazy ways this can all be
> combined :/

If any of those crazy things happen,  CPUID.B/CPUID.1F are not
going to help software understand it -- CPUID.4 and the NUMA tables
are the tool of choice.

> > So the effect of Leaf B,1F is that it defines the scope of MSRs.  eg.
> > what processors does a die-scope MSR cover.  That is why the rest of
> > the patch is about sysfs topology, and about package MSR scope.
> >
> > Yes, there will be more exotic MSR situations in future products --
> > the first ones are pretty simple -- something  called a
> > package-scope-MSR  in the SDM today becomes a die-scope-MSR in this
> > generation on a multi-die/package system.
>
> Yes :-(
>
> > It also reflects how many packages appear in sysfs, and this can
> > effect licensing of some kinds of software.
>
> That's just plain insanity and we should not let that affect our sysfs
> interfaces.

This change isn't made for compatibility with per-package licensing.
Indeed, vendors, who license  based on package-count need to
be made aware that on a system with multi-die/package, they'll
see their package count go _down_ as a result of this change.
Thankfully, I'm told that per-package licensing is quite rare --
most stuff that cares has moved to per-CPU.

I think a good semantic side effect of this series is that it
maintains the invariant that a physical package and a socket are synonymous.
While we don't use the word "socket" in Linux anymore, the industry
broadly assume that the two are synonyms.  And people expect that a
physical package really is a physical package -- you can see it,
buy it in a box, and hold it in your hand.

Functionally, the bottom line is that it allows software to discover topology
levels that previously needed to be discovered by looking up family/model,
in the past, which was sort of annoying.  The things that care are
things that care about MSR scope.   Thankfully, the list of things that care
about MSR scope is quite finite.

thanks,
-Len




--
Len Brown, Intel Open Source Technology Center


Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-26 Thread Peter Zijlstra
On Wed, Feb 20, 2019 at 10:08:48AM -0500, Len Brown wrote:
> Thanks for the comments, Peter. I'll update the patch to address the
> syntax points.  (Maybe checkpatch.pl should be updated to reflect your
> preferences?).

Don't know about checkpatch; I ignore plenty of its output. I think tglx
started a document somewhere for what tip prefers, but I'm not sure
where that went.

> About macros vs C.  I agree with your preference.
> I used macros to be consistent with the existing code, and to be as
> backport friendly as possible.
> (a number of distros need to pull these patches into their supported kernels)
> Sure, I'm willing to write in a cosmetic-only patch, after the
> functional changes are upstream.

Fair enough.

> > It would've been nice to have the CPUID instruction 1F leaf reference
> > 3B-3.9 in the SDM, and maybe mention this here too.
> 
> I didn't mention SDM sections because they change -- leaving stale
> pointers in our commit messages.  The SDM is re-published 4 times per
> year.

Yah, I know. Which is why I keep all SDMs. So if you say, book 3 section
8 of Jul'17, I can find it :-)

> > You haven't explained, and I can't readily find it in the SDM either,
> > how these topology bits relate to caches and interconnects.
> >
> > Will these die thingies share LLC, or will LLC be per die. Will NUMA
> > span dies or not.
> 
> Excellent question.
> Cache enumeration in Leaf-4 is totally unchanged.
> ACPI NUMA tables are totally unchanged.

Sure; and yet Sub-NUMA-Clustering broke stuff in interesting ways. I'm
trying to get a feel for how these levels will interact with all that.

Before that SNC stuff, caches had never spanned NODEs (and I still
think that is 'creative' at best).

> From a scheduler point of view, imagine that a SKX system with 4 die
> in 4 packages was mechanically re-designed so that those 4 die resided
> in 2 double-sized packages.
> 
> They may have tweaked the links between the die, but logically it is
> identical and compatible, and the legacy kernel will function
> properly.

This example has LLC in die and yes that works.

But I can imagine things like L2 in tile and L3 across tiles but within
DIE and then it _might_ make sense to still consider the tile for
scheduling.

Another option is having the LLC off die; also not unheard of.

And then there's many creative and slightly crazy ways this can all be
combined :/

> So the effect of Leaf B,1F is that it defines the scope of MSRs.  eg.
> what processors does a die-scope MSR cover.  That is why the rest of
> the patch is about sysfs topology, and about package MSR scope.
> 
> Yes, there will be more exotic MSR situations in future products --
> the first ones are pretty simple -- something  called a
> package-scope-MSR  in the SDM today becomes a die-scope-MSR in this
> generation on a multi-die/package system.

Yes :-(

> It also reflects how many packages appear in sysfs, and this can
> effect licensing of some kinds of software.

That's just plain insanity and we should not let that affect our sysfs
interfaces.


RE: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-25 Thread Brown, Len
Hi Brice,

Yes, you re-discovered the bug that Kan Liang pointed out last week.

I have updated this patch set, and the latest for testing is in my git tree on 
kernel.org or

https://github.com/lenb/linux.git x86

Note that I took your advice and left the core_siblings with its original 
definition,
And created package_threads as a synonym.  I will e-mail out the patch set again
When I do 2 more things:

1. add core_threads map to sysfs
2. replace unique_die_id with logical_die_id -- turns out it is useful for same 
reason as logical_package_id.

Thanks,
-Len




-Original Message-
From: Brice Goglin [mailto:brice.gog...@inria.fr] 
Sent: Sunday, February 24, 2019 5:04 AM
To: Len Brown ; x...@kernel.org
Cc: linux-kernel@vger.kernel.org; Brown, Len ; 
linux-...@vger.kernel.org; Like Xu 
Subject: Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

Le 19/02/2019 à 04:40, Len Brown a écrit :
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c 
> index ccd1f2a8e557..4250a87f57db 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -393,6 +393,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct 
> cpuinfo_x86 *o)
>   int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
>  
>   if (c->phys_proc_id == o->phys_proc_id &&
> + c->cpu_die_id == o->cpu_die_id &&
>   per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
>   if (c->cpu_core_id == o->cpu_core_id)
>   return topology_sane(c, o, "smt"); @@ -404,6 
> +405,7 @@ static 
> bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
>   }
>  
>   } else if (c->phys_proc_id == o->phys_proc_id &&
> +c->cpu_die_id == o->cpu_die_id &&
>  c->cpu_core_id == o->cpu_core_id) {
>   return topology_sane(c, o, "smt");
>   }
> @@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
> cpuinfo_x86 *o)
>   */
>  static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)  
> {
> - if (c->phys_proc_id == o->phys_proc_id)
> + if (c->cpu_die_id == o->cpu_die_id)
>   return true;
>   return false;
>  }

Hello Len,

I am testing your patches in a VM with CPUID.1f QEMU patches (author Like Xu is 
CC'ed), booted to get 2 packages with 1 NUMA node each and 2 dies each:

-smp cpus=16,threads=2,cores=2,dies=2,sockets=2 -numa node,cpus=0-7,nodeid=0 
-numa node,cpus=8-15,nodeid=1

sysfs files expose wrong information:
core_siblings contains threads of the local die AND of die with same number in 
other processors, eg cpu 0-3 and 8-11 instead of 0-3 only.

The issue is that you seem to assume that die ids will always be unique across 
multiple packages.
Is this a valid assumption? If so, QEMU patches should be fixed.
If not, I fixed the issue by changing match_die() to check both phys_proc_id 
and cpu_die_id:

static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) {
if (c->phys_proc_id == o->phys_proc_id && c->cpu_die_id == 
o->cpu_die_id)
return true;
return false;
}

Thanks
Brice




Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-24 Thread Like Xu

On 2019/2/24 18:04, Brice Goglin wrote:

Le 19/02/2019 à 04:40, Len Brown a écrit :

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ccd1f2a8e557..4250a87f57db 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -393,6 +393,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct 
cpuinfo_x86 *o)
int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
  
  		if (c->phys_proc_id == o->phys_proc_id &&

+   c->cpu_die_id == o->cpu_die_id &&
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
if (c->cpu_core_id == o->cpu_core_id)
return topology_sane(c, o, "smt");
@@ -404,6 +405,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct 
cpuinfo_x86 *o)
}
  
  	} else if (c->phys_proc_id == o->phys_proc_id &&

+  c->cpu_die_id == o->cpu_die_id &&
   c->cpu_core_id == o->cpu_core_id) {
return topology_sane(c, o, "smt");
}
@@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
cpuinfo_x86 *o)
   */
  static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
  {
-   if (c->phys_proc_id == o->phys_proc_id)
+   if (c->cpu_die_id == o->cpu_die_id)
return true;
return false;
  }


Hello Len,

I am testing your patches in a VM with CPUID.1f QEMU patches (author Like Xu is 
CC'ed),
booted to get 2 packages with 1 NUMA node each and 2 dies each:

-smp cpus=16,threads=2,cores=2,dies=2,sockets=2 -numa node,cpus=0-7,nodeid=0 
-numa node,cpus=8-15,nodeid=1

sysfs files expose wrong information:
core_siblings contains threads of the local die AND of die with same number in 
other processors,
eg cpu 0-3 and 8-11 instead of 0-3 only.

The issue is that you seem to assume that die ids will always be unique across 
multiple packages.


I don't think the intel MCP topolohgy has this global uniqueness 
assumption for die_id according to CPUID.1F SDM.



Is this a valid assumption? If so, QEMU patches should be fixed.
If not, I fixed the issue by changing match_die() to check both phys_proc_id 
and cpu_die_id:

static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
{
if (c->phys_proc_id == o->phys_proc_id && c->cpu_die_id == 
o->cpu_die_id)
return true;
return false;
}

Thanks
Brice







Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-24 Thread Brice Goglin
Le 19/02/2019 à 04:40, Len Brown a écrit :
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index ccd1f2a8e557..4250a87f57db 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -393,6 +393,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct 
> cpuinfo_x86 *o)
>   int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
>  
>   if (c->phys_proc_id == o->phys_proc_id &&
> + c->cpu_die_id == o->cpu_die_id &&
>   per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
>   if (c->cpu_core_id == o->cpu_core_id)
>   return topology_sane(c, o, "smt");
> @@ -404,6 +405,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct 
> cpuinfo_x86 *o)
>   }
>  
>   } else if (c->phys_proc_id == o->phys_proc_id &&
> +c->cpu_die_id == o->cpu_die_id &&
>  c->cpu_core_id == o->cpu_core_id) {
>   return topology_sane(c, o, "smt");
>   }
> @@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
> cpuinfo_x86 *o)
>   */
>  static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
>  {
> - if (c->phys_proc_id == o->phys_proc_id)
> + if (c->cpu_die_id == o->cpu_die_id)
>   return true;
>   return false;
>  }

Hello Len,

I am testing your patches in a VM with CPUID.1f QEMU patches (author Like Xu is 
CC'ed),
booted to get 2 packages with 1 NUMA node each and 2 dies each:

-smp cpus=16,threads=2,cores=2,dies=2,sockets=2 -numa node,cpus=0-7,nodeid=0 
-numa node,cpus=8-15,nodeid=1

sysfs files expose wrong information:
core_siblings contains threads of the local die AND of die with same number in 
other processors,
eg cpu 0-3 and 8-11 instead of 0-3 only.

The issue is that you seem to assume that die ids will always be unique across 
multiple packages.
Is this a valid assumption? If so, QEMU patches should be fixed.
If not, I fixed the issue by changing match_die() to check both phys_proc_id 
and cpu_die_id:

static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
{
if (c->phys_proc_id == o->phys_proc_id && c->cpu_die_id == 
o->cpu_die_id)
return true;
return false;
}

Thanks
Brice




Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-20 Thread Len Brown
Thanks for the comments, Peter. I'll update the patch to address the
syntax points.  (Maybe checkpatch.pl should be updated to reflect your
preferences?).

About macros vs C.  I agree with your preference.
I used macros to be consistent with the existing code, and to be as
backport friendly as possible.
(a number of distros need to pull these patches into their supported kernels)
Sure, I'm willing to write in a cosmetic-only patch, after the
functional changes are upstream.

> It would've been nice to have the CPUID instruction 1F leaf reference
> 3B-3.9 in the SDM, and maybe mention this here too.

I didn't mention SDM sections because they change -- leaving stale
pointers in our commit messages.  The SDM is re-published 4 times per
year.

> Also, figure 8-6 uses Q,R ID, without prior mention.

Yeah, the tech-writer took my real example and turned it into a
generic example.  Probably a good idea, even if not gracefully
executed.  The point of undefined "Q" and "R" are that a new level can
be invented at any time in the future, and the existing code that
doesn't know about future levels should still function properly.

The back-story is that Leaf-B was supposed to work this way, but the
original SDM code example was hard-coded to assume 3-levels.  Plenty
of software was hard-coded and would have broken if we had actually
added new levels to Leaf-B.  And so Leaf-B had to be "forked" into
Leaf-1F, with the implicit contract that only new code that can
function properly in the face of unknown levels parses leaf-1F.

Yes, the parsing routine in the Linux Kernel will work fine in the
face of future unknown levels.  Some utilities (such as cpuid), would
have actually crashed if we added levels to Leaf-B.

> You haven't explained, and I can't readily find it in the SDM either,
> how these topology bits relate to caches and interconnects.
>
> Will these die thingies share LLC, or will LLC be per die. Will NUMA
> span dies or not.

Excellent question.
Cache enumeration in Leaf-4 is totally unchanged.
ACPI NUMA tables are totally unchanged.

>From a scheduler point of view, imagine that a SKX system with 4 die
in 4 packages
was mechanically re-designed so that those 4 die resided in 2
double-sized packages.

They may have tweaked the links between the die, but logically it is
identical and compatible,
and the legacy kernel will function properly.

So the effect of Leaf B,1F is that it defines the scope of MSRs.
eg. what processors does a die-scope MSR cover.
That is why the rest of the patch is about sysfs topology, and about
package MSR scope.

Yes, there will be more exotic MSR situations in future products --
the first ones are pretty simple --
something  called a package-scope-MSR  in the SDM today becomes a
die-scope-MSR in this generation
on a multi-die/package system.

It also reflects how many packages appear in sysfs, and this can
effect licensing of some kinds of software.

thanks,
Len Brown, Intel Open Source Technology Center


Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-20 Thread Peter Zijlstra
On Mon, Feb 18, 2019 at 10:40:05PM -0500, Len Brown wrote:
> From: Len Brown 
> 
> Some new systems have multiple software-visible die within each package.
> The new CPUID.1F leaf can enumerate this multi-die/package topology.
> 
> CPUID.1F a super-set of the CPUID.B "Extended Toplogy Leaf",
> and a common updated routine can parse either leaf.
> 
> Legacy systems without CPUID.1F, and systems without multi-die/package
> hardware, will see no functional change from this patch series.
> 
> Multi-die/package systems will use CPUID.B before this patch,
> and CPUID.1F after this patch.  In the CPUID.B case, all die appear
> (incorrectly) to software as individual packages.  In the CPUID.1F case,
> the package id's reflect reality, and die_id's become meaningful.
> 
> Subsequent patches in this series update the kernel to be multi-die aware.
> In particular, some software needs to know the difference between
> a die-scope MSR and a package-scope MSR.

It would've been nice to have the CPUID instruction 1F leaf reference
3B-3.9 in the SDM, and maybe mention this here too.

Also, figure 8-6 uses Q,R ID, without prior mention.

> +/*
> + * Check if given CPUID extended toplogy "leaf" is implemented
> + */
> +static int check_extended_topology_leaf(int leaf)
> +{
>   unsigned int eax, ebx, ecx, edx;
>  
> + cpuid_count(leaf, SMT_LEVEL, , , , );
> +
> + if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
>   return -1;
>  
> + return 0;
> +}
> +/*
> + * Return best CPUID Extended Toplogy Leaf supported
> + */
> +static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
> +{
> + if (c->cpuid_level >= 0x1f)
> + if (check_extended_topology_leaf(0x1f) == 0)
> + return 0x1f;

Coding-style requires { } on the outer if-stmt.

>  
> + if (c->cpuid_level >= 0xb)
> + if (check_extended_topology_leaf(0xb) == 0)
> + return 0xb;

idem.

> + return -1;
> +}
> +#endif
> +
> +int detect_extended_topology_early(struct cpuinfo_x86 *c)
> +{
> +#ifdef CONFIG_SMP
> + unsigned int eax, ebx, ecx, edx;
> + int leaf;
> +
> + leaf = detect_extended_topology_leaf(c);
> + if (leaf < 0)
>   return -1;
>  
>   set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);
>  
> + cpuid_count(leaf, SMT_LEVEL, , , , );
>   /*
>* initial apic id, which also represents 32-bit extended x2apic id.
>*/
> @@ -52,7 +80,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
>  }
>  
>  /*
> + * Check for extended topology enumeration cpuid leaf, and if it
>   * exists, use it for populating initial_apicid and cpu topology
>   * detection.
>   */
> @@ -60,46 +88,62 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
>  {
>  #ifdef CONFIG_SMP
>   unsigned int eax, ebx, ecx, edx, sub_index;
> + unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
>   unsigned int core_select_mask, core_level_siblings;
> + unsigned int die_select_mask, die_level_siblings;
> + int leaf;
>  
> + leaf = detect_extended_topology_leaf(c);
> + if (leaf  < 0)

s/  / /

>   return -1;
>  
>   /*
>* Populate HT related information from sub-leaf level 0.
>*/
> + cpuid_count(leaf, SMT_LEVEL, , , , );
> + c->initial_apicid = edx;
>   core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
>   core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
> + die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
> + die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>  
>   sub_index = 1;
>   do {
> + cpuid_count(leaf, sub_index, , , , );
>  
>   /*
>* Check for the Core type in the implemented sub leaves.
>*/
>   if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
>   core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
> + die_level_siblings = core_level_siblings;
>   core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
> - break;
> + }
> + if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
> + die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
> + die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>   }
>  
>   sub_index++;
>   } while (LEAFB_SUBTYPE(ecx) != INVALID_TYPE);

Personally I much prefer union based decoding of cpuid leafs over this
macro magic (git grep cpuid10_):

union cpuid1f_eax {
struct {
unsigned int x2apic_shift : 5;
} split;
unsigned int full;
};

union cpuid1f_ebx {
struct {
unsigned int nr_cpus : 16;
} split;
unsigned int full;
};

enum level_type_enum {
invalid_type = 0,
smt_type,
core_type,
module_type,
tile_type,
die_type,
};

union cpuid1f_ecx {
struct {
   

Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-19 Thread Len Brown
On Tue, Feb 19, 2019 at 9:59 PM Like Xu  wrote:

> > -/* leaf 0xb sub-leaf types */
> > +/* extended topology sub-leaf types */
> >   #define INVALID_TYPE0
> >   #define SMT_TYPE1
> >   #define CORE_TYPE   2
> > +#define DIE_TYPE 5
>
> This patch set is going to export die topology via sysfs
> while the extended topology sub-leaf type could be one of followings:
> SMT/Core/Module/Tile/Die according to CPUID.1F from SDM.
>
> Why not choose to export module/tile topology as well?

Excellent question.  (and thank you for reading the SDM:-)

While it is true that there are shipping products with
software-visible CPU modules and tiles,
they shipped before this mechanism was available.  As a result, there
are currently zero
products that use this mechanism to enumerate modules and tiles.  If
future products
have software-visible modules and tiles, and they choose to use this mechanism,
we can add support for them.  But I do not advocate adding code to the kernel
"just in case".

In contrast, the need to support multi-die/package products as soon as
possible is quite clear.

thanks,
Len Brown, Intel Open Source Technology Center


Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-19 Thread Like Xu

On 2019/2/19 11:40, Len Brown wrote:

From: Len Brown 

Some new systems have multiple software-visible die within each package.
The new CPUID.1F leaf can enumerate this multi-die/package topology.

CPUID.1F a super-set of the CPUID.B "Extended Toplogy Leaf",
and a common updated routine can parse either leaf.

Legacy systems without CPUID.1F, and systems without multi-die/package
hardware, will see no functional change from this patch series.

Multi-die/package systems will use CPUID.B before this patch,
and CPUID.1F after this patch.  In the CPUID.B case, all die appear
(incorrectly) to software as individual packages.  In the CPUID.1F case,
the package id's reflect reality, and die_id's become meaningful.

Subsequent patches in this series update the kernel to be multi-die aware.
In particular, some software needs to know the difference between
a die-scope MSR and a package-scope MSR.

Signed-off-by: Len Brown 
Cc: linux-...@vger.kernel.org
Signed-off-by: Len Brown 
---
  Documentation/x86/topology.txt   |  4 ++
  arch/x86/include/asm/processor.h |  4 +-
  arch/x86/kernel/cpu/topology.c   | 82 
  arch/x86/kernel/smpboot.c|  4 +-
  4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
index 06b3cdbc4048..8107b6cfc9ea 100644
--- a/Documentation/x86/topology.txt
+++ b/Documentation/x86/topology.txt
@@ -46,6 +46,10 @@ The topology of a system is described in the units of:
  
  The number of cores in a package. This information is retrieved via CPUID.
  
+  - cpuinfo_x86.x86_max_dies:

+
+The number of dies in a package. This information is retrieved via CPUID.
+
- cpuinfo_x86.phys_proc_id:
  
  The physical ID of the package. This information is retrieved via CPUID

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 33051436c864..f2856fe03715 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -105,7 +105,8 @@ struct cpuinfo_x86 {
int x86_power;
unsigned long   loops_per_jiffy;
/* cpuid returned max cores value: */
-   u16  x86_max_cores;
+   u16 x86_max_cores;
+   u16 x86_max_dies;
u16 apicid;
u16 initial_apicid;
u16 x86_clflush_size;
@@ -117,6 +118,7 @@ struct cpuinfo_x86 {
u16 logical_proc_id;
/* Core id: */
u16 cpu_core_id;
+   u16 cpu_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 8f6c784141d1..6dce6ee77849 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -15,33 +15,61 @@
  /* leaf 0xb SMT level */
  #define SMT_LEVEL 0
  
-/* leaf 0xb sub-leaf types */

+/* extended topology sub-leaf types */
  #define INVALID_TYPE  0
  #define SMT_TYPE  1
  #define CORE_TYPE 2
+#define DIE_TYPE   5


This patch set is going to export die topology via sysfs
while the extended topology sub-leaf type could be one of followings:
SMT/Core/Module/Tile/Die according to CPUID.1F from SDM.

Why not choose to export module/tile topology as well?

  
  #define LEAFB_SUBTYPE(ecx)		(((ecx) >> 8) & 0xff)

  #define BITS_SHIFT_NEXT_LEVEL(eax)((eax) & 0x1f)
  #define LEVEL_MAX_SIBLINGS(ebx)   ((ebx) & 0x)
  
-int detect_extended_topology_early(struct cpuinfo_x86 *c)

-{
  #ifdef CONFIG_SMP
+/*
+ * Check if given CPUID extended toplogy "leaf" is implemented
+ */
+static int check_extended_topology_leaf(int leaf)
+{
unsigned int eax, ebx, ecx, edx;
  
-	if (c->cpuid_level < 0xb)

+   cpuid_count(leaf, SMT_LEVEL, , , , );
+
+   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
return -1;
  
-	cpuid_count(0xb, SMT_LEVEL, , , , );

+   return 0;
+}
+/*
+ * Return best CPUID Extended Toplogy Leaf supported
+ */
+static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
+{
+   if (c->cpuid_level >= 0x1f)
+   if (check_extended_topology_leaf(0x1f) == 0)
+   return 0x1f;
  
-	/*

-* check if the cpuid leaf 0xb is actually implemented.
-*/
-   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
+   if (c->cpuid_level >= 0xb)
+   if (check_extended_topology_leaf(0xb) == 0)
+   return 0xb;
+
+   return -1;
+}
+#endif
+
+int detect_extended_topology_early(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+   unsigned int eax, ebx, ecx, edx;
+   int leaf;
+
+   leaf = detect_extended_topology_leaf(c);
+   if (leaf < 0)
return -1;
  
  	set_cpu_cap(c, 

RE: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-19 Thread Brown, Len
>> @@ -461,7 +463,7 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
>> cpuinfo_x86 *o)
>>*/
>>   static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
>>   {
>> -if (c->>phys_proc_id == o->>phys_proc_id)
>> +if (c->>cpu_die_id == o->>cpu_die_id)
>>  return true;
>>  return false;
>>   }

> Shouldn't we check the unique_die_id here?
> Die from different package can have the same cpu_die_id.

Good catch.
Updated this hunk as below.

Thanks!
-Len

@@ -461,7 +463,8 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
cpuinfo_x86 *o)
  */
 static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
-   if (c->phys_proc_id == o->phys_proc_id)
+   if ((c->phys_proc_id == o->phys_proc_id) &&
+   (c->cpu_die_id == o->cpu_die_id))
return true;
return false;
 }



Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-19 Thread Liang, Kan




On 2/18/2019 10:40 PM, Len Brown wrote:

From: Len Brown 

Some new systems have multiple software-visible die within each package.
The new CPUID.1F leaf can enumerate this multi-die/package topology.

CPUID.1F a super-set of the CPUID.B "Extended Toplogy Leaf",
and a common updated routine can parse either leaf.

Legacy systems without CPUID.1F, and systems without multi-die/package
hardware, will see no functional change from this patch series.

Multi-die/package systems will use CPUID.B before this patch,
and CPUID.1F after this patch.  In the CPUID.B case, all die appear
(incorrectly) to software as individual packages.  In the CPUID.1F case,
the package id's reflect reality, and die_id's become meaningful.

Subsequent patches in this series update the kernel to be multi-die aware.
In particular, some software needs to know the difference between
a die-scope MSR and a package-scope MSR.

Signed-off-by: Len Brown 
Cc: linux-...@vger.kernel.org
Signed-off-by: Len Brown 
---
  Documentation/x86/topology.txt   |  4 ++
  arch/x86/include/asm/processor.h |  4 +-
  arch/x86/kernel/cpu/topology.c   | 82 
  arch/x86/kernel/smpboot.c|  4 +-
  4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
index 06b3cdbc4048..8107b6cfc9ea 100644
--- a/Documentation/x86/topology.txt
+++ b/Documentation/x86/topology.txt
@@ -46,6 +46,10 @@ The topology of a system is described in the units of:
  
  The number of cores in a package. This information is retrieved via CPUID.
  
+  - cpuinfo_x86.x86_max_dies:

+
+The number of dies in a package. This information is retrieved via CPUID.
+
- cpuinfo_x86.phys_proc_id:
  
  The physical ID of the package. This information is retrieved via CPUID

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 33051436c864..f2856fe03715 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -105,7 +105,8 @@ struct cpuinfo_x86 {
int x86_power;
unsigned long   loops_per_jiffy;
/* cpuid returned max cores value: */
-   u16  x86_max_cores;
+   u16 x86_max_cores;
+   u16 x86_max_dies;
u16 apicid;
u16 initial_apicid;
u16 x86_clflush_size;
@@ -117,6 +118,7 @@ struct cpuinfo_x86 {
u16 logical_proc_id;
/* Core id: */
u16 cpu_core_id;
+   u16 cpu_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 8f6c784141d1..6dce6ee77849 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -15,33 +15,61 @@
  /* leaf 0xb SMT level */
  #define SMT_LEVEL 0
  
-/* leaf 0xb sub-leaf types */

+/* extended topology sub-leaf types */
  #define INVALID_TYPE  0
  #define SMT_TYPE  1
  #define CORE_TYPE 2
+#define DIE_TYPE   5
  
  #define LEAFB_SUBTYPE(ecx)		(((ecx) >> 8) & 0xff)

  #define BITS_SHIFT_NEXT_LEVEL(eax)((eax) & 0x1f)
  #define LEVEL_MAX_SIBLINGS(ebx)   ((ebx) & 0x)
  
-int detect_extended_topology_early(struct cpuinfo_x86 *c)

-{
  #ifdef CONFIG_SMP
+/*
+ * Check if given CPUID extended toplogy "leaf" is implemented
+ */
+static int check_extended_topology_leaf(int leaf)
+{
unsigned int eax, ebx, ecx, edx;
  
-	if (c->cpuid_level < 0xb)

+   cpuid_count(leaf, SMT_LEVEL, , , , );
+
+   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
return -1;
  
-	cpuid_count(0xb, SMT_LEVEL, , , , );

+   return 0;
+}
+/*
+ * Return best CPUID Extended Toplogy Leaf supported
+ */
+static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
+{
+   if (c->cpuid_level >= 0x1f)
+   if (check_extended_topology_leaf(0x1f) == 0)
+   return 0x1f;
  
-	/*

-* check if the cpuid leaf 0xb is actually implemented.
-*/
-   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
+   if (c->cpuid_level >= 0xb)
+   if (check_extended_topology_leaf(0xb) == 0)
+   return 0xb;
+
+   return -1;
+}
+#endif
+
+int detect_extended_topology_early(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+   unsigned int eax, ebx, ecx, edx;
+   int leaf;
+
+   leaf = detect_extended_topology_leaf(c);
+   if (leaf < 0)
return -1;
  
  	set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);
  
+	cpuid_count(leaf, SMT_LEVEL, , , , );

/*
 * initial apic id, which also represents 32-bit extended x2apic id.
 */
@@ -52,7 +80,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
  

[PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

2019-02-18 Thread Len Brown
From: Len Brown 

Some new systems have multiple software-visible die within each package.
The new CPUID.1F leaf can enumerate this multi-die/package topology.

CPUID.1F a super-set of the CPUID.B "Extended Toplogy Leaf",
and a common updated routine can parse either leaf.

Legacy systems without CPUID.1F, and systems without multi-die/package
hardware, will see no functional change from this patch series.

Multi-die/package systems will use CPUID.B before this patch,
and CPUID.1F after this patch.  In the CPUID.B case, all die appear
(incorrectly) to software as individual packages.  In the CPUID.1F case,
the package id's reflect reality, and die_id's become meaningful.

Subsequent patches in this series update the kernel to be multi-die aware.
In particular, some software needs to know the difference between
a die-scope MSR and a package-scope MSR.

Signed-off-by: Len Brown 
Cc: linux-...@vger.kernel.org
Signed-off-by: Len Brown 
---
 Documentation/x86/topology.txt   |  4 ++
 arch/x86/include/asm/processor.h |  4 +-
 arch/x86/kernel/cpu/topology.c   | 82 
 arch/x86/kernel/smpboot.c|  4 +-
 4 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/Documentation/x86/topology.txt b/Documentation/x86/topology.txt
index 06b3cdbc4048..8107b6cfc9ea 100644
--- a/Documentation/x86/topology.txt
+++ b/Documentation/x86/topology.txt
@@ -46,6 +46,10 @@ The topology of a system is described in the units of:
 
 The number of cores in a package. This information is retrieved via CPUID.
 
+  - cpuinfo_x86.x86_max_dies:
+
+The number of dies in a package. This information is retrieved via CPUID.
+
   - cpuinfo_x86.phys_proc_id:
 
 The physical ID of the package. This information is retrieved via CPUID
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 33051436c864..f2856fe03715 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -105,7 +105,8 @@ struct cpuinfo_x86 {
int x86_power;
unsigned long   loops_per_jiffy;
/* cpuid returned max cores value: */
-   u16  x86_max_cores;
+   u16 x86_max_cores;
+   u16 x86_max_dies;
u16 apicid;
u16 initial_apicid;
u16 x86_clflush_size;
@@ -117,6 +118,7 @@ struct cpuinfo_x86 {
u16 logical_proc_id;
/* Core id: */
u16 cpu_core_id;
+   u16 cpu_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 8f6c784141d1..6dce6ee77849 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -15,33 +15,61 @@
 /* leaf 0xb SMT level */
 #define SMT_LEVEL  0
 
-/* leaf 0xb sub-leaf types */
+/* extended topology sub-leaf types */
 #define INVALID_TYPE   0
 #define SMT_TYPE   1
 #define CORE_TYPE  2
+#define DIE_TYPE   5
 
 #define LEAFB_SUBTYPE(ecx) (((ecx) >> 8) & 0xff)
 #define BITS_SHIFT_NEXT_LEVEL(eax) ((eax) & 0x1f)
 #define LEVEL_MAX_SIBLINGS(ebx)((ebx) & 0x)
 
-int detect_extended_topology_early(struct cpuinfo_x86 *c)
-{
 #ifdef CONFIG_SMP
+/*
+ * Check if given CPUID extended toplogy "leaf" is implemented
+ */
+static int check_extended_topology_leaf(int leaf)
+{
unsigned int eax, ebx, ecx, edx;
 
-   if (c->cpuid_level < 0xb)
+   cpuid_count(leaf, SMT_LEVEL, , , , );
+
+   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
return -1;
 
-   cpuid_count(0xb, SMT_LEVEL, , , , );
+   return 0;
+}
+/*
+ * Return best CPUID Extended Toplogy Leaf supported
+ */
+static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
+{
+   if (c->cpuid_level >= 0x1f)
+   if (check_extended_topology_leaf(0x1f) == 0)
+   return 0x1f;
 
-   /*
-* check if the cpuid leaf 0xb is actually implemented.
-*/
-   if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
+   if (c->cpuid_level >= 0xb)
+   if (check_extended_topology_leaf(0xb) == 0)
+   return 0xb;
+
+   return -1;
+}
+#endif
+
+int detect_extended_topology_early(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+   unsigned int eax, ebx, ecx, edx;
+   int leaf;
+
+   leaf = detect_extended_topology_leaf(c);
+   if (leaf < 0)
return -1;
 
set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);
 
+   cpuid_count(leaf, SMT_LEVEL, , , , );
/*
 * initial apic id, which also represents 32-bit extended x2apic id.
 */
@@ -52,7 +80,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
 }
 
 /*
- * Check for extended