Re: [PATCH v2 4/4] EDAC/mce_amd: Add support for FRU Text in MCA

2024-06-27 Thread Yazen Ghannam
On Wed, Jun 26, 2024 at 08:20:13PM +0200, Borislav Petkov wrote:
> On Wed, Jun 26, 2024 at 01:00:30PM -0500, Naik, Avadhut wrote:
> > > 
> > > Why are you clearing it if you're overwriting it immediately?
> > > 
> > Since its a local variable, wanted to ensure that the memory is zeroed out 
> > to prevent
> > any issues with the %s specifier, used later on.
> 
> What issues?
> 
> > Would you recommend removing that and using initializer instead for the 
> > string?
> 
> I'd recommend looking at what the code does and then really thinking whether
> that makes any sense.
>

We need to make sure the string is NULL-terminated. So the memset()
could be replaced with this:

frutext[16] = '\0';

Or better yet, maybe we can use scnprintf() or similar.

Thanks,
Yazen



Re: [PATCH 0/4] MCE wrapper and support for new SMCA syndrome MSRs

2024-06-21 Thread Yazen Ghannam
On Fri, Jun 21, 2024 at 06:58:23PM +0200, Borislav Petkov wrote:
> On Thu, May 30, 2024 at 04:16:16PM -0500, Avadhut Naik wrote:
> >  arch/x86/include/asm/mce.h  |  20 ++-
> >  arch/x86/kernel/cpu/mce/apei.c  | 111 ++
> >  arch/x86/kernel/cpu/mce/core.c  | 191 ++--
> >  arch/x86/kernel/cpu/mce/dev-mcelog.c|   2 +-
> >  arch/x86/kernel/cpu/mce/genpool.c   |  20 +--
> >  arch/x86/kernel/cpu/mce/inject.c|   4 +-
> >  arch/x86/kernel/cpu/mce/internal.h  |   4 +-
> >  drivers/acpi/acpi_extlog.c  |   2 +-
> >  drivers/acpi/nfit/mce.c |   2 +-
> >  drivers/edac/i7core_edac.c  |   2 +-
> >  drivers/edac/igen6_edac.c   |   2 +-
> >  drivers/edac/mce_amd.c  |  27 +++-
> >  drivers/edac/pnd2_edac.c|   2 +-
> >  drivers/edac/sb_edac.c  |   2 +-
> >  drivers/edac/skx_common.c   |   2 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c |   2 +-
> >  drivers/ras/amd/fmpm.c  |   2 +-
> >  drivers/ras/cec.c   |   2 +-
> >  include/trace/events/mce.h  |  51 ---
> >  19 files changed, 286 insertions(+), 164 deletions(-)
> 
> This doesn't apply anymore - please redo this ontop of the latest tip/master.
>

Avadhut,

You can drop the dependencies on other sets. We can sort out any
conflicts as needed.

Thanks,
Yazen



Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-20 Thread Yazen Ghannam
On Mon, Jan 18, 2021 at 08:31:12PM +0100, Borislav Petkov wrote:
> On Sat, Jan 16, 2021 at 02:33:53PM +0000, Yazen Ghannam wrote:
> > +static struct {
> > +   u32 base, limit;
> > +} f17h_scrub_regs = {F17H_M30H_SCR_BASE_ADDR, F17H_M30H_SCR_LIMIT_ADDR};
> 
> Why not make this part of struct amd64_umc so that you can access them
> through pvt->umc?
>

We have a struct amd64_umc per channel, so putting these fixed values
there seemed redundant. Would you mind if we put this in struct
amd64_family_type? We can then set the values per family/model group
like we do with the max_mcs.

Thanks,
Yazen


Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-20 Thread Yazen Ghannam
On Mon, Jan 18, 2021 at 04:30:58AM +0300, WGH wrote:
> On 16/01/2021 17:33, Yazen Ghannam wrote:
> > From: Yazen Ghannam 
> >
> > The Family 17h scrubber registers moved to different offset starting
> > with Model 30h. The new register offsets are used for all currently
> > available models since then.
> >
> > Use the new register addresses as the defaults.
> >
> > Set the proper scrub register addresses during module init for older
> > models.
> 
> So I tested the patch on my machine (AMD Ryzen 9 3900XT on ASRock B550 
> Extreme4 motherboard, Linux 5.10.7).
> 
> The /sys/devices/system/edac/mc/mc0/sdram_scrub_rate value seems to be stuck 
> at 12284069 right after the boot, and does not change.
> Writes to the file do not report any errors.
> 
> dmesg:
> 
> [    0.549451] EDAC MC: Ver: 3.0.0
> [    0.817576] EDAC amd64: F17h_M70h detected (node 0).
> [    0.818159] EDAC amd64: Node 0: DRAM ECC enabled.
> [    0.818717] EDAC amd64: MCT channel count: 2
> [    0.819324] EDAC MC0: Giving out device to module amd64_edac controller 
> F17h_M70h: DEV :00:18.3 (INTERRUPT)
> [    0.819909] EDAC MC: UMC0 chip selects:
> [    0.819910] EDAC amd64: MC: 0: 16384MB 1: 16384MB
> [    0.820488] EDAC amd64: MC: 2: 16384MB 3: 16384MB
> [    0.821067] EDAC MC: UMC1 chip selects:
> [    0.821067] EDAC amd64: MC: 0: 16384MB 1: 16384MB
> [    0.821630] EDAC amd64: MC: 2: 16384MB 3: 16384MB
> [    0.822187] EDAC amd64: using x16 syndromes.
> [    0.822739] EDAC PCI0: Giving out device to module amd64_edac controller 
> EDAC PCI controller: DEV :00:18.0 (POLLED)
> [    0.823314] AMD64 EDAC driver v3.5.0
> 
>

Thanks for testing. I'll try to find a similar system and check it out.

Thanks,
Yazen


[PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-16 Thread Yazen Ghannam
From: Yazen Ghannam 

The Family 17h scrubber registers moved to different offset starting
with Model 30h. The new register offsets are used for all currently
available models since then.

Use the new register addresses as the defaults.

Set the proper scrub register addresses during module init for older
models.

Reported-by: WGH 
Signed-off-by: Yazen Ghannam 
---
 drivers/edac/amd64_edac.c | 23 ++-
 drivers/edac/amd64_edac.h |  2 ++
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 9868f95a5622..b324b1589e5a 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -167,6 +167,10 @@ static inline int amd64_read_dct_pci_cfg(struct amd64_pvt 
*pvt, u8 dct,
  * other archs, we might not have access to the caches directly.
  */
 
+static struct {
+   u32 base, limit;
+} f17h_scrub_regs = {F17H_M30H_SCR_BASE_ADDR, F17H_M30H_SCR_LIMIT_ADDR};
+
 static inline void __f17h_set_scrubval(struct amd64_pvt *pvt, u32 scrubval)
 {
/*
@@ -176,10 +180,10 @@ static inline void __f17h_set_scrubval(struct amd64_pvt 
*pvt, u32 scrubval)
 */
if (scrubval >= 0x5 && scrubval <= 0x14) {
scrubval -= 0x5;
-   pci_write_bits32(pvt->F6, F17H_SCR_LIMIT_ADDR, scrubval, 0xF);
-   pci_write_bits32(pvt->F6, F17H_SCR_BASE_ADDR, 1, 0x1);
+   pci_write_bits32(pvt->F6, f17h_scrub_regs.limit, scrubval, 0xF);
+   pci_write_bits32(pvt->F6, f17h_scrub_regs.base, 1, 0x1);
} else {
-   pci_write_bits32(pvt->F6, F17H_SCR_BASE_ADDR, 0, 0x1);
+   pci_write_bits32(pvt->F6, f17h_scrub_regs.base, 0, 0x1);
}
 }
 /*
@@ -257,9 +261,9 @@ static int get_scrub_rate(struct mem_ctl_info *mci)
u32 scrubval = 0;
 
if (pvt->umc) {
-   amd64_read_pci_cfg(pvt->F6, F17H_SCR_BASE_ADDR, &scrubval);
+   amd64_read_pci_cfg(pvt->F6, f17h_scrub_regs.base, &scrubval);
if (scrubval & BIT(0)) {
-   amd64_read_pci_cfg(pvt->F6, F17H_SCR_LIMIT_ADDR, 
&scrubval);
+   amd64_read_pci_cfg(pvt->F6, f17h_scrub_regs.limit, 
&scrubval);
scrubval &= 0xF;
scrubval += 0x5;
} else {
@@ -3568,6 +3572,14 @@ f17h_determine_edac_ctl_cap(struct mem_ctl_info *mci, 
struct amd64_pvt *pvt)
}
 }
 
+static void f17h_set_scrub_regs(struct amd64_pvt *pvt)
+{
+   if ((pvt->fam == 0x17 && pvt->model < 0x30) || pvt->fam == 0x18) {
+   f17h_scrub_regs.base = F17H_SCR_BASE_ADDR;
+   f17h_scrub_regs.limit = F17H_SCR_LIMIT_ADDR;
+   }
+}
+
 static void setup_mci_misc_attrs(struct mem_ctl_info *mci)
 {
struct amd64_pvt *pvt = mci->pvt_info;
@@ -3577,6 +3589,7 @@ static void setup_mci_misc_attrs(struct mem_ctl_info *mci)
 
if (pvt->umc) {
f17h_determine_edac_ctl_cap(mci, pvt);
+   f17h_set_scrub_regs(pvt);
} else {
if (pvt->nbcap & NBCAP_SECDED)
mci->edac_ctl_cap |= EDAC_FLAG_SECDED;
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 85aa820bc165..4606f72f4258 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -213,6 +213,8 @@
 #define F15H_M60H_SCRCTRL  0x1C8
 #define F17H_SCR_BASE_ADDR 0x48
 #define F17H_SCR_LIMIT_ADDR0x4C
+#define F17H_M30H_SCR_BASE_ADDR0x40
+#define F17H_M30H_SCR_LIMIT_ADDR   0x44
 
 /*
  * Function 3 - Misc Control
-- 
2.25.1



[tip: x86/urgent] x86/cpu/amd: Set __max_die_per_package on AMD

2021-01-12 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 76e2fc63ca40977af893b724b00cc2f8e9ce47a4
Gitweb:
https://git.kernel.org/tip/76e2fc63ca40977af893b724b00cc2f8e9ce47a4
Author:Yazen Ghannam 
AuthorDate:Mon, 11 Jan 2021 11:04:29 +01:00
Committer: Borislav Petkov 
CommitterDate: Tue, 12 Jan 2021 12:21:01 +01:00

x86/cpu/amd: Set __max_die_per_package on AMD

Set the maximum DIE per package variable on AMD using the
NodesPerProcessor topology value. This will be used by RAPL, among
others, to determine the maximum number of DIEs on the system in order
to do per-DIE manipulations.

 [ bp: Productize into a proper patch. ]

Fixes: 028c221ed190 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id")
Reported-by: Johnathan Smithinovic 
Reported-by: Rafael Kitover 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Tested-by: Johnathan Smithinovic 
Tested-by: Rafael Kitover 
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939
Link: https://lkml.kernel.org/r/20210106112106.ge5...@zn.tnic
Link: https://lkml.kernel.org/r/2021001455.1194-1...@alien8.de
---
 arch/x86/kernel/cpu/amd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f8ca66f..347a956 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -542,12 +542,12 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
u32 ecx;
 
ecx = cpuid_ecx(0x801e);
-   nodes_per_socket = ((ecx >> 8) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1;
} else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   nodes_per_socket = ((value >> 3) & 7) + 1;
+   __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 
1;
}
 
if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&


[PATCH] EDAC/amd64: Tone down messages about missing PCI IDs

2020-12-15 Thread Yazen Ghannam
From: Yazen Ghannam 

Give these messages a debug severity as they are really only useful to
the module developers.

Also, drop the "(broken BIOS?)" phrase, since this can cause churn for
BIOS folks. The PCI IDs needed by the module, at least on modern systems,
are fixed in hardware.

Signed-off-by: Yazen Ghannam 
---
 drivers/edac/amd64_edac.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index f7087b90..a3770ffee2ea 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2665,7 +2665,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 
pci_id1, u16 pci_id2)
if (pvt->umc) {
pvt->F0 = pci_get_related_function(pvt->F3->vendor, pci_id1, 
pvt->F3);
if (!pvt->F0) {
-   amd64_err("F0 not found, device 0x%x (broken BIOS?)\n", 
pci_id1);
+   edac_dbg(1, "F0 not found, device 0x%x\n", pci_id1);
return -ENODEV;
}
 
@@ -2674,7 +2674,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 
pci_id1, u16 pci_id2)
pci_dev_put(pvt->F0);
pvt->F0 = NULL;
 
-   amd64_err("F6 not found: device 0x%x (broken BIOS?)\n", 
pci_id2);
+   edac_dbg(1, "F6 not found: device 0x%x\n", pci_id2);
return -ENODEV;
}
 
@@ -2691,7 +2691,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 
pci_id1, u16 pci_id2)
/* Reserve the ADDRESS MAP Device */
pvt->F1 = pci_get_related_function(pvt->F3->vendor, pci_id1, pvt->F3);
if (!pvt->F1) {
-   amd64_err("F1 not found: device 0x%x (broken BIOS?)\n", 
pci_id1);
+   edac_dbg(1, "F1 not found: device 0x%x\n", pci_id1);
return -ENODEV;
}
 
@@ -2701,7 +2701,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 
pci_id1, u16 pci_id2)
pci_dev_put(pvt->F1);
pvt->F1 = NULL;
 
-   amd64_err("F2 not found: device 0x%x (broken BIOS?)\n", 
pci_id2);
+   edac_dbg(1, "F2 not found: device 0x%x\n", pci_id2);
return -ENODEV;
}
 
-- 
2.25.1



Re: [PATCH 2/2] EDAC/amd64: Merge error injection sysfs facilities

2020-12-15 Thread Yazen Ghannam
On Tue, Dec 15, 2020 at 12:05:17PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov 
> 
> Merge them into the main driver and put them inside an EDAC_DEBUG
> ifdeffery to simplify the driver and have all debugging/injection stuff
> behind a debug build-time switch.
> 
> No functional changes.
> 
> Signed-off-by: Borislav Petkov 
> ---
>  drivers/edac/Kconfig  |   7 +-
>  drivers/edac/Makefile |   6 +-
>  drivers/edac/amd64_edac.c | 237 +-
>  drivers/edac/amd64_edac.h |   8 --
>  drivers/edac/amd64_edac_inj.c | 235 -
>  5 files changed, 236 insertions(+), 257 deletions(-)
>  delete mode 100644 drivers/edac/amd64_edac_inj.c
> 
> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
> index 7a47680d6f07..9c2e719cb86a 100644
> --- a/drivers/edac/Kconfig
> +++ b/drivers/edac/Kconfig
> @@ -81,10 +81,9 @@ config EDAC_AMD64
> Support for error detection and correction of DRAM ECC errors on
> the AMD64 families (>= K8) of memory controllers.
>  
> -config EDAC_AMD64_ERROR_INJECTION
> - bool "Sysfs HW Error injection facilities"
> - depends on EDAC_AMD64
> - help
> +   When EDAC_DEBUG is enabled, hardware error injection facilities
> +   through sysfs are available:
> +
> Recent Opterons (Family 10h and later) provide for Memory Error

Can we say "Opterons (Family 10h to Family 15h)"? It may also apply to
Family 16h, but I don't know if they were branded as Opterons.

The injection code in this module doesn't apply to Family 17h and later.

Also, Family 17h and later doesn't allow the OS direct access to the error
injection registers. They're locked down by security policy, etc.

> Injection into the ECC detection circuits. The amd64_edac module
> allows the operator/user to inject Uncorrectable and Correctable

...

> +
> +static umode_t inj_is_visible(struct kobject *kobj, struct attribute *attr, 
> int idx)
> +{
> + struct device *dev = kobj_to_dev(kobj);
> + struct mem_ctl_info *mci = container_of(dev, struct mem_ctl_info, dev);
> + struct amd64_pvt *pvt = mci->pvt_info;
> +
> + if (pvt->fam < 0x10)

Related to the comment above, can this be changed to the following?

if (pvt->fam < 0x10 || pvt->fam >= 0x17)

> + return 0;
> + return attr->mode;
> +}
> +

Everything else looks good to me.

Reviewed-by: Yazen Ghannam 

Thanks,
Yazen


Re: [PATCH 1/2] EDAC/amd64: Merge sysfs debugging attributes setup code

2020-12-15 Thread Yazen Ghannam
On Tue, Dec 15, 2020 at 12:05:16PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov 
> 
> There's no need for them to be in a separate file so merge them into the
> main driver compilation unit like the other EDAC drivers do.
> 
> Drop now-unneeded function export, make the function static and shorten
> static function names.
> 
> No functional changes.
> 
> Signed-off-by: Borislav Petkov 

Reviewed-by: Yazen Ghannam 

Thanks,
Yazen


[tip: x86/cpu] EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId

2020-11-19 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 8de0c9917cc1297bc5543b61992d5bdee4ce621a
Gitweb:
https://git.kernel.org/tip/8de0c9917cc1297bc5543b61992d5bdee4ce621a
Author:Yazen Ghannam 
AuthorDate:Mon, 09 Nov 2020 21:06:58 
Committer: Borislav Petkov 
CommitterDate: Thu, 19 Nov 2020 11:43:21 +01:00

EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId

The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.

In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().

However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact.

But the NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.

Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the
physical ID is used.

Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20201109210659.754018-4-yazen.ghan...@amd.com
---
 drivers/edac/mce_amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 85095e3..5dd905a 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m)
pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
 
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
-   decode_dram_ecc(cpu_to_node(m->extcpu), m);
+   decode_dram_ecc(topology_die_id(m->extcpu), m);
 }
 
 static inline void amd_decode_err_code(u16 ec)


[tip: x86/cpu] x86/CPU/AMD: Remove amd_get_nb_id()

2020-11-19 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: db970bd231c2264a062e0de4dcf4ead5e6669e7a
Gitweb:
https://git.kernel.org/tip/db970bd231c2264a062e0de4dcf4ead5e6669e7a
Author:Yazen Ghannam 
AuthorDate:Mon, 09 Nov 2020 21:06:57 
Committer: Borislav Petkov 
CommitterDate: Thu, 19 Nov 2020 11:43:17 +01:00

x86/CPU/AMD: Remove amd_get_nb_id()

The Last Level Cache ID is returned by amd_get_nb_id(). In practice,
this value is the same as the AMD NodeId for callers of this function.
The NodeId is saved in struct cpuinfo_x86.cpu_die_id.

Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and
remove the function.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20201109210659.754018-3-yazen.ghan...@amd.com
---
 arch/x86/events/amd/core.c   | 2 +-
 arch/x86/include/asm/processor.h | 2 --
 arch/x86/kernel/amd_nb.c | 4 ++--
 arch/x86/kernel/cpu/amd.c| 6 --
 arch/x86/kernel/cpu/cacheinfo.c  | 2 +-
 arch/x86/kernel/cpu/mce/amd.c| 4 ++--
 arch/x86/kernel/cpu/mce/inject.c | 4 ++--
 drivers/edac/amd64_edac.c| 4 ++--
 drivers/edac/mce_amd.c   | 2 +-
 9 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 39eb276..2c1791c 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu)
if (!x86_pmu.amd_nb_constraints)
return;
 
-   nb_id = amd_get_nb_id(cpu);
+   nb_id = topology_die_id(cpu);
WARN_ON_ONCE(nb_id == BAD_APICID);
 
for_each_online_cpu(i) {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 82a08b5..c20a52b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -813,10 +813,8 @@ extern int set_tsc_mode(unsigned int val);
 DECLARE_PER_CPU(u64, msr_misc_features_shadow);
 
 #ifdef CONFIG_CPU_SUP_AMD
-extern u16 amd_get_nb_id(int cpu);
 extern u32 amd_get_nodes_per_socket(void);
 #else
-static inline u16 amd_get_nb_id(int cpu)   { return 0; }
 static inline u32 amd_get_nodes_per_socket(void)   { return 0; }
 #endif
 
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 18f6b7c..b439695 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource 
*res)
 
 int amd_get_subcaches(int cpu)
 {
-   struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link;
+   struct pci_dev *link = node_to_amd_nb(topology_die_id(cpu))->link;
unsigned int mask;
 
if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING))
@@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu)
 int amd_set_subcaches(int cpu, unsigned long mask)
 {
static unsigned int reset, ban;
-   struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   struct amd_northbridge *nb = node_to_amd_nb(topology_die_id(cpu));
unsigned int reg;
int cuid;
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 2f1fbd8..1f71c76 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -424,12 +424,6 @@ clear_ppin:
clear_cpu_cap(c, X86_FEATURE_AMD_PPIN);
 }
 
-u16 amd_get_nb_id(int cpu)
-{
-   return per_cpu(cpu_llc_id, cpu);
-}
-EXPORT_SYMBOL_GPL(amd_get_nb_id);
-
 u32 amd_get_nodes_per_socket(void)
 {
return nodes_per_socket;
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index f9ac682..3ca9be4 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs 
*this_leaf, int index)
if (index < 3)
return;
 
-   node = amd_get_nb_id(smp_processor_id());
+   node = topology_die_id(smp_processor_id());
this_leaf->nb = node_to_amd_nb(node);
if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)
amd_calc_l3_indices(this_leaf->nb);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 0c6b02d..e486f96 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank 
**bp, unsigned int cpu,
return -ENODEV;
 
if (is_shared_bank(bank)) {
-   nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   nb = node_to_amd_nb(topology_die_id(cpu));
 
/* threshold descriptor already initialized on this node? */
if (nb && nb->bank4) {
@@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank 
*bank)
 * The last CPU on this node using the shared bank is going
 * away, remove that 

[tip: x86/cpu] x86/topology: Set cpu_die_id only if DIE_TYPE found

2020-11-19 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: cb09a379724d299c603a7a79f444f52a9a75b8d2
Gitweb:
https://git.kernel.org/tip/cb09a379724d299c603a7a79f444f52a9a75b8d2
Author:Yazen Ghannam 
AuthorDate:Mon, 09 Nov 2020 21:06:59 
Committer: Borislav Petkov 
CommitterDate: Thu, 19 Nov 2020 11:43:25 +01:00

x86/topology: Set cpu_die_id only if DIE_TYPE found

CPUID Leaf 0x1F defines a DIE_TYPE level (nb: ECX[8:15] level type == 0x5),
but CPUID Leaf 0xB does not. However, detect_extended_topology() will
set struct cpuinfo_x86.cpu_die_id regardless of whether a valid Die ID
was found.

Only set cpu_die_id if a DIE_TYPE level is found. CPU topology code may
use another value for cpu_die_id, e.g. the AMD NodeId on AMD-based
systems. Code ordering should be maintained so that the CPUID Leaf 0x1F
Die ID value will take precedence on systems that may use another value.

Suggested-by: Borislav Petkov 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20201109210659.754018-5-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/topology.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index d3a0791..1068002 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -96,6 +96,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
unsigned int die_select_mask, die_level_siblings;
+   bool die_level_present = false;
int leaf;
 
leaf = detect_extended_topology_leaf(c);
@@ -126,6 +127,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
+   die_level_present = true;
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
@@ -139,8 +141,12 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
 
c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid,
ht_mask_width) & core_select_mask;
-   c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
-   core_plus_mask_width) & die_select_mask;
+
+   if (die_level_present) {
+   c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
+   core_plus_mask_width) & die_select_mask;
+   }
+
c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid,
die_plus_mask_width);
/*


[tip: x86/cpu] x86/CPU/AMD: Save AMD NodeId as cpu_die_id

2020-11-19 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 028c221ed1904af9ac3c5162ee98f48966de6b3d
Gitweb:
https://git.kernel.org/tip/028c221ed1904af9ac3c5162ee98f48966de6b3d
Author:Yazen Ghannam 
AuthorDate:Mon, 09 Nov 2020 21:06:56 
Committer: Borislav Petkov 
CommitterDate: Thu, 19 Nov 2020 11:43:13 +01:00

x86/CPU/AMD: Save AMD NodeId as cpu_die_id

AMD systems provide a "NodeId" value that represents a global ID
indicating to which "Node" a logical CPU belongs. The "Node" is a
physical structure equivalent to a Die, and it should not be confused
with logical structures like NUMA nodes. Logical nodes can be adjusted
based on firmware or other settings whereas the physical nodes/dies are
fixed based on hardware topology.

The NodeId value can be used when a physical ID is needed by software.

Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value
from CPUID or MSR as appropriate. Default to phys_proc_id otherwise.
Do so for both AMD and Hygon systems.

Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no
longer needed.

Update the x86 topology documentation.

Suggested-by: Borislav Petkov 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20201109210659.754018-2-yazen.ghan...@amd.com
---
 Documentation/x86/topology.rst   |  9 +
 arch/x86/include/asm/cacheinfo.h |  4 ++--
 arch/x86/kernel/cpu/amd.c| 11 +--
 arch/x86/kernel/cpu/cacheinfo.c  |  6 +++---
 arch/x86/kernel/cpu/hygon.c  | 11 +--
 5 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
index e297399..7f58010 100644
--- a/Documentation/x86/topology.rst
+++ b/Documentation/x86/topology.rst
@@ -41,6 +41,8 @@ Package
 Packages contain a number of cores plus shared resources, e.g. DRAM
 controller, shared caches etc.
 
+Modern systems may also use the term 'Die' for package.
+
 AMD nomenclature for package is 'Node'.
 
 Package-related topology information in the kernel:
@@ -53,11 +55,18 @@ Package-related topology information in the kernel:
 
 The number of dies in a package. This information is retrieved via CPUID.
 
+  - cpuinfo_x86.cpu_die_id:
+
+The physical ID of the die. This information is retrieved via CPUID.
+
   - cpuinfo_x86.phys_proc_id:
 
 The physical ID of the package. This information is retrieved via CPUID
 and deduced from the APIC IDs of the cores in the package.
 
+Modern systems use this value for the socket. There may be multiple
+packages within a socket. This value may differ from cpu_die_id.
+
   - cpuinfo_x86.logical_proc_id:
 
 The logical ID of the package. As we do not trust BIOSes to enumerate the
diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h
index 86b63c7..86b2e0d 100644
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_X86_CACHEINFO_H
 #define _ASM_X86_CACHEINFO_H
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
-void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu);
+void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);
 
 #endif /* _ASM_X86_CACHEINFO_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6062ce5..2f1fbd8 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c)
  */
 static void amd_get_topology(struct cpuinfo_x86 *c)
 {
-   u8 node_id;
int cpu = smp_processor_id();
 
/* get information required for multi-node processors */
@@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
cpuid(0x801e, &eax, &ebx, &ecx, &edx);
 
-   node_id  = ecx & 0xff;
+   c->cpu_die_id  = ecx & 0xff;
 
if (c->x86 == 0x15)
c->cu_id = ebx & 0xff;
@@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
if (!err)
c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
-   cacheinfo_amd_init_llc_id(c, cpu, node_id);
+   cacheinfo_amd_init_llc_id(c, cpu);
 
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   node_id = value & 7;
+   c->cpu_die_id = value & 7;
 
-   per_cpu(cpu_llc_id, cpu) = node_id;
+   per_cpu(cpu_llc_id, cpu) = c->cpu_die_id;
} else
return;
 
@@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into t

[PATCH 3/4] EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId

2020-11-09 Thread Yazen Ghannam
From: Yazen Ghannam 

The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.

In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().

However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact. But the
NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.

Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the
physical ID is used.

Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam 
---
 drivers/edac/mce_amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 85095e3902ec..5dd905a3f30c 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m)
pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
 
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
-   decode_dram_ecc(cpu_to_node(m->extcpu), m);
+   decode_dram_ecc(topology_die_id(m->extcpu), m);
 }
 
 static inline void amd_decode_err_code(u16 ec)
-- 
2.25.1



[PATCH 4/4] x86/topology: Set cpu_die_id only if DIE_TYPE found

2020-11-09 Thread Yazen Ghannam
From: Yazen Ghannam 

CPUID Leaf 0x1F defines a DIE_TYPE level, but CPUID Leaf 0xB does not.
However, detect_extended_topology() will set struct
cpuinfo_x86.cpu_die_id regardless of whether a valid Die ID was found.

Only set cpu_die_id if a DIE_TYPE level is found. CPU topology code may
use another value for cpu_die_id, e.g. the AMD NodeId on AMD-based
systems. Code ordering should be maintained so that the CPUID Leaf 0x1F
Die ID value will take precedence on systems that may use another value.

Suggested-by: Borislav Petkov 
Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/topology.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index d3a0791bc052..1068002c8532 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -96,6 +96,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
unsigned int die_select_mask, die_level_siblings;
+   bool die_level_present = false;
int leaf;
 
leaf = detect_extended_topology_leaf(c);
@@ -126,6 +127,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
+   die_level_present = true;
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
@@ -139,8 +141,12 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
 
c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid,
ht_mask_width) & core_select_mask;
-   c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
-   core_plus_mask_width) & die_select_mask;
+
+   if (die_level_present) {
+   c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
+   core_plus_mask_width) & die_select_mask;
+   }
+
c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid,
die_plus_mask_width);
/*
-- 
2.25.1



[PATCH 2/4] x86/CPU/AMD: Remove amd_get_nb_id()

2020-11-09 Thread Yazen Ghannam
From: Yazen Ghannam 

The Last Level Cache ID is returned by amd_get_nb_id(). In practice,
this value is the same as the AMD NodeId for callers of this function.
The NodeId is saved in struct cpuinfo_x86.cpu_die_id.

Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and
remove the function.

Signed-off-by: Yazen Ghannam 
---
 arch/x86/events/amd/core.c   | 2 +-
 arch/x86/include/asm/processor.h | 2 --
 arch/x86/kernel/amd_nb.c | 4 ++--
 arch/x86/kernel/cpu/amd.c| 6 --
 arch/x86/kernel/cpu/cacheinfo.c  | 2 +-
 arch/x86/kernel/cpu/mce/amd.c| 4 ++--
 arch/x86/kernel/cpu/mce/inject.c | 4 ++--
 drivers/edac/amd64_edac.c| 4 ++--
 drivers/edac/mce_amd.c   | 2 +-
 9 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 39eb276d0277..2c1791c4a518 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu)
if (!x86_pmu.amd_nb_constraints)
return;
 
-   nb_id = amd_get_nb_id(cpu);
+   nb_id = topology_die_id(cpu);
WARN_ON_ONCE(nb_id == BAD_APICID);
 
for_each_online_cpu(i) {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 60dbcdcb833f..a411466a6e74 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -815,10 +815,8 @@ extern int set_tsc_mode(unsigned int val);
 DECLARE_PER_CPU(u64, msr_misc_features_shadow);
 
 #ifdef CONFIG_CPU_SUP_AMD
-extern u16 amd_get_nb_id(int cpu);
 extern u32 amd_get_nodes_per_socket(void);
 #else
-static inline u16 amd_get_nb_id(int cpu)   { return 0; }
 static inline u32 amd_get_nodes_per_socket(void)   { return 0; }
 #endif
 
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 18f6b7c4bd79..b4396952c9a6 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource 
*res)
 
 int amd_get_subcaches(int cpu)
 {
-   struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link;
+   struct pci_dev *link = node_to_amd_nb(topology_die_id(cpu))->link;
unsigned int mask;
 
if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING))
@@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu)
 int amd_set_subcaches(int cpu, unsigned long mask)
 {
static unsigned int reset, ban;
-   struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   struct amd_northbridge *nb = node_to_amd_nb(topology_die_id(cpu));
unsigned int reg;
int cuid;
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 2f1fbd8150af..1f71c7616917 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -424,12 +424,6 @@ static void amd_detect_ppin(struct cpuinfo_x86 *c)
clear_cpu_cap(c, X86_FEATURE_AMD_PPIN);
 }
 
-u16 amd_get_nb_id(int cpu)
-{
-   return per_cpu(cpu_llc_id, cpu);
-}
-EXPORT_SYMBOL_GPL(amd_get_nb_id);
-
 u32 amd_get_nodes_per_socket(void)
 {
return nodes_per_socket;
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index f9ac682e75e7..3ca9be482a9e 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs 
*this_leaf, int index)
if (index < 3)
return;
 
-   node = amd_get_nb_id(smp_processor_id());
+   node = topology_die_id(smp_processor_id());
this_leaf->nb = node_to_amd_nb(node);
if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)
amd_calc_l3_indices(this_leaf->nb);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 0c6b02dd744c..e486f96b3cb3 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank 
**bp, unsigned int cpu,
return -ENODEV;
 
if (is_shared_bank(bank)) {
-   nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   nb = node_to_amd_nb(topology_die_id(cpu));
 
/* threshold descriptor already initialized on this node? */
if (nb && nb->bank4) {
@@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank 
*bank)
 * The last CPU on this node using the shared bank is going
 * away, remove that bank now.
 */
-   nb = node_to_amd_nb(amd_get_nb_id(smp_processor_id()));
+   nb = node_to_amd_nb(topology_die_id(smp_processor_id()));
nb->bank4 = NULL;
}
 
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 3a44346f2276..7b360731fc2d 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86

[PATCH 1/4] x86/CPU/AMD: Save AMD NodeId as cpu_die_id

2020-11-09 Thread Yazen Ghannam
From: Yazen Ghannam 

AMD systems provide a "NodeId" value that represents a global ID
indicating to which "Node" a logical CPU belongs. The "Node" is a
physical structure equivalent to a Die, and it should not be confused
with logical structures like NUMA nodes. Logical nodes can be adjusted
based on firmware or other settings whereas the physical nodes/dies are
fixed based on hardware topology.

The NodeId value can be used when a physical ID is needed by software.

Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value
from CPUID or MSR as appropriate. Default to phys_proc_id otherwise.
Do so for both AMD and Hygon systems.

Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no
longer needed.

Update the x86 topology documentation.

[ Use cpu_die_id. ]
Suggested-by: Borislav Petkov 
Signed-off-by: Yazen Ghannam 
---
 Documentation/x86/topology.rst   |  9 +
 arch/x86/include/asm/cacheinfo.h |  4 ++--
 arch/x86/kernel/cpu/amd.c| 11 +--
 arch/x86/kernel/cpu/cacheinfo.c  |  6 +++---
 arch/x86/kernel/cpu/hygon.c  | 11 +--
 5 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst
index e29739904e37..7f58010ea86a 100644
--- a/Documentation/x86/topology.rst
+++ b/Documentation/x86/topology.rst
@@ -41,6 +41,8 @@ Package
 Packages contain a number of cores plus shared resources, e.g. DRAM
 controller, shared caches etc.
 
+Modern systems may also use the term 'Die' for package.
+
 AMD nomenclature for package is 'Node'.
 
 Package-related topology information in the kernel:
@@ -53,11 +55,18 @@ Package-related topology information in the kernel:
 
 The number of dies in a package. This information is retrieved via CPUID.
 
+  - cpuinfo_x86.cpu_die_id:
+
+The physical ID of the die. This information is retrieved via CPUID.
+
   - cpuinfo_x86.phys_proc_id:
 
 The physical ID of the package. This information is retrieved via CPUID
 and deduced from the APIC IDs of the cores in the package.
 
+Modern systems use this value for the socket. There may be multiple
+packages within a socket. This value may differ from cpu_die_id.
+
   - cpuinfo_x86.logical_proc_id:
 
 The logical ID of the package. As we do not trust BIOSes to enumerate the
diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h
index 86b63c7feab7..86b2e0dcc4bf 100644
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_X86_CACHEINFO_H
 #define _ASM_X86_CACHEINFO_H
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
-void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu);
+void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);
 
 #endif /* _ASM_X86_CACHEINFO_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 6062ce586b95..2f1fbd8150af 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c)
  */
 static void amd_get_topology(struct cpuinfo_x86 *c)
 {
-   u8 node_id;
int cpu = smp_processor_id();
 
/* get information required for multi-node processors */
@@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
cpuid(0x801e, &eax, &ebx, &ecx, &edx);
 
-   node_id  = ecx & 0xff;
+   c->cpu_die_id  = ecx & 0xff;
 
if (c->x86 == 0x15)
c->cu_id = ebx & 0xff;
@@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
if (!err)
c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
-   cacheinfo_amd_init_llc_id(c, cpu, node_id);
+   cacheinfo_amd_init_llc_id(c, cpu);
 
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   node_id = value & 7;
+   c->cpu_die_id = value & 7;
 
-   per_cpu(cpu_llc_id, cpu) = node_id;
+   per_cpu(cpu_llc_id, cpu) = c->cpu_die_id;
} else
return;
 
@@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
/* use socket ID also for last level cache */
-   per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+   per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id;
 }
 
 static void amd_detect_ppin(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 57074cf3ad7c..f9ac682e75e7 100644

[PATCH 0/4] Set and use cpu_die_id on AMD-based systems

2020-11-09 Thread Yazen Ghannam
From: Yazen Ghannam 

AMD-based systems currently use a "NodeId" when referencing a
software-visible hardware structure. This may be referred to as a "Die"
in x86 documentation, "Node" in some AMD documentation, and "Package" in
Linux documentation.

Recently a cpu_die_id value was added to struct cpuinfo_x86. This value
can be used on AMD-based systems rather than using an AMD-specific value
throughout the kernel.

This set is based on patches 1-3 from the following set.
https://lkml.kernel.org/r/20200903200144.310991-1-yazen.ghan...@amd.com

Thanks,
Yazen

Yazen Ghannam (4):
  x86/CPU/AMD: Save AMD NodeId as cpu_die_id
  x86/CPU/AMD: Remove amd_get_nb_id()
  EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
  x86/topology: Set cpu_die_id only if DIE_TYPE found

 Documentation/x86/topology.rst   |  9 +
 arch/x86/events/amd/core.c   |  2 +-
 arch/x86/include/asm/cacheinfo.h |  4 ++--
 arch/x86/include/asm/processor.h |  2 --
 arch/x86/kernel/amd_nb.c |  4 ++--
 arch/x86/kernel/cpu/amd.c| 17 +
 arch/x86/kernel/cpu/cacheinfo.c  |  8 
 arch/x86/kernel/cpu/hygon.c  | 11 +--
 arch/x86/kernel/cpu/mce/amd.c|  4 ++--
 arch/x86/kernel/cpu/mce/inject.c |  4 ++--
 arch/x86/kernel/cpu/topology.c   | 10 --
 drivers/edac/amd64_edac.c|  4 ++--
 drivers/edac/mce_amd.c   |  4 ++--
 13 files changed, 44 insertions(+), 39 deletions(-)

-- 
2.25.1



[PATCH] EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh

2020-10-09 Thread Yazen Ghannam
From: Yazen Ghannam 

AMD Family 19h Models 20h-2Fh use the same PCI IDs as Family 17h Models
70h-7Fh. The same family ops and number of channels also apply.

Use the Family17h Model 70h family_type and ops for Family 19h Models
20h-2Fh. Update the controller name to match the system.

Signed-off-by: Yazen Ghannam 
---
 drivers/edac/amd64_edac.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index fcc08bbf6945..1362274d840b 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3385,6 +3385,12 @@ static struct amd64_family_type *per_family_init(struct 
amd64_pvt *pvt)
break;
 
case 0x19:
+   if (pvt->model >= 0x20 && pvt->model <= 0x2f) {
+   fam_type = &family_types[F17_M70H_CPUS];
+   pvt->ops = &family_types[F17_M70H_CPUS].ops;
+   fam_type->ctl_name = "F19h_M20h";
+   break;
+   }
fam_type= &family_types[F19_CPUS];
pvt->ops= &family_types[F19_CPUS].ops;
family_types[F19_CPUS].ctl_name = "F19h";
-- 
2.25.1



Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-29 Thread Yazen Ghannam
On Mon, Sep 28, 2020 at 08:14:07PM +0200, Borislav Petkov wrote:
> On Mon, Sep 28, 2020 at 10:53:50AM -0500, Yazen Ghannam wrote:
> 
> > I agree that the translation code is implementation-specific and applies
> > only to DRAM ECC errors, so it make sense to have it in amd64_edac. The
> > only issue is getting the address translation to earlier notifiers. I
> > think we can add a new one in amd64_edac to run before others. Maybe this
> > can be a new priority class like MCE_PRIO_PREPROCESS, or something like
> > that for notifiers that fixup the MCE data.
> 
> Well, I'm not sure you need notifiers here - you wanna call
> mce_usable_address() and in it, it should do the address conversion
> calculation to give you a physical address which you can feed to
> memory_failure etc.
> 
> Now, mce_usable_address() is core code and we can make core code call
> into a module but that is yucky. So *that* is your reason for keeping it
> where it is.
>

Okay, we'll keep the code where it is. I'll work on another set to call
the address translation with mce_usable_address().

> Looking at its size:
> 
> $ readelf -s vmlinux | grep umc_normaddr_to
>   2864: 817d8ae5   168 FUNCLOCAL  DEFAULT1 
> umc_normaddr_to_[...]
>  91866: 81030e00  1127 FUNCGLOBAL DEFAULT1 
> umc_normaddr_to_[...]
> 
> that's something like ~1.3K and if you split it and do some
> experimenting, you might get it even slimmer. Not that ~1.3K is that
> huge for current standards but we should always aim at not bloating the
> fat guy our kernel already is.
>

Okay, I'll keep an eye on this and try to slim it down.

Thanks,
Yazen


Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-28 Thread Yazen Ghannam
On Mon, Sep 28, 2020 at 11:47:59AM +0200, Borislav Petkov wrote:
> On Fri, Sep 25, 2020 at 02:51:27PM -0500, Yazen Ghannam wrote:
> 
> > The address translation needs to be done before the notfiers that need
> > it, and EDAC comes after all of them. There's also the case where the
> > EDAC interface isn't wanted, so amd64_edac will be unloaded.
> 
> I'd be interested as to why. Because decoding addresses is amd64_edac
> *core* functionality. We can stick it in drivers/edac/mce_amd.c but I'd
> like to hear what those valid reasons are, not to use the driver which
> is supposed to do that anyway.
>

I don't have any clear reasons. I just get vague use cases sometimes
about not using EDAC and relying on other things. But it shouldn't hurt
to have the module load anyway. The EDAC messages can be suppressed, and
the sysfs interface can be ignored. So, after a bit more thought, this
doesn't seem like a good reason.

I agree that the translation code is implementation-specific and applies
only to DRAM ECC errors, so it make sense to have it in amd64_edac. The
only issue is getting the address translation to earlier notifiers. I
think we can add a new one in amd64_edac to run before others. Maybe this
can be a new priority class like MCE_PRIO_PREPROCESS, or something like
that for notifiers that fixup the MCE data.

I can start by moving the address translation to amd64_edac and doing
the code cleanup.

Thanks,
Yazen


Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-25 Thread Yazen Ghannam
On Fri, Sep 25, 2020 at 09:22:31AM +0200, Borislav Petkov wrote:
> On Wed, Sep 23, 2020 at 11:25:10AM -0500, Yazen Ghannam wrote:
> > I don't remember the original reason, and I was recently asked about
> > this code living in a module. I did some looking after this ask, and I
> > found that we should be using this translation to get a proper value for
> > the memory error notifiers to use. So I think we still need to use this
> > function some way with the core code even if the EDAC interface isn't
> > used.
> 
> You'd need to be more specific here, you want to bypass amd64_edac to
> decode errors? Judging by the current RAS activity coming from you guys,
> I'm thinking firmware. But then wouldn't the firmware do the decoding
> for us and then this function is not even needed?
>

The UC, NFIT, and CEC notifiers all operate on system physical
addresses. The address in the MCE record is checked by
mce_usable_address() to see if it can be used by the kernel, i.e. the
address is a system physical address. Right now, this check passes on
AMD systems if MCA_STATUS[AddrV] is set. This works for memory errors on
legacy AMD systems, since the NB MCA bank logs a physical address for
DRAM ECC errors. But this won't work on newer systems, because the UMC
MCA bank does not log a system physical address for DRAM ECC errors. So
the address provided by the hardware will need to be translated to a
physical address before the notifiers in the MCE chain can use it.

We can add support to get the physical address from firmware in some
cases. But it looks to me that we'll still need to keep updating the
translation code in the kernel to cover some platform/user
configurations. So it makes sense to me to move the functionality into a
module to make it easier to update.

The address translation needs to be done before the notfiers that need
it, and EDAC comes after all of them. There's also the case where the
EDAC interface isn't wanted, so amd64_edac will be unloaded. But the
functionality in the other notifiers are still expected to be available.
So it's more than just decoding the error like we do now with amd64_edac.
That's why I think the translation code can be in a separate module with
a notfier that runs before the others. This can do the translation once
then pass the result down to the CEC, UC, NFIT, and EDAC notifiers to
use as needed.

Thanks,
Yazen


Re: [PATCH v4] cper, apei, mce: Pass x86 CPER through the MCA handling chain

2020-09-25 Thread Yazen Ghannam
On Fri, Sep 25, 2020 at 09:54:06AM +0900, Punit Agrawal wrote:
> Borislav Petkov  writes:
> 
> > On Thu, Sep 24, 2020 at 12:23:27PM -0500, Smita Koralahalli Channabasappa 
> > wrote:
> >> > Even though it's not defined in the UEFI spec, it doesn't mean a
> >> > structure definition cannot be created.
> >
> > Created for what? That structure better have a big fat comment above it, 
> > what
> > firmware generates its layout.
> 
> Maybe I could've used a better choice of words - I meant to define a
> structure with meaningful member names to replace the *(ptr + i)
> accesses in the patch.
> 
> The requirement for documenting the record layout doesn't change -
> whether using raw pointer arithmetic vs a structure definition.
> 
> >> > After all, the patch is relying on some guarantee of the meaning of
> >> > the values and their ordering.
> >
> > AFAICT, this looks like an ad-hoc definition and the moment they change
> > it in some future revision, that struct of yours becomes invalid so we'd
> > need to add another one.
> 
> If there's no spec backing the current layout, then it'll indeed be an
> ad-hoc definition of a structure in the kernel. But considering that
> it's part of firmware / OS interface for an important part of the RAS
> story I would hope that the code is based on a spec - having that
> reference included would help maintainability.
> 
> Incompatible changes will indeed break the assumptions in the kernel and
> code will need to be updated - regardless of the choice of kernel
> implementation; pointer arithmetic, structure definition - ad-hoc or
> spec provided.
> 
> Having versioning will allow running older kernels on newer hardware and
> vice versa - but I don't see why that is important only when using a
> structure based access.
>

There is no versioning option for the x86 context info structure in the
UEFI spec, so I don't think there'd be a clean way to include version
information.

The format of the data in the context info is not totally ad-hoc, and it
does follow the UEFI spec. The "Register Array" field is raw data. This
may follow one of the predefined formats in the UEFI spec like the "X64
Register State", etc. Or, in the case of MSR and Memory Mapped
Registers, this is a raw dump of the registers starting from the address
shown in the structure. The two values that can be changed are the
starting address and the array size. These two together provide a window
to the registers. The registers are fixed, so a single context info
struture should include a single contiguous range of registers. Multiple
context info structures can be provided to include registers from
different, non-contiguous ranges.

This patch is checking if an MSR context info structure lines up with
the MCAX register space used on Scalable MCA systems. This register
space is defined in the AMD Processor Programming Reference for various
products. This is considered a hardware feature extension, so the
existing register layout won't change though new registers may be added.
A layout change would require moving to another register space which is
what happened going from legacy MCA (starting at address 0x400) to MCAX
(starting at address 0xC0002000) registers.

The only two things firmware can change are from what address does the
info start and where does the info end. So the implementation-specific
details here are that currently the starting address is MCA_STATUS (in
MCAX space) for a bank and the remaining info includes the other MCA
registers for this bank.

So I think the kernel can be strict with this format, i.e. the two
variables match what we're looking for. This patch already has a check
on the starting address. It should also include a check that "Register
Array Size" is large enough to include all the registers we want to
extract. If the format doesn't match, then we fall back to a raw dump
of the data like we have today.

Or the kernel can be more flexible and try to find the window of
registers based on the starting address. I think this is really
open-ended though.

Does this sound reasonable?

Thanks,
Yazen


Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-23 Thread Yazen Ghannam
On Wed, Sep 23, 2020 at 10:20:39AM +0200, Borislav Petkov wrote:
> On Thu, Sep 03, 2020 at 08:01:44PM +0000, Yazen Ghannam wrote:
> > From: Muralidhara M K 
> > 
> > Add support for new memory interleaving modes used in current AMD systems.
> >
> > Check if the system is using a current Data Fabric version or a legacy
> > version as some bit and register definitions have changed.
> > 
> > Tested on AMD reference platforms with the following memory interleaving
> > options.
> > 
> > Naples
> > - None
> > - Channel
> > - Die
> > - Socket
> > 
> > Rome (NPS = Nodes per Socket)
> > - None
> > - NPS0
> > - NPS1
> > - NPS2
> > - NPS4
> > 
> > The fixes tag refers to the commit that allows amd64_edac_mod to load on
> > Rome systems.
> 
> Err, why? This is adding new stuff to an address translation function.
> How does that fix amd64_edac loading on Rome?
> 
> > The module may report an incorrect system addresses on
> > Rome systems depending on the interleaving option used.
> 
> That doesn't stop it from loading, sorry.
>

Okay, no problem.

> Now, before you guys do any new features, I'd like you to split this
> humongous function umc_normaddr_to_sysaddr() logically into separate
> helpers and each helper does exactly one thing and one thing only.
> 
> Then use a verb in its name: umc_translate_normaddr_to_sysaddr() or so.
>

Okay, will do.

> Also, Yazen, remind me again pls why isn't this function in
> drivers/edac/amd64_edac.c, where it is needed?
> 
> If the reason is not valid anymore, let's move it there before splitting
> so that it doesn't bloat the core code.
>

I don't remember the original reason, and I was recently asked about
this code living in a module. I did some looking after this ask, and I
found that we should be using this translation to get a proper value for
the memory error notifiers to use. So I think we still need to use this
function some way with the core code even if the EDAC interface isn't
used.

I think this set can be split up.

1) Set with patches 1-3 fixed up to use cpu_die_id.
2) Set with the address translation updates.
   a) Move umc_normaddr_to_sysaddr() into a new module under EDAC.
   b) Hook the new module into amd64_edac.c where it's used today.
   c) Refactor the code as you suggested above.
   d) Add the new features.
3) New set that sets up a proper notifier for the address translation.
   a) Unhook the new module from amd64_edac.c.
   b) Register a notifer that runs before any notifiers that operate on
  memory errors.
   c) Find a way to pass the translated address through the chain
  without losing the original value.

What do you think?

Thanks,
Yazen


Re: [PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable in translation code

2020-09-23 Thread Yazen Ghannam
On Wed, Sep 23, 2020 at 10:05:56AM +0200, Borislav Petkov wrote:
> On Thu, Sep 03, 2020 at 08:01:42PM +0000, Yazen Ghannam wrote:
> > From: Yazen Ghannam 
> > 
> > Remove the "tmp" variable used to save register values. Save the values
> > in existing variables, if possible.
> > 
> > The register values are 32 bits. Use separate "reg_" variables to hold
> > the register values if the existing variable sizes doesn't match, or if
> > no bitfields in a register share the same name as the register.
> 
> So I'm missing the "why" in the commit message. Why are you doing this?
> 
> Is there some reason which I'll find out later? If not, then this is
> just unnecessary churn.
>

I don't have a strong reason other than trying to address a comment in
the first version. I can drop this patch if you prefer.

Thanks,
Yazen


Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-17 Thread Yazen Ghannam
On Thu, Sep 17, 2020 at 06:40:48PM +0200, Borislav Petkov wrote:
> On Thu, Sep 17, 2020 at 11:20:53AM -0500, Yazen Ghannam wrote:
> > But newer systems support CPUID Leaf 0xB, so cpu_die_id will get
> > explicitly set by detect_extended_topology(). The value set is
> > different from the AMD NodeId. And at that point I shied away from
> > doing any override or fixup.
> 
> Well, different how? Can you extract the node_id you need
> from CPUID(0xb)? If yes, we can do an AMD-specific branch in
> detect_extended_topology() but that better be future proof.
> 
> IOW, is information from CPUID(0xb) ever going to be needed in the
> kernel?
> 
> Also, and independently, if its definition do not give you the
> node_id you need, then you can just as well overwrite ->cpu_die_id in
> detect_extended_topology() because that value - whatever that is, could
> be garbage, just as well - is wrong on AMD anyway.
> 
> So it would be a fix for the leaf parsing, regardless of whether you
> need it or not.
> 
> Makes sense?
>

Yes, I think so. "Die" is not defined in CPUID(0xb), only SMT and Core,
so the cpu_die_id value is not valid. In which case, we can overwrite
it. CPUID(0xb) doesn't have anything equivalent to AMD NodeId. So on
systems with CPUID < 0x1F, we should be okay with using cpu_die_id equal
to AMD NodeId.

I have an idea on what to do, so I'll send another rev if that's okay.
Do you have any comments on the other patches in the set?

Thanks,
Yazen


Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-17 Thread Yazen Ghannam
On Thu, Sep 17, 2020 at 12:37:20PM +0200, Borislav Petkov wrote:
> On Wed, Sep 16, 2020 at 02:51:52PM -0500, Yazen Ghannam wrote:
> > What do you think?
> 
> Yeah, forget logical_proc_id - the galactic senate of x86 maintainers
> said that we're keeping that for when BIOS vendors f*ck up with the
> phys_proc_id enumeration on AMD. Then we'll need that as a workaround.
> 
> Look instead at:
> 
> struct cpuinfo_x86 {
> 
>   ...
> 
> u16 cpu_die_id;
> u16 logical_die_id;
> 
> and
> 
> 7745f03eb395 ("x86/topology: Add CPUID.1F multi-die/package support")
> 
> "Some new systems have multiple software-visible die within each
> package."
> 
> and you could map the AMD packages to those dies. And if you guys
> implement CPUID.1F to enumerate those packages the same way, then all
> should just work (famous last words).
>
> Because Intel dies is basically AMD packages consisting of a CCX, caches
> and DF.
> 
> We would have to update the documentation in the end to denote that but
> let's see if this should work for you too first. Because the concepts
> sound very similar, if not identical...
>

Yep, we could ask the hardware folks to implement CPUID Leaf 0x1F, but
that'll be in some future products. 

I actually tried using cpu_die_id, but I ran into an issue on newer
systems.

On older systems, there is no CPUID Leaf 0xB or 0x1F, and cpu_die_id
doesn't get explicitly set. So setting cpu_die_id equal to AMD NodeId
would work. But newer systems support CPUID Leaf 0xB, so cpu_die_id
will get explicitly set by detect_extended_topology(). The value set is
different from the AMD NodeId. And at that point I shied away from
doing any override or fixup.

Thanks,
Yazen


Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-16 Thread Yazen Ghannam
On Tue, Sep 15, 2020 at 10:35:15AM +0200, Borislav Petkov wrote:
...
> > Yeah, I think example 4b works here. The mismatch though is with
> > phys_proc_id and package on AMD systems. You can see above that
> > phys_proc_id gives a socket number, and the AMD NodeId gives a package
> > number.
> 
> Ok, now looka here:
> 
> "  - cpuinfo_x86.logical_proc_id:
> 
> The logical ID of the package. As we do not trust BIOSes to enumerate the
> packages in a consistent way, we introduced the concept of logical package
> ID so we can sanely calculate the number of maximum possible packages in
> the system and have the packages enumerated linearly."
> 
> Doesn't that sound like exactly what you need?
> 
> Because that DF ID *is* practically the package ID as there's 1:1
> mapping between DF and a package, as you say above.
> 
> Right?
> 
> Now, it says
> 
> [7.670791] smpboot: Max logical packages: 2
> 
> on my Rome box but what you want sounds very much like the logical
> package ID and if we define that on AMD to be that and document it this
> way, I guess that should work too, provided there are no caveats like
> sched is using this info for proper task placement and so on. That would
> need code audit, of course...
>

The only use of logical_proc_id seems to be in hswep_uncore_cpu_init().
So I think maybe we can use this.

However, I think there are two issues.

1) The logical_proc_id seems like it should refer to the same type of
structure as phys_proc_id. In our case, this won't be true as
phys_proc_id would refer to the "socket" on AMD and logical_proc_id
would refer to the package/AMD NodeId.

2) The AMD NodeId is read during c_init()/init_amd(), so logical_proc_id
can be set here. But then logical_proc_id will get overwritten later in 
topology_update_package_map(). I don't know if it'd be good to modify
the generic flow to support this vendor-specific behavior.

What do you think?

Thanks,
Yazen


Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-14 Thread Yazen Ghannam
On Thu, Sep 10, 2020 at 12:14:43PM +0200, Borislav Petkov wrote:
> On Wed, Sep 09, 2020 at 03:17:55PM -0500, Yazen Ghannam wrote:
> > We need to access specific instances of hardware registers in the
> > Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does
> > this.
> 
> So you don't need the node_id - you need the northbridge/data fabric ID?
> I'm guessing NB == DF, i.e., it was NB before Zen and it is DF now.
> 
> Yes?
>

Yes, that's right.

I called it "node_id" based on the AMD documentation and what it's
called today in the Linux code. It's called other things like nb_id and
nid too.

I think we can call it something else to avoid confusion with NUMA nodes
if that'll help.

> > Package = Socket, i.e. a field replaceable unit. Socket may not be
> > useful for software, but I think it helps users identify the hardware.
> > 
> > I think the following could be changed in the documentation:
> > 
> > "In the past a socket always contained a single package (see below), but
> > with the advent of Multi Chip Modules (MCM) a socket can hold more than one
> > package."
> > 
> > Replace "package" with "die".
> 
> So first of all, we have:
> 
> "AMD nomenclature for package is 'Node'."
> 
> so we either change that because as you explain, node != package on AMD.
> 
> What you need is the ID of that northbridge or data fabric instance,
> AFAIU.
> 
> > You take multiple dies from the foundry and you "package" them together
> > into a single unit.
> 
> I think you're overloading the word "package" here and that leads to
> more confusion. Package in our definition - Linux' - is:
> 
> "Packages contain a number of cores plus shared resources, e.g. DRAM
> controller, shared caches etc." If you glue several packages together,
> you get an MCM.
> 

Yes, you're right. The AMD documentation is different, so I'll try to
stick with the Linux documentation and qualify names with "AMD" when
noting the usage by the AMD docs.

> > They could be equal depending on the system. The values are different on
> > MCM systems like Bulldozer and Naples though.
> > 
> > The functions and structures in amd_nb.c are indexed by the node_id.
> > This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id.
> > But the LLC isn't always equal to the Node/Die like in Naples. So the
> > patches in this set save and explicitly use the node_id when needed.
> > 
> > What do you think?
> 
> Sounds to me that you want to ID that data fabric instance which
> logically belongs to one or multiple packages. Or can a DF a single
> package?
> 
> So let's start simple: how does a DF instance map to a logical NUMA
> node or package? Can a DF serve multiple packages?
> 

There's one DF/NB per package and it's a fixed value, i.e. it shouldn't
change based on the NUMA configuration.

Here's an example of a 2 socket Naples system with 4 packages per socket
and setup to have 1 NUMA node. The "node_id" value is the AMD NodeId
from CPUID.

CPU=0 phys_proc_id=0 node_id=0 cpu_to_node()=0
CPU=8 phys_proc_id=0 node_id=1 cpu_to_node()=0
CPU=16 phys_proc_id=0 node_id=2 cpu_to_node()=0
CPU=24 phys_proc_id=0 node_id=3 cpu_to_node()=0
CPU=32 phys_proc_id=1 node_id=4 cpu_to_node()=0
CPU=40 phys_proc_id=1 node_id=5 cpu_to_node()=0
CPU=48 phys_proc_id=1 node_id=6 cpu_to_node()=0
CPU=56 phys_proc_id=1 node_id=7 cpu_to_node()=0

> You could use the examples at the end of Documentation/x86/topology.rst
> to explain how those things play together. And remember to not think
> about the physical aspect of the hardware structure because it doesn't
> mean anything to software. All you wanna do is address the proper DF
> instance so this needs to be enumerable and properly represented by sw.
>

Yeah, I think example 4b works here. The mismatch though is with
phys_proc_id and package on AMD systems. You can see above that
phys_proc_id gives a socket number, and the AMD NodeId gives a package
number.

Should we add a note under cpuinfo_x86.phys_proc_id to make this
distinction?

> Confused?
> 
> I am.
> 
> :-)
>

Yeah, me too. :)

Thanks,
Yazen


Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-09 Thread Yazen Ghannam
On Wed, Sep 09, 2020 at 08:06:47PM +0200, Borislav Petkov wrote:
> On Thu, Sep 03, 2020 at 08:01:37PM +0000, Yazen Ghannam wrote:
> > From: Yazen Ghannam 
> > 
> > AMD systems provide a "NodeId" value that represents a global ID
> > indicating to which "Node" a logical CPU belongs. The "Node" is a
> > physical structure equivalent to a Die, and it should not be confused
> > with logical structures like NUMA node.
> 
> So we said in Documentation/x86/topology.rst that:
> 
> "The kernel does not care about the concept of physical sockets because
> a socket has no relevance to software. It's an electromechanical
> component."
> 

Yes, I agree with this.

> Now, you're talking, AFAIU, about physical components. Why do you need
> them?
> 

We need to access specific instances of hardware registers in the
Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does
this.

> What is then:
> 
>   - cpuinfo_x86.phys_proc_id:
> 
> The physical ID of the package. This information is retrieved via CPUID
> and deduced from the APIC IDs of the cores in the package.
> 
> supposed to mean?
> 

Package = Socket, i.e. a field replaceable unit. Socket may not be
useful for software, but I think it helps users identify the hardware.

I think the following could be changed in the documentation:

"In the past a socket always contained a single package (see below), but
with the advent of Multi Chip Modules (MCM) a socket can hold more than one
package."

Replace "package" with "die".

You take multiple dies from the foundry and you "package" them together
into a single unit.

> Why isn't phys_proc_id != node_id?
> 

They could be equal depending on the system. The values are different on
MCM systems like Bulldozer and Naples though.

The functions and structures in amd_nb.c are indexed by the node_id.
This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id.
But the LLC isn't always equal to the Node/Die like in Naples. So the
patches in this set save and explicitly use the node_id when needed.

What do you think?

Thanks,
Yazen


[PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

AMD systems provide a "NodeId" value that represents a global ID
indicating to which "Node" a logical CPU belongs. The "Node" is a
physical structure equivalent to a Die, and it should not be confused
with logical structures like NUMA node. Logical nodes can be adjusted
based on firmware or other settings whereas the physical nodes/dies are
fixed based on hardware topology.

The NodeId value can be used when a physical ID is needed by software.

Save the AMD NodeId to struct cpuinfo_x86. Use the value from CPUID or
MSR as appropriate. Default to phys_proc_id otherwise. Do so for both
AMD and Hygon systems.

Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no
longer needed.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com

v1 -> v2:
* New patch based on review comment to save value to struct cpuinfo_x86.

 arch/x86/include/asm/cacheinfo.h |  4 ++--
 arch/x86/include/asm/processor.h |  1 +
 arch/x86/kernel/cpu/amd.c| 11 +--
 arch/x86/kernel/cpu/cacheinfo.c  |  6 +++---
 arch/x86/kernel/cpu/hygon.c  | 11 +--
 5 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h
index 86b63c7feab7..86b2e0dcc4bf 100644
--- a/arch/x86/include/asm/cacheinfo.h
+++ b/arch/x86/include/asm/cacheinfo.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_X86_CACHEINFO_H
 #define _ASM_X86_CACHEINFO_H
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
-void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id);
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu);
+void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu);
 
 #endif /* _ASM_X86_CACHEINFO_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 97143d87994c..a776b7886ec0 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -95,6 +95,7 @@ struct cpuinfo_x86 {
/* CPUID returned core id bits: */
__u8x86_coreid_bits;
__u8cu_id;
+   __u8node_id;
/* Max extended CPUID function supported: */
__u32   extended_cpuid_level;
/* Maximum supported CPUID level, -1=no CPUID: */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index dcc3d943c68f..5eef4cc1e5b7 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c)
  */
 static void amd_get_topology(struct cpuinfo_x86 *c)
 {
-   u8 node_id;
int cpu = smp_processor_id();
 
/* get information required for multi-node processors */
@@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 
cpuid(0x801e, &eax, &ebx, &ecx, &edx);
 
-   node_id  = ecx & 0xff;
+   c->node_id  = ecx & 0xff;
 
if (c->x86 == 0x15)
c->cu_id = ebx & 0xff;
@@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
if (!err)
c->x86_coreid_bits = get_count_order(c->x86_max_cores);
 
-   cacheinfo_amd_init_llc_id(c, cpu, node_id);
+   cacheinfo_amd_init_llc_id(c, cpu);
 
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;
 
rdmsrl(MSR_FAM10H_NODE_ID, value);
-   node_id = value & 7;
+   c->node_id = value & 7;
 
-   per_cpu(cpu_llc_id, cpu) = node_id;
+   per_cpu(cpu_llc_id, cpu) = c->node_id;
} else
return;
 
@@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c)
/* Convert the initial APIC ID into the socket ID */
c->phys_proc_id = c->initial_apicid >> bits;
/* use socket ID also for last level cache */
-   per_cpu(cpu_llc_id, cpu) = c->phys_proc_id;
+   per_cpu(cpu_llc_id, cpu) = c->node_id = c->phys_proc_id;
 }
 
 static void amd_detect_ppin(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 57074cf3ad7c..81dfddae4470 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -646,7 +646,7 @@ static int find_num_cache_leaves(struct cpuinfo_x86 *c)
return i;
 }
 
-void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id)
+void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu)
 {
/*
 * We may have multiple LLCs if L3 caches exist, so check if we
@@ -657,7 +657,7 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int 
cpu, u8 node_id)
 
if (c->x86 < 0x17) {
/* LLC is at the nod

[PATCH v2 2/8] x86/CPU/AMD: Remove amd_get_nb_id()

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

The Last Level Cache ID is returned by amd_get_nb_id(). In practice,
this value is the same as the AMD NodeId for callers of this function.
The NodeId is saved in struct cpuinfo_x86.node_id.

Replace calls to amd_get_nb_id() with the logical CPU's node_id and
remove the function.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com

v1 -> v2:
* New patch.

 arch/x86/events/amd/core.c   | 2 +-
 arch/x86/include/asm/processor.h | 2 --
 arch/x86/kernel/amd_nb.c | 4 ++--
 arch/x86/kernel/cpu/amd.c| 6 --
 arch/x86/kernel/cpu/cacheinfo.c  | 2 +-
 arch/x86/kernel/cpu/mce/amd.c| 4 ++--
 arch/x86/kernel/cpu/mce/inject.c | 4 ++--
 drivers/edac/amd64_edac.c| 4 ++--
 drivers/edac/mce_amd.c   | 2 +-
 9 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 39eb276d0277..01b9b943dcf4 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu)
if (!x86_pmu.amd_nb_constraints)
return;
 
-   nb_id = amd_get_nb_id(cpu);
+   nb_id = cpu_data(cpu).node_id;
WARN_ON_ONCE(nb_id == BAD_APICID);
 
for_each_online_cpu(i) {
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a776b7886ec0..408977a323d3 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -871,10 +871,8 @@ extern int set_tsc_mode(unsigned int val);
 DECLARE_PER_CPU(u64, msr_misc_features_shadow);
 
 #ifdef CONFIG_CPU_SUP_AMD
-extern u16 amd_get_nb_id(int cpu);
 extern u32 amd_get_nodes_per_socket(void);
 #else
-static inline u16 amd_get_nb_id(int cpu)   { return 0; }
 static inline u32 amd_get_nodes_per_socket(void)   { return 0; }
 #endif
 
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 18f6b7c4bd79..2bd8abdbed8e 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource 
*res)
 
 int amd_get_subcaches(int cpu)
 {
-   struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link;
+   struct pci_dev *link = node_to_amd_nb(cpu_data(cpu).node_id)->link;
unsigned int mask;
 
if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING))
@@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu)
 int amd_set_subcaches(int cpu, unsigned long mask)
 {
static unsigned int reset, ban;
-   struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   struct amd_northbridge *nb = node_to_amd_nb(cpu_data(cpu).node_id);
unsigned int reg;
int cuid;
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 5eef4cc1e5b7..846367a69c4a 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -424,12 +424,6 @@ static void amd_detect_ppin(struct cpuinfo_x86 *c)
clear_cpu_cap(c, X86_FEATURE_AMD_PPIN);
 }
 
-u16 amd_get_nb_id(int cpu)
-{
-   return per_cpu(cpu_llc_id, cpu);
-}
-EXPORT_SYMBOL_GPL(amd_get_nb_id);
-
 u32 amd_get_nodes_per_socket(void)
 {
return nodes_per_socket;
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 81dfddae4470..8e34e90bb872 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs 
*this_leaf, int index)
if (index < 3)
return;
 
-   node = amd_get_nb_id(smp_processor_id());
+   node = cpu_data(smp_processor_id()).node_id;
this_leaf->nb = node_to_amd_nb(node);
if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)
amd_calc_l3_indices(this_leaf->nb);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 0c6b02dd744c..be96f77004ad 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank 
**bp, unsigned int cpu,
return -ENODEV;
 
if (is_shared_bank(bank)) {
-   nb = node_to_amd_nb(amd_get_nb_id(cpu));
+   nb = node_to_amd_nb(cpu_data(cpu).node_id);
 
/* threshold descriptor already initialized on this node? */
if (nb && nb->bank4) {
@@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank 
*bank)
 * The last CPU on this node using the shared bank is going
 * away, remove that bank now.
 */
-   nb = node_to_amd_nb(amd_get_nb_id(smp_processor_id()));
+   nb = node_to_amd_nb(cpu_data(smp_processor_id()).node_id);
nb->bank4 = NULL;
}
 
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cp

[PATCH v2 7/8] x86/MCE/AMD: Group register reads in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

...so that bitfield extraction can be done together to simplify future
patches.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com

v1 -> v2:
* New patch based on comments for v1 Patch 2.

 arch/x86/kernel/cpu/mce/amd.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 5a18937ff7cd..f5440f8000e9 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -729,11 +729,18 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
goto out_err;
}
 
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
®_dram_limit_addr))
+   goto out_err;
+
lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1);
intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4);
intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8);
dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28;
 
+   intlv_num_sockets = get_bit(reg_dram_limit_addr, 8);
+   intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10);
+   dram_limit_addr   = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | 
GENMASK_ULL(27, 0);
+
/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
if (intlv_addr_sel > 3) {
pr_err("%s: Invalid interleave address select %d.\n",
@@ -741,13 +748,6 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
goto out_err;
}
 
-   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
®_dram_limit_addr))
-   goto out_err;
-
-   intlv_num_sockets = get_bit(reg_dram_limit_addr, 8);
-   intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10);
-   dram_limit_addr   = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | 
GENMASK_ULL(27, 0);
-
intlv_addr_bit = intlv_addr_sel + 8;
 
/* Re-use intlv_num_chan by setting it equal to log2(#channels) */
-- 
2.25.1



[PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

Remove the "tmp" variable used to save register values. Save the values
in existing variables, if possible.

The register values are 32 bits. Use separate "reg_" variables to hold
the register values if the existing variable sizes doesn't match, or if
no bitfields in a register share the same name as the register.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com

v1 -> v2:
* New patch based on comments for v1 Patch 2.

 arch/x86/kernel/cpu/mce/amd.c | 56 +++
 1 file changed, 30 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 90c3ad61ae19..5a18937ff7cd 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -688,11 +688,14 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
 
 int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 {
-   u64 dram_base_addr, dram_limit_addr, dram_hole_base;
/* We start from the normalized address */
u64 ret_addr = norm_addr;
 
-   u32 tmp;
+   u64 dram_base_addr, dram_limit_addr;
+   u32 dram_hole_base;
+
+   u32 reg_dram_base_addr, reg_dram_limit_addr;
+   u32 reg_dram_offset;
 
u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask;
u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets;
@@ -702,12 +705,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
u8 cs_mask, cs_id = 0;
bool hash_enabled = false;
 
-   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, 
®_dram_offset))
goto out_err;
 
/* Remove HiAddrOffset from normalized address, if enabled: */
-   if (tmp & BIT(0)) {
-   u64 hi_addr_offset = get_bits(tmp, 31, 20) << 28;
+   if (reg_dram_offset & BIT(0)) {
+   u64 hi_addr_offset = get_bits(reg_dram_offset, 31, 20) << 28;
 
/* Check if base 1 is used. */
if (norm_addr >= hi_addr_offset) {
@@ -716,20 +719,20 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
}
}
 
-   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, 
&tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, 
®_dram_base_addr))
goto out_err;
 
/* Check if address range is valid. */
-   if (!(tmp & BIT(0))) {
+   if (!(reg_dram_base_addr & BIT(0))) {
pr_err("%s: Invalid DramBaseAddress range: 0x%x.\n",
-   __func__, tmp);
+   __func__, reg_dram_base_addr);
goto out_err;
}
 
-   lgcy_mmio_hole_en = get_bit(tmp, 1);
-   intlv_num_chan= get_bits(tmp, 7, 4);
-   intlv_addr_sel= get_bits(tmp, 10, 8);
-   dram_base_addr= get_bits(tmp, 31, 12) << 28;
+   lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1);
+   intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4);
+   intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8);
+   dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28;
 
/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
if (intlv_addr_sel > 3) {
@@ -738,12 +741,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
goto out_err;
}
 
-   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
&tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
®_dram_limit_addr))
goto out_err;
 
-   intlv_num_sockets = get_bit(tmp, 8);
-   intlv_num_dies= get_bits(tmp, 11, 10);
-   dram_limit_addr   = (get_bits(tmp, 31, 12) << 28) | GENMASK_ULL(27, 0);
+   intlv_num_sockets = get_bit(reg_dram_limit_addr, 8);
+   intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10);
+   dram_limit_addr   = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | 
GENMASK_ULL(27, 0);
 
intlv_addr_bit = intlv_addr_sel + 8;
 
@@ -786,17 +789,18 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
 
if (num_intlv_bits > 0) {
u64 temp_addr_x, temp_addr_i, temp_addr_y;
-   u8 die_id_bit, sock_id_bit, cs_fabric_id;
+   u32 reg_sys_fabric_id, cs_fabric_id;
+   u8 die_id_bit, sock_id_bit;
 
/*
 * This is the fabric id for this coherent slave. Use
 * umc/channel# as instance id of the coherent slave
 * for FICAA.
 */
-   if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, 
&tmp))
+  

[PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-03 Thread Yazen Ghannam
From: Muralidhara M K 

Add support for new memory interleaving modes used in current AMD systems.

Check if the system is using a current Data Fabric version or a legacy
version as some bit and register definitions have changed.

Tested on AMD reference platforms with the following memory interleaving
options.

Naples
- None
- Channel
- Die
- Socket

Rome (NPS = Nodes per Socket)
- None
- NPS0
- NPS1
- NPS2
- NPS4

The fixes tag refers to the commit that allows amd64_edac_mod to load on
Rome systems. The module may report an incorrect system addresses on
Rome systems depending on the interleaving option used.

Fixes: 6e846239e548 ("EDAC/amd64: Add Family 17h Model 30h PCI IDs")
Signed-off-by: Muralidhara M K 
Co-developed-by: Naveen Krishna Chtradhi 
Signed-off-by: Naveen Krishna Chtradhi 
Co-developed-by: Yazen Ghannam 
Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com

v1 -> v2:
* Rebased on cleanup patches.
* Save and use the Data Fabric version.
* Reorder code to execute non-legacy flows first. This change wasn't
  made to the section with the "hashed_bit" calculation, since the
  current flow reads easier IMHO.

 arch/x86/kernel/cpu/mce/amd.c | 222 ++
 1 file changed, 172 insertions(+), 50 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index f5440f8000e9..c14076bcabf2 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -683,8 +683,10 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
 #define DF_F0_DRAMBASEADDR 0x110
 #define DF_F0_DRAMLIMITADDR0x114
 #define DF_F0_DRAMOFFSET   0x1B4
+#define DF_F0_DFGLOBALCTRL 0x3F8
 
 #define DF_F1_SYSFABRICID  0x208
+#define DF_F1_SYSFABRICID1 0x20C
 
 int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 {
@@ -695,22 +697,30 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
u32 dram_hole_base;
 
u32 reg_dram_base_addr, reg_dram_limit_addr;
-   u32 reg_dram_offset;
+   u32 reg_dram_offset, reg_sys_fabric_id;
+
+   bool hash_enabled = false, split_normalized = false;
 
-   u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask;
u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets;
-   u8 intlv_addr_sel, intlv_addr_bit;
-   u8 num_intlv_bits, hashed_bit;
+   u8 intlv_addr_sel, intlv_addr_bit, num_intlv_bits;
+   u8 cs_mask, cs_id = 0, dst_fabric_id = 0;
u8 lgcy_mmio_hole_en, base = 0;
-   u8 cs_mask, cs_id = 0;
-   bool hash_enabled = false;
+   u8 df_version;
+
+   if (amd_df_indirect_read(nid, 1, DF_F1_SYSFABRICID, umc, 
®_sys_fabric_id))
+   goto out_err;
+
+   df_version = (reg_sys_fabric_id & 0xFF) ? 3 : 2;
 
if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, 
®_dram_offset))
goto out_err;
 
/* Remove HiAddrOffset from normalized address, if enabled: */
if (reg_dram_offset & BIT(0)) {
-   u64 hi_addr_offset = get_bits(reg_dram_offset, 31, 20) << 28;
+   u8 hi_addr_offset_lsb = (df_version >= 3) ? 12 : 20;
+   u64 hi_addr_offset = get_bits(reg_dram_offset, 31, 
hi_addr_offset_lsb);
+
+   hi_addr_offset <<= 28;
 
/* Check if base 1 is used. */
if (norm_addr >= hi_addr_offset) {
@@ -733,19 +743,23 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
goto out_err;
 
lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1);
-   intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4);
-   intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8);
dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28;
-
-   intlv_num_sockets = get_bit(reg_dram_limit_addr, 8);
-   intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10);
dram_limit_addr   = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | 
GENMASK_ULL(27, 0);
 
-   /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
-   if (intlv_addr_sel > 3) {
-   pr_err("%s: Invalid interleave address select %d.\n",
-   __func__, intlv_addr_sel);
-   goto out_err;
+   if (df_version >= 3) {
+   intlv_num_chan= get_bits(reg_dram_base_addr, 5, 2);
+   intlv_num_dies= get_bits(reg_dram_base_addr, 7, 6);
+   intlv_num_sockets = get_bit(reg_dram_base_addr, 8);
+   intlv_addr_sel= get_bits(reg_dram_base_addr, 11, 9);
+
+   dst_fabric_id = get_bits(reg_dram_limit_addr, 9, 0);
+   } else {
+   intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4);
+   intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8);
+
+   dst_fabric_id 

[PATCH v2 5/8] x86/MCE/AMD: Use macros to get bitfields in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

Define macros to get individual bits and bitfields. Use these to make
the code more readable.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com

v1 -> v2:
* New patch based on comments for v1 Patch 2.

 arch/x86/kernel/cpu/mce/amd.c | 46 +--
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 1e0510fd5afc..90c3ad61ae19 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -675,6 +675,9 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
deferred_error_interrupt_enable(c);
 }
 
+#define get_bits(x, msb, lsb)  ((x & GENMASK_ULL(msb, lsb)) >> lsb)
+#define get_bit(x, bit)((x >> bit) & BIT(0))
+
 #define DF_F0_FABRICINSTINFO3  0x50
 #define DF_F0_MMIOHOLE 0x104
 #define DF_F0_DRAMBASEADDR 0x110
@@ -704,7 +707,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
 
/* Remove HiAddrOffset from normalized address, if enabled: */
if (tmp & BIT(0)) {
-   u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8;
+   u64 hi_addr_offset = get_bits(tmp, 31, 20) << 28;
 
/* Check if base 1 is used. */
if (norm_addr >= hi_addr_offset) {
@@ -723,10 +726,10 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
goto out_err;
}
 
-   lgcy_mmio_hole_en = tmp & BIT(1);
-   intlv_num_chan= (tmp >> 4) & 0xF;
-   intlv_addr_sel= (tmp >> 8) & 0x7;
-   dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16;
+   lgcy_mmio_hole_en = get_bit(tmp, 1);
+   intlv_num_chan= get_bits(tmp, 7, 4);
+   intlv_addr_sel= get_bits(tmp, 10, 8);
+   dram_base_addr= get_bits(tmp, 31, 12) << 28;
 
/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
if (intlv_addr_sel > 3) {
@@ -738,9 +741,9 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
&tmp))
goto out_err;
 
-   intlv_num_sockets = (tmp >> 8) & 0x1;
-   intlv_num_dies= (tmp >> 10) & 0x3;
-   dram_limit_addr   = ((tmp & GENMASK_ULL(31, 12)) << 16) | 
GENMASK_ULL(27, 0);
+   intlv_num_sockets = get_bit(tmp, 8);
+   intlv_num_dies= get_bits(tmp, 11, 10);
+   dram_limit_addr   = (get_bits(tmp, 31, 12) << 28) | GENMASK_ULL(27, 0);
 
intlv_addr_bit = intlv_addr_sel + 8;
 
@@ -793,7 +796,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, 
&tmp))
goto out_err;
 
-   cs_fabric_id = (tmp >> 8) & 0xFF;
+   cs_fabric_id = get_bits(tmp, 15, 8);
die_id_bit   = 0;
 
/* If interleaved over more than 1 channel: */
@@ -812,16 +815,16 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
/* If interleaved over more than 1 die. */
if (intlv_num_dies) {
sock_id_bit  = die_id_bit + intlv_num_dies;
-   die_id_shift = (tmp >> 24) & 0xF;
-   die_id_mask  = (tmp >> 8) & 0xFF;
+   die_id_shift = get_bits(tmp, 27, 24);
+   die_id_mask  = get_bits(tmp, 15, 8);
 
cs_id |= ((cs_fabric_id & die_id_mask) >> die_id_shift) 
<< die_id_bit;
}
 
/* If interleaved over more than 1 socket. */
if (intlv_num_sockets) {
-   socket_id_shift = (tmp >> 28) & 0xF;
-   socket_id_mask  = (tmp >> 16) & 0xFF;
+   socket_id_shift = get_bits(tmp, 31, 28);
+   socket_id_mask  = get_bits(tmp, 23, 16);
 
cs_id |= ((cs_fabric_id & socket_id_mask) >> 
socket_id_shift) << sock_id_bit;
}
@@ -834,7 +837,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
 * bits there are. "intlv_addr_bit" tells us how many "Y" bits
 * there are (where "I" starts).
 */
-   temp_addr_y = ret_addr & GENMASK_ULL(intlv_addr_bit-1, 0);
+   temp_addr_y = get_bits(ret_addr, intlv_addr_bit-1, 0);
temp_addr_i = (cs_id << intlv_addr_bit);
temp_addr_x = (ret_addr & GENMASK_ULL(63, intlv_addr_bit)) << 
num_intlv

[PATCH v2 4/8] x86/MCE/AMD: Use defines for register addresses in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

Replace raw register offset values in the AMD address translation code
with named definitions.

Also, drop comments that only note the register names.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com

v1 -> v2:
* New patch based on comments for v1 Patch 2.

 arch/x86/kernel/cpu/mce/amd.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index be96f77004ad..1e0510fd5afc 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -675,6 +675,14 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
deferred_error_interrupt_enable(c);
 }
 
+#define DF_F0_FABRICINSTINFO3  0x50
+#define DF_F0_MMIOHOLE 0x104
+#define DF_F0_DRAMBASEADDR 0x110
+#define DF_F0_DRAMLIMITADDR0x114
+#define DF_F0_DRAMOFFSET   0x1B4
+
+#define DF_F1_SYSFABRICID  0x208
+
 int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 {
u64 dram_base_addr, dram_limit_addr, dram_hole_base;
@@ -691,22 +699,21 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
u8 cs_mask, cs_id = 0;
bool hash_enabled = false;
 
-   /* Read D18F0x1B4 (DramOffset), check if base 1 is used. */
-   if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, &tmp))
goto out_err;
 
/* Remove HiAddrOffset from normalized address, if enabled: */
if (tmp & BIT(0)) {
u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8;
 
+   /* Check if base 1 is used. */
if (norm_addr >= hi_addr_offset) {
ret_addr -= hi_addr_offset;
base = 1;
}
}
 
-   /* Read D18F0x110 (DramBaseAddress). */
-   if (amd_df_indirect_read(nid, 0, 0x110 + (8 * base), umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, 
&tmp))
goto out_err;
 
/* Check if address range is valid. */
@@ -728,8 +735,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
goto out_err;
}
 
-   /* Read D18F0x114 (DramLimitAddress). */
-   if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, 
&tmp))
goto out_err;
 
intlv_num_sockets = (tmp >> 8) & 0x1;
@@ -780,12 +786,11 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
u8 die_id_bit, sock_id_bit, cs_fabric_id;
 
/*
-* Read FabricBlockInstanceInformation3_CS[BlockFabricID].
 * This is the fabric id for this coherent slave. Use
 * umc/channel# as instance id of the coherent slave
 * for FICAA.
 */
-   if (amd_df_indirect_read(nid, 0, 0x50, umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, 
&tmp))
goto out_err;
 
cs_fabric_id = (tmp >> 8) & 0xFF;
@@ -800,9 +805,8 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
 
sock_id_bit = die_id_bit;
 
-   /* Read D18F1x208 (SystemFabricIdMask). */
if (intlv_num_dies || intlv_num_sockets)
-   if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp))
+   if (amd_df_indirect_read(nid, 1, DF_F1_SYSFABRICID, 
umc, &tmp))
goto out_err;
 
/* If interleaved over more than 1 die. */
@@ -841,7 +845,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr)
 
/* If legacy MMIO hole enabled */
if (lgcy_mmio_hole_en) {
-   if (amd_df_indirect_read(nid, 0, 0x104, umc, &tmp))
+   if (amd_df_indirect_read(nid, 0, DF_F0_MMIOHOLE, umc, &tmp))
goto out_err;
 
dram_hole_base = tmp & GENMASK(31, 24);
-- 
2.25.1



[PATCH v2 0/8] AMD MCA Address Translation Updates

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

This patchset includes updates for the MCA Address Translation process
on recent AMD systems.

Patches 1 & 3:
Fixes an input to the address translation function. The translation
requires a physical Die ID (NodeId in AMD documentation) rather than a
logicial NUMA node ID. This is because the physical and logical nodes
may not always match.

Patch 2:
Removes a function that is no longer needed with Patch 1.

Patches 4-7:
Code cleanup in preparation for Patch 8.

Patch 8:
Add translation support for new memory interleaving options available in
Rome systems. The patch is based on the latest AMD reference code for
the address translation.

Patches 6-8 have checkpatch warnings about long lines, but I kept the
long lines for readability.

Thanks,
Yazen

Link:
https://lkml.kernel.org/r/20200814191449.183998-1-yazen.ghan...@amd.com

v1 -> v2:
* Save the AMD NodeId value in struct cpuinfo_x86 rather than use a
  local value in MCA code.
* Include code cleanup for AMD MCA Address Translation function before
  adding new functionality.

Muralidhara M K (1):
  x86/MCE/AMD Support new memory interleaving modes during address
translation

Yazen Ghannam (7):
  x86/CPU/AMD: Save NodeId on AMD-based systems
  x86/CPU/AMD: Remove amd_get_nb_id()
  EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId
  x86/MCE/AMD: Use defines for register addresses in translation code
  x86/MCE/AMD: Use macros to get bitfields in translation code
  x86/MCE/AMD: Drop tmp variable in translation code
  x86/MCE/AMD: Group register reads in translation code

 arch/x86/events/amd/core.c   |   2 +-
 arch/x86/include/asm/cacheinfo.h |   4 +-
 arch/x86/include/asm/processor.h |   3 +-
 arch/x86/kernel/amd_nb.c |   4 +-
 arch/x86/kernel/cpu/amd.c|  17 +-
 arch/x86/kernel/cpu/cacheinfo.c  |   8 +-
 arch/x86/kernel/cpu/hygon.c  |  11 +-
 arch/x86/kernel/cpu/mce/amd.c| 284 ++-
 arch/x86/kernel/cpu/mce/inject.c |   4 +-
 drivers/edac/amd64_edac.c|   4 +-
 drivers/edac/mce_amd.c   |   4 +-
 11 files changed, 233 insertions(+), 112 deletions(-)

-- 
2.25.1



[PATCH v2 3/8] EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam 

The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.

In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().

However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact. But the
NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.

Use struct cpuinfo_x86.node_id for the node_id parameter to ensure the
physical ID is used.

Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com

v1 -> v2:
* Redo based on change in Patch 1.

 drivers/edac/mce_amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index ac9bd74c92cd..91b5e3e0744e 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m)
pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
 
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
-   decode_dram_ecc(cpu_to_node(m->extcpu), m);
+   decode_dram_ecc(cpu_data(m->extcpu).node_id, m);
 }
 
 static inline void amd_decode_err_code(u16 ec)
-- 
2.25.1



Re: [PATCH v2 1/2] cper, apei, mce: Pass x86 CPER through the MCA handling chain

2020-09-01 Thread Yazen Ghannam
On Fri, Aug 28, 2020 at 03:33:31PM -0500, Smita Koralahalli wrote:
...
> +int apei_mce_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 
> lapic_id)
> +{
> + const u64 *i_mce = ((const void *) (ctx_info + 1));
> + unsigned int cpu;
> + struct mce m;
> +
> + if (!boot_cpu_has(X86_FEATURE_SMCA))
> + return -EINVAL;
> +

This function is called on any context type, but it can only decode
"MSR" types that follow the MCAX register layout used on Scalable MCA
systems.

So I think there should be a couple of checks added:
1) Context type is "MSR".
2) Register layout follows what is expected below. There's no explict
way to do this, since the data is implemenation-specific. But at least
there can be a check that the starting MSR address matches the first
expected register: Bank's MCA_STATUS in MCAX space (0xC0002XX1).

For example:

(ctx_info->msr_addr & 0xC0002001) == 0xC0002001

The raw value in the example should be defined with a name.

> + mce_setup(&m);
> +
> + m.extcpu = -1;
> + m.socketid = -1;
> +
> + for_each_possible_cpu(cpu) {
> + if (cpu_data(cpu).initial_apicid == lapic_id) {
> + m.extcpu = cpu;
> + m.socketid = cpu_data(m.extcpu).phys_proc_id;
> + break;
> + }
> + }
> +
> + m.apicid = lapic_id;
> + m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
> + m.status = *i_mce;
> + m.addr = *(i_mce + 1);
> + m.misc = *(i_mce + 2);
> + /* Skipping MCA_CONFIG */
> + m.ipid = *(i_mce + 4);
> + m.synd = *(i_mce + 5);
> +
> + mce_log(&m);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(apei_mce_report_x86_error);
> +

Thanks,
Yazen


[PATCH v2] x86/mce: Increase maximum number of banks to 64

2020-08-28 Thread Yazen Ghannam
From: Akshay Gupta 

...because future AMD systems will support up to 64 MCA banks per CPU.

MAX_NR_BANKS is used to allocate a number of data structures, and it is
used as a ceiling for values read from MCG_CAP[Count]. Therefore, this
change will have no functional effect on existing systems with 32 or
fewer MCA banks per CPU.

However, this will increase the size of the following structures.

Global bitmaps:
- core.c / mce_banks_ce_disabled
- core.c / all_banks
- core.c / valid_banks
- core.c / toclear
- Total: 32 new bits * 4 bitmaps = 16 new bytes

Per-CPU bitmaps:
- core.c / mce_poll_banks
- intel.c / mce_banks_owned
- Total: 32 new bits * 2 bitmaps = 8 new bytes

The bitmaps are arrays of longs. So this change will only affect 32-bit
execution, since there will be one additional long used. There will be
no additional memory use on 64-bit execution, because the size of long
is 64 bits.

Global structs:
- amd.c / struct smca_bank smca_banks[]: 16 bytes per bank
- core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank
- Total: 32 new banks * (16 + 56) bytes = 2304 new bytes

Per-CPU structs:
- core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank
- Total: 32 new banks * 16 bytes = 512 new bytes

32-bit
Total global size increase: 2320 bytes
Total per-CPU size increase: 520 bytes

64-bit
Total global size increase: 2304 bytes
Total per-CPU size increase: 512 bytes

This additional memory should still fit within the existing .data
section of the kernel binary. However, in the case where it doesn't fit,
an additional page (4kB) of memory will be added to the binary to
accommodate the extra data.

Signed-off-by: Akshay Gupta 
[ Adjust commit message and code comment. ]
Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20200820170624.1855825-1-yazen.ghan...@amd.com

v1->v2:
* Update commit message with discussion details from review.

 arch/x86/include/asm/mce.h | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 6adced6e7dd3..109af5c7f515 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -200,12 +200,8 @@ void mce_setup(struct mce *m);
 void mce_log(struct mce *m);
 DECLARE_PER_CPU(struct device *, mce_device);
 
-/*
- * Maximum banks number.
- * This is the limit of the current register layout on
- * Intel CPUs.
- */
-#define MAX_NR_BANKS 32
+/* Maximum number of MCA banks per CPU. */
+#define MAX_NR_BANKS 64
 
 #ifdef CONFIG_X86_MCE_INTEL
 void mce_intel_feature_init(struct cpuinfo_x86 *c);
-- 
2.25.1



Re: [PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-24 Thread Yazen Ghannam
On Thu, Aug 20, 2020 at 06:15:15PM +, Luck, Tony wrote:
> >> How much does vmlinux size grow with your change?
> >>
> >
> > It seems to get smaller.
> >
> > -rwxrwxr-x   1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks
> > -rwxrwxr-x   1 yghannam yghannam 807634072 Aug 20 17:50 vmlinux-64banks
> 
> You need to run:
> 
> $ size vmlinux
>textdata bss dec hex filename
> 203347551256968214798924477033612d7e541 
> vmlinux
> 
> Likely the extra space is added to the third element ("bss"). That doesn't 
> show
> up in the vmlinux file, but does add to memory footprint while running.

Thanks. Yeah, they're identical:
   textdata bss dec hex filename
   15710076135193065398528 346279102106146   
vmlinux-32banks
   15710076135193065398528 346279102106146   
vmlinux-64banks

I did a quick audit of the statically allocated data structures which
use MAX_NR_BANKS.

Global bitmaps:
- core.c / mce_banks_ce_disabled
- core.c / all_banks
- core.c / valid_banks
- core.c / toclear
- Total: 32 new bits * 4 bitmaps = 16 new bytes

Per-CPU bitmaps:
- core.c / mce_poll_banks
- intel.c / mce_banks_owned
- Total: 32 new bits * 2 bitmaps = 8 new bytes

The bitmaps are arrays of longs. So this change will only affect 32-bit
execution (I assume), since there will be one additional long used.
There will be no additional memory use on 64-bit execution, because the
size of long is 64 bits.

Global structs:
- amd.c / struct smca_bank smca_banks[]: 16 bytes per bank
- core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank
- Total: 32 new banks * (16 + 56) bytes = 2304 new bytes

Per-CPU structs:
- core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank
- Total: 32 new banks * 16 bytes = 512 new bytes

32-bit
Total global size increase: 2320 bytes
Total per-CPU size increase: 520 bytes

64-bit
Total global size increase: 2304 bytes
Total per-CPU size increase: 512 bytes

Is this okay?

Thanks,
Yazen


Re: [PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-20 Thread Yazen Ghannam
On Thu, Aug 20, 2020 at 07:15:18PM +0200, Borislav Petkov wrote:
> On Thu, Aug 20, 2020 at 05:06:24PM +0000, Yazen Ghannam wrote:
> > From: Akshay Gupta 
> > 
> > ...because future AMD systems will support up to 64 MCA banks per CPU.
> > 
> > MAX_NR_BANKS is used to allocate a number of data structures, and it is
> > used as a ceiling for values read from MCG_CAP[Count]. Therefore, this
> > change will have no functional effect on existing systems with 32 or
> > fewer MCA banks per CPU.
> 
> Of course it will, grep for MAX_NR_BANKS and look at all those bitmaps
> and arrays which get defined with MAX_NR_BANKS size. With your change,
> they will double in size.
> 
> How much does vmlinux size grow with your change?
>

It seems to get smaller.

-rwxrwxr-x   1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks
-rwxrwxr-x   1 yghannam yghannam 807634072 Aug 20 17:50 vmlinux-64banks

Any ideas? Maybe there's some alignment change? Or a build issue on my
end?

Thanks,
Yazen


[PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-20 Thread Yazen Ghannam
From: Akshay Gupta 

...because future AMD systems will support up to 64 MCA banks per CPU.

MAX_NR_BANKS is used to allocate a number of data structures, and it is
used as a ceiling for values read from MCG_CAP[Count]. Therefore, this
change will have no functional effect on existing systems with 32 or
fewer MCA banks per CPU.

Signed-off-by: Akshay Gupta 
[ Adjust commit message and code comment. ]
Signed-off-by: Yazen Ghannam 
---
 arch/x86/include/asm/mce.h | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 6adced6e7dd3..109af5c7f515 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -200,12 +200,8 @@ void mce_setup(struct mce *m);
 void mce_log(struct mce *m);
 DECLARE_PER_CPU(struct device *, mce_device);
 
-/*
- * Maximum banks number.
- * This is the limit of the current register layout on
- * Intel CPUs.
- */
-#define MAX_NR_BANKS 32
+/* Maximum number of MCA banks per CPU. */
+#define MAX_NR_BANKS 64
 
 #ifdef CONFIG_X86_MCE_INTEL
 void mce_intel_feature_init(struct cpuinfo_x86 *c);
-- 
2.25.1



[tip: ras/core] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

2020-08-20 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the ras/core branch of tip:

Commit-ID: 368d1887200d68075c064a62a9aa191168cf1eed
Gitweb:
https://git.kernel.org/tip/368d1887200d68075c064a62a9aa191168cf1eed
Author:Yazen Ghannam 
AuthorDate:Mon, 20 Jul 2020 14:53:53 
Committer: Borislav Petkov 
CommitterDate: Thu, 20 Aug 2020 10:34:38 +02:00

x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary after a few product releases
because the hardware will only report valid error codes. Thus, there's
no need for it with future systems.

Remove the xec_bitmap field and all references to it.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20200720145353.43924-1-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h|  1 +-
 arch/x86/kernel/cpu/mce/amd.c | 44 +-
 drivers/edac/mce_amd.c|  4 +---
 3 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index cf50382..6adced6 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -328,7 +328,6 @@ enum smca_bank_types {
 struct smca_hwid {
unsigned int bank_type; /* Use with smca_bank_types for easy indexing. 
*/
u32 hwid_mcatype;   /* (hwid,mcatype) tuple */
-   u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max 
is 21. */
u8 count;   /* Number of instances. */
 };
 
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 99be063..0c6b02d 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned 
int bank)
 }
 
 static struct smca_hwid smca_hwid_mcatypes[] = {
-   /* { bank_type, hwid_mcatype, xec_bitmap } */
+   /* { bank_type, hwid_mcatype } */
 
/* Reserved type */
-   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 },
+   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)},
 
/* ZN Core (HWID=0xB0) MCA types */
-   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0), 0x1F },
-   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF },
-   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1), 0x3FFF },
-   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF },
-   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3), 0x1FF },
+   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0)},
+   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10)   },
+   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1)},
+   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)},
+   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3)},
/* HWID 0xB0 MCATYPE 0x4 is Reserved */
-   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5), 0xFFF },
-   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6), 0x7F },
-   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF },
+   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5)},
+   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6)},
+   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)},
 
/* Data Fabric MCA types */
-   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0), 0x1FF },
-   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0x1F },
-   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
+   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0)},
+   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1)},
+   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)},
 
/* Unified Memory Controller MCA type */
-   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0xFF },
+   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0)},
 
/* Parameter Block MCA type */
-   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0), 0x1 },
+   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0)},
 
/* Platform Security Processor MCA type */
-   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0), 0x1 },
-   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1), 0x3 },
+   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0)},
+   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1)},
 
/* System Management Unit MCA type */
-   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0), 0x1 },
-   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1), 0x7FF },
+   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0)},
+   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1)},
 
/* Microprocessor 5 Unit MCA type */
-   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2), 0x3FF },
+   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2)},
 
/* Northbridge IO Unit MCA type */
-   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F },
+   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)},
 
/* PCI Express Unit MCA type */
-   { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0),

Re: [PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes during address translation

2020-08-18 Thread Yazen Ghannam
On Sat, Aug 15, 2020 at 11:13:36AM +0200, Ingo Molnar wrote:
> 
> * Yazen Ghannam  wrote:
> 
> > + /* Read D18F1x208 (System Fabric ID Mask 0). */
> > + if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp))
> > + goto out_err;
> > +
> > + /* Determine if system is a legacy Data Fabric type. */
> > + legacy_df = !(tmp & 0xFF);
> 
> 1)
> 
> I see this pattern in a lot of places in the code, first the magic 
> constant 0x208 is explained a comment, then it is *repeated* and used 
> it in the code...
> 
> How about introducing an obviously named enum for it instead, which 
> would then be self-documenting, saving the comment and removing magic 
> numbers:
> 
>   if (amd_df_indirect_read(nid, 1, AMD_REG_FAB_ID, umc, ®_fab_id))
>   goto out_err;
> 
> (The symbolic name should be something better, I just guessed 
> something quickly.)
> 
> Please clean this up in a separate patch, not part of the already 
> large patch that introduces a new feature.
>

Okay, will do.

> 2)
> 
> 'tmp & 0xFF' is some sort of fabric version ID value, with a value of 
> 0 denoting legacy (pre-Rome) systems, right?
> 
> How about making that explicit:
> 
>   df_version = reg_fab_id & 0xFF;
> 
> I'm pretty sure such a version ID might come handy later on, should 
> there be quirks or new capabilities with the newer systems ...
> 

Not exactly. The register field is Read-as-Zero on legacy systems. The
versions are 2 and 3 where 2 is the "legacy" version. But I can make
this change.

For example:

df_version = reg_fab_id & 0xFF ? 3 : 2;

> 
> > ret_addr -= hi_addr_offset;
> > @@ -728,23 +740,31 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, 
> > u8 umc, u64 *sys_addr)
> > }
> >  
> > lgcy_mmio_hole_en = tmp & BIT(1);
> > -   intlv_num_chan= (tmp >> 4) & 0xF;
> > -   intlv_addr_sel= (tmp >> 8) & 0x7;
> > -   dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16;
> >  
> > -   /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
> > -   if (intlv_addr_sel > 3) {
> > -   pr_err("%s: Invalid interleave address select %d.\n",
> > -   __func__, intlv_addr_sel);
> > -   goto out_err;
> > +   if (legacy_df) {
> > +   intlv_num_chan= (tmp >> 4) & 0xF;
> > +   intlv_addr_sel= (tmp >> 8) & 0x7;
> > +   } else {
> > +   intlv_num_chan= (tmp >> 2) & 0xF;
> > +   intlv_num_dies= (tmp >> 6) & 0x3;
> > +   intlv_num_sockets = (tmp >> 8) & 0x1;
> > +   intlv_addr_sel= (tmp >> 9) & 0x7;
> > }
> >  
> > +   dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16;
> > +
> > /* Read D18F0x114 (DramLimitAddress). */
> > if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp))
> > goto out_err;
> >  
> > -   intlv_num_sockets = (tmp >> 8) & 0x1;
> > -   intlv_num_dies= (tmp >> 10) & 0x3;
> > +   if (legacy_df) {
> > +   intlv_num_sockets = (tmp >> 8) & 0x1;
> > +   intlv_num_dies= (tmp >> 10) & 0x3;
> > +   dst_fabric_id = tmp & 0xFF;
> > +   } else {
> > +   dst_fabric_id = tmp & 0x3FF;
> > +   }
> > +
> > dram_limit_addr   = ((tmp & GENMASK_ULL(31, 12)) << 16) | 
> > GENMASK_ULL(27, 0);
> 
> Could we please structure this code in a bit more readable fashion?
> 
> 1)
> 
> Such as not using the meaningless 'tmp' variable name to first read 
> out DramOffset, then DramLimitAddress?
> 

IIRC, the "tmp" variable come to be in the review for the patch which
added this function. There are a few places where the register name and
the value needed have the same or similar name. For example,
DramLimitAddress is the register name and also a field within the
register. So we'd have a reg_dram_limit_addr and val_dram_limit_addr.
The "tmp" variable removes the need for the "reg_" variable.

But I think this can be reworked so that the final variable name is
reused. The register value can read into the variable, extra fields can
be extracted from it, and the final value can be adjusted as needed.

> How about naming them a bit more obviously, and retrieving them in a 
> single step:
> 
> if (amd_df_indirect_read(nid, 0, 0x1B4, umc, ®_dram_off))
> goto out_err;
>

[tip: ras/core] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

2020-08-18 Thread tip-bot2 for Yazen Ghannam
The following commit has been merged into the ras/core branch of tip:

Commit-ID: 5f2c67bd0f8a470a12c38a8786c42c043e100014
Gitweb:
https://git.kernel.org/tip/5f2c67bd0f8a470a12c38a8786c42c043e100014
Author:Yazen Ghannam 
AuthorDate:Mon, 20 Jul 2020 14:53:53 
Committer: Borislav Petkov 
CommitterDate: Tue, 18 Aug 2020 12:15:43 +02:00

x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary after a few product releases
because the hardware will only report valid error codes. Thus, there's
no need for it with future systems.

Remove the xec_bitmap field and all references to it.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Link: https://lkml.kernel.org/r/20200720145353.43924-1-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h|  1 +-
 arch/x86/kernel/cpu/mce/amd.c | 44 +-
 drivers/edac/mce_amd.c|  4 +---
 3 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index cf50382..6adced6 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -328,7 +328,6 @@ enum smca_bank_types {
 struct smca_hwid {
unsigned int bank_type; /* Use with smca_bank_types for easy indexing. 
*/
u32 hwid_mcatype;   /* (hwid,mcatype) tuple */
-   u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max 
is 21. */
u8 count;   /* Number of instances. */
 };
 
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 99be063..0c6b02d 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned 
int bank)
 }
 
 static struct smca_hwid smca_hwid_mcatypes[] = {
-   /* { bank_type, hwid_mcatype, xec_bitmap } */
+   /* { bank_type, hwid_mcatype } */
 
/* Reserved type */
-   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 },
+   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)},
 
/* ZN Core (HWID=0xB0) MCA types */
-   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0), 0x1F },
-   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF },
-   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1), 0x3FFF },
-   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF },
-   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3), 0x1FF },
+   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0)},
+   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10)   },
+   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1)},
+   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)},
+   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3)},
/* HWID 0xB0 MCATYPE 0x4 is Reserved */
-   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5), 0xFFF },
-   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6), 0x7F },
-   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF },
+   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5)},
+   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6)},
+   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)},
 
/* Data Fabric MCA types */
-   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0), 0x1FF },
-   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0x1F },
-   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
+   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0)},
+   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1)},
+   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)},
 
/* Unified Memory Controller MCA type */
-   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0xFF },
+   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0)},
 
/* Parameter Block MCA type */
-   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0), 0x1 },
+   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0)},
 
/* Platform Security Processor MCA type */
-   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0), 0x1 },
-   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1), 0x3 },
+   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0)},
+   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1)},
 
/* System Management Unit MCA type */
-   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0), 0x1 },
-   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1), 0x7FF },
+   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0)},
+   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1)},
 
/* Microprocessor 5 Unit MCA type */
-   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2), 0x3FF },
+   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2)},
 
/* Northbridge IO Unit MCA type */
-   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F },
+   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)},
 
/* PCI Express Unit MCA type */
-   { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0),

Re: [PATCH] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

2020-08-17 Thread Yazen Ghannam
On Mon, Aug 17, 2020 at 11:40:07AM +0200, Borislav Petkov wrote:
> On Mon, Jul 20, 2020 at 02:53:53PM +0000, Yazen Ghannam wrote:
> > From: Yazen Ghannam 
> > 
> > The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
> > was intended to be used by the kernel to filter out invalid error codes
> > on a system. However, this is unnecessary because the hardware will only
> > report valid error codes.
> 
> That's a kinda bold statement. :)
> 

Yeah, I'm trying to keep "may" out of my vocabulary. :)

> Are you saying, you wanna trust verification and that check is totally
> useless?
> 

I do. This check was added because I wasn't sure what to expect with
this new architecural extension. But after a few product releases, it
has been unnecessary. And I don't see a need for it with future systems.

Thanks,
Yazen


Re: [PATCH 1/2] x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode

2020-08-17 Thread Yazen Ghannam
On Sat, Aug 15, 2020 at 10:42:12AM +0200, Ingo Molnar wrote:
> 
> * Yazen Ghannam  wrote:
> 
> > From: Yazen Ghannam 
> > 
> > The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
> > later systems. This function is used in amd64_edac_mod to do
> > system-specific decoding for DRAM ECC errors. The function takes a
> > "NodeId" as a parameter.
> > 
> > In AMD documentation, NodeId is used to identify a physical die in a
> > system. This can be used to identify a node in the AMD_NB code and also
> > it is used with umc_normaddr_to_sysaddr().
> > 
> > However, the input used for decode_dram_ecc() is currently the NUMA node
> > of a logical CPU. In the default configuration, the NUMA node and
> > physical die will be equivalent, so this doesn't have an impact. But the
> > NUMA node configuration can be adjusted with optional memory
> > interleaving schemes. This will cause the NUMA node enumeration to not
> > match the physical die enumeration. The mismatch will cause the address
> > translation function to fail or report incorrect results.
> > 
> > Save the "NodeId" as a percpu value during init in AMD MCE code. Export
> > a function to return the value which can be used from modules like
> > edac_mce_amd.
> > 
> > Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
> > Signed-off-by: Yazen Ghannam 
> > ---
> >  arch/x86/include/asm/mce.h|  2 ++
> >  arch/x86/kernel/cpu/mce/amd.c | 11 +++
> >  drivers/edac/mce_amd.c|  2 +-
> >  3 files changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> > index cf503824529c..92527cc9ed06 100644
> > --- a/arch/x86/include/asm/mce.h
> > +++ b/arch/x86/include/asm/mce.h
> > @@ -343,6 +343,8 @@ extern struct smca_bank smca_banks[MAX_NR_BANKS];
> >  extern const char *smca_get_long_name(enum smca_bank_types t);
> >  extern bool amd_mce_is_memory_error(struct mce *m);
> >  
> > +extern u8 amd_cpu_to_node(unsigned int cpu);
> > +
> >  extern int mce_threshold_create_device(unsigned int cpu);
> >  extern int mce_threshold_remove_device(unsigned int cpu);
> >  
> > diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> > index 99be063fcb1b..524edf81e287 100644
> > --- a/arch/x86/kernel/cpu/mce/amd.c
> > +++ b/arch/x86/kernel/cpu/mce/amd.c
> > @@ -202,6 +202,9 @@ static DEFINE_PER_CPU(unsigned int, bank_map);
> >  /* Map of banks that have more than MCA_MISC0 available. */
> >  static DEFINE_PER_CPU(u32, smca_misc_banks_map);
> >  
> > +/* CPUID_Fn801E_ECX[NodeId] used to identify a physical node/die. */
> > +static DEFINE_PER_CPU(u8, node_id);
> > +
> >  static void amd_threshold_interrupt(void);
> >  static void amd_deferred_error_interrupt(void);
> >  
> > @@ -233,6 +236,12 @@ static void smca_set_misc_banks_map(unsigned int bank, 
> > unsigned int cpu)
> >  
> >  }
> >  
> > +u8 amd_cpu_to_node(unsigned int cpu)
> > +{
> > +   return per_cpu(node_id, cpu);
> > +}
> > +EXPORT_SYMBOL_GPL(amd_cpu_to_node);
> > +
> >  static void smca_configure(unsigned int bank, unsigned int cpu)
> >  {
> > unsigned int i, hwid_mcatype;
> > @@ -240,6 +249,8 @@ static void smca_configure(unsigned int bank, unsigned 
> > int cpu)
> > u32 high, low;
> > u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank);
> >  
> > +   this_cpu_write(node_id, cpuid_ecx(0x801e) & 0xFF);
> 
> So we already have this magic number used for a similar purpose, in 
> amd_get_topology():
> 
> cpuid(0x801e, &eax, &ebx, &ecx, &edx);
> 
> node_id  = ecx & 0xff;
>

Yes, that's right. I did have a patch that tried to leverage the
existing topology variables. But it wasn't working for all targeted
systems. So I thought to have something local to the AMD MCA code in
order to avoid messing with the topology code just for this feature.

> Firstly, could we please at least give 0x801e a proper symbolic 
> name, use it in hygon.c too (which AFAIK is derived from AMD anyway), 
> and then use it in these new patches?
> 

Sure, but all places that use a symbolic name for a CPUID leaf define it
locally. Should the same be done here? Or should there be common place
for all the defines like in  or maybe a new header
file?

> Secondly, why not stick node_id into struct cpuinfo_x86, where the MCA 
> code can then use it without having to introduce a new percpu data 
> structure?
> 

I think this would be the simplest approach. I can write it. Also, the
amd_get_nb_id() function could then be replaced with this.

> There's also the underlying assumption that there's only ever going to 
> be 256 nodes, which limitation I'm sure we'll hear about in a couple 
> of years as not being quite enough. ;-)
> 

Yeah, CPU topology seems very fractal-like. :)

> So less hardcoding and more generalizations please.
> 

Will do.

Thanks,
Yazen


[PATCH 1/2] x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode

2020-08-14 Thread Yazen Ghannam
From: Yazen Ghannam 

The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.

In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().

However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact. But the
NUMA node configuration can be adjusted with optional memory
interleaving schemes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.

Save the "NodeId" as a percpu value during init in AMD MCE code. Export
a function to return the value which can be used from modules like
edac_mce_amd.

Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam 
---
 arch/x86/include/asm/mce.h|  2 ++
 arch/x86/kernel/cpu/mce/amd.c | 11 +++
 drivers/edac/mce_amd.c|  2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index cf503824529c..92527cc9ed06 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -343,6 +343,8 @@ extern struct smca_bank smca_banks[MAX_NR_BANKS];
 extern const char *smca_get_long_name(enum smca_bank_types t);
 extern bool amd_mce_is_memory_error(struct mce *m);
 
+extern u8 amd_cpu_to_node(unsigned int cpu);
+
 extern int mce_threshold_create_device(unsigned int cpu);
 extern int mce_threshold_remove_device(unsigned int cpu);
 
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 99be063fcb1b..524edf81e287 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -202,6 +202,9 @@ static DEFINE_PER_CPU(unsigned int, bank_map);
 /* Map of banks that have more than MCA_MISC0 available. */
 static DEFINE_PER_CPU(u32, smca_misc_banks_map);
 
+/* CPUID_Fn801E_ECX[NodeId] used to identify a physical node/die. */
+static DEFINE_PER_CPU(u8, node_id);
+
 static void amd_threshold_interrupt(void);
 static void amd_deferred_error_interrupt(void);
 
@@ -233,6 +236,12 @@ static void smca_set_misc_banks_map(unsigned int bank, 
unsigned int cpu)
 
 }
 
+u8 amd_cpu_to_node(unsigned int cpu)
+{
+   return per_cpu(node_id, cpu);
+}
+EXPORT_SYMBOL_GPL(amd_cpu_to_node);
+
 static void smca_configure(unsigned int bank, unsigned int cpu)
 {
unsigned int i, hwid_mcatype;
@@ -240,6 +249,8 @@ static void smca_configure(unsigned int bank, unsigned int 
cpu)
u32 high, low;
u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank);
 
+   this_cpu_write(node_id, cpuid_ecx(0x801e) & 0xFF);
+
/* Set appropriate bits in MCA_CONFIG */
if (!rdmsr_safe(smca_config, &low, &high)) {
/*
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 325aedf46ff2..9476097d0fdb 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -996,7 +996,7 @@ static void decode_smca_error(struct mce *m)
}
 
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
-   decode_dram_ecc(cpu_to_node(m->extcpu), m);
+   decode_dram_ecc(amd_cpu_to_node(m->extcpu), m);
 }
 
 static inline void amd_decode_err_code(u16 ec)
-- 
2.25.1



[PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes during address translation

2020-08-14 Thread Yazen Ghannam
From: Muralidhara M K 

Add support for new memory interleaving schemes used in current AMD
systems.

Check if the system is using a current Data Fabric version or a legacy
version as some bit and register definitions have changed.

Tested on AMD reference platforms with the following memory interleaving
options.

Naples
- None
- Channel
- Die
- Socket

Rome (NPS = Nodes per Socket)
- None
- NPS0
- NPS1
- NPS2
- NPS4

The fixes tag refers to the commit that allows amd64_edac_mod to load on
Rome systems. The module may report an incorrect system address on Rome
systems depending on the interleaving option used.

Fixes: 6e846239e548 ("EDAC/amd64: Add Family 17h Model 30h PCI IDs")
Signed-off-by: Muralidhara M K 
Co-developed-by: Naveen Krishna Chtradhi 
Signed-off-by: Naveen Krishna Chtradhi 
Co-developed-by: Yazen Ghannam 
Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/mce/amd.c | 237 +++---
 1 file changed, 188 insertions(+), 49 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 524edf81e287..a687aa898fef 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -689,18 +689,25 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
 int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 {
u64 dram_base_addr, dram_limit_addr, dram_hole_base;
+
/* We start from the normalized address */
u64 ret_addr = norm_addr;
 
u32 tmp;
 
-   u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask;
+   bool hash_enabled = false, split_normalized = false, legacy_df = false;
+
u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets;
-   u8 intlv_addr_sel, intlv_addr_bit;
-   u8 num_intlv_bits, hashed_bit;
+   u8 intlv_addr_sel, intlv_addr_bit, num_intlv_bits;
+   u8 cs_mask, cs_id = 0, dst_fabric_id = 0;
u8 lgcy_mmio_hole_en, base = 0;
-   u8 cs_mask, cs_id = 0;
-   bool hash_enabled = false;
+
+   /* Read D18F1x208 (System Fabric ID Mask 0). */
+   if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp))
+   goto out_err;
+
+   /* Determine if system is a legacy Data Fabric type. */
+   legacy_df = !(tmp & 0xFF);
 
/* Read D18F0x1B4 (DramOffset), check if base 1 is used. */
if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &tmp))
@@ -708,7 +715,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
 
/* Remove HiAddrOffset from normalized address, if enabled: */
if (tmp & BIT(0)) {
-   u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8;
+   u8 hi_addr_offset_lsb = legacy_df ? 20 : 12;
+   u64 hi_addr_offset = tmp & GENMASK_ULL(31, hi_addr_offset_lsb);
+
+   /* Align to bit 28 regardless of the LSB used. */
+   hi_addr_offset >>= hi_addr_offset_lsb;
+   hi_addr_offset <<= 28;
 
if (norm_addr >= hi_addr_offset) {
ret_addr -= hi_addr_offset;
@@ -728,23 +740,31 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 
umc, u64 *sys_addr)
}
 
lgcy_mmio_hole_en = tmp & BIT(1);
-   intlv_num_chan= (tmp >> 4) & 0xF;
-   intlv_addr_sel= (tmp >> 8) & 0x7;
-   dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16;
 
-   /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
-   if (intlv_addr_sel > 3) {
-   pr_err("%s: Invalid interleave address select %d.\n",
-   __func__, intlv_addr_sel);
-   goto out_err;
+   if (legacy_df) {
+   intlv_num_chan= (tmp >> 4) & 0xF;
+   intlv_addr_sel= (tmp >> 8) & 0x7;
+   } else {
+   intlv_num_chan= (tmp >> 2) & 0xF;
+   intlv_num_dies= (tmp >> 6) & 0x3;
+   intlv_num_sockets = (tmp >> 8) & 0x1;
+   intlv_addr_sel= (tmp >> 9) & 0x7;
}
 
+   dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16;
+
/* Read D18F0x114 (DramLimitAddress). */
if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp))
goto out_err;
 
-   intlv_num_sockets = (tmp >> 8) & 0x1;
-   intlv_num_dies= (tmp >> 10) & 0x3;
+   if (legacy_df) {
+   intlv_num_sockets = (tmp >> 8) & 0x1;
+   intlv_num_dies= (tmp >> 10) & 0x3;
+   dst_fabric_id = tmp & 0xFF;
+   } else {
+   dst_fabric_id = tmp & 0x3FF;
+   }
+
dram_limit_addr   = ((tmp & GENMASK_ULL(31, 12)) << 16) | 
GENMASK_ULL(27, 0);
 
intlv_addr_bit = intlv_addr_sel + 8;
@@ -757,8 +777,27 @@ int umc_nor

[PATCH 0/2] AMD MCA Address Translation Updates

2020-08-14 Thread Yazen Ghannam
From: Yazen Ghannam 

This patchset includes updates for the MCA Address Translation process
on recent AMD systems.

Patch 1:
Fixes an input to the address translation function. The translation
requires a physical Die ID (NodeId in AMD documentation) rather than a
logicial NUMA node ID. This is because the physical and logical nodes
may not always match.

Patch 2:
Add translation support for new memory interleaving options available in
Rome systems. The patch is based on the latest AMD reference code for
the address translation.

Both patches have fixes tags, since they do fix some issues. However,
stable is not copied. Patch 1 needs some fixups to apply. Patch 2 is
large and doesn't seem to meet the requirements for stable though
comments are welcome on if it should be applied.

Thanks,
Yazen

Muralidhara M K (1):
  x86/MCE/AMD Support new memory interleaving schemes during address
translation

Yazen Ghannam (1):
  x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode

 arch/x86/include/asm/mce.h|   2 +
 arch/x86/kernel/cpu/mce/amd.c | 248 +++---
 drivers/edac/mce_amd.c|   2 +-
 3 files changed, 202 insertions(+), 50 deletions(-)

-- 
2.25.1



Re: [PATCH] x86/MCE/AMD, EDAC/mce_amd

2020-08-10 Thread Yazen Ghannam
On Sun, Aug 09, 2020 at 12:35:59PM +0800, Feng zhou wrote:
> From: zhoufeng 
> 
> The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
> later systems. This function is used in amd64_edac_mod to do
> system-specific decoding for DRAM ECC errors. The function takes a
> "NodeId" as a parameter.
> 
> In AMD documentation, NodeId is used to identify a physical die in a
> system. This can be used to identify a node in the AMD_NB code and also
> it is used with umc_normaddr_to_sysaddr().
> 
> However, the input used for decode_dram_ecc() is currently the NUMA node
> of a logical CPU. so this will cause the address translation function to
> fail or report incorrect results.
> 
> Signed-off-by: zhoufeng 
> ---
>  drivers/edac/mce_amd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
> index 325aedf46ff2..73c805113322 100644
> --- a/drivers/edac/mce_amd.c
> +++ b/drivers/edac/mce_amd.c
> @@ -996,7 +996,7 @@ static void decode_smca_error(struct mce *m)
>   }
>  
>   if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
> - decode_dram_ecc(cpu_to_node(m->extcpu), m);
> + decode_dram_ecc(topology_physical_package_id(m->extcpu), m);

This will break on Naples systems, because the NodeId and the physical
package ID will not match.

I can send a patch soon that will work for Naples, Rome, and later
systems.

Thanks,
Yazen


[PATCH] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap

2020-07-20 Thread Yazen Ghannam
From: Yazen Ghannam 

The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary because the hardware will only
report valid error codes.

Remove the xec_bitmap field and all references to it.

Signed-off-by: Yazen Ghannam 
---
 arch/x86/include/asm/mce.h|  1 -
 arch/x86/kernel/cpu/mce/amd.c | 44 +--
 drivers/edac/mce_amd.c|  4 +---
 3 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 734ffe78a3d6..c18e87aeeccc 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -327,7 +327,6 @@ enum smca_bank_types {
 struct smca_hwid {
unsigned int bank_type; /* Use with smca_bank_types for easy indexing. 
*/
u32 hwid_mcatype;   /* (hwid,mcatype) tuple */
-   u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max 
is 21. */
u8 count;   /* Number of instances. */
 };
 
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 327b85304cdd..a578df70768b 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned 
int bank)
 }
 
 static struct smca_hwid smca_hwid_mcatypes[] = {
-   /* { bank_type, hwid_mcatype, xec_bitmap } */
+   /* { bank_type, hwid_mcatype } */
 
/* Reserved type */
-   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 },
+   { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)},
 
/* ZN Core (HWID=0xB0) MCA types */
-   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0), 0x1F },
-   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF },
-   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1), 0x3FFF },
-   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF },
-   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3), 0x1FF },
+   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0)},
+   { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10)   },
+   { SMCA_IF,   HWID_MCATYPE(0xB0, 0x1)},
+   { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)},
+   { SMCA_DE,   HWID_MCATYPE(0xB0, 0x3)},
/* HWID 0xB0 MCATYPE 0x4 is Reserved */
-   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5), 0xFFF },
-   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6), 0x7F },
-   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF },
+   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5)},
+   { SMCA_FP,   HWID_MCATYPE(0xB0, 0x6)},
+   { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)},
 
/* Data Fabric MCA types */
-   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0), 0x1FF },
-   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0x1F },
-   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
+   { SMCA_CS,   HWID_MCATYPE(0x2E, 0x0)},
+   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1)},
+   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)},
 
/* Unified Memory Controller MCA type */
-   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0xFF },
+   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0)},
 
/* Parameter Block MCA type */
-   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0), 0x1 },
+   { SMCA_PB,   HWID_MCATYPE(0x05, 0x0)},
 
/* Platform Security Processor MCA type */
-   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0), 0x1 },
-   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1), 0x3 },
+   { SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0)},
+   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1)},
 
/* System Management Unit MCA type */
-   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0), 0x1 },
-   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1), 0x7FF },
+   { SMCA_SMU,  HWID_MCATYPE(0x01, 0x0)},
+   { SMCA_SMU_V2,   HWID_MCATYPE(0x01, 0x1)},
 
/* Microprocessor 5 Unit MCA type */
-   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2), 0x3FF },
+   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2)},
 
/* Northbridge IO Unit MCA type */
-   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F },
+   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)},
 
/* PCI Express Unit MCA type */
-   { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0), 0x1F },
+   { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0)},
 };
 
 struct smca_bank smca_banks[MAX_NR_BANKS];
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 4fd06a3dc6fe..7f28edb070bd 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -999,10 +999,8 @@ static void decode_smca_error(struct mce *m)
pr_emerg(HW_ERR "%s Ext. Error Code: %d", ip_name, xec);
 
/* Only print the decode of valid error codes */
-   if (xec < smca_mce_descs[bank_type].num_descs &&
-   (hwid->

[PATCH] EDAC/mce_amd: Add new error descriptions for existing types

2020-07-08 Thread Yazen Ghannam
From: Yazen Ghannam 

A few existing MCA bank types will have new error types in future SMCA
systems.

Add the descriptions for the new error types.

Signed-off-by: Yazen Ghannam 
---
 drivers/edac/mce_amd.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 325aedf46ff2..4fd06a3dc6fe 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -210,6 +210,11 @@ static const char * const smca_if_mce_desc[] = {
"L2 BTB Multi-Match Error",
"L2 Cache Response Poison Error",
"System Read Data Error",
+   "Hardware Assertion Error",
+   "L1-TLB Multi-Hit",
+   "L2-TLB Multi-Hit",
+   "BSR Parity Error",
+   "CT MCE",
 };
 
 static const char * const smca_l2_mce_desc[] = {
@@ -228,7 +233,8 @@ static const char * const smca_de_mce_desc[] = {
"Fetch address FIFO parity error",
"Patch RAM data parity error",
"Patch RAM sequencer parity error",
-   "Micro-op buffer parity error"
+   "Micro-op buffer parity error",
+   "Hardware Assertion MCA Error",
 };
 
 static const char * const smca_ex_mce_desc[] = {
@@ -244,6 +250,8 @@ static const char * const smca_ex_mce_desc[] = {
"Scheduling queue parity error",
"Branch buffer queue parity error",
"Hardware Assertion error",
+   "Spec Map parity error",
+   "Retire Map parity error",
 };
 
 static const char * const smca_fp_mce_desc[] = {
@@ -360,6 +368,7 @@ static const char * const smca_smu2_mce_desc[] = {
"Instruction Tag Cache Bank A ECC or parity error",
"Instruction Tag Cache Bank B ECC or parity error",
"System Hub Read Buffer ECC or parity error",
+   "PHY RAM ECC error",
 };
 
 static const char * const smca_mp5_mce_desc[] = {
-- 
2.25.1



Re: [PATCH 0/2] MCA and EDAC updates for AMD Family 17h, Model 60h

2020-06-16 Thread Yazen Ghannam
On Mon, Jun 15, 2020 at 07:59:50AM -0400, Borislav Petkov wrote:
> + Yazen and linux-hwmon.
> 
> On Sun, Jun 07, 2020 at 12:37:07PM +0800, Jacky Hu wrote:
> > This patchset adds MCA and EDAC support for AMD Family 17h, Model 60h.
> > 
> > Also k10temp works with 4800h
> > 
> > k10temp-pci-00c3
> > Adapter: PCI adapter
> > Vcore: 1.55 V
> > Vsoc:  1.55 V
> > Tctl: +49.6°C
> > Tdie: +49.6°C
> > Icore: 0.00 A
> > Isoc:  0.00 A
> > 
> > Jacky Hu (2):
> >   x86/amd_nb: Add Family 17h, Model 60h PCI IDs
> >   EDAC/amd64: Add family ops for Family 17h Models 60h-6Fh
> > 
> >  arch/x86/kernel/amd_nb.c  |  5 +
> >  drivers/edac/amd64_edac.c | 14 ++
> >  drivers/edac/amd64_edac.h |  3 +++
> >  drivers/hwmon/k10temp.c   |  2 ++
> >  include/linux/pci_ids.h   |  1 +
> >  5 files changed, 25 insertions(+)
> >

PCI IDs and EDAC look good to me.

Acked-by: Yazen Ghannam 

Thanks,
Yazen


Re: [PATCH] x86/mce: fix a wrong assignment of i_mce.status

2020-06-11 Thread Yazen Ghannam
On Thu, Jun 11, 2020 at 12:55:00PM -0400, Luck, Tony wrote:
> +Yazen
> 
> On Thu, Jun 11, 2020 at 10:32:38AM +0800, Zhenzhong Duan wrote:
> > The original code is a nop as i_mce.status is or'ed with part of itself,
> > fix it.
> > 
> > Signed-off-by: Zhenzhong Duan 
> > ---
> >  arch/x86/kernel/cpu/mce/inject.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/mce/inject.c 
> > b/arch/x86/kernel/cpu/mce/inject.c
> > index 3413b41..dc28a61 100644
> > --- a/arch/x86/kernel/cpu/mce/inject.c
> > +++ b/arch/x86/kernel/cpu/mce/inject.c
> > @@ -511,7 +511,7 @@ static void do_inject(void)
> >  */
> > if (inj_type == DFR_INT_INJ) {
> > i_mce.status |= MCI_STATUS_DEFERRED;
> > -   i_mce.status |= (i_mce.status & ~MCI_STATUS_UC);
> > +   i_mce.status &= ~MCI_STATUS_UC;
> 
> Boris: "git blame" says you wrote this code. Patch looks right (in
> that it makes the code do what the comment just above says it is trying
> to do):
> 
>  * - MCx_STATUS[UC] cleared: deferred errors are _not_ UC
> 
> But this is AMD specific, so I'll defer judgement
>

Acked-by: Yazen Ghannam 

Thanks,
Yazen


Re: 5.6.12 MCE on AMD EPYC 7502

2020-05-29 Thread Yazen Ghannam
On Fri, May 29, 2020 at 07:57:20AM -0400, Borislav Petkov wrote:
> On Fri, May 29, 2020 at 01:55:29PM +0300, Dmitry Antipov wrote:
> > Hello,
> > 
> > I'm facing the following kernel messages running Debian 9 with
> > custom 5.6.12 kernel running on AMD EPYC 7502 - based hardware:
> > 
> > [138537.806814] mce: [Hardware Error]: Machine check events logged
> > [138537.806818] [Hardware Error]: Corrected error, no action required.
> > [138537.808456] [Hardware Error]: CPU:0 (17:31:0) 
> > MC27_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd822080b
> > [138537.810080] [Hardware Error]: IPID: 0x0001002e1e01, Syndrome: 
> > 0x5a05
> > [138537.811694] [Hardware Error]: Power, Interrupts, etc. Ext. Error Code: 
> > 2, Link Error.
> > [138537.813281] [Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: 
> > GEN, part-proc: SRC (no timeout)
> > 
> > Is it related to some (not so) known CPU errata?
> 
> Who knows.
>

There aren't any reported errata related to this that I could find.

> > Should I try to update microcode, motherboard firmware, kernel, or whatever 
> > else?
> 
> Yeah, BIOS update might be a good idea, if there's a newer version for
> your board.
>

I agree. The link settings are generally tuned for the platform. So the
platform vendor may have a fix.

Thanks,
Yazen


Re: [PATCH 3/3] EDAC/amd64: Add AMD family 17h model 60h PCI IDs

2020-05-13 Thread Yazen Ghannam
On Sun, May 10, 2020 at 04:48:42PM -0400, Alexander Monakov wrote:
> Add support for AMD Renoir (4000-series Ryzen CPUs).
> 
> Signed-off-by: Alexander Monakov 
> Cc: Thomas Gleixner 
> Cc: Borislav Petkov 
> Cc: x...@kernel.org
> Cc: Yazen Ghannam 
> Cc: Brian Woods 
> Cc: Clemens Ladisch 
> Cc: Jean Delvare 
> Cc: Guenter Roeck 
> Cc: linux-hw...@vger.kernel.org
> Cc: linux-e...@vger.kernel.org

Acked-by: Yazen Ghannam 

Thanks,
Yazen


Re: [PATCH 1/3] x86/amd_nb: add AMD family 17h model 60h PCI IDs

2020-05-13 Thread Yazen Ghannam
On Sun, May 10, 2020 at 04:48:40PM -0400, Alexander Monakov wrote:
> Add PCI IDs for AMD Renoir (4000-series Ryzen CPUs). This is necessary
> to enable support for temperature sensors via the k10temp module.
> 
> Signed-off-by: Alexander Monakov 
> Cc: Thomas Gleixner 
> Cc: Borislav Petkov 
> Cc: x...@kernel.org
> Cc: Yazen Ghannam 
> Cc: Brian Woods 
> Cc: Clemens Ladisch 
> Cc: Jean Delvare 
> Cc: Guenter Roeck 
> Cc: linux-hw...@vger.kernel.org
> Cc: linux-e...@vger.kernel.org

Acked-by: Yazen Ghannam 

Thanks,
Yazen


[tip:ras/core] x86/MCE: Determine MCA banks' init state properly

2019-06-11 Thread tip-bot for Yazen Ghannam
Commit-ID:  068b053dca0e2ab40b3d953b102a178654eec282
Gitweb: https://git.kernel.org/tip/068b053dca0e2ab40b3d953b102a178654eec282
Author: Yazen Ghannam 
AuthorDate: Fri, 7 Jun 2019 20:18:06 +
Committer:  Borislav Petkov 
CommitDate: Tue, 11 Jun 2019 15:23:34 +0200

x86/MCE: Determine MCA banks' init state properly

The OS is expected to write all bits to MCA_CTL for each bank,
thus enabling error reporting in all banks. However, some banks
may be unused in which case the registers for such banks are
Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control
bits because of quirks, etc.

A bank can be considered uninitialized if the MCA_CTL register returns
zero. This is because either the OS did not write anything or because
the hardware is enforcing RAZ/WI for the bank.

Set a bank's init value based on if the control bits are set or not in
hardware. Return an error code in the sysfs interface for uninitialized
banks.

Do a final bank init check in a separate function which is not part of
any user-controlled code flows. This is so a user may enable/disable a
bank during runtime without having to restart their system.

 [ bp: Massage a bit. Discover bank init state at boot. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: "linux-e...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: "x...@kernel.org" 
Link: https://lkml.kernel.org/r/20190607201752.221446-6-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/core.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 10f9f140985e..c2c93e9195ed 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1490,6 +1490,11 @@ static void __mcheck_cpu_mce_banks_init(void)
for (i = 0; i < n_banks; i++) {
struct mce_bank *b = &mce_banks[i];
 
+   /*
+* Init them all, __mcheck_cpu_apply_quirks() is going to apply
+* the required vendor quirks before
+* __mcheck_cpu_init_clear_banks() does the final bank setup.
+*/
b->ctl = -1ULL;
b->init = 1;
}
@@ -1562,6 +1567,33 @@ static void __mcheck_cpu_init_clear_banks(void)
}
 }
 
+/*
+ * Do a final check to see if there are any unused/RAZ banks.
+ *
+ * This must be done after the banks have been initialized and any quirks have
+ * been applied.
+ *
+ * Do not call this from any user-initiated flows, e.g. CPU hotplug or sysfs.
+ * Otherwise, a user who disables a bank will not be able to re-enable it
+ * without a system reboot.
+ */
+static void __mcheck_cpu_check_banks(void)
+{
+   struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+   u64 msrval;
+   int i;
+
+   for (i = 0; i < this_cpu_read(mce_num_banks); i++) {
+   struct mce_bank *b = &mce_banks[i];
+
+   if (!b->init)
+   continue;
+
+   rdmsrl(msr_ops.ctl(i), msrval);
+   b->init = !!msrval;
+   }
+}
+
 /*
  * During IFU recovery Sandy Bridge -EP4S processors set the RIPV and
  * EIPV bits in MCG_STATUS to zero on the affected logical processor (SDM
@@ -1849,6 +1881,7 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
__mcheck_cpu_init_generic();
__mcheck_cpu_init_vendor(c);
__mcheck_cpu_init_clear_banks();
+   __mcheck_cpu_check_banks();
__mcheck_cpu_setup_timer();
 }
 
@@ -2085,6 +2118,9 @@ static ssize_t show_bank(struct device *s, struct 
device_attribute *attr,
 
b = &per_cpu(mce_banks_array, s->id)[bank];
 
+   if (!b->init)
+   return -ENODEV;
+
return sprintf(buf, "%llx\n", b->ctl);
 }
 
@@ -2103,6 +2139,9 @@ static ssize_t set_bank(struct device *s, struct 
device_attribute *attr,
 
b = &per_cpu(mce_banks_array, s->id)[bank];
 
+   if (!b->init)
+   return -ENODEV;
+
b->ctl = new;
mce_restart();
 


[tip:ras/core] x86/MCE: Make the number of MCA banks a per-CPU variable

2019-06-11 Thread tip-bot for Yazen Ghannam
Commit-ID:  c7d314f386e987be8b51eeb7dd947756ae23f6b6
Gitweb: https://git.kernel.org/tip/c7d314f386e987be8b51eeb7dd947756ae23f6b6
Author: Yazen Ghannam 
AuthorDate: Fri, 7 Jun 2019 20:18:05 +
Committer:  Borislav Petkov 
CommitDate: Tue, 11 Jun 2019 15:23:09 +0200

x86/MCE: Make the number of MCA banks a per-CPU variable

The number of MCA banks is provided per logical CPU. Historically, this
number has been the same across all CPUs, but this is not an
architectural guarantee. Future AMD systems may have MCA bank counts
that vary between logical CPUs in a system.

This issue was partially addressed in

  006c077041dc ("x86/mce: Handle varying MCA bank counts")

by allocating structures using the maximum number of MCA banks and by
saving the maximum MCA bank count in a system as the global count. This
means that some extra structures are allocated. Also, this means that
CPUs will spend more time in the #MC and other handlers checking extra
MCA banks.

Thus, define the number of MCA banks as a per-CPU variable.

 [ bp: Make mce_num_banks an unsigned int. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: "linux-e...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: "x...@kernel.org" 
Link: https://lkml.kernel.org/r/20190607201752.221446-5-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/amd.c  | 19 
 arch/x86/kernel/cpu/mce/core.c | 45 +-
 arch/x86/kernel/cpu/mce/internal.h |  2 +-
 3 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index d4d6e4b7f9dc..fb5c935af2c5 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -495,7 +495,7 @@ static u32 get_block_address(u32 current_addr, u32 low, u32 
high,
 {
u32 addr = 0, offset = 0;
 
-   if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS))
+   if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS))
return addr;
 
if (mce_flags.smca)
@@ -627,11 +627,12 @@ void disable_err_thresholding(struct cpuinfo_x86 *c, 
unsigned int bank)
 /* cpu init entry point, called from mce.c with preempt off */
 void mce_amd_feature_init(struct cpuinfo_x86 *c)
 {
-   u32 low = 0, high = 0, address = 0;
unsigned int bank, block, cpu = smp_processor_id();
+   u32 low = 0, high = 0, address = 0;
int offset = -1;
 
-   for (bank = 0; bank < mca_cfg.banks; ++bank) {
+
+   for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
 
@@ -976,7 +977,7 @@ static void amd_deferred_error_interrupt(void)
 {
unsigned int bank;
 
-   for (bank = 0; bank < mca_cfg.banks; ++bank)
+   for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank)
log_error_deferred(bank);
 }
 
@@ -1017,7 +1018,7 @@ static void amd_threshold_interrupt(void)
struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
unsigned int bank, cpu = smp_processor_id();
 
-   for (bank = 0; bank < mca_cfg.banks; ++bank) {
+   for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
 
@@ -1204,7 +1205,7 @@ static int allocate_threshold_blocks(unsigned int cpu, 
unsigned int bank,
u32 low, high;
int err;
 
-   if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS))
+   if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS))
return 0;
 
if (rdmsr_safe_on_cpu(cpu, address, &low, &high))
@@ -1438,7 +1439,7 @@ int mce_threshold_remove_device(unsigned int cpu)
 {
unsigned int bank;
 
-   for (bank = 0; bank < mca_cfg.banks; ++bank) {
+   for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
threshold_remove_bank(cpu, bank);
@@ -1459,14 +1460,14 @@ int mce_threshold_create_device(unsigned int cpu)
if (bp)
return 0;
 
-   bp = kcalloc(mca_cfg.banks, sizeof(struct threshold_bank *),
+   bp = kcalloc(per_cpu(mce_num_banks, cpu), sizeof(struct threshold_bank 
*),
 GFP_KERNEL);
if (!bp)
return -ENOMEM;
 
per_cpu(threshold_banks, cpu) = bp;
 
-   for (bank = 0; bank < mca_cfg.banks; ++bank) {
+   for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
err = threshold_create_bank(cpu, bank);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/ker

[tip:ras/core] x86/MCE/AMD: Don't cache block addresses on SMCA systems

2019-06-11 Thread tip-bot for Yazen Ghannam
Commit-ID:  95d057f54664f3c6e8f650faf5690b82b30a9e52
Gitweb: https://git.kernel.org/tip/95d057f54664f3c6e8f650faf5690b82b30a9e52
Author: Yazen Ghannam 
AuthorDate: Fri, 7 Jun 2019 20:18:04 +
Committer:  Borislav Petkov 
CommitDate: Tue, 11 Jun 2019 15:22:41 +0200

x86/MCE/AMD: Don't cache block addresses on SMCA systems

On legacy systems, the addresses of the MCA_MISC* registers need to be
recursively discovered based on a Block Pointer field in the registers.

On Scalable MCA systems, the register space is fixed, and particular
addresses can be derived by regular offsets for bank and register type.
This fixed address space includes the MCA_MISC* registers.

MCA_MISC0 is always available for each MCA bank. MCA_MISC1 through
MCA_MISC4 are considered available if MCA_MISC0[BlkPtr]=1.

Cache the value of MCA_MISC0[BlkPtr] for each bank and per CPU. This
needs to be done only during init. The values should be saved per CPU
to accommodate heterogeneous SMCA systems.

Redo smca_get_block_address() to directly return the block addresses.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: "linux-e...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: "x...@kernel.org" 
Link: https://lkml.kernel.org/r/20190607201752.221446-4-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/amd.c | 73 ++-
 1 file changed, 37 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index d904aafe6409..d4d6e4b7f9dc 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -101,11 +101,6 @@ static struct smca_bank_name smca_names[] = {
[SMCA_PCIE] = { "pcie", "PCI Express Unit" },
 };
 
-static u32 smca_bank_addrs[MAX_NR_BANKS][NR_BLOCKS] __ro_after_init =
-{
-   [0 ... MAX_NR_BANKS - 1] = { [0 ... NR_BLOCKS - 1] = -1 }
-};
-
 static const char *smca_get_name(enum smca_bank_types t)
 {
if (t >= N_SMCA_BANK_TYPES)
@@ -199,6 +194,9 @@ static char buf_mcatype[MAX_MCATYPE_NAME_LEN];
 static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks);
 static DEFINE_PER_CPU(unsigned int, bank_map); /* see which banks are on */
 
+/* Map of banks that have more than MCA_MISC0 available. */
+static DEFINE_PER_CPU(u32, smca_misc_banks_map);
+
 static void amd_threshold_interrupt(void);
 static void amd_deferred_error_interrupt(void);
 
@@ -208,6 +206,28 @@ static void default_deferred_error_interrupt(void)
 }
 void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt;
 
+static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu)
+{
+   u32 low, high;
+
+   /*
+* For SMCA enabled processors, BLKPTR field of the first MISC register
+* (MCx_MISC0) indicates presence of additional MISC regs set (MISC1-4).
+*/
+   if (rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &low, &high))
+   return;
+
+   if (!(low & MCI_CONFIG_MCAX))
+   return;
+
+   if (rdmsr_safe(MSR_AMD64_SMCA_MCx_MISC(bank), &low, &high))
+   return;
+
+   if (low & MASK_BLKPTR_LO)
+   per_cpu(smca_misc_banks_map, cpu) |= BIT(bank);
+
+}
+
 static void smca_configure(unsigned int bank, unsigned int cpu)
 {
unsigned int i, hwid_mcatype;
@@ -245,6 +265,8 @@ static void smca_configure(unsigned int bank, unsigned int 
cpu)
wrmsr(smca_config, low, high);
}
 
+   smca_set_misc_banks_map(bank, cpu);
+
/* Return early if this bank was already initialized. */
if (smca_banks[bank].hwid)
return;
@@ -455,42 +477,21 @@ static void deferred_error_interrupt_enable(struct 
cpuinfo_x86 *c)
wrmsr(MSR_CU_DEF_ERR, low, high);
 }
 
-static u32 smca_get_block_address(unsigned int bank, unsigned int block)
+static u32 smca_get_block_address(unsigned int bank, unsigned int block,
+ unsigned int cpu)
 {
-   u32 low, high;
-   u32 addr = 0;
-
-   if (smca_get_bank_type(bank) == SMCA_RESERVED)
-   return addr;
-
if (!block)
return MSR_AMD64_SMCA_MCx_MISC(bank);
 
-   /* Check our cache first: */
-   if (smca_bank_addrs[bank][block] != -1)
-   return smca_bank_addrs[bank][block];
-
-   /*
-* For SMCA enabled processors, BLKPTR field of the first MISC register
-* (MCx_MISC0) indicates presence of additional MISC regs set (MISC1-4).
-*/
-   if (rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &low, &high))
-   goto out;
-
-   if (!(low & MCI_CONFIG_MCAX))
-   goto out;
-
-   if (!rdmsr_safe(MSR_AMD64_SMCA_MCx_MISC(bank), &low, &high) &&
-   (low & MASK_BLKPTR_LO))
-   addr = MSR_AMD64_SMCA_MCx_MISC

[tip:ras/core] x86/MCE: Make mce_banks a per-CPU array

2019-06-11 Thread tip-bot for Yazen Ghannam
Commit-ID:  b4914508f1fe0eca1cd011b6026ff762a1aa62d5
Gitweb: https://git.kernel.org/tip/b4914508f1fe0eca1cd011b6026ff762a1aa62d5
Author: Yazen Ghannam 
AuthorDate: Fri, 7 Jun 2019 20:18:04 +
Committer:  Borislav Petkov 
CommitDate: Tue, 11 Jun 2019 15:22:13 +0200

x86/MCE: Make mce_banks a per-CPU array

Current AMD systems have unique MCA banks per logical CPU even though
the type of the banks may all align to the same bank number. Each CPU
will have control of a set of MCA banks in the hardware and these are
not shared with other CPUs.

For example, bank 0 may be the Load-Store Unit on every logical CPU, but
each bank 0 is a unique structure in the hardware. In other words, there
isn't a *single* Load-Store Unit at MCA bank 0 that all logical CPUs
share.

This idea extends even to non-core MCA banks. For example, CPU0 and CPU4
may see a Unified Memory Controller at bank 15, but each CPU is actually
seeing a unique hardware structure that is not shared with other CPUs.

Because the MCA banks are all unique hardware structures, it would be
good to control them in a more granular way. For example, if there is a
known issue with the Floating Point Unit on CPU5 and a user wishes to
disable an error type on the Floating Point Unit, then it would be good
to do this only for CPU5 rather than all CPUs.

Also, future AMD systems may have heterogeneous MCA banks. Meaning
the bank numbers may not necessarily represent the same types between
CPUs. For example, bank 20 visible to CPU0 may be a Unified Memory
Controller and bank 20 visible to CPU4 may be a Coherent Slave. So
granular control will be even more necessary should the user wish to
control specific MCA banks.

Split the device attributes from struct mce_bank leaving only the MCA
bank control fields.

Make struct mce_banks[] per_cpu in order to have more granular control
over individual MCA banks in the hardware.

Allocate the device attributes statically based on the maximum number of
MCA banks supported. The sysfs interface will use as many as needed per
CPU. Currently, this is set to mca_cfg.banks, but will be changed to a
per_cpu bank count in a future patch.

Allocate the MCA control bits statically. This is in order to avoid
locking warnings when memory is allocated during secondary CPUs' init
sequences.

Also, remove the now unnecessary return values from
__mcheck_cpu_mce_banks_init() and __mcheck_cpu_cap_init().

Redo the sysfs store/show functions to handle the per_cpu mce_banks[].

 [ bp: s/mce_banks_percpu/mce_banks_array/g ]

[ Locking issue reported by ]
Reported-by: kernel test robot 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: "linux-e...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: "x...@kernel.org" 
Link: https://lkml.kernel.org/r/20190607201752.221446-3-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/core.c | 76 ++
 1 file changed, 48 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 55bdbedde0b8..49fac95d036b 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -65,16 +65,21 @@ static DEFINE_MUTEX(mce_sysfs_mutex);
 
 DEFINE_PER_CPU(unsigned, mce_exception_count);
 
-#define ATTR_LEN   16
-/* One object for each MCE bank, shared by all CPUs */
 struct mce_bank {
u64 ctl;/* subevents to enable 
*/
boolinit;   /* initialise bank? */
+};
+static DEFINE_PER_CPU_READ_MOSTLY(struct mce_bank[MAX_NR_BANKS], 
mce_banks_array);
+
+#define ATTR_LEN   16
+/* One object for each MCE bank, shared by all CPUs */
+struct mce_bank_dev {
struct device_attribute attr;   /* device attribute */
charattrname[ATTR_LEN]; /* attribute name */
+   u8  bank;   /* bank number */
 };
+static struct mce_bank_dev mce_bank_devs[MAX_NR_BANKS];
 
-static struct mce_bank *mce_banks __read_mostly;
 struct mce_vendor_flags mce_flags __read_mostly;
 
 struct mca_config mca_cfg __read_mostly = {
@@ -684,6 +689,7 @@ DEFINE_PER_CPU(unsigned, mce_poll_count);
  */
 bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 {
+   struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
bool error_seen = false;
struct mce m;
int i;
@@ -1131,6 +1137,7 @@ static void __mc_scan_banks(struct mce *m, struct mce 
*final,
unsigned long *toclear, unsigned long *valid_banks,
int no_way_out, int *worst)
 {
+   struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
int severity, i;
 
@@ -1472,27 +1479,23 @@ int mce_notify_irq(void)
 }
 EXPORT_SYMBOL_GPL(mce_notify_irq);
 

[tip:ras/core] x86/MCE: Make struct mce_banks[] static

2019-06-11 Thread tip-bot for Yazen Ghannam
Commit-ID:  95fdce6b24f3526c2bd1aad15978d238b79da6bd
Gitweb: https://git.kernel.org/tip/95fdce6b24f3526c2bd1aad15978d238b79da6bd
Author: Yazen Ghannam 
AuthorDate: Fri, 7 Jun 2019 20:18:03 +
Committer:  Borislav Petkov 
CommitDate: Tue, 11 Jun 2019 15:13:51 +0200

x86/MCE: Make struct mce_banks[] static

The struct mce_banks[] array is only used in mce/core.c so move its
definition there and make it static. Also, change the "init" field to
bool type.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: "x...@kernel.org" 
Link: https://lkml.kernel.org/r/20190607201752.221446-2-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/core.c | 11 ++-
 arch/x86/kernel/cpu/mce/internal.h | 10 --
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 282916f3b8d8..55bdbedde0b8 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -65,7 +65,16 @@ static DEFINE_MUTEX(mce_sysfs_mutex);
 
 DEFINE_PER_CPU(unsigned, mce_exception_count);
 
-struct mce_bank *mce_banks __read_mostly;
+#define ATTR_LEN   16
+/* One object for each MCE bank, shared by all CPUs */
+struct mce_bank {
+   u64 ctl;/* subevents to enable 
*/
+   boolinit;   /* initialise bank? */
+   struct device_attribute attr;   /* device attribute */
+   charattrname[ATTR_LEN]; /* attribute name */
+};
+
+static struct mce_bank *mce_banks __read_mostly;
 struct mce_vendor_flags mce_flags __read_mostly;
 
 struct mca_config mca_cfg __read_mostly = {
diff --git a/arch/x86/kernel/cpu/mce/internal.h 
b/arch/x86/kernel/cpu/mce/internal.h
index a34b55baa7aa..35b3e5c02c1c 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -22,17 +22,8 @@ enum severity_level {
 
 extern struct blocking_notifier_head x86_mce_decoder_chain;
 
-#define ATTR_LEN   16
 #define INITIAL_CHECK_INTERVAL 5 * 60 /* 5 minutes */
 
-/* One object for each MCE bank, shared by all CPUs */
-struct mce_bank {
-   u64 ctl;/* subevents to enable 
*/
-   unsigned char init; /* initialise bank? */
-   struct device_attribute attr;   /* device attribute */
-   charattrname[ATTR_LEN]; /* attribute name */
-};
-
 struct mce_evt_llist {
struct llist_node llnode;
struct mce mce;
@@ -47,7 +38,6 @@ struct llist_node *mce_gen_pool_prepare_records(void);
 extern int (*mce_severity)(struct mce *a, int tolerant, char **msg, bool 
is_excp);
 struct dentry *mce_get_debugfs_dir(void);
 
-extern struct mce_bank *mce_banks;
 extern mce_banks_t mce_banks_ce_disabled;
 
 #ifdef CONFIG_X86_MCE_INTEL


[tip:ras/core] x86/MCE: Add an MCE-record filtering function

2019-04-23 Thread tip-bot for Yazen Ghannam
Commit-ID:  45d4b7b9cb88526f6d5bd4c03efab88d75d10e4f
Gitweb: https://git.kernel.org/tip/45d4b7b9cb88526f6d5bd4c03efab88d75d10e4f
Author: Yazen Ghannam 
AuthorDate: Mon, 25 Mar 2019 16:34:22 +
Committer:  Borislav Petkov 
CommitDate: Tue, 23 Apr 2019 18:04:47 +0200

x86/MCE: Add an MCE-record filtering function

Some systems may report spurious MCA errors. In general, spurious MCA
errors may be disabled by clearing a particular bit in MCA_CTL. However,
clearing a bit in MCA_CTL may not be recommended for some errors, so the
only option is to ignore them.

An MCA error is printed and handled after it has been added to the MCE
event pool. So an MCA error can be ignored by not adding it to that pool
in the first place.

Add such a filtering function.

 [ bp: Move function prototype to the internal header and massage. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: Arnd Bergmann 
Cc: "cle...@gmail.com" 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: "ra...@milecki.pl" 
Cc: Shirish S 
Cc:  # 5.0.x
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190325163410.171021-1-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/core.c | 5 +
 arch/x86/kernel/cpu/mce/genpool.c  | 3 +++
 arch/x86/kernel/cpu/mce/internal.h | 3 +++
 3 files changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3e081428117c..80b8c6bff8ed 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1775,6 +1775,11 @@ static void __mcheck_cpu_init_timer(void)
mce_start_timer(t);
 }
 
+bool filter_mce(struct mce *m)
+{
+   return false;
+}
+
 /* Handle unconfigured int18 (should never happen) */
 static void unexpected_machine_check(struct pt_regs *regs, long error_code)
 {
diff --git a/arch/x86/kernel/cpu/mce/genpool.c 
b/arch/x86/kernel/cpu/mce/genpool.c
index 3395549c51d3..64d1d5a00f39 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -99,6 +99,9 @@ int mce_gen_pool_add(struct mce *mce)
 {
struct mce_evt_llist *node;
 
+   if (filter_mce(mce))
+   return -EINVAL;
+
if (!mce_evt_pool)
return -EINVAL;
 
diff --git a/arch/x86/kernel/cpu/mce/internal.h 
b/arch/x86/kernel/cpu/mce/internal.h
index af5eab1e65e2..b822a645395d 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -173,4 +173,7 @@ struct mca_msr_regs {
 
 extern struct mca_msr_regs msr_ops;
 
+/* Decide whether to add MCE record to MCE event pool or filter it out. */
+extern bool filter_mce(struct mce *m);
+
 #endif /* __X86_MCE_INTERNAL_H__ */


[tip:ras/core] x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models

2019-04-23 Thread tip-bot for Yazen Ghannam
Commit-ID:  71a84402b93e5fbd8f817f40059c137e10171788
Gitweb: https://git.kernel.org/tip/71a84402b93e5fbd8f817f40059c137e10171788
Author: Yazen Ghannam 
AuthorDate: Mon, 25 Mar 2019 16:34:22 +
Committer:  Borislav Petkov 
CommitDate: Tue, 23 Apr 2019 18:16:07 +0200

x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models

AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA
errors under certain conditions. The errors are benign and can safely be
ignored. However, the high error rate may cause the MCA threshold
counter to overflow causing a high rate of thresholding interrupts.

In addition, users may see the errors reported through the AMD MCE
decoder module, even with the interrupt disabled, due to MCA polling.

Clear the "Counter Present" bit in the Instruction Fetch bank's
MCA_MISC0 register. This will prevent enabling MCA thresholding on this
bank which will prevent the high interrupt rate due to this error.

Define an AMD-specific function to filter these errors from the MCE
event pool so that they don't get reported during early boot.

Rename filter function in EDAC/mce_amd to avoid a naming conflict, while
at it.

 [ bp: Move function prototype to the internal header and
   massage/cleanup, fix typos. ]

Reported-by: Rafał Miłecki 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: "cle...@gmail.com" 
Cc: Arnd Bergmann 
Cc: Ingo Molnar 
Cc: James Morse 
Cc: Kees Cook 
Cc: Mauro Carvalho Chehab 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: Shirish S 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: linux-edac 
Cc: x86-ml 
Cc:  # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off 
MC4_MISC thresholding on all family 0x15 models
Cc:  # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the 
MC4_MISC thresholding quirk
Cc:  # 5.0.x: 9308fd407455: x86/MCE: Group AMD function 
prototypes in 
Cc:  # 5.0.x
Link: https://lkml.kernel.org/r/20190325163410.171021-2-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/amd.c  | 52 --
 arch/x86/kernel/cpu/mce/core.c |  3 +++
 arch/x86/kernel/cpu/mce/internal.h |  6 +
 drivers/edac/mce_amd.c |  4 +--
 4 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index e64de5149e50..d904aafe6409 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -563,33 +563,59 @@ out:
return offset;
 }
 
+bool amd_filter_mce(struct mce *m)
+{
+   enum smca_bank_types bank_type = smca_get_bank_type(m->bank);
+   struct cpuinfo_x86 *c = &boot_cpu_data;
+   u8 xec = (m->status >> 16) & 0x3F;
+
+   /* See Family 17h Models 10h-2Fh Erratum #1114. */
+   if (c->x86 == 0x17 &&
+   c->x86_model >= 0x10 && c->x86_model <= 0x2F &&
+   bank_type == SMCA_IF && xec == 10)
+   return true;
+
+   return false;
+}
+
 /*
- * Turn off MC4_MISC thresholding banks on all family 0x15 models since
- * they're not supported there.
+ * Turn off thresholding banks for the following conditions:
+ * - MC4_MISC thresholding is not supported on Family 0x15.
+ * - Prevent possible spurious interrupts from the IF bank on Family 0x17
+ *   Models 0x10-0x2F due to Erratum #1114.
  */
-void disable_err_thresholding(struct cpuinfo_x86 *c)
+void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
 {
-   int i;
+   int i, num_msrs;
u64 hwcr;
bool need_toggle;
-   u32 msrs[] = {
-   0x0413, /* MC4_MISC0 */
-   0xc408, /* MC4_MISC1 */
-   };
+   u32 msrs[NR_BLOCKS];
+
+   if (c->x86 == 0x15 && bank == 4) {
+   msrs[0] = 0x0413; /* MC4_MISC0 */
+   msrs[1] = 0xc408; /* MC4_MISC1 */
+   num_msrs = 2;
+   } else if (c->x86 == 0x17 &&
+  (c->x86_model >= 0x10 && c->x86_model <= 0x2F)) {
 
-   if (c->x86 != 0x15)
+   if (smca_get_bank_type(bank) != SMCA_IF)
+   return;
+
+   msrs[0] = MSR_AMD64_SMCA_MCx_MISC(bank);
+   num_msrs = 1;
+   } else {
return;
+   }
 
rdmsrl(MSR_K7_HWCR, hwcr);
 
/* McStatusWrEn has to be set */
need_toggle = !(hwcr & BIT(18));
-
if (need_toggle)
wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
 
/* Clear CntP bit safely */
-   for (i = 0; i < ARRAY_SIZE(msrs); i++)
+   for (i = 0; i < num_msrs; i++)
msr_clear_bit(msrs[i], 62);
 
/* restore old settings */
@@ -604,12 +630,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
 
- 

[tip:ras/core] x86/mce: Handle varying MCA bank counts

2019-03-27 Thread tip-bot for Yazen Ghannam
Commit-ID:  006c077041dc73b9490fffc4c6af5befe0687110
Gitweb: https://git.kernel.org/tip/006c077041dc73b9490fffc4c6af5befe0687110
Author: Yazen Ghannam 
AuthorDate: Fri, 27 Jul 2018 16:40:09 -0500
Committer:  Borislav Petkov 
CommitDate: Wed, 27 Mar 2019 13:12:49 +0100

x86/mce: Handle varying MCA bank counts

Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a
CPU. Currently, this number is the same for all CPUs and a warning is
shown if there is a difference. The number of banks is overwritten with
the MCG_CAP[Count] value of each following CPU that boots.

According to the Intel SDM and AMD APM, the MCG_CAP[Count] value gives
the number of banks that are available to a "processor implementation".
The AMD BKDGs/PPRs further clarify that this value is per core. This
value has historically been the same for every core in the system, but
that is not an architectural requirement.

Future AMD systems may have different MCG_CAP[Count] values per core,
so the assumption that all CPUs will have the same MCG_CAP[Count] value
will no longer be valid.

Also, the first CPU to boot will allocate the struct mce_banks[] array
using the number of banks based on its MCG_CAP[Count] value. The machine
check handler and other functions use the global number of banks to
iterate and index into the mce_banks[] array. So it's possible to use an
out-of-bounds index on an asymmetric system where a following CPU sees a
MCG_CAP[Count] value greater than its predecessors.

Thus, allocate the mce_banks[] array to the maximum number of banks.
This will avoid the potential out-of-bounds index since the value of
mca_cfg.banks is capped to MAX_NR_BANKS.

Set the value of mca_cfg.banks equal to the max of the previous value
and the value for the current CPU. This way mca_cfg.banks will always
represent the max number of banks detected on any CPU in the system.

This will ensure that all CPUs will access all the banks that are
visible to them. A CPU that can access fewer than the max number of
banks will find the registers of the extra banks to be read-as-zero.

Furthermore, print the resulting number of MCA banks in use. Do this in
mcheck_late_init() so that the final value is printed after all CPUs
have been initialized.

Finally, get bank count from target CPU when doing injection with mce-inject
module.

 [ bp: Remove out-of-bounds example, passify and cleanup commit message. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: linux-edac 
Cc: Pu Wen 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20180727214009.78289-1-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/core.c   | 22 +++---
 arch/x86/kernel/cpu/mce/inject.c | 14 +++---
 2 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e558ca77cfe8..c3498732ba28 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1481,13 +1481,12 @@ EXPORT_SYMBOL_GPL(mce_notify_irq);
 static int __mcheck_cpu_mce_banks_init(void)
 {
int i;
-   u8 num_banks = mca_cfg.banks;
 
-   mce_banks = kcalloc(num_banks, sizeof(struct mce_bank), GFP_KERNEL);
+   mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL);
if (!mce_banks)
return -ENOMEM;
 
-   for (i = 0; i < num_banks; i++) {
+   for (i = 0; i < MAX_NR_BANKS; i++) {
struct mce_bank *b = &mce_banks[i];
 
b->ctl = -1ULL;
@@ -1501,28 +1500,19 @@ static int __mcheck_cpu_mce_banks_init(void)
  */
 static int __mcheck_cpu_cap_init(void)
 {
-   unsigned b;
u64 cap;
+   u8 b;
 
rdmsrl(MSR_IA32_MCG_CAP, cap);
 
b = cap & MCG_BANKCNT_MASK;
-   if (!mca_cfg.banks)
-   pr_info("CPU supports %d MCE banks\n", b);
-
-   if (b > MAX_NR_BANKS) {
-   pr_warn("Using only %u machine check banks out of %u\n",
-   MAX_NR_BANKS, b);
+   if (WARN_ON_ONCE(b > MAX_NR_BANKS))
b = MAX_NR_BANKS;
-   }
 
-   /* Don't support asymmetric configurations today */
-   WARN_ON(mca_cfg.banks != 0 && b != mca_cfg.banks);
-   mca_cfg.banks = b;
+   mca_cfg.banks = max(mca_cfg.banks, b);
 
if (!mce_banks) {
int err = __mcheck_cpu_mce_banks_init();
-
if (err)
return err;
}
@@ -2481,6 +2471,8 @@ EXPORT_SYMBOL_GPL(mcsafe_key);
 
 static int __init mcheck_late_init(void)
 {
+   pr_info("Using %d MCE banks\n", mca_cfg.banks);
+
if (mca_cfg.recovery)
static_branch_inc(&mcsafe_key);
 
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 8492ef7d9015..3f82afd0f46f 100644
--- a/ar

[tip:ras/core] x86/MCE: Group AMD function prototypes in

2019-03-24 Thread tip-bot for Yazen Ghannam
Commit-ID:  9308fd4074551f222f30322d1ee8c5aff18e9747
Gitweb: https://git.kernel.org/tip/9308fd4074551f222f30322d1ee8c5aff18e9747
Author: Yazen Ghannam 
AuthorDate: Fri, 22 Mar 2019 20:29:00 +
Committer:  Borislav Petkov 
CommitDate: Sun, 24 Mar 2019 10:54:13 +0100

x86/MCE: Group AMD function prototypes in 

There are two groups of "ifdef CONFIG_X86_MCE_AMD" function prototypes
in . Merge these two groups.

No functional change.

 [ bp: align vertically. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: Arnd Bergmann 
Cc: "cle...@gmail.com" 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: "ra...@milecki.pl" 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190322202848.20749-3-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h | 25 +++--
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 22d05e3835f0..dc2d4b206ab7 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -210,16 +210,6 @@ static inline void cmci_rediscover(void) {}
 static inline void cmci_recheck(void) {}
 #endif
 
-#ifdef CONFIG_X86_MCE_AMD
-void mce_amd_feature_init(struct cpuinfo_x86 *c);
-int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr);
-#else
-static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
-static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 
*sys_addr) { return -EINVAL; };
-#endif
-
-static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return 
mce_amd_feature_init(c); }
-
 int mce_available(struct cpuinfo_x86 *c);
 bool mce_is_memory_error(struct mce *m);
 bool mce_is_correctable(struct mce *m);
@@ -345,12 +335,19 @@ extern bool amd_mce_is_memory_error(struct mce *m);
 extern int mce_threshold_create_device(unsigned int cpu);
 extern int mce_threshold_remove_device(unsigned int cpu);
 
-#else
+void mce_amd_feature_init(struct cpuinfo_x86 *c);
+int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr);
 
-static inline int mce_threshold_create_device(unsigned int cpu) { return 0; };
-static inline int mce_threshold_remove_device(unsigned int cpu) { return 0; };
-static inline bool amd_mce_is_memory_error(struct mce *m) { return false; };
+#else
 
+static inline int mce_threshold_create_device(unsigned int cpu)
{ return 0; };
+static inline int mce_threshold_remove_device(unsigned int cpu)
{ return 0; };
+static inline bool amd_mce_is_memory_error(struct mce *m)  { 
return false; };
+static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
+static inline int
+umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { 
return -EINVAL; };
 #endif
 
+static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c)   { 
return mce_amd_feature_init(c); }
+
 #endif /* _ASM_X86_MCE_H */


[tip:ras/core] EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit

2019-02-15 Thread tip-bot for Yazen Ghannam
Commit-ID:  3f4da372ec8e4ce58c17ac4f2e3c8891bbfea17e
Gitweb: https://git.kernel.org/tip/3f4da372ec8e4ce58c17ac4f2e3c8891bbfea17e
Author: Yazen Ghannam 
AuthorDate: Tue, 12 Feb 2019 21:24:28 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 14:25:58 +0100

EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit

Previous AMD systems have had a bit in MCA_STATUS to indicate that an
error was detected on a scrub operation. However, this bit was defined
differently within different banks and families/models.

Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero
or defined as "Scrub", for all MCA banks and CPU models. Therefore, this
bit can be defined as the "Scrub" bit.

Define MCA_STATUS[40] as "Scrub" and decode it in the AMD MCE decoding
module for Family 17h and newer systems.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: James Morse 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190212212417.107049-1-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h | 1 +
 drivers/edac/mce_amd.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 299a38536567..22d05e3835f0 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -48,6 +48,7 @@
 #define MCI_STATUS_SYNDV   BIT_ULL(53)  /* synd reg. valid */
 #define MCI_STATUS_DEFERREDBIT_ULL(44)  /* uncorrected error, deferred 
exception */
 #define MCI_STATUS_POISON  BIT_ULL(43)  /* access poisonous data */
+#define MCI_STATUS_SCRUB   BIT_ULL(40)  /* Error detected during scrub 
operation */
 
 /*
  * McaX field if set indicates a given bank supports MCA extensions:
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index f286b880f981..b349c22bb386 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1078,6 +1078,9 @@ amd_decode_mce(struct notifier_block *nb, unsigned long 
val, void *data)
if (ecc)
pr_cont("|%sECC", ((ecc == 2) ? "C" : "U"));
 
+   if (fam >= 0x17)
+   pr_cont("|%s", (m->status & MCI_STATUS_SCRUB ? "Scrub" : "-"));
+
pr_cont("]: 0x%016llx\n", m->status);
 
if (m->status & MCI_STATUS_ADDRV)


[tip:ras/core] EDAC/mce_amd: Decode MCA_STATUS in bit definition order

2019-02-15 Thread tip-bot for Yazen Ghannam
Commit-ID:  a0bcd3c0b8a52ba0eb74371fa6be15ad0390ba67
Gitweb: https://git.kernel.org/tip/a0bcd3c0b8a52ba0eb74371fa6be15ad0390ba67
Author: Yazen Ghannam 
AuthorDate: Tue, 12 Feb 2019 21:24:29 +
Committer:  Borislav Petkov 
CommitDate: Fri, 15 Feb 2019 14:36:31 +0100

EDAC/mce_amd: Decode MCA_STATUS in bit definition order

Sort the MCA_STATUS bits in decode output to follow how they are defined
in the register.

The order is as follows:

  Bit | Decode
  
  62  | Over
  61  | UC
  59  | MiscV
  58  | AddrV
  57  | PCC
  55  | TCC
  53  | SyndV
  46  | CECC
  45  | UECC
  44  | Deferred
  43  | Poison
  40  | Scrub

 [ bp: Massage a bit. ]

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: Mauro Carvalho Chehab 
Cc: linux-edac 
Cc: x...@kernel.org
Link: https://lkml.kernel.org/r/20190212212417.107049-2-yazen.ghan...@amd.com
---
 drivers/edac/mce_amd.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index b349c22bb386..0a1814dad6cf 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1051,26 +1051,18 @@ amd_decode_mce(struct notifier_block *nb, unsigned long 
val, void *data)
((m->status & MCI_STATUS_UC)? "UE":
 (m->status & MCI_STATUS_DEFERRED) ? "-"  : "CE"),
((m->status & MCI_STATUS_MISCV) ? "MiscV" : "-"),
-   ((m->status & MCI_STATUS_PCC)   ? "PCC"   : "-"),
-   ((m->status & MCI_STATUS_ADDRV) ? "AddrV" : "-"));
-
-   if (fam >= 0x15) {
-   pr_cont("|%s", (m->status & MCI_STATUS_DEFERRED ? "Deferred" : 
"-"));
-
-   /* F15h, bank4, bit 43 is part of McaStatSubCache. */
-   if (fam != 0x15 || m->bank != 4)
-   pr_cont("|%s", (m->status & MCI_STATUS_POISON ? 
"Poison" : "-"));
-   }
+   ((m->status & MCI_STATUS_ADDRV) ? "AddrV" : "-"),
+   ((m->status & MCI_STATUS_PCC)   ? "PCC"   : "-"));
 
if (boot_cpu_has(X86_FEATURE_SMCA)) {
u32 low, high;
u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
 
-   pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : 
"-"));
-
if (!rdmsr_safe(addr, &low, &high) &&
(low & MCI_CONFIG_MCAX))
pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : 
"-"));
+
+   pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : 
"-"));
}
 
/* do the two bits[14:13] together */
@@ -1078,6 +1070,14 @@ amd_decode_mce(struct notifier_block *nb, unsigned long 
val, void *data)
if (ecc)
pr_cont("|%sECC", ((ecc == 2) ? "C" : "U"));
 
+   if (fam >= 0x15) {
+   pr_cont("|%s", (m->status & MCI_STATUS_DEFERRED ? "Deferred" : 
"-"));
+
+   /* F15h, bank4, bit 43 is part of McaStatSubCache. */
+   if (fam != 0x15 || m->bank != 4)
+   pr_cont("|%s", (m->status & MCI_STATUS_POISON ? 
"Poison" : "-"));
+   }
+
if (fam >= 0x17)
pr_cont("|%s", (m->status & MCI_STATUS_SCRUB ? "Scrub" : "-"));
 


[tip:ras/core] EDAC, mce_amd: Print ExtErrorCode and description on a single line

2019-02-04 Thread tip-bot for Yazen Ghannam
Commit-ID:  1c1522d32ac49065f88e5a8b3d6e3a5613b20118
Gitweb: https://git.kernel.org/tip/1c1522d32ac49065f88e5a8b3d6e3a5613b20118
Author: Yazen Ghannam 
AuthorDate: Fri, 1 Feb 2019 22:55:54 +
Committer:  Borislav Petkov 
CommitDate: Mon, 4 Feb 2019 19:29:13 +0100

EDAC, mce_amd: Print ExtErrorCode and description on a single line

Save a log line by printing the extended error code and the description
on a single line. This is similar to how errors are printed in other
subsystems, e.g. "#, description". If we don't have a valid description
then only the number/code is printed.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Tony Luck 
Cc: x...@kernel.org
Link: https://lkml.kernel.org/r/20190201225534.8177-6-yazen.ghan...@amd.com
---
 drivers/edac/mce_amd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 7e29ceabdf6f..f286b880f981 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -965,13 +965,12 @@ static void decode_smca_error(struct mce *m)
 
ip_name = smca_get_long_name(bank_type);
 
-   pr_emerg(HW_ERR "%s Extended Error Code: %d\n", ip_name, xec);
+   pr_emerg(HW_ERR "%s Ext. Error Code: %d", ip_name, xec);
 
/* Only print the decode of valid error codes */
if (xec < smca_mce_descs[bank_type].num_descs &&
(hwid->xec_bitmap & BIT_ULL(xec))) {
-   pr_emerg(HW_ERR "%s Error: ", ip_name);
-   pr_cont("%s.\n", smca_mce_descs[bank_type].descs[xec]);
+   pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]);
}
 
if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)


[tip:ras/core] EDAC, mce_amd: Match error descriptions to latest documentation

2019-02-04 Thread tip-bot for Yazen Ghannam
Commit-ID:  e03447ee718b331be8f3abc388c7bf7d325dfab4
Gitweb: https://git.kernel.org/tip/e03447ee718b331be8f3abc388c7bf7d325dfab4
Author: Yazen Ghannam 
AuthorDate: Fri, 1 Feb 2019 22:55:53 +
Committer:  Borislav Petkov 
CommitDate: Sun, 3 Feb 2019 13:16:50 +0100

EDAC, mce_amd: Match error descriptions to latest documentation

Update the error descriptions to match the latest documentation for
easier searching. In some cases the changes are small and in other cases
the changes may be total rewording of the description.

No functional changes.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Tony Luck 
Cc: x...@kernel.org
Link: https://lkml.kernel.org/r/20190201225534.8177-5-yazen.ghan...@amd.com
---
 drivers/edac/mce_amd.c | 166 -
 1 file changed, 83 insertions(+), 83 deletions(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index c79e650aa606..7e29ceabdf6f 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -151,74 +151,74 @@ static const char * const mc6_mce_desc[] = {
 
 /* Scalable MCA error strings */
 static const char * const smca_ls_mce_desc[] = {
-   "Load queue parity",
-   "Store queue parity",
-   "Miss address buffer payload parity",
-   "L1 TLB parity",
+   "Load queue parity error",
+   "Store queue parity error",
+   "Miss address buffer payload parity error",
+   "Level 1 TLB parity error",
"DC Tag error type 5",
-   "DC tag error type 6",
-   "DC tag error type 1",
+   "DC Tag error type 6",
+   "DC Tag error type 1",
"Internal error type 1",
"Internal error type 2",
-   "Sys Read data error thread 0",
-   "Sys read data error thread 1",
-   "DC tag error type 2",
-   "DC data error type 1 (poison consumption)",
-   "DC data error type 2",
-   "DC data error type 3",
-   "DC tag error type 4",
-   "L2 TLB parity",
+   "System Read Data Error Thread 0",
+   "System Read Data Error Thread 1",
+   "DC Tag error type 2",
+   "DC Data error type 1 and poison consumption",
+   "DC Data error type 2",
+   "DC Data error type 3",
+   "DC Tag error type 4",
+   "Level 2 TLB parity error",
"PDC parity error",
-   "DC tag error type 3",
-   "DC tag error type 5",
-   "L2 fill data error",
+   "DC Tag error type 3",
+   "DC Tag error type 5",
+   "L2 Fill Data error",
 };
 
 static const char * const smca_if_mce_desc[] = {
-   "microtag probe port parity error",
-   "IC microtag or full tag multi-hit error",
-   "IC full tag parity",
-   "IC data array parity",
-   "Decoupling queue phys addr parity error",
-   "L0 ITLB parity error",
-   "L1 ITLB parity error",
-   "L2 ITLB parity error",
-   "BPQ snoop parity on Thread 0",
-   "BPQ snoop parity on Thread 1",
-   "L1 BTB multi-match error",
-   "L2 BTB multi-match error",
-   "L2 Cache Response Poison error",
-   "System Read Data error",
+   "Op Cache Microtag Probe Port Parity Error",
+   "IC Microtag or Full Tag Multi-hit Error",
+   "IC Full Tag Parity Error",
+   "IC Data Array Parity Error",
+   "Decoupling Queue PhysAddr Parity Error",
+   "L0 ITLB Parity Error",
+   "L1 ITLB Parity Error",
+   "L2 ITLB Parity Error",
+   "BPQ Thread 0 Snoop Parity Error",
+   "BPQ Thread 1 Snoop Parity Error",
+   "L1 BTB Multi-Match Error",
+   "L2 BTB Multi-Match Error",
+   "L2 Cache Response Poison Error",
+   "System Read Data Error",
 };
 
 static const char * const smca_l2_mce_desc[] = {
-   "L2M tag multi-way-hit error",
-   "L2M tag ECC error",
-   "L2M data ECC error",
-   "HW assert",
+   "L2M Tag Multiple-Way-Hit error",
+   "L2M Tag or State Array ECC Error",
+   "L2M Data Array ECC Error",
+   "Hardware Assert Error",
 };
 
 static const char * const smca_de_mce_desc[] = {
-   "uop cache tag parity error",
-   "uop cache data parity error",
-   "Insn buffer parity error",
-   "uop queue parity error",

[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units

2019-02-04 Thread tip-bot for Yazen Ghannam
Commit-ID:  3ad7e748c12cc771df6020a552def3e1727e8a17
Gitweb: https://git.kernel.org/tip/3ad7e748c12cc771df6020a552def3e1727e8a17
Author: Yazen Ghannam 
AuthorDate: Fri, 1 Feb 2019 22:55:52 +
Committer:  Borislav Petkov 
CommitDate: Sun, 3 Feb 2019 13:01:57 +0100

x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units

The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.

Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.

Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: Arnd Bergmann 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Kees Cook 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: Shirish S 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190201225534.8177-3-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h|  3 +++
 arch/x86/kernel/cpu/mce/amd.c |  6 +
 drivers/edac/mce_amd.c| 55 +++
 3 files changed, 64 insertions(+)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 91b65d859ca8..299a38536567 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -307,11 +307,14 @@ enum smca_bank_types {
SMCA_FP,/* Floating Point */
SMCA_L3_CACHE,  /* L3 Cache */
SMCA_CS,/* Coherent Slave */
+   SMCA_CS_V2, /* Coherent Slave */
SMCA_PIE,   /* Power, Interrupts, etc. */
SMCA_UMC,   /* Unified Memory Controller */
SMCA_PB,/* Parameter Block */
SMCA_PSP,   /* Platform Security Processor */
+   SMCA_PSP_V2,/* Platform Security Processor */
SMCA_SMU,   /* System Management Unit */
+   SMCA_SMU_V2,/* System Management Unit */
SMCA_MP5,   /* Microprocessor 5 Unit */
SMCA_NBIO,  /* Northbridge IO Unit */
SMCA_PCIE,  /* PCI Express Unit */
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 00f60b8c7e4f..bd1331b241ca 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -88,11 +88,14 @@ static struct smca_bank_name smca_names[] = {
[SMCA_FP]   = { "floating_point",   "Floating Point Unit" },
[SMCA_L3_CACHE] = { "l3_cache", "L3 Cache" },
[SMCA_CS]   = { "coherent_slave",   "Coherent Slave" },
+   [SMCA_CS_V2]= { "coherent_slave",   "Coherent Slave" },
[SMCA_PIE]  = { "pie",  "Power, Interrupts, etc." },
[SMCA_UMC]  = { "umc",  "Unified Memory Controller" },
[SMCA_PB]   = { "param_block",  "Parameter Block" },
[SMCA_PSP]  = { "psp",  "Platform Security Processor" },
+   [SMCA_PSP_V2]   = { "psp",  "Platform Security Processor" },
[SMCA_SMU]  = { "smu",  "System Management Unit" },
+   [SMCA_SMU_V2]   = { "smu",  "System Management Unit" },
[SMCA_MP5]  = { "mp5",  "Microprocessor 5 Unit" },
[SMCA_NBIO] = { "nbio", "Northbridge IO Unit" },
[SMCA_PCIE] = { "pcie", "PCI Express Unit" },
@@ -153,6 +156,7 @@ static struct smca_hwid smca_hwid_mcatypes[] = {
/* Data Fabric MCA types */
{ SMCA_CS,   HWID_MCATYPE(0x2E, 0x0), 0x1FF },
{ SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0xF },
+   { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
 
/* Unified Memory Controller MCA type */
{ SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0x3F },
@@ -162,9 +166,11 @@ static struct smca_hwid smca_hwid_mcatypes[] = {
 
/* Platform Security Processor MCA type */
{ SMCA_PSP,  HWID_MCATYPE(0xFF, 0x0), 0x1 },
+   { SMCA_PSP_V2,   HWID_MCATYPE(0xFF, 0x1), 0x3 },
 
/* System Management Unit MCA type */
{ SMCA_SMU,  HWID_MCATYPE(0x01, 0x0), 0x1 },
+   { SMCA_SMU_V2,   HWID_MCA

[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA bank types

2019-02-04 Thread tip-bot for Yazen Ghannam
Commit-ID:  8a5dd2cd2f2e94878cacc969655a69ca214795ab
Gitweb: https://git.kernel.org/tip/8a5dd2cd2f2e94878cacc969655a69ca214795ab
Author: Yazen Ghannam 
AuthorDate: Fri, 1 Feb 2019 22:55:52 +
Committer:  Borislav Petkov 
CommitDate: Sun, 3 Feb 2019 13:05:16 +0100

x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA bank types

Some SMCA bank types on future systems will report new error types even
though the bank type is not treated as a new version. These new error
types will reported by bits that are reserved in past systems.

Add the new error descriptions to the lists in edac_mce_amd.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Kees Cook 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Shirish S 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190201225534.8177-4-yazen.ghan...@amd.com
---
 arch/x86/kernel/cpu/mce/amd.c | 8 
 drivers/edac/mce_amd.c| 6 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index bd1331b241ca..e64de5149e50 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -144,22 +144,22 @@ static struct smca_hwid smca_hwid_mcatypes[] = {
{ SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 },
 
/* ZN Core (HWID=0xB0) MCA types */
-   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0), 0x1FFFEF },
+   { SMCA_LS,   HWID_MCATYPE(0xB0, 0x0), 0x1F },
{ SMCA_IF,   HWID_MCATYPE(0xB0, 0x1), 0x3FFF },
{ SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF },
{ SMCA_DE,   HWID_MCATYPE(0xB0, 0x3), 0x1FF },
/* HWID 0xB0 MCATYPE 0x4 is Reserved */
-   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5), 0x7FF },
+   { SMCA_EX,   HWID_MCATYPE(0xB0, 0x5), 0xFFF },
{ SMCA_FP,   HWID_MCATYPE(0xB0, 0x6), 0x7F },
{ SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF },
 
/* Data Fabric MCA types */
{ SMCA_CS,   HWID_MCATYPE(0x2E, 0x0), 0x1FF },
-   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0xF },
+   { SMCA_PIE,  HWID_MCATYPE(0x2E, 0x1), 0x1F },
{ SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF },
 
/* Unified Memory Controller MCA type */
-   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0x3F },
+   { SMCA_UMC,  HWID_MCATYPE(0x96, 0x0), 0xFF },
 
/* Parameter Block MCA type */
{ SMCA_PB,   HWID_MCATYPE(0x05, 0x0), 0x1 },
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 184c90172d17..c79e650aa606 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -155,7 +155,7 @@ static const char * const smca_ls_mce_desc[] = {
"Store queue parity",
"Miss address buffer payload parity",
"L1 TLB parity",
-   "Reserved",
+   "DC Tag error type 5",
"DC tag error type 6",
"DC tag error type 1",
"Internal error type 1",
@@ -222,6 +222,7 @@ static const char * const smca_ex_mce_desc[] = {
"Retire status queue parity error",
"Scheduling queue parity error",
"Branch buffer queue parity error",
+   "Hardware Assertion error",
 };
 
 static const char * const smca_fp_mce_desc[] = {
@@ -279,6 +280,7 @@ static const char * const smca_pie_mce_desc[] = {
"Internal PIE register security violation",
"Error on GMI link",
"Poison data written to internal PIE register",
+   "A deferred error was detected in the DF"
 };
 
 static const char * const smca_umc_mce_desc[] = {
@@ -288,6 +290,8 @@ static const char * const smca_umc_mce_desc[] = {
"Advanced peripheral bus error",
"Command/address parity error",
"Write data CRC error",
+   "DCQ SRAM ECC error",
+   "AES SRAM ECC error",
 };
 
 static const char * const smca_pb_mce_desc[] = {


[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types

2019-02-04 Thread tip-bot for Yazen Ghannam
Commit-ID:  cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0
Gitweb: https://git.kernel.org/tip/cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0
Author: Yazen Ghannam 
AuthorDate: Fri, 1 Feb 2019 22:55:51 +
Committer:  Borislav Petkov 
CommitDate: Sun, 3 Feb 2019 13:01:44 +0100

x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types

Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and
PCIE SMCA bank types.

Also, add their respective error descriptions to the MCE decoding module
edac_mce_amd.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: Arnd Bergmann 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Kees Cook 
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: Pu Wen 
Cc: Qiuxu Zhuo 
Cc: Shirish S 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vishal Verma 
Cc: x86-ml 
Link: https://lkml.kernel.org/r/20190201225534.8177-2-yazen.ghan...@amd.com
---
 arch/x86/include/asm/mce.h|  3 +++
 arch/x86/kernel/cpu/mce/amd.c | 12 
 drivers/edac/mce_amd.c| 32 
 3 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index c1a812bd5a27..91b65d859ca8 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -312,6 +312,9 @@ enum smca_bank_types {
SMCA_PB,/* Parameter Block */
SMCA_PSP,   /* Platform Security Processor */
SMCA_SMU,   /* System Management Unit */
+   SMCA_MP5,   /* Microprocessor 5 Unit */
+   SMCA_NBIO,  /* Northbridge IO Unit */
+   SMCA_PCIE,  /* PCI Express Unit */
N_SMCA_BANK_TYPES
 };
 
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index ed3327342b40..00f60b8c7e4f 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -93,6 +93,9 @@ static struct smca_bank_name smca_names[] = {
[SMCA_PB]   = { "param_block",  "Parameter Block" },
[SMCA_PSP]  = { "psp",  "Platform Security Processor" },
[SMCA_SMU]  = { "smu",  "System Management Unit" },
+   [SMCA_MP5]  = { "mp5",  "Microprocessor 5 Unit" },
+   [SMCA_NBIO] = { "nbio", "Northbridge IO Unit" },
+   [SMCA_PCIE] = { "pcie", "PCI Express Unit" },
 };
 
 static u32 smca_bank_addrs[MAX_NR_BANKS][NR_BLOCKS] __ro_after_init =
@@ -162,6 +165,15 @@ static struct smca_hwid smca_hwid_mcatypes[] = {
 
/* System Management Unit MCA type */
{ SMCA_SMU,  HWID_MCATYPE(0x01, 0x0), 0x1 },
+
+   /* Microprocessor 5 Unit MCA type */
+   { SMCA_MP5,  HWID_MCATYPE(0x01, 0x2), 0x3FF },
+
+   /* Northbridge IO Unit MCA type */
+   { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F },
+
+   /* PCI Express Unit MCA type */
+   { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0), 0x1F },
 };
 
 struct smca_bank smca_banks[MAX_NR_BANKS];
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index c605089d899f..5ab4ab3f0ce6 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -285,6 +285,35 @@ static const char * const smca_smu_mce_desc[] = {
"SMU RAM ECC or parity error",
 };
 
+static const char * const smca_mp5_mce_desc[] = {
+   "High SRAM ECC or parity error",
+   "Low SRAM ECC or parity error",
+   "Data Cache Bank A ECC or parity error",
+   "Data Cache Bank B ECC or parity error",
+   "Data Tag Cache Bank A ECC or parity error",
+   "Data Tag Cache Bank B ECC or parity error",
+   "Instruction Cache Bank A ECC or parity error",
+   "Instruction Cache Bank B ECC or parity error",
+   "Instruction Tag Cache Bank A ECC or parity error",
+   "Instruction Tag Cache Bank B ECC or parity error",
+};
+
+static const char * const smca_nbio_mce_desc[] = {
+   "ECC or Parity error",
+   "PCIE error",
+   "SDP ErrEvent error",
+   "SDP Egress Poison Error",
+   "IOHC Internal Poison Error",
+};
+
+static const char * const smca_pcie_mce_desc[] = {
+   "CCIX PER Message logging",
+   "CCIX Read Response with Status: Non-Data Error",
+   "CCIX Write Response with Status: Non-Data Error",
+   "CCIX Read Response with Status: Data Error",
+   "CCIX Non-okay write response with data error",
+};
+
 struct smca_mce_desc {
const char * const *descs;
unsigned int num_descs;
@@ -304,6 +333,9 @@ static struct smca_mce_desc smca_mce_descs[] = {
[SMCA_PB]   = { smca_pb_mce_desc,   ARRAY_SIZE(smca_pb_mce_desc)
},
[SMCA_PSP]  = { smca_psp_mce_desc,  ARRAY_SIZE(smca_psp_mce_d

[PATCH 2/2] x86/MCE/AMD: Skip creating kobjects with NULL names

2018-08-09 Thread Yazen Ghannam
From: Yazen Ghannam 

During mce_threshold_create_device() data structures are allocated for
each CPUs MCA banks and thresholding blocks. These data structures are
used to save information related to AMD's MCA Error Thresholding
feature. The structures are used in the thresholding interrupt handler,
and they are exposed to the user through sysfs. The sysfs interface has
user-friendly names for each bank.

However, errors in mce_threshold_create_device() will cause all the data
structures to be deallocated. This will break the thresholding interrupt
handler since it depends on these structures.

One possible error is creating a kobject with a NULL name. This will
happen if a bank exists on a system that doesn't have a name, e.g. new
bank types on future systems.

Skip creating kobjects for banks without a name. This means that the
sysfs interface for this bank will not exist. But this will keep all the
data structures allocated, so the thresholding interrupt handler will
work, even for the unnamed bank. Also, the sysfs interface will still be
populated for all existing, known bank types.

Cc:  # 4.13.x
Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c 
b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 2dbf34250bbf..521fd8f406df 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -1130,6 +1130,7 @@ static int allocate_threshold_blocks(unsigned int cpu, 
unsigned int bank,
struct threshold_block *b = NULL;
u32 low, high;
int err;
+   const char *name = NULL;
 
if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS))
return 0;
@@ -1176,9 +1177,13 @@ static int allocate_threshold_blocks(unsigned int cpu, 
unsigned int bank,
per_cpu(threshold_banks, cpu)[bank]->blocks = b;
}
 
+   name = get_name(bank, b);
+   if (!name)
+   goto recurse;
+
err = kobject_init_and_add(&b->kobj, &threshold_ktype,
   per_cpu(threshold_banks, cpu)[bank]->kobj,
-  get_name(bank, b));
+  name);
if (err)
goto out_free;
 recurse:
@@ -1265,12 +1270,16 @@ static int threshold_create_bank(unsigned int cpu, 
unsigned int bank)
goto out;
}
 
+   if (!name)
+   goto allocate;
+
b->kobj = kobject_create_and_add(name, &dev->kobj);
if (!b->kobj) {
err = -EINVAL;
goto out_free;
}
 
+allocate:
per_cpu(threshold_banks, cpu)[bank] = b;
 
if (is_shared_bank(bank)) {
-- 
2.17.1



[PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-09 Thread Yazen Ghannam
From: Yazen Ghannam 

If threshold_init_device() fails then per_cpu(threshold_banks) will be
deallocated. The thresholding interrupt handler will still be active, so
it's possible to get a NULL pointer dereference if a THR interrupt
happens and any of the structures are NULL.

Exit the handler if per_cpu(threshold_banks) is NULL and skip NULL
banks. MCA error information will still be in the registers. The
information will be logged during polling or in another MCA exception or
interrupt handler.

Fixes: 17ef4af0ec0f ("x86/mce/AMD: Use saved threshold block info in interrupt 
handler")
Cc:  # 4.13.x
Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/mcheck/mce_amd.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c 
b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index dd33c357548f..2dbf34250bbf 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -934,13 +934,21 @@ static void log_and_reset_block(struct threshold_block 
*block)
 static void amd_threshold_interrupt(void)
 {
struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
+   struct threshold_bank *th_bank = NULL;
unsigned int bank, cpu = smp_processor_id();
 
+   if (!per_cpu(threshold_banks, cpu))
+   return;
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
 
-   first_block = per_cpu(threshold_banks, cpu)[bank]->blocks;
+   th_bank = per_cpu(threshold_banks, cpu)[bank];
+   if (!th_bank)
+   continue;
+
+   first_block = th_bank->blocks;
if (!first_block)
continue;
 
-- 
2.17.1



[PATCH] x86/mce: Handle varying MCA bank counts

2018-07-27 Thread Yazen Ghannam
From: Yazen Ghannam 

Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a
CPU. Currently, this is assumed to be the same for all CPUs and a
warning is shown if there is a difference. The number of banks is
overwritten with the MCG_CAP[Count] value of each following CPU that
boots.

According to the Intel SDM and AMD APM, the MCG_CAP[Count] value gives
the number of banks that are available to a "processor implementation".
The AMD BKDGs/PPRs further clarify that this value is per core. This
value has historically been the same for every core in the system, but
that is not an architectural requirement.

Future AMD systems may have different MCG_CAP[Count] values per core,
so the assumption that all CPUs will have the same MCG_CAP[Count] value
will no longer be valid.

Also, the first CPU to boot will allocate the struct mce_banks[] array
using the number of banks based on its MCG_CAP[Count] value. The machine
check handler and other functions use the global number of banks to
iterate and index into the mce_banks[] array. So it's possible to use an
out-of-bounds index on an asymmetric system where a following CPU sees a
MCG_CAP[Count] value greater than its predecessors.

For example, CPU0 sees MCG_CAP[Count]=2. It sets mca_cfg.banks=2 and
allocates mce_banks[] with 2 elements. CPU1 sees MCG_CAP[Count]=3 and
sets mca_cfg.banks=3, but mce_banks[] is already allocated and remains
having 2 elements.

Allocate the mce_banks[] array to the maximum number of banks. This will
avoid the potential out-of-bounds index since we cap the value of
mca_cfg.banks to MAX_NR_BANKS.

Set the value of mca_cfg.banks equal to the max of the previous value
and the value for the current CPU. This way mca_cfg.banks will always
represent the max number of banks detected on any CPU in the system.
This will ensure that all CPUs will access all the banks that are
visible to them. A CPU that can access fewer than the max number of
banks will find the registers of the extra banks to be read-as-zero.

Print the number of MCA banks that we're using. Do this in
mcheck_late_init() so that we print the final value after all CPUs have
been initialized.

Get bank count from target CPU when doing injection with mce-inject
module.

Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/cpu/mcheck/mce-inject.c | 14 +++---
 arch/x86/kernel/cpu/mcheck/mce.c| 21 +++--
 2 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c 
b/arch/x86/kernel/cpu/mcheck/mce-inject.c
index c805a06e14c3..5dda56d56dd3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -46,8 +46,6 @@
 static struct mce i_mce;
 static struct dentry *dfs_inj;
 
-static u8 n_banks;
-
 #define MAX_FLAG_OPT_SIZE  4
 #define NBCFG  0x44
 
@@ -567,9 +565,15 @@ static void do_inject(void)
 static int inj_bank_set(void *data, u64 val)
 {
struct mce *m = (struct mce *)data;
+   u64 cap;
+   u8 n_banks;
+
+   /* Get bank count on target CPU so we can handle non-uniform values. */
+   rdmsrl_on_cpu(m->extcpu, MSR_IA32_MCG_CAP, &cap);
+   n_banks = cap & MCG_BANKCNT_MASK;
 
if (val >= n_banks) {
-   pr_err("Non-existent MCE bank: %llu\n", val);
+   pr_err("MCA bank %llu non-existent on CPU%d\n", val, m->extcpu);
return -EINVAL;
}
 
@@ -659,10 +663,6 @@ static struct dfs_node {
 static int __init debugfs_init(void)
 {
unsigned int i;
-   u64 cap;
-
-   rdmsrl(MSR_IA32_MCG_CAP, cap);
-   n_banks = cap & MCG_BANKCNT_MASK;
 
dfs_inj = debugfs_create_dir("mce-inject", NULL);
if (!dfs_inj)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 4b767284b7f5..4238c65a0cce 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1479,13 +1479,12 @@ EXPORT_SYMBOL_GPL(mce_notify_irq);
 static int __mcheck_cpu_mce_banks_init(void)
 {
int i;
-   u8 num_banks = mca_cfg.banks;
 
-   mce_banks = kcalloc(num_banks, sizeof(struct mce_bank), GFP_KERNEL);
+   mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL);
if (!mce_banks)
return -ENOMEM;
 
-   for (i = 0; i < num_banks; i++) {
+   for (i = 0; i < MAX_NR_BANKS; i++) {
struct mce_bank *b = &mce_banks[i];
 
b->ctl = -1ULL;
@@ -1499,24 +1498,16 @@ static int __mcheck_cpu_mce_banks_init(void)
  */
 static int __mcheck_cpu_cap_init(void)
 {
-   unsigned b;
+   u8 b;
u64 cap;
 
rdmsrl(MSR_IA32_MCG_CAP, cap);
 
b = cap & MCG_BANKCNT_MASK;
-   if (!mca_cfg.banks)
-   pr_info("CPU supports %d MCE banks\n", b);
-
-   if (b > MAX_NR_BANKS) {
-   pr_warn("Using only

[tip:efi/core] efi: Decode IA32/X64 Context Info structure

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  9c178663cbf2e754be322505078306b4a380a697
Gitweb: https://git.kernel.org/tip/9c178663cbf2e754be322505078306b4a380a697
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:56 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:48 +0200

efi: Decode IA32/X64 Context Info structure

Print the fields of the IA32/X64 Context Information structure.

Print the "Register Array" as raw values. Some context types are defined
in the UEFI spec, so more detailed decoded may be added in the future.

Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Processor Context
Information Structure.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-11-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 356b8d326219..2531de49f56c 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -10,6 +10,7 @@
 #define VALID_LAPIC_ID BIT_ULL(0)
 #define VALID_CPUID_INFO   BIT_ULL(1)
 #define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
+#define VALID_PROC_CXT_INFO_NUM(bits)  (((bits) & GENMASK_ULL(13, 8)) >> 8)
 
 #define INFO_ERR_STRUCT_TYPE_CACHE \
GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B,   \
@@ -71,6 +72,9 @@
 #define CHECK_MS_RESTARTABLE_IPBIT_ULL(22)
 #define CHECK_MS_OVERFLOW  BIT_ULL(23)
 
+#define CTX_TYPE_MSR   1
+#define CTX_TYPE_MMREG 7
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -134,6 +138,17 @@ static const char * const ia_check_ms_error_type_strs[] = {
"Internal Unclassified",
 };
 
+static const char * const ia_reg_ctx_strs[] = {
+   "Unclassified Data",
+   "MSR Registers (Machine Check and other MSRs)",
+   "32-bit Mode Execution Context",
+   "64-bit Mode Execution Context",
+   "FXSAVE Context",
+   "32-bit Mode Debug Registers (DR0-DR7)",
+   "64-bit Mode Debug Registers (DR0-DR7)",
+   "Memory Mapped Registers",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
@@ -242,6 +257,7 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
 {
int i;
struct cper_ia_err_info *err_info;
+   struct cper_ia_proc_ctx *ctx_info;
char newpfx[64], infopfx[64];
u8 err_type;
 
@@ -305,4 +321,36 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
 
err_info++;
}
+
+   ctx_info = (struct cper_ia_proc_ctx *)err_info;
+   for (i = 0; i < VALID_PROC_CXT_INFO_NUM(proc->validation_bits); i++) {
+   int size = sizeof(*ctx_info) + ctx_info->reg_arr_size;
+   int groupsize = 4;
+
+   printk("%sContext Information Structure %d:\n", pfx, i);
+
+   printk("%sRegister Context Type: %s\n", newpfx,
+  ctx_info->reg_ctx_type < ARRAY_SIZE(ia_reg_ctx_strs) ?
+  ia_reg_ctx_strs[ctx_info->reg_ctx_type] : "unknown");
+
+   printk("%sRegister Array Size: 0x%04x\n", newpfx,
+  ctx_info->reg_arr_size);
+
+   if (ctx_info->reg_ctx_type == CTX_TYPE_MSR) {
+   groupsize = 8; /* MSRs are 8 bytes wide. */
+   printk("%sMSR Address: 0x%08x\n", newpfx,
+  ctx_info->msr_addr);
+   }
+
+   if (ctx_info->reg_ctx_type == CTX_TYPE_MMREG) {
+   printk("%sMM Register Address: 0x%016llx\n", newpfx,
+  ctx_info->mm_reg_addr);
+   }
+
+   printk("%sRegister Array:\n", newpfx);
+   print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize,
+  (ctx_info + 1), ctx_info->reg_arr_size, 0);
+
+   ctx_info = (struct cper_ia_proc_ctx *)((long)ctx_info + size);
+   }
 }


[tip:efi/core] efi: Decode IA32/X64 MS Check structure

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  a32bc29ed19776ef6827d6336847de9a0b7a8dc5
Gitweb: https://git.kernel.org/tip/a32bc29ed19776ef6827d6336847de9a0b7a8dc5
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:55 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:48 +0200

efi: Decode IA32/X64 MS Check structure

The IA32/X64 MS Check structure varies from the other Check structures
in the the bit positions of its fields, and it includes an additional
"Error Type" field.

Decode the MS Check structure in a separate function.

Based on UEFI 2.7 Table 257. IA32/X64 MS Check Field Description.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-10-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 55 -
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 5e6716564dba..356b8d326219 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -57,6 +57,20 @@
 #define CHECK_BUS_TIME_OUT BIT_ULL(32)
 #define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33)
 
+#define CHECK_VALID_MS_ERR_TYPEBIT_ULL(0)
+#define CHECK_VALID_MS_PCC BIT_ULL(1)
+#define CHECK_VALID_MS_UNCORRECTED BIT_ULL(2)
+#define CHECK_VALID_MS_PRECISE_IP  BIT_ULL(3)
+#define CHECK_VALID_MS_RESTARTABLE_IP  BIT_ULL(4)
+#define CHECK_VALID_MS_OVERFLOWBIT_ULL(5)
+
+#define CHECK_MS_ERR_TYPE(check)   (((check) & GENMASK_ULL(18, 16)) >> 16)
+#define CHECK_MS_PCC   BIT_ULL(19)
+#define CHECK_MS_UNCORRECTED   BIT_ULL(20)
+#define CHECK_MS_PRECISE_IPBIT_ULL(21)
+#define CHECK_MS_RESTARTABLE_IPBIT_ULL(22)
+#define CHECK_MS_OVERFLOW  BIT_ULL(23)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -111,17 +125,56 @@ static const char * const ia_check_bus_addr_space_strs[] 
= {
"Other Transaction",
 };
 
+static const char * const ia_check_ms_error_type_strs[] = {
+   "No Error",
+   "Unclassified",
+   "Microcode ROM Parity Error",
+   "External Error",
+   "FRC Error",
+   "Internal Unclassified",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
 }
 
+static void print_err_info_ms(const char *pfx, u16 validation_bits, u64 check)
+{
+   if (validation_bits & CHECK_VALID_MS_ERR_TYPE) {
+   u8 err_type = CHECK_MS_ERR_TYPE(check);
+
+   printk("%sError Type: %u, %s\n", pfx, err_type,
+  err_type < ARRAY_SIZE(ia_check_ms_error_type_strs) ?
+  ia_check_ms_error_type_strs[err_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_MS_PCC)
+   print_bool("Processor Context Corrupt", pfx, check, 
CHECK_MS_PCC);
+
+   if (validation_bits & CHECK_VALID_MS_UNCORRECTED)
+   print_bool("Uncorrected", pfx, check, CHECK_MS_UNCORRECTED);
+
+   if (validation_bits & CHECK_VALID_MS_PRECISE_IP)
+   print_bool("Precise IP", pfx, check, CHECK_MS_PRECISE_IP);
+
+   if (validation_bits & CHECK_VALID_MS_RESTARTABLE_IP)
+   print_bool("Restartable IP", pfx, check, 
CHECK_MS_RESTARTABLE_IP);
+
+   if (validation_bits & CHECK_VALID_MS_OVERFLOW)
+   print_bool("Overflow", pfx, check, CHECK_MS_OVERFLOW);
+}
+
 static void print_err_info(const char *pfx, u8 err_type, u64 check)
 {
u16 validation_bits = CHECK_VALID_BITS(check);
 
+   /*
+* The MS Check structure varies a lot from the others, so use a
+* separate function for decoding.
+*/
if (err_type == ERR_TYPE_MS)
-   return;
+   return print_err_info_ms(pfx, validation_bits, check);
 
if (validation_bits & CHECK_VALID_TRANS_TYPE) {
u8 trans_type = CHECK_TRANS_TYPE(check);


[tip:efi/core] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  a9c1e3e791409e35207277b7873efc756b6fb625
Gitweb: https://git.kernel.org/tip/a9c1e3e791409e35207277b7873efc756b6fb625
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:53 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:48 +0200

efi: Decode IA32/X64 Cache, TLB, and Bus Check structures

Print the common fields of the Cache, TLB, and Bus check structures.The
fields of these three check types are the same except for a few more
fields in the Bus check structure. The remaining Bus check structure
fields will be decoded in a following patch.

Based on UEFI 2.7,
Table 254. IA32/X64 Cache Check Structure
Table 255. IA32/X64 TLB Check Structure
Table 256. IA32/X64 Bus Check Structure

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-8-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 99 -
 1 file changed, 98 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 5438097b93ac..f70c46f7a4db 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -30,6 +30,25 @@
 #define INFO_VALID_RESPONDER_IDBIT_ULL(3)
 #define INFO_VALID_IP  BIT_ULL(4)
 
+#define CHECK_VALID_TRANS_TYPE BIT_ULL(0)
+#define CHECK_VALID_OPERATION  BIT_ULL(1)
+#define CHECK_VALID_LEVEL  BIT_ULL(2)
+#define CHECK_VALID_PCCBIT_ULL(3)
+#define CHECK_VALID_UNCORRECTEDBIT_ULL(4)
+#define CHECK_VALID_PRECISE_IP BIT_ULL(5)
+#define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6)
+#define CHECK_VALID_OVERFLOW   BIT_ULL(7)
+
+#define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0)))
+#define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 
16)) >> 16)
+#define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18)
+#define CHECK_LEVEL(check) (((check) & GENMASK_ULL(24, 22)) >> 22)
+#define CHECK_PCC  BIT_ULL(25)
+#define CHECK_UNCORRECTED  BIT_ULL(26)
+#define CHECK_PRECISE_IP   BIT_ULL(27)
+#define CHECK_RESTARTABLE_IP   BIT_ULL(28)
+#define CHECK_OVERFLOW BIT_ULL(29)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -52,11 +71,81 @@ static enum err_types cper_get_err_type(const guid_t 
*err_type)
return N_ERR_TYPES;
 }
 
+static const char * const ia_check_trans_type_strs[] = {
+   "Instruction",
+   "Data Access",
+   "Generic",
+};
+
+static const char * const ia_check_op_strs[] = {
+   "generic error",
+   "generic read",
+   "generic write",
+   "data read",
+   "data write",
+   "instruction fetch",
+   "prefetch",
+   "eviction",
+   "snoop",
+};
+
+static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
+{
+   printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
+}
+
+static void print_err_info(const char *pfx, u8 err_type, u64 check)
+{
+   u16 validation_bits = CHECK_VALID_BITS(check);
+
+   if (err_type == ERR_TYPE_MS)
+   return;
+
+   if (validation_bits & CHECK_VALID_TRANS_TYPE) {
+   u8 trans_type = CHECK_TRANS_TYPE(check);
+
+   printk("%sTransaction Type: %u, %s\n", pfx, trans_type,
+  trans_type < ARRAY_SIZE(ia_check_trans_type_strs) ?
+  ia_check_trans_type_strs[trans_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_OPERATION) {
+   u8 op = CHECK_OPERATION(check);
+
+   /*
+* CACHE has more operation types than TLB or BUS, though the
+* name and the order are the same.
+*/
+   u8 max_ops = (err_type == ERR_TYPE_CACHE) ? 9 : 7;
+
+   printk("%sOperation: %u, %s\n", pfx, op,
+  op < max_ops ? ia_check_op_strs[op] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_LEVEL)
+   printk("%sLevel: %llu\n", pfx, CHECK_LEVEL(check));
+
+   if (validation_bits & CHECK_VALID_PCC)
+   print_bool("Processor Context Corrupt", pfx, check, CHECK_PCC);
+
+   if (validation_bits & CHECK_VALID_UNCORRECTED)
+   print_bool("Uncorrected", pfx, check, CHECK_UNCORRECTED);
+
+   if (validation_bits & CHECK_VALID_PRECISE_IP)
+   prin

[tip:efi/core] efi: Decode additional IA32/X64 Bus Check fields

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  c6bc4ac0aadede7a5c5260bcc315cd2b18c6b471
Gitweb: https://git.kernel.org/tip/c6bc4ac0aadede7a5c5260bcc315cd2b18c6b471
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:54 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:48 +0200

efi: Decode additional IA32/X64 Bus Check fields

The "Participation Type", "Time Out", and "Address Space" fields are
unique to the IA32/X64 Bus Check structure. Print these fields.

Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-9-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 44 +
 1 file changed, 44 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index f70c46f7a4db..5e6716564dba 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -39,6 +39,10 @@
 #define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6)
 #define CHECK_VALID_OVERFLOW   BIT_ULL(7)
 
+#define CHECK_VALID_BUS_PART_TYPE  BIT_ULL(8)
+#define CHECK_VALID_BUS_TIME_OUT   BIT_ULL(9)
+#define CHECK_VALID_BUS_ADDR_SPACE BIT_ULL(10)
+
 #define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0)))
 #define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 
16)) >> 16)
 #define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18)
@@ -49,6 +53,10 @@
 #define CHECK_RESTARTABLE_IP   BIT_ULL(28)
 #define CHECK_OVERFLOW BIT_ULL(29)
 
+#define CHECK_BUS_PART_TYPE(check) (((check) & GENMASK_ULL(31, 30)) >> 30)
+#define CHECK_BUS_TIME_OUT BIT_ULL(32)
+#define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -89,6 +97,20 @@ static const char * const ia_check_op_strs[] = {
"snoop",
 };
 
+static const char * const ia_check_bus_part_type_strs[] = {
+   "Local Processor originated request",
+   "Local Processor responded to request",
+   "Local Processor observed",
+   "Generic",
+};
+
+static const char * const ia_check_bus_addr_space_strs[] = {
+   "Memory Access",
+   "Reserved",
+   "I/O",
+   "Other Transaction",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
@@ -139,6 +161,28 @@ static void print_err_info(const char *pfx, u8 err_type, 
u64 check)
 
if (validation_bits & CHECK_VALID_OVERFLOW)
print_bool("Overflow", pfx, check, CHECK_OVERFLOW);
+
+   if (err_type != ERR_TYPE_BUS)
+   return;
+
+   if (validation_bits & CHECK_VALID_BUS_PART_TYPE) {
+   u8 part_type = CHECK_BUS_PART_TYPE(check);
+
+   printk("%sParticipation Type: %u, %s\n", pfx, part_type,
+  part_type < ARRAY_SIZE(ia_check_bus_part_type_strs) ?
+  ia_check_bus_part_type_strs[part_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_BUS_TIME_OUT)
+   print_bool("Time Out", pfx, check, CHECK_BUS_TIME_OUT);
+
+   if (validation_bits & CHECK_VALID_BUS_ADDR_SPACE) {
+   u8 addr_space = CHECK_BUS_ADDR_SPACE(check);
+
+   printk("%sAddress Space: %u, %s\n", pfx, addr_space,
+  addr_space < ARRAY_SIZE(ia_check_bus_addr_space_strs) ?
+  ia_check_bus_addr_space_strs[addr_space] : "unknown");
+   }
 }
 
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)


[tip:efi/core] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  dc2d26e4b667c8005c58669e71de3efd17f4390f
Gitweb: https://git.kernel.org/tip/dc2d26e4b667c8005c58669e71de3efd17f4390f
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:52 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:47 +0200

efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs

For easier handling, match the known IA32/X64 error structure GUIDs to
enums.

Also, print out the name of the matching Error Structure Type.

Only print the GUID for unknown types.

GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error
Information Structure.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-7-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 47 +++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index e0633a103fcf..5438097b93ac 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -11,17 +11,53 @@
 #define VALID_CPUID_INFO   BIT_ULL(1)
 #define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
 
+#define INFO_ERR_STRUCT_TYPE_CACHE \
+   GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B,   \
+ 0x57, 0x3F, 0xAD, 0x2C)
+#define INFO_ERR_STRUCT_TYPE_TLB   \
+   GUID_INIT(0xFC06B535, 0x5E1F, 0x4562, 0x9F, 0x25, 0x0A, 0x3B,   \
+ 0x9A, 0xDB, 0x63, 0xC3)
+#define INFO_ERR_STRUCT_TYPE_BUS   \
+   GUID_INIT(0x1CF3F8B3, 0xC5B1, 0x49a2, 0xAA, 0x59, 0x5E, 0xEF,   \
+ 0x92, 0xFF, 0xA6, 0x3C)
+#define INFO_ERR_STRUCT_TYPE_MS
\
+   GUID_INIT(0x48AB7F57, 0xDC34, 0x4f6c, 0xA7, 0xD3, 0xB0, 0xB5,   \
+ 0xB0, 0xA7, 0x43, 0x14)
+
 #define INFO_VALID_CHECK_INFO  BIT_ULL(0)
 #define INFO_VALID_TARGET_ID   BIT_ULL(1)
 #define INFO_VALID_REQUESTOR_IDBIT_ULL(2)
 #define INFO_VALID_RESPONDER_IDBIT_ULL(3)
 #define INFO_VALID_IP  BIT_ULL(4)
 
+enum err_types {
+   ERR_TYPE_CACHE = 0,
+   ERR_TYPE_TLB,
+   ERR_TYPE_BUS,
+   ERR_TYPE_MS,
+   N_ERR_TYPES
+};
+
+static enum err_types cper_get_err_type(const guid_t *err_type)
+{
+   if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_CACHE))
+   return ERR_TYPE_CACHE;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_TLB))
+   return ERR_TYPE_TLB;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_BUS))
+   return ERR_TYPE_BUS;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_MS))
+   return ERR_TYPE_MS;
+   else
+   return N_ERR_TYPES;
+}
+
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
 {
int i;
struct cper_ia_err_info *err_info;
char newpfx[64];
+   u8 err_type;
 
if (proc->validation_bits & VALID_LAPIC_ID)
printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
@@ -38,8 +74,15 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) {
printk("%sError Information Structure %d:\n", pfx, i);
 
-   printk("%sError Structure Type: %pUl\n", newpfx,
-  &err_info->err_type);
+   err_type = cper_get_err_type(&err_info->err_type);
+   printk("%sError Structure Type: %s\n", newpfx,
+  err_type < ARRAY_SIZE(cper_proc_error_type_strs) ?
+  cper_proc_error_type_strs[err_type] : "unknown");
+
+   if (err_type >= N_ERR_TYPES) {
+   printk("%sError Structure Type: %pUl\n", newpfx,
+  &err_info->err_type);
+   }
 
if (err_info->validation_bits & INFO_VALID_CHECK_INFO) {
printk("%sCheck Information: 0x%016llx\n", newpfx,


[tip:efi/core] efi: Decode IA32/X64 Processor Error Info Structure

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  7c9449b8c8a59511b7d749afb193c96353451c82
Gitweb: https://git.kernel.org/tip/7c9449b8c8a59511b7d749afb193c96353451c82
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:51 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:47 +0200

efi: Decode IA32/X64 Processor Error Info Structure

Print the fields in the IA32/X64 Processor Error Info Structure.

Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information
Structure.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-6-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/cper-x86.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 863f0cd2a0ff..e0633a103fcf 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -9,9 +9,20 @@
  */
 #define VALID_LAPIC_ID BIT_ULL(0)
 #define VALID_CPUID_INFO   BIT_ULL(1)
+#define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
+
+#define INFO_VALID_CHECK_INFO  BIT_ULL(0)
+#define INFO_VALID_TARGET_ID   BIT_ULL(1)
+#define INFO_VALID_REQUESTOR_IDBIT_ULL(2)
+#define INFO_VALID_RESPONDER_IDBIT_ULL(3)
+#define INFO_VALID_IP  BIT_ULL(4)
 
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
 {
+   int i;
+   struct cper_ia_err_info *err_info;
+   char newpfx[64];
+
if (proc->validation_bits & VALID_LAPIC_ID)
printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
 
@@ -20,4 +31,41 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid,
   sizeof(proc->cpuid), 0);
}
+
+   snprintf(newpfx, sizeof(newpfx), "%s ", pfx);
+
+   err_info = (struct cper_ia_err_info *)(proc + 1);
+   for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) {
+   printk("%sError Information Structure %d:\n", pfx, i);
+
+   printk("%sError Structure Type: %pUl\n", newpfx,
+  &err_info->err_type);
+
+   if (err_info->validation_bits & INFO_VALID_CHECK_INFO) {
+   printk("%sCheck Information: 0x%016llx\n", newpfx,
+  err_info->check_info);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_TARGET_ID) {
+   printk("%sTarget Identifier: 0x%016llx\n",
+  newpfx, err_info->target_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_REQUESTOR_ID) {
+   printk("%sRequestor Identifier: 0x%016llx\n",
+  newpfx, err_info->requestor_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_RESPONDER_ID) {
+   printk("%sResponder Identifier: 0x%016llx\n",
+  newpfx, err_info->responder_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_IP) {
+   printk("%sInstruction Pointer: 0x%016llx\n",
+  newpfx, err_info->ip);
+   }
+
+   err_info++;
+   }
 }


[tip:efi/core] efi: Decode IA32/X64 Processor Error Section

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  f9e1bdb9f35f4f5cfa7c9025ac68c02909b6d3b1
Gitweb: https://git.kernel.org/tip/f9e1bdb9f35f4f5cfa7c9025ac68c02909b6d3b1
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:50 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:47 +0200

efi: Decode IA32/X64 Processor Error Section

Recognize the IA32/X64 Processor Error Section.

Do the section decoding in a new "cper-x86.c" file and add this to the
Makefile depending on a new "UEFI_CPER_X86" config option.

Print the Local APIC ID and CPUID info from the Processor Error Record.

The "Processor Error Info" and "Processor Context" fields will be
decoded in following patches.

Based on UEFI 2.7 Table 252. Processor Error Record.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-5-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/Kconfig|  5 +
 drivers/firmware/efi/Makefile   |  1 +
 drivers/firmware/efi/cper-x86.c | 23 +++
 drivers/firmware/efi/cper.c | 10 ++
 include/linux/cper.h|  2 ++
 5 files changed, 41 insertions(+)

diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 3098410abad8..781a4a337557 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -174,6 +174,11 @@ config UEFI_CPER_ARM
depends on UEFI_CPER && ( ARM || ARM64 )
default y
 
+config UEFI_CPER_X86
+   bool
+   depends on UEFI_CPER && X86
+   default y
+
 config EFI_DEV_PATH_PARSER
bool
depends on ACPI
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index cb805374f4bc..5f9f5039de50 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -31,3 +31,4 @@ obj-$(CONFIG_ARM) += $(arm-obj-y)
 obj-$(CONFIG_ARM64)+= $(arm-obj-y)
 obj-$(CONFIG_EFI_CAPSULE_LOADER)   += capsule-loader.o
 obj-$(CONFIG_UEFI_CPER_ARM)+= cper-arm.o
+obj-$(CONFIG_UEFI_CPER_X86)+= cper-x86.o
diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
new file mode 100644
index ..863f0cd2a0ff
--- /dev/null
+++ b/drivers/firmware/efi/cper-x86.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018, Advanced Micro Devices, Inc.
+
+#include 
+
+/*
+ * We don't need a "CPER_IA" prefix since these are all locally defined.
+ * This will save us a lot of line space.
+ */
+#define VALID_LAPIC_ID BIT_ULL(0)
+#define VALID_CPUID_INFO   BIT_ULL(1)
+
+void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
+{
+   if (proc->validation_bits & VALID_LAPIC_ID)
+   printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
+
+   if (proc->validation_bits & VALID_CPUID_INFO) {
+   printk("%sCPUID Info:\n", pfx);
+   print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid,
+  sizeof(proc->cpuid), 0);
+   }
+}
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index ab21f1614007..3bf0dca378a6 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -467,6 +467,16 @@ cper_estatus_print_section(const char *pfx, struct 
acpi_hest_generic_data *gdata
cper_print_proc_arm(newpfx, arm_err);
else
goto err_section_too_small;
+#endif
+#if defined(CONFIG_UEFI_CPER_X86)
+   } else if (guid_equal(sec_type, &CPER_SEC_PROC_IA)) {
+   struct cper_sec_proc_ia *ia_err = acpi_hest_get_payload(gdata);
+
+   printk("%ssection_type: IA32/X64 processor error\n", newpfx);
+   if (gdata->error_data_length >= sizeof(*ia_err))
+   cper_print_proc_ia(newpfx, ia_err);
+   else
+   goto err_section_too_small;
 #endif
} else {
const void *err = acpi_hest_get_payload(gdata);
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 4b5f8459b403..9c703a0abe6e 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -551,5 +551,7 @@ const char *cper_mem_err_unpack(struct trace_seq *,
struct cper_mem_err_compact *);
 void cper_print_proc_arm(const char *pfx,
 const struct cper_sec_proc_arm *proc);
+void cper_print_proc_ia(const char *pfx,
+   const struct cper_sec_proc_ia *proc);
 
 #endif


[tip:efi/core] efi: Fix IA32/X64 Processor Error Record definition

2018-05-14 Thread tip-bot for Yazen Ghannam
Commit-ID:  742632d237ce180439ab4af31e9891df0df81233
Gitweb: https://git.kernel.org/tip/742632d237ce180439ab4af31e9891df0df81233
Author: Yazen Ghannam 
AuthorDate: Fri, 4 May 2018 07:59:49 +0200
Committer:  Ingo Molnar 
CommitDate: Mon, 14 May 2018 08:57:47 +0200

efi: Fix IA32/X64 Processor Error Record definition

Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID"
field is 8 bytes but Linux defines this field as 1 byte.

Fix this in the struct cper_sec_proc_ia definition.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Ard Biesheuvel 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-4-ard.biesheu...@linaro.org
Signed-off-by: Ingo Molnar 
---
 include/linux/cper.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/cper.h b/include/linux/cper.h
index d14ef4e77c8a..4b5f8459b403 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -381,7 +381,7 @@ struct cper_sec_proc_generic {
 /* IA32/X64 Processor Error Section */
 struct cper_sec_proc_ia {
__u64   validation_bits;
-   __u8lapic_id;
+   __u64   lapic_id;
__u8cpuid[48];
 };
 


[tip:x86/urgent] x86/smpboot: Don't use mwait_play_dead() on AMD systems

2018-04-26 Thread tip-bot for Yazen Ghannam
Commit-ID:  da6fa7ef67f07108a1b0cb9fd9e7fcaabd39c051
Gitweb: https://git.kernel.org/tip/da6fa7ef67f07108a1b0cb9fd9e7fcaabd39c051
Author: Yazen Ghannam 
AuthorDate: Tue, 3 Apr 2018 09:02:28 -0500
Committer:  Thomas Gleixner 
CommitDate: Thu, 26 Apr 2018 16:06:19 +0200

x86/smpboot: Don't use mwait_play_dead() on AMD systems

Recent AMD systems support using MWAIT for C1 state. However, MWAIT will
not allow deeper cstates than C1 on current systems.

play_dead() expects to use the deepest state available.  The deepest state
available on AMD systems is reached through SystemIO or HALT. If MWAIT is
available, it is preferred over the other methods, so the CPU never reaches
the deepest possible state.

Don't try to use MWAIT to play_dead() on AMD systems. Instead, use CPUIDLE
to enter the deepest state advertised by firmware. If CPUIDLE is not
available then fallback to HALT.

Signed-off-by: Yazen Ghannam 
Signed-off-by: Thomas Gleixner 
Reviewed-by: Borislav Petkov 
Cc: sta...@vger.kernel.org
Cc: Yazen Ghannam 
Link: https://lkml.kernel.org/r/20180403140228.58540-1-yazen.ghan...@amd.com

---
 arch/x86/kernel/smpboot.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 45175b81dd5b..0f1cbb042f49 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1571,6 +1571,8 @@ static inline void mwait_play_dead(void)
void *mwait_ptr;
int i;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   return;
if (!this_cpu_has(X86_FEATURE_MWAIT))
return;
if (!this_cpu_has(X86_FEATURE_CLFLUSH))


[PATCH v2] x86/smpboot: Don't do mwait_play_dead() on AMD systems

2018-04-03 Thread Yazen Ghannam
From: Yazen Ghannam 

Recent AMD systems support using MWAIT for C1 state. However, MWAIT will
not allow deeper cstates than C1 on current systems.

With play_dead() we expect the OS to use the deepest state available.
The deepest state available on AMD systems is reached through SystemIO
or HALT. If MWAIT is available, we use it instead of the other methods,
so we never reach the deepest state.

Don't try to use MWAIT to play_dead() on AMD systems. Instead, we'll use
CPUIDLE to enter the deepest state advertised by firmware. If CPUIDLE is
not available then we fallback to HALT.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180402183424.48222-1-yazen.ghan...@amd.com

v1->v2:
* Drop comment in code.

 arch/x86/kernel/smpboot.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ff99e2b6fc54..12599e55e040 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1536,6 +1536,8 @@ static inline void mwait_play_dead(void)
void *mwait_ptr;
int i;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   return;
if (!this_cpu_has(X86_FEATURE_MWAIT))
return;
if (!this_cpu_has(X86_FEATURE_CLFLUSH))
-- 
2.14.1



[PATCH] x86/MCE, EDAC/mce_amd: Save all aux registers on SMCA systems

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

The Intel SDM and AMD APM both state that the auxiliary MCA registers
should be read if their respective valid bits are set in MCA_STATUS.

The Processor Programming Reference for AMD Fam17h systems has a new
recommendation that the auxiliary registers should be saved
unconditionally. This recommendation can be retroactively applied to
older AMD systems. However, we only need to apply this to SMCA systems
to avoid modifying behavior on older systems.

Define a separate function to save all auxiliary registers on SMCA
systems. Call this function from both the MCE handlers and the AMD LVT
interrupt handlers so that we don't duplicate code.

Print all auxiliary registers in EDAC/mce_amd. Don't restrict this to
SMCA systems in order to save a conditional and keep the format similar
between SMCA and non-SMCA systems.

Signed-off-by: Yazen Ghannam 
---
Links:
https://lkml.kernel.org/r/20180326191526.64314-1-yazen.ghan...@amd.com
https://lkml.kernel.org/r/20180326191526.64314-2-yazen.ghan...@amd.com

 arch/x86/kernel/cpu/mcheck/mce-internal.h |  6 +++
 arch/x86/kernel/cpu/mcheck/mce.c  | 20 ++
 arch/x86/kernel/cpu/mcheck/mce_amd.c  | 65 +--
 drivers/edac/mce_amd.c| 12 ++
 4 files changed, 57 insertions(+), 46 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h 
b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 374d1aa66952..67a2c7c095ca 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -59,6 +59,12 @@ static inline void mce_intel_hcpu_update(unsigned long cpu) 
{ }
 static inline void cmci_disable_bank(int bank) { }
 #endif
 
+#ifdef CONFIG_X86_MCE_AMD
+bool smca_read_aux(struct mce *m, int bank);
+#else
+static inline bool smca_read_aux(struct mce *m, int bank) { return false; }
+#endif
+
 void mce_timer_kick(unsigned long interval);
 
 #ifdef CONFIG_ACPI_APEI
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 42cf2880d0ed..6be63e9e067d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -639,6 +639,9 @@ static struct notifier_block mce_default_nb = {
  */
 static void mce_read_aux(struct mce *m, int i)
 {
+   if (smca_read_aux(m, i))
+   return;
+
if (m->status & MCI_STATUS_MISCV)
m->misc = mce_rdmsrl(msr_ops.misc(i));
 
@@ -653,23 +656,6 @@ static void mce_read_aux(struct mce *m, int i)
m->addr >>= shift;
m->addr <<= shift;
}
-
-   /*
-* Extract [55:] where lsb is the least significant
-* *valid* bit of the address bits.
-*/
-   if (mce_flags.smca) {
-   u8 lsb = (m->addr >> 56) & 0x3f;
-
-   m->addr &= GENMASK_ULL(55, lsb);
-   }
-   }
-
-   if (mce_flags.smca) {
-   m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i));
-
-   if (m->status & MCI_STATUS_SYNDV)
-   m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i));
}
 }
 
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c 
b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index f7666eef4a87..b00d5fff1848 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -244,6 +244,47 @@ static void smca_configure(unsigned int bank, unsigned int 
cpu)
}
 }
 
+
+static bool _smca_read_aux(struct mce *m, int bank, bool read_addr)
+{
+   if (!mce_flags.smca)
+   return false;
+
+   rdmsrl(MSR_AMD64_SMCA_MCx_IPID(bank), m->ipid);
+   rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m->synd);
+
+   /*
+* We should already have a value if we're coming from the Threshold LVT
+* interrupt handler. Otherwise, read it now.
+*/
+   if (!m->misc)
+   rdmsrl(msr_ops.misc(bank), m->misc);
+
+   /*
+* Read MCA_ADDR if we don't have it already. We should already have it
+* if we're coming from the interrupt handlers.
+*/
+   if (read_addr)
+   rdmsrl(msr_ops.addr(bank), m->addr);
+
+   /*
+* Extract [55:] where lsb is the least significant
+* *valid* bit of the address bits.
+*/
+   if (m->addr) {
+   u8 lsb = (m->addr >> 56) & 0x3f;
+
+   m->addr &= GENMASK_ULL(55, lsb);
+   }
+
+   return true;
+}
+
+bool smca_read_aux(struct mce *m, int bank)
+{
+   return _smca_read_aux(m, bank, true);
+}
+
 struct thresh_restart {
struct threshold_block  *b;
int reset;
@@ -799,30 +840,12 @@ static void __log_error(unsigned int bank, u64 status, 
u64 addr, u64 misc)
mce_setup(&m);
 
m.status = statu

[PATCH] x86/smpboot: Don't do mwait_play_dead() on AMD systems

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Recent AMD systems support using MWAIT for C1 state. However, MWAIT will
not allow deeper cstates than C1 on current systems.

With play_dead() we expect the OS to use the deepest state available.
The deepest state available on AMD systems is reached through SystemIO
or HALT. If MWAIT is available, we use it instead of the other methods,
so we never reach the deepest state.

Don't try to use MWAIT to play_dead() on AMD systems. Instead, we'll use
CPUIDLE to enter the deepest state advertised by firmware. If CPUIDLE is
not available then we fallback to HALT.

Signed-off-by: Yazen Ghannam 
---
 arch/x86/kernel/smpboot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ff99e2b6fc54..67cf00b25f83 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1536,6 +1536,9 @@ static inline void mwait_play_dead(void)
void *mwait_ptr;
int i;
 
+   /* Don't try native MWAIT on AMD. Stick to CPUIDLE and HALT. */
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   return;
if (!this_cpu_has(X86_FEATURE_MWAIT))
return;
if (!this_cpu_has(X86_FEATURE_CLFLUSH))
-- 
2.14.1



[PATCH v4 1/8] efi: Fix IA32/X64 Processor Error Record definition

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID"
field is 8 bytes but Linux defines this field as 1 byte.

Fix this in the struct cper_sec_proc_ia definition.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-2-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* Fix table number in commit message.

v1->v2:
* No changes

 include/linux/cper.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/cper.h b/include/linux/cper.h
index d14ef4e77c8a..4b5f8459b403 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -381,7 +381,7 @@ struct cper_sec_proc_generic {
 /* IA32/X64 Processor Error Section */
 struct cper_sec_proc_ia {
__u64   validation_bits;
-   __u8lapic_id;
+   __u64   lapic_id;
__u8cpuid[48];
 };
 
-- 
2.14.1



[PATCH v4 5/8] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Print the common fields of the Cache, TLB, and Bus check structures.The
fields of these three check types are the same except for a few more
fields in the Bus check structure. The remaining Bus check structure
fields will be decoded in a following patch.

Based on UEFI 2.7,
Table 254. IA32/X64 Cache Check Structure
Table 255. IA32/X64 TLB Check Structure
Table 256. IA32/X64 Bus Check Structure

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-6-yazen.ghan...@amd.com

v3->v4:
* Drop INDENT_SP use.

v2->v3:
* Fix table numbers in commit message.
* Don't print raw validation bits.

v1->v2:
* Add parantheses around "check" expression in macro.
* Change use of enum type to u8.
* Fix indentation on multi-line statements.

 drivers/firmware/efi/cper-x86.c | 99 -
 1 file changed, 98 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 5438097b93ac..f70c46f7a4db 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -30,6 +30,25 @@
 #define INFO_VALID_RESPONDER_IDBIT_ULL(3)
 #define INFO_VALID_IP  BIT_ULL(4)
 
+#define CHECK_VALID_TRANS_TYPE BIT_ULL(0)
+#define CHECK_VALID_OPERATION  BIT_ULL(1)
+#define CHECK_VALID_LEVEL  BIT_ULL(2)
+#define CHECK_VALID_PCCBIT_ULL(3)
+#define CHECK_VALID_UNCORRECTEDBIT_ULL(4)
+#define CHECK_VALID_PRECISE_IP BIT_ULL(5)
+#define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6)
+#define CHECK_VALID_OVERFLOW   BIT_ULL(7)
+
+#define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0)))
+#define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 
16)) >> 16)
+#define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18)
+#define CHECK_LEVEL(check) (((check) & GENMASK_ULL(24, 22)) >> 22)
+#define CHECK_PCC  BIT_ULL(25)
+#define CHECK_UNCORRECTED  BIT_ULL(26)
+#define CHECK_PRECISE_IP   BIT_ULL(27)
+#define CHECK_RESTARTABLE_IP   BIT_ULL(28)
+#define CHECK_OVERFLOW BIT_ULL(29)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -52,11 +71,81 @@ static enum err_types cper_get_err_type(const guid_t 
*err_type)
return N_ERR_TYPES;
 }
 
+static const char * const ia_check_trans_type_strs[] = {
+   "Instruction",
+   "Data Access",
+   "Generic",
+};
+
+static const char * const ia_check_op_strs[] = {
+   "generic error",
+   "generic read",
+   "generic write",
+   "data read",
+   "data write",
+   "instruction fetch",
+   "prefetch",
+   "eviction",
+   "snoop",
+};
+
+static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
+{
+   printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
+}
+
+static void print_err_info(const char *pfx, u8 err_type, u64 check)
+{
+   u16 validation_bits = CHECK_VALID_BITS(check);
+
+   if (err_type == ERR_TYPE_MS)
+   return;
+
+   if (validation_bits & CHECK_VALID_TRANS_TYPE) {
+   u8 trans_type = CHECK_TRANS_TYPE(check);
+
+   printk("%sTransaction Type: %u, %s\n", pfx, trans_type,
+  trans_type < ARRAY_SIZE(ia_check_trans_type_strs) ?
+  ia_check_trans_type_strs[trans_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_OPERATION) {
+   u8 op = CHECK_OPERATION(check);
+
+   /*
+* CACHE has more operation types than TLB or BUS, though the
+* name and the order are the same.
+*/
+   u8 max_ops = (err_type == ERR_TYPE_CACHE) ? 9 : 7;
+
+   printk("%sOperation: %u, %s\n", pfx, op,
+  op < max_ops ? ia_check_op_strs[op] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_LEVEL)
+   printk("%sLevel: %llu\n", pfx, CHECK_LEVEL(check));
+
+   if (validation_bits & CHECK_VALID_PCC)
+   print_bool("Processor Context Corrupt", pfx, check, CHECK_PCC);
+
+   if (validation_bits & CHECK_VALID_UNCORRECTED)
+   print_bool("Uncorrected", pfx, check, CHECK_UNCORRECTED);
+
+   if (validation_bits & CHECK_VALID_PRECISE_IP)
+   print_bool("Precise IP", pfx, check, CHECK_PRECISE_IP);
+
+   if (validation_bits & CHECK_VALID_RESTARTABLE_IP)
+   print_bool("Restartable IP", pfx, check, CHECK_RESTA

[PATCH v4 7/8] efi: Decode IA32/X64 MS Check structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

The IA32/X64 MS Check structure varies from the other Check structures
in the the bit positions of its fields, and it includes an additional
"Error Type" field.

Decode the MS Check structure in a separate function.

Based on UEFI 2.7 Table 257. IA32/X64 MS Check Field Description.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-8-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* Fix table number in commit message.

v1->v2:
* Add parantheses around "check" expression in macro.
* Fix indentation on multi-line statements.

 drivers/firmware/efi/cper-x86.c | 55 -
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 5e6716564dba..356b8d326219 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -57,6 +57,20 @@
 #define CHECK_BUS_TIME_OUT BIT_ULL(32)
 #define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33)
 
+#define CHECK_VALID_MS_ERR_TYPEBIT_ULL(0)
+#define CHECK_VALID_MS_PCC BIT_ULL(1)
+#define CHECK_VALID_MS_UNCORRECTED BIT_ULL(2)
+#define CHECK_VALID_MS_PRECISE_IP  BIT_ULL(3)
+#define CHECK_VALID_MS_RESTARTABLE_IP  BIT_ULL(4)
+#define CHECK_VALID_MS_OVERFLOWBIT_ULL(5)
+
+#define CHECK_MS_ERR_TYPE(check)   (((check) & GENMASK_ULL(18, 16)) >> 16)
+#define CHECK_MS_PCC   BIT_ULL(19)
+#define CHECK_MS_UNCORRECTED   BIT_ULL(20)
+#define CHECK_MS_PRECISE_IPBIT_ULL(21)
+#define CHECK_MS_RESTARTABLE_IPBIT_ULL(22)
+#define CHECK_MS_OVERFLOW  BIT_ULL(23)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -111,17 +125,56 @@ static const char * const ia_check_bus_addr_space_strs[] 
= {
"Other Transaction",
 };
 
+static const char * const ia_check_ms_error_type_strs[] = {
+   "No Error",
+   "Unclassified",
+   "Microcode ROM Parity Error",
+   "External Error",
+   "FRC Error",
+   "Internal Unclassified",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
 }
 
+static void print_err_info_ms(const char *pfx, u16 validation_bits, u64 check)
+{
+   if (validation_bits & CHECK_VALID_MS_ERR_TYPE) {
+   u8 err_type = CHECK_MS_ERR_TYPE(check);
+
+   printk("%sError Type: %u, %s\n", pfx, err_type,
+  err_type < ARRAY_SIZE(ia_check_ms_error_type_strs) ?
+  ia_check_ms_error_type_strs[err_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_MS_PCC)
+   print_bool("Processor Context Corrupt", pfx, check, 
CHECK_MS_PCC);
+
+   if (validation_bits & CHECK_VALID_MS_UNCORRECTED)
+   print_bool("Uncorrected", pfx, check, CHECK_MS_UNCORRECTED);
+
+   if (validation_bits & CHECK_VALID_MS_PRECISE_IP)
+   print_bool("Precise IP", pfx, check, CHECK_MS_PRECISE_IP);
+
+   if (validation_bits & CHECK_VALID_MS_RESTARTABLE_IP)
+   print_bool("Restartable IP", pfx, check, 
CHECK_MS_RESTARTABLE_IP);
+
+   if (validation_bits & CHECK_VALID_MS_OVERFLOW)
+   print_bool("Overflow", pfx, check, CHECK_MS_OVERFLOW);
+}
+
 static void print_err_info(const char *pfx, u8 err_type, u64 check)
 {
u16 validation_bits = CHECK_VALID_BITS(check);
 
+   /*
+* The MS Check structure varies a lot from the others, so use a
+* separate function for decoding.
+*/
if (err_type == ERR_TYPE_MS)
-   return;
+   return print_err_info_ms(pfx, validation_bits, check);
 
if (validation_bits & CHECK_VALID_TRANS_TYPE) {
u8 trans_type = CHECK_TRANS_TYPE(check);
-- 
2.14.1



[PATCH v4 8/8] efi: Decode IA32/X64 Context Info structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Print the fields of the IA32/X64 Context Information structure.

Print the "Register Array" as raw values. Some context types are defined
in the UEFI spec, so more detailed decoded may be added in the future.

Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Processor Context
Information Structure.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-9-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* No change.

v1->v2:
* Add parantheses around "bits" expression in macro.
* Change VALID_PROC_CNTXT_INFO_NUM to VALID_PROC_CTX_INFO_NUM.
* Fix indentation on multi-line statements.
* Remove conditional to skip unknown context types. The context info
  should be printed even if the type is unknown. This is just like what
  we do for the error information.

 drivers/firmware/efi/cper-x86.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 356b8d326219..2531de49f56c 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -10,6 +10,7 @@
 #define VALID_LAPIC_ID BIT_ULL(0)
 #define VALID_CPUID_INFO   BIT_ULL(1)
 #define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
+#define VALID_PROC_CXT_INFO_NUM(bits)  (((bits) & GENMASK_ULL(13, 8)) >> 8)
 
 #define INFO_ERR_STRUCT_TYPE_CACHE \
GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B,   \
@@ -71,6 +72,9 @@
 #define CHECK_MS_RESTARTABLE_IPBIT_ULL(22)
 #define CHECK_MS_OVERFLOW  BIT_ULL(23)
 
+#define CTX_TYPE_MSR   1
+#define CTX_TYPE_MMREG 7
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -134,6 +138,17 @@ static const char * const ia_check_ms_error_type_strs[] = {
"Internal Unclassified",
 };
 
+static const char * const ia_reg_ctx_strs[] = {
+   "Unclassified Data",
+   "MSR Registers (Machine Check and other MSRs)",
+   "32-bit Mode Execution Context",
+   "64-bit Mode Execution Context",
+   "FXSAVE Context",
+   "32-bit Mode Debug Registers (DR0-DR7)",
+   "64-bit Mode Debug Registers (DR0-DR7)",
+   "Memory Mapped Registers",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
@@ -242,6 +257,7 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
 {
int i;
struct cper_ia_err_info *err_info;
+   struct cper_ia_proc_ctx *ctx_info;
char newpfx[64], infopfx[64];
u8 err_type;
 
@@ -305,4 +321,36 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
 
err_info++;
}
+
+   ctx_info = (struct cper_ia_proc_ctx *)err_info;
+   for (i = 0; i < VALID_PROC_CXT_INFO_NUM(proc->validation_bits); i++) {
+   int size = sizeof(*ctx_info) + ctx_info->reg_arr_size;
+   int groupsize = 4;
+
+   printk("%sContext Information Structure %d:\n", pfx, i);
+
+   printk("%sRegister Context Type: %s\n", newpfx,
+  ctx_info->reg_ctx_type < ARRAY_SIZE(ia_reg_ctx_strs) ?
+  ia_reg_ctx_strs[ctx_info->reg_ctx_type] : "unknown");
+
+   printk("%sRegister Array Size: 0x%04x\n", newpfx,
+  ctx_info->reg_arr_size);
+
+   if (ctx_info->reg_ctx_type == CTX_TYPE_MSR) {
+   groupsize = 8; /* MSRs are 8 bytes wide. */
+   printk("%sMSR Address: 0x%08x\n", newpfx,
+  ctx_info->msr_addr);
+   }
+
+   if (ctx_info->reg_ctx_type == CTX_TYPE_MMREG) {
+   printk("%sMM Register Address: 0x%016llx\n", newpfx,
+  ctx_info->mm_reg_addr);
+   }
+
+   printk("%sRegister Array:\n", newpfx);
+   print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize,
+  (ctx_info + 1), ctx_info->reg_arr_size, 0);
+
+   ctx_info = (struct cper_ia_proc_ctx *)((long)ctx_info + size);
+   }
 }
-- 
2.14.1



[PATCH v4 6/8] efi: Decode additional IA32/X64 Bus Check fields

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

The "Participation Type", "Time Out", and "Address Space" fields are
unique to the IA32/X64 Bus Check structure. Print these fields.

Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-7-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* Fix table number in commit message.

v1->v2:
* Add parantheses around "check" expression in macro.
* Fix indentation on multi-line statements.

 drivers/firmware/efi/cper-x86.c | 44 +
 1 file changed, 44 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index f70c46f7a4db..5e6716564dba 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -39,6 +39,10 @@
 #define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6)
 #define CHECK_VALID_OVERFLOW   BIT_ULL(7)
 
+#define CHECK_VALID_BUS_PART_TYPE  BIT_ULL(8)
+#define CHECK_VALID_BUS_TIME_OUT   BIT_ULL(9)
+#define CHECK_VALID_BUS_ADDR_SPACE BIT_ULL(10)
+
 #define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0)))
 #define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 
16)) >> 16)
 #define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18)
@@ -49,6 +53,10 @@
 #define CHECK_RESTARTABLE_IP   BIT_ULL(28)
 #define CHECK_OVERFLOW BIT_ULL(29)
 
+#define CHECK_BUS_PART_TYPE(check) (((check) & GENMASK_ULL(31, 30)) >> 30)
+#define CHECK_BUS_TIME_OUT BIT_ULL(32)
+#define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33)
+
 enum err_types {
ERR_TYPE_CACHE = 0,
ERR_TYPE_TLB,
@@ -89,6 +97,20 @@ static const char * const ia_check_op_strs[] = {
"snoop",
 };
 
+static const char * const ia_check_bus_part_type_strs[] = {
+   "Local Processor originated request",
+   "Local Processor responded to request",
+   "Local Processor observed",
+   "Generic",
+};
+
+static const char * const ia_check_bus_addr_space_strs[] = {
+   "Memory Access",
+   "Reserved",
+   "I/O",
+   "Other Transaction",
+};
+
 static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit)
 {
printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false");
@@ -139,6 +161,28 @@ static void print_err_info(const char *pfx, u8 err_type, 
u64 check)
 
if (validation_bits & CHECK_VALID_OVERFLOW)
print_bool("Overflow", pfx, check, CHECK_OVERFLOW);
+
+   if (err_type != ERR_TYPE_BUS)
+   return;
+
+   if (validation_bits & CHECK_VALID_BUS_PART_TYPE) {
+   u8 part_type = CHECK_BUS_PART_TYPE(check);
+
+   printk("%sParticipation Type: %u, %s\n", pfx, part_type,
+  part_type < ARRAY_SIZE(ia_check_bus_part_type_strs) ?
+  ia_check_bus_part_type_strs[part_type] : "unknown");
+   }
+
+   if (validation_bits & CHECK_VALID_BUS_TIME_OUT)
+   print_bool("Time Out", pfx, check, CHECK_BUS_TIME_OUT);
+
+   if (validation_bits & CHECK_VALID_BUS_ADDR_SPACE) {
+   u8 addr_space = CHECK_BUS_ADDR_SPACE(check);
+
+   printk("%sAddress Space: %u, %s\n", pfx, addr_space,
+  addr_space < ARRAY_SIZE(ia_check_bus_addr_space_strs) ?
+  ia_check_bus_addr_space_strs[addr_space] : "unknown");
+   }
 }
 
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
-- 
2.14.1



[PATCH v4 4/8] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

For easier handling, match the known IA32/X64 error structure GUIDs to
enums.

Also, print out the name of the matching Error Structure Type.

Only print the GUID for unknown types.

GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error
Information Structure.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-5-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* Only print raw GUID for unknown error types.

v1->v2:
* Change use of enum type to u8.
* Fix indentation on multi-line statements.

 drivers/firmware/efi/cper-x86.c | 47 +++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index e0633a103fcf..5438097b93ac 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -11,17 +11,53 @@
 #define VALID_CPUID_INFO   BIT_ULL(1)
 #define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
 
+#define INFO_ERR_STRUCT_TYPE_CACHE \
+   GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B,   \
+ 0x57, 0x3F, 0xAD, 0x2C)
+#define INFO_ERR_STRUCT_TYPE_TLB   \
+   GUID_INIT(0xFC06B535, 0x5E1F, 0x4562, 0x9F, 0x25, 0x0A, 0x3B,   \
+ 0x9A, 0xDB, 0x63, 0xC3)
+#define INFO_ERR_STRUCT_TYPE_BUS   \
+   GUID_INIT(0x1CF3F8B3, 0xC5B1, 0x49a2, 0xAA, 0x59, 0x5E, 0xEF,   \
+ 0x92, 0xFF, 0xA6, 0x3C)
+#define INFO_ERR_STRUCT_TYPE_MS
\
+   GUID_INIT(0x48AB7F57, 0xDC34, 0x4f6c, 0xA7, 0xD3, 0xB0, 0xB5,   \
+ 0xB0, 0xA7, 0x43, 0x14)
+
 #define INFO_VALID_CHECK_INFO  BIT_ULL(0)
 #define INFO_VALID_TARGET_ID   BIT_ULL(1)
 #define INFO_VALID_REQUESTOR_IDBIT_ULL(2)
 #define INFO_VALID_RESPONDER_IDBIT_ULL(3)
 #define INFO_VALID_IP  BIT_ULL(4)
 
+enum err_types {
+   ERR_TYPE_CACHE = 0,
+   ERR_TYPE_TLB,
+   ERR_TYPE_BUS,
+   ERR_TYPE_MS,
+   N_ERR_TYPES
+};
+
+static enum err_types cper_get_err_type(const guid_t *err_type)
+{
+   if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_CACHE))
+   return ERR_TYPE_CACHE;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_TLB))
+   return ERR_TYPE_TLB;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_BUS))
+   return ERR_TYPE_BUS;
+   else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_MS))
+   return ERR_TYPE_MS;
+   else
+   return N_ERR_TYPES;
+}
+
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
 {
int i;
struct cper_ia_err_info *err_info;
char newpfx[64];
+   u8 err_type;
 
if (proc->validation_bits & VALID_LAPIC_ID)
printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
@@ -38,8 +74,15 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) {
printk("%sError Information Structure %d:\n", pfx, i);
 
-   printk("%sError Structure Type: %pUl\n", newpfx,
-  &err_info->err_type);
+   err_type = cper_get_err_type(&err_info->err_type);
+   printk("%sError Structure Type: %s\n", newpfx,
+  err_type < ARRAY_SIZE(cper_proc_error_type_strs) ?
+  cper_proc_error_type_strs[err_type] : "unknown");
+
+   if (err_type >= N_ERR_TYPES) {
+   printk("%sError Structure Type: %pUl\n", newpfx,
+  &err_info->err_type);
+   }
 
if (err_info->validation_bits & INFO_VALID_CHECK_INFO) {
printk("%sCheck Information: 0x%016llx\n", newpfx,
-- 
2.14.1



[PATCH v4 3/8] efi: Decode IA32/X64 Processor Error Info Structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Print the fields in the IA32/X64 Processor Error Info Structure.

Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information
Structure.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-4-yazen.ghan...@amd.com

v3->v4:
* Drop INDENT_SP use.

v2->v3:
* Fix table number in commit message.
* Don't print raw validation bits.

v1->v2:
* Add parantheses around "bits" expression in macro.
* Fix indentation on multi-line statements.

 drivers/firmware/efi/cper-x86.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
index 863f0cd2a0ff..e0633a103fcf 100644
--- a/drivers/firmware/efi/cper-x86.c
+++ b/drivers/firmware/efi/cper-x86.c
@@ -9,9 +9,20 @@
  */
 #define VALID_LAPIC_ID BIT_ULL(0)
 #define VALID_CPUID_INFO   BIT_ULL(1)
+#define VALID_PROC_ERR_INFO_NUM(bits)  (((bits) & GENMASK_ULL(7, 2)) >> 2)
+
+#define INFO_VALID_CHECK_INFO  BIT_ULL(0)
+#define INFO_VALID_TARGET_ID   BIT_ULL(1)
+#define INFO_VALID_REQUESTOR_IDBIT_ULL(2)
+#define INFO_VALID_RESPONDER_IDBIT_ULL(3)
+#define INFO_VALID_IP  BIT_ULL(4)
 
 void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
 {
+   int i;
+   struct cper_ia_err_info *err_info;
+   char newpfx[64];
+
if (proc->validation_bits & VALID_LAPIC_ID)
printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
 
@@ -20,4 +31,41 @@ void cper_print_proc_ia(const char *pfx, const struct 
cper_sec_proc_ia *proc)
print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid,
   sizeof(proc->cpuid), 0);
}
+
+   snprintf(newpfx, sizeof(newpfx), "%s ", pfx);
+
+   err_info = (struct cper_ia_err_info *)(proc + 1);
+   for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) {
+   printk("%sError Information Structure %d:\n", pfx, i);
+
+   printk("%sError Structure Type: %pUl\n", newpfx,
+  &err_info->err_type);
+
+   if (err_info->validation_bits & INFO_VALID_CHECK_INFO) {
+   printk("%sCheck Information: 0x%016llx\n", newpfx,
+  err_info->check_info);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_TARGET_ID) {
+   printk("%sTarget Identifier: 0x%016llx\n",
+  newpfx, err_info->target_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_REQUESTOR_ID) {
+   printk("%sRequestor Identifier: 0x%016llx\n",
+  newpfx, err_info->requestor_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_RESPONDER_ID) {
+   printk("%sResponder Identifier: 0x%016llx\n",
+  newpfx, err_info->responder_id);
+   }
+
+   if (err_info->validation_bits & INFO_VALID_IP) {
+   printk("%sInstruction Pointer: 0x%016llx\n",
+  newpfx, err_info->ip);
+   }
+
+   err_info++;
+   }
 }
-- 
2.14.1



[PATCH v4 2/8] efi: Decode IA32/X64 Processor Error Section

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

Recognize the IA32/X64 Processor Error Section.

Do the section decoding in a new "cper-x86.c" file and add this to the
Makefile depending on a new "UEFI_CPER_X86" config option.

Print the Local APIC ID and CPUID info from the Processor Error Record.

The "Processor Error Info" and "Processor Context" fields will be
decoded in following patches.

Based on UEFI 2.7 Table 252. Processor Error Record.

Signed-off-by: Yazen Ghannam 
---
Link:
https://lkml.kernel.org/r/20180324184940.19762-3-yazen.ghan...@amd.com

v3->v4:
* No changes.

v2->v3:
* Fix table number in commit message.
* Don't print raw validation bits.

v1->v2:
* Change config option depends to "X86" instead of "X86_32 || X64_64".
* Remove extra newline in Makefile changes.
* Drop author copyright line.

 drivers/firmware/efi/Kconfig|  5 +
 drivers/firmware/efi/Makefile   |  1 +
 drivers/firmware/efi/cper-x86.c | 23 +++
 drivers/firmware/efi/cper.c | 10 ++
 include/linux/cper.h|  2 ++
 5 files changed, 41 insertions(+)
 create mode 100644 drivers/firmware/efi/cper-x86.c

diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 3098410abad8..781a4a337557 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -174,6 +174,11 @@ config UEFI_CPER_ARM
depends on UEFI_CPER && ( ARM || ARM64 )
default y
 
+config UEFI_CPER_X86
+   bool
+   depends on UEFI_CPER && X86
+   default y
+
 config EFI_DEV_PATH_PARSER
bool
depends on ACPI
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index cb805374f4bc..5f9f5039de50 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -31,3 +31,4 @@ obj-$(CONFIG_ARM) += $(arm-obj-y)
 obj-$(CONFIG_ARM64)+= $(arm-obj-y)
 obj-$(CONFIG_EFI_CAPSULE_LOADER)   += capsule-loader.o
 obj-$(CONFIG_UEFI_CPER_ARM)+= cper-arm.o
+obj-$(CONFIG_UEFI_CPER_X86)+= cper-x86.o
diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c
new file mode 100644
index ..863f0cd2a0ff
--- /dev/null
+++ b/drivers/firmware/efi/cper-x86.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2018, Advanced Micro Devices, Inc.
+
+#include 
+
+/*
+ * We don't need a "CPER_IA" prefix since these are all locally defined.
+ * This will save us a lot of line space.
+ */
+#define VALID_LAPIC_ID BIT_ULL(0)
+#define VALID_CPUID_INFO   BIT_ULL(1)
+
+void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
+{
+   if (proc->validation_bits & VALID_LAPIC_ID)
+   printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id);
+
+   if (proc->validation_bits & VALID_CPUID_INFO) {
+   printk("%sCPUID Info:\n", pfx);
+   print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid,
+  sizeof(proc->cpuid), 0);
+   }
+}
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index c165933ebf38..5a59b582c9aa 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -469,6 +469,16 @@ cper_estatus_print_section(const char *pfx, struct 
acpi_hest_generic_data *gdata
cper_print_proc_arm(newpfx, arm_err);
else
goto err_section_too_small;
+#endif
+#if defined(CONFIG_UEFI_CPER_X86)
+   } else if (guid_equal(sec_type, &CPER_SEC_PROC_IA)) {
+   struct cper_sec_proc_ia *ia_err = acpi_hest_get_payload(gdata);
+
+   printk("%ssection_type: IA32/X64 processor error\n", newpfx);
+   if (gdata->error_data_length >= sizeof(*ia_err))
+   cper_print_proc_ia(newpfx, ia_err);
+   else
+   goto err_section_too_small;
 #endif
} else {
const void *err = acpi_hest_get_payload(gdata);
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 4b5f8459b403..9c703a0abe6e 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -551,5 +551,7 @@ const char *cper_mem_err_unpack(struct trace_seq *,
struct cper_mem_err_compact *);
 void cper_print_proc_arm(const char *pfx,
 const struct cper_sec_proc_arm *proc);
+void cper_print_proc_ia(const char *pfx,
+   const struct cper_sec_proc_ia *proc);
 
 #endif
-- 
2.14.1



[PATCH v4 0/8] Decode IA32/X64 CPER

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam 

This series adds decoding for the IA32/X64 Common Platform Error Record.

Patch 1 fixes the IA32/X64 Processor Error Section definition to match
the UEFI spec.

Patches 2-8 add the new decoding. The patches incrementally add the
decoding starting from the top-level "Error Section". Hopefully, this
will make reviewing a bit easier compared to one large patch.

The formatting of the field names and options is taken from the UEFI
spec. I tried to keep everything the same to make searching easier.

The patches were written to the UEFI 2.7 spec though the definition of
the IA32/X64 CPER seems to be the same as when it was introduced in
the UEFI 2.1 spec.

Link:
https://lkml.kernel.org/r/20180324184940.19762-1-yazen.ghan...@amd.com

Changes V3 to V4:
* Drop INDENT_SP use.

Changes V2 to V3:
* Fix table numbers in commit messages.
* Don't print raw validation bits.
* Only print GUID for unknown error types.

Changes V1 to V2:
* Remove stable request for all patches.
* Address Ard's comments on formatting and other issues.
* In Patch 8, always print context info even if the type is not
  recognized.

Yazen Ghannam (8):
  efi: Fix IA32/X64 Processor Error Record definition
  efi: Decode IA32/X64 Processor Error Section
  efi: Decode IA32/X64 Processor Error Info Structure
  efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs
  efi: Decode IA32/X64 Cache, TLB, and Bus Check structures
  efi: Decode additional IA32/X64 Bus Check fields
  efi: Decode IA32/X64 MS Check structure
  efi: Decode IA32/X64 Context Info structure

 drivers/firmware/efi/Kconfig|   5 +
 drivers/firmware/efi/Makefile   |   1 +
 drivers/firmware/efi/cper-x86.c | 356 
 drivers/firmware/efi/cper.c |  10 ++
 include/linux/cper.h|   4 +-
 5 files changed, 375 insertions(+), 1 deletion(-)
 create mode 100644 drivers/firmware/efi/cper-x86.c

-- 
2.14.1



  1   2   3   >