Re: [PATCH v2 4/4] EDAC/mce_amd: Add support for FRU Text in MCA
On Wed, Jun 26, 2024 at 08:20:13PM +0200, Borislav Petkov wrote: > On Wed, Jun 26, 2024 at 01:00:30PM -0500, Naik, Avadhut wrote: > > > > > > Why are you clearing it if you're overwriting it immediately? > > > > > Since its a local variable, wanted to ensure that the memory is zeroed out > > to prevent > > any issues with the %s specifier, used later on. > > What issues? > > > Would you recommend removing that and using initializer instead for the > > string? > > I'd recommend looking at what the code does and then really thinking whether > that makes any sense. > We need to make sure the string is NULL-terminated. So the memset() could be replaced with this: frutext[16] = '\0'; Or better yet, maybe we can use scnprintf() or similar. Thanks, Yazen
Re: [PATCH 0/4] MCE wrapper and support for new SMCA syndrome MSRs
On Fri, Jun 21, 2024 at 06:58:23PM +0200, Borislav Petkov wrote: > On Thu, May 30, 2024 at 04:16:16PM -0500, Avadhut Naik wrote: > > arch/x86/include/asm/mce.h | 20 ++- > > arch/x86/kernel/cpu/mce/apei.c | 111 ++ > > arch/x86/kernel/cpu/mce/core.c | 191 ++-- > > arch/x86/kernel/cpu/mce/dev-mcelog.c| 2 +- > > arch/x86/kernel/cpu/mce/genpool.c | 20 +-- > > arch/x86/kernel/cpu/mce/inject.c| 4 +- > > arch/x86/kernel/cpu/mce/internal.h | 4 +- > > drivers/acpi/acpi_extlog.c | 2 +- > > drivers/acpi/nfit/mce.c | 2 +- > > drivers/edac/i7core_edac.c | 2 +- > > drivers/edac/igen6_edac.c | 2 +- > > drivers/edac/mce_amd.c | 27 +++- > > drivers/edac/pnd2_edac.c| 2 +- > > drivers/edac/sb_edac.c | 2 +- > > drivers/edac/skx_common.c | 2 +- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +- > > drivers/ras/amd/fmpm.c | 2 +- > > drivers/ras/cec.c | 2 +- > > include/trace/events/mce.h | 51 --- > > 19 files changed, 286 insertions(+), 164 deletions(-) > > This doesn't apply anymore - please redo this ontop of the latest tip/master. > Avadhut, You can drop the dependencies on other sets. We can sort out any conflicts as needed. Thanks, Yazen
Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models
On Mon, Jan 18, 2021 at 08:31:12PM +0100, Borislav Petkov wrote: > On Sat, Jan 16, 2021 at 02:33:53PM +0000, Yazen Ghannam wrote: > > +static struct { > > + u32 base, limit; > > +} f17h_scrub_regs = {F17H_M30H_SCR_BASE_ADDR, F17H_M30H_SCR_LIMIT_ADDR}; > > Why not make this part of struct amd64_umc so that you can access them > through pvt->umc? > We have a struct amd64_umc per channel, so putting these fixed values there seemed redundant. Would you mind if we put this in struct amd64_family_type? We can then set the values per family/model group like we do with the max_mcs. Thanks, Yazen
Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models
On Mon, Jan 18, 2021 at 04:30:58AM +0300, WGH wrote: > On 16/01/2021 17:33, Yazen Ghannam wrote: > > From: Yazen Ghannam > > > > The Family 17h scrubber registers moved to different offset starting > > with Model 30h. The new register offsets are used for all currently > > available models since then. > > > > Use the new register addresses as the defaults. > > > > Set the proper scrub register addresses during module init for older > > models. > > So I tested the patch on my machine (AMD Ryzen 9 3900XT on ASRock B550 > Extreme4 motherboard, Linux 5.10.7). > > The /sys/devices/system/edac/mc/mc0/sdram_scrub_rate value seems to be stuck > at 12284069 right after the boot, and does not change. > Writes to the file do not report any errors. > > dmesg: > > [ 0.549451] EDAC MC: Ver: 3.0.0 > [ 0.817576] EDAC amd64: F17h_M70h detected (node 0). > [ 0.818159] EDAC amd64: Node 0: DRAM ECC enabled. > [ 0.818717] EDAC amd64: MCT channel count: 2 > [ 0.819324] EDAC MC0: Giving out device to module amd64_edac controller > F17h_M70h: DEV :00:18.3 (INTERRUPT) > [ 0.819909] EDAC MC: UMC0 chip selects: > [ 0.819910] EDAC amd64: MC: 0: 16384MB 1: 16384MB > [ 0.820488] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 0.821067] EDAC MC: UMC1 chip selects: > [ 0.821067] EDAC amd64: MC: 0: 16384MB 1: 16384MB > [ 0.821630] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 0.822187] EDAC amd64: using x16 syndromes. > [ 0.822739] EDAC PCI0: Giving out device to module amd64_edac controller > EDAC PCI controller: DEV :00:18.0 (POLLED) > [ 0.823314] AMD64 EDAC driver v3.5.0 > > Thanks for testing. I'll try to find a similar system and check it out. Thanks, Yazen
[PATCH] EDAC/AMD64: Update scrub register addresses for newer models
From: Yazen Ghannam The Family 17h scrubber registers moved to different offset starting with Model 30h. The new register offsets are used for all currently available models since then. Use the new register addresses as the defaults. Set the proper scrub register addresses during module init for older models. Reported-by: WGH Signed-off-by: Yazen Ghannam --- drivers/edac/amd64_edac.c | 23 ++- drivers/edac/amd64_edac.h | 2 ++ 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 9868f95a5622..b324b1589e5a 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -167,6 +167,10 @@ static inline int amd64_read_dct_pci_cfg(struct amd64_pvt *pvt, u8 dct, * other archs, we might not have access to the caches directly. */ +static struct { + u32 base, limit; +} f17h_scrub_regs = {F17H_M30H_SCR_BASE_ADDR, F17H_M30H_SCR_LIMIT_ADDR}; + static inline void __f17h_set_scrubval(struct amd64_pvt *pvt, u32 scrubval) { /* @@ -176,10 +180,10 @@ static inline void __f17h_set_scrubval(struct amd64_pvt *pvt, u32 scrubval) */ if (scrubval >= 0x5 && scrubval <= 0x14) { scrubval -= 0x5; - pci_write_bits32(pvt->F6, F17H_SCR_LIMIT_ADDR, scrubval, 0xF); - pci_write_bits32(pvt->F6, F17H_SCR_BASE_ADDR, 1, 0x1); + pci_write_bits32(pvt->F6, f17h_scrub_regs.limit, scrubval, 0xF); + pci_write_bits32(pvt->F6, f17h_scrub_regs.base, 1, 0x1); } else { - pci_write_bits32(pvt->F6, F17H_SCR_BASE_ADDR, 0, 0x1); + pci_write_bits32(pvt->F6, f17h_scrub_regs.base, 0, 0x1); } } /* @@ -257,9 +261,9 @@ static int get_scrub_rate(struct mem_ctl_info *mci) u32 scrubval = 0; if (pvt->umc) { - amd64_read_pci_cfg(pvt->F6, F17H_SCR_BASE_ADDR, &scrubval); + amd64_read_pci_cfg(pvt->F6, f17h_scrub_regs.base, &scrubval); if (scrubval & BIT(0)) { - amd64_read_pci_cfg(pvt->F6, F17H_SCR_LIMIT_ADDR, &scrubval); + amd64_read_pci_cfg(pvt->F6, f17h_scrub_regs.limit, &scrubval); scrubval &= 0xF; scrubval += 0x5; } else { @@ -3568,6 +3572,14 @@ f17h_determine_edac_ctl_cap(struct mem_ctl_info *mci, struct amd64_pvt *pvt) } } +static void f17h_set_scrub_regs(struct amd64_pvt *pvt) +{ + if ((pvt->fam == 0x17 && pvt->model < 0x30) || pvt->fam == 0x18) { + f17h_scrub_regs.base = F17H_SCR_BASE_ADDR; + f17h_scrub_regs.limit = F17H_SCR_LIMIT_ADDR; + } +} + static void setup_mci_misc_attrs(struct mem_ctl_info *mci) { struct amd64_pvt *pvt = mci->pvt_info; @@ -3577,6 +3589,7 @@ static void setup_mci_misc_attrs(struct mem_ctl_info *mci) if (pvt->umc) { f17h_determine_edac_ctl_cap(mci, pvt); + f17h_set_scrub_regs(pvt); } else { if (pvt->nbcap & NBCAP_SECDED) mci->edac_ctl_cap |= EDAC_FLAG_SECDED; diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h index 85aa820bc165..4606f72f4258 100644 --- a/drivers/edac/amd64_edac.h +++ b/drivers/edac/amd64_edac.h @@ -213,6 +213,8 @@ #define F15H_M60H_SCRCTRL 0x1C8 #define F17H_SCR_BASE_ADDR 0x48 #define F17H_SCR_LIMIT_ADDR0x4C +#define F17H_M30H_SCR_BASE_ADDR0x40 +#define F17H_M30H_SCR_LIMIT_ADDR 0x44 /* * Function 3 - Misc Control -- 2.25.1
[tip: x86/urgent] x86/cpu/amd: Set __max_die_per_package on AMD
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 76e2fc63ca40977af893b724b00cc2f8e9ce47a4 Gitweb: https://git.kernel.org/tip/76e2fc63ca40977af893b724b00cc2f8e9ce47a4 Author:Yazen Ghannam AuthorDate:Mon, 11 Jan 2021 11:04:29 +01:00 Committer: Borislav Petkov CommitterDate: Tue, 12 Jan 2021 12:21:01 +01:00 x86/cpu/amd: Set __max_die_per_package on AMD Set the maximum DIE per package variable on AMD using the NodesPerProcessor topology value. This will be used by RAPL, among others, to determine the maximum number of DIEs on the system in order to do per-DIE manipulations. [ bp: Productize into a proper patch. ] Fixes: 028c221ed190 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id") Reported-by: Johnathan Smithinovic Reported-by: Rafael Kitover Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Tested-by: Johnathan Smithinovic Tested-by: Rafael Kitover Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939 Link: https://lkml.kernel.org/r/20210106112106.ge5...@zn.tnic Link: https://lkml.kernel.org/r/2021001455.1194-1...@alien8.de --- arch/x86/kernel/cpu/amd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f8ca66f..347a956 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -542,12 +542,12 @@ static void bsp_init_amd(struct cpuinfo_x86 *c) u32 ecx; ecx = cpuid_ecx(0x801e); - nodes_per_socket = ((ecx >> 8) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((ecx >> 8) & 7) + 1; } else if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); - nodes_per_socket = ((value >> 3) & 7) + 1; + __max_die_per_package = nodes_per_socket = ((value >> 3) & 7) + 1; } if (!boot_cpu_has(X86_FEATURE_AMD_SSBD) &&
[PATCH] EDAC/amd64: Tone down messages about missing PCI IDs
From: Yazen Ghannam Give these messages a debug severity as they are really only useful to the module developers. Also, drop the "(broken BIOS?)" phrase, since this can cause churn for BIOS folks. The PCI IDs needed by the module, at least on modern systems, are fixed in hardware. Signed-off-by: Yazen Ghannam --- drivers/edac/amd64_edac.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index f7087b90..a3770ffee2ea 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -2665,7 +2665,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 pci_id1, u16 pci_id2) if (pvt->umc) { pvt->F0 = pci_get_related_function(pvt->F3->vendor, pci_id1, pvt->F3); if (!pvt->F0) { - amd64_err("F0 not found, device 0x%x (broken BIOS?)\n", pci_id1); + edac_dbg(1, "F0 not found, device 0x%x\n", pci_id1); return -ENODEV; } @@ -2674,7 +2674,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 pci_id1, u16 pci_id2) pci_dev_put(pvt->F0); pvt->F0 = NULL; - amd64_err("F6 not found: device 0x%x (broken BIOS?)\n", pci_id2); + edac_dbg(1, "F6 not found: device 0x%x\n", pci_id2); return -ENODEV; } @@ -2691,7 +2691,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 pci_id1, u16 pci_id2) /* Reserve the ADDRESS MAP Device */ pvt->F1 = pci_get_related_function(pvt->F3->vendor, pci_id1, pvt->F3); if (!pvt->F1) { - amd64_err("F1 not found: device 0x%x (broken BIOS?)\n", pci_id1); + edac_dbg(1, "F1 not found: device 0x%x\n", pci_id1); return -ENODEV; } @@ -2701,7 +2701,7 @@ reserve_mc_sibling_devs(struct amd64_pvt *pvt, u16 pci_id1, u16 pci_id2) pci_dev_put(pvt->F1); pvt->F1 = NULL; - amd64_err("F2 not found: device 0x%x (broken BIOS?)\n", pci_id2); + edac_dbg(1, "F2 not found: device 0x%x\n", pci_id2); return -ENODEV; } -- 2.25.1
Re: [PATCH 2/2] EDAC/amd64: Merge error injection sysfs facilities
On Tue, Dec 15, 2020 at 12:05:17PM +0100, Borislav Petkov wrote: > From: Borislav Petkov > > Merge them into the main driver and put them inside an EDAC_DEBUG > ifdeffery to simplify the driver and have all debugging/injection stuff > behind a debug build-time switch. > > No functional changes. > > Signed-off-by: Borislav Petkov > --- > drivers/edac/Kconfig | 7 +- > drivers/edac/Makefile | 6 +- > drivers/edac/amd64_edac.c | 237 +- > drivers/edac/amd64_edac.h | 8 -- > drivers/edac/amd64_edac_inj.c | 235 - > 5 files changed, 236 insertions(+), 257 deletions(-) > delete mode 100644 drivers/edac/amd64_edac_inj.c > > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 7a47680d6f07..9c2e719cb86a 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -81,10 +81,9 @@ config EDAC_AMD64 > Support for error detection and correction of DRAM ECC errors on > the AMD64 families (>= K8) of memory controllers. > > -config EDAC_AMD64_ERROR_INJECTION > - bool "Sysfs HW Error injection facilities" > - depends on EDAC_AMD64 > - help > + When EDAC_DEBUG is enabled, hardware error injection facilities > + through sysfs are available: > + > Recent Opterons (Family 10h and later) provide for Memory Error Can we say "Opterons (Family 10h to Family 15h)"? It may also apply to Family 16h, but I don't know if they were branded as Opterons. The injection code in this module doesn't apply to Family 17h and later. Also, Family 17h and later doesn't allow the OS direct access to the error injection registers. They're locked down by security policy, etc. > Injection into the ECC detection circuits. The amd64_edac module > allows the operator/user to inject Uncorrectable and Correctable ... > + > +static umode_t inj_is_visible(struct kobject *kobj, struct attribute *attr, > int idx) > +{ > + struct device *dev = kobj_to_dev(kobj); > + struct mem_ctl_info *mci = container_of(dev, struct mem_ctl_info, dev); > + struct amd64_pvt *pvt = mci->pvt_info; > + > + if (pvt->fam < 0x10) Related to the comment above, can this be changed to the following? if (pvt->fam < 0x10 || pvt->fam >= 0x17) > + return 0; > + return attr->mode; > +} > + Everything else looks good to me. Reviewed-by: Yazen Ghannam Thanks, Yazen
Re: [PATCH 1/2] EDAC/amd64: Merge sysfs debugging attributes setup code
On Tue, Dec 15, 2020 at 12:05:16PM +0100, Borislav Petkov wrote: > From: Borislav Petkov > > There's no need for them to be in a separate file so merge them into the > main driver compilation unit like the other EDAC drivers do. > > Drop now-unneeded function export, make the function static and shorten > static function names. > > No functional changes. > > Signed-off-by: Borislav Petkov Reviewed-by: Yazen Ghannam Thanks, Yazen
[tip: x86/cpu] EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: 8de0c9917cc1297bc5543b61992d5bdee4ce621a Gitweb: https://git.kernel.org/tip/8de0c9917cc1297bc5543b61992d5bdee4ce621a Author:Yazen Ghannam AuthorDate:Mon, 09 Nov 2020 21:06:58 Committer: Borislav Petkov CommitterDate: Thu, 19 Nov 2020 11:43:21 +01:00 EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr(). However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact. But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results. Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the physical ID is used. Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20201109210659.754018-4-yazen.ghan...@amd.com --- drivers/edac/mce_amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 85095e3..5dd905a 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m) pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) - decode_dram_ecc(cpu_to_node(m->extcpu), m); + decode_dram_ecc(topology_die_id(m->extcpu), m); } static inline void amd_decode_err_code(u16 ec)
[tip: x86/cpu] x86/CPU/AMD: Remove amd_get_nb_id()
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: db970bd231c2264a062e0de4dcf4ead5e6669e7a Gitweb: https://git.kernel.org/tip/db970bd231c2264a062e0de4dcf4ead5e6669e7a Author:Yazen Ghannam AuthorDate:Mon, 09 Nov 2020 21:06:57 Committer: Borislav Petkov CommitterDate: Thu, 19 Nov 2020 11:43:17 +01:00 x86/CPU/AMD: Remove amd_get_nb_id() The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id. Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20201109210659.754018-3-yazen.ghan...@amd.com --- arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/processor.h | 2 -- arch/x86/kernel/amd_nb.c | 4 ++-- arch/x86/kernel/cpu/amd.c| 6 -- arch/x86/kernel/cpu/cacheinfo.c | 2 +- arch/x86/kernel/cpu/mce/amd.c| 4 ++-- arch/x86/kernel/cpu/mce/inject.c | 4 ++-- drivers/edac/amd64_edac.c| 4 ++-- drivers/edac/mce_amd.c | 2 +- 9 files changed, 11 insertions(+), 19 deletions(-) diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index 39eb276..2c1791c 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu) if (!x86_pmu.amd_nb_constraints) return; - nb_id = amd_get_nb_id(cpu); + nb_id = topology_die_id(cpu); WARN_ON_ONCE(nb_id == BAD_APICID); for_each_online_cpu(i) { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 82a08b5..c20a52b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -813,10 +813,8 @@ extern int set_tsc_mode(unsigned int val); DECLARE_PER_CPU(u64, msr_misc_features_shadow); #ifdef CONFIG_CPU_SUP_AMD -extern u16 amd_get_nb_id(int cpu); extern u32 amd_get_nodes_per_socket(void); #else -static inline u16 amd_get_nb_id(int cpu) { return 0; } static inline u32 amd_get_nodes_per_socket(void) { return 0; } #endif diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index 18f6b7c..b439695 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource *res) int amd_get_subcaches(int cpu) { - struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link; + struct pci_dev *link = node_to_amd_nb(topology_die_id(cpu))->link; unsigned int mask; if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING)) @@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu) int amd_set_subcaches(int cpu, unsigned long mask) { static unsigned int reset, ban; - struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu)); + struct amd_northbridge *nb = node_to_amd_nb(topology_die_id(cpu)); unsigned int reg; int cuid; diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2f1fbd8..1f71c76 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -424,12 +424,6 @@ clear_ppin: clear_cpu_cap(c, X86_FEATURE_AMD_PPIN); } -u16 amd_get_nb_id(int cpu) -{ - return per_cpu(cpu_llc_id, cpu); -} -EXPORT_SYMBOL_GPL(amd_get_nb_id); - u32 amd_get_nodes_per_socket(void) { return nodes_per_socket; diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index f9ac682..3ca9be4 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs *this_leaf, int index) if (index < 3) return; - node = amd_get_nb_id(smp_processor_id()); + node = topology_die_id(smp_processor_id()); this_leaf->nb = node_to_amd_nb(node); if (this_leaf->nb && !this_leaf->nb->l3_cache.indices) amd_calc_l3_indices(this_leaf->nb); diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 0c6b02d..e486f96 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu, return -ENODEV; if (is_shared_bank(bank)) { - nb = node_to_amd_nb(amd_get_nb_id(cpu)); + nb = node_to_amd_nb(topology_die_id(cpu)); /* threshold descriptor already initialized on this node? */ if (nb && nb->bank4) { @@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank *bank) * The last CPU on this node using the shared bank is going * away, remove that
[tip: x86/cpu] x86/topology: Set cpu_die_id only if DIE_TYPE found
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: cb09a379724d299c603a7a79f444f52a9a75b8d2 Gitweb: https://git.kernel.org/tip/cb09a379724d299c603a7a79f444f52a9a75b8d2 Author:Yazen Ghannam AuthorDate:Mon, 09 Nov 2020 21:06:59 Committer: Borislav Petkov CommitterDate: Thu, 19 Nov 2020 11:43:25 +01:00 x86/topology: Set cpu_die_id only if DIE_TYPE found CPUID Leaf 0x1F defines a DIE_TYPE level (nb: ECX[8:15] level type == 0x5), but CPUID Leaf 0xB does not. However, detect_extended_topology() will set struct cpuinfo_x86.cpu_die_id regardless of whether a valid Die ID was found. Only set cpu_die_id if a DIE_TYPE level is found. CPU topology code may use another value for cpu_die_id, e.g. the AMD NodeId on AMD-based systems. Code ordering should be maintained so that the CPUID Leaf 0x1F Die ID value will take precedence on systems that may use another value. Suggested-by: Borislav Petkov Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20201109210659.754018-5-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/topology.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index d3a0791..1068002 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -96,6 +96,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c) unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width; unsigned int core_select_mask, core_level_siblings; unsigned int die_select_mask, die_level_siblings; + bool die_level_present = false; int leaf; leaf = detect_extended_topology_leaf(c); @@ -126,6 +127,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c) die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); } if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) { + die_level_present = true; die_level_siblings = LEVEL_MAX_SIBLINGS(ebx); die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); } @@ -139,8 +141,12 @@ int detect_extended_topology(struct cpuinfo_x86 *c) c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, ht_mask_width) & core_select_mask; - c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid, - core_plus_mask_width) & die_select_mask; + + if (die_level_present) { + c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid, + core_plus_mask_width) & die_select_mask; + } + c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, die_plus_mask_width); /*
[tip: x86/cpu] x86/CPU/AMD: Save AMD NodeId as cpu_die_id
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: 028c221ed1904af9ac3c5162ee98f48966de6b3d Gitweb: https://git.kernel.org/tip/028c221ed1904af9ac3c5162ee98f48966de6b3d Author:Yazen Ghannam AuthorDate:Mon, 09 Nov 2020 21:06:56 Committer: Borislav Petkov CommitterDate: Thu, 19 Nov 2020 11:43:13 +01:00 x86/CPU/AMD: Save AMD NodeId as cpu_die_id AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA nodes. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Update the x86 topology documentation. Suggested-by: Borislav Petkov Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20201109210659.754018-2-yazen.ghan...@amd.com --- Documentation/x86/topology.rst | 9 + arch/x86/include/asm/cacheinfo.h | 4 ++-- arch/x86/kernel/cpu/amd.c| 11 +-- arch/x86/kernel/cpu/cacheinfo.c | 6 +++--- arch/x86/kernel/cpu/hygon.c | 11 +-- 5 files changed, 24 insertions(+), 17 deletions(-) diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst index e297399..7f58010 100644 --- a/Documentation/x86/topology.rst +++ b/Documentation/x86/topology.rst @@ -41,6 +41,8 @@ Package Packages contain a number of cores plus shared resources, e.g. DRAM controller, shared caches etc. +Modern systems may also use the term 'Die' for package. + AMD nomenclature for package is 'Node'. Package-related topology information in the kernel: @@ -53,11 +55,18 @@ Package-related topology information in the kernel: The number of dies in a package. This information is retrieved via CPUID. + - cpuinfo_x86.cpu_die_id: + +The physical ID of the die. This information is retrieved via CPUID. + - cpuinfo_x86.phys_proc_id: The physical ID of the package. This information is retrieved via CPUID and deduced from the APIC IDs of the cores in the package. +Modern systems use this value for the socket. There may be multiple +packages within a socket. This value may differ from cpu_die_id. + - cpuinfo_x86.logical_proc_id: The logical ID of the package. As we do not trust BIOSes to enumerate the diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h index 86b63c7..86b2e0d 100644 --- a/arch/x86/include/asm/cacheinfo.h +++ b/arch/x86/include/asm/cacheinfo.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_CACHEINFO_H #define _ASM_X86_CACHEINFO_H -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); -void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu); +void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu); #endif /* _ASM_X86_CACHEINFO_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 6062ce5..2f1fbd8 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c) */ static void amd_get_topology(struct cpuinfo_x86 *c) { - u8 node_id; int cpu = smp_processor_id(); /* get information required for multi-node processors */ @@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c) cpuid(0x801e, &eax, &ebx, &ecx, &edx); - node_id = ecx & 0xff; + c->cpu_die_id = ecx & 0xff; if (c->x86 == 0x15) c->cu_id = ebx & 0xff; @@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c) if (!err) c->x86_coreid_bits = get_count_order(c->x86_max_cores); - cacheinfo_amd_init_llc_id(c, cpu, node_id); + cacheinfo_amd_init_llc_id(c, cpu); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); - node_id = value & 7; + c->cpu_die_id = value & 7; - per_cpu(cpu_llc_id, cpu) = node_id; + per_cpu(cpu_llc_id, cpu) = c->cpu_die_id; } else return; @@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* Convert the initial APIC ID into t
[PATCH 3/4] EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
From: Yazen Ghannam The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr(). However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact. But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results. Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the physical ID is used. Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam --- drivers/edac/mce_amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 85095e3902ec..5dd905a3f30c 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m) pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) - decode_dram_ecc(cpu_to_node(m->extcpu), m); + decode_dram_ecc(topology_die_id(m->extcpu), m); } static inline void amd_decode_err_code(u16 ec) -- 2.25.1
[PATCH 4/4] x86/topology: Set cpu_die_id only if DIE_TYPE found
From: Yazen Ghannam CPUID Leaf 0x1F defines a DIE_TYPE level, but CPUID Leaf 0xB does not. However, detect_extended_topology() will set struct cpuinfo_x86.cpu_die_id regardless of whether a valid Die ID was found. Only set cpu_die_id if a DIE_TYPE level is found. CPU topology code may use another value for cpu_die_id, e.g. the AMD NodeId on AMD-based systems. Code ordering should be maintained so that the CPUID Leaf 0x1F Die ID value will take precedence on systems that may use another value. Suggested-by: Borislav Petkov Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/topology.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index d3a0791bc052..1068002c8532 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -96,6 +96,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c) unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width; unsigned int core_select_mask, core_level_siblings; unsigned int die_select_mask, die_level_siblings; + bool die_level_present = false; int leaf; leaf = detect_extended_topology_leaf(c); @@ -126,6 +127,7 @@ int detect_extended_topology(struct cpuinfo_x86 *c) die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); } if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) { + die_level_present = true; die_level_siblings = LEVEL_MAX_SIBLINGS(ebx); die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); } @@ -139,8 +141,12 @@ int detect_extended_topology(struct cpuinfo_x86 *c) c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, ht_mask_width) & core_select_mask; - c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid, - core_plus_mask_width) & die_select_mask; + + if (die_level_present) { + c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid, + core_plus_mask_width) & die_select_mask; + } + c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, die_plus_mask_width); /* -- 2.25.1
[PATCH 2/4] x86/CPU/AMD: Remove amd_get_nb_id()
From: Yazen Ghannam The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id. Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function. Signed-off-by: Yazen Ghannam --- arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/processor.h | 2 -- arch/x86/kernel/amd_nb.c | 4 ++-- arch/x86/kernel/cpu/amd.c| 6 -- arch/x86/kernel/cpu/cacheinfo.c | 2 +- arch/x86/kernel/cpu/mce/amd.c| 4 ++-- arch/x86/kernel/cpu/mce/inject.c | 4 ++-- drivers/edac/amd64_edac.c| 4 ++-- drivers/edac/mce_amd.c | 2 +- 9 files changed, 11 insertions(+), 19 deletions(-) diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index 39eb276d0277..2c1791c4a518 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu) if (!x86_pmu.amd_nb_constraints) return; - nb_id = amd_get_nb_id(cpu); + nb_id = topology_die_id(cpu); WARN_ON_ONCE(nb_id == BAD_APICID); for_each_online_cpu(i) { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 60dbcdcb833f..a411466a6e74 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -815,10 +815,8 @@ extern int set_tsc_mode(unsigned int val); DECLARE_PER_CPU(u64, msr_misc_features_shadow); #ifdef CONFIG_CPU_SUP_AMD -extern u16 amd_get_nb_id(int cpu); extern u32 amd_get_nodes_per_socket(void); #else -static inline u16 amd_get_nb_id(int cpu) { return 0; } static inline u32 amd_get_nodes_per_socket(void) { return 0; } #endif diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index 18f6b7c4bd79..b4396952c9a6 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource *res) int amd_get_subcaches(int cpu) { - struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link; + struct pci_dev *link = node_to_amd_nb(topology_die_id(cpu))->link; unsigned int mask; if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING)) @@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu) int amd_set_subcaches(int cpu, unsigned long mask) { static unsigned int reset, ban; - struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu)); + struct amd_northbridge *nb = node_to_amd_nb(topology_die_id(cpu)); unsigned int reg; int cuid; diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 2f1fbd8150af..1f71c7616917 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -424,12 +424,6 @@ static void amd_detect_ppin(struct cpuinfo_x86 *c) clear_cpu_cap(c, X86_FEATURE_AMD_PPIN); } -u16 amd_get_nb_id(int cpu) -{ - return per_cpu(cpu_llc_id, cpu); -} -EXPORT_SYMBOL_GPL(amd_get_nb_id); - u32 amd_get_nodes_per_socket(void) { return nodes_per_socket; diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index f9ac682e75e7..3ca9be482a9e 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs *this_leaf, int index) if (index < 3) return; - node = amd_get_nb_id(smp_processor_id()); + node = topology_die_id(smp_processor_id()); this_leaf->nb = node_to_amd_nb(node); if (this_leaf->nb && !this_leaf->nb->l3_cache.indices) amd_calc_l3_indices(this_leaf->nb); diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 0c6b02dd744c..e486f96b3cb3 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu, return -ENODEV; if (is_shared_bank(bank)) { - nb = node_to_amd_nb(amd_get_nb_id(cpu)); + nb = node_to_amd_nb(topology_die_id(cpu)); /* threshold descriptor already initialized on this node? */ if (nb && nb->bank4) { @@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank *bank) * The last CPU on this node using the shared bank is going * away, remove that bank now. */ - nb = node_to_amd_nb(amd_get_nb_id(smp_processor_id())); + nb = node_to_amd_nb(topology_die_id(smp_processor_id())); nb->bank4 = NULL; } diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c index 3a44346f2276..7b360731fc2d 100644 --- a/arch/x86/kernel/cpu/mce/inject.c +++ b/arch/x86
[PATCH 1/4] x86/CPU/AMD: Save AMD NodeId as cpu_die_id
From: Yazen Ghannam AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA nodes. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Update the x86 topology documentation. [ Use cpu_die_id. ] Suggested-by: Borislav Petkov Signed-off-by: Yazen Ghannam --- Documentation/x86/topology.rst | 9 + arch/x86/include/asm/cacheinfo.h | 4 ++-- arch/x86/kernel/cpu/amd.c| 11 +-- arch/x86/kernel/cpu/cacheinfo.c | 6 +++--- arch/x86/kernel/cpu/hygon.c | 11 +-- 5 files changed, 24 insertions(+), 17 deletions(-) diff --git a/Documentation/x86/topology.rst b/Documentation/x86/topology.rst index e29739904e37..7f58010ea86a 100644 --- a/Documentation/x86/topology.rst +++ b/Documentation/x86/topology.rst @@ -41,6 +41,8 @@ Package Packages contain a number of cores plus shared resources, e.g. DRAM controller, shared caches etc. +Modern systems may also use the term 'Die' for package. + AMD nomenclature for package is 'Node'. Package-related topology information in the kernel: @@ -53,11 +55,18 @@ Package-related topology information in the kernel: The number of dies in a package. This information is retrieved via CPUID. + - cpuinfo_x86.cpu_die_id: + +The physical ID of the die. This information is retrieved via CPUID. + - cpuinfo_x86.phys_proc_id: The physical ID of the package. This information is retrieved via CPUID and deduced from the APIC IDs of the cores in the package. +Modern systems use this value for the socket. There may be multiple +packages within a socket. This value may differ from cpu_die_id. + - cpuinfo_x86.logical_proc_id: The logical ID of the package. As we do not trust BIOSes to enumerate the diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h index 86b63c7feab7..86b2e0dcc4bf 100644 --- a/arch/x86/include/asm/cacheinfo.h +++ b/arch/x86/include/asm/cacheinfo.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_CACHEINFO_H #define _ASM_X86_CACHEINFO_H -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); -void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu); +void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu); #endif /* _ASM_X86_CACHEINFO_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 6062ce586b95..2f1fbd8150af 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c) */ static void amd_get_topology(struct cpuinfo_x86 *c) { - u8 node_id; int cpu = smp_processor_id(); /* get information required for multi-node processors */ @@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c) cpuid(0x801e, &eax, &ebx, &ecx, &edx); - node_id = ecx & 0xff; + c->cpu_die_id = ecx & 0xff; if (c->x86 == 0x15) c->cu_id = ebx & 0xff; @@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c) if (!err) c->x86_coreid_bits = get_count_order(c->x86_max_cores); - cacheinfo_amd_init_llc_id(c, cpu, node_id); + cacheinfo_amd_init_llc_id(c, cpu); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); - node_id = value & 7; + c->cpu_die_id = value & 7; - per_cpu(cpu_llc_id, cpu) = node_id; + per_cpu(cpu_llc_id, cpu) = c->cpu_die_id; } else return; @@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* Convert the initial APIC ID into the socket ID */ c->phys_proc_id = c->initial_apicid >> bits; /* use socket ID also for last level cache */ - per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; + per_cpu(cpu_llc_id, cpu) = c->cpu_die_id = c->phys_proc_id; } static void amd_detect_ppin(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 57074cf3ad7c..f9ac682e75e7 100644
[PATCH 0/4] Set and use cpu_die_id on AMD-based systems
From: Yazen Ghannam AMD-based systems currently use a "NodeId" when referencing a software-visible hardware structure. This may be referred to as a "Die" in x86 documentation, "Node" in some AMD documentation, and "Package" in Linux documentation. Recently a cpu_die_id value was added to struct cpuinfo_x86. This value can be used on AMD-based systems rather than using an AMD-specific value throughout the kernel. This set is based on patches 1-3 from the following set. https://lkml.kernel.org/r/20200903200144.310991-1-yazen.ghan...@amd.com Thanks, Yazen Yazen Ghannam (4): x86/CPU/AMD: Save AMD NodeId as cpu_die_id x86/CPU/AMD: Remove amd_get_nb_id() EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId x86/topology: Set cpu_die_id only if DIE_TYPE found Documentation/x86/topology.rst | 9 + arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/cacheinfo.h | 4 ++-- arch/x86/include/asm/processor.h | 2 -- arch/x86/kernel/amd_nb.c | 4 ++-- arch/x86/kernel/cpu/amd.c| 17 + arch/x86/kernel/cpu/cacheinfo.c | 8 arch/x86/kernel/cpu/hygon.c | 11 +-- arch/x86/kernel/cpu/mce/amd.c| 4 ++-- arch/x86/kernel/cpu/mce/inject.c | 4 ++-- arch/x86/kernel/cpu/topology.c | 10 -- drivers/edac/amd64_edac.c| 4 ++-- drivers/edac/mce_amd.c | 4 ++-- 13 files changed, 44 insertions(+), 39 deletions(-) -- 2.25.1
[PATCH] EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh
From: Yazen Ghannam AMD Family 19h Models 20h-2Fh use the same PCI IDs as Family 17h Models 70h-7Fh. The same family ops and number of channels also apply. Use the Family17h Model 70h family_type and ops for Family 19h Models 20h-2Fh. Update the controller name to match the system. Signed-off-by: Yazen Ghannam --- drivers/edac/amd64_edac.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index fcc08bbf6945..1362274d840b 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3385,6 +3385,12 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt) break; case 0x19: + if (pvt->model >= 0x20 && pvt->model <= 0x2f) { + fam_type = &family_types[F17_M70H_CPUS]; + pvt->ops = &family_types[F17_M70H_CPUS].ops; + fam_type->ctl_name = "F19h_M20h"; + break; + } fam_type= &family_types[F19_CPUS]; pvt->ops= &family_types[F19_CPUS].ops; family_types[F19_CPUS].ctl_name = "F19h"; -- 2.25.1
Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation
On Mon, Sep 28, 2020 at 08:14:07PM +0200, Borislav Petkov wrote: > On Mon, Sep 28, 2020 at 10:53:50AM -0500, Yazen Ghannam wrote: > > > I agree that the translation code is implementation-specific and applies > > only to DRAM ECC errors, so it make sense to have it in amd64_edac. The > > only issue is getting the address translation to earlier notifiers. I > > think we can add a new one in amd64_edac to run before others. Maybe this > > can be a new priority class like MCE_PRIO_PREPROCESS, or something like > > that for notifiers that fixup the MCE data. > > Well, I'm not sure you need notifiers here - you wanna call > mce_usable_address() and in it, it should do the address conversion > calculation to give you a physical address which you can feed to > memory_failure etc. > > Now, mce_usable_address() is core code and we can make core code call > into a module but that is yucky. So *that* is your reason for keeping it > where it is. > Okay, we'll keep the code where it is. I'll work on another set to call the address translation with mce_usable_address(). > Looking at its size: > > $ readelf -s vmlinux | grep umc_normaddr_to > 2864: 817d8ae5 168 FUNCLOCAL DEFAULT1 > umc_normaddr_to_[...] > 91866: 81030e00 1127 FUNCGLOBAL DEFAULT1 > umc_normaddr_to_[...] > > that's something like ~1.3K and if you split it and do some > experimenting, you might get it even slimmer. Not that ~1.3K is that > huge for current standards but we should always aim at not bloating the > fat guy our kernel already is. > Okay, I'll keep an eye on this and try to slim it down. Thanks, Yazen
Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation
On Mon, Sep 28, 2020 at 11:47:59AM +0200, Borislav Petkov wrote: > On Fri, Sep 25, 2020 at 02:51:27PM -0500, Yazen Ghannam wrote: > > > The address translation needs to be done before the notfiers that need > > it, and EDAC comes after all of them. There's also the case where the > > EDAC interface isn't wanted, so amd64_edac will be unloaded. > > I'd be interested as to why. Because decoding addresses is amd64_edac > *core* functionality. We can stick it in drivers/edac/mce_amd.c but I'd > like to hear what those valid reasons are, not to use the driver which > is supposed to do that anyway. > I don't have any clear reasons. I just get vague use cases sometimes about not using EDAC and relying on other things. But it shouldn't hurt to have the module load anyway. The EDAC messages can be suppressed, and the sysfs interface can be ignored. So, after a bit more thought, this doesn't seem like a good reason. I agree that the translation code is implementation-specific and applies only to DRAM ECC errors, so it make sense to have it in amd64_edac. The only issue is getting the address translation to earlier notifiers. I think we can add a new one in amd64_edac to run before others. Maybe this can be a new priority class like MCE_PRIO_PREPROCESS, or something like that for notifiers that fixup the MCE data. I can start by moving the address translation to amd64_edac and doing the code cleanup. Thanks, Yazen
Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation
On Fri, Sep 25, 2020 at 09:22:31AM +0200, Borislav Petkov wrote: > On Wed, Sep 23, 2020 at 11:25:10AM -0500, Yazen Ghannam wrote: > > I don't remember the original reason, and I was recently asked about > > this code living in a module. I did some looking after this ask, and I > > found that we should be using this translation to get a proper value for > > the memory error notifiers to use. So I think we still need to use this > > function some way with the core code even if the EDAC interface isn't > > used. > > You'd need to be more specific here, you want to bypass amd64_edac to > decode errors? Judging by the current RAS activity coming from you guys, > I'm thinking firmware. But then wouldn't the firmware do the decoding > for us and then this function is not even needed? > The UC, NFIT, and CEC notifiers all operate on system physical addresses. The address in the MCE record is checked by mce_usable_address() to see if it can be used by the kernel, i.e. the address is a system physical address. Right now, this check passes on AMD systems if MCA_STATUS[AddrV] is set. This works for memory errors on legacy AMD systems, since the NB MCA bank logs a physical address for DRAM ECC errors. But this won't work on newer systems, because the UMC MCA bank does not log a system physical address for DRAM ECC errors. So the address provided by the hardware will need to be translated to a physical address before the notifiers in the MCE chain can use it. We can add support to get the physical address from firmware in some cases. But it looks to me that we'll still need to keep updating the translation code in the kernel to cover some platform/user configurations. So it makes sense to me to move the functionality into a module to make it easier to update. The address translation needs to be done before the notfiers that need it, and EDAC comes after all of them. There's also the case where the EDAC interface isn't wanted, so amd64_edac will be unloaded. But the functionality in the other notifiers are still expected to be available. So it's more than just decoding the error like we do now with amd64_edac. That's why I think the translation code can be in a separate module with a notfier that runs before the others. This can do the translation once then pass the result down to the CEC, UC, NFIT, and EDAC notifiers to use as needed. Thanks, Yazen
Re: [PATCH v4] cper, apei, mce: Pass x86 CPER through the MCA handling chain
On Fri, Sep 25, 2020 at 09:54:06AM +0900, Punit Agrawal wrote: > Borislav Petkov writes: > > > On Thu, Sep 24, 2020 at 12:23:27PM -0500, Smita Koralahalli Channabasappa > > wrote: > >> > Even though it's not defined in the UEFI spec, it doesn't mean a > >> > structure definition cannot be created. > > > > Created for what? That structure better have a big fat comment above it, > > what > > firmware generates its layout. > > Maybe I could've used a better choice of words - I meant to define a > structure with meaningful member names to replace the *(ptr + i) > accesses in the patch. > > The requirement for documenting the record layout doesn't change - > whether using raw pointer arithmetic vs a structure definition. > > >> > After all, the patch is relying on some guarantee of the meaning of > >> > the values and their ordering. > > > > AFAICT, this looks like an ad-hoc definition and the moment they change > > it in some future revision, that struct of yours becomes invalid so we'd > > need to add another one. > > If there's no spec backing the current layout, then it'll indeed be an > ad-hoc definition of a structure in the kernel. But considering that > it's part of firmware / OS interface for an important part of the RAS > story I would hope that the code is based on a spec - having that > reference included would help maintainability. > > Incompatible changes will indeed break the assumptions in the kernel and > code will need to be updated - regardless of the choice of kernel > implementation; pointer arithmetic, structure definition - ad-hoc or > spec provided. > > Having versioning will allow running older kernels on newer hardware and > vice versa - but I don't see why that is important only when using a > structure based access. > There is no versioning option for the x86 context info structure in the UEFI spec, so I don't think there'd be a clean way to include version information. The format of the data in the context info is not totally ad-hoc, and it does follow the UEFI spec. The "Register Array" field is raw data. This may follow one of the predefined formats in the UEFI spec like the "X64 Register State", etc. Or, in the case of MSR and Memory Mapped Registers, this is a raw dump of the registers starting from the address shown in the structure. The two values that can be changed are the starting address and the array size. These two together provide a window to the registers. The registers are fixed, so a single context info struture should include a single contiguous range of registers. Multiple context info structures can be provided to include registers from different, non-contiguous ranges. This patch is checking if an MSR context info structure lines up with the MCAX register space used on Scalable MCA systems. This register space is defined in the AMD Processor Programming Reference for various products. This is considered a hardware feature extension, so the existing register layout won't change though new registers may be added. A layout change would require moving to another register space which is what happened going from legacy MCA (starting at address 0x400) to MCAX (starting at address 0xC0002000) registers. The only two things firmware can change are from what address does the info start and where does the info end. So the implementation-specific details here are that currently the starting address is MCA_STATUS (in MCAX space) for a bank and the remaining info includes the other MCA registers for this bank. So I think the kernel can be strict with this format, i.e. the two variables match what we're looking for. This patch already has a check on the starting address. It should also include a check that "Register Array Size" is large enough to include all the registers we want to extract. If the format doesn't match, then we fall back to a raw dump of the data like we have today. Or the kernel can be more flexible and try to find the window of registers based on the starting address. I think this is really open-ended though. Does this sound reasonable? Thanks, Yazen
Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation
On Wed, Sep 23, 2020 at 10:20:39AM +0200, Borislav Petkov wrote: > On Thu, Sep 03, 2020 at 08:01:44PM +0000, Yazen Ghannam wrote: > > From: Muralidhara M K > > > > Add support for new memory interleaving modes used in current AMD systems. > > > > Check if the system is using a current Data Fabric version or a legacy > > version as some bit and register definitions have changed. > > > > Tested on AMD reference platforms with the following memory interleaving > > options. > > > > Naples > > - None > > - Channel > > - Die > > - Socket > > > > Rome (NPS = Nodes per Socket) > > - None > > - NPS0 > > - NPS1 > > - NPS2 > > - NPS4 > > > > The fixes tag refers to the commit that allows amd64_edac_mod to load on > > Rome systems. > > Err, why? This is adding new stuff to an address translation function. > How does that fix amd64_edac loading on Rome? > > > The module may report an incorrect system addresses on > > Rome systems depending on the interleaving option used. > > That doesn't stop it from loading, sorry. > Okay, no problem. > Now, before you guys do any new features, I'd like you to split this > humongous function umc_normaddr_to_sysaddr() logically into separate > helpers and each helper does exactly one thing and one thing only. > > Then use a verb in its name: umc_translate_normaddr_to_sysaddr() or so. > Okay, will do. > Also, Yazen, remind me again pls why isn't this function in > drivers/edac/amd64_edac.c, where it is needed? > > If the reason is not valid anymore, let's move it there before splitting > so that it doesn't bloat the core code. > I don't remember the original reason, and I was recently asked about this code living in a module. I did some looking after this ask, and I found that we should be using this translation to get a proper value for the memory error notifiers to use. So I think we still need to use this function some way with the core code even if the EDAC interface isn't used. I think this set can be split up. 1) Set with patches 1-3 fixed up to use cpu_die_id. 2) Set with the address translation updates. a) Move umc_normaddr_to_sysaddr() into a new module under EDAC. b) Hook the new module into amd64_edac.c where it's used today. c) Refactor the code as you suggested above. d) Add the new features. 3) New set that sets up a proper notifier for the address translation. a) Unhook the new module from amd64_edac.c. b) Register a notifer that runs before any notifiers that operate on memory errors. c) Find a way to pass the translated address through the chain without losing the original value. What do you think? Thanks, Yazen
Re: [PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable in translation code
On Wed, Sep 23, 2020 at 10:05:56AM +0200, Borislav Petkov wrote: > On Thu, Sep 03, 2020 at 08:01:42PM +0000, Yazen Ghannam wrote: > > From: Yazen Ghannam > > > > Remove the "tmp" variable used to save register values. Save the values > > in existing variables, if possible. > > > > The register values are 32 bits. Use separate "reg_" variables to hold > > the register values if the existing variable sizes doesn't match, or if > > no bitfields in a register share the same name as the register. > > So I'm missing the "why" in the commit message. Why are you doing this? > > Is there some reason which I'll find out later? If not, then this is > just unnecessary churn. > I don't have a strong reason other than trying to address a comment in the first version. I can drop this patch if you prefer. Thanks, Yazen
Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
On Thu, Sep 17, 2020 at 06:40:48PM +0200, Borislav Petkov wrote: > On Thu, Sep 17, 2020 at 11:20:53AM -0500, Yazen Ghannam wrote: > > But newer systems support CPUID Leaf 0xB, so cpu_die_id will get > > explicitly set by detect_extended_topology(). The value set is > > different from the AMD NodeId. And at that point I shied away from > > doing any override or fixup. > > Well, different how? Can you extract the node_id you need > from CPUID(0xb)? If yes, we can do an AMD-specific branch in > detect_extended_topology() but that better be future proof. > > IOW, is information from CPUID(0xb) ever going to be needed in the > kernel? > > Also, and independently, if its definition do not give you the > node_id you need, then you can just as well overwrite ->cpu_die_id in > detect_extended_topology() because that value - whatever that is, could > be garbage, just as well - is wrong on AMD anyway. > > So it would be a fix for the leaf parsing, regardless of whether you > need it or not. > > Makes sense? > Yes, I think so. "Die" is not defined in CPUID(0xb), only SMT and Core, so the cpu_die_id value is not valid. In which case, we can overwrite it. CPUID(0xb) doesn't have anything equivalent to AMD NodeId. So on systems with CPUID < 0x1F, we should be okay with using cpu_die_id equal to AMD NodeId. I have an idea on what to do, so I'll send another rev if that's okay. Do you have any comments on the other patches in the set? Thanks, Yazen
Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
On Thu, Sep 17, 2020 at 12:37:20PM +0200, Borislav Petkov wrote: > On Wed, Sep 16, 2020 at 02:51:52PM -0500, Yazen Ghannam wrote: > > What do you think? > > Yeah, forget logical_proc_id - the galactic senate of x86 maintainers > said that we're keeping that for when BIOS vendors f*ck up with the > phys_proc_id enumeration on AMD. Then we'll need that as a workaround. > > Look instead at: > > struct cpuinfo_x86 { > > ... > > u16 cpu_die_id; > u16 logical_die_id; > > and > > 7745f03eb395 ("x86/topology: Add CPUID.1F multi-die/package support") > > "Some new systems have multiple software-visible die within each > package." > > and you could map the AMD packages to those dies. And if you guys > implement CPUID.1F to enumerate those packages the same way, then all > should just work (famous last words). > > Because Intel dies is basically AMD packages consisting of a CCX, caches > and DF. > > We would have to update the documentation in the end to denote that but > let's see if this should work for you too first. Because the concepts > sound very similar, if not identical... > Yep, we could ask the hardware folks to implement CPUID Leaf 0x1F, but that'll be in some future products. I actually tried using cpu_die_id, but I ran into an issue on newer systems. On older systems, there is no CPUID Leaf 0xB or 0x1F, and cpu_die_id doesn't get explicitly set. So setting cpu_die_id equal to AMD NodeId would work. But newer systems support CPUID Leaf 0xB, so cpu_die_id will get explicitly set by detect_extended_topology(). The value set is different from the AMD NodeId. And at that point I shied away from doing any override or fixup. Thanks, Yazen
Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
On Tue, Sep 15, 2020 at 10:35:15AM +0200, Borislav Petkov wrote: ... > > Yeah, I think example 4b works here. The mismatch though is with > > phys_proc_id and package on AMD systems. You can see above that > > phys_proc_id gives a socket number, and the AMD NodeId gives a package > > number. > > Ok, now looka here: > > " - cpuinfo_x86.logical_proc_id: > > The logical ID of the package. As we do not trust BIOSes to enumerate the > packages in a consistent way, we introduced the concept of logical package > ID so we can sanely calculate the number of maximum possible packages in > the system and have the packages enumerated linearly." > > Doesn't that sound like exactly what you need? > > Because that DF ID *is* practically the package ID as there's 1:1 > mapping between DF and a package, as you say above. > > Right? > > Now, it says > > [7.670791] smpboot: Max logical packages: 2 > > on my Rome box but what you want sounds very much like the logical > package ID and if we define that on AMD to be that and document it this > way, I guess that should work too, provided there are no caveats like > sched is using this info for proper task placement and so on. That would > need code audit, of course... > The only use of logical_proc_id seems to be in hswep_uncore_cpu_init(). So I think maybe we can use this. However, I think there are two issues. 1) The logical_proc_id seems like it should refer to the same type of structure as phys_proc_id. In our case, this won't be true as phys_proc_id would refer to the "socket" on AMD and logical_proc_id would refer to the package/AMD NodeId. 2) The AMD NodeId is read during c_init()/init_amd(), so logical_proc_id can be set here. But then logical_proc_id will get overwritten later in topology_update_package_map(). I don't know if it'd be good to modify the generic flow to support this vendor-specific behavior. What do you think? Thanks, Yazen
Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
On Thu, Sep 10, 2020 at 12:14:43PM +0200, Borislav Petkov wrote: > On Wed, Sep 09, 2020 at 03:17:55PM -0500, Yazen Ghannam wrote: > > We need to access specific instances of hardware registers in the > > Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does > > this. > > So you don't need the node_id - you need the northbridge/data fabric ID? > I'm guessing NB == DF, i.e., it was NB before Zen and it is DF now. > > Yes? > Yes, that's right. I called it "node_id" based on the AMD documentation and what it's called today in the Linux code. It's called other things like nb_id and nid too. I think we can call it something else to avoid confusion with NUMA nodes if that'll help. > > Package = Socket, i.e. a field replaceable unit. Socket may not be > > useful for software, but I think it helps users identify the hardware. > > > > I think the following could be changed in the documentation: > > > > "In the past a socket always contained a single package (see below), but > > with the advent of Multi Chip Modules (MCM) a socket can hold more than one > > package." > > > > Replace "package" with "die". > > So first of all, we have: > > "AMD nomenclature for package is 'Node'." > > so we either change that because as you explain, node != package on AMD. > > What you need is the ID of that northbridge or data fabric instance, > AFAIU. > > > You take multiple dies from the foundry and you "package" them together > > into a single unit. > > I think you're overloading the word "package" here and that leads to > more confusion. Package in our definition - Linux' - is: > > "Packages contain a number of cores plus shared resources, e.g. DRAM > controller, shared caches etc." If you glue several packages together, > you get an MCM. > Yes, you're right. The AMD documentation is different, so I'll try to stick with the Linux documentation and qualify names with "AMD" when noting the usage by the AMD docs. > > They could be equal depending on the system. The values are different on > > MCM systems like Bulldozer and Naples though. > > > > The functions and structures in amd_nb.c are indexed by the node_id. > > This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id. > > But the LLC isn't always equal to the Node/Die like in Naples. So the > > patches in this set save and explicitly use the node_id when needed. > > > > What do you think? > > Sounds to me that you want to ID that data fabric instance which > logically belongs to one or multiple packages. Or can a DF a single > package? > > So let's start simple: how does a DF instance map to a logical NUMA > node or package? Can a DF serve multiple packages? > There's one DF/NB per package and it's a fixed value, i.e. it shouldn't change based on the NUMA configuration. Here's an example of a 2 socket Naples system with 4 packages per socket and setup to have 1 NUMA node. The "node_id" value is the AMD NodeId from CPUID. CPU=0 phys_proc_id=0 node_id=0 cpu_to_node()=0 CPU=8 phys_proc_id=0 node_id=1 cpu_to_node()=0 CPU=16 phys_proc_id=0 node_id=2 cpu_to_node()=0 CPU=24 phys_proc_id=0 node_id=3 cpu_to_node()=0 CPU=32 phys_proc_id=1 node_id=4 cpu_to_node()=0 CPU=40 phys_proc_id=1 node_id=5 cpu_to_node()=0 CPU=48 phys_proc_id=1 node_id=6 cpu_to_node()=0 CPU=56 phys_proc_id=1 node_id=7 cpu_to_node()=0 > You could use the examples at the end of Documentation/x86/topology.rst > to explain how those things play together. And remember to not think > about the physical aspect of the hardware structure because it doesn't > mean anything to software. All you wanna do is address the proper DF > instance so this needs to be enumerable and properly represented by sw. > Yeah, I think example 4b works here. The mismatch though is with phys_proc_id and package on AMD systems. You can see above that phys_proc_id gives a socket number, and the AMD NodeId gives a package number. Should we add a note under cpuinfo_x86.phys_proc_id to make this distinction? > Confused? > > I am. > > :-) > Yeah, me too. :) Thanks, Yazen
Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
On Wed, Sep 09, 2020 at 08:06:47PM +0200, Borislav Petkov wrote: > On Thu, Sep 03, 2020 at 08:01:37PM +0000, Yazen Ghannam wrote: > > From: Yazen Ghannam > > > > AMD systems provide a "NodeId" value that represents a global ID > > indicating to which "Node" a logical CPU belongs. The "Node" is a > > physical structure equivalent to a Die, and it should not be confused > > with logical structures like NUMA node. > > So we said in Documentation/x86/topology.rst that: > > "The kernel does not care about the concept of physical sockets because > a socket has no relevance to software. It's an electromechanical > component." > Yes, I agree with this. > Now, you're talking, AFAIU, about physical components. Why do you need > them? > We need to access specific instances of hardware registers in the Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does this. > What is then: > > - cpuinfo_x86.phys_proc_id: > > The physical ID of the package. This information is retrieved via CPUID > and deduced from the APIC IDs of the cores in the package. > > supposed to mean? > Package = Socket, i.e. a field replaceable unit. Socket may not be useful for software, but I think it helps users identify the hardware. I think the following could be changed in the documentation: "In the past a socket always contained a single package (see below), but with the advent of Multi Chip Modules (MCM) a socket can hold more than one package." Replace "package" with "die". You take multiple dies from the foundry and you "package" them together into a single unit. > Why isn't phys_proc_id != node_id? > They could be equal depending on the system. The values are different on MCM systems like Bulldozer and Naples though. The functions and structures in amd_nb.c are indexed by the node_id. This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id. But the LLC isn't always equal to the Node/Die like in Naples. So the patches in this set save and explicitly use the node_id when needed. What do you think? Thanks, Yazen
[PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
From: Yazen Ghannam AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA node. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com v1 -> v2: * New patch based on review comment to save value to struct cpuinfo_x86. arch/x86/include/asm/cacheinfo.h | 4 ++-- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/amd.c| 11 +-- arch/x86/kernel/cpu/cacheinfo.c | 6 +++--- arch/x86/kernel/cpu/hygon.c | 11 +-- 5 files changed, 16 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h index 86b63c7feab7..86b2e0dcc4bf 100644 --- a/arch/x86/include/asm/cacheinfo.h +++ b/arch/x86/include/asm/cacheinfo.h @@ -2,7 +2,7 @@ #ifndef _ASM_X86_CACHEINFO_H #define _ASM_X86_CACHEINFO_H -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); -void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id); +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu); +void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu); #endif /* _ASM_X86_CACHEINFO_H */ diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 97143d87994c..a776b7886ec0 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -95,6 +95,7 @@ struct cpuinfo_x86 { /* CPUID returned core id bits: */ __u8x86_coreid_bits; __u8cu_id; + __u8node_id; /* Max extended CPUID function supported: */ __u32 extended_cpuid_level; /* Maximum supported CPUID level, -1=no CPUID: */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index dcc3d943c68f..5eef4cc1e5b7 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -330,7 +330,6 @@ static void legacy_fixup_core_id(struct cpuinfo_x86 *c) */ static void amd_get_topology(struct cpuinfo_x86 *c) { - u8 node_id; int cpu = smp_processor_id(); /* get information required for multi-node processors */ @@ -340,7 +339,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c) cpuid(0x801e, &eax, &ebx, &ecx, &edx); - node_id = ecx & 0xff; + c->node_id = ecx & 0xff; if (c->x86 == 0x15) c->cu_id = ebx & 0xff; @@ -360,15 +359,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c) if (!err) c->x86_coreid_bits = get_count_order(c->x86_max_cores); - cacheinfo_amd_init_llc_id(c, cpu, node_id); + cacheinfo_amd_init_llc_id(c, cpu); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { u64 value; rdmsrl(MSR_FAM10H_NODE_ID, value); - node_id = value & 7; + c->node_id = value & 7; - per_cpu(cpu_llc_id, cpu) = node_id; + per_cpu(cpu_llc_id, cpu) = c->node_id; } else return; @@ -393,7 +392,7 @@ static void amd_detect_cmp(struct cpuinfo_x86 *c) /* Convert the initial APIC ID into the socket ID */ c->phys_proc_id = c->initial_apicid >> bits; /* use socket ID also for last level cache */ - per_cpu(cpu_llc_id, cpu) = c->phys_proc_id; + per_cpu(cpu_llc_id, cpu) = c->node_id = c->phys_proc_id; } static void amd_detect_ppin(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 57074cf3ad7c..81dfddae4470 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -646,7 +646,7 @@ static int find_num_cache_leaves(struct cpuinfo_x86 *c) return i; } -void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id) +void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu) { /* * We may have multiple LLCs if L3 caches exist, so check if we @@ -657,7 +657,7 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id) if (c->x86 < 0x17) { /* LLC is at the nod
[PATCH v2 2/8] x86/CPU/AMD: Remove amd_get_nb_id()
From: Yazen Ghannam The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.node_id. Replace calls to amd_get_nb_id() with the logical CPU's node_id and remove the function. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com v1 -> v2: * New patch. arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/processor.h | 2 -- arch/x86/kernel/amd_nb.c | 4 ++-- arch/x86/kernel/cpu/amd.c| 6 -- arch/x86/kernel/cpu/cacheinfo.c | 2 +- arch/x86/kernel/cpu/mce/amd.c| 4 ++-- arch/x86/kernel/cpu/mce/inject.c | 4 ++-- drivers/edac/amd64_edac.c| 4 ++-- drivers/edac/mce_amd.c | 2 +- 9 files changed, 11 insertions(+), 19 deletions(-) diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index 39eb276d0277..01b9b943dcf4 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -538,7 +538,7 @@ static void amd_pmu_cpu_starting(int cpu) if (!x86_pmu.amd_nb_constraints) return; - nb_id = amd_get_nb_id(cpu); + nb_id = cpu_data(cpu).node_id; WARN_ON_ONCE(nb_id == BAD_APICID); for_each_online_cpu(i) { diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a776b7886ec0..408977a323d3 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -871,10 +871,8 @@ extern int set_tsc_mode(unsigned int val); DECLARE_PER_CPU(u64, msr_misc_features_shadow); #ifdef CONFIG_CPU_SUP_AMD -extern u16 amd_get_nb_id(int cpu); extern u32 amd_get_nodes_per_socket(void); #else -static inline u16 amd_get_nb_id(int cpu) { return 0; } static inline u32 amd_get_nodes_per_socket(void) { return 0; } #endif diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index 18f6b7c4bd79..2bd8abdbed8e 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -384,7 +384,7 @@ struct resource *amd_get_mmconfig_range(struct resource *res) int amd_get_subcaches(int cpu) { - struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link; + struct pci_dev *link = node_to_amd_nb(cpu_data(cpu).node_id)->link; unsigned int mask; if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING)) @@ -398,7 +398,7 @@ int amd_get_subcaches(int cpu) int amd_set_subcaches(int cpu, unsigned long mask) { static unsigned int reset, ban; - struct amd_northbridge *nb = node_to_amd_nb(amd_get_nb_id(cpu)); + struct amd_northbridge *nb = node_to_amd_nb(cpu_data(cpu).node_id); unsigned int reg; int cuid; diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 5eef4cc1e5b7..846367a69c4a 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -424,12 +424,6 @@ static void amd_detect_ppin(struct cpuinfo_x86 *c) clear_cpu_cap(c, X86_FEATURE_AMD_PPIN); } -u16 amd_get_nb_id(int cpu) -{ - return per_cpu(cpu_llc_id, cpu); -} -EXPORT_SYMBOL_GPL(amd_get_nb_id); - u32 amd_get_nodes_per_socket(void) { return nodes_per_socket; diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 81dfddae4470..8e34e90bb872 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -580,7 +580,7 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs *this_leaf, int index) if (index < 3) return; - node = amd_get_nb_id(smp_processor_id()); + node = cpu_data(smp_processor_id()).node_id; this_leaf->nb = node_to_amd_nb(node); if (this_leaf->nb && !this_leaf->nb->l3_cache.indices) amd_calc_l3_indices(this_leaf->nb); diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 0c6b02dd744c..be96f77004ad 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1341,7 +1341,7 @@ static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu, return -ENODEV; if (is_shared_bank(bank)) { - nb = node_to_amd_nb(amd_get_nb_id(cpu)); + nb = node_to_amd_nb(cpu_data(cpu).node_id); /* threshold descriptor already initialized on this node? */ if (nb && nb->bank4) { @@ -1445,7 +1445,7 @@ static void threshold_remove_bank(struct threshold_bank *bank) * The last CPU on this node using the shared bank is going * away, remove that bank now. */ - nb = node_to_amd_nb(amd_get_nb_id(smp_processor_id())); + nb = node_to_amd_nb(cpu_data(smp_processor_id()).node_id); nb->bank4 = NULL; } diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cp
[PATCH v2 7/8] x86/MCE/AMD: Group register reads in translation code
From: Yazen Ghannam ...so that bitfield extraction can be done together to simplify future patches. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/x86/kernel/cpu/mce/amd.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 5a18937ff7cd..f5440f8000e9 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -729,11 +729,18 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; } + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, ®_dram_limit_addr)) + goto out_err; + lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1); intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4); intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8); dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28; + intlv_num_sockets = get_bit(reg_dram_limit_addr, 8); + intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10); + dram_limit_addr = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | GENMASK_ULL(27, 0); + /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ if (intlv_addr_sel > 3) { pr_err("%s: Invalid interleave address select %d.\n", @@ -741,13 +748,6 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; } - if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, ®_dram_limit_addr)) - goto out_err; - - intlv_num_sockets = get_bit(reg_dram_limit_addr, 8); - intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10); - dram_limit_addr = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | GENMASK_ULL(27, 0); - intlv_addr_bit = intlv_addr_sel + 8; /* Re-use intlv_num_chan by setting it equal to log2(#channels) */ -- 2.25.1
[PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable in translation code
From: Yazen Ghannam Remove the "tmp" variable used to save register values. Save the values in existing variables, if possible. The register values are 32 bits. Use separate "reg_" variables to hold the register values if the existing variable sizes doesn't match, or if no bitfields in a register share the same name as the register. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/x86/kernel/cpu/mce/amd.c | 56 +++ 1 file changed, 30 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 90c3ad61ae19..5a18937ff7cd 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -688,11 +688,14 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { - u64 dram_base_addr, dram_limit_addr, dram_hole_base; /* We start from the normalized address */ u64 ret_addr = norm_addr; - u32 tmp; + u64 dram_base_addr, dram_limit_addr; + u32 dram_hole_base; + + u32 reg_dram_base_addr, reg_dram_limit_addr; + u32 reg_dram_offset; u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask; u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets; @@ -702,12 +705,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) u8 cs_mask, cs_id = 0; bool hash_enabled = false; - if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, ®_dram_offset)) goto out_err; /* Remove HiAddrOffset from normalized address, if enabled: */ - if (tmp & BIT(0)) { - u64 hi_addr_offset = get_bits(tmp, 31, 20) << 28; + if (reg_dram_offset & BIT(0)) { + u64 hi_addr_offset = get_bits(reg_dram_offset, 31, 20) << 28; /* Check if base 1 is used. */ if (norm_addr >= hi_addr_offset) { @@ -716,20 +719,20 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) } } - if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, ®_dram_base_addr)) goto out_err; /* Check if address range is valid. */ - if (!(tmp & BIT(0))) { + if (!(reg_dram_base_addr & BIT(0))) { pr_err("%s: Invalid DramBaseAddress range: 0x%x.\n", - __func__, tmp); + __func__, reg_dram_base_addr); goto out_err; } - lgcy_mmio_hole_en = get_bit(tmp, 1); - intlv_num_chan= get_bits(tmp, 7, 4); - intlv_addr_sel= get_bits(tmp, 10, 8); - dram_base_addr= get_bits(tmp, 31, 12) << 28; + lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1); + intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4); + intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8); + dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28; /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ if (intlv_addr_sel > 3) { @@ -738,12 +741,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; } - if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, ®_dram_limit_addr)) goto out_err; - intlv_num_sockets = get_bit(tmp, 8); - intlv_num_dies= get_bits(tmp, 11, 10); - dram_limit_addr = (get_bits(tmp, 31, 12) << 28) | GENMASK_ULL(27, 0); + intlv_num_sockets = get_bit(reg_dram_limit_addr, 8); + intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10); + dram_limit_addr = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | GENMASK_ULL(27, 0); intlv_addr_bit = intlv_addr_sel + 8; @@ -786,17 +789,18 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) if (num_intlv_bits > 0) { u64 temp_addr_x, temp_addr_i, temp_addr_y; - u8 die_id_bit, sock_id_bit, cs_fabric_id; + u32 reg_sys_fabric_id, cs_fabric_id; + u8 die_id_bit, sock_id_bit; /* * This is the fabric id for this coherent slave. Use * umc/channel# as instance id of the coherent slave * for FICAA. */ - if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, &tmp)) +
[PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation
From: Muralidhara M K Add support for new memory interleaving modes used in current AMD systems. Check if the system is using a current Data Fabric version or a legacy version as some bit and register definitions have changed. Tested on AMD reference platforms with the following memory interleaving options. Naples - None - Channel - Die - Socket Rome (NPS = Nodes per Socket) - None - NPS0 - NPS1 - NPS2 - NPS4 The fixes tag refers to the commit that allows amd64_edac_mod to load on Rome systems. The module may report an incorrect system addresses on Rome systems depending on the interleaving option used. Fixes: 6e846239e548 ("EDAC/amd64: Add Family 17h Model 30h PCI IDs") Signed-off-by: Muralidhara M K Co-developed-by: Naveen Krishna Chtradhi Signed-off-by: Naveen Krishna Chtradhi Co-developed-by: Yazen Ghannam Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * Rebased on cleanup patches. * Save and use the Data Fabric version. * Reorder code to execute non-legacy flows first. This change wasn't made to the section with the "hashed_bit" calculation, since the current flow reads easier IMHO. arch/x86/kernel/cpu/mce/amd.c | 222 ++ 1 file changed, 172 insertions(+), 50 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index f5440f8000e9..c14076bcabf2 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -683,8 +683,10 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) #define DF_F0_DRAMBASEADDR 0x110 #define DF_F0_DRAMLIMITADDR0x114 #define DF_F0_DRAMOFFSET 0x1B4 +#define DF_F0_DFGLOBALCTRL 0x3F8 #define DF_F1_SYSFABRICID 0x208 +#define DF_F1_SYSFABRICID1 0x20C int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { @@ -695,22 +697,30 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) u32 dram_hole_base; u32 reg_dram_base_addr, reg_dram_limit_addr; - u32 reg_dram_offset; + u32 reg_dram_offset, reg_sys_fabric_id; + + bool hash_enabled = false, split_normalized = false; - u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask; u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets; - u8 intlv_addr_sel, intlv_addr_bit; - u8 num_intlv_bits, hashed_bit; + u8 intlv_addr_sel, intlv_addr_bit, num_intlv_bits; + u8 cs_mask, cs_id = 0, dst_fabric_id = 0; u8 lgcy_mmio_hole_en, base = 0; - u8 cs_mask, cs_id = 0; - bool hash_enabled = false; + u8 df_version; + + if (amd_df_indirect_read(nid, 1, DF_F1_SYSFABRICID, umc, ®_sys_fabric_id)) + goto out_err; + + df_version = (reg_sys_fabric_id & 0xFF) ? 3 : 2; if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, ®_dram_offset)) goto out_err; /* Remove HiAddrOffset from normalized address, if enabled: */ if (reg_dram_offset & BIT(0)) { - u64 hi_addr_offset = get_bits(reg_dram_offset, 31, 20) << 28; + u8 hi_addr_offset_lsb = (df_version >= 3) ? 12 : 20; + u64 hi_addr_offset = get_bits(reg_dram_offset, 31, hi_addr_offset_lsb); + + hi_addr_offset <<= 28; /* Check if base 1 is used. */ if (norm_addr >= hi_addr_offset) { @@ -733,19 +743,23 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; lgcy_mmio_hole_en = get_bit(reg_dram_base_addr, 1); - intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4); - intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8); dram_base_addr= get_bits(reg_dram_base_addr, 31, 12) << 28; - - intlv_num_sockets = get_bit(reg_dram_limit_addr, 8); - intlv_num_dies= get_bits(reg_dram_limit_addr, 11, 10); dram_limit_addr = (get_bits(reg_dram_limit_addr, 31, 12) << 28) | GENMASK_ULL(27, 0); - /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ - if (intlv_addr_sel > 3) { - pr_err("%s: Invalid interleave address select %d.\n", - __func__, intlv_addr_sel); - goto out_err; + if (df_version >= 3) { + intlv_num_chan= get_bits(reg_dram_base_addr, 5, 2); + intlv_num_dies= get_bits(reg_dram_base_addr, 7, 6); + intlv_num_sockets = get_bit(reg_dram_base_addr, 8); + intlv_addr_sel= get_bits(reg_dram_base_addr, 11, 9); + + dst_fabric_id = get_bits(reg_dram_limit_addr, 9, 0); + } else { + intlv_num_chan= get_bits(reg_dram_base_addr, 7, 4); + intlv_addr_sel= get_bits(reg_dram_base_addr, 10, 8); + + dst_fabric_id
[PATCH v2 5/8] x86/MCE/AMD: Use macros to get bitfields in translation code
From: Yazen Ghannam Define macros to get individual bits and bitfields. Use these to make the code more readable. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/x86/kernel/cpu/mce/amd.c | 46 +-- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 1e0510fd5afc..90c3ad61ae19 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -675,6 +675,9 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) deferred_error_interrupt_enable(c); } +#define get_bits(x, msb, lsb) ((x & GENMASK_ULL(msb, lsb)) >> lsb) +#define get_bit(x, bit)((x >> bit) & BIT(0)) + #define DF_F0_FABRICINSTINFO3 0x50 #define DF_F0_MMIOHOLE 0x104 #define DF_F0_DRAMBASEADDR 0x110 @@ -704,7 +707,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) /* Remove HiAddrOffset from normalized address, if enabled: */ if (tmp & BIT(0)) { - u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8; + u64 hi_addr_offset = get_bits(tmp, 31, 20) << 28; /* Check if base 1 is used. */ if (norm_addr >= hi_addr_offset) { @@ -723,10 +726,10 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; } - lgcy_mmio_hole_en = tmp & BIT(1); - intlv_num_chan= (tmp >> 4) & 0xF; - intlv_addr_sel= (tmp >> 8) & 0x7; - dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16; + lgcy_mmio_hole_en = get_bit(tmp, 1); + intlv_num_chan= get_bits(tmp, 7, 4); + intlv_addr_sel= get_bits(tmp, 10, 8); + dram_base_addr= get_bits(tmp, 31, 12) << 28; /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ if (intlv_addr_sel > 3) { @@ -738,9 +741,9 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, &tmp)) goto out_err; - intlv_num_sockets = (tmp >> 8) & 0x1; - intlv_num_dies= (tmp >> 10) & 0x3; - dram_limit_addr = ((tmp & GENMASK_ULL(31, 12)) << 16) | GENMASK_ULL(27, 0); + intlv_num_sockets = get_bit(tmp, 8); + intlv_num_dies= get_bits(tmp, 11, 10); + dram_limit_addr = (get_bits(tmp, 31, 12) << 28) | GENMASK_ULL(27, 0); intlv_addr_bit = intlv_addr_sel + 8; @@ -793,7 +796,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, &tmp)) goto out_err; - cs_fabric_id = (tmp >> 8) & 0xFF; + cs_fabric_id = get_bits(tmp, 15, 8); die_id_bit = 0; /* If interleaved over more than 1 channel: */ @@ -812,16 +815,16 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) /* If interleaved over more than 1 die. */ if (intlv_num_dies) { sock_id_bit = die_id_bit + intlv_num_dies; - die_id_shift = (tmp >> 24) & 0xF; - die_id_mask = (tmp >> 8) & 0xFF; + die_id_shift = get_bits(tmp, 27, 24); + die_id_mask = get_bits(tmp, 15, 8); cs_id |= ((cs_fabric_id & die_id_mask) >> die_id_shift) << die_id_bit; } /* If interleaved over more than 1 socket. */ if (intlv_num_sockets) { - socket_id_shift = (tmp >> 28) & 0xF; - socket_id_mask = (tmp >> 16) & 0xFF; + socket_id_shift = get_bits(tmp, 31, 28); + socket_id_mask = get_bits(tmp, 23, 16); cs_id |= ((cs_fabric_id & socket_id_mask) >> socket_id_shift) << sock_id_bit; } @@ -834,7 +837,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) * bits there are. "intlv_addr_bit" tells us how many "Y" bits * there are (where "I" starts). */ - temp_addr_y = ret_addr & GENMASK_ULL(intlv_addr_bit-1, 0); + temp_addr_y = get_bits(ret_addr, intlv_addr_bit-1, 0); temp_addr_i = (cs_id << intlv_addr_bit); temp_addr_x = (ret_addr & GENMASK_ULL(63, intlv_addr_bit)) << num_intlv
[PATCH v2 4/8] x86/MCE/AMD: Use defines for register addresses in translation code
From: Yazen Ghannam Replace raw register offset values in the AMD address translation code with named definitions. Also, drop comments that only note the register names. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/x86/kernel/cpu/mce/amd.c | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index be96f77004ad..1e0510fd5afc 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -675,6 +675,14 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) deferred_error_interrupt_enable(c); } +#define DF_F0_FABRICINSTINFO3 0x50 +#define DF_F0_MMIOHOLE 0x104 +#define DF_F0_DRAMBASEADDR 0x110 +#define DF_F0_DRAMLIMITADDR0x114 +#define DF_F0_DRAMOFFSET 0x1B4 + +#define DF_F1_SYSFABRICID 0x208 + int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { u64 dram_base_addr, dram_limit_addr, dram_hole_base; @@ -691,22 +699,21 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) u8 cs_mask, cs_id = 0; bool hash_enabled = false; - /* Read D18F0x1B4 (DramOffset), check if base 1 is used. */ - if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMOFFSET, umc, &tmp)) goto out_err; /* Remove HiAddrOffset from normalized address, if enabled: */ if (tmp & BIT(0)) { u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8; + /* Check if base 1 is used. */ if (norm_addr >= hi_addr_offset) { ret_addr -= hi_addr_offset; base = 1; } } - /* Read D18F0x110 (DramBaseAddress). */ - if (amd_df_indirect_read(nid, 0, 0x110 + (8 * base), umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMBASEADDR + (8 * base), umc, &tmp)) goto out_err; /* Check if address range is valid. */ @@ -728,8 +735,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) goto out_err; } - /* Read D18F0x114 (DramLimitAddress). */ - if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_DRAMLIMITADDR + (8 * base), umc, &tmp)) goto out_err; intlv_num_sockets = (tmp >> 8) & 0x1; @@ -780,12 +786,11 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) u8 die_id_bit, sock_id_bit, cs_fabric_id; /* -* Read FabricBlockInstanceInformation3_CS[BlockFabricID]. * This is the fabric id for this coherent slave. Use * umc/channel# as instance id of the coherent slave * for FICAA. */ - if (amd_df_indirect_read(nid, 0, 0x50, umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_FABRICINSTINFO3, umc, &tmp)) goto out_err; cs_fabric_id = (tmp >> 8) & 0xFF; @@ -800,9 +805,8 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) sock_id_bit = die_id_bit; - /* Read D18F1x208 (SystemFabricIdMask). */ if (intlv_num_dies || intlv_num_sockets) - if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp)) + if (amd_df_indirect_read(nid, 1, DF_F1_SYSFABRICID, umc, &tmp)) goto out_err; /* If interleaved over more than 1 die. */ @@ -841,7 +845,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) /* If legacy MMIO hole enabled */ if (lgcy_mmio_hole_en) { - if (amd_df_indirect_read(nid, 0, 0x104, umc, &tmp)) + if (amd_df_indirect_read(nid, 0, DF_F0_MMIOHOLE, umc, &tmp)) goto out_err; dram_hole_base = tmp & GENMASK(31, 24); -- 2.25.1
[PATCH v2 0/8] AMD MCA Address Translation Updates
From: Yazen Ghannam This patchset includes updates for the MCA Address Translation process on recent AMD systems. Patches 1 & 3: Fixes an input to the address translation function. The translation requires a physical Die ID (NodeId in AMD documentation) rather than a logicial NUMA node ID. This is because the physical and logical nodes may not always match. Patch 2: Removes a function that is no longer needed with Patch 1. Patches 4-7: Code cleanup in preparation for Patch 8. Patch 8: Add translation support for new memory interleaving options available in Rome systems. The patch is based on the latest AMD reference code for the address translation. Patches 6-8 have checkpatch warnings about long lines, but I kept the long lines for readability. Thanks, Yazen Link: https://lkml.kernel.org/r/20200814191449.183998-1-yazen.ghan...@amd.com v1 -> v2: * Save the AMD NodeId value in struct cpuinfo_x86 rather than use a local value in MCA code. * Include code cleanup for AMD MCA Address Translation function before adding new functionality. Muralidhara M K (1): x86/MCE/AMD Support new memory interleaving modes during address translation Yazen Ghannam (7): x86/CPU/AMD: Save NodeId on AMD-based systems x86/CPU/AMD: Remove amd_get_nb_id() EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId x86/MCE/AMD: Use defines for register addresses in translation code x86/MCE/AMD: Use macros to get bitfields in translation code x86/MCE/AMD: Drop tmp variable in translation code x86/MCE/AMD: Group register reads in translation code arch/x86/events/amd/core.c | 2 +- arch/x86/include/asm/cacheinfo.h | 4 +- arch/x86/include/asm/processor.h | 3 +- arch/x86/kernel/amd_nb.c | 4 +- arch/x86/kernel/cpu/amd.c| 17 +- arch/x86/kernel/cpu/cacheinfo.c | 8 +- arch/x86/kernel/cpu/hygon.c | 11 +- arch/x86/kernel/cpu/mce/amd.c| 284 ++- arch/x86/kernel/cpu/mce/inject.c | 4 +- drivers/edac/amd64_edac.c| 4 +- drivers/edac/mce_amd.c | 4 +- 11 files changed, 233 insertions(+), 112 deletions(-) -- 2.25.1
[PATCH v2 3/8] EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId
From: Yazen Ghannam The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr(). However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact. But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results. Use struct cpuinfo_x86.node_id for the node_id parameter to ensure the physical ID is used. Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-2-yazen.ghan...@amd.com v1 -> v2: * Redo based on change in Patch 1. drivers/edac/mce_amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index ac9bd74c92cd..91b5e3e0744e 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1003,7 +1003,7 @@ static void decode_smca_error(struct mce *m) pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) - decode_dram_ecc(cpu_to_node(m->extcpu), m); + decode_dram_ecc(cpu_data(m->extcpu).node_id, m); } static inline void amd_decode_err_code(u16 ec) -- 2.25.1
Re: [PATCH v2 1/2] cper, apei, mce: Pass x86 CPER through the MCA handling chain
On Fri, Aug 28, 2020 at 03:33:31PM -0500, Smita Koralahalli wrote: ... > +int apei_mce_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 > lapic_id) > +{ > + const u64 *i_mce = ((const void *) (ctx_info + 1)); > + unsigned int cpu; > + struct mce m; > + > + if (!boot_cpu_has(X86_FEATURE_SMCA)) > + return -EINVAL; > + This function is called on any context type, but it can only decode "MSR" types that follow the MCAX register layout used on Scalable MCA systems. So I think there should be a couple of checks added: 1) Context type is "MSR". 2) Register layout follows what is expected below. There's no explict way to do this, since the data is implemenation-specific. But at least there can be a check that the starting MSR address matches the first expected register: Bank's MCA_STATUS in MCAX space (0xC0002XX1). For example: (ctx_info->msr_addr & 0xC0002001) == 0xC0002001 The raw value in the example should be defined with a name. > + mce_setup(&m); > + > + m.extcpu = -1; > + m.socketid = -1; > + > + for_each_possible_cpu(cpu) { > + if (cpu_data(cpu).initial_apicid == lapic_id) { > + m.extcpu = cpu; > + m.socketid = cpu_data(m.extcpu).phys_proc_id; > + break; > + } > + } > + > + m.apicid = lapic_id; > + m.bank = (ctx_info->msr_addr >> 4) & 0xFF; > + m.status = *i_mce; > + m.addr = *(i_mce + 1); > + m.misc = *(i_mce + 2); > + /* Skipping MCA_CONFIG */ > + m.ipid = *(i_mce + 4); > + m.synd = *(i_mce + 5); > + > + mce_log(&m); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(apei_mce_report_x86_error); > + Thanks, Yazen
[PATCH v2] x86/mce: Increase maximum number of banks to 64
From: Akshay Gupta ...because future AMD systems will support up to 64 MCA banks per CPU. MAX_NR_BANKS is used to allocate a number of data structures, and it is used as a ceiling for values read from MCG_CAP[Count]. Therefore, this change will have no functional effect on existing systems with 32 or fewer MCA banks per CPU. However, this will increase the size of the following structures. Global bitmaps: - core.c / mce_banks_ce_disabled - core.c / all_banks - core.c / valid_banks - core.c / toclear - Total: 32 new bits * 4 bitmaps = 16 new bytes Per-CPU bitmaps: - core.c / mce_poll_banks - intel.c / mce_banks_owned - Total: 32 new bits * 2 bitmaps = 8 new bytes The bitmaps are arrays of longs. So this change will only affect 32-bit execution, since there will be one additional long used. There will be no additional memory use on 64-bit execution, because the size of long is 64 bits. Global structs: - amd.c / struct smca_bank smca_banks[]: 16 bytes per bank - core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank - Total: 32 new banks * (16 + 56) bytes = 2304 new bytes Per-CPU structs: - core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank - Total: 32 new banks * 16 bytes = 512 new bytes 32-bit Total global size increase: 2320 bytes Total per-CPU size increase: 520 bytes 64-bit Total global size increase: 2304 bytes Total per-CPU size increase: 512 bytes This additional memory should still fit within the existing .data section of the kernel binary. However, in the case where it doesn't fit, an additional page (4kB) of memory will be added to the binary to accommodate the extra data. Signed-off-by: Akshay Gupta [ Adjust commit message and code comment. ] Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200820170624.1855825-1-yazen.ghan...@amd.com v1->v2: * Update commit message with discussion details from review. arch/x86/include/asm/mce.h | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 6adced6e7dd3..109af5c7f515 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -200,12 +200,8 @@ void mce_setup(struct mce *m); void mce_log(struct mce *m); DECLARE_PER_CPU(struct device *, mce_device); -/* - * Maximum banks number. - * This is the limit of the current register layout on - * Intel CPUs. - */ -#define MAX_NR_BANKS 32 +/* Maximum number of MCA banks per CPU. */ +#define MAX_NR_BANKS 64 #ifdef CONFIG_X86_MCE_INTEL void mce_intel_feature_init(struct cpuinfo_x86 *c); -- 2.25.1
Re: [PATCH] x86/mce: Increase maximum number of banks to 64
On Thu, Aug 20, 2020 at 06:15:15PM +, Luck, Tony wrote: > >> How much does vmlinux size grow with your change? > >> > > > > It seems to get smaller. > > > > -rwxrwxr-x 1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks > > -rwxrwxr-x 1 yghannam yghannam 807634072 Aug 20 17:50 vmlinux-64banks > > You need to run: > > $ size vmlinux >textdata bss dec hex filename > 203347551256968214798924477033612d7e541 > vmlinux > > Likely the extra space is added to the third element ("bss"). That doesn't > show > up in the vmlinux file, but does add to memory footprint while running. Thanks. Yeah, they're identical: textdata bss dec hex filename 15710076135193065398528 346279102106146 vmlinux-32banks 15710076135193065398528 346279102106146 vmlinux-64banks I did a quick audit of the statically allocated data structures which use MAX_NR_BANKS. Global bitmaps: - core.c / mce_banks_ce_disabled - core.c / all_banks - core.c / valid_banks - core.c / toclear - Total: 32 new bits * 4 bitmaps = 16 new bytes Per-CPU bitmaps: - core.c / mce_poll_banks - intel.c / mce_banks_owned - Total: 32 new bits * 2 bitmaps = 8 new bytes The bitmaps are arrays of longs. So this change will only affect 32-bit execution (I assume), since there will be one additional long used. There will be no additional memory use on 64-bit execution, because the size of long is 64 bits. Global structs: - amd.c / struct smca_bank smca_banks[]: 16 bytes per bank - core.c / struct mce_bank_dev mce_bank_devs[]: 56 bytes per bank - Total: 32 new banks * (16 + 56) bytes = 2304 new bytes Per-CPU structs: - core.c / struct mce_bank mce_banks_array[]: 16 bytes per bank - Total: 32 new banks * 16 bytes = 512 new bytes 32-bit Total global size increase: 2320 bytes Total per-CPU size increase: 520 bytes 64-bit Total global size increase: 2304 bytes Total per-CPU size increase: 512 bytes Is this okay? Thanks, Yazen
Re: [PATCH] x86/mce: Increase maximum number of banks to 64
On Thu, Aug 20, 2020 at 07:15:18PM +0200, Borislav Petkov wrote: > On Thu, Aug 20, 2020 at 05:06:24PM +0000, Yazen Ghannam wrote: > > From: Akshay Gupta > > > > ...because future AMD systems will support up to 64 MCA banks per CPU. > > > > MAX_NR_BANKS is used to allocate a number of data structures, and it is > > used as a ceiling for values read from MCG_CAP[Count]. Therefore, this > > change will have no functional effect on existing systems with 32 or > > fewer MCA banks per CPU. > > Of course it will, grep for MAX_NR_BANKS and look at all those bitmaps > and arrays which get defined with MAX_NR_BANKS size. With your change, > they will double in size. > > How much does vmlinux size grow with your change? > It seems to get smaller. -rwxrwxr-x 1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks -rwxrwxr-x 1 yghannam yghannam 807634072 Aug 20 17:50 vmlinux-64banks Any ideas? Maybe there's some alignment change? Or a build issue on my end? Thanks, Yazen
[PATCH] x86/mce: Increase maximum number of banks to 64
From: Akshay Gupta ...because future AMD systems will support up to 64 MCA banks per CPU. MAX_NR_BANKS is used to allocate a number of data structures, and it is used as a ceiling for values read from MCG_CAP[Count]. Therefore, this change will have no functional effect on existing systems with 32 or fewer MCA banks per CPU. Signed-off-by: Akshay Gupta [ Adjust commit message and code comment. ] Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 6adced6e7dd3..109af5c7f515 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -200,12 +200,8 @@ void mce_setup(struct mce *m); void mce_log(struct mce *m); DECLARE_PER_CPU(struct device *, mce_device); -/* - * Maximum banks number. - * This is the limit of the current register layout on - * Intel CPUs. - */ -#define MAX_NR_BANKS 32 +/* Maximum number of MCA banks per CPU. */ +#define MAX_NR_BANKS 64 #ifdef CONFIG_X86_MCE_INTEL void mce_intel_feature_init(struct cpuinfo_x86 *c); -- 2.25.1
[tip: ras/core] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The following commit has been merged into the ras/core branch of tip: Commit-ID: 368d1887200d68075c064a62a9aa191168cf1eed Gitweb: https://git.kernel.org/tip/368d1887200d68075c064a62a9aa191168cf1eed Author:Yazen Ghannam AuthorDate:Mon, 20 Jul 2020 14:53:53 Committer: Borislav Petkov CommitterDate: Thu, 20 Aug 2020 10:34:38 +02:00 x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary after a few product releases because the hardware will only report valid error codes. Thus, there's no need for it with future systems. Remove the xec_bitmap field and all references to it. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20200720145353.43924-1-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h| 1 +- arch/x86/kernel/cpu/mce/amd.c | 44 +- drivers/edac/mce_amd.c| 4 +--- 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index cf50382..6adced6 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -328,7 +328,6 @@ enum smca_bank_types { struct smca_hwid { unsigned int bank_type; /* Use with smca_bank_types for easy indexing. */ u32 hwid_mcatype; /* (hwid,mcatype) tuple */ - u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max is 21. */ u8 count; /* Number of instances. */ }; diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 99be063..0c6b02d 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned int bank) } static struct smca_hwid smca_hwid_mcatypes[] = { - /* { bank_type, hwid_mcatype, xec_bitmap } */ + /* { bank_type, hwid_mcatype } */ /* Reserved type */ - { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 }, + { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)}, /* ZN Core (HWID=0xB0) MCA types */ - { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1F }, - { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF }, - { SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF }, - { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF }, - { SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF }, + { SMCA_LS, HWID_MCATYPE(0xB0, 0x0)}, + { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10) }, + { SMCA_IF, HWID_MCATYPE(0xB0, 0x1)}, + { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)}, + { SMCA_DE, HWID_MCATYPE(0xB0, 0x3)}, /* HWID 0xB0 MCATYPE 0x4 is Reserved */ - { SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0xFFF }, - { SMCA_FP, HWID_MCATYPE(0xB0, 0x6), 0x7F }, - { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF }, + { SMCA_EX, HWID_MCATYPE(0xB0, 0x5)}, + { SMCA_FP, HWID_MCATYPE(0xB0, 0x6)}, + { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)}, /* Data Fabric MCA types */ - { SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF }, - { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0x1F }, - { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF }, + { SMCA_CS, HWID_MCATYPE(0x2E, 0x0)}, + { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1)}, + { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)}, /* Unified Memory Controller MCA type */ - { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0xFF }, + { SMCA_UMC, HWID_MCATYPE(0x96, 0x0)}, /* Parameter Block MCA type */ - { SMCA_PB, HWID_MCATYPE(0x05, 0x0), 0x1 }, + { SMCA_PB, HWID_MCATYPE(0x05, 0x0)}, /* Platform Security Processor MCA type */ - { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0), 0x1 }, - { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1), 0x3 }, + { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0)}, + { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1)}, /* System Management Unit MCA type */ - { SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 }, - { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1), 0x7FF }, + { SMCA_SMU, HWID_MCATYPE(0x01, 0x0)}, + { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1)}, /* Microprocessor 5 Unit MCA type */ - { SMCA_MP5, HWID_MCATYPE(0x01, 0x2), 0x3FF }, + { SMCA_MP5, HWID_MCATYPE(0x01, 0x2)}, /* Northbridge IO Unit MCA type */ - { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F }, + { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)}, /* PCI Express Unit MCA type */ - { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0),
Re: [PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes during address translation
On Sat, Aug 15, 2020 at 11:13:36AM +0200, Ingo Molnar wrote: > > * Yazen Ghannam wrote: > > > + /* Read D18F1x208 (System Fabric ID Mask 0). */ > > + if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp)) > > + goto out_err; > > + > > + /* Determine if system is a legacy Data Fabric type. */ > > + legacy_df = !(tmp & 0xFF); > > 1) > > I see this pattern in a lot of places in the code, first the magic > constant 0x208 is explained a comment, then it is *repeated* and used > it in the code... > > How about introducing an obviously named enum for it instead, which > would then be self-documenting, saving the comment and removing magic > numbers: > > if (amd_df_indirect_read(nid, 1, AMD_REG_FAB_ID, umc, ®_fab_id)) > goto out_err; > > (The symbolic name should be something better, I just guessed > something quickly.) > > Please clean this up in a separate patch, not part of the already > large patch that introduces a new feature. > Okay, will do. > 2) > > 'tmp & 0xFF' is some sort of fabric version ID value, with a value of > 0 denoting legacy (pre-Rome) systems, right? > > How about making that explicit: > > df_version = reg_fab_id & 0xFF; > > I'm pretty sure such a version ID might come handy later on, should > there be quirks or new capabilities with the newer systems ... > Not exactly. The register field is Read-as-Zero on legacy systems. The versions are 2 and 3 where 2 is the "legacy" version. But I can make this change. For example: df_version = reg_fab_id & 0xFF ? 3 : 2; > > > ret_addr -= hi_addr_offset; > > @@ -728,23 +740,31 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, > > u8 umc, u64 *sys_addr) > > } > > > > lgcy_mmio_hole_en = tmp & BIT(1); > > - intlv_num_chan= (tmp >> 4) & 0xF; > > - intlv_addr_sel= (tmp >> 8) & 0x7; > > - dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16; > > > > - /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ > > - if (intlv_addr_sel > 3) { > > - pr_err("%s: Invalid interleave address select %d.\n", > > - __func__, intlv_addr_sel); > > - goto out_err; > > + if (legacy_df) { > > + intlv_num_chan= (tmp >> 4) & 0xF; > > + intlv_addr_sel= (tmp >> 8) & 0x7; > > + } else { > > + intlv_num_chan= (tmp >> 2) & 0xF; > > + intlv_num_dies= (tmp >> 6) & 0x3; > > + intlv_num_sockets = (tmp >> 8) & 0x1; > > + intlv_addr_sel= (tmp >> 9) & 0x7; > > } > > > > + dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16; > > + > > /* Read D18F0x114 (DramLimitAddress). */ > > if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp)) > > goto out_err; > > > > - intlv_num_sockets = (tmp >> 8) & 0x1; > > - intlv_num_dies= (tmp >> 10) & 0x3; > > + if (legacy_df) { > > + intlv_num_sockets = (tmp >> 8) & 0x1; > > + intlv_num_dies= (tmp >> 10) & 0x3; > > + dst_fabric_id = tmp & 0xFF; > > + } else { > > + dst_fabric_id = tmp & 0x3FF; > > + } > > + > > dram_limit_addr = ((tmp & GENMASK_ULL(31, 12)) << 16) | > > GENMASK_ULL(27, 0); > > Could we please structure this code in a bit more readable fashion? > > 1) > > Such as not using the meaningless 'tmp' variable name to first read > out DramOffset, then DramLimitAddress? > IIRC, the "tmp" variable come to be in the review for the patch which added this function. There are a few places where the register name and the value needed have the same or similar name. For example, DramLimitAddress is the register name and also a field within the register. So we'd have a reg_dram_limit_addr and val_dram_limit_addr. The "tmp" variable removes the need for the "reg_" variable. But I think this can be reworked so that the final variable name is reused. The register value can read into the variable, extra fields can be extracted from it, and the final value can be adjusted as needed. > How about naming them a bit more obviously, and retrieving them in a > single step: > > if (amd_df_indirect_read(nid, 0, 0x1B4, umc, ®_dram_off)) > goto out_err; >
[tip: ras/core] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The following commit has been merged into the ras/core branch of tip: Commit-ID: 5f2c67bd0f8a470a12c38a8786c42c043e100014 Gitweb: https://git.kernel.org/tip/5f2c67bd0f8a470a12c38a8786c42c043e100014 Author:Yazen Ghannam AuthorDate:Mon, 20 Jul 2020 14:53:53 Committer: Borislav Petkov CommitterDate: Tue, 18 Aug 2020 12:15:43 +02:00 x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary after a few product releases because the hardware will only report valid error codes. Thus, there's no need for it with future systems. Remove the xec_bitmap field and all references to it. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20200720145353.43924-1-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h| 1 +- arch/x86/kernel/cpu/mce/amd.c | 44 +- drivers/edac/mce_amd.c| 4 +--- 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index cf50382..6adced6 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -328,7 +328,6 @@ enum smca_bank_types { struct smca_hwid { unsigned int bank_type; /* Use with smca_bank_types for easy indexing. */ u32 hwid_mcatype; /* (hwid,mcatype) tuple */ - u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max is 21. */ u8 count; /* Number of instances. */ }; diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 99be063..0c6b02d 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned int bank) } static struct smca_hwid smca_hwid_mcatypes[] = { - /* { bank_type, hwid_mcatype, xec_bitmap } */ + /* { bank_type, hwid_mcatype } */ /* Reserved type */ - { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 }, + { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)}, /* ZN Core (HWID=0xB0) MCA types */ - { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1F }, - { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF }, - { SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF }, - { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF }, - { SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF }, + { SMCA_LS, HWID_MCATYPE(0xB0, 0x0)}, + { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10) }, + { SMCA_IF, HWID_MCATYPE(0xB0, 0x1)}, + { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)}, + { SMCA_DE, HWID_MCATYPE(0xB0, 0x3)}, /* HWID 0xB0 MCATYPE 0x4 is Reserved */ - { SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0xFFF }, - { SMCA_FP, HWID_MCATYPE(0xB0, 0x6), 0x7F }, - { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF }, + { SMCA_EX, HWID_MCATYPE(0xB0, 0x5)}, + { SMCA_FP, HWID_MCATYPE(0xB0, 0x6)}, + { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)}, /* Data Fabric MCA types */ - { SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF }, - { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0x1F }, - { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF }, + { SMCA_CS, HWID_MCATYPE(0x2E, 0x0)}, + { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1)}, + { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)}, /* Unified Memory Controller MCA type */ - { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0xFF }, + { SMCA_UMC, HWID_MCATYPE(0x96, 0x0)}, /* Parameter Block MCA type */ - { SMCA_PB, HWID_MCATYPE(0x05, 0x0), 0x1 }, + { SMCA_PB, HWID_MCATYPE(0x05, 0x0)}, /* Platform Security Processor MCA type */ - { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0), 0x1 }, - { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1), 0x3 }, + { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0)}, + { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1)}, /* System Management Unit MCA type */ - { SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 }, - { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1), 0x7FF }, + { SMCA_SMU, HWID_MCATYPE(0x01, 0x0)}, + { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1)}, /* Microprocessor 5 Unit MCA type */ - { SMCA_MP5, HWID_MCATYPE(0x01, 0x2), 0x3FF }, + { SMCA_MP5, HWID_MCATYPE(0x01, 0x2)}, /* Northbridge IO Unit MCA type */ - { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F }, + { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)}, /* PCI Express Unit MCA type */ - { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0),
Re: [PATCH] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
On Mon, Aug 17, 2020 at 11:40:07AM +0200, Borislav Petkov wrote: > On Mon, Jul 20, 2020 at 02:53:53PM +0000, Yazen Ghannam wrote: > > From: Yazen Ghannam > > > > The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type > > was intended to be used by the kernel to filter out invalid error codes > > on a system. However, this is unnecessary because the hardware will only > > report valid error codes. > > That's a kinda bold statement. :) > Yeah, I'm trying to keep "may" out of my vocabulary. :) > Are you saying, you wanna trust verification and that check is totally > useless? > I do. This check was added because I wasn't sure what to expect with this new architecural extension. But after a few product releases, it has been unnecessary. And I don't see a need for it with future systems. Thanks, Yazen
Re: [PATCH 1/2] x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode
On Sat, Aug 15, 2020 at 10:42:12AM +0200, Ingo Molnar wrote: > > * Yazen Ghannam wrote: > > > From: Yazen Ghannam > > > > The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and > > later systems. This function is used in amd64_edac_mod to do > > system-specific decoding for DRAM ECC errors. The function takes a > > "NodeId" as a parameter. > > > > In AMD documentation, NodeId is used to identify a physical die in a > > system. This can be used to identify a node in the AMD_NB code and also > > it is used with umc_normaddr_to_sysaddr(). > > > > However, the input used for decode_dram_ecc() is currently the NUMA node > > of a logical CPU. In the default configuration, the NUMA node and > > physical die will be equivalent, so this doesn't have an impact. But the > > NUMA node configuration can be adjusted with optional memory > > interleaving schemes. This will cause the NUMA node enumeration to not > > match the physical die enumeration. The mismatch will cause the address > > translation function to fail or report incorrect results. > > > > Save the "NodeId" as a percpu value during init in AMD MCE code. Export > > a function to return the value which can be used from modules like > > edac_mce_amd. > > > > Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") > > Signed-off-by: Yazen Ghannam > > --- > > arch/x86/include/asm/mce.h| 2 ++ > > arch/x86/kernel/cpu/mce/amd.c | 11 +++ > > drivers/edac/mce_amd.c| 2 +- > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h > > index cf503824529c..92527cc9ed06 100644 > > --- a/arch/x86/include/asm/mce.h > > +++ b/arch/x86/include/asm/mce.h > > @@ -343,6 +343,8 @@ extern struct smca_bank smca_banks[MAX_NR_BANKS]; > > extern const char *smca_get_long_name(enum smca_bank_types t); > > extern bool amd_mce_is_memory_error(struct mce *m); > > > > +extern u8 amd_cpu_to_node(unsigned int cpu); > > + > > extern int mce_threshold_create_device(unsigned int cpu); > > extern int mce_threshold_remove_device(unsigned int cpu); > > > > diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c > > index 99be063fcb1b..524edf81e287 100644 > > --- a/arch/x86/kernel/cpu/mce/amd.c > > +++ b/arch/x86/kernel/cpu/mce/amd.c > > @@ -202,6 +202,9 @@ static DEFINE_PER_CPU(unsigned int, bank_map); > > /* Map of banks that have more than MCA_MISC0 available. */ > > static DEFINE_PER_CPU(u32, smca_misc_banks_map); > > > > +/* CPUID_Fn801E_ECX[NodeId] used to identify a physical node/die. */ > > +static DEFINE_PER_CPU(u8, node_id); > > + > > static void amd_threshold_interrupt(void); > > static void amd_deferred_error_interrupt(void); > > > > @@ -233,6 +236,12 @@ static void smca_set_misc_banks_map(unsigned int bank, > > unsigned int cpu) > > > > } > > > > +u8 amd_cpu_to_node(unsigned int cpu) > > +{ > > + return per_cpu(node_id, cpu); > > +} > > +EXPORT_SYMBOL_GPL(amd_cpu_to_node); > > + > > static void smca_configure(unsigned int bank, unsigned int cpu) > > { > > unsigned int i, hwid_mcatype; > > @@ -240,6 +249,8 @@ static void smca_configure(unsigned int bank, unsigned > > int cpu) > > u32 high, low; > > u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank); > > > > + this_cpu_write(node_id, cpuid_ecx(0x801e) & 0xFF); > > So we already have this magic number used for a similar purpose, in > amd_get_topology(): > > cpuid(0x801e, &eax, &ebx, &ecx, &edx); > > node_id = ecx & 0xff; > Yes, that's right. I did have a patch that tried to leverage the existing topology variables. But it wasn't working for all targeted systems. So I thought to have something local to the AMD MCA code in order to avoid messing with the topology code just for this feature. > Firstly, could we please at least give 0x801e a proper symbolic > name, use it in hygon.c too (which AFAIK is derived from AMD anyway), > and then use it in these new patches? > Sure, but all places that use a symbolic name for a CPUID leaf define it locally. Should the same be done here? Or should there be common place for all the defines like in or maybe a new header file? > Secondly, why not stick node_id into struct cpuinfo_x86, where the MCA > code can then use it without having to introduce a new percpu data > structure? > I think this would be the simplest approach. I can write it. Also, the amd_get_nb_id() function could then be replaced with this. > There's also the underlying assumption that there's only ever going to > be 256 nodes, which limitation I'm sure we'll hear about in a couple > of years as not being quite enough. ;-) > Yeah, CPU topology seems very fractal-like. :) > So less hardcoding and more generalizations please. > Will do. Thanks, Yazen
[PATCH 1/2] x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode
From: Yazen Ghannam The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr(). However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact. But the NUMA node configuration can be adjusted with optional memory interleaving schemes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results. Save the "NodeId" as a percpu value during init in AMD MCE code. Export a function to return the value which can be used from modules like edac_mce_amd. Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h| 2 ++ arch/x86/kernel/cpu/mce/amd.c | 11 +++ drivers/edac/mce_amd.c| 2 +- 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index cf503824529c..92527cc9ed06 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -343,6 +343,8 @@ extern struct smca_bank smca_banks[MAX_NR_BANKS]; extern const char *smca_get_long_name(enum smca_bank_types t); extern bool amd_mce_is_memory_error(struct mce *m); +extern u8 amd_cpu_to_node(unsigned int cpu); + extern int mce_threshold_create_device(unsigned int cpu); extern int mce_threshold_remove_device(unsigned int cpu); diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 99be063fcb1b..524edf81e287 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -202,6 +202,9 @@ static DEFINE_PER_CPU(unsigned int, bank_map); /* Map of banks that have more than MCA_MISC0 available. */ static DEFINE_PER_CPU(u32, smca_misc_banks_map); +/* CPUID_Fn801E_ECX[NodeId] used to identify a physical node/die. */ +static DEFINE_PER_CPU(u8, node_id); + static void amd_threshold_interrupt(void); static void amd_deferred_error_interrupt(void); @@ -233,6 +236,12 @@ static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu) } +u8 amd_cpu_to_node(unsigned int cpu) +{ + return per_cpu(node_id, cpu); +} +EXPORT_SYMBOL_GPL(amd_cpu_to_node); + static void smca_configure(unsigned int bank, unsigned int cpu) { unsigned int i, hwid_mcatype; @@ -240,6 +249,8 @@ static void smca_configure(unsigned int bank, unsigned int cpu) u32 high, low; u32 smca_config = MSR_AMD64_SMCA_MCx_CONFIG(bank); + this_cpu_write(node_id, cpuid_ecx(0x801e) & 0xFF); + /* Set appropriate bits in MCA_CONFIG */ if (!rdmsr_safe(smca_config, &low, &high)) { /* diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 325aedf46ff2..9476097d0fdb 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -996,7 +996,7 @@ static void decode_smca_error(struct mce *m) } if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) - decode_dram_ecc(cpu_to_node(m->extcpu), m); + decode_dram_ecc(amd_cpu_to_node(m->extcpu), m); } static inline void amd_decode_err_code(u16 ec) -- 2.25.1
[PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes during address translation
From: Muralidhara M K Add support for new memory interleaving schemes used in current AMD systems. Check if the system is using a current Data Fabric version or a legacy version as some bit and register definitions have changed. Tested on AMD reference platforms with the following memory interleaving options. Naples - None - Channel - Die - Socket Rome (NPS = Nodes per Socket) - None - NPS0 - NPS1 - NPS2 - NPS4 The fixes tag refers to the commit that allows amd64_edac_mod to load on Rome systems. The module may report an incorrect system address on Rome systems depending on the interleaving option used. Fixes: 6e846239e548 ("EDAC/amd64: Add Family 17h Model 30h PCI IDs") Signed-off-by: Muralidhara M K Co-developed-by: Naveen Krishna Chtradhi Signed-off-by: Naveen Krishna Chtradhi Co-developed-by: Yazen Ghannam Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mce/amd.c | 237 +++--- 1 file changed, 188 insertions(+), 49 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 524edf81e287..a687aa898fef 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -689,18 +689,25 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { u64 dram_base_addr, dram_limit_addr, dram_hole_base; + /* We start from the normalized address */ u64 ret_addr = norm_addr; u32 tmp; - u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask; + bool hash_enabled = false, split_normalized = false, legacy_df = false; + u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets; - u8 intlv_addr_sel, intlv_addr_bit; - u8 num_intlv_bits, hashed_bit; + u8 intlv_addr_sel, intlv_addr_bit, num_intlv_bits; + u8 cs_mask, cs_id = 0, dst_fabric_id = 0; u8 lgcy_mmio_hole_en, base = 0; - u8 cs_mask, cs_id = 0; - bool hash_enabled = false; + + /* Read D18F1x208 (System Fabric ID Mask 0). */ + if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp)) + goto out_err; + + /* Determine if system is a legacy Data Fabric type. */ + legacy_df = !(tmp & 0xFF); /* Read D18F0x1B4 (DramOffset), check if base 1 is used. */ if (amd_df_indirect_read(nid, 0, 0x1B4, umc, &tmp)) @@ -708,7 +715,12 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) /* Remove HiAddrOffset from normalized address, if enabled: */ if (tmp & BIT(0)) { - u64 hi_addr_offset = (tmp & GENMASK_ULL(31, 20)) << 8; + u8 hi_addr_offset_lsb = legacy_df ? 20 : 12; + u64 hi_addr_offset = tmp & GENMASK_ULL(31, hi_addr_offset_lsb); + + /* Align to bit 28 regardless of the LSB used. */ + hi_addr_offset >>= hi_addr_offset_lsb; + hi_addr_offset <<= 28; if (norm_addr >= hi_addr_offset) { ret_addr -= hi_addr_offset; @@ -728,23 +740,31 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) } lgcy_mmio_hole_en = tmp & BIT(1); - intlv_num_chan= (tmp >> 4) & 0xF; - intlv_addr_sel= (tmp >> 8) & 0x7; - dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16; - /* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */ - if (intlv_addr_sel > 3) { - pr_err("%s: Invalid interleave address select %d.\n", - __func__, intlv_addr_sel); - goto out_err; + if (legacy_df) { + intlv_num_chan= (tmp >> 4) & 0xF; + intlv_addr_sel= (tmp >> 8) & 0x7; + } else { + intlv_num_chan= (tmp >> 2) & 0xF; + intlv_num_dies= (tmp >> 6) & 0x3; + intlv_num_sockets = (tmp >> 8) & 0x1; + intlv_addr_sel= (tmp >> 9) & 0x7; } + dram_base_addr= (tmp & GENMASK_ULL(31, 12)) << 16; + /* Read D18F0x114 (DramLimitAddress). */ if (amd_df_indirect_read(nid, 0, 0x114 + (8 * base), umc, &tmp)) goto out_err; - intlv_num_sockets = (tmp >> 8) & 0x1; - intlv_num_dies= (tmp >> 10) & 0x3; + if (legacy_df) { + intlv_num_sockets = (tmp >> 8) & 0x1; + intlv_num_dies= (tmp >> 10) & 0x3; + dst_fabric_id = tmp & 0xFF; + } else { + dst_fabric_id = tmp & 0x3FF; + } + dram_limit_addr = ((tmp & GENMASK_ULL(31, 12)) << 16) | GENMASK_ULL(27, 0); intlv_addr_bit = intlv_addr_sel + 8; @@ -757,8 +777,27 @@ int umc_nor
[PATCH 0/2] AMD MCA Address Translation Updates
From: Yazen Ghannam This patchset includes updates for the MCA Address Translation process on recent AMD systems. Patch 1: Fixes an input to the address translation function. The translation requires a physical Die ID (NodeId in AMD documentation) rather than a logicial NUMA node ID. This is because the physical and logical nodes may not always match. Patch 2: Add translation support for new memory interleaving options available in Rome systems. The patch is based on the latest AMD reference code for the address translation. Both patches have fixes tags, since they do fix some issues. However, stable is not copied. Patch 1 needs some fixups to apply. Patch 2 is large and doesn't seem to meet the requirements for stable though comments are welcome on if it should be applied. Thanks, Yazen Muralidhara M K (1): x86/MCE/AMD Support new memory interleaving schemes during address translation Yazen Ghannam (1): x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode arch/x86/include/asm/mce.h| 2 + arch/x86/kernel/cpu/mce/amd.c | 248 +++--- drivers/edac/mce_amd.c| 2 +- 3 files changed, 202 insertions(+), 50 deletions(-) -- 2.25.1
Re: [PATCH] x86/MCE/AMD, EDAC/mce_amd
On Sun, Aug 09, 2020 at 12:35:59PM +0800, Feng zhou wrote: > From: zhoufeng > > The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and > later systems. This function is used in amd64_edac_mod to do > system-specific decoding for DRAM ECC errors. The function takes a > "NodeId" as a parameter. > > In AMD documentation, NodeId is used to identify a physical die in a > system. This can be used to identify a node in the AMD_NB code and also > it is used with umc_normaddr_to_sysaddr(). > > However, the input used for decode_dram_ecc() is currently the NUMA node > of a logical CPU. so this will cause the address translation function to > fail or report incorrect results. > > Signed-off-by: zhoufeng > --- > drivers/edac/mce_amd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c > index 325aedf46ff2..73c805113322 100644 > --- a/drivers/edac/mce_amd.c > +++ b/drivers/edac/mce_amd.c > @@ -996,7 +996,7 @@ static void decode_smca_error(struct mce *m) > } > > if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc) > - decode_dram_ecc(cpu_to_node(m->extcpu), m); > + decode_dram_ecc(topology_physical_package_id(m->extcpu), m); This will break on Naples systems, because the NodeId and the physical package ID will not match. I can send a patch soon that will work for Naples, Rome, and later systems. Thanks, Yazen
[PATCH] x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
From: Yazen Ghannam The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary because the hardware will only report valid error codes. Remove the xec_bitmap field and all references to it. Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h| 1 - arch/x86/kernel/cpu/mce/amd.c | 44 +-- drivers/edac/mce_amd.c| 4 +--- 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 734ffe78a3d6..c18e87aeeccc 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -327,7 +327,6 @@ enum smca_bank_types { struct smca_hwid { unsigned int bank_type; /* Use with smca_bank_types for easy indexing. */ u32 hwid_mcatype; /* (hwid,mcatype) tuple */ - u32 xec_bitmap; /* Bitmap of valid ExtErrorCodes; current max is 21. */ u8 count; /* Number of instances. */ }; diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 327b85304cdd..a578df70768b 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -132,49 +132,49 @@ static enum smca_bank_types smca_get_bank_type(unsigned int bank) } static struct smca_hwid smca_hwid_mcatypes[] = { - /* { bank_type, hwid_mcatype, xec_bitmap } */ + /* { bank_type, hwid_mcatype } */ /* Reserved type */ - { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 }, + { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0)}, /* ZN Core (HWID=0xB0) MCA types */ - { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1F }, - { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10), 0xFF }, - { SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF }, - { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF }, - { SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF }, + { SMCA_LS, HWID_MCATYPE(0xB0, 0x0)}, + { SMCA_LS_V2,HWID_MCATYPE(0xB0, 0x10) }, + { SMCA_IF, HWID_MCATYPE(0xB0, 0x1)}, + { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2)}, + { SMCA_DE, HWID_MCATYPE(0xB0, 0x3)}, /* HWID 0xB0 MCATYPE 0x4 is Reserved */ - { SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0xFFF }, - { SMCA_FP, HWID_MCATYPE(0xB0, 0x6), 0x7F }, - { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF }, + { SMCA_EX, HWID_MCATYPE(0xB0, 0x5)}, + { SMCA_FP, HWID_MCATYPE(0xB0, 0x6)}, + { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7)}, /* Data Fabric MCA types */ - { SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF }, - { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0x1F }, - { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF }, + { SMCA_CS, HWID_MCATYPE(0x2E, 0x0)}, + { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1)}, + { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2)}, /* Unified Memory Controller MCA type */ - { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0xFF }, + { SMCA_UMC, HWID_MCATYPE(0x96, 0x0)}, /* Parameter Block MCA type */ - { SMCA_PB, HWID_MCATYPE(0x05, 0x0), 0x1 }, + { SMCA_PB, HWID_MCATYPE(0x05, 0x0)}, /* Platform Security Processor MCA type */ - { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0), 0x1 }, - { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1), 0x3 }, + { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0)}, + { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1)}, /* System Management Unit MCA type */ - { SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 }, - { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1), 0x7FF }, + { SMCA_SMU, HWID_MCATYPE(0x01, 0x0)}, + { SMCA_SMU_V2, HWID_MCATYPE(0x01, 0x1)}, /* Microprocessor 5 Unit MCA type */ - { SMCA_MP5, HWID_MCATYPE(0x01, 0x2), 0x3FF }, + { SMCA_MP5, HWID_MCATYPE(0x01, 0x2)}, /* Northbridge IO Unit MCA type */ - { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F }, + { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0)}, /* PCI Express Unit MCA type */ - { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0), 0x1F }, + { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0)}, }; struct smca_bank smca_banks[MAX_NR_BANKS]; diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 4fd06a3dc6fe..7f28edb070bd 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -999,10 +999,8 @@ static void decode_smca_error(struct mce *m) pr_emerg(HW_ERR "%s Ext. Error Code: %d", ip_name, xec); /* Only print the decode of valid error codes */ - if (xec < smca_mce_descs[bank_type].num_descs && - (hwid->
[PATCH] EDAC/mce_amd: Add new error descriptions for existing types
From: Yazen Ghannam A few existing MCA bank types will have new error types in future SMCA systems. Add the descriptions for the new error types. Signed-off-by: Yazen Ghannam --- drivers/edac/mce_amd.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 325aedf46ff2..4fd06a3dc6fe 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -210,6 +210,11 @@ static const char * const smca_if_mce_desc[] = { "L2 BTB Multi-Match Error", "L2 Cache Response Poison Error", "System Read Data Error", + "Hardware Assertion Error", + "L1-TLB Multi-Hit", + "L2-TLB Multi-Hit", + "BSR Parity Error", + "CT MCE", }; static const char * const smca_l2_mce_desc[] = { @@ -228,7 +233,8 @@ static const char * const smca_de_mce_desc[] = { "Fetch address FIFO parity error", "Patch RAM data parity error", "Patch RAM sequencer parity error", - "Micro-op buffer parity error" + "Micro-op buffer parity error", + "Hardware Assertion MCA Error", }; static const char * const smca_ex_mce_desc[] = { @@ -244,6 +250,8 @@ static const char * const smca_ex_mce_desc[] = { "Scheduling queue parity error", "Branch buffer queue parity error", "Hardware Assertion error", + "Spec Map parity error", + "Retire Map parity error", }; static const char * const smca_fp_mce_desc[] = { @@ -360,6 +368,7 @@ static const char * const smca_smu2_mce_desc[] = { "Instruction Tag Cache Bank A ECC or parity error", "Instruction Tag Cache Bank B ECC or parity error", "System Hub Read Buffer ECC or parity error", + "PHY RAM ECC error", }; static const char * const smca_mp5_mce_desc[] = { -- 2.25.1
Re: [PATCH 0/2] MCA and EDAC updates for AMD Family 17h, Model 60h
On Mon, Jun 15, 2020 at 07:59:50AM -0400, Borislav Petkov wrote: > + Yazen and linux-hwmon. > > On Sun, Jun 07, 2020 at 12:37:07PM +0800, Jacky Hu wrote: > > This patchset adds MCA and EDAC support for AMD Family 17h, Model 60h. > > > > Also k10temp works with 4800h > > > > k10temp-pci-00c3 > > Adapter: PCI adapter > > Vcore: 1.55 V > > Vsoc: 1.55 V > > Tctl: +49.6°C > > Tdie: +49.6°C > > Icore: 0.00 A > > Isoc: 0.00 A > > > > Jacky Hu (2): > > x86/amd_nb: Add Family 17h, Model 60h PCI IDs > > EDAC/amd64: Add family ops for Family 17h Models 60h-6Fh > > > > arch/x86/kernel/amd_nb.c | 5 + > > drivers/edac/amd64_edac.c | 14 ++ > > drivers/edac/amd64_edac.h | 3 +++ > > drivers/hwmon/k10temp.c | 2 ++ > > include/linux/pci_ids.h | 1 + > > 5 files changed, 25 insertions(+) > > PCI IDs and EDAC look good to me. Acked-by: Yazen Ghannam Thanks, Yazen
Re: [PATCH] x86/mce: fix a wrong assignment of i_mce.status
On Thu, Jun 11, 2020 at 12:55:00PM -0400, Luck, Tony wrote: > +Yazen > > On Thu, Jun 11, 2020 at 10:32:38AM +0800, Zhenzhong Duan wrote: > > The original code is a nop as i_mce.status is or'ed with part of itself, > > fix it. > > > > Signed-off-by: Zhenzhong Duan > > --- > > arch/x86/kernel/cpu/mce/inject.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/x86/kernel/cpu/mce/inject.c > > b/arch/x86/kernel/cpu/mce/inject.c > > index 3413b41..dc28a61 100644 > > --- a/arch/x86/kernel/cpu/mce/inject.c > > +++ b/arch/x86/kernel/cpu/mce/inject.c > > @@ -511,7 +511,7 @@ static void do_inject(void) > > */ > > if (inj_type == DFR_INT_INJ) { > > i_mce.status |= MCI_STATUS_DEFERRED; > > - i_mce.status |= (i_mce.status & ~MCI_STATUS_UC); > > + i_mce.status &= ~MCI_STATUS_UC; > > Boris: "git blame" says you wrote this code. Patch looks right (in > that it makes the code do what the comment just above says it is trying > to do): > > * - MCx_STATUS[UC] cleared: deferred errors are _not_ UC > > But this is AMD specific, so I'll defer judgement > Acked-by: Yazen Ghannam Thanks, Yazen
Re: 5.6.12 MCE on AMD EPYC 7502
On Fri, May 29, 2020 at 07:57:20AM -0400, Borislav Petkov wrote: > On Fri, May 29, 2020 at 01:55:29PM +0300, Dmitry Antipov wrote: > > Hello, > > > > I'm facing the following kernel messages running Debian 9 with > > custom 5.6.12 kernel running on AMD EPYC 7502 - based hardware: > > > > [138537.806814] mce: [Hardware Error]: Machine check events logged > > [138537.806818] [Hardware Error]: Corrected error, no action required. > > [138537.808456] [Hardware Error]: CPU:0 (17:31:0) > > MC27_STATUS[Over|CE|MiscV|-|-|-|SyndV|-|-|-]: 0xd822080b > > [138537.810080] [Hardware Error]: IPID: 0x0001002e1e01, Syndrome: > > 0x5a05 > > [138537.811694] [Hardware Error]: Power, Interrupts, etc. Ext. Error Code: > > 2, Link Error. > > [138537.813281] [Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: > > GEN, part-proc: SRC (no timeout) > > > > Is it related to some (not so) known CPU errata? > > Who knows. > There aren't any reported errata related to this that I could find. > > Should I try to update microcode, motherboard firmware, kernel, or whatever > > else? > > Yeah, BIOS update might be a good idea, if there's a newer version for > your board. > I agree. The link settings are generally tuned for the platform. So the platform vendor may have a fix. Thanks, Yazen
Re: [PATCH 3/3] EDAC/amd64: Add AMD family 17h model 60h PCI IDs
On Sun, May 10, 2020 at 04:48:42PM -0400, Alexander Monakov wrote: > Add support for AMD Renoir (4000-series Ryzen CPUs). > > Signed-off-by: Alexander Monakov > Cc: Thomas Gleixner > Cc: Borislav Petkov > Cc: x...@kernel.org > Cc: Yazen Ghannam > Cc: Brian Woods > Cc: Clemens Ladisch > Cc: Jean Delvare > Cc: Guenter Roeck > Cc: linux-hw...@vger.kernel.org > Cc: linux-e...@vger.kernel.org Acked-by: Yazen Ghannam Thanks, Yazen
Re: [PATCH 1/3] x86/amd_nb: add AMD family 17h model 60h PCI IDs
On Sun, May 10, 2020 at 04:48:40PM -0400, Alexander Monakov wrote: > Add PCI IDs for AMD Renoir (4000-series Ryzen CPUs). This is necessary > to enable support for temperature sensors via the k10temp module. > > Signed-off-by: Alexander Monakov > Cc: Thomas Gleixner > Cc: Borislav Petkov > Cc: x...@kernel.org > Cc: Yazen Ghannam > Cc: Brian Woods > Cc: Clemens Ladisch > Cc: Jean Delvare > Cc: Guenter Roeck > Cc: linux-hw...@vger.kernel.org > Cc: linux-e...@vger.kernel.org Acked-by: Yazen Ghannam Thanks, Yazen
[tip:ras/core] x86/MCE: Determine MCA banks' init state properly
Commit-ID: 068b053dca0e2ab40b3d953b102a178654eec282 Gitweb: https://git.kernel.org/tip/068b053dca0e2ab40b3d953b102a178654eec282 Author: Yazen Ghannam AuthorDate: Fri, 7 Jun 2019 20:18:06 + Committer: Borislav Petkov CommitDate: Tue, 11 Jun 2019 15:23:34 +0200 x86/MCE: Determine MCA banks' init state properly The OS is expected to write all bits to MCA_CTL for each bank, thus enabling error reporting in all banks. However, some banks may be unused in which case the registers for such banks are Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control bits because of quirks, etc. A bank can be considered uninitialized if the MCA_CTL register returns zero. This is because either the OS did not write anything or because the hardware is enforcing RAZ/WI for the bank. Set a bank's init value based on if the control bits are set or not in hardware. Return an error code in the sysfs interface for uninitialized banks. Do a final bank init check in a separate function which is not part of any user-controlled code flows. This is so a user may enable/disable a bank during runtime without having to restart their system. [ bp: Massage a bit. Discover bank init state at boot. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: "linux-e...@vger.kernel.org" Cc: Thomas Gleixner Cc: Tony Luck Cc: "x...@kernel.org" Link: https://lkml.kernel.org/r/20190607201752.221446-6-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/core.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 10f9f140985e..c2c93e9195ed 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1490,6 +1490,11 @@ static void __mcheck_cpu_mce_banks_init(void) for (i = 0; i < n_banks; i++) { struct mce_bank *b = &mce_banks[i]; + /* +* Init them all, __mcheck_cpu_apply_quirks() is going to apply +* the required vendor quirks before +* __mcheck_cpu_init_clear_banks() does the final bank setup. +*/ b->ctl = -1ULL; b->init = 1; } @@ -1562,6 +1567,33 @@ static void __mcheck_cpu_init_clear_banks(void) } } +/* + * Do a final check to see if there are any unused/RAZ banks. + * + * This must be done after the banks have been initialized and any quirks have + * been applied. + * + * Do not call this from any user-initiated flows, e.g. CPU hotplug or sysfs. + * Otherwise, a user who disables a bank will not be able to re-enable it + * without a system reboot. + */ +static void __mcheck_cpu_check_banks(void) +{ + struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array); + u64 msrval; + int i; + + for (i = 0; i < this_cpu_read(mce_num_banks); i++) { + struct mce_bank *b = &mce_banks[i]; + + if (!b->init) + continue; + + rdmsrl(msr_ops.ctl(i), msrval); + b->init = !!msrval; + } +} + /* * During IFU recovery Sandy Bridge -EP4S processors set the RIPV and * EIPV bits in MCG_STATUS to zero on the affected logical processor (SDM @@ -1849,6 +1881,7 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c) __mcheck_cpu_init_generic(); __mcheck_cpu_init_vendor(c); __mcheck_cpu_init_clear_banks(); + __mcheck_cpu_check_banks(); __mcheck_cpu_setup_timer(); } @@ -2085,6 +2118,9 @@ static ssize_t show_bank(struct device *s, struct device_attribute *attr, b = &per_cpu(mce_banks_array, s->id)[bank]; + if (!b->init) + return -ENODEV; + return sprintf(buf, "%llx\n", b->ctl); } @@ -2103,6 +2139,9 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr, b = &per_cpu(mce_banks_array, s->id)[bank]; + if (!b->init) + return -ENODEV; + b->ctl = new; mce_restart();
[tip:ras/core] x86/MCE: Make the number of MCA banks a per-CPU variable
Commit-ID: c7d314f386e987be8b51eeb7dd947756ae23f6b6 Gitweb: https://git.kernel.org/tip/c7d314f386e987be8b51eeb7dd947756ae23f6b6 Author: Yazen Ghannam AuthorDate: Fri, 7 Jun 2019 20:18:05 + Committer: Borislav Petkov CommitDate: Tue, 11 Jun 2019 15:23:09 +0200 x86/MCE: Make the number of MCA banks a per-CPU variable The number of MCA banks is provided per logical CPU. Historically, this number has been the same across all CPUs, but this is not an architectural guarantee. Future AMD systems may have MCA bank counts that vary between logical CPUs in a system. This issue was partially addressed in 006c077041dc ("x86/mce: Handle varying MCA bank counts") by allocating structures using the maximum number of MCA banks and by saving the maximum MCA bank count in a system as the global count. This means that some extra structures are allocated. Also, this means that CPUs will spend more time in the #MC and other handlers checking extra MCA banks. Thus, define the number of MCA banks as a per-CPU variable. [ bp: Make mce_num_banks an unsigned int. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: "linux-e...@vger.kernel.org" Cc: Thomas Gleixner Cc: Tony Luck Cc: "x...@kernel.org" Link: https://lkml.kernel.org/r/20190607201752.221446-5-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/amd.c | 19 arch/x86/kernel/cpu/mce/core.c | 45 +- arch/x86/kernel/cpu/mce/internal.h | 2 +- 3 files changed, 36 insertions(+), 30 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index d4d6e4b7f9dc..fb5c935af2c5 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -495,7 +495,7 @@ static u32 get_block_address(u32 current_addr, u32 low, u32 high, { u32 addr = 0, offset = 0; - if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS)) + if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS)) return addr; if (mce_flags.smca) @@ -627,11 +627,12 @@ void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank) /* cpu init entry point, called from mce.c with preempt off */ void mce_amd_feature_init(struct cpuinfo_x86 *c) { - u32 low = 0, high = 0, address = 0; unsigned int bank, block, cpu = smp_processor_id(); + u32 low = 0, high = 0, address = 0; int offset = -1; - for (bank = 0; bank < mca_cfg.banks; ++bank) { + + for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (mce_flags.smca) smca_configure(bank, cpu); @@ -976,7 +977,7 @@ static void amd_deferred_error_interrupt(void) { unsigned int bank; - for (bank = 0; bank < mca_cfg.banks; ++bank) + for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) log_error_deferred(bank); } @@ -1017,7 +1018,7 @@ static void amd_threshold_interrupt(void) struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL; unsigned int bank, cpu = smp_processor_id(); - for (bank = 0; bank < mca_cfg.banks; ++bank) { + for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) { if (!(per_cpu(bank_map, cpu) & (1 << bank))) continue; @@ -1204,7 +1205,7 @@ static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank, u32 low, high; int err; - if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS)) + if ((bank >= per_cpu(mce_num_banks, cpu)) || (block >= NR_BLOCKS)) return 0; if (rdmsr_safe_on_cpu(cpu, address, &low, &high)) @@ -1438,7 +1439,7 @@ int mce_threshold_remove_device(unsigned int cpu) { unsigned int bank; - for (bank = 0; bank < mca_cfg.banks; ++bank) { + for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) { if (!(per_cpu(bank_map, cpu) & (1 << bank))) continue; threshold_remove_bank(cpu, bank); @@ -1459,14 +1460,14 @@ int mce_threshold_create_device(unsigned int cpu) if (bp) return 0; - bp = kcalloc(mca_cfg.banks, sizeof(struct threshold_bank *), + bp = kcalloc(per_cpu(mce_num_banks, cpu), sizeof(struct threshold_bank *), GFP_KERNEL); if (!bp) return -ENOMEM; per_cpu(threshold_banks, cpu) = bp; - for (bank = 0; bank < mca_cfg.banks; ++bank) { + for (bank = 0; bank < per_cpu(mce_num_banks, cpu); ++bank) { if (!(per_cpu(bank_map, cpu) & (1 << bank))) continue; err = threshold_create_bank(cpu, bank); diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/ker
[tip:ras/core] x86/MCE/AMD: Don't cache block addresses on SMCA systems
Commit-ID: 95d057f54664f3c6e8f650faf5690b82b30a9e52 Gitweb: https://git.kernel.org/tip/95d057f54664f3c6e8f650faf5690b82b30a9e52 Author: Yazen Ghannam AuthorDate: Fri, 7 Jun 2019 20:18:04 + Committer: Borislav Petkov CommitDate: Tue, 11 Jun 2019 15:22:41 +0200 x86/MCE/AMD: Don't cache block addresses on SMCA systems On legacy systems, the addresses of the MCA_MISC* registers need to be recursively discovered based on a Block Pointer field in the registers. On Scalable MCA systems, the register space is fixed, and particular addresses can be derived by regular offsets for bank and register type. This fixed address space includes the MCA_MISC* registers. MCA_MISC0 is always available for each MCA bank. MCA_MISC1 through MCA_MISC4 are considered available if MCA_MISC0[BlkPtr]=1. Cache the value of MCA_MISC0[BlkPtr] for each bank and per CPU. This needs to be done only during init. The values should be saved per CPU to accommodate heterogeneous SMCA systems. Redo smca_get_block_address() to directly return the block addresses. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: "linux-e...@vger.kernel.org" Cc: Thomas Gleixner Cc: Tony Luck Cc: "x...@kernel.org" Link: https://lkml.kernel.org/r/20190607201752.221446-4-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/amd.c | 73 ++- 1 file changed, 37 insertions(+), 36 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index d904aafe6409..d4d6e4b7f9dc 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -101,11 +101,6 @@ static struct smca_bank_name smca_names[] = { [SMCA_PCIE] = { "pcie", "PCI Express Unit" }, }; -static u32 smca_bank_addrs[MAX_NR_BANKS][NR_BLOCKS] __ro_after_init = -{ - [0 ... MAX_NR_BANKS - 1] = { [0 ... NR_BLOCKS - 1] = -1 } -}; - static const char *smca_get_name(enum smca_bank_types t) { if (t >= N_SMCA_BANK_TYPES) @@ -199,6 +194,9 @@ static char buf_mcatype[MAX_MCATYPE_NAME_LEN]; static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks); static DEFINE_PER_CPU(unsigned int, bank_map); /* see which banks are on */ +/* Map of banks that have more than MCA_MISC0 available. */ +static DEFINE_PER_CPU(u32, smca_misc_banks_map); + static void amd_threshold_interrupt(void); static void amd_deferred_error_interrupt(void); @@ -208,6 +206,28 @@ static void default_deferred_error_interrupt(void) } void (*deferred_error_int_vector)(void) = default_deferred_error_interrupt; +static void smca_set_misc_banks_map(unsigned int bank, unsigned int cpu) +{ + u32 low, high; + + /* +* For SMCA enabled processors, BLKPTR field of the first MISC register +* (MCx_MISC0) indicates presence of additional MISC regs set (MISC1-4). +*/ + if (rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &low, &high)) + return; + + if (!(low & MCI_CONFIG_MCAX)) + return; + + if (rdmsr_safe(MSR_AMD64_SMCA_MCx_MISC(bank), &low, &high)) + return; + + if (low & MASK_BLKPTR_LO) + per_cpu(smca_misc_banks_map, cpu) |= BIT(bank); + +} + static void smca_configure(unsigned int bank, unsigned int cpu) { unsigned int i, hwid_mcatype; @@ -245,6 +265,8 @@ static void smca_configure(unsigned int bank, unsigned int cpu) wrmsr(smca_config, low, high); } + smca_set_misc_banks_map(bank, cpu); + /* Return early if this bank was already initialized. */ if (smca_banks[bank].hwid) return; @@ -455,42 +477,21 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c) wrmsr(MSR_CU_DEF_ERR, low, high); } -static u32 smca_get_block_address(unsigned int bank, unsigned int block) +static u32 smca_get_block_address(unsigned int bank, unsigned int block, + unsigned int cpu) { - u32 low, high; - u32 addr = 0; - - if (smca_get_bank_type(bank) == SMCA_RESERVED) - return addr; - if (!block) return MSR_AMD64_SMCA_MCx_MISC(bank); - /* Check our cache first: */ - if (smca_bank_addrs[bank][block] != -1) - return smca_bank_addrs[bank][block]; - - /* -* For SMCA enabled processors, BLKPTR field of the first MISC register -* (MCx_MISC0) indicates presence of additional MISC regs set (MISC1-4). -*/ - if (rdmsr_safe(MSR_AMD64_SMCA_MCx_CONFIG(bank), &low, &high)) - goto out; - - if (!(low & MCI_CONFIG_MCAX)) - goto out; - - if (!rdmsr_safe(MSR_AMD64_SMCA_MCx_MISC(bank), &low, &high) && - (low & MASK_BLKPTR_LO)) - addr = MSR_AMD64_SMCA_MCx_MISC
[tip:ras/core] x86/MCE: Make mce_banks a per-CPU array
Commit-ID: b4914508f1fe0eca1cd011b6026ff762a1aa62d5 Gitweb: https://git.kernel.org/tip/b4914508f1fe0eca1cd011b6026ff762a1aa62d5 Author: Yazen Ghannam AuthorDate: Fri, 7 Jun 2019 20:18:04 + Committer: Borislav Petkov CommitDate: Tue, 11 Jun 2019 15:22:13 +0200 x86/MCE: Make mce_banks a per-CPU array Current AMD systems have unique MCA banks per logical CPU even though the type of the banks may all align to the same bank number. Each CPU will have control of a set of MCA banks in the hardware and these are not shared with other CPUs. For example, bank 0 may be the Load-Store Unit on every logical CPU, but each bank 0 is a unique structure in the hardware. In other words, there isn't a *single* Load-Store Unit at MCA bank 0 that all logical CPUs share. This idea extends even to non-core MCA banks. For example, CPU0 and CPU4 may see a Unified Memory Controller at bank 15, but each CPU is actually seeing a unique hardware structure that is not shared with other CPUs. Because the MCA banks are all unique hardware structures, it would be good to control them in a more granular way. For example, if there is a known issue with the Floating Point Unit on CPU5 and a user wishes to disable an error type on the Floating Point Unit, then it would be good to do this only for CPU5 rather than all CPUs. Also, future AMD systems may have heterogeneous MCA banks. Meaning the bank numbers may not necessarily represent the same types between CPUs. For example, bank 20 visible to CPU0 may be a Unified Memory Controller and bank 20 visible to CPU4 may be a Coherent Slave. So granular control will be even more necessary should the user wish to control specific MCA banks. Split the device attributes from struct mce_bank leaving only the MCA bank control fields. Make struct mce_banks[] per_cpu in order to have more granular control over individual MCA banks in the hardware. Allocate the device attributes statically based on the maximum number of MCA banks supported. The sysfs interface will use as many as needed per CPU. Currently, this is set to mca_cfg.banks, but will be changed to a per_cpu bank count in a future patch. Allocate the MCA control bits statically. This is in order to avoid locking warnings when memory is allocated during secondary CPUs' init sequences. Also, remove the now unnecessary return values from __mcheck_cpu_mce_banks_init() and __mcheck_cpu_cap_init(). Redo the sysfs store/show functions to handle the per_cpu mce_banks[]. [ bp: s/mce_banks_percpu/mce_banks_array/g ] [ Locking issue reported by ] Reported-by: kernel test robot Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: "linux-e...@vger.kernel.org" Cc: Thomas Gleixner Cc: Tony Luck Cc: "x...@kernel.org" Link: https://lkml.kernel.org/r/20190607201752.221446-3-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/core.c | 76 ++ 1 file changed, 48 insertions(+), 28 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 55bdbedde0b8..49fac95d036b 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -65,16 +65,21 @@ static DEFINE_MUTEX(mce_sysfs_mutex); DEFINE_PER_CPU(unsigned, mce_exception_count); -#define ATTR_LEN 16 -/* One object for each MCE bank, shared by all CPUs */ struct mce_bank { u64 ctl;/* subevents to enable */ boolinit; /* initialise bank? */ +}; +static DEFINE_PER_CPU_READ_MOSTLY(struct mce_bank[MAX_NR_BANKS], mce_banks_array); + +#define ATTR_LEN 16 +/* One object for each MCE bank, shared by all CPUs */ +struct mce_bank_dev { struct device_attribute attr; /* device attribute */ charattrname[ATTR_LEN]; /* attribute name */ + u8 bank; /* bank number */ }; +static struct mce_bank_dev mce_bank_devs[MAX_NR_BANKS]; -static struct mce_bank *mce_banks __read_mostly; struct mce_vendor_flags mce_flags __read_mostly; struct mca_config mca_cfg __read_mostly = { @@ -684,6 +689,7 @@ DEFINE_PER_CPU(unsigned, mce_poll_count); */ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) { + struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array); bool error_seen = false; struct mce m; int i; @@ -1131,6 +1137,7 @@ static void __mc_scan_banks(struct mce *m, struct mce *final, unsigned long *toclear, unsigned long *valid_banks, int no_way_out, int *worst) { + struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array); struct mca_config *cfg = &mca_cfg; int severity, i; @@ -1472,27 +1479,23 @@ int mce_notify_irq(void) } EXPORT_SYMBOL_GPL(mce_notify_irq);
[tip:ras/core] x86/MCE: Make struct mce_banks[] static
Commit-ID: 95fdce6b24f3526c2bd1aad15978d238b79da6bd Gitweb: https://git.kernel.org/tip/95fdce6b24f3526c2bd1aad15978d238b79da6bd Author: Yazen Ghannam AuthorDate: Fri, 7 Jun 2019 20:18:03 + Committer: Borislav Petkov CommitDate: Tue, 11 Jun 2019 15:13:51 +0200 x86/MCE: Make struct mce_banks[] static The struct mce_banks[] array is only used in mce/core.c so move its definition there and make it static. Also, change the "init" field to bool type. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Thomas Gleixner Cc: Tony Luck Cc: "x...@kernel.org" Link: https://lkml.kernel.org/r/20190607201752.221446-2-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/core.c | 11 ++- arch/x86/kernel/cpu/mce/internal.h | 10 -- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 282916f3b8d8..55bdbedde0b8 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -65,7 +65,16 @@ static DEFINE_MUTEX(mce_sysfs_mutex); DEFINE_PER_CPU(unsigned, mce_exception_count); -struct mce_bank *mce_banks __read_mostly; +#define ATTR_LEN 16 +/* One object for each MCE bank, shared by all CPUs */ +struct mce_bank { + u64 ctl;/* subevents to enable */ + boolinit; /* initialise bank? */ + struct device_attribute attr; /* device attribute */ + charattrname[ATTR_LEN]; /* attribute name */ +}; + +static struct mce_bank *mce_banks __read_mostly; struct mce_vendor_flags mce_flags __read_mostly; struct mca_config mca_cfg __read_mostly = { diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index a34b55baa7aa..35b3e5c02c1c 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -22,17 +22,8 @@ enum severity_level { extern struct blocking_notifier_head x86_mce_decoder_chain; -#define ATTR_LEN 16 #define INITIAL_CHECK_INTERVAL 5 * 60 /* 5 minutes */ -/* One object for each MCE bank, shared by all CPUs */ -struct mce_bank { - u64 ctl;/* subevents to enable */ - unsigned char init; /* initialise bank? */ - struct device_attribute attr; /* device attribute */ - charattrname[ATTR_LEN]; /* attribute name */ -}; - struct mce_evt_llist { struct llist_node llnode; struct mce mce; @@ -47,7 +38,6 @@ struct llist_node *mce_gen_pool_prepare_records(void); extern int (*mce_severity)(struct mce *a, int tolerant, char **msg, bool is_excp); struct dentry *mce_get_debugfs_dir(void); -extern struct mce_bank *mce_banks; extern mce_banks_t mce_banks_ce_disabled; #ifdef CONFIG_X86_MCE_INTEL
[tip:ras/core] x86/MCE: Add an MCE-record filtering function
Commit-ID: 45d4b7b9cb88526f6d5bd4c03efab88d75d10e4f Gitweb: https://git.kernel.org/tip/45d4b7b9cb88526f6d5bd4c03efab88d75d10e4f Author: Yazen Ghannam AuthorDate: Mon, 25 Mar 2019 16:34:22 + Committer: Borislav Petkov CommitDate: Tue, 23 Apr 2019 18:04:47 +0200 x86/MCE: Add an MCE-record filtering function Some systems may report spurious MCA errors. In general, spurious MCA errors may be disabled by clearing a particular bit in MCA_CTL. However, clearing a bit in MCA_CTL may not be recommended for some errors, so the only option is to ignore them. An MCA error is printed and handled after it has been added to the MCE event pool. So an MCA error can be ignored by not adding it to that pool in the first place. Add such a filtering function. [ bp: Move function prototype to the internal header and massage. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: Arnd Bergmann Cc: "cle...@gmail.com" Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Pu Wen Cc: Qiuxu Zhuo Cc: "ra...@milecki.pl" Cc: Shirish S Cc: # 5.0.x Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20190325163410.171021-1-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/core.c | 5 + arch/x86/kernel/cpu/mce/genpool.c | 3 +++ arch/x86/kernel/cpu/mce/internal.h | 3 +++ 3 files changed, 11 insertions(+) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 3e081428117c..80b8c6bff8ed 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1775,6 +1775,11 @@ static void __mcheck_cpu_init_timer(void) mce_start_timer(t); } +bool filter_mce(struct mce *m) +{ + return false; +} + /* Handle unconfigured int18 (should never happen) */ static void unexpected_machine_check(struct pt_regs *regs, long error_code) { diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c index 3395549c51d3..64d1d5a00f39 100644 --- a/arch/x86/kernel/cpu/mce/genpool.c +++ b/arch/x86/kernel/cpu/mce/genpool.c @@ -99,6 +99,9 @@ int mce_gen_pool_add(struct mce *mce) { struct mce_evt_llist *node; + if (filter_mce(mce)) + return -EINVAL; + if (!mce_evt_pool) return -EINVAL; diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h index af5eab1e65e2..b822a645395d 100644 --- a/arch/x86/kernel/cpu/mce/internal.h +++ b/arch/x86/kernel/cpu/mce/internal.h @@ -173,4 +173,7 @@ struct mca_msr_regs { extern struct mca_msr_regs msr_ops; +/* Decide whether to add MCE record to MCE event pool or filter it out. */ +extern bool filter_mce(struct mce *m); + #endif /* __X86_MCE_INTERNAL_H__ */
[tip:ras/core] x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models
Commit-ID: 71a84402b93e5fbd8f817f40059c137e10171788 Gitweb: https://git.kernel.org/tip/71a84402b93e5fbd8f817f40059c137e10171788 Author: Yazen Ghannam AuthorDate: Mon, 25 Mar 2019 16:34:22 + Committer: Borislav Petkov CommitDate: Tue, 23 Apr 2019 18:16:07 +0200 x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts. In addition, users may see the errors reported through the AMD MCE decoder module, even with the interrupt disabled, due to MCA polling. Clear the "Counter Present" bit in the Instruction Fetch bank's MCA_MISC0 register. This will prevent enabling MCA thresholding on this bank which will prevent the high interrupt rate due to this error. Define an AMD-specific function to filter these errors from the MCE event pool so that they don't get reported during early boot. Rename filter function in EDAC/mce_amd to avoid a naming conflict, while at it. [ bp: Move function prototype to the internal header and massage/cleanup, fix typos. ] Reported-by: Rafał Miłecki Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: "cle...@gmail.com" Cc: Arnd Bergmann Cc: Ingo Molnar Cc: James Morse Cc: Kees Cook Cc: Mauro Carvalho Chehab Cc: Pu Wen Cc: Qiuxu Zhuo Cc: Shirish S Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: linux-edac Cc: x86-ml Cc: # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models Cc: # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk Cc: # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in Cc: # 5.0.x Link: https://lkml.kernel.org/r/20190325163410.171021-2-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/amd.c | 52 -- arch/x86/kernel/cpu/mce/core.c | 3 +++ arch/x86/kernel/cpu/mce/internal.h | 6 + drivers/edac/mce_amd.c | 4 +-- 4 files changed, 50 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index e64de5149e50..d904aafe6409 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -563,33 +563,59 @@ out: return offset; } +bool amd_filter_mce(struct mce *m) +{ + enum smca_bank_types bank_type = smca_get_bank_type(m->bank); + struct cpuinfo_x86 *c = &boot_cpu_data; + u8 xec = (m->status >> 16) & 0x3F; + + /* See Family 17h Models 10h-2Fh Erratum #1114. */ + if (c->x86 == 0x17 && + c->x86_model >= 0x10 && c->x86_model <= 0x2F && + bank_type == SMCA_IF && xec == 10) + return true; + + return false; +} + /* - * Turn off MC4_MISC thresholding banks on all family 0x15 models since - * they're not supported there. + * Turn off thresholding banks for the following conditions: + * - MC4_MISC thresholding is not supported on Family 0x15. + * - Prevent possible spurious interrupts from the IF bank on Family 0x17 + * Models 0x10-0x2F due to Erratum #1114. */ -void disable_err_thresholding(struct cpuinfo_x86 *c) +void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank) { - int i; + int i, num_msrs; u64 hwcr; bool need_toggle; - u32 msrs[] = { - 0x0413, /* MC4_MISC0 */ - 0xc408, /* MC4_MISC1 */ - }; + u32 msrs[NR_BLOCKS]; + + if (c->x86 == 0x15 && bank == 4) { + msrs[0] = 0x0413; /* MC4_MISC0 */ + msrs[1] = 0xc408; /* MC4_MISC1 */ + num_msrs = 2; + } else if (c->x86 == 0x17 && + (c->x86_model >= 0x10 && c->x86_model <= 0x2F)) { - if (c->x86 != 0x15) + if (smca_get_bank_type(bank) != SMCA_IF) + return; + + msrs[0] = MSR_AMD64_SMCA_MCx_MISC(bank); + num_msrs = 1; + } else { return; + } rdmsrl(MSR_K7_HWCR, hwcr); /* McStatusWrEn has to be set */ need_toggle = !(hwcr & BIT(18)); - if (need_toggle) wrmsrl(MSR_K7_HWCR, hwcr | BIT(18)); /* Clear CntP bit safely */ - for (i = 0; i < ARRAY_SIZE(msrs); i++) + for (i = 0; i < num_msrs; i++) msr_clear_bit(msrs[i], 62); /* restore old settings */ @@ -604,12 +630,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) unsigned int bank, block, cpu = smp_processor_id(); int offset = -1; -
[tip:ras/core] x86/mce: Handle varying MCA bank counts
Commit-ID: 006c077041dc73b9490fffc4c6af5befe0687110 Gitweb: https://git.kernel.org/tip/006c077041dc73b9490fffc4c6af5befe0687110 Author: Yazen Ghannam AuthorDate: Fri, 27 Jul 2018 16:40:09 -0500 Committer: Borislav Petkov CommitDate: Wed, 27 Mar 2019 13:12:49 +0100 x86/mce: Handle varying MCA bank counts Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a CPU. Currently, this number is the same for all CPUs and a warning is shown if there is a difference. The number of banks is overwritten with the MCG_CAP[Count] value of each following CPU that boots. According to the Intel SDM and AMD APM, the MCG_CAP[Count] value gives the number of banks that are available to a "processor implementation". The AMD BKDGs/PPRs further clarify that this value is per core. This value has historically been the same for every core in the system, but that is not an architectural requirement. Future AMD systems may have different MCG_CAP[Count] values per core, so the assumption that all CPUs will have the same MCG_CAP[Count] value will no longer be valid. Also, the first CPU to boot will allocate the struct mce_banks[] array using the number of banks based on its MCG_CAP[Count] value. The machine check handler and other functions use the global number of banks to iterate and index into the mce_banks[] array. So it's possible to use an out-of-bounds index on an asymmetric system where a following CPU sees a MCG_CAP[Count] value greater than its predecessors. Thus, allocate the mce_banks[] array to the maximum number of banks. This will avoid the potential out-of-bounds index since the value of mca_cfg.banks is capped to MAX_NR_BANKS. Set the value of mca_cfg.banks equal to the max of the previous value and the value for the current CPU. This way mca_cfg.banks will always represent the max number of banks detected on any CPU in the system. This will ensure that all CPUs will access all the banks that are visible to them. A CPU that can access fewer than the max number of banks will find the registers of the extra banks to be read-as-zero. Furthermore, print the resulting number of MCA banks in use. Do this in mcheck_late_init() so that the final value is printed after all CPUs have been initialized. Finally, get bank count from target CPU when doing injection with mce-inject module. [ bp: Remove out-of-bounds example, passify and cleanup commit message. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Pu Wen Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20180727214009.78289-1-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/core.c | 22 +++--- arch/x86/kernel/cpu/mce/inject.c | 14 +++--- 2 files changed, 14 insertions(+), 22 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index e558ca77cfe8..c3498732ba28 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1481,13 +1481,12 @@ EXPORT_SYMBOL_GPL(mce_notify_irq); static int __mcheck_cpu_mce_banks_init(void) { int i; - u8 num_banks = mca_cfg.banks; - mce_banks = kcalloc(num_banks, sizeof(struct mce_bank), GFP_KERNEL); + mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL); if (!mce_banks) return -ENOMEM; - for (i = 0; i < num_banks; i++) { + for (i = 0; i < MAX_NR_BANKS; i++) { struct mce_bank *b = &mce_banks[i]; b->ctl = -1ULL; @@ -1501,28 +1500,19 @@ static int __mcheck_cpu_mce_banks_init(void) */ static int __mcheck_cpu_cap_init(void) { - unsigned b; u64 cap; + u8 b; rdmsrl(MSR_IA32_MCG_CAP, cap); b = cap & MCG_BANKCNT_MASK; - if (!mca_cfg.banks) - pr_info("CPU supports %d MCE banks\n", b); - - if (b > MAX_NR_BANKS) { - pr_warn("Using only %u machine check banks out of %u\n", - MAX_NR_BANKS, b); + if (WARN_ON_ONCE(b > MAX_NR_BANKS)) b = MAX_NR_BANKS; - } - /* Don't support asymmetric configurations today */ - WARN_ON(mca_cfg.banks != 0 && b != mca_cfg.banks); - mca_cfg.banks = b; + mca_cfg.banks = max(mca_cfg.banks, b); if (!mce_banks) { int err = __mcheck_cpu_mce_banks_init(); - if (err) return err; } @@ -2481,6 +2471,8 @@ EXPORT_SYMBOL_GPL(mcsafe_key); static int __init mcheck_late_init(void) { + pr_info("Using %d MCE banks\n", mca_cfg.banks); + if (mca_cfg.recovery) static_branch_inc(&mcsafe_key); diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c index 8492ef7d9015..3f82afd0f46f 100644 --- a/ar
[tip:ras/core] x86/MCE: Group AMD function prototypes in
Commit-ID: 9308fd4074551f222f30322d1ee8c5aff18e9747 Gitweb: https://git.kernel.org/tip/9308fd4074551f222f30322d1ee8c5aff18e9747 Author: Yazen Ghannam AuthorDate: Fri, 22 Mar 2019 20:29:00 + Committer: Borislav Petkov CommitDate: Sun, 24 Mar 2019 10:54:13 +0100 x86/MCE: Group AMD function prototypes in There are two groups of "ifdef CONFIG_X86_MCE_AMD" function prototypes in . Merge these two groups. No functional change. [ bp: align vertically. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: Arnd Bergmann Cc: "cle...@gmail.com" Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Pu Wen Cc: Qiuxu Zhuo Cc: "ra...@milecki.pl" Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20190322202848.20749-3-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h | 25 +++-- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 22d05e3835f0..dc2d4b206ab7 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -210,16 +210,6 @@ static inline void cmci_rediscover(void) {} static inline void cmci_recheck(void) {} #endif -#ifdef CONFIG_X86_MCE_AMD -void mce_amd_feature_init(struct cpuinfo_x86 *c); -int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr); -#else -static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { } -static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; }; -#endif - -static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); } - int mce_available(struct cpuinfo_x86 *c); bool mce_is_memory_error(struct mce *m); bool mce_is_correctable(struct mce *m); @@ -345,12 +335,19 @@ extern bool amd_mce_is_memory_error(struct mce *m); extern int mce_threshold_create_device(unsigned int cpu); extern int mce_threshold_remove_device(unsigned int cpu); -#else +void mce_amd_feature_init(struct cpuinfo_x86 *c); +int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr); -static inline int mce_threshold_create_device(unsigned int cpu) { return 0; }; -static inline int mce_threshold_remove_device(unsigned int cpu) { return 0; }; -static inline bool amd_mce_is_memory_error(struct mce *m) { return false; }; +#else +static inline int mce_threshold_create_device(unsigned int cpu) { return 0; }; +static inline int mce_threshold_remove_device(unsigned int cpu) { return 0; }; +static inline bool amd_mce_is_memory_error(struct mce *m) { return false; }; +static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { } +static inline int +umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr) { return -EINVAL; }; #endif +static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); } + #endif /* _ASM_X86_MCE_H */
[tip:ras/core] EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit
Commit-ID: 3f4da372ec8e4ce58c17ac4f2e3c8891bbfea17e Gitweb: https://git.kernel.org/tip/3f4da372ec8e4ce58c17ac4f2e3c8891bbfea17e Author: Yazen Ghannam AuthorDate: Tue, 12 Feb 2019 21:24:28 + Committer: Borislav Petkov CommitDate: Fri, 15 Feb 2019 14:25:58 +0100 EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit Previous AMD systems have had a bit in MCA_STATUS to indicate that an error was detected on a scrub operation. However, this bit was defined differently within different banks and families/models. Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero or defined as "Scrub", for all MCA banks and CPU models. Therefore, this bit can be defined as the "Scrub" bit. Define MCA_STATUS[40] as "Scrub" and decode it in the AMD MCE decoding module for Family 17h and newer systems. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Morse Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Pu Wen Cc: Qiuxu Zhuo Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20190212212417.107049-1-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h | 1 + drivers/edac/mce_amd.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 299a38536567..22d05e3835f0 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -48,6 +48,7 @@ #define MCI_STATUS_SYNDV BIT_ULL(53) /* synd reg. valid */ #define MCI_STATUS_DEFERREDBIT_ULL(44) /* uncorrected error, deferred exception */ #define MCI_STATUS_POISON BIT_ULL(43) /* access poisonous data */ +#define MCI_STATUS_SCRUB BIT_ULL(40) /* Error detected during scrub operation */ /* * McaX field if set indicates a given bank supports MCA extensions: diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index f286b880f981..b349c22bb386 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1078,6 +1078,9 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) if (ecc) pr_cont("|%sECC", ((ecc == 2) ? "C" : "U")); + if (fam >= 0x17) + pr_cont("|%s", (m->status & MCI_STATUS_SCRUB ? "Scrub" : "-")); + pr_cont("]: 0x%016llx\n", m->status); if (m->status & MCI_STATUS_ADDRV)
[tip:ras/core] EDAC/mce_amd: Decode MCA_STATUS in bit definition order
Commit-ID: a0bcd3c0b8a52ba0eb74371fa6be15ad0390ba67 Gitweb: https://git.kernel.org/tip/a0bcd3c0b8a52ba0eb74371fa6be15ad0390ba67 Author: Yazen Ghannam AuthorDate: Tue, 12 Feb 2019 21:24:29 + Committer: Borislav Petkov CommitDate: Fri, 15 Feb 2019 14:36:31 +0100 EDAC/mce_amd: Decode MCA_STATUS in bit definition order Sort the MCA_STATUS bits in decode output to follow how they are defined in the register. The order is as follows: Bit | Decode 62 | Over 61 | UC 59 | MiscV 58 | AddrV 57 | PCC 55 | TCC 53 | SyndV 46 | CECC 45 | UECC 44 | Deferred 43 | Poison 40 | Scrub [ bp: Massage a bit. ] Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: Mauro Carvalho Chehab Cc: linux-edac Cc: x...@kernel.org Link: https://lkml.kernel.org/r/20190212212417.107049-2-yazen.ghan...@amd.com --- drivers/edac/mce_amd.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index b349c22bb386..0a1814dad6cf 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -1051,26 +1051,18 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) ((m->status & MCI_STATUS_UC)? "UE": (m->status & MCI_STATUS_DEFERRED) ? "-" : "CE"), ((m->status & MCI_STATUS_MISCV) ? "MiscV" : "-"), - ((m->status & MCI_STATUS_PCC) ? "PCC" : "-"), - ((m->status & MCI_STATUS_ADDRV) ? "AddrV" : "-")); - - if (fam >= 0x15) { - pr_cont("|%s", (m->status & MCI_STATUS_DEFERRED ? "Deferred" : "-")); - - /* F15h, bank4, bit 43 is part of McaStatSubCache. */ - if (fam != 0x15 || m->bank != 4) - pr_cont("|%s", (m->status & MCI_STATUS_POISON ? "Poison" : "-")); - } + ((m->status & MCI_STATUS_ADDRV) ? "AddrV" : "-"), + ((m->status & MCI_STATUS_PCC) ? "PCC" : "-")); if (boot_cpu_has(X86_FEATURE_SMCA)) { u32 low, high; u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank); - pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-")); - if (!rdmsr_safe(addr, &low, &high) && (low & MCI_CONFIG_MCAX)) pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-")); + + pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-")); } /* do the two bits[14:13] together */ @@ -1078,6 +1070,14 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data) if (ecc) pr_cont("|%sECC", ((ecc == 2) ? "C" : "U")); + if (fam >= 0x15) { + pr_cont("|%s", (m->status & MCI_STATUS_DEFERRED ? "Deferred" : "-")); + + /* F15h, bank4, bit 43 is part of McaStatSubCache. */ + if (fam != 0x15 || m->bank != 4) + pr_cont("|%s", (m->status & MCI_STATUS_POISON ? "Poison" : "-")); + } + if (fam >= 0x17) pr_cont("|%s", (m->status & MCI_STATUS_SCRUB ? "Scrub" : "-"));
[tip:ras/core] EDAC, mce_amd: Print ExtErrorCode and description on a single line
Commit-ID: 1c1522d32ac49065f88e5a8b3d6e3a5613b20118 Gitweb: https://git.kernel.org/tip/1c1522d32ac49065f88e5a8b3d6e3a5613b20118 Author: Yazen Ghannam AuthorDate: Fri, 1 Feb 2019 22:55:54 + Committer: Borislav Petkov CommitDate: Mon, 4 Feb 2019 19:29:13 +0100 EDAC, mce_amd: Print ExtErrorCode and description on a single line Save a log line by printing the extended error code and the description on a single line. This is similar to how errors are printed in other subsystems, e.g. "#, description". If we don't have a valid description then only the number/code is printed. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Tony Luck Cc: x...@kernel.org Link: https://lkml.kernel.org/r/20190201225534.8177-6-yazen.ghan...@amd.com --- drivers/edac/mce_amd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 7e29ceabdf6f..f286b880f981 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -965,13 +965,12 @@ static void decode_smca_error(struct mce *m) ip_name = smca_get_long_name(bank_type); - pr_emerg(HW_ERR "%s Extended Error Code: %d\n", ip_name, xec); + pr_emerg(HW_ERR "%s Ext. Error Code: %d", ip_name, xec); /* Only print the decode of valid error codes */ if (xec < smca_mce_descs[bank_type].num_descs && (hwid->xec_bitmap & BIT_ULL(xec))) { - pr_emerg(HW_ERR "%s Error: ", ip_name); - pr_cont("%s.\n", smca_mce_descs[bank_type].descs[xec]); + pr_cont(", %s.\n", smca_mce_descs[bank_type].descs[xec]); } if (bank_type == SMCA_UMC && xec == 0 && decode_dram_ecc)
[tip:ras/core] EDAC, mce_amd: Match error descriptions to latest documentation
Commit-ID: e03447ee718b331be8f3abc388c7bf7d325dfab4 Gitweb: https://git.kernel.org/tip/e03447ee718b331be8f3abc388c7bf7d325dfab4 Author: Yazen Ghannam AuthorDate: Fri, 1 Feb 2019 22:55:53 + Committer: Borislav Petkov CommitDate: Sun, 3 Feb 2019 13:16:50 +0100 EDAC, mce_amd: Match error descriptions to latest documentation Update the error descriptions to match the latest documentation for easier searching. In some cases the changes are small and in other cases the changes may be total rewording of the description. No functional changes. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Tony Luck Cc: x...@kernel.org Link: https://lkml.kernel.org/r/20190201225534.8177-5-yazen.ghan...@amd.com --- drivers/edac/mce_amd.c | 166 - 1 file changed, 83 insertions(+), 83 deletions(-) diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index c79e650aa606..7e29ceabdf6f 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -151,74 +151,74 @@ static const char * const mc6_mce_desc[] = { /* Scalable MCA error strings */ static const char * const smca_ls_mce_desc[] = { - "Load queue parity", - "Store queue parity", - "Miss address buffer payload parity", - "L1 TLB parity", + "Load queue parity error", + "Store queue parity error", + "Miss address buffer payload parity error", + "Level 1 TLB parity error", "DC Tag error type 5", - "DC tag error type 6", - "DC tag error type 1", + "DC Tag error type 6", + "DC Tag error type 1", "Internal error type 1", "Internal error type 2", - "Sys Read data error thread 0", - "Sys read data error thread 1", - "DC tag error type 2", - "DC data error type 1 (poison consumption)", - "DC data error type 2", - "DC data error type 3", - "DC tag error type 4", - "L2 TLB parity", + "System Read Data Error Thread 0", + "System Read Data Error Thread 1", + "DC Tag error type 2", + "DC Data error type 1 and poison consumption", + "DC Data error type 2", + "DC Data error type 3", + "DC Tag error type 4", + "Level 2 TLB parity error", "PDC parity error", - "DC tag error type 3", - "DC tag error type 5", - "L2 fill data error", + "DC Tag error type 3", + "DC Tag error type 5", + "L2 Fill Data error", }; static const char * const smca_if_mce_desc[] = { - "microtag probe port parity error", - "IC microtag or full tag multi-hit error", - "IC full tag parity", - "IC data array parity", - "Decoupling queue phys addr parity error", - "L0 ITLB parity error", - "L1 ITLB parity error", - "L2 ITLB parity error", - "BPQ snoop parity on Thread 0", - "BPQ snoop parity on Thread 1", - "L1 BTB multi-match error", - "L2 BTB multi-match error", - "L2 Cache Response Poison error", - "System Read Data error", + "Op Cache Microtag Probe Port Parity Error", + "IC Microtag or Full Tag Multi-hit Error", + "IC Full Tag Parity Error", + "IC Data Array Parity Error", + "Decoupling Queue PhysAddr Parity Error", + "L0 ITLB Parity Error", + "L1 ITLB Parity Error", + "L2 ITLB Parity Error", + "BPQ Thread 0 Snoop Parity Error", + "BPQ Thread 1 Snoop Parity Error", + "L1 BTB Multi-Match Error", + "L2 BTB Multi-Match Error", + "L2 Cache Response Poison Error", + "System Read Data Error", }; static const char * const smca_l2_mce_desc[] = { - "L2M tag multi-way-hit error", - "L2M tag ECC error", - "L2M data ECC error", - "HW assert", + "L2M Tag Multiple-Way-Hit error", + "L2M Tag or State Array ECC Error", + "L2M Data Array ECC Error", + "Hardware Assert Error", }; static const char * const smca_de_mce_desc[] = { - "uop cache tag parity error", - "uop cache data parity error", - "Insn buffer parity error", - "uop queue parity error",
[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units
Commit-ID: 3ad7e748c12cc771df6020a552def3e1727e8a17 Gitweb: https://git.kernel.org/tip/3ad7e748c12cc771df6020a552def3e1727e8a17 Author: Yazen Ghannam AuthorDate: Fri, 1 Feb 2019 22:55:52 + Committer: Borislav Petkov CommitDate: Sun, 3 Feb 2019 13:01:57 +0100 x86/MCE/AMD, EDAC/mce_amd: Add new McaTypes for CS, PSP, and SMU units The existing CS, PSP, and SMU SMCA bank types will see new versions (as indicated by their McaTypes) in future SMCA systems. Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the same names as the older versions, since they are logically the same to the user. SMCA systems won't mix and match IP blocks with different McaType versions in the same system, so there isn't a need to distinguish them. The MCA_IPID register is saved when logging an MCA error, and that can be used to triage the error. Also, add the new error descriptions to edac_mce_amd. Some error types (positions in the list) are overloaded compared to the previous McaTypes. Therefore, just create new lists of the error descriptions to keep things simple even if some of the error descriptions are the same between versions. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: Arnd Bergmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Kees Cook Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Pu Wen Cc: Qiuxu Zhuo Cc: Shirish S Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20190201225534.8177-3-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h| 3 +++ arch/x86/kernel/cpu/mce/amd.c | 6 + drivers/edac/mce_amd.c| 55 +++ 3 files changed, 64 insertions(+) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index 91b65d859ca8..299a38536567 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -307,11 +307,14 @@ enum smca_bank_types { SMCA_FP,/* Floating Point */ SMCA_L3_CACHE, /* L3 Cache */ SMCA_CS,/* Coherent Slave */ + SMCA_CS_V2, /* Coherent Slave */ SMCA_PIE, /* Power, Interrupts, etc. */ SMCA_UMC, /* Unified Memory Controller */ SMCA_PB,/* Parameter Block */ SMCA_PSP, /* Platform Security Processor */ + SMCA_PSP_V2,/* Platform Security Processor */ SMCA_SMU, /* System Management Unit */ + SMCA_SMU_V2,/* System Management Unit */ SMCA_MP5, /* Microprocessor 5 Unit */ SMCA_NBIO, /* Northbridge IO Unit */ SMCA_PCIE, /* PCI Express Unit */ diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 00f60b8c7e4f..bd1331b241ca 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -88,11 +88,14 @@ static struct smca_bank_name smca_names[] = { [SMCA_FP] = { "floating_point", "Floating Point Unit" }, [SMCA_L3_CACHE] = { "l3_cache", "L3 Cache" }, [SMCA_CS] = { "coherent_slave", "Coherent Slave" }, + [SMCA_CS_V2]= { "coherent_slave", "Coherent Slave" }, [SMCA_PIE] = { "pie", "Power, Interrupts, etc." }, [SMCA_UMC] = { "umc", "Unified Memory Controller" }, [SMCA_PB] = { "param_block", "Parameter Block" }, [SMCA_PSP] = { "psp", "Platform Security Processor" }, + [SMCA_PSP_V2] = { "psp", "Platform Security Processor" }, [SMCA_SMU] = { "smu", "System Management Unit" }, + [SMCA_SMU_V2] = { "smu", "System Management Unit" }, [SMCA_MP5] = { "mp5", "Microprocessor 5 Unit" }, [SMCA_NBIO] = { "nbio", "Northbridge IO Unit" }, [SMCA_PCIE] = { "pcie", "PCI Express Unit" }, @@ -153,6 +156,7 @@ static struct smca_hwid smca_hwid_mcatypes[] = { /* Data Fabric MCA types */ { SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF }, { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0xF }, + { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF }, /* Unified Memory Controller MCA type */ { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0x3F }, @@ -162,9 +166,11 @@ static struct smca_hwid smca_hwid_mcatypes[] = { /* Platform Security Processor MCA type */ { SMCA_PSP, HWID_MCATYPE(0xFF, 0x0), 0x1 }, + { SMCA_PSP_V2, HWID_MCATYPE(0xFF, 0x1), 0x3 }, /* System Management Unit MCA type */ { SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 }, + { SMCA_SMU_V2, HWID_MCA
[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA bank types
Commit-ID: 8a5dd2cd2f2e94878cacc969655a69ca214795ab Gitweb: https://git.kernel.org/tip/8a5dd2cd2f2e94878cacc969655a69ca214795ab Author: Yazen Ghannam AuthorDate: Fri, 1 Feb 2019 22:55:52 + Committer: Borislav Petkov CommitDate: Sun, 3 Feb 2019 13:05:16 +0100 x86/MCE/AMD, EDAC/mce_amd: Add new error descriptions for some SMCA bank types Some SMCA bank types on future systems will report new error types even though the bank type is not treated as a new version. These new error types will reported by bits that are reserved in past systems. Add the new error descriptions to the lists in edac_mce_amd. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Kees Cook Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Shirish S Cc: Thomas Gleixner Cc: Tony Luck Cc: x86-ml Link: https://lkml.kernel.org/r/20190201225534.8177-4-yazen.ghan...@amd.com --- arch/x86/kernel/cpu/mce/amd.c | 8 drivers/edac/mce_amd.c| 6 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index bd1331b241ca..e64de5149e50 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -144,22 +144,22 @@ static struct smca_hwid smca_hwid_mcatypes[] = { { SMCA_RESERVED, HWID_MCATYPE(0x00, 0x0), 0x0 }, /* ZN Core (HWID=0xB0) MCA types */ - { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1FFFEF }, + { SMCA_LS, HWID_MCATYPE(0xB0, 0x0), 0x1F }, { SMCA_IF, HWID_MCATYPE(0xB0, 0x1), 0x3FFF }, { SMCA_L2_CACHE, HWID_MCATYPE(0xB0, 0x2), 0xF }, { SMCA_DE, HWID_MCATYPE(0xB0, 0x3), 0x1FF }, /* HWID 0xB0 MCATYPE 0x4 is Reserved */ - { SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0x7FF }, + { SMCA_EX, HWID_MCATYPE(0xB0, 0x5), 0xFFF }, { SMCA_FP, HWID_MCATYPE(0xB0, 0x6), 0x7F }, { SMCA_L3_CACHE, HWID_MCATYPE(0xB0, 0x7), 0xFF }, /* Data Fabric MCA types */ { SMCA_CS, HWID_MCATYPE(0x2E, 0x0), 0x1FF }, - { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0xF }, + { SMCA_PIE, HWID_MCATYPE(0x2E, 0x1), 0x1F }, { SMCA_CS_V2,HWID_MCATYPE(0x2E, 0x2), 0x3FFF }, /* Unified Memory Controller MCA type */ - { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0x3F }, + { SMCA_UMC, HWID_MCATYPE(0x96, 0x0), 0xFF }, /* Parameter Block MCA type */ { SMCA_PB, HWID_MCATYPE(0x05, 0x0), 0x1 }, diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index 184c90172d17..c79e650aa606 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -155,7 +155,7 @@ static const char * const smca_ls_mce_desc[] = { "Store queue parity", "Miss address buffer payload parity", "L1 TLB parity", - "Reserved", + "DC Tag error type 5", "DC tag error type 6", "DC tag error type 1", "Internal error type 1", @@ -222,6 +222,7 @@ static const char * const smca_ex_mce_desc[] = { "Retire status queue parity error", "Scheduling queue parity error", "Branch buffer queue parity error", + "Hardware Assertion error", }; static const char * const smca_fp_mce_desc[] = { @@ -279,6 +280,7 @@ static const char * const smca_pie_mce_desc[] = { "Internal PIE register security violation", "Error on GMI link", "Poison data written to internal PIE register", + "A deferred error was detected in the DF" }; static const char * const smca_umc_mce_desc[] = { @@ -288,6 +290,8 @@ static const char * const smca_umc_mce_desc[] = { "Advanced peripheral bus error", "Command/address parity error", "Write data CRC error", + "DCQ SRAM ECC error", + "AES SRAM ECC error", }; static const char * const smca_pb_mce_desc[] = {
[tip:ras/core] x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types
Commit-ID: cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0 Gitweb: https://git.kernel.org/tip/cbfa447edd6a3825fdb8a4ffae74ff7208f2d2c0 Author: Yazen Ghannam AuthorDate: Fri, 1 Feb 2019 22:55:51 + Committer: Borislav Petkov CommitDate: Sun, 3 Feb 2019 13:01:44 +0100 x86/MCE/AMD, EDAC/mce_amd: Add new MP5, NBIO, and PCIE SMCA bank types Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and PCIE SMCA bank types. Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: Arnd Bergmann Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Kees Cook Cc: linux-edac Cc: Mauro Carvalho Chehab Cc: Pu Wen Cc: Qiuxu Zhuo Cc: Shirish S Cc: Thomas Gleixner Cc: Tony Luck Cc: Vishal Verma Cc: x86-ml Link: https://lkml.kernel.org/r/20190201225534.8177-2-yazen.ghan...@amd.com --- arch/x86/include/asm/mce.h| 3 +++ arch/x86/kernel/cpu/mce/amd.c | 12 drivers/edac/mce_amd.c| 32 3 files changed, 47 insertions(+) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index c1a812bd5a27..91b65d859ca8 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -312,6 +312,9 @@ enum smca_bank_types { SMCA_PB,/* Parameter Block */ SMCA_PSP, /* Platform Security Processor */ SMCA_SMU, /* System Management Unit */ + SMCA_MP5, /* Microprocessor 5 Unit */ + SMCA_NBIO, /* Northbridge IO Unit */ + SMCA_PCIE, /* PCI Express Unit */ N_SMCA_BANK_TYPES }; diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index ed3327342b40..00f60b8c7e4f 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -93,6 +93,9 @@ static struct smca_bank_name smca_names[] = { [SMCA_PB] = { "param_block", "Parameter Block" }, [SMCA_PSP] = { "psp", "Platform Security Processor" }, [SMCA_SMU] = { "smu", "System Management Unit" }, + [SMCA_MP5] = { "mp5", "Microprocessor 5 Unit" }, + [SMCA_NBIO] = { "nbio", "Northbridge IO Unit" }, + [SMCA_PCIE] = { "pcie", "PCI Express Unit" }, }; static u32 smca_bank_addrs[MAX_NR_BANKS][NR_BLOCKS] __ro_after_init = @@ -162,6 +165,15 @@ static struct smca_hwid smca_hwid_mcatypes[] = { /* System Management Unit MCA type */ { SMCA_SMU, HWID_MCATYPE(0x01, 0x0), 0x1 }, + + /* Microprocessor 5 Unit MCA type */ + { SMCA_MP5, HWID_MCATYPE(0x01, 0x2), 0x3FF }, + + /* Northbridge IO Unit MCA type */ + { SMCA_NBIO, HWID_MCATYPE(0x18, 0x0), 0x1F }, + + /* PCI Express Unit MCA type */ + { SMCA_PCIE, HWID_MCATYPE(0x46, 0x0), 0x1F }, }; struct smca_bank smca_banks[MAX_NR_BANKS]; diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c index c605089d899f..5ab4ab3f0ce6 100644 --- a/drivers/edac/mce_amd.c +++ b/drivers/edac/mce_amd.c @@ -285,6 +285,35 @@ static const char * const smca_smu_mce_desc[] = { "SMU RAM ECC or parity error", }; +static const char * const smca_mp5_mce_desc[] = { + "High SRAM ECC or parity error", + "Low SRAM ECC or parity error", + "Data Cache Bank A ECC or parity error", + "Data Cache Bank B ECC or parity error", + "Data Tag Cache Bank A ECC or parity error", + "Data Tag Cache Bank B ECC or parity error", + "Instruction Cache Bank A ECC or parity error", + "Instruction Cache Bank B ECC or parity error", + "Instruction Tag Cache Bank A ECC or parity error", + "Instruction Tag Cache Bank B ECC or parity error", +}; + +static const char * const smca_nbio_mce_desc[] = { + "ECC or Parity error", + "PCIE error", + "SDP ErrEvent error", + "SDP Egress Poison Error", + "IOHC Internal Poison Error", +}; + +static const char * const smca_pcie_mce_desc[] = { + "CCIX PER Message logging", + "CCIX Read Response with Status: Non-Data Error", + "CCIX Write Response with Status: Non-Data Error", + "CCIX Read Response with Status: Data Error", + "CCIX Non-okay write response with data error", +}; + struct smca_mce_desc { const char * const *descs; unsigned int num_descs; @@ -304,6 +333,9 @@ static struct smca_mce_desc smca_mce_descs[] = { [SMCA_PB] = { smca_pb_mce_desc, ARRAY_SIZE(smca_pb_mce_desc) }, [SMCA_PSP] = { smca_psp_mce_desc, ARRAY_SIZE(smca_psp_mce_d
[PATCH 2/2] x86/MCE/AMD: Skip creating kobjects with NULL names
From: Yazen Ghannam During mce_threshold_create_device() data structures are allocated for each CPUs MCA banks and thresholding blocks. These data structures are used to save information related to AMD's MCA Error Thresholding feature. The structures are used in the thresholding interrupt handler, and they are exposed to the user through sysfs. The sysfs interface has user-friendly names for each bank. However, errors in mce_threshold_create_device() will cause all the data structures to be deallocated. This will break the thresholding interrupt handler since it depends on these structures. One possible error is creating a kobject with a NULL name. This will happen if a bank exists on a system that doesn't have a name, e.g. new bank types on future systems. Skip creating kobjects for banks without a name. This means that the sysfs interface for this bank will not exist. But this will keep all the data structures allocated, so the thresholding interrupt handler will work, even for the unnamed bank. Also, the sysfs interface will still be populated for all existing, known bank types. Cc: # 4.13.x Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mcheck/mce_amd.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c index 2dbf34250bbf..521fd8f406df 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c @@ -1130,6 +1130,7 @@ static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank, struct threshold_block *b = NULL; u32 low, high; int err; + const char *name = NULL; if ((bank >= mca_cfg.banks) || (block >= NR_BLOCKS)) return 0; @@ -1176,9 +1177,13 @@ static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank, per_cpu(threshold_banks, cpu)[bank]->blocks = b; } + name = get_name(bank, b); + if (!name) + goto recurse; + err = kobject_init_and_add(&b->kobj, &threshold_ktype, per_cpu(threshold_banks, cpu)[bank]->kobj, - get_name(bank, b)); + name); if (err) goto out_free; recurse: @@ -1265,12 +1270,16 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank) goto out; } + if (!name) + goto allocate; + b->kobj = kobject_create_and_add(name, &dev->kobj); if (!b->kobj) { err = -EINVAL; goto out_free; } +allocate: per_cpu(threshold_banks, cpu)[bank] = b; if (is_shared_bank(bank)) { -- 2.17.1
[PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler
From: Yazen Ghannam If threshold_init_device() fails then per_cpu(threshold_banks) will be deallocated. The thresholding interrupt handler will still be active, so it's possible to get a NULL pointer dereference if a THR interrupt happens and any of the structures are NULL. Exit the handler if per_cpu(threshold_banks) is NULL and skip NULL banks. MCA error information will still be in the registers. The information will be logged during polling or in another MCA exception or interrupt handler. Fixes: 17ef4af0ec0f ("x86/mce/AMD: Use saved threshold block info in interrupt handler") Cc: # 4.13.x Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mcheck/mce_amd.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c index dd33c357548f..2dbf34250bbf 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c @@ -934,13 +934,21 @@ static void log_and_reset_block(struct threshold_block *block) static void amd_threshold_interrupt(void) { struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL; + struct threshold_bank *th_bank = NULL; unsigned int bank, cpu = smp_processor_id(); + if (!per_cpu(threshold_banks, cpu)) + return; + for (bank = 0; bank < mca_cfg.banks; ++bank) { if (!(per_cpu(bank_map, cpu) & (1 << bank))) continue; - first_block = per_cpu(threshold_banks, cpu)[bank]->blocks; + th_bank = per_cpu(threshold_banks, cpu)[bank]; + if (!th_bank) + continue; + + first_block = th_bank->blocks; if (!first_block) continue; -- 2.17.1
[PATCH] x86/mce: Handle varying MCA bank counts
From: Yazen Ghannam Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a CPU. Currently, this is assumed to be the same for all CPUs and a warning is shown if there is a difference. The number of banks is overwritten with the MCG_CAP[Count] value of each following CPU that boots. According to the Intel SDM and AMD APM, the MCG_CAP[Count] value gives the number of banks that are available to a "processor implementation". The AMD BKDGs/PPRs further clarify that this value is per core. This value has historically been the same for every core in the system, but that is not an architectural requirement. Future AMD systems may have different MCG_CAP[Count] values per core, so the assumption that all CPUs will have the same MCG_CAP[Count] value will no longer be valid. Also, the first CPU to boot will allocate the struct mce_banks[] array using the number of banks based on its MCG_CAP[Count] value. The machine check handler and other functions use the global number of banks to iterate and index into the mce_banks[] array. So it's possible to use an out-of-bounds index on an asymmetric system where a following CPU sees a MCG_CAP[Count] value greater than its predecessors. For example, CPU0 sees MCG_CAP[Count]=2. It sets mca_cfg.banks=2 and allocates mce_banks[] with 2 elements. CPU1 sees MCG_CAP[Count]=3 and sets mca_cfg.banks=3, but mce_banks[] is already allocated and remains having 2 elements. Allocate the mce_banks[] array to the maximum number of banks. This will avoid the potential out-of-bounds index since we cap the value of mca_cfg.banks to MAX_NR_BANKS. Set the value of mca_cfg.banks equal to the max of the previous value and the value for the current CPU. This way mca_cfg.banks will always represent the max number of banks detected on any CPU in the system. This will ensure that all CPUs will access all the banks that are visible to them. A CPU that can access fewer than the max number of banks will find the registers of the extra banks to be read-as-zero. Print the number of MCA banks that we're using. Do this in mcheck_late_init() so that we print the final value after all CPUs have been initialized. Get bank count from target CPU when doing injection with mce-inject module. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mcheck/mce-inject.c | 14 +++--- arch/x86/kernel/cpu/mcheck/mce.c| 21 +++-- 2 files changed, 14 insertions(+), 21 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c b/arch/x86/kernel/cpu/mcheck/mce-inject.c index c805a06e14c3..5dda56d56dd3 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-inject.c +++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c @@ -46,8 +46,6 @@ static struct mce i_mce; static struct dentry *dfs_inj; -static u8 n_banks; - #define MAX_FLAG_OPT_SIZE 4 #define NBCFG 0x44 @@ -567,9 +565,15 @@ static void do_inject(void) static int inj_bank_set(void *data, u64 val) { struct mce *m = (struct mce *)data; + u64 cap; + u8 n_banks; + + /* Get bank count on target CPU so we can handle non-uniform values. */ + rdmsrl_on_cpu(m->extcpu, MSR_IA32_MCG_CAP, &cap); + n_banks = cap & MCG_BANKCNT_MASK; if (val >= n_banks) { - pr_err("Non-existent MCE bank: %llu\n", val); + pr_err("MCA bank %llu non-existent on CPU%d\n", val, m->extcpu); return -EINVAL; } @@ -659,10 +663,6 @@ static struct dfs_node { static int __init debugfs_init(void) { unsigned int i; - u64 cap; - - rdmsrl(MSR_IA32_MCG_CAP, cap); - n_banks = cap & MCG_BANKCNT_MASK; dfs_inj = debugfs_create_dir("mce-inject", NULL); if (!dfs_inj) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 4b767284b7f5..4238c65a0cce 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1479,13 +1479,12 @@ EXPORT_SYMBOL_GPL(mce_notify_irq); static int __mcheck_cpu_mce_banks_init(void) { int i; - u8 num_banks = mca_cfg.banks; - mce_banks = kcalloc(num_banks, sizeof(struct mce_bank), GFP_KERNEL); + mce_banks = kcalloc(MAX_NR_BANKS, sizeof(struct mce_bank), GFP_KERNEL); if (!mce_banks) return -ENOMEM; - for (i = 0; i < num_banks; i++) { + for (i = 0; i < MAX_NR_BANKS; i++) { struct mce_bank *b = &mce_banks[i]; b->ctl = -1ULL; @@ -1499,24 +1498,16 @@ static int __mcheck_cpu_mce_banks_init(void) */ static int __mcheck_cpu_cap_init(void) { - unsigned b; + u8 b; u64 cap; rdmsrl(MSR_IA32_MCG_CAP, cap); b = cap & MCG_BANKCNT_MASK; - if (!mca_cfg.banks) - pr_info("CPU supports %d MCE banks\n", b); - - if (b > MAX_NR_BANKS) { - pr_warn("Using only
[tip:efi/core] efi: Decode IA32/X64 Context Info structure
Commit-ID: 9c178663cbf2e754be322505078306b4a380a697 Gitweb: https://git.kernel.org/tip/9c178663cbf2e754be322505078306b4a380a697 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:56 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:48 +0200 efi: Decode IA32/X64 Context Info structure Print the fields of the IA32/X64 Context Information structure. Print the "Register Array" as raw values. Some context types are defined in the UEFI spec, so more detailed decoded may be added in the future. Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Processor Context Information Structure. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-11-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 48 + 1 file changed, 48 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 356b8d326219..2531de49f56c 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -10,6 +10,7 @@ #define VALID_LAPIC_ID BIT_ULL(0) #define VALID_CPUID_INFO BIT_ULL(1) #define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) +#define VALID_PROC_CXT_INFO_NUM(bits) (((bits) & GENMASK_ULL(13, 8)) >> 8) #define INFO_ERR_STRUCT_TYPE_CACHE \ GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B, \ @@ -71,6 +72,9 @@ #define CHECK_MS_RESTARTABLE_IPBIT_ULL(22) #define CHECK_MS_OVERFLOW BIT_ULL(23) +#define CTX_TYPE_MSR 1 +#define CTX_TYPE_MMREG 7 + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -134,6 +138,17 @@ static const char * const ia_check_ms_error_type_strs[] = { "Internal Unclassified", }; +static const char * const ia_reg_ctx_strs[] = { + "Unclassified Data", + "MSR Registers (Machine Check and other MSRs)", + "32-bit Mode Execution Context", + "64-bit Mode Execution Context", + "FXSAVE Context", + "32-bit Mode Debug Registers (DR0-DR7)", + "64-bit Mode Debug Registers (DR0-DR7)", + "Memory Mapped Registers", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); @@ -242,6 +257,7 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { int i; struct cper_ia_err_info *err_info; + struct cper_ia_proc_ctx *ctx_info; char newpfx[64], infopfx[64]; u8 err_type; @@ -305,4 +321,36 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) err_info++; } + + ctx_info = (struct cper_ia_proc_ctx *)err_info; + for (i = 0; i < VALID_PROC_CXT_INFO_NUM(proc->validation_bits); i++) { + int size = sizeof(*ctx_info) + ctx_info->reg_arr_size; + int groupsize = 4; + + printk("%sContext Information Structure %d:\n", pfx, i); + + printk("%sRegister Context Type: %s\n", newpfx, + ctx_info->reg_ctx_type < ARRAY_SIZE(ia_reg_ctx_strs) ? + ia_reg_ctx_strs[ctx_info->reg_ctx_type] : "unknown"); + + printk("%sRegister Array Size: 0x%04x\n", newpfx, + ctx_info->reg_arr_size); + + if (ctx_info->reg_ctx_type == CTX_TYPE_MSR) { + groupsize = 8; /* MSRs are 8 bytes wide. */ + printk("%sMSR Address: 0x%08x\n", newpfx, + ctx_info->msr_addr); + } + + if (ctx_info->reg_ctx_type == CTX_TYPE_MMREG) { + printk("%sMM Register Address: 0x%016llx\n", newpfx, + ctx_info->mm_reg_addr); + } + + printk("%sRegister Array:\n", newpfx); + print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize, + (ctx_info + 1), ctx_info->reg_arr_size, 0); + + ctx_info = (struct cper_ia_proc_ctx *)((long)ctx_info + size); + } }
[tip:efi/core] efi: Decode IA32/X64 MS Check structure
Commit-ID: a32bc29ed19776ef6827d6336847de9a0b7a8dc5 Gitweb: https://git.kernel.org/tip/a32bc29ed19776ef6827d6336847de9a0b7a8dc5 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:55 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:48 +0200 efi: Decode IA32/X64 MS Check structure The IA32/X64 MS Check structure varies from the other Check structures in the the bit positions of its fields, and it includes an additional "Error Type" field. Decode the MS Check structure in a separate function. Based on UEFI 2.7 Table 257. IA32/X64 MS Check Field Description. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-10-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 55 - 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 5e6716564dba..356b8d326219 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -57,6 +57,20 @@ #define CHECK_BUS_TIME_OUT BIT_ULL(32) #define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33) +#define CHECK_VALID_MS_ERR_TYPEBIT_ULL(0) +#define CHECK_VALID_MS_PCC BIT_ULL(1) +#define CHECK_VALID_MS_UNCORRECTED BIT_ULL(2) +#define CHECK_VALID_MS_PRECISE_IP BIT_ULL(3) +#define CHECK_VALID_MS_RESTARTABLE_IP BIT_ULL(4) +#define CHECK_VALID_MS_OVERFLOWBIT_ULL(5) + +#define CHECK_MS_ERR_TYPE(check) (((check) & GENMASK_ULL(18, 16)) >> 16) +#define CHECK_MS_PCC BIT_ULL(19) +#define CHECK_MS_UNCORRECTED BIT_ULL(20) +#define CHECK_MS_PRECISE_IPBIT_ULL(21) +#define CHECK_MS_RESTARTABLE_IPBIT_ULL(22) +#define CHECK_MS_OVERFLOW BIT_ULL(23) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -111,17 +125,56 @@ static const char * const ia_check_bus_addr_space_strs[] = { "Other Transaction", }; +static const char * const ia_check_ms_error_type_strs[] = { + "No Error", + "Unclassified", + "Microcode ROM Parity Error", + "External Error", + "FRC Error", + "Internal Unclassified", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); } +static void print_err_info_ms(const char *pfx, u16 validation_bits, u64 check) +{ + if (validation_bits & CHECK_VALID_MS_ERR_TYPE) { + u8 err_type = CHECK_MS_ERR_TYPE(check); + + printk("%sError Type: %u, %s\n", pfx, err_type, + err_type < ARRAY_SIZE(ia_check_ms_error_type_strs) ? + ia_check_ms_error_type_strs[err_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_MS_PCC) + print_bool("Processor Context Corrupt", pfx, check, CHECK_MS_PCC); + + if (validation_bits & CHECK_VALID_MS_UNCORRECTED) + print_bool("Uncorrected", pfx, check, CHECK_MS_UNCORRECTED); + + if (validation_bits & CHECK_VALID_MS_PRECISE_IP) + print_bool("Precise IP", pfx, check, CHECK_MS_PRECISE_IP); + + if (validation_bits & CHECK_VALID_MS_RESTARTABLE_IP) + print_bool("Restartable IP", pfx, check, CHECK_MS_RESTARTABLE_IP); + + if (validation_bits & CHECK_VALID_MS_OVERFLOW) + print_bool("Overflow", pfx, check, CHECK_MS_OVERFLOW); +} + static void print_err_info(const char *pfx, u8 err_type, u64 check) { u16 validation_bits = CHECK_VALID_BITS(check); + /* +* The MS Check structure varies a lot from the others, so use a +* separate function for decoding. +*/ if (err_type == ERR_TYPE_MS) - return; + return print_err_info_ms(pfx, validation_bits, check); if (validation_bits & CHECK_VALID_TRANS_TYPE) { u8 trans_type = CHECK_TRANS_TYPE(check);
[tip:efi/core] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures
Commit-ID: a9c1e3e791409e35207277b7873efc756b6fb625 Gitweb: https://git.kernel.org/tip/a9c1e3e791409e35207277b7873efc756b6fb625 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:53 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:48 +0200 efi: Decode IA32/X64 Cache, TLB, and Bus Check structures Print the common fields of the Cache, TLB, and Bus check structures.The fields of these three check types are the same except for a few more fields in the Bus check structure. The remaining Bus check structure fields will be decoded in a following patch. Based on UEFI 2.7, Table 254. IA32/X64 Cache Check Structure Table 255. IA32/X64 TLB Check Structure Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-8-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 99 - 1 file changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 5438097b93ac..f70c46f7a4db 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -30,6 +30,25 @@ #define INFO_VALID_RESPONDER_IDBIT_ULL(3) #define INFO_VALID_IP BIT_ULL(4) +#define CHECK_VALID_TRANS_TYPE BIT_ULL(0) +#define CHECK_VALID_OPERATION BIT_ULL(1) +#define CHECK_VALID_LEVEL BIT_ULL(2) +#define CHECK_VALID_PCCBIT_ULL(3) +#define CHECK_VALID_UNCORRECTEDBIT_ULL(4) +#define CHECK_VALID_PRECISE_IP BIT_ULL(5) +#define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6) +#define CHECK_VALID_OVERFLOW BIT_ULL(7) + +#define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0))) +#define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 16)) >> 16) +#define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18) +#define CHECK_LEVEL(check) (((check) & GENMASK_ULL(24, 22)) >> 22) +#define CHECK_PCC BIT_ULL(25) +#define CHECK_UNCORRECTED BIT_ULL(26) +#define CHECK_PRECISE_IP BIT_ULL(27) +#define CHECK_RESTARTABLE_IP BIT_ULL(28) +#define CHECK_OVERFLOW BIT_ULL(29) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -52,11 +71,81 @@ static enum err_types cper_get_err_type(const guid_t *err_type) return N_ERR_TYPES; } +static const char * const ia_check_trans_type_strs[] = { + "Instruction", + "Data Access", + "Generic", +}; + +static const char * const ia_check_op_strs[] = { + "generic error", + "generic read", + "generic write", + "data read", + "data write", + "instruction fetch", + "prefetch", + "eviction", + "snoop", +}; + +static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) +{ + printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); +} + +static void print_err_info(const char *pfx, u8 err_type, u64 check) +{ + u16 validation_bits = CHECK_VALID_BITS(check); + + if (err_type == ERR_TYPE_MS) + return; + + if (validation_bits & CHECK_VALID_TRANS_TYPE) { + u8 trans_type = CHECK_TRANS_TYPE(check); + + printk("%sTransaction Type: %u, %s\n", pfx, trans_type, + trans_type < ARRAY_SIZE(ia_check_trans_type_strs) ? + ia_check_trans_type_strs[trans_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_OPERATION) { + u8 op = CHECK_OPERATION(check); + + /* +* CACHE has more operation types than TLB or BUS, though the +* name and the order are the same. +*/ + u8 max_ops = (err_type == ERR_TYPE_CACHE) ? 9 : 7; + + printk("%sOperation: %u, %s\n", pfx, op, + op < max_ops ? ia_check_op_strs[op] : "unknown"); + } + + if (validation_bits & CHECK_VALID_LEVEL) + printk("%sLevel: %llu\n", pfx, CHECK_LEVEL(check)); + + if (validation_bits & CHECK_VALID_PCC) + print_bool("Processor Context Corrupt", pfx, check, CHECK_PCC); + + if (validation_bits & CHECK_VALID_UNCORRECTED) + print_bool("Uncorrected", pfx, check, CHECK_UNCORRECTED); + + if (validation_bits & CHECK_VALID_PRECISE_IP) + prin
[tip:efi/core] efi: Decode additional IA32/X64 Bus Check fields
Commit-ID: c6bc4ac0aadede7a5c5260bcc315cd2b18c6b471 Gitweb: https://git.kernel.org/tip/c6bc4ac0aadede7a5c5260bcc315cd2b18c6b471 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:54 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:48 +0200 efi: Decode additional IA32/X64 Bus Check fields The "Participation Type", "Time Out", and "Address Space" fields are unique to the IA32/X64 Bus Check structure. Print these fields. Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-9-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 44 + 1 file changed, 44 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index f70c46f7a4db..5e6716564dba 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -39,6 +39,10 @@ #define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6) #define CHECK_VALID_OVERFLOW BIT_ULL(7) +#define CHECK_VALID_BUS_PART_TYPE BIT_ULL(8) +#define CHECK_VALID_BUS_TIME_OUT BIT_ULL(9) +#define CHECK_VALID_BUS_ADDR_SPACE BIT_ULL(10) + #define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0))) #define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 16)) >> 16) #define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18) @@ -49,6 +53,10 @@ #define CHECK_RESTARTABLE_IP BIT_ULL(28) #define CHECK_OVERFLOW BIT_ULL(29) +#define CHECK_BUS_PART_TYPE(check) (((check) & GENMASK_ULL(31, 30)) >> 30) +#define CHECK_BUS_TIME_OUT BIT_ULL(32) +#define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -89,6 +97,20 @@ static const char * const ia_check_op_strs[] = { "snoop", }; +static const char * const ia_check_bus_part_type_strs[] = { + "Local Processor originated request", + "Local Processor responded to request", + "Local Processor observed", + "Generic", +}; + +static const char * const ia_check_bus_addr_space_strs[] = { + "Memory Access", + "Reserved", + "I/O", + "Other Transaction", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); @@ -139,6 +161,28 @@ static void print_err_info(const char *pfx, u8 err_type, u64 check) if (validation_bits & CHECK_VALID_OVERFLOW) print_bool("Overflow", pfx, check, CHECK_OVERFLOW); + + if (err_type != ERR_TYPE_BUS) + return; + + if (validation_bits & CHECK_VALID_BUS_PART_TYPE) { + u8 part_type = CHECK_BUS_PART_TYPE(check); + + printk("%sParticipation Type: %u, %s\n", pfx, part_type, + part_type < ARRAY_SIZE(ia_check_bus_part_type_strs) ? + ia_check_bus_part_type_strs[part_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_BUS_TIME_OUT) + print_bool("Time Out", pfx, check, CHECK_BUS_TIME_OUT); + + if (validation_bits & CHECK_VALID_BUS_ADDR_SPACE) { + u8 addr_space = CHECK_BUS_ADDR_SPACE(check); + + printk("%sAddress Space: %u, %s\n", pfx, addr_space, + addr_space < ARRAY_SIZE(ia_check_bus_addr_space_strs) ? + ia_check_bus_addr_space_strs[addr_space] : "unknown"); + } } void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc)
[tip:efi/core] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs
Commit-ID: dc2d26e4b667c8005c58669e71de3efd17f4390f Gitweb: https://git.kernel.org/tip/dc2d26e4b667c8005c58669e71de3efd17f4390f Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:52 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:47 +0200 efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs For easier handling, match the known IA32/X64 error structure GUIDs to enums. Also, print out the name of the matching Error Structure Type. Only print the GUID for unknown types. GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-7-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 47 +++-- 1 file changed, 45 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index e0633a103fcf..5438097b93ac 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -11,17 +11,53 @@ #define VALID_CPUID_INFO BIT_ULL(1) #define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) +#define INFO_ERR_STRUCT_TYPE_CACHE \ + GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B, \ + 0x57, 0x3F, 0xAD, 0x2C) +#define INFO_ERR_STRUCT_TYPE_TLB \ + GUID_INIT(0xFC06B535, 0x5E1F, 0x4562, 0x9F, 0x25, 0x0A, 0x3B, \ + 0x9A, 0xDB, 0x63, 0xC3) +#define INFO_ERR_STRUCT_TYPE_BUS \ + GUID_INIT(0x1CF3F8B3, 0xC5B1, 0x49a2, 0xAA, 0x59, 0x5E, 0xEF, \ + 0x92, 0xFF, 0xA6, 0x3C) +#define INFO_ERR_STRUCT_TYPE_MS \ + GUID_INIT(0x48AB7F57, 0xDC34, 0x4f6c, 0xA7, 0xD3, 0xB0, 0xB5, \ + 0xB0, 0xA7, 0x43, 0x14) + #define INFO_VALID_CHECK_INFO BIT_ULL(0) #define INFO_VALID_TARGET_ID BIT_ULL(1) #define INFO_VALID_REQUESTOR_IDBIT_ULL(2) #define INFO_VALID_RESPONDER_IDBIT_ULL(3) #define INFO_VALID_IP BIT_ULL(4) +enum err_types { + ERR_TYPE_CACHE = 0, + ERR_TYPE_TLB, + ERR_TYPE_BUS, + ERR_TYPE_MS, + N_ERR_TYPES +}; + +static enum err_types cper_get_err_type(const guid_t *err_type) +{ + if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_CACHE)) + return ERR_TYPE_CACHE; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_TLB)) + return ERR_TYPE_TLB; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_BUS)) + return ERR_TYPE_BUS; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_MS)) + return ERR_TYPE_MS; + else + return N_ERR_TYPES; +} + void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { int i; struct cper_ia_err_info *err_info; char newpfx[64]; + u8 err_type; if (proc->validation_bits & VALID_LAPIC_ID) printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); @@ -38,8 +74,15 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) { printk("%sError Information Structure %d:\n", pfx, i); - printk("%sError Structure Type: %pUl\n", newpfx, - &err_info->err_type); + err_type = cper_get_err_type(&err_info->err_type); + printk("%sError Structure Type: %s\n", newpfx, + err_type < ARRAY_SIZE(cper_proc_error_type_strs) ? + cper_proc_error_type_strs[err_type] : "unknown"); + + if (err_type >= N_ERR_TYPES) { + printk("%sError Structure Type: %pUl\n", newpfx, + &err_info->err_type); + } if (err_info->validation_bits & INFO_VALID_CHECK_INFO) { printk("%sCheck Information: 0x%016llx\n", newpfx,
[tip:efi/core] efi: Decode IA32/X64 Processor Error Info Structure
Commit-ID: 7c9449b8c8a59511b7d749afb193c96353451c82 Gitweb: https://git.kernel.org/tip/7c9449b8c8a59511b7d749afb193c96353451c82 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:51 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:47 +0200 efi: Decode IA32/X64 Processor Error Info Structure Print the fields in the IA32/X64 Processor Error Info Structure. Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-6-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/cper-x86.c | 48 + 1 file changed, 48 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 863f0cd2a0ff..e0633a103fcf 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -9,9 +9,20 @@ */ #define VALID_LAPIC_ID BIT_ULL(0) #define VALID_CPUID_INFO BIT_ULL(1) +#define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) + +#define INFO_VALID_CHECK_INFO BIT_ULL(0) +#define INFO_VALID_TARGET_ID BIT_ULL(1) +#define INFO_VALID_REQUESTOR_IDBIT_ULL(2) +#define INFO_VALID_RESPONDER_IDBIT_ULL(3) +#define INFO_VALID_IP BIT_ULL(4) void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { + int i; + struct cper_ia_err_info *err_info; + char newpfx[64]; + if (proc->validation_bits & VALID_LAPIC_ID) printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); @@ -20,4 +31,41 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid, sizeof(proc->cpuid), 0); } + + snprintf(newpfx, sizeof(newpfx), "%s ", pfx); + + err_info = (struct cper_ia_err_info *)(proc + 1); + for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) { + printk("%sError Information Structure %d:\n", pfx, i); + + printk("%sError Structure Type: %pUl\n", newpfx, + &err_info->err_type); + + if (err_info->validation_bits & INFO_VALID_CHECK_INFO) { + printk("%sCheck Information: 0x%016llx\n", newpfx, + err_info->check_info); + } + + if (err_info->validation_bits & INFO_VALID_TARGET_ID) { + printk("%sTarget Identifier: 0x%016llx\n", + newpfx, err_info->target_id); + } + + if (err_info->validation_bits & INFO_VALID_REQUESTOR_ID) { + printk("%sRequestor Identifier: 0x%016llx\n", + newpfx, err_info->requestor_id); + } + + if (err_info->validation_bits & INFO_VALID_RESPONDER_ID) { + printk("%sResponder Identifier: 0x%016llx\n", + newpfx, err_info->responder_id); + } + + if (err_info->validation_bits & INFO_VALID_IP) { + printk("%sInstruction Pointer: 0x%016llx\n", + newpfx, err_info->ip); + } + + err_info++; + } }
[tip:efi/core] efi: Decode IA32/X64 Processor Error Section
Commit-ID: f9e1bdb9f35f4f5cfa7c9025ac68c02909b6d3b1 Gitweb: https://git.kernel.org/tip/f9e1bdb9f35f4f5cfa7c9025ac68c02909b6d3b1 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:50 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:47 +0200 efi: Decode IA32/X64 Processor Error Section Recognize the IA32/X64 Processor Error Section. Do the section decoding in a new "cper-x86.c" file and add this to the Makefile depending on a new "UEFI_CPER_X86" config option. Print the Local APIC ID and CPUID info from the Processor Error Record. The "Processor Error Info" and "Processor Context" fields will be decoded in following patches. Based on UEFI 2.7 Table 252. Processor Error Record. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-5-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- drivers/firmware/efi/Kconfig| 5 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/cper-x86.c | 23 +++ drivers/firmware/efi/cper.c | 10 ++ include/linux/cper.h| 2 ++ 5 files changed, 41 insertions(+) diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 3098410abad8..781a4a337557 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -174,6 +174,11 @@ config UEFI_CPER_ARM depends on UEFI_CPER && ( ARM || ARM64 ) default y +config UEFI_CPER_X86 + bool + depends on UEFI_CPER && X86 + default y + config EFI_DEV_PATH_PARSER bool depends on ACPI diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile index cb805374f4bc..5f9f5039de50 100644 --- a/drivers/firmware/efi/Makefile +++ b/drivers/firmware/efi/Makefile @@ -31,3 +31,4 @@ obj-$(CONFIG_ARM) += $(arm-obj-y) obj-$(CONFIG_ARM64)+= $(arm-obj-y) obj-$(CONFIG_EFI_CAPSULE_LOADER) += capsule-loader.o obj-$(CONFIG_UEFI_CPER_ARM)+= cper-arm.o +obj-$(CONFIG_UEFI_CPER_X86)+= cper-x86.o diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c new file mode 100644 index ..863f0cd2a0ff --- /dev/null +++ b/drivers/firmware/efi/cper-x86.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018, Advanced Micro Devices, Inc. + +#include + +/* + * We don't need a "CPER_IA" prefix since these are all locally defined. + * This will save us a lot of line space. + */ +#define VALID_LAPIC_ID BIT_ULL(0) +#define VALID_CPUID_INFO BIT_ULL(1) + +void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) +{ + if (proc->validation_bits & VALID_LAPIC_ID) + printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); + + if (proc->validation_bits & VALID_CPUID_INFO) { + printk("%sCPUID Info:\n", pfx); + print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid, + sizeof(proc->cpuid), 0); + } +} diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index ab21f1614007..3bf0dca378a6 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -467,6 +467,16 @@ cper_estatus_print_section(const char *pfx, struct acpi_hest_generic_data *gdata cper_print_proc_arm(newpfx, arm_err); else goto err_section_too_small; +#endif +#if defined(CONFIG_UEFI_CPER_X86) + } else if (guid_equal(sec_type, &CPER_SEC_PROC_IA)) { + struct cper_sec_proc_ia *ia_err = acpi_hest_get_payload(gdata); + + printk("%ssection_type: IA32/X64 processor error\n", newpfx); + if (gdata->error_data_length >= sizeof(*ia_err)) + cper_print_proc_ia(newpfx, ia_err); + else + goto err_section_too_small; #endif } else { const void *err = acpi_hest_get_payload(gdata); diff --git a/include/linux/cper.h b/include/linux/cper.h index 4b5f8459b403..9c703a0abe6e 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -551,5 +551,7 @@ const char *cper_mem_err_unpack(struct trace_seq *, struct cper_mem_err_compact *); void cper_print_proc_arm(const char *pfx, const struct cper_sec_proc_arm *proc); +void cper_print_proc_ia(const char *pfx, + const struct cper_sec_proc_ia *proc); #endif
[tip:efi/core] efi: Fix IA32/X64 Processor Error Record definition
Commit-ID: 742632d237ce180439ab4af31e9891df0df81233 Gitweb: https://git.kernel.org/tip/742632d237ce180439ab4af31e9891df0df81233 Author: Yazen Ghannam AuthorDate: Fri, 4 May 2018 07:59:49 +0200 Committer: Ingo Molnar CommitDate: Mon, 14 May 2018 08:57:47 +0200 efi: Fix IA32/X64 Processor Error Record definition Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID" field is 8 bytes but Linux defines this field as 1 byte. Fix this in the struct cper_sec_proc_ia definition. Signed-off-by: Yazen Ghannam Signed-off-by: Ard Biesheuvel Cc: Linus Torvalds Cc: Matt Fleming Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-...@vger.kernel.org Link: http://lkml.kernel.org/r/20180504060003.19618-4-ard.biesheu...@linaro.org Signed-off-by: Ingo Molnar --- include/linux/cper.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/cper.h b/include/linux/cper.h index d14ef4e77c8a..4b5f8459b403 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -381,7 +381,7 @@ struct cper_sec_proc_generic { /* IA32/X64 Processor Error Section */ struct cper_sec_proc_ia { __u64 validation_bits; - __u8lapic_id; + __u64 lapic_id; __u8cpuid[48]; };
[tip:x86/urgent] x86/smpboot: Don't use mwait_play_dead() on AMD systems
Commit-ID: da6fa7ef67f07108a1b0cb9fd9e7fcaabd39c051 Gitweb: https://git.kernel.org/tip/da6fa7ef67f07108a1b0cb9fd9e7fcaabd39c051 Author: Yazen Ghannam AuthorDate: Tue, 3 Apr 2018 09:02:28 -0500 Committer: Thomas Gleixner CommitDate: Thu, 26 Apr 2018 16:06:19 +0200 x86/smpboot: Don't use mwait_play_dead() on AMD systems Recent AMD systems support using MWAIT for C1 state. However, MWAIT will not allow deeper cstates than C1 on current systems. play_dead() expects to use the deepest state available. The deepest state available on AMD systems is reached through SystemIO or HALT. If MWAIT is available, it is preferred over the other methods, so the CPU never reaches the deepest possible state. Don't try to use MWAIT to play_dead() on AMD systems. Instead, use CPUIDLE to enter the deepest state advertised by firmware. If CPUIDLE is not available then fallback to HALT. Signed-off-by: Yazen Ghannam Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Cc: sta...@vger.kernel.org Cc: Yazen Ghannam Link: https://lkml.kernel.org/r/20180403140228.58540-1-yazen.ghan...@amd.com --- arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 45175b81dd5b..0f1cbb042f49 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1571,6 +1571,8 @@ static inline void mwait_play_dead(void) void *mwait_ptr; int i; + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + return; if (!this_cpu_has(X86_FEATURE_MWAIT)) return; if (!this_cpu_has(X86_FEATURE_CLFLUSH))
[PATCH v2] x86/smpboot: Don't do mwait_play_dead() on AMD systems
From: Yazen Ghannam Recent AMD systems support using MWAIT for C1 state. However, MWAIT will not allow deeper cstates than C1 on current systems. With play_dead() we expect the OS to use the deepest state available. The deepest state available on AMD systems is reached through SystemIO or HALT. If MWAIT is available, we use it instead of the other methods, so we never reach the deepest state. Don't try to use MWAIT to play_dead() on AMD systems. Instead, we'll use CPUIDLE to enter the deepest state advertised by firmware. If CPUIDLE is not available then we fallback to HALT. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180402183424.48222-1-yazen.ghan...@amd.com v1->v2: * Drop comment in code. arch/x86/kernel/smpboot.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index ff99e2b6fc54..12599e55e040 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1536,6 +1536,8 @@ static inline void mwait_play_dead(void) void *mwait_ptr; int i; + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + return; if (!this_cpu_has(X86_FEATURE_MWAIT)) return; if (!this_cpu_has(X86_FEATURE_CLFLUSH)) -- 2.14.1
[PATCH] x86/MCE, EDAC/mce_amd: Save all aux registers on SMCA systems
From: Yazen Ghannam The Intel SDM and AMD APM both state that the auxiliary MCA registers should be read if their respective valid bits are set in MCA_STATUS. The Processor Programming Reference for AMD Fam17h systems has a new recommendation that the auxiliary registers should be saved unconditionally. This recommendation can be retroactively applied to older AMD systems. However, we only need to apply this to SMCA systems to avoid modifying behavior on older systems. Define a separate function to save all auxiliary registers on SMCA systems. Call this function from both the MCE handlers and the AMD LVT interrupt handlers so that we don't duplicate code. Print all auxiliary registers in EDAC/mce_amd. Don't restrict this to SMCA systems in order to save a conditional and keep the format similar between SMCA and non-SMCA systems. Signed-off-by: Yazen Ghannam --- Links: https://lkml.kernel.org/r/20180326191526.64314-1-yazen.ghan...@amd.com https://lkml.kernel.org/r/20180326191526.64314-2-yazen.ghan...@amd.com arch/x86/kernel/cpu/mcheck/mce-internal.h | 6 +++ arch/x86/kernel/cpu/mcheck/mce.c | 20 ++ arch/x86/kernel/cpu/mcheck/mce_amd.c | 65 +-- drivers/edac/mce_amd.c| 12 ++ 4 files changed, 57 insertions(+), 46 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 374d1aa66952..67a2c7c095ca 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -59,6 +59,12 @@ static inline void mce_intel_hcpu_update(unsigned long cpu) { } static inline void cmci_disable_bank(int bank) { } #endif +#ifdef CONFIG_X86_MCE_AMD +bool smca_read_aux(struct mce *m, int bank); +#else +static inline bool smca_read_aux(struct mce *m, int bank) { return false; } +#endif + void mce_timer_kick(unsigned long interval); #ifdef CONFIG_ACPI_APEI diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 42cf2880d0ed..6be63e9e067d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -639,6 +639,9 @@ static struct notifier_block mce_default_nb = { */ static void mce_read_aux(struct mce *m, int i) { + if (smca_read_aux(m, i)) + return; + if (m->status & MCI_STATUS_MISCV) m->misc = mce_rdmsrl(msr_ops.misc(i)); @@ -653,23 +656,6 @@ static void mce_read_aux(struct mce *m, int i) m->addr >>= shift; m->addr <<= shift; } - - /* -* Extract [55:] where lsb is the least significant -* *valid* bit of the address bits. -*/ - if (mce_flags.smca) { - u8 lsb = (m->addr >> 56) & 0x3f; - - m->addr &= GENMASK_ULL(55, lsb); - } - } - - if (mce_flags.smca) { - m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i)); - - if (m->status & MCI_STATUS_SYNDV) - m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i)); } } diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c index f7666eef4a87..b00d5fff1848 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c @@ -244,6 +244,47 @@ static void smca_configure(unsigned int bank, unsigned int cpu) } } + +static bool _smca_read_aux(struct mce *m, int bank, bool read_addr) +{ + if (!mce_flags.smca) + return false; + + rdmsrl(MSR_AMD64_SMCA_MCx_IPID(bank), m->ipid); + rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m->synd); + + /* +* We should already have a value if we're coming from the Threshold LVT +* interrupt handler. Otherwise, read it now. +*/ + if (!m->misc) + rdmsrl(msr_ops.misc(bank), m->misc); + + /* +* Read MCA_ADDR if we don't have it already. We should already have it +* if we're coming from the interrupt handlers. +*/ + if (read_addr) + rdmsrl(msr_ops.addr(bank), m->addr); + + /* +* Extract [55:] where lsb is the least significant +* *valid* bit of the address bits. +*/ + if (m->addr) { + u8 lsb = (m->addr >> 56) & 0x3f; + + m->addr &= GENMASK_ULL(55, lsb); + } + + return true; +} + +bool smca_read_aux(struct mce *m, int bank) +{ + return _smca_read_aux(m, bank, true); +} + struct thresh_restart { struct threshold_block *b; int reset; @@ -799,30 +840,12 @@ static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc) mce_setup(&m); m.status = statu
[PATCH] x86/smpboot: Don't do mwait_play_dead() on AMD systems
From: Yazen Ghannam Recent AMD systems support using MWAIT for C1 state. However, MWAIT will not allow deeper cstates than C1 on current systems. With play_dead() we expect the OS to use the deepest state available. The deepest state available on AMD systems is reached through SystemIO or HALT. If MWAIT is available, we use it instead of the other methods, so we never reach the deepest state. Don't try to use MWAIT to play_dead() on AMD systems. Instead, we'll use CPUIDLE to enter the deepest state advertised by firmware. If CPUIDLE is not available then we fallback to HALT. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/smpboot.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index ff99e2b6fc54..67cf00b25f83 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1536,6 +1536,9 @@ static inline void mwait_play_dead(void) void *mwait_ptr; int i; + /* Don't try native MWAIT on AMD. Stick to CPUIDLE and HALT. */ + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + return; if (!this_cpu_has(X86_FEATURE_MWAIT)) return; if (!this_cpu_has(X86_FEATURE_CLFLUSH)) -- 2.14.1
[PATCH v4 1/8] efi: Fix IA32/X64 Processor Error Record definition
From: Yazen Ghannam Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID" field is 8 bytes but Linux defines this field as 1 byte. Fix this in the struct cper_sec_proc_ia definition. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-2-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * Fix table number in commit message. v1->v2: * No changes include/linux/cper.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/cper.h b/include/linux/cper.h index d14ef4e77c8a..4b5f8459b403 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -381,7 +381,7 @@ struct cper_sec_proc_generic { /* IA32/X64 Processor Error Section */ struct cper_sec_proc_ia { __u64 validation_bits; - __u8lapic_id; + __u64 lapic_id; __u8cpuid[48]; }; -- 2.14.1
[PATCH v4 5/8] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures
From: Yazen Ghannam Print the common fields of the Cache, TLB, and Bus check structures.The fields of these three check types are the same except for a few more fields in the Bus check structure. The remaining Bus check structure fields will be decoded in a following patch. Based on UEFI 2.7, Table 254. IA32/X64 Cache Check Structure Table 255. IA32/X64 TLB Check Structure Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-6-yazen.ghan...@amd.com v3->v4: * Drop INDENT_SP use. v2->v3: * Fix table numbers in commit message. * Don't print raw validation bits. v1->v2: * Add parantheses around "check" expression in macro. * Change use of enum type to u8. * Fix indentation on multi-line statements. drivers/firmware/efi/cper-x86.c | 99 - 1 file changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 5438097b93ac..f70c46f7a4db 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -30,6 +30,25 @@ #define INFO_VALID_RESPONDER_IDBIT_ULL(3) #define INFO_VALID_IP BIT_ULL(4) +#define CHECK_VALID_TRANS_TYPE BIT_ULL(0) +#define CHECK_VALID_OPERATION BIT_ULL(1) +#define CHECK_VALID_LEVEL BIT_ULL(2) +#define CHECK_VALID_PCCBIT_ULL(3) +#define CHECK_VALID_UNCORRECTEDBIT_ULL(4) +#define CHECK_VALID_PRECISE_IP BIT_ULL(5) +#define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6) +#define CHECK_VALID_OVERFLOW BIT_ULL(7) + +#define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0))) +#define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 16)) >> 16) +#define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18) +#define CHECK_LEVEL(check) (((check) & GENMASK_ULL(24, 22)) >> 22) +#define CHECK_PCC BIT_ULL(25) +#define CHECK_UNCORRECTED BIT_ULL(26) +#define CHECK_PRECISE_IP BIT_ULL(27) +#define CHECK_RESTARTABLE_IP BIT_ULL(28) +#define CHECK_OVERFLOW BIT_ULL(29) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -52,11 +71,81 @@ static enum err_types cper_get_err_type(const guid_t *err_type) return N_ERR_TYPES; } +static const char * const ia_check_trans_type_strs[] = { + "Instruction", + "Data Access", + "Generic", +}; + +static const char * const ia_check_op_strs[] = { + "generic error", + "generic read", + "generic write", + "data read", + "data write", + "instruction fetch", + "prefetch", + "eviction", + "snoop", +}; + +static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) +{ + printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); +} + +static void print_err_info(const char *pfx, u8 err_type, u64 check) +{ + u16 validation_bits = CHECK_VALID_BITS(check); + + if (err_type == ERR_TYPE_MS) + return; + + if (validation_bits & CHECK_VALID_TRANS_TYPE) { + u8 trans_type = CHECK_TRANS_TYPE(check); + + printk("%sTransaction Type: %u, %s\n", pfx, trans_type, + trans_type < ARRAY_SIZE(ia_check_trans_type_strs) ? + ia_check_trans_type_strs[trans_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_OPERATION) { + u8 op = CHECK_OPERATION(check); + + /* +* CACHE has more operation types than TLB or BUS, though the +* name and the order are the same. +*/ + u8 max_ops = (err_type == ERR_TYPE_CACHE) ? 9 : 7; + + printk("%sOperation: %u, %s\n", pfx, op, + op < max_ops ? ia_check_op_strs[op] : "unknown"); + } + + if (validation_bits & CHECK_VALID_LEVEL) + printk("%sLevel: %llu\n", pfx, CHECK_LEVEL(check)); + + if (validation_bits & CHECK_VALID_PCC) + print_bool("Processor Context Corrupt", pfx, check, CHECK_PCC); + + if (validation_bits & CHECK_VALID_UNCORRECTED) + print_bool("Uncorrected", pfx, check, CHECK_UNCORRECTED); + + if (validation_bits & CHECK_VALID_PRECISE_IP) + print_bool("Precise IP", pfx, check, CHECK_PRECISE_IP); + + if (validation_bits & CHECK_VALID_RESTARTABLE_IP) + print_bool("Restartable IP", pfx, check, CHECK_RESTA
[PATCH v4 7/8] efi: Decode IA32/X64 MS Check structure
From: Yazen Ghannam The IA32/X64 MS Check structure varies from the other Check structures in the the bit positions of its fields, and it includes an additional "Error Type" field. Decode the MS Check structure in a separate function. Based on UEFI 2.7 Table 257. IA32/X64 MS Check Field Description. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-8-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * Fix table number in commit message. v1->v2: * Add parantheses around "check" expression in macro. * Fix indentation on multi-line statements. drivers/firmware/efi/cper-x86.c | 55 - 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 5e6716564dba..356b8d326219 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -57,6 +57,20 @@ #define CHECK_BUS_TIME_OUT BIT_ULL(32) #define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33) +#define CHECK_VALID_MS_ERR_TYPEBIT_ULL(0) +#define CHECK_VALID_MS_PCC BIT_ULL(1) +#define CHECK_VALID_MS_UNCORRECTED BIT_ULL(2) +#define CHECK_VALID_MS_PRECISE_IP BIT_ULL(3) +#define CHECK_VALID_MS_RESTARTABLE_IP BIT_ULL(4) +#define CHECK_VALID_MS_OVERFLOWBIT_ULL(5) + +#define CHECK_MS_ERR_TYPE(check) (((check) & GENMASK_ULL(18, 16)) >> 16) +#define CHECK_MS_PCC BIT_ULL(19) +#define CHECK_MS_UNCORRECTED BIT_ULL(20) +#define CHECK_MS_PRECISE_IPBIT_ULL(21) +#define CHECK_MS_RESTARTABLE_IPBIT_ULL(22) +#define CHECK_MS_OVERFLOW BIT_ULL(23) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -111,17 +125,56 @@ static const char * const ia_check_bus_addr_space_strs[] = { "Other Transaction", }; +static const char * const ia_check_ms_error_type_strs[] = { + "No Error", + "Unclassified", + "Microcode ROM Parity Error", + "External Error", + "FRC Error", + "Internal Unclassified", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); } +static void print_err_info_ms(const char *pfx, u16 validation_bits, u64 check) +{ + if (validation_bits & CHECK_VALID_MS_ERR_TYPE) { + u8 err_type = CHECK_MS_ERR_TYPE(check); + + printk("%sError Type: %u, %s\n", pfx, err_type, + err_type < ARRAY_SIZE(ia_check_ms_error_type_strs) ? + ia_check_ms_error_type_strs[err_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_MS_PCC) + print_bool("Processor Context Corrupt", pfx, check, CHECK_MS_PCC); + + if (validation_bits & CHECK_VALID_MS_UNCORRECTED) + print_bool("Uncorrected", pfx, check, CHECK_MS_UNCORRECTED); + + if (validation_bits & CHECK_VALID_MS_PRECISE_IP) + print_bool("Precise IP", pfx, check, CHECK_MS_PRECISE_IP); + + if (validation_bits & CHECK_VALID_MS_RESTARTABLE_IP) + print_bool("Restartable IP", pfx, check, CHECK_MS_RESTARTABLE_IP); + + if (validation_bits & CHECK_VALID_MS_OVERFLOW) + print_bool("Overflow", pfx, check, CHECK_MS_OVERFLOW); +} + static void print_err_info(const char *pfx, u8 err_type, u64 check) { u16 validation_bits = CHECK_VALID_BITS(check); + /* +* The MS Check structure varies a lot from the others, so use a +* separate function for decoding. +*/ if (err_type == ERR_TYPE_MS) - return; + return print_err_info_ms(pfx, validation_bits, check); if (validation_bits & CHECK_VALID_TRANS_TYPE) { u8 trans_type = CHECK_TRANS_TYPE(check); -- 2.14.1
[PATCH v4 8/8] efi: Decode IA32/X64 Context Info structure
From: Yazen Ghannam Print the fields of the IA32/X64 Context Information structure. Print the "Register Array" as raw values. Some context types are defined in the UEFI spec, so more detailed decoded may be added in the future. Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Processor Context Information Structure. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-9-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * No change. v1->v2: * Add parantheses around "bits" expression in macro. * Change VALID_PROC_CNTXT_INFO_NUM to VALID_PROC_CTX_INFO_NUM. * Fix indentation on multi-line statements. * Remove conditional to skip unknown context types. The context info should be printed even if the type is unknown. This is just like what we do for the error information. drivers/firmware/efi/cper-x86.c | 48 + 1 file changed, 48 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 356b8d326219..2531de49f56c 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -10,6 +10,7 @@ #define VALID_LAPIC_ID BIT_ULL(0) #define VALID_CPUID_INFO BIT_ULL(1) #define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) +#define VALID_PROC_CXT_INFO_NUM(bits) (((bits) & GENMASK_ULL(13, 8)) >> 8) #define INFO_ERR_STRUCT_TYPE_CACHE \ GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B, \ @@ -71,6 +72,9 @@ #define CHECK_MS_RESTARTABLE_IPBIT_ULL(22) #define CHECK_MS_OVERFLOW BIT_ULL(23) +#define CTX_TYPE_MSR 1 +#define CTX_TYPE_MMREG 7 + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -134,6 +138,17 @@ static const char * const ia_check_ms_error_type_strs[] = { "Internal Unclassified", }; +static const char * const ia_reg_ctx_strs[] = { + "Unclassified Data", + "MSR Registers (Machine Check and other MSRs)", + "32-bit Mode Execution Context", + "64-bit Mode Execution Context", + "FXSAVE Context", + "32-bit Mode Debug Registers (DR0-DR7)", + "64-bit Mode Debug Registers (DR0-DR7)", + "Memory Mapped Registers", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); @@ -242,6 +257,7 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { int i; struct cper_ia_err_info *err_info; + struct cper_ia_proc_ctx *ctx_info; char newpfx[64], infopfx[64]; u8 err_type; @@ -305,4 +321,36 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) err_info++; } + + ctx_info = (struct cper_ia_proc_ctx *)err_info; + for (i = 0; i < VALID_PROC_CXT_INFO_NUM(proc->validation_bits); i++) { + int size = sizeof(*ctx_info) + ctx_info->reg_arr_size; + int groupsize = 4; + + printk("%sContext Information Structure %d:\n", pfx, i); + + printk("%sRegister Context Type: %s\n", newpfx, + ctx_info->reg_ctx_type < ARRAY_SIZE(ia_reg_ctx_strs) ? + ia_reg_ctx_strs[ctx_info->reg_ctx_type] : "unknown"); + + printk("%sRegister Array Size: 0x%04x\n", newpfx, + ctx_info->reg_arr_size); + + if (ctx_info->reg_ctx_type == CTX_TYPE_MSR) { + groupsize = 8; /* MSRs are 8 bytes wide. */ + printk("%sMSR Address: 0x%08x\n", newpfx, + ctx_info->msr_addr); + } + + if (ctx_info->reg_ctx_type == CTX_TYPE_MMREG) { + printk("%sMM Register Address: 0x%016llx\n", newpfx, + ctx_info->mm_reg_addr); + } + + printk("%sRegister Array:\n", newpfx); + print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, groupsize, + (ctx_info + 1), ctx_info->reg_arr_size, 0); + + ctx_info = (struct cper_ia_proc_ctx *)((long)ctx_info + size); + } } -- 2.14.1
[PATCH v4 6/8] efi: Decode additional IA32/X64 Bus Check fields
From: Yazen Ghannam The "Participation Type", "Time Out", and "Address Space" fields are unique to the IA32/X64 Bus Check structure. Print these fields. Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-7-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * Fix table number in commit message. v1->v2: * Add parantheses around "check" expression in macro. * Fix indentation on multi-line statements. drivers/firmware/efi/cper-x86.c | 44 + 1 file changed, 44 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index f70c46f7a4db..5e6716564dba 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -39,6 +39,10 @@ #define CHECK_VALID_RESTARTABLE_IP BIT_ULL(6) #define CHECK_VALID_OVERFLOW BIT_ULL(7) +#define CHECK_VALID_BUS_PART_TYPE BIT_ULL(8) +#define CHECK_VALID_BUS_TIME_OUT BIT_ULL(9) +#define CHECK_VALID_BUS_ADDR_SPACE BIT_ULL(10) + #define CHECK_VALID_BITS(check)(((check) & GENMASK_ULL(15, 0))) #define CHECK_TRANS_TYPE(check)(((check) & GENMASK_ULL(17, 16)) >> 16) #define CHECK_OPERATION(check) (((check) & GENMASK_ULL(21, 18)) >> 18) @@ -49,6 +53,10 @@ #define CHECK_RESTARTABLE_IP BIT_ULL(28) #define CHECK_OVERFLOW BIT_ULL(29) +#define CHECK_BUS_PART_TYPE(check) (((check) & GENMASK_ULL(31, 30)) >> 30) +#define CHECK_BUS_TIME_OUT BIT_ULL(32) +#define CHECK_BUS_ADDR_SPACE(check)(((check) & GENMASK_ULL(34, 33)) >> 33) + enum err_types { ERR_TYPE_CACHE = 0, ERR_TYPE_TLB, @@ -89,6 +97,20 @@ static const char * const ia_check_op_strs[] = { "snoop", }; +static const char * const ia_check_bus_part_type_strs[] = { + "Local Processor originated request", + "Local Processor responded to request", + "Local Processor observed", + "Generic", +}; + +static const char * const ia_check_bus_addr_space_strs[] = { + "Memory Access", + "Reserved", + "I/O", + "Other Transaction", +}; + static inline void print_bool(char *str, const char *pfx, u64 check, u64 bit) { printk("%s%s: %s\n", pfx, str, (check & bit) ? "true" : "false"); @@ -139,6 +161,28 @@ static void print_err_info(const char *pfx, u8 err_type, u64 check) if (validation_bits & CHECK_VALID_OVERFLOW) print_bool("Overflow", pfx, check, CHECK_OVERFLOW); + + if (err_type != ERR_TYPE_BUS) + return; + + if (validation_bits & CHECK_VALID_BUS_PART_TYPE) { + u8 part_type = CHECK_BUS_PART_TYPE(check); + + printk("%sParticipation Type: %u, %s\n", pfx, part_type, + part_type < ARRAY_SIZE(ia_check_bus_part_type_strs) ? + ia_check_bus_part_type_strs[part_type] : "unknown"); + } + + if (validation_bits & CHECK_VALID_BUS_TIME_OUT) + print_bool("Time Out", pfx, check, CHECK_BUS_TIME_OUT); + + if (validation_bits & CHECK_VALID_BUS_ADDR_SPACE) { + u8 addr_space = CHECK_BUS_ADDR_SPACE(check); + + printk("%sAddress Space: %u, %s\n", pfx, addr_space, + addr_space < ARRAY_SIZE(ia_check_bus_addr_space_strs) ? + ia_check_bus_addr_space_strs[addr_space] : "unknown"); + } } void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) -- 2.14.1
[PATCH v4 4/8] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs
From: Yazen Ghannam For easier handling, match the known IA32/X64 error structure GUIDs to enums. Also, print out the name of the matching Error Structure Type. Only print the GUID for unknown types. GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-5-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * Only print raw GUID for unknown error types. v1->v2: * Change use of enum type to u8. * Fix indentation on multi-line statements. drivers/firmware/efi/cper-x86.c | 47 +++-- 1 file changed, 45 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index e0633a103fcf..5438097b93ac 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -11,17 +11,53 @@ #define VALID_CPUID_INFO BIT_ULL(1) #define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) +#define INFO_ERR_STRUCT_TYPE_CACHE \ + GUID_INIT(0xA55701F5, 0xE3EF, 0x43DE, 0xAC, 0x72, 0x24, 0x9B, \ + 0x57, 0x3F, 0xAD, 0x2C) +#define INFO_ERR_STRUCT_TYPE_TLB \ + GUID_INIT(0xFC06B535, 0x5E1F, 0x4562, 0x9F, 0x25, 0x0A, 0x3B, \ + 0x9A, 0xDB, 0x63, 0xC3) +#define INFO_ERR_STRUCT_TYPE_BUS \ + GUID_INIT(0x1CF3F8B3, 0xC5B1, 0x49a2, 0xAA, 0x59, 0x5E, 0xEF, \ + 0x92, 0xFF, 0xA6, 0x3C) +#define INFO_ERR_STRUCT_TYPE_MS \ + GUID_INIT(0x48AB7F57, 0xDC34, 0x4f6c, 0xA7, 0xD3, 0xB0, 0xB5, \ + 0xB0, 0xA7, 0x43, 0x14) + #define INFO_VALID_CHECK_INFO BIT_ULL(0) #define INFO_VALID_TARGET_ID BIT_ULL(1) #define INFO_VALID_REQUESTOR_IDBIT_ULL(2) #define INFO_VALID_RESPONDER_IDBIT_ULL(3) #define INFO_VALID_IP BIT_ULL(4) +enum err_types { + ERR_TYPE_CACHE = 0, + ERR_TYPE_TLB, + ERR_TYPE_BUS, + ERR_TYPE_MS, + N_ERR_TYPES +}; + +static enum err_types cper_get_err_type(const guid_t *err_type) +{ + if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_CACHE)) + return ERR_TYPE_CACHE; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_TLB)) + return ERR_TYPE_TLB; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_BUS)) + return ERR_TYPE_BUS; + else if (guid_equal(err_type, &INFO_ERR_STRUCT_TYPE_MS)) + return ERR_TYPE_MS; + else + return N_ERR_TYPES; +} + void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { int i; struct cper_ia_err_info *err_info; char newpfx[64]; + u8 err_type; if (proc->validation_bits & VALID_LAPIC_ID) printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); @@ -38,8 +74,15 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) { printk("%sError Information Structure %d:\n", pfx, i); - printk("%sError Structure Type: %pUl\n", newpfx, - &err_info->err_type); + err_type = cper_get_err_type(&err_info->err_type); + printk("%sError Structure Type: %s\n", newpfx, + err_type < ARRAY_SIZE(cper_proc_error_type_strs) ? + cper_proc_error_type_strs[err_type] : "unknown"); + + if (err_type >= N_ERR_TYPES) { + printk("%sError Structure Type: %pUl\n", newpfx, + &err_info->err_type); + } if (err_info->validation_bits & INFO_VALID_CHECK_INFO) { printk("%sCheck Information: 0x%016llx\n", newpfx, -- 2.14.1
[PATCH v4 3/8] efi: Decode IA32/X64 Processor Error Info Structure
From: Yazen Ghannam Print the fields in the IA32/X64 Processor Error Info Structure. Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-4-yazen.ghan...@amd.com v3->v4: * Drop INDENT_SP use. v2->v3: * Fix table number in commit message. * Don't print raw validation bits. v1->v2: * Add parantheses around "bits" expression in macro. * Fix indentation on multi-line statements. drivers/firmware/efi/cper-x86.c | 48 + 1 file changed, 48 insertions(+) diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c index 863f0cd2a0ff..e0633a103fcf 100644 --- a/drivers/firmware/efi/cper-x86.c +++ b/drivers/firmware/efi/cper-x86.c @@ -9,9 +9,20 @@ */ #define VALID_LAPIC_ID BIT_ULL(0) #define VALID_CPUID_INFO BIT_ULL(1) +#define VALID_PROC_ERR_INFO_NUM(bits) (((bits) & GENMASK_ULL(7, 2)) >> 2) + +#define INFO_VALID_CHECK_INFO BIT_ULL(0) +#define INFO_VALID_TARGET_ID BIT_ULL(1) +#define INFO_VALID_REQUESTOR_IDBIT_ULL(2) +#define INFO_VALID_RESPONDER_IDBIT_ULL(3) +#define INFO_VALID_IP BIT_ULL(4) void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) { + int i; + struct cper_ia_err_info *err_info; + char newpfx[64]; + if (proc->validation_bits & VALID_LAPIC_ID) printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); @@ -20,4 +31,41 @@ void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid, sizeof(proc->cpuid), 0); } + + snprintf(newpfx, sizeof(newpfx), "%s ", pfx); + + err_info = (struct cper_ia_err_info *)(proc + 1); + for (i = 0; i < VALID_PROC_ERR_INFO_NUM(proc->validation_bits); i++) { + printk("%sError Information Structure %d:\n", pfx, i); + + printk("%sError Structure Type: %pUl\n", newpfx, + &err_info->err_type); + + if (err_info->validation_bits & INFO_VALID_CHECK_INFO) { + printk("%sCheck Information: 0x%016llx\n", newpfx, + err_info->check_info); + } + + if (err_info->validation_bits & INFO_VALID_TARGET_ID) { + printk("%sTarget Identifier: 0x%016llx\n", + newpfx, err_info->target_id); + } + + if (err_info->validation_bits & INFO_VALID_REQUESTOR_ID) { + printk("%sRequestor Identifier: 0x%016llx\n", + newpfx, err_info->requestor_id); + } + + if (err_info->validation_bits & INFO_VALID_RESPONDER_ID) { + printk("%sResponder Identifier: 0x%016llx\n", + newpfx, err_info->responder_id); + } + + if (err_info->validation_bits & INFO_VALID_IP) { + printk("%sInstruction Pointer: 0x%016llx\n", + newpfx, err_info->ip); + } + + err_info++; + } } -- 2.14.1
[PATCH v4 2/8] efi: Decode IA32/X64 Processor Error Section
From: Yazen Ghannam Recognize the IA32/X64 Processor Error Section. Do the section decoding in a new "cper-x86.c" file and add this to the Makefile depending on a new "UEFI_CPER_X86" config option. Print the Local APIC ID and CPUID info from the Processor Error Record. The "Processor Error Info" and "Processor Context" fields will be decoded in following patches. Based on UEFI 2.7 Table 252. Processor Error Record. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-3-yazen.ghan...@amd.com v3->v4: * No changes. v2->v3: * Fix table number in commit message. * Don't print raw validation bits. v1->v2: * Change config option depends to "X86" instead of "X86_32 || X64_64". * Remove extra newline in Makefile changes. * Drop author copyright line. drivers/firmware/efi/Kconfig| 5 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/cper-x86.c | 23 +++ drivers/firmware/efi/cper.c | 10 ++ include/linux/cper.h| 2 ++ 5 files changed, 41 insertions(+) create mode 100644 drivers/firmware/efi/cper-x86.c diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig index 3098410abad8..781a4a337557 100644 --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -174,6 +174,11 @@ config UEFI_CPER_ARM depends on UEFI_CPER && ( ARM || ARM64 ) default y +config UEFI_CPER_X86 + bool + depends on UEFI_CPER && X86 + default y + config EFI_DEV_PATH_PARSER bool depends on ACPI diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile index cb805374f4bc..5f9f5039de50 100644 --- a/drivers/firmware/efi/Makefile +++ b/drivers/firmware/efi/Makefile @@ -31,3 +31,4 @@ obj-$(CONFIG_ARM) += $(arm-obj-y) obj-$(CONFIG_ARM64)+= $(arm-obj-y) obj-$(CONFIG_EFI_CAPSULE_LOADER) += capsule-loader.o obj-$(CONFIG_UEFI_CPER_ARM)+= cper-arm.o +obj-$(CONFIG_UEFI_CPER_X86)+= cper-x86.o diff --git a/drivers/firmware/efi/cper-x86.c b/drivers/firmware/efi/cper-x86.c new file mode 100644 index ..863f0cd2a0ff --- /dev/null +++ b/drivers/firmware/efi/cper-x86.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018, Advanced Micro Devices, Inc. + +#include + +/* + * We don't need a "CPER_IA" prefix since these are all locally defined. + * This will save us a lot of line space. + */ +#define VALID_LAPIC_ID BIT_ULL(0) +#define VALID_CPUID_INFO BIT_ULL(1) + +void cper_print_proc_ia(const char *pfx, const struct cper_sec_proc_ia *proc) +{ + if (proc->validation_bits & VALID_LAPIC_ID) + printk("%sLocal APIC_ID: 0x%llx\n", pfx, proc->lapic_id); + + if (proc->validation_bits & VALID_CPUID_INFO) { + printk("%sCPUID Info:\n", pfx); + print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, proc->cpuid, + sizeof(proc->cpuid), 0); + } +} diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index c165933ebf38..5a59b582c9aa 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -469,6 +469,16 @@ cper_estatus_print_section(const char *pfx, struct acpi_hest_generic_data *gdata cper_print_proc_arm(newpfx, arm_err); else goto err_section_too_small; +#endif +#if defined(CONFIG_UEFI_CPER_X86) + } else if (guid_equal(sec_type, &CPER_SEC_PROC_IA)) { + struct cper_sec_proc_ia *ia_err = acpi_hest_get_payload(gdata); + + printk("%ssection_type: IA32/X64 processor error\n", newpfx); + if (gdata->error_data_length >= sizeof(*ia_err)) + cper_print_proc_ia(newpfx, ia_err); + else + goto err_section_too_small; #endif } else { const void *err = acpi_hest_get_payload(gdata); diff --git a/include/linux/cper.h b/include/linux/cper.h index 4b5f8459b403..9c703a0abe6e 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -551,5 +551,7 @@ const char *cper_mem_err_unpack(struct trace_seq *, struct cper_mem_err_compact *); void cper_print_proc_arm(const char *pfx, const struct cper_sec_proc_arm *proc); +void cper_print_proc_ia(const char *pfx, + const struct cper_sec_proc_ia *proc); #endif -- 2.14.1
[PATCH v4 0/8] Decode IA32/X64 CPER
From: Yazen Ghannam This series adds decoding for the IA32/X64 Common Platform Error Record. Patch 1 fixes the IA32/X64 Processor Error Section definition to match the UEFI spec. Patches 2-8 add the new decoding. The patches incrementally add the decoding starting from the top-level "Error Section". Hopefully, this will make reviewing a bit easier compared to one large patch. The formatting of the field names and options is taken from the UEFI spec. I tried to keep everything the same to make searching easier. The patches were written to the UEFI 2.7 spec though the definition of the IA32/X64 CPER seems to be the same as when it was introduced in the UEFI 2.1 spec. Link: https://lkml.kernel.org/r/20180324184940.19762-1-yazen.ghan...@amd.com Changes V3 to V4: * Drop INDENT_SP use. Changes V2 to V3: * Fix table numbers in commit messages. * Don't print raw validation bits. * Only print GUID for unknown error types. Changes V1 to V2: * Remove stable request for all patches. * Address Ard's comments on formatting and other issues. * In Patch 8, always print context info even if the type is not recognized. Yazen Ghannam (8): efi: Fix IA32/X64 Processor Error Record definition efi: Decode IA32/X64 Processor Error Section efi: Decode IA32/X64 Processor Error Info Structure efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs efi: Decode IA32/X64 Cache, TLB, and Bus Check structures efi: Decode additional IA32/X64 Bus Check fields efi: Decode IA32/X64 MS Check structure efi: Decode IA32/X64 Context Info structure drivers/firmware/efi/Kconfig| 5 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/cper-x86.c | 356 drivers/firmware/efi/cper.c | 10 ++ include/linux/cper.h| 4 +- 5 files changed, 375 insertions(+), 1 deletion(-) create mode 100644 drivers/firmware/efi/cper-x86.c -- 2.14.1