[PATCH] x86/AMD: Fix Socket ID for LLC topology for AMD Fam17h systems

2016-08-31 Thread Yazen Ghannam
The Socket ID is ApicId[bits] on Fam17h systems. Change substraction to logical AND when extracting socket_id from c->apicid. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/

[PATCH v2] x86/mce: Always save severity in machine_check_poll

2017-06-21 Thread Yazen Ghannam
From: Yazen Ghannam The severity gives a hint as to how to handle the error. The notifier blocks can then use the severity to decide on an action. It's not necessary for machine_check_poll() to filter errors for the notifier chain, since each block will check its own set of conditions b

[PATCH v2] x86/MCE/AMD: Always give PANIC severity for UC errors IN_KERNEL context

2017-11-01 Thread Yazen Ghannam
From: Yazen Ghannam The AMD severity grading function was introduced in v4.1 and has remained logically unchanged with the exception of a separate SMCA severity grading function for SMCA systems. The current logic can possibly give MCE_AR_SEVERITY for uncorrectable errors in kernel context. The

[PATCH] x86/mce/AMD: Fix mce_severity_amd_smca() signature

2017-11-01 Thread Yazen Ghannam
From: Yazen Ghannam Change the err_ctx type to "enum context" to match the type passed in. Suggested-by: Borislav Petkov Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mcheck/mce-severity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/

[PATCH 1/2] x86/MCE/AMD: Check for NULL banks in THR interrupt handler

2018-08-09 Thread Yazen Ghannam
From: Yazen Ghannam If threshold_init_device() fails then per_cpu(threshold_banks) will be deallocated. The thresholding interrupt handler will still be active, so it's possible to get a NULL pointer dereference if a THR interrupt happens and any of the structures are NULL. Exit the handl

[PATCH 2/2] x86/MCE/AMD: Skip creating kobjects with NULL names

2018-08-09 Thread Yazen Ghannam
From: Yazen Ghannam During mce_threshold_create_device() data structures are allocated for each CPUs MCA banks and thresholding blocks. These data structures are used to save information related to AMD's MCA Error Thresholding feature. The structures are used in the thresholding inte

[PATCH] x86/mce: Handle varying MCA bank counts

2018-07-27 Thread Yazen Ghannam
From: Yazen Ghannam Linux reads MCG_CAP[Count] to find the number of MCA banks visible to a CPU. Currently, this is assumed to be the same for all CPUs and a warning is shown if there is a difference. The number of banks is overwritten with the MCG_CAP[Count] value of each following CPU that

[PATCH] PCI/ACPI: Disable AER when _OSC control bit is clear.

2018-01-11 Thread Yazen Ghannam
From: Yazen Ghannam Currently, aer_service_init() checks if AER is available and that Firmware First handling is not enabled. The _OSC request for AER is not taken into account when deciding to enable AER in Linux. We should check that the _OSC control for AER is set. If it's not the

[PATCH 3/3] x86/MCE/AMD: Get address from already initialized block

2018-02-01 Thread Yazen Ghannam
From: Yazen Ghannam The block address is saved after the block is initialized when threshold_init_device() is called. Use the saved block address, if available, rather than trying to rediscover it. We can avoid some *on_cpu() calls in the init path that will cause a call trace when resuming

[PATCH 1/3] x86/MCE/AMD: Redo function to get SMCA bank type

2018-02-01 Thread Yazen Ghannam
From: Yazen Ghannam Pass the bank number to smca_get_bank_type() since that's all we need. Also, we should compare the bank number to the size of the smca_banks array not the number of bank types. Bank types are reused for multiple banks, so the number of types can be different from the n

[PATCH 2/3] x86/MCE/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type

2018-02-01 Thread Yazen Ghannam
From: Yazen Ghannam Currently, bank 4 is reserved on Fam17h, so we chose not to initialize bank 4 in the smca_banks array. This means that when we check if a bank is initialized, like during boot or resume, we will see that bank 4 is not initialized and try to initialize it. This may cause a

[PATCH] ACPI / processor_idle: Set default C1 state description

2018-01-29 Thread Yazen Ghannam
From: Yazen Ghannam The acpi_idle driver will default to ACPI_CSTATE_HALT for C1 if a _CST object for C1 is not defined. However, the description will not be set, so users will see "" when reading the description from sysfs. Set the C1 state description when defaulting to ACPI_C

[PATCH v2] PCI/ACPI: Disable AER when _OSC control bit is clear.

2018-01-15 Thread Yazen Ghannam
From: Yazen Ghannam Currently, aer_service_init() checks if AER is available and that Firmware First handling is not enabled. The _OSC request for AER is not taken into account when deciding to enable AER in Linux. >From ACPI 6.2 Section 6.2.11.3, "If any bits in the Control Field are

[PATCH v3 1/8] efi: Fix IA32/X64 Processor Error Record definition

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam Based on UEFI 2.7 Table 252. Processor Error Record, the "Local APIC_ID" field is 8 bytes but Linux defines this field as 1 byte. Fix this in the struct cper_sec_proc_ia definition. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/201802261939

[PATCH v3 8/8] efi: Decode IA32/X64 Context Info structure

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam Print the fields of the IA32/X64 Context Information structure. Print the "Register Array" as raw values. Some context types are defined in the UEFI spec, so more detailed decoded may be added in the future. Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Process

[PATCH v3 6/8] efi: Decode additional IA32/X64 Bus Check fields

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam The "Participation Type", "Time Out", and "Address Space" fields are unique to the IA32/X64 Bus Check structure. Print these fields. Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam --- L

[PATCH v3 7/8] efi: Decode IA32/X64 MS Check structure

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam The IA32/X64 MS Check structure varies from the other Check structures in the the bit positions of its fields, and it includes an additional "Error Type" field. Decode the MS Check structure in a separate function. Based on UEFI 2.7 Table 257. IA32/X64 MS C

[PATCH v3 0/8] Decode IA32/X64 CPER

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam This series adds decoding for the IA32/X64 Common Platform Error Record. Patch 1 fixes the IA32/X64 Processor Error Section definition to match the UEFI spec. Patches 2-8 add the new decoding. The patches incrementally add the decoding starting from the top-level "

[PATCH v3 5/8] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam Print the common fields of the Cache, TLB, and Bus check structures.The fields of these three check types are the same except for a few more fields in the Bus check structure. The remaining Bus check structure fields will be decoded in a following patch. Based on UEFI 2.7

[PATCH v3 3/8] efi: Decode IA32/X64 Processor Error Info Structure

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam Print the fields in the IA32/X64 Processor Error Info Structure. Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180226193904.20532-4-yazen.ghan...@amd.com v2->v3: * Fix ta

[PATCH v3 4/8] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam For easier handling, match the known IA32/X64 error structure GUIDs to enums. Also, print out the name of the matching Error Structure Type. Only print the GUID for unknown types. GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error Information Structure

[PATCH v3 2/8] efi: Decode IA32/X64 Processor Error Section

2018-03-24 Thread Yazen Ghannam
From: Yazen Ghannam Recognize the IA32/X64 Processor Error Section. Do the section decoding in a new "cper-x86.c" file and add this to the Makefile depending on a new "UEFI_CPER_X86" config option. Print the Local APIC ID and CPUID info from the Processor Error Record.

[PATCH 1/2] Revert "x86/mce/AMD: Collect error info even if valid bits are not set"

2018-03-26 Thread Yazen Ghannam
From: Yazen Ghannam This reverts commit 4b1e84276a6172980c5bf39aa091ba13e90d6dad. Software uses the valid bits to decide if the values can be used for further processing or other actions. So setting the valid bits will have software act on values that it shouldn't be acting on.

[PATCH 2/2] x86/MCE: Always save MCA_{ADDR,MISC,SYND} register contents

2018-03-26 Thread Yazen Ghannam
From: Yazen Ghannam The Intel SDM and AMD APM both state that the contents of the MCA_ADDR register should be saved if MCA_STATUS[ADDRV] is set. The same applies to MCA_MISC and MCA_SYND (on SMCA systems) and their respective valid bits. However, the Fam17h Processor Programming Reference

[PATCH 2/3] EDAC/amd64: Only remove instances that exist

2018-03-21 Thread Yazen Ghannam
From: Yazen Ghannam An instance may have failed probing because the probed node did not have DRAM installed. When the module is unloaded we'll get a WARNing when we try to remove a non-existent instance. Save a bitmask of enabled instances with a bit for each node. Only try to remo

[PATCH 1/3] EDAC/amd64: Print ECC enabled/disabled for nodes with enabled MCs

2018-03-21 Thread Yazen Ghannam
From: Yazen Ghannam It's possible that a system can be used without any DRAM populated on one or more physical Dies on multi-die systems. Firmware will not enable DRAM ECC on Dies without DRAM. Users will then see a message about DRAM ECC disabled on those nodes without DRAM. However, DRA

[PATCH 3/3] EDAC/amd64: Add DIMM device type for Fam17h

2018-03-21 Thread Yazen Ghannam
From: Yazen Ghannam Set the DIMM device type for Fam17h. Cc: # 4.14.x Signed-off-by: Yazen Ghannam --- drivers/edac/amd64_edac.c | 17 ++--- drivers/edac/amd64_edac.h | 1 + 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac

Re: [PATCH 0/4] MCE wrapper and support for new SMCA syndrome MSRs

2024-06-21 Thread Yazen Ghannam
On Fri, Jun 21, 2024 at 06:58:23PM +0200, Borislav Petkov wrote: > On Thu, May 30, 2024 at 04:16:16PM -0500, Avadhut Naik wrote: > > arch/x86/include/asm/mce.h | 20 ++- > > arch/x86/kernel/cpu/mce/apei.c | 111 ++ > > arch/x86/kernel/cpu/mce/core.c | 19

Re: [PATCH v2 4/4] EDAC/mce_amd: Add support for FRU Text in MCA

2024-06-27 Thread Yazen Ghannam
On Wed, Jun 26, 2024 at 08:20:13PM +0200, Borislav Petkov wrote: > On Wed, Jun 26, 2024 at 01:00:30PM -0500, Naik, Avadhut wrote: > > > > > > Why are you clearing it if you're overwriting it immediately? > > > > > Since its a local variable, wanted to ensure that the memory is zeroed out > > to

[PATCH v2] x86/smpboot: Don't do mwait_play_dead() on AMD systems

2018-04-03 Thread Yazen Ghannam
From: Yazen Ghannam Recent AMD systems support using MWAIT for C1 state. However, MWAIT will not allow deeper cstates than C1 on current systems. With play_dead() we expect the OS to use the deepest state available. The deepest state available on AMD systems is reached through SystemIO or HALT

[PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-20 Thread Yazen Ghannam
32 or fewer MCA banks per CPU. Signed-off-by: Akshay Gupta [ Adjust commit message and code comment. ] Signed-off-by: Yazen Ghannam --- arch/x86/include/asm/mce.h | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h

Re: [PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-20 Thread Yazen Ghannam
On Thu, Aug 20, 2020 at 07:15:18PM +0200, Borislav Petkov wrote: > On Thu, Aug 20, 2020 at 05:06:24PM +0000, Yazen Ghannam wrote: > > From: Akshay Gupta > > > > ...because future AMD systems will support up to 64 MCA banks per CPU. > > > > MAX_NR_BANKS is

Re: [PATCH] x86/MCE/AMD, EDAC/mce_amd

2020-08-10 Thread Yazen Ghannam
On Sun, Aug 09, 2020 at 12:35:59PM +0800, Feng zhou wrote: > From: zhoufeng > > The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and > later systems. This function is used in amd64_edac_mod to do > system-specific decoding for DRAM ECC errors. The function takes a > "NodeId" as a

Re: [PATCH 2/2] x86/MCE/AMD Support new memory interleaving schemes during address translation

2020-08-18 Thread Yazen Ghannam
On Sat, Aug 15, 2020 at 11:13:36AM +0200, Ingo Molnar wrote: > > * Yazen Ghannam wrote: > > > + /* Read D18F1x208 (System Fabric ID Mask 0). */ > > + if (amd_df_indirect_read(nid, 1, 0x208, umc, &tmp)) > > + goto out_err; > > + > &g

Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-20 Thread Yazen Ghannam
On Mon, Jan 18, 2021 at 04:30:58AM +0300, WGH wrote: > On 16/01/2021 17:33, Yazen Ghannam wrote: > > From: Yazen Ghannam > > > > The Family 17h scrubber registers moved to different offset starting > > with Model 30h. The new register offsets are used for all currently

Re: [PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-20 Thread Yazen Ghannam
On Mon, Jan 18, 2021 at 08:31:12PM +0100, Borislav Petkov wrote: > On Sat, Jan 16, 2021 at 02:33:53PM +0000, Yazen Ghannam wrote: > > +static struct { > > + u32 base, limit; > > +} f17h_scrub_regs = {F17H_M30H_SCR_BASE_ADDR, F17H_M30H_SCR_LIMIT_ADDR}; > > Why n

[PATCH] EDAC/AMD64: Update scrub register addresses for newer models

2021-01-16 Thread Yazen Ghannam
From: Yazen Ghannam The Family 17h scrubber registers moved to different offset starting with Model 30h. The new register offsets are used for all currently available models since then. Use the new register addresses as the defaults. Set the proper scrub register addresses during module init

Re: [PATCH 1/2] EDAC/amd64: Merge sysfs debugging attributes setup code

2020-12-15 Thread Yazen Ghannam
ake the function static and shorten > static function names. > > No functional changes. > > Signed-off-by: Borislav Petkov Reviewed-by: Yazen Ghannam Thanks, Yazen

Re: [PATCH 2/2] EDAC/amd64: Merge error injection sysfs facilities

2020-12-15 Thread Yazen Ghannam
ed to the comment above, can this be changed to the following? if (pvt->fam < 0x10 || pvt->fam >= 0x17) > + return 0; > + return attr->mode; > +} > + Everything else looks good to me. Reviewed-by: Yazen Ghannam Thanks, Yazen

[PATCH] EDAC/amd64: Tone down messages about missing PCI IDs

2020-12-15 Thread Yazen Ghannam
From: Yazen Ghannam Give these messages a debug severity as they are really only useful to the module developers. Also, drop the "(broken BIOS?)" phrase, since this can cause churn for BIOS folks. The PCI IDs needed by the module, at least on modern systems, are fixed in hardware.

[PATCH] arm64: crypto: Add ARM64 CRC32 hw accelerated module

2014-11-19 Thread Yazen Ghannam
% speedup. Signed-off-by: Yazen Ghannam Acked-by: Steve Capper Acked-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 4 + arch/arm64/crypto/Makefile | 4 + arch/arm64/crypto/crc32-arm64.c | 274 3 files changed, 282 insertions(+) create

Re: [PATCH] arm64: crypto: Add ARM64 CRC32 hw accelerated module

2014-11-20 Thread Yazen Ghannam
+linux-arm-ker...@lists.infradead.org On Wed, Nov 19, 2014 at 11:19 AM, Yazen Ghannam wrote: > This module registers a crc32 algorithm and a crc32c algorithm > that use the optional CRC32 and CRC32C instructions in ARMv8. > > Tested on AMD Seattle. > > Improvement compared

Re: [PATCH] arm64: crypto: Add ARM64 CRC32 hw accelerated module

2014-11-25 Thread Yazen Ghannam
at 3:39 PM, Ard Biesheuvel wrote: > On 20 November 2014 15:22, Yazen Ghannam wrote: >> +linux-arm-ker...@lists.infradead.org >> >> On Wed, Nov 19, 2014 at 11:19 AM, Yazen Ghannam >> wrote: >>> This module registers a crc32 algorithm and a crc32c algorithm >>

[PATCH v4 0/8] Decode IA32/X64 CPER

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam This series adds decoding for the IA32/X64 Common Platform Error Record. Patch 1 fixes the IA32/X64 Processor Error Section definition to match the UEFI spec. Patches 2-8 add the new decoding. The patches incrementally add the decoding starting from the top-level "

[PATCH v4 3/8] efi: Decode IA32/X64 Processor Error Info Structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Print the fields in the IA32/X64 Processor Error Info Structure. Based on UEFI 2.7 Table 253. IA32/X64 Processor Error Information Structure. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20180324184940.19762-4-yazen.ghan...@amd.com v3->v4: * D

[PATCH v4 2/8] efi: Decode IA32/X64 Processor Error Section

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Recognize the IA32/X64 Processor Error Section. Do the section decoding in a new "cper-x86.c" file and add this to the Makefile depending on a new "UEFI_CPER_X86" config option. Print the Local APIC ID and CPUID info from the Processor Error Record.

[PATCH v4 4/8] efi: Decode UEFI-defined IA32/X64 Error Structure GUIDs

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam For easier handling, match the known IA32/X64 error structure GUIDs to enums. Also, print out the name of the matching Error Structure Type. Only print the GUID for unknown types. GUIDs taken from UEFI 2.7 section N.2.4.2.1 IA32/X64 Processor Error Information Structure

[PATCH v4 6/8] efi: Decode additional IA32/X64 Bus Check fields

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam The "Participation Type", "Time Out", and "Address Space" fields are unique to the IA32/X64 Bus Check structure. Print these fields. Based on UEFI 2.7 Table 256. IA32/X64 Bus Check Structure Signed-off-by: Yazen Ghannam --- L

[PATCH v4 8/8] efi: Decode IA32/X64 Context Info structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Print the fields of the IA32/X64 Context Information structure. Print the "Register Array" as raw values. Some context types are defined in the UEFI spec, so more detailed decoded may be added in the future. Based on UEFI 2.7 section N.2.4.2.2 IA32/X64 Process

[PATCH v4 7/8] efi: Decode IA32/X64 MS Check structure

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam The IA32/X64 MS Check structure varies from the other Check structures in the the bit positions of its fields, and it includes an additional "Error Type" field. Decode the MS Check structure in a separate function. Based on UEFI 2.7 Table 257. IA32/X64 MS C

[PATCH v4 5/8] efi: Decode IA32/X64 Cache, TLB, and Bus Check structures

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Print the common fields of the Cache, TLB, and Bus check structures.The fields of these three check types are the same except for a few more fields in the Bus check structure. The remaining Bus check structure fields will be decoded in a following patch. Based on UEFI 2.7

[PATCH v4 1/8] efi: Fix IA32/X64 Processor Error Record definition

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID" field is 8 bytes but Linux defines this field as 1 byte. Fix this in the struct cper_sec_proc_ia definition. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/201803241849

[PATCH] x86/smpboot: Don't do mwait_play_dead() on AMD systems

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam Recent AMD systems support using MWAIT for C1 state. However, MWAIT will not allow deeper cstates than C1 on current systems. With play_dead() we expect the OS to use the deepest state available. The deepest state available on AMD systems is reached through SystemIO or HALT

[PATCH] x86/MCE, EDAC/mce_amd: Save all aux registers on SMCA systems

2018-04-02 Thread Yazen Ghannam
From: Yazen Ghannam The Intel SDM and AMD APM both state that the auxiliary MCA registers should be read if their respective valid bits are set in MCA_STATUS. The Processor Programming Reference for AMD Fam17h systems has a new recommendation that the auxiliary registers should be saved

Re: [PATCH] x86/AMD: Fix LLC ID for AMD Fam17h systems

2016-10-27 Thread Yazen Ghannam
>> +/* >> + * LLC is at the Core Complex level. >> + * Core Complex Id is ApicId[3]. >> + */ >> +else if (c->x86 == 0x17) >> +per_cpu(cpu_llc_id, cpu) = c->initial_apicid

[PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family

2016-10-28 Thread Yazen Ghannam
10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/amd.c | 32 1 file changed, 20 insertions(+), 12 dele

[PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems

2016-10-28 Thread Yazen Ghannam
-off-by: Yazen Ghannam Cc: # v4.4.. Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") --- arch/x86/kernel/cpu/amd.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c ind

Re: [PATCH -v1.1] x86/topology: Document cpu_llc_id

2016-11-17 Thread Yazen Ghannam
On Thu, Nov 17, 2016 at 10:45:57AM +0100, Borislav Petkov wrote: > It means different things on Intel and AMD so write it down so that > there's no confusion. > > Signed-off-by: Borislav Petkov > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: Yazen Ghannam

Re: [tip:ras/core] x86/RAS: Simplify SMCA HWID descriptor struct

2016-11-10 Thread Yazen Ghannam
> static void get_smca_bank_info(unsigned int bank) > { > unsigned int i, hwid_mcatype, cpu = smp_processor_id(); > - struct smca_hwid_mcatype *type; > + struct smca_hwid *s_hwid; > u32 high, instance_id; > - u16 hwid, mcatype; > > /* Collect bank_info using CPU 0

Re: [tip:ras/core] x86/RAS: Simplify SMCA HWID descriptor struct

2016-11-10 Thread Yazen Ghannam
> > > > Argh, the macro should be adding the additional parentheses: > > > > #define HWID_MCATYPE(hwid, mcatype) (((hwid) << 16) | (mcatype)) > > > > That should fix the issue too. > Yep, sure does. > Patch please. Will do. Thanks, Yazen

Re: [PATCH v2 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family

2016-10-31 Thread Yazen Ghannam
> > > The NODEID_MSR feature only applies to Fam10h in which case the llc is at > > s/llc/LLC (Last Level Cache/ > > Let's try to have abbreviations written out in their first mention in the > text. > Okay. > > the node level. > > > > The TOPOEXT feature is used on families 15h, 16h and 17h

Re: [PATCH v2 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems

2016-10-31 Thread Yazen Ghannam
y 3. > > "... because then the LSBit will be the Core Complex ID." > Ack. > > We can fix the underflow bug and simplify the code by replacing the > > current cpu_llc_id derivation with a right shift. > > > > Signed-off-by: Yazen Ghannam > > Cc

[PATCH v3 2/2] x86/AMD: Group cpu_llc_id assignment by topology feature and family

2016-11-01 Thread Yazen Ghannam
node_id for TOPOEXT systems. Single node systems in families 10h and 15h will have a Node ID of 0 which will be the same as the phys_proc_id, so we don't need to check for multiple nodes before using the node_id. Signed-off-by: Yazen Ghannam --- Link: http://lkml.kernel.org/r/1477669918-56

[PATCH v3 1/2] x86/AMD: Fix cpu_llc_id for AMD Fam17h systems

2016-11-01 Thread Yazen Ghannam
bug and simplify the code by replacing the current cpu_llc_id derivation with a right shift. Signed-off-by: Yazen Ghannam Cc: # v4.4.. Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h systems") --- Link: http://lkml.kernel.org/r/1477669918-56261-1-git-

[PATCH] x86/AMD: Fix LLC ID for AMD Fam17h systems

2016-10-26 Thread Yazen Ghannam
Fix an underflow bug with the current Fam17h LLC ID derivation by simplifying the derivation, and also move it into amd_get_topology(). Signed-off-by: Yazen Ghannam Cc: sta...@vger.kernel.org # v4.6.. Fixes: 3849e91f571d ("x86/AMD: Fix last level cache topology for AMD Fam17h sy

Re: linux-next: manual merge of the edac-amd tree with the edac tree

2016-12-01 Thread Yazen Ghannam
x27; > > With was introduced by this commit: > > commit d12a969ebbfcfc25853c4147d42b388f758e8784 > Author: Yazen Ghannam > Date: Thu Nov 17 17:57:32 2016 -0500 > > EDAC, amd64: Add Deferred Error type > > Currently, deferred errors are cla

Re: linux-next: manual merge of the edac-amd tree with the edac tree

2016-12-01 Thread Yazen Ghannam
On Thu, Dec 01, 2016 at 07:15:01PM +0100, Borislav Petkov wrote: > On Thu, Dec 01, 2016 at 11:02:04AM -0500, Yazen Ghannam wrote: > > A deferred error is an uncorrectable error whose handling can be > > deferred, i.e. it's not urgent. This affects the system behavior, but >

[PATCH] x86/mce/AMD: Give a name to MCA bank 3 to use with legacy MSRs

2017-03-21 Thread Yazen Ghannam
From: Yazen Ghannam MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name. However, MCA bank 3 is defined on Fam17h systems and can be accessed using legacy MSRs. Without a name we get a stack trace on Fam17h systems when trying to register sysfs files for bank 3 on kernels

[PATCH 2/2] x86/mce/AMD: Carve out SMCA bank configuration

2017-03-22 Thread Yazen Ghannam
From: Yazen Ghannam Scalable MCA systems have a new MCA_CONFIG register that we use to configure each bank. We currently use this when we set up thresholding. However, this is logically separate. Move setup of MCA_CONFIG into a separate function. Signed-off-by: Yazen Ghannam --- arch/x86

[PATCH 1/2] x86/mce/AMD: Redo use of SMCA MCA_DE{STAT,ADDR} registers

2017-03-22 Thread Yazen Ghannam
From: Yazen Ghannam We have support for the new SMCA MCA_DE{STAT,ADDR} registers in Linux. So we've used these registers in place of MCA_{STATUS,ADDR} on SMCA systems. However, the guidance for current implementations of SMCA is to continue using MCA_{STATUS,ADDR} and to use MCA_DE{STAT

[PATCH v2 2/4] x86/mce/AMD; EDAC,amd64: Move find_umc_channel() to AMD mcheck

2017-03-20 Thread Yazen Ghannam
we're only looking at UMCs in case the UMC instance IDs ever match up with other bank types. Signed-off-by: Yazen Ghannam --- Link: http://lkml.kernel.org/r/1486760120-60944-2-git-send-email-yazen.ghan...@amd.com v1->v2: - Redo commit message based on comments. - Add UMC bank type sanity

[PATCH v2 1/4] EDAC,mce_amd: Find node ID on SMCA systems using generic methods

2017-03-20 Thread Yazen Ghannam
From: Yazen Ghannam We should move away from using AMD-specific amd_get_nb_id() to find a node ID and move toward using generic Linux methods. We can use cpu_to_node() since NUMA should be working as expected on newly released Fam17h systems. Replace call to amd_get_nb_id() and related shifting

[PATCH v2 3/4] x86/mce/AMD: Mark Deferred errors as Action Optional on SMCA systems

2017-03-20 Thread Yazen Ghannam
From: Yazen Ghannam Give Deferred errors an Action Optional severity on SMCA systems so that the SRAO notifier block can potentially handle them. Signed-off-by: Yazen Ghannam --- Link: http://lkml.kernel.org/r/1486760120-60944-3-git-send-email-yazen.ghan...@amd.com v1->v2: - New in v2. Ba

[PATCH v2 0/4] Call memory_failure() on Deferred errors

2017-03-20 Thread Yazen Ghannam
From: Yazen Ghannam This set is based on an earlier 3 patch set. Patch 1: - Address comments by using cpu_to_node() when finding a node ID rather than amd_get_nb_id(). Link: http://lkml.kernel.org/r/1486760120-60944-1-git-send-email-yazen.ghan...@amd.com Patch 2: - Fix up commit message. Link

[PATCH v2 4/4] x86/mce: Add AMD SMCA support to SRAO notifier

2017-03-20 Thread Yazen Ghannam
From: Yazen Ghannam Deferred errors on AMD systems may get an Action Optional severity with the goal of being handled by the SRAO notifier block. However, the process of determining if an address is usable is different between Intel and AMD. So define vendor-specific functions for this. Also

[PATCH] x86/mce: Do feature check earlier

2017-03-15 Thread Yazen Ghannam
ature initialization will still happen after generic init. Signed-off-by: Yazen Ghannam --- arch/x86/kernel/cpu/mcheck/mce.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 17

Re: [PATCH 1/3] x86/RAS: Simplify SMCA bank descriptor struct

2016-11-04 Thread Yazen Ghannam
> > Call the struct simply smca_bank, it's instance ID can be simply ->id. > Makes the code much more readable. > > Signed-off-by: Borislav Petkov Looks good to me. Please add: Tested-by: Yazen Ghannam Ditto for the others. Thanks, Yazen

Re: [PATCH] x86/mce: fix a wrong assignment of i_mce.status

2020-06-11 Thread Yazen Ghannam
_mce.status & ~MCI_STATUS_UC); > > + i_mce.status &= ~MCI_STATUS_UC; > > Boris: "git blame" says you wrote this code. Patch looks right (in > that it makes the code do what the comment just above says it is trying > to do): > > * - MCx_STATUS[UC] cleared: deferred errors are _not_ UC > > But this is AMD specific, so I'll defer judgement > Acked-by: Yazen Ghannam Thanks, Yazen

[PATCH] EDAC/mce_amd: Add new error descriptions for existing types

2020-07-08 Thread Yazen Ghannam
From: Yazen Ghannam A few existing MCA bank types will have new error types in future SMCA systems. Add the descriptions for the new error types. Signed-off-by: Yazen Ghannam --- drivers/edac/mce_amd.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers

Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-29 Thread Yazen Ghannam
On Mon, Sep 28, 2020 at 08:14:07PM +0200, Borislav Petkov wrote: > On Mon, Sep 28, 2020 at 10:53:50AM -0500, Yazen Ghannam wrote: > > > I agree that the translation code is implementation-specific and applies > > only to DRAM ECC errors, so it make sense to have it in amd64_

[PATCH] EDAC/amd64: Set proper family type for Family 19h Models 20h-2Fh

2020-10-09 Thread Yazen Ghannam
From: Yazen Ghannam AMD Family 19h Models 20h-2Fh use the same PCI IDs as Family 17h Models 70h-7Fh. The same family ops and number of channels also apply. Use the Family17h Model 70h family_type and ops for Family 19h Models 20h-2Fh. Update the controller name to match the system. Signed-off

Re: [PATCH v4] cper, apei, mce: Pass x86 CPER through the MCA handling chain

2020-09-25 Thread Yazen Ghannam
On Fri, Sep 25, 2020 at 09:54:06AM +0900, Punit Agrawal wrote: > Borislav Petkov writes: > > > On Thu, Sep 24, 2020 at 12:23:27PM -0500, Smita Koralahalli Channabasappa > > wrote: > >> > Even though it's not defined in the UEFI spec, it doesn't mean a > >> > structure definition cannot be create

Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-25 Thread Yazen Ghannam
On Fri, Sep 25, 2020 at 09:22:31AM +0200, Borislav Petkov wrote: > On Wed, Sep 23, 2020 at 11:25:10AM -0500, Yazen Ghannam wrote: > > I don't remember the original reason, and I was recently asked about > > this code living in a module. I did some looking after this ask, and

Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-28 Thread Yazen Ghannam
On Mon, Sep 28, 2020 at 11:47:59AM +0200, Borislav Petkov wrote: > On Fri, Sep 25, 2020 at 02:51:27PM -0500, Yazen Ghannam wrote: > > > The address translation needs to be done before the notfiers that need > > it, and EDAC comes after all of them. There's also the

Re: 5.6.12 MCE on AMD EPYC 7502

2020-05-29 Thread Yazen Ghannam
On Fri, May 29, 2020 at 07:57:20AM -0400, Borislav Petkov wrote: > On Fri, May 29, 2020 at 01:55:29PM +0300, Dmitry Antipov wrote: > > Hello, > > > > I'm facing the following kernel messages running Debian 9 with > > custom 5.6.12 kernel running on AMD EPYC 7502 - based hardware: > > > > [138537.

Re: [PATCH 1/3] x86/amd_nb: add AMD family 17h model 60h PCI IDs

2020-05-13 Thread Yazen Ghannam
islav Petkov > Cc: x...@kernel.org > Cc: Yazen Ghannam > Cc: Brian Woods > Cc: Clemens Ladisch > Cc: Jean Delvare > Cc: Guenter Roeck > Cc: linux-hw...@vger.kernel.org > Cc: linux-e...@vger.kernel.org Acked-by: Yazen Ghannam Thanks, Yazen

Re: [PATCH 3/3] EDAC/amd64: Add AMD family 17h model 60h PCI IDs

2020-05-13 Thread Yazen Ghannam
On Sun, May 10, 2020 at 04:48:42PM -0400, Alexander Monakov wrote: > Add support for AMD Renoir (4000-series Ryzen CPUs). > > Signed-off-by: Alexander Monakov > Cc: Thomas Gleixner > Cc: Borislav Petkov > Cc: x...@kernel.org > Cc: Yazen Ghannam > Cc: Brian Woods >

[PATCH v2] x86/mce: Increase maximum number of banks to 64

2020-08-28 Thread Yazen Ghannam
binary. However, in the case where it doesn't fit, an additional page (4kB) of memory will be added to the binary to accommodate the extra data. Signed-off-by: Akshay Gupta [ Adjust commit message and code comment. ] Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.

[PATCH v2 3/8] EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter. In AMD documentation, NodeId is used to

[PATCH v2 4/8] x86/MCE/AMD: Use defines for register addresses in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam Replace raw register offset values in the AMD address translation code with named definitions. Also, drop comments that only note the register names. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 ->

[PATCH v2 0/8] AMD MCA Address Translation Updates

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam This patchset includes updates for the MCA Address Translation process on recent AMD systems. Patches 1 & 3: Fixes an input to the address translation function. The translation requires a physical Die ID (NodeId in AMD documentation) rather than a logicial NUMA node ID.

[PATCH v2 5/8] x86/MCE/AMD: Use macros to get bitfields in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam Define macros to get individual bits and bitfields. Use these to make the code more readable. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/

[PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation

2020-09-03 Thread Yazen Ghannam
interleaving option used. Fixes: 6e846239e548 ("EDAC/amd64: Add Family 17h Model 30h PCI IDs") Signed-off-by: Muralidhara M K Co-developed-by: Naveen Krishna Chtradhi Signed-off-by: Naveen Krishna Chtradhi Co-developed-by: Yazen Ghannam Signed-off-by: Yazen Ghannam --- Link: https://lkml.

[PATCH v2 7/8] x86/MCE/AMD: Group register reads in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam ...so that bitfield extraction can be done together to simplify future patches. Signed-off-by: Yazen Ghannam --- Link: https://lkml.kernel.org/r/20200814191449.183998-3-yazen.ghan...@amd.com v1 -> v2: * New patch based on comments for v1 Patch 2. arch/x86/kernel/cpu/

[PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable in translation code

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam Remove the "tmp" variable used to save register values. Save the values in existing variables, if possible. The register values are 32 bits. Use separate "reg_" variables to hold the register values if the existing variable sizes doesn't match

[PATCH v2 2/8] x86/CPU/AMD: Remove amd_get_nb_id()

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.node_id. Replace calls to amd_get_nb_id() with the logical CPU's node_id and remov

[PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

2020-09-03 Thread Yazen Ghannam
From: Yazen Ghannam AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA node. Logic

Re: [PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-24 Thread Yazen Ghannam
On Thu, Aug 20, 2020 at 06:15:15PM +, Luck, Tony wrote: > >> How much does vmlinux size grow with your change? > >> > > > > It seems to get smaller. > > > > -rwxrwxr-x 1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks > > -rwxrwxr-x 1 yghannam yghannam 807634072 Aug 20 17:50 vmlinu

Re: [PATCH v2 1/2] cper, apei, mce: Pass x86 CPER through the MCA handling chain

2020-09-01 Thread Yazen Ghannam
On Fri, Aug 28, 2020 at 03:33:31PM -0500, Smita Koralahalli wrote: ... > +int apei_mce_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 > lapic_id) > +{ > + const u64 *i_mce = ((const void *) (ctx_info + 1)); > + unsigned int cpu; > + struct mce m; > + > + if (!boot_cpu_has(

Re: [PATCH 1/2] x86/MCE/AMD, EDAC/mce_amd: Use AMD NodeId for Family17h+ DRAM Decode

2020-08-17 Thread Yazen Ghannam
On Sat, Aug 15, 2020 at 10:42:12AM +0200, Ingo Molnar wrote: > > * Yazen Ghannam wrote: > > > From: Yazen Ghannam > > > > The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and > > later systems. This function is used in amd64_edac_mod to do >

  1   2   3   >