Hi Borislav, On 1/10/2019 3:38 AM, Borislav Petkov wrote: > On Mon, Jan 07, 2019 at 10:37:24AM +0000, S, Shirish wrote: >> This patch adds threshold quirk applicable for family 15 > Same issue with "This patch" here. Agree, have avoided the phrase in both patches that originate from this discussion. > >> in resume path as well, since mce_amd_feature_init() >> does not have quirks applied when originating from mce_syscore_resume(), >> resulting in the below message at every successful resume: >> >> "[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..." >> >> Signed-off-by: Shirish S <shiris...@amd.com> >> --- >> arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++ >> 1 file changed, 34 insertions(+) >> >> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c >> index 89298c8..27cbf66 100644 >> --- a/arch/x86/kernel/cpu/mce/amd.c >> +++ b/arch/x86/kernel/cpu/mce/amd.c >> @@ -545,6 +545,34 @@ prepare_threshold_block(unsigned int bank, unsigned int >> block, u32 addr, >> return offset; >> } >> >> +void disable_err_thresholding(struct cpuinfo_x86 *c) >> +{ >> + int i; >> + u64 hwcr; >> + bool need_toggle; >> + u32 msrs[] = { >> + 0x00000413, /* MC4_MISC0 */ >> + 0xc0000408, /* MC4_MISC1 */ >> + }; >> + >> + if (c->x86_model >= 0x10 && c->x86_model <= 0x7f) { > You can save yourself an indentation level by reversing the logic here: > > if (c->x86 != 0x15) > return; > > Also, I'm wondering if you simply can't do > > if (c->x86_model < 0x10) > return; > > The assumption being that all the models - even after 0x7f - are highly > unlikely to get MC4_MISC thresholding supported, all of a sudden. Might > wanna run it by HW guys first though.
Perhaps no need to check the model altogether, this is applicable to entire family 15, since there is no 8th Gen. > >> + rdmsrl(MSR_K7_HWCR, hwcr); >> + >> + /* McStatusWrEn has to be set */ >> + need_toggle = !(hwcr & BIT(18)); >> + >> + if (need_toggle) >> + wrmsrl(MSR_K7_HWCR, hwcr | BIT(18)); >> + >> + /* Clear CntP bit safely */ >> + for (i = 0; i < ARRAY_SIZE(msrs); i++) >> + msr_clear_bit(msrs[i], 62); >> + >> + /* restore old settings */ >> + if (need_toggle) >> + wrmsrl(MSR_K7_HWCR, hwcr); >> + } > So you copied the same code from __mcheck_cpu_apply_quirks(). > > No. > > In a first patch, you carve that CntP clearing code in a separate > function disable_err_thresholding() like you've done before. Agree, first patch here: https://lkml.org/lkml/2019/1/10/70 > Then, in a second patch, you call it from mce/amd.c and you move all the > family/model checks inside the function so that you have a sole > > disable_err_thresholding(); > > calls where you need them. > > You don't have to pass in struct cpuinfo_x86 *c - you can use > boot_cpu_data in the function. Agree, second patch here: https://lkml.org/lkml/2019/1/10/72 > >> +} >> /* cpu init entry point, called from mce.c with preempt off */ >> void mce_amd_feature_init(struct cpuinfo_x86 *c) >> { >> @@ -552,6 +580,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c) >> unsigned int bank, block, cpu = smp_processor_id(); >> int offset = -1; >> >> + /* Disable error thresholding bank in S3 resume path as well, > What S3 resume path? That's the CPU init path. I have also put in the commit message, but for discussion sake, there are 2 code paths leading to mce_amd_feature_init() as below. 1) S5 -> S0: (boot) secondary_startup_64 -> start_kernel -> identify_boot_cpu -> identify_cpu -> mcheck_cpu_init (calls __mcheck_cpu_apply_quirks before) -> mce_amd_feature_init 2) S3 -> S0: (resume) syscore_resume -> mce_syscore_resume -> mce_amd_feature_init I have summarized this sequence after capturing the call trace in both scenarios. Its clear that the quirks are not applied during S3-> S0 transition, hence the patch. I cannot move the quirk to mce_amd_feature_init() since amd.c is guarded by CONFIG_X86_MCE_AMD and the quirk should be applicable in case the config is off via __mcheck_cpu_apply_quirks() at boot. > Also, kernel comments style is: > > /* > * A sentence ending with a full-stop. > * Another sentence. ... > * More sentences. ... > */ > Agree, followed it in the above mentioned patches. Thanks. Regards, Shirish S > Good luck! > > :-) > -- Regards, Shirish S