RE: [PATCH v3] x86/paravirt: Disable virt spinlock on bare metal
> From: Nikolay Borisov > [...] > >> Actually now shouldn't the CONFIG_PARAVIRT_SPINLOCKS check be > retained? > >> Otherwise we'll have the virtspinlock enabled even if we are a guest > >> but CONFIG_PARAVIRT_SPINLOCKS is disabled, no ? > >> > > > > It seems to be the expected behavior? If CONFIG_PARAVIRT_SPINLOCKS is > > disabled, should the virt_spin_lock_key be enabled in the guest? > > No, but if it's disabled and we are under a hypervisor shouldn't the virt > spinlock be kept disabled? No, the virt_spin_lock_key shouldn't be kept disabled. According to the comments [1], in the hypervisor if CONFIG_PARAVIRT_SPINLOCKS is disabled, the virt_spin_lock_key should be enabled to fall back to the TAS spinlock. [1] https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/qspinlock.h#L94 According to the comments [2]: So my understanding is that in hypervisor keeping virt_spin_lock_key enabled allows the spinlock fallback to TAS if PV spinlock is not supported (either CONFIG_PARAVIRT_SPINLOCKS=n or the host doesn't support the PV feature) [2] https://github.com/torvalds/linux/blob/master/arch/x86/kernel/kvm.c#L1073 > As it stands now everytime we are under a > hypervisor the virt spinlock is enabled irrespective of the PARAVIRT_SPINLOCK > config state. According to [1] [2], yes, I think so, -Qiuxu
RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
> [...] > > I think 507b460f8144 appeared in v5.11, so not something we broke in v5.12. > Applied to pci/error for v5.13, thanks! Thanks Bjorn! > If I understand correctly, we previously only got this right in one > case: > >0 == PCI_SLOT(00.0)# correct >1 == PCI_SLOT(00.1)# incorrect >2 == PCI_SLOT(00.2)# incorrect >... >8 == PCI_SLOT(01.0)# incorrect >9 == PCI_SLOT(01.1)# incorrect >... > 31 == PCI_SLOT(03.7)# incorrect Yes, you're right. Thanks! -Qiuxu
RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
Hi Bjorn, Do you have any comments on this patch? If need any changes, please let me know. Thanks! -Qiuxu > -Original Message- > From: Zhuo, Qiuxu > Sent: Monday, February 22, 2021 9:17 AM > To: Bjorn Helgaas > Cc: Zhuo, Qiuxu ; Lorenzo Pieralisi > ; Krzysztof Wilczyński ; Kelley, > Sean V ; Luck, Tony ; Jin, Wen > ; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org > Subject: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association > > Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single byte > encoding the device and function number) as the device number to check > whether the corresponding bit was set in the RCiEPBitmap of the RCEC (Root > Complex Event Collector) while enumerating over each bit of the RCiEPBitmap. > > As per the PCI Express Base Specification, Revision 5.0, Version 1.0, Section > 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to use a device > number to check whether the corresponding bit was set in the RCiEPBitmap. > > Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value of > "rciep->devfn" to a device number to ensure that the RCiEP devices associated > with the RCEC are linked when the RCEC is enumerated. > > Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs") > Reported-and-tested-by: Wen Jin > Reviewed-by: Sean V Kelley > Signed-off-by: Qiuxu Zhuo > --- > v2->v3: > Drop "[ Krzysztof: Update commit message. ]" from the commit message > > drivers/pci/pcie/rcec.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c index > 2c5c552994e4..d0bcd141ac9c 100644 > --- a/drivers/pci/pcie/rcec.c > +++ b/drivers/pci/pcie/rcec.c > @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct > pci_dev *rciep) > > /* Same bus, so check bitmap */ > for_each_set_bit(devn, , 32) > - if (devn == rciep->devfn) > + if (devn == PCI_SLOT(rciep->devfn)) > return true; > > return false; > -- > 2.17.1
RE: [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
> ... > > [ Krzysztof: Update commit message. ] > [...] > > Thank you! I appreciate that. However, we probably should drop this from the > commit message. Perhaps either Bjorn or Lorenzo could do it when applying > changes. OK, will send out the v3 that drops "[ Krzysztof: Update commit message. ]" from the commit message. -Qiuxu
RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
> ... > > I took your suggestion and came up with the following: > > Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single > byte encoding the device and function number) as the device number to > check whether the corresponding bit was set in the RCiEPBitmap of the > RCEC (Root Complex Event Collector) while enumerating over each bit of > the RCiEPBitmap. > > As per the PCI Express Base Specification, Revision 5.0, Version 1.0, > Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to > use a device number to check whether the corresponding bit was set in > the RCiEPBitmap. > > Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value > of "rciep->devfn" to a device number to ensure that the RCiEP devices > are associated with the RCEC are linked when the RCEC is enumerated. > > Using either of the following as the subject: > > PCI/RCEC: Use device number to check RCiEPBitmap of RCEC > PCI/RCEC: Fix RCiEP capable devices RCEC association > > What do you think? Also, feel free to change whatever you see fit, of > course, as > tis is only a suggestion. > Hi Krzysztof, Thanks for improving the commit message. It looks clearer. Will send out a v2 with this commit message. Thanks! -Qiuxu
RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
>... > > We could probably add the following: > > Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs") > OK. Will add this to the v2. Thanks! -Qiuxu
RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices
Hi Krzysztof, Sorry, just back from Chinese New Year holiday. > From: Krzysztof Wilczyński > ... > ... > Would this only affect error injection or would this be also a generic problem > with the driver itself causing issues regardless of whether it was an error > injection or not for this particular device? I am asking, as there is a lot > going on > in the commit message. This is also a generic problem. > I wonder if simplifying this commit message so that it clearly explains what > was > broken, why, and how this patch is fixing it, would perhaps be an option? The > backstory of how you found the issue while doing some testing and error > injection is nice, but not sure if needed. > > What do you think? Agree to simplify the commit message. How about the following subject and commit message? Subject: Use device number to check RCiEPBitmap of RCEC Commit message: rcec_assoc_rciep() used the combination of device number and function number 'devfn' to check whether the corresponding bit in the RCiEPBimap of RCEC was set. According to [1], it only needs to use the device number to check the corresponding bit in the RCiEPBitmap was set. So fix it by using PCI_SLOT() to convert 'devfn' to device number for rcec_assoc_rciep(). [1] PCIe r5.0, sec "7.9.10.2 Association Bitmap for RCiEPs" Thanks! -Qiuxu
RE: [RFC PATCH 5/9] PCI/AER: Apply function level reset to RCiEP on fatal error
> From: Jonathan Cameron > Sent: Monday, July 27, 2020 7:17 PM > To: Kelley, Sean V > Cc: bhelg...@google.com; r...@rjwysocki.net; ashok@kernel.org; Luck, > Tony ; > sathyanarayanan.kuppusw...@linux.intel.com; linux-...@vger.kernel.org; > linux-kernel@vger.kernel.org; Zhuo, Qiuxu > Subject: Re: [RFC PATCH 5/9] PCI/AER: Apply function level reset to RCiEP > on fatal error > > On Fri, 24 Jul 2020 10:22:19 -0700 > Sean V Kelley wrote: > > > From: Qiuxu Zhuo > > > > Attempt to do function level reset for an RCiEP associated with an > > RCEC device on fatal error. > > I'd like to understand more on your reasoning for flr here. > Is it simply that it is all we can do, or is there some basis in a spec > somewhere? > Yes. Though there isn't the link reset for the RCiEP here, I think we should still be able to reset the RCiEP via FLR on fatal error, if the RCiEP supports FLR. -Qiuxu > > > > Signed-off-by: Qiuxu Zhuo > > --- > > drivers/pci/pcie/err.c | 31 ++- > > 1 file changed, 22 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index > > 044df004f20b..9b3ec94bdf1d 100644 > > --- a/drivers/pci/pcie/err.c > > +++ b/drivers/pci/pcie/err.c > > @@ -170,6 +170,17 @@ static void pci_walk_dev_affected(struct > pci_dev *dev, int (*cb)(struct pci_dev > > } > > } > > > > +static enum pci_channel_state flr_on_rciep(struct pci_dev *dev) { > > + if (!pcie_has_flr(dev)) > > + return PCI_ERS_RESULT_NONE; > > + > > + if (pcie_flr(dev)) > > + return PCI_ERS_RESULT_DISCONNECT; > > + > > + return PCI_ERS_RESULT_RECOVERED; > > +} > > + > > pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, > > enum pci_channel_state state, > > pci_ers_result_t (*reset_link)(struct pci_dev *pdev)) > @@ -191,15 > > +202,17 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev, > > if (state == pci_channel_io_frozen) { > > pci_walk_dev_affected(dev, report_frozen_detected, > ); > > if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) { > > - pci_warn(dev, "link reset not possible for RCiEP\n"); > > - status = PCI_ERS_RESULT_NONE; > > - goto failed; > > - } > > - > > - status = reset_link(dev); > > - if (status != PCI_ERS_RESULT_RECOVERED) { > > - pci_warn(dev, "link reset failed\n"); > > - goto failed; > > + status = flr_on_rciep(dev); > > + if (status != PCI_ERS_RESULT_RECOVERED) { > > + pci_warn(dev, "function level reset failed\n"); > > + goto failed; > > + } > > + } else { > > + status = reset_link(dev); > > + if (status != PCI_ERS_RESULT_RECOVERED) { > > + pci_warn(dev, "link reset failed\n"); > > + goto failed; > > + } > > } > > } else { > > pci_walk_dev_affected(dev, report_normal_detected, > ); >
RE: [PATCH] EDAC, pnd2: Fix ioremap() size in dnv_rd_reg() from 64K -> 32K
> > BIOS has marked the 32K MCHBAR window as reserved, so when dnv_rd_reg() > tries to ioremap() a 64KB region you get warnings like: > > resource sanity check: requesting [mem 0xfed1-0xfed1], which spans > more than reserved [mem 0xfed1-0xfed17fff] caller > dnv_rd_reg+0xc8/0x240 [pnd2_edac] mapping multiple BARs > > ioremap() the correct size on Denverton platforms to get rid of those > warnings. I've several dmesg logs for loading the pnd2_edac driver on a Denverton server successfully. But the logs haven't got such warning. -Qiuxu
RE: [PATCH 1/3] x86/CPU: Add more Icelake model number
> From: Borislav Petkov [mailto:b...@alien8.de] >> ... >> Dropping my SOB or adding a text "[Qiuxu: Get the macros in the Ice Lake >> group sorted by > > model number.]" at the end of the commit message - which one is > > better/clear for you? > > I'll add that note when applying. > > Thx. Thanks Boris! -Qiuxu
RE: [PATCH 1/3] x86/CPU: Add more Icelake model number
> From: Borislav Petkov [mailto:b...@alien8.de] > ... > > From: Kan Liang > > > > Add the CPUID model number of Icelake (ICL) desktop and server > > processors to the Intel family list. > > > > Signed-off-by: Kan Liang > > Signed-off-by: Qiuxu Zhuo > > You're sending this patch but it has Qiuxu's SOB too. What's that supposed to > mean? Hi Boris, During internal co-work, based on Kan's original patch, I got the "#define" in the Ice Lake group sorted by model number(the header of the file requires the sorting) and added my SOB. Dropping my SOB or adding a text "[Qiuxu: Get the macros in the Ice Lake group sorted by model number.]" at the end of the commit message - which one is better/clear for you? Thanks! -Qiuxu
RE: [PATCH] Raise maximum number of memory controllers
Hi Justin, > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller > Skylake Socket#1 IMC#1 Just curious, has the system(two memory controllers per socket) got more than 8 sockets? Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here. Thanks! -Qiuxu
RE: [PATCH] Raise maximum number of memory controllers
Hi Justin, > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller > Skylake Socket#1 IMC#1 Just curious, has the system(two memory controllers per socket) got more than 8 sockets? Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" should be 7 (that was 15/2), but it was 1 here. Thanks! -Qiuxu
RE: [PATCH] EDAC, sb_edac: mark expected switch fall-through
Hi Silva, The actual intention of the code is NOT to fall through, though current code can work correctly. Thanks for this finding. If you don't mind, I'll submit a fix patch for it with the tag 'Reported-by:' by you. Thanks! - Qiuxu > From: linux-edac-ow...@vger.kernel.org [mailto:linux-edac- > ow...@vger.kernel.org] On Behalf Of Gustavo A. R. Silva > Sent: Saturday, October 14, 2017 4:28 AM > To: Mauro Carvalho Chehab; Borislav Petkov > > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Gustavo A. R. > Silva > Subject: [PATCH] EDAC, sb_edac: mark expected switch fall-through > > In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we > are expecting to fall through. > > Signed-off-by: Gustavo A. R. Silva > --- > This code was tested by compilation only (GCC 7.2.0 was used). > Please, verify if the actual intention of the code is to fall through. > > drivers/edac/sb_edac.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index > 72b98a0..b50d714 100644 > --- a/drivers/edac/sb_edac.c > +++ b/drivers/edac/sb_edac.c > @@ -2485,6 +2485,7 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info > *mci, > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA: > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA: > pvt->pci_ta = pdev; > + /* fall through */ > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS: > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:
RE: [PATCH] EDAC, sb_edac: mark expected switch fall-through
Hi Silva, The actual intention of the code is NOT to fall through, though current code can work correctly. Thanks for this finding. If you don't mind, I'll submit a fix patch for it with the tag 'Reported-by:' by you. Thanks! - Qiuxu > From: linux-edac-ow...@vger.kernel.org [mailto:linux-edac- > ow...@vger.kernel.org] On Behalf Of Gustavo A. R. Silva > Sent: Saturday, October 14, 2017 4:28 AM > To: Mauro Carvalho Chehab ; Borislav Petkov > > Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Gustavo A. R. > Silva > Subject: [PATCH] EDAC, sb_edac: mark expected switch fall-through > > In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we > are expecting to fall through. > > Signed-off-by: Gustavo A. R. Silva > --- > This code was tested by compilation only (GCC 7.2.0 was used). > Please, verify if the actual intention of the code is to fall through. > > drivers/edac/sb_edac.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index > 72b98a0..b50d714 100644 > --- a/drivers/edac/sb_edac.c > +++ b/drivers/edac/sb_edac.c > @@ -2485,6 +2485,7 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info > *mci, > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA: > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA: > pvt->pci_ta = pdev; > + /* fall through */ > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS: > case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
> From: Borislav Petkov [mailto:b...@alien8.de] > > Xiaolong, > > can you please run Qiuxu's patch to verify it fixes your issue? Hi Boris, I manually verified the fix patch on the Broadwell-DE server on which the bug was found by Xiaolong: the sb_edac can be loaded successfully, and it identified which size and type of DIMMs were installed in each slot correctly (see attached dmesg.sb_edac.on.Broadwell-DE.log). Hi Xiaolong, Would you please also test the patch by your LKP method by which the issue was found last time. Thanks! BR Qiuxu dmesg.sb_edac.on.Broadwell-DE.log Description: dmesg.sb_edac.on.Broadwell-DE.log
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
> From: Borislav Petkov [mailto:b...@alien8.de] > > Xiaolong, > > can you please run Qiuxu's patch to verify it fixes your issue? Hi Boris, I manually verified the fix patch on the Broadwell-DE server on which the bug was found by Xiaolong: the sb_edac can be loaded successfully, and it identified which size and type of DIMMs were installed in each slot correctly (see attached dmesg.sb_edac.on.Broadwell-DE.log). Hi Xiaolong, Would you please also test the patch by your LKP method by which the issue was found last time. Thanks! BR Qiuxu dmesg.sb_edac.on.Broadwell-DE.log Description: dmesg.sb_edac.on.Broadwell-DE.log
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
Hi Xiaolong, Fixed this issue by 'EDAC, sb_edac: Avoid creating 'SOCK' EDAC memory controller' (you were also CCed by 'Reported-by'). Thanks for this test case :-) BR qiuxu -Original Message- From: Zhuo, Qiuxu Sent: Monday, June 5, 2017 9:08 PM To: Ye, Xiaolong <xiaolong...@intel.com> Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; l...@01.org Subject: RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error Hi Xiaolong, Thanks! I'll look at it, and feedback ASAP. BR qiuxu -Original Message- From: lkp-robot-requ...@eclists.intel.com [mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong Sent: Monday, June 5, 2017 2:23 PM To: Zhuo, Qiuxu <qiuxu.z...@intel.com> Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; l...@01.org Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error FYI, we noticed the following commit: commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: netperf with following parameters: ip: ipv4 runtime: 300s nr_threads: 25% cluster: cs-localhost test: TCP_CRR cpufreq_governor: performance test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance. test-url: http://www.netperf.org/netperf/ on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): kern :err : [ 17.309988] EDAC sbridge: Some needed devices are missing kern :info : [ 17.504765] ast :07:00.0: fb0: astdrmfb frame buffer device kern :info : [ 17.533508] EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV :ff:12.0 kern :err : [ 17.533529] EDAC sbridge: Couldn't find mci handler kern :err : [ 17.533530] EDAC sbridge: Failed to register device with error -19. To reproduce: git clone https://github.com/01org/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml Thanks, Xiaolong
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
Hi Xiaolong, Fixed this issue by 'EDAC, sb_edac: Avoid creating 'SOCK' EDAC memory controller' (you were also CCed by 'Reported-by'). Thanks for this test case :-) BR qiuxu -Original Message- From: Zhuo, Qiuxu Sent: Monday, June 5, 2017 9:08 PM To: Ye, Xiaolong Cc: Borislav Petkov ; linux-edac ; LKML ; Stephen Rothwell ; l...@01.org Subject: RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error Hi Xiaolong, Thanks! I'll look at it, and feedback ASAP. BR qiuxu -Original Message- From: lkp-robot-requ...@eclists.intel.com [mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong Sent: Monday, June 5, 2017 2:23 PM To: Zhuo, Qiuxu Cc: Borislav Petkov ; linux-edac ; LKML ; Stephen Rothwell ; l...@01.org Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error FYI, we noticed the following commit: commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: netperf with following parameters: ip: ipv4 runtime: 300s nr_threads: 25% cluster: cs-localhost test: TCP_CRR cpufreq_governor: performance test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance. test-url: http://www.netperf.org/netperf/ on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): kern :err : [ 17.309988] EDAC sbridge: Some needed devices are missing kern :info : [ 17.504765] ast :07:00.0: fb0: astdrmfb frame buffer device kern :info : [ 17.533508] EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV :ff:12.0 kern :err : [ 17.533529] EDAC sbridge: Couldn't find mci handler kern :err : [ 17.533530] EDAC sbridge: Failed to register device with error -19. To reproduce: git clone https://github.com/01org/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml Thanks, Xiaolong
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
Hi Xiaolong, Thanks! I'll look at it, and feedback ASAP. BR qiuxu -Original Message- From: lkp-robot-requ...@eclists.intel.com [mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong Sent: Monday, June 5, 2017 2:23 PM To: Zhuo, Qiuxu <qiuxu.z...@intel.com> Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; l...@01.org Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error FYI, we noticed the following commit: commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: netperf with following parameters: ip: ipv4 runtime: 300s nr_threads: 25% cluster: cs-localhost test: TCP_CRR cpufreq_governor: performance test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance. test-url: http://www.netperf.org/netperf/ on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): kern :err : [ 17.309988] EDAC sbridge: Some needed devices are missing kern :info : [ 17.504765] ast :07:00.0: fb0: astdrmfb frame buffer device kern :info : [ 17.533508] EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV :ff:12.0 kern :err : [ 17.533529] EDAC sbridge: Couldn't find mci handler kern :err : [ 17.533530] EDAC sbridge: Failed to register device with error -19. To reproduce: git clone https://github.com/01org/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml Thanks, Xiaolong
RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error
Hi Xiaolong, Thanks! I'll look at it, and feedback ASAP. BR qiuxu -Original Message- From: lkp-robot-requ...@eclists.intel.com [mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong Sent: Monday, June 5, 2017 2:23 PM To: Zhuo, Qiuxu Cc: Borislav Petkov ; linux-edac ; LKML ; Stephen Rothwell ; l...@01.org Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error FYI, we noticed the following commit: commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC memory controller per h/w controller") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: netperf with following parameters: ip: ipv4 runtime: 300s nr_threads: 25% cluster: cs-localhost test: TCP_CRR cpufreq_governor: performance test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance. test-url: http://www.netperf.org/netperf/ on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): kern :err : [ 17.309988] EDAC sbridge: Some needed devices are missing kern :info : [ 17.504765] ast :07:00.0: fb0: astdrmfb frame buffer device kern :info : [ 17.533508] EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV :ff:12.0 kern :err : [ 17.533529] EDAC sbridge: Couldn't find mci handler kern :err : [ 17.533530] EDAC sbridge: Failed to register device with error -19. To reproduce: git clone https://github.com/01org/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml Thanks, Xiaolong
RE: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention
Hi Linus and Andy, We did test in our side with v4.4-rc8 + Andy's vDSO v2 patches + Android M (bionic libc using sysenter) ==> Device can boot up successfully Other tests were: - Android L (bionic libc using int80) + v4.4-rc8 ==> Device can boot up successfully - Android L (bionic libc using int80) + v4.4-rc8 + Andy's v2 patches ==> Device can boot up successfully - Android M (bionic libc using sysenter) + v4.4-rc8 ==> Device can NOT boot up successfully - Android M (bionic libc using sysenter) + v4.4-rc8 + Andy's v2 patches ==> Device can boot up successfully Thanks! BR qiuxu -Original Message- From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds Sent: Tuesday, January 5, 2016 3:28 AM To: H. Peter Anvin Cc: Andy Lutomirski; Shi, Mingwei; Fu, Borun; Gross, Mark; Andrew Lutomirski; Su, Tao; Borislav Petkov; Ingo Molnar; Brian Gerst; linux-kernel@vger.kernel.org; Zhuo, Qiuxu; Thomas Gleixner; Denys Vlasenko; Wang, Frank; linux-tip-comm...@vger.kernel.org Subject: Re: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention On Mon, Jan 4, 2016 at 10:48 AM, H. Peter Anvin wrote: > > Linus has frequently stated that if it is something that is critical > enough for stable, it is critical enough for final. Linus will decide > if an additional -rc is needed for that reason. So it would have been good to have it in an -rc, but at the same time I'm not particularly worried about this one. It's not like it's complicated, and I'm assuming it got tested and passed all our current test-cases (which are much more complete than anything we've ever had historically). Linus
RE: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention
Hi Linus and Andy, We did test in our side with v4.4-rc8 + Andy's vDSO v2 patches + Android M (bionic libc using sysenter) ==> Device can boot up successfully Other tests were: - Android L (bionic libc using int80) + v4.4-rc8 ==> Device can boot up successfully - Android L (bionic libc using int80) + v4.4-rc8 + Andy's v2 patches ==> Device can boot up successfully - Android M (bionic libc using sysenter) + v4.4-rc8 ==> Device can NOT boot up successfully - Android M (bionic libc using sysenter) + v4.4-rc8 + Andy's v2 patches ==> Device can boot up successfully Thanks! BR qiuxu -Original Message- From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds Sent: Tuesday, January 5, 2016 3:28 AM To: H. Peter Anvin Cc: Andy Lutomirski; Shi, Mingwei; Fu, Borun; Gross, Mark; Andrew Lutomirski; Su, Tao; Borislav Petkov; Ingo Molnar; Brian Gerst; linux-kernel@vger.kernel.org; Zhuo, Qiuxu; Thomas Gleixner; Denys Vlasenko; Wang, Frank; linux-tip-comm...@vger.kernel.org Subject: Re: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention On Mon, Jan 4, 2016 at 10:48 AM, H. Peter Anvin <h...@zytor.com> wrote: > > Linus has frequently stated that if it is something that is critical > enough for stable, it is critical enough for final. Linus will decide > if an additional -rc is needed for that reason. So it would have been good to have it in an -rc, but at the same time I'm not particularly worried about this one. It's not like it's complicated, and I'm assuming it got tested and passed all our current test-cases (which are much more complete than anything we've ever had historically). Linus