RE: [PATCH v3] x86/paravirt: Disable virt spinlock on bare metal

2024-06-25 Thread Zhuo, Qiuxu
> From: Nikolay Borisov 
> [...]
> >> Actually now shouldn't the CONFIG_PARAVIRT_SPINLOCKS check be
> retained?
> >> Otherwise we'll have the virtspinlock enabled even if we are a guest
> >> but CONFIG_PARAVIRT_SPINLOCKS is disabled, no ?
> >>
> >
> > It seems to be the expected behavior? If CONFIG_PARAVIRT_SPINLOCKS is
> > disabled, should the virt_spin_lock_key be enabled in the guest?
> 
> No, but if it's disabled and we are under a hypervisor shouldn't the virt
> spinlock be kept disabled? 

No, the virt_spin_lock_key shouldn't be kept disabled.

According to the comments [1], in the hypervisor if CONFIG_PARAVIRT_SPINLOCKS
is disabled,  the virt_spin_lock_key should be enabled to fall back to the TAS 
spinlock.

[1] 
https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/qspinlock.h#L94

According to the comments [2]:
So my understanding is that in hypervisor keeping virt_spin_lock_key enabled 
allows
the spinlock fallback to TAS if PV spinlock is not supported (either 
CONFIG_PARAVIRT_SPINLOCKS=n
or the host doesn't support the PV feature)

[2] https://github.com/torvalds/linux/blob/master/arch/x86/kernel/kvm.c#L1073

> As it stands now everytime we are under a
> hypervisor the virt spinlock is enabled irrespective of the PARAVIRT_SPINLOCK
> config state.

According to [1] [2], yes, I think so, 

-Qiuxu 



RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association

2021-03-10 Thread Zhuo, Qiuxu
> [...]
> 
> I think 507b460f8144 appeared in v5.11, so not something we broke in v5.12.
> Applied to pci/error for v5.13, thanks!

Thanks Bjorn!

> If I understand correctly, we previously only got this right in one
> case:
> 
>0 == PCI_SLOT(00.0)# correct
>1 == PCI_SLOT(00.1)# incorrect
>2 == PCI_SLOT(00.2)# incorrect
>...
>8 == PCI_SLOT(01.0)# incorrect
>9 == PCI_SLOT(01.1)# incorrect
>...
>   31 == PCI_SLOT(03.7)# incorrect

Yes, you're right. 

Thanks!
-Qiuxu



RE: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association

2021-03-04 Thread Zhuo, Qiuxu
Hi Bjorn,

Do you have any comments on this patch? If need any changes, please let me 
know. 
Thanks!

-Qiuxu

> -Original Message-
> From: Zhuo, Qiuxu 
> Sent: Monday, February 22, 2021 9:17 AM
> To: Bjorn Helgaas 
> Cc: Zhuo, Qiuxu ; Lorenzo Pieralisi
> ; Krzysztof Wilczyński ; Kelley,
> Sean V ; Luck, Tony ; Jin, Wen
> ; linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH v3 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association
> 
> Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single byte
> encoding the device and function number) as the device number to check
> whether the corresponding bit was set in the RCiEPBitmap of the RCEC (Root
> Complex Event Collector) while enumerating over each bit of the RCiEPBitmap.
> 
> As per the PCI Express Base Specification, Revision 5.0, Version 1.0, Section
> 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to use a device
> number to check whether the corresponding bit was set in the RCiEPBitmap.
> 
> Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value of
> "rciep->devfn" to a device number to ensure that the RCiEP devices associated
> with the RCEC are linked when the RCEC is enumerated.
> 
> Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> Reported-and-tested-by: Wen Jin 
> Reviewed-by: Sean V Kelley 
> Signed-off-by: Qiuxu Zhuo 
> ---
> v2->v3:
>  Drop "[ Krzysztof: Update commit message. ]" from the commit message
> 
>  drivers/pci/pcie/rcec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c index
> 2c5c552994e4..d0bcd141ac9c 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -32,7 +32,7 @@ static bool rcec_assoc_rciep(struct pci_dev *rcec, struct
> pci_dev *rciep)
> 
>   /* Same bus, so check bitmap */
>   for_each_set_bit(devn, , 32)
> - if (devn == rciep->devfn)
> + if (devn == PCI_SLOT(rciep->devfn))
>   return true;
> 
>   return false;
> --
> 2.17.1



RE: [PATCH v2 1/1] PCI/RCEC: Fix RCiEP capable devices RCEC association

2021-02-21 Thread Zhuo, Qiuxu
> ...
> > [ Krzysztof: Update commit message. ]
> [...]
> 
> Thank you!  I appreciate that.  However, we probably should drop this from the
> commit message.  Perhaps either Bjorn or Lorenzo could do it when applying
> changes.

OK, will send out the v3 that drops "[ Krzysztof: Update commit message. ]" 
from the commit message.

-Qiuxu


RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices

2021-02-18 Thread Zhuo, Qiuxu
> ...
> 
> I took your suggestion and came up with the following:
> 
>   Function rcec_assoc_rciep() incorrectly used "rciep->devfn" (a single
>   byte encoding the device and function number) as the device number to
>   check whether the corresponding bit was set in the RCiEPBitmap of the
>   RCEC (Root Complex Event Collector) while enumerating over each bit of
>   the RCiEPBitmap.
> 
>   As per the PCI Express Base Specification, Revision 5.0, Version 1.0,
>   Section 7.9.10.2, "Association Bitmap for RCiEPs", p. 935, only needs to
>   use a device number to check whether the corresponding bit was set in
>   the RCiEPBitmap.
> 
>   Fix rcec_assoc_rciep() using the PCI_SLOT() macro and convert the value
>   of "rciep->devfn" to a device number to ensure that the RCiEP devices
>   are associated with the RCEC are linked when the RCEC is enumerated.
>
> Using either of the following as the subject:
> 
>   PCI/RCEC: Use device number to check RCiEPBitmap of RCEC
>   PCI/RCEC: Fix RCiEP capable devices RCEC association
> 
> What do you think?  Also, feel free to change whatever you see fit, of 
> course, as
> tis is only a suggestion.
> 

Hi Krzysztof,

Thanks for improving the commit message. It looks clearer. 
Will send out a v2 with this commit message.

Thanks!
-Qiuxu



RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices

2021-02-18 Thread Zhuo, Qiuxu
>...
> 
> We could probably add the following:
> 
>   Fixes: 507b460f8144 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
> 

OK. Will add this to the v2.

Thanks!
-Qiuxu


RE: [PATCH 1/1] PCI/RCEC: Fix failure to inject errors to some RCiEP devices

2021-02-17 Thread Zhuo, Qiuxu
Hi Krzysztof,

Sorry, just back from Chinese New Year holiday.

> From: Krzysztof Wilczyński 
> ...
> ...
> Would this only affect error injection or would this be also a generic problem
> with the driver itself causing issues regardless of whether it was an error
> injection or not for this particular device?  I am asking, as there is a lot 
> going on
> in the commit message.

This is also a generic problem.

> I wonder if simplifying this commit message so that it clearly explains what 
> was
> broken, why, and how this patch is fixing it, would perhaps be an option?  The
> backstory of how you found the issue while doing some testing and error
> injection is nice, but not sure if needed.
> 
> What do you think?

Agree to simplify the commit message. How about the following subject and 
commit message?

Subject:  
Use device number to check RCiEPBitmap of RCEC

Commit message: 
rcec_assoc_rciep() used the combination of device number and function number 
'devfn' to check whether the corresponding bit in the RCiEPBimap of RCEC was 
set. According to [1], it only needs to use the device number to check the 
corresponding bit in the RCiEPBitmap was set. So fix it by using PCI_SLOT() to 
convert 'devfn' to device number for rcec_assoc_rciep().
[1] PCIe r5.0, sec "7.9.10.2 Association Bitmap for RCiEPs"


Thanks!
-Qiuxu


RE: [RFC PATCH 5/9] PCI/AER: Apply function level reset to RCiEP on fatal error

2020-07-28 Thread Zhuo, Qiuxu
> From: Jonathan Cameron 
> Sent: Monday, July 27, 2020 7:17 PM
> To: Kelley, Sean V 
> Cc: bhelg...@google.com; r...@rjwysocki.net; ashok@kernel.org; Luck,
> Tony ;
> sathyanarayanan.kuppusw...@linux.intel.com; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org; Zhuo, Qiuxu 
> Subject: Re: [RFC PATCH 5/9] PCI/AER: Apply function level reset to RCiEP
> on fatal error
> 
> On Fri, 24 Jul 2020 10:22:19 -0700
> Sean V Kelley  wrote:
> 
> > From: Qiuxu Zhuo 
> >
> > Attempt to do function level reset for an RCiEP associated with an
> > RCEC device on fatal error.
> 
> I'd like to understand more on your reasoning for flr here.
> Is it simply that it is all we can do, or is there some basis in a spec
> somewhere?
> 

Yes. Though there isn't the link reset for the RCiEP here, I think we should 
still be able to reset the RCiEP via FLR on fatal error, if the RCiEP supports 
FLR.

-Qiuxu

> >
> > Signed-off-by: Qiuxu Zhuo 
> > ---
> >  drivers/pci/pcie/err.c | 31 ++-
> >  1 file changed, 22 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index
> > 044df004f20b..9b3ec94bdf1d 100644
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -170,6 +170,17 @@ static void pci_walk_dev_affected(struct
> pci_dev *dev, int (*cb)(struct pci_dev
> > }
> >  }
> >
> > +static enum pci_channel_state flr_on_rciep(struct pci_dev *dev) {
> > +   if (!pcie_has_flr(dev))
> > +   return PCI_ERS_RESULT_NONE;
> > +
> > +   if (pcie_flr(dev))
> > +   return PCI_ERS_RESULT_DISCONNECT;
> > +
> > +   return PCI_ERS_RESULT_RECOVERED;
> > +}
> > +
> >  pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> > enum pci_channel_state state,
> > pci_ers_result_t (*reset_link)(struct pci_dev *pdev))
> @@ -191,15
> > +202,17 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> > if (state == pci_channel_io_frozen) {
> > pci_walk_dev_affected(dev, report_frozen_detected,
> );
> > if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END) {
> > -   pci_warn(dev, "link reset not possible for RCiEP\n");
> > -   status = PCI_ERS_RESULT_NONE;
> > -   goto failed;
> > -   }
> > -
> > -   status = reset_link(dev);
> > -   if (status != PCI_ERS_RESULT_RECOVERED) {
> > -   pci_warn(dev, "link reset failed\n");
> > -   goto failed;
> > +   status = flr_on_rciep(dev);
> > +   if (status != PCI_ERS_RESULT_RECOVERED) {
> > +   pci_warn(dev, "function level reset failed\n");
> > +   goto failed;
> > +   }
> > +   } else {
> > +   status = reset_link(dev);
> > +   if (status != PCI_ERS_RESULT_RECOVERED) {
> > +   pci_warn(dev, "link reset failed\n");
> > +   goto failed;
> > +   }
> > }
> > } else {
> > pci_walk_dev_affected(dev, report_normal_detected,
> );
> 



RE: [PATCH] EDAC, pnd2: Fix ioremap() size in dnv_rd_reg() from 64K -> 32K

2019-08-09 Thread Zhuo, Qiuxu
> 
> BIOS has marked the 32K MCHBAR window as reserved, so when dnv_rd_reg()
> tries to ioremap() a 64KB region you get warnings like:
> 
> resource sanity check: requesting [mem 0xfed1-0xfed1], which spans
> more than reserved [mem 0xfed1-0xfed17fff] caller
> dnv_rd_reg+0xc8/0x240 [pnd2_edac] mapping multiple BARs
> 
> ioremap() the correct size on Denverton platforms to get rid of those 
> warnings.

I've several dmesg logs for loading the pnd2_edac driver on a Denverton server 
successfully. But the logs haven't got such warning.

-Qiuxu 


RE: [PATCH 1/3] x86/CPU: Add more Icelake model number

2019-06-06 Thread Zhuo, Qiuxu
> From: Borislav Petkov [mailto:b...@alien8.de]
>> ...
>> Dropping my SOB or adding a text "[Qiuxu: Get the macros in the Ice Lake 
>> group sorted by
> > model number.]" at the end of the commit message - which one is 
> > better/clear for you?
> 
> I'll add that note when applying.
> 
> Thx.

Thanks Boris!
-Qiuxu


RE: [PATCH 1/3] x86/CPU: Add more Icelake model number

2019-06-06 Thread Zhuo, Qiuxu
> From: Borislav Petkov [mailto:b...@alien8.de]
> ...
> > From: Kan Liang 
> >
> > Add the CPUID model number of Icelake (ICL) desktop and server
> > processors to the Intel family list.
> >
> > Signed-off-by: Kan Liang 
> > Signed-off-by: Qiuxu Zhuo 
> 
> You're sending this patch but it has Qiuxu's SOB too. What's that supposed to 
> mean?

Hi Boris,

During internal co-work, based on Kan's original patch, I got the "#define" in 
the Ice Lake group sorted by model number(the header of the file requires the 
sorting) and added my SOB. Dropping my SOB or adding a text "[Qiuxu: Get the 
macros in the Ice Lake group sorted by model number.]" at the end of the commit 
message - which one is better/clear for you?

Thanks!
-Qiuxu




RE: [PATCH] Raise maximum number of memory controllers

2018-09-26 Thread Zhuo, Qiuxu
Hi Justin,

> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller 
> Skylake Socket#1 IMC#1
Just curious, has the system(two memory controllers per socket) got more 
than 8 sockets?
Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" 
should be 7 (that was 15/2), but it was 1 here.

Thanks!
-Qiuxu


RE: [PATCH] Raise maximum number of memory controllers

2018-09-26 Thread Zhuo, Qiuxu
Hi Justin,

> [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller 
> Skylake Socket#1 IMC#1
Just curious, has the system(two memory controllers per socket) got more 
than 8 sockets?
Normally, the number "1" in the above string "Skylake Socekt#1 IMC#1" 
should be 7 (that was 15/2), but it was 1 here.

Thanks!
-Qiuxu


RE: [PATCH] EDAC, sb_edac: mark expected switch fall-through

2017-10-15 Thread Zhuo, Qiuxu
Hi Silva,

The actual intention of the code is NOT to fall through, though current 
code can work correctly.
Thanks for this finding. If you don't mind, I'll submit a fix patch for it 
with the tag 'Reported-by:' by you.

Thanks!
- Qiuxu

> From: linux-edac-ow...@vger.kernel.org [mailto:linux-edac-
> ow...@vger.kernel.org] On Behalf Of Gustavo A. R. Silva
> Sent: Saturday, October 14, 2017 4:28 AM
> To: Mauro Carvalho Chehab ; Borislav Petkov
> 
> Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Gustavo A. R.
> Silva 
> Subject: [PATCH] EDAC, sb_edac: mark expected switch fall-through
> 
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we
> are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
> This code was tested by compilation only (GCC 7.2.0 was used).
> Please, verify if the actual intention of the code is to fall through.
> 
>  drivers/edac/sb_edac.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index
> 72b98a0..b50d714 100644
> --- a/drivers/edac/sb_edac.c
> +++ b/drivers/edac/sb_edac.c
> @@ -2485,6 +2485,7 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info
> *mci,
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA:
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA:
>   pvt->pci_ta = pdev;
> + /* fall through */
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS:
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:


RE: [PATCH] EDAC, sb_edac: mark expected switch fall-through

2017-10-15 Thread Zhuo, Qiuxu
Hi Silva,

The actual intention of the code is NOT to fall through, though current 
code can work correctly.
Thanks for this finding. If you don't mind, I'll submit a fix patch for it 
with the tag 'Reported-by:' by you.

Thanks!
- Qiuxu

> From: linux-edac-ow...@vger.kernel.org [mailto:linux-edac-
> ow...@vger.kernel.org] On Behalf Of Gustavo A. R. Silva
> Sent: Saturday, October 14, 2017 4:28 AM
> To: Mauro Carvalho Chehab ; Borislav Petkov
> 
> Cc: linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org; Gustavo A. R.
> Silva 
> Subject: [PATCH] EDAC, sb_edac: mark expected switch fall-through
> 
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we
> are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
> This code was tested by compilation only (GCC 7.2.0 was used).
> Please, verify if the actual intention of the code is to fall through.
> 
>  drivers/edac/sb_edac.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index
> 72b98a0..b50d714 100644
> --- a/drivers/edac/sb_edac.c
> +++ b/drivers/edac/sb_edac.c
> @@ -2485,6 +2485,7 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info
> *mci,
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA:
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA:
>   pvt->pci_ta = pdev;
> + /* fall through */
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS:
>   case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-09 Thread Zhuo, Qiuxu
> From: Borislav Petkov [mailto:b...@alien8.de] 
>
> Xiaolong,
>
> can you please run Qiuxu's patch to verify it fixes your issue?


Hi Boris,
I manually verified the fix patch on the Broadwell-DE server on which the 
bug was found by Xiaolong: 
the sb_edac can be loaded successfully, and it identified which size and 
type of DIMMs were installed
in each slot correctly (see attached dmesg.sb_edac.on.Broadwell-DE.log).

Hi Xiaolong,
   Would you please also test the patch by your LKP method by which the issue 
was found last time.

Thanks!

BR
Qiuxu 





dmesg.sb_edac.on.Broadwell-DE.log
Description: dmesg.sb_edac.on.Broadwell-DE.log


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-09 Thread Zhuo, Qiuxu
> From: Borislav Petkov [mailto:b...@alien8.de] 
>
> Xiaolong,
>
> can you please run Qiuxu's patch to verify it fixes your issue?


Hi Boris,
I manually verified the fix patch on the Broadwell-DE server on which the 
bug was found by Xiaolong: 
the sb_edac can be loaded successfully, and it identified which size and 
type of DIMMs were installed
in each slot correctly (see attached dmesg.sb_edac.on.Broadwell-DE.log).

Hi Xiaolong,
   Would you please also test the patch by your LKP method by which the issue 
was found last time.

Thanks!

BR
Qiuxu 





dmesg.sb_edac.on.Broadwell-DE.log
Description: dmesg.sb_edac.on.Broadwell-DE.log


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-08 Thread Zhuo, Qiuxu
Hi Xiaolong,

Fixed this issue by 'EDAC, sb_edac: Avoid creating 'SOCK' EDAC memory 
controller' (you were also CCed by 'Reported-by').
Thanks for this test case :-)

BR
qiuxu

-Original Message-
From: Zhuo, Qiuxu 
Sent: Monday, June 5, 2017 9:08 PM
To: Ye, Xiaolong <xiaolong...@intel.com>
Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; 
LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; 
l...@01.org
Subject: RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error

Hi Xiaolong,

   Thanks!  I'll look at it, and feedback ASAP.

BR
qiuxu

-Original Message-
From: lkp-robot-requ...@eclists.intel.com 
[mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong
Sent: Monday, June 5, 2017 2:23 PM
To: Zhuo, Qiuxu <qiuxu.z...@intel.com>
Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; 
LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; 
l...@01.org
Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error


FYI, we noticed the following commit:

commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC 
memory controller per h/w controller") 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: netperf
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various 
aspect of networking performance.
test-url: http://www.netperf.org/netperf/


on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


kern  :err   : [   17.309988] EDAC sbridge: Some needed devices are missing
kern  :info  : [   17.504765] ast :07:00.0: fb0: astdrmfb frame buffer 
device
kern  :info  : [   17.533508] EDAC MC: Removed device 0 for sbridge_edac.c 
Broadwell SrcID#0_Ha#0: DEV :ff:12.0
kern  :err   : [   17.533529] EDAC sbridge: Couldn't find mci handler
kern  :err   : [   17.533530] EDAC sbridge: Failed to register device with 
error -19.




To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml  # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-08 Thread Zhuo, Qiuxu
Hi Xiaolong,

Fixed this issue by 'EDAC, sb_edac: Avoid creating 'SOCK' EDAC memory 
controller' (you were also CCed by 'Reported-by').
Thanks for this test case :-)

BR
qiuxu

-Original Message-
From: Zhuo, Qiuxu 
Sent: Monday, June 5, 2017 9:08 PM
To: Ye, Xiaolong 
Cc: Borislav Petkov ; linux-edac ; 
LKML ; Stephen Rothwell ; 
l...@01.org
Subject: RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error

Hi Xiaolong,

   Thanks!  I'll look at it, and feedback ASAP.

BR
qiuxu

-Original Message-
From: lkp-robot-requ...@eclists.intel.com 
[mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong
Sent: Monday, June 5, 2017 2:23 PM
To: Zhuo, Qiuxu 
Cc: Borislav Petkov ; linux-edac ; 
LKML ; Stephen Rothwell ; 
l...@01.org
Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error


FYI, we noticed the following commit:

commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC 
memory controller per h/w controller") 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: netperf
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various 
aspect of networking performance.
test-url: http://www.netperf.org/netperf/


on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


kern  :err   : [   17.309988] EDAC sbridge: Some needed devices are missing
kern  :info  : [   17.504765] ast :07:00.0: fb0: astdrmfb frame buffer 
device
kern  :info  : [   17.533508] EDAC MC: Removed device 0 for sbridge_edac.c 
Broadwell SrcID#0_Ha#0: DEV :ff:12.0
kern  :err   : [   17.533529] EDAC sbridge: Couldn't find mci handler
kern  :err   : [   17.533530] EDAC sbridge: Failed to register device with 
error -19.




To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml  # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-05 Thread Zhuo, Qiuxu
Hi Xiaolong,

   Thanks!  I'll look at it, and feedback ASAP.

BR
qiuxu

-Original Message-
From: lkp-robot-requ...@eclists.intel.com 
[mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong
Sent: Monday, June 5, 2017 2:23 PM
To: Zhuo, Qiuxu <qiuxu.z...@intel.com>
Cc: Borislav Petkov <b...@suse.de>; linux-edac <linux-e...@vger.kernel.org>; 
LKML <linux-kernel@vger.kernel.org>; Stephen Rothwell <s...@canb.auug.org.au>; 
l...@01.org
Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error


FYI, we noticed the following commit:

commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC 
memory controller per h/w controller") 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: netperf
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various 
aspect of networking performance.
test-url: http://www.netperf.org/netperf/


on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


kern  :err   : [   17.309988] EDAC sbridge: Some needed devices are missing
kern  :info  : [   17.504765] ast :07:00.0: fb0: astdrmfb frame buffer 
device
kern  :info  : [   17.533508] EDAC MC: Removed device 0 for sbridge_edac.c 
Broadwell SrcID#0_Ha#0: DEV :ff:12.0
kern  :err   : [   17.533529] EDAC sbridge: Couldn't find mci handler
kern  :err   : [   17.533530] EDAC sbridge: Failed to register device with 
error -19.




To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml  # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong


RE: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2017-06-05 Thread Zhuo, Qiuxu
Hi Xiaolong,

   Thanks!  I'll look at it, and feedback ASAP.

BR
qiuxu

-Original Message-
From: lkp-robot-requ...@eclists.intel.com 
[mailto:lkp-robot-requ...@eclists.intel.com] On Behalf Of Ye, Xiaolong
Sent: Monday, June 5, 2017 2:23 PM
To: Zhuo, Qiuxu 
Cc: Borislav Petkov ; linux-edac ; 
LKML ; Stephen Rothwell ; 
l...@01.org
Subject: [lkp-robot] [EDAC, sb_edac] e2f747b1f4: 
kmsg.EDAC_sbridge:Failed_to_register_device_with_error


FYI, we noticed the following commit:

commit: e2f747b1f42a2f6b0cf5416be1684c1b94a42f0f ("EDAC, sb_edac: Assign EDAC 
memory controller per h/w controller") 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: netperf
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 25%
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various 
aspect of networking performance.
test-url: http://www.netperf.org/netperf/


on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


kern  :err   : [   17.309988] EDAC sbridge: Some needed devices are missing
kern  :info  : [   17.504765] ast :07:00.0: fb0: astdrmfb frame buffer 
device
kern  :info  : [   17.533508] EDAC MC: Removed device 0 for sbridge_edac.c 
Broadwell SrcID#0_Ha#0: DEV :ff:12.0
kern  :err   : [   17.533529] EDAC sbridge: Couldn't find mci handler
kern  :err   : [   17.533530] EDAC sbridge: Failed to register device with 
error -19.




To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml  # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Xiaolong


RE: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention

2016-01-04 Thread Zhuo, Qiuxu
Hi Linus and Andy,

  We did test in our side with v4.4-rc8 + Andy's vDSO v2 patches + Android M 
(bionic libc using sysenter) ==> Device can boot up successfully

  Other tests were:
  - Android L (bionic libc using int80) + v4.4-rc8 ==> Device can boot up 
successfully
  - Android L (bionic libc using int80) + v4.4-rc8 + Andy's v2 patches ==> 
Device can boot up successfully
  - Android M (bionic libc using sysenter) + v4.4-rc8 ==> Device can NOT boot 
up successfully
  - Android M (bionic libc using sysenter) + v4.4-rc8 + Andy's v2 patches ==> 
Device can boot up successfully

   Thanks!
BR
qiuxu

-Original Message-
From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds
Sent: Tuesday, January 5, 2016 3:28 AM
To: H. Peter Anvin
Cc: Andy Lutomirski; Shi, Mingwei; Fu, Borun; Gross, Mark; Andrew Lutomirski; 
Su, Tao; Borislav Petkov; Ingo Molnar; Brian Gerst; 
linux-kernel@vger.kernel.org; Zhuo, Qiuxu; Thomas Gleixner; Denys Vlasenko; 
Wang, Frank; linux-tip-comm...@vger.kernel.org
Subject: Re: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling 
convention

On Mon, Jan 4, 2016 at 10:48 AM, H. Peter Anvin  wrote:
>
> Linus has frequently stated that if it is something that is critical 
> enough for stable, it is critical enough for final.  Linus will decide 
> if an additional -rc is needed for that reason.

So it would have been good to have it in an -rc, but at the same time I'm not 
particularly worried about this one.

It's not like it's complicated, and I'm assuming it got tested and passed all 
our current test-cases (which are much more complete than anything we've ever 
had historically).

 Linus


RE: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling convention

2016-01-04 Thread Zhuo, Qiuxu
Hi Linus and Andy,

  We did test in our side with v4.4-rc8 + Andy's vDSO v2 patches + Android M 
(bionic libc using sysenter) ==> Device can boot up successfully

  Other tests were:
  - Android L (bionic libc using int80) + v4.4-rc8 ==> Device can boot up 
successfully
  - Android L (bionic libc using int80) + v4.4-rc8 + Andy's v2 patches ==> 
Device can boot up successfully
  - Android M (bionic libc using sysenter) + v4.4-rc8 ==> Device can NOT boot 
up successfully
  - Android M (bionic libc using sysenter) + v4.4-rc8 + Andy's v2 patches ==> 
Device can boot up successfully

   Thanks!
BR
qiuxu

-Original Message-
From: linus...@gmail.com [mailto:linus...@gmail.com] On Behalf Of Linus Torvalds
Sent: Tuesday, January 5, 2016 3:28 AM
To: H. Peter Anvin
Cc: Andy Lutomirski; Shi, Mingwei; Fu, Borun; Gross, Mark; Andrew Lutomirski; 
Su, Tao; Borislav Petkov; Ingo Molnar; Brian Gerst; 
linux-kernel@vger.kernel.org; Zhuo, Qiuxu; Thomas Gleixner; Denys Vlasenko; 
Wang, Frank; linux-tip-comm...@vger.kernel.org
Subject: Re: [tip:x86/urgent] x86/entry: Restore traditional SYSENTER calling 
convention

On Mon, Jan 4, 2016 at 10:48 AM, H. Peter Anvin <h...@zytor.com> wrote:
>
> Linus has frequently stated that if it is something that is critical 
> enough for stable, it is critical enough for final.  Linus will decide 
> if an additional -rc is needed for that reason.

So it would have been good to have it in an -rc, but at the same time I'm not 
particularly worried about this one.

It's not like it's complicated, and I'm assuming it got tested and passed all 
our current test-cases (which are much more complete than anything we've ever 
had historically).

 Linus