Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Sat, Jul 21, 2018 at 09:35:59AM +0800, 陈华才 wrote: > SFB can improve the memory bandwidth as much as 30%, and we are > planning to enable SFB by default. So, we want to control cpu_relax() > under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT. OK, applied to mips-next for 4.19 with changes to the commit message & comment. Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Sat, Jul 21, 2018 at 09:35:59AM +0800, 陈华才 wrote: > SFB can improve the memory bandwidth as much as 30%, and we are > planning to enable SFB by default. So, we want to control cpu_relax() > under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT. OK, applied to mips-next for 4.19 with changes to the commit message & comment. Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi, Paul, SFB can improve the memory bandwidth as much as 30%, and we are planning to enable SFB by default. So, we want to control cpu_relax() under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT. Huacai -- Original -- From: "Paul Burton"; Date: Fri, Jul 20, 2018 05:15 AM To: "Huacai Chen"; Cc: "Ralf Baechle"; "James Hogan"; "linux-mips"; "Fuxin Zhang"; "wuzhangjin"; "stable"; "Alan Stern"; "Andrea Parri"; "Will Deacon"; "Peter Zijlstra"; "Boqun Feng"; "Nicholas Piggin"; "David Howells"; "Jade Alglave"; "Luc Maranget"; "Paul E. McKenney"; "Akira Yokosawa"; "LKML"; Subject: Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Hi Huacai, On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote: > >> diff --git a/arch/mips/include/asm/processor.h > >> b/arch/mips/include/asm/processor.h > >> index af34afb..a8c4a3a 100644 > >> --- a/arch/mips/include/asm/processor.h > >> +++ b/arch/mips/include/asm/processor.h > >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > >> > >> +#ifdef CONFIG_CPU_LOONGSON3 > >> +/* > >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > >> read > >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, > >> force > >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending > >> writes will > >> + * become available as expected. > >> + */ > > > > I think "may starve writes" or "may queue writes indefinitely" would be > > clearer than "may get starved". > > Need I change the comment and resend? Or you change the comment and get > merged? I'm happy to fix up the comment - but have a couple more questions. Looking into the history, would it be fair to say that this is only a problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y, which adds code to enable the SFB? If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select the use of smp_mb()? How much does performance gain does enabling the SFB give you? Would it be reasonable to just disable it, rather than using this workaround? Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi, Paul, SFB can improve the memory bandwidth as much as 30%, and we are planning to enable SFB by default. So, we want to control cpu_relax() under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT. Huacai -- Original -- From: "Paul Burton"; Date: Fri, Jul 20, 2018 05:15 AM To: "Huacai Chen"; Cc: "Ralf Baechle"; "James Hogan"; "linux-mips"; "Fuxin Zhang"; "wuzhangjin"; "stable"; "Alan Stern"; "Andrea Parri"; "Will Deacon"; "Peter Zijlstra"; "Boqun Feng"; "Nicholas Piggin"; "David Howells"; "Jade Alglave"; "Luc Maranget"; "Paul E. McKenney"; "Akira Yokosawa"; "LKML"; Subject: Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3 Hi Huacai, On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote: > >> diff --git a/arch/mips/include/asm/processor.h > >> b/arch/mips/include/asm/processor.h > >> index af34afb..a8c4a3a 100644 > >> --- a/arch/mips/include/asm/processor.h > >> +++ b/arch/mips/include/asm/processor.h > >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > >> > >> +#ifdef CONFIG_CPU_LOONGSON3 > >> +/* > >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > >> read > >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, > >> force > >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending > >> writes will > >> + * become available as expected. > >> + */ > > > > I think "may starve writes" or "may queue writes indefinitely" would be > > clearer than "may get starved". > > Need I change the comment and resend? Or you change the comment and get > merged? I'm happy to fix up the comment - but have a couple more questions. Looking into the history, would it be fair to say that this is only a problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y, which adds code to enable the SFB? If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select the use of smp_mb()? How much does performance gain does enabling the SFB give you? Would it be reasonable to just disable it, rather than using this workaround? Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote: > >> diff --git a/arch/mips/include/asm/processor.h > >> b/arch/mips/include/asm/processor.h > >> index af34afb..a8c4a3a 100644 > >> --- a/arch/mips/include/asm/processor.h > >> +++ b/arch/mips/include/asm/processor.h > >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > >> > >> +#ifdef CONFIG_CPU_LOONGSON3 > >> +/* > >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > >> read > >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, > >> force > >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending > >> writes will > >> + * become available as expected. > >> + */ > > > > I think "may starve writes" or "may queue writes indefinitely" would be > > clearer than "may get starved". > > Need I change the comment and resend? Or you change the comment and get > merged? I'm happy to fix up the comment - but have a couple more questions. Looking into the history, would it be fair to say that this is only a problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y, which adds code to enable the SFB? If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select the use of smp_mb()? How much does performance gain does enabling the SFB give you? Would it be reasonable to just disable it, rather than using this workaround? Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote: > >> diff --git a/arch/mips/include/asm/processor.h > >> b/arch/mips/include/asm/processor.h > >> index af34afb..a8c4a3a 100644 > >> --- a/arch/mips/include/asm/processor.h > >> +++ b/arch/mips/include/asm/processor.h > >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > >> > >> +#ifdef CONFIG_CPU_LOONGSON3 > >> +/* > >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > >> read > >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, > >> force > >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending > >> writes will > >> + * become available as expected. > >> + */ > > > > I think "may starve writes" or "may queue writes indefinitely" would be > > clearer than "may get starved". > > Need I change the comment and resend? Or you change the comment and get > merged? I'm happy to fix up the comment - but have a couple more questions. Looking into the history, would it be fair to say that this is only a problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y, which adds code to enable the SFB? If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select the use of smp_mb()? How much does performance gain does enabling the SFB give you? Would it be reasonable to just disable it, rather than using this workaround? Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Wed, Jul 18, 2018 at 1:52 AM, Paul Burton wrote: > Hi Huacai, > > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: >> Linux expects that if a CPU modifies a memory location, then that >> modification will eventually become visible to other CPUs in the system. >> >> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be >> prioritised over stores so it is possible for a store operation to be >> postponed if a polling loop immediately follows it. If the variable >> being polled indirectly depends on the outstanding store [for example, >> another CPU may be polling the variable that is pending modification] >> then there is the potential for deadlock if interrupts are disabled. >> This deadlock occurs in qspinlock code. >> >> This patch changes the definition of cpu_relax() to smp_mb() for >> Loongson-3, forcing a flushing of the SFB on SMP systems before the >> next load takes place. If the Kernel is not compiled for SMP support, >> this will expand to a barrier() as before. >> >> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() >> for ARM11MPCore) >> Cc: sta...@vger.kernel.org >> Signed-off-by: Huacai Chen >> --- >> arch/mips/include/asm/processor.h | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/mips/include/asm/processor.h >> b/arch/mips/include/asm/processor.h >> index af34afb..a8c4a3a 100644 >> --- a/arch/mips/include/asm/processor.h >> +++ b/arch/mips/include/asm/processor.h >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) >> >> +#ifdef CONFIG_CPU_LOONGSON3 >> +/* >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, >> force >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes >> will >> + * become available as expected. >> + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". Need I change the comment and resend? Or you change the comment and get merged? Huacai > >> +#define cpu_relax() smp_mb() >> +#else >> #define cpu_relax() barrier() >> +#endif >> >> /* >> * Return_address is a replacement for __builtin_return_address(count) >> -- >> 2.7.0 > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? > > Thanks, > Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Wed, Jul 18, 2018 at 1:52 AM, Paul Burton wrote: > Hi Huacai, > > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: >> Linux expects that if a CPU modifies a memory location, then that >> modification will eventually become visible to other CPUs in the system. >> >> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be >> prioritised over stores so it is possible for a store operation to be >> postponed if a polling loop immediately follows it. If the variable >> being polled indirectly depends on the outstanding store [for example, >> another CPU may be polling the variable that is pending modification] >> then there is the potential for deadlock if interrupts are disabled. >> This deadlock occurs in qspinlock code. >> >> This patch changes the definition of cpu_relax() to smp_mb() for >> Loongson-3, forcing a flushing of the SFB on SMP systems before the >> next load takes place. If the Kernel is not compiled for SMP support, >> this will expand to a barrier() as before. >> >> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() >> for ARM11MPCore) >> Cc: sta...@vger.kernel.org >> Signed-off-by: Huacai Chen >> --- >> arch/mips/include/asm/processor.h | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/mips/include/asm/processor.h >> b/arch/mips/include/asm/processor.h >> index af34afb..a8c4a3a 100644 >> --- a/arch/mips/include/asm/processor.h >> +++ b/arch/mips/include/asm/processor.h >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); >> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) >> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) >> >> +#ifdef CONFIG_CPU_LOONGSON3 >> +/* >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, >> force >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes >> will >> + * become available as expected. >> + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". Need I change the comment and resend? Or you change the comment and get merged? Huacai > >> +#define cpu_relax() smp_mb() >> +#else >> #define cpu_relax() barrier() >> +#endif >> >> /* >> * Return_address is a replacement for __builtin_return_address(count) >> -- >> 2.7.0 > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? > > Thanks, > Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote: > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > > Linux expects that if a CPU modifies a memory location, then that > > modification will eventually become visible to other CPUs in the system. > > > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > > prioritised over stores so it is possible for a store operation to be > > postponed if a polling loop immediately follows it. If the variable > > being polled indirectly depends on the outstanding store [for example, > > another CPU may be polling the variable that is pending modification] > > then there is the potential for deadlock if interrupts are disabled. > > This deadlock occurs in qspinlock code. > > > > This patch changes the definition of cpu_relax() to smp_mb() for > > Loongson-3, forcing a flushing of the SFB on SMP systems before the > > next load takes place. If the Kernel is not compiled for SMP support, > > this will expand to a barrier() as before. > > > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > > for ARM11MPCore) > > Cc: sta...@vger.kernel.org > > Signed-off-by: Huacai Chen > > --- > > arch/mips/include/asm/processor.h | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/mips/include/asm/processor.h > > b/arch/mips/include/asm/processor.h > > index af34afb..a8c4a3a 100644 > > --- a/arch/mips/include/asm/processor.h > > +++ b/arch/mips/include/asm/processor.h > > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > > > +#ifdef CONFIG_CPU_LOONGSON3 > > +/* > > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > > read > > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > > force > > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > > will > > + * become available as expected. > > + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". Agreed. > > +#define cpu_relax()smp_mb() > > +#else > > #define cpu_relax()barrier() > > +#endif > > > > /* > > * Return_address is a replacement for __builtin_return_address(count) > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? Right, thanks for adding us back on :-) Yes, this is much better, although I myself would also prefer explicit mention that this is a work-around for a hardware bug. But aside from the actual comment bike-shedding, this looks entirely acceptible (also because ARM is already doing this -- and the Changelog might want to refer to that particular patch).
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote: > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > > Linux expects that if a CPU modifies a memory location, then that > > modification will eventually become visible to other CPUs in the system. > > > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > > prioritised over stores so it is possible for a store operation to be > > postponed if a polling loop immediately follows it. If the variable > > being polled indirectly depends on the outstanding store [for example, > > another CPU may be polling the variable that is pending modification] > > then there is the potential for deadlock if interrupts are disabled. > > This deadlock occurs in qspinlock code. > > > > This patch changes the definition of cpu_relax() to smp_mb() for > > Loongson-3, forcing a flushing of the SFB on SMP systems before the > > next load takes place. If the Kernel is not compiled for SMP support, > > this will expand to a barrier() as before. > > > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > > for ARM11MPCore) > > Cc: sta...@vger.kernel.org > > Signed-off-by: Huacai Chen > > --- > > arch/mips/include/asm/processor.h | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/mips/include/asm/processor.h > > b/arch/mips/include/asm/processor.h > > index af34afb..a8c4a3a 100644 > > --- a/arch/mips/include/asm/processor.h > > +++ b/arch/mips/include/asm/processor.h > > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > > > +#ifdef CONFIG_CPU_LOONGSON3 > > +/* > > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > > read > > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > > force > > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > > will > > + * become available as expected. > > + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". Agreed. > > +#define cpu_relax()smp_mb() > > +#else > > #define cpu_relax()barrier() > > +#endif > > > > /* > > * Return_address is a replacement for __builtin_return_address(count) > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? Right, thanks for adding us back on :-) Yes, this is much better, although I myself would also prefer explicit mention that this is a work-around for a hardware bug. But aside from the actual comment bike-shedding, this looks entirely acceptible (also because ARM is already doing this -- and the Changelog might want to refer to that particular patch).
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote: > Hi Huacai, > > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > > Linux expects that if a CPU modifies a memory location, then that > > modification will eventually become visible to other CPUs in the system. > > > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > > prioritised over stores so it is possible for a store operation to be > > postponed if a polling loop immediately follows it. If the variable > > being polled indirectly depends on the outstanding store [for example, > > another CPU may be polling the variable that is pending modification] > > then there is the potential for deadlock if interrupts are disabled. > > This deadlock occurs in qspinlock code. > > > > This patch changes the definition of cpu_relax() to smp_mb() for > > Loongson-3, forcing a flushing of the SFB on SMP systems before the > > next load takes place. If the Kernel is not compiled for SMP support, > > this will expand to a barrier() as before. > > > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > > for ARM11MPCore) > > Cc: sta...@vger.kernel.org > > Signed-off-by: Huacai Chen > > --- > > arch/mips/include/asm/processor.h | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/mips/include/asm/processor.h > > b/arch/mips/include/asm/processor.h > > index af34afb..a8c4a3a 100644 > > --- a/arch/mips/include/asm/processor.h > > +++ b/arch/mips/include/asm/processor.h > > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > > > +#ifdef CONFIG_CPU_LOONGSON3 > > +/* > > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > > read > > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > > force > > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > > will > > + * become available as expected. > > + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". > > > +#define cpu_relax()smp_mb() > > +#else > > #define cpu_relax()barrier() > > +#endif > > > > /* > > * Return_address is a replacement for __builtin_return_address(count) > > -- > > 2.7.0 > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? This looks much better to me. Thanx, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote: > Hi Huacai, > > On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > > Linux expects that if a CPU modifies a memory location, then that > > modification will eventually become visible to other CPUs in the system. > > > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > > prioritised over stores so it is possible for a store operation to be > > postponed if a polling loop immediately follows it. If the variable > > being polled indirectly depends on the outstanding store [for example, > > another CPU may be polling the variable that is pending modification] > > then there is the potential for deadlock if interrupts are disabled. > > This deadlock occurs in qspinlock code. > > > > This patch changes the definition of cpu_relax() to smp_mb() for > > Loongson-3, forcing a flushing of the SFB on SMP systems before the > > next load takes place. If the Kernel is not compiled for SMP support, > > this will expand to a barrier() as before. > > > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > > for ARM11MPCore) > > Cc: sta...@vger.kernel.org > > Signed-off-by: Huacai Chen > > --- > > arch/mips/include/asm/processor.h | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/mips/include/asm/processor.h > > b/arch/mips/include/asm/processor.h > > index af34afb..a8c4a3a 100644 > > --- a/arch/mips/include/asm/processor.h > > +++ b/arch/mips/include/asm/processor.h > > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > > > +#ifdef CONFIG_CPU_LOONGSON3 > > +/* > > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a > > read > > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > > force > > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > > will > > + * become available as expected. > > + */ > > I think "may starve writes" or "may queue writes indefinitely" would be > clearer than "may get starved". > > > +#define cpu_relax()smp_mb() > > +#else > > #define cpu_relax()barrier() > > +#endif > > > > /* > > * Return_address is a replacement for __builtin_return_address(count) > > -- > > 2.7.0 > > Apart from the comment above though this looks better to me. > > Re-copying the LKMM maintainers - are you happy(ish) with this? This looks much better to me. Thanx, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > Linux expects that if a CPU modifies a memory location, then that > modification will eventually become visible to other CPUs in the system. > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > prioritised over stores so it is possible for a store operation to be > postponed if a polling loop immediately follows it. If the variable > being polled indirectly depends on the outstanding store [for example, > another CPU may be polling the variable that is pending modification] > then there is the potential for deadlock if interrupts are disabled. > This deadlock occurs in qspinlock code. > > This patch changes the definition of cpu_relax() to smp_mb() for > Loongson-3, forcing a flushing of the SFB on SMP systems before the > next load takes place. If the Kernel is not compiled for SMP support, > this will expand to a barrier() as before. > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > for ARM11MPCore) > Cc: sta...@vger.kernel.org > Signed-off-by: Huacai Chen > --- > arch/mips/include/asm/processor.h | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/mips/include/asm/processor.h > b/arch/mips/include/asm/processor.h > index af34afb..a8c4a3a 100644 > --- a/arch/mips/include/asm/processor.h > +++ b/arch/mips/include/asm/processor.h > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > +#ifdef CONFIG_CPU_LOONGSON3 > +/* > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > force > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > will > + * become available as expected. > + */ I think "may starve writes" or "may queue writes indefinitely" would be clearer than "may get starved". > +#define cpu_relax() smp_mb() > +#else > #define cpu_relax() barrier() > +#endif > > /* > * Return_address is a replacement for __builtin_return_address(count) > -- > 2.7.0 Apart from the comment above though this looks better to me. Re-copying the LKMM maintainers - are you happy(ish) with this? Thanks, Paul
Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3
Hi Huacai, On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote: > Linux expects that if a CPU modifies a memory location, then that > modification will eventually become visible to other CPUs in the system. > > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be > prioritised over stores so it is possible for a store operation to be > postponed if a polling loop immediately follows it. If the variable > being polled indirectly depends on the outstanding store [for example, > another CPU may be polling the variable that is pending modification] > then there is the potential for deadlock if interrupts are disabled. > This deadlock occurs in qspinlock code. > > This patch changes the definition of cpu_relax() to smp_mb() for > Loongson-3, forcing a flushing of the SFB on SMP systems before the > next load takes place. If the Kernel is not compiled for SMP support, > this will expand to a barrier() as before. > > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() > for ARM11MPCore) > Cc: sta...@vger.kernel.org > Signed-off-by: Huacai Chen > --- > arch/mips/include/asm/processor.h | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/mips/include/asm/processor.h > b/arch/mips/include/asm/processor.h > index af34afb..a8c4a3a 100644 > --- a/arch/mips/include/asm/processor.h > +++ b/arch/mips/include/asm/processor.h > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); > #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) > #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) > > +#ifdef CONFIG_CPU_LOONGSON3 > +/* > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read > + * loop. Since spin loops of any kind should have a cpu_relax() in them, > force > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes > will > + * become available as expected. > + */ I think "may starve writes" or "may queue writes indefinitely" would be clearer than "may get starved". > +#define cpu_relax() smp_mb() > +#else > #define cpu_relax() barrier() > +#endif > > /* > * Return_address is a replacement for __builtin_return_address(count) > -- > 2.7.0 Apart from the comment above though this looks better to me. Re-copying the LKMM maintainers - are you happy(ish) with this? Thanks, Paul