Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-23 Thread Paul Burton
Hi Huacai,

On Sat, Jul 21, 2018 at 09:35:59AM +0800, 陈华才 wrote:
> SFB can improve the memory bandwidth as much as 30%, and we are
> planning to enable SFB by default. So, we want to control cpu_relax()
> under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

OK, applied to mips-next for 4.19 with changes to the commit message &
comment.

Thanks,
Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-23 Thread Paul Burton
Hi Huacai,

On Sat, Jul 21, 2018 at 09:35:59AM +0800, 陈华才 wrote:
> SFB can improve the memory bandwidth as much as 30%, and we are
> planning to enable SFB by default. So, we want to control cpu_relax()
> under CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

OK, applied to mips-next for 4.19 with changes to the commit message &
comment.

Thanks,
Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-20 Thread 陈华才
Hi, Paul,
 
SFB can improve the memory bandwidth as much as 30%, and we are planning to 
enable SFB by default. So, we want to control cpu_relax() under 
CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

Huacai
 
-- Original --
From:  "Paul Burton";
Date:  Fri, Jul 20, 2018 05:15 AM
To:  "Huacai Chen"; 
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "stable"; "Alan 
Stern"; "Andrea 
Parri"; "Will Deacon"; 
"Peter Zijlstra"; "Boqun Feng"; 
"Nicholas Piggin"; "David Howells"; 
"Jade Alglave"; "Luc Maranget"; 
"Paul E. McKenney"; "Akira 
Yokosawa"; "LKML"; 
Subject:  Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

 
Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul

Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-20 Thread 陈华才
Hi, Paul,
 
SFB can improve the memory bandwidth as much as 30%, and we are planning to 
enable SFB by default. So, we want to control cpu_relax() under 
CONFIG_CPU_LOONGSON3, not under CONFIG_LOONGSON3_ENHANCEMENT.

Huacai
 
-- Original --
From:  "Paul Burton";
Date:  Fri, Jul 20, 2018 05:15 AM
To:  "Huacai Chen"; 
Cc:  "Ralf Baechle"; "James Hogan"; 
"linux-mips"; "Fuxin Zhang"; 
"wuzhangjin"; "stable"; "Alan 
Stern"; "Andrea 
Parri"; "Will Deacon"; 
"Peter Zijlstra"; "Boqun Feng"; 
"Nicholas Piggin"; "David Howells"; 
"Jade Alglave"; "Luc Maranget"; 
"Paul E. McKenney"; "Akira 
Yokosawa"; "LKML"; 
Subject:  Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

 
Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul

Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-19 Thread Paul Burton
Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-19 Thread Paul Burton
Hi Huacai,

On Wed, Jul 18, 2018 at 09:15:46AM +0800, Huacai Chen wrote:
> >> diff --git a/arch/mips/include/asm/processor.h 
> >> b/arch/mips/include/asm/processor.h
> >> index af34afb..a8c4a3a 100644
> >> --- a/arch/mips/include/asm/processor.h
> >> +++ b/arch/mips/include/asm/processor.h
> >> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >>
> >> +#ifdef CONFIG_CPU_LOONGSON3
> >> +/*
> >> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> >> read
> >> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> >> force
> >> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending 
> >> writes will
> >> + * become available as expected.
> >> + */
> >
> > I think "may starve writes" or "may queue writes indefinitely" would be
> > clearer than "may get starved".
>
> Need I change the comment and resend? Or you change the comment and get 
> merged?

I'm happy to fix up the comment - but have a couple more questions.

Looking into the history, would it be fair to say that this is only a
problem after commit 1e820da3c9af ("MIPS: Loongson-3: Introduce
CONFIG_LOONGSON3_ENHANCEMENT") when CONFIG_LOONGSON3_ENHANCEMENT=y,
which adds code to enable the SFB?

If so would it make sense to use CONFIG_LOONGSON3_ENHANCEMENT to select
the use of smp_mb()?

How much does performance gain does enabling the SFB give you? Would it
be reasonable to just disable it, rather than using this workaround?

Thanks,
Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Huacai Chen
On Wed, Jul 18, 2018 at 1:52 AM, Paul Burton  wrote:
> Hi Huacai,
>
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
>> Linux expects that if a CPU modifies a memory location, then that
>> modification will eventually become visible to other CPUs in the system.
>>
>> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
>> prioritised over stores so it is possible for a store operation to be
>> postponed if a polling loop immediately follows it. If the variable
>> being polled indirectly depends on the outstanding store [for example,
>> another CPU may be polling the variable that is pending modification]
>> then there is the potential for deadlock if interrupts are disabled.
>> This deadlock occurs in qspinlock code.
>>
>> This patch changes the definition of cpu_relax() to smp_mb() for
>> Loongson-3, forcing a flushing of the SFB on SMP systems before the
>> next load takes place. If the Kernel is not compiled for SMP support,
>> this will expand to a barrier() as before.
>>
>> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
>> for ARM11MPCore)
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Huacai Chen 
>> ---
>>  arch/mips/include/asm/processor.h | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/mips/include/asm/processor.h 
>> b/arch/mips/include/asm/processor.h
>> index af34afb..a8c4a3a 100644
>> --- a/arch/mips/include/asm/processor.h
>> +++ b/arch/mips/include/asm/processor.h
>> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>>
>> +#ifdef CONFIG_CPU_LOONGSON3
>> +/*
>> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
>> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
>> force
>> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
>> will
>> + * become available as expected.
>> + */
>
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
Need I change the comment and resend? Or you change the comment and get merged?

Huacai

>
>> +#define cpu_relax()  smp_mb()
>> +#else
>>  #define cpu_relax()  barrier()
>> +#endif
>>
>>  /*
>>   * Return_address is a replacement for __builtin_return_address(count)
>> --
>> 2.7.0
>
> Apart from the comment above though this looks better to me.
>
> Re-copying the LKMM maintainers - are you happy(ish) with this?
>
> Thanks,
> Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Huacai Chen
On Wed, Jul 18, 2018 at 1:52 AM, Paul Burton  wrote:
> Hi Huacai,
>
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
>> Linux expects that if a CPU modifies a memory location, then that
>> modification will eventually become visible to other CPUs in the system.
>>
>> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
>> prioritised over stores so it is possible for a store operation to be
>> postponed if a polling loop immediately follows it. If the variable
>> being polled indirectly depends on the outstanding store [for example,
>> another CPU may be polling the variable that is pending modification]
>> then there is the potential for deadlock if interrupts are disabled.
>> This deadlock occurs in qspinlock code.
>>
>> This patch changes the definition of cpu_relax() to smp_mb() for
>> Loongson-3, forcing a flushing of the SFB on SMP systems before the
>> next load takes place. If the Kernel is not compiled for SMP support,
>> this will expand to a barrier() as before.
>>
>> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
>> for ARM11MPCore)
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Huacai Chen 
>> ---
>>  arch/mips/include/asm/processor.h | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/mips/include/asm/processor.h 
>> b/arch/mips/include/asm/processor.h
>> index af34afb..a8c4a3a 100644
>> --- a/arch/mips/include/asm/processor.h
>> +++ b/arch/mips/include/asm/processor.h
>> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>>
>> +#ifdef CONFIG_CPU_LOONGSON3
>> +/*
>> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
>> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
>> force
>> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
>> will
>> + * become available as expected.
>> + */
>
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
Need I change the comment and resend? Or you change the comment and get merged?

Huacai

>
>> +#define cpu_relax()  smp_mb()
>> +#else
>>  #define cpu_relax()  barrier()
>> +#endif
>>
>>  /*
>>   * Return_address is a replacement for __builtin_return_address(count)
>> --
>> 2.7.0
>
> Apart from the comment above though this looks better to me.
>
> Re-copying the LKMM maintainers - are you happy(ish) with this?
>
> Thanks,
> Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Peter Zijlstra
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> > for ARM11MPCore)
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Huacai Chen 
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h 
> > b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> > read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> > force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> > will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".

Agreed.

> > +#define cpu_relax()smp_mb()
> > +#else
> >  #define cpu_relax()barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

Right, thanks for adding us back on :-)

Yes, this is much better, although I myself would also prefer explicit
mention that this is a work-around for a hardware bug.

But aside from the actual comment bike-shedding, this looks entirely
acceptible (also because ARM is already doing this -- and the Changelog
might want to refer to that particular patch).


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Peter Zijlstra
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> > for ARM11MPCore)
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Huacai Chen 
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h 
> > b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> > read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> > force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> > will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".

Agreed.

> > +#define cpu_relax()smp_mb()
> > +#else
> >  #define cpu_relax()barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

Right, thanks for adding us back on :-)

Yes, this is much better, although I myself would also prefer explicit
mention that this is a work-around for a hardware bug.

But aside from the actual comment bike-shedding, this looks entirely
acceptible (also because ARM is already doing this -- and the Changelog
might want to refer to that particular patch).


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Paul E. McKenney
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> Hi Huacai,
> 
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> > for ARM11MPCore)
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Huacai Chen 
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h 
> > b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> > read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> > force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> > will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
> 
> > +#define cpu_relax()smp_mb()
> > +#else
> >  #define cpu_relax()barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> > -- 
> > 2.7.0
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

This looks much better to me.

Thanx, Paul



Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Paul E. McKenney
On Tue, Jul 17, 2018 at 10:52:32AM -0700, Paul Burton wrote:
> Hi Huacai,
> 
> On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> > Linux expects that if a CPU modifies a memory location, then that
> > modification will eventually become visible to other CPUs in the system.
> > 
> > On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> > prioritised over stores so it is possible for a store operation to be
> > postponed if a polling loop immediately follows it. If the variable
> > being polled indirectly depends on the outstanding store [for example,
> > another CPU may be polling the variable that is pending modification]
> > then there is the potential for deadlock if interrupts are disabled.
> > This deadlock occurs in qspinlock code.
> > 
> > This patch changes the definition of cpu_relax() to smp_mb() for
> > Loongson-3, forcing a flushing of the SFB on SMP systems before the
> > next load takes place. If the Kernel is not compiled for SMP support,
> > this will expand to a barrier() as before.
> > 
> > References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> > for ARM11MPCore)
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Huacai Chen 
> > ---
> >  arch/mips/include/asm/processor.h | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/mips/include/asm/processor.h 
> > b/arch/mips/include/asm/processor.h
> > index af34afb..a8c4a3a 100644
> > --- a/arch/mips/include/asm/processor.h
> > +++ b/arch/mips/include/asm/processor.h
> > @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
> >  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
> >  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
> >  
> > +#ifdef CONFIG_CPU_LOONGSON3
> > +/*
> > + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a 
> > read
> > + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> > force
> > + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> > will
> > + * become available as expected.
> > + */
> 
> I think "may starve writes" or "may queue writes indefinitely" would be
> clearer than "may get starved".
> 
> > +#define cpu_relax()smp_mb()
> > +#else
> >  #define cpu_relax()barrier()
> > +#endif
> >  
> >  /*
> >   * Return_address is a replacement for __builtin_return_address(count)
> > -- 
> > 2.7.0
> 
> Apart from the comment above though this looks better to me.
> 
> Re-copying the LKMM maintainers - are you happy(ish) with this?

This looks much better to me.

Thanx, Paul



Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Paul Burton
Hi Huacai,

On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> Linux expects that if a CPU modifies a memory location, then that
> modification will eventually become visible to other CPUs in the system.
> 
> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> prioritised over stores so it is possible for a store operation to be
> postponed if a polling loop immediately follows it. If the variable
> being polled indirectly depends on the outstanding store [for example,
> another CPU may be polling the variable that is pending modification]
> then there is the potential for deadlock if interrupts are disabled.
> This deadlock occurs in qspinlock code.
> 
> This patch changes the definition of cpu_relax() to smp_mb() for
> Loongson-3, forcing a flushing of the SFB on SMP systems before the
> next load takes place. If the Kernel is not compiled for SMP support,
> this will expand to a barrier() as before.
> 
> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> for ARM11MPCore)
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>  arch/mips/include/asm/processor.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/mips/include/asm/processor.h 
> b/arch/mips/include/asm/processor.h
> index af34afb..a8c4a3a 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>  
> +#ifdef CONFIG_CPU_LOONGSON3
> +/*
> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> force
> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> will
> + * become available as expected.
> + */

I think "may starve writes" or "may queue writes indefinitely" would be
clearer than "may get starved".

> +#define cpu_relax()  smp_mb()
> +#else
>  #define cpu_relax()  barrier()
> +#endif
>  
>  /*
>   * Return_address is a replacement for __builtin_return_address(count)
> -- 
> 2.7.0

Apart from the comment above though this looks better to me.

Re-copying the LKMM maintainers - are you happy(ish) with this?

Thanks,
Paul


Re: [PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

2018-07-17 Thread Paul Burton
Hi Huacai,

On Fri, Jul 13, 2018 at 03:37:57PM +0800, Huacai Chen wrote:
> Linux expects that if a CPU modifies a memory location, then that
> modification will eventually become visible to other CPUs in the system.
> 
> On Loongson-3 processor with SFB (Store Fill Buffer), loads may be
> prioritised over stores so it is possible for a store operation to be
> postponed if a polling loop immediately follows it. If the variable
> being polled indirectly depends on the outstanding store [for example,
> another CPU may be polling the variable that is pending modification]
> then there is the potential for deadlock if interrupts are disabled.
> This deadlock occurs in qspinlock code.
> 
> This patch changes the definition of cpu_relax() to smp_mb() for
> Loongson-3, forcing a flushing of the SFB on SMP systems before the
> next load takes place. If the Kernel is not compiled for SMP support,
> this will expand to a barrier() as before.
> 
> References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() 
> for ARM11MPCore)
> Cc: sta...@vger.kernel.org
> Signed-off-by: Huacai Chen 
> ---
>  arch/mips/include/asm/processor.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/mips/include/asm/processor.h 
> b/arch/mips/include/asm/processor.h
> index af34afb..a8c4a3a 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p);
>  #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29])
>  #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>  
> +#ifdef CONFIG_CPU_LOONGSON3
> +/*
> + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read
> + * loop. Since spin loops of any kind should have a cpu_relax() in them, 
> force
> + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes 
> will
> + * become available as expected.
> + */

I think "may starve writes" or "may queue writes indefinitely" would be
clearer than "may get starved".

> +#define cpu_relax()  smp_mb()
> +#else
>  #define cpu_relax()  barrier()
> +#endif
>  
>  /*
>   * Return_address is a replacement for __builtin_return_address(count)
> -- 
> 2.7.0

Apart from the comment above though this looks better to me.

Re-copying the LKMM maintainers - are you happy(ish) with this?

Thanks,
Paul