[RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-02-27 Thread Pranith Kumar
ARM64 documentation recommends keeping exclusive loads and stores as close as
possible. Any instructions which do not depend on the value loaded should be
moved outside. 

In the current implementation of cmpxchg(), there is a mov instruction which can
be pulled before the load exclusive instruction without any change in
functionality. This patch does that change.

Signed-off-by: Pranith Kumar 
---
 arch/arm64/include/asm/cmpxchg.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index cb95930..8057735 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -89,8 +89,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
case 1:
do {
asm volatile("// __cmpxchg1\n"
-   "   ldxrb   %w1, %2\n"
"   mov %w0, #0\n"
+   "   ldxrb   %w1, %2\n"
"   cmp %w1, %w3\n"
"   b.ne1f\n"
"   stxrb   %w0, %w4, %2\n"
@@ -104,8 +104,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
case 2:
do {
asm volatile("// __cmpxchg2\n"
-   "   ldxrh   %w1, %2\n"
"   mov %w0, #0\n"
+   "   ldxrh   %w1, %2\n"
"   cmp %w1, %w3\n"
"   b.ne1f\n"
"   stxrh   %w0, %w4, %2\n"
@@ -119,8 +119,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
case 4:
do {
asm volatile("// __cmpxchg4\n"
-   "   ldxr%w1, %2\n"
"   mov %w0, #0\n"
+   "   ldxr%w1, %2\n"
"   cmp %w1, %w3\n"
"   b.ne1f\n"
"   stxr%w0, %w4, %2\n"
@@ -134,8 +134,8 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
case 8:
do {
asm volatile("// __cmpxchg8\n"
-   "   ldxr%1, %2\n"
"   mov %w0, #0\n"
+   "   ldxr%1, %2\n"
"   cmp %1, %3\n"
"   b.ne1f\n"
"   stxr%w0, %4, %2\n"
@@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, 
volatile void *ptr2,
VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
do {
asm volatile("// __cmpxchg_double8\n"
+   "   mov %w0, #0\n"
"   ldxp%0, %1, %2\n"
"   eor %0, %0, %3\n"
"   eor %1, %1, %4\n"
"   orr %1, %0, %1\n"
-   "   mov %w0, #0\n"
"   cbnz%1, 1f\n"
"   stxp%w0, %5, %6, %2\n"
"1:\n"
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-02-27 Thread Will Deacon
On Fri, Feb 27, 2015 at 08:09:17PM +, Pranith Kumar wrote:
> ARM64 documentation recommends keeping exclusive loads and stores as close as
> possible. Any instructions which do not depend on the value loaded should be
> moved outside. 
> 
> In the current implementation of cmpxchg(), there is a mov instruction which 
> can
> be pulled before the load exclusive instruction without any change in
> functionality. This patch does that change.
> 
> Signed-off-by: Pranith Kumar 
> ---
>  arch/arm64/include/asm/cmpxchg.h | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

[...]

> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void *ptr1, 
> volatile void *ptr2,
>   VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
>   do {
>   asm volatile("// __cmpxchg_double8\n"
> + "   mov %w0, #0\n"
>   "   ldxp%0, %1, %2\n"

Seriously, you might want to test this before you mindlessly make changes to
low-level synchronisation code. Not only is the change completely unnecessary
but it is actively harmful.

Have a good weekend,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-02-27 Thread Pranith Kumar
On Fri, Feb 27, 2015 at 3:15 PM, Will Deacon  wrote:
> On Fri, Feb 27, 2015 at 08:09:17PM +, Pranith Kumar wrote:
>> ARM64 documentation recommends keeping exclusive loads and stores as close as
>> possible. Any instructions which do not depend on the value loaded should be
>> moved outside.
>>
>> In the current implementation of cmpxchg(), there is a mov instruction which 
>> can
>> be pulled before the load exclusive instruction without any change in
>> functionality. This patch does that change.
>>
>> Signed-off-by: Pranith Kumar 
>> ---
>>  arch/arm64/include/asm/cmpxchg.h | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> [...]
>
>> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void 
>> *ptr1, volatile void *ptr2,
>>   VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
>>   do {
>>   asm volatile("// __cmpxchg_double8\n"
>> + "   mov %w0, #0\n"
>>   "   ldxp%0, %1, %2\n"
>
> Seriously, you might want to test this before you mindlessly make changes to
> low-level synchronisation code. Not only is the change completely unnecessary
> but it is actively harmful.
>

Oops, I apologize for this. I should have looked more closely. It is
wrong to do this in cmpxchg_double(). What about the other cases?


-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-02-27 Thread Pranith Kumar
On Fri, Feb 27, 2015 at 3:17 PM, Pranith Kumar  wrote:
> On Fri, Feb 27, 2015 at 3:15 PM, Will Deacon  wrote:
>>> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void 
>>> *ptr1, volatile void *ptr2,
>>>   VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1);
>>>   do {
>>>   asm volatile("// __cmpxchg_double8\n"
>>> + "   mov %w0, #0\n"
>>>   "   ldxp%0, %1, %2\n"
>>
>> Seriously, you might want to test this before you mindlessly make changes to
>> low-level synchronisation code. Not only is the change completely unnecessary
>> but it is actively harmful.
>>
>
> Oops, I apologize for this. I should have looked more closely. It is
> wrong to do this in cmpxchg_double(). What about the other cases?
>

Hi Will,

I tried looking closely on what might be the problem here. I am
waiting on a HiKey arm64 board and I agree I should not send in
changes without running/testing them first.

Could you please explain (for educational purposes) why you think this
change is harmful?


Thanks,
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-03-03 Thread Catalin Marinas
On Fri, Feb 27, 2015 at 03:29:38PM -0500, Pranith Kumar wrote:
> On Fri, Feb 27, 2015 at 3:17 PM, Pranith Kumar  wrote:
> > On Fri, Feb 27, 2015 at 3:15 PM, Will Deacon  wrote:
> >>> @@ -166,11 +166,11 @@ static inline int __cmpxchg_double(volatile void 
> >>> *ptr1, volatile void *ptr2,
> >>>   VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 
> >>> 1);
> >>>   do {
> >>>   asm volatile("// __cmpxchg_double8\n"
> >>> + "   mov %w0, #0\n"
> >>>   "   ldxp%0, %1, %2\n"
> >>
> >> Seriously, you might want to test this before you mindlessly make changes 
> >> to
> >> low-level synchronisation code. Not only is the change completely 
> >> unnecessary
> >> but it is actively harmful.
> >>
> >
> > Oops, I apologize for this. I should have looked more closely. It is
> > wrong to do this in cmpxchg_double(). What about the other cases?
> 
> I tried looking closely on what might be the problem here. I am
> waiting on a HiKey arm64 board and I agree I should not send in
> changes without running/testing them first.
> 
> Could you please explain (for educational purposes) why you think this
> change is harmful?

Do you mean the cmpxchg_double() change? Becuase %w0 and %0 is the same
physical register. You set it to 0 and immediately override it with
ldxp.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-03-03 Thread Pranith Kumar
On Tue, Mar 3, 2015 at 9:34 AM, Catalin Marinas  wrote:
>
> Do you mean the cmpxchg_double() change? Becuase %w0 and %0 is the same
> physical register. You set it to 0 and immediately override it with
> ldxp.
>

Thanks Catalin. I realized the blunder a while after Will pointed it
out. The asm here was a bit confusing. %0 usually refers to the first
input/output variable. But for ldxp instruction(which is just a
double-word load, not exclusive), it refers to the physical registers.

What about the changes in cmpxchg()? Why do we need to set %w0 to 0
after the ldxrh instruction? Also, could you please point me to any
arm64 asm reference?

Thanks!
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: cmpxchg.h: Bring ldxr and stxr closer

2015-03-03 Thread Catalin Marinas
On Tue, Mar 03, 2015 at 11:58:26AM -0500, Pranith Kumar wrote:
> On Tue, Mar 3, 2015 at 9:34 AM, Catalin Marinas  
> wrote:
> > Do you mean the cmpxchg_double() change? Becuase %w0 and %0 is the same
> > physical register. You set it to 0 and immediately override it with
> > ldxp.
> 
> Thanks Catalin. I realized the blunder a while after Will pointed it
> out. The asm here was a bit confusing. %0 usually refers to the first
> input/output variable. But for ldxp instruction(which is just a
> double-word load, not exclusive), it refers to the physical registers.

OK, so please try not to touch any asm code until you understood (a) the
AArch64 instruction set and (b) the gcc inline assembly syntax ;).

> What about the changes in cmpxchg()? Why do we need to set %w0 to 0
> after the ldxrh instruction? Also, could you please point me to any
> arm64 asm reference?

"info gcc" for the inline assembly syntax and the ARMv8 ARM for the
description of the AArch64 instruction set. It also helps if you look at
the code generated by the compiler (e.g. objdump -d).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/