Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-07 Thread Peter Zijlstra
On Tue, Oct 06, 2015 at 02:39:02PM -0700, H. Peter Anvin wrote:

> However, I think one of the major uses for cmpxchg_double() is for page
> table manipulation, and for that it isn't clear that a compiler barrier
> is needed nor desired.

See mm/slub.c, that uses cmpxchg_double() (the LOCK prefixed one) and
one would expect that to also include a compiler barrier.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-07 Thread Peter Zijlstra
On Tue, Oct 06, 2015 at 02:39:02PM -0700, H. Peter Anvin wrote:

> However, I think one of the major uses for cmpxchg_double() is for page
> table manipulation, and for that it isn't clear that a compiler barrier
> is needed nor desired.

See mm/slub.c, that uses cmpxchg_double() (the LOCK prefixed one) and
one would expect that to also include a compiler barrier.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread H. Peter Anvin
On 10/06/2015 01:29 PM, Pranith Kumar wrote:
> On Tue, Oct 6, 2015 at 4:16 PM, H. Peter Anvin  wrote:
>>
>> NAK.  We already have the "+m" for exactly this reason; adding an
>> explicit memory clobber should only be used to prevent movement of
>> *other* memory operations around this one (i.e. a barrier).
>>
> 
> OK. If that is so, can you please explain why we need it in the
> __raw_cmpxchg() case? I think it is a good idea to make cmpxchg() and
> cmpxchg_double() have similar barrier semantics.
> 

OK, it is a bit of a mess.  We use the same macros for locked operations
(__cmpxchg and __sync_cmpxchg) and unlocked operations
(__cmpxchg_local).  For locked operations we generally want a compiler
barrier, although there are exceptions.  I'm wondering if it would be
better to add an explicit barrier(); to the locked versions.

However, I think one of the major uses for cmpxchg_double() is for page
table manipulation, and for that it isn't clear that a compiler barrier
is needed nor desired.

On the other hand, perhaps all of this is false optimization and we
should just add the memory clobber.  The real issue is the impact on the
_local variants.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread Pranith Kumar
On Tue, Oct 6, 2015 at 4:16 PM, H. Peter Anvin  wrote:
>
> NAK.  We already have the "+m" for exactly this reason; adding an
> explicit memory clobber should only be used to prevent movement of
> *other* memory operations around this one (i.e. a barrier).
>

OK. If that is so, can you please explain why we need it in the
__raw_cmpxchg() case? I think it is a good idea to make cmpxchg() and
cmpxchg_double() have similar barrier semantics.

Thanks!
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread H. Peter Anvin
On 10/06/2015 11:54 AM, Pranith Kumar wrote:
> We are reading from memory locations pointed to by p1 and p2 in the asm
> block. Add a memory clobber flag to make gcc aware of this.
> 
> Signed-off-by: Pranith Kumar 
> ---
>  arch/x86/include/asm/cmpxchg.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> index 4a2e5bc..3e83949 100644
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -214,7 +214,8 @@ extern void __add_wrong_size(void)
>: "=a" (__ret), "+d" (__old2), \
>  "+m" (*(p1)), "+m" (*(p2))   \
>: "i" (2 * sizeof(long)), "a" (__old1),\
> -"b" (__new1), "c" (__new2)); \
> +"b" (__new1), "c" (__new2)   \
> +  : "memory");   \
>   __ret;  \
>  })

NAK.  We already have the "+m" for exactly this reason; adding an
explicit memory clobber should only be used to prevent movement of
*other* memory operations around this one (i.e. a barrier).

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread Pranith Kumar
We are reading from memory locations pointed to by p1 and p2 in the asm
block. Add a memory clobber flag to make gcc aware of this.

Signed-off-by: Pranith Kumar 
---
 arch/x86/include/asm/cmpxchg.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 4a2e5bc..3e83949 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -214,7 +214,8 @@ extern void __add_wrong_size(void)
 : "=a" (__ret), "+d" (__old2), \
   "+m" (*(p1)), "+m" (*(p2))   \
 : "i" (2 * sizeof(long)), "a" (__old1),\
-  "b" (__new1), "c" (__new2)); \
+  "b" (__new1), "c" (__new2)   \
+: "memory");   \
__ret;  \
 })
 
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread Pranith Kumar
We are reading from memory locations pointed to by p1 and p2 in the asm
block. Add a memory clobber flag to make gcc aware of this.

Signed-off-by: Pranith Kumar 
---
 arch/x86/include/asm/cmpxchg.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 4a2e5bc..3e83949 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -214,7 +214,8 @@ extern void __add_wrong_size(void)
 : "=a" (__ret), "+d" (__old2), \
   "+m" (*(p1)), "+m" (*(p2))   \
 : "i" (2 * sizeof(long)), "a" (__old1),\
-  "b" (__new1), "c" (__new2)); \
+  "b" (__new1), "c" (__new2)   \
+: "memory");   \
__ret;  \
 })
 
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread H. Peter Anvin
On 10/06/2015 11:54 AM, Pranith Kumar wrote:
> We are reading from memory locations pointed to by p1 and p2 in the asm
> block. Add a memory clobber flag to make gcc aware of this.
> 
> Signed-off-by: Pranith Kumar 
> ---
>  arch/x86/include/asm/cmpxchg.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> index 4a2e5bc..3e83949 100644
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -214,7 +214,8 @@ extern void __add_wrong_size(void)
>: "=a" (__ret), "+d" (__old2), \
>  "+m" (*(p1)), "+m" (*(p2))   \
>: "i" (2 * sizeof(long)), "a" (__old1),\
> -"b" (__new1), "c" (__new2)); \
> +"b" (__new1), "c" (__new2)   \
> +  : "memory");   \
>   __ret;  \
>  })

NAK.  We already have the "+m" for exactly this reason; adding an
explicit memory clobber should only be used to prevent movement of
*other* memory operations around this one (i.e. a barrier).

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread Pranith Kumar
On Tue, Oct 6, 2015 at 4:16 PM, H. Peter Anvin  wrote:
>
> NAK.  We already have the "+m" for exactly this reason; adding an
> explicit memory clobber should only be used to prevent movement of
> *other* memory operations around this one (i.e. a barrier).
>

OK. If that is so, can you please explain why we need it in the
__raw_cmpxchg() case? I think it is a good idea to make cmpxchg() and
cmpxchg_double() have similar barrier semantics.

Thanks!
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: cmpxchg_double: Add missing memory clobber

2015-10-06 Thread H. Peter Anvin
On 10/06/2015 01:29 PM, Pranith Kumar wrote:
> On Tue, Oct 6, 2015 at 4:16 PM, H. Peter Anvin  wrote:
>>
>> NAK.  We already have the "+m" for exactly this reason; adding an
>> explicit memory clobber should only be used to prevent movement of
>> *other* memory operations around this one (i.e. a barrier).
>>
> 
> OK. If that is so, can you please explain why we need it in the
> __raw_cmpxchg() case? I think it is a good idea to make cmpxchg() and
> cmpxchg_double() have similar barrier semantics.
> 

OK, it is a bit of a mess.  We use the same macros for locked operations
(__cmpxchg and __sync_cmpxchg) and unlocked operations
(__cmpxchg_local).  For locked operations we generally want a compiler
barrier, although there are exceptions.  I'm wondering if it would be
better to add an explicit barrier(); to the locked versions.

However, I think one of the major uses for cmpxchg_double() is for page
table manipulation, and for that it isn't clear that a compiler barrier
is needed nor desired.

On the other hand, perhaps all of this is false optimization and we
should just add the memory clobber.  The real issue is the impact on the
_local variants.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/