Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Dmitry Vyukov
On Tue, Mar 14, 2017 at 4:44 PM, Mark Rutland  wrote:
>> > -static __always_inline int atomic_read(const atomic_t *v)
>> > +static __always_inline int arch_atomic_read(const atomic_t *v)
>> >  {
>> > -   return READ_ONCE((v)->counter);
>> > +   return READ_ONCE_NOCHECK((v)->counter);
>>
>> Should NOCHEKC come with a comment, because i've no idea why this is so.
>
> I suspect the idea is that given the wrapper will have done the KASAN
> check, duplicating it here is either sub-optimal, or results in
> duplicate splats. READ_ONCE() has an implicit KASAN check,
> READ_ONCE_NOCHECK() does not.
>
> If this is to solve duplicate splats, it'd be worth having a
> WRITE_ONCE_NOCHECK() for arch_atomic_set().
>
> Agreed on the comment, regardless.


Reverted xchg changes.
Added comments re READ_ONCE_NOCHECK() and WRITE_ONCE().
Added file comment.
Split into 3 patches and mailed.

Thanks!


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Dmitry Vyukov
On Tue, Mar 14, 2017 at 4:44 PM, Mark Rutland  wrote:
>> > -static __always_inline int atomic_read(const atomic_t *v)
>> > +static __always_inline int arch_atomic_read(const atomic_t *v)
>> >  {
>> > -   return READ_ONCE((v)->counter);
>> > +   return READ_ONCE_NOCHECK((v)->counter);
>>
>> Should NOCHEKC come with a comment, because i've no idea why this is so.
>
> I suspect the idea is that given the wrapper will have done the KASAN
> check, duplicating it here is either sub-optimal, or results in
> duplicate splats. READ_ONCE() has an implicit KASAN check,
> READ_ONCE_NOCHECK() does not.
>
> If this is to solve duplicate splats, it'd be worth having a
> WRITE_ONCE_NOCHECK() for arch_atomic_set().
>
> Agreed on the comment, regardless.


Reverted xchg changes.
Added comments re READ_ONCE_NOCHECK() and WRITE_ONCE().
Added file comment.
Split into 3 patches and mailed.

Thanks!


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Mark Rutland
On Tue, Mar 14, 2017 at 04:32:30PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> > -static __always_inline int atomic_read(const atomic_t *v)
> > +static __always_inline int arch_atomic_read(const atomic_t *v)
> >  {
> > -   return READ_ONCE((v)->counter);
> > +   return READ_ONCE_NOCHECK((v)->counter);
> 
> Should NOCHEKC come with a comment, because i've no idea why this is so.

I suspect the idea is that given the wrapper will have done the KASAN
check, duplicating it here is either sub-optimal, or results in
duplicate splats. READ_ONCE() has an implicit KASAN check,
READ_ONCE_NOCHECK() does not.

If this is to solve duplicate splats, it'd be worth having a
WRITE_ONCE_NOCHECK() for arch_atomic_set().

Agreed on the comment, regardless.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Mark Rutland
On Tue, Mar 14, 2017 at 04:32:30PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> > -static __always_inline int atomic_read(const atomic_t *v)
> > +static __always_inline int arch_atomic_read(const atomic_t *v)
> >  {
> > -   return READ_ONCE((v)->counter);
> > +   return READ_ONCE_NOCHECK((v)->counter);
> 
> Should NOCHEKC come with a comment, because i've no idea why this is so.

I suspect the idea is that given the wrapper will have done the KASAN
check, duplicating it here is either sub-optimal, or results in
duplicate splats. READ_ONCE() has an implicit KASAN check,
READ_ONCE_NOCHECK() does not.

If this is to solve duplicate splats, it'd be worth having a
WRITE_ONCE_NOCHECK() for arch_atomic_set().

Agreed on the comment, regardless.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Peter Zijlstra
On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> - return READ_ONCE((v)->counter);
> + return READ_ONCE_NOCHECK((v)->counter);

Should NOCHEKC come with a comment, because i've no idea why this is so.

>  }


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Peter Zijlstra
On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> - return READ_ONCE((v)->counter);
> + return READ_ONCE_NOCHECK((v)->counter);

Should NOCHEKC come with a comment, because i've no idea why this is so.

>  }


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Peter Zijlstra
On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> Any other suggestions?

> - return i + xadd(>counter, i);
> + return i + arch_xadd(>counter, i);

> +#define xadd(ptr, v) \
> +({   \
> + __typeof__(ptr) ptr = (ptr);\
> + kasan_check_write(ptr, sizeof(*ptr));   \
> + arch_xadd(ptr, (v));\
> +})

xadd() isn't a generic thing, it only exists inside x86 as a helper to
implement atomic bits.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Peter Zijlstra
On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> Any other suggestions?

> - return i + xadd(>counter, i);
> + return i + arch_xadd(>counter, i);

> +#define xadd(ptr, v) \
> +({   \
> + __typeof__(ptr) ptr = (ptr);\
> + kasan_check_write(ptr, sizeof(*ptr));   \
> + arch_xadd(ptr, (v));\
> +})

xadd() isn't a generic thing, it only exists inside x86 as a helper to
implement atomic bits.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 6:43 PM, Will Deacon  wrote:
> On Wed, Mar 08, 2017 at 03:20:41PM +, Mark Rutland wrote:
>> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> > consequently x86/arm64) initially, it becomes more realistic. For the
>> > tools we don't care about absolute efficiency and this gets rid of
>> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> > Re (3) I think rmb/wmb can be reasonably replaced with
>> > atomic_thread_fence(acquire/release). Re (5) situation with
>> > correctness becomes better very quickly as more people use them in
>> > user-space. Since KASAN is not intended to be used in production (or
>> > at least such build is expected to crash), we can afford to shake out
>> > any remaining correctness issues in such build. (1) I don't fully
>> > understand, what exactly is the problem with seq_cst?
>>
>> I'll have to leave it to Will to have the final word on these; I'm
>> certainly not familiar enough with the C11 memory model to comment on
>> (1).
>
> rmb()/wmb() are not remotely similar to
> atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
> coherent CPUs (i.e. the smp_* variants). Please don't do that :)
>
> I'm also terrified of the optimisations that the compiler is theoretically
> allowed to make to C11 atomics given the assumptions of the language
> virtual machine, which are not necessarily valid in the kernel environment.
> We would at least need well-supported compiler options to disable these
> options, and also to allow data races with things like READ_ONCE.

Hello,

I've prototyped what Mark suggested:
 - prefix arch atomics with arch_
 - add  which defined atomics and
forwards to the arch_ version

Patch attached. It boot with/without KASAN.

Does it look reasonable to you?

If so, I will split it into:
 - minor kasan patch that adds volatile to kasan_check_read/write
 - main patch that adds arch_ prefix and
 header
 - kasan instrumentation in 

Any other suggestions?
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..605a29c9fe81 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -16,36 +16,36 @@
 #define ATOMIC_INIT(i)	{ (i) }
 
 /**
- * atomic_read - read atomic variable
+ * arch_atomic_read - read atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically reads the value of @v.
  */
-static __always_inline int atomic_read(const atomic_t *v)
+static __always_inline int arch_atomic_read(const atomic_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic_set - set atomic variable
+ * arch_atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static __always_inline void atomic_set(atomic_t *v, int i)
+static __always_inline void arch_atomic_set(atomic_t *v, int i)
 {
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic_add - add integer to atomic variable
+ * arch_atomic_add - add integer to atomic variable
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic_add(int i, atomic_t *v)
+static __always_inline void arch_atomic_add(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		 : "+m" (v->counter)
@@ -53,13 +53,13 @@ static __always_inline void atomic_add(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub - subtract integer from atomic variable
+ * arch_atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
  * Atomically subtracts @i from @v.
  */
-static __always_inline void atomic_sub(int i, atomic_t *v)
+static __always_inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		 : "+m" (v->counter)
@@ -67,7 +67,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub_and_test - subtract value from variable and test result
+ * arch_atomic_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
@@ -75,63 +75,63 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
+static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic_inc - increment atomic variable
+ * arch_atomic_inc - increment atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic_inc(atomic_t *v)
+static __always_inline void arch_atomic_inc(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "incl %0"
 		 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-14 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 6:43 PM, Will Deacon  wrote:
> On Wed, Mar 08, 2017 at 03:20:41PM +, Mark Rutland wrote:
>> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> > consequently x86/arm64) initially, it becomes more realistic. For the
>> > tools we don't care about absolute efficiency and this gets rid of
>> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> > Re (3) I think rmb/wmb can be reasonably replaced with
>> > atomic_thread_fence(acquire/release). Re (5) situation with
>> > correctness becomes better very quickly as more people use them in
>> > user-space. Since KASAN is not intended to be used in production (or
>> > at least such build is expected to crash), we can afford to shake out
>> > any remaining correctness issues in such build. (1) I don't fully
>> > understand, what exactly is the problem with seq_cst?
>>
>> I'll have to leave it to Will to have the final word on these; I'm
>> certainly not familiar enough with the C11 memory model to comment on
>> (1).
>
> rmb()/wmb() are not remotely similar to
> atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
> coherent CPUs (i.e. the smp_* variants). Please don't do that :)
>
> I'm also terrified of the optimisations that the compiler is theoretically
> allowed to make to C11 atomics given the assumptions of the language
> virtual machine, which are not necessarily valid in the kernel environment.
> We would at least need well-supported compiler options to disable these
> options, and also to allow data races with things like READ_ONCE.

Hello,

I've prototyped what Mark suggested:
 - prefix arch atomics with arch_
 - add  which defined atomics and
forwards to the arch_ version

Patch attached. It boot with/without KASAN.

Does it look reasonable to you?

If so, I will split it into:
 - minor kasan patch that adds volatile to kasan_check_read/write
 - main patch that adds arch_ prefix and
 header
 - kasan instrumentation in 

Any other suggestions?
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..605a29c9fe81 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -16,36 +16,36 @@
 #define ATOMIC_INIT(i)	{ (i) }
 
 /**
- * atomic_read - read atomic variable
+ * arch_atomic_read - read atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically reads the value of @v.
  */
-static __always_inline int atomic_read(const atomic_t *v)
+static __always_inline int arch_atomic_read(const atomic_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic_set - set atomic variable
+ * arch_atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static __always_inline void atomic_set(atomic_t *v, int i)
+static __always_inline void arch_atomic_set(atomic_t *v, int i)
 {
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic_add - add integer to atomic variable
+ * arch_atomic_add - add integer to atomic variable
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic_add(int i, atomic_t *v)
+static __always_inline void arch_atomic_add(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		 : "+m" (v->counter)
@@ -53,13 +53,13 @@ static __always_inline void atomic_add(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub - subtract integer from atomic variable
+ * arch_atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
  * Atomically subtracts @i from @v.
  */
-static __always_inline void atomic_sub(int i, atomic_t *v)
+static __always_inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		 : "+m" (v->counter)
@@ -67,7 +67,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub_and_test - subtract value from variable and test result
+ * arch_atomic_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
@@ -75,63 +75,63 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
+static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic_inc - increment atomic variable
+ * arch_atomic_inc - increment atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic_inc(atomic_t *v)
+static __always_inline void arch_atomic_inc(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "incl %0"
 		 : "+m" 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Will Deacon
On Wed, Mar 08, 2017 at 03:20:41PM +, Mark Rutland wrote:
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> > consequently x86/arm64) initially, it becomes more realistic. For the
> > tools we don't care about absolute efficiency and this gets rid of
> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> > Re (3) I think rmb/wmb can be reasonably replaced with
> > atomic_thread_fence(acquire/release). Re (5) situation with
> > correctness becomes better very quickly as more people use them in
> > user-space. Since KASAN is not intended to be used in production (or
> > at least such build is expected to crash), we can afford to shake out
> > any remaining correctness issues in such build. (1) I don't fully
> > understand, what exactly is the problem with seq_cst?
> 
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).

rmb()/wmb() are not remotely similar to
atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
coherent CPUs (i.e. the smp_* variants). Please don't do that :)

I'm also terrified of the optimisations that the compiler is theoretically
allowed to make to C11 atomics given the assumptions of the language
virtual machine, which are not necessarily valid in the kernel environment.
We would at least need well-supported compiler options to disable these
options, and also to allow data races with things like READ_ONCE.

Will


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Will Deacon
On Wed, Mar 08, 2017 at 03:20:41PM +, Mark Rutland wrote:
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> > consequently x86/arm64) initially, it becomes more realistic. For the
> > tools we don't care about absolute efficiency and this gets rid of
> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> > Re (3) I think rmb/wmb can be reasonably replaced with
> > atomic_thread_fence(acquire/release). Re (5) situation with
> > correctness becomes better very quickly as more people use them in
> > user-space. Since KASAN is not intended to be used in production (or
> > at least such build is expected to crash), we can afford to shake out
> > any remaining correctness issues in such build. (1) I don't fully
> > understand, what exactly is the problem with seq_cst?
> 
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).

rmb()/wmb() are not remotely similar to
atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
coherent CPUs (i.e. the smp_* variants). Please don't do that :)

I'm also terrified of the optimisations that the compiler is theoretically
allowed to make to C11 atomics given the assumptions of the language
virtual machine, which are not necessarily valid in the kernel environment.
We would at least need well-supported compiler options to disable these
options, and also to allow data races with things like READ_ONCE.

Will


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 9:35 PM, Peter Zijlstra  wrote:
> On Mon, Mar 06, 2017 at 04:20:18PM +, Mark Rutland wrote:
>> > >> So the problem is doing load/stores from asm bits, and GCC
>> > >> (traditionally) doesn't try and interpret APP asm bits.
>> > >>
>> > >> However, could we not write a GCC plugin that does exactly that?
>> > >> Something that interprets the APP asm bits and generates these KASAN
>> > >> bits that go with it?
>
>> I don't think there's much you'll be able to do within the compiler,
>> assuming you mean to derive this from the asm block inputs and outputs.
>
> Nah, I was thinking about a full asm interpreter.
>
>> Those can hide address-generation (e.g. with per-cpu stuff), which the
>> compiler may erroneously be detected as racing.
>>
>> Those may also take fake inputs (e.g. the sp input to arm64's
>> __my_cpu_offset()) which may confuse matters.
>>
>> Parsing the assembly itself will be *extremely* painful due to the way
>> that's set up for run-time patching.
>
> Argh, yah, completely forgot about all that alternative and similar
> nonsense :/



I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
consequently x86/arm64) initially, it becomes more realistic. For the
tools we don't care about absolute efficiency and this gets rid of
Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
Re (3) I think rmb/wmb can be reasonably replaced with
atomic_thread_fence(acquire/release). Re (5) situation with
correctness becomes better very quickly as more people use them in
user-space. Since KASAN is not intended to be used in production (or
at least such build is expected to crash), we can afford to shake out
any remaining correctness issues in such build. (1) I don't fully
understand, what exactly is the problem with seq_cst?

I've sketched a patch that does it, and did some testing with/without
KASAN on x86_64.

In short, it adds include/linux/atomic_compiler.h which is included
from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
and  is not included when CONFIG_COMPILER_ATOMIC is
defined.
For bitops it is similar except that only parts of asm/bitops.h are
selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
defines other stuff.
asm/barriers.h is left intact for now. We don't need it for KASAN. But
for KTSAN we can do similar thing -- selectively disable some of the
barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).

Such change would allow us to support atomic ops for multiple arches
for all of KASAN/KTSAN/KMSAN.

Thoughts?
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..7bcb10544fc1 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_ATOMIC_H
 #define _ASM_X86_ATOMIC_H
 
+#ifdef CONFIG_COMPILER_ATOMIC
+#error "should not be included"
+#endif
+
 #include 
 #include 
 #include 
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 854022772c5b..e42b85f1ed75 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -68,6 +68,7 @@
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 set_bit(long nr, volatile unsigned long *addr)
 {
@@ -81,6 +82,7 @@ set_bit(long nr, volatile unsigned long *addr)
 			: BITOP_ADDR(addr) : "Ir" (nr) : "memory");
 	}
 }
+#endif
 
 /**
  * __set_bit - Set a bit in memory
@@ -106,6 +108,7 @@ static __always_inline void __set_bit(long nr, volatile unsigned long *addr)
  * you should call smp_mb__before_atomic() and/or smp_mb__after_atomic()
  * in order to ensure changes are visible on other processors.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 clear_bit(long nr, volatile unsigned long *addr)
 {
@@ -119,6 +122,7 @@ clear_bit(long nr, volatile unsigned long *addr)
 			: "Ir" (nr));
 	}
 }
+#endif
 
 /*
  * clear_bit_unlock - Clears a bit in memory
@@ -128,17 +132,20 @@ clear_bit(long nr, volatile unsigned long *addr)
  * clear_bit() is atomic and implies release semantics before the memory
  * operation. It can be used for an unlock.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
 {
 	barrier();
 	clear_bit(nr, addr);
 }
+#endif
 
 static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
 
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
 {
 	bool negative;
@@ -151,6 +158,7 @@ static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile
 
 // Let everybody know we have it
 #define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+#endif
 
 /*
  * __clear_bit_unlock - Clears a bit 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 9:35 PM, Peter Zijlstra  wrote:
> On Mon, Mar 06, 2017 at 04:20:18PM +, Mark Rutland wrote:
>> > >> So the problem is doing load/stores from asm bits, and GCC
>> > >> (traditionally) doesn't try and interpret APP asm bits.
>> > >>
>> > >> However, could we not write a GCC plugin that does exactly that?
>> > >> Something that interprets the APP asm bits and generates these KASAN
>> > >> bits that go with it?
>
>> I don't think there's much you'll be able to do within the compiler,
>> assuming you mean to derive this from the asm block inputs and outputs.
>
> Nah, I was thinking about a full asm interpreter.
>
>> Those can hide address-generation (e.g. with per-cpu stuff), which the
>> compiler may erroneously be detected as racing.
>>
>> Those may also take fake inputs (e.g. the sp input to arm64's
>> __my_cpu_offset()) which may confuse matters.
>>
>> Parsing the assembly itself will be *extremely* painful due to the way
>> that's set up for run-time patching.
>
> Argh, yah, completely forgot about all that alternative and similar
> nonsense :/



I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
consequently x86/arm64) initially, it becomes more realistic. For the
tools we don't care about absolute efficiency and this gets rid of
Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
Re (3) I think rmb/wmb can be reasonably replaced with
atomic_thread_fence(acquire/release). Re (5) situation with
correctness becomes better very quickly as more people use them in
user-space. Since KASAN is not intended to be used in production (or
at least such build is expected to crash), we can afford to shake out
any remaining correctness issues in such build. (1) I don't fully
understand, what exactly is the problem with seq_cst?

I've sketched a patch that does it, and did some testing with/without
KASAN on x86_64.

In short, it adds include/linux/atomic_compiler.h which is included
from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
and  is not included when CONFIG_COMPILER_ATOMIC is
defined.
For bitops it is similar except that only parts of asm/bitops.h are
selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
defines other stuff.
asm/barriers.h is left intact for now. We don't need it for KASAN. But
for KTSAN we can do similar thing -- selectively disable some of the
barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).

Such change would allow us to support atomic ops for multiple arches
for all of KASAN/KTSAN/KMSAN.

Thoughts?
diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..7bcb10544fc1 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_ATOMIC_H
 #define _ASM_X86_ATOMIC_H
 
+#ifdef CONFIG_COMPILER_ATOMIC
+#error "should not be included"
+#endif
+
 #include 
 #include 
 #include 
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 854022772c5b..e42b85f1ed75 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -68,6 +68,7 @@
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 set_bit(long nr, volatile unsigned long *addr)
 {
@@ -81,6 +82,7 @@ set_bit(long nr, volatile unsigned long *addr)
 			: BITOP_ADDR(addr) : "Ir" (nr) : "memory");
 	}
 }
+#endif
 
 /**
  * __set_bit - Set a bit in memory
@@ -106,6 +108,7 @@ static __always_inline void __set_bit(long nr, volatile unsigned long *addr)
  * you should call smp_mb__before_atomic() and/or smp_mb__after_atomic()
  * in order to ensure changes are visible on other processors.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 clear_bit(long nr, volatile unsigned long *addr)
 {
@@ -119,6 +122,7 @@ clear_bit(long nr, volatile unsigned long *addr)
 			: "Ir" (nr));
 	}
 }
+#endif
 
 /*
  * clear_bit_unlock - Clears a bit in memory
@@ -128,17 +132,20 @@ clear_bit(long nr, volatile unsigned long *addr)
  * clear_bit() is atomic and implies release semantics before the memory
  * operation. It can be used for an unlock.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
 {
 	barrier();
 	clear_bit(nr, addr);
 }
+#endif
 
 static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
 
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
 {
 	bool negative;
@@ -151,6 +158,7 @@ static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile
 
 // Let everybody know we have it
 #define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+#endif
 
 /*
  * __clear_bit_unlock - Clears a bit in memory
@@ -193,6 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
On Wed, Mar 08, 2017 at 04:45:58PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland  wrote:
> > On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> >> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> >> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> >> > atomic implementations such that we can instrument them explicitly in a
> >> > core header. That means that the implementation and semantics of the
> >> > atomics don't change at all.
> >> >
> >> > Note that we could initially do this just for x86 and arm64), e.g. by
> >> > having those explicitly include an 
> >> > at the end of their .
> >>
> >> How exactly do you want to do this incrementally?
> >> I don't feel ready to shuffle all archs, but doing x86 in one patch
> >> and then arm64 in another looks tractable.
> >
> > I guess we'd have three patches: one adding the header and any core
> > infrastructure, followed by separate patches migrating arm64 and x86
> > over.
> 
> But if we add e.g. atomic_read() which forwards to arch_atomic_read()
> to , it will break all archs that don't rename its
> atomic_read() to arch_atomic_read().

... as above, that'd be handled by placing this in an
 file, that we only include at the
end of the arch implementation.

So we'd only include that on arm64 and x86, without needing to change
the names elsewhere.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
On Wed, Mar 08, 2017 at 04:45:58PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland  wrote:
> > On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> >> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> >> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> >> > atomic implementations such that we can instrument them explicitly in a
> >> > core header. That means that the implementation and semantics of the
> >> > atomics don't change at all.
> >> >
> >> > Note that we could initially do this just for x86 and arm64), e.g. by
> >> > having those explicitly include an 
> >> > at the end of their .
> >>
> >> How exactly do you want to do this incrementally?
> >> I don't feel ready to shuffle all archs, but doing x86 in one patch
> >> and then arm64 in another looks tractable.
> >
> > I guess we'd have three patches: one adding the header and any core
> > infrastructure, followed by separate patches migrating arm64 and x86
> > over.
> 
> But if we add e.g. atomic_read() which forwards to arch_atomic_read()
> to , it will break all archs that don't rename its
> atomic_read() to arch_atomic_read().

... as above, that'd be handled by placing this in an
 file, that we only include at the
end of the arch implementation.

So we'd only include that on arm64 and x86, without needing to change
the names elsewhere.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland  wrote:
> On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
>> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
>> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
>> > atomic implementations such that we can instrument them explicitly in a
>> > core header. That means that the implementation and semantics of the
>> > atomics don't change at all.
>> >
>> > Note that we could initially do this just for x86 and arm64), e.g. by
>> > having those explicitly include an 
>> > at the end of their .
>>
>> How exactly do you want to do this incrementally?
>> I don't feel ready to shuffle all archs, but doing x86 in one patch
>> and then arm64 in another looks tractable.
>
> I guess we'd have three patches: one adding the header and any core
> infrastructure, followed by separate patches migrating arm64 and x86
> over.

But if we add e.g. atomic_read() which forwards to arch_atomic_read()
to , it will break all archs that don't rename its
atomic_read() to arch_atomic_read().


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland  wrote:
> On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
>> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
>> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
>> > atomic implementations such that we can instrument them explicitly in a
>> > core header. That means that the implementation and semantics of the
>> > atomics don't change at all.
>> >
>> > Note that we could initially do this just for x86 and arm64), e.g. by
>> > having those explicitly include an 
>> > at the end of their .
>>
>> How exactly do you want to do this incrementally?
>> I don't feel ready to shuffle all archs, but doing x86 in one patch
>> and then arm64 in another looks tractable.
>
> I guess we'd have three patches: one adding the header and any core
> infrastructure, followed by separate patches migrating arm64 and x86
> over.

But if we add e.g. atomic_read() which forwards to arch_atomic_read()
to , it will break all archs that don't rename its
atomic_read() to arch_atomic_read().


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> > atomic implementations such that we can instrument them explicitly in a
> > core header. That means that the implementation and semantics of the
> > atomics don't change at all.
> >
> > Note that we could initially do this just for x86 and arm64), e.g. by
> > having those explicitly include an 
> > at the end of their .
> 
> How exactly do you want to do this incrementally?
> I don't feel ready to shuffle all archs, but doing x86 in one patch
> and then arm64 in another looks tractable.

I guess we'd have three patches: one adding the header and any core
infrastructure, followed by separate patches migrating arm64 and x86
over.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> > atomic implementations such that we can instrument them explicitly in a
> > core header. That means that the implementation and semantics of the
> > atomics don't change at all.
> >
> > Note that we could initially do this just for x86 and arm64), e.g. by
> > having those explicitly include an 
> > at the end of their .
> 
> How exactly do you want to do this incrementally?
> I don't feel ready to shuffle all archs, but doing x86 in one patch
> and then arm64 in another looks tractable.

I guess we'd have three patches: one adding the header and any core
infrastructure, followed by separate patches migrating arm64 and x86
over.

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
Hi,

On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> consequently x86/arm64) initially, it becomes more realistic. For the
> tools we don't care about absolute efficiency and this gets rid of
> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> Re (3) I think rmb/wmb can be reasonably replaced with
> atomic_thread_fence(acquire/release). Re (5) situation with
> correctness becomes better very quickly as more people use them in
> user-space. Since KASAN is not intended to be used in production (or
> at least such build is expected to crash), we can afford to shake out
> any remaining correctness issues in such build. (1) I don't fully
> understand, what exactly is the problem with seq_cst?

I'll have to leave it to Will to have the final word on these; I'm
certainly not familiar enough with the C11 memory model to comment on
(1).

However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
with atomic_thread_fence_acquire() and atomic_thread_fence_release()
respectively on arm64.

The former use barriers with full system scope, whereas the latter may
be limited to the inner shareable domain. While I'm not sure of the
precise intended semantics of wmb() and rmb(), I believe this
substitution would break some cases (e.g. communicating with a
non-coherent master).

Note that regardless, we'd have to special-case __iowmb() to use a full
system barrier.

Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
KASAN today, with compilers that are known to have bugs in their atomics
(e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
implementation of atomics without introducing a functional regression.

> i'Ve sketched a patch that does it, and did some testing with/without
> KASAN on x86_64.
> 
> In short, it adds include/linux/atomic_compiler.h which is included
> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
> and  is not included when CONFIG_COMPILER_ATOMIC is
> defined.
> For bitops it is similar except that only parts of asm/bitops.h are
> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
> defines other stuff.
> asm/barriers.h is left intact for now. We don't need it for KASAN. But
> for KTSAN we can do similar thing -- selectively disable some of the
> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
> 
> Such change would allow us to support atomic ops for multiple arches
> for all of KASAN/KTSAN/KMSAN.
> 
> Thoughts?

As in my other reply, I'd prefer that we wrapped the (arch-specific)
atomic implementations such that we can instrument them explicitly in a
core header. That means that the implementation and semantics of the
atomics don't change at all.

Note that we could initially do this just for x86 and arm64), e.g. by
having those explicitly include an 
at the end of their .

For architectures which can use the compiler's atomics, we can allow
them to do so, skipping the redundant explicit instrumentation.

Other than being potentially slower (which we've established we don't
care too much about above), is there a problem with that approach?

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Mark Rutland
Hi,

On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> consequently x86/arm64) initially, it becomes more realistic. For the
> tools we don't care about absolute efficiency and this gets rid of
> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> Re (3) I think rmb/wmb can be reasonably replaced with
> atomic_thread_fence(acquire/release). Re (5) situation with
> correctness becomes better very quickly as more people use them in
> user-space. Since KASAN is not intended to be used in production (or
> at least such build is expected to crash), we can afford to shake out
> any remaining correctness issues in such build. (1) I don't fully
> understand, what exactly is the problem with seq_cst?

I'll have to leave it to Will to have the final word on these; I'm
certainly not familiar enough with the C11 memory model to comment on
(1).

However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
with atomic_thread_fence_acquire() and atomic_thread_fence_release()
respectively on arm64.

The former use barriers with full system scope, whereas the latter may
be limited to the inner shareable domain. While I'm not sure of the
precise intended semantics of wmb() and rmb(), I believe this
substitution would break some cases (e.g. communicating with a
non-coherent master).

Note that regardless, we'd have to special-case __iowmb() to use a full
system barrier.

Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
KASAN today, with compilers that are known to have bugs in their atomics
(e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
implementation of atomics without introducing a functional regression.

> i'Ve sketched a patch that does it, and did some testing with/without
> KASAN on x86_64.
> 
> In short, it adds include/linux/atomic_compiler.h which is included
> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
> and  is not included when CONFIG_COMPILER_ATOMIC is
> defined.
> For bitops it is similar except that only parts of asm/bitops.h are
> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
> defines other stuff.
> asm/barriers.h is left intact for now. We don't need it for KASAN. But
> for KTSAN we can do similar thing -- selectively disable some of the
> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
> 
> Such change would allow us to support atomic ops for multiple arches
> for all of KASAN/KTSAN/KMSAN.
> 
> Thoughts?

As in my other reply, I'd prefer that we wrapped the (arch-specific)
atomic implementations such that we can instrument them explicitly in a
core header. That means that the implementation and semantics of the
atomics don't change at all.

Note that we could initially do this just for x86 and arm64), e.g. by
having those explicitly include an 
at the end of their .

For architectures which can use the compiler's atomics, we can allow
them to do so, skipping the redundant explicit instrumentation.

Other than being potentially slower (which we've established we don't
care too much about above), is there a problem with that approach?

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> consequently x86/arm64) initially, it becomes more realistic. For the
>> tools we don't care about absolute efficiency and this gets rid of
>> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> Re (3) I think rmb/wmb can be reasonably replaced with
>> atomic_thread_fence(acquire/release). Re (5) situation with
>> correctness becomes better very quickly as more people use them in
>> user-space. Since KASAN is not intended to be used in production (or
>> at least such build is expected to crash), we can afford to shake out
>> any remaining correctness issues in such build. (1) I don't fully
>> understand, what exactly is the problem with seq_cst?
>
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).
>
> However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
> with atomic_thread_fence_acquire() and atomic_thread_fence_release()
> respectively on arm64.
>
> The former use barriers with full system scope, whereas the latter may
> be limited to the inner shareable domain. While I'm not sure of the
> precise intended semantics of wmb() and rmb(), I believe this
> substitution would break some cases (e.g. communicating with a
> non-coherent master).
>
> Note that regardless, we'd have to special-case __iowmb() to use a full
> system barrier.
>
> Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
> KASAN today, with compilers that are known to have bugs in their atomics
> (e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
> implementation of atomics without introducing a functional regression.
>
>> i'Ve sketched a patch that does it, and did some testing with/without
>> KASAN on x86_64.
>>
>> In short, it adds include/linux/atomic_compiler.h which is included
>> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
>> and  is not included when CONFIG_COMPILER_ATOMIC is
>> defined.
>> For bitops it is similar except that only parts of asm/bitops.h are
>> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
>> defines other stuff.
>> asm/barriers.h is left intact for now. We don't need it for KASAN. But
>> for KTSAN we can do similar thing -- selectively disable some of the
>> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
>>
>> Such change would allow us to support atomic ops for multiple arches
>> for all of KASAN/KTSAN/KMSAN.
>>
>> Thoughts?
>
> As in my other reply, I'd prefer that we wrapped the (arch-specific)
> atomic implementations such that we can instrument them explicitly in a
> core header. That means that the implementation and semantics of the
> atomics don't change at all.
>
> Note that we could initially do this just for x86 and arm64), e.g. by
> having those explicitly include an 
> at the end of their .

How exactly do you want to do this incrementally?
I don't feel ready to shuffle all archs, but doing x86 in one patch
and then arm64 in another looks tractable.


> For architectures which can use the compiler's atomics, we can allow
> them to do so, skipping the redundant explicit instrumentation.
>
> Other than being potentially slower (which we've established we don't
> care too much about above), is there a problem with that approach?


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-08 Thread Dmitry Vyukov
On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland  wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> consequently x86/arm64) initially, it becomes more realistic. For the
>> tools we don't care about absolute efficiency and this gets rid of
>> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> Re (3) I think rmb/wmb can be reasonably replaced with
>> atomic_thread_fence(acquire/release). Re (5) situation with
>> correctness becomes better very quickly as more people use them in
>> user-space. Since KASAN is not intended to be used in production (or
>> at least such build is expected to crash), we can afford to shake out
>> any remaining correctness issues in such build. (1) I don't fully
>> understand, what exactly is the problem with seq_cst?
>
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).
>
> However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
> with atomic_thread_fence_acquire() and atomic_thread_fence_release()
> respectively on arm64.
>
> The former use barriers with full system scope, whereas the latter may
> be limited to the inner shareable domain. While I'm not sure of the
> precise intended semantics of wmb() and rmb(), I believe this
> substitution would break some cases (e.g. communicating with a
> non-coherent master).
>
> Note that regardless, we'd have to special-case __iowmb() to use a full
> system barrier.
>
> Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
> KASAN today, with compilers that are known to have bugs in their atomics
> (e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
> implementation of atomics without introducing a functional regression.
>
>> i'Ve sketched a patch that does it, and did some testing with/without
>> KASAN on x86_64.
>>
>> In short, it adds include/linux/atomic_compiler.h which is included
>> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
>> and  is not included when CONFIG_COMPILER_ATOMIC is
>> defined.
>> For bitops it is similar except that only parts of asm/bitops.h are
>> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
>> defines other stuff.
>> asm/barriers.h is left intact for now. We don't need it for KASAN. But
>> for KTSAN we can do similar thing -- selectively disable some of the
>> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
>>
>> Such change would allow us to support atomic ops for multiple arches
>> for all of KASAN/KTSAN/KMSAN.
>>
>> Thoughts?
>
> As in my other reply, I'd prefer that we wrapped the (arch-specific)
> atomic implementations such that we can instrument them explicitly in a
> core header. That means that the implementation and semantics of the
> atomics don't change at all.
>
> Note that we could initially do this just for x86 and arm64), e.g. by
> having those explicitly include an 
> at the end of their .

How exactly do you want to do this incrementally?
I don't feel ready to shuffle all archs, but doing x86 in one patch
and then arm64 in another looks tractable.


> For architectures which can use the compiler's atomics, we can allow
> them to do so, skipping the redundant explicit instrumentation.
>
> Other than being potentially slower (which we've established we don't
> care too much about above), is there a problem with that approach?


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 04:20:18PM +, Mark Rutland wrote:
> > >> So the problem is doing load/stores from asm bits, and GCC
> > >> (traditionally) doesn't try and interpret APP asm bits.
> > >>
> > >> However, could we not write a GCC plugin that does exactly that?
> > >> Something that interprets the APP asm bits and generates these KASAN
> > >> bits that go with it?

> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.

Nah, I was thinking about a full asm interpreter.

> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
> 
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
> 
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.

Argh, yah, completely forgot about all that alternative and similar
nonsense :/



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 04:20:18PM +, Mark Rutland wrote:
> > >> So the problem is doing load/stores from asm bits, and GCC
> > >> (traditionally) doesn't try and interpret APP asm bits.
> > >>
> > >> However, could we not write a GCC plugin that does exactly that?
> > >> Something that interprets the APP asm bits and generates these KASAN
> > >> bits that go with it?

> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.

Nah, I was thinking about a full asm interpreter.

> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
> 
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
> 
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.

Argh, yah, completely forgot about all that alternative and similar
nonsense :/



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
On Mon, Mar 06, 2017 at 05:27:44PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland  wrote:
> > On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> >> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  
> >> wrote:
> >> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  
> >> >> > wrote:
> >> >> > > KASAN uses compiler instrumentation to intercept all memory 
> >> >> > > accesses.
> >> >> > > But it does not see memory accesses done in assembly code.
> >> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> >> > > for example, an atomic reference decrement is the last access to an
> >> >> > > object and a good candidate for a racy use-after-free.
> >> >> > >
> >> >> > > Add manual KASAN checks to atomic operations.
> >> >> > > Note: we need checks only before asm blocks and don't need them
> >> >> > > in atomic functions composed of other atomic functions
> >> >> > > (e.g. load-cmpxchg loops).
> >> >> >
> >> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add 
> >> >> > them in v2.
> >> >> >
> >> >>
> >> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> >> > >  {
> >> >> > > +   kasan_check_write(v, sizeof(*v));
> >> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> >> >> > >  : "+m" (v->counter)
> >> >> > >  : "ir" (i));

> >> Bottom line:
> >> 1. Involving compiler looks quite complex, hard to deploy, and it's
> >> unclear if it will actually make things easier.
> >> 2. This patch is the simplest short-term option (I am leaning towards
> >> adding bitops to this patch and leaving percpu out for now).
> >> 3. Providing an implementation of atomic ops based on compiler
> >> builtins looks like a nice option for other archs and tools, but is
> >> more work. If you consider this as a good solution, we can move
> >> straight to this option.
> >
> > Having *only* seen the assembly snippet at the top of this mail, I can't
> > say whether this is the simplest implementation.
> >
> > However, I do think that annotation of this sort is the only reasonable
> > way to handle this.
> 
> Here is the whole patch:
> https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/X76pwg_tAwAJ

I see.

Given we'd have to instrument each architecture's atomics in an
identical fashion, maybe we should follow the example of spinlocks, and
add an arch_ prefix to the arch-specific implementation, and place the
instrumentation in a common wrapper.

i.e. have something like:

static __always_inline void atomic_inc(atomic_t *v)
{
kasan_check_write(v, sizeof(*v)); 
arch_atomic_inc(v);
}

... in asm-generic somewhere.

It's more churn initially, but it should bea saving overall, and I
imagine for KMSAN or other things we may want more instrumentation
anyway...

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
On Mon, Mar 06, 2017 at 05:27:44PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland  wrote:
> > On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> >> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  
> >> wrote:
> >> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  
> >> >> > wrote:
> >> >> > > KASAN uses compiler instrumentation to intercept all memory 
> >> >> > > accesses.
> >> >> > > But it does not see memory accesses done in assembly code.
> >> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> >> > > for example, an atomic reference decrement is the last access to an
> >> >> > > object and a good candidate for a racy use-after-free.
> >> >> > >
> >> >> > > Add manual KASAN checks to atomic operations.
> >> >> > > Note: we need checks only before asm blocks and don't need them
> >> >> > > in atomic functions composed of other atomic functions
> >> >> > > (e.g. load-cmpxchg loops).
> >> >> >
> >> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add 
> >> >> > them in v2.
> >> >> >
> >> >>
> >> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> >> > >  {
> >> >> > > +   kasan_check_write(v, sizeof(*v));
> >> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> >> >> > >  : "+m" (v->counter)
> >> >> > >  : "ir" (i));

> >> Bottom line:
> >> 1. Involving compiler looks quite complex, hard to deploy, and it's
> >> unclear if it will actually make things easier.
> >> 2. This patch is the simplest short-term option (I am leaning towards
> >> adding bitops to this patch and leaving percpu out for now).
> >> 3. Providing an implementation of atomic ops based on compiler
> >> builtins looks like a nice option for other archs and tools, but is
> >> more work. If you consider this as a good solution, we can move
> >> straight to this option.
> >
> > Having *only* seen the assembly snippet at the top of this mail, I can't
> > say whether this is the simplest implementation.
> >
> > However, I do think that annotation of this sort is the only reasonable
> > way to handle this.
> 
> Here is the whole patch:
> https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/X76pwg_tAwAJ

I see.

Given we'd have to instrument each architecture's atomics in an
identical fashion, maybe we should follow the example of spinlocks, and
add an arch_ prefix to the arch-specific implementation, and place the
instrumentation in a common wrapper.

i.e. have something like:

static __always_inline void atomic_inc(atomic_t *v)
{
kasan_check_write(v, sizeof(*v)); 
arch_atomic_inc(v);
}

... in asm-generic somewhere.

It's more churn initially, but it should bea saving overall, and I
imagine for KMSAN or other things we may want more instrumentation
anyway...

Thanks,
Mark.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Andrey Ryabinin
On 03/06/2017 05:24 PM, Dmitry Vyukov wrote:

> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

We've tried this with gcc about 3 years ago. Here is the patch - 
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02447.html
The problem is that memory block in "m" constraint doesn't actually mean
that inline asm will access it. It only means that asm block *may* access that 
memory (or part of it).
This causes false positives. As I vaguely remember I hit some false-positive in 
FPU-related code.

This problem gave birth to another idea - add a new constraint to strictly mark 
the memory access
inside asm block. See https://gcc.gnu.org/ml/gcc/2014-09/msg00237.html
But all ended with nothing.



> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.
>
> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.
> 



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Andrey Ryabinin
On 03/06/2017 05:24 PM, Dmitry Vyukov wrote:

> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

We've tried this with gcc about 3 years ago. Here is the patch - 
https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02447.html
The problem is that memory block in "m" constraint doesn't actually mean
that inline asm will access it. It only means that asm block *may* access that 
memory (or part of it).
This causes false positives. As I vaguely remember I hit some false-positive in 
FPU-related code.

This problem gave birth to another idea - add a new constraint to strictly mark 
the memory access
inside asm block. See https://gcc.gnu.org/ml/gcc/2014-09/msg00237.html
But all ended with nothing.



> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.
>
> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.
> 



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland  wrote:
> Hi,
>
> [roping in Will, since he loves atomics]
>
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
>> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
>> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  
>> >> > wrote:
>> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> >> > > But it does not see memory accesses done in assembly code.
>> >> > > One notable user of assembly code is atomic operations. Frequently,
>> >> > > for example, an atomic reference decrement is the last access to an
>> >> > > object and a good candidate for a racy use-after-free.
>> >> > >
>> >> > > Add manual KASAN checks to atomic operations.
>> >> > > Note: we need checks only before asm blocks and don't need them
>> >> > > in atomic functions composed of other atomic functions
>> >> > > (e.g. load-cmpxchg loops).
>> >> >
>> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them 
>> >> > in v2.
>> >> >
>> >>
>> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> >> > >  {
>> >> > > +   kasan_check_write(v, sizeof(*v));
>> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
>> >> > >  : "+m" (v->counter)
>> >> > >  : "ir" (i));
>> >>
>> >>
>> >> So the problem is doing load/stores from asm bits, and GCC
>> >> (traditionally) doesn't try and interpret APP asm bits.
>> >>
>> >> However, could we not write a GCC plugin that does exactly that?
>> >> Something that interprets the APP asm bits and generates these KASAN
>> >> bits that go with it?
>> >
>> > Another suspect is the per-cpu stuff, that's all asm foo as well.
>
> Unfortunately, I think that manual annotation is the only way to handle
> these (as we already do for kernel part of the uaccess sequences), since
> we hide things from the compiler or otherwise trick it into doing what
> we want.
>
>> +x86, Mark
>>
>> Let me provide more context and design alternatives.
>>
>> There are also other archs, at least arm64 for now.
>> There are also other tools. For KTSAN (race detector) we will
>> absolutely need to hook into atomic ops. For KMSAN (uses of unit
>> values) we also need to understand atomic ops at least to some degree.
>> Both of them will require different instrumentation.
>> For KASAN we are also more interested in cases where it's more likely
>> that an object is touched only by an asm, but not by normal memory
>> accesses (otherwise we would report the bug on the normal access,
>> which is fine, this makes atomic ops stand out in my opinion).
>>
>> We could involve compiler (and by compiler I mean clang, because we
>> are not going to touch gcc, any volunteers?).
>
> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.
>
> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
>
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
>
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.
>
>> However, it's unclear if it will be simpler or not. There will
>> definitely will be a problem with uaccess asm blocks. Currently KASAN
>> relies of the fact that it does not see uaccess accesses and the user
>> addresses are considered bad by KASAN. There can also be a problem
>> with offsets/sizes, it's not possible to figure out what exactly an
>> asm block touches, we can only assume that it directly dereferences
>> the passed pointer. However, for example, bitops touch the pointer
>> with offset. Looking at the current x86 impl, we should be able to
>> handle it because the offset is computed outside of asm blocks. But
>> it's unclear if we hit this problem in other places.
>
> As above, I think you'd see more fun for the percpu stuff, since the
> pointer passed into those is "fake", with a percpu pointer accessing
> different addresses dependent on the CPU it is executed on.
>
>> I also see that arm64 bitops are implemented in .S files. And we won't
>> be able to instrument them in compiler.
>> There can also be other problems. Is it possible that some asm blocks
>> accept e.g. physical addresses? KASAN would consider them as bad.
>
> I'm not sure I follow what you mean here.
>
> I can imagine physical addresses being passed into asm statements that
> don't access memory (e.g. for setting up the base registers for page
> tables).
>
>> We could also provide a parallel implementation of atomic ops based on
>> the new compiler builtins (__atomic_load_n and friends):
>> 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland  wrote:
> Hi,
>
> [roping in Will, since he loves atomics]
>
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
>> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
>> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  
>> >> > wrote:
>> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> >> > > But it does not see memory accesses done in assembly code.
>> >> > > One notable user of assembly code is atomic operations. Frequently,
>> >> > > for example, an atomic reference decrement is the last access to an
>> >> > > object and a good candidate for a racy use-after-free.
>> >> > >
>> >> > > Add manual KASAN checks to atomic operations.
>> >> > > Note: we need checks only before asm blocks and don't need them
>> >> > > in atomic functions composed of other atomic functions
>> >> > > (e.g. load-cmpxchg loops).
>> >> >
>> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them 
>> >> > in v2.
>> >> >
>> >>
>> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> >> > >  {
>> >> > > +   kasan_check_write(v, sizeof(*v));
>> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
>> >> > >  : "+m" (v->counter)
>> >> > >  : "ir" (i));
>> >>
>> >>
>> >> So the problem is doing load/stores from asm bits, and GCC
>> >> (traditionally) doesn't try and interpret APP asm bits.
>> >>
>> >> However, could we not write a GCC plugin that does exactly that?
>> >> Something that interprets the APP asm bits and generates these KASAN
>> >> bits that go with it?
>> >
>> > Another suspect is the per-cpu stuff, that's all asm foo as well.
>
> Unfortunately, I think that manual annotation is the only way to handle
> these (as we already do for kernel part of the uaccess sequences), since
> we hide things from the compiler or otherwise trick it into doing what
> we want.
>
>> +x86, Mark
>>
>> Let me provide more context and design alternatives.
>>
>> There are also other archs, at least arm64 for now.
>> There are also other tools. For KTSAN (race detector) we will
>> absolutely need to hook into atomic ops. For KMSAN (uses of unit
>> values) we also need to understand atomic ops at least to some degree.
>> Both of them will require different instrumentation.
>> For KASAN we are also more interested in cases where it's more likely
>> that an object is touched only by an asm, but not by normal memory
>> accesses (otherwise we would report the bug on the normal access,
>> which is fine, this makes atomic ops stand out in my opinion).
>>
>> We could involve compiler (and by compiler I mean clang, because we
>> are not going to touch gcc, any volunteers?).
>
> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.
>
> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
>
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
>
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.
>
>> However, it's unclear if it will be simpler or not. There will
>> definitely will be a problem with uaccess asm blocks. Currently KASAN
>> relies of the fact that it does not see uaccess accesses and the user
>> addresses are considered bad by KASAN. There can also be a problem
>> with offsets/sizes, it's not possible to figure out what exactly an
>> asm block touches, we can only assume that it directly dereferences
>> the passed pointer. However, for example, bitops touch the pointer
>> with offset. Looking at the current x86 impl, we should be able to
>> handle it because the offset is computed outside of asm blocks. But
>> it's unclear if we hit this problem in other places.
>
> As above, I think you'd see more fun for the percpu stuff, since the
> pointer passed into those is "fake", with a percpu pointer accessing
> different addresses dependent on the CPU it is executed on.
>
>> I also see that arm64 bitops are implemented in .S files. And we won't
>> be able to instrument them in compiler.
>> There can also be other problems. Is it possible that some asm blocks
>> accept e.g. physical addresses? KASAN would consider them as bad.
>
> I'm not sure I follow what you mean here.
>
> I can imagine physical addresses being passed into asm statements that
> don't access memory (e.g. for setting up the base registers for page
> tables).
>
>> We could also provide a parallel implementation of atomic ops based on
>> the new compiler builtins (__atomic_load_n and friends):
>> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
>> and enable it under KSAN. The nice thing about 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
Hi,

[roping in Will, since he loves atomics]

On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> >> > > But it does not see memory accesses done in assembly code.
> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> > > for example, an atomic reference decrement is the last access to an
> >> > > object and a good candidate for a racy use-after-free.
> >> > >
> >> > > Add manual KASAN checks to atomic operations.
> >> > > Note: we need checks only before asm blocks and don't need them
> >> > > in atomic functions composed of other atomic functions
> >> > > (e.g. load-cmpxchg loops).
> >> >
> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them 
> >> > in v2.
> >> >
> >>
> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> > >  {
> >> > > +   kasan_check_write(v, sizeof(*v));
> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> >> > >  : "+m" (v->counter)
> >> > >  : "ir" (i));
> >>
> >>
> >> So the problem is doing load/stores from asm bits, and GCC
> >> (traditionally) doesn't try and interpret APP asm bits.
> >>
> >> However, could we not write a GCC plugin that does exactly that?
> >> Something that interprets the APP asm bits and generates these KASAN
> >> bits that go with it?
> >
> > Another suspect is the per-cpu stuff, that's all asm foo as well.

Unfortunately, I think that manual annotation is the only way to handle
these (as we already do for kernel part of the uaccess sequences), since
we hide things from the compiler or otherwise trick it into doing what
we want.

> +x86, Mark
> 
> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

I don't think there's much you'll be able to do within the compiler,
assuming you mean to derive this from the asm block inputs and outputs.

Those can hide address-generation (e.g. with per-cpu stuff), which the
compiler may erroneously be detected as racing.

Those may also take fake inputs (e.g. the sp input to arm64's
__my_cpu_offset()) which may confuse matters.

Parsing the assembly itself will be *extremely* painful due to the way
that's set up for run-time patching.

> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.

As above, I think you'd see more fun for the percpu stuff, since the
pointer passed into those is "fake", with a percpu pointer accessing
different addresses dependent on the CPU it is executed on.

> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.

I'm not sure I follow what you mean here.

I can imagine physical addresses being passed into asm statements that
don't access memory (e.g. for setting up the base registers for page
tables).

> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

These don't permit runtime patching, and there are 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
Hi,

[roping in Will, since he loves atomics]

On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> >> > > But it does not see memory accesses done in assembly code.
> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> > > for example, an atomic reference decrement is the last access to an
> >> > > object and a good candidate for a racy use-after-free.
> >> > >
> >> > > Add manual KASAN checks to atomic operations.
> >> > > Note: we need checks only before asm blocks and don't need them
> >> > > in atomic functions composed of other atomic functions
> >> > > (e.g. load-cmpxchg loops).
> >> >
> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them 
> >> > in v2.
> >> >
> >>
> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> > >  {
> >> > > +   kasan_check_write(v, sizeof(*v));
> >> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> >> > >  : "+m" (v->counter)
> >> > >  : "ir" (i));
> >>
> >>
> >> So the problem is doing load/stores from asm bits, and GCC
> >> (traditionally) doesn't try and interpret APP asm bits.
> >>
> >> However, could we not write a GCC plugin that does exactly that?
> >> Something that interprets the APP asm bits and generates these KASAN
> >> bits that go with it?
> >
> > Another suspect is the per-cpu stuff, that's all asm foo as well.

Unfortunately, I think that manual annotation is the only way to handle
these (as we already do for kernel part of the uaccess sequences), since
we hide things from the compiler or otherwise trick it into doing what
we want.

> +x86, Mark
> 
> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

I don't think there's much you'll be able to do within the compiler,
assuming you mean to derive this from the asm block inputs and outputs.

Those can hide address-generation (e.g. with per-cpu stuff), which the
compiler may erroneously be detected as racing.

Those may also take fake inputs (e.g. the sp input to arm64's
__my_cpu_offset()) which may confuse matters.

Parsing the assembly itself will be *extremely* painful due to the way
that's set up for run-time patching.

> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.

As above, I think you'd see more fun for the percpu stuff, since the
pointer passed into those is "fake", with a percpu pointer accessing
different addresses dependent on the CPU it is executed on.

> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.

I'm not sure I follow what you mean here.

I can imagine physical addresses being passed into asm statements that
don't access memory (e.g. for setting up the base registers for page
tables).

> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

These don't permit runtime patching, and there are some differences
between the C11 and Linux 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
On Mon, Mar 06, 2017 at 04:20:13PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> > We could also provide a parallel implementation of atomic ops based on
> > the new compiler builtins (__atomic_load_n and friends):
> > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > and enable it under KSAN. The nice thing about it is that it will
> > automatically support arm64 and KMSAN and KTSAN.
> > But it's more work.
> 
> There's a summary out there somewhere, I think Will knows, that explain
> how the C/C++ memory model and the Linux Kernel Memory model differ and
> how its going to be 'interesting' to make using the C/C++ builtin crud
> with the kernel 'correct.

Trivially, The C++ model doesn't feature I/O ordering [1]...

Otherwise Will pointed out a few details in [2].

Thanks,
Mark.

[1] https://lwn.net/Articles/698014/
[2] http://lwn.net/Articles/691295/


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Mark Rutland
On Mon, Mar 06, 2017 at 04:20:13PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> > We could also provide a parallel implementation of atomic ops based on
> > the new compiler builtins (__atomic_load_n and friends):
> > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > and enable it under KSAN. The nice thing about it is that it will
> > automatically support arm64 and KMSAN and KTSAN.
> > But it's more work.
> 
> There's a summary out there somewhere, I think Will knows, that explain
> how the C/C++ memory model and the Linux Kernel Memory model differ and
> how its going to be 'interesting' to make using the C/C++ builtin crud
> with the kernel 'correct.

Trivially, The C++ model doesn't feature I/O ordering [1]...

Otherwise Will pointed out a few details in [2].

Thanks,
Mark.

[1] https://lwn.net/Articles/698014/
[2] http://lwn.net/Articles/691295/


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

There's a summary out there somewhere, I think Will knows, that explain
how the C/C++ memory model and the Linux Kernel Memory model differ and
how its going to be 'interesting' to make using the C/C++ builtin crud
with the kernel 'correct.



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

There's a summary out there somewhere, I think Will knows, that explain
how the C/C++ memory model and the Linux Kernel Memory model differ and
how its going to be 'interesting' to make using the C/C++ builtin crud
with the kernel 'correct.



Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

FWIW, clang isn't even close to being a viable compiler for the kernel.
It lacks far too many features, _IF_ you can get it to compiler in the
first place.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

FWIW, clang isn't even close to being a viable compiler for the kernel.
It lacks far too many features, _IF_ you can get it to compiler in the
first place.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
> On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
>> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> > > But it does not see memory accesses done in assembly code.
>> > > One notable user of assembly code is atomic operations. Frequently,
>> > > for example, an atomic reference decrement is the last access to an
>> > > object and a good candidate for a racy use-after-free.
>> > >
>> > > Add manual KASAN checks to atomic operations.
>> > > Note: we need checks only before asm blocks and don't need them
>> > > in atomic functions composed of other atomic functions
>> > > (e.g. load-cmpxchg loops).
>> >
>> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in 
>> > v2.
>> >
>>
>> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> > >  {
>> > > +   kasan_check_write(v, sizeof(*v));
>> > > asm volatile(LOCK_PREFIX "addl %1,%0"
>> > >  : "+m" (v->counter)
>> > >  : "ir" (i));
>>
>>
>> So the problem is doing load/stores from asm bits, and GCC
>> (traditionally) doesn't try and interpret APP asm bits.
>>
>> However, could we not write a GCC plugin that does exactly that?
>> Something that interprets the APP asm bits and generates these KASAN
>> bits that go with it?
>
> Another suspect is the per-cpu stuff, that's all asm foo as well.


+x86, Mark

Let me provide more context and design alternatives.

There are also other archs, at least arm64 for now.
There are also other tools. For KTSAN (race detector) we will
absolutely need to hook into atomic ops. For KMSAN (uses of unit
values) we also need to understand atomic ops at least to some degree.
Both of them will require different instrumentation.
For KASAN we are also more interested in cases where it's more likely
that an object is touched only by an asm, but not by normal memory
accesses (otherwise we would report the bug on the normal access,
which is fine, this makes atomic ops stand out in my opinion).

We could involve compiler (and by compiler I mean clang, because we
are not going to touch gcc, any volunteers?).
However, it's unclear if it will be simpler or not. There will
definitely will be a problem with uaccess asm blocks. Currently KASAN
relies of the fact that it does not see uaccess accesses and the user
addresses are considered bad by KASAN. There can also be a problem
with offsets/sizes, it's not possible to figure out what exactly an
asm block touches, we can only assume that it directly dereferences
the passed pointer. However, for example, bitops touch the pointer
with offset. Looking at the current x86 impl, we should be able to
handle it because the offset is computed outside of asm blocks. But
it's unclear if we hit this problem in other places.
I also see that arm64 bitops are implemented in .S files. And we won't
be able to instrument them in compiler.
There can also be other problems. Is it possible that some asm blocks
accept e.g. physical addresses? KASAN would consider them as bad.

We could also provide a parallel implementation of atomic ops based on
the new compiler builtins (__atomic_load_n and friends):
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
and enable it under KSAN. The nice thing about it is that it will
automatically support arm64 and KMSAN and KTSAN.
But it's more work.

Re per-cpu asm. I would say that it's less critical than atomic ops.
Static per-cpu slots are not subject to use-after-free. Dynamic slots
can be subject to use-after-free and it would be nice to catch bugs
there. However, I think we will need to add manual
poisoning/unpoisoning of dynamic slots as well.

Bottom line:
1. Involving compiler looks quite complex, hard to deploy, and it's
unclear if it will actually make things easier.
2. This patch is the simplest short-term option (I am leaning towards
adding bitops to this patch and leaving percpu out for now).
3. Providing an implementation of atomic ops based on compiler
builtins looks like a nice option for other archs and tools, but is
more work. If you consider this as a good solution, we can move
straight to this option.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra  wrote:
> On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
>> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> > > But it does not see memory accesses done in assembly code.
>> > > One notable user of assembly code is atomic operations. Frequently,
>> > > for example, an atomic reference decrement is the last access to an
>> > > object and a good candidate for a racy use-after-free.
>> > >
>> > > Add manual KASAN checks to atomic operations.
>> > > Note: we need checks only before asm blocks and don't need them
>> > > in atomic functions composed of other atomic functions
>> > > (e.g. load-cmpxchg loops).
>> >
>> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in 
>> > v2.
>> >
>>
>> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> > >  {
>> > > +   kasan_check_write(v, sizeof(*v));
>> > > asm volatile(LOCK_PREFIX "addl %1,%0"
>> > >  : "+m" (v->counter)
>> > >  : "ir" (i));
>>
>>
>> So the problem is doing load/stores from asm bits, and GCC
>> (traditionally) doesn't try and interpret APP asm bits.
>>
>> However, could we not write a GCC plugin that does exactly that?
>> Something that interprets the APP asm bits and generates these KASAN
>> bits that go with it?
>
> Another suspect is the per-cpu stuff, that's all asm foo as well.


+x86, Mark

Let me provide more context and design alternatives.

There are also other archs, at least arm64 for now.
There are also other tools. For KTSAN (race detector) we will
absolutely need to hook into atomic ops. For KMSAN (uses of unit
values) we also need to understand atomic ops at least to some degree.
Both of them will require different instrumentation.
For KASAN we are also more interested in cases where it's more likely
that an object is touched only by an asm, but not by normal memory
accesses (otherwise we would report the bug on the normal access,
which is fine, this makes atomic ops stand out in my opinion).

We could involve compiler (and by compiler I mean clang, because we
are not going to touch gcc, any volunteers?).
However, it's unclear if it will be simpler or not. There will
definitely will be a problem with uaccess asm blocks. Currently KASAN
relies of the fact that it does not see uaccess accesses and the user
addresses are considered bad by KASAN. There can also be a problem
with offsets/sizes, it's not possible to figure out what exactly an
asm block touches, we can only assume that it directly dereferences
the passed pointer. However, for example, bitops touch the pointer
with offset. Looking at the current x86 impl, we should be able to
handle it because the offset is computed outside of asm blocks. But
it's unclear if we hit this problem in other places.
I also see that arm64 bitops are implemented in .S files. And we won't
be able to instrument them in compiler.
There can also be other problems. Is it possible that some asm blocks
accept e.g. physical addresses? KASAN would consider them as bad.

We could also provide a parallel implementation of atomic ops based on
the new compiler builtins (__atomic_load_n and friends):
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
and enable it under KSAN. The nice thing about it is that it will
automatically support arm64 and KMSAN and KTSAN.
But it's more work.

Re per-cpu asm. I would say that it's less critical than atomic ops.
Static per-cpu slots are not subject to use-after-free. Dynamic slots
can be subject to use-after-free and it would be nice to catch bugs
there. However, I think we will need to add manual
poisoning/unpoisoning of dynamic slots as well.

Bottom line:
1. Involving compiler looks quite complex, hard to deploy, and it's
unclear if it will actually make things easier.
2. This patch is the simplest short-term option (I am leaning towards
adding bitops to this patch and leaving percpu out for now).
3. Providing an implementation of atomic ops based on compiler
builtins looks like a nice option for other archs and tools, but is
more work. If you consider this as a good solution, we can move
straight to this option.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> > > But it does not see memory accesses done in assembly code.
> > > One notable user of assembly code is atomic operations. Frequently,
> > > for example, an atomic reference decrement is the last access to an
> > > object and a good candidate for a racy use-after-free.
> > >
> > > Add manual KASAN checks to atomic operations.
> > > Note: we need checks only before asm blocks and don't need them
> > > in atomic functions composed of other atomic functions
> > > (e.g. load-cmpxchg loops).
> > 
> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in 
> > v2.
> > 
> 
> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> > >  {
> > > +   kasan_check_write(v, sizeof(*v));
> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> > >  : "+m" (v->counter)
> > >  : "ir" (i));
> 
> 
> So the problem is doing load/stores from asm bits, and GCC
> (traditionally) doesn't try and interpret APP asm bits.
> 
> However, could we not write a GCC plugin that does exactly that?
> Something that interprets the APP asm bits and generates these KASAN
> bits that go with it?

Another suspect is the per-cpu stuff, that's all asm foo as well.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> > > But it does not see memory accesses done in assembly code.
> > > One notable user of assembly code is atomic operations. Frequently,
> > > for example, an atomic reference decrement is the last access to an
> > > object and a good candidate for a racy use-after-free.
> > >
> > > Add manual KASAN checks to atomic operations.
> > > Note: we need checks only before asm blocks and don't need them
> > > in atomic functions composed of other atomic functions
> > > (e.g. load-cmpxchg loops).
> > 
> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in 
> > v2.
> > 
> 
> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> > >  {
> > > +   kasan_check_write(v, sizeof(*v));
> > > asm volatile(LOCK_PREFIX "addl %1,%0"
> > >  : "+m" (v->counter)
> > >  : "ir" (i));
> 
> 
> So the problem is doing load/stores from asm bits, and GCC
> (traditionally) doesn't try and interpret APP asm bits.
> 
> However, could we not write a GCC plugin that does exactly that?
> Something that interprets the APP asm bits and generates these KASAN
> bits that go with it?

Another suspect is the per-cpu stuff, that's all asm foo as well.


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> > KASAN uses compiler instrumentation to intercept all memory accesses.
> > But it does not see memory accesses done in assembly code.
> > One notable user of assembly code is atomic operations. Frequently,
> > for example, an atomic reference decrement is the last access to an
> > object and a good candidate for a racy use-after-free.
> >
> > Add manual KASAN checks to atomic operations.
> > Note: we need checks only before asm blocks and don't need them
> > in atomic functions composed of other atomic functions
> > (e.g. load-cmpxchg loops).
> 
> Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> 

> >  static __always_inline void atomic_add(int i, atomic_t *v)
> >  {
> > +   kasan_check_write(v, sizeof(*v));
> > asm volatile(LOCK_PREFIX "addl %1,%0"
> >  : "+m" (v->counter)
> >  : "ir" (i));


So the problem is doing load/stores from asm bits, and GCC
(traditionally) doesn't try and interpret APP asm bits.

However, could we not write a GCC plugin that does exactly that?
Something that interprets the APP asm bits and generates these KASAN
bits that go with it?


Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Peter Zijlstra
On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> > KASAN uses compiler instrumentation to intercept all memory accesses.
> > But it does not see memory accesses done in assembly code.
> > One notable user of assembly code is atomic operations. Frequently,
> > for example, an atomic reference decrement is the last access to an
> > object and a good candidate for a racy use-after-free.
> >
> > Add manual KASAN checks to atomic operations.
> > Note: we need checks only before asm blocks and don't need them
> > in atomic functions composed of other atomic functions
> > (e.g. load-cmpxchg loops).
> 
> Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> 

> >  static __always_inline void atomic_add(int i, atomic_t *v)
> >  {
> > +   kasan_check_write(v, sizeof(*v));
> > asm volatile(LOCK_PREFIX "addl %1,%0"
> >  : "+m" (v->counter)
> >  : "ir" (i));


So the problem is doing load/stores from asm bits, and GCC
(traditionally) doesn't try and interpret APP asm bits.

However, could we not write a GCC plugin that does exactly that?
Something that interprets the APP asm bits and generates these KASAN
bits that go with it?


[PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Add manual KASAN checks to atomic operations.
Note: we need checks only before asm blocks and don't need them
in atomic functions composed of other atomic functions
(e.g. load-cmpxchg loops).

Signed-off-by: Dmitry Vyukov 
Cc: Andrew Morton 
Cc: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org

---
Within a day it has found its first bug:

==
BUG: KASAN: use-after-free in atomic_dec_and_test
arch/x86/include/asm/atomic.h:123 [inline] at addr 880079c30158
BUG: KASAN: use-after-free in put_task_struct
include/linux/sched/task.h:93 [inline] at addr 880079c30158
BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
kernel/events/core.c:1131 at addr 880079c30158
Write of size 4 by task syz-executor6/25698
CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
 kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:208 [inline]
 kasan_report_error mm/kasan/report.c:292 [inline]
 kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
 kasan_report+0x21/0x30 mm/kasan/report.c:301
 check_memory_region_inline mm/kasan/kasan.c:326 [inline]
 check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
 atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
 put_task_struct include/linux/sched/task.h:93 [inline]
 put_ctx+0xcf/0x110 kernel/events/core.c:1131
 perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
 perf_release+0x37/0x50 kernel/events/core.c:4338
 __fput+0x332/0x800 fs/file_table.c:209
 fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x197/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xb38/0x29c0 kernel/exit.c:880
 do_group_exit+0x149/0x420 kernel/exit.c:984
 get_signal+0x7e0/0x1820 kernel/signal.c:2318
 do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
 syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
 do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x4458d9
RSP: 002b:7f3f07187cf8 EFLAGS: 0246 ORIG_RAX: 00ca
RAX: fe00 RBX: 007080c8 RCX: 004458d9
RDX:  RSI:  RDI: 007080c8
RBP: 007080a8 R08:  R09: 
R10:  R11: 0246 R12: 
R13:  R14: 7f3f071889c0 R15: 7f3f07188700
Object at 880079c30140, in cache task_struct size: 5376
Allocated:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
 kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
 alloc_task_struct_node kernel/fork.c:153 [inline]
 dup_task_struct kernel/fork.c:495 [inline]
 copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 free_task_struct kernel/fork.c:158 [inline]
 free_task+0x151/0x1d0 kernel/fork.c:370
 copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
---
 arch/x86/include/asm/atomic.h  | 11 +++
 arch/x86/include/asm/atomic64_64.h | 10 ++
 arch/x86/include/asm/cmpxchg.h |  4 
 3 files changed, 25 insertions(+)

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..64f0a7fb9b2f 100644
--- a/arch/x86/include/asm/atomic.h
+++ 

[PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Add manual KASAN checks to atomic operations.
Note: we need checks only before asm blocks and don't need them
in atomic functions composed of other atomic functions
(e.g. load-cmpxchg loops).

Signed-off-by: Dmitry Vyukov 
Cc: Andrew Morton 
Cc: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org

---
Within a day it has found its first bug:

==
BUG: KASAN: use-after-free in atomic_dec_and_test
arch/x86/include/asm/atomic.h:123 [inline] at addr 880079c30158
BUG: KASAN: use-after-free in put_task_struct
include/linux/sched/task.h:93 [inline] at addr 880079c30158
BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
kernel/events/core.c:1131 at addr 880079c30158
Write of size 4 by task syz-executor6/25698
CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
 kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:208 [inline]
 kasan_report_error mm/kasan/report.c:292 [inline]
 kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
 kasan_report+0x21/0x30 mm/kasan/report.c:301
 check_memory_region_inline mm/kasan/kasan.c:326 [inline]
 check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
 atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
 put_task_struct include/linux/sched/task.h:93 [inline]
 put_ctx+0xcf/0x110 kernel/events/core.c:1131
 perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
 perf_release+0x37/0x50 kernel/events/core.c:4338
 __fput+0x332/0x800 fs/file_table.c:209
 fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x197/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xb38/0x29c0 kernel/exit.c:880
 do_group_exit+0x149/0x420 kernel/exit.c:984
 get_signal+0x7e0/0x1820 kernel/signal.c:2318
 do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
 syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
 do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x4458d9
RSP: 002b:7f3f07187cf8 EFLAGS: 0246 ORIG_RAX: 00ca
RAX: fe00 RBX: 007080c8 RCX: 004458d9
RDX:  RSI:  RDI: 007080c8
RBP: 007080a8 R08:  R09: 
R10:  R11: 0246 R12: 
R13:  R14: 7f3f071889c0 R15: 7f3f07188700
Object at 880079c30140, in cache task_struct size: 5376
Allocated:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
 kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
 alloc_task_struct_node kernel/fork.c:153 [inline]
 dup_task_struct kernel/fork.c:495 [inline]
 copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 free_task_struct kernel/fork.c:158 [inline]
 free_task+0x151/0x1d0 kernel/fork.c:370
 copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
---
 arch/x86/include/asm/atomic.h  | 11 +++
 arch/x86/include/asm/atomic64_64.h | 10 ++
 arch/x86/include/asm/cmpxchg.h |  4 
 3 files changed, 25 insertions(+)

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..64f0a7fb9b2f 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -2,6 +2,7 @@
 #define _ASM_X86_ATOMIC_H
 
 #include 

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> KASAN uses compiler instrumentation to intercept all memory accesses.
> But it does not see memory accesses done in assembly code.
> One notable user of assembly code is atomic operations. Frequently,
> for example, an atomic reference decrement is the last access to an
> object and a good candidate for a racy use-after-free.
>
> Add manual KASAN checks to atomic operations.
> Note: we need checks only before asm blocks and don't need them
> in atomic functions composed of other atomic functions
> (e.g. load-cmpxchg loops).

Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.


> Signed-off-by: Dmitry Vyukov 
> Cc: Andrew Morton 
> Cc: Andrey Ryabinin 
> Cc: kasan-...@googlegroups.com
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> Within a day it has found its first bug:
>
> ==
> BUG: KASAN: use-after-free in atomic_dec_and_test
> arch/x86/include/asm/atomic.h:123 [inline] at addr 880079c30158
> BUG: KASAN: use-after-free in put_task_struct
> include/linux/sched/task.h:93 [inline] at addr 880079c30158
> BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
> kernel/events/core.c:1131 at addr 880079c30158
> Write of size 4 by task syz-executor6/25698
> CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
>  print_address_description mm/kasan/report.c:208 [inline]
>  kasan_report_error mm/kasan/report.c:292 [inline]
>  kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
>  kasan_report+0x21/0x30 mm/kasan/report.c:301
>  check_memory_region_inline mm/kasan/kasan.c:326 [inline]
>  check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
>  atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
>  put_task_struct include/linux/sched/task.h:93 [inline]
>  put_ctx+0xcf/0x110 kernel/events/core.c:1131
>  perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
>  perf_release+0x37/0x50 kernel/events/core.c:4338
>  __fput+0x332/0x800 fs/file_table.c:209
>  fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x197/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xb38/0x29c0 kernel/exit.c:880
>  do_group_exit+0x149/0x420 kernel/exit.c:984
>  get_signal+0x7e0/0x1820 kernel/signal.c:2318
>  do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
>  syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
>  do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x4458d9
> RSP: 002b:7f3f07187cf8 EFLAGS: 0246 ORIG_RAX: 00ca
> RAX: fe00 RBX: 007080c8 RCX: 004458d9
> RDX:  RSI:  RDI: 007080c8
> RBP: 007080a8 R08:  R09: 
> R10:  R11: 0246 R12: 
> R13:  R14: 7f3f071889c0 R15: 7f3f07188700
> Object at 880079c30140, in cache task_struct size: 5376
> Allocated:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
>  kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
>  alloc_task_struct_node kernel/fork.c:153 [inline]
>  dup_task_struct kernel/fork.c:495 [inline]
>  copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> Freed:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
>  __cache_free mm/slab.c:3514 [inline]
>  kmem_cache_free+0x71/0x240 mm/slab.c:3774
>  free_task_struct kernel/fork.c:158 [inline]
>  free_task+0x151/0x1d0 kernel/fork.c:370
>  copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> ---
>  

Re: [PATCH] x86, kasan: add KASAN checks to atomic operations

2017-03-06 Thread Dmitry Vyukov
On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov  wrote:
> KASAN uses compiler instrumentation to intercept all memory accesses.
> But it does not see memory accesses done in assembly code.
> One notable user of assembly code is atomic operations. Frequently,
> for example, an atomic reference decrement is the last access to an
> object and a good candidate for a racy use-after-free.
>
> Add manual KASAN checks to atomic operations.
> Note: we need checks only before asm blocks and don't need them
> in atomic functions composed of other atomic functions
> (e.g. load-cmpxchg loops).

Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.


> Signed-off-by: Dmitry Vyukov 
> Cc: Andrew Morton 
> Cc: Andrey Ryabinin 
> Cc: kasan-...@googlegroups.com
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> Within a day it has found its first bug:
>
> ==
> BUG: KASAN: use-after-free in atomic_dec_and_test
> arch/x86/include/asm/atomic.h:123 [inline] at addr 880079c30158
> BUG: KASAN: use-after-free in put_task_struct
> include/linux/sched/task.h:93 [inline] at addr 880079c30158
> BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
> kernel/events/core.c:1131 at addr 880079c30158
> Write of size 4 by task syz-executor6/25698
> CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
>  print_address_description mm/kasan/report.c:208 [inline]
>  kasan_report_error mm/kasan/report.c:292 [inline]
>  kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
>  kasan_report+0x21/0x30 mm/kasan/report.c:301
>  check_memory_region_inline mm/kasan/kasan.c:326 [inline]
>  check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
>  atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
>  put_task_struct include/linux/sched/task.h:93 [inline]
>  put_ctx+0xcf/0x110 kernel/events/core.c:1131
>  perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
>  perf_release+0x37/0x50 kernel/events/core.c:4338
>  __fput+0x332/0x800 fs/file_table.c:209
>  fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x197/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xb38/0x29c0 kernel/exit.c:880
>  do_group_exit+0x149/0x420 kernel/exit.c:984
>  get_signal+0x7e0/0x1820 kernel/signal.c:2318
>  do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
>  syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
>  do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x4458d9
> RSP: 002b:7f3f07187cf8 EFLAGS: 0246 ORIG_RAX: 00ca
> RAX: fe00 RBX: 007080c8 RCX: 004458d9
> RDX:  RSI:  RDI: 007080c8
> RBP: 007080a8 R08:  R09: 
> R10:  R11: 0246 R12: 
> R13:  R14: 7f3f071889c0 R15: 7f3f07188700
> Object at 880079c30140, in cache task_struct size: 5376
> Allocated:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
>  kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
>  alloc_task_struct_node kernel/fork.c:153 [inline]
>  dup_task_struct kernel/fork.c:495 [inline]
>  copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> Freed:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
>  __cache_free mm/slab.c:3514 [inline]
>  kmem_cache_free+0x71/0x240 mm/slab.c:3774
>  free_task_struct kernel/fork.c:158 [inline]
>  free_task+0x151/0x1d0 kernel/fork.c:370
>  copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> ---
>  arch/x86/include/asm/atomic.h  | 11 +++
>  arch/x86/include/asm/atomic64_64.h | 10