Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-10-02 Thread Arnd Bergmann
On Thu, Sep 28, 2017 at 4:30 PM, Arnd Bergmann  wrote:
> On Thu, Sep 28, 2017 at 6:09 AM, Andrey Ryabinin
>  wrote:
>> On 09/27/2017 04:26 PM, Arnd Bergmann wrote:
>>> On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin
>>>  wrote:
>
>>> --- a/include/linux/string.h
>>> +++ b/include/linux/string.h
>>> @@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path)
>>>  #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline))
>>>  #define __RENAME(x) __asm__(#x)
>>>
>>> -void fortify_panic(const char *name) __noreturn __cold;
>>> +void fortify_panic(const char *name) __cold;
>>>  void __read_overflow(void) __compiletime_error("detected read beyond
>>> size of object passed as 1st parameter");
>>>  void __read_overflow2(void) __compiletime_error("detected read beyond
>>> size of object passed as 2nd parameter");
>>>  void __read_overflow3(void) __compiletime_error("detected read beyond
>>> size of object passed as 3rd parameter");
>>>
>>> I don't immediately see why the __noreturn changes the behavior here, any 
>>> idea?
>>>
>>
>>
>> At first I thought that this somehow might be related to 
>> __asan_handle_no_return(). GCC calls it
>> before noreturn function. So I made patch to remove generation of these 
>> calls (we don't need them in the kernel anyway)
>> but it didn't help. It must be something else than.
>
> I made a reduced test case yesterday (see http://paste.ubuntu.com/25628030/),
> and it shows the same behavior with and without the sanitizer, it uses 128
> bytes without the noreturn attribute and 480 bytes when its added, the 
> sanitizer
> adds a factor of 1.5x on top. It's possible that I did something wrong while
> reducing, since the original driver file uses very little stack (a few hundred
> bytes) without -fsanitize=kernel-address, but finding out what happens in
> the reduced case may still help understand the other one.

This is now GCC PR82365, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365

I've come up with a workaround, but I'm not sure if that is any better than the
alternatives, will send the patch as a follow-up in a bit.

 Arnd


Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-28 Thread Arnd Bergmann
On Thu, Sep 28, 2017 at 6:09 AM, Andrey Ryabinin
 wrote:
> On 09/27/2017 04:26 PM, Arnd Bergmann wrote:
>> On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin
>>  wrote:

>> --- a/include/linux/string.h
>> +++ b/include/linux/string.h
>> @@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path)
>>  #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline))
>>  #define __RENAME(x) __asm__(#x)
>>
>> -void fortify_panic(const char *name) __noreturn __cold;
>> +void fortify_panic(const char *name) __cold;
>>  void __read_overflow(void) __compiletime_error("detected read beyond
>> size of object passed as 1st parameter");
>>  void __read_overflow2(void) __compiletime_error("detected read beyond
>> size of object passed as 2nd parameter");
>>  void __read_overflow3(void) __compiletime_error("detected read beyond
>> size of object passed as 3rd parameter");
>>
>> I don't immediately see why the __noreturn changes the behavior here, any 
>> idea?
>>
>
>
> At first I thought that this somehow might be related to 
> __asan_handle_no_return(). GCC calls it
> before noreturn function. So I made patch to remove generation of these calls 
> (we don't need them in the kernel anyway)
> but it didn't help. It must be something else than.

I made a reduced test case yesterday (see http://paste.ubuntu.com/25628030/),
and it shows the same behavior with and without the sanitizer, it uses 128
bytes without the noreturn attribute and 480 bytes when its added, the sanitizer
adds a factor of 1.5x on top. It's possible that I did something wrong while
reducing, since the original driver file uses very little stack (a few hundred
bytes) without -fsanitize=kernel-address, but finding out what happens in
the reduced case may still help understand the other one.

Arnd


Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-28 Thread Andrey Ryabinin
On 09/27/2017 04:26 PM, Arnd Bergmann wrote:
> On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin
>  wrote:
>>
>>
>> On 09/26/2017 09:47 AM, Arnd Bergmann wrote:
>>> On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann  wrote:
> 
>>> +   ret = __builtin_strlen(q);
>>
>>
>> I think this is not correct. Fortified strlen called here on purpose. If 
>> sizeof q is known at compile time
>> and 'q' contains not-null fortified strlen() will panic.
> 
> Ok, got it.
> 
>>> if (size) {
>>> size_t len = (ret >= size) ? size - 1 : ret;
>>> if (__builtin_constant_p(len) && len >= p_size)
>>>
>>> The problem is apparently that the fortified strlcpy calls the fortified 
>>> strlen,
>>> which in turn calls strnlen and that ends up calling the extern 
>>> '__real_strnlen'
>>> that gcc cannot reduce to a constant expression for a constant input.
>>
>>
>> Per my observation, it's the code like this:
>> if ()
>> fortify_panic(__func__);
>>
>>
>> somehow prevent gcc to merge several "struct i2c_board_info info;" into one 
>> stack slot.
>> With the hack bellow, stack usage reduced to ~1,6K:
> 
> 1.6k is also what I see with my patch, or any other approach I tried
> that changes
> string.h. With the split up em28xx_dvb_init() function (and without
> changes to string.h),
> I got down to a few hundred bytes for the largest handler.
> 
>> ---
>>  include/linux/string.h | 4 
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/include/linux/string.h b/include/linux/string.h
>> index 54d21783e18d..9a96ff3ebf94 100644
>> --- a/include/linux/string.h
>> +++ b/include/linux/string.h
>> @@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p)
>> if (p_size == (size_t)-1)
>> return __builtin_strlen(p);
>> ret = strnlen(p, p_size);
>> -   if (p_size <= ret)
>> -   fortify_panic(__func__);
>> return ret;
>>  }
>>
>> @@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, 
>> __kernel_size_t maxlen)
>>  {
>> size_t p_size = __builtin_object_size(p, 0);
>> __kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : 
>> p_size);
>> -   if (p_size <= ret && maxlen != ret)
>> -   fortify_panic(__func__);
>> return ret;
> 
> I've reduced it further to this change:
> 
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path)
>  #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline))
>  #define __RENAME(x) __asm__(#x)
> 
> -void fortify_panic(const char *name) __noreturn __cold;
> +void fortify_panic(const char *name) __cold;
>  void __read_overflow(void) __compiletime_error("detected read beyond
> size of object passed as 1st parameter");
>  void __read_overflow2(void) __compiletime_error("detected read beyond
> size of object passed as 2nd parameter");
>  void __read_overflow3(void) __compiletime_error("detected read beyond
> size of object passed as 3rd parameter");
> 
> I don't immediately see why the __noreturn changes the behavior here, any 
> idea?
> 


At first I thought that this somehow might be related to 
__asan_handle_no_return(). GCC calls it
before noreturn function. So I made patch to remove generation of these calls 
(we don't need them in the kernel anyway)
but it didn't help. It must be something else than.


>>> Not sure if that change is the best fix, but it seems to address the 
>>> problem in
>>> this driver and probably leads to better code in other places as well.
>>>
>>
>> Probably it would be better to solve this on the strlcpy side, but I haven't 
>> found the way to do this right.
>> Alternative solutions:
>>
>>  - use memcpy() instead of strlcpy(). All source strings are smaller than 
>> I2C_NAME_SIZE, so we could
>>do something like this - memcpy(info.type, "si2168", sizeof("si2168"));
>>Also this should be faster.
> 
> This would be very similar to the patch I posted at the start of this
> thread to use strncpy(), right?

Sure.

> I was hoping that changing strlcpy() here could also improve other
> users that might run into
> the same situation, but stay below the 2048-byte stack frame limit.
> 
>>  - Move code under different "case:" in the switch(dev->model) to the 
>> separate function should help as well.
>>But it might be harder to backport into stables.
> 
> Agreed, I posted this in earlier versions of the patch series, see
> https://patchwork.kernel.org/patch/9601025/
> 
> The new patch was a result of me trying to come up with a less
> invasive version to
> make it easier to backport, since I would like to backport the last
> patch in the series
> that depends on all the earlier ones.
> 
>  Arnd
> 


Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-27 Thread Arnd Bergmann
On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin
 wrote:
>
>
> On 09/26/2017 09:47 AM, Arnd Bergmann wrote:
>> On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann  wrote:

>> +   ret = __builtin_strlen(q);
>
>
> I think this is not correct. Fortified strlen called here on purpose. If 
> sizeof q is known at compile time
> and 'q' contains not-null fortified strlen() will panic.

Ok, got it.

>> if (size) {
>> size_t len = (ret >= size) ? size - 1 : ret;
>> if (__builtin_constant_p(len) && len >= p_size)
>>
>> The problem is apparently that the fortified strlcpy calls the fortified 
>> strlen,
>> which in turn calls strnlen and that ends up calling the extern 
>> '__real_strnlen'
>> that gcc cannot reduce to a constant expression for a constant input.
>
>
> Per my observation, it's the code like this:
> if ()
> fortify_panic(__func__);
>
>
> somehow prevent gcc to merge several "struct i2c_board_info info;" into one 
> stack slot.
> With the hack bellow, stack usage reduced to ~1,6K:

1.6k is also what I see with my patch, or any other approach I tried
that changes
string.h. With the split up em28xx_dvb_init() function (and without
changes to string.h),
I got down to a few hundred bytes for the largest handler.

> ---
>  include/linux/string.h | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/include/linux/string.h b/include/linux/string.h
> index 54d21783e18d..9a96ff3ebf94 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p)
> if (p_size == (size_t)-1)
> return __builtin_strlen(p);
> ret = strnlen(p, p_size);
> -   if (p_size <= ret)
> -   fortify_panic(__func__);
> return ret;
>  }
>
> @@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, 
> __kernel_size_t maxlen)
>  {
> size_t p_size = __builtin_object_size(p, 0);
> __kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : 
> p_size);
> -   if (p_size <= ret && maxlen != ret)
> -   fortify_panic(__func__);
> return ret;

I've reduced it further to this change:

--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path)
 #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline))
 #define __RENAME(x) __asm__(#x)

-void fortify_panic(const char *name) __noreturn __cold;
+void fortify_panic(const char *name) __cold;
 void __read_overflow(void) __compiletime_error("detected read beyond
size of object passed as 1st parameter");
 void __read_overflow2(void) __compiletime_error("detected read beyond
size of object passed as 2nd parameter");
 void __read_overflow3(void) __compiletime_error("detected read beyond
size of object passed as 3rd parameter");

I don't immediately see why the __noreturn changes the behavior here, any idea?

>> Not sure if that change is the best fix, but it seems to address the problem 
>> in
>> this driver and probably leads to better code in other places as well.
>>
>
> Probably it would be better to solve this on the strlcpy side, but I haven't 
> found the way to do this right.
> Alternative solutions:
>
>  - use memcpy() instead of strlcpy(). All source strings are smaller than 
> I2C_NAME_SIZE, so we could
>do something like this - memcpy(info.type, "si2168", sizeof("si2168"));
>Also this should be faster.

This would be very similar to the patch I posted at the start of this
thread to use strncpy(), right?
I was hoping that changing strlcpy() here could also improve other
users that might run into
the same situation, but stay below the 2048-byte stack frame limit.

>  - Move code under different "case:" in the switch(dev->model) to the 
> separate function should help as well.
>But it might be harder to backport into stables.

Agreed, I posted this in earlier versions of the patch series, see
https://patchwork.kernel.org/patch/9601025/

The new patch was a result of me trying to come up with a less
invasive version to
make it easier to backport, since I would like to backport the last
patch in the series
that depends on all the earlier ones.

 Arnd


Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-26 Thread Andrey Ryabinin


On 09/26/2017 09:47 AM, Arnd Bergmann wrote:
> On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann  wrote:
>> On Mon, Sep 25, 2017 at 7:41 AM, David Laight  
>> wrote:
>>> From: Arnd Bergmann
 Sent: 22 September 2017 22:29
>>> ...
 It seems that this is triggered in part by using strlcpy(), which the
 compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
 is not part of the C standard.
>>>
>>> Neither is strncpy().
>>>
>>> It'll almost certainly be a marker in a header file somewhere,
>>> so it should be possibly to teach it about other functions.
>>
>> I'm currently travelling and haven't investigated in detail, but from
>> taking a closer look here, I found that the hardened 'strlcpy()'
>> in include/linux/string.h triggers it. There is also a hardened
>> (much shorted) 'strncpy()' that doesn't trigger it in the same file,
>> and having only the extern declaration of strncpy also doesn't.
> 
> And a little more experimenting leads to this simple patch that fixes
> the problem:
> 
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -254,7 +254,7 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const
> char *q, size_t size)
> size_t q_size = __builtin_object_size(q, 0);
> if (p_size == (size_t)-1 && q_size == (size_t)-1)
> return __real_strlcpy(p, q, size);
> -   ret = strlen(q);
> +   ret = __builtin_strlen(q);


I think this is not correct. Fortified strlen called here on purpose. If sizeof 
q is known at compile time
and 'q' contains not-null fortified strlen() will panic.


> if (size) {
> size_t len = (ret >= size) ? size - 1 : ret;
> if (__builtin_constant_p(len) && len >= p_size)
> 
> The problem is apparently that the fortified strlcpy calls the fortified 
> strlen,
> which in turn calls strnlen and that ends up calling the extern 
> '__real_strnlen'
> that gcc cannot reduce to a constant expression for a constant input.


Per my observation, it's the code like this:
if () 
fortify_panic(__func__);


somehow prevent gcc to merge several "struct i2c_board_info info;" into one 
stack slot.
With the hack bellow, stack usage reduced to ~1,6K:

---
 include/linux/string.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index 54d21783e18d..9a96ff3ebf94 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p)
if (p_size == (size_t)-1)
return __builtin_strlen(p);
ret = strnlen(p, p_size);
-   if (p_size <= ret)
-   fortify_panic(__func__);
return ret;
 }
 
@@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, 
__kernel_size_t maxlen)
 {
size_t p_size = __builtin_object_size(p, 0);
__kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : 
p_size);
-   if (p_size <= ret && maxlen != ret)
-   fortify_panic(__func__);
return ret;
 }




> Not sure if that change is the best fix, but it seems to address the problem 
> in
> this driver and probably leads to better code in other places as well.
> 

Probably it would be better to solve this on the strlcpy side, but I haven't 
found the way to do this right.
Alternative solutions:

 - use memcpy() instead of strlcpy(). All source strings are smaller than 
I2C_NAME_SIZE, so we could
   do something like this - memcpy(info.type, "si2168", sizeof("si2168"));
   Also this should be faster.

 - Move code under different "case:" in the switch(dev->model) to the separate 
function should help as well.
   But it might be harder to backport into stables.





Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-26 Thread Arnd Bergmann
On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann  wrote:
> On Mon, Sep 25, 2017 at 7:41 AM, David Laight  wrote:
>> From: Arnd Bergmann
>>> Sent: 22 September 2017 22:29
>> ...
>>> It seems that this is triggered in part by using strlcpy(), which the
>>> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
>>> is not part of the C standard.
>>
>> Neither is strncpy().
>>
>> It'll almost certainly be a marker in a header file somewhere,
>> so it should be possibly to teach it about other functions.
>
> I'm currently travelling and haven't investigated in detail, but from
> taking a closer look here, I found that the hardened 'strlcpy()'
> in include/linux/string.h triggers it. There is also a hardened
> (much shorted) 'strncpy()' that doesn't trigger it in the same file,
> and having only the extern declaration of strncpy also doesn't.

And a little more experimenting leads to this simple patch that fixes
the problem:

--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -254,7 +254,7 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const
char *q, size_t size)
size_t q_size = __builtin_object_size(q, 0);
if (p_size == (size_t)-1 && q_size == (size_t)-1)
return __real_strlcpy(p, q, size);
-   ret = strlen(q);
+   ret = __builtin_strlen(q);
if (size) {
size_t len = (ret >= size) ? size - 1 : ret;
if (__builtin_constant_p(len) && len >= p_size)

The problem is apparently that the fortified strlcpy calls the fortified strlen,
which in turn calls strnlen and that ends up calling the extern '__real_strnlen'
that gcc cannot reduce to a constant expression for a constant input.

Not sure if that change is the best fix, but it seems to address the problem in
this driver and probably leads to better code in other places as well.

  Arnd


Re: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-26 Thread Arnd Bergmann
On Mon, Sep 25, 2017 at 7:41 AM, David Laight  wrote:
> From: Arnd Bergmann
>> Sent: 22 September 2017 22:29
> ...
>> It seems that this is triggered in part by using strlcpy(), which the
>> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
>> is not part of the C standard.
>
> Neither is strncpy().
>
> It'll almost certainly be a marker in a header file somewhere,
> so it should be possibly to teach it about other functions.

I'm currently travelling and haven't investigated in detail, but from
taking a closer look here, I found that the hardened 'strlcpy()'
in include/linux/string.h triggers it. There is also a hardened
(much shorted) 'strncpy()' that doesn't trigger it in the same file,
and having only the extern declaration of strncpy also doesn't.

Arnd


RE: [PATCH v4 4/9] em28xx: fix em28xx_dvb_init for KASAN

2017-09-25 Thread David Laight
From: Arnd Bergmann
> Sent: 22 September 2017 22:29
...
> It seems that this is triggered in part by using strlcpy(), which the
> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy
> is not part of the C standard.

Neither is strncpy().

It'll almost certainly be a marker in a header file somewhere,
so it should be possibly to teach it about other functions.

David