Re: [PATCH] powerpc/kcsan: Properly instrument arch_spin_unlock()

2023-05-30 Thread Marco Elver
On Mon, 29 May 2023 at 17:50, Christophe Leroy
 wrote:
>
> The following boottime error is encountered with SMP kernel:
>
>   kcsan: improperly instrumented type=(0): arch_spin_unlock(_spinlock)
>   kcsan: improperly instrumented type=(0): spin_unlock(_spinlock)
>   kcsan: improperly instrumented type=(KCSAN_ACCESS_WRITE): 
> arch_spin_unlock(_spinlock)
>   kcsan: improperly instrumented type=(KCSAN_ACCESS_WRITE): 
> spin_unlock(_spinlock)
>   kcsan: improperly instrumented type=(KCSAN_ACCESS_WRITE | 
> KCSAN_ACCESS_COMPOUND): arch_spin_unlock(_spinlock)
>   kcsan: improperly instrumented type=(KCSAN_ACCESS_WRITE | 
> KCSAN_ACCESS_COMPOUND): spin_unlock(_spinlock)
>   kcsan: selftest: test_barrier failed
>   kcsan: selftest: 2/3 tests passed
>   Kernel panic - not syncing: selftests failed
>
> Properly instrument arch_spin_unlock() with kcsan_mb().
>
> Signed-off-by: Christophe Leroy 

Acked-by: Marco Elver 

> ---
>  arch/powerpc/include/asm/simple_spinlock.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/simple_spinlock.h 
> b/arch/powerpc/include/asm/simple_spinlock.h
> index 9dcc7e9993b9..4dd12dcb9ef8 100644
> --- a/arch/powerpc/include/asm/simple_spinlock.h
> +++ b/arch/powerpc/include/asm/simple_spinlock.h
> @@ -15,6 +15,7 @@
>   * (the type definitions are in asm/simple_spinlock_types.h)
>   */
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -126,6 +127,7 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
>
>  static inline void arch_spin_unlock(arch_spinlock_t *lock)
>  {
> +   kcsan_mb();
> __asm__ __volatile__("# arch_spin_unlock\n\t"
> PPC_RELEASE_BARRIER: : :"memory");
> lock->slock = 0;
> --
> 2.40.1
>


Re: [PATCH 0/3] Extend KCSAN to all powerpc

2023-05-12 Thread Marco Elver
On Fri, 12 May 2023 at 17:31, Christophe Leroy
 wrote:
>
> This series enables KCSAN on all powerpc.
>
> To do this, a fix is required to KCSAN core.
>
> Once that fix is done, the stubs can also be removed from xtensa.
>
> It would be nice if patch 1 could go in v6.4 as a fix, then patches 2 and 3
> could be handled separately in each architecture in next cycle.
>
> Christophe Leroy (2):
>   kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures
>   xtensa: Remove 64 bits atomic builtins stubs
>
> Rohan McLure (1):
>   powerpc/{32,book3e}: kcsan: Extend KCSAN Support

Acked-by: Marco Elver 


Re: [PATCH 1/3] kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures

2023-05-12 Thread Marco Elver
On Fri, 12 May 2023 at 17:31, Christophe Leroy
 wrote:
>
> Activating KCSAN on a 32 bits architecture leads to the following
> link-time failure:
>
> LD  .tmp_vmlinux.kallsyms1
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function `__tsan_atomic64_load':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_load_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_store':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_store_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_exchange':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_exchange_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_add':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_add_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_sub':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_sub_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_and':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_and_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_or':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_or_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_xor':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_xor_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_fetch_nand':
>   kernel/kcsan/core.c:1273: undefined reference to `__atomic_fetch_nand_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_compare_exchange_strong':
>   kernel/kcsan/core.c:1273: undefined reference to 
> `__atomic_compare_exchange_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_compare_exchange_weak':
>   kernel/kcsan/core.c:1273: undefined reference to 
> `__atomic_compare_exchange_8'
>   powerpc64-linux-ld: kernel/kcsan/core.o: in function 
> `__tsan_atomic64_compare_exchange_val':
>   kernel/kcsan/core.c:1273: undefined reference to 
> `__atomic_compare_exchange_8'
>
> 32 bits architectures don't have 64 bits atomic builtins. Only
> include DEFINE_TSAN_ATOMIC_OPS(64) on 64 bits architectures.
>
> Fixes: 0f8ad5f2e934 ("kcsan: Add support for atomic builtins")
> Suggested-by: Marco Elver 
> Signed-off-by: Christophe Leroy 

Reviewed-by: Marco Elver 

Do you have your own tree to take this through with the other patches?

> ---
>  kernel/kcsan/core.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
> index 5a60cc52adc0..8a7baf4e332e 100644
> --- a/kernel/kcsan/core.c
> +++ b/kernel/kcsan/core.c
> @@ -1270,7 +1270,9 @@ static __always_inline void 
> kcsan_atomic_builtin_memorder(int memorder)
>  DEFINE_TSAN_ATOMIC_OPS(8);
>  DEFINE_TSAN_ATOMIC_OPS(16);
>  DEFINE_TSAN_ATOMIC_OPS(32);
> +#ifdef CONFIG_64BIT
>  DEFINE_TSAN_ATOMIC_OPS(64);
> +#endif
>
>  void __tsan_atomic_thread_fence(int memorder);
>  void __tsan_atomic_thread_fence(int memorder)
> --
> 2.40.1
>


Re: [PATCH] mm: kfence: Fix false positives on big endian

2023-05-05 Thread Marco Elver
On Fri, 5 May 2023 at 05:51, Michael Ellerman  wrote:
>
> Since commit 1ba3cbf3ec3b ("mm: kfence: improve the performance of
> __kfence_alloc() and __kfence_free()"), kfence reports failures in
> random places at boot on big endian machines.
>
> The problem is that the new KFENCE_CANARY_PATTERN_U64 encodes the
> address of each byte in its value, so it needs to be byte swapped on big
> endian machines.
>
> The compiler is smart enough to do the le64_to_cpu() at compile time, so
> there is no runtime overhead.
>
> Fixes: 1ba3cbf3ec3b ("mm: kfence: improve the performance of __kfence_alloc() 
> and __kfence_free()")
> Signed-off-by: Michael Ellerman 

Reviewed-by: Marco Elver 

Andrew, is the Fixes enough to make it to stable as well or do we also
need Cc: stable?

Thanks,
-- Marco

> ---
>  mm/kfence/kfence.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/kfence/kfence.h b/mm/kfence/kfence.h
> index 2aafc46a4aaf..392fb273e7bd 100644
> --- a/mm/kfence/kfence.h
> +++ b/mm/kfence/kfence.h
> @@ -29,7 +29,7 @@
>   * canary of every 8 bytes is the same. 64-bit memory can be filled and 
> checked
>   * at a time instead of byte by byte to improve performance.
>   */
> -#define KFENCE_CANARY_PATTERN_U64 ((u64)0x ^ 
> (u64)(0x0706050403020100))
> +#define KFENCE_CANARY_PATTERN_U64 ((u64)0x ^ 
> (u64)(le64_to_cpu(0x0706050403020100)))
>
>  /* Maximum stack depth for reports. */
>  #define KFENCE_STACK_DEPTH 64
> --
> 2.40.1
>


Re: [PATCH mm] kasan, powerpc: Don't rename memintrinsics if compiler adds prefixes

2023-02-27 Thread Marco Elver
On Mon, 27 Feb 2023 at 23:16, Andrew Morton  wrote:
>
> On Mon, 27 Feb 2023 10:47:27 +0100 Marco Elver  wrote:
>
> > With appropriate compiler support [1], KASAN builds use __asan prefixed
> > meminstrinsics, and KASAN no longer overrides memcpy/memset/memmove.
> >
> > If compiler support is detected (CC_HAS_KASAN_MEMINTRINSIC_PREFIX),
> > define memintrinsics normally (do not prefix '__').
> >
> > On powerpc, KASAN is the only user of __mem functions, which are used to
> > define instrumented memintrinsics. Alias the normal versions for KASAN
> > to use in its implementation.
> >
> > Link: 
> > https://lore.kernel.org/all/20230224085942.1791837-1-el...@google.com/ [1]
> > Link: 
> > https://lore.kernel.org/oe-kbuild-all/202302271348.u5lvmo0s-...@intel.com/
> > Reported-by: kernel test robot 
> > Signed-off-by: Marco Elver 
>
> Seems this is a fix against "kasan: treat meminstrinsic as builtins in
> uninstrumented files", so I'll plan to fold this patch into that patch.

Yes, that looks right.

If a powerpc maintainer could take a quick look as well would be good.
The maze of memcpy/memmove/memset definitions and redefinitions isn't
the simplest - I hope in a few years we can delete all the old code
(before CC_HAS_KASAN_MEMINTRINSIC_PREFIX), and let the compilers just
"do the right thing".

Thanks,
-- Marco


[PATCH mm] kasan, powerpc: Don't rename memintrinsics if compiler adds prefixes

2023-02-27 Thread Marco Elver
With appropriate compiler support [1], KASAN builds use __asan prefixed
meminstrinsics, and KASAN no longer overrides memcpy/memset/memmove.

If compiler support is detected (CC_HAS_KASAN_MEMINTRINSIC_PREFIX),
define memintrinsics normally (do not prefix '__').

On powerpc, KASAN is the only user of __mem functions, which are used to
define instrumented memintrinsics. Alias the normal versions for KASAN
to use in its implementation.

Link: https://lore.kernel.org/all/20230224085942.1791837-1-el...@google.com/ [1]
Link: https://lore.kernel.org/oe-kbuild-all/202302271348.u5lvmo0s-...@intel.com/
Reported-by: kernel test robot 
Signed-off-by: Marco Elver 
---
 arch/powerpc/include/asm/kasan.h   |  2 +-
 arch/powerpc/include/asm/string.h  | 15 +++
 arch/powerpc/kernel/prom_init_check.sh |  9 +++--
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index 92a968202ba7..365d2720097c 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -2,7 +2,7 @@
 #ifndef __ASM_KASAN_H
 #define __ASM_KASAN_H
 
-#ifdef CONFIG_KASAN
+#if defined(CONFIG_KASAN) && !defined(CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX)
 #define _GLOBAL_KASAN(fn)  _GLOBAL(__##fn)
 #define _GLOBAL_TOC_KASAN(fn)  _GLOBAL_TOC(__##fn)
 #define EXPORT_SYMBOL_KASAN(fn)EXPORT_SYMBOL(__##fn)
diff --git a/arch/powerpc/include/asm/string.h 
b/arch/powerpc/include/asm/string.h
index 2aa0e31e6884..60ba22770f51 100644
--- a/arch/powerpc/include/asm/string.h
+++ b/arch/powerpc/include/asm/string.h
@@ -30,11 +30,17 @@ extern int memcmp(const void *,const void 
*,__kernel_size_t);
 extern void * memchr(const void *,int,__kernel_size_t);
 void memcpy_flushcache(void *dest, const void *src, size_t size);
 
+#ifdef CONFIG_KASAN
+/* __mem variants are used by KASAN to implement instrumented meminstrinsics. 
*/
+#ifdef CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX
+#define __memset memset
+#define __memcpy memcpy
+#define __memmove memmove
+#else /* CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX */
 void *__memset(void *s, int c, __kernel_size_t count);
 void *__memcpy(void *to, const void *from, __kernel_size_t n);
 void *__memmove(void *to, const void *from, __kernel_size_t n);
-
-#if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
+#ifndef __SANITIZE_ADDRESS__
 /*
  * For files that are not instrumented (e.g. mm/slub.c) we
  * should use not instrumented version of mem* functions.
@@ -46,8 +52,9 @@ void *__memmove(void *to, const void *from, __kernel_size_t 
n);
 #ifndef __NO_FORTIFY
 #define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
 #endif
-
-#endif
+#endif /* !__SANITIZE_ADDRESS__ */
+#endif /* CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX */
+#endif /* CONFIG_KASAN */
 
 #ifdef CONFIG_PPC64
 #ifndef CONFIG_KASAN
diff --git a/arch/powerpc/kernel/prom_init_check.sh 
b/arch/powerpc/kernel/prom_init_check.sh
index 311890d71c4c..f3f43a8f48cf 100644
--- a/arch/powerpc/kernel/prom_init_check.sh
+++ b/arch/powerpc/kernel/prom_init_check.sh
@@ -13,8 +13,13 @@
 # If you really need to reference something from prom_init.o add
 # it to the list below:
 
-grep "^CONFIG_KASAN=y$" ${KCONFIG_CONFIG} >/dev/null
-if [ $? -eq 0 ]
+has_renamed_memintrinsics()
+{
+   grep -q "^CONFIG_KASAN=y$" ${KCONFIG_CONFIG} && \
+   ! grep -q "^CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y" 
${KCONFIG_CONFIG}
+}
+
+if has_renamed_memintrinsics
 then
MEM_FUNCS="__memcpy __memset"
 else
-- 
2.39.2.637.g21b0678d19-goog



Re: [PATCH 1/2] kcsan: xtensa: Add atomic builtin stubs for 32-bit systems

2023-02-16 Thread Marco Elver
On Thu, Feb 16, 2023 at 07:12AM +, Christophe Leroy wrote:
> 
> 
> Le 16/02/2023 à 06:09, Rohan McLure a écrit :
> > KCSAN instruments calls to atomic builtins, and will in turn call these
> > builtins itself. As such, architectures supporting KCSAN must have
> > compiler support for these atomic primitives.
> > 
> > Since 32-bit systems are unlikely to have 64-bit compiler builtins,
> > provide a stub for each missing builtin, and use BUG() to assert
> > unreachability.
> > 
> > In commit 725aea873261 ("xtensa: enable KCSAN"), xtensa implements these
> > locally. Move these definitions to be accessible to all 32-bit
> > architectures that do not provide the necessary builtins, with opt in
> > for PowerPC and xtensa.
> > 
> > Signed-off-by: Rohan McLure 
> > Reviewed-by: Max Filippov 
> 
> This series should also be addressed to KCSAN Maintainers, shouldn't it ?
> 
> KCSAN
> M:Marco Elver 
> R:Dmitry Vyukov 
> L:kasan-...@googlegroups.com
> S:Maintained
> F:Documentation/dev-tools/kcsan.rst
> F:include/linux/kcsan*.h
> F:kernel/kcsan/
> F:lib/Kconfig.kcsan
> F:scripts/Makefile.kcsan
> 
> 
> > ---
> > Previously issued as a part of a patch series adding KCSAN support to
> > 64-bit.
> > Link: 
> > https://lore.kernel.org/linuxppc-dev/167646486000.1421441.10070059569986228558.b4...@ellerman.id.au/T/#t
> > v1: Remove __has_builtin check, as gcc is not obligated to inline
> > builtins detected using this check, but instead is permitted to supply
> > them in libatomic:
> > Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108734
> > Instead, opt-in PPC32 and xtensa.
> > ---
> >   arch/xtensa/lib/Makefile  | 1 -
> >   kernel/kcsan/Makefile | 2 ++
> >   arch/xtensa/lib/kcsan-stubs.c => kernel/kcsan/stubs.c | 0
> >   3 files changed, 2 insertions(+), 1 deletion(-)
> >   rename arch/xtensa/lib/kcsan-stubs.c => kernel/kcsan/stubs.c (100%)
> > 
> > diff --git a/arch/xtensa/lib/Makefile b/arch/xtensa/lib/Makefile
> > index 7ecef0519a27..d69356dc97df 100644
> > --- a/arch/xtensa/lib/Makefile
> > +++ b/arch/xtensa/lib/Makefile
> > @@ -8,5 +8,4 @@ lib-y   += memcopy.o memset.o checksum.o \
> >divsi3.o udivsi3.o modsi3.o umodsi3.o mulsi3.o umulsidi3.o \
> >usercopy.o strncpy_user.o strnlen_user.o
> >   lib-$(CONFIG_PCI) += pci-auto.o
> > -lib-$(CONFIG_KCSAN) += kcsan-stubs.o
> >   KCSAN_SANITIZE_kcsan-stubs.o := n
> > diff --git a/kernel/kcsan/Makefile b/kernel/kcsan/Makefile
> > index 8cf70f068d92..86dd713d8855 100644
> > --- a/kernel/kcsan/Makefile
> > +++ b/kernel/kcsan/Makefile
> > @@ -12,6 +12,8 @@ CFLAGS_core.o := $(call cc-option,-fno-conserve-stack) \
> > -fno-stack-protector -DDISABLE_BRANCH_PROFILING
> >   
> >   obj-y := core.o debugfs.o report.o
> > +obj-$(CONFIG_PPC32) += stubs.o
> > +obj-$(CONFIG_XTENSA) += stubs.o
> 
> Not sure it is acceptable to do it that way.
> 
> There should likely be something like a CONFIG_ARCH_WANTS_KCSAN_STUBS in 
> KCSAN's Kconfig then PPC32 and XTENSA should select it.

The longer I think about it, since these stubs all BUG() anyway, perhaps
we ought to just avoid them altogether. If you delete all the stubs from
ppc and xtensa, but do this:

 | diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
 | index 54d077e1a2dc..8169d6dadd0e 100644
 | --- a/kernel/kcsan/core.c
 | +++ b/kernel/kcsan/core.c
 | @@ -1261,7 +1261,9 @@ static __always_inline void 
kcsan_atomic_builtin_memorder(int memorder)
 |  DEFINE_TSAN_ATOMIC_OPS(8);
 |  DEFINE_TSAN_ATOMIC_OPS(16);
 |  DEFINE_TSAN_ATOMIC_OPS(32);
 | +#ifdef CONFIG_64BIT
 |  DEFINE_TSAN_ATOMIC_OPS(64);
 | +#endif
 |  
 |  void __tsan_atomic_thread_fence(int memorder);
 |  void __tsan_atomic_thread_fence(int memorder)

Does that work?


Re: [PATCH v4 7/7] powerpc: kcsan: Add KCSAN Support

2023-02-08 Thread Marco Elver
On Wed, 8 Feb 2023 at 04:23, Rohan McLure  wrote:
>
> Enable HAVE_ARCH_KCSAN on all powerpc platforms, permitting use of the
> kernel concurrency sanitiser through the CONFIG_KCSAN_* kconfig options.
> KCSAN requires compiler builtins __atomic_* 64-bit values, and so only
> report support on PPC64.

I thought the stubs solve that?

In general the whole builtin atomic support is only there to support
some odd corner cases today (Clang GCOV generates builtin atomic ops,
occasionally some driver sneaks in a builtin atomic although it's
technically strongly discouraged, and probably Rust stuff in future).
So I'd like to keep them, although they shouldn't cause issues.

> See documentation in Documentation/dev-tools/kcsan.rst for more
> information.

Do the tests pass?

> Signed-off-by: Rohan McLure 
> ---
> v3: Restrict support to 64-bit, as TSAN expects 64-bit __atomic_* compiler
> built-ins.
> ---
>  arch/powerpc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b8c4ac56bddc..55bc2d724c73 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -198,6 +198,7 @@ config PPC
> select HAVE_ARCH_KASAN  if PPC_RADIX_MMU
> select HAVE_ARCH_KASAN  if PPC_BOOK3E_64
> select HAVE_ARCH_KASAN_VMALLOC  if HAVE_ARCH_KASAN
> +   select HAVE_ARCH_KCSANif PPC64
> select HAVE_ARCH_KFENCE if 
> ARCH_SUPPORTS_DEBUG_PAGEALLOC
> select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> select HAVE_ARCH_KGDB
> --
> 2.37.2
>


Re: [PATCH v3 1/7] kernel/fork: convert vma assignment to a memcpy

2023-02-07 Thread Marco Elver
On Tue, 7 Feb 2023 at 18:24, Suren Baghdasaryan  wrote:
>
> On Tue, Feb 7, 2023 at 9:16 AM Marco Elver  wrote:
> >
> > On Thu, Jan 26, 2023 at 09:27AM -0800, Paul E. McKenney wrote:
> > > On Wed, Jan 25, 2023 at 05:34:49PM -0800, Andrew Morton wrote:
> > > > On Wed, 25 Jan 2023 16:50:01 -0800 Suren Baghdasaryan 
> > > >  wrote:
> > > >
> > > > > On Wed, Jan 25, 2023 at 4:22 PM Andrew Morton 
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, 25 Jan 2023 15:35:48 -0800 Suren Baghdasaryan 
> > > > > >  wrote:
> > > > > >
> > > > > > > Convert vma assignment in vm_area_dup() to a memcpy() to prevent 
> > > > > > > compiler
> > > > > > > errors when we add a const modifier to vma->vm_flags.
> > > > > > >
> > > > > > > ...
> > > > > > >
> > > > > > > --- a/kernel/fork.c
> > > > > > > +++ b/kernel/fork.c
> > > > > > > @@ -482,7 +482,7 @@ struct vm_area_struct *vm_area_dup(struct 
> > > > > > > vm_area_struct *orig)
> > > > > > >* orig->shared.rb may be modified concurrently, 
> > > > > > > but the clone
> > > > > > >* will be reinitialized.
> > > > > > >*/
> > > > > > > - *new = data_race(*orig);
> > > > > > > + memcpy(new, orig, sizeof(*new));
> > > > > >
> > > > > > The data_race() removal is unchangelogged?
> > > > >
> > > > > True. I'll add a note in the changelog about that. Ideally I would
> > > > > like to preserve it but I could not find a way to do that.
> > > >
> > > > Perhaps Paul can comment?
> > > >
> > > > I wonder if KCSAN knows how to detect this race, given that it's now in
> > > > a memcpy.  I assume so.
> > >
> > > I ran an experiment memcpy()ing between a static array and an onstack
> > > array, and KCSAN did not complain.  But maybe I was setting it up wrong.
> > >
> > > This is what I did:
> > >
> > >   long myid = (long)arg; /* different value for each task */
> > >   static unsigned long z1[10] = { 0 };
> > >   unsigned long z2[10];
> > >
> > >   ...
> > >
> > >   memcpy(z1, z2, ARRAY_SIZE(z1) * sizeof(z1[0]));
> > >   for (zi = 0; zi < ARRAY_SIZE(z1); zi++)
> > >   z2[zi] += myid;
> > >   memcpy(z2, z1, ARRAY_SIZE(z1) * sizeof(z1[0]));
> > >
> > > Adding Marco on CC for his thoughts.
> >
> > ( Sorry for not seeing it earlier - just saw this by chance. )
> >
> > memcpy() data races will be detected as of (given a relatively recent
> > Clang compiler):
> >
> >   
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7c201739beef
> >
> > Also beware that the compiler is free to "optimize" things by either
> > inlining memcpy() (turning an explicit memcpy() into just a bunch of
> > loads/stores), or outline plain assignments into memcpy() calls. So the
> > only way to be sure what ends up there is to look at the disassembled
> > code.
> >
> > The data_race() was introduced by:
> >
> >   
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cda099b37d716
> >
> > It says:
> >  "vm_area_dup() blindly copies all fields of original VMA to the new one.
> >   This includes coping vm_area_struct::shared.rb which is normally
> >   protected by i_mmap_lock. But this is fine because the read value will
> >   be overwritten on the following __vma_link_file() under proper
> >   protection. Thus, mark it as an intentional data race and insert a few
> >   assertions for the fields that should not be modified concurrently."
> >
> > And as far as I can tell this hasn't changed.
>
> Thanks for the feedback, Marco!
> So, IIUC Mel's proposal to use data_race(memcpy(new, orig,
> sizeof(*new))); is fine in this case, right?

Yes, that'd work.


Re: [PATCH v3 1/7] kernel/fork: convert vma assignment to a memcpy

2023-02-07 Thread Marco Elver
On Thu, Jan 26, 2023 at 09:27AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 05:34:49PM -0800, Andrew Morton wrote:
> > On Wed, 25 Jan 2023 16:50:01 -0800 Suren Baghdasaryan  
> > wrote:
> > 
> > > On Wed, Jan 25, 2023 at 4:22 PM Andrew Morton  
> > > wrote:
> > > >
> > > > On Wed, 25 Jan 2023 15:35:48 -0800 Suren Baghdasaryan 
> > > >  wrote:
> > > >
> > > > > Convert vma assignment in vm_area_dup() to a memcpy() to prevent 
> > > > > compiler
> > > > > errors when we add a const modifier to vma->vm_flags.
> > > > >
> > > > > ...
> > > > >
> > > > > --- a/kernel/fork.c
> > > > > +++ b/kernel/fork.c
> > > > > @@ -482,7 +482,7 @@ struct vm_area_struct *vm_area_dup(struct 
> > > > > vm_area_struct *orig)
> > > > >* orig->shared.rb may be modified concurrently, but 
> > > > > the clone
> > > > >* will be reinitialized.
> > > > >*/
> > > > > - *new = data_race(*orig);
> > > > > + memcpy(new, orig, sizeof(*new));
> > > >
> > > > The data_race() removal is unchangelogged?
> > > 
> > > True. I'll add a note in the changelog about that. Ideally I would
> > > like to preserve it but I could not find a way to do that.
> > 
> > Perhaps Paul can comment?
> > 
> > I wonder if KCSAN knows how to detect this race, given that it's now in
> > a memcpy.  I assume so.
> 
> I ran an experiment memcpy()ing between a static array and an onstack
> array, and KCSAN did not complain.  But maybe I was setting it up wrong.
> 
> This is what I did:
> 
>   long myid = (long)arg; /* different value for each task */
>   static unsigned long z1[10] = { 0 };
>   unsigned long z2[10];
> 
>   ...
> 
>   memcpy(z1, z2, ARRAY_SIZE(z1) * sizeof(z1[0]));
>   for (zi = 0; zi < ARRAY_SIZE(z1); zi++)
>   z2[zi] += myid;
>   memcpy(z2, z1, ARRAY_SIZE(z1) * sizeof(z1[0]));
> 
> Adding Marco on CC for his thoughts.

( Sorry for not seeing it earlier - just saw this by chance. )

memcpy() data races will be detected as of (given a relatively recent
Clang compiler):

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7c201739beef

Also beware that the compiler is free to "optimize" things by either
inlining memcpy() (turning an explicit memcpy() into just a bunch of
loads/stores), or outline plain assignments into memcpy() calls. So the
only way to be sure what ends up there is to look at the disassembled
code.

The data_race() was introduced by:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cda099b37d716

It says:
 "vm_area_dup() blindly copies all fields of original VMA to the new one.
  This includes coping vm_area_struct::shared.rb which is normally
  protected by i_mmap_lock. But this is fine because the read value will
  be overwritten on the following __vma_link_file() under proper
  protection. Thus, mark it as an intentional data race and insert a few
  assertions for the fields that should not be modified concurrently."

And as far as I can tell this hasn't changed.

Thanks,
-- Marco


Re: [PATCH AUTOSEL 5.19 05/29] powerpc/hw_breakpoint: Avoid relying on caller synchronization

2022-10-17 Thread Marco Elver
On Mon, 17 Oct 2022 at 17:08, Sasha Levin  wrote:
>
> From: Marco Elver 
>
> [ Upstream commit f95e5a3d59011eec1257d0e76de1e1f8969d426f ]
>
> Internal data structures (cpu_bps, task_bps) of powerpc's hw_breakpoint
> implementation have relied on nr_bp_mutex serializing access to them.
>
> Before overhauling synchronization of kernel/events/hw_breakpoint.c,
> introduce 2 spinlocks to synchronize cpu_bps and task_bps respectively,
> thus avoiding reliance on callers synchronizing powerpc's hw_breakpoint.
>
> Reported-by: Dmitry Vyukov 
> Signed-off-by: Marco Elver 
> Signed-off-by: Peter Zijlstra (Intel) 
> Acked-by: Dmitry Vyukov 
> Acked-by: Ian Rogers 
> Link: https://lore.kernel.org/r/20220829124719.675715-10-el...@google.com
> Signed-off-by: Sasha Levin 

Backporting this patch seems unnecessary if we're not backporting the
hw_breakpoint overhaul.

Without the overhaul, nothing will break without this patch.

Thanks,
-- Marco

> ---
>  arch/powerpc/kernel/hw_breakpoint.c | 53 ++---
>  1 file changed, 40 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
> b/arch/powerpc/kernel/hw_breakpoint.c
> index 2669f80b3a49..8db1a15d7acb 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -129,7 +130,14 @@ struct breakpoint {
> bool ptrace_bp;
>  };
>
> +/*
> + * While kernel/events/hw_breakpoint.c does its own synchronization, we 
> cannot
> + * rely on it safely synchronizing internals here; however, we can rely on it
> + * not requesting more breakpoints than available.
> + */
> +static DEFINE_SPINLOCK(cpu_bps_lock);
>  static DEFINE_PER_CPU(struct breakpoint *, cpu_bps[HBP_NUM_MAX]);
> +static DEFINE_SPINLOCK(task_bps_lock);
>  static LIST_HEAD(task_bps);
>
>  static struct breakpoint *alloc_breakpoint(struct perf_event *bp)
> @@ -174,7 +182,9 @@ static int task_bps_add(struct perf_event *bp)
> if (IS_ERR(tmp))
> return PTR_ERR(tmp);
>
> +   spin_lock(_bps_lock);
> list_add(>list, _bps);
> +   spin_unlock(_bps_lock);
> return 0;
>  }
>
> @@ -182,6 +192,7 @@ static void task_bps_remove(struct perf_event *bp)
>  {
> struct list_head *pos, *q;
>
> +   spin_lock(_bps_lock);
> list_for_each_safe(pos, q, _bps) {
> struct breakpoint *tmp = list_entry(pos, struct breakpoint, 
> list);
>
> @@ -191,6 +202,7 @@ static void task_bps_remove(struct perf_event *bp)
> break;
> }
> }
> +   spin_unlock(_bps_lock);
>  }
>
>  /*
> @@ -200,12 +212,17 @@ static void task_bps_remove(struct perf_event *bp)
>  static bool all_task_bps_check(struct perf_event *bp)
>  {
> struct breakpoint *tmp;
> +   bool ret = false;
>
> +   spin_lock(_bps_lock);
> list_for_each_entry(tmp, _bps, list) {
> -   if (!can_co_exist(tmp, bp))
> -   return true;
> +   if (!can_co_exist(tmp, bp)) {
> +   ret = true;
> +   break;
> +   }
> }
> -   return false;
> +   spin_unlock(_bps_lock);
> +   return ret;
>  }
>
>  /*
> @@ -215,13 +232,18 @@ static bool all_task_bps_check(struct perf_event *bp)
>  static bool same_task_bps_check(struct perf_event *bp)
>  {
> struct breakpoint *tmp;
> +   bool ret = false;
>
> +   spin_lock(_bps_lock);
> list_for_each_entry(tmp, _bps, list) {
> if (tmp->bp->hw.target == bp->hw.target &&
> -   !can_co_exist(tmp, bp))
> -   return true;
> +   !can_co_exist(tmp, bp)) {
> +   ret = true;
> +   break;
> +   }
> }
> -   return false;
> +   spin_unlock(_bps_lock);
> +   return ret;
>  }
>
>  static int cpu_bps_add(struct perf_event *bp)
> @@ -234,6 +256,7 @@ static int cpu_bps_add(struct perf_event *bp)
> if (IS_ERR(tmp))
> return PTR_ERR(tmp);
>
> +   spin_lock(_bps_lock);
> cpu_bp = per_cpu_ptr(cpu_bps, bp->cpu);
> for (i = 0; i < nr_wp_slots(); i++) {
> if (!cpu_bp[i]) {
> @@ -241,6 +264,7 @@ static int cpu_bps_add(struct perf_event *bp)
> break;
> }
> }
> +   spin_unlock(_bps_lock);
> return 0;
>  }
>
> @@ -249,6 +273,7 @@ st

Re: [PATCH v14 4/4] kasan: use MAX_PTRS_PER_* for early shadow tables

2021-06-17 Thread Marco Elver
On Thu, 17 Jun 2021 at 08:40, Daniel Axtens  wrote:
>
> powerpc has a variable number of PTRS_PER_*, set at runtime based
> on the MMU that the kernel is booted under.
>
> This means the PTRS_PER_* are no longer constants, and therefore
> breaks the build. Switch to using MAX_PTRS_PER_*, which are constant.
>
> Suggested-by: Christophe Leroy 
> Suggested-by: Balbir Singh 
> Reviewed-by: Christophe Leroy 
> Reviewed-by: Balbir Singh 
> Signed-off-by: Daniel Axtens 

Reviewed-by: Marco Elver 


> ---
>  include/linux/kasan.h | 6 +++---
>  mm/kasan/init.c   | 6 +++---
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 768d7d342757..5310e217bd74 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -41,9 +41,9 @@ struct kunit_kasan_expectation {
>  #endif
>
>  extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
> -extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE + PTE_HWTABLE_PTRS];
> -extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
> -extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
> +extern pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS];
> +extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD];
> +extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD];
>  extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>
>  int kasan_populate_early_shadow(const void *shadow_start,
> diff --git a/mm/kasan/init.c b/mm/kasan/init.c
> index 348f31d15a97..cc64ed6858c6 100644
> --- a/mm/kasan/init.c
> +++ b/mm/kasan/init.c
> @@ -41,7 +41,7 @@ static inline bool kasan_p4d_table(pgd_t pgd)
>  }
>  #endif
>  #if CONFIG_PGTABLE_LEVELS > 3
> -pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
> +pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD] __page_aligned_bss;
>  static inline bool kasan_pud_table(p4d_t p4d)
>  {
> return p4d_page(p4d) == 
> virt_to_page(lm_alias(kasan_early_shadow_pud));
> @@ -53,7 +53,7 @@ static inline bool kasan_pud_table(p4d_t p4d)
>  }
>  #endif
>  #if CONFIG_PGTABLE_LEVELS > 2
> -pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
> +pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD] __page_aligned_bss;
>  static inline bool kasan_pmd_table(pud_t pud)
>  {
> return pud_page(pud) == 
> virt_to_page(lm_alias(kasan_early_shadow_pmd));
> @@ -64,7 +64,7 @@ static inline bool kasan_pmd_table(pud_t pud)
> return false;
>  }
>  #endif
> -pte_t kasan_early_shadow_pte[PTRS_PER_PTE + PTE_HWTABLE_PTRS]
> +pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS]
> __page_aligned_bss;
>
>  static inline bool kasan_pte_table(pmd_t pmd)
> --
> 2.30.2
>


Re: [PATCH v14 3/4] mm: define default MAX_PTRS_PER_* in include/pgtable.h

2021-06-17 Thread Marco Elver
On Thu, 17 Jun 2021 at 08:40, Daniel Axtens  wrote:
>
> Commit c65e774fb3f6 ("x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable")
> made PTRS_PER_P4D variable on x86 and introduced MAX_PTRS_PER_P4D as a
> constant for cases which need a compile-time constant (e.g. fixed-size
> arrays).
>
> powerpc likewise has boot-time selectable MMU features which can cause
> other mm "constants" to vary. For KASAN, we have some static
> PTE/PMD/PUD/P4D arrays so we need compile-time maximums for all these
> constants. Extend the MAX_PTRS_PER_ idiom, and place default definitions
> in include/pgtable.h. These define MAX_PTRS_PER_x to be PTRS_PER_x unless
> an architecture has defined MAX_PTRS_PER_x in its arch headers.
>
> Clean up pgtable-nop4d.h and s390's MAX_PTRS_PER_P4D definitions while
> we're at it: both can just pick up the default now.
>
> Signed-off-by: Daniel Axtens 

Reviewed-by: Marco Elver 


> ---
>
> s390 was compile tested only.
> ---
>  arch/s390/include/asm/pgtable.h |  2 --
>  include/asm-generic/pgtable-nop4d.h |  1 -
>  include/linux/pgtable.h | 22 ++
>  3 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 7c66ae5d7e32..cf05954ce013 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -342,8 +342,6 @@ static inline int is_module_addr(void *addr)
>  #define PTRS_PER_P4D   _CRST_ENTRIES
>  #define PTRS_PER_PGD   _CRST_ENTRIES
>
> -#define MAX_PTRS_PER_P4D   PTRS_PER_P4D
> -
>  /*
>   * Segment table and region3 table entry encoding
>   * (R = read-only, I = invalid, y = young bit):
> diff --git a/include/asm-generic/pgtable-nop4d.h 
> b/include/asm-generic/pgtable-nop4d.h
> index ce2cbb3c380f..2f6b1befb129 100644
> --- a/include/asm-generic/pgtable-nop4d.h
> +++ b/include/asm-generic/pgtable-nop4d.h
> @@ -9,7 +9,6 @@
>  typedef struct { pgd_t pgd; } p4d_t;
>
>  #define P4D_SHIFT  PGDIR_SHIFT
> -#define MAX_PTRS_PER_P4D   1
>  #define PTRS_PER_P4D   1
>  #define P4D_SIZE   (1UL << P4D_SHIFT)
>  #define P4D_MASK   (~(P4D_SIZE-1))
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 9e6f71265f72..69700e3e615f 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1625,4 +1625,26 @@ typedef unsigned int pgtbl_mod_mask;
>  #define pte_leaf_size(x) PAGE_SIZE
>  #endif
>
> +/*
> + * Some architectures have MMUs that are configurable or selectable at boot
> + * time. These lead to variable PTRS_PER_x. For statically allocated arrays 
> it
> + * helps to have a static maximum value.
> + */
> +
> +#ifndef MAX_PTRS_PER_PTE
> +#define MAX_PTRS_PER_PTE PTRS_PER_PTE
> +#endif
> +
> +#ifndef MAX_PTRS_PER_PMD
> +#define MAX_PTRS_PER_PMD PTRS_PER_PMD
> +#endif
> +
> +#ifndef MAX_PTRS_PER_PUD
> +#define MAX_PTRS_PER_PUD PTRS_PER_PUD
> +#endif
> +
> +#ifndef MAX_PTRS_PER_P4D
> +#define MAX_PTRS_PER_P4D PTRS_PER_P4D
> +#endif
> +
>  #endif /* _LINUX_PGTABLE_H */
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210617063956.94061-4-dja%40axtens.net.


Re: [PATCH v14 1/4] kasan: allow an architecture to disable inline instrumentation

2021-06-17 Thread Marco Elver
On Thu, 17 Jun 2021 at 08:40, Daniel Axtens  wrote:
>
> For annoying architectural reasons, it's very difficult to support inline
> instrumentation on powerpc64.*
>
> Add a Kconfig flag to allow an arch to disable inline. (It's a bit
> annoying to be 'backwards', but I'm not aware of any way to have
> an arch force a symbol to be 'n', rather than 'y'.)
>
> We also disable stack instrumentation in this case as it does things that
> are functionally equivalent to inline instrumentation, namely adding
> code that touches the shadow directly without going through a C helper.
>
> * on ppc64 atm, the shadow lives in virtual memory and isn't accessible in
> real mode. However, before we turn on virtual memory, we parse the device
> tree to determine which platform and MMU we're running under. That calls
> generic DT code, which is instrumented. Inline instrumentation in DT would
> unconditionally attempt to touch the shadow region, which we won't have
> set up yet, and would crash. We can make outline mode wait for the arch to
> be ready, but we can't change what the compiler inserts for inline mode.
>
> Signed-off-by: Daniel Axtens 

Reviewed-by: Marco Elver 


> ---
>  lib/Kconfig.kasan | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index cffc2ebbf185..cb5e02d09e11 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -12,6 +12,15 @@ config HAVE_ARCH_KASAN_HW_TAGS
>  config HAVE_ARCH_KASAN_VMALLOC
> bool
>
> +config ARCH_DISABLE_KASAN_INLINE
> +   bool
> +   help
> + Sometimes an architecture might not be able to support inline
> + instrumentation but might be able to support outline 
> instrumentation.
> + This option allows an architecture to prevent inline and stack
> + instrumentation from being enabled.
> +
> +
>  config CC_HAS_KASAN_GENERIC
> def_bool $(cc-option, -fsanitize=kernel-address)
>
> @@ -130,6 +139,7 @@ config KASAN_OUTLINE
>
>  config KASAN_INLINE
> bool "Inline instrumentation"
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> help
>   Compiler directly inserts code checking shadow memory before
>   memory accesses. This is faster than outline (in some workloads
> @@ -141,6 +151,7 @@ endchoice
>  config KASAN_STACK
> bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && 
> !COMPILE_TEST
> depends on KASAN_GENERIC || KASAN_SW_TAGS
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> default y if CC_IS_GCC
> help
>   The LLVM stack address sanitizer has a know problem that
> @@ -154,6 +165,9 @@ config KASAN_STACK
>   but clang users can still enable it for builds without
>   CONFIG_COMPILE_TEST.  On gcc it is assumed to always be safe
>   to use and enabled by default.
> + If the architecture disables inline instrumentation, this is
> + also disabled as it adds inline-style instrumentation that
> + is run unconditionally.
>
>  config KASAN_SW_TAGS_IDENTIFY
> bool "Enable memory corruption identification"
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210617063956.94061-2-dja%40axtens.net.


Re: [PATCH v14 2/4] kasan: allow architectures to provide an outline readiness check

2021-06-17 Thread Marco Elver
On Thu, 17 Jun 2021 at 08:40, Daniel Axtens  wrote:
>
> Allow architectures to define a kasan_arch_is_ready() hook that bails
> out of any function that's about to touch the shadow unless the arch
> says that it is ready for the memory to be accessed. This is fairly
> uninvasive and should have a negligible performance penalty.
>
> This will only work in outline mode, so an arch must specify
> ARCH_DISABLE_KASAN_INLINE if it requires this.
>
> Cc: Balbir Singh 
> Cc: Aneesh Kumar K.V 
> Suggested-by: Christophe Leroy 
> Signed-off-by: Daniel Axtens 

With Christophe's suggestion:

Reviewed-by: Marco Elver 


> --
>
> Both previous RFCs for ppc64 - by 2 different people - have
> needed this trick! See:
>  - https://lore.kernel.org/patchwork/patch/592820/ # ppc64 hash series
>  - https://patchwork.ozlabs.org/patch/795211/  # ppc radix series
>
> I haven't been able to exercise the arch hook error for !GENERIC as I
> don't have a particularly modern aarch64 toolchain or a lot of experience
> cross-compiling with clang. But it does fire for GENERIC + INLINE on x86.
> ---
>  mm/kasan/common.c  | 4 
>  mm/kasan/generic.c | 3 +++
>  mm/kasan/kasan.h   | 8 
>  mm/kasan/shadow.c  | 8 
>  4 files changed, 23 insertions(+)
>
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 10177cc26d06..0ad615f3801d 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -331,6 +331,10 @@ static inline bool kasan_slab_free(struct kmem_cache 
> *cache, void *object,
> u8 tag;
> void *tagged_object;
>
> +   /* Bail if the arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return false;
> +
> tag = get_tag(object);
> tagged_object = object;
> object = kasan_reset_tag(object);
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index 53cbf28859b5..c3f5ba7a294a 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -163,6 +163,9 @@ static __always_inline bool check_region_inline(unsigned 
> long addr,
> size_t size, bool write,
> unsigned long ret_ip)
>  {
> +   if (!kasan_arch_is_ready())
> +   return true;
> +
> if (unlikely(size == 0))
> return true;
>
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 8f450bc28045..b18abaf8c78e 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -449,6 +449,14 @@ static inline void kasan_poison_last_granule(const void 
> *address, size_t size) {
>
>  #endif /* CONFIG_KASAN_GENERIC */
>
> +#ifndef kasan_arch_is_ready
> +static inline bool kasan_arch_is_ready(void)   { return true; }
> +#else
> +#if !defined(CONFIG_KASAN_GENERIC) || !defined(CONFIG_KASAN_OUTLINE)
> +#error kasan_arch_is_ready only works in KASAN generic outline mode!
> +#endif
> +#endif
> +
>  /*
>   * Exported functions for interfaces called from assembly or from generated
>   * code. Declarations here to avoid warning about missing declarations.
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 082ee5b6d9a1..3c7f7efe6f68 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -73,6 +73,10 @@ void kasan_poison(const void *addr, size_t size, u8 value, 
> bool init)
>  {
> void *shadow_start, *shadow_end;
>
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +
> /*
>  * Perform shadow offset calculation based on untagged address, as
>  * some of the callers (e.g. kasan_poison_object_data) pass tagged
> @@ -99,6 +103,10 @@ EXPORT_SYMBOL(kasan_poison);
>  #ifdef CONFIG_KASAN_GENERIC
>  void kasan_poison_last_granule(const void *addr, size_t size)
>  {
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +
> if (size & KASAN_GRANULE_MASK) {
> u8 *shadow = (u8 *)kasan_mem_to_shadow(addr + size);
> *shadow = size & KASAN_GRANULE_MASK;
> --
> 2.30.2
>


Re: [PATCH v13 1/3] kasan: allow an architecture to disable inline instrumentation

2021-06-16 Thread Marco Elver
On Wed, 16 Jun 2021 at 10:02, Daniel Axtens  wrote:
>
> For annoying architectural reasons, it's very difficult to support inline
> instrumentation on powerpc64.*
>
> Add a Kconfig flag to allow an arch to disable inline. (It's a bit
> annoying to be 'backwards', but I'm not aware of any way to have
> an arch force a symbol to be 'n', rather than 'y'.)
>
> We also disable stack instrumentation in this case as it does things that
> are functionally equivalent to inline instrumentation, namely adding
> code that touches the shadow directly without going through a C helper.
>
> * on ppc64 atm, the shadow lives in virtual memory and isn't accessible in
> real mode. However, before we turn on virtual memory, we parse the device
> tree to determine which platform and MMU we're running under. That calls
> generic DT code, which is instrumented. Inline instrumentation in DT would
> unconditionally attempt to touch the shadow region, which we won't have
> set up yet, and would crash. We can make outline mode wait for the arch to
> be ready, but we can't change what the compiler inserts for inline mode.
>
> Signed-off-by: Daniel Axtens 

Reviewed-by: Marco Elver 

Thank you.

> ---
>  lib/Kconfig.kasan | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index cffc2ebbf185..cb5e02d09e11 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -12,6 +12,15 @@ config HAVE_ARCH_KASAN_HW_TAGS
>  config HAVE_ARCH_KASAN_VMALLOC
> bool
>
> +config ARCH_DISABLE_KASAN_INLINE
> +   bool
> +   help
> + Sometimes an architecture might not be able to support inline
> + instrumentation but might be able to support outline 
> instrumentation.
> + This option allows an architecture to prevent inline and stack
> + instrumentation from being enabled.
> +
> +
>  config CC_HAS_KASAN_GENERIC
> def_bool $(cc-option, -fsanitize=kernel-address)
>
> @@ -130,6 +139,7 @@ config KASAN_OUTLINE
>
>  config KASAN_INLINE
> bool "Inline instrumentation"
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> help
>   Compiler directly inserts code checking shadow memory before
>   memory accesses. This is faster than outline (in some workloads
> @@ -141,6 +151,7 @@ endchoice
>  config KASAN_STACK
> bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && 
> !COMPILE_TEST
> depends on KASAN_GENERIC || KASAN_SW_TAGS
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> default y if CC_IS_GCC
> help
>   The LLVM stack address sanitizer has a know problem that
> @@ -154,6 +165,9 @@ config KASAN_STACK
>   but clang users can still enable it for builds without
>   CONFIG_COMPILE_TEST.  On gcc it is assumed to always be safe
>   to use and enabled by default.
> + If the architecture disables inline instrumentation, this is
> + also disabled as it adds inline-style instrumentation that
> + is run unconditionally.
>
>  config KASAN_SW_TAGS_IDENTIFY
> bool "Enable memory corruption identification"
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210616080244.51236-2-dja%40axtens.net.


Re: [PATCH v13 3/3] kasan: define and use MAX_PTRS_PER_* for early shadow tables

2021-06-16 Thread Marco Elver
On Wed, 16 Jun 2021 at 10:03, Daniel Axtens  wrote:
[...]
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 768d7d342757..fd65f477ac92 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -40,10 +40,22 @@ struct kunit_kasan_expectation {
>  #define PTE_HWTABLE_PTRS 0
>  #endif
>
> +#ifndef MAX_PTRS_PER_PTE
> +#define MAX_PTRS_PER_PTE PTRS_PER_PTE
> +#endif
> +
> +#ifndef MAX_PTRS_PER_PMD
> +#define MAX_PTRS_PER_PMD PTRS_PER_PMD
> +#endif
> +
> +#ifndef MAX_PTRS_PER_PUD
> +#define MAX_PTRS_PER_PUD PTRS_PER_PUD
> +#endif

This is introducing new global constants in a  header. It
feels like this should be in  together with a
comment. Because  is actually included in
, most of the kernel will get these new definitions.
That in itself is fine, but it feels wrong that the KASAN header
introduces these.

Thoughts?

Sorry for only realizing this now.

Thanks,
-- Marco

>  extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
> -extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE + PTE_HWTABLE_PTRS];
> -extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
> -extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
> +extern pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS];
> +extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD];
> +extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD];
>  extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>
>  int kasan_populate_early_shadow(const void *shadow_start,
> diff --git a/mm/kasan/init.c b/mm/kasan/init.c
> index 348f31d15a97..cc64ed6858c6 100644
> --- a/mm/kasan/init.c
> +++ b/mm/kasan/init.c
> @@ -41,7 +41,7 @@ static inline bool kasan_p4d_table(pgd_t pgd)
>  }
>  #endif
>  #if CONFIG_PGTABLE_LEVELS > 3
> -pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
> +pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD] __page_aligned_bss;
>  static inline bool kasan_pud_table(p4d_t p4d)
>  {
> return p4d_page(p4d) == 
> virt_to_page(lm_alias(kasan_early_shadow_pud));
> @@ -53,7 +53,7 @@ static inline bool kasan_pud_table(p4d_t p4d)
>  }
>  #endif
>  #if CONFIG_PGTABLE_LEVELS > 2
> -pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
> +pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD] __page_aligned_bss;
>  static inline bool kasan_pmd_table(pud_t pud)
>  {
> return pud_page(pud) == 
> virt_to_page(lm_alias(kasan_early_shadow_pmd));
> @@ -64,7 +64,7 @@ static inline bool kasan_pmd_table(pud_t pud)
> return false;
>  }
>  #endif
> -pte_t kasan_early_shadow_pte[PTRS_PER_PTE + PTE_HWTABLE_PTRS]
> +pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS]
> __page_aligned_bss;
>
>  static inline bool kasan_pte_table(pmd_t pmd)
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210616080244.51236-4-dja%40axtens.net.


Re: [PATCH v13 2/3] kasan: allow architectures to provide an outline readiness check

2021-06-16 Thread Marco Elver
On Wed, 16 Jun 2021 at 10:02, Daniel Axtens  wrote:
> Allow architectures to define a kasan_arch_is_ready() hook that bails
> out of any function that's about to touch the shadow unless the arch
> says that it is ready for the memory to be accessed. This is fairly
> uninvasive and should have a negligible performance penalty.
>
> This will only work in outline mode, so an arch must specify
> ARCH_DISABLE_KASAN_INLINE if it requires this.
>
> Cc: Balbir Singh 
> Cc: Aneesh Kumar K.V 
> Suggested-by: Christophe Leroy 
> Signed-off-by: Daniel Axtens 

Reviewed-by: Marco Elver 

but also check if an assertion that this is only used with
KASAN_GENERIC might make sense (below). Depends on how much we want to
make sure kasan_arch_is_ready() could be useful for other modes (which
I don't think it makes sense).

> --
>
> I discuss the justfication for this later in the series. Also,
> both previous RFCs for ppc64 - by 2 different people - have
> needed this trick! See:
>  - https://lore.kernel.org/patchwork/patch/592820/ # ppc64 hash series
>  - https://patchwork.ozlabs.org/patch/795211/  # ppc radix series
> ---
>  mm/kasan/common.c  | 4 
>  mm/kasan/generic.c | 3 +++
>  mm/kasan/kasan.h   | 4 
>  mm/kasan/shadow.c  | 8 
>  4 files changed, 19 insertions(+)
>
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 10177cc26d06..0ad615f3801d 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -331,6 +331,10 @@ static inline bool kasan_slab_free(struct kmem_cache 
> *cache, void *object,
> u8 tag;
> void *tagged_object;
>
> +   /* Bail if the arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return false;
> +
> tag = get_tag(object);
> tagged_object = object;
> object = kasan_reset_tag(object);
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index 53cbf28859b5..c3f5ba7a294a 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -163,6 +163,9 @@ static __always_inline bool check_region_inline(unsigned 
> long addr,
> size_t size, bool write,
> unsigned long ret_ip)
>  {
> +   if (!kasan_arch_is_ready())
> +   return true;
> +
> if (unlikely(size == 0))
> return true;
>
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 8f450bc28045..19323a3d5975 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -449,6 +449,10 @@ static inline void kasan_poison_last_granule(const void 
> *address, size_t size) {
>
>  #endif /* CONFIG_KASAN_GENERIC */
>
> +#ifndef kasan_arch_is_ready
> +static inline bool kasan_arch_is_ready(void)   { return true; }
> +#endif
> +

I've been trying to think of a way to make it clear this is only for
KASAN_GENERIC mode, and not the others. An arch can always define this
function, but of course it might not be used. One way would be to add
an '#ifndef CONFIG_KASAN_GENERIC' in the #else case and #error if it's
not generic mode.

I think trying to make this do anything useful for SW_TAGS or HW_TAGS
modes does not make sense (at least right now).

>  /*
>   * Exported functions for interfaces called from assembly or from generated
>   * code. Declarations here to avoid warning about missing declarations.
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 082ee5b6d9a1..3c7f7efe6f68 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -73,6 +73,10 @@ void kasan_poison(const void *addr, size_t size, u8 value, 
> bool init)
>  {
> void *shadow_start, *shadow_end;
>
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +
> /*
>  * Perform shadow offset calculation based on untagged address, as
>  * some of the callers (e.g. kasan_poison_object_data) pass tagged
> @@ -99,6 +103,10 @@ EXPORT_SYMBOL(kasan_poison);
>  #ifdef CONFIG_KASAN_GENERIC
>  void kasan_poison_last_granule(const void *addr, size_t size)
>  {
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +
> if (size & KASAN_GRANULE_MASK) {
> u8 *shadow = (u8 *)kasan_mem_to_shadow(addr + size);
> *shadow = size & KASAN_GRANULE_MASK;
> --
> 2.30.2
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210616080244.51236-3-dja%40axtens.net.


Re: [PATCH v12 0/6] KASAN core changes for ppc64 radix KASAN

2021-06-15 Thread Marco Elver
[+Cc Andrey]

On Tue, 15 Jun 2021 at 03:47, Daniel Axtens  wrote:
>
> Building on the work of Christophe, Aneesh and Balbir, I've ported
> KASAN to 64-bit Book3S kernels running on the Radix MMU.
>
> I've been trying this for a while, but we keep having collisions
> between the kasan code in the mm tree and the code I want to put in to
> the ppc tree. So my aim here is for patches 1 through 4 or 1 through 5
> to go in via the mm tree.

I think this is reasonable. I'd suggest just sending non-ppc patches
separately (i.e. split the series explicitly) to KASAN maintainers,
and ensure to Cc Andrew, too. Just point at this series to illustrate
how it'll be used.

I think the patches are fine, but I'm not entirely sure about the
current placements of kasan_arch_is_ready(), so hopefully Andrey can
also have a look.


> I will then propose the powerpc changes for
> a later cycle. (I have attached them to this series as an RFC, and
> there are still outstanding review comments I need to attend to.)
>
> v12 applies to next-20210611. There should be no noticable changes to
> other platforms.
>
> Kind regards,
> Daniel
>
> Daniel Axtens (6):
>   kasan: allow an architecture to disable inline instrumentation
>   kasan: allow architectures to provide an outline readiness check
>   kasan: define and use MAX_PTRS_PER_* for early shadow tables

^^ Up to here could be a separate series to go through -mm.

>   kasan: Document support on 32-bit powerpc

^^ The Documentation changes are minimal and not just confined to
kasan.rst it seems. In fact your "powerpc: Book3S .." patch changes
Documentation more. So you could just take "kasan: Document support on
32-bit powerpc" through ppc tree as well.

>   powerpc/mm/kasan: rename kasan_init_32.c to init_32.c
>   [RFC] powerpc: Book3S 64-bit outline-only KASAN support


Re: [PATCH v12 2/6] kasan: allow architectures to provide an outline readiness check

2021-06-15 Thread Marco Elver
On Tue, 15 Jun 2021 at 03:47, Daniel Axtens  wrote:
>
> Allow architectures to define a kasan_arch_is_ready() hook that bails
> out of any function that's about to touch the shadow unless the arch
> says that it is ready for the memory to be accessed. This is fairly
> uninvasive and should have a negligible performance penalty.
>
> This will only work in outline mode, so an arch must specify
> ARCH_DISABLE_KASAN_INLINE if it requires this.
>
> Cc: Balbir Singh 
> Cc: Aneesh Kumar K.V 
> Suggested-by: Christophe Leroy 
> Signed-off-by: Daniel Axtens 
>
> --
>
> I discuss the justfication for this later in the series. Also,
> both previous RFCs for ppc64 - by 2 different people - have
> needed this trick! See:
>  - https://lore.kernel.org/patchwork/patch/592820/ # ppc64 hash series
>  - https://patchwork.ozlabs.org/patch/795211/  # ppc radix series
> ---
>  mm/kasan/common.c  | 4 
>  mm/kasan/generic.c | 3 +++
>  mm/kasan/kasan.h   | 4 
>  mm/kasan/shadow.c  | 4 
>  4 files changed, 15 insertions(+)
>
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 10177cc26d06..0ad615f3801d 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -331,6 +331,10 @@ static inline bool kasan_slab_free(struct kmem_cache 
> *cache, void *object,
> u8 tag;
> void *tagged_object;
>
> +   /* Bail if the arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return false;
> +
> tag = get_tag(object);
> tagged_object = object;
> object = kasan_reset_tag(object);
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index 53cbf28859b5..c3f5ba7a294a 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -163,6 +163,9 @@ static __always_inline bool check_region_inline(unsigned 
> long addr,
> size_t size, bool write,
> unsigned long ret_ip)
>  {
> +   if (!kasan_arch_is_ready())
> +   return true;
> +
> if (unlikely(size == 0))
> return true;
>
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 8f450bc28045..19323a3d5975 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -449,6 +449,10 @@ static inline void kasan_poison_last_granule(const void 
> *address, size_t size) {
>
>  #endif /* CONFIG_KASAN_GENERIC */
>
> +#ifndef kasan_arch_is_ready
> +static inline bool kasan_arch_is_ready(void)   { return true; }
> +#endif
> +
>  /*
>   * Exported functions for interfaces called from assembly or from generated
>   * code. Declarations here to avoid warning about missing declarations.
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 082ee5b6d9a1..74134b657d7d 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -73,6 +73,10 @@ void kasan_poison(const void *addr, size_t size, u8 value, 
> bool init)
>  {
> void *shadow_start, *shadow_end;
>
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +

What about kasan_poison_last_granule()? kasan_unpoison() currently
seems to potentially trip on that.


Re: [PATCH v12 1/6] kasan: allow an architecture to disable inline instrumentation

2021-06-15 Thread Marco Elver
On Tue, 15 Jun 2021 at 03:47, Daniel Axtens  wrote:
>
> For annoying architectural reasons, it's very difficult to support inline
> instrumentation on powerpc64.
>
> Add a Kconfig flag to allow an arch to disable inline. (It's a bit
> annoying to be 'backwards', but I'm not aware of any way to have
> an arch force a symbol to be 'n', rather than 'y'.)
>
> We also disable stack instrumentation in this case as it does things that
> are functionally equivalent to inline instrumentation, namely adding
> code that touches the shadow directly without going through a C helper.
>
> Signed-off-by: Daniel Axtens 
> ---
>  lib/Kconfig.kasan | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index cffc2ebbf185..935814f332a7 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -12,6 +12,15 @@ config HAVE_ARCH_KASAN_HW_TAGS
>  config HAVE_ARCH_KASAN_VMALLOC
> bool
>
> +# Sometimes an architecture might not be able to support inline 
> instrumentation
> +# but might be able to support outline instrumentation. This option allows an
> +# arch to prevent inline and stack instrumentation from being enabled.

This comment could be moved into 'help' of this new config option.

> +# ppc64 turns on virtual memory late in boot, after calling into generic code
> +# like the device-tree parser, so it uses this in conjuntion with a hook in
> +# outline mode to avoid invalid access early in boot.

I think the ppc64-related comment isn't necessary and can be moved to
arch/ppc64 somewhere, if there isn't one already.

> +config ARCH_DISABLE_KASAN_INLINE
> +   bool
> +
>  config CC_HAS_KASAN_GENERIC
> def_bool $(cc-option, -fsanitize=kernel-address)
>
> @@ -130,6 +139,7 @@ config KASAN_OUTLINE
>
>  config KASAN_INLINE
> bool "Inline instrumentation"
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> help
>   Compiler directly inserts code checking shadow memory before
>   memory accesses. This is faster than outline (in some workloads
> @@ -141,6 +151,7 @@ endchoice
>  config KASAN_STACK
> bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && 
> !COMPILE_TEST
> depends on KASAN_GENERIC || KASAN_SW_TAGS
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> default y if CC_IS_GCC
> help
>   The LLVM stack address sanitizer has a know problem that
> @@ -154,6 +165,9 @@ config KASAN_STACK
>   but clang users can still enable it for builds without
>   CONFIG_COMPILE_TEST.  On gcc it is assumed to always be safe
>   to use and enabled by default.
> + If the architecture disables inline instrumentation, this is
> + also disabled as it adds inline-style instrumentation that
> + is run unconditionally.
>
>  config KASAN_SW_TAGS_IDENTIFY
> bool "Enable memory corruption identification"
> --
> 2.27.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210615014705.2234866-2-dja%40axtens.net.


Re: [PATCH v11 1/6] kasan: allow an architecture to disable inline instrumentation

2021-03-22 Thread Marco Elver
On Fri, 19 Mar 2021 at 15:41, Daniel Axtens  wrote:
>
> For annoying architectural reasons, it's very difficult to support inline
> instrumentation on powerpc64.
>
> Add a Kconfig flag to allow an arch to disable inline. (It's a bit
> annoying to be 'backwards', but I'm not aware of any way to have
> an arch force a symbol to be 'n', rather than 'y'.)
>
> We also disable stack instrumentation in this case as it does things that
> are functionally equivalent to inline instrumentation, namely adding
> code that touches the shadow directly without going through a C helper.
>
> Signed-off-by: Daniel Axtens 
> ---
>  lib/Kconfig.kasan | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index cffc2ebbf185..7e237dbb6df3 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -12,6 +12,9 @@ config HAVE_ARCH_KASAN_HW_TAGS
>  config HAVE_ARCH_KASAN_VMALLOC
> bool
>
> +config ARCH_DISABLE_KASAN_INLINE
> +   def_bool n
> +

Does just "bool" work here?

>  config CC_HAS_KASAN_GENERIC
> def_bool $(cc-option, -fsanitize=kernel-address)
>
> @@ -130,6 +133,7 @@ config KASAN_OUTLINE
>
>  config KASAN_INLINE
> bool "Inline instrumentation"
> +   depends on !ARCH_DISABLE_KASAN_INLINE
> help
>   Compiler directly inserts code checking shadow memory before
>   memory accesses. This is faster than outline (in some workloads
> @@ -142,6 +146,7 @@ config KASAN_STACK
> bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && 
> !COMPILE_TEST
> depends on KASAN_GENERIC || KASAN_SW_TAGS
> default y if CC_IS_GCC
> +   depends on !ARCH_DISABLE_KASAN_INLINE

Minor, but perhaps this 'depends on' line could be moved up 1 line to
be grouped with the other 'depends on'.


> help
>   The LLVM stack address sanitizer has a know problem that
>   causes excessive stack usage in a lot of functions, see
> @@ -154,6 +159,9 @@ config KASAN_STACK
>   but clang users can still enable it for builds without
>   CONFIG_COMPILE_TEST.  On gcc it is assumed to always be safe
>   to use and enabled by default.
> + If the architecture disables inline instrumentation, this is
> + also disabled as it adds inline-style instrumentation that
> + is run unconditionally.
>
>  config KASAN_SW_TAGS_IDENTIFY
> bool "Enable memory corruption identification"
> --
> 2.27.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210319144058.772525-2-dja%40axtens.net.


Re: [PATCH v11 2/6] kasan: allow architectures to provide an outline readiness check

2021-03-22 Thread Marco Elver
On Fri, 19 Mar 2021 at 15:41, Daniel Axtens  wrote:
> Allow architectures to define a kasan_arch_is_ready() hook that bails
> out of any function that's about to touch the shadow unless the arch
> says that it is ready for the memory to be accessed. This is fairly
> uninvasive and should have a negligible performance penalty.
>
> This will only work in outline mode, so an arch must specify
> ARCH_DISABLE_KASAN_INLINE if it requires this.
>
> Cc: Balbir Singh 
> Cc: Aneesh Kumar K.V 
> Suggested-by: Christophe Leroy 
> Signed-off-by: Daniel Axtens 
>
> --
>
> I discuss the justfication for this later in the series. Also,
> both previous RFCs for ppc64 - by 2 different people - have
> needed this trick! See:
>  - https://lore.kernel.org/patchwork/patch/592820/ # ppc64 hash series
>  - https://patchwork.ozlabs.org/patch/795211/  # ppc radix series
> ---
>  include/linux/kasan.h | 4 
>  mm/kasan/common.c | 4 
>  mm/kasan/generic.c| 3 +++
>  mm/kasan/shadow.c | 4 
>  4 files changed, 15 insertions(+)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 8b3b99d659b7..6bd8343f0033 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h

Does kasan_arch_is_ready() need to be defined in the public interface
of KASAN? Could it instead be moved to mm/kasan/kasan.h?

> @@ -23,6 +23,10 @@ struct kunit_kasan_expectation {
>
>  #endif
>
> +#ifndef kasan_arch_is_ready
> +static inline bool kasan_arch_is_ready(void)   { return true; }
> +#endif
> +
>  #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
>
>  #include 
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 6bb87f2acd4e..f23a9e2dce9f 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -345,6 +345,10 @@ static inline bool kasan_slab_free(struct kmem_cache 
> *cache, void *object,
> if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU))
> return false;
>
> +   /* We can't read the shadow byte if the arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return false;
> +

While it probably doesn't matter much, it seems this check could be
moved up, rather than having it in the middle here.


> if (!kasan_byte_accessible(tagged_object)) {
> kasan_report_invalid_free(tagged_object, ip);
> return true;
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index 53cbf28859b5..c3f5ba7a294a 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -163,6 +163,9 @@ static __always_inline bool check_region_inline(unsigned 
> long addr,
> size_t size, bool write,
> unsigned long ret_ip)
>  {
> +   if (!kasan_arch_is_ready())
> +   return true;
> +
> if (unlikely(size == 0))
> return true;
>
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 727ad4629173..1f650c521037 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -80,6 +80,10 @@ void kasan_poison(const void *addr, size_t size, u8 value, 
> bool init)
>  */
> addr = kasan_reset_tag(addr);
>
> +   /* Don't touch the shadow memory if arch isn't ready */
> +   if (!kasan_arch_is_ready())
> +   return;
> +
> /* Skip KFENCE memory if called explicitly outside of sl*b. */
> if (is_kfence_address(addr))
> return;
> --
> 2.27.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/20210319144058.772525-3-dja%40axtens.net.


Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-05 Thread Marco Elver
On Fri, 5 Mar 2021 at 12:49, Michael Ellerman  wrote:
> Marco Elver  writes:
> ...
> >
> > The choice is between:
> >
> > 1. ARCH_FUNC_PREFIX (as a matter of fact, the ARCH_FUNC_PREFIX patch
> > is already in -mm). Perhaps we could optimize it further, by checking
> > ARCH_FUNC_PREFIX in buf, and advancing buf like you propose, but I'm
> > not sure it's worth worrying about.
> >
> > 2. The dynamic solution that I proposed that does not use a hard-coded
> > '.' (or some variation thereof).
> >
> > Please tell me which solution you prefer, 1 or 2 -- I'd like to stop
> > bikeshedding here. If there's a compelling argument for hard-coding
> > the '.' in non-arch code, please clarify, but otherwise I'd like to
> > keep arch-specific things out of generic code.
>
> It's your choice, I was just trying to minimise the size of the wart you
> have to carry in kfence code to deal with it.
>
> The ARCH_FUNC_PREFIX solution is fine by me.

Thank you -- the ARCH_FUNC_PREFIX version is already in -mm, so let's
keep it. It's purely static vs the other options. Should another
debugging tool need something similar we can revisit whether to change
or move it.

Thanks,
-- Marco


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 19:51, Mark Rutland  wrote:
> On Thu, Mar 04, 2021 at 07:22:53PM +0100, Marco Elver wrote:
> > On Thu, 4 Mar 2021 at 19:02, Mark Rutland  wrote:
> > > On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > > > On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> > > > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  
> > > > > > wrote:
> > > > > > > [adding Mark Brown]
> > > > > > >
> > > > > > > The bigger problem here is that skipping is dodgy to begin with, 
> > > > > > > and
> > > > > > > this is still liable to break in some cases. One big concern is 
> > > > > > > that
> > > > > > > (especially with LTO) we cannot guarantee the compiler will not 
> > > > > > > inline
> > > > > > > or outline functions, causing the skipp value to be too large or 
> > > > > > > too
> > > > > > > small. That's liable to happen to callers, and in theory (though
> > > > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > > > stack_trace_save() could get outlined too.
> > > > > > >
> > > > > > > Unless we can get some strong guarantees from compiler folk such 
> > > > > > > that we
> > > > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > > > doesn't itself get split, etc), the only reliable way I can think 
> > > > > > > to
> > > > > > > solve this requires an assembly trampoline. Whatever we do is 
> > > > > > > liable to
> > > > > > > need some invasive rework.
> > > > > >
> > > > > > Will LTO and friends respect 'noinline'?
> > > > >
> > > > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > > > know whether they actually so.
> > > > >
> > > > > I suspect even with 'noinline' the compiler is permitted to outline
> > > > > portions of a function if it wanted to (and IIUC it could still make
> > > > > specialized copies in the absence of 'noclone').
> > > > >
> > > > > > One thing I also noticed is that tail calls would also cause the 
> > > > > > stack
> > > > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > > > disabled tail call optimizations).
> > > > >
> > > > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > > > trace A->C? ... or is A going missing too?
> > > >
> > > > Correct, it's just the A->C outcome.
> > >
> > > I'd assumed that those cases were benign, e.g. for livepatching what
> > > matters is what can be returned to, so B disappearing from the trace
> > > isn't a problem there.
> > >
> > > Is the concern debugability, or is there a functional issue you have in
> > > mind?
> >
> > For me, it's just been debuggability, and reliable test cases.
> >
> > > > > > Is there a way to also mark a function non-tail-callable?
> > > > >
> > > > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > > > function-local optimization options), but I don't expect there's any 
> > > > > way
> > > > > to mark a callee as not being tail-callable.
> > > >
> > > > I don't think this is reliable. It'd be
> > > > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > > > work if applied to the function we do not want to tail-call-optimize,
> > > > but would have to be applied to the function that does the tail-calling.
> > >
> > > Yup; that's what I meant then I said you could do that on the caller but
> > > not the callee.
> > >
> > > I don't follow why you'd want to put this on the callee, though, so I
> > > think I'm missing something. Considering a set of functions in different
> > > compilation units:
> > >
> > >   A->B->C->D->E->F->G->H->I->J-&g

Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 19:02, Mark Rutland  wrote:
> On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  wrote:
> > > > > [adding Mark Brown]
> > > > >
> > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > this is still liable to break in some cases. One big concern is that
> > > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > > or outline functions, causing the skipp value to be too large or too
> > > > > small. That's liable to happen to callers, and in theory (though
> > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > stack_trace_save() could get outlined too.
> > > > >
> > > > > Unless we can get some strong guarantees from compiler folk such that 
> > > > > we
> > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > solve this requires an assembly trampoline. Whatever we do is liable 
> > > > > to
> > > > > need some invasive rework.
> > > >
> > > > Will LTO and friends respect 'noinline'?
> > >
> > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > know whether they actually so.
> > >
> > > I suspect even with 'noinline' the compiler is permitted to outline
> > > portions of a function if it wanted to (and IIUC it could still make
> > > specialized copies in the absence of 'noclone').
> > >
> > > > One thing I also noticed is that tail calls would also cause the stack
> > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > disabled tail call optimizations).
> > >
> > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > trace A->C? ... or is A going missing too?
> >
> > Correct, it's just the A->C outcome.
>
> I'd assumed that those cases were benign, e.g. for livepatching what
> matters is what can be returned to, so B disappearing from the trace
> isn't a problem there.
>
> Is the concern debugability, or is there a functional issue you have in
> mind?

For me, it's just been debuggability, and reliable test cases.

> > > > Is there a way to also mark a function non-tail-callable?
> > >
> > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > function-local optimization options), but I don't expect there's any way
> > > to mark a callee as not being tail-callable.
> >
> > I don't think this is reliable. It'd be
> > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > work if applied to the function we do not want to tail-call-optimize,
> > but would have to be applied to the function that does the tail-calling.
>
> Yup; that's what I meant then I said you could do that on the caller but
> not the callee.
>
> I don't follow why you'd want to put this on the callee, though, so I
> think I'm missing something. Considering a set of functions in different
> compilation units:
>
>   A->B->C->D->E->F->G->H->I->J->K

I was having this problem with KCSAN, where the compiler would
tail-call-optimize __tsan_X instrumentation. This would mean that
KCSAN runtime functions ended up in the trace, but the function where
the access happened would not. However, I don't care about the runtime
functions, and instead want to see the function where the access
happened. In that case, I'd like to just mark __tsan_X and any other
kcsan instrumentation functions as do-not-tail-call-optimize, which
would solve the problem.

The solution today is that when you compile a kernel with KCSAN, every
instrumented TU is compiled with -fno-optimize-sibling-calls. The
better solution would be to just mark KCSAN runtime functions somehow,
but permit tail calling other things. Although, I probably still want
to see the full trace, and would decide that having
-fno-optimize-sibling-calls is a small price to pay in a
debug-only-kernel to get complete traces.

> ... if K were marked in this way, and J was compiled with visibility of
> this, J would stick around, but J's callers might not, and so the a
> trace might see:
>
>   A->J->K
>
> ... do you just care about the final caller, i.e. you just need
> certainty that J will be in the trace?

Yes. But maybe it's a special problem that only sanitizers have.

> If so, we can somewhat bodge that by having K have an __always_inline
> wrapper which has a barrier() or similar after the real call to K, so
> the call couldn't be TCO'd.
>
> Otherwise I'd expect we'd probably need to disable TCO generally.

Thanks,
-- Marco


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  wrote:
> > > [adding Mark Brown]
> > >
> > > The bigger problem here is that skipping is dodgy to begin with, and
> > > this is still liable to break in some cases. One big concern is that
> > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > or outline functions, causing the skipp value to be too large or too
> > > small. That's liable to happen to callers, and in theory (though
> > > unlikely in practice), portions of arch_stack_walk() or
> > > stack_trace_save() could get outlined too.
> > >
> > > Unless we can get some strong guarantees from compiler folk such that we
> > > can guarantee a specific function acts boundary for unwinding (and
> > > doesn't itself get split, etc), the only reliable way I can think to
> > > solve this requires an assembly trampoline. Whatever we do is liable to
> > > need some invasive rework.
> > 
> > Will LTO and friends respect 'noinline'?
> 
> I hope so (and suspect we'd have more problems otherwise), but I don't
> know whether they actually so.
> 
> I suspect even with 'noinline' the compiler is permitted to outline
> portions of a function if it wanted to (and IIUC it could still make
> specialized copies in the absence of 'noclone').
> 
> > One thing I also noticed is that tail calls would also cause the stack
> > trace to appear somewhat incomplete (for some of my tests I've
> > disabled tail call optimizations).
> 
> I assume you mean for a chain A->B->C where B tail-calls C, you get a
> trace A->C? ... or is A going missing too?

Correct, it's just the A->C outcome.

> > Is there a way to also mark a function non-tail-callable?
> 
> I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> function-local optimization options), but I don't expect there's any way
> to mark a callee as not being tail-callable.

I don't think this is reliable. It'd be
__attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
work if applied to the function we do not want to tail-call-optimize,
but would have to be applied to the function that does the tail-calling.
So it's a bit backwards, even if it worked.

> Accoding to the GCC documentation, GCC won't TCO noreturn functions, but
> obviously that's not something we can use generally.
> 
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

Perhaps we can ask the toolchain folks to help add such an attribute. Or
maybe the feature already exists somewhere, but hidden.

+Cc linux-toolcha...@vger.kernel.org

> > But I'm also not sure if with all that we'd be guaranteed the code we
> > want, even though in practice it might.
> 
> True! I'd just like to be on the least dodgy ground we can be.

It's been dodgy for a while, and I'd welcome any low-cost fixes to make
it less dodgy in the short-term at least. :-)

Thanks,
-- Marco


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 15:57, Mark Rutland  wrote:
> [adding Mark Brown]
>
> On Wed, Mar 03, 2021 at 04:20:43PM +0100, Marco Elver wrote:
> > On Wed, Mar 03, 2021 at 03:52PM +0100, Christophe Leroy wrote:
> > > Le 03/03/2021 � 15:38, Marco Elver a �crit�:
> > > > On Wed, 3 Mar 2021 at 15:09, Christophe Leroy
> > > >  wrote:
> > > > >
> > > > > It seems like all other sane architectures, namely x86 and arm64
> > > > > at least, include the running function as top entry when saving
> > > > > stack trace.
> > > > >
> > > > > Functionnalities like KFENCE expect it.
> > > > >
> > > > > Do the same on powerpc, it allows KFENCE to properly identify the 
> > > > > faulting
> > > > > function as depicted below. Before the patch KFENCE was identifying
> > > > > finish_task_switch.isra as the faulting function.
> > > > >
> > > > > [   14.937370] 
> > > > > ==
> > > > > [   14.948692] BUG: KFENCE: invalid read in 
> > > > > test_invalid_access+0x54/0x108
> > > > > [   14.948692]
> > > > > [   14.956814] Invalid read at 0xdf98800a:
> > > > > [   14.960664]  test_invalid_access+0x54/0x108
> > > > > [   14.964876]  finish_task_switch.isra.0+0x54/0x23c
> > > > > [   14.969606]  kunit_try_run_case+0x5c/0xd0
> > > > > [   14.973658]  kunit_generic_run_threadfn_adapter+0x24/0x30
> > > > > [   14.979079]  kthread+0x15c/0x174
> > > > > [   14.982342]  ret_from_kernel_thread+0x14/0x1c
> > > > > [   14.986731]
> > > > > [   14.988236] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB  
> > > > >5.12.0-rc1-01537-g95f6e2088d7e-dirty #4682
> > > > > [   14.999795] NIP:  c016ec2c LR: c02f517c CTR: c016ebd8
> > > > > [   15.004851] REGS: e2449d90 TRAP: 0301   Tainted: GB
> > > > >   (5.12.0-rc1-01537-g95f6e2088d7e-dirty)
> > > > > [   15.015274] MSR:  9032   CR: 2204  XER: 
> > > > > 
> > > > > [   15.022043] DAR: df98800a DSISR: 2000
> > > > > [   15.022043] GPR00: c02f517c e2449e50 c1142080 e100dd24 c084b13c 
> > > > > 0008 c084b32b c016ebd8
> > > > > [   15.022043] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> > > > > [   15.040581] NIP [c016ec2c] test_invalid_access+0x54/0x108
> > > > > [   15.046010] LR [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > > > [   15.051181] Call Trace:
> > > > > [   15.053637] [e2449e50] [c005a68c] 
> > > > > finish_task_switch.isra.0+0x54/0x23c (unreliable)
> > > > > [   15.061338] [e2449eb0] [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > > > [   15.067215] [e2449ed0] [c02f648c] 
> > > > > kunit_generic_run_threadfn_adapter+0x24/0x30
> > > > > [   15.074472] [e2449ef0] [c004e7b0] kthread+0x15c/0x174
> > > > > [   15.079571] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> > > > > [   15.085798] Instruction dump:
> > > > > [   15.088784] 8129d608 38e7ebd8 81020280 911f004c 3900 995f0024 
> > > > > 907f0028 90ff001c
> > > > > [   15.096613] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 
> > > > > 3908adb0 812a4b98 3d40c02f
> > > > > [   15.104612] 
> > > > > ==
> > > > >
> > > > > Signed-off-by: Christophe Leroy 
> > > >
> > > > Acked-by: Marco Elver 
> > > >
> > > > Thank you, I think this looks like the right solution. Just a question 
> > > > below:
> > > >
> > > ...
> > >
> > > > > @@ -59,23 +70,26 @@ void save_stack_trace(struct stack_trace *trace)
> > > > >
> > > > >  sp = current_stack_frame();
> > > > >
> > > > > -   save_context_stack(trace, sp, current, 1);
> > > > > +   save_context_stack(trace, sp, (unsigned 
> > > > > long)save_stack_trace, current, 1);
> > > >
> > > > This causes ip == save_stack_trace and also below for
> > > > save_stack_trace_tsk. Does this mean save_stack_trace() is included in
> > > > the trace? Looking at kernel/stacktrace.c, I think the l

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 15:08, Christophe Leroy
 wrote:
>
>
>
> Le 04/03/2021 à 13:48, Marco Elver a écrit :
> >  From d118080eb9552073f5dcf1f86198f3d86d5ea850 Mon Sep 17 00:00:00 2001
> > From: Marco Elver 
> > Date: Thu, 4 Mar 2021 13:15:51 +0100
> > Subject: [PATCH] kfence: fix reports if constant function prefixes exist
> >
> > Some architectures prefix all functions with a constant string ('.' on
> > ppc64). Add ARCH_FUNC_PREFIX, which may optionally be defined in
> > , so that get_stack_skipnr() can work properly.
>
>
> It works, thanks.
>
> >
> > Link: 
> > https://lkml.kernel.org/r/f036c53d-7e81-763c-47f4-6024c6c5f...@csgroup.eu
> > Reported-by: Christophe Leroy 
> > Signed-off-by: Marco Elver 
>
> Tested-by: Christophe Leroy 

Thanks, I'll send this to Andrew for inclusion in -mm, since this is
not a strict dependency (it'll work without the patch, just the stack
traces aren't that pretty but still useful). If the ppc patches and
this make it into the next merge window, everything should be good for
5.13.

> > ---
> >   mm/kfence/report.c | 18 --
> >   1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/kfence/report.c b/mm/kfence/report.c
> > index 519f037720f5..e3f71451ad9e 100644
> > --- a/mm/kfence/report.c
> > +++ b/mm/kfence/report.c
> > @@ -20,6 +20,11 @@
> >
> >   #include "kfence.h"
> >
> > +/* May be overridden by . */
> > +#ifndef ARCH_FUNC_PREFIX
> > +#define ARCH_FUNC_PREFIX ""
> > +#endif
> > +
> >   extern bool no_hash_pointers;
> >
> >   /* Helper function to either print to a seq_file or to console. */
> > @@ -67,8 +72,9 @@ static int get_stack_skipnr(const unsigned long 
> > stack_entries[], int num_entries
> >   for (skipnr = 0; skipnr < num_entries; skipnr++) {
> >   int len = scnprintf(buf, sizeof(buf), "%ps", (void 
> > *)stack_entries[skipnr]);
> >
> > - if (str_has_prefix(buf, "kfence_") || str_has_prefix(buf, 
> > "__kfence_") ||
> > - !strncmp(buf, "__slab_free", len)) {
> > + if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfence_") ||
> > + str_has_prefix(buf, ARCH_FUNC_PREFIX "__kfence_") ||
> > + !strncmp(buf, ARCH_FUNC_PREFIX "__slab_free", len)) {
> >   /*
> >* In case of tail calls from any of the below
> >* to any of the above.
> > @@ -77,10 +83,10 @@ static int get_stack_skipnr(const unsigned long 
> > stack_entries[], int num_entries
> >   }
> >
> >   /* Also the *_bulk() variants by only checking prefixes. */
> > - if (str_has_prefix(buf, "kfree") ||
> > - str_has_prefix(buf, "kmem_cache_free") ||
> > - str_has_prefix(buf, "__kmalloc") ||
> > - str_has_prefix(buf, "kmem_cache_alloc"))
> > + if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") ||
> > + str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") ||
> > + str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") ||
> > + str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc"))
> >   goto found;
> >   }
> >   if (fallback < num_entries)
> >


Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-04 Thread Marco Elver
On Thu, Mar 04, 2021 at 12:48PM +0100, Christophe Leroy wrote:
> Le 04/03/2021 à 12:31, Marco Elver a écrit :
> > On Thu, 4 Mar 2021 at 12:23, Christophe Leroy
> >  wrote:
> > > Le 03/03/2021 à 11:56, Marco Elver a écrit :
> > > > 
> > > > Somewhat tangentially, I also note that e.g. show_regs(regs) (which
> > > > was printed along the KFENCE report above) didn't include the top
> > > > frame in the "Call Trace", so this assumption is definitely not
> > > > isolated to KFENCE.
> > > > 
> > > 
> > > Now, I have tested PPC64 (with the patch I sent yesterday to modify 
> > > save_stack_trace_regs()
> > > applied), and I get many failures. Any idea ?
> > > 
> > > [   17.653751][   T58] 
> > > ==
> > > [   17.654379][   T58] BUG: KFENCE: invalid free in 
> > > .kfence_guarded_free+0x2e4/0x530
> > > [   17.654379][   T58]
> > > [   17.654831][   T58] Invalid free of 0xc0003c9c (in kfence-#77):
> > > [   17.655358][   T58]  .kfence_guarded_free+0x2e4/0x530
> > > [   17.655775][   T58]  .__slab_free+0x320/0x5a0
> > > [   17.656039][   T58]  .test_double_free+0xe0/0x198
> > > [   17.656308][   T58]  .kunit_try_run_case+0x80/0x110
> > > [   17.656523][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> > > [   17.657161][   T58]  .kthread+0x18c/0x1a0
> > > [   17.659148][   T58]  .ret_from_kernel_thread+0x58/0x70
> > > [   17.659869][   T58]
[...]
> > 
> > Looks like something is prepending '.' to function names. We expect
> > the function name to appear as-is, e.g. "kfence_guarded_free",
> > "test_double_free", etc.
> > 
> > Is there something special on ppc64, where the '.' is some convention?
> > 
> 
> I think so, see 
> https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#FUNC-DES
> 
> Also see commit https://github.com/linuxppc/linux/commit/02424d896

Thanks -- could you try the below patch? You'll need to define
ARCH_FUNC_PREFIX accordingly.

We think, since there are only very few architectures that add a prefix,
requiring  to define something like ARCH_FUNC_PREFIX is
the simplest option. Let me know if this works for you.

There an alternative option, which is to dynamically figure out the
prefix, but if this simpler option is fine with you, we'd prefer it.

Thanks,
-- Marco

-- >8 --

>From d118080eb9552073f5dcf1f86198f3d86d5ea850 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Thu, 4 Mar 2021 13:15:51 +0100
Subject: [PATCH] kfence: fix reports if constant function prefixes exist

Some architectures prefix all functions with a constant string ('.' on
ppc64). Add ARCH_FUNC_PREFIX, which may optionally be defined in
, so that get_stack_skipnr() can work properly.

Link: https://lkml.kernel.org/r/f036c53d-7e81-763c-47f4-6024c6c5f...@csgroup.eu
Reported-by: Christophe Leroy 
Signed-off-by: Marco Elver 
---
 mm/kfence/report.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/kfence/report.c b/mm/kfence/report.c
index 519f037720f5..e3f71451ad9e 100644
--- a/mm/kfence/report.c
+++ b/mm/kfence/report.c
@@ -20,6 +20,11 @@
 
 #include "kfence.h"
 
+/* May be overridden by . */
+#ifndef ARCH_FUNC_PREFIX
+#define ARCH_FUNC_PREFIX ""
+#endif
+
 extern bool no_hash_pointers;
 
 /* Helper function to either print to a seq_file or to console. */
@@ -67,8 +72,9 @@ static int get_stack_skipnr(const unsigned long 
stack_entries[], int num_entries
for (skipnr = 0; skipnr < num_entries; skipnr++) {
int len = scnprintf(buf, sizeof(buf), "%ps", (void 
*)stack_entries[skipnr]);
 
-   if (str_has_prefix(buf, "kfence_") || str_has_prefix(buf, 
"__kfence_") ||
-   !strncmp(buf, "__slab_free", len)) {
+   if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfence_") ||
+   str_has_prefix(buf, ARCH_FUNC_PREFIX "__kfence_") ||
+   !strncmp(buf, ARCH_FUNC_PREFIX "__slab_free", len)) {
/*
 * In case of tail calls from any of the below
 * to any of the above.
@@ -77,10 +83,10 @@ static int get_stack_skipnr(const unsigned long 
stack_entries[], int num_entries
}
 
/* Also the *_bulk() variants by only checking prefixes. */
-   if (str_has_prefix(buf, "kfree") ||
-   str_has_prefix(buf, "kmem_cache_free") ||
-   str_has_prefix(buf, "__kmalloc") ||
-   str_has_prefix(buf, "kmem_cache_alloc"))
+   if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") ||
+   str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") ||
+   str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") ||
+   str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc"))
goto found;
}
if (fallback < num_entries)
-- 
2.30.1.766.gb4fecdf3b7-goog


Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 13:00, Christophe Leroy
 wrote:
>
>
>
> Le 04/03/2021 à 12:48, Christophe Leroy a écrit :
> >
> >
> > Le 04/03/2021 à 12:31, Marco Elver a écrit :
> >> On Thu, 4 Mar 2021 at 12:23, Christophe Leroy
> >>  wrote:
> >>> Le 03/03/2021 à 11:56, Marco Elver a écrit :
> >>>>
> >>>> Somewhat tangentially, I also note that e.g. show_regs(regs) (which
> >>>> was printed along the KFENCE report above) didn't include the top
> >>>> frame in the "Call Trace", so this assumption is definitely not
> >>>> isolated to KFENCE.
> >>>>
> >>>
> >>> Now, I have tested PPC64 (with the patch I sent yesterday to modify 
> >>> save_stack_trace_regs()
> >>> applied), and I get many failures. Any idea ?
> >>>
> >>> [   17.653751][   T58] 
> >>> ==
> >>> [   17.654379][   T58] BUG: KFENCE: invalid free in 
> >>> .kfence_guarded_free+0x2e4/0x530
> >>> [   17.654379][   T58]
> >>> [   17.654831][   T58] Invalid free of 0xc0003c9c (in kfence-#77):
> >>> [   17.655358][   T58]  .kfence_guarded_free+0x2e4/0x530
> >>> [   17.655775][   T58]  .__slab_free+0x320/0x5a0
> >>> [   17.656039][   T58]  .test_double_free+0xe0/0x198
> >>> [   17.656308][   T58]  .kunit_try_run_case+0x80/0x110
> >>> [   17.656523][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> >>> [   17.657161][   T58]  .kthread+0x18c/0x1a0
> >>> [   17.659148][   T58]  .ret_from_kernel_thread+0x58/0x70
> >>> [   17.659869][   T58]
> >>> [   17.663954][   T58] kfence-#77 [0xc0003c9c-0xc0003c9c001f, 
> >>> size=32, cache=kmalloc-32]
> >>> allocated by task 58:
> >>> [   17.666113][   T58]  .__kfence_alloc+0x1bc/0x510
> >>> [   17.667069][   T58]  .__kmalloc+0x280/0x4f0
> >>> [   17.667452][   T58]  .test_alloc+0x19c/0x430
> >>> [   17.667732][   T58]  .test_double_free+0x88/0x198
> >>> [   17.667971][   T58]  .kunit_try_run_case+0x80/0x110
> >>> [   17.668283][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> >>> [   17.668553][   T58]  .kthread+0x18c/0x1a0
> >>> [   17.669315][   T58]  .ret_from_kernel_thread+0x58/0x70
> >>> [   17.669711][   T58]
> >>> [   17.669711][   T58] freed by task 58:
> >>> [   17.670116][   T58]  .kfence_guarded_free+0x3d0/0x530
> >>> [   17.670421][   T58]  .__slab_free+0x320/0x5a0
> >>> [   17.670603][   T58]  .test_double_free+0xb4/0x198
> >>> [   17.670827][   T58]  .kunit_try_run_case+0x80/0x110
> >>> [   17.671073][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> >>> [   17.671410][   T58]  .kthread+0x18c/0x1a0
> >>> [   17.671618][   T58]  .ret_from_kernel_thread+0x58/0x70
> >>> [   17.671972][   T58]
> >>> [   17.672638][   T58] CPU: 0 PID: 58 Comm: kunit_try_catch Tainted: G
> >>> B
> >>> 5.12.0-rc1-01540-g0783285cc1b8-dirty #4685
> >>> [   17.673768][   T58] 
> >>> ==
> >>> [   17.677031][   T58] # test_double_free: EXPECTATION FAILED at 
> >>> mm/kfence/kfence_test.c:380
> >>> [   17.677031][   T58] Expected report_matches() to be true, 
> >>> but is false
> >>> [   17.684397][T1] not ok 7 - test_double_free
> >>> [   17.686463][   T59] # test_double_free-memcache: setup_test_cache: 
> >>> size=32, ctor=0x0
> >>> [   17.688403][   T59] # test_double_free-memcache: test_alloc: 
> >>> size=32, gfp=cc0, policy=any,
> >>> cache=1
> >>
> >> Looks like something is prepending '.' to function names. We expect
> >> the function name to appear as-is, e.g. "kfence_guarded_free",
> >> "test_double_free", etc.
> >>
> >> Is there something special on ppc64, where the '.' is some convention?
> >>
> >
> > I think so, see 
> > https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#FUNC-DES
> >
> > Also see commit https://github.com/linuxppc/linux/commit/02424d896
> >
>
> But I'm wondering, if the dot is the problem, how so is the following one ok ?
>
> [   79.574457][   T75] # test_krealloc: test_alloc: size=32, gfp=cc0, 
> policy=any, cache=0
> [   79.682728][   T75] 
> =

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 12:23, Christophe Leroy
 wrote:
> Le 03/03/2021 à 11:56, Marco Elver a écrit :
> >
> > Somewhat tangentially, I also note that e.g. show_regs(regs) (which
> > was printed along the KFENCE report above) didn't include the top
> > frame in the "Call Trace", so this assumption is definitely not
> > isolated to KFENCE.
> >
>
> Now, I have tested PPC64 (with the patch I sent yesterday to modify 
> save_stack_trace_regs()
> applied), and I get many failures. Any idea ?
>
> [   17.653751][   T58] 
> ==
> [   17.654379][   T58] BUG: KFENCE: invalid free in 
> .kfence_guarded_free+0x2e4/0x530
> [   17.654379][   T58]
> [   17.654831][   T58] Invalid free of 0xc0003c9c (in kfence-#77):
> [   17.655358][   T58]  .kfence_guarded_free+0x2e4/0x530
> [   17.655775][   T58]  .__slab_free+0x320/0x5a0
> [   17.656039][   T58]  .test_double_free+0xe0/0x198
> [   17.656308][   T58]  .kunit_try_run_case+0x80/0x110
> [   17.656523][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> [   17.657161][   T58]  .kthread+0x18c/0x1a0
> [   17.659148][   T58]  .ret_from_kernel_thread+0x58/0x70
> [   17.659869][   T58]
> [   17.663954][   T58] kfence-#77 [0xc0003c9c-0xc0003c9c001f, 
> size=32, cache=kmalloc-32]
> allocated by task 58:
> [   17.666113][   T58]  .__kfence_alloc+0x1bc/0x510
> [   17.667069][   T58]  .__kmalloc+0x280/0x4f0
> [   17.667452][   T58]  .test_alloc+0x19c/0x430
> [   17.667732][   T58]  .test_double_free+0x88/0x198
> [   17.667971][   T58]  .kunit_try_run_case+0x80/0x110
> [   17.668283][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> [   17.668553][   T58]  .kthread+0x18c/0x1a0
> [   17.669315][   T58]  .ret_from_kernel_thread+0x58/0x70
> [   17.669711][   T58]
> [   17.669711][   T58] freed by task 58:
> [   17.670116][   T58]  .kfence_guarded_free+0x3d0/0x530
> [   17.670421][   T58]  .__slab_free+0x320/0x5a0
> [   17.670603][   T58]  .test_double_free+0xb4/0x198
> [   17.670827][   T58]  .kunit_try_run_case+0x80/0x110
> [   17.671073][   T58]  .kunit_generic_run_threadfn_adapter+0x38/0x50
> [   17.671410][   T58]  .kthread+0x18c/0x1a0
> [   17.671618][   T58]  .ret_from_kernel_thread+0x58/0x70
> [   17.671972][   T58]
> [   17.672638][   T58] CPU: 0 PID: 58 Comm: kunit_try_catch Tainted: GB
> 5.12.0-rc1-01540-g0783285cc1b8-dirty #4685
> [   17.673768][   T58] 
> ==
> [   17.677031][   T58] # test_double_free: EXPECTATION FAILED at 
> mm/kfence/kfence_test.c:380
> [   17.677031][   T58] Expected report_matches() to be true, but 
> is false
> [   17.684397][T1] not ok 7 - test_double_free
> [   17.686463][   T59] # test_double_free-memcache: setup_test_cache: 
> size=32, ctor=0x0
> [   17.688403][   T59] # test_double_free-memcache: test_alloc: size=32, 
> gfp=cc0, policy=any,
> cache=1

Looks like something is prepending '.' to function names. We expect
the function name to appear as-is, e.g. "kfence_guarded_free",
"test_double_free", etc.

Is there something special on ppc64, where the '.' is some convention?

Thanks,
-- Marco


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-03 Thread Marco Elver
On Wed, Mar 03, 2021 at 03:52PM +0100, Christophe Leroy wrote:
> Le 03/03/2021 à 15:38, Marco Elver a écrit :
> > On Wed, 3 Mar 2021 at 15:09, Christophe Leroy
> >  wrote:
> > > 
> > > It seems like all other sane architectures, namely x86 and arm64
> > > at least, include the running function as top entry when saving
> > > stack trace.
> > > 
> > > Functionnalities like KFENCE expect it.
> > > 
> > > Do the same on powerpc, it allows KFENCE to properly identify the faulting
> > > function as depicted below. Before the patch KFENCE was identifying
> > > finish_task_switch.isra as the faulting function.
> > > 
> > > [   14.937370] 
> > > ==
> > > [   14.948692] BUG: KFENCE: invalid read in test_invalid_access+0x54/0x108
> > > [   14.948692]
> > > [   14.956814] Invalid read at 0xdf98800a:
> > > [   14.960664]  test_invalid_access+0x54/0x108
> > > [   14.964876]  finish_task_switch.isra.0+0x54/0x23c
> > > [   14.969606]  kunit_try_run_case+0x5c/0xd0
> > > [   14.973658]  kunit_generic_run_threadfn_adapter+0x24/0x30
> > > [   14.979079]  kthread+0x15c/0x174
> > > [   14.982342]  ret_from_kernel_thread+0x14/0x1c
> > > [   14.986731]
> > > [   14.988236] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB  
> > >5.12.0-rc1-01537-g95f6e2088d7e-dirty #4682
> > > [   14.999795] NIP:  c016ec2c LR: c02f517c CTR: c016ebd8
> > > [   15.004851] REGS: e2449d90 TRAP: 0301   Tainted: GB  
> > > (5.12.0-rc1-01537-g95f6e2088d7e-dirty)
> > > [   15.015274] MSR:  9032   CR: 2204  XER: 
> > > 
> > > [   15.022043] DAR: df98800a DSISR: 2000
> > > [   15.022043] GPR00: c02f517c e2449e50 c1142080 e100dd24 c084b13c 
> > > 0008 c084b32b c016ebd8
> > > [   15.022043] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> > > [   15.040581] NIP [c016ec2c] test_invalid_access+0x54/0x108
> > > [   15.046010] LR [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > [   15.051181] Call Trace:
> > > [   15.053637] [e2449e50] [c005a68c] finish_task_switch.isra.0+0x54/0x23c 
> > > (unreliable)
> > > [   15.061338] [e2449eb0] [c02f517c] kunit_try_run_case+0x5c/0xd0
> > > [   15.067215] [e2449ed0] [c02f648c] 
> > > kunit_generic_run_threadfn_adapter+0x24/0x30
> > > [   15.074472] [e2449ef0] [c004e7b0] kthread+0x15c/0x174
> > > [   15.079571] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> > > [   15.085798] Instruction dump:
> > > [   15.088784] 8129d608 38e7ebd8 81020280 911f004c 3900 995f0024 
> > > 907f0028 90ff001c
> > > [   15.096613] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 
> > > 812a4b98 3d40c02f
> > > [   15.104612] 
> > > ==
> > > 
> > > Signed-off-by: Christophe Leroy 
> > 
> > Acked-by: Marco Elver 
> > 
> > Thank you, I think this looks like the right solution. Just a question 
> > below:
> > 
> ...
> 
> > > @@ -59,23 +70,26 @@ void save_stack_trace(struct stack_trace *trace)
> > > 
> > >  sp = current_stack_frame();
> > > 
> > > -   save_context_stack(trace, sp, current, 1);
> > > +   save_context_stack(trace, sp, (unsigned long)save_stack_trace, 
> > > current, 1);
> > 
> > This causes ip == save_stack_trace and also below for
> > save_stack_trace_tsk. Does this mean save_stack_trace() is included in
> > the trace? Looking at kernel/stacktrace.c, I think the library wants
> > to exclude itself from the trace, as it does '.skip = skipnr + 1' (and
> > '.skip   = skipnr + (current == tsk)' for the _tsk variant).
> > 
> > If the arch-helper here is included, should this use _RET_IP_ instead?
> > 
> 
> Don't really know, I was inspired by arm64 which has:
> 
> void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
>struct task_struct *task, struct pt_regs *regs)
> {
>   struct stackframe frame;
> 
>   if (regs)
>   start_backtrace(, regs->regs[29], regs->pc);
>   else if (task == current)
>   start_backtrace(,
>   (unsigned long)__builtin_frame_address(0),
>   (unsigned long)arch_stack_walk);
>   else
>   start_backtrace(, thread_saved_fp(task),
> 

Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-03 Thread Marco Elver
On Wed, 3 Mar 2021 at 15:09, Christophe Leroy
 wrote:
>
> It seems like all other sane architectures, namely x86 and arm64
> at least, include the running function as top entry when saving
> stack trace.
>
> Functionnalities like KFENCE expect it.
>
> Do the same on powerpc, it allows KFENCE to properly identify the faulting
> function as depicted below. Before the patch KFENCE was identifying
> finish_task_switch.isra as the faulting function.
>
> [   14.937370] 
> ==
> [   14.948692] BUG: KFENCE: invalid read in test_invalid_access+0x54/0x108
> [   14.948692]
> [   14.956814] Invalid read at 0xdf98800a:
> [   14.960664]  test_invalid_access+0x54/0x108
> [   14.964876]  finish_task_switch.isra.0+0x54/0x23c
> [   14.969606]  kunit_try_run_case+0x5c/0xd0
> [   14.973658]  kunit_generic_run_threadfn_adapter+0x24/0x30
> [   14.979079]  kthread+0x15c/0x174
> [   14.982342]  ret_from_kernel_thread+0x14/0x1c
> [   14.986731]
> [   14.988236] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB  
>5.12.0-rc1-01537-g95f6e2088d7e-dirty #4682
> [   14.999795] NIP:  c016ec2c LR: c02f517c CTR: c016ebd8
> [   15.004851] REGS: e2449d90 TRAP: 0301   Tainted: GB  
> (5.12.0-rc1-01537-g95f6e2088d7e-dirty)
> [   15.015274] MSR:  9032   CR: 2204  XER: 
> [   15.022043] DAR: df98800a DSISR: 2000
> [   15.022043] GPR00: c02f517c e2449e50 c1142080 e100dd24 c084b13c 0008 
> c084b32b c016ebd8
> [   15.022043] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> [   15.040581] NIP [c016ec2c] test_invalid_access+0x54/0x108
> [   15.046010] LR [c02f517c] kunit_try_run_case+0x5c/0xd0
> [   15.051181] Call Trace:
> [   15.053637] [e2449e50] [c005a68c] finish_task_switch.isra.0+0x54/0x23c 
> (unreliable)
> [   15.061338] [e2449eb0] [c02f517c] kunit_try_run_case+0x5c/0xd0
> [   15.067215] [e2449ed0] [c02f648c] 
> kunit_generic_run_threadfn_adapter+0x24/0x30
> [   15.074472] [e2449ef0] [c004e7b0] kthread+0x15c/0x174
> [   15.079571] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> [   15.085798] Instruction dump:
> [   15.088784] 8129d608 38e7ebd8 81020280 911f004c 3900 995f0024 907f0028 
> 90ff001c
> [   15.096613] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 
> 812a4b98 3d40c02f
> [   15.104612] 
> ==
>
> Signed-off-by: Christophe Leroy 

Acked-by: Marco Elver 

Thank you, I think this looks like the right solution. Just a question below:

> ---
>  arch/powerpc/kernel/stacktrace.c | 42 +---
>  1 file changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kernel/stacktrace.c 
> b/arch/powerpc/kernel/stacktrace.c
> index b6440657ef92..67c2b8488035 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -22,16 +22,32 @@
>  #include 
>
>  #include 
> +#include 
>
>  /*
>   * Save stack-backtrace addresses into a stack_trace buffer.
>   */
> +static void save_entry(struct stack_trace *trace, unsigned long ip, int 
> savesched)
> +{
> +   if (savesched || !in_sched_functions(ip)) {
> +   if (!trace->skip)
> +   trace->entries[trace->nr_entries++] = ip;
> +   else
> +   trace->skip--;
> +   }
> +}
> +
>  static void save_context_stack(struct stack_trace *trace, unsigned long sp,
> -   struct task_struct *tsk, int savesched)
> +  unsigned long ip, struct task_struct *tsk, int 
> savesched)
>  {
> +   save_entry(trace, ip, savesched);
> +
> +   if (trace->nr_entries >= trace->max_entries)
> +   return;
> +
> for (;;) {
> unsigned long *stack = (unsigned long *) sp;
> -   unsigned long newsp, ip;
> +   unsigned long newsp;
>
> if (!validate_sp(sp, tsk, STACK_FRAME_OVERHEAD))
> return;
> @@ -39,12 +55,7 @@ static void save_context_stack(struct stack_trace *trace, 
> unsigned long sp,
> newsp = stack[0];
> ip = stack[STACK_FRAME_LR_SAVE];
>
> -   if (savesched || !in_sched_functions(ip)) {
> -   if (!trace->skip)
> -   trace->entries[trace->nr_entries++] = ip;
> -   else
> -   trace->skip--;
> -   }
> +   save_entry(trace, ip, savesched);
>
> if (trace->nr_entries >= trace->max_entri

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-03 Thread Marco Elver
On Wed, 3 Mar 2021 at 11:39, Christophe Leroy
 wrote:
>
>
>
> Le 02/03/2021 à 12:39, Marco Elver a écrit :
> > On Tue, 2 Mar 2021 at 12:21, Christophe Leroy
> >  wrote:
> > [...]
> >>>> Booting with 'no_hash_pointers" I get the following. Does it helps ?
> >>>>
> >>>> [   16.837198] 
> >>>> ==
> >>>> [   16.848521] BUG: KFENCE: invalid read in 
> >>>> finish_task_switch.isra.0+0x54/0x23c
> >>>> [   16.848521]
> >>>> [   16.857158] Invalid read at 0xdf98800a:
> >>>> [   16.861004]  finish_task_switch.isra.0+0x54/0x23c
> >>>> [   16.865731]  kunit_try_run_case+0x5c/0xd0
> >>>> [   16.869780]  kunit_generic_run_threadfn_adapter+0x24/0x30
> >>>> [   16.875199]  kthread+0x15c/0x174
> >>>> [   16.878460]  ret_from_kernel_thread+0x14/0x1c
> >>>> [   16.882847]
> >>>> [   16.884351] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB
> >>>> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
> >>>> [   16.895908] NIP:  c016eb8c LR: c02f50dc CTR: c016eb38
> >>>> [   16.900963] REGS: e2449d90 TRAP: 0301   Tainted: GB
> >>>> (5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty)
> >>>> [   16.911386] MSR:  9032   CR: 2204  XER: 
> >>>> 
> >>>> [   16.918153] DAR: df98800a DSISR: 2000
> >>>> [   16.918153] GPR00: c02f50dc e2449e50 c1140d00 e100dd24 c084b13c 
> >>>> 0008 c084b32b c016eb38
> >>>> [   16.918153] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> >>>> [   16.936695] NIP [c016eb8c] test_invalid_access+0x54/0x108
> >>>> [   16.942125] LR [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >>>> [   16.947292] Call Trace:
> >>>> [   16.949746] [e2449e50] [c005a5ec] 
> >>>> finish_task_switch.isra.0+0x54/0x23c (unreliable)
> >>>
> >>> The "(unreliable)" might be a clue that it's related to ppc32 stack
> >>> unwinding. Any ppc expert know what this is about?
> >>>
> >>>> [   16.957443] [e2449eb0] [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >>>> [   16.963319] [e2449ed0] [c02f63ec] 
> >>>> kunit_generic_run_threadfn_adapter+0x24/0x30
> >>>> [   16.970574] [e2449ef0] [c004e710] kthread+0x15c/0x174
> >>>> [   16.975670] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> >>>> [   16.981896] Instruction dump:
> >>>> [   16.984879] 8129d608 38e7eb38 81020280 911f004c 3900 995f0024 
> >>>> 907f0028 90ff001c
> >>>> [   16.992710] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 
> >>>> 812a4b98 3d40c02f
> >>>> [   17.000711] 
> >>>> ==
> >>>> [   17.008223] # test_invalid_access: EXPECTATION FAILED at 
> >>>> mm/kfence/kfence_test.c:636
> >>>> [   17.008223] Expected report_matches() to be true, but is 
> >>>> false
> >>>> [   17.023243] not ok 21 - test_invalid_access
> >>>
> >>> On a fault in test_invalid_access, KFENCE prints the stack trace based
> >>> on the information in pt_regs. So we do not think there's anything we
> >>> can do to improve stack printing pe-se.
> >>
> >> stack printing, probably not. Would be good anyway to mark the last level 
> >> [unreliable] as the ppc does.
> >
> > We use stack_trace_save_regs() + stack_trace_print().
> >
> >> IIUC, on ppc the address in the stack frame of the caller is written by 
> >> the caller. In most tests,
> >> there is some function call being done before the fault, for instance
> >> test_kmalloc_aligned_oob_read() does a call to kunit_do_assertion which 
> >> populates the address of the
> >> call in the stack. However this is fragile.
> >
> > Interesting, this might explain it.
> >
> >> This works for function calls because in order to call a subfunction, a 
> >> function has to set up a
> >> stack frame in order to same the value in the Link Register, which 
> >> contains the address of the
> >> function's parent and that will be clobbered by the sub-function call.
> >>
> >> However, it cannot be done by exceptions, because exceptions can happen in 
> >> a function that h

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-03 Thread Marco Elver
On Wed, 3 Mar 2021 at 11:32, Christophe Leroy
 wrote:
>
>
>
> Le 02/03/2021 à 10:53, Marco Elver a écrit :
> > On Tue, 2 Mar 2021 at 10:27, Christophe Leroy
> >  wrote:
> >> Le 02/03/2021 à 10:21, Alexander Potapenko a écrit :
> >>>> [   14.998426] BUG: KFENCE: invalid read in 
> >>>> finish_task_switch.isra.0+0x54/0x23c
> >>>> [   14.998426]
> >>>> [   15.007061] Invalid read at 0x(ptrval):
> >>>> [   15.010906]  finish_task_switch.isra.0+0x54/0x23c
> >>>> [   15.015633]  kunit_try_run_case+0x5c/0xd0
> >>>> [   15.019682]  kunit_generic_run_threadfn_adapter+0x24/0x30
> >>>> [   15.025099]  kthread+0x15c/0x174
> >>>> [   15.028359]  ret_from_kernel_thread+0x14/0x1c
> >>>> [   15.032747]
> >>>> [   15.034251] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB
> >>>> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
> >>>> [   15.045811] 
> >>>> ==
> >>>> [   15.053324] # test_invalid_access: EXPECTATION FAILED at 
> >>>> mm/kfence/kfence_test.c:636
> >>>> [   15.053324] Expected report_matches() to be true, but is 
> >>>> false
> >>>> [   15.068359] not ok 21 - test_invalid_access
> >>>
> >>> The test expects the function name to be test_invalid_access, i. e.
> >>> the first line should be "BUG: KFENCE: invalid read in
> >>> test_invalid_access".
> >>> The error reporting function unwinds the stack, skips a couple of
> >>> "uninteresting" frames
> >>> (https://elixir.bootlin.com/linux/v5.12-rc1/source/mm/kfence/report.c#L43)
> >>> and uses the first "interesting" one frame to print the report header
> >>> (https://elixir.bootlin.com/linux/v5.12-rc1/source/mm/kfence/report.c#L226).
> >>>
> >>> It's strange that test_invalid_access is missing altogether from the
> >>> stack trace - is that expected?
> >>> Can you try printing the whole stacktrace without skipping any frames
> >>> to see if that function is there?
> >>>
> >>
> >> Booting with 'no_hash_pointers" I get the following. Does it helps ?
> >>
> >> [   16.837198] 
> >> ==
> >> [   16.848521] BUG: KFENCE: invalid read in 
> >> finish_task_switch.isra.0+0x54/0x23c
> >> [   16.848521]
> >> [   16.857158] Invalid read at 0xdf98800a:
> >> [   16.861004]  finish_task_switch.isra.0+0x54/0x23c
> >> [   16.865731]  kunit_try_run_case+0x5c/0xd0
> >> [   16.869780]  kunit_generic_run_threadfn_adapter+0x24/0x30
> >> [   16.875199]  kthread+0x15c/0x174
> >> [   16.878460]  ret_from_kernel_thread+0x14/0x1c
> >> [   16.882847]
> >> [   16.884351] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB
> >> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
> >> [   16.895908] NIP:  c016eb8c LR: c02f50dc CTR: c016eb38
> >> [   16.900963] REGS: e2449d90 TRAP: 0301   Tainted: GB
> >> (5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty)
> >> [   16.911386] MSR:  9032   CR: 2204  XER: 
> >> [   16.918153] DAR: df98800a DSISR: 2000
> >> [   16.918153] GPR00: c02f50dc e2449e50 c1140d00 e100dd24 c084b13c 
> >> 0008 c084b32b c016eb38
> >> [   16.918153] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> >> [   16.936695] NIP [c016eb8c] test_invalid_access+0x54/0x108
> >> [   16.942125] LR [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >> [   16.947292] Call Trace:
> >> [   16.949746] [e2449e50] [c005a5ec] finish_task_switch.isra.0+0x54/0x23c 
> >> (unreliable)
> >
> > The "(unreliable)" might be a clue that it's related to ppc32 stack
> > unwinding. Any ppc expert know what this is about?
> >
> >> [   16.957443] [e2449eb0] [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >> [   16.963319] [e2449ed0] [c02f63ec] 
> >> kunit_generic_run_threadfn_adapter+0x24/0x30
> >> [   16.970574] [e2449ef0] [c004e710] kthread+0x15c/0x174
> >> [   16.975670] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> >> [   16.981896] Instruction dump:
> >> [   16.984879] 8129d608 38e7eb38 81020280 911f004c 3900 995f0024 
> >> 907f0028 90ff001c
> >> [   16.992710] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 
>

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-02 Thread Marco Elver
On Tue, 2 Mar 2021 at 12:21, Christophe Leroy
 wrote:
[...]
> >> Booting with 'no_hash_pointers" I get the following. Does it helps ?
> >>
> >> [   16.837198] 
> >> ==
> >> [   16.848521] BUG: KFENCE: invalid read in 
> >> finish_task_switch.isra.0+0x54/0x23c
> >> [   16.848521]
> >> [   16.857158] Invalid read at 0xdf98800a:
> >> [   16.861004]  finish_task_switch.isra.0+0x54/0x23c
> >> [   16.865731]  kunit_try_run_case+0x5c/0xd0
> >> [   16.869780]  kunit_generic_run_threadfn_adapter+0x24/0x30
> >> [   16.875199]  kthread+0x15c/0x174
> >> [   16.878460]  ret_from_kernel_thread+0x14/0x1c
> >> [   16.882847]
> >> [   16.884351] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: GB
> >> 5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty #4674
> >> [   16.895908] NIP:  c016eb8c LR: c02f50dc CTR: c016eb38
> >> [   16.900963] REGS: e2449d90 TRAP: 0301   Tainted: GB
> >> (5.12.0-rc1-s3k-dev-01534-g4f14ae75edf0-dirty)
> >> [   16.911386] MSR:  9032   CR: 2204  XER: 
> >> [   16.918153] DAR: df98800a DSISR: 2000
> >> [   16.918153] GPR00: c02f50dc e2449e50 c1140d00 e100dd24 c084b13c 
> >> 0008 c084b32b c016eb38
> >> [   16.918153] GPR08: c085 df988000 c0d1 e2449eb0 22000288
> >> [   16.936695] NIP [c016eb8c] test_invalid_access+0x54/0x108
> >> [   16.942125] LR [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >> [   16.947292] Call Trace:
> >> [   16.949746] [e2449e50] [c005a5ec] finish_task_switch.isra.0+0x54/0x23c 
> >> (unreliable)
> >
> > The "(unreliable)" might be a clue that it's related to ppc32 stack
> > unwinding. Any ppc expert know what this is about?
> >
> >> [   16.957443] [e2449eb0] [c02f50dc] kunit_try_run_case+0x5c/0xd0
> >> [   16.963319] [e2449ed0] [c02f63ec] 
> >> kunit_generic_run_threadfn_adapter+0x24/0x30
> >> [   16.970574] [e2449ef0] [c004e710] kthread+0x15c/0x174
> >> [   16.975670] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c
> >> [   16.981896] Instruction dump:
> >> [   16.984879] 8129d608 38e7eb38 81020280 911f004c 3900 995f0024 
> >> 907f0028 90ff001c
> >> [   16.992710] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 
> >> 812a4b98 3d40c02f
> >> [   17.000711] 
> >> ==
> >> [   17.008223] # test_invalid_access: EXPECTATION FAILED at 
> >> mm/kfence/kfence_test.c:636
> >> [   17.008223] Expected report_matches() to be true, but is 
> >> false
> >> [   17.023243] not ok 21 - test_invalid_access
> >
> > On a fault in test_invalid_access, KFENCE prints the stack trace based
> > on the information in pt_regs. So we do not think there's anything we
> > can do to improve stack printing pe-se.
>
> stack printing, probably not. Would be good anyway to mark the last level 
> [unreliable] as the ppc does.

We use stack_trace_save_regs() + stack_trace_print().

> IIUC, on ppc the address in the stack frame of the caller is written by the 
> caller. In most tests,
> there is some function call being done before the fault, for instance
> test_kmalloc_aligned_oob_read() does a call to kunit_do_assertion which 
> populates the address of the
> call in the stack. However this is fragile.

Interesting, this might explain it.

> This works for function calls because in order to call a subfunction, a 
> function has to set up a
> stack frame in order to same the value in the Link Register, which contains 
> the address of the
> function's parent and that will be clobbered by the sub-function call.
>
> However, it cannot be done by exceptions, because exceptions can happen in a 
> function that has no
> stack frame (because that function has no need to call a subfunction and 
> doesn't need to same
> anything on the stack). If the exception handler was writting the caller's 
> address in the stack
> frame, it would in fact write it in the parent's frame, leading to a mess.
>
> But in fact the information is in pt_regs, it is in regs->nip so KFENCE 
> should be able to use that
> instead of the stack.

Perhaps stack_trace_save_regs() needs fixing for ppc32? Although that
seems to use arch_stack_walk().

> > What's confusing is that it's only this test, and none of the others.
> > Given that, it might be code-gen related, which results in some subtle
> > issue with stack unwinding. There are a few things to try, if you feel
> > like it:
> >
> > -- Change the unwinder, if it's possible for ppc32.
>
> I don't think it is possible.
>
> >
> > -- Add code to test_invalid_access(), to get the compiler to emit
> > different code. E.g. add a bunch (unnecessary) function calls, or add
> > barriers, etc.
>
> The following does the trick
>
> diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> index 4acf4251ee04..22550676cd1f 100644
> --- a/mm/kfence/kfence_test.c
> +++ b/mm/kfence/kfence_test.c
> @@ -631,8 +631,11 @@ static void test_invalid_access(struct kunit *test)
> .addr = &__kfence_pool[10],
> 

Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-02 Thread Marco Elver
 to test_invalid_access(), to get the compiler to emit
different code. E.g. add a bunch (unnecessary) function calls, or add
barriers, etc.

-- Play with compiler options. We already pass
-fno-optimize-sibling-calls for kfence_test.o to avoid tail-call
optimizations that'd hide stack trace entries. But perhaps there's
something ppc-specific we missed?

Well, the good thing is that KFENCE detects the bad access just fine.
Since, according to the test, everything works from KFENCE's side, I'd
be happy to give my Ack:

  Acked-by: Marco Elver 

Thanks,
-- Marco


Re: [RFC PATCH v1] powerpc: Enable KFENCE for PPC32

2021-03-02 Thread Marco Elver
On Tue, 2 Mar 2021 at 09:37, Christophe Leroy
 wrote:
> Add architecture specific implementation details for KFENCE and enable
> KFENCE for the ppc32 architecture. In particular, this implements the
> required interface in .

Nice!

> KFENCE requires that attributes for pages from its memory pool can
> individually be set. Therefore, force the Read/Write linear map to be
> mapped at page granularity.
>
> Unit tests succeed on all tests but one:
>
> [   15.053324] # test_invalid_access: EXPECTATION FAILED at 
> mm/kfence/kfence_test.c:636
> [   15.053324] Expected report_matches() to be true, but 
> is false
> [   15.068359] not ok 21 - test_invalid_access

This is strange, given all the other tests passed. Do you mind sharing
the full test log?

Thanks,
-- Marco


Re: [PATCH v2 1/2] kasan: support instrumented bitops combined with generic bitops

2019-11-14 Thread Marco Elver
On Mon, 28 Oct 2019 at 14:56, Daniel Axtens  wrote:
>
> Hi all,
>
> Would it be possible to get an Ack from a KASAN maintainter? mpe is
> happy to take this through powerpc but would like an ack first.
>
> Regards,
> Daniel
>
> >> Currently bitops-instrumented.h assumes that the architecture provides
> >> atomic, non-atomic and locking bitops (e.g. both set_bit and __set_bit).
> >> This is true on x86 and s390, but is not always true: there is a
> >> generic bitops/non-atomic.h header that provides generic non-atomic
> >> operations, and also a generic bitops/lock.h for locking operations.
> >>
> >> powerpc uses the generic non-atomic version, so it does not have it's
> >> own e.g. __set_bit that could be renamed arch___set_bit.
> >>
> >> Split up bitops-instrumented.h to mirror the atomic/non-atomic/lock
> >> split. This allows arches to only include the headers where they
> >> have arch-specific versions to rename. Update x86 and s390.
> >
> > This patch should not cause any functional change on either arch.
> >
> > To verify, I have compiled kernels with and without these. With the
> > appropriate setting of environment variables and the general assorted
> > mucking around required for reproducible builds, I have tested:
> >
> >  - s390, without kasan - byte-for-byte identical vmlinux before and after
> >  - x86,  without kasan - byte-for-byte identical vmlinux before and after
> >
> >  - s390, inline kasan  - byte-for-byte identical vmlinux before and after
> >
> >  - x86,  inline kasan  - 3 functions in drivers/rtc/dev.o are reordered,
> >  build-id and __ex_table differ, rest is unchanged
> >
> > The kernels were based on defconfigs. I disabled debug info (as that
> > obviously changes with code being rearranged) and initrd support (as the
> > cpio wrapper doesn't seem to take KBUILD_BUILD_TIMESTAMP but the current
> > time, and that screws things up).
> >
> > I wouldn't read too much in to the weird result on x86 with inline
> > kasan: the code I moved about is compiled even without KASAN enabled.
> >
> > Regards,
> > Daniel
> >
> >
> >>
> >> (The generic operations are automatically instrumented because they're
> >> written in C, not asm.)
> >>
> >> Suggested-by: Christophe Leroy 
> >> Reviewed-by: Christophe Leroy 
> >> Signed-off-by: Daniel Axtens 
> >> ---
> >>  Documentation/core-api/kernel-api.rst |  17 +-
> >>  arch/s390/include/asm/bitops.h|   4 +-
> >>  arch/x86/include/asm/bitops.h |   4 +-
> >>  include/asm-generic/bitops-instrumented.h | 263 --
> >>  .../asm-generic/bitops/instrumented-atomic.h  | 100 +++
> >>  .../asm-generic/bitops/instrumented-lock.h|  81 ++
> >>  .../bitops/instrumented-non-atomic.h  | 114 
> >>  7 files changed, 317 insertions(+), 266 deletions(-)
> >>  delete mode 100644 include/asm-generic/bitops-instrumented.h
> >>  create mode 100644 include/asm-generic/bitops/instrumented-atomic.h
> >>  create mode 100644 include/asm-generic/bitops/instrumented-lock.h
> >>  create mode 100644 include/asm-generic/bitops/instrumented-non-atomic.h
> >>
> >> diff --git a/Documentation/core-api/kernel-api.rst 
> >> b/Documentation/core-api/kernel-api.rst
> >> index 08af5caf036d..2e21248277e3 100644
> >> --- a/Documentation/core-api/kernel-api.rst
> >> +++ b/Documentation/core-api/kernel-api.rst
> >> @@ -54,7 +54,22 @@ The Linux kernel provides more basic utility functions.
> >>  Bit Operations
> >>  --
> >>
> >> -.. kernel-doc:: include/asm-generic/bitops-instrumented.h
> >> +Atomic Operations
> >> +~
> >> +
> >> +.. kernel-doc:: include/asm-generic/bitops/instrumented-atomic.h
> >> +   :internal:
> >> +
> >> +Non-atomic Operations
> >> +~
> >> +
> >> +.. kernel-doc:: include/asm-generic/bitops/instrumented-non-atomic.h
> >> +   :internal:
> >> +
> >> +Locking Operations
> >> +~~
> >> +
> >> +.. kernel-doc:: include/asm-generic/bitops/instrumented-lock.h
> >> :internal:
> >>
> >>  Bitmap Operations
> >> diff --git a/arch/s390/include/asm/bitops.h 
> >> b/arch/s390/include/asm/bitops.h
> >> index b8833ac983fa..0ceb12593a68 100644
> >> --- a/arch/s390/include/asm/bitops.h
> >> +++ b/arch/s390/include/asm/bitops.h
> >> @@ -241,7 +241,9 @@ static inline void arch___clear_bit_unlock(unsigned 
> >> long nr,
> >>  arch___clear_bit(nr, ptr);
> >>  }
> >>
> >> -#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >>
> >>  /*
> >>   * Functions which use MSB0 bit numbering.
> >> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> >> index ba15d53c1ca7..4a2e2432238f 100644
> >> --- a/arch/x86/include/asm/bitops.h
> >> +++ b/arch/x86/include/asm/bitops.h
> >> @@ -389,7 +389,9 @@ static __always_inline int fls64(__u64 x)
> >>
> >>  #include 
> >>
> >> -#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >>
> >>  #include 
> >>
> >> diff --git