Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Nikolay Borisov


On  9.03.2017 03:58, Theodore Ts'o wrote:
> On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
>> So this is wrong, the reason why the issues seemed fix is because I
>> switched my compiler to version 5.4.0. So this manifests only if I'm
>> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
>> there are multiple invocations of ext4_ext_map_blocks and the freeing,
>> including with the address being used in subsequent kasan reports :
>> 88006ae8fdb0
> 
> Can you help bisect this, then?  I'm using Debian Testing, and the
> default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
> because I was running into problems with gcc 6.x a while back.  (TBH,
> I was thinking about trying to see if gcc 6.3 was stable for kernel
> compiles when I had some spare time.)  But I don't have access to
> *any* gcc 4.x on my development system, and I don't think I've tried
> using gcc 4.x in a long, Long, LONG time.
> 
> I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
> see if I can trigger it myself.  Can you send me a copy of your
> .config so I can see what else might be interesting with your config?
> (e.g., SLAB vs SLUB, etc.)

Attached the config. FUrther debugging and talking with the kasan
developers I think this actually might be a kasan problem when used with
an old compiler.  I bisected this all the way to 1771c6e1a567ea0ba2,
which is the commit introducing the user access instrumentation. Here is
a mail thread where I confirmed that this might be a kasan issue :
https://lkml.org/lkml/2017/3/8/69

What I believe is happening is that the manual checks inserted in user
access code misses some context information due to instrumentation not
inserted by the compiler. Kasan gets confused as a result, hence the
warnings.


> 
> Thanks,
> 
>  - Ted
> 
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.7.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-nbor"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# 

Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Nikolay Borisov


On  9.03.2017 03:58, Theodore Ts'o wrote:
> On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
>> So this is wrong, the reason why the issues seemed fix is because I
>> switched my compiler to version 5.4.0. So this manifests only if I'm
>> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
>> there are multiple invocations of ext4_ext_map_blocks and the freeing,
>> including with the address being used in subsequent kasan reports :
>> 88006ae8fdb0
> 
> Can you help bisect this, then?  I'm using Debian Testing, and the
> default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
> because I was running into problems with gcc 6.x a while back.  (TBH,
> I was thinking about trying to see if gcc 6.3 was stable for kernel
> compiles when I had some spare time.)  But I don't have access to
> *any* gcc 4.x on my development system, and I don't think I've tried
> using gcc 4.x in a long, Long, LONG time.
> 
> I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
> see if I can trigger it myself.  Can you send me a copy of your
> .config so I can see what else might be interesting with your config?
> (e.g., SLAB vs SLUB, etc.)

Attached the config. FUrther debugging and talking with the kasan
developers I think this actually might be a kasan problem when used with
an old compiler.  I bisected this all the way to 1771c6e1a567ea0ba2,
which is the commit introducing the user access instrumentation. Here is
a mail thread where I confirmed that this might be a kasan issue :
https://lkml.org/lkml/2017/3/8/69

What I believe is happening is that the manual checks inserted in user
access code misses some context information due to instrumentation not
inserted by the compiler. Kasan gets confused as a result, hence the
warnings.


> 
> Thanks,
> 
>  - Ted
> 
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.7.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-nbor"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# 

Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Theodore Ts'o
On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
> So this is wrong, the reason why the issues seemed fix is because I
> switched my compiler to version 5.4.0. So this manifests only if I'm
> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
> there are multiple invocations of ext4_ext_map_blocks and the freeing,
> including with the address being used in subsequent kasan reports :
> 88006ae8fdb0

Can you help bisect this, then?  I'm using Debian Testing, and the
default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
because I was running into problems with gcc 6.x a while back.  (TBH,
I was thinking about trying to see if gcc 6.3 was stable for kernel
compiles when I had some spare time.)  But I don't have access to
*any* gcc 4.x on my development system, and I don't think I've tried
using gcc 4.x in a long, Long, LONG time.

I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
see if I can trigger it myself.  Can you send me a copy of your
.config so I can see what else might be interesting with your config?
(e.g., SLAB vs SLUB, etc.)

Thanks,

   - Ted


Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-08 Thread Theodore Ts'o
On Tue, Mar 07, 2017 at 10:40:53PM +0200, Nikolay Borisov wrote:
> So this is wrong, the reason why the issues seemed fix is because I
> switched my compiler to version 5.4.0. So this manifests only if I'm
> using gcc 4.7.4. With the pr_info added here is the output of a boot. So
> there are multiple invocations of ext4_ext_map_blocks and the freeing,
> including with the address being used in subsequent kasan reports :
> 88006ae8fdb0

Can you help bisect this, then?  I'm using Debian Testing, and the
default gcc is gcc 6.3.0.  I'm currently forcing the use of gcc 5.4.1
because I was running into problems with gcc 6.x a while back.  (TBH,
I was thinking about trying to see if gcc 6.3 was stable for kernel
compiles when I had some spare time.)  But I don't have access to
*any* gcc 4.x on my development system, and I don't think I've tried
using gcc 4.x in a long, Long, LONG time.

I'm currently kicking off a test run using 5.4.1 with KASAN enabled to
see if I can trigger it myself.  Can you send me a copy of your
.config so I can see what else might be interesting with your config?
(e.g., SLAB vs SLUB, etc.)

Thanks,

   - Ted


Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-07 Thread Nikolay Borisov


On  7.03.2017 16:33, Nikolay Borisov wrote:
> 
> 
> On  7.03.2017 11:38, Nikolay Borisov wrote:
>>
>>
>> On  7.03.2017 00:35, Rafael J. Wysocki wrote:
>>> On Mon, Mar 6, 2017 at 9:31 PM, Nikolay Borisov
>>>  wrote:
 Hello,

 Booting 4.11-rc1 with kasan enabled and "slub_debug=F" produces the 
 following errors:

 [7.070797] 
 ==
 [7.071724] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at 
 addr 88006bc2b0ae
 [7.071724] Read of size 20 by task systemd/1
 [7.071724] CPU: 1 PID: 1 Comm: systemd Not tainted 4.11.0-rc1-nbor #150
 [7.071724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
 Ubuntu-1.8.2-1ubuntu1 04/01/2014
 [7.071724] Call Trace:
 [7.071724]  dump_stack+0x85/0xc9
 [7.071724]  kasan_object_err+0x2c/0x90
 [7.071724]  kasan_report+0x285/0x510
 [7.071724]  check_memory_region+0x137/0x160
 [7.071724]  kasan_check_read+0x11/0x20
 [7.071724]  filldir+0xc3/0x160
 [7.071724]  call_filldir+0x88/0x140
 [7.071724]  ext4_readdir+0x757/0x920
 [7.071724]  ? iterate_dir+0x49/0x190
 [7.071724]  iterate_dir+0x7d/0x190
 [7.071724]  ? entry_SYSCALL_64_fastpath+0x5/0xc6
 [7.071724]  SyS_getdents+0xac/0x170
 [7.071724]  ? filldir64+0x170/0x170
 [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
 [7.071724] RIP: 0033:0x7fa37ca2dd3b
 [7.071724] RSP: 002b:7ffc63daf400 EFLAGS: 0206 ORIG_RAX: 
 004e
 [7.071724] RAX: ffda RBX: 0046 RCX: 
 7fa37ca2dd3b
 [7.071724] RDX: 8000 RSI: 560b369e4a10 RDI: 
 0004
 [7.071724] RBP: 7fa37cd29b20 R08: 7fa37cd29bd8 R09: 
 
 [7.071724] R10: 008f R11: 0206 R12: 
 8041
 [7.071724] R13: 7fa37cd29b78 R14: 270f R15: 
 7fa37cd29b78
 [7.071724] Object at 88006bc2b080, in cache kmalloc-96 size: 96
 [7.071724] Allocated:
 [7.071724] PID = 1
 [7.071724]  save_stack_trace+0x1b/0x20
 [7.071724]  kasan_kmalloc.part.4+0x64/0xf0
 [7.071724]  kasan_kmalloc+0x85/0xb0
 [7.071724]  __kmalloc+0x12b/0x320
 [7.071724]  ext4_htree_store_dirent+0x3e/0x120
 [7.071724]  htree_dirblock_to_tree+0xb9/0x1a0
 [7.071724]  ext4_htree_fill_tree+0xa3/0x310
 [7.071724]  ext4_readdir+0x6a9/0x920
 [7.071724]  iterate_dir+0x7d/0x190
 [7.071724]  SyS_getdents+0xac/0x170
 [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
 [7.071724] Freed:
 [7.071724] PID = 1
 [7.071724]  save_stack_trace+0x1b/0x20
 [7.071724]  kasan_slab_free+0xbe/0x190
 [7.071724]  kfree+0xff/0x2f0
 [7.071724]  acpi_ut_evaluate_object+0x18e/0x19d
 [7.071724]  acpi_ut_execute_STA+0x26/0x53
 [7.071724]  acpi_ns_get_device_callback+0x73/0x163
 [7.071724]  acpi_ns_walk_namespace+0xc0/0x17a
 [7.071724]  acpi_get_devices+0x66/0x7d
 [7.071724]  pnpacpi_init+0x52/0x74
 [7.071724]  do_one_initcall+0x51/0x1b0
 [7.071724]  kernel_init_freeable+0x20a/0x2a1
 [7.071724]  kernel_init+0xe/0x100
 [7.071724]  ret_from_fork+0x31/0x40
 [7.071724] Memory state around the buggy address:
 [7.071724]  88006bc2af80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
 fc fc fc
 [7.071724]  88006bc2b000: fb fb fb fb fb fb fb fb fb fb fb fb fc 
 fc fc fc
 [7.071724] >88006bc2b080: 00 00 00 00 00 00 00 00 05 fc fc fc fc 
 fc fc fc
 [7.071724]^
 [7.071724]  88006bc2b100: 00 00 00 00 00 00 00 00 00 04 fc fc fc 
 fc fc fc
 [7.071724]  88006bc2b180: 00 00 00 00 00 00 00 00 00 00 fc fc fc 
 fc fc fc

 Not killing the VM instantly produces a continuous stream of kasan errors. 
 Most of them
 are identical to the one above, however there was one which was different:

 [5.846193] 
 ==
 [5.846787] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at 
 addr 88006c783eae
 [5.847177] Read of size 22 by task systemd/1
 [5.847177] CPU: 3 PID: 1 Comm: systemd Tainted: GB   
 4.11.0-rc1-nbor #150
 [5.847177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
 Ubuntu-1.8.2-1ubuntu1 04/01/2014
 [5.847177] Call Trace:
 [5.847177]  dump_stack+0x85/0xc9
 [5.847177]  kasan_object_err+0x2c/0x90
 [5.847177]  kasan_report+0x285/0x510
 [5.847177]  check_memory_region+0x137/0x160
 [5.847177]  

Re: Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-07 Thread Nikolay Borisov


On  7.03.2017 16:33, Nikolay Borisov wrote:
> 
> 
> On  7.03.2017 11:38, Nikolay Borisov wrote:
>>
>>
>> On  7.03.2017 00:35, Rafael J. Wysocki wrote:
>>> On Mon, Mar 6, 2017 at 9:31 PM, Nikolay Borisov
>>>  wrote:
 Hello,

 Booting 4.11-rc1 with kasan enabled and "slub_debug=F" produces the 
 following errors:

 [7.070797] 
 ==
 [7.071724] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at 
 addr 88006bc2b0ae
 [7.071724] Read of size 20 by task systemd/1
 [7.071724] CPU: 1 PID: 1 Comm: systemd Not tainted 4.11.0-rc1-nbor #150
 [7.071724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
 Ubuntu-1.8.2-1ubuntu1 04/01/2014
 [7.071724] Call Trace:
 [7.071724]  dump_stack+0x85/0xc9
 [7.071724]  kasan_object_err+0x2c/0x90
 [7.071724]  kasan_report+0x285/0x510
 [7.071724]  check_memory_region+0x137/0x160
 [7.071724]  kasan_check_read+0x11/0x20
 [7.071724]  filldir+0xc3/0x160
 [7.071724]  call_filldir+0x88/0x140
 [7.071724]  ext4_readdir+0x757/0x920
 [7.071724]  ? iterate_dir+0x49/0x190
 [7.071724]  iterate_dir+0x7d/0x190
 [7.071724]  ? entry_SYSCALL_64_fastpath+0x5/0xc6
 [7.071724]  SyS_getdents+0xac/0x170
 [7.071724]  ? filldir64+0x170/0x170
 [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
 [7.071724] RIP: 0033:0x7fa37ca2dd3b
 [7.071724] RSP: 002b:7ffc63daf400 EFLAGS: 0206 ORIG_RAX: 
 004e
 [7.071724] RAX: ffda RBX: 0046 RCX: 
 7fa37ca2dd3b
 [7.071724] RDX: 8000 RSI: 560b369e4a10 RDI: 
 0004
 [7.071724] RBP: 7fa37cd29b20 R08: 7fa37cd29bd8 R09: 
 
 [7.071724] R10: 008f R11: 0206 R12: 
 8041
 [7.071724] R13: 7fa37cd29b78 R14: 270f R15: 
 7fa37cd29b78
 [7.071724] Object at 88006bc2b080, in cache kmalloc-96 size: 96
 [7.071724] Allocated:
 [7.071724] PID = 1
 [7.071724]  save_stack_trace+0x1b/0x20
 [7.071724]  kasan_kmalloc.part.4+0x64/0xf0
 [7.071724]  kasan_kmalloc+0x85/0xb0
 [7.071724]  __kmalloc+0x12b/0x320
 [7.071724]  ext4_htree_store_dirent+0x3e/0x120
 [7.071724]  htree_dirblock_to_tree+0xb9/0x1a0
 [7.071724]  ext4_htree_fill_tree+0xa3/0x310
 [7.071724]  ext4_readdir+0x6a9/0x920
 [7.071724]  iterate_dir+0x7d/0x190
 [7.071724]  SyS_getdents+0xac/0x170
 [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
 [7.071724] Freed:
 [7.071724] PID = 1
 [7.071724]  save_stack_trace+0x1b/0x20
 [7.071724]  kasan_slab_free+0xbe/0x190
 [7.071724]  kfree+0xff/0x2f0
 [7.071724]  acpi_ut_evaluate_object+0x18e/0x19d
 [7.071724]  acpi_ut_execute_STA+0x26/0x53
 [7.071724]  acpi_ns_get_device_callback+0x73/0x163
 [7.071724]  acpi_ns_walk_namespace+0xc0/0x17a
 [7.071724]  acpi_get_devices+0x66/0x7d
 [7.071724]  pnpacpi_init+0x52/0x74
 [7.071724]  do_one_initcall+0x51/0x1b0
 [7.071724]  kernel_init_freeable+0x20a/0x2a1
 [7.071724]  kernel_init+0xe/0x100
 [7.071724]  ret_from_fork+0x31/0x40
 [7.071724] Memory state around the buggy address:
 [7.071724]  88006bc2af80: fc fc fc fc fc fc fc fc fc fc fc fc fc 
 fc fc fc
 [7.071724]  88006bc2b000: fb fb fb fb fb fb fb fb fb fb fb fb fc 
 fc fc fc
 [7.071724] >88006bc2b080: 00 00 00 00 00 00 00 00 05 fc fc fc fc 
 fc fc fc
 [7.071724]^
 [7.071724]  88006bc2b100: 00 00 00 00 00 00 00 00 00 04 fc fc fc 
 fc fc fc
 [7.071724]  88006bc2b180: 00 00 00 00 00 00 00 00 00 00 fc fc fc 
 fc fc fc

 Not killing the VM instantly produces a continuous stream of kasan errors. 
 Most of them
 are identical to the one above, however there was one which was different:

 [5.846193] 
 ==
 [5.846787] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at 
 addr 88006c783eae
 [5.847177] Read of size 22 by task systemd/1
 [5.847177] CPU: 3 PID: 1 Comm: systemd Tainted: GB   
 4.11.0-rc1-nbor #150
 [5.847177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
 Ubuntu-1.8.2-1ubuntu1 04/01/2014
 [5.847177] Call Trace:
 [5.847177]  dump_stack+0x85/0xc9
 [5.847177]  kasan_object_err+0x2c/0x90
 [5.847177]  kasan_report+0x285/0x510
 [5.847177]  check_memory_region+0x137/0x160
 [5.847177]  kasan_check_read+0x11/0x20
 [5.847177]  

Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-07 Thread Nikolay Borisov


On  7.03.2017 11:38, Nikolay Borisov wrote:
> 
> 
> On  7.03.2017 00:35, Rafael J. Wysocki wrote:
>> On Mon, Mar 6, 2017 at 9:31 PM, Nikolay Borisov
>>  wrote:
>>> Hello,
>>>
>>> Booting 4.11-rc1 with kasan enabled and "slub_debug=F" produces the 
>>> following errors:
>>>
>>> [7.070797] 
>>> ==
>>> [7.071724] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at addr 
>>> 88006bc2b0ae
>>> [7.071724] Read of size 20 by task systemd/1
>>> [7.071724] CPU: 1 PID: 1 Comm: systemd Not tainted 4.11.0-rc1-nbor #150
>>> [7.071724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> [7.071724] Call Trace:
>>> [7.071724]  dump_stack+0x85/0xc9
>>> [7.071724]  kasan_object_err+0x2c/0x90
>>> [7.071724]  kasan_report+0x285/0x510
>>> [7.071724]  check_memory_region+0x137/0x160
>>> [7.071724]  kasan_check_read+0x11/0x20
>>> [7.071724]  filldir+0xc3/0x160
>>> [7.071724]  call_filldir+0x88/0x140
>>> [7.071724]  ext4_readdir+0x757/0x920
>>> [7.071724]  ? iterate_dir+0x49/0x190
>>> [7.071724]  iterate_dir+0x7d/0x190
>>> [7.071724]  ? entry_SYSCALL_64_fastpath+0x5/0xc6
>>> [7.071724]  SyS_getdents+0xac/0x170
>>> [7.071724]  ? filldir64+0x170/0x170
>>> [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
>>> [7.071724] RIP: 0033:0x7fa37ca2dd3b
>>> [7.071724] RSP: 002b:7ffc63daf400 EFLAGS: 0206 ORIG_RAX: 
>>> 004e
>>> [7.071724] RAX: ffda RBX: 0046 RCX: 
>>> 7fa37ca2dd3b
>>> [7.071724] RDX: 8000 RSI: 560b369e4a10 RDI: 
>>> 0004
>>> [7.071724] RBP: 7fa37cd29b20 R08: 7fa37cd29bd8 R09: 
>>> 
>>> [7.071724] R10: 008f R11: 0206 R12: 
>>> 8041
>>> [7.071724] R13: 7fa37cd29b78 R14: 270f R15: 
>>> 7fa37cd29b78
>>> [7.071724] Object at 88006bc2b080, in cache kmalloc-96 size: 96
>>> [7.071724] Allocated:
>>> [7.071724] PID = 1
>>> [7.071724]  save_stack_trace+0x1b/0x20
>>> [7.071724]  kasan_kmalloc.part.4+0x64/0xf0
>>> [7.071724]  kasan_kmalloc+0x85/0xb0
>>> [7.071724]  __kmalloc+0x12b/0x320
>>> [7.071724]  ext4_htree_store_dirent+0x3e/0x120
>>> [7.071724]  htree_dirblock_to_tree+0xb9/0x1a0
>>> [7.071724]  ext4_htree_fill_tree+0xa3/0x310
>>> [7.071724]  ext4_readdir+0x6a9/0x920
>>> [7.071724]  iterate_dir+0x7d/0x190
>>> [7.071724]  SyS_getdents+0xac/0x170
>>> [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
>>> [7.071724] Freed:
>>> [7.071724] PID = 1
>>> [7.071724]  save_stack_trace+0x1b/0x20
>>> [7.071724]  kasan_slab_free+0xbe/0x190
>>> [7.071724]  kfree+0xff/0x2f0
>>> [7.071724]  acpi_ut_evaluate_object+0x18e/0x19d
>>> [7.071724]  acpi_ut_execute_STA+0x26/0x53
>>> [7.071724]  acpi_ns_get_device_callback+0x73/0x163
>>> [7.071724]  acpi_ns_walk_namespace+0xc0/0x17a
>>> [7.071724]  acpi_get_devices+0x66/0x7d
>>> [7.071724]  pnpacpi_init+0x52/0x74
>>> [7.071724]  do_one_initcall+0x51/0x1b0
>>> [7.071724]  kernel_init_freeable+0x20a/0x2a1
>>> [7.071724]  kernel_init+0xe/0x100
>>> [7.071724]  ret_from_fork+0x31/0x40
>>> [7.071724] Memory state around the buggy address:
>>> [7.071724]  88006bc2af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
>>> fc fc
>>> [7.071724]  88006bc2b000: fb fb fb fb fb fb fb fb fb fb fb fb fc fc 
>>> fc fc
>>> [7.071724] >88006bc2b080: 00 00 00 00 00 00 00 00 05 fc fc fc fc fc 
>>> fc fc
>>> [7.071724]^
>>> [7.071724]  88006bc2b100: 00 00 00 00 00 00 00 00 00 04 fc fc fc fc 
>>> fc fc
>>> [7.071724]  88006bc2b180: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc 
>>> fc fc
>>>
>>> Not killing the VM instantly produces a continuous stream of kasan errors. 
>>> Most of them
>>> are identical to the one above, however there was one which was different:
>>>
>>> [5.846193] 
>>> ==
>>> [5.846787] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at addr 
>>> 88006c783eae
>>> [5.847177] Read of size 22 by task systemd/1
>>> [5.847177] CPU: 3 PID: 1 Comm: systemd Tainted: GB   
>>> 4.11.0-rc1-nbor #150
>>> [5.847177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> [5.847177] Call Trace:
>>> [5.847177]  dump_stack+0x85/0xc9
>>> [5.847177]  kasan_object_err+0x2c/0x90
>>> [5.847177]  kasan_report+0x285/0x510
>>> [5.847177]  check_memory_region+0x137/0x160
>>> [5.847177]  kasan_check_read+0x11/0x20
>>> [5.847177]  filldir+0xc3/0x160
>>> [5.847177]  call_filldir+0x88/0x140
>>> [5.847177]  ext4_readdir+0x757/0x920
>>> [5.847177]  ? 

Race condition in ext4 (was Re: 4.11-rc1 acpi stomping ext4 slabs)

2017-03-07 Thread Nikolay Borisov


On  7.03.2017 11:38, Nikolay Borisov wrote:
> 
> 
> On  7.03.2017 00:35, Rafael J. Wysocki wrote:
>> On Mon, Mar 6, 2017 at 9:31 PM, Nikolay Borisov
>>  wrote:
>>> Hello,
>>>
>>> Booting 4.11-rc1 with kasan enabled and "slub_debug=F" produces the 
>>> following errors:
>>>
>>> [7.070797] 
>>> ==
>>> [7.071724] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at addr 
>>> 88006bc2b0ae
>>> [7.071724] Read of size 20 by task systemd/1
>>> [7.071724] CPU: 1 PID: 1 Comm: systemd Not tainted 4.11.0-rc1-nbor #150
>>> [7.071724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> [7.071724] Call Trace:
>>> [7.071724]  dump_stack+0x85/0xc9
>>> [7.071724]  kasan_object_err+0x2c/0x90
>>> [7.071724]  kasan_report+0x285/0x510
>>> [7.071724]  check_memory_region+0x137/0x160
>>> [7.071724]  kasan_check_read+0x11/0x20
>>> [7.071724]  filldir+0xc3/0x160
>>> [7.071724]  call_filldir+0x88/0x140
>>> [7.071724]  ext4_readdir+0x757/0x920
>>> [7.071724]  ? iterate_dir+0x49/0x190
>>> [7.071724]  iterate_dir+0x7d/0x190
>>> [7.071724]  ? entry_SYSCALL_64_fastpath+0x5/0xc6
>>> [7.071724]  SyS_getdents+0xac/0x170
>>> [7.071724]  ? filldir64+0x170/0x170
>>> [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
>>> [7.071724] RIP: 0033:0x7fa37ca2dd3b
>>> [7.071724] RSP: 002b:7ffc63daf400 EFLAGS: 0206 ORIG_RAX: 
>>> 004e
>>> [7.071724] RAX: ffda RBX: 0046 RCX: 
>>> 7fa37ca2dd3b
>>> [7.071724] RDX: 8000 RSI: 560b369e4a10 RDI: 
>>> 0004
>>> [7.071724] RBP: 7fa37cd29b20 R08: 7fa37cd29bd8 R09: 
>>> 
>>> [7.071724] R10: 008f R11: 0206 R12: 
>>> 8041
>>> [7.071724] R13: 7fa37cd29b78 R14: 270f R15: 
>>> 7fa37cd29b78
>>> [7.071724] Object at 88006bc2b080, in cache kmalloc-96 size: 96
>>> [7.071724] Allocated:
>>> [7.071724] PID = 1
>>> [7.071724]  save_stack_trace+0x1b/0x20
>>> [7.071724]  kasan_kmalloc.part.4+0x64/0xf0
>>> [7.071724]  kasan_kmalloc+0x85/0xb0
>>> [7.071724]  __kmalloc+0x12b/0x320
>>> [7.071724]  ext4_htree_store_dirent+0x3e/0x120
>>> [7.071724]  htree_dirblock_to_tree+0xb9/0x1a0
>>> [7.071724]  ext4_htree_fill_tree+0xa3/0x310
>>> [7.071724]  ext4_readdir+0x6a9/0x920
>>> [7.071724]  iterate_dir+0x7d/0x190
>>> [7.071724]  SyS_getdents+0xac/0x170
>>> [7.071724]  entry_SYSCALL_64_fastpath+0x23/0xc6
>>> [7.071724] Freed:
>>> [7.071724] PID = 1
>>> [7.071724]  save_stack_trace+0x1b/0x20
>>> [7.071724]  kasan_slab_free+0xbe/0x190
>>> [7.071724]  kfree+0xff/0x2f0
>>> [7.071724]  acpi_ut_evaluate_object+0x18e/0x19d
>>> [7.071724]  acpi_ut_execute_STA+0x26/0x53
>>> [7.071724]  acpi_ns_get_device_callback+0x73/0x163
>>> [7.071724]  acpi_ns_walk_namespace+0xc0/0x17a
>>> [7.071724]  acpi_get_devices+0x66/0x7d
>>> [7.071724]  pnpacpi_init+0x52/0x74
>>> [7.071724]  do_one_initcall+0x51/0x1b0
>>> [7.071724]  kernel_init_freeable+0x20a/0x2a1
>>> [7.071724]  kernel_init+0xe/0x100
>>> [7.071724]  ret_from_fork+0x31/0x40
>>> [7.071724] Memory state around the buggy address:
>>> [7.071724]  88006bc2af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
>>> fc fc
>>> [7.071724]  88006bc2b000: fb fb fb fb fb fb fb fb fb fb fb fb fc fc 
>>> fc fc
>>> [7.071724] >88006bc2b080: 00 00 00 00 00 00 00 00 05 fc fc fc fc fc 
>>> fc fc
>>> [7.071724]^
>>> [7.071724]  88006bc2b100: 00 00 00 00 00 00 00 00 00 04 fc fc fc fc 
>>> fc fc
>>> [7.071724]  88006bc2b180: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc 
>>> fc fc
>>>
>>> Not killing the VM instantly produces a continuous stream of kasan errors. 
>>> Most of them
>>> are identical to the one above, however there was one which was different:
>>>
>>> [5.846193] 
>>> ==
>>> [5.846787] BUG: KASAN: slab-out-of-bounds in filldir+0xc3/0x160 at addr 
>>> 88006c783eae
>>> [5.847177] Read of size 22 by task systemd/1
>>> [5.847177] CPU: 3 PID: 1 Comm: systemd Tainted: GB   
>>> 4.11.0-rc1-nbor #150
>>> [5.847177] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>>> Ubuntu-1.8.2-1ubuntu1 04/01/2014
>>> [5.847177] Call Trace:
>>> [5.847177]  dump_stack+0x85/0xc9
>>> [5.847177]  kasan_object_err+0x2c/0x90
>>> [5.847177]  kasan_report+0x285/0x510
>>> [5.847177]  check_memory_region+0x137/0x160
>>> [5.847177]  kasan_check_read+0x11/0x20
>>> [5.847177]  filldir+0xc3/0x160
>>> [5.847177]  call_filldir+0x88/0x140
>>> [5.847177]  ext4_readdir+0x757/0x920
>>> [5.847177]  ? iterate_dir+0x49/0x190
>>> [