Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-04-08 Thread syzbot
syzbot suspects this issue was fixed by commit:

commit f4e61f0c9add3b00bd5f2df3c814d688849b8707
Author: Wanpeng Li 
Date:   Mon Mar 15 06:55:28 2021 +

x86/kvm: Fix broken irq restoration in kvm_wait

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1022d7aad0
start commit:   144c79ef Merge tag 'perf-tools-fixes-for-v5.12-2020-03-07'..
git tree:   upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=db9c6adb4986f2f2
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=167574dad0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c8f566d0

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: x86/kvm: Fix broken irq restoration in kvm_wait

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-10 Thread syzbot
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any 
issue:

Reported-and-tested-by: syzbot+ac39856cb1b332dbb...@syzkaller.appspotmail.com

Tested on:

commit: 7d41e854 io_uring: remove indirect ctx into sqo injection
git tree:   git://git.kernel.dk/linux-block io_uring-5.12
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3c6cab008c50864
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
compiler:   

Note: testing is done by a robot and is best-effort only.


Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-10 Thread Jens Axboe
#syz test: git://git.kernel.dk/linux-block io_uring-5.12

-- 
Jens Axboe



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-10 Thread Jens Axboe
On 3/10/21 6:40 AM, Pavel Begunkov wrote:
> On 10/03/2021 04:10, Hillf Danton wrote:> 
>> Fix 05ff6c4a0e07 ("io_uring: SQPOLL parking fixes") in the current tree
>> by removing the extra set of IO_SQ_THREAD_SHOULD_STOP in response to
>> the arrival of urgent signal because it misleads io_sq_thread_stop(),
>> though a followup cleanup should go there.
> 
> That's actually reasonable, just like
> 8bff1bf8abeda ("io_uring: fix io_sq_offload_create error handling")
> 
> Are you going to send a patch?

Agree - Hillf, do you mind if I just fold this one in?

-- 
Jens Axboe



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-10 Thread Pavel Begunkov
On 10/03/2021 04:10, Hillf Danton wrote:> 
> Fix 05ff6c4a0e07 ("io_uring: SQPOLL parking fixes") in the current tree
> by removing the extra set of IO_SQ_THREAD_SHOULD_STOP in response to
> the arrival of urgent signal because it misleads io_sq_thread_stop(),
> though a followup cleanup should go there.

That's actually reasonable, just like
8bff1bf8abeda ("io_uring: fix io_sq_offload_create error handling")

Are you going to send a patch?

> 
> --- x/fs/io_uring.c
> +++ y/fs/io_uring.c
> @@ -6689,10 +6689,8 @@ static int io_sq_thread(void *data)
>   io_sqd_init_new(sqd);
>   timeout = jiffies + sqd->sq_thread_idle;
>   }
> - if (fatal_signal_pending(current)) {
> - set_bit(IO_SQ_THREAD_SHOULD_STOP, >state);
> + if (fatal_signal_pending(current))
>   break;
> - }
>   sqt_spin = false;
>   cap_entries = !list_is_singular(>ctx_list);
>   list_for_each_entry(ctx, >ctx_list, sqd_list) {
> 

-- 
Pavel Begunkov


Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread syzbot
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an 
issue:
KASAN: use-after-free Read in io_sq_thread

==
BUG: KASAN: use-after-free in __lock_acquire+0x3e6f/0x54c0 
kernel/locking/lockdep.c:4770
Read of size 8 at addr 88801d418c78 by task iou-sqp-10269/10271

CPU: 1 PID: 10271 Comm: iou-sqp-10269 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
 __lock_acquire+0x3e6f/0x54c0 kernel/locking/lockdep.c:4770
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
 down_write+0x92/0x150 kernel/locking/rwsem.c:1406
 io_sq_thread+0x1220/0x1b10 fs/io_uring.c:6754
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Allocated by task 10269:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:427 [inline]
 kasan_kmalloc mm/kasan/common.c:506 [inline]
 kasan_kmalloc mm/kasan/common.c:465 [inline]
 __kasan_kmalloc+0x99/0xc0 mm/kasan/common.c:515
 kmalloc include/linux/slab.h:554 [inline]
 kzalloc include/linux/slab.h:684 [inline]
 io_get_sq_data fs/io_uring.c:7153 [inline]
 io_sq_offload_create fs/io_uring.c:7827 [inline]
 io_uring_create fs/io_uring.c:9443 [inline]
 io_uring_setup+0x154b/0x2940 fs/io_uring.c:9523
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 9:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
 kasan_slab_free mm/kasan/common.c:360 [inline]
 kasan_slab_free mm/kasan/common.c:325 [inline]
 __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
 kasan_slab_free include/linux/kasan.h:199 [inline]
 slab_free_hook mm/slub.c:1562 [inline]
 slab_free_freelist_hook+0x92/0x210 mm/slub.c:1600
 slab_free mm/slub.c:3161 [inline]
 kfree+0xe5/0x7f0 mm/slub.c:4213
 io_put_sq_data fs/io_uring.c:7095 [inline]
 io_sq_thread_finish+0x48e/0x5b0 fs/io_uring.c:7113
 io_ring_ctx_free fs/io_uring.c:8355 [inline]
 io_ring_exit_work+0x333/0xcf0 fs/io_uring.c:8525
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

The buggy address belongs to the object at 88801d418c00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 120 bytes inside of
 512-byte region [88801d418c00, 88801d418e00)
The buggy address belongs to the page:
page:311e6f59 refcount:1 mapcount:0 mapping: index:0x0 
pfn:0x1d418
head:311e6f59 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff0010200(slab|head)
raw: 00fff0010200 dead0100 dead0122 88800fc41c80
raw:  00100010 0001 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 88801d418b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 88801d418b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>88801d418c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
 88801d418c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 88801d418d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==


Tested on:

commit: dc5c40fb io_uring: always wait for sqd exited when stoppin..
git tree:   git://git.kernel.dk/linux-block io_uring-5.12
console output: https://syzkaller.appspot.com/x/log.txt?x=111d175cd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3c6cab008c50864
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
compiler:   



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread syzbot
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an 
issue:
KASAN: use-after-free Read in io_sq_thread

==
BUG: KASAN: use-after-free in __lock_acquire+0x3e6f/0x54c0 
kernel/locking/lockdep.c:4770
Read of size 8 at addr 888023e47c78 by task iou-sqp-10156/10158

CPU: 0 PID: 10158 Comm: iou-sqp-10156 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
 __lock_acquire+0x3e6f/0x54c0 kernel/locking/lockdep.c:4770
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
 down_write+0x92/0x150 kernel/locking/rwsem.c:1406
 io_sq_thread+0x1220/0x1b10 fs/io_uring.c:6754
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Allocated by task 10156:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:427 [inline]
 kasan_kmalloc mm/kasan/common.c:506 [inline]
 kasan_kmalloc mm/kasan/common.c:465 [inline]
 __kasan_kmalloc+0x99/0xc0 mm/kasan/common.c:515
 kmalloc include/linux/slab.h:554 [inline]
 kzalloc include/linux/slab.h:684 [inline]
 io_get_sq_data fs/io_uring.c:7153 [inline]
 io_sq_offload_create fs/io_uring.c:7827 [inline]
 io_uring_create fs/io_uring.c:9443 [inline]
 io_uring_setup+0x154b/0x2940 fs/io_uring.c:9523
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 3392:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
 kasan_slab_free mm/kasan/common.c:360 [inline]
 kasan_slab_free mm/kasan/common.c:325 [inline]
 __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
 kasan_slab_free include/linux/kasan.h:199 [inline]
 slab_free_hook mm/slub.c:1562 [inline]
 slab_free_freelist_hook+0x92/0x210 mm/slub.c:1600
 slab_free mm/slub.c:3161 [inline]
 kfree+0xe5/0x7f0 mm/slub.c:4213
 io_put_sq_data fs/io_uring.c:7095 [inline]
 io_sq_thread_finish+0x48e/0x5b0 fs/io_uring.c:7113
 io_ring_ctx_free fs/io_uring.c:8355 [inline]
 io_ring_exit_work+0x333/0xcf0 fs/io_uring.c:8525
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

The buggy address belongs to the object at 888023e47c00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 120 bytes inside of
 512-byte region [888023e47c00, 888023e47e00)
The buggy address belongs to the page:
page:200f7571 refcount:1 mapcount:0 mapping: 
index:0x888023e47400 pfn:0x23e44
head:200f7571 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff0010200(slab|head)
raw: 00fff0010200 ea5f6908 ea527508 88800fc41c80
raw: 888023e47400 001f 0001 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 888023e47b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 888023e47b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>888023e47c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
 888023e47c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 888023e47d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==


Tested on:

commit: dc5c40fb io_uring: always wait for sqd exited when stoppin..
git tree:   git://git.kernel.dk/linux-block io_uring-5.12
console output: https://syzkaller.appspot.com/x/log.txt?x=16cd022cd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3c6cab008c50864
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
compiler:   



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread Jens Axboe
#syz test: git://git.kernel.dk/linux-block io_uring-5.12

-- 
Jens Axboe



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread Jens Axboe
#syz test: git://git.kernel.dk/linux-block io_uring-5.12

-- 
Jens Axboe



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread syzbot
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an 
issue:
KASAN: use-after-free Read in io_sq_thread

==
BUG: KASAN: use-after-free in __lock_acquire+0x3e6f/0x54c0 
kernel/locking/lockdep.c:4770
Read of size 8 at addr 888034cbfc78 by task iou-sqp-10518/10523

CPU: 0 PID: 10523 Comm: iou-sqp-10518 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
 __lock_acquire+0x3e6f/0x54c0 kernel/locking/lockdep.c:4770
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
 down_write+0x92/0x150 kernel/locking/rwsem.c:1406
 io_sq_thread+0x1220/0x1b10 fs/io_uring.c:6754
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Allocated by task 10518:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:427 [inline]
 kasan_kmalloc mm/kasan/common.c:506 [inline]
 kasan_kmalloc mm/kasan/common.c:465 [inline]
 __kasan_kmalloc+0x99/0xc0 mm/kasan/common.c:515
 kmalloc include/linux/slab.h:554 [inline]
 kzalloc include/linux/slab.h:684 [inline]
 io_get_sq_data fs/io_uring.c:7156 [inline]
 io_sq_offload_create fs/io_uring.c:7830 [inline]
 io_uring_create fs/io_uring.c:9443 [inline]
 io_uring_setup+0x1552/0x2860 fs/io_uring.c:9523
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 396:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
 kasan_slab_free mm/kasan/common.c:360 [inline]
 kasan_slab_free mm/kasan/common.c:325 [inline]
 __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
 kasan_slab_free include/linux/kasan.h:199 [inline]
 slab_free_hook mm/slub.c:1562 [inline]
 slab_free_freelist_hook+0x92/0x210 mm/slub.c:1600
 slab_free mm/slub.c:3161 [inline]
 kfree+0xe5/0x7f0 mm/slub.c:4213
 io_put_sq_data fs/io_uring.c:7098 [inline]
 io_sq_thread_finish+0x4b0/0x5f0 fs/io_uring.c:7116
 io_ring_ctx_free fs/io_uring.c:8355 [inline]
 io_ring_exit_work+0x333/0xcf0 fs/io_uring.c:8525
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

The buggy address belongs to the object at 888034cbfc00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 120 bytes inside of
 512-byte region [888034cbfc00, 888034cbfe00)
The buggy address belongs to the page:
page:4a1f04c4 refcount:1 mapcount:0 mapping: index:0x0 
pfn:0x34cbc
head:4a1f04c4 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff0010200(slab|head)
raw: 00fff0010200 dead0100 dead0122 88800fc41c80
raw:  00100010 0001 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 888034cbfb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 888034cbfb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>888034cbfc00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
 888034cbfc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 888034cbfd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==


Tested on:

commit: 8bf06ba6 io_uring: remove unneeded variable 'ret'
git tree:   git://git.kernel.dk/linux-block io_uring-5.12
console output: https://syzkaller.appspot.com/x/log.txt?x=13fcd952d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3c6cab008c50864
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
compiler:   



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread Jens Axboe
On 3/9/21 7:04 AM, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:144c79ef Merge tag 'perf-tools-fixes-for-v5.12-2020-03-07'..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=129addbcd0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=db9c6adb4986f2f2
> dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=167574dad0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c8f566d0

#syz test: git://git.kernel.dk/linux-block io_uring-5.12

-- 
Jens Axboe



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-09 Thread syzbot
syzbot has found a reproducer for the following issue on:

HEAD commit:144c79ef Merge tag 'perf-tools-fixes-for-v5.12-2020-03-07'..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=129addbcd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=db9c6adb4986f2f2
dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=167574dad0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c8f566d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ac39856cb1b332dbb...@syzkaller.appspotmail.com


WARNING: possible recursive locking detected
5.12.0-rc2-syzkaller #0 Not tainted

kworker/u4:7/8696 is trying to acquire lock:
888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_stop 
fs/io_uring.c:7099 [inline]
888015395870 (>lock){+.+.}-{3:3}, at: io_put_sq_data 
fs/io_uring.c:7115 [inline]
888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_finish+0x408/0x650 
fs/io_uring.c:7139

but task is already holding lock:
888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
fs/io_uring.c:7088 [inline]
888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park+0x63/0xc0 
fs/io_uring.c:7082

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(>lock);
  lock(>lock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/u4:7/8696:
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
set_work_data kernel/workqueue.c:616 [inline]
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
process_one_work+0x871/0x1600 kernel/workqueue.c:2246
 #1: c9000253fda8 ((work_completion)(>exit_work)){+.+.}-{0:0}, at: 
process_one_work+0x8a5/0x1600 kernel/workqueue.c:2250
 #2: 888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
fs/io_uring.c:7088 [inline]
 #2: 888015395870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park+0x63/0xc0 
fs/io_uring.c:7082

stack backtrace:
CPU: 0 PID: 8696 Comm: kworker/u4:7 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events_unbound io_ring_exit_work
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
 __mutex_lock_common kernel/locking/mutex.c:946 [inline]
 __mutex_lock+0x139/0x1120 kernel/locking/mutex.c:1093
 io_sq_thread_stop fs/io_uring.c:7099 [inline]
 io_put_sq_data fs/io_uring.c:7115 [inline]
 io_sq_thread_finish+0x408/0x650 fs/io_uring.c:7139
 io_ring_ctx_free fs/io_uring.c:8408 [inline]
 io_ring_exit_work+0x82/0x9a0 fs/io_uring.c:8539
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294



Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-07 Thread Pavel Begunkov
On 07/03/2021 12:39, Pavel Begunkov wrote:
> On 07/03/2021 09:49, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:a38fd874 Linux 5.12-rc2
>> git tree:   upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=143ee02ad0
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=db9c6adb4986f2f2
>> dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+ac39856cb1b332dbb...@syzkaller.appspotmail.com
> 
> Legit error, park() might take an sqd lock, and then we take it again.
> I'll patch it up

I was wrong, it looks fine, io_put_sq_data() and io_sq_thread_park()
don't nest. I wonder if that's a false positive due to conditional
locking as below

if (sqd->thread == current)
return;
mutex_lock(>lock);

> 
>>
>> 
>> WARNING: possible recursive locking detected
>> 5.12.0-rc2-syzkaller #0 Not tainted
>> 
>> kworker/u4:7/7615 is trying to acquire lock:
>> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_stop 
>> fs/io_uring.c:7099 [inline]
>> 888144a02870 (>lock){+.+.}-{3:3}, at: io_put_sq_data 
>> fs/io_uring.c:7115 [inline]
>> 888144a02870 (>lock){+.+.}-{3:3}, at: 
>> io_sq_thread_finish+0x408/0x650 fs/io_uring.c:7139
>>
>> but task is already holding lock:
>> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
>> fs/io_uring.c:7088 [inline]
>> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park+0x63/0xc0 
>> fs/io_uring.c:7082
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>CPU0
>>
>>   lock(>lock);
>>   lock(>lock);
>>
>>  *** DEADLOCK ***
>>
>>  May be due to missing lock nesting notation
>>
>> 3 locks held by kworker/u4:7/7615:
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> set_work_data kernel/workqueue.c:616 [inline]
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
>> process_one_work+0x871/0x1600 kernel/workqueue.c:2246
>>  #1: c900023a7da8 ((work_completion)(>exit_work)){+.+.}-{0:0}, at: 
>> process_one_work+0x8a5/0x1600 kernel/workqueue.c:2250
>>  #2: 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
>> fs/io_uring.c:7088 [inline]
>>  #2: 888144a02870 (>lock){+.+.}-{3:3}, at: 
>> io_sq_thread_park+0x63/0xc0 fs/io_uring.c:7082
>>
>> stack backtrace:
>> CPU: 1 PID: 7615 Comm: kworker/u4:7 Not tainted 5.12.0-rc2-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>> Google 01/01/2011
>> Workqueue: events_unbound io_ring_exit_work
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:79 [inline]
>>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>>  lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
>>  __mutex_lock_common kernel/locking/mutex.c:946 [inline]
>>  __mutex_lock+0x139/0x1120 kernel/locking/mutex.c:1093
>>  io_sq_thread_stop fs/io_uring.c:7099 [inline]
>>  io_put_sq_data fs/io_uring.c:7115 [inline]
>>  io_sq_thread_finish+0x408/0x650 fs/io_uring.c:7139
>>  io_ring_ctx_free fs/io_uring.c:8408 [inline]
>>  io_ring_exit_work+0x82/0x9a0 fs/io_uring.c:8539
>>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>
>>
>> ---
>> This report is generated by a bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for more information about syzbot.
>> syzbot engineers can be reached at syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this issue. See:
>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>
> 

-- 
Pavel Begunkov


Re: [syzbot] possible deadlock in io_sq_thread_finish

2021-03-07 Thread Pavel Begunkov
On 07/03/2021 09:49, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:a38fd874 Linux 5.12-rc2
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=143ee02ad0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=db9c6adb4986f2f2
> dashboard link: https://syzkaller.appspot.com/bug?extid=ac39856cb1b332dbbdda
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ac39856cb1b332dbb...@syzkaller.appspotmail.com

Legit error, park() might take an sqd lock, and then we take it again.
I'll patch it up

> 
> 
> WARNING: possible recursive locking detected
> 5.12.0-rc2-syzkaller #0 Not tainted
> 
> kworker/u4:7/7615 is trying to acquire lock:
> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_stop 
> fs/io_uring.c:7099 [inline]
> 888144a02870 (>lock){+.+.}-{3:3}, at: io_put_sq_data 
> fs/io_uring.c:7115 [inline]
> 888144a02870 (>lock){+.+.}-{3:3}, at: 
> io_sq_thread_finish+0x408/0x650 fs/io_uring.c:7139
> 
> but task is already holding lock:
> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
> fs/io_uring.c:7088 [inline]
> 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park+0x63/0xc0 
> fs/io_uring.c:7082
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>CPU0
>
>   lock(>lock);
>   lock(>lock);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 3 locks held by kworker/u4:7/7615:
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> set_work_data kernel/workqueue.c:616 [inline]
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: 888010469138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
> process_one_work+0x871/0x1600 kernel/workqueue.c:2246
>  #1: c900023a7da8 ((work_completion)(>exit_work)){+.+.}-{0:0}, at: 
> process_one_work+0x8a5/0x1600 kernel/workqueue.c:2250
>  #2: 888144a02870 (>lock){+.+.}-{3:3}, at: io_sq_thread_park 
> fs/io_uring.c:7088 [inline]
>  #2: 888144a02870 (>lock){+.+.}-{3:3}, at: 
> io_sq_thread_park+0x63/0xc0 fs/io_uring.c:7082
> 
> stack backtrace:
> CPU: 1 PID: 7615 Comm: kworker/u4:7 Not tainted 5.12.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events_unbound io_ring_exit_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>  lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
>  __mutex_lock_common kernel/locking/mutex.c:946 [inline]
>  __mutex_lock+0x139/0x1120 kernel/locking/mutex.c:1093
>  io_sq_thread_stop fs/io_uring.c:7099 [inline]
>  io_put_sq_data fs/io_uring.c:7115 [inline]
>  io_sq_thread_finish+0x408/0x650 fs/io_uring.c:7139
>  io_ring_ctx_free fs/io_uring.c:8408 [inline]
>  io_ring_exit_work+0x82/0x9a0 fs/io_uring.c:8539
>  process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 

-- 
Pavel Begunkov