On Mon, Jul 01, 2019 at 05:14:12PM -0700, Andrew Morton wrote: > On Mon, 01 Jul 2019 01:27:04 -0700 syzbot > <[email protected]> wrote: > > > Hello, > > > > syzbot found the following crash on: > > At a guess I'd say that perf_mmap() hit a deadlock on event->mmap_mutex > while holding down_write(mmap_sem) (via vm_mmap_pgoff). The > down_read(mmap_sem) in do_exit() happened to stumble across this and > that's what got reported.
lockdep never reported that and I don't see event->mmap_mutex being held anywhere. AFAICT CPU0 is running 8355 and only 'has' mmap_sem -- it's blocked waiting to acquire. CPU1 is running 8354 and has mmap_sem and is waiting to acquire event->mmap_mutex. But nobody is actually owning it We take mmap_mutex in: perf_mmap() - called with mmap_sem held perf_mmap_close() - called with mmap_sem held _free_event() - no faults/mmap while holding it perf_poll() - idem perf_event_set_output() - idem I don't see any of those functions in the below stacktrace, and having just looked them over, I don't see how they would end up trying to acquire mmap_sem and AB-BA. Now, clearly there's something screwy, but I'm not seeing a deadlock. Let me go play with that reproducer. > > HEAD commit: 249155c2 Merge branch 'parisc-5.2-4' of git://git.kernel.o.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=1306be61a00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=9a31528e58cc12e2 > > dashboard link: https://syzkaller.appspot.com/bug?extid=8cc1843d4eec9c0dfb35 > > compiler: clang version 9.0.0 (/home/glider/llvm/clang > > 80fee25776c2fb61e74c1ecb1a523375c2500b69) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14a85379a00000 > > > > Bisection is inconclusive: the bug happens on the oldest tested release. > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=119f6249a00000 > > final crash: https://syzkaller.appspot.com/x/report.txt?x=139f6249a00000 > > console output: https://syzkaller.appspot.com/x/log.txt?x=159f6249a00000 > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: [email protected] > > > > INFO: task syz-executor.0:8352 blocked for more than 143 seconds. > > Not tainted 5.2.0-rc6+ #7 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > syz-executor.0 D24576 8352 8340 0x80004000 > > Call Trace: > > context_switch kernel/sched/core.c:2818 [inline] > > __schedule+0x658/0x9e0 kernel/sched/core.c:3445 > > schedule+0x131/0x1d0 kernel/sched/core.c:3509 > > __rwsem_down_read_failed_common+0x345/0x790 > > kernel/locking/rwsem-xadd.c:495 > > rwsem_down_read_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:515 > > __down_read+0x72/0x1a0 kernel/locking/rwsem.h:178 > > down_read+0x45/0x50 kernel/locking/rwsem.c:26 > > exit_mm+0xdb/0x630 kernel/exit.c:513 > > do_exit+0x5c3/0x2300 kernel/exit.c:864 > > do_group_exit+0x15c/0x2a0 kernel/exit.c:981 > > __do_sys_exit_group+0x17/0x20 kernel/exit.c:992 > > __se_sys_exit_group+0x14/0x20 kernel/exit.c:990 > > __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:990 > > do_syscall_64+0xfe/0x140 arch/x86/entry/common.c:301 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x459519 > > Code: Bad RIP value. > > RSP: 002b:00007ffdb1483048 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 > > RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000459519 > > RDX: 0000000000413201 RSI: fffffffffffffff7 RDI: 0000000000000000 > > RBP: 0000000000000000 R08: ffffffffffffffff R09: 00007ffdb14830a0 > > R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000001 > > R13: 00007ffdb14830a0 R14: 0000000000000000 R15: 00007ffdb14830b0 > > INFO: task syz-executor.0:8355 blocked for more than 143 seconds. > > Not tainted 5.2.0-rc6+ #7 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > syz-executor.0 D25512 8355 8340 0x80004000 > > Call Trace: > > context_switch kernel/sched/core.c:2818 [inline] > > __schedule+0x658/0x9e0 kernel/sched/core.c:3445 > > schedule+0x131/0x1d0 kernel/sched/core.c:3509 > > __rwsem_down_read_failed_common+0x345/0x790 > > kernel/locking/rwsem-xadd.c:495 > > rwsem_down_read_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:515 > > __down_read+0x72/0x1a0 kernel/locking/rwsem.h:178 > > down_read+0x45/0x50 kernel/locking/rwsem.c:26 > > exit_mm+0xdb/0x630 kernel/exit.c:513 > > do_exit+0x5c3/0x2300 kernel/exit.c:864 > > do_group_exit+0x15c/0x2a0 kernel/exit.c:981 > > get_signal+0x6df/0x21f0 kernel/signal.c:2640 > > do_signal+0x7b/0x750 arch/x86/kernel/signal.c:815 > > exit_to_usermode_loop arch/x86/entry/common.c:164 [inline] > > prepare_exit_to_usermode+0x2f5/0x4f0 arch/x86/entry/common.c:199 > > syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279 > > do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x459519 > > Code: Bad RIP value. > > RSP: 002b:00007f33b051bcf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > > RAX: fffffffffffffe00 RBX: 000000000075bfd0 RCX: 0000000000459519 > > RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000075bfd0 > > RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000075bfd4 > > R13: 00007ffdb1482e3f R14: 00007f33b051c9c0 R15: 000000000075bfd4 > > > > Showing all locks held in the system: > > 1 lock held by khungtaskd/1043: > > #0: 000000004a05a158 (rcu_read_lock){....}, at: rcu_lock_acquire+0x4/0x30 > > > > include/linux/rcupdate.h:207 > > 1 lock held by rsyslogd/7880: > > #0: 00000000946eafbf (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x243/0x2e0 > > fs/file.c:801 > > 2 locks held by getty/7970: > > #0: 00000000837c576c (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 000000008890c3b0 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7971: > > #0: 00000000b66a4c98 (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 00000000d588511a (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7972: > > #0: 00000000881b5f61 (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 00000000343a7af4 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7973: > > #0: 000000009862f21e (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 0000000033c60fb7 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7974: > > #0: 00000000b1abdc0b (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 0000000040853254 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7975: > > #0: 00000000fff02dba (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 0000000036cfe603 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 2 locks held by getty/7976: > > #0: 0000000059ae43cb (&tty->ldisc_sem){++++}, at: > > tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272 > > #1: 0000000032d54919 (&ldata->atomic_read_lock){+.+.}, at: > > n_tty_read+0x2ee/0x1c80 drivers/tty/n_tty.c:2156 > > 1 lock held by syz-executor.0/8352: > > #0: 0000000049c5e979 (&mm->mmap_sem#2){++++}, at: exit_mm+0xdb/0x630 > > kernel/exit.c:513 > > 2 locks held by syz-executor.0/8354: > > 1 lock held by syz-executor.0/8355: > > #0: 0000000049c5e979 (&mm->mmap_sem#2){++++}, at: exit_mm+0xdb/0x630 > > kernel/exit.c:513 > > > > ============================================= > > > > NMI backtrace for cpu 1 > > CPU: 1 PID: 1043 Comm: khungtaskd Not tainted 5.2.0-rc6+ #7 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x1d8/0x2f8 lib/dump_stack.c:113 > > nmi_cpu_backtrace+0x89/0x160 lib/nmi_backtrace.c:101 > > nmi_trigger_cpumask_backtrace+0x125/0x230 lib/nmi_backtrace.c:62 > > arch_trigger_cpumask_backtrace+0x10/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > > trigger_all_cpu_backtrace+0x17/0x20 include/linux/nmi.h:146 > > check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline] > > watchdog+0xbb9/0xbd0 kernel/hung_task.c:289 > > kthread+0x325/0x350 kernel/kthread.c:255 > > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 > > Sending NMI from CPU 1 to CPUs 0: > > NMI backtrace for cpu 0 > > CPU: 0 PID: 8354 Comm: syz-executor.0 Not tainted 5.2.0-rc6+ #7 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > RIP: 0010:lock_is_held_type+0x26c/0x2b0 kernel/locking/lockdep.c:4346 > > Code: c7 c7 90 63 aa 88 e8 43 4a 54 00 48 83 3d 9b d2 4f 07 00 74 56 4c 89 > > e7 57 9d 0f 1f 44 00 00 89 d8 48 83 c4 28 5b 41 5c 41 5d <41> 5e 41 5f 5d > > c3 44 89 e9 80 e1 07 80 c1 03 38 c1 7c a8 4c 89 ef > > RSP: 0018:ffff8880a4d87708 EFLAGS: 00000296 > > RAX: 0000000000000000 RBX: ffff88809f0c6040 RCX: 1ffff110149b0f10 > > RDX: dffffc0000000000 RSI: ffff88809f0c68e0 RDI: 0000000000000286 > > RBP: ffff8880a4d87718 R08: ffffffff818fd4ed R09: 0000000000000000 > > R10: ffffed101155a22f R11: 1ffff1101155a22e R12: 0000000000000000 > > R13: 1ffff11013e18c0a R14: dffffc0000000000 R15: 1ffff11013e18d17 > > FS: 00007f33b053d700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000000000 CR3: 00000000941bd000 CR4: 00000000001406f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > lock_is_held include/linux/lockdep.h:356 [inline] > > ___might_sleep+0x84/0x530 kernel/sched/core.c:6103 > > __might_sleep+0x8f/0x100 kernel/sched/core.c:6091 > > __mutex_lock_common+0xc8/0x2fc0 kernel/locking/mutex.c:909 > > __mutex_lock kernel/locking/mutex.c:1073 [inline] > > mutex_lock_nested+0x1b/0x30 kernel/locking/mutex.c:1088 > > perf_mmap+0x76d/0x16c0 kernel/events/core.c:5672 > > call_mmap include/linux/fs.h:1877 [inline] > > mmap_region+0x186d/0x1d80 mm/mmap.c:1788 > > do_mmap+0x9de/0x1010 mm/mmap.c:1561 > > do_mmap_pgoff include/linux/mm.h:2402 [inline] > > vm_mmap_pgoff+0x190/0x240 mm/util.c:363 > > ksys_mmap_pgoff+0x4ed/0x5f0 mm/mmap.c:1611 > > __do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline] > > __se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline] > > __x64_sys_mmap+0x103/0x120 arch/x86/kernel/sys_x86_64.c:91 > > do_syscall_64+0xfe/0x140 arch/x86/entry/common.c:301 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x459519 > > Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > > ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > > RSP: 002b:00007f33b053cc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 > > RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000459519 > > RDX: 0000000000000000 RSI: 0000000000003000 RDI: 0000000020ffd000 > > RBP: 000000000075bf20 R08: 0000000000000003 R09: 0000000000000000 > > R10: 0080000000000011 R11: 0000000000000246 R12: 00007f33b053d6d4 > > R13: 00000000004c5822 R14: 00000000004d9ed8 R15: 00000000ffffffff

