Re: INFO: rcu detected stall in tasklet_action_common

2020-12-10 Thread Dmitry Vyukov
On Wed, Dec 9, 2020 at 10:53 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:b3298500 Merge tag 'for-5.10/dm-fixes' of git://git.kernel..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=135a07ab50
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e49433cfed49b7d9
> dashboard link: https://syzkaller.appspot.com/bug?extid=cdb28ae22d09b2142434
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17b4334550
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=102d68df50
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cdb28ae22d09b2142...@syzkaller.appspotmail.com

The reproducer is basically just perf_event_open:

r0 = perf_event_open(&(0x7f000500)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0,
0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_bp={0x0}}, 0x0,
0x, 0x, 0x0)
fcntl$setstatus(r0, 0x4, 0x42000)

I guess it's the old case of perf starving the system with some
effective busy looping due to small sampling period (?).

Peter, IIRC you had some patches to make lower bound controllable or
something. Were they ever merged? Maybe we need to tune some sysctl's
on syzbot?



> rcu: INFO: rcu_preempt self-detected stall on CPU
> rcu:0-...!: (10194 ticks this GP) idle=25a/1/0x4000 
> softirq=8880/8882 fqs=1
> (t=10502 jiffies g=8805 q=65)
> rcu: rcu_preempt kthread starved for 10501 jiffies! g8805 f0x0 
> RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
> rcu:Unless rcu_preempt kthread gets sufficient CPU time, OOM is now 
> expected behavior.
> rcu: RCU grace-period kthread stack dump:
> task:rcu_preempt state:R  running task stack:29512 pid:   11 ppid:
>  2 flags:0x4000
> Call Trace:
>  context_switch kernel/sched/core.c:3779 [inline]
>  __schedule+0x893/0x2130 kernel/sched/core.c:4528
>  schedule+0xcf/0x270 kernel/sched/core.c:4606
>  schedule_timeout+0x148/0x250 kernel/time/timer.c:1871
>  rcu_gp_fqs_loop kernel/rcu/tree.c:1925 [inline]
>  rcu_gp_kthread+0xb4c/0x1c90 kernel/rcu/tree.c:2099
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> NMI backtrace for cpu 0
> CPU: 0 PID: 8880 Comm: syz-executor561 Not tainted 5.10.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
>  nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
>  trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
>  rcu_dump_cpu_stacks+0x1e3/0x21e kernel/rcu/tree_stall.h:331
>  print_cpu_stall kernel/rcu/tree_stall.h:563 [inline]
>  check_cpu_stall kernel/rcu/tree_stall.h:637 [inline]
>  rcu_pending kernel/rcu/tree.c:3694 [inline]
>  rcu_sched_clock_irq.cold+0x472/0xee8 kernel/rcu/tree.c:2567
>  update_process_times+0x77/0xd0 kernel/time/timer.c:1709
>  tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
>  tick_sched_timer+0x1d1/0x2a0 kernel/time/tick-sched.c:1328
>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>  __hrtimer_run_queues+0x1ce/0xea0 kernel/time/hrtimer.c:1583
>  hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1080 [inline]
>  __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1097
>  run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:91 [inline]
>  sysvec_apic_timer_interrupt+0x48/0x100 arch/x86/kernel/apic/apic.c:1091
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
> RIP: 0010:__ieee80211_rx_handle_packet net/mac80211/rx.c:4568 [inline]
> RIP: 0010:ieee80211_rx_list+0x9fc/0x23d0 net/mac80211/rx.c:4759
> Code: d2 0f 85 8d 17 00 00 48 8b 44 24 10 bf 50 00 00 00 0f b7 00 41 89 c4 66 
> 89 44 24 08 66 41 81 e4 fc 00 44 89 e6 e8 e4 36 21 f9 <66> 41 83 fc 50 0f 84 
> be 11 00 00 e8 a4 3e 21 f9 44 89 e6 bf 80 00
> RSP: 0018:c9007cb8 EFLAGS: 0246
> RAX:  RBX:  RCX: 884ec5cc
> RDX: 0080 RSI: 88801b054ec0 RDI: 0003
> RBP: 88801a9ae140 R08: 0001 R09: c9007d48
> R10: 0050 R11: 0001 R12: 0080
> R13: 88801a9ae140 R14: 88801a650c80 R15: c9007d48
>  ieee80211_rx_napi+0xf7/0x3d0 net/mac80211/rx.c:4780
>  ieee80211_rx include/net/mac80211.h:4502 [inline]
>  ieee80211_tasklet_handler+0xd3/0x130 net/mac80211/main.c:235
>  tasklet_action_common.constprop.0+0x22f/0x2d0 kernel/softirq.c:560
>  __do_softirq+0x2a0/0x9f6 kernel/softirq.c:298
>  

Re: UBSAN: shift-out-of-bounds in ext4_fill_super

2020-12-10 Thread Dmitry Vyukov
On Thu, Dec 10, 2020 at 4:50 AM syzbot
 wrote:
>
> Hello,
>
> syzbot tried to test the proposed patch but the build/boot failed:
>
> failed to checkout kernel repo 
> git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git on commit 
> e360ba58d067a30a4e3e7d55ebdd919885a058d6: failed to run ["git" "fetch" 
> "--tags" "d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8"]: exit status 1
> From git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
>  * [new branch]bisect-test-ext4-035 -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/bisect-test-ext4-035
>  * [new branch]bisect-test-generic-307  -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/bisect-test-generic-307
>  * [new branch]dev  -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/dev
>  * [new branch]ext4-3.18-> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-3.18
>  * [new branch]ext4-4.1 -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-4.1
>  * [new branch]ext4-4.4 -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-4.4
>  * [new branch]ext4-4.9 -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-4.9
>  * [new branch]ext4-dax -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-dax
>  * [new branch]ext4-tools   -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/ext4-tools
>  * [new branch]fix-bz-206443-> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/fix-bz-206443
>  * [new branch]for-stable   -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/for-stable
>  * [new branch]fsverity -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/fsverity
>  * [new branch]lazy_journal -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/lazy_journal
>  * [new branch]master   -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/master
>  * [new branch]origin   -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/origin
>  * [new branch]pu   -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/pu
>  * [new branch]test -> 
> d06f7b29746c7f0a52f349ff7fbf2a3f22d27cf8/test
>  * [new tag]   ext4-for-linus-5.8-rc1-2 -> 
> ext4-for-linus-5.8-rc1-2
>  ! [rejected]  ext4_for_linus   -> ext4_for_linus  
> (would clobber existing tag)

Interesting. First time I see this. Should syzkaller use 'git fetch
--tags --force"?...
StackOverflow suggests it should help:
https://stackoverflow.com/questions/58031165/how-to-get-rid-of-would-clobber-existing-tag


>  * [new tag]   ext4_for_linus_bugfixes  -> 
> ext4_for_linus_bugfixes
>  * [new tag]   ext4_for_linus_cleanups  -> 
> ext4_for_linus_cleanups
>  * [new tag]   ext4_for_linus_fixes -> 
> ext4_for_linus_fixes
>  * [new tag]   ext4_for_linus_fixes2-> 
> ext4_for_linus_fixes2
>
>
>
> Tested on:
>
> commit: [unknown
> git tree:   git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git 
> e360ba58d067a30a4e3e7d55ebdd919885a058d6
> dashboard link: https://syzkaller.appspot.com/bug?extid=345b75652b1d24227443
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> patch:  https://syzkaller.appspot.com/x/patch.diff?x=1499c28750


Re: linux-next: build warning after merge of the akpm tree

2020-12-09 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 1:52 PM Marco Elver  wrote:
>
> On Mon, 7 Dec 2020 at 13:38, 'Dmitry Vyukov' via kasan-dev
>  wrote:
> > On Mon, Dec 7, 2020 at 1:08 PM Dmitry Vyukov  wrote:
> > > > > Hi all,
> > > > >
> > > > > After merging the akpm tree, today's linux-next build (powerpc
> > > > > allyesconfig) produced warnings like this:
> > > > >
> > > > > kernel/kcov.c:296:14: warning: conflicting types for built-in 
> > > > > function '__sanitizer_cov_trace_switch'; expected 'void(long unsigned 
> > > > > int,  void *)' [-Wbuiltin-declaration-mismatch]
> > > > >   296 | void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)
> > > > >   |  ^~~~
> > > >
> > > > Odd.  clang wants that signature, according to
> > > > https://clang.llvm.org/docs/SanitizerCoverage.html.  But gcc seems to
> > > > want a different signature.  Beats me - best I can do is to cc various
> > > > likely culprits ;)
> > > >
> > > > Which gcc version?  Did you recently update gcc?
> > > >
> > > > > ld: warning: orphan section `.data..Lubsan_data177' from 
> > > > > `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section 
> > > > > `.data..Lubsan_data177'
> > > > >
> > > > > (lots of these latter ones)
> > > > >
> > > > > I don't know what produced these, but it is in the akpm-current or
> > > > > akpm trees.
> > >
> > > I can reproduce this in x86_64 build as well but only if I enable
> > > UBSAN as well. There were some recent UBSAN changes by Kees, so maybe
> > > that's what affected the warning.
> > > Though, the warning itself looks legit and unrelated to UBSAN. In
> > > fact, if the compiler expects long and we accept u64, it may be broken
> > > on 32-bit arches...
> >
> > No, I think it works, the argument should be uint64.
> >
> > I think both gcc and clang signatures are correct and both want
> > uint64_t. The question is just how uint64_t is defined :) The old
> > printf joke that one can't write portable format specifier for
> > uint64_t.
> >
> > What I know so far:
> > clang 11 does not produce this warning even with obviously wrong
> > signatures (e.g. short).
> > I wasn't able to trigger it with gcc on 32-bits at all. KCOV is not
> > supported on i386 and on arm I got no warnings even with obviously
> > wrong signatures (e.g. short).
> > Using "(unsigned long val, void *cases)" fixes the warning on x86_64.
> >
> > I am still puzzled why gcc considers this as a builtin because we
> > don't enable -fsanitizer-coverage on this file. I am also puzzled how
> > UBSAN affects things.
>
> It might be some check-for-builtins check gone wrong if it enables any
> one of the sanitizers. That would be confirmed if it works with
>
> UBSAN_SANITIZE_kcov.o := n

Yes, it "fixes" the warning.
Initially I thought it's not a good solution because we want to detect
UBSAN bugs in KCOV. But on second thought, if UBSAN detects a bug in
KCOV, it may lead to infinite recursion. We already disable all other
sanitizers on KCOV for this reason, so it's reasonable to disable
UBSAN as well. And as a side effect it "resolves" the warning as well.
I mailed:
https://lore.kernel.org/lkml/20201209100152.2492072-1-dvyu...@google.com/T/#u

Thanks

> > We could change the signature to long, but it feels wrong/dangerous
> > because the variable should really be 64-bits (long is broken on
> > 32-bits).
> > Or we could introduce a typedef that is long on 64-bits and 'long
> > long' on 32-bits.


[PATCH] kcov: don't instrument with UBSAN

2020-12-09 Thread Dmitry Vyukov
Both KCOV and UBSAN use compiler instrumentation. If UBSAN detects a bug
in KCOV, it may cause infinite recursion via printk and other common
functions. We already don't instrument KCOV with KASAN/KCSAN for this
reason, don't instrument it with UBSAN as well.

As a side effect this also resolves the following gcc warning:

conflicting types for built-in function '__sanitizer_cov_trace_switch';
expected 'void(long unsigned int,  void *)' [-Wbuiltin-declaration-mismatch]

It's only reported when kcov.c is compiled with any of the sanitizers
enabled. Size of the arguments is correct, it's just that gcc uses 'long'
on 64-bit arches and 'long long' on 32-bit arches, while kernel type is
always 'long long'.

Reported-by: Stephen Rothwell 
Suggested-by: Marco Elver 
Signed-off-by: Dmitry Vyukov 
---
 kernel/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/Makefile b/kernel/Makefile
index aac15aeb9d69..efa42857532b 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -34,8 +34,11 @@ KCOV_INSTRUMENT_extable.o := n
 KCOV_INSTRUMENT_stacktrace.o := n
 # Don't self-instrument.
 KCOV_INSTRUMENT_kcov.o := n
+# If sanitizers detect any issues in kcov, it may lead to recursion
+# via printk, etc.
 KASAN_SANITIZE_kcov.o := n
 KCSAN_SANITIZE_kcov.o := n
+UBSAN_SANITIZE_kcov.o := n
 CFLAGS_kcov.o := $(call cc-option, -fno-conserve-stack) -fno-stack-protector
 
 obj-y += sched/
-- 
2.29.2.576.ga3fc446d84-goog



Re: BUG: MAX_LOCKDEP_KEYS too low!

2020-12-09 Thread Dmitry Vyukov
On Sun, Oct 27, 2019 at 4:31 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:65921376 Merge branch 'net-fix-nested-device-bugs'
> git tree:   net
> console output: https://syzkaller.appspot.com/x/log.txt?x=1637fdc0e0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e0ac4d9b35046343
> dashboard link: https://syzkaller.appspot.com/bug?extid=692f39f040c1f415567b
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+692f39f040c1f4155...@syzkaller.appspotmail.com

This stopped happening a while ago, let's close this to get
notifications about new instances.
One of likely candidates:

#syz fix: net: partially revert dynamic lockdep key changes


> BUG: MAX_LOCKDEP_KEYS too low!
> turning off the locking correctness validator.
> CPU: 0 PID: 15175 Comm: syz-executor.5 Not tainted 5.4.0-rc3+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:77 [inline]
>   dump_stack+0x172/0x1f0 lib/dump_stack.c:113
>   register_lock_class.cold+0x1b/0x27 kernel/locking/lockdep.c:1222
>   __lock_acquire+0xf4/0x4a00 kernel/locking/lockdep.c:3837
>   lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
>   __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
>   _raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175
>   spin_lock_bh include/linux/spinlock.h:343 [inline]
>   netif_addr_lock_bh include/linux/netdevice.h:4055 [inline]
>   __dev_mc_add+0x2e/0xd0 net/core/dev_addr_lists.c:765
>   dev_mc_add+0x20/0x30 net/core/dev_addr_lists.c:783
>   igmp6_group_added+0x3b5/0x460 net/ipv6/mcast.c:672
>   __ipv6_dev_mc_inc+0x727/0xa60 net/ipv6/mcast.c:931
>   ipv6_dev_mc_inc+0x20/0x30 net/ipv6/mcast.c:938
>   ipv6_add_dev net/ipv6/addrconf.c:456 [inline]
>   ipv6_add_dev+0xa3d/0x10b0 net/ipv6/addrconf.c:363
>   addrconf_notify+0x97d/0x23b0 net/ipv6/addrconf.c:3491
>   notifier_call_chain+0xc2/0x230 kernel/notifier.c:95
>   __raw_notifier_call_chain kernel/notifier.c:396 [inline]
>   raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:403
>   call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1668
>   call_netdevice_notifiers_extack net/core/dev.c:1680 [inline]
>   call_netdevice_notifiers net/core/dev.c:1694 [inline]
>   register_netdevice+0x950/0xeb0 net/core/dev.c:9114
>   ieee80211_if_add+0xf51/0x1730 net/mac80211/iface.c:1881
>   ieee80211_register_hw+0x36e6/0x3ac0 net/mac80211/main.c:1256
>   mac80211_hwsim_new_radio+0x20d9/0x4360
> drivers/net/wireless/mac80211_hwsim.c:3031
>   hwsim_new_radio_nl+0x9e3/0x1070 drivers/net/wireless/mac80211_hwsim.c:3586
>   genl_family_rcv_msg+0x74b/0xf90 net/netlink/genetlink.c:629
>   genl_rcv_msg+0xca/0x170 net/netlink/genetlink.c:654
>   netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
>   genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
>   netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
>   netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
>   netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
>   sock_sendmsg_nosec net/socket.c:637 [inline]
>   sock_sendmsg+0xd7/0x130 net/socket.c:657
>   ___sys_sendmsg+0x803/0x920 net/socket.c:2311
>   __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
>   __do_sys_sendmsg net/socket.c:2365 [inline]
>   __se_sys_sendmsg net/socket.c:2363 [inline]
>   __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
>   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x459f39
> Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7fd0af43ac78 EFLAGS: 0246 ORIG_RAX: 002e
> RAX: ffda RBX: 0003 RCX: 00459f39
> RDX:  RSI: 2180 RDI: 0003
> RBP: 0075bf20 R08:  R09: 
> R10:  R11: 0246 R12: 7fd0af43b6d4
> R13: 004c82f8 R14: 004de3f0 R15: 
> kobject: 'batman_adv' (9392522f): kobject_add_internal:
> parent: 'wlan1810', set: ''
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion 

BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

2020-12-09 Thread Dmitry Vyukov
This stopped happening a while ago, let's close this to get
notifications about new instances.
One of likely candidates:

#syz fix: net: partially revert dynamic lockdep key changes


Re: memory leak in generic_parse_monolithic [+PATCH]

2020-12-08 Thread Dmitry Vyukov
On Wed, Dec 9, 2020 at 12:15 AM Randy Dunlap  wrote:
>
> On 12/8/20 2:54 PM, David Howells wrote:
> > Randy Dunlap  wrote:
> >
> >>> Now the backtrace only shows what the state was when the string was 
> >>> allocated;
> >>> it doesn't show what happened to it after that, so another possibility is 
> >>> that
> >>> the filesystem being mounted nicked what vfs_parse_fs_param() had 
> >>> rightfully
> >>> stolen, transferring fc->source somewhere else and then failed to release 
> >>> it -
> >>> most likely on mount failure (ie. it's an error handling bug in the
> >>> filesystem).
> >>>
> >>> Do we know what filesystem it was?
> >>
> >> Yes, it's call AFS (or kAFS).
> >
> > Hmmm...  afs parses the string in afs_parse_source() without modifying it,
> > then moves the pointer to fc->source (parallelling vfs_parse_fs_param()) and
> > doesn't touch it again.  fc->source should be cleaned up by do_new_mount()
> > calling put_fs_context() at the end of the function.
> >
> > As far as I can tell with the attached print-insertion patch, it works, 
> > called
> > by the following commands, some of which are correct and some which aren't:
> >
> > # mount -t afs none /xfstest.test/ -o dyn
> > # umount /xfstest.test
> > # mount -t afs "" /xfstest.test/ -o foo
> > mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
> > you might need a /sbin/mount. helper program.
> > # umount /xfstest.test
> > umount: /xfstest.test: not mounted.
> > # mount -t afs %xfstest.test20 /xfstest.test/ -o foo
> > mount: /xfstest.test: bad option; for several filesystems (e.g. nfs, cifs) 
> > you might need a /sbin/mount. helper program.
> > # umount /xfstest.test
> > umount: /xfstest.test: not mounted.
> > # mount -t afs %xfstest.test20 /xfstest.test/
> > # umount /xfstest.test
> >
> > Do you know if the mount was successful and what the mount parameters were?
>
> Here's the syzbot reproducer:
> https://syzkaller.appspot.com/x/repro.c?x=129ca3d650
>
> The "interesting" mount params are:
> source=%^]$[+%](${:\017k[)-:,source=%^]$[+.](%{:\017\200[)-:,\000
>
> There is no other AFS activity: nothing mounted, no cells known (or
> whatever that is), etc.
>
> I don't recall if the mount was successful and I can't test it just now.
> My laptop is mucked up.
>
>
> Be aware that this report could just be a false positive: it waits
> for 5 seconds then looks for a memleak. AFAIK, it's possible that the "leaked"
> memory is still in valid use and will be freed some day.

FWIW KMEMLEAK scans memory for pointers. If it claims a memory leak,
it means the heap object is not referenced anywhere anymore. There are
no live pointers to it to call kfree or anything else.
Some false positives are theoretically possible, but so I don't
remember any, all reported ones were true leaks:
https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak



> > David
> > ---
> > diff --git a/fs/afs/super.c b/fs/afs/super.c
> > index 6c5900df6aa5..4c44ec0196c9 100644
> > --- a/fs/afs/super.c
> > +++ b/fs/afs/super.c
> > @@ -299,7 +299,7 @@ static int afs_parse_source(struct fs_context *fc, 
> > struct fs_parameter *param)
> >   ctx->cell = cell;
> >   }
> >
> > - _debug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
> > + kdebug("CELL:%s [%p] VOLUME:%*.*s SUFFIX:%s TYPE:%d%s",
> >  ctx->cell->name, ctx->cell,
> >  ctx->volnamesz, ctx->volnamesz, ctx->volname,
> >  suffix ?: "-", ctx->type, ctx->force ? " FORCE" : "");
> > @@ -318,6 +318,8 @@ static int afs_parse_param(struct fs_context *fc, 
> > struct fs_parameter *param)
> >   struct afs_fs_context *ctx = fc->fs_private;
> >   int opt;
> >
> > + kenter("%s,%p '%s'", param->key, param->string, param->string);
> > +
> >   opt = fs_parse(fc, afs_fs_parameters, param, );
> >   if (opt < 0)
> >   return opt;
> > diff --git a/fs/fs_context.c b/fs/fs_context.c
> > index 2834d1afa6e8..f530a33876ce 100644
> > --- a/fs/fs_context.c
> > +++ b/fs/fs_context.c
> > @@ -450,6 +450,8 @@ void put_fs_context(struct fs_context *fc)
> >   put_user_ns(fc->user_ns);
> >   put_cred(fc->cred);
> >   put_fc_log(fc);
> > + if (strcmp(fc->fs_type->name, "afs") == 0)
> > + printk("PUT %p '%s'\n", fc->source, fc->source);
> >   put_filesystem(fc->fs_type);
> >   kfree(fc->source);
> >   kfree(fc);
> > @@ -671,6 +673,8 @@ void vfs_clean_context(struct fs_context *fc)
> >   fc->s_fs_info = NULL;
> >   fc->sb_flags = 0;
> >   security_free_mnt_opts(>security);
> > + if (strcmp(fc->fs_type->name, "afs") == 0)
> > + printk("CLEAN %p '%s'\n", fc->source, fc->source);
> >   kfree(fc->source);
> >   fc->source = NULL;
> >
> >
>
> I'll check more after my test machine is working again.
>
> thanks.
> --
> ~Randy
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this 

Re: BUG: unable to handle kernel paging request in bpf_lru_populate

2020-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 12:43 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:bcd684aa net/nfc/nci: Support NCI 2.x initial sequence
> git tree:   net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12001bd350
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3cb098ab0334059f
> dashboard link: https://syzkaller.appspot.com/bug?extid=ec2234240c96fdd26b93
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11f7f2ef50
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=103833f750
>
> The issue was bisected to:
>
> commit b93ef089d35c3386dd197e85afb6399bbd54cfb3
> Author: Martin KaFai Lau 
> Date:   Mon Nov 16 20:01:13 2020 +
>
> bpf: Fix the irq and nmi check in bpf_sk_storage for tracing usage
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1103b83750
> final oops: https://syzkaller.appspot.com/x/report.txt?x=1303b83750
> console output: https://syzkaller.appspot.com/x/log.txt?x=1503b83750
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ec2234240c96fdd26...@syzkaller.appspotmail.com
> Fixes: b93ef089d35c ("bpf: Fix the irq and nmi check in bpf_sk_storage for 
> tracing usage")

I assume this is also

#syz fix: bpf: Avoid overflows involving hash elem_size


> BUG: unable to handle page fault for address: f5200471266c
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 23fff2067 P4D 23fff2067 PUD 101a4067 PMD 32e3a067 PTE 0
> Oops:  [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 8503 Comm: syz-executor608 Not tainted 5.10.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:bpf_common_lru_populate kernel/bpf/bpf_lru_list.c:569 [inline]
> RIP: 0010:bpf_lru_populate+0xd8/0x5e0 kernel/bpf/bpf_lru_list.c:614
> Code: 03 4d 01 e7 48 01 d8 48 89 4c 24 10 4d 89 fe 48 89 44 24 08 e8 99 23 eb 
> ff 49 8d 7e 12 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 <0f> b6 04 18 38 d0 7f 
> 08 84 c0 0f 85 80 04 00 00 49 8d 7e 13 41 c6
> RSP: 0018:c9000126fc20 EFLAGS: 00010202
> RAX: 19200471266c RBX: dc00 RCX: 8184e3e2
> RDX: 0002 RSI: 8184e2e7 RDI: c90023893362
> RBP: 00bc R08: 107c R09: 
> R10: 107c R11:  R12: 0001
> R13: 107c R14: c90023893350 R15: c900234832f0
> FS:  00fe0880() GS:8880b9f0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: f5200471266c CR3: 1ba62000 CR4: 001506e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  prealloc_init kernel/bpf/hashtab.c:319 [inline]
>  htab_map_alloc+0xf6e/0x1230 kernel/bpf/hashtab.c:507
>  find_and_alloc_map kernel/bpf/syscall.c:123 [inline]
>  map_create kernel/bpf/syscall.c:829 [inline]
>  __do_sys_bpf+0xa81/0x5170 kernel/bpf/syscall.c:4374
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x4402e9
> Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7ffe77af23b8 EFLAGS: 0246 ORIG_RAX: 0141
> RAX: ffda RBX: 004002c8 RCX: 004402e9
> RDX: 0040 RSI: 2000 RDI: 0d00
> RBP: 006ca018 R08:  R09: 
> R10:  R11: 0246 R12: 00401af0
> R13: 00401b80 R14:  R15: 
> Modules linked in:
> CR2: f5200471266c
> ---[ end trace 4f3928bacde7b3ed ]---
> RIP: 0010:bpf_common_lru_populate kernel/bpf/bpf_lru_list.c:569 [inline]
> RIP: 0010:bpf_lru_populate+0xd8/0x5e0 kernel/bpf/bpf_lru_list.c:614
> Code: 03 4d 01 e7 48 01 d8 48 89 4c 24 10 4d 89 fe 48 89 44 24 08 e8 99 23 eb 
> ff 49 8d 7e 12 48 89 f8 48 89 fa 48 c1 e8 03 83 e2 07 <0f> b6 04 18 38 d0 7f 
> 08 84 c0 0f 85 80 04 00 00 49 8d 7e 13 41 c6
> RSP: 0018:c9000126fc20 EFLAGS: 00010202
> RAX: 19200471266c RBX: dc00 RCX: 8184e3e2
> RDX: 0002 RSI: 8184e2e7 RDI: c90023893362
> RBP: 00bc R08: 107c R09: 
> R10: 107c R11:  R12: 0001
> R13: 107c R14: c90023893350 R15: c900234832f0
> FS:  00fe0880() GS:8880b9f0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: f5200471266c CR3: 1ba62000 CR4: 001506e0
> DR0:  DR1: 

Re: KASAN: vmalloc-out-of-bounds Write in pcpu_freelist_populate

2020-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 9:03 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:34da8721 selftests/bpf: Test bpf_sk_storage_get in tcp ite..
> git tree:   bpf-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=10c3b83750
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3cb098ab0334059f
> dashboard link: https://syzkaller.appspot.com/bug?extid=942085bfb8f7a276af1c
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+942085bfb8f7a276a...@syzkaller.appspotmail.com

I assume this is also

#syz fix: bpf: Avoid overflows involving hash elem_size


> ==
> BUG: KASAN: vmalloc-out-of-bounds in pcpu_freelist_push_node 
> kernel/bpf/percpu_freelist.c:33 [inline]
> BUG: KASAN: vmalloc-out-of-bounds in pcpu_freelist_populate+0x1fe/0x260 
> kernel/bpf/percpu_freelist.c:114
> Write of size 8 at addr c90119e78020 by task syz-executor.4/27988
>
> CPU: 1 PID: 27988 Comm: syz-executor.4 Not tainted 5.10.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
>  __kasan_report mm/kasan/report.c:545 [inline]
>  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>  pcpu_freelist_push_node kernel/bpf/percpu_freelist.c:33 [inline]
>  pcpu_freelist_populate+0x1fe/0x260 kernel/bpf/percpu_freelist.c:114
>  prealloc_init kernel/bpf/hashtab.c:323 [inline]
>  htab_map_alloc+0x981/0x1230 kernel/bpf/hashtab.c:507
>  find_and_alloc_map kernel/bpf/syscall.c:123 [inline]
>  map_create kernel/bpf/syscall.c:829 [inline]
>  __do_sys_bpf+0xa81/0x5170 kernel/bpf/syscall.c:4374
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x45e0f9
> Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f679c7a7c68 EFLAGS: 0246
>  ORIG_RAX: 0141
> RAX: ffda RBX: 0003 RCX: 0045e0f9
> RDX: 0040 RSI: 2040 RDI: 
> RBP: 0119c068 R08:  R09: 
> R10:  R11: 0246 R12: 0119c034
> R13: 7fffd601c75f R14: 7f679c7a89c0 R15: 0119c034
>
>
> Memory state around the buggy address:
>  c90119e77f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>  c90119e77f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> >c90119e78000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>^
>  c90119e78080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>  c90119e78100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ==
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/caabb705b5e550aa%40google.com.


Re: INFO: rcu detected stall in __se_sys_mount

2020-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 9:06 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 1d0e850a49a5b56f8f3cb51e74a11e2fedb96be6
> Author: David Howells 
> Date:   Fri Oct 16 12:21:14 2020 +
>
> afs: Fix cell removal
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=162cebcf50
> start commit:   c85fb28b Merge tag 'arm64-fixes' of git://git.kernel.org/p..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=de7f697da23057c7
> dashboard link: https://syzkaller.appspot.com/bug?extid=3f2db34df769d77edf8c
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11df5d4f90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=157851e050
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: afs: Fix cell removal
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: afs: Fix cell removal


Re: WARNING: filesystem loop0 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-07 Thread Dmitry Vyukov
On Mon, Sep 28, 2020 at 11:08 AM Tigran Aivazian
 wrote:
>
> On Mon, 28 Sep 2020 at 09:29, Dmitry Vyukov  wrote:
> > On Mon, Sep 28, 2020 at 10:23 AM Tigran Aivazian
> > > No, this is not an issue. In the latest change to BFS I added the
> > > following comment to the header fs/bfs/bfs.h, which explains it:
> > >
> > > /* In theory BFS supports up to 512 inodes, numbered from 2 (for /) up
> > > to 513 inclusive.
> > >In actual fact, attempting to create the 512th inode (i.e. inode
> > > No. 513 or file No. 511)
> > >will fail with ENOSPC in bfs_add_entry(): the root directory cannot
> > > contain so many entries, counting '..'.
> > >So, mkfs.bfs(8) should really limit its -N option to 511 and not
> > > 512. For now, we just print a warning
> > >if a filesystem is mounted with such "impossible to fill up" number
> > > of inodes */
> >
> > There are rules for use of "WARNING" in output required to support
> > kernel testing:
> > https://github.com/torvalds/linux/blob/master/include/asm-generic/bug.h#L67-L80
> > This seems to be triggerable by exteranal inputs and breaks these rules.
>
> Thank you, I didn't know about these rules. Ok, then, since this
> warning does not "need prompt attention if it should ever occur at
> runtime", the easiest solution is to change "WARNING" to lower case
> "warning" in that printk in fs/bfs/inode.c:
>
> --- fs/bfs/inode.c.0 2020-09-28 10:03:00.658549556 +0100
> +++ fs/bfs/inode.c 2020-09-28 10:03:05.408548250 +0100
> @@ -351,7 +351,7 @@
>
>   info->si_lasti = (le32_to_cpu(bfs_sb->s_start) - BFS_BSIZE) /
> sizeof(struct bfs_inode) + BFS_ROOT_INO - 1;
>   if (info->si_lasti == BFS_MAX_LASTI)
> - printf("WARNING: filesystem %s was created with 512 inodes, the real
> maximum is 511, mounting anyway\n", s->s_id);
> + printf("warning: filesystem %s was created with 512 inodes, the real
> maximum is 511, mounting anyway\n", s->s_id);
>   else if (info->si_lasti > BFS_MAX_LASTI) {
>   printf("Impossible last inode number %lu > %d on %s\n",
> info->si_lasti, BFS_MAX_LASTI, s->s_id);
>   goto out1;
>
> If you want to submit this patch to the appropriate place(s), feel
> free to do this -- I approve it. If the comment in asm/bug.h is
> inaccurate and its mention of "BUG/WARNING" implies the lowercase
> "bug/warning" also, then one can remove the prefix "warning: " from
> the patch altogether and proper case "filesystem" to "Filesystem".
>
> Kind regards,
> Tigran
>
> Acked-By: Tigran Aivazian 
> Approved-By: Tigran Aivazian 


#syz fix: bfs: don't use WARNING: string when it's just info.


Re: WARNING: filesystem loop1 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-07 Thread Dmitry Vyukov
#syz fix: bfs: don't use WARNING: string when it's just info.

On Mon, Sep 28, 2020 at 8:10 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:c9c9e6a4 Merge tag 'trace-v5.9-rc5-2' of git://git.kernel...
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=138b09d990
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5f4c828c9e3cef97
> dashboard link: https://syzkaller.appspot.com/bug?extid=2435de7315366e15f0ca
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2435de7315366e15f...@syzkaller.appspotmail.com
>
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop1 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Inode 0x0002 corrupted on loop1
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop1 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Inode 0x0002 corrupted on loop1
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/6f1f5d05b06394d0%40google.com.


Re: WARNING: filesystem loop4 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-07 Thread Dmitry Vyukov
#syz fix: bfs: don't use WARNING: string when it's just info.

On Sat, Nov 21, 2020 at 8:33 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:09162bc3 Linux 5.10-rc4
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=103f4fbe50
> kernel config:  https://syzkaller.appspot.com/x/.config?x=75292221eb79ace2
> dashboard link: https://syzkaller.appspot.com/bug?extid=1a219abc12077a390bc9
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+1a219abc12077a390...@syzkaller.appspotmail.com
>
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop4 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Last block not available on loop4: 1507328
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop4 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Last block not available on loop4: 1507328
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/bf566005b498f95f%40google.com.


Re: WARNING: filesystem loop2 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-07 Thread Dmitry Vyukov
#syz fix: bfs: don't use WARNING: string when it's just info.

On Sat, Nov 21, 2020 at 8:33 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:09162bc3 Linux 5.10-rc4
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16e9a48650
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e93bbe4ce29223b
> dashboard link: https://syzkaller.appspot.com/bug?extid=ae3ff0bb2a0133596a5b
> compiler:   clang version 11.0.0 
> (https://github.com/llvm/llvm-project.git 
> ca2dcbd030eadbf0aa9b660efe864ff08af6e18b)
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ae3ff0bb2a0133596...@syzkaller.appspotmail.com
>
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop2 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Last block not available on loop2: 1507328
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop2 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Last block not available on loop2: 1507328
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/c2f72c05b498f9bd%40google.com.


Re: WARNING: filesystem loop3 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-07 Thread Dmitry Vyukov
#syz fix: bfs: don't use WARNING: string when it's just info.

On Thu, Sep 24, 2020 at 11:40 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:98477740 Merge branch 'rcu/urgent' of git://git.kernel.org..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15964ec390
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6f192552d75898a1
> dashboard link: https://syzkaller.appspot.com/bug?extid=293714df4fe354fae488
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+293714df4fe354fae...@syzkaller.appspotmail.com
>
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop3 was created with 512 
> inodes, the real maximum is 511, mounting anyway
> BFS-fs: bfs_fill_super(): Inode 0x0002 corrupted on loop3
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/28201305b00bfdba%40google.com.


Re: WARNING: filesystem loop0 was created with 512 inodes, the real maximum is 511, mounting anywa

2020-12-07 Thread Dmitry Vyukov
#syz fix: bfs: don't use WARNING: string when it's just info.

On Mon, Dec 7, 2020 at 1:53 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:34816d20 Merge tag 'gfs2-v5.10-rc5-fixes' of git://git.ker..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=157dad0750
> kernel config:  https://syzkaller.appspot.com/x/.config?x=b3a044ccf5b03ac4
> dashboard link: https://syzkaller.appspot.com/bug?extid=02c44c7f92e70a73730a
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=152b05ab50
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14fc3fad50
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+02c44c7f92e70a737...@syzkaller.appspotmail.com
>
> BFS-fs: bfs_fill_super(): WARNING: filesystem loop0 was created with 512 
> inodes, the real maximum is 511, mounting anywa
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/fbb4f505b5df4eea%40google.com.


Re: linux-next: build warning after merge of the akpm tree

2020-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 1:08 PM Dmitry Vyukov  wrote:
> > > Hi all,
> > >
> > > After merging the akpm tree, today's linux-next build (powerpc
> > > allyesconfig) produced warnings like this:
> > >
> > > kernel/kcov.c:296:14: warning: conflicting types for built-in function 
> > > '__sanitizer_cov_trace_switch'; expected 'void(long unsigned int,  void 
> > > *)' [-Wbuiltin-declaration-mismatch]
> > >   296 | void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)
> > >   |  ^~~~
> >
> > Odd.  clang wants that signature, according to
> > https://clang.llvm.org/docs/SanitizerCoverage.html.  But gcc seems to
> > want a different signature.  Beats me - best I can do is to cc various
> > likely culprits ;)
> >
> > Which gcc version?  Did you recently update gcc?
> >
> > > ld: warning: orphan section `.data..Lubsan_data177' from 
> > > `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section 
> > > `.data..Lubsan_data177'
> > >
> > > (lots of these latter ones)
> > >
> > > I don't know what produced these, but it is in the akpm-current or
> > > akpm trees.
>
> I can reproduce this in x86_64 build as well but only if I enable
> UBSAN as well. There were some recent UBSAN changes by Kees, so maybe
> that's what affected the warning.
> Though, the warning itself looks legit and unrelated to UBSAN. In
> fact, if the compiler expects long and we accept u64, it may be broken
> on 32-bit arches...

No, I think it works, the argument should be uint64.

I think both gcc and clang signatures are correct and both want
uint64_t. The question is just how uint64_t is defined :) The old
printf joke that one can't write portable format specifier for
uint64_t.

What I know so far:
clang 11 does not produce this warning even with obviously wrong
signatures (e.g. short).
I wasn't able to trigger it with gcc on 32-bits at all. KCOV is not
supported on i386 and on arm I got no warnings even with obviously
wrong signatures (e.g. short).
Using "(unsigned long val, void *cases)" fixes the warning on x86_64.

I am still puzzled why gcc considers this as a builtin because we
don't enable -fsanitizer-coverage on this file. I am also puzzled how
UBSAN affects things.

We could change the signature to long, but it feels wrong/dangerous
because the variable should really be 64-bits (long is broken on
32-bits).
Or we could introduce a typedef that is long on 64-bits and 'long
long' on 32-bits.


Re: linux-next: build warning after merge of the akpm tree

2020-12-07 Thread Dmitry Vyukov
On Sat, Dec 5, 2020 at 6:19 AM Andrew Morton  wrote:
>
> On Fri, 4 Dec 2020 21:00:00 +1100 Stephen Rothwell  
> wrote:
>
> > Hi all,
> >
> > After merging the akpm tree, today's linux-next build (powerpc
> > allyesconfig) produced warnings like this:
> >
> > kernel/kcov.c:296:14: warning: conflicting types for built-in function 
> > '__sanitizer_cov_trace_switch'; expected 'void(long unsigned int,  void *)' 
> > [-Wbuiltin-declaration-mismatch]
> >   296 | void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)
> >   |  ^~~~
>
> Odd.  clang wants that signature, according to
> https://clang.llvm.org/docs/SanitizerCoverage.html.  But gcc seems to
> want a different signature.  Beats me - best I can do is to cc various
> likely culprits ;)
>
> Which gcc version?  Did you recently update gcc?
>
> > ld: warning: orphan section `.data..Lubsan_data177' from 
> > `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section 
> > `.data..Lubsan_data177'
> >
> > (lots of these latter ones)
> >
> > I don't know what produced these, but it is in the akpm-current or
> > akpm trees.

I can reproduce this in x86_64 build as well but only if I enable
UBSAN as well. There were some recent UBSAN changes by Kees, so maybe
that's what affected the warning.
Though, the warning itself looks legit and unrelated to UBSAN. In
fact, if the compiler expects long and we accept u64, it may be broken
on 32-bit arches...

I have gcc version 10.2.0 (Debian 10.2.0-15)
On next-20201207
config is defconfig +
CONFIG_KCOV=y
CONFIG_KCOV_ENABLE_COMPARISONS=y
CONFIG_UBSAN=y

$ make -j8 kernel/kcov.o
  CC  kernel/kcov.o
kernel/kcov.c:296:14: warning: conflicting types for built-in function
‘__sanitizer_cov_trace_switch’; expected ‘void(long unsigned int,
void *)’ [-Wbuiltin-declaration-mismatch]
  296 | void notrace __sanitizer_cov_trace_switch(u64 val, u64 *cases)


Re: KASAN: slab-out-of-bounds Read in btrfs_scan_one_device

2020-12-07 Thread Dmitry Vyukov
On Mon, Dec 7, 2020 at 10:34 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 0697d9a610998b8bdee6b2390836cb2391d8fd1a
> Author: Johannes Thumshirn 
> Date:   Wed Nov 18 09:03:26 2020 +
>
> btrfs: don't access possibly stale fs_info data for printing duplicate 
> device
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10fb0d9b50
> start commit:   521b619a Merge tag 'linux-kselftest-kunit-fixes-5.10-rc3' ..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=e791ddf0875adf65
> dashboard link: https://syzkaller.appspot.com/bug?extid=c4b1e5278d93269fd69c
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=16296f5c50
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1614e74650
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: btrfs: don't access possibly stale fs_info data for printing 
> duplicate device
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix:
btrfs: don't access possibly stale fs_info data for printing duplicate device


Re: WARN_ON_ONCE

2020-12-06 Thread Dmitry Vyukov
On Sat, Dec 5, 2020 at 1:05 PM Michael Ellerman  wrote:
>
> Alexey Kardashevskiy  writes:
> > On 04/12/2020 12:25, Michael Ellerman wrote:
> >> Dmitry Vyukov  writes:
> >>> On Thu, Dec 3, 2020 at 10:19 AM Dmitry Vyukov  wrote:
> >>>> On Thu, Dec 3, 2020 at 10:10 AM Alexey Kardashevskiy  
> >>>> wrote:
> >>>>>
> >>>>> Hi!
> >>>>>
> >>>>> Syzkaller triggered WARN_ON_ONCE at
> >>>>>
> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/tracepoint.c?h=v5.10-rc6#n266
> >>>>>
> >>>>>
> >>>>> ===
> >>>>> static int tracepoint_add_func(struct tracepoint *tp,
> >>>>> struct tracepoint_func *func, int prio)
> >>>>> {
> >>>>>  struct tracepoint_func *old, *tp_funcs;
> >>>>>  int ret;
> >>>>>
> >>>>>  if (tp->regfunc && !static_key_enabled(>key)) {
> >>>>>  ret = tp->regfunc();
> >>>>>  if (ret < 0)
> >>>>>  return ret;
> >>>>>  }
> >>>>>
> >>>>>  tp_funcs = rcu_dereference_protected(tp->funcs,
> >>>>>  lockdep_is_held(_mutex));
> >>>>>  old = func_add(_funcs, func, prio);
> >>>>>  if (IS_ERR(old)) {
> >>>>>  WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
> >>>>>  return PTR_ERR(old);
> >>>>>  }
> >>>>>
> >>>>> ===
> >>>>>
> >>>>> What is the common approach here? Syzkaller reacts on this as if it was
> >>>>> a bug but WARN_ON_ONCE here seems intentional. Do we still push for
> >>>>> removing such warnings?
> >>
> >> AFAICS it is a bug if that fires.
> >>
> >> See the commit that added it:
> >>d66a270be331 ("tracepoint: Do not warn on ENOMEM")
> >>
> >> Which says:
> >>Tracepoint should only warn when a kernel API user does not respect the
> >>required preconditions (e.g. same tracepoint enabled twice,
> >
> > This says that the userspace can trigger the warning if it does not use
> > the API right.
>
> No I don't think it says that.
>
> It's saying that it should be a WARN if a *kernel* user of the
> tracepoint API violates the API. The implication is that this condition
> should never happen if the kernel is using the tracepoint API correctly,
> and so if we hit this condition it indicates a bug in the kernel that
> should be fixed.
>
> >> or called
> >>to remove a tracepoint that does not exist).
> >>
> >>Silence warning in out-of-memory conditions, given that the error is
> >>returned to the caller.
> >>
> >>
> >> So if you're seeing it then you've someone caused it to return something
> >> other than ENOMEM, and that is a bug.
> >
> > This is an userspace bug which registers the same thing twice, the
> > kernel returns a correct error. The question is should it warn by
> > WARN_ON or pr_err(). The comment in bug.h suggests pr_err() is the right
> > way, is not it?
>
> Userspace must not be able to trigger a WARN.
>
> What is the path into that code from userspace?

There are lots of info on this WARNING in the syzbot report:
https://syzkaller.appspot.com/bug?id=41f4318cf01762389f4d1c1c459da4f542fe5153
https://lore.kernel.org/lkml/a6348d05a9234...@google.com/

There are lots of sample stacks and reproducers, also happens on 4.14 and 4.19.

> Either something on that path should be checking that it's not violating
> the API and triggering the WARN, or if that's not possible/easy then the
> WARN should be removed.
>
> cheers


Re: WARNING: filesystem loop5 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-03 Thread Dmitry Vyukov
On Thu, Dec 3, 2020 at 1:55 PM Dmitry Vyukov  wrote:
>
> On Thu, Dec 3, 2020 at 5:15 AM Randy Dunlap  wrote:
> >
> > On 12/1/20 1:17 PM, Randy Dunlap wrote:
> > > On 11/30/20 11:47 PM, Dmitry Vyukov wrote:
> > >> On Tue, Dec 1, 2020 at 2:03 AM Randy Dunlap  
> > >> wrote:
> > >>>
> > >>> On 11/30/20 12:43 AM, Dmitry Vyukov wrote:
> > >>>> On Mon, Nov 30, 2020 at 5:29 AM Randy Dunlap  
> > >>>> wrote:
> > >>>>>
> > >>>>> On 11/27/20 4:32 AM, syzbot wrote:
> > >>>>>> Hello,
> > >>>>>>
> > >>>>>> syzbot found the following issue on:
> > >>>>>>
> > >>>>>> HEAD commit:418baf2c Linux 5.10-rc5
> > >>>>>> git tree:   upstream
> > >>>>>> console output: 
> > >>>>>> https://syzkaller.appspot.com/x/log.txt?x=171555b950
> > >>>>>> kernel config:  
> > >>>>>> https://syzkaller.appspot.com/x/.config?x=b81aff78c272da44
> > >>>>>> dashboard link: 
> > >>>>>> https://syzkaller.appspot.com/bug?extid=3fd34060f26e766536ff
> > >>>>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
> > >>>>>>
> > >>>>>> Unfortunately, I don't have any reproducer for this issue yet.
> > >>>>>>
> > >>>>>> IMPORTANT: if you fix the issue, please add the following tag to the 
> > >>>>>> commit:
> > >>>>>> Reported-by: syzbot+3fd34060f26e76653...@syzkaller.appspotmail.com
> > >>>>>>
> > >>>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> > >>>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> > >>>>>> 512 inodes, the real maximum is 511, mounting anyway
> > >>>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> > >>>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> > >>>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> > >>>>>> 512 inodes, the real maximum is 511, mounting anyway
> > >>>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> > >>>>>>
> > >>>>>>
> > >>>>>> ---
> > >>>>>> This report is generated by a bot. It may contain errors.
> > >>>>>> See https://goo.gl/tpsmEJ for more information about syzbot.
> > >>>>>> syzbot engineers can be reached at syzkal...@googlegroups.com.
> > >>>>>>
> > >>>>>> syzbot will keep track of this issue. See:
> > >>>>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
> > ...
> >
> > >>>> Hi Randy,
> > >>>>
> > >>>> I see this bug was reported with a reproducer:
> > >>>> https://syzkaller.appspot.com/bug?id=a32ebd5db2f7c957b82cf54b97bdecf367bf0421
> > >>>> I assume it's a dup of this one.
> > >>>
> > >>> Sure, looks the same.
> > >>>
> > >>>> If you need the image itself, you can dump it to a file in the C
> > >>>> reproducer inside of syz_mount_image before mount call.
> > >>>
> > >>> Yes, got that.
> > >>>
> > >>> What outcome or result are you looking for here?
> > >>> Or what do you see as the problem?
> > >>
> > >> Hi Randy,
> > >>
> > >> "WARNING:" in kernel output is supposed to mean a kernel source bug.
> > >> Presence of that kernel bug is what syzbot has reported.
> > >>
> > >> Note: the bug may be a misuse of the "WARNING:" for invalid user
> > >> inputs in output as well :)
> > >
> > >
> > > [adding Al Viro]
> > >
> > > Hi Dmitry,
> > >
> > > I expect that the "WARNING:" message is being interpreted incorrectly 
> > > here,
> > > but that's a minor issue IMO.
> > >
> > >   if (info->si_lasti == BFS_MAX_LASTI)
> > >   printf("WARNING: filesystem %s was created with 512 inodes, 
> > > the real maximum is 511, mounting anyway\n", s->s_id);
> > >
> >
> > ...
> >
> > >
> > >
> > > However, in testing this, I see that the BFS image is not mounted
> > > on /dev/loop# at all.
> > >
> > > 'mount' says:
> > >
> > > # mount -t bfs -o loop bfsfilesyz000.img  /mnt/stand
> > > mount: /mnt/stand: mount(2) system call failed: Not a directory.
> > >
> > > (but it is a directory)
> > >
> > > and I have tracked that down to fs/namespace.c::graft_tree()
> > > returning -ENOTDIR, but I don't know why that is happening.
> > >
> > >
> > > Al, can you provide any insights on this?
> >
> > OK, with Al's help, here is the situation.
> >
> > If I use a regular file instead of a directory, the mount
> > command succeeds.
> >
> > The printk() from fs/bfs/inode.c that uses the WARNING: string
> > is not a WARN() or WARN_ON(). It's just a printk().
> >
> >  says:
> >
> >  * Do not include "BUG"/"WARNING" in format strings manually to make these
> >  * conditions distinguishable from kernel issues.
> >
> > so if I change fs/bfs/inode.c to use "warning:" or "Warning," or "Note:",
> > this little problem should go away.  Is that correct?
>
> Hi,
>
> Yes, any of these prefixes will work (not be considered as a kernel
> issue). syzkaller only matches "WARNING:" verbatim. I don't know about
> all other kernel testing systems, but at least it's distinguishable.
>
> Maybe also worth adding "bfs:" prefix for cases when people stare at
> dmesg afterwards.

Oh, sorry, there are already enough prefixes (BFS-fs: bfs_fill_super():).


Re: WARNING: filesystem loop5 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-03 Thread Dmitry Vyukov
On Thu, Dec 3, 2020 at 5:15 AM Randy Dunlap  wrote:
>
> On 12/1/20 1:17 PM, Randy Dunlap wrote:
> > On 11/30/20 11:47 PM, Dmitry Vyukov wrote:
> >> On Tue, Dec 1, 2020 at 2:03 AM Randy Dunlap  wrote:
> >>>
> >>> On 11/30/20 12:43 AM, Dmitry Vyukov wrote:
> >>>> On Mon, Nov 30, 2020 at 5:29 AM Randy Dunlap  
> >>>> wrote:
> >>>>>
> >>>>> On 11/27/20 4:32 AM, syzbot wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> syzbot found the following issue on:
> >>>>>>
> >>>>>> HEAD commit:418baf2c Linux 5.10-rc5
> >>>>>> git tree:   upstream
> >>>>>> console output: 
> >>>>>> https://syzkaller.appspot.com/x/log.txt?x=171555b950
> >>>>>> kernel config:  
> >>>>>> https://syzkaller.appspot.com/x/.config?x=b81aff78c272da44
> >>>>>> dashboard link: 
> >>>>>> https://syzkaller.appspot.com/bug?extid=3fd34060f26e766536ff
> >>>>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
> >>>>>>
> >>>>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>>>
> >>>>>> IMPORTANT: if you fix the issue, please add the following tag to the 
> >>>>>> commit:
> >>>>>> Reported-by: syzbot+3fd34060f26e76653...@syzkaller.appspotmail.com
> >>>>>>
> >>>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> >>>>>> 512 inodes, the real maximum is 511, mounting anyway
> >>>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> >>>>>> 512 inodes, the real maximum is 511, mounting anyway
> >>>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>>>>>
> >>>>>>
> >>>>>> ---
> >>>>>> This report is generated by a bot. It may contain errors.
> >>>>>> See https://goo.gl/tpsmEJ for more information about syzbot.
> >>>>>> syzbot engineers can be reached at syzkal...@googlegroups.com.
> >>>>>>
> >>>>>> syzbot will keep track of this issue. See:
> >>>>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> ...
>
> >>>> Hi Randy,
> >>>>
> >>>> I see this bug was reported with a reproducer:
> >>>> https://syzkaller.appspot.com/bug?id=a32ebd5db2f7c957b82cf54b97bdecf367bf0421
> >>>> I assume it's a dup of this one.
> >>>
> >>> Sure, looks the same.
> >>>
> >>>> If you need the image itself, you can dump it to a file in the C
> >>>> reproducer inside of syz_mount_image before mount call.
> >>>
> >>> Yes, got that.
> >>>
> >>> What outcome or result are you looking for here?
> >>> Or what do you see as the problem?
> >>
> >> Hi Randy,
> >>
> >> "WARNING:" in kernel output is supposed to mean a kernel source bug.
> >> Presence of that kernel bug is what syzbot has reported.
> >>
> >> Note: the bug may be a misuse of the "WARNING:" for invalid user
> >> inputs in output as well :)
> >
> >
> > [adding Al Viro]
> >
> > Hi Dmitry,
> >
> > I expect that the "WARNING:" message is being interpreted incorrectly here,
> > but that's a minor issue IMO.
> >
> >   if (info->si_lasti == BFS_MAX_LASTI)
> >   printf("WARNING: filesystem %s was created with 512 inodes, 
> > the real maximum is 511, mounting anyway\n", s->s_id);
> >
>
> ...
>
> >
> >
> > However, in testing this, I see that the BFS image is not mounted
> > on /dev/loop# at all.
> >
> > 'mount' says:
> >
> > # mount -t bfs -o loop bfsfilesyz000.img  /mnt/stand
> > mount: /mnt/stand: mount(2) system call failed: Not a directory.
> >
> > (but it is a directory)
> >
> > and I have tracked that down to fs/namespace.c::graft_tree()
> > returning -ENOTDIR, but I don't know why that is happening.
> >
> >
> > Al, can you provide any insights on this?
>
> OK, with Al's help, here is the situation.
>
> If I use a regular file instead of a directory, the mount
> command succeeds.
>
> The printk() from fs/bfs/inode.c that uses the WARNING: string
> is not a WARN() or WARN_ON(). It's just a printk().
>
>  says:
>
>  * Do not include "BUG"/"WARNING" in format strings manually to make these
>  * conditions distinguishable from kernel issues.
>
> so if I change fs/bfs/inode.c to use "warning:" or "Warning," or "Note:",
> this little problem should go away.  Is that correct?

Hi,

Yes, any of these prefixes will work (not be considered as a kernel
issue). syzkaller only matches "WARNING:" verbatim. I don't know about
all other kernel testing systems, but at least it's distinguishable.

Maybe also worth adding "bfs:" prefix for cases when people stare at
dmesg afterwards.


Re: WARN_ON_ONCE

2020-12-03 Thread Dmitry Vyukov
On Thu, Dec 3, 2020 at 10:19 AM Dmitry Vyukov  wrote:
>
> On Thu, Dec 3, 2020 at 10:10 AM Alexey Kardashevskiy  wrote:
> >
> > Hi!
> >
> > Syzkaller triggered WARN_ON_ONCE at
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/tracepoint.c?h=v5.10-rc6#n266
> >
> >
> > ===
> > static int tracepoint_add_func(struct tracepoint *tp,
> >struct tracepoint_func *func, int prio)
> > {
> > struct tracepoint_func *old, *tp_funcs;
> > int ret;
> >
> > if (tp->regfunc && !static_key_enabled(>key)) {
> > ret = tp->regfunc();
> > if (ret < 0)
> > return ret;
> > }
> >
> > tp_funcs = rcu_dereference_protected(tp->funcs,
> > lockdep_is_held(_mutex));
> > old = func_add(_funcs, func, prio);
> > if (IS_ERR(old)) {
> > WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
> > return PTR_ERR(old);
> > }
> >
> > ===
> >
> > What is the common approach here? Syzkaller reacts on this as if it was
> > a bug but WARN_ON_ONCE here seems intentional. Do we still push for
> > removing such warnings?
>
> +LKML

+LKML for real

> Hi Alexey,
>
> Yes, see the guidelines here:
> https://elixir.bootlin.com/linux/v5.10-rc6/source/include/asm-generic/bug.h#L67
>
> Without a criteria for kernel but/not a kernel bug no kernel testing
> is possible.
>
> But this may be a real bug as well. The code seems to assume that
> ENOMEM is the only possible error here, which is not the case in
> reality.
>
>
> > Another example is:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/tracepoint.h?h=v5.10-rc6#n313
> >
> > My VMs crash on dereferencing it_func_ptr which is easily fixable by:
> >
> > @@ -307,9 +307,11 @@ static inline struct tracepoint
> > *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> >  \
> >  it_func_ptr =   \
> >
> > rcu_dereference_raw((&__tracepoint_##_name)->funcs); \
> > +   if (it_func_ptr)\
> >  do {\
> >  it_func = (it_func_ptr)->func;  \
> >  __data = (it_func_ptr)->data;   \
> >
> >
> > But - this only happens when OOM killer starts killing syzkaller
> > processes (I do not give it much memory so it is quite artificial
> > environment). Do we push these?
> >
> > Are there guidelines of some sort? Thanks,
> >
> >
> > --
> > Alexey
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "syzkaller" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to syzkaller+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/syzkaller/87f443cf-26c0-6302-edee-556045bca18a%40ozlabs.ru.


Re: [PATCH v5 0/4] kasan: add workqueue stack for generic KASAN

2020-12-02 Thread Dmitry Vyukov
On Thu, Dec 3, 2020 at 3:21 AM Walter Wu  wrote:
>
> Syzbot reports many UAF issues for workqueue, see [1].
> In some of these access/allocation happened in process_one_work(),
> we see the free stack is useless in KASAN report, it doesn't help
> programmers to solve UAF for workqueue issue.
>
> This patchset improves KASAN reports by making them to have workqueue
> queueing stack. It is useful for programmers to solve use-after-free
> or double-free memory issue.
>
> Generic KASAN also records the last two workqueue stacks and prints
> them in KASAN report. It is only suitable for generic KASAN.
>
> [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work
> [2]https://bugzilla.kernel.org/show_bug.cgi?id=198437
>
> Walter Wu (4):
> workqueue: kasan: record workqueue stack
> kasan: print workqueue stack
> lib/test_kasan.c: add workqueue test case
> kasan: update documentation for generic kasan
>
> ---
> Changes since v4:
> - Not found timer use case, so that remove timer patch
> - remove a mention of call_rcu() from the kasan_record_aux_stack()
>   Thanks for Dmitry and Alexander suggestion.
>
> Changes since v3:
> - testcases have merge conflict, so that need to
>   be rebased onto the KASAN-KUNIT.
>
> Changes since v2:
> - modify kasan document to be readable,
>   Thanks for Marco suggestion.
>
> Changes since v1:
> - Thanks for Marco and Thomas suggestion.
> - Remove unnecessary code and fix commit log
> - reuse kasan_record_aux_stack() and aux_stack
>   to record timer and workqueue stack.
> - change the aux stack title for common name.
>
> ---
> Documentation/dev-tools/kasan.rst |  5 +++--
> kernel/workqueue.c|  3 +++
> lib/test_kasan_module.c   | 29 +
> mm/kasan/generic.c|  4 +---
> mm/kasan/report.c |  4 ++--
> 5 files changed, 38 insertions(+), 7 deletions(-)


Hi Walter,

Thanks for the update.
The series still looks good to me. I see patches already have my
Reviewed-by, so I will not resend them.


Re: [PATCH v3 1/1] kasan: fix object remain in offline per-cpu quarantine

2020-12-02 Thread Dmitry Vyukov
On Wed, Dec 2, 2020 at 8:58 AM Kuan-Ying Lee  wrote:
>
> We hit this issue in our internal test.
> When enabling generic kasan, a kfree()'d object is put into per-cpu
> quarantine first. If the cpu goes offline, object still remains in
> the per-cpu quarantine. If we call kmem_cache_destroy() now, slub
> will report "Objects remaining" error.
>
> [   74.982625] 
> =
> [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in 
> test_module_slab on __kmem_cache_shutdown()
> [   74.984145] 
> -
> [   74.984145]
> [   74.984883] Disabling lock debugging due to kernel taint
> [   74.985561] INFO: Slab 0x(ptrval) objects=34 used=1 
> fp=0x(ptrval) flags=0x20010200
> [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: GB 
> 5.10.0-rc1-7-g4525c8781ec0-dirty #10
> [   74.987262] Hardware name: linux,dummy-virt (DT)
> [   74.987606] Call trace:
> [   74.987924]  dump_backtrace+0x0/0x2b0
> [   74.988296]  show_stack+0x18/0x68
> [   74.988698]  dump_stack+0xfc/0x168
> [   74.989030]  slab_err+0xac/0xd4
> [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> [   74.989779]  kmem_cache_destroy+0x68/0x130
> [   74.990176]  test_version_show+0x84/0xf0
> [   74.990679]  module_attr_show+0x40/0x60
> [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> [   74.991656]  kernfs_seq_show+0xa0/0xb8
> [   74.992059]  seq_read+0x1f0/0x7e8
> [   74.992415]  kernfs_fop_read+0x70/0x338
> [   74.993051]  vfs_read+0xe4/0x250
> [   74.993498]  ksys_read+0xc8/0x180
> [   74.993825]  __arm64_sys_read+0x44/0x58
> [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> [   74.994708]  do_el0_svc+0x38/0xa0
> [   74.995088]  el0_sync_handler+0x170/0x178
> [   74.995497]  el0_sync+0x174/0x180
> [   74.996050] INFO: Object 0x(ptrval) @offset=15848
> [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 
> pid=172
> [   75.000802]  stack_trace_save+0x9c/0xd0
> [   75.002420]  set_track+0x64/0xf0
> [   75.002770]  alloc_debug_processing+0x104/0x1a0
> [   75.003171]  ___slab_alloc+0x628/0x648
> [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> [   75.004757]  kmem_cache_alloc+0x560/0x588
> [   75.005376]  test_version_show+0x98/0xf0
> [   75.005756]  module_attr_show+0x40/0x60
> [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> [   75.007433]  kernfs_seq_show+0xa0/0xb8
> [   75.007800]  seq_read+0x1f0/0x7e8
> [   75.008128]  kernfs_fop_read+0x70/0x338
> [   75.008507]  vfs_read+0xe4/0x250
> [   75.008990]  ksys_read+0xc8/0x180
> [   75.009462]  __arm64_sys_read+0x44/0x58
> [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has 
> objects
>
> Register a cpu hotplug function to remove all objects in the offline
> per-cpu quarantine when cpu is going offline. Set a per-cpu variable
> to indicate this cpu is offline.
>
> Signed-off-by: Kuan-Ying Lee 
> Suggested-by: Dmitry Vyukov 
> Reported-by: Guangye Yang 
> Cc: Andrey Ryabinin 
> Cc: Alexander Potapenko 
> Cc: Andrew Morton 
> Cc: Matthias Brugger 

Looks good to me, thanks.

Reviewed-by: Dmitry Vyukov 


> ---
>  mm/kasan/quarantine.c | 40 
>  1 file changed, 40 insertions(+)
>
> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> index 4c5375810449..cac7c617df72 100644
> --- a/mm/kasan/quarantine.c
> +++ b/mm/kasan/quarantine.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "../slab.h"
>  #include "kasan.h"
> @@ -43,6 +44,7 @@ struct qlist_head {
> struct qlist_node *head;
> struct qlist_node *tail;
> size_t bytes;
> +   bool offline;
>  };
>
>  #define QLIST_INIT { NULL, NULL, 0 }
> @@ -188,6 +190,11 @@ void quarantine_put(struct kasan_free_meta *info, struct 
> kmem_cache *cache)
> local_irq_save(flags);
>
> q = this_cpu_ptr(_quarantine);
> +   if (q->offline) {
> +   qlink_free(>quarantine_link, cache);
> +   local_irq_restore(flags);
> +   return;
> +   }
> qlist_put(q, >quarantine_link, cache->size);
> if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
> qlist_move_all(q, );
> @@ -328,3 +335,36 @@ void quarantine_remove_cache(struct kmem_cache *cache)
>
> synchronize_srcu(_cache_srcu);
>  }
> +
> +static int kasan_cpu_online(unsigned int cpu)
> +{
> +   this_cpu_ptr(_quarantine

Re: WARNING: filesystem loop5 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-12-02 Thread Dmitry Vyukov
On Tue, Dec 1, 2020 at 10:17 PM Randy Dunlap  wrote:
>
> On 11/30/20 11:47 PM, Dmitry Vyukov wrote:
> > On Tue, Dec 1, 2020 at 2:03 AM Randy Dunlap  wrote:
> >>
> >> On 11/30/20 12:43 AM, Dmitry Vyukov wrote:
> >>> On Mon, Nov 30, 2020 at 5:29 AM Randy Dunlap  
> >>> wrote:
> >>>>
> >>>> On 11/27/20 4:32 AM, syzbot wrote:
> >>>>> Hello,
> >>>>>
> >>>>> syzbot found the following issue on:
> >>>>>
> >>>>> HEAD commit:418baf2c Linux 5.10-rc5
> >>>>> git tree:   upstream
> >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=171555b950
> >>>>> kernel config:  
> >>>>> https://syzkaller.appspot.com/x/.config?x=b81aff78c272da44
> >>>>> dashboard link: 
> >>>>> https://syzkaller.appspot.com/bug?extid=3fd34060f26e766536ff
> >>>>> compiler:   gcc (GCC) 10.1.0-syz 20200507
> >>>>>
> >>>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>>
> >>>>> IMPORTANT: if you fix the issue, please add the following tag to the 
> >>>>> commit:
> >>>>> Reported-by: syzbot+3fd34060f26e76653...@syzkaller.appspotmail.com
> >>>>>
> >>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> >>>>> 512 inodes, the real maximum is 511, mounting anyway
> >>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>>>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>>>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 
> >>>>> 512 inodes, the real maximum is 511, mounting anyway
> >>>>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>>>>
> >>>>>
> >>>>> ---
> >>>>> This report is generated by a bot. It may contain errors.
> >>>>> See https://goo.gl/tpsmEJ for more information about syzbot.
> >>>>> syzbot engineers can be reached at syzkal...@googlegroups.com.
> >>>>>
> >>>>> syzbot will keep track of this issue. See:
> >>>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >>>>>
> >>>>
> >>>> Hi,
> >>>> Can you provide the BFS image file that is being mounted?
> >>>> (./file0 I think.)
> >>>>
> >>>> --
> >>>> ~Randy
> >>>
> >>>
> >>> Hi Randy,
> >>>
> >>> I see this bug was reported with a reproducer:
> >>> https://syzkaller.appspot.com/bug?id=a32ebd5db2f7c957b82cf54b97bdecf367bf0421
> >>> I assume it's a dup of this one.
> >>
> >> Sure, looks the same.
> >>
> >>> If you need the image itself, you can dump it to a file in the C
> >>> reproducer inside of syz_mount_image before mount call.
> >>
> >> Yes, got that.
> >>
> >> What outcome or result are you looking for here?
> >> Or what do you see as the problem?
> >
> > Hi Randy,
> >
> > "WARNING:" in kernel output is supposed to mean a kernel source bug.
> > Presence of that kernel bug is what syzbot has reported.
> >
> > Note: the bug may be a misuse of the "WARNING:" for invalid user
> > inputs in output as well :)
>
>
> [adding Al Viro]
>
> Hi Dmitry,
>
> I expect that the "WARNING:" message is being interpreted incorrectly here,
> but that's a minor issue IMO.
>
> if (info->si_lasti == BFS_MAX_LASTI)
> printf("WARNING: filesystem %s was created with 512 inodes, 
> the real maximum is 511, mounting anyway\n", s->s_id);
>
>
> If you/we look at fs/bfs/bfs.h, it says:
>
> /* In theory BFS supports up to 512 inodes, numbered from 2 (for /) up to 513 
> inclusive.
>In actual fact, attempting to create the 512th inode (i.e. inode No. 513 
> or file No. 511)
>will fail with ENOSPC in bfs_add_entry(): the root directory cannot 
> contain so many entries, counting '..'.
>So, mkfs.bfs(8) should really limit its -N option to 511 and not 512. For 
> now, we just print a warning
>if a filesystem is mounted with such "impossible to fill up" number of 
> inodes */
>
> so one question is why does syzkaller try to do this at all?

Solely for kernel testing purposes.

> Why not set number-of-inodes to 511 instead of 512 in the BFS image file?
>
> However, in testing this, I see that the BFS image is not mounted
> on /dev/loop# at all.
>
> 'mount' says:
>
> # mount -t bfs -o loop bfsfilesyz000.img  /mnt/stand
> mount: /mnt/stand: mount(2) system call failed: Not a directory.
>
> (but it is a directory)
>
> and I have tracked that down to fs/namespace.c::graft_tree()
> returning -ENOTDIR, but I don't know why that is happening.
>
>
> Al, can you provide any insights on this?
>
> thanks.
> --
> ~Randy
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/dc76e615-a2fc-64e1-c979-4699d0d57309%40infradead.org.


Re: [PATCH v4 0/6] kasan: add workqueue and timer stack for generic KASAN

2020-12-01 Thread Dmitry Vyukov
On Tue, Dec 1, 2020 at 3:13 PM Thomas Gleixner  wrote:
> >> > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> >> > In some of these access/allocation happened in process_one_work(),
> >> > we see the free stack is useless in KASAN report, it doesn't help
> >> > programmers to solve UAF on workqueue. The same may stand for times.
> >> >
> >> > This patchset improves KASAN reports by making them to have workqueue
> >> > queueing stack and timer stack information. It is useful for programmers
> >> > to solve use-after-free or double-free memory issue.
> >> >
> >> > Generic KASAN also records the last two workqueue and timer stacks and
> >> > prints them in KASAN report. It is only suitable for generic KASAN.
> >
> > Walter, did you mail v5?
> > Checking statuses of KASAN issues and this seems to be not in linux-next.
> >
> >> > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work
> >> > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers
> >>
> >> How are these links useful for people who do not have a gurgle account?
> >
> > This is a public mailing list archive, so effectively the same way as
> > lore links ;)
>
> Just that it asked me to log in last time. That's why I wrote the
> above. Today it does not, odd.

Some random permissions settings changes were observed before, so I
can believe that.


Re: [PATCH v4 0/6] kasan: add workqueue and timer stack for generic KASAN

2020-12-01 Thread Dmitry Vyukov
On Tue, Dec 1, 2020 at 12:17 PM Walter Wu  wrote:
>
> Hi Dmitry,
>
> On Tue, 2020-12-01 at 08:59 +0100, 'Dmitry Vyukov' via kasan-dev wrote:
> > On Wed, Sep 30, 2020 at 5:29 PM Thomas Gleixner  wrote:
> > >
> > > On Thu, Sep 24 2020 at 12:01, Walter Wu wrote:
> > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> > > > In some of these access/allocation happened in process_one_work(),
> > > > we see the free stack is useless in KASAN report, it doesn't help
> > > > programmers to solve UAF on workqueue. The same may stand for times.
> > > >
> > > > This patchset improves KASAN reports by making them to have workqueue
> > > > queueing stack and timer stack information. It is useful for programmers
> > > > to solve use-after-free or double-free memory issue.
> > > >
> > > > Generic KASAN also records the last two workqueue and timer stacks and
> > > > prints them in KASAN report. It is only suitable for generic KASAN.
> >
> > Walter, did you mail v5?
> > Checking statuses of KASAN issues and this seems to be not in linux-next.
> >
>
> Sorry for the delay in responding to this patch. I'm busy these few
> months, so that suspend processing it.
> Yes, I will send it next week. But v4 need to confirm the timer stack is
> useful. I haven't found an example. Do you have some suggestion about
> timer?

Good question.

We had some use-after-free's what mention call_timer_fn:
https://groups.google.com/g/syzkaller-bugs/search?q=%22kasan%22%20%22use-after-free%22%20%22expire_timers%22%20%22call_timer_fn%22%20
In the reports I checked call_timer_fn appears in the "access" stack
rather in the "free" stack.

Looking at these reports I cannot conclude that do_init_timer stack
would be useful.
I am mildly leaning towards not memorizing do_init_timer stack for now
(until we have clear use cases) as the number of aux stacks is very
limited (2).


Re: [PATCH v4 0/6] kasan: add workqueue and timer stack for generic KASAN

2020-12-01 Thread Dmitry Vyukov
On Wed, Sep 30, 2020 at 5:29 PM Thomas Gleixner  wrote:
>
> On Thu, Sep 24 2020 at 12:01, Walter Wu wrote:
> > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2].
> > In some of these access/allocation happened in process_one_work(),
> > we see the free stack is useless in KASAN report, it doesn't help
> > programmers to solve UAF on workqueue. The same may stand for times.
> >
> > This patchset improves KASAN reports by making them to have workqueue
> > queueing stack and timer stack information. It is useful for programmers
> > to solve use-after-free or double-free memory issue.
> >
> > Generic KASAN also records the last two workqueue and timer stacks and
> > prints them in KASAN report. It is only suitable for generic KASAN.

Walter, did you mail v5?
Checking statuses of KASAN issues and this seems to be not in linux-next.

> > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work
> > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers
>
> How are these links useful for people who do not have a gurgle account?

This is a public mailing list archive, so effectively the same way as
lore links ;)


Re: WARNING: filesystem loop5 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-11-30 Thread Dmitry Vyukov
On Tue, Dec 1, 2020 at 2:03 AM Randy Dunlap  wrote:
>
> On 11/30/20 12:43 AM, Dmitry Vyukov wrote:
> > On Mon, Nov 30, 2020 at 5:29 AM Randy Dunlap  wrote:
> >>
> >> On 11/27/20 4:32 AM, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzbot found the following issue on:
> >>>
> >>> HEAD commit:418baf2c Linux 5.10-rc5
> >>> git tree:   upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=171555b950
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b81aff78c272da44
> >>> dashboard link: 
> >>> https://syzkaller.appspot.com/bug?extid=3fd34060f26e766536ff
> >>> compiler:   gcc (GCC) 10.1.0-syz 20200507
> >>>
> >>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>
> >>> IMPORTANT: if you fix the issue, please add the following tag to the 
> >>> commit:
> >>> Reported-by: syzbot+3fd34060f26e76653...@syzkaller.appspotmail.com
> >>>
> >>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 512 
> >>> inodes, the real maximum is 511, mounting anyway
> >>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>> BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> >>> BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 512 
> >>> inodes, the real maximum is 511, mounting anyway
> >>> BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >>>
> >>>
> >>> ---
> >>> This report is generated by a bot. It may contain errors.
> >>> See https://goo.gl/tpsmEJ for more information about syzbot.
> >>> syzbot engineers can be reached at syzkal...@googlegroups.com.
> >>>
> >>> syzbot will keep track of this issue. See:
> >>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >>>
> >>
> >> Hi,
> >> Can you provide the BFS image file that is being mounted?
> >> (./file0 I think.)
> >>
> >> --
> >> ~Randy
> >
> >
> > Hi Randy,
> >
> > I see this bug was reported with a reproducer:
> > https://syzkaller.appspot.com/bug?id=a32ebd5db2f7c957b82cf54b97bdecf367bf0421
> > I assume it's a dup of this one.
>
> Sure, looks the same.
>
> > If you need the image itself, you can dump it to a file in the C
> > reproducer inside of syz_mount_image before mount call.
>
> Yes, got that.
>
> What outcome or result are you looking for here?
> Or what do you see as the problem?

Hi Randy,

"WARNING:" in kernel output is supposed to mean a kernel source bug.
Presence of that kernel bug is what syzbot has reported.

Note: the bug may be a misuse of the "WARNING:" for invalid user
inputs in output as well :)


Re: BUG: rwlock bad magic on CPU, kworker/0:LINE/NUM, ADDR

2020-11-30 Thread Dmitry Vyukov
On Mon, Nov 30, 2020 at 12:33 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:90cf87d1 enetc: Let the hardware auto-advance the taprio b..
> git tree:   net
> console output: https://syzkaller.appspot.com/x/log.txt?x=135479b350
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5720c06118e6c4cc
> dashboard link: https://syzkaller.appspot.com/bug?extid=cb987a9c796abc570b47
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cb987a9c796abc570...@syzkaller.appspotmail.com
>
> tipc: 32-bit node address hash set to aa1414ac
> BUG: rwlock bad magic on CPU#0, kworker/0:18/18158, 859f2a8d
> CPU: 0 PID: 18158 Comm: kworker/0:18 Not tainted 5.10.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: events tipc_net_finalize_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  rwlock_bug kernel/locking/spinlock_debug.c:144 [inline]
>  debug_write_lock_before kernel/locking/spinlock_debug.c:182 [inline]
>  do_raw_write_lock+0x1ef/0x280 kernel/locking/spinlock_debug.c:206
>  tipc_mon_reinit_self+0x1f7/0x630 net/tipc/monitor.c:685

There was also "general protection fault in tipc_mon_reinit_self":
https://syzkaller.appspot.com/bug?id=dc141b9a05cb48d3d9b46837bc2fdc9e7d95dbe9
which also happened once. Smells like an intricate race condition.


>  tipc_net_finalize net/tipc/net.c:134 [inline]
>  tipc_net_finalize+0x1df/0x310 net/tipc/net.c:125
>  process_one_work+0x933/0x15a0 kernel/workqueue.c:2272
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2418
>  kthread+0x3af/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/4e5bdb05b5516009%40google.com.


Re: WARNING: filesystem loop5 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-11-30 Thread Dmitry Vyukov
On Mon, Nov 30, 2020 at 5:29 AM Randy Dunlap  wrote:
>
> On 11/27/20 4:32 AM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:418baf2c Linux 5.10-rc5
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=171555b950
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=b81aff78c272da44
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3fd34060f26e766536ff
> > compiler:   gcc (GCC) 10.1.0-syz 20200507
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+3fd34060f26e76653...@syzkaller.appspotmail.com
> >
> > BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> > BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 512 
> > inodes, the real maximum is 511, mounting anyway
> > BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> > BFS-fs: bfs_fill_super(): loop5 is unclean, continuing
> > BFS-fs: bfs_fill_super(): WARNING: filesystem loop5 was created with 512 
> > inodes, the real maximum is 511, mounting anyway
> > BFS-fs: bfs_fill_super(): Last block not available on loop5: 120
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkal...@googlegroups.com.
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
>
> Hi,
> Can you provide the BFS image file that is being mounted?
> (./file0 I think.)
>
> --
> ~Randy


Hi Randy,

I see this bug was reported with a reproducer:
https://syzkaller.appspot.com/bug?id=a32ebd5db2f7c957b82cf54b97bdecf367bf0421
I assume it's a dup of this one.

If you need the image itself, you can dump it to a file in the C
reproducer inside of syz_mount_image before mount call.

Thanks


Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-27 Thread Dmitry Vyukov
On Sun, Nov 22, 2020 at 2:56 AM Tetsuo Handa
 wrote:
>
> On 2020/11/20 18:27, Dmitry Vyukov wrote:
> > Peter, so far it looks like just a very large, but normal graph to me.
> > The cheapest from an engineering point of view solution would be just
> > to increase the constants. I assume a 2x increase should buy us lots
> > of time to overflow.
> > I can think of more elaborate solutions, e.g. using bitmasks to
> > represent hot leaf and top-level locks. But it will both increase the
> > resulting code complexity (no uniform representation anymore, all code
> > will need to deal with different representations) and require some
> > time investments (that I can't justify for me at least as compared to
> > just throwing a bit more machine memory at it). And in the end it
> > won't really reduce the size of the graph.
> > What do you think?
> >
>
> Yes, I think it is a normal graph; simply syzkaller kernels tend to record
> a few times more dependencies than my idle kernel (shown bottom).
>
> Peter, you guessed that the culprit is sysfs at
> https://lkml.kernel.org/r/20200916115057.go2...@hirez.programming.kicks-ass.net
>  , but
> syzbot reported at 
> https://syzkaller.appspot.com/text?tag=MachineInfo=99b8f2b092d9714f
> that "BUG: MAX_LOCKDEP_ENTRIES too low!" can occur on a VM with only 2 CPUs.
> Is your guess catching the culprit?
>
> We could improve a few locks, but as a whole we won't be able to afford 
> keeping
> sum of individual dependencies under current threshold. Therefore, allow 
> lockdep to
> tune the capacity and allow syzkaller to dump when reaching the capacity will 
> be
> the way to go.
>
>
>
> # cat /proc/lockdep_stats
>  lock-classes: 1236 [max: 8192]
>  direct dependencies:  9610 [max: 32768]
>  indirect dependencies:   40401
>  all direct dependencies:174635
>  dependency chains:   11398 [max: 65536]
>  dependency chain hlocks used:42830 [max: 327680]
>  dependency chain hlocks lost:0
>  in-hardirq chains:  61
>  in-softirq chains: 414
>  in-process chains:   10923
>  stack-trace entries: 93041 [max: 524288]
>  number of stack traces:   4997
>  number of stack hash chains:  4292
>  combined max dependencies:   281074520
>  hardirq-safe locks: 50
>  hardirq-unsafe locks:  805
>  softirq-safe locks:146
>  softirq-unsafe locks:  722
>  irq-safe locks:155
>  irq-unsafe locks:  805
>  hardirq-read-safe locks: 2
>  hardirq-read-unsafe locks: 129
>  softirq-read-safe locks:11
>  softirq-read-unsafe locks: 123
>  irq-read-safe locks:11
>  irq-read-unsafe locks: 129
>  uncategorized locks:   224
>  unused locks:0
>  max locking depth:  15
>  max bfs queue depth:   215
>  chain lookup misses: 11664
>  chain lookup hits:37393935
>  cyclic checks:   11053
>  redundant checks:0
>  redundant links: 0
>  find-mask forwards checks:1588
>  find-mask backwards checks:   1779
>  hardirq on events:17502380
>  hardirq off events:   17502376
>  redundant hardirq ons:   0
>  redundant hardirq offs:  0
>  softirq on events:   90845
>  softirq off events:  90845
>  redundant softirq ons:   0
>  redundant softirq offs:  0
>  debug_locks: 1
>
>  zapped classes:  0
>  zapped lock chains:  0
>  large chain blocks:  1
> # awk ' { if ($2 == "OPS:") print $5" "$9 } ' /proc/lockdep | sort -rV | head 
> -n 30
> 423 (wq_completion)events
> 405 (wq_completion)events_unbound
> 393 >f_pos_lock
> 355 >lock
> 349 sb_writers#3
> 342 sb_writers#6
> 338 >mutex
> 330 (work_completion)(>work)
> 330 pernet_ops_rwsem
> 289 epmutex
> 288 >mtx
> 281 tty_mutex
> 280 >legacy_mutex
> 273 >legacy_mutex/1
> 269 >ldisc_sem
> 268 (wq_completion)ipv6_addrconf
> 266 (work_completion)(&(>dad_work)->work)
> 266 (linkwatch_work).work
> 266 (addr_chk_work).work
> 266 >atomic_read_lock
> 265 (work_completion)(>

Re: BUG: receive list entry not found for dev vxcan1, id 002, mask C00007FF

2020-11-25 Thread Dmitry Vyukov
On Wed, Nov 25, 2020 at 5:04 PM Oliver Hartkopp  wrote:
>
> Hello all,
>
> AFAICS the problems are caused by the WARN() statement here:
>
> https://elixir.bootlin.com/linux/v5.10-rc4/source/net/can/af_can.c#L546
>
> The idea was to check whether CAN protocol implementations work
> correctly on their filter lists.
>
> With the fault injection it seem like we're getting a race between
> closing the socket and removing the netdevice.
>
> This seems to be very seldom but it does not break anything.
>
> Would removing the WARN(1) or replacing it with pr_warn() be ok to close
> this issue?

Hi Oliver,

Yes, this is the intended way to deal with this:
https://elixir.bootlin.com/linux/v5.10-rc5/source/include/asm-generic/bug.h#L75

Maybe a good opportunity to add some explanatory comment as well
regarding how it should not happen but can.

Thanks for looking into this.




> On 23.11.20 12:58, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:c2e7554e Merge tag 'gfs2-v5.10-rc4-fixes' of git://git.ker..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=117f03ba50
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=75292221eb79ace2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=381d06e0c8eaacb8706f
> > compiler:   gcc (GCC) 10.1.0-syz 20200507
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+381d06e0c8eaacb87...@syzkaller.appspotmail.com
> >
> > [ cut here ]
> > BUG: receive list entry not found for dev vxcan1, id 002, mask C7FF
> > WARNING: CPU: 1 PID: 12946 at net/can/af_can.c:546 
> > can_rx_unregister+0x5a4/0x700 net/can/af_can.c:546
> > Modules linked in:
> > CPU: 1 PID: 12946 Comm: syz-executor.1 Not tainted 5.10.0-rc4-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > RIP: 0010:can_rx_unregister+0x5a4/0x700 net/can/af_can.c:546
> > Code: 8b 7c 24 78 44 8b 64 24 68 49 c7 c5 20 ac 56 8a e8 01 6c 97 f9 44 89 
> > f9 44 89 e2 4c 89 ee 48 c7 c7 60 ac 56 8a e8 66 af d3 00 <0f> 0b 48 8b 7c 
> > 24 28 e8 b0 25 0f 01 e9 54 fb ff ff e8 26 e0 d8 f9
> > RSP: 0018:c90017e2fb38 EFLAGS: 00010286
> > RAX:  RBX:  RCX: 
> > RDX: 8880147a8000 RSI: 8158f3c5 RDI: f52002fc5f59
> > RBP: 0118 R08: 0001 R09: 8880b9f2011b
> > R10:  R11:  R12: 0002
> > R13: 8880254c R14: 192002fc5f6e R15: c7ff
> > FS:  01ddc940() GS:8880b9f0() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 001b2f121000 CR3: 152c CR4: 001506e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: fffe0ff0 DR7: 0400
> > Call Trace:
> >   isotp_notifier+0x2a7/0x540 net/can/isotp.c:1303
> >   call_netdevice_notifier net/core/dev.c:1735 [inline]
> >   call_netdevice_unregister_notifiers+0x156/0x1c0 net/core/dev.c:1763
> >   call_netdevice_unregister_net_notifiers net/core/dev.c:1791 [inline]
> >   unregister_netdevice_notifier+0xcd/0x170 net/core/dev.c:1870
> >   isotp_release+0x136/0x600 net/can/isotp.c:1011
> >   __sock_release+0xcd/0x280 net/socket.c:596
> >   sock_close+0x18/0x20 net/socket.c:1277
> >   __fput+0x285/0x920 fs/file_table.c:281
> >   task_work_run+0xdd/0x190 kernel/task_work.c:151
> >   tracehook_notify_resume include/linux/tracehook.h:188 [inline]
> >   exit_to_user_mode_loop kernel/entry/common.c:164 [inline]
> >   exit_to_user_mode_prepare+0x17e/0x1a0 kernel/entry/common.c:191
> >   syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:266
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x417811
> > Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 a4 1a 00 00 c3 48 
> > 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 
> > 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
> > RSP: 002b:0169fbf0 EFLAGS: 0293 ORIG_RAX: 0003
> > RAX:  RBX: 0004 RCX: 00417811
> > RDX:  RSI: 13b7 RDI: 0003
> > RBP: 0001 R08: acabb3b7 R09: acabb3bb
> > R10: 0169fcd0 R11: 0293 R12: 0118c9a0
> > R13: 0118c9a0 R14: 03e8 R15: 0118bf2c
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkal...@googlegroups.com.
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
>
> --
> You received this message 

Re: [PATCH] rcu: kasan: record and print kvfree_call_rcu call stack

2020-11-20 Thread Dmitry Vyukov
On Fri, Nov 20, 2020 at 3:34 PM Paul E. McKenney  wrote:
>
> On Fri, Nov 20, 2020 at 09:51:15AM +0100, Dmitry Vyukov wrote:
> > On Thu, Nov 19, 2020 at 10:49 PM Paul E. McKenney  
> > wrote:
> > >
> > > On Wed, Nov 18, 2020 at 11:53:09AM +0800, qiang.zh...@windriver.com wrote:
> > > > From: Zqiang 
> > > >
> > > > Add kasan_record_aux_stack function for kvfree_call_rcu function to
> > > > record call stacks.
> > > >
> > > > Signed-off-by: Zqiang 
> > >
> > > Thank you, but this does not apply on the "dev" branch of the -rcu tree.
> > > See file:///home/git/kernel.org/rcutodo.html for more info.
> > >
> > > Adding others on CC who might have feedback on the general approach.
> > >
> > > Thanx, Paul
> > >
> > > > ---
> > > >  kernel/rcu/tree.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index da3414522285..a252b2f0208d 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -3506,7 +3506,7 @@ void kvfree_call_rcu(struct rcu_head *head, 
> > > > rcu_callback_t func)
> > > >   success = true;
> > > >   goto unlock_return;
> > > >   }
> > > > -
> > > > + kasan_record_aux_stack(ptr);
> > > >   success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
> > > >   if (!success) {
> > > >   run_page_cache_worker(krcp);
> >
> > kvfree_call_rcu is intended to free objects, right? If so this is:
>
> True, but mightn't there still be RCU readers referencing this object for
> some time, as in up to the point that the RCU grace period ends?  If so,
> won't adding this cause KASAN to incorrectly complain about those readers?
>
> Or am I missing something here?

kvfree_call_rcu does not check anything, not poison the object for
future accesses (it is also called in call_rcu which does not
necessarily free the object).
It just notes the current stack to provide in reports later.
The problem is that the free stack is pointless for objects freed by
rcu. In such cases we want call_rcu/kvfree_call_rcu stack in
use-after-free reports.


Re: [PATCH] rcu: kasan: record and print kvfree_call_rcu call stack

2020-11-20 Thread Dmitry Vyukov
On Fri, Nov 20, 2020 at 12:59 PM Uladzislau Rezki  wrote:
>
> On Thu, Nov 19, 2020 at 01:49:34PM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 18, 2020 at 11:53:09AM +0800, qiang.zh...@windriver.com wrote:
> > > From: Zqiang 
> > >
> > > Add kasan_record_aux_stack function for kvfree_call_rcu function to
> > > record call stacks.
> > >
> > > Signed-off-by: Zqiang 
> >
> > Thank you, but this does not apply on the "dev" branch of the -rcu tree.
> > See file:///home/git/kernel.org/rcutodo.html for more info.
> >
> > Adding others on CC who might have feedback on the general approach.
> >
> >   Thanx, Paul
> >
> > > ---
> > >  kernel/rcu/tree.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index da3414522285..a252b2f0208d 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -3506,7 +3506,7 @@ void kvfree_call_rcu(struct rcu_head *head, 
> > > rcu_callback_t func)
> > > success = true;
> > > goto unlock_return;
> > > }
> > > -
> > > +   kasan_record_aux_stack(ptr);
> Is that save to invoke it on vmalloced ptr.?

Yes, kasan_record_aux_stack should figure it out itself.
We call kasan_record_aux_stack on call_rcu as well, and rcu structs
can be anywhere.
See:
https://elixir.bootlin.com/linux/v5.10-rc4/source/mm/kasan/generic.c#L335


Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-20 Thread Dmitry Vyukov
On Fri, Nov 20, 2020 at 10:22 AM Dmitry Vyukov  wrote:
>
> On Thu, Nov 19, 2020 at 7:08 PM Dmitry Vyukov  wrote:
> > > > > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> > > > >  wrote:
> > > > > >
> > > > > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > > > > >>>>
> > > > > > >>>> I am trying to reproduce this locally first. syzbot caims it 
> > > > > > >>>> can
> > > > > > >>>> reproduce it with a number of very simpler reproducers (like 
> > > > > > >>>> spawn
> > > > > > >>>> process, unshare, create socket):
> > > > > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > > > > >>>>
> > > > > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > > > > >>>>  direct dependencies: 22072 [max: 32768]
> > > > > > >>>>
> > > > > > >>>> But that's running a very uniform workload.
> > > > > > >>>>
> > > > > > >>>> However when I tried to cat /proc/lockdep to see if there is 
> > > > > > >>>> anything
> > > > > > >>>> fishy already,
> > > > > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > > > > >>>>
> > > > > > >>>> Some missing locks?
> > > > > >
> > > > > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > > > > >
> > > > > > >
> > > > > > > But I don't know if it's enough to explain the overflow or not...
> > > > > > >
> > > > > >
> > > > > > Since you can't hit the limit locally, I guess we need to ask 
> > > > > > syzbot to
> > > > > > run massive testcases.
> > > > >
> > > > > I am trying to test the code that will do this. Otherwise we will get
> > > > > days-long round-trips for stupid bugs. These files are also quite
> > > > > huge, I afraid that may not fit into storage.
> > > > >
> > > > > So far I get to at most:
> > > > >
> > > > >  lock-classes: 2901 [max: 8192]
> > > > >  direct dependencies: 25574 [max: 32768]
> > > > >  dependency chains:   40605 [max: 65536]
> > > > >  dependency chain hlocks used:   176814 [max: 327680]
> > > > >  stack-trace entries:258590 [max: 524288]
> > > > >
> > > > > with these worst offenders:
> > > > >
> > > > > # egrep "BD: [0-9]" /proc/lockdep
> > > > > df5b6792 FD:2 BD: 1235 -.-.: _hash[i].lock
> > > > > 5dfeb73c FD:1 BD: 1236 ..-.: pool_lock
> > > > > b86254b1 FD:   14 BD:  -.-.: >lock
> > > > > 866efb75 FD:1 BD: 1112 : _b->lock
> > > > > 6970cf1a FD:2 BD: 1126 : tk_core.seq.seqcount
> > > > > f49d95b0 FD:3 BD: 1180 -.-.: >lock
> > > > > ba3f8454 FD:5 BD: 1115 -.-.: hrtimer_bases.lock
> > > > > fb340f16 FD:   16 BD: 1030 -.-.: >pi_lock
> > > > > c9f6f58c FD:1 BD: 1114 -.-.: _cpu_ptr(group->pcpu, 
> > > > > cpu)->seq
> > > > > 49d3998c FD:1 BD: 1112 -.-.: _rq->removed.lock
> > > > > fdf7f396 FD:7 BD: 1112 -...: _b->rt_runtime_lock
> > > > > 21aedb8d FD:1 BD: 1113 -...: _rq->rt_runtime_lock
> > > > > 4e34c8d4 FD:1 BD: 1112 : >lock
> > > > > b2ac5d96 FD:1 BD: 1127 -.-.: pvclock_gtod_data
> > > > > c5df4dc3 FD:1 BD: 1031 ..-.: >delays->lock
> > > > > fe623698 FD:1 BD: 1112 -...:
> > > > > per_cpu_ptr(_rstat_cpu_lock, cpu)
> > > > >
> > > > >
> > > > > But the kernel continues to crash on different unrelated bugs...
> > > >
> > > >
> > > > Here is one successful sample. How do we debug it? What should we be
> > > > looking for?
> > >

Re: [PATCH] rcu: kasan: record and print kvfree_call_rcu call stack

2020-11-20 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 10:49 PM Paul E. McKenney  wrote:
>
> On Wed, Nov 18, 2020 at 11:53:09AM +0800, qiang.zh...@windriver.com wrote:
> > From: Zqiang 
> >
> > Add kasan_record_aux_stack function for kvfree_call_rcu function to
> > record call stacks.
> >
> > Signed-off-by: Zqiang 
>
> Thank you, but this does not apply on the "dev" branch of the -rcu tree.
> See file:///home/git/kernel.org/rcutodo.html for more info.
>
> Adding others on CC who might have feedback on the general approach.
>
> Thanx, Paul
>
> > ---
> >  kernel/rcu/tree.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index da3414522285..a252b2f0208d 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3506,7 +3506,7 @@ void kvfree_call_rcu(struct rcu_head *head, 
> > rcu_callback_t func)
> >   success = true;
> >   goto unlock_return;
> >   }
> > -
> > + kasan_record_aux_stack(ptr);
> >   success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
> >   if (!success) {
> >   run_page_cache_worker(krcp);


kvfree_call_rcu is intended to free objects, right? If so this is:

Acked-by: Dmitry Vyukov 


Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 3:30 PM Dmitry Vyukov  wrote:
> >
> > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> >  wrote:
> > >
> > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > >>>>
> > > >>>> I am trying to reproduce this locally first. syzbot caims it can
> > > >>>> reproduce it with a number of very simpler reproducers (like spawn
> > > >>>> process, unshare, create socket):
> > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > >>>>
> > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > >>>>  direct dependencies: 22072 [max: 32768]
> > > >>>>
> > > >>>> But that's running a very uniform workload.
> > > >>>>
> > > >>>> However when I tried to cat /proc/lockdep to see if there is anything
> > > >>>> fishy already,
> > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > >>>>
> > > >>>> Some missing locks?
> > >
> > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > >
> > > >
> > > > But I don't know if it's enough to explain the overflow or not...
> > > >
> > >
> > > Since you can't hit the limit locally, I guess we need to ask syzbot to
> > > run massive testcases.
> >
> > I am trying to test the code that will do this. Otherwise we will get
> > days-long round-trips for stupid bugs. These files are also quite
> > huge, I afraid that may not fit into storage.
> >
> > So far I get to at most:
> >
> >  lock-classes: 2901 [max: 8192]
> >  direct dependencies: 25574 [max: 32768]
> >  dependency chains:   40605 [max: 65536]
> >  dependency chain hlocks used:   176814 [max: 327680]
> >  stack-trace entries:258590 [max: 524288]
> >
> > with these worst offenders:
> >
> > # egrep "BD: [0-9]" /proc/lockdep
> > df5b6792 FD:2 BD: 1235 -.-.: _hash[i].lock
> > 5dfeb73c FD:1 BD: 1236 ..-.: pool_lock
> > b86254b1 FD:   14 BD:  -.-.: >lock
> > 866efb75 FD:1 BD: 1112 : _b->lock
> > 6970cf1a FD:2 BD: 1126 : tk_core.seq.seqcount
> > f49d95b0 FD:3 BD: 1180 -.-.: >lock
> > ba3f8454 FD:5 BD: 1115 -.-.: hrtimer_bases.lock
> > fb340f16 FD:   16 BD: 1030 -.-.: >pi_lock
> > c9f6f58c FD:1 BD: 1114 -.-.: _cpu_ptr(group->pcpu, cpu)->seq
> > 49d3998c FD:1 BD: 1112 -.-.: _rq->removed.lock
> > fdf7f396 FD:7 BD: 1112 -...: _b->rt_runtime_lock
> > 21aedb8d FD:1 BD: 1113 -...: _rq->rt_runtime_lock
> > 4e34c8d4 FD:1 BD: 1112 : >lock
> > b2ac5d96 FD:1 BD: 1127 -.-.: pvclock_gtod_data
> > c5df4dc3 FD:1 BD: 1031 ..-.: >delays->lock
> > fe623698 FD:1 BD: 1112 -...:
> > per_cpu_ptr(_rstat_cpu_lock, cpu)
> >
> >
> > But the kernel continues to crash on different unrelated bugs...
>
>
> Here is one successful sample. How do we debug it? What should we be
> looking for?
>
> p.s. it's indeed huge, full log was 11MB, this probably won't be
> chewed by syzbot.
> Peter, are these [hex numbers] needed? Could we strip them during
> post-processing? At first sight they look like derivatives of the
> name.

The worst back-edge offenders are:

b445a595 FD:2 BD: 1595 -.-.: _hash[i].lock
55ae0468 FD:1 BD: 1596 ..-.: pool_lock
b1336dc4 FD:2 BD: 1002 ..-.: >lock
9a0cabce FD:1 BD: 1042 ...-: &s->seqcount
1f2849b5 FD:1 BD: 1192 ..-.: depot_lock
d044255b FD:1 BD: 1038 -.-.: >list_lock
5868699e FD:   17 BD: 1447 -.-.: >lock
bb52ab59 FD:1 BD: 1448 : _b->lock
4f442fff FD:2 BD: 1469 : tk_core.seq.seqcount
c908cc32 FD:3 BD: 1512 -.-.: >lock
478677cc FD:5 BD: 1435 -.-.: hrtimer_bases.lock
b5b65cb1 FD:   19 BD: 1255 -.-.: >pi_lock
7f313bd5 FD:1 BD: 1451 -.-.: _cpu_ptr(group->pcpu, cpu)->seq
bac5d8ed FD:1 BD: 1004 ...-: &s->seqcount#2
0f57e411 FD:1 BD: 1448 -.-.: _rq->removed.lock
13c1ab65 FD:7 BD: 1449 -.-.: _b->rt_runtime_lock
3bdf78f4 FD:1 BD: 1450 -.-.: _rq->rt_runtime_lock
975d5b80 FD:1 BD: 1448 : >lock
2586e81b FD:1 BD: 1471 -.-.: pvclock_gtod_data
d03aed24 FD:1 BD: 1275 ..-.: >delays->lock
1119414f FD:1 BD: 1448 -...:
per_cpu_ptr(_rstat_cpu_lock, cpu)
6f3d793b FD:6 BD: 1449 -.-.: >lock
f3f0190c FD:9 BD: 1448 -...: >lock/1
7410cf1a FD:1 BD: 1448 -...: >rto_lock

There are 19 with ~1500 incoming edges. So that's 20K.

In my local testing I was at around 20-something K and these worst
offenders were at ~1000 back edges.
Now they got to 1500, so that is what got us over the 32K limit, right?

Does this analysis make sense?

Any ideas what to do with these?


Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
 wrote:
>
> On 2020/11/19 22:06, Dmitry Vyukov wrote:
> >>>>
> >>>> I am trying to reproduce this locally first. syzbot caims it can
> >>>> reproduce it with a number of very simpler reproducers (like spawn
> >>>> process, unshare, create socket):
> >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> >>>>
> >>>> I see a very slow drift, but it's very slow, so get only to:
> >>>>  direct dependencies: 22072 [max: 32768]
> >>>>
> >>>> But that's running a very uniform workload.
> >>>>
> >>>> However when I tried to cat /proc/lockdep to see if there is anything
> >>>> fishy already,
> >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> >>>>
> >>>> Some missing locks?
>
> Not a TOMOYO's bug. Maybe a lockdep's bug.
>
> >
> > But I don't know if it's enough to explain the overflow or not...
> >
>
> Since you can't hit the limit locally, I guess we need to ask syzbot to
> run massive testcases.

I am trying to test the code that will do this. Otherwise we will get
days-long round-trips for stupid bugs. These files are also quite
huge, I afraid that may not fit into storage.

So far I get to at most:

 lock-classes: 2901 [max: 8192]
 direct dependencies: 25574 [max: 32768]
 dependency chains:   40605 [max: 65536]
 dependency chain hlocks used:   176814 [max: 327680]
 stack-trace entries:258590 [max: 524288]

with these worst offenders:

# egrep "BD: [0-9]" /proc/lockdep
df5b6792 FD:2 BD: 1235 -.-.: _hash[i].lock
5dfeb73c FD:1 BD: 1236 ..-.: pool_lock
b86254b1 FD:   14 BD:  -.-.: >lock
866efb75 FD:1 BD: 1112 : _b->lock
6970cf1a FD:2 BD: 1126 : tk_core.seq.seqcount
f49d95b0 FD:3 BD: 1180 -.-.: >lock
ba3f8454 FD:5 BD: 1115 -.-.: hrtimer_bases.lock
fb340f16 FD:   16 BD: 1030 -.-.: >pi_lock
c9f6f58c FD:1 BD: 1114 -.-.: _cpu_ptr(group->pcpu, cpu)->seq
49d3998c FD:1 BD: 1112 -.-.: _rq->removed.lock
fdf7f396 FD:7 BD: 1112 -...: _b->rt_runtime_lock
21aedb8d FD:1 BD: 1113 -...: _rq->rt_runtime_lock
4e34c8d4 FD:1 BD: 1112 : >lock
b2ac5d96 FD:1 BD: 1127 -.-.: pvclock_gtod_data
c5df4dc3 FD:1 BD: 1031 ..-.: >delays->lock
fe623698 FD:1 BD: 1112 -...:
per_cpu_ptr(_rstat_cpu_lock, cpu)


But the kernel continues to crash on different unrelated bugs...


Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 1:49 PM Dmitry Vyukov  wrote:
>
> On Thu, Nov 19, 2020 at 1:43 PM Dmitry Vyukov  wrote:
> > > On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
> > >  wrote:
> > > >
> > > > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > > > >> The problem is that we can't know what exactly is consuming these 
> > > > >> resources.
> > > > >> My question is do you have a plan to make it possible to know what 
> > > > >> exactly is
> > > > >> consuming these resources.
> > > > >
> > > > > I'm pretty sure it's in /proc/lockdep* somewhere.
> > > >
> > > > OK. Then...
> > > >
> > > > Dmitry, can you update syzkaller to dump /proc/lockdep* before 
> > > > terminating as
> > > > a crash as soon as encountering one of
> > > >
> > > >   BUG: MAX_LOCKDEP_ENTRIES too low!
> > > >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > >   BUG: MAX_LOCKDEP_CHAINS too low!
> > > >   BUG: MAX_LOCKDEP_KEYS too low!
> > > >   WARNING in print_bfs_bug
> > > >
> > > > messages?
> > > >
> > > > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > > > On Wed, Sep 16, 2020 at 1:51 PM  wrote:
> > > > >>
> > > > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > > > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > > > >>>  wrote:
> > > > >>>>
> > > > >>>> Hello. Can we apply this patch?
> > > > >>>>
> > > > >>>> This patch addresses top crashers for syzbot, and applying this 
> > > > >>>> patch
> > > > >>>> will help utilizing syzbot's resource for finding other bugs.
> > > > >>>
> > > > >>> Acked-by: Dmitry Vyukov 
> > > > >>>
> > > > >>> Peter, do you still have concerns with this?
> > > > >>
> > > > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > > > >> annotation that blows up the lockdep storage, no worries, we'll just
> > > > >> increase this :/
> > > > >>
> > > > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > > > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > > > >> poking at /sys a lot and thus floods the system.
> > > > >>
> > > > >> I don't know enough about sysfs to suggest an alternative, and 
> > > > >> haven't
> > > > >> exactly had spare time to look into it either :/
> > > > >>
> > > > >> Examples of bad annotations is getting every CPU a separate class, 
> > > > >> that
> > > > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if 
> > > > >> there's
> > > > >> only a single nesting level).
> > > > >
> > > > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > > > sort and show existing chains so that it's possible to identify if
> > > > > there are any worst offenders and who they are.
> > > > >
> > > > > Currently we only have a hypothesis that there are some worst
> > > > > offenders vs lots of normal load. And we can't point fingers which
> > > > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > > > fix anything.
> > > > >
> > > > > If we would know for sure that lock class X is guilty. That would make
> > > > > the situation much more actionable.
> > >
> > > I am trying to reproduce this locally first. syzbot caims it can
> > > reproduce it with a number of very simpler reproducers (like spawn
> > > process, unshare, create socket):
> > > https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > >
> > > I see a very slow drift, but it's very slow, so get only to:
> > >  direct dependencies: 22072 [max: 32768]
> > >
> > > But that's running a very uniform workload.
> > >
> > > However when I tried to cat /proc/lockdep to see if there is anything
>

Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 1:43 PM Dmitry Vyukov  wrote:
> > On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
> >  wrote:
> > >
> > > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > > >> The problem is that we can't know what exactly is consuming these 
> > > >> resources.
> > > >> My question is do you have a plan to make it possible to know what 
> > > >> exactly is
> > > >> consuming these resources.
> > > >
> > > > I'm pretty sure it's in /proc/lockdep* somewhere.
> > >
> > > OK. Then...
> > >
> > > Dmitry, can you update syzkaller to dump /proc/lockdep* before 
> > > terminating as
> > > a crash as soon as encountering one of
> > >
> > >   BUG: MAX_LOCKDEP_ENTRIES too low!
> > >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > >   BUG: MAX_LOCKDEP_CHAINS too low!
> > >   BUG: MAX_LOCKDEP_KEYS too low!
> > >   WARNING in print_bfs_bug
> > >
> > > messages?
> > >
> > > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > > On Wed, Sep 16, 2020 at 1:51 PM  wrote:
> > > >>
> > > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > > >>>  wrote:
> > > >>>>
> > > >>>> Hello. Can we apply this patch?
> > > >>>>
> > > >>>> This patch addresses top crashers for syzbot, and applying this patch
> > > >>>> will help utilizing syzbot's resource for finding other bugs.
> > > >>>
> > > >>> Acked-by: Dmitry Vyukov 
> > > >>>
> > > >>> Peter, do you still have concerns with this?
> > > >>
> > > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > > >> annotation that blows up the lockdep storage, no worries, we'll just
> > > >> increase this :/
> > > >>
> > > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > > >> poking at /sys a lot and thus floods the system.
> > > >>
> > > >> I don't know enough about sysfs to suggest an alternative, and haven't
> > > >> exactly had spare time to look into it either :/
> > > >>
> > > >> Examples of bad annotations is getting every CPU a separate class, that
> > > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> > > >> only a single nesting level).
> > > >
> > > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > > sort and show existing chains so that it's possible to identify if
> > > > there are any worst offenders and who they are.
> > > >
> > > > Currently we only have a hypothesis that there are some worst
> > > > offenders vs lots of normal load. And we can't point fingers which
> > > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > > fix anything.
> > > >
> > > > If we would know for sure that lock class X is guilty. That would make
> > > > the situation much more actionable.
> >
> > I am trying to reproduce this locally first. syzbot caims it can
> > reproduce it with a number of very simpler reproducers (like spawn
> > process, unshare, create socket):
> > https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> >
> > I see a very slow drift, but it's very slow, so get only to:
> >  direct dependencies: 22072 [max: 32768]
> >
> > But that's running a very uniform workload.
> >
> > However when I tried to cat /proc/lockdep to see if there is anything
> > fishy already,
> > I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> >
> > Some missing locks?
> >
> > ==
> > BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
> > BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
> > Read of size 1 at addr 888295833740 by task less/1855
> >
> > CPU: 0 PID: 1855 Comm: less Tainted: GW 5.10.0-rc4+ #68
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2

Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Thu, Nov 19, 2020 at 1:33 PM Dmitry Vyukov  wrote:
>
> On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
>  wrote:
> >
> > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > >> The problem is that we can't know what exactly is consuming these 
> > >> resources.
> > >> My question is do you have a plan to make it possible to know what 
> > >> exactly is
> > >> consuming these resources.
> > >
> > > I'm pretty sure it's in /proc/lockdep* somewhere.
> >
> > OK. Then...
> >
> > Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating 
> > as
> > a crash as soon as encountering one of
> >
> >   BUG: MAX_LOCKDEP_ENTRIES too low!
> >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> >   BUG: MAX_LOCKDEP_CHAINS too low!
> >   BUG: MAX_LOCKDEP_KEYS too low!
> >   WARNING in print_bfs_bug
> >
> > messages?
> >
> > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > On Wed, Sep 16, 2020 at 1:51 PM  wrote:
> > >>
> > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > >>>  wrote:
> > >>>>
> > >>>> Hello. Can we apply this patch?
> > >>>>
> > >>>> This patch addresses top crashers for syzbot, and applying this patch
> > >>>> will help utilizing syzbot's resource for finding other bugs.
> > >>>
> > >>> Acked-by: Dmitry Vyukov 
> > >>>
> > >>> Peter, do you still have concerns with this?
> > >>
> > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > >> annotation that blows up the lockdep storage, no worries, we'll just
> > >> increase this :/
> > >>
> > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > >> poking at /sys a lot and thus floods the system.
> > >>
> > >> I don't know enough about sysfs to suggest an alternative, and haven't
> > >> exactly had spare time to look into it either :/
> > >>
> > >> Examples of bad annotations is getting every CPU a separate class, that
> > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> > >> only a single nesting level).
> > >
> > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > sort and show existing chains so that it's possible to identify if
> > > there are any worst offenders and who they are.
> > >
> > > Currently we only have a hypothesis that there are some worst
> > > offenders vs lots of normal load. And we can't point fingers which
> > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > fix anything.
> > >
> > > If we would know for sure that lock class X is guilty. That would make
> > > the situation much more actionable.
>
> I am trying to reproduce this locally first. syzbot caims it can
> reproduce it with a number of very simpler reproducers (like spawn
> process, unshare, create socket):
> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
>
> I see a very slow drift, but it's very slow, so get only to:
>  direct dependencies: 22072 [max: 32768]
>
> But that's running a very uniform workload.
>
> However when I tried to cat /proc/lockdep to see if there is anything
> fishy already,
> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
>
> Some missing locks?
>
> ==
> BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
> BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
> Read of size 1 at addr 888295833740 by task less/1855
>
> CPU: 0 PID: 1855 Comm: less Tainted: GW 5.10.0-rc4+ #68
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
>  __kasan_report mm/kasan/report.c:545 [inline]
>  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>  string_nocheck lib/vsprintf.c:611 [inline]
>  string+0x39c/0x3d0 lib/vspri

Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.

2020-11-19 Thread Dmitry Vyukov
On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
 wrote:
>
> On 2020/11/19 0:10, Peter Zijlstra wrote:
> > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> >> The problem is that we can't know what exactly is consuming these 
> >> resources.
> >> My question is do you have a plan to make it possible to know what exactly 
> >> is
> >> consuming these resources.
> >
> > I'm pretty sure it's in /proc/lockdep* somewhere.
>
> OK. Then...
>
> Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
> a crash as soon as encountering one of
>
>   BUG: MAX_LOCKDEP_ENTRIES too low!
>   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>   BUG: MAX_LOCKDEP_CHAINS too low!
>   BUG: MAX_LOCKDEP_KEYS too low!
>   WARNING in print_bfs_bug
>
> messages?
>
> On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > On Wed, Sep 16, 2020 at 1:51 PM  wrote:
> >>
> >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> >>>  wrote:
> >>>>
> >>>> Hello. Can we apply this patch?
> >>>>
> >>>> This patch addresses top crashers for syzbot, and applying this patch
> >>>> will help utilizing syzbot's resource for finding other bugs.
> >>>
> >>> Acked-by: Dmitry Vyukov 
> >>>
> >>> Peter, do you still have concerns with this?
> >>
> >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> >> annotation that blows up the lockdep storage, no worries, we'll just
> >> increase this :/
> >>
> >> IIRC the issue with syzbot is that the current sysfs annotation is
> >> pretty terrible and generates a gazillion classes, and syzbot likes
> >> poking at /sys a lot and thus floods the system.
> >>
> >> I don't know enough about sysfs to suggest an alternative, and haven't
> >> exactly had spare time to look into it either :/
> >>
> >> Examples of bad annotations is getting every CPU a separate class, that
> >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> >> only a single nesting level).
> >
> > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > sort and show existing chains so that it's possible to identify if
> > there are any worst offenders and who they are.
> >
> > Currently we only have a hypothesis that there are some worst
> > offenders vs lots of normal load. And we can't point fingers which
> > means that, say, sysfs, or other maintainers won't be too inclined to
> > fix anything.
> >
> > If we would know for sure that lock class X is guilty. That would make
> > the situation much more actionable.

I am trying to reproduce this locally first. syzbot caims it can
reproduce it with a number of very simpler reproducers (like spawn
process, unshare, create socket):
https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc

I see a very slow drift, but it's very slow, so get only to:
 direct dependencies: 22072 [max: 32768]

But that's running a very uniform workload.

However when I tried to cat /proc/lockdep to see if there is anything
fishy already,
I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).

Some missing locks?

==
BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
Read of size 1 at addr 888295833740 by task less/1855

CPU: 0 PID: 1855 Comm: less Tainted: GW 5.10.0-rc4+ #68
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 string_nocheck lib/vsprintf.c:611 [inline]
 string+0x39c/0x3d0 lib/vsprintf.c:693
 vsnprintf+0x71b/0x14f0 lib/vsprintf.c:2618
 seq_vprintf fs/seq_file.c:398 [inline]
 seq_printf+0x195/0x240 fs/seq_file.c:413
 print_name+0x98/0x1d0 kernel/locking/lockdep_proc.c:50
 l_show+0x111/0x2c0 kernel/locking/lockdep_proc.c:82
 seq_read_iter+0xae4/0x10c0 fs/seq_file.c:268
 proc_reg_read_iter+0x1fb/0x2d0 fs/proc/inode.c:310
 call_read_iter include/linux/fs.h:1897 [inline]
 new_sync_read+0x41e/0x6e0 fs/read_write.c:415
 vfs_read+0x35c/0x570 fs/read_write.c:496
 ksys_read+0x12d/0x250 fs/read_write.c:634
 do_syscall_64+0x2d/0x70 arch/x86/en

Re: Collecting both remote and "local" coverage with KCOV

2020-11-18 Thread Dmitry Vyukov
On Wed, Nov 18, 2020 at 3:21 AM Alexander Bulekov  wrote:
>
> On 201116 1805, Andrey Konovalov wrote:
> > On Mon, Nov 16, 2020 at 9:35 AM Dmitry Vyukov  wrote:
> > >
> > > On Mon, Nov 16, 2020 at 3:39 AM Alexander Bulekov  wrote:
> > > >
> > > > Hello,
> > > > I'm trying to collect coverage over the syscalls issued by my process,
> > > > as well as the kthreads spawned as a result of these syscalls
> > > > (eg coverage over vhost ioctls and the worker kthread). Is there a way
> > > > to collect coverage with both KCOV_REMOTE_ENABLE(with common_handle) and
> > > > KCOV_ENABLE, simultaneously?
> > > >
> > > > Based on the code it seems that these two modes are mutually
> > > > exclusive within a single task, but I don't think this is mentioned in
> > > > the Documentation, so I want to make sure I'm not missing something.
> > >
> > > Hi Alex,
> > >
> > > Yes, it's probably not supported within a single task. The easiest way
> > > to verify is to try it ;)
> > >
> > > It is possible to collect both coverages, but you will need 2 threads
> > > (one just to set up remote KCOV).
> > >
> > > Unless I am missing any fundamental limitations, I would say it would
> > > be reasonable to support this within a single task as well.
> >
> > I think the reason we did that initially, is because we don't care
> > about normal coverage for USB emitting pseudo-syscalls. Filed a bug
> > for this: https://bugzilla.kernel.org/show_bug.cgi?id=210225
>
> I'm interested in adding support for this. Looking through the code, I
> can think of ~two approaches:
>
> 1.) Allow the same kcov fd to be used to track coverage with both
> KCOV_REMOTE_ENABLE and KCOV_ENABLE. If we try to use the same coverage
> bitmap for both the remote and the local coverage, I think the local
> part would have to deal with the kcov_remote_lock. If the local part
> continues to write directly into the user-space coverage-area, as it
> does now, it seems it would require locking for each __sanitizer_cov
> call.  Alternatively, the local and the remote parts could write into
> different coverage-bitmaps, but I'm not sure if there is a neat way to
> do this.

This has 2 problems:
 - performance (__sanitizer_cov is by far the most performance
critical part of kernel with KCOV=y)
 - recurions, locks are also traced, so it's not that we really can
call anything there

> 2.) Allow multiple kcov fds to be used by the same task. In the task,
> keep a linked-list of pointers to kcov objects (remote or local). For
> each __sanitizer_... call, walk the linked list and check if any of the
> kcov objects match the requirements (trace_cmp/trace_pc/remote). This
> would also have the side-effect of enabling simultaneous PC and CMP
> tracing. Of course, it seems that this would add some overhead (in the
> case of a single open fd, there would be extra pointer dereferences to
> get the area[], size, etc).

Walking linked list in __sanitizer_... has the same performance
problems, but I think we don't really need to do it.
Assuming we have at most 1 KCOV that traces the task itself we can
continue keeping it cached in task_struct:
https://elixir.bootlin.com/linux/v5.10-rc4/source/include/linux/sched.h#L1254
and __sanitizer_... will continue using these fields.

For the kcov pointer in task struct:
https://elixir.bootlin.com/linux/v5.10-rc4/source/include/linux/sched.h#L1257
we either have a linked list, or 1 pointer for local tracking and a
separate list for remote kcov's:
struct kcov *kcov; // local tracing
struct kcov *remote_kcovs; // remote tracing, can be more than 1
Whichever is better I am not sure, it seems that some functions would
benefit from a single list (KCOV_DISABLE), while others would benefit
from separate fields (KCOV_ENABLE).
Maybe the simplest code will be if we use both approaches -- put all
kcov's into a list, but also cache the local kcov into a separate
field? Then KCOV_DISABLE could just walk the list, but KCOV_ENABLE can
continue checking 1 field.


Re: [PATCH mm v3 17/19] kasan: clean up metadata allocation and usage

2020-11-17 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM Andrey Konovalov  wrote:
>
> KASAN marks caches that are sanitized with the SLAB_KASAN cache flag.
> Currently if the metadata that is appended after the object (stores e.g.
> stack trace ids) doesn't fit into KMALLOC_MAX_SIZE (can only happen with
> SLAB, see the comment in the patch), KASAN turns off sanitization
> completely.
>
> With this change sanitization of the object data is always enabled.
> However the metadata is only stored when it fits. Instead of checking for
> SLAB_KASAN flag accross the code to find out whether the metadata is
> there, use cache->kasan_info.alloc/free_meta_offset. As 0 can be a valid
> value for free_meta_offset, introduce KASAN_NO_FREE_META as an indicator
> that the free metadata is missing.
>
> Along the way rework __kasan_cache_create() and add claryfying comments.
>
> Co-developed-by: Vincenzo Frascino 
> Signed-off-by: Vincenzo Frascino 
> Signed-off-by: Andrey Konovalov 
> Link: 
> https://linux-review.googlesource.com/id/Icd947e2bea054cb5cfbdc6cf6652227d97032dcb
> ---
>  mm/kasan/common.c | 112 +-
>  mm/kasan/generic.c|  15 ++---
>  mm/kasan/hw_tags.c|   6 +-
>  mm/kasan/kasan.h  |  13 -
>  mm/kasan/quarantine.c |   8 +++
>  mm/kasan/report.c |  43 ---
>  mm/kasan/report_sw_tags.c |   9 ++-
>  mm/kasan/sw_tags.c|   4 ++
>  8 files changed, 139 insertions(+), 71 deletions(-)
>
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 42ba64fce8a3..cf874243efab 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -115,9 +115,6 @@ void __kasan_free_pages(struct page *page, unsigned int 
> order)
>   */
>  static inline unsigned int optimal_redzone(unsigned int object_size)
>  {
> -   if (!IS_ENABLED(CONFIG_KASAN_GENERIC))
> -   return 0;
> -
> return
> object_size <= 64- 16   ? 16 :
> object_size <= 128   - 32   ? 32 :
> @@ -131,47 +128,77 @@ static inline unsigned int optimal_redzone(unsigned int 
> object_size)
>  void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
>   slab_flags_t *flags)
>  {
> -   unsigned int orig_size = *size;
> -   unsigned int redzone_size;
> -   int redzone_adjust;
> +   unsigned int ok_size;
> +   unsigned int optimal_size;
> +
> +   /*
> +* SLAB_KASAN is used to mark caches as ones that are sanitized by
> +* KASAN. Currently this is used in two places:
> +* 1. In slab_ksize() when calculating the size of the accessible
> +*memory within the object.
> +* 2. In slab_common.c to prevent merging of sanitized caches.
> +*/
> +   *flags |= SLAB_KASAN;
>
> -   if (!kasan_stack_collection_enabled()) {
> -   *flags |= SLAB_KASAN;
> +   if (!kasan_stack_collection_enabled())
> return;
> -   }
>
> -   /* Add alloc meta. */
> +   ok_size = *size;
> +
> +   /* Add alloc meta into redzone. */
> cache->kasan_info.alloc_meta_offset = *size;
> *size += sizeof(struct kasan_alloc_meta);
>
> -   /* Add free meta. */
> -   if (IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -   (cache->flags & SLAB_TYPESAFE_BY_RCU || cache->ctor ||
> -cache->object_size < sizeof(struct kasan_free_meta))) {
> -   cache->kasan_info.free_meta_offset = *size;
> -   *size += sizeof(struct kasan_free_meta);
> +   /*
> +* If alloc meta doesn't fit, don't add it.
> +* This can only happen with SLAB, as it has KMALLOC_MAX_SIZE equal
> +* to KMALLOC_MAX_CACHE_SIZE and doesn't fall back to page_alloc for
> +* larger sizes.
> +*/
> +   if (*size > KMALLOC_MAX_SIZE) {
> +   cache->kasan_info.alloc_meta_offset = 0;
> +   *size = ok_size;
> +   /* Continue, since free meta might still fit. */
> }
>
> -   redzone_size = optimal_redzone(cache->object_size);
> -   redzone_adjust = redzone_size - (*size - cache->object_size);
> -   if (redzone_adjust > 0)
> -   *size += redzone_adjust;
> -
> -   *size = min_t(unsigned int, KMALLOC_MAX_SIZE,
> -   max(*size, cache->object_size + redzone_size));
> +   /* Only the generic mode uses free meta or flexible redzones. */
> +   if (!IS_ENABLED(CONFIG_KASAN_GENERIC)) {
> +   cache->kasan_info.free_meta_offset = KASAN_NO_FREE_META;
> +   return;
> +   }
>
> /*
> -* If the metadata doesn't fit, don't enable KASAN at all.
> +* Add free meta into redzone when it's not possible to store
> +* it in the object. This is the case when:
> +* 1. Object is SLAB_TYPESAFE_BY_RCU, which means that it can
> +*be touched after it was freed, or
> +* 2. Object has a constructor, which means it's 

Re: [PATCH mm v3 19/19] kasan: update documentation

2020-11-17 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 4:47 PM Marco Elver  wrote:
>
> On Fri, Nov 13, 2020 at 11:20PM +0100, Andrey Konovalov wrote:
> > This change updates KASAN documentation to reflect the addition of boot
> > parameters and also reworks and clarifies some of the existing sections,
> > in particular: defines what a memory granule is, mentions quarantine,
> > makes Kunit section more readable.
> >
> > Signed-off-by: Andrey Konovalov 
> > Link: 
> > https://linux-review.googlesource.com/id/Ib1f83e91be273264b25f42b04448ac96b858849f
>
> Reviewed-by: Marco Elver 

Reviewed-by: Dmitry Vyukov 

> > ---
> >  Documentation/dev-tools/kasan.rst | 186 +++---
> >  1 file changed, 116 insertions(+), 70 deletions(-)
> >
> > diff --git a/Documentation/dev-tools/kasan.rst 
> > b/Documentation/dev-tools/kasan.rst
> > index ffbae8ce5748..0d5d77919b1a 100644
> > --- a/Documentation/dev-tools/kasan.rst
> > +++ b/Documentation/dev-tools/kasan.rst
> > @@ -4,8 +4,9 @@ The Kernel Address Sanitizer (KASAN)
> >  Overview
> >  
> >
> > -KernelAddressSANitizer (KASAN) is a dynamic memory error detector designed 
> > to
> > -find out-of-bound and use-after-free bugs. KASAN has three modes:
> > +KernelAddressSANitizer (KASAN) is a dynamic memory safety error detector
> > +designed to find out-of-bound and use-after-free bugs. KASAN has three 
> > modes:
> > +
> >  1. generic KASAN (similar to userspace ASan),
> >  2. software tag-based KASAN (similar to userspace HWASan),
> >  3. hardware tag-based KASAN (based on hardware memory tagging).
> > @@ -39,23 +40,13 @@ CONFIG_KASAN_INLINE. Outline and inline are compiler 
> > instrumentation types.
> >  The former produces smaller binary while the latter is 1.1 - 2 times 
> > faster.
> >
> >  Both software KASAN modes work with both SLUB and SLAB memory allocators,
> > -hardware tag-based KASAN currently only support SLUB.
> > -For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.
> > +while the hardware tag-based KASAN currently only support SLUB.
> > +
> > +For better error reports that include stack traces, enable 
> > CONFIG_STACKTRACE.
> >
> >  To augment reports with last allocation and freeing stack of the physical 
> > page,
> >  it is recommended to enable also CONFIG_PAGE_OWNER and boot with 
> > page_owner=on.
> >
> > -To disable instrumentation for specific files or directories, add a line
> > -similar to the following to the respective kernel Makefile:
> > -
> > -- For a single file (e.g. main.o)::
> > -
> > -KASAN_SANITIZE_main.o := n
> > -
> > -- For all files in one directory::
> > -
> > -KASAN_SANITIZE := n
> > -
> >  Error reports
> >  ~
> >
> > @@ -140,22 +131,75 @@ freed (in case of a use-after-free bug report). Next 
> > comes a description of
> >  the accessed slab object and information about the accessed memory page.
> >
> >  In the last section the report shows memory state around the accessed 
> > address.
> > -Reading this part requires some understanding of how KASAN works.
> > -
> > -The state of each 8 aligned bytes of memory is encoded in one shadow byte.
> > -Those 8 bytes can be accessible, partially accessible, freed or be a 
> > redzone.
> > -We use the following encoding for each shadow byte: 0 means that all 8 
> > bytes
> > -of the corresponding memory region are accessible; number N (1 <= N <= 7) 
> > means
> > -that the first N bytes are accessible, and other (8 - N) bytes are not;
> > -any negative value indicates that the entire 8-byte word is inaccessible.
> > -We use different negative values to distinguish between different kinds of
> > -inaccessible memory like redzones or freed memory (see mm/kasan/kasan.h).
> > +Internally KASAN tracks memory state separately for each memory granule, 
> > which
> > +is either 8 or 16 aligned bytes depending on KASAN mode. Each number in the
> > +memory state section of the report shows the state of one of the memory
> > +granules that surround the accessed address.
> > +
> > +For generic KASAN the size of each memory granule is 8. The state of each
> > +granule is encoded in one shadow byte. Those 8 bytes can be accessible,
> > +partially accessible, freed or be a part of a redzone. KASAN uses the 
> > following
> > +encoding for each shadow byte: 0 means that all 8 bytes of the 
> > corresponding
> > +memory region are accessible; number N (1 <= N <= 7) means that 

Re: [PATCH mm v3 17/19] kasan: clean up metadata allocation and usage

2020-11-17 Thread Dmitry Vyukov
On Tue, Nov 17, 2020 at 2:18 PM Marco Elver  wrote:
>
> On Tue, 17 Nov 2020 at 14:12, Dmitry Vyukov  wrote:
>
> > > +*/
> > > *(u8 *)kasan_mem_to_shadow(object) = KASAN_KMALLOC_FREE;
> > > +
> > > ___cache_free(cache, object, _THIS_IP_);
> > >
> > > if (IS_ENABLED(CONFIG_SLAB))
> > > @@ -168,6 +173,9 @@ void quarantine_put(struct kmem_cache *cache, void 
> > > *object)
> > > struct qlist_head temp = QLIST_INIT;
> > > struct kasan_free_meta *meta = kasan_get_free_meta(cache, object);
> > >
> > > +   if (!meta)
> > > +   return;
> >
> > Humm... is this possible? If yes, we would be leaking the object here...
> > Perhaps BUG_ON with a comment instead.
>
> If this is possible in prod-mode KASAN, a WARN_ON() that returns would be 
> safer.

We only compile quarantine.c for CONFIG_KASAN_GENERIC.


Re: [PATCH mm v3 18/19] kasan, mm: allow cache merging with no metadata

2020-11-17 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM Andrey Konovalov  wrote:
>
> The reason cache merging is disabled with KASAN is because KASAN puts its
> metadata right after the allocated object. When the merged caches have
> slightly different sizes, the metadata ends up in different places, which
> KASAN doesn't support.
>
> It might be possible to adjust the metadata allocation algorithm and make
> it friendly to the cache merging code. Instead this change takes a simpler
> approach and allows merging caches when no metadata is present. Which is
> the case for hardware tag-based KASAN with kasan.mode=prod.
>
> Co-developed-by: Vincenzo Frascino 
> Signed-off-by: Vincenzo Frascino 
> Signed-off-by: Andrey Konovalov 
> Link: 
> https://linux-review.googlesource.com/id/Ia114847dfb2244f297d2cb82d592bf6a07455dba

Somehow gerrit contains an old version... so I was going to
independently propose what Marco already proposed as simplification...
until I looked at the patch in the email :)

Reviewed-by: Dmitry Vyukov 

> ---
>  include/linux/kasan.h | 21 +++--
>  mm/kasan/common.c | 11 +++
>  mm/slab_common.c  |  3 ++-
>  3 files changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 16cf53eac29b..173a8e81d001 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -81,17 +81,30 @@ struct kasan_cache {
>  };
>
>  #ifdef CONFIG_KASAN_HW_TAGS
> +
>  DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> +
>  static __always_inline bool kasan_enabled(void)
>  {
> return static_branch_likely(_flag_enabled);
>  }
> -#else
> +
> +#else /* CONFIG_KASAN_HW_TAGS */
> +
>  static inline bool kasan_enabled(void)
>  {
> return true;
>  }
> -#endif
> +
> +#endif /* CONFIG_KASAN_HW_TAGS */
> +
> +slab_flags_t __kasan_never_merge(void);
> +static __always_inline slab_flags_t kasan_never_merge(void)
> +{
> +   if (kasan_enabled())
> +   return __kasan_never_merge();
> +   return 0;
> +}
>
>  void __kasan_unpoison_range(const void *addr, size_t size);
>  static __always_inline void kasan_unpoison_range(const void *addr, size_t 
> size)
> @@ -238,6 +251,10 @@ static inline bool kasan_enabled(void)
>  {
> return false;
>  }
> +static inline slab_flags_t kasan_never_merge(void)
> +{
> +   return 0;
> +}
>  static inline void kasan_unpoison_range(const void *address, size_t size) {}
>  static inline void kasan_alloc_pages(struct page *page, unsigned int order) 
> {}
>  static inline void kasan_free_pages(struct page *page, unsigned int order) {}
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index cf874243efab..a5a4dcb1254d 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -87,6 +87,17 @@ asmlinkage void kasan_unpoison_task_stack_below(const void 
> *watermark)
>  }
>  #endif /* CONFIG_KASAN_STACK */
>
> +/*
> + * Only allow cache merging when stack collection is disabled and no metadata
> + * is present.
> + */
> +slab_flags_t __kasan_never_merge(void)
> +{
> +   if (kasan_stack_collection_enabled())
> +   return SLAB_KASAN;
> +   return 0;
> +}
> +
>  void __kasan_alloc_pages(struct page *page, unsigned int order)
>  {
> u8 tag;
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 0b5ae1819a8b..075b23ce94ec 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -54,7 +55,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
>   */
>  #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
> SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
> -   SLAB_FAILSLAB | SLAB_KASAN)
> +   SLAB_FAILSLAB | kasan_never_merge())
>
>  #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
>  SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
> --
> 2.29.2.299.gdc1121823c-goog
>


Re: [PATCH mm v3 12/19] kasan, mm: check kasan_enabled in annotations

2020-11-17 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 4:26 PM Marco Elver  wrote:
>
> On Fri, Nov 13, 2020 at 11:20PM +0100, Andrey Konovalov wrote:
> > Declare the kasan_enabled static key in include/linux/kasan.h and in
> > include/linux/mm.h and check it in all kasan annotations. This allows to
> > avoid any slowdown caused by function calls when kasan_enabled is
> > disabled.
> >
> > Co-developed-by: Vincenzo Frascino 
> > Signed-off-by: Vincenzo Frascino 
> > Signed-off-by: Andrey Konovalov 
> > Link: 
> > https://linux-review.googlesource.com/id/I2589451d3c96c97abbcbf714baabe6161c6f153e
>
> Reviewed-by: Marco Elver 

Also much nicer with kasan_enabled() now.

Reviewed-by: Dmitry Vyukov 

> > ---
> >  include/linux/kasan.h | 213 --
> >  include/linux/mm.h|  22 +++--
> >  mm/kasan/common.c |  56 +--
> >  3 files changed, 210 insertions(+), 81 deletions(-)
> >
> > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > index 872bf145ddde..6bd95243a583 100644
> > --- a/include/linux/kasan.h
> > +++ b/include/linux/kasan.h
> > @@ -2,6 +2,7 @@
> >  #ifndef _LINUX_KASAN_H
> >  #define _LINUX_KASAN_H
> >
> > +#include 
> >  #include 
> >
> >  struct kmem_cache;
> > @@ -74,54 +75,176 @@ static inline void kasan_disable_current(void) {}
> >
> >  #ifdef CONFIG_KASAN
> >
> > -void kasan_unpoison_range(const void *address, size_t size);
> > +struct kasan_cache {
> > + int alloc_meta_offset;
> > + int free_meta_offset;
> > +};
> >
> > -void kasan_alloc_pages(struct page *page, unsigned int order);
> > -void kasan_free_pages(struct page *page, unsigned int order);
> > +#ifdef CONFIG_KASAN_HW_TAGS
> > +DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> > +static __always_inline bool kasan_enabled(void)
> > +{
> > + return static_branch_likely(_flag_enabled);
> > +}
> > +#else
> > +static inline bool kasan_enabled(void)
> > +{
> > + return true;
> > +}
> > +#endif
> >
> > -void kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> > - slab_flags_t *flags);
> > +void __kasan_unpoison_range(const void *addr, size_t size);
> > +static __always_inline void kasan_unpoison_range(const void *addr, size_t 
> > size)
> > +{
> > + if (kasan_enabled())
> > + __kasan_unpoison_range(addr, size);
> > +}
> >
> > -void kasan_poison_slab(struct page *page);
> > -void kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
> > -void kasan_poison_object_data(struct kmem_cache *cache, void *object);
> > -void * __must_check kasan_init_slab_obj(struct kmem_cache *cache,
> > - const void *object);
> > +void __kasan_alloc_pages(struct page *page, unsigned int order);
> > +static __always_inline void kasan_alloc_pages(struct page *page,
> > + unsigned int order)
> > +{
> > + if (kasan_enabled())
> > + __kasan_alloc_pages(page, order);
> > +}
> >
> > -void * __must_check kasan_kmalloc_large(const void *ptr, size_t size,
> > - gfp_t flags);
> > -void kasan_kfree_large(void *ptr, unsigned long ip);
> > -void kasan_poison_kfree(void *ptr, unsigned long ip);
> > -void * __must_check kasan_kmalloc(struct kmem_cache *s, const void *object,
> > - size_t size, gfp_t flags);
> > -void * __must_check kasan_krealloc(const void *object, size_t new_size,
> > - gfp_t flags);
> > +void __kasan_free_pages(struct page *page, unsigned int order);
> > +static __always_inline void kasan_free_pages(struct page *page,
> > + unsigned int order)
> > +{
> > + if (kasan_enabled())
> > + __kasan_free_pages(page, order);
> > +}
> >
> > -void * __must_check kasan_slab_alloc(struct kmem_cache *s, void *object,
> > - gfp_t flags);
> > -bool kasan_slab_free(struct kmem_cache *s, void *object, unsigned long ip);
> > +void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> > + slab_flags_t *flags);
> > +static __always_inline void kasan_cache_create(struct kmem_cache *cache,
> > + unsigned int *size, slab_flags_t *flags)
> > +{
> > + if (kasan_enabled())
> > + __k

Re: [PATCH mm v3 11/19] kasan: add and integrate kasan boot parameters

2020-11-17 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 4:15 PM Marco Elver  wrote:
>
> On Fri, Nov 13, 2020 at 11:20PM +0100, Andrey Konovalov wrote:
> > Hardware tag-based KASAN mode is intended to eventually be used in
> > production as a security mitigation. Therefore there's a need for finer
> > control over KASAN features and for an existence of a kill switch.
> >
> > This change adds a few boot parameters for hardware tag-based KASAN that
> > allow to disable or otherwise control particular KASAN features.
> >
> > The features that can be controlled are:
> >
> > 1. Whether KASAN is enabled at all.
> > 2. Whether KASAN collects and saves alloc/free stacks.
> > 3. Whether KASAN panics on a detected bug or not.
> >
> > With this change a new boot parameter kasan.mode allows to choose one of
> > three main modes:
> >
> > - kasan.mode=off - KASAN is disabled, no tag checks are performed
> > - kasan.mode=prod - only essential production features are enabled
> > - kasan.mode=full - all KASAN features are enabled
> >
> > The chosen mode provides default control values for the features mentioned
> > above. However it's also possible to override the default values by
> > providing:
> >
> > - kasan.stacktrace=off/on - enable alloc/free stack collection
> > (default: on for mode=full, otherwise off)
> > - kasan.fault=report/panic - only report tag fault or also panic
> >  (default: report)
> >
> > If kasan.mode parameter is not provided, it defaults to full when
> > CONFIG_DEBUG_KERNEL is enabled, and to prod otherwise.
> >
> > It is essential that switching between these modes doesn't require
> > rebuilding the kernel with different configs, as this is required by
> > the Android GKI (Generic Kernel Image) initiative [1].
> >
> > [1] 
> > https://source.android.com/devices/architecture/kernel/generic-kernel-image
> >
> > Signed-off-by: Andrey Konovalov 
> > Link: 
> > https://linux-review.googlesource.com/id/If7d37003875b2ed3e0935702c8015c223d6416a4
>
> Reviewed-by: Marco Elver 

Much nicer with the wrappers now.

Reviewed-by: Dmitry Vyukov 

> > ---
> >  mm/kasan/common.c  |  22 +--
> >  mm/kasan/hw_tags.c | 151 +
> >  mm/kasan/kasan.h   |  16 +
> >  mm/kasan/report.c  |  14 -
> >  4 files changed, 196 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > index 1ac4f435c679..a11e3e75eb08 100644
> > --- a/mm/kasan/common.c
> > +++ b/mm/kasan/common.c
> > @@ -135,6 +135,11 @@ void kasan_cache_create(struct kmem_cache *cache, 
> > unsigned int *size,
> >   unsigned int redzone_size;
> >   int redzone_adjust;
> >
> > + if (!kasan_stack_collection_enabled()) {
> > + *flags |= SLAB_KASAN;
> > + return;
> > + }
> > +
> >   /* Add alloc meta. */
> >   cache->kasan_info.alloc_meta_offset = *size;
> >   *size += sizeof(struct kasan_alloc_meta);
> > @@ -171,6 +176,8 @@ void kasan_cache_create(struct kmem_cache *cache, 
> > unsigned int *size,
> >
> >  size_t kasan_metadata_size(struct kmem_cache *cache)
> >  {
> > + if (!kasan_stack_collection_enabled())
> > + return 0;
> >   return (cache->kasan_info.alloc_meta_offset ?
> >   sizeof(struct kasan_alloc_meta) : 0) +
> >   (cache->kasan_info.free_meta_offset ?
> > @@ -263,11 +270,13 @@ void * __must_check kasan_init_slab_obj(struct 
> > kmem_cache *cache,
> >  {
> >   struct kasan_alloc_meta *alloc_meta;
> >
> > - if (!(cache->flags & SLAB_KASAN))
> > - return (void *)object;
> > + if (kasan_stack_collection_enabled()) {
> > + if (!(cache->flags & SLAB_KASAN))
> > + return (void *)object;
> >
> > - alloc_meta = kasan_get_alloc_meta(cache, object);
> > - __memset(alloc_meta, 0, sizeof(*alloc_meta));
> > + alloc_meta = kasan_get_alloc_meta(cache, object);
> > + __memset(alloc_meta, 0, sizeof(*alloc_meta));
> > + }
> >
> >   if (IS_ENABLED(CONFIG_KASAN_SW_TAGS) || 
> > IS_ENABLED(CONFIG_KASAN_HW_TAGS))
> >   object = set_tag(object, assign_tag(cache, object, true, 
> > false));
> > @@ -307,6 +316,9 @@ static bool __kasan_slab_free(struct kmem_cache *cache, 
> > void *object,
> >   rounded_up_size = round_up(

Re: [PATCH mm v3 08/19] kasan: inline random_tag for HW_TAGS

2020-11-17 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM Andrey Konovalov  wrote:
>
> Using random_tag() currently results in a function call. Move its
> definition to mm/kasan/kasan.h and turn it into a static inline function
> for hardware tag-based mode to avoid uneeded function calls.
>
> Signed-off-by: Andrey Konovalov 
> Reviewed-by: Marco Elver 

Reviewed-by: Dmitry Vyukov 

> Link: 
> https://linux-review.googlesource.com/id/Iac5b2faf9a912900e16cca6834d621f5d4abf427
> ---
>  mm/kasan/hw_tags.c |  5 -
>  mm/kasan/kasan.h   | 31 ++-
>  2 files changed, 14 insertions(+), 22 deletions(-)
>
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index a34476764f1d..3cdd87d189f6 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -51,11 +51,6 @@ void unpoison_range(const void *address, size_t size)
> round_up(size, KASAN_GRANULE_SIZE), get_tag(address));
>  }
>
> -u8 random_tag(void)
> -{
> -   return hw_get_random_tag();
> -}
> -
>  bool check_invalid_free(void *addr)
>  {
> u8 ptr_tag = get_tag(addr);
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 5e8cd2080369..7876a2547b7d 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -190,6 +190,12 @@ static inline bool addr_has_metadata(const void *addr)
>
>  #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
>
> +#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
> +void print_tags(u8 addr_tag, const void *addr);
> +#else
> +static inline void print_tags(u8 addr_tag, const void *addr) { }
> +#endif
> +
>  bool check_invalid_free(void *addr);
>
>  void *find_first_bad_addr(void *addr, size_t size);
> @@ -225,23 +231,6 @@ static inline void quarantine_reduce(void) { }
>  static inline void quarantine_remove_cache(struct kmem_cache *cache) { }
>  #endif
>
> -#if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
> -
> -void print_tags(u8 addr_tag, const void *addr);
> -
> -u8 random_tag(void);
> -
> -#else
> -
> -static inline void print_tags(u8 addr_tag, const void *addr) { }
> -
> -static inline u8 random_tag(void)
> -{
> -   return 0;
> -}
> -
> -#endif
> -
>  #ifndef arch_kasan_set_tag
>  static inline const void *arch_kasan_set_tag(const void *addr, u8 tag)
>  {
> @@ -281,6 +270,14 @@ static inline const void *arch_kasan_set_tag(const void 
> *addr, u8 tag)
>
>  #endif /* CONFIG_KASAN_HW_TAGS */
>
> +#ifdef CONFIG_KASAN_SW_TAGS
> +u8 random_tag(void);
> +#elif defined(CONFIG_KASAN_HW_TAGS)
> +static inline u8 random_tag(void) { return hw_get_random_tag(); }
> +#else
> +static inline u8 random_tag(void) { return 0; }
> +#endif
> +
>  /*
>   * Exported functions for interfaces called from assembly or from generated
>   * code. Declarations here to avoid warning about missing declarations.
> --
> 2.29.2.299.gdc1121823c-goog
>


Re: [PATCH mm v3 07/19] kasan: inline kasan_reset_tag for tag-based modes

2020-11-17 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM Andrey Konovalov  wrote:
>
> Using kasan_reset_tag() currently results in a function call. As it's
> called quite often from the allocator code, this leads to a noticeable
> slowdown. Move it to include/linux/kasan.h and turn it into a static
> inline function. Also remove the now unneeded reset_tag() internal KASAN
> macro and use kasan_reset_tag() instead.
>
> Signed-off-by: Andrey Konovalov 
> Reviewed-by: Marco Elver 

Reviewed-by: Dmitry Vyukov 

> Link: 
> https://linux-review.googlesource.com/id/I4d2061acfe91d480a75df00b07c22d8494ef14b5
> ---
>  include/linux/kasan.h | 5 -
>  mm/kasan/common.c | 6 +++---
>  mm/kasan/hw_tags.c| 9 ++---
>  mm/kasan/kasan.h  | 4 
>  mm/kasan/report.c | 4 ++--
>  mm/kasan/report_hw_tags.c | 2 +-
>  mm/kasan/report_sw_tags.c | 4 ++--
>  mm/kasan/shadow.c | 4 ++--
>  mm/kasan/sw_tags.c| 9 ++---
>  9 files changed, 18 insertions(+), 29 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index f2109bf0c5f9..1594177f86bb 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -193,7 +193,10 @@ static inline void kasan_record_aux_stack(void *ptr) {}
>
>  #if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS)
>
> -void *kasan_reset_tag(const void *addr);
> +static inline void *kasan_reset_tag(const void *addr)
> +{
> +   return (void *)arch_kasan_reset_tag(addr);
> +}
>
>  bool kasan_report(unsigned long addr, size_t size,
> bool is_write, unsigned long ip);
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index fabd843eff3d..1ac4f435c679 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -180,14 +180,14 @@ size_t kasan_metadata_size(struct kmem_cache *cache)
>  struct kasan_alloc_meta *kasan_get_alloc_meta(struct kmem_cache *cache,
>   const void *object)
>  {
> -   return (void *)reset_tag(object) + 
> cache->kasan_info.alloc_meta_offset;
> +   return kasan_reset_tag(object) + cache->kasan_info.alloc_meta_offset;
>  }
>
>  struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
> const void *object)
>  {
> BUILD_BUG_ON(sizeof(struct kasan_free_meta) > 32);
> -   return (void *)reset_tag(object) + cache->kasan_info.free_meta_offset;
> +   return kasan_reset_tag(object) + cache->kasan_info.free_meta_offset;
>  }
>
>  void kasan_poison_slab(struct page *page)
> @@ -284,7 +284,7 @@ static bool __kasan_slab_free(struct kmem_cache *cache, 
> void *object,
>
> tag = get_tag(object);
> tagged_object = object;
> -   object = reset_tag(object);
> +   object = kasan_reset_tag(object);
>
> if (is_kfence_address(object))
> return false;
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 68e77363e58b..a34476764f1d 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -31,18 +31,13 @@ void __init kasan_init_hw_tags(void)
> pr_info("KernelAddressSanitizer initialized\n");
>  }
>
> -void *kasan_reset_tag(const void *addr)
> -{
> -   return reset_tag(addr);
> -}
> -
>  void poison_range(const void *address, size_t size, u8 value)
>  {
> /* Skip KFENCE memory if called explicitly outside of sl*b. */
> if (is_kfence_address(address))
> return;
>
> -   hw_set_mem_tag_range(reset_tag(address),
> +   hw_set_mem_tag_range(kasan_reset_tag(address),
> round_up(size, KASAN_GRANULE_SIZE), value);
>  }
>
> @@ -52,7 +47,7 @@ void unpoison_range(const void *address, size_t size)
> if (is_kfence_address(address))
> return;
>
> -   hw_set_mem_tag_range(reset_tag(address),
> +   hw_set_mem_tag_range(kasan_reset_tag(address),
> round_up(size, KASAN_GRANULE_SIZE), get_tag(address));
>  }
>
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 0eab7e4cecb8..5e8cd2080369 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -248,15 +248,11 @@ static inline const void *arch_kasan_set_tag(const void 
> *addr, u8 tag)
> return addr;
>  }
>  #endif
> -#ifndef arch_kasan_reset_tag
> -#define arch_kasan_reset_tag(addr) ((void *)(addr))
> -#endif
>  #ifndef arch_kasan_get_tag
>  #define arch_kasan_get_tag(addr)   0
>  #endif
>
>  #define set_tag(addr, tag) ((void *)arch_kasan_set_tag((addr), (tag)))
> -#define reset_tag(addr)((void *)arch_kasan_reset_tag(addr))

suspicious capability check in ovl_ioctl_set_flags

2020-11-16 Thread Dmitry Vyukov
Hi Miklos,

We've detected a suspicious double-fetch of user-space data in
ovl_ioctl_set_flags using a prototype tool (see report below [1]).

It points to ovl_ioctl_set_flags that does a capability check using
flags, but then the real ioctl double-fetches flags and uses
potentially different value:

static long ovl_ioctl_set_flags(struct file *file, unsigned int cmd,
unsigned long arg, unsigned int flags)
{
...
/* Check the capability before cred override */
oldflags = ovl_iflags_to_fsflags(READ_ONCE(inode->i_flags));
ret = vfs_ioc_setflags_prepare(inode, oldflags, flags);
if (ret)
goto unlock;
...
ret = ovl_real_ioctl(file, cmd, arg);

All fs impls call vfs_ioc_setflags_prepare again, so the capability is
checked again.

But I think this makes the vfs_ioc_setflags_prepare check in overlayfs
pointless (?) and the "Check the capability before cred override"
comment misleading, user can skip this check by presenting benign
flags first and then overwriting them to non-benign flags. Or, if this
check is still needed... it is wrong (?). The code would need to
arrange for both ioctl's to operate on the same data then.
Does it make any sense?
Thanks

[1] BUG: multi-read in __x64_sys_ioctl  between ovl_ioctl and ext4_ioctl
=== First Address Range Stack ===
 df_save_stack+0x33/0x70 lib/df-detection.c:208
 add_address+0x2ac/0x352 lib/df-detection.c:47
 ovl_ioctl_set_fsflags fs/overlayfs/file.c:607 [inline]
 ovl_ioctl+0x7d/0x290 fs/overlayfs/file.c:654
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
=== Second Address Range Stack ===
 df_save_stack+0x33/0x70 lib/df-detection.c:208
 add_address+0x2ac/0x352 lib/df-detection.c:47
 ext4_ioctl+0x13b1/0x27f0 fs/ext4/ioctl.c:833
 vfs_ioctl+0x30/0x80 fs/ioctl.c:48
 ovl_real_ioctl+0xed/0x100 fs/overlayfs/file.c:539
 ovl_ioctl_set_flags+0x11d/0x180 fs/overlayfs/file.c:574
 ovl_ioctl_set_fsflags fs/overlayfs/file.c:610 [inline]
 ovl_ioctl+0x11e/0x290 fs/overlayfs/file.c:654
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
syscall number 16  System Call: __x64_sys_ioctl+0x0/0x140 fs/ioctl.c:800
First 2000 len 4 Caller vfs_ioctl fs/ioctl.c:48 [inline]
First 2000 len 4 Caller __do_sys_ioctl fs/ioctl.c:753 [inline]
First 2000 len 4 Caller __se_sys_ioctl fs/ioctl.c:739 [inline]
First 2000 len 4 Caller __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:739
Second 2000 len 4 Caller vfs_ioctl+0x30/0x80 fs/ioctl.c:48
==


Re: [PATCH v2 1/1] kasan: fix object remain in offline per-cpu quarantine

2020-11-16 Thread Dmitry Vyukov
On Tue, Nov 17, 2020 at 7:46 AM Kuan-Ying Lee
 wrote:
>
> On Mon, 2020-11-16 at 10:26 +0100, Dmitry Vyukov wrote:
> > On Mon, Nov 16, 2020 at 7:30 AM Kuan-Ying Lee
> >  wrote:
> > >
> > > We hit this issue in our internal test.
> > > When enabling generic kasan, a kfree()'d object is put into per-cpu
> > > quarantine first. If the cpu goes offline, object still remains in
> > > the per-cpu quarantine. If we call kmem_cache_destroy() now, slub
> > > will report "Objects remaining" error.
> > >
> > > [   74.982625] 
> > > =
> > > [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in 
> > > test_module_slab on __kmem_cache_shutdown()
> > > [   74.984145] 
> > > -
> > > [   74.984145]
> > > [   74.984883] Disabling lock debugging due to kernel taint
> > > [   74.985561] INFO: Slab 0x(ptrval) objects=34 used=1 
> > > fp=0x(ptrval) flags=0x20010200
> > > [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: GB 
> > > 5.10.0-rc1-7-g4525c8781ec0-dirty #10
> > > [   74.987262] Hardware name: linux,dummy-virt (DT)
> > > [   74.987606] Call trace:
> > > [   74.987924]  dump_backtrace+0x0/0x2b0
> > > [   74.988296]  show_stack+0x18/0x68
> > > [   74.988698]  dump_stack+0xfc/0x168
> > > [   74.989030]  slab_err+0xac/0xd4
> > > [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> > > [   74.989779]  kmem_cache_destroy+0x68/0x130
> > > [   74.990176]  test_version_show+0x84/0xf0
> > > [   74.990679]  module_attr_show+0x40/0x60
> > > [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> > > [   74.991656]  kernfs_seq_show+0xa0/0xb8
> > > [   74.992059]  seq_read+0x1f0/0x7e8
> > > [   74.992415]  kernfs_fop_read+0x70/0x338
> > > [   74.993051]  vfs_read+0xe4/0x250
> > > [   74.993498]  ksys_read+0xc8/0x180
> > > [   74.993825]  __arm64_sys_read+0x44/0x58
> > > [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> > > [   74.994708]  do_el0_svc+0x38/0xa0
> > > [   74.995088]  el0_sync_handler+0x170/0x178
> > > [   74.995497]  el0_sync+0x174/0x180
> > > [   74.996050] INFO: Object 0x(ptrval) @offset=15848
> > > [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 
> > > cpu=6 pid=172
> > > [   75.000802]  stack_trace_save+0x9c/0xd0
> > > [   75.002420]  set_track+0x64/0xf0
> > > [   75.002770]  alloc_debug_processing+0x104/0x1a0
> > > [   75.003171]  ___slab_alloc+0x628/0x648
> > > [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> > > [   75.004757]  kmem_cache_alloc+0x560/0x588
> > > [   75.005376]  test_version_show+0x98/0xf0
> > > [   75.005756]  module_attr_show+0x40/0x60
> > > [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> > > [   75.007433]  kernfs_seq_show+0xa0/0xb8
> > > [   75.007800]  seq_read+0x1f0/0x7e8
> > > [   75.008128]  kernfs_fop_read+0x70/0x338
> > > [   75.008507]  vfs_read+0xe4/0x250
> > > [   75.008990]  ksys_read+0xc8/0x180
> > > [   75.009462]  __arm64_sys_read+0x44/0x58
> > > [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> > > [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has 
> > > objects
> > >
> > > Register a cpu hotplug function to remove all objects in the offline
> > > per-cpu quarantine when cpu is going offline. Set a per-cpu variable
> > > to indicate this cpu is offline.
> > >
> > > Signed-off-by: Kuan-Ying Lee 
> > > Suggested-by: Dmitry Vyukov 
> > > Reported-by: Guangye Yang 
> > > Cc: Andrey Ryabinin 
> > > Cc: Alexander Potapenko 
> > > Cc: Andrew Morton 
> > > Cc: Matthias Brugger 
> > > ---
> > >  mm/kasan/quarantine.c | 35 +++
> > >  1 file changed, 35 insertions(+)
> > >
> > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > index 4c5375810449..16e618ea805e 100644
> > > --- a/mm/kasan/quarantine.c
> > > +++ b/mm/kasan/quarantine.c
> > > @@ -29,6 +29,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include "../slab.h"
> > >  #include "kasan.h"
> > > @@ -43,6 +44,7 @@ struct qlist_head {
> > 

Re: [PATCH RFC v2 04/21] kasan: unpoison stack only with CONFIG_KASAN_STACK

2020-11-16 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 1:16 PM Catalin Marinas  wrote:
>
> On Mon, Nov 16, 2020 at 12:50:00PM +0100, Marco Elver wrote:
> > On Mon, 16 Nov 2020 at 11:59, Dmitry Vyukov  wrote:
> > > On Thu, Oct 29, 2020 at 8:57 PM 'Andrey Konovalov' via kasan-dev
> > >  wrote:
> > > > On Tue, Oct 27, 2020 at 1:44 PM Dmitry Vyukov  
> > > > wrote:
> > > > >
> > > > > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov 
> > > > >  wrote:
> > > > > >
> > > > > > There's a config option CONFIG_KASAN_STACK that has to be enabled 
> > > > > > for
> > > > > > KASAN to use stack instrumentation and perform validity checks for
> > > > > > stack variables.
> > > > > >
> > > > > > There's no need to unpoison stack when CONFIG_KASAN_STACK is not 
> > > > > > enabled.
> > > > > > Only call kasan_unpoison_task_stack[_below]() when 
> > > > > > CONFIG_KASAN_STACK is
> > > > > > enabled.
> > > > > >
> > > > > > Signed-off-by: Andrey Konovalov 
> > > > > > Link: 
> > > > > > https://linux-review.googlesource.com/id/If8a891e9fe01ea543e00b576852685afec0887e3
> > > > > > ---
> > > > > >  arch/arm64/kernel/sleep.S|  2 +-
> > > > > >  arch/x86/kernel/acpi/wakeup_64.S |  2 +-
> > > > > >  include/linux/kasan.h| 10 ++
> > > > > >  mm/kasan/common.c|  2 ++
> > > > > >  4 files changed, 10 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> > > > > > index ba40d57757d6..bdadfa56b40e 100644
> > > > > > --- a/arch/arm64/kernel/sleep.S
> > > > > > +++ b/arch/arm64/kernel/sleep.S
> > > > > > @@ -133,7 +133,7 @@ SYM_FUNC_START(_cpu_resume)
> > > > > >  */
> > > > > > bl  cpu_do_resume
> > > > > >
> > > > > > -#ifdef CONFIG_KASAN
> > > > > > +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> > > > > > mov x0, sp
> > > > > > bl  kasan_unpoison_task_stack_below
> > > > > >  #endif
> > > > > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S 
> > > > > > b/arch/x86/kernel/acpi/wakeup_64.S
> > > > > > index c8daa92f38dc..5d3a0b8fd379 100644
> > > > > > --- a/arch/x86/kernel/acpi/wakeup_64.S
> > > > > > +++ b/arch/x86/kernel/acpi/wakeup_64.S
> > > > > > @@ -112,7 +112,7 @@ SYM_FUNC_START(do_suspend_lowlevel)
> > > > > > movqpt_regs_r14(%rax), %r14
> > > > > > movqpt_regs_r15(%rax), %r15
> > > > > >
> > > > > > -#ifdef CONFIG_KASAN
> > > > > > +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> > > > > > /*
> > > > > >  * The suspend path may have poisoned some areas deeper in 
> > > > > > the stack,
> > > > > >  * which we now need to unpoison.
> > > > > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > > > > > index 3f3f541e5d5f..7be9fb9146ac 100644
> > > > > > --- a/include/linux/kasan.h
> > > > > > +++ b/include/linux/kasan.h
> > > > > > @@ -68,8 +68,6 @@ static inline void kasan_disable_current(void) {}
> > > > > >
> > > > > >  void kasan_unpoison_memory(const void *address, size_t size);
> > > > > >
> > > > > > -void kasan_unpoison_task_stack(struct task_struct *task);
> > > > > > -
> > > > > >  void kasan_alloc_pages(struct page *page, unsigned int order);
> > > > > >  void kasan_free_pages(struct page *page, unsigned int order);
> > > > > >
> > > > > > @@ -114,8 +112,6 @@ void kasan_restore_multi_shot(bool enabled);
> > > > > >
> > > > > >  static inline void kasan_unpoison_memory(const void *address, 
> > > > > > size_t size) {}
> > > > > >
> > > > > > -static inline void kasan_unpoison_task_stack(struct task_struct 
> > > > > > *task) {}
> > > > > > -
> > > >

Re: KASAN: invalid-free in p9_client_create

2020-11-16 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 11:30 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:92edc4ae Add linux-next specific files for 20201113
> git tree:   linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=142f881650
> kernel config:  https://syzkaller.appspot.com/x/.config?x=79ad4f8ad2d96176
> dashboard link: https://syzkaller.appspot.com/bug?extid=3a0f6c96e37e347c6ba9
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+3a0f6c96e37e347c6...@syzkaller.appspotmail.com

Looks like a real double free in slab code. +MM maintainers
Note there was a preceding kmalloc failure in sysfs_slab_add.


> RBP: 7fa358076ca0 R08: 2080 R09: 
> R10:  R11: 0246 R12: 001f
> R13: 7fff7dcf224f R14: 7fa3580779c0 R15: 0118bf2c
> kobject_add_internal failed for 9p-fcall-cache (error: -12 parent: slab)
> ==
> BUG: KASAN: double-free or invalid-free in slab_free mm/slub.c:3157 [inline]
> BUG: KASAN: double-free or invalid-free in kmem_cache_free+0x82/0x350 
> mm/slub.c:3173
>
> CPU: 0 PID: 15981 Comm: syz-executor.5 Not tainted 
> 5.10.0-rc3-next-20201113-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:120
>  print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:230
>  kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355
>  kasan_slab_free+0x100/0x110 mm/kasan/common.c:352
>  kasan_slab_free include/linux/kasan.h:194 [inline]
>  slab_free_hook mm/slub.c:1548 [inline]
>  slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1586
>  slab_free mm/slub.c:3157 [inline]
>  kmem_cache_free+0x82/0x350 mm/slub.c:3173
>  create_cache mm/slab_common.c:274 [inline]
>  kmem_cache_create_usercopy+0x2ab/0x300 mm/slab_common.c:357
>  p9_client_create+0xc4d/0x10c0 net/9p/client.c:1063
>  v9fs_session_init+0x1dd/0x1770 fs/9p/v9fs.c:406
>  v9fs_mount+0x79/0x9b0 fs/9p/vfs_super.c:126
>  legacy_get_tree+0x105/0x220 fs/fs_context.c:592
>  vfs_get_tree+0x89/0x2f0 fs/super.c:1549
>  do_new_mount fs/namespace.c:2896 [inline]
>  path_mount+0x12ae/0x1e70 fs/namespace.c:3227
>  do_mount fs/namespace.c:3240 [inline]
>  __do_sys_mount fs/namespace.c:3448 [inline]
>  __se_sys_mount fs/namespace.c:3425 [inline]
>  __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x45deb9
> Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7fa358076c78 EFLAGS: 0246 ORIG_RAX: 00a5
> RAX: ffda RBX: 00021800 RCX: 0045deb9
> RDX: 2100 RSI: 2040 RDI: 
> RBP: 7fa358076ca0 R08: 2080 R09: 
> R10:  R11: 0246 R12: 001f
> R13: 7fff7dcf224f R14: 7fa3580779c0 R15: 0118bf2c
>
> Allocated by task 15981:
>  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:39
>  kasan_set_track mm/kasan/common.c:47 [inline]
>  set_alloc_info mm/kasan/common.c:403 [inline]
>  kasan_kmalloc.constprop.0+0x82/0xa0 mm/kasan/common.c:434
>  kasan_slab_alloc include/linux/kasan.h:211 [inline]
>  slab_post_alloc_hook mm/slab.h:512 [inline]
>  slab_alloc_node mm/slub.c:2903 [inline]
>  slab_alloc mm/slub.c:2911 [inline]
>  kmem_cache_alloc+0x12a/0x470 mm/slub.c:2916
>  kmem_cache_zalloc include/linux/slab.h:672 [inline]
>  create_cache mm/slab_common.c:251 [inline]
>  kmem_cache_create_usercopy+0x1a6/0x300 mm/slab_common.c:357
>  p9_client_create+0xc4d/0x10c0 net/9p/client.c:1063
>  v9fs_session_init+0x1dd/0x1770 fs/9p/v9fs.c:406
>  v9fs_mount+0x79/0x9b0 fs/9p/vfs_super.c:126
>  legacy_get_tree+0x105/0x220 fs/fs_context.c:592
>  vfs_get_tree+0x89/0x2f0 fs/super.c:1549
>  do_new_mount fs/namespace.c:2896 [inline]
>  path_mount+0x12ae/0x1e70 fs/namespace.c:3227
>  do_mount fs/namespace.c:3240 [inline]
>  __do_sys_mount fs/namespace.c:3448 [inline]
>  __se_sys_mount fs/namespace.c:3425 [inline]
>  __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Freed by task 15981:
>  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:39
>  kasan_set_track+0x1c/0x30 mm/kasan/common.c:47
>  kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:359
>  kasan_slab_free+0xe1/0x110 mm/kasan/common.c:373
>  kasan_slab_free include/linux/kasan.h:194 [inline]
>  

Re: [PATCH mm v3 05/19] kasan: allow VMAP_STACK for HW_TAGS mode

2020-11-16 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM 'Andrey Konovalov' via kasan-dev
 wrote:
>
> Even though hardware tag-based mode currently doesn't support checking
> vmalloc allocations, it doesn't use shadow memory and works with
> VMAP_STACK as is. Change VMAP_STACK definition accordingly.
>
> Signed-off-by: Andrey Konovalov 
> Reviewed-by: Marco Elver 

Reviewed-by: Dmitry Vyukov 

> Acked-by: Catalin Marinas 
> Link: 
> https://linux-review.googlesource.com/id/I3552cbc12321dec82cd7372676e9372a2eb452ac
> ---
>  arch/Kconfig | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 9ebdab3d0ca2..546869c3269d 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -921,16 +921,16 @@ config VMAP_STACK
> default y
> bool "Use a virtually-mapped stack"
> depends on HAVE_ARCH_VMAP_STACK
> -   depends on !KASAN || KASAN_VMALLOC
> +   depends on !KASAN || KASAN_HW_TAGS || KASAN_VMALLOC
> help
>   Enable this if you want the use virtually-mapped kernel stacks
>   with guard pages.  This causes kernel stack overflows to be
>   caught immediately rather than causing difficult-to-diagnose
>   corruption.
>
> - To use this with KASAN, the architecture must support backing
> - virtual mappings with real shadow memory, and KASAN_VMALLOC must
> - be enabled.
> + To use this with software KASAN modes, the architecture must support
> + backing virtual mappings with real shadow memory, and KASAN_VMALLOC
> + must be enabled.
>
>  config ARCH_OPTIONAL_KERNEL_RWX
> def_bool n
> --
> 2.29.2.299.gdc1121823c-goog
>
> --
> You received this message because you are subscribed to the Google Groups 
> "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kasan-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/kasan-dev/89bf275f233121fc0ad695693a072872d4deda5d.1605305978.git.andreyknvl%40google.com.


Re: [PATCH mm v3 04/19] kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK

2020-11-16 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 11:20 PM Andrey Konovalov  wrote:
>
> There's a config option CONFIG_KASAN_STACK that has to be enabled for
> KASAN to use stack instrumentation and perform validity checks for
> stack variables.
>
> There's no need to unpoison stack when CONFIG_KASAN_STACK is not enabled.
> Only call kasan_unpoison_task_stack[_below]() when CONFIG_KASAN_STACK is
> enabled.
>
> Note, that CONFIG_KASAN_STACK is an option that is currently always
> defined when CONFIG_KASAN is enabled, and therefore has to be tested
> with #if instead of #ifdef.
>
> Signed-off-by: Andrey Konovalov 
> Reviewed-by: Marco Elver 

Reviewed-by: Dmitry Vyukov 

> Acked-by: Catalin Marinas 
> Link: 
> https://linux-review.googlesource.com/id/If8a891e9fe01ea543e00b576852685afec0887e3
> ---
>  arch/arm64/kernel/sleep.S|  2 +-
>  arch/x86/kernel/acpi/wakeup_64.S |  2 +-
>  include/linux/kasan.h| 10 ++
>  mm/kasan/common.c|  2 ++
>  4 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> index ba40d57757d6..bdadfa56b40e 100644
> --- a/arch/arm64/kernel/sleep.S
> +++ b/arch/arm64/kernel/sleep.S
> @@ -133,7 +133,7 @@ SYM_FUNC_START(_cpu_resume)
>  */
> bl  cpu_do_resume
>
> -#ifdef CONFIG_KASAN
> +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> mov x0, sp
> bl  kasan_unpoison_task_stack_below
>  #endif
> diff --git a/arch/x86/kernel/acpi/wakeup_64.S 
> b/arch/x86/kernel/acpi/wakeup_64.S
> index c8daa92f38dc..5d3a0b8fd379 100644
> --- a/arch/x86/kernel/acpi/wakeup_64.S
> +++ b/arch/x86/kernel/acpi/wakeup_64.S
> @@ -112,7 +112,7 @@ SYM_FUNC_START(do_suspend_lowlevel)
> movqpt_regs_r14(%rax), %r14
> movqpt_regs_r15(%rax), %r15
>
> -#ifdef CONFIG_KASAN
> +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> /*
>  * The suspend path may have poisoned some areas deeper in the stack,
>  * which we now need to unpoison.
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 0c89e6fdd29e..f2109bf0c5f9 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -76,8 +76,6 @@ static inline void kasan_disable_current(void) {}
>
>  void kasan_unpoison_range(const void *address, size_t size);
>
> -void kasan_unpoison_task_stack(struct task_struct *task);
> -
>  void kasan_alloc_pages(struct page *page, unsigned int order);
>  void kasan_free_pages(struct page *page, unsigned int order);
>
> @@ -122,8 +120,6 @@ void kasan_restore_multi_shot(bool enabled);
>
>  static inline void kasan_unpoison_range(const void *address, size_t size) {}
>
> -static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
> -
>  static inline void kasan_alloc_pages(struct page *page, unsigned int order) 
> {}
>  static inline void kasan_free_pages(struct page *page, unsigned int order) {}
>
> @@ -175,6 +171,12 @@ static inline size_t kasan_metadata_size(struct 
> kmem_cache *cache) { return 0; }
>
>  #endif /* CONFIG_KASAN */
>
> +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> +void kasan_unpoison_task_stack(struct task_struct *task);
> +#else
> +static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
> +#endif
> +
>  #ifdef CONFIG_KASAN_GENERIC
>
>  void kasan_cache_shrink(struct kmem_cache *cache);
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 0a420f1dbc54..7648a2452a01 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -64,6 +64,7 @@ void kasan_unpoison_range(const void *address, size_t size)
> unpoison_range(address, size);
>  }
>
> +#if CONFIG_KASAN_STACK
>  static void __kasan_unpoison_stack(struct task_struct *task, const void *sp)
>  {
> void *base = task_stack_page(task);
> @@ -90,6 +91,7 @@ asmlinkage void kasan_unpoison_task_stack_below(const void 
> *watermark)
>
> unpoison_range(base, watermark - base);
>  }
> +#endif /* CONFIG_KASAN_STACK */
>
>  void kasan_alloc_pages(struct page *page, unsigned int order)
>  {
> --
> 2.29.2.299.gdc1121823c-goog
>


Re: [PATCH RFC v2 04/21] kasan: unpoison stack only with CONFIG_KASAN_STACK

2020-11-16 Thread Dmitry Vyukov
On Thu, Oct 29, 2020 at 8:57 PM 'Andrey Konovalov' via kasan-dev
 wrote:
>
> On Tue, Oct 27, 2020 at 1:44 PM Dmitry Vyukov  wrote:
> >
> > On Thu, Oct 22, 2020 at 3:19 PM Andrey Konovalov  
> > wrote:
> > >
> > > There's a config option CONFIG_KASAN_STACK that has to be enabled for
> > > KASAN to use stack instrumentation and perform validity checks for
> > > stack variables.
> > >
> > > There's no need to unpoison stack when CONFIG_KASAN_STACK is not enabled.
> > > Only call kasan_unpoison_task_stack[_below]() when CONFIG_KASAN_STACK is
> > > enabled.
> > >
> > > Signed-off-by: Andrey Konovalov 
> > > Link: 
> > > https://linux-review.googlesource.com/id/If8a891e9fe01ea543e00b576852685afec0887e3
> > > ---
> > >  arch/arm64/kernel/sleep.S|  2 +-
> > >  arch/x86/kernel/acpi/wakeup_64.S |  2 +-
> > >  include/linux/kasan.h| 10 ++
> > >  mm/kasan/common.c|  2 ++
> > >  4 files changed, 10 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> > > index ba40d57757d6..bdadfa56b40e 100644
> > > --- a/arch/arm64/kernel/sleep.S
> > > +++ b/arch/arm64/kernel/sleep.S
> > > @@ -133,7 +133,7 @@ SYM_FUNC_START(_cpu_resume)
> > >  */
> > > bl  cpu_do_resume
> > >
> > > -#ifdef CONFIG_KASAN
> > > +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> > > mov x0, sp
> > > bl  kasan_unpoison_task_stack_below
> > >  #endif
> > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S 
> > > b/arch/x86/kernel/acpi/wakeup_64.S
> > > index c8daa92f38dc..5d3a0b8fd379 100644
> > > --- a/arch/x86/kernel/acpi/wakeup_64.S
> > > +++ b/arch/x86/kernel/acpi/wakeup_64.S
> > > @@ -112,7 +112,7 @@ SYM_FUNC_START(do_suspend_lowlevel)
> > > movqpt_regs_r14(%rax), %r14
> > > movqpt_regs_r15(%rax), %r15
> > >
> > > -#ifdef CONFIG_KASAN
> > > +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> > > /*
> > >  * The suspend path may have poisoned some areas deeper in the 
> > > stack,
> > >  * which we now need to unpoison.
> > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > > index 3f3f541e5d5f..7be9fb9146ac 100644
> > > --- a/include/linux/kasan.h
> > > +++ b/include/linux/kasan.h
> > > @@ -68,8 +68,6 @@ static inline void kasan_disable_current(void) {}
> > >
> > >  void kasan_unpoison_memory(const void *address, size_t size);
> > >
> > > -void kasan_unpoison_task_stack(struct task_struct *task);
> > > -
> > >  void kasan_alloc_pages(struct page *page, unsigned int order);
> > >  void kasan_free_pages(struct page *page, unsigned int order);
> > >
> > > @@ -114,8 +112,6 @@ void kasan_restore_multi_shot(bool enabled);
> > >
> > >  static inline void kasan_unpoison_memory(const void *address, size_t 
> > > size) {}
> > >
> > > -static inline void kasan_unpoison_task_stack(struct task_struct *task) {}
> > > -
> > >  static inline void kasan_alloc_pages(struct page *page, unsigned int 
> > > order) {}
> > >  static inline void kasan_free_pages(struct page *page, unsigned int 
> > > order) {}
> > >
> > > @@ -167,6 +163,12 @@ static inline size_t kasan_metadata_size(struct 
> > > kmem_cache *cache) { return 0; }
> > >
> > >  #endif /* CONFIG_KASAN */
> > >
> > > +#if defined(CONFIG_KASAN) && CONFIG_KASAN_STACK
> >
> > && defined(CONFIG_KASAN_STACK) for consistency
>
> CONFIG_KASAN_STACK is different from other KASAN configs. It's always
> defined, and its value is what controls whether stack instrumentation
> is enabled.

Not sure why we did this instead of the following, but okay.

 config KASAN_STACK
-   int
-   default 1 if KASAN_STACK_ENABLE || CC_IS_GCC
-   default 0
+   bool
+   default y if KASAN_STACK_ENABLE || CC_IS_GCC
+   default n


Re: [PATCH v2 1/1] kasan: fix object remain in offline per-cpu quarantine

2020-11-16 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 7:30 AM Kuan-Ying Lee
 wrote:
>
> We hit this issue in our internal test.
> When enabling generic kasan, a kfree()'d object is put into per-cpu
> quarantine first. If the cpu goes offline, object still remains in
> the per-cpu quarantine. If we call kmem_cache_destroy() now, slub
> will report "Objects remaining" error.
>
> [   74.982625] 
> =
> [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in 
> test_module_slab on __kmem_cache_shutdown()
> [   74.984145] 
> -
> [   74.984145]
> [   74.984883] Disabling lock debugging due to kernel taint
> [   74.985561] INFO: Slab 0x(ptrval) objects=34 used=1 
> fp=0x(ptrval) flags=0x20010200
> [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: GB 
> 5.10.0-rc1-7-g4525c8781ec0-dirty #10
> [   74.987262] Hardware name: linux,dummy-virt (DT)
> [   74.987606] Call trace:
> [   74.987924]  dump_backtrace+0x0/0x2b0
> [   74.988296]  show_stack+0x18/0x68
> [   74.988698]  dump_stack+0xfc/0x168
> [   74.989030]  slab_err+0xac/0xd4
> [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> [   74.989779]  kmem_cache_destroy+0x68/0x130
> [   74.990176]  test_version_show+0x84/0xf0
> [   74.990679]  module_attr_show+0x40/0x60
> [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> [   74.991656]  kernfs_seq_show+0xa0/0xb8
> [   74.992059]  seq_read+0x1f0/0x7e8
> [   74.992415]  kernfs_fop_read+0x70/0x338
> [   74.993051]  vfs_read+0xe4/0x250
> [   74.993498]  ksys_read+0xc8/0x180
> [   74.993825]  __arm64_sys_read+0x44/0x58
> [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> [   74.994708]  do_el0_svc+0x38/0xa0
> [   74.995088]  el0_sync_handler+0x170/0x178
> [   74.995497]  el0_sync+0x174/0x180
> [   74.996050] INFO: Object 0x(ptrval) @offset=15848
> [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 
> pid=172
> [   75.000802]  stack_trace_save+0x9c/0xd0
> [   75.002420]  set_track+0x64/0xf0
> [   75.002770]  alloc_debug_processing+0x104/0x1a0
> [   75.003171]  ___slab_alloc+0x628/0x648
> [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> [   75.004757]  kmem_cache_alloc+0x560/0x588
> [   75.005376]  test_version_show+0x98/0xf0
> [   75.005756]  module_attr_show+0x40/0x60
> [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> [   75.007433]  kernfs_seq_show+0xa0/0xb8
> [   75.007800]  seq_read+0x1f0/0x7e8
> [   75.008128]  kernfs_fop_read+0x70/0x338
> [   75.008507]  vfs_read+0xe4/0x250
> [   75.008990]  ksys_read+0xc8/0x180
> [   75.009462]  __arm64_sys_read+0x44/0x58
> [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has 
> objects
>
> Register a cpu hotplug function to remove all objects in the offline
> per-cpu quarantine when cpu is going offline. Set a per-cpu variable
> to indicate this cpu is offline.
>
> Signed-off-by: Kuan-Ying Lee 
> Suggested-by: Dmitry Vyukov 
> Reported-by: Guangye Yang 
> Cc: Andrey Ryabinin 
> Cc: Alexander Potapenko 
> Cc: Andrew Morton 
> Cc: Matthias Brugger 
> ---
>  mm/kasan/quarantine.c | 35 +++
>  1 file changed, 35 insertions(+)
>
> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> index 4c5375810449..16e618ea805e 100644
> --- a/mm/kasan/quarantine.c
> +++ b/mm/kasan/quarantine.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "../slab.h"
>  #include "kasan.h"
> @@ -43,6 +44,7 @@ struct qlist_head {
> struct qlist_node *head;
> struct qlist_node *tail;
> size_t bytes;
> +   bool offline;
>  };
>
>  #define QLIST_INIT { NULL, NULL, 0 }
> @@ -188,6 +190,11 @@ void quarantine_put(struct kasan_free_meta *info, struct 
> kmem_cache *cache)
> local_irq_save(flags);
>
> q = this_cpu_ptr(_quarantine);
> +   if (q->offline) {
> +   qlink_free(>quarantine_link, cache);
> +   local_irq_restore(flags);
> +   return;
> +   }
> qlist_put(q, >quarantine_link, cache->size);
> if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
> qlist_move_all(q, );
> @@ -328,3 +335,31 @@ void quarantine_remove_cache(struct kmem_cache *cache)
>
> synchronize_srcu(_cache_srcu);
>  }
> +
> +static int kasan_cpu_online(unsigned int cpu)
> +{
> +   this_cpu_ptr(_quarantine)->offline = false;
> +   return 0;
> +}
> +
> +

Re: Collecting both remote and "local" coverage with KCOV

2020-11-16 Thread Dmitry Vyukov
On Mon, Nov 16, 2020 at 3:39 AM Alexander Bulekov  wrote:
>
> Hello,
> I'm trying to collect coverage over the syscalls issued by my process,
> as well as the kthreads spawned as a result of these syscalls
> (eg coverage over vhost ioctls and the worker kthread). Is there a way
> to collect coverage with both KCOV_REMOTE_ENABLE(with common_handle) and
> KCOV_ENABLE, simultaneously?
>
> Based on the code it seems that these two modes are mutually
> exclusive within a single task, but I don't think this is mentioned in
> the Documentation, so I want to make sure I'm not missing something.

Hi Alex,

Yes, it's probably not supported within a single task. The easiest way
to verify is to try it ;)

It is possible to collect both coverages, but you will need 2 threads
(one just to set up remote KCOV).

Unless I am missing any fundamental limitations, I would say it would
be reasonable to support this within a single task as well.


Re: bpf test error: BUG: sleeping function called from invalid context in sta_info_move_state

2020-11-15 Thread Dmitry Vyukov
On Sat, Nov 14, 2020 at 9:42 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:96021828 MAINTAINERS/bpf: Update Andrii's entry.
> git tree:   bpf
> console output: https://syzkaller.appspot.com/x/log.txt?x=102717be50
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61033507391c77ff
> dashboard link: https://syzkaller.appspot.com/bug?extid=5921b7c1b10a0ddd02bc
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5921b7c1b10a0ddd0...@syzkaller.appspotmail.com

#syz fix: mac80211: free sta in sta_info_insert_finish() on errors

> BUG: sleeping function called from invalid context at 
> net/mac80211/sta_info.c:1962
> in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 124, name: kworker/u4:3
> 4 locks held by kworker/u4:3/124:
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: atomic64_set 
> include/asm-generic/atomic-instrumented.h:856 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: atomic_long_set 
> include/asm-generic/atomic-long.h:41 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: set_work_data 
> kernel/workqueue.c:616 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
>  #1: c9000132fda8 ((work_completion)(>work)){+.+.}-{0:0}, at: 
> process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
>  #2: 88801ab98d00 (>mtx){+.+.}-{3:3}, at: sdata_lock 
> net/mac80211/ieee80211_i.h:1021 [inline]
>  #2: 88801ab98d00 (>mtx){+.+.}-{3:3}, at: 
> ieee80211_ibss_work+0x93/0xe80 net/mac80211/ibss.c:1683
>  #3: 8b337160 (rcu_read_lock){}-{1:2}, at: sta_info_insert_finish 
> net/mac80211/sta_info.c:644 [inline]
>  #3: 8b337160 (rcu_read_lock){}-{1:2}, at: 
> sta_info_insert_rcu+0x680/0x2ba0 net/mac80211/sta_info.c:732
> Preemption disabled at:
> [] __mutex_lock_common kernel/locking/mutex.c:955 [inline]
> [] __mutex_lock+0x10f/0x10e0 kernel/locking/mutex.c:1103
> CPU: 0 PID: 124 Comm: kworker/u4:3 Not tainted 5.10.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue: phy4 ieee80211_iface_work
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  ___might_sleep.cold+0x1e8/0x22e kernel/sched/core.c:7298
>  sta_info_move_state+0x32/0x8d0 net/mac80211/sta_info.c:1962
>  sta_info_free+0x65/0x3b0 net/mac80211/sta_info.c:274
>  sta_info_insert_rcu+0x303/0x2ba0 net/mac80211/sta_info.c:738
>  ieee80211_ibss_finish_sta+0x212/0x390 net/mac80211/ibss.c:592
>  ieee80211_ibss_work+0x2c7/0xe80 net/mac80211/ibss.c:1700
>  ieee80211_iface_work+0x82e/0x970 net/mac80211/iface.c:1476
>  process_one_work+0x933/0x15a0 kernel/workqueue.c:2272
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2418
>  kthread+0x3af/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>
> =
> [ BUG: Invalid wait context ]
> 5.10.0-rc2-syzkaller #0 Tainted: GW
> -
> kworker/u4:3/124 is trying to lock:
> 888035f2a9d0 (>chanctx_mtx){+.+.}-{3:3}, at: 
> ieee80211_recalc_min_chandef+0x49/0x140 net/mac80211/util.c:2740
> other info that might help us debug this:
> context-{4:4}
> 4 locks held by kworker/u4:3/124:
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: atomic64_set 
> include/asm-generic/atomic-instrumented.h:856 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: atomic_long_set 
> include/asm-generic/atomic-long.h:41 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: set_work_data 
> kernel/workqueue.c:616 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: 888035f48938 ((wq_completion)phy4){+.+.}-{0:0}, at: 
> process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
>  #1: c9000132fda8 ((work_completion)(>work)){+.+.}-{0:0}, at: 
> process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
>  #2: 88801ab98d00 (>mtx){+.+.}-{3:3}, at: sdata_lock 
> net/mac80211/ieee80211_i.h:1021 [inline]
>  #2: 88801ab98d00 (>mtx){+.+.}-{3:3}, at: 
> ieee80211_ibss_work+0x93/0xe80 net/mac80211/ibss.c:1683
>  #3: 8b337160 (rcu_read_lock){}-{1:2}, at: sta_info_insert_finish 
> net/mac80211/sta_info.c:644 [inline]
>  #3: 8b337160 (rcu_read_lock){}-{1:2}, at: 
> 

Re: INFO: task hung in reboot

2020-11-14 Thread Dmitry Vyukov
On Sat, Nov 14, 2020 at 2:42 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:e7018751 usb: host: ehci-mxc: Remove the driver
> git tree:   
> https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-testing
> console output: https://syzkaller.appspot.com/x/log.txt?x=15d0c6a150
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a84e3059298aeb27
> dashboard link: https://syzkaller.appspot.com/bug?extid=9dec836197fea6892a28
> compiler:   gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+9dec836197fea6892...@syzkaller.appspotmail.com

It seems syzkaller has found another way to reboot the machine.
systemd initialiated shutdown:
[ [0;32m  OK   [0m] Stopped Raise network interfaces.

but it's unclear why.

Kernel also hanged:

[ 2599.057387][T1] systemd-shutdown[1]: Sending SIGTERM to
remaining processes...
[ 2599.704086][T1] systemd-shutdown[1]: Sending SIGKILL to
remaining processes...
[ 2600.107613][T1] systemd-shutdown[1]: All loop devices detached.
[ 2746.006716][ T1253] INFO: task systemd-shutdow:1 blocked for more
than 143 seconds.
[ 2746.014884][ T1253]   Not tainted 5.10.0-rc3-syzkaller #0

It seems due to some misbehaving device:

[ 2746.057796][ T1253]  schedule+0xcb/0x270
[ 2746.061900][ T1253]  wait_for_device_probe+0x1be/0x220
[ 2746.089974][ T1253]  device_shutdown+0x18/0x5c0
[ 2746.094669][ T1253]  __do_sys_reboot.cold+0x5d/0x97

Most likely some USB device because there is lots of USB-related
output before that.



> INFO: task systemd-shutdow:1 blocked for more than 143 seconds.
>   Not tainted 5.10.0-rc3-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:systemd-shutdow state:D stack:23432 pid:1 ppid: 0 
> flags:0x
> Call Trace:
>  context_switch kernel/sched/core.c:3774 [inline]
>  __schedule+0x8a2/0x1f30 kernel/sched/core.c:4523
>  schedule+0xcb/0x270 kernel/sched/core.c:4601
>  wait_for_device_probe+0x1be/0x220 drivers/base/dd.c:702
>  device_shutdown+0x18/0x5c0 drivers/base/core.c:4007
>  kernel_restart_prepare kernel/reboot.c:76 [inline]
>  kernel_restart kernel/reboot.c:246 [inline]
>  __do_sys_reboot.cold+0x5d/0x97 kernel/reboot.c:347
>  do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7fa389e6c8c6
> Code: Unable to access opcode bytes at RIP 0x7fa389e6c89c.
> RSP: 002b:7ffe1fb46e18 EFLAGS: 0206 ORIG_RAX: 00a9
> RAX: ffda RBX:  RCX: 7fa389e6c8c6
> RDX: 01234567 RSI: 28121969 RDI: fee1dead
> RBP: 7ffe1fb46ea8 R08: 2800 R09: 0005
> R10: 0002 R11: 0206 R12: 
> R13:  R14: 55c2d05dc150 R15: 7ffe1fb47198
>
> Showing all locks held in the system:
> 1 lock held by systemd-shutdow/1:
>  #0: 871288a8 (system_transition_mutex){+.+.}-{3:3}, at: 
> __do_sys_reboot+0x1a4/0x3e0 kernel/reboot.c:344
> 5 locks held by kworker/1:0/17:
> 1 lock held by khungtaskd/1253:
>  #0: 872492e0 (rcu_read_lock){}-{1:2}, at: 
> debug_show_all_locks+0x53/0x269 kernel/locking/lockdep.c:6253
>
> =
>
> NMI backtrace for cpu 1
> CPU: 1 PID: 1253 Comm: khungtaskd Not tainted 5.10.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  nmi_cpu_backtrace.cold+0x46/0xe0 lib/nmi_backtrace.c:105
>  nmi_trigger_cpumask_backtrace+0x1da/0x200 lib/nmi_backtrace.c:62
>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:209 [inline]
>  watchdog+0xd32/0xf70 kernel/hung_task.c:294
>  kthread+0x38c/0x460 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0 skipped: idling at native_safe_halt 
> arch/x86/include/asm/irqflags.h:60 [inline]
> NMI backtrace for cpu 0 skipped: idling at arch_safe_halt 
> arch/x86/include/asm/irqflags.h:103 [inline]
> NMI backtrace for cpu 0 skipped: idling at acpi_safe_halt 
> drivers/acpi/processor_idle.c:111 [inline]
> NMI backtrace for cpu 0 skipped: idling at acpi_idle_do_entry+0x1c9/0x250 
> drivers/acpi/processor_idle.c:517
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> --
> You received this message because you are subscribed to the Google 

Re: KASAN: use-after-free Write in afs_manage_cell

2020-11-14 Thread Dmitry Vyukov
On Sat, Nov 14, 2020 at 2:58 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 1d0e850a49a5b56f8f3cb51e74a11e2fedb96be6
> Author: David Howells 
> Date:   Fri Oct 16 12:21:14 2020 +
>
> afs: Fix cell removal
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15b78dba50
> start commit:   da690031 Merge branch 'i2c/for-current' of git://git.kerne..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=de7f697da23057c7
> dashboard link: https://syzkaller.appspot.com/bug?extid=f59c67285cb61166a0cf
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10960a8b90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17e938cf90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: afs: Fix cell removal
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: afs: Fix cell removal


Re: INFO: task can't die in nbd_ioctl

2020-11-13 Thread Dmitry Vyukov
On Tue, Nov 3, 2020 at 8:21 AM Ming Lei  wrote:
>
> On Sat, Oct 31, 2020 at 4:01 AM syzbot
>  wrote:
> >
> > syzbot has found a reproducer for the following issue on:
> >
> > HEAD commit:4e78c578 Add linux-next specific files for 20201030
> > git tree:   linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=158c179850
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=83318758268dc331
> > dashboard link: https://syzkaller.appspot.com/bug?extid=69a90a5e8f6b59086b2a
> > compiler:   gcc (GCC) 10.1.0-syz 20200507
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15e051a850
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15bf75b850
>
> Not reproduce this issue by above C reproducer with the kernel config
> in hours running
> on linus tree.

Let's see how reproducible this is by syzbot:

#syz test: 
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next-history.git
4e78c578


Re: kernel panic: stack is corrupted in get_kernel_gp_address

2020-11-13 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 9:27 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a49145acfb975d921464b84fe00279f99827d816
> Author: George Kennedy 
> Date:   Tue Jul 7 19:26:03 2020 +
>
> fbmem: add margin check to fb_check_caps()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10ff757250
> start commit:   f4d51dff Linux 5.9-rc4
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=d6459d8f8984c0929e54
> userspace arch: i386
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=164270dd90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: fbmem: add margin check to fb_check_caps()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: fbmem: add margin check to fb_check_caps()


Re: [PATCH 1/1] kasan: fix object remain in offline per-cpu quarantine

2020-11-12 Thread Dmitry Vyukov
On Fri, Nov 13, 2020 at 3:32 AM Kuan-Ying Lee
 wrote:
>
> On Thu, 2020-11-12 at 09:39 +0100, Dmitry Vyukov wrote:
> > On Thu, Nov 12, 2020 at 7:25 AM Kuan-Ying Lee
> >  wrote:
> > >
> > > We hit this issue in our internal test.
> > > When enabling generic kasan, a kfree()'d object is put into per-cpu
> > > quarantine first. If the cpu goes offline, object still remains in
> > > the per-cpu quarantine. If we call kmem_cache_destroy() now, slub
> > > will report "Objects remaining" error.
> > >
> > > [   74.982625] 
> > > =
> > > [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in 
> > > test_module_slab on __kmem_cache_shutdown()
> > > [   74.984145] 
> > > -
> > > [   74.984145]
> > > [   74.984883] Disabling lock debugging due to kernel taint
> > > [   74.985561] INFO: Slab 0x(ptrval) objects=34 used=1 
> > > fp=0x(ptrval) flags=0x20010200
> > > [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: GB 
> > > 5.10.0-rc1-7-g4525c8781ec0-dirty #10
> > > [   74.987262] Hardware name: linux,dummy-virt (DT)
> > > [   74.987606] Call trace:
> > > [   74.987924]  dump_backtrace+0x0/0x2b0
> > > [   74.988296]  show_stack+0x18/0x68
> > > [   74.988698]  dump_stack+0xfc/0x168
> > > [   74.989030]  slab_err+0xac/0xd4
> > > [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> > > [   74.989779]  kmem_cache_destroy+0x68/0x130
> > > [   74.990176]  test_version_show+0x84/0xf0
> > > [   74.990679]  module_attr_show+0x40/0x60
> > > [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> > > [   74.991656]  kernfs_seq_show+0xa0/0xb8
> > > [   74.992059]  seq_read+0x1f0/0x7e8
> > > [   74.992415]  kernfs_fop_read+0x70/0x338
> > > [   74.993051]  vfs_read+0xe4/0x250
> > > [   74.993498]  ksys_read+0xc8/0x180
> > > [   74.993825]  __arm64_sys_read+0x44/0x58
> > > [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> > > [   74.994708]  do_el0_svc+0x38/0xa0
> > > [   74.995088]  el0_sync_handler+0x170/0x178
> > > [   74.995497]  el0_sync+0x174/0x180
> > > [   74.996050] INFO: Object 0x(ptrval) @offset=15848
> > > [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 
> > > cpu=6 pid=172
> > > [   75.000802]  stack_trace_save+0x9c/0xd0
> > > [   75.002420]  set_track+0x64/0xf0
> > > [   75.002770]  alloc_debug_processing+0x104/0x1a0
> > > [   75.003171]  ___slab_alloc+0x628/0x648
> > > [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> > > [   75.004757]  kmem_cache_alloc+0x560/0x588
> > > [   75.005376]  test_version_show+0x98/0xf0
> > > [   75.005756]  module_attr_show+0x40/0x60
> > > [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> > > [   75.007433]  kernfs_seq_show+0xa0/0xb8
> > > [   75.007800]  seq_read+0x1f0/0x7e8
> > > [   75.008128]  kernfs_fop_read+0x70/0x338
> > > [   75.008507]  vfs_read+0xe4/0x250
> > > [   75.008990]  ksys_read+0xc8/0x180
> > > [   75.009462]  __arm64_sys_read+0x44/0x58
> > > [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> > > [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has 
> > > objects
> > >
> > > Register a cpu hotplug function to remove all objects in the offline
> > > per-cpu quarantine when cpu is going offline. Set a per-cpu variable
> > > to indicate this cpu is offline.
> > >
> > > Signed-off-by: Kuan-Ying Lee 
> > > ---
> > >  mm/kasan/quarantine.c | 59 +--
> > >  1 file changed, 57 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > index 4c5375810449..67fb91ae2bd0 100644
> > > --- a/mm/kasan/quarantine.c
> > > +++ b/mm/kasan/quarantine.c
> > > @@ -29,6 +29,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include "../slab.h"
> > >  #include "kasan.h"
> > > @@ -97,6 +98,7 @@ static void qlist_move_all(struct qlist_head *from, 
> > > struct qlist_head *to)
> > >   * guarded by quarantine_lock.
> > >   */
> >
> > Hi Kuan-Ying,
> >
> > Thanks for fixing this.
&g

Re: Process-wide watchpoints

2020-11-12 Thread Dmitry Vyukov
On Thu, Nov 12, 2020 at 11:31 AM Peter Zijlstra  wrote:
>
> On Thu, Nov 12, 2020 at 08:46:23AM +0100, Dmitry Vyukov wrote:
>
> > for sampling race detection),
> > number of threads in the process can be up to, say, ~~10K and the
> > watchpoint is intended to be set for a very brief period of time
> > (~~few ms).
>
> Performance is a consideration here, doing lots of IPIs in such a short
> window, on potentially large machines is a DoS risk.
>
> > This can be done today with both perf_event_open and ptrace.
> > However, the problem is that both APIs work on a single thread level
> > (? perf_event_open can be inherited by children, but not for existing
> > siblings). So doing this would require iterating over, say, 10K
>
> One way would be to create the event before the process starts spawning
> threads and keeping it disabled. Then every thread will inherit it, but
> it'll be inactive.
>
> > I see at least one potential problem: what do we do if some sibling
> > thread already has all 4 watchpoints consumed?
>
> That would be immediately avoided by this, since it will have the
> watchpoint reserved per inheriting the event.
>
> Then you can do ioctl(PERF_EVENT_IOC_{MODIFY_ATTRIBUTES,ENABLE,DISABLE})
> to update the watch location and enable/disable it. This _will_ indeed
> result in a shitload of IPIs if the threads are active, but it should
> work.

Aha! That's the possibility I missed.
We will try to prototype this and get back with more questions if/when
we have them.
Thanks!


Re: WARNING in irqentry_exit

2020-11-12 Thread Dmitry Vyukov
On Thu, Nov 12, 2020 at 3:01 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 4d004099a668c41522242aa146a38cc4eb59cb1e
> Author: Peter Zijlstra 
> Date:   Fri Oct 2 09:04:21 2020 +
>
> lockdep: Fix lockdep recursion
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1202919a50
> start commit:   f873db9a Merge tag 'io_uring-5.9-2020-08-21' of git://git...
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a0437fdd630bee11
> dashboard link: https://syzkaller.appspot.com/bug?extid=d4336c84ed0099fdbe47
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15312a6690
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13b0164190
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: lockdep: Fix lockdep recursion
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: lockdep: Fix lockdep recursion


Re: [PATCH 1/1] kasan: fix object remain in offline per-cpu quarantine

2020-11-12 Thread Dmitry Vyukov
On Thu, Nov 12, 2020 at 7:25 AM Kuan-Ying Lee
 wrote:
>
> We hit this issue in our internal test.
> When enabling generic kasan, a kfree()'d object is put into per-cpu
> quarantine first. If the cpu goes offline, object still remains in
> the per-cpu quarantine. If we call kmem_cache_destroy() now, slub
> will report "Objects remaining" error.
>
> [   74.982625] 
> =
> [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in 
> test_module_slab on __kmem_cache_shutdown()
> [   74.984145] 
> -
> [   74.984145]
> [   74.984883] Disabling lock debugging due to kernel taint
> [   74.985561] INFO: Slab 0x(ptrval) objects=34 used=1 
> fp=0x(ptrval) flags=0x20010200
> [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: GB 
> 5.10.0-rc1-7-g4525c8781ec0-dirty #10
> [   74.987262] Hardware name: linux,dummy-virt (DT)
> [   74.987606] Call trace:
> [   74.987924]  dump_backtrace+0x0/0x2b0
> [   74.988296]  show_stack+0x18/0x68
> [   74.988698]  dump_stack+0xfc/0x168
> [   74.989030]  slab_err+0xac/0xd4
> [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> [   74.989779]  kmem_cache_destroy+0x68/0x130
> [   74.990176]  test_version_show+0x84/0xf0
> [   74.990679]  module_attr_show+0x40/0x60
> [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> [   74.991656]  kernfs_seq_show+0xa0/0xb8
> [   74.992059]  seq_read+0x1f0/0x7e8
> [   74.992415]  kernfs_fop_read+0x70/0x338
> [   74.993051]  vfs_read+0xe4/0x250
> [   74.993498]  ksys_read+0xc8/0x180
> [   74.993825]  __arm64_sys_read+0x44/0x58
> [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> [   74.994708]  do_el0_svc+0x38/0xa0
> [   74.995088]  el0_sync_handler+0x170/0x178
> [   74.995497]  el0_sync+0x174/0x180
> [   74.996050] INFO: Object 0x(ptrval) @offset=15848
> [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 
> pid=172
> [   75.000802]  stack_trace_save+0x9c/0xd0
> [   75.002420]  set_track+0x64/0xf0
> [   75.002770]  alloc_debug_processing+0x104/0x1a0
> [   75.003171]  ___slab_alloc+0x628/0x648
> [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> [   75.004757]  kmem_cache_alloc+0x560/0x588
> [   75.005376]  test_version_show+0x98/0xf0
> [   75.005756]  module_attr_show+0x40/0x60
> [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> [   75.007433]  kernfs_seq_show+0xa0/0xb8
> [   75.007800]  seq_read+0x1f0/0x7e8
> [   75.008128]  kernfs_fop_read+0x70/0x338
> [   75.008507]  vfs_read+0xe4/0x250
> [   75.008990]  ksys_read+0xc8/0x180
> [   75.009462]  __arm64_sys_read+0x44/0x58
> [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has 
> objects
>
> Register a cpu hotplug function to remove all objects in the offline
> per-cpu quarantine when cpu is going offline. Set a per-cpu variable
> to indicate this cpu is offline.
>
> Signed-off-by: Kuan-Ying Lee 
> ---
>  mm/kasan/quarantine.c | 59 +--
>  1 file changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> index 4c5375810449..67fb91ae2bd0 100644
> --- a/mm/kasan/quarantine.c
> +++ b/mm/kasan/quarantine.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "../slab.h"
>  #include "kasan.h"
> @@ -97,6 +98,7 @@ static void qlist_move_all(struct qlist_head *from, struct 
> qlist_head *to)
>   * guarded by quarantine_lock.
>   */

Hi Kuan-Ying,

Thanks for fixing this.

>  static DEFINE_PER_CPU(struct qlist_head, cpu_quarantine);
> +static DEFINE_PER_CPU(int, cpu_quarantine_offline);

I think cpu_quarantine_offline is better be part of cpu_quarantine
because it logically is and we already obtain a pointer to
cpu_quarantine in quarantine_put, so it will also make the code a bit
shorter.


>  /* Round-robin FIFO array of batches. */
>  static struct qlist_head global_quarantine[QUARANTINE_BATCHES];
> @@ -176,6 +178,8 @@ void quarantine_put(struct kasan_free_meta *info, struct 
> kmem_cache *cache)
> unsigned long flags;
> struct qlist_head *q;
> struct qlist_head temp = QLIST_INIT;
> +   int *offline;
> +   struct qlist_head q_offline = QLIST_INIT;
>
> /*
>  * Note: irq must be disabled until after we move the batch to the
> @@ -187,8 +191,16 @@ void quarantine_put(struct kasan_free_meta *info, struct 
> kmem_cache *cache)
>  */
> local_irq_save(flags);
>
> -   q = this_cpu_ptr(_quarantine);
> -   qlist_put(q, >quarantine_link, cache->size);
> +   offline = this_cpu_ptr(_quarantine_offline);
> +   if (*offline == 0) {
> +   q = this_cpu_ptr(_quarantine);
> +   qlist_put(q, >quarantine_link, cache->size);
> +   } else {
> +   qlist_put(_offline, >quarantine_link, 

Process-wide watchpoints

2020-11-11 Thread Dmitry Vyukov
Hello perf maintainers,

I have a wish for a particular kernel functionality related to
watchpoints, and I would appreciate it if you can say how
feasible/complex to add it is (mostly glueing existing infra pieces,
or redesigning and adding lots of new code), or maybe it exists
already and I am missing it.

You can think of the functionality as setting MPROT_NONE but for a few
bytes only using watchpoints. On the access the accessing thread
should receive a signal (similar to SIGSEGV). Kernel copy_to/from_user
should not be affected (no EFAULT), I think this is already satisfied
for watchpoints. This functionality is also intended for production
environments (if you are interested -- for sampling race detection),
number of threads in the process can be up to, say, ~~10K and the
watchpoint is intended to be set for a very brief period of time
(~~few ms).

This can be done today with both perf_event_open and ptrace.
However, the problem is that both APIs work on a single thread level
(? perf_event_open can be inherited by children, but not for existing
siblings). So doing this would require iterating over, say, 10K
threads, calling perf_event_open, F_SETOWN, F_SETSIG, later close and
consuming 40K file descriptors.

What I would like to have is a single syscall that does all of it for
the whole process (either sending IPIs to currently running siblings,
or maybe activating this only on the next sched in).

I see at least one potential problem: what do we do if some sibling
thread already has all 4 watchpoints consumed? We don't necessarily
want to iterate over all 10K threads synchronously, nor we even want
to fail in this case. The intended use case is that only this feature
will mostly use watchpoints, so all threads will have equal number of
available watchpoints. So perhaps the removal of the watchpoint could
just communicate that there were some threads that were not able to
install the watchpoint.

Does it make any sense? How feasible/complex to add it is?

Thanks in advance


Re: KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

2020-11-11 Thread Dmitry Vyukov
On Mon, Nov 2, 2020 at 12:54 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
> git tree:   bpf
> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c50
> kernel config:  https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10f4b03250
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1371a47c50
>
> The issue was bisected to:
>
> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
> Author: Matt Mullins 
> Date:   Fri Apr 26 18:49:47 2019 +
>
> bpf: add writable context for raw tracepoints


We have a number of kernel memory corruptions related to bpf_trace_run now:
https://groups.google.com/g/syzkaller-bugs/search?q=kernel%2Ftrace%2Fbpf_trace.c

Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
Or they shouldn't?

Looking at the description of Matt's commit, it seems that corruptions
should not be possible (bounded buffer, checked size, etc). Then it
means it's a real kernel bug?



> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da50
> final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da50
> console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da50
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+d29e58bb557324e55...@syzkaller.appspotmail.com
> Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")
>
> ==
> BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run 
> kernel/trace/bpf_trace.c:2045 [inline]
> BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 
> kernel/trace/bpf_trace.c:2083
> Read of size 8 at addr c9e6c030 by task kworker/0:3/3754
>
> CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Workqueue:  0x0 (events)
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
>  __kasan_report mm/kasan/report.c:545 [inline]
>  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>  __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
>  bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
>  __bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
>  __traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
>  trace_sched_switch include/trace/events/sched.h:138 [inline]
>  __schedule+0xeb8/0x2130 kernel/sched/core.c:4520
>  schedule+0xcf/0x270 kernel/sched/core.c:4601
>  worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
>  kthread+0x3af/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
>
>
> Memory state around the buggy address:
>  c9e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>  c9e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> >c9e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>  ^
>  c9e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>  c9e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> ==
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/syzkaller-bugs/04500b05b31e68ce%40google.com.


Re: WARNING in wp_page_copy

2020-11-11 Thread Dmitry Vyukov
On Tue, Mar 24, 2020 at 3:47 AM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit c3e5ea6ee574ae5e845a40ac8198de1fb63bb3ab
> Author: Kirill A. Shutemov 
> Date:   Fri Mar 6 06:28:32 2020 +
>
> mm: avoid data corruption on CoW fault into PFN-mapped VMA
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1170c813e0
> start commit:   e31736d9 Merge tag 'nios2-v5.5-rc2' of git://git.kernel.or..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=79f79de2a27d3e3d
> dashboard link: https://syzkaller.appspot.com/bug?extid=9301f2f33873407d5b33
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10fd9fb1e0
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: mm: avoid data corruption on CoW fault into PFN-mapped VMA
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: mm: avoid data corruption on CoW fault into PFN-mapped VMA


Re: inconsistent lock state in icmp_send

2020-11-11 Thread Dmitry Vyukov
On Mon, May 25, 2020 at 12:19 PM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit 1378817486d6860f6a927f573491afe65287abf1
> Author: Eric Dumazet 
> Date:   Thu May 21 18:29:58 2020 +
>
> tipc: block BH before using dst_cache
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10cbef0610
> start commit:   f5d58277 Merge branch 'for-linus' of git://git.kernel.org/..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c8970c89a0efbb23
> dashboard link: https://syzkaller.appspot.com/bug?extid=251ec6887ada6eac4921
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10ab6ba340
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: tipc: block BH before using dst_cache
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: tipc: block BH before using dst_cache


Re: INFO: task hung in ctrl_getfamily

2020-11-11 Thread Dmitry Vyukov
On Mon, Sep 14, 2020 at 12:43 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 47733f9daf4fe4f7e0eb9e273f21ad3a19130487
> Author: Cong Wang 
> Date:   Sat Aug 15 23:29:15 2020 +
>
> tipc: fix uninit skb->data in tipc_nl_compat_dumpit()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13f287b390
> start commit:   f5d58277 Merge branch 'for-linus' of git://git.kernel.org/..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c8970c89a0efbb23
> dashboard link: https://syzkaller.appspot.com/bug?extid=36edb5cac286af8e3385
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=139f101b40
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: tipc: fix uninit skb->data in tipc_nl_compat_dumpit()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: tipc: fix uninit skb->data in tipc_nl_compat_dumpit()


Re: WARNING: refcount bug in p9_req_put

2020-11-11 Thread Dmitry Vyukov
On Sat, Aug 15, 2020 at 7:23 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a39c46067c845a8a2d7144836e9468b7f072343e
> Author: Christoph Hellwig 
> Date:   Fri Jul 10 08:57:22 2020 +
>
> net/9p: validate fds in p9_fd_open
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1015f01290
> start commit:   459e3a21 gcc-9: properly declare the {pv,hv}clock_page sto..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ef1b87b455c397cf
> dashboard link: https://syzkaller.appspot.com/bug?extid=edec7868af5997928fe9
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1642ee48a0
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: net/9p: validate fds in p9_fd_open
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: net/9p: validate fds in p9_fd_open


Re: possible deadlock in mnt_want_write

2020-11-11 Thread Dmitry Vyukov
On Sat, Nov 7, 2020 at 1:10 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 146d62e5a5867fbf84490d82455718bfb10fe824
> Author: Amir Goldstein 
> Date:   Thu Apr 18 14:42:08 2019 +
>
> ovl: detect overlapping layers
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11e4018450
> start commit:   6d906f99 Merge tag 'arm64-fixes' of git://git.kernel.org/p..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=856fc6d0fbbeede9
> dashboard link: https://syzkaller.appspot.com/bug?extid=ae82084b07d0297e566b
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=111767b720
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1611ab2d20
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: ovl: detect overlapping layers
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: ovl: detect overlapping layers


Re: INFO: rcu detected stall in do_swap_page

2020-11-11 Thread Dmitry Vyukov
On Thu, Nov 7, 2019 at 3:25 PM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit 32aaf0553df99cc4314f6e9f43216cd83afc6c20
> Author: Pengfei Li 
> Date:   Mon Sep 23 22:36:58 2019 +
>
>  mm/compaction.c: remove unnecessary zone parameter in
> isolate_migratepages()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=159dcaaae0
> start commit:   c6dd78fc Merge branch 'x86-urgent-for-linus' of git://git...
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3c8985c08e1f9727
> dashboard link: https://syzkaller.appspot.com/bug?extid=2a89b1fb6539ff150c16
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1456f9d060
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: mm/compaction.c: remove unnecessary zone parameter in
> isolate_migratepages()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

This does not look like the right commit. The bug happened only twice
long time ago, so:

#syz invalid


Re: INFO: task hung in flush_to_ldisc

2020-11-11 Thread Dmitry Vyukov
On Tue, Mar 17, 2020 at 10:43 AM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit e8c75a30a23c6ba63f4ef6895cbf41fd42f21aa2
> Author: Jiri Slaby 
> Date:   Fri Feb 28 11:54:06 2020 +
>
> vt: selection, push sel_lock up
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17bd23e3e0
> start commit:   07c4b9e9 Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=79f79de2a27d3e3d
> dashboard link: https://syzkaller.appspot.com/bug?extid=e199b43b49192126ff69
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11a4efdae0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11ccf946e0
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: vt: selection, push sel_lock up
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: vt: selection, push sel_lock up


Re: WARNING in percpu_ref_exit (2)

2020-11-11 Thread Dmitry Vyukov
On Wed, Nov 11, 2020 at 4:09 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit c1e2148f8ecb26863b899d402a823dab8e26efd1
> Author: Jens Axboe 
> Date:   Wed Mar 4 14:25:50 2020 +
>
> io_uring: free fixed_file_data after RCU grace period
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=161ea46e50
> start commit:   63849c8f Merge tag 'linux-kselftest-5.6-rc5' of git://git...
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=4527d1e2fb19fd5c
> dashboard link: https://syzkaller.appspot.com/bug?extid=8c4a14856e657b43487c
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13c30061e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1251b731e0
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: io_uring: free fixed_file_data after RCU grace period
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: io_uring: free fixed_file_data after RCU grace period


Re: INFO: task hung in htable_put

2020-11-11 Thread Dmitry Vyukov
On Fri, Mar 20, 2020 at 5:42 AM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit 99b79c3900d4627672c85d9f344b5b0f06bc2a4d
> Author: Cong Wang 
> Date:   Thu Feb 13 06:53:52 2020 +
>
> netfilter: xt_hashlimit: unregister proc file before releasing mutex
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17446eb1e0
> start commit:   f2850dd5 Merge tag 'kbuild-fixes-v5.6' of git://git.kernel..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=735296e4dd620b10
> dashboard link: https://syzkaller.appspot.com/bug?extid=84936245a918e2cddb32
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17a96c29e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12fcc65ee0
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: netfilter: xt_hashlimit: unregister proc file before releasing mutex
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: netfilter: xt_hashlimit: unregister proc file before releasing mutex


Re: WARNING: refcount bug in l2cap_chan_put

2020-11-11 Thread Dmitry Vyukov
On Sun, Sep 6, 2020 at 3:07 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit b83764f9220a4a14525657466f299850bbc98de9
> Author: Miao-chen Chou 
> Date:   Tue Jun 30 03:15:00 2020 +
>
> Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11aaff5d90
> start commit:   fffe3ae0 Merge tag 'for-linus-hmm' of git://git.kernel.org..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=18bb86f2e4ebfda2
> dashboard link: https://syzkaller.appspot.com/bug?extid=198362c76088d1515529
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=152a482c90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=109b781a90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()


Re: KASAN: use-after-free Write in refcount_warn_saturate

2020-11-11 Thread Dmitry Vyukov
On Fri, Sep 4, 2020 at 4:44 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit b83764f9220a4a14525657466f299850bbc98de9
> Author: Miao-chen Chou 
> Date:   Tue Jun 30 03:15:00 2020 +
>
> Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10f92e3e90
> start commit:   c0842fbc random32: move the pseudo-random 32-bit definitio..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=cf567e8c7428377e
> dashboard link: https://syzkaller.appspot.com/bug?extid=7dd7f2f77a7a01d1dc14
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15b606dc90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=123e87cc90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()


Re: INFO: trying to register non-static key in uhid_dev_destroy

2020-11-11 Thread Dmitry Vyukov
On Tue, Oct 6, 2020 at 6:54 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit bce1305c0ece3dc549663605e567655dd701752c
> Author: Marc Zyngier 
> Date:   Sat Aug 29 11:26:01 2020 +
>
> HID: core: Correctly handle ReportSize being zero
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10b82f5050
> start commit:   152036d1 Merge tag 'nfsd-5.7-rc-2' of git://git.linux-nfs...
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=efdde85c3af536b5
> dashboard link: https://syzkaller.appspot.com/bug?extid=0c601d7fbb8122d39093
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10ebad0c10
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d6c21c10
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: HID: core: Correctly handle ReportSize being zero
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: HID: core: Correctly handle ReportSize being zero


Re: INFO: trying to register non-static key in uhid_char_release

2020-11-11 Thread Dmitry Vyukov
On Wed, Oct 7, 2020 at 7:01 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit bce1305c0ece3dc549663605e567655dd701752c
> Author: Marc Zyngier 
> Date:   Sat Aug 29 11:26:01 2020 +
>
> HID: core: Correctly handle ReportSize being zero
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=12d1937050
> start commit:   1127b219 Merge tag 'fallthrough-fixes-5.9-rc3' of git://gi..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=891ca5711a9f1650
> dashboard link: https://syzkaller.appspot.com/bug?extid=8357fbef0d7bb602de45
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=102c472e90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1308105690
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: HID: core: Correctly handle ReportSize being zero
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: HID: core: Correctly handle ReportSize being zero


Re: general protection fault in tcf_action_destroy (2)

2020-11-11 Thread Dmitry Vyukov
On Wed, Apr 29, 2020 at 5:03 AM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit 0d1c3530e1bd38382edef72591b78e877e0edcd3
> Author: Cong Wang 
> Date:   Thu Mar 12 05:42:28 2020 +
>
> net_sched: keep alloc_hash updated after hash allocation
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15e7415410
> start commit:   67d584e3 Merge tag 'for-5.6-rc6-tag' of git://git.kernel.o..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=9f894bd92023de02
> dashboard link: https://syzkaller.appspot.com/bug?extid=92a80fff3b3af6c4464e
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=160c3223e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14f790ade0
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: net_sched: keep alloc_hash updated after hash allocation
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: net_sched: keep alloc_hash updated after hash allocation


Re: KASAN: use-after-free Write in tcindex_change

2020-11-11 Thread Dmitry Vyukov
On Fri, Apr 17, 2020 at 9:05 PM syzbot
 wrote:
>
> syzbot suspects this bug was fixed by commit:
>
> commit 0d1c3530e1bd38382edef72591b78e877e0edcd3
> Author: Cong Wang 
> Date:   Thu Mar 12 05:42:28 2020 +
>
> net_sched: keep alloc_hash updated after hash allocation
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15a956d7e0
> start commit:   ac309e77 Merge branch 'for-linus' of git://git.kernel.org/..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985
> dashboard link: https://syzkaller.appspot.com/bug?extid=ba4bcf1563f90386910f
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1771b973e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1248a61de0
>
> If the result looks correct, please mark the bug fixed by replying with:
>
> #syz fix: net_sched: keep alloc_hash updated after hash allocation
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: net_sched: keep alloc_hash updated after hash allocation


Re: WARNING: refcount bug in do_enable_set

2020-11-11 Thread Dmitry Vyukov
On Sun, Sep 6, 2020 at 7:31 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit b83764f9220a4a14525657466f299850bbc98de9
> Author: Miao-chen Chou 
> Date:   Tue Jun 30 03:15:00 2020 +
>
> Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=115a424590
> start commit:   fffe3ae0 Merge tag 'for-linus-hmm' of git://git.kernel.org..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=226c7a97d80bec54
> dashboard link: https://syzkaller.appspot.com/bug?extid=2e9900a1e1b3c9c96a77
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12b3efea90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1113128490
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: Bluetooth: Fix kernel oops triggered by hci_adv_monitors_clear()


Re: WARNING: suspicious RCU usage in ctrl_cmd_new_lookup

2020-11-11 Thread Dmitry Vyukov
On Thu, Oct 22, 2020 at 2:40 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a7809ff90ce6c48598d3c4ab54eb599bec1e9c42
> Author: Manivannan Sadhasivam 
> Date:   Sat Sep 26 16:56:25 2020 +
>
> net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=143867c850
> start commit:   7ae77150 Merge tag 'powerpc-5.8-1' of git://git.kernel.org..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d195fe572fb15312
> dashboard link: https://syzkaller.appspot.com/bug?extid=3025b9294f8cb0ede850
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11802cf910
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=144acc0310
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks


Re: WARNING in rxrpc_recvmsg

2020-11-11 Thread Dmitry Vyukov
On Thu, Aug 6, 2020 at 5:25 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 65550098c1c4db528400c73acf3e46bfa78d9264
> Author: David Howells 
> Date:   Tue Jul 28 23:03:56 2020 +
>
> rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10bd3bcc90
> start commit:   7cc2a8ea Merge tag 'block-5.8-2020-07-01' of git://git.ker..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=7be693511b29b338
> dashboard link: https://syzkaller.appspot.com/bug?extid=1a68d5c4e74edea44294
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17a5022f10
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=150932a710
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: rxrpc: Fix race between recvmsg and sendmsg on immediate call 
> failure
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: rxrpc: Fix race between recvmsg and sendmsg on immediate call failure


Re: KASAN: use-after-free Read in __cfg8NUM_wpan_dev_from_attrs (2)

2020-11-11 Thread Dmitry Vyukov
On Thu, Aug 6, 2020 at 9:00 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit bf64ff4c2aac65d680dc639a511c781cf6b6ec08
> Author: Cong Wang 
> Date:   Sat Jun 27 07:12:24 2020 +
>
> genetlink: get rid of family->attrbuf
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1206949490
> start commit:   e44f65fd xen-netfront: remove redundant assignment to vari..
> git tree:   net-next
> kernel config:  https://syzkaller.appspot.com/x/.config?x=829871134ca5e230
> dashboard link: https://syzkaller.appspot.com/bug?extid=14e0e4960091ffae7cf7
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11818aa710
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10f997d310
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: genetlink: get rid of family->attrbuf
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: genetlink: get rid of family->attrbuf


Re: BUG: corrupted list in kobject_add_internal

2020-11-11 Thread Dmitry Vyukov
On Sun, Nov 8, 2020 at 11:55 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a46b7ed4d52d09bd6c7ab53b2217d04fc2f02c65
> Author: Sonny Sasaka 
> Date:   Fri Aug 14 19:09:09 2020 +
>
> Bluetooth: Fix auto-creation of hci_conn at Conn Complete event
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13d7579250
> start commit:   d6efb3ac Merge tag 'tty-5.9-rc1' of git://git.kernel.org/p..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=ff87594cecb7e666
> dashboard link: https://syzkaller.appspot.com/bug?extid=dd768a260f7358adbaf9
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=105054aa90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16ab697690
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: Bluetooth: Fix auto-creation of hci_conn at Conn Complete event
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: Bluetooth: Fix auto-creation of hci_conn at Conn Complete event


Re: INFO: task hung in io_uring_flush

2020-11-11 Thread Dmitry Vyukov
On Thu, Sep 17, 2020 at 3:42 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit b7ddce3cbf010edbfac6c6d8cc708560a7bcd7a4
> Author: Pavel Begunkov 
> Date:   Sat Sep 5 21:45:14 2020 +
>
> io_uring: fix cancel of deferred reqs with ->files
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=173d9b0d90
> start commit:   9123e3a7 Linux 5.9-rc1
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3d400a47d1416652
> dashboard link: https://syzkaller.appspot.com/bug?extid=6338dcebf269a590b668
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1573f11690
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=144d307290
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: io_uring: fix cancel of deferred reqs with ->files
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: io_uring: fix cancel of deferred reqs with ->files


Re: general protection fault in rt6_fill_node

2020-11-11 Thread Dmitry Vyukov
On Thu, Oct 1, 2020 at 8:46 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit eeaac3634ee0e3f35548be35275efeca888e9b23
> Author: Nikolay Aleksandrov 
> Date:   Sat Aug 22 12:06:36 2020 +
>
> net: nexthop: don't allow empty NHA_GROUP
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=12beed5b90
> start commit:   c3d8f220 Merge tag 'kbuild-fixes-v5.9' of git://git.kernel..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a0437fdd630bee11
> dashboard link: https://syzkaller.appspot.com/bug?extid=81af6e9b3c4b8bc874f8
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=13ff853990
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=143f3a9690
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: net: nexthop: don't allow empty NHA_GROUP
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: net: nexthop: don't allow empty NHA_GROUP


Re: WARNING: ODEBUG bug in exit_to_user_mode_prepare

2020-11-11 Thread Dmitry Vyukov
On Fri, Aug 28, 2020 at 5:08 PM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:d012a719 Linux 5.9-rc2
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15e9e90e90
> kernel config:  https://syzkaller.appspot.com/x/.config?x=978db74cb30aa994
> dashboard link: https://syzkaller.appspot.com/bug?extid=fbd7ba7207767ed15165
> compiler:   gcc (GCC) 10.1.0-syz 20200507
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=12f8166690
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15abb10e90
>
> The issue was bisected to:
>
> commit a9ed4a6560b8562b7e2e2bed9527e88001f7b682
> Author: Marc Zyngier 
> Date:   Wed Aug 19 16:12:17 2020 +
>
> epoll: Keep a reference on files added to the check list
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=130823a990
> final oops: https://syzkaller.appspot.com/x/report.txt?x=108823a990
> console output: https://syzkaller.appspot.com/x/log.txt?x=170823a990

#syz fix:
fix regression in "epoll: Keep a reference on files added to the check list"

> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+fbd7ba7207767ed15...@syzkaller.appspotmail.com
> Fixes: a9ed4a6560b8 ("epoll: Keep a reference on files added to the check 
> list")
>
> [ cut here ]
> ODEBUG: free active (active state 1) object type: rcu_head hint: 0x0
> WARNING: CPU: 1 PID: 10170 at lib/debugobjects.c:485 
> debug_print_object+0x160/0x250 lib/debugobjects.c:485
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 1 PID: 10170 Comm: syz-executor403 Not tainted 5.9.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x18f/0x20d lib/dump_stack.c:118
>  panic+0x2e3/0x75c kernel/panic.c:231
>  __warn.cold+0x20/0x4a kernel/panic.c:600
>  report_bug+0x1bd/0x210 lib/bug.c:198
>  handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
>  exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
>  asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
> RIP: 0010:debug_print_object+0x160/0x250 lib/debugobjects.c:485
> Code: dd e0 26 94 88 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 bf 00 00 00 48 8b 
> 14 dd e0 26 94 88 48 c7 c7 40 1c 94 88 e8 82 3b a6 fd <0f> 0b 83 05 83 54 13 
> 07 01 48 83 c4 20 5b 5d 41 5c 41 5d c3 48 89
> RSP: 0018:c9000dabfdd0 EFLAGS: 00010082
> RAX:  RBX: 0003 RCX: 
> RDX: 888093ada540 RSI: 815dafc7 RDI: f52001b57fac
> RBP: 0001 R08: 0001 R09: 8880ae720f8b
> R10:  R11:  R12: 89bd6780
> R13:  R14: dead0100 R15: dc00
>  __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
>  debug_check_no_obj_freed+0x301/0x41c lib/debugobjects.c:998
>  kmem_cache_free.part.0+0x16d/0x1f0 mm/slab.c:3692
>  task_work_run+0xdd/0x190 kernel/task_work.c:141
>  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:140 [inline]
>  exit_to_user_mode_prepare+0x195/0x1c0 kernel/entry/common.c:167
>  syscall_exit_to_user_mode+0x59/0x2b0 kernel/entry/common.c:242
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x447849
> Code: e8 9c e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 
> 83 bb 04 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:7f37799a7db8 EFLAGS: 0246 ORIG_RAX: 00e9
> RAX:  RBX: 006ddc68 RCX: 00447849
> RDX: 0003 RSI: 0001 RDI: 0004
> RBP: 006ddc60 R08:  R09: 
> R10: 2000 R11: 0246 R12: 006ddc6c
> R13: 7ffcdbf7f6bf R14: 7f37799a89c0 R15: 
> Shutting down cpus with NMI
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups 
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to syzkaller-bugs+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> 

Re: general protection fault in nexthop_is_blackhole

2020-11-11 Thread Dmitry Vyukov
On Thu, Oct 1, 2020 at 5:34 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit eeaac3634ee0e3f35548be35275efeca888e9b23
> Author: Nikolay Aleksandrov 
> Date:   Sat Aug 22 12:06:36 2020 +
>
> net: nexthop: don't allow empty NHA_GROUP
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=116177a790
> start commit:   c3d8f220 Merge tag 'kbuild-fixes-v5.9' of git://git.kernel..
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=bb68b9e8a8cc842f
> dashboard link: https://syzkaller.appspot.com/bug?extid=b2c08a2f5cfef635cc3a
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14d75e3990
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12aea51990
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: net: nexthop: don't allow empty NHA_GROUP
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection


#syz fix: net: nexthop: don't allow empty NHA_GROUP


Re: KASAN: use-after-free Read in delete_partition

2020-11-11 Thread Dmitry Vyukov
On Thu, Oct 8, 2020 at 5:38 AM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 08fc1ab6d748ab1a690fd483f41e2938984ce353
> Author: Christoph Hellwig 
> Date:   Tue Sep 1 09:59:41 2020 +
>
> block: fix locking in bdev_del_partition
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1259b1e790
> start commit:   f75aef39 Linux 5.9-rc3
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3c5f6ce8d5b68299
> dashboard link: https://syzkaller.appspot.com/bug?extid=b8639c8dcb5ec4483d4f
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15c43c7990
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=173dfa1e90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: block: fix locking in bdev_del_partition
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: block: fix locking in bdev_del_partition


Re: INFO: rcu detected stall in exit_group

2020-11-11 Thread Dmitry Vyukov
On Mon, Nov 9, 2020 at 12:03 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 1d0e850a49a5b56f8f3cb51e74a11e2fedb96be6
> Author: David Howells 
> Date:   Fri Oct 16 12:21:14 2020 +
>
> afs: Fix cell removal
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14b65c3a50
> start commit:   34d4ddd3 Merge tag 'linux-kselftest-5.9-rc5' of git://git...
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=1a14a0f8ce1a06d4415f
> userspace arch: i386
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10c6642d90
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=132d00fd90
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: afs: Fix cell removal
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: afs: Fix cell removal


Re: WARNING in syscall_exit_to_user_mode

2020-11-11 Thread Dmitry Vyukov
On Sun, Nov 8, 2020 at 6:22 PM syzbot
 wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit a49145acfb975d921464b84fe00279f99827d816
> Author: George Kennedy 
> Date:   Tue Jul 7 19:26:03 2020 +
>
> fbmem: add margin check to fb_check_caps()
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17a21f3250
> start commit:   f4d51dff Linux 5.9-rc4
> git tree:   upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a9075b36a6ae26c9
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ffc7214b893651d52b8
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=122d733590
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13cea1a590
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: fbmem: add margin check to fb_check_caps()
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

#syz fix: fbmem: add margin check to fb_check_caps()


<    1   2   3   4   5   6   7   8   9   10   >