Re: WARNING: refcount bug in find_key_to_update
Okay, I managed to catch a backtrace for this line: encrypted_key: key user:syz not found (-126) looking like: CPU: 0 PID: 8878 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: dump_stack+0x172/0x1f0 request_master_key.isra.0.cold+0x62/0xc3 encrypted_read+0x221/0x830 ? get_derived_key+0xf0/0xf0 ? keyctl_read_key+0x1c2/0x2b0 ? __kasan_check_write+0x14/0x20 ? down_read+0x109/0x430 ? security_key_permission+0x8d/0xc0 ? down_read_killable+0x490/0x490 ? key_task_permission+0x1b5/0x3a0 keyctl_read_key+0x231/0x2b0 __x64_sys_keyctl+0x171/0x470 do_syscall_64+0xfa/0x760 entry_SYSCALL_64_after_hwframe+0x49/0xbe So something somewhere is calling keyctl_read() in userspace on the encrypted key and that is then referring across to the user key added. Also, the encrypted key is being given the following payload: ENCRYPTED: 'new default user:syz 04096' in at least one of the cases that encrypted_update() being called. David
Re: WARNING: refcount bug in find_key_to_update
Linus Torvalds wrote: > > syzbot has bisected this bug to 0570bc8b7c9b ("Merge tag > > 'riscv/for-v5.3-rc1' ...") > > Yeah, that looks unlikely. The only non-riscv changes are from > documentation updates and moving a config variable around. > > Looks like the crash is quite unlikely, and only happens in one out of > ten runs for the ones it has happened to. > > The backtrace looks simple enough, though: > > RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156 >__key_get include/linux/key.h:281 [inline] >find_key_to_update+0x67/0x80 security/keys/keyring.c:1127 >key_create_or_update+0x4e5/0xb20 security/keys/key.c:905 >__do_sys_add_key security/keys/keyctl.c:132 [inline] >__se_sys_add_key security/keys/keyctl.c:72 [inline] >__x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72 >do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296 >entry_SYSCALL_64_after_hwframe+0x49/0xbe > > which to me implies that there's some locking bug, and somebody > released the key without holding a lock. I'm wondering if this is actually a bug in the error handling in the encrypted key type. Looking in the syzbot console log, there's a lot of output from there prior to the crash, of which the following is an excerpt: [ 248.516746][T27381] encrypted_key: key user:syz not found [ 248.524392][T27382] encrypted_key: key user:syz not found [ 248.616141][T27392] encrypted_key: key user:syz not found [ 248.618890][T27393] encrypted_key: key user:syz not found [ 248.690844][T27404] encrypted_key: key user:syz not found [ 248.739405][T27403] encrypted_key: key user:syz not found [ 248.804881][T27417] encrypted_key: key user:syz not found [ 248.828354][T27418] encrypted_key: keyword 'new' not allowed when called from .update method [ 248.925249][T27427] encrypted_key: keyword 'new' not allowed when called from .update method [ 248.928200][T27415] Bad refcount user syz [ 248.934043][T27428] encrypted_key: key user:syz not found [ 248.939502][T27429] encrypted_key: key user:syz not found [ 248.968744][T27434] encrypted_key: key user:syz not found [ 248.982201][T27415] == [ 248.996072][T27415] BUG: KASAN: use-after-free in refcount_inc_not_zero_checked+0x81/0x200 Note that the "Bad refcount user syz" is a bit I patched in to print the type and description of the key that incurred the error. It's a tad difficult to say exactly what's going on since I've no idea what the syzbot reproducer is actually doing. #{"threaded":true,"collide":true,"repeat":true,"procs":6,"sandbox":"namespace","fault_call":-1,"tun":true,"netdev":true,"resetnet":true,"cgroups":true,"binfmt_misc":true,"close_fds":true,"tmpdir":true,"segv":true} perf_event_open(&(0x7f01d000)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x, 0x, 0x0) keyctl$instantiate(0xc, 0x0, &(0x7f000100)=ANY=[@ANYBLOB='new default user:syz 04096'], 0x1, 0x0) r0 = add_key(&(0x7f000140)='encrypted\x00', &(0x7f000180)={'syz'}, &(0x7f000100), 0xca, 0xfffe) add_key$user(&(0x7f40)='user\x00', &(0x7f00)={'syz'}, &(0x7f000440)='X', 0x1, 0xfffe) keyctl$read(0xb, r0, &(0x7f000240)=""/112, 0x349b7f55) However, it looks like the encrypted key type is trying to access a user key, so maybe there's an overput there? I'm trying to insert more debugging, but the test doesn't always fail. syzbot wrote: > HEAD commit:bc88f85c kthread: make __kthread_queue_delayed_work static > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1730584b60 > kernel config: https://syzkaller.appspot.com/x/.config?x=e0ac4d9b35046343 > dashboard link: https://syzkaller.appspot.com/bug?extid=6455648abc28dbdd1e7f > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab60 David
Re: WARNING: refcount bug in find_key_to_update
On Mon, Oct 21, 2019 at 5:59 PM David Howells wrote: > > syzbot wrote: > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab60 > > How do I tell what's been passed into the add_key for the encrypted key? Hi David, The easiest and most reliable would be to run it and dump the data in the kernel function.
Re: WARNING: refcount bug in find_key_to_update
syzbot wrote: > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab60 How do I tell what's been passed into the add_key for the encrypted key? David
Re: WARNING: refcount bug in find_key_to_update
Hillf Danton wrote: > - (1 << KEY_FLAG_REVOKED))) { > + (1 << KEY_FLAG_REVOKED)) || !key_tryget(key)) { > kleave(" = NULL [x]"); > return NULL; > } > - __key_get(key); That should be ineffective and ought not to fix the bug. David
Re: WARNING: refcount bug in find_key_to_update
Linus Torvalds wrote: > The backtrace looks simple enough, though: > > RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156 >__key_get include/linux/key.h:281 [inline] >find_key_to_update+0x67/0x80 security/keys/keyring.c:1127 >key_create_or_update+0x4e5/0xb20 security/keys/key.c:905 >__do_sys_add_key security/keys/keyctl.c:132 [inline] >__se_sys_add_key security/keys/keyctl.c:72 [inline] >__x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72 >do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296 >entry_SYSCALL_64_after_hwframe+0x49/0xbe > > which to me implies that there's some locking bug, and somebody > released the key without holding a lock. > > That code looks a bit confused to me. Releasing a key without holding > a lock looks permitted, but if that's the case then __key_get() is > complete garbage. It would need to use 'refcount_inc_not_zero()' and > failure would require failing the caller. find_key_to_update() must be called with the keyring-to-be-searched locked, as stated in the comment on that function. If a key-to-be-updated can be found in that keyring, then the keyring must be holding a ref on that key already, so it's refcount must be > 0, so it shouldn't be necessary to use refcount_inc_not_zero(). There shouldn't be a race with key_link(), key_unlink(), key_move(), keyring_clear() or keyring_gc() (garbage collection) as all of those take a write-lock on the keyring. > But I haven't followed the key locking rules, so who knows. That "put > without lock" scenario would explain the crash, though. That shouldn't explain it. When key_put() reduces the refcount to 0, it just schedules the garbage collector. It doesn't touch the key again directly. I would guess that something incorrectly put a ref when it shouldn't have. Do we know which type of key is involved? Looking at the syzkaller reproducer, it's adding an encrypted key and a user key to the process keyring - presumably repeating the procedure within the same process, hence how it finds something to update. David
Re: WARNING: refcount bug in find_key_to_update
On Thu, Oct 17, 2019 at 08:53:06AM -0700, Linus Torvalds wrote: > On Wed, Oct 16, 2019 at 7:42 PM syzbot > wrote: > > > > syzbot has bisected this bug to 0570bc8b7c9b ("Merge tag > > 'riscv/for-v5.3-rc1' ...") > > Yeah, that looks unlikely. The only non-riscv changes are from > documentation updates and moving a config variable around. > > Looks like the crash is quite unlikely, and only happens in one out of > ten runs for the ones it has happened to. > > The backtrace looks simple enough, though: > > RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156 >__key_get include/linux/key.h:281 [inline] >find_key_to_update+0x67/0x80 security/keys/keyring.c:1127 >key_create_or_update+0x4e5/0xb20 security/keys/key.c:905 >__do_sys_add_key security/keys/keyctl.c:132 [inline] >__se_sys_add_key security/keys/keyctl.c:72 [inline] >__x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72 >do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296 >entry_SYSCALL_64_after_hwframe+0x49/0xbe > > which to me implies that there's some locking bug, and somebody > released the key without holding a lock. > > That code looks a bit confused to me. Releasing a key without holding > a lock looks permitted, but if that's the case then __key_get() is > complete garbage. It would need to use 'refcount_inc_not_zero()' and > failure would require failing the caller. > > But I haven't followed the key locking rules, so who knows. That "put > without lock" scenario would explain the crash, though. > > David? > Yes this is a bogus bisection. The key is supposed to have refcount >= 1 since it's in a keyring. So some bug is causing it to have refcount 0. Perhaps some place calling key_put() too many times. Unfortunately I can't get the reproducer to work locally. Note that there are 2 other syzbot reports that look related. No reproducers for them, though: Title: KASAN: use-after-free Read in key_put Last occurred: 1 day ago Reported: 28 days ago Branches: Mainline Dashboard link: https://syzkaller.appspot.com/bug?id=f13750b1124e01191250cf930086dcc40740fa30 Original thread: https://lore.kernel.org/lkml/8c3e590592cf4...@google.com/T/#u Title: KASAN: use-after-free Read in keyring_compare_object Last occurred: 49 days ago Reported: 84 days ago Branches: Mainline Dashboard link: https://syzkaller.appspot.com/bug?id=529ab6a98286c2a97c445988a62760a58d4a1d4b Original thread: https://lore.kernel.org/lkml/038ef6058e6f3...@google.com/T/#u - Eric
Re: WARNING: refcount bug in find_key_to_update
On Wed, Oct 16, 2019 at 7:42 PM syzbot wrote: > > syzbot has bisected this bug to 0570bc8b7c9b ("Merge tag > 'riscv/for-v5.3-rc1' ...") Yeah, that looks unlikely. The only non-riscv changes are from documentation updates and moving a config variable around. Looks like the crash is quite unlikely, and only happens in one out of ten runs for the ones it has happened to. The backtrace looks simple enough, though: RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156 __key_get include/linux/key.h:281 [inline] find_key_to_update+0x67/0x80 security/keys/keyring.c:1127 key_create_or_update+0x4e5/0xb20 security/keys/key.c:905 __do_sys_add_key security/keys/keyctl.c:132 [inline] __se_sys_add_key security/keys/keyctl.c:72 [inline] __x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72 do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe which to me implies that there's some locking bug, and somebody released the key without holding a lock. That code looks a bit confused to me. Releasing a key without holding a lock looks permitted, but if that's the case then __key_get() is complete garbage. It would need to use 'refcount_inc_not_zero()' and failure would require failing the caller. But I haven't followed the key locking rules, so who knows. That "put without lock" scenario would explain the crash, though. David? Linus
Re: WARNING: refcount bug in find_key_to_update
syzbot has bisected this bug to: commit 0570bc8b7c9b41deba6f61ac218922e7168ad648 Author: Linus Torvalds Date: Thu Jul 18 19:26:59 2019 + Merge tag 'riscv/for-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11b6e2bb60 start commit: bc88f85c kthread: make __kthread_queue_delayed_work static git tree: upstream final crash:https://syzkaller.appspot.com/x/report.txt?x=13b6e2bb60 console output: https://syzkaller.appspot.com/x/log.txt?x=15b6e2bb60 kernel config: https://syzkaller.appspot.com/x/.config?x=e0ac4d9b35046343 dashboard link: https://syzkaller.appspot.com/bug?extid=6455648abc28dbdd1e7f syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab60 Reported-by: syzbot+6455648abc28dbdd1...@syzkaller.appspotmail.com Fixes: 0570bc8b7c9b ("Merge tag 'riscv/for-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux") For information about bisection process see: https://goo.gl/tpsmEJ#bisection
WARNING: refcount bug in find_key_to_update
Hello, syzbot found the following crash on: HEAD commit:bc88f85c kthread: make __kthread_queue_delayed_work static git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=1730584b60 kernel config: https://syzkaller.appspot.com/x/.config?x=e0ac4d9b35046343 dashboard link: https://syzkaller.appspot.com/bug?extid=6455648abc28dbdd1e7f compiler: gcc (GCC) 9.0.0 20181231 (experimental) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab60 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+6455648abc28dbdd1...@syzkaller.appspotmail.com [ cut here ] refcount_t: increment on 0; use-after-free. WARNING: CPU: 1 PID: 9064 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline] WARNING: CPU: 1 PID: 9064 at lib/refcount.c:156 refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 9064 Comm: syz-executor.5 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x35 kernel/panic.c:582 report_bug+0x289/0x300 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028 RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline] RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Code: 1d 58 46 7e 06 31 ff 89 de e8 0b cb 2e fe 84 db 75 dd e8 c2 c9 2e fe 48 c7 c7 40 ad e6 87 c6 05 38 46 7e 06 01 e8 67 0c 00 fe <0f> 0b eb c1 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 RSP: 0018:888081447c68 EFLAGS: 00010286 RAX: RBX: RCX: RDX: RSI: 815cb646 RDI: ed1010288f7f RBP: 888081447c78 R08: 8880a231a080 R09: ed1015d26159 R10: ed1015d26158 R11: 8880ae930ac7 R12: 8880a4518940 R13: R14: 888081447e10 R15: 8880a4518c40 __key_get include/linux/key.h:281 [inline] find_key_to_update+0x8b/0xc0 security/keys/keyring.c:1127 key_create_or_update+0x588/0xbe0 security/keys/key.c:905 __do_sys_add_key security/keys/keyctl.c:132 [inline] __se_sys_add_key security/keys/keyctl.c:72 [inline] __x64_sys_add_key+0x2bd/0x4f0 security/keys/keyctl.c:72 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x459a59 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f22e3171c78 EFLAGS: 0246 ORIG_RAX: 00f8 RAX: ffda RBX: 0005 RCX: 00459a59 RDX: 2440 RSI: 2000 RDI: 2040 RBP: 0075bf20 R08: fffe R09: R10: 0001 R11: 0246 R12: 7f22e31726d4 R13: 004bfab8 R14: 004d1ad8 R15: Kernel Offset: disabled Rebooting in 86400 seconds.. --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot can test patches for this bug, for details see: https://goo.gl/tpsmEJ#testing-patches