On Thu, Apr 13, 2017 at 5:50 PM, Suzuki K. Poulose <suzuki.poul...@arm.com> wrote: > On Thu, Apr 13, 2017 at 10:17:54AM +0100, Suzuki K Poulose wrote: >> On 12/04/17 19:43, Marc Zyngier wrote: >> > On 12/04/17 17:19, Andrey Konovalov wrote: >> > >> > Hi Andrey, >> > >> > > Apparently this wasn't fixed, I've got this report again on >> > > linux-next-c4e7b35a3 (Apr 11), which includes 8b3405e34 "kvm: >> > > arm/arm64: Fix locking for kvm_free_stage2_pgd". >> > >> > This looks like a different bug. >> > >> > > >> > > I now have a way to reproduce it, so I can test proposed patches. I >> > > don't have a simple C reproducer though. >> > > >> > > The bug happens when the following syzkaller program is executed: >> > > >> > > mmap(&(0x7f0000000000/0xc000)=nil, (0xc000), 0x3, 0x32, >> > > 0xffffffffffffffff, 0x0) >> > > unshare(0x400) >> > > perf_event_open(&(0x7f000002f000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0, >> > > 0x0, 0x6, 0x0, 0x0, 0xd34, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, >> > > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffff, >> > > 0xffffffffffffffff, 0x0) >> > > r0 = openat$kvm(0xffffffffffffff9c, >> > > &(0x7f000000c000-0x9)="2f6465762f6b766d00", 0x0, 0x0) >> > > ioctl$TIOCSBRK(0xffffffffffffffff, 0x5427) >> > > r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0) >> > > syz_kvm_setup_cpu$arm64(r1, 0xffffffffffffffff, >> > > &(0x7f0000dc6000/0x18000)=nil, &(0x7f000000c000)=[{0x0, >> > > &(0x7f000000c000)="5ba3c16f533efbed09f8221253c73763327fadce2371813b45dd7f7982f84a873e4ae89a6c2bd1af83a6024c36a1ff518318", >> > > 0x32}], 0x1, 0x0, &(0x7f000000d000-0x10)=[@featur2={0x1, 0x3}], 0x1) >> > >> > Is that the only thing the program does? Or is there anything running in >> > parallel? >> > >> > > ================================================================== >> > > BUG: KASAN: use-after-free in arch_spin_is_locked >> > > include/linux/compiler.h:254 [inline] >> > > BUG: KASAN: use-after-free in unmap_stage2_range+0x990/0x9a8 >> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 >> > > Read of size 8 at addr ffff800004476730 by task syz-executor/13106 >> > > >> > > CPU: 1 PID: 13106 Comm: syz-executor Not tainted >> > > 4.11.0-rc6-next-20170411-xc2-11025-gc4e7b35a33d4-dirty #5 >> > > Hardware name: Hardkernel ODROID-C2 (DT) >> > > Call trace: >> > > [<ffff20000808fd08>] dump_backtrace+0x0/0x440 >> > > arch/arm64/kernel/traps.c:505 >> > > [<ffff2000080903c0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228 >> > > [<ffff2000088df030>] __dump_stack lib/dump_stack.c:16 [inline] >> > > [<ffff2000088df030>] dump_stack+0x110/0x168 lib/dump_stack.c:52 >> > > [<ffff200008406db8>] print_address_description+0x60/0x248 >> > > mm/kasan/report.c:252 >> > > [<ffff2000084072c8>] kasan_report_error mm/kasan/report.c:351 [inline] >> > > [<ffff2000084072c8>] kasan_report+0x218/0x300 mm/kasan/report.c:408 >> > > [<ffff200008407428>] __asan_report_load8_noabort+0x18/0x20 >> > > mm/kasan/report.c:429 >> > > [<ffff2000080db1b8>] arch_spin_is_locked include/linux/compiler.h:254 >> > > [inline] >> > >> > This is the assert on the spinlock, and the memory is gone. >> > >> > > [<ffff2000080db1b8>] unmap_stage2_range+0x990/0x9a8 >> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295 >> > > [<ffff2000080db248>] kvm_free_stage2_pgd.part.16+0x30/0x98 >> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:842 >> > > [<ffff2000080ddfb8>] kvm_free_stage2_pgd >> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:838 [inline] >> > >> > But we've taken than lock here. There's only a handful of instructions >> > in between, and the memory can only go away if there is something >> > messing with us in parallel. >> > >> > > [<ffff2000080ddfb8>] kvm_arch_flush_shadow_all+0x40/0x58 >> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1895 >> > > [<ffff2000080c379c>] kvm_mmu_notifier_release+0x154/0x1d0 >> > > arch/arm64/kvm/../../../virt/kvm/kvm_main.c:472 >> > > [<ffff2000083f2b60>] __mmu_notifier_release+0x1c0/0x3e0 >> > > mm/mmu_notifier.c:75 >> > > [<ffff2000083a1fb4>] mmu_notifier_release >> > > include/linux/mmu_notifier.h:235 [inline] >> > > [<ffff2000083a1fb4>] exit_mmap+0x21c/0x288 mm/mmap.c:2941 >> > > [<ffff20000810ecd4>] __mmput kernel/fork.c:888 [inline] >> > > [<ffff20000810ecd4>] mmput+0xdc/0x2e0 kernel/fork.c:910 >> > > [<ffff20000811fda8>] exit_mm kernel/exit.c:557 [inline] >> > > [<ffff20000811fda8>] do_exit+0x648/0x2020 kernel/exit.c:865 >> > > [<ffff2000081218b4>] do_group_exit+0xdc/0x260 kernel/exit.c:982 >> > > [<ffff20000813adf0>] get_signal+0x358/0xf58 kernel/signal.c:2318 >> > > [<ffff20000808de98>] do_signal+0x170/0xc10 arch/arm64/kernel/signal.c:370 >> > > [<ffff20000808edb4>] do_notify_resume+0xe4/0x120 >> > > arch/arm64/kernel/signal.c:421 >> > > [<ffff200008083e68>] work_pending+0x8/0x14 >> > >> > So we're being serviced with a signal. Do you know if this signal is >> > generated by your syzkaller program? We could be racing between do_exit >> > triggered by a fatal signal (this trace) and the closing of the two file >> > descriptors (vcpu and vm). >> > >> > Paolo: does this look possible to you? I can't see what locking we have >> > that could prevent this race. >> >> On a quick look, I see two issues: >> >> 1) It looks like the mmu_notifier->ops.release could be called twice for a >> notifier, >> from mmu_notifier_unregister() and exit_mmap()->mmu_notifier_release(), >> which is >> causing the problem as above. >> >> This could possibly be avoided by swapping the order of the following >> operations >> in themmu_notifier_unregister(): >> >> a) Invoke ops->release under src_read_lock() >> b) Delete the notifier from the list. >> >> which can prevent mmu_notifier_release() calling the ops->release() again, >> before >> we reach (b). >> >> >> 2) The core KVM code does an mmgrab()/mmdrop on the current->mm to pin the >> mm_struct. But >> this doesn't prevent the "real_address user space" from being destroyed. >> Since KVM >> actually depends on the user pages and page tables, it should really/also(?) >> use >> mmget()/mmput() (See Documentation/vm/active_mm.txt). I understand that >> mmget() shouldn't >> be used for pinning unbounded amount of time. But since we do it from within >> the same >> process context (like say threads), we should be safe to do so. > > Here is a patch which implements (2) above.
Hi Suzuki, Your patch fixes KASAN reports for me. Thanks! > > ----8>----- > > kvm: Hold reference to the user address space > > The core KVM code, uses mmgrab/mmdrop to pin the mm struct of the user > application. mmgrab only guarantees that the mm struct is available, > while the "real address space" (see Documentation/vm/active_mm.txt) may > be destroyed. Since the KVM depends on the user space page tables for > the Guest pages, we should instead do an mmget/mmput. Even though > mmget/mmput is not encouraged for uses with unbounded time, the KVM > is fine to do so, as we are doing it from the context of the same process. > > This also prevents the race condition where mmu_notifier_release() could > be called in parallel and one instance could end up using a free'd kvm > instance. > > Cc: Mark Rutland <mark.rutl...@arm.com> > Cc: Paolo Bonzin <pbonz...@redhat.com> > Cc: Radim Krčmář <rkrc...@redhat.com> > Cc: Marc Zyngier <marc.zyng...@arm.com> > Cc: Christoffer Dall <christoffer.d...@linaro.org> > Cc: andreyk...@google.com > Signed-off-by: Suzuki K Poulose <suzuki.poul...@arm.com> > --- > virt/kvm/kvm_main.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 88257b3..555712e 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -613,7 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > return ERR_PTR(-ENOMEM); > > spin_lock_init(&kvm->mmu_lock); > - mmgrab(current->mm); > + mmget(current->mm); > kvm->mm = current->mm; > kvm_eventfd_init(kvm); > mutex_init(&kvm->lock); > @@ -685,7 +685,7 @@ static struct kvm *kvm_create_vm(unsigned long type) > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) > kvm_free_memslots(kvm, kvm->memslots[i]); > kvm_arch_free_vm(kvm); > - mmdrop(current->mm); > + mmput(current->mm); > return ERR_PTR(r); > } > > @@ -747,7 +747,7 @@ static void kvm_destroy_vm(struct kvm *kvm) > kvm_arch_free_vm(kvm); > preempt_notifier_dec(); > hardware_disable_all(); > - mmdrop(mm); > + mmput(mm); > } > > void kvm_get_kvm(struct kvm *kvm) > -- > 2.7.4 > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm