On Thu, Apr 13, 2017 at 10:17:54AM +0100, Suzuki K Poulose wrote:
> On 12/04/17 19:43, Marc Zyngier wrote:
> > On 12/04/17 17:19, Andrey Konovalov wrote:
> >
> > Hi Andrey,
> >
> > > Apparently this wasn't fixed, I've got this report again on
> > > linux-next-c4e7b35a3 (Apr 11), which includes 8b3405e34 "kvm:
> > > arm/arm64: Fix locking for kvm_free_stage2_pgd".
> >
> > This looks like a different bug.
> >
> > >
> > > I now have a way to reproduce it, so I can test proposed patches. I
> > > don't have a simple C reproducer though.
> > >
> > > The bug happens when the following syzkaller program is executed:
> > >
> > > mmap(&(0x7f0000000000/0xc000)=nil, (0xc000), 0x3, 0x32, 
> > > 0xffffffffffffffff, 0x0)
> > > unshare(0x400)
> > > perf_event_open(&(0x7f000002f000-0x78)={0x1, 0x78, 0x0, 0x0, 0x0, 0x0,
> > > 0x0, 0x6, 0x0, 0x0, 0xd34, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 0x0, 0xffffffff,
> > > 0xffffffffffffffff, 0x0)
> > > r0 = openat$kvm(0xffffffffffffff9c,
> > > &(0x7f000000c000-0x9)="2f6465762f6b766d00", 0x0, 0x0)
> > > ioctl$TIOCSBRK(0xffffffffffffffff, 0x5427)
> > > r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
> > > syz_kvm_setup_cpu$arm64(r1, 0xffffffffffffffff,
> > > &(0x7f0000dc6000/0x18000)=nil, &(0x7f000000c000)=[{0x0,
> > > &(0x7f000000c000)="5ba3c16f533efbed09f8221253c73763327fadce2371813b45dd7f7982f84a873e4ae89a6c2bd1af83a6024c36a1ff518318",
> > > 0x32}], 0x1, 0x0, &(0x7f000000d000-0x10)=[@featur2={0x1, 0x3}], 0x1)
> >
> > Is that the only thing the program does? Or is there anything running in
> > parallel?
> >
> > > ==================================================================
> > > BUG: KASAN: use-after-free in arch_spin_is_locked
> > > include/linux/compiler.h:254 [inline]
> > > BUG: KASAN: use-after-free in unmap_stage2_range+0x990/0x9a8
> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295
> > > Read of size 8 at addr ffff800004476730 by task syz-executor/13106
> > >
> > > CPU: 1 PID: 13106 Comm: syz-executor Not tainted
> > > 4.11.0-rc6-next-20170411-xc2-11025-gc4e7b35a33d4-dirty #5
> > > Hardware name: Hardkernel ODROID-C2 (DT)
> > > Call trace:
> > > [<ffff20000808fd08>] dump_backtrace+0x0/0x440 
> > > arch/arm64/kernel/traps.c:505
> > > [<ffff2000080903c0>] show_stack+0x20/0x30 arch/arm64/kernel/traps.c:228
> > > [<ffff2000088df030>] __dump_stack lib/dump_stack.c:16 [inline]
> > > [<ffff2000088df030>] dump_stack+0x110/0x168 lib/dump_stack.c:52
> > > [<ffff200008406db8>] print_address_description+0x60/0x248 
> > > mm/kasan/report.c:252
> > > [<ffff2000084072c8>] kasan_report_error mm/kasan/report.c:351 [inline]
> > > [<ffff2000084072c8>] kasan_report+0x218/0x300 mm/kasan/report.c:408
> > > [<ffff200008407428>] __asan_report_load8_noabort+0x18/0x20 
> > > mm/kasan/report.c:429
> > > [<ffff2000080db1b8>] arch_spin_is_locked include/linux/compiler.h:254 
> > > [inline]
> >
> > This is the assert on the spinlock, and the memory is gone.
> >
> > > [<ffff2000080db1b8>] unmap_stage2_range+0x990/0x9a8
> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:295
> > > [<ffff2000080db248>] kvm_free_stage2_pgd.part.16+0x30/0x98
> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:842
> > > [<ffff2000080ddfb8>] kvm_free_stage2_pgd
> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:838 [inline]
> >
> > But we've taken than lock here. There's only a handful of instructions
> > in between, and the memory can only go away if there is something
> > messing with us in parallel.
> >
> > > [<ffff2000080ddfb8>] kvm_arch_flush_shadow_all+0x40/0x58
> > > arch/arm64/kvm/../../../arch/arm/kvm/mmu.c:1895
> > > [<ffff2000080c379c>] kvm_mmu_notifier_release+0x154/0x1d0
> > > arch/arm64/kvm/../../../virt/kvm/kvm_main.c:472
> > > [<ffff2000083f2b60>] __mmu_notifier_release+0x1c0/0x3e0 
> > > mm/mmu_notifier.c:75
> > > [<ffff2000083a1fb4>] mmu_notifier_release
> > > include/linux/mmu_notifier.h:235 [inline]
> > > [<ffff2000083a1fb4>] exit_mmap+0x21c/0x288 mm/mmap.c:2941
> > > [<ffff20000810ecd4>] __mmput kernel/fork.c:888 [inline]
> > > [<ffff20000810ecd4>] mmput+0xdc/0x2e0 kernel/fork.c:910
> > > [<ffff20000811fda8>] exit_mm kernel/exit.c:557 [inline]
> > > [<ffff20000811fda8>] do_exit+0x648/0x2020 kernel/exit.c:865
> > > [<ffff2000081218b4>] do_group_exit+0xdc/0x260 kernel/exit.c:982
> > > [<ffff20000813adf0>] get_signal+0x358/0xf58 kernel/signal.c:2318
> > > [<ffff20000808de98>] do_signal+0x170/0xc10 arch/arm64/kernel/signal.c:370
> > > [<ffff20000808edb4>] do_notify_resume+0xe4/0x120 
> > > arch/arm64/kernel/signal.c:421
> > > [<ffff200008083e68>] work_pending+0x8/0x14
> >
> > So we're being serviced with a signal. Do you know if this signal is
> > generated by your syzkaller program? We could be racing between do_exit
> > triggered by a fatal signal (this trace) and the closing of the two file
> > descriptors (vcpu and vm).
> >
> > Paolo: does this look possible to you? I can't see what locking we have
> > that could prevent this race.
>
> On a quick look, I see two issues:
>
> 1) It looks like the mmu_notifier->ops.release could be called twice for a 
> notifier,
> from mmu_notifier_unregister() and exit_mmap()->mmu_notifier_release(), which 
> is
> causing the problem as above.
>
> This could possibly be avoided by swapping the order of the following 
> operations
> in themmu_notifier_unregister():
>
>  a) Invoke ops->release under src_read_lock()
>  b) Delete the notifier from the list.
>
> which can prevent mmu_notifier_release() calling the ops->release() again, 
> before
> we reach (b).
>
>
> 2) The core KVM code does an mmgrab()/mmdrop on the current->mm to pin the 
> mm_struct. But
> this doesn't prevent the "real_address user space" from being destroyed. 
> Since KVM
> actually depends on the user pages and page tables, it should really/also(?) 
> use
> mmget()/mmput() (See Documentation/vm/active_mm.txt). I understand that 
> mmget() shouldn't
> be used for pinning unbounded amount of time. But since we do it from within 
> the same
> process context (like say threads), we should be safe to do so.

Here is a patch which implements (2) above.

----8>-----

kvm: Hold reference to the user address space

The core KVM code, uses mmgrab/mmdrop to pin the mm struct of the user
application. mmgrab only guarantees that the mm struct is available,
while the "real address space" (see Documentation/vm/active_mm.txt) may
be destroyed. Since the KVM depends on the user space page tables for
the Guest pages, we should instead do an mmget/mmput. Even though
mmget/mmput is not encouraged for uses with unbounded time, the KVM
is fine to do so, as we are doing it from the context of the same process.

This also prevents the race condition where mmu_notifier_release() could
be called in parallel and one instance could end up using a free'd kvm
instance.

Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Paolo Bonzin <pbonz...@redhat.com>
Cc: Radim Krčmář <rkrc...@redhat.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Christoffer Dall <christoffer.d...@linaro.org>
Cc: andreyk...@google.com
Signed-off-by: Suzuki K Poulose <suzuki.poul...@arm.com>
---
 virt/kvm/kvm_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 88257b3..555712e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -613,7 +613,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
                return ERR_PTR(-ENOMEM);

        spin_lock_init(&kvm->mmu_lock);
-       mmgrab(current->mm);
+       mmget(current->mm);
        kvm->mm = current->mm;
        kvm_eventfd_init(kvm);
        mutex_init(&kvm->lock);
@@ -685,7 +685,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
        for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
                kvm_free_memslots(kvm, kvm->memslots[i]);
        kvm_arch_free_vm(kvm);
-       mmdrop(current->mm);
+       mmput(current->mm);
        return ERR_PTR(r);
 }

@@ -747,7 +747,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
        kvm_arch_free_vm(kvm);
        preempt_notifier_dec();
        hardware_disable_all();
-       mmdrop(mm);
+       mmput(mm);
 }

 void kvm_get_kvm(struct kvm *kvm)
--
2.7.4
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to