On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote:
> On 15/03/17 10:56, Christoffer Dall wrote:
> > On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote:
> >> On 15/03/17 09:21, Christoffer Dall wrote:
> >>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote:
> >>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
> >>>> unmap_stage2_range() on the entire memory range for the guest. This could
> >>>> cause problems with other callers (e.g, munmap on a memslot) trying to
> >>>> unmap a range.
> >>>>
> >>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
> >>>> Cc: sta...@vger.kernel.org # v3.10+
> >>>> Cc: Marc Zyngier <marc.zyng...@arm.com>
> >>>> Cc: Christoffer Dall <christoffer.d...@linaro.org>
> >>>> Signed-off-by: Suzuki K Poulose <suzuki.poul...@arm.com>
> >>>> ---
> >>>>  arch/arm/kvm/mmu.c | 3 +++
> >>>>  1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >>>> index 13b9c1f..b361f71 100644
> >>>> --- a/arch/arm/kvm/mmu.c
> >>>> +++ b/arch/arm/kvm/mmu.c
> >>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> >>>>          if (kvm->arch.pgd == NULL)
> >>>>                  return;
> >>>>  
> >>>> +        spin_lock(&kvm->mmu_lock);
> >>>>          unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> >>>> +        spin_unlock(&kvm->mmu_lock);
> >>>> +
> >>>
> >>> This ends up holding the spin lock for potentially quite a while, where
> >>> we can do things like __flush_dcache_area(), which I think can fault.
> >>
> >> I believe we're always using the linear mapping (or kmap on 32bit) in
> >> order not to fault.
> >>
> > 
> > ok, then there's just the concern that we may be holding a spinlock for
> > a very long time.  I seem to recall Mario once added something where he
> > unlocked and gave a chance to schedule something else for each PUD or
> > something like that, because he ran into the issue during migration.  Am
> > I confusing this with something else?
> 
> That definitely rings a bell: stage2_wp_range() uses that kind of trick
> to give the system a chance to breathe. Maybe we could use a similar
> trick in our S2 unmapping code? How about this (completely untested) patch:
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1786c24212d4 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, 
> phys_addr_t start, u64 size)
>       phys_addr_t addr = start, end = start + size;
>       phys_addr_t next;
>  
> +     BUG_ON(!spin_is_locked(&kvm->mmu_lock));
> +
>       pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>       do {
> +             if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> +                     cond_resched_lock(&kvm->mmu_lock);
> +
>               next = stage2_pgd_addr_end(addr, end);
>               if (!stage2_pgd_none(*pgd))
>                       unmap_stage2_puds(kvm, pgd, addr, next);
> 
> The additional BUG_ON() is just for my own peace of mind - we seem to
> have missed a couple of these lately, and the "breathing" code makes
> it imperative that this lock is being taken prior to entering the
> function.
> 

Looks good to me!

-Christoffer

Reply via email to