I've been considering if keeping the shadow TLB data structure around is
the right approach. To begin with, Linux does not keep an analogous
structure, and it's basically the same problem: when resuming the guest
after a context switch, the TLB will be mostly empty of guest entries,
and suffer a high TLB miss rate until it can fault more back in.

The shadow TLB allows us to skip the repopulation phase, but at the
expense of some memory and some overhead on every exit. Also, it's
difficult or impossible to implement a shadow TLB on cores where
software cannot address TLB entries by index (i.e. the hardware
automatically selects the index).

If we did remove the shadow TLB, I think we'd suffer even more TLB
misses, so we'd need to really optimize our TLB miss handler. (Arguably
we should already be doing that anyways.) Linux's TLB miss handler walks
the Linux page tables entirely in assembly, which also removes the need
to save/restore the full C ABI set of GPRs.

Logically speaking, our TLB miss handler must do the following: 
     1. walk guest TLB(s) to find a matching entry 
     2. if not present, deliver fault to guest 
     3. if present
             1. calculate guest physical address from TLB entry
             2. check if the guest is allowed to access that address
             3. if yes, write new TLB entry into hardware 
             4. if no, deliver fault to userspace 

I was hoping to be able implement 1 through 3.3 in assembly. Doing that
would require hoisting some of the logic currently in kvmppc_mmu_map()
to execute before the page fault occurs, mostly because it's implemented
in C. In particular we need kvm_is_visible_gfn(), and most unfortunately
gfn_to_pfn(). I started moving the logic around, but gfn_to_pfn() has me
stuck because calling that on *all* gfns at memory registration time
would a) pin all guest memory all the time, and b) require us to keep a
large data structure to remember all the pfns.

Avi suggested that we could keep a cache of "gotten" pfns (gfn_to_pfn()
calls get_user_pages()). That could obviate both C calls, but now I'm
not sure it's worth the development complexity.

I'll have to keep thinking about those other cores without directly
index-able TLBs.

-- 
Hollis Blanchard
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to