On 11.02.2011, at 01:22, Alexander Graf wrote: > > On 11.02.2011, at 01:20, Alexander Graf wrote: > >> >> On 10.02.2011, at 19:51, Scott Wood wrote: >> >>> On Thu, 10 Feb 2011 12:45:38 +0100 >>> Alexander Graf <ag...@suse.de> wrote: >>> >>>> Ok, thinking about this a bit more. You're basically proposing a list of >>>> tlb set calls, with each array field identifying one tlb set call. What >>>> I was thinking of was a full TLB sync, so we could keep qemu's internal >>>> TLB representation identical to the ioctl layout and then just call that >>>> one ioctl to completely overwrite all of qemu's internal data (and vice >>>> versa). >>> >>> No, this is a full sync -- the list replaces any existing TLB entries (need >>> to make that explicit in the doc). Basically it's an invalidate plus a >>> list of tlb set operations. >>> >>> Qemu's internal representation will want to be ordered with no missing >>> entries. If we require that of the transfer representation we can't do >>> early termination. It would also limit Qemu's flexibility in choosing its >>> internal representation, and make it more awkward to support multiple MMU >>> types. >> >> Well, but this way it means we'll have to assemble/disassemble a list of >> entries multiple times: >> >> SET: >> * qemu assembles the list from its internal representation >> * kvm disassembles the list into its internal structure >> >> GET: >> * kvm assembles the list from its internal representation >> * qemu disassembles the list into its internal structure >> >> Maybe we should go with Avi's proposal after all and simply keep the full >> soft-mmu synced between kernel and user space? That way we only need a setup >> call at first, no copying in between and simply update the user space >> version whenever something changes in the guest. We need to store the TLB's >> contents off somewhere anyways, so all we need is an additional in-kernel >> array with internal translation data, but that can be separate from the >> guest visible data, right? > > If we could then keep qemu's internal representation == shared data with kvm > == kvm's internal data for guest visible stuff, we get this done with almost > no additional overhead. And I don't see any problem with this. Should be > easily doable.
So then everything we need to get all the functionality we need is a hint from kernel to user space that something changed and vice versa. >From kernel to user space is simple. We can just document that after every >RUN, all fields can be modified. >From user space to kernel, we could modify the entries directly and then pass >in an ioctl that passes in a dirty bitmap to kernel space. KVM can then decide >what to do with it. I guess the easiest implementation for now would be to >ignore the bitmap and simply flush the shadow tlb. That gives us the flush almost for free. All we need to do is set the tlb to all zeros (should be done by env init anyways) and pass in the "something changed" call. KVM can then decide to simply drop all of its shadow state or loop through every shadow entry and flush it individually. Maybe we should give a hint on the amount of flushes, so KVM can implement some threshold. Also, please tell me you didn't implement the previous revisions already. It'd be a real bummer to see that work wasted only because we're still iterating through the spec O_o. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html