On Thu, Apr 04, 2013 at 02:32:08PM +0200, Alexander Graf wrote: > > On 04.04.2013, at 14:08, Gleb Natapov wrote: > > > On Thu, Apr 04, 2013 at 01:57:34PM +0200, Alexander Graf wrote: > >> > >> On 04.04.2013, at 12:50, Michael S. Tsirkin wrote: > >> > >>> With KVM, MMIO is much slower than PIO, due to the need to > >>> do page walk and emulation. But with EPT, it does not have to be: we > >>> know the address from the VMCS so if the address is unique, we can look > >>> up the eventfd directly, bypassing emulation. > >>> > >>> Add an interface for userspace to specify this per-address, we can > >>> use this e.g. for virtio. > >>> > >>> The implementation adds a separate bus internally. This serves two > >>> purposes: > >>> - minimize overhead for old userspace that does not use PV MMIO > >>> - minimize disruption in other code (since we don't know the length, > >>> devices on the MMIO bus only get a valid address in write, this > >>> way we don't need to touch all devices to teach them handle > >>> an dinvalid length) > >>> > >>> At the moment, this optimization is only supported for EPT on x86 and > >>> silently ignored for NPT and MMU, so everything works correctly but > >>> slowly. > >>> > >>> TODO: NPT, MMU and non x86 architectures. > >>> > >>> The idea was suggested by Peter Anvin. Lots of thanks to Gleb for > >>> pre-review and suggestions. > >>> > >>> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> > >> > >> This still uses page fault intercepts which are orders of magnitudes > >> slower than hypercalls. Why don't you just create a PV MMIO hypercall that > >> the guest can use to invoke MMIO accesses towards the host based on > >> physical addresses with explicit length encodings? > >> > > It is slower, but not an order of magnitude slower. It become faster > > with newer HW. > > > >> That way you simplify and speed up all code paths, exceeding the speed of > >> PIO exits even. It should also be quite easily portable, as all other > >> platforms have hypercalls available as well. > >> > > We are trying to avoid PV as much as possible (well this is also PV, > > but not guest visible > > Also, how is this not guest visible?
It's visible but it does not require special code in the guest. Only on qemu. Guest can use standard iowritel/w/b. > Who sets > KVM_IOEVENTFD_FLAG_PV_MMIO? It's an ioctl so qemu does this. > The comment above its definition indicates > that the guest does so, so it is guest visible. > > +/* > + * PV_MMIO - Guest can promise us that all accesses touching this address > + * are writes of specified length, starting at the specified address. > + * If not - it's a Guest bug. > + * Can not be used together with either PIO or DATAMATCH. > + */ > > > Alex The requirement is to only access a specific address by a single aligned instruction. For example, only aligned signle-word accesses are allowed. This is a standard practice with many devices, in fact, virtio spec already requires this for notifications. -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/