Re: [Qemu-devel] [PATCHv2] virtio-pci: add MMIO property
> @@ -682,10 +733,18 @@ void virtio_init_pci(VirtIOPCIProxy *proxy, > VirtIODevice *vdev) if (size & (size-1)) > size = 1 << qemu_fls(size); > > +proxy->bar0_mask = size - 1; You'll get better performance if you use page-sized mappings. You're already creating a mapping bigger than the actual data (rounding up to power-of-two), so you may as well pick a value that's convenient for qemu to map into the address space. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [RFC 0/4] virtio-mmio transport
> I can do that, but not this year (on holiday from Friday 16th, without > any access to Internet whatsoever :-) One think to be decided is in what > order the halfs should be filled? Low first, then high? High then low? > Does it matter at all? :-) My inital though was that you shouldn't be changing this value when the ring is enabled. Unfortunately you disable the ring by setting the address to zero so that argument doesn't work :-/ I suggest that the device to buffer writes to the high part, and construct the actual 64-bit value when the low part is written. That allows 32-bit guests can ignore the high part entirely. Requiring the guest always write high then low also works, though I don't see any benefit to either guest or host. Acting on writes as soon as they occur is not a viable option as it causes the device to be enabled before the full address has bee written. You could say the address takes effect after both halves have been written, writes must come in pairs, but may occur in either order. However there is a risk that host and guest will somehow get out of sync on which values pair together, so IMO this is a bad idea. If you can't stomach the above then I guess changing the condition for ring enablement to QueueNum != 0 and rearanging the configuration sequence appropriately would also do the trick. Having this be different between PCI and mmio is probably not worth the confusion though. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] virtio-spec: document block CMD and FLUSH
> Does this mean that virtio-blk supports all three combinations? > >1. FLUSH that isn't a barrier >2. FLUSH that is also a barrier >3. Barrier that is not a flush > > 1 is good for fsync-like operations; > 2 is good for journalling-like ordered operations. > 3 sounds like it doesn't mean a lot as the host cache provides no > guarantees and has no ordering facility that can be used. (3) allows the guest to queue overlapping transfers with well defined results. I have no idea how useful this is in practice, but it's certainly plausible. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 0/4] megaraid_sas HBA emulation
> But I certainly do _not_ want to update the SCSI disk > emulation, as this is really quite tied to the SCSI parallel > interface used by the old lsi53c895a.c. This is completely untrue. scsi-disk.c contains no transport-specific code. It is deliberately designed to be independent of both the transport (e.g. parallel SCSI or USB) and the mechanism by which the initiator transfers data to the host. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] Re: virtio-serial: An interface for host-guest communication
> However, as I've mentioned repeatedly, the reason I won't merge > virtio-serial is that it duplicates functionality with virtio-console. > If the two are converged, I'm happy to merge it. I'm not opposed to > having more functionality. I strongly agree. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication
On Tuesday 23 June 2009, Christian Bornträger wrote: > Am Dienstag 23 Juni 2009 14:55:52 schrieb Paul Brook: > > > Here are two patches. One implements a virtio-serial device in qemu > > > and the other is the driver for a guest kernel. > > > > So I'll ask again. Why is this separate from virtio-console? > > I did some work on virtio-console, since kvm on s390 does not provide any > other. I dont think we should mix two different types of devices into one > driver. The only thing that these drivers have in common, is the fact that > there are two virtqueues, piping data (single bytes or larger chunks). So > you could make the same argument with the first virtio_net driver (the one > before GSO) - which is obviously wrong. The common part of the transport is > already factored out to virtio_ring and the transports. virtio-net is packet based, not stream based. > In addition there are two ABIs involved: a userspace ABI (/dev/hvc0) and a > guest/host ABI for this console. (and virtio was not meant to be a KVM-only > interface, that we can change all the time). David A. Wheeler's 'SLOCCount' > gives me 141 lines of code for virtio_console.c. I am quite confident that > the saving we could achieve by merging these two drivers is not worth the > hazzle. AFAICS the functionality provided is exactly the same. The host API is identical, and the guest userspace API only has trivial differences (which could be eliminated with a simple udev rule). By my reading virtio-serial makes virtio-console entirely redundant. > Discussion about merging the console code into this distracts from the main > problem: To get the interface and functionality right before it becomes an > ABI (is it /dev/ttyS, network like or is it something completely > different?). Ah, now that's a different question. I don't know what the requirements are for the higher level vmchannel interface. However I also don't care. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication
On Tuesday 23 June 2009, Amit Shah wrote: > On (Tue) Jun 23 2009 [13:55:52], Paul Brook wrote: > > > Here are two patches. One implements a virtio-serial device in qemu > > > and the other is the driver for a guest kernel. > > > > So I'll ask again. Why is this separate from virtio-console? > > I'm basically writing a vmchannel and found out that a lot can be shared > between some virtio devices. So I'm just trying to abstract out those > things in virtio-serial. Once we're sure virtio-serial is good and ready > to be merged, I will look at converting over virtio-console to the > virtio-serial interface. That doesn't really answer my question. We already have a virtual serial device (called virtio-console). Why are you inventing another one? Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication
> Here are two patches. One implements a virtio-serial device in qemu > and the other is the driver for a guest kernel. So I'll ask again. Why is this separate from virtio-console? Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCHv5 08/13] qemu: add support for resizing regions
On Thursday 18 June 2009, Michael S. Tsirkin wrote: > Make it possible to resize PCI regions. This will be used by virtio > with MSI-X, where the region size depends on whether MSI-X is enabled, > and can change across load/save. I thought we'd agreed we shouldn't be doing this. i.e. if the user tries to load the wrong device, just bail out. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
> > If we can't start a new qemu with the same hardware configuration then we > > should not be allowing migration or loading of snapshots. > > OK, so I'll add an option in virtio-net to disable msi-x, and such > an option will be added in any device with msi-x support. > Will that address your concern? Yes, as long as migration fails when you try to migrate to the wrong kind of device. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
On Wednesday 10 June 2009, Michael S. Tsirkin wrote: > On Wed, Jun 10, 2009 at 05:46:03PM +0100, Paul Brook wrote: > > > > If you can't create an identical machine from scratch then I don't > > > > consider snapshot/migration to be a useful feature. i.e. as soon as > > > > you shutdown and restart the guest it is liable to break anyway. > > > > > > Why is liable to break? > > > > A VM booted on an old version of qemu and migrated to a new version will > > behave differently to a the same VM booted on a new version of qemu. > > It will behave identically. That's what the patch does: discover > how did the device behave on old qemu, and make it behave the same way > on new qemu. You're missing the point. After doing a live migration from old-qemu to new- qemu, there is no snapshot to load. We need to be able to shutdown the guest, kill qemu (without saving a snapshot), then start qemu with the exact same hardware. If we can't start a new qemu with the same hardware configuration then we should not be allowing migration or loading of snapshots. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
> > If you can't create an identical machine from scratch then I don't > > consider snapshot/migration to be a useful feature. i.e. as soon as you > > shutdown and restart the guest it is liable to break anyway. > > Why is liable to break? A VM booted on an old version of qemu and migrated to a new version will behave differently to a the same VM booted on a new version of qemu. I hope I don't need to explain why this is bad. As previously discussed, any guest visible changes are liable to break a guest OS, particularly guests like Windows which deliberately break when run on "different" hardware. Personally I don't particularly care, but if we support live migration we also need to support "cold" migration - i.e. shutdown and restart. >So once you load and image with MSIX capability off, >it will stay off across guest restarts. I'm assuming guest restart includes restarting qemu. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
On Wednesday 10 June 2009, Michael S. Tsirkin wrote: > On Wed, Jun 10, 2009 at 04:15:04PM +0100, Paul Brook wrote: > > > > That's seems just plain wrong to me. > > > > Loading a VM shouldn't not > > > > do anything that can't happen during normal operation. > > > > > > At least wrt pci, we are very far from this state: load just overwrites > > > all registers, readonly or not, which can never happen during normal > > > operation. > > > > IMO that code is wrong. We should only be loading things that the guest > > can change (directly or indirectly). > > Making it work this way will mean that minor changes to a device can > break backwards compatibility with old images, often in surprising ways. > What are the advantages? If you can't create an identical machine from scratch then I don't consider snapshot/migration to be a useful feature. i.e. as soon as you shutdown and restart the guest it is liable to break anyway. It may be that the snapshot/migration code wants to include a machine config, and create a new machine from that. However this is a separate issue, and arguably something your VM manager should be handling for you. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCHv3 03/13] qemu: add routines to manage PCI capabilities
> > caps can be anywhere, but we don't expect it to change during machine > > execution lifetime. > > > > Or I am just confused by the name "pci_device_load" ? > > Right. So I want to load an image and it has capability X at offset Y. > wmask has to match. I don't want to assume that we never change Y > for the device without breaking old images, so I clear wmask here > and set it up again after looking up capabilities that I loaded. We should not be loading state into a different device (or a similar device with a different set of capabilities). If you want to provide backwards compatibility then you should do that by creating a device that is the same as the original. As I mentioned in my earlier mail, loading a snapshot should never do anything that can not be achieved through normal operation. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
> > That's seems just plain wrong to me. > > Loading a VM shouldn't not > > do anything that can't happen during normal operation. > > At least wrt pci, we are very far from this state: load just overwrites > all registers, readonly or not, which can never happen during normal > operation. IMO that code is wrong. We should only be loading things that the guest can change (directly or indirectly). Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
> > If we really need to avoid MSI-X capable devices then that should be done > > explicity per-device. i.e. you have a different virtio-net device that > > does not use MSI-X. > > > > Paul > > Why should it be done per-device? Because otherwise you end up with the horrible hacks that you're currently tripping over: devices have to magically morph into a different device when you load a VM. That's seems just plain wrong to me. Loading a VM shouldn't not do anything that can't happen during normal operation. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
> > > Note that platform must set a flag to declare MSI supported. > > > For PC this will be set by APIC. > > > > This sounds wrong. The device shouldn't know or care whether the system > > has a MSI capable interrupt controller. That's for the guest OS to figure > > out. > > You are right of course. In theory there's nothing that breaks if I > set this flag to on, on all platforms. OTOH if qemu emulates some > controller incorrectly, guest might misdetect MSI support in the > controller, and things will break horribly. > > It seems safer to have a flag that can be enabled by people > that know about a specific platform. No. The solution is to fix whatever is broken. If we really need to avoid MSI-X capable devices then that should be done explicity per-device. i.e. you have a different virtio-net device that does not use MSI-X. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH 05/11] qemu: MSI-X support functions
On Monday 25 May 2009, Michael S. Tsirkin wrote: > Add functions implementing MSI-X support. First user will be virtio-pci. > Note that platform must set a flag to declare MSI supported. > For PC this will be set by APIC. This sounds wrong. The device shouldn't know or care whether the system has a MSI capable interrupt controller. That's for the guest OS to figure out. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH] qemu: virtio save/load bindings
> > -/* FIXME: load/save binding. */ > > -//pci_device_save(&vdev->pci_dev, f); > > -//msix_save(&vdev->pci_dev, f); > > qdev regressed save/restore? What else is broken right now from the > qdev commit? > > I'm beginning to think committing in the state it was in was a mistake. > Paul, can you put together a TODO so that we know all of the things that > have regressed so we can get things back into shape? Sorry, this one apparently slipped past. I#'d intended to fix it, but apparently never ported that bit of the patch. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
On Thursday 21 May 2009, Avi Kivity wrote: > Paul Brook wrote: > >> kvm implements the APIC in the host kernel (qemu upstream doesn't > >> support this yet). The fast path is wired to the in-kernel APIC, not > >> the cpu core directly. > >> > >> The idea is to wire it to UIO for device assignment, to a virtio-device > >> implemented in the kernel, and to qemu. > > > > I still don't see why you're trying to bypass straight from the pci layer > > to the apic. Why can't you just pass the apic MMIO writes to the kernel? > > You've presumably got to update the apic state anyway. > > The fast path is an eventfd so that we don't have to teach all the > clients about the details of MSI. Userspace programs the MSI details > into kvm and hands the client an eventfd. All the client has to do is > bang on the eventfd for the interrupt to be queued. The eventfd > provides event coalescing and is equally useful from the kernel and > userspace, and can be used with targets other than kvm. So presumably if a device triggers an APIC interrupt using a write that isn't one of the currently configured PCI devices, it all explodes horribly? Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> >>> kvm has no business messing with the PCI device code. > >> > >> kvm has a fast path for irq injection. If qemu wants to support it we > >> need some abstraction here. > > > > Fast path from where to where? Having the PCI layer bypass/re-implement > > the APIC and inject the interrupt directly into the cpu core sounds a > > particularly bad idea. > > kvm implements the APIC in the host kernel (qemu upstream doesn't > support this yet). The fast path is wired to the in-kernel APIC, not > the cpu core directly. > > The idea is to wire it to UIO for device assignment, to a virtio-device > implemented in the kernel, and to qemu. I still don't see why you're trying to bypass straight from the pci layer to the apic. Why can't you just pass the apic MMIO writes to the kernel? You've presumably got to update the apic state anyway. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
On Thursday 21 May 2009, Avi Kivity wrote: > Paul Brook wrote: > >>> which is a trivial wrapper around stl_phys. > >> > >> OK, but I'm adding another level of indirection in the middle, > >> to allow us to tie in a kvm backend. > > > > kvm has no business messing with the PCI device code. > > kvm has a fast path for irq injection. If qemu wants to support it we > need some abstraction here. Fast path from where to where? Having the PCI layer bypass/re-implement the APIC and inject the interrupt directly into the cpu core sounds a particularly bad idea. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> > which is a trivial wrapper around stl_phys. > > OK, but I'm adding another level of indirection in the middle, > to allow us to tie in a kvm backend. kvm has no business messing with the PCI device code. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
On Thursday 21 May 2009, Paul Brook wrote: > > > MSI provides multiple edge triggered interrupts, whereas traditional > > > mode provides a single level triggered interrupt. My guess is most > > > devices will want to treat these differently anyway. > > > > So, is qemu_send_msi better than qemu_set_irq. > > Neither. pci_send_msi, which is a trivial wrapper around stl_phys. To clarify, you seem to be trying to fuse two largely separate features together. MSI is a standard PCI device capability[1] that involves the device performing a 32-bit memory write when something interesting occurs. These writes may or may not be directed at a APIC. The x86 APIC has a memory mapped interface that allows generation of CPU interrupts in response response to memory writes. These may or may not come from an MSI capable PCI device. Paul [1] Note a *device* capability, not a bus capability. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> > MSI provides multiple edge triggered interrupts, whereas traditional mode > > provides a single level triggered interrupt. My guess is most devices > > will want to treat these differently anyway. > > So, is qemu_send_msi better than qemu_set_irq. Neither. pci_send_msi, which is a trivial wrapper around stl_phys. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> > A tight coupling between PCI devices and the APIC is just going to cause > > us problems later one. I'm going to come back to the fact that these are > > memory writes so once we get IOMMU support they will presumably be > > subject to remapping by that, just like any other memory access. > > I'm not suggesting the qemu_irq will extend all the way to the apic. > Think of it as connecting the device core with its interrupt unit. > > > Even ignoring that, qemu_irq isn't really the right interface. A MSI is a > > one- off event, not a level state. OTOH stl_phys is exactly the right > > interface. > > The qemu_irq callback should do an stl_phys(). The device is happy > since it's using the same API it uses for non-MSI. MSI provides multiple edge triggered interrupts, whereas traditional mode provides a single level triggered interrupt. My guess is most devices will want to treat these differently anyway. Either way, this is an implementation detail between pci.c and individual devices. It has nothing to do with the APIC. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
On Thursday 21 May 2009, Avi Kivity wrote: > Paul Brook wrote: > >>>> In any case we need some internal API for this, and qemu_irq looks > >>>> like a good choice. > >>> > >>> What do you expect to be using this API? > >> > >> virtio, emulated devices capable of supporting MSI (e1000?), device > >> assignment (not yet in qemu.git). > > > > It probably makes sense to have common infrastructure in pci.c to > > expose/implement device side MSI functionality. However I see no need for > > a direct API between the device and the APIC. We already have an API for > > memory accesses and MMIO regions. I'm pretty sure a system could > > implement MSI by pointing the device at system ram, and having the CPU > > periodically poll that. > > Instead of writing directly, let's abstract it behind a qemu_set_irq(). > This is easier for device authors. The default implementation of the > irq callback could write to apic memory, while for kvm we can directly > trigger the interrupt via the kvm APIs. I'm still not convinced. A tight coupling between PCI devices and the APIC is just going to cause us problems later one. I'm going to come back to the fact that these are memory writes so once we get IOMMU support they will presumably be subject to remapping by that, just like any other memory access. Even ignoring that, qemu_irq isn't really the right interface. A MSI is a one- off event, not a level state. OTOH stl_phys is exactly the right interface. The KVM interface should be contained within the APIC implementation. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> >> The PCI bus doesn't need any special support (I think) but something on > >> the other end needs to interpret those writes. > > > > Sure. But there's definitely nothing PCI specific about it. I assumed > > this would all be contained within the APIC. > > MSIs are defined by PCI and their configuration is done using the PCI > configuration space. A MSI is just a regular memory write, and the PCI spec explicitly states that a target (e.g. the APIC) is unable to distinguish between a MSI and any other write. The PCI config bits just provide a way of telling the device where/what to write. > >> In any case we need some internal API for this, and qemu_irq looks like > >> a good choice. > > > > What do you expect to be using this API? > > virtio, emulated devices capable of supporting MSI (e1000?), device > assignment (not yet in qemu.git). It probably makes sense to have common infrastructure in pci.c to expose/implement device side MSI functionality. However I see no need for a direct API between the device and the APIC. We already have an API for memory accesses and MMIO regions. I'm pretty sure a system could implement MSI by pointing the device at system ram, and having the CPU periodically poll that. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
> The PCI bus doesn't need any special support (I think) but something on > the other end needs to interpret those writes. Sure. But there's definitely nothing PCI specific about it. I assumed this would all be contained within the APIC. > In any case we need some internal API for this, and qemu_irq looks like > a good choice. What do you expect to be using this API? Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Qemu-devel] [PATCH] qemu: msi irq allocation api
On Wednesday 20 May 2009, Michael S. Tsirkin wrote: > define api for allocating/setting up msi-x irqs, and for updating them > with msi-x vector information, supply implementation in ioapic. Please > comment on this API: I intend to port my msi-x patch to work on top of > it. I though the point of MSI is that they are just a regular memory writes, and don't require any special bus support. Paul ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization