Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
Alexander Graf wrote: >> Marcelo Tosatti wrote: >>> Add three PCI bridges to support 128 slots. >>> >>> Changes since v1: >>> - Remove I/O address range "support" (so standard PCI I/O space is >>> used). >>> - Verify that there's no special quirks for 82801 PCI bridge. >>> - Introduce separate flat IRQ mapping function for non-SPARC targets. >>> >>> >> >> I've cooled off on the 128 slot stuff, mainly because most real hosts >> don't have them. An unusual configuration will likely lead to problems >> as most guest OSes and workloads will not have been tested thoroughly >> with them. > > This is more of a "let's do this conditionally" than a "let's not do > it" reason imho. Yes. More precisely, let's not do it until we're sure it works and performs. I don't think a queue-per-disk approach will perform well, since the queue will always be very short and will not be able to amortize exit costs and ring management overhead very well. >> - it requires a large number of interrupts, which are difficult to >> provide, and which it is hard to ensure all OSes support. MSI is >> relatively new. > > We could just as well extend the device layout to have every device be > attached to one virtual IOAPIC pin, so we'd have like 128 / 4 = 32 > IOAPICs in the system and one interrupt for each device. That's problematic for these reasons: - how many OSes work well with 32 IOAPICs? - at one point, you run out of interrupt vectors (~ 220 per cpu if the OS can allocate per-cpu vectors; otherwise just ~220) - you will have many interrupts fired, each for a single device with a few requests, reducing performance >> - is only a few interrupts are available, then each interrupt requires >> scanning a large number of queues > > This case should be rare, basically only existent with OSs that don't > support APIC properly. > Hopefully. >> The alternative approach of having the virtio block device control up to >> 16 disks allows having those 80 disks with just 5 slots (and 5 >> interrupts). This is similar to the way traditional SCSI controllers >> behave, and so should not surprise the guest OS. > > The one thing I'm actually really missing here is use cases. What are > we doing this for? And further along the line, are there other > approaches to the problems for which this was supposed to be a > solution? Maybe someone can raise a case where it's not virtblk / > virtnet. The requirement for lots of storage is a given. There are two ways of doing that, paying a lot of money to EMC or NetApp for a storage controller, or connecting lots of disks directly and doing the storage controller on the OS (what EMC and NetApp do anyway, inside their boxes). zfs is a good example of a use case, and I'd guess databases could use this too if they were able to supply the redundancy. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
Anthony Liguori wrote: > Avi Kivity wrote: >> Marcelo Tosatti wrote: >> >>> Add three PCI bridges to support 128 slots. >>> >>> Changes since v1: >>> - Remove I/O address range "support" (so standard PCI I/O space is >>> used). >>> - Verify that there's no special quirks for 82801 PCI bridge. >>> - Introduce separate flat IRQ mapping function for non-SPARC targets. >>> >>> >> >> I've cooled off on the 128 slot stuff, mainly because most real hosts >> don't have them. An unusual configuration will likely lead to >> problems as most guest OSes and workloads will not have been tested >> thoroughly with them. >> >> - it requires a large number of interrupts, which are difficult to >> provide, and which it is hard to ensure all OSes support. MSI is >> relatively new. >> - is only a few interrupts are available, then each interrupt >> requires scanning a large number of queues >> >> If we are to do this, then we need better tests than "80 disks show up". >> >> The alternative approach of having the virtio block device control up >> to 16 disks allows having those 80 disks with just 5 slots (and 5 >> interrupts). This is similar to the way traditional SCSI controllers >> behave, and so should not surprise the guest OS. >> > > If you have a single virtio-blk device that shows up as 8 functions, > we could achieve the same thing. We can cheat with the interrupt > handlers to avoid cache line bouncing too. You can't cheat on all guests, and even on Linux, it's better to keep on doing what real hardware does than go off on a tangent than no one else uses. You'll have to cheat on ->kick(), too. Virtio needs one exit per O(queue depth). With one spindle per ring, it doesn't make sense to have a queue depth > 4 (or latency goes to hell), so you have many exits. > Plus, we can use PCI hotplug so we don't have to reinvent a new > hotplug mechanism. You can plug disks into a Fibre Channel mesh, so presumably that works on real hardware somehow. > > I'm inclined to think that ring sharing isn't as useful as it seems as > long as we don't have indirect scatter gather lists. I agree, but I think that indirect sg is very important for storage: - a long sg list is cheap from the disk's point of view (the seeks are what's expensive) - it is important to keep the queue depth meaningful and small (O(spindles * 3)), as it drastically affects latency -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
On May 4, 2008, at 9:56 AM, Avi Kivity wrote: > Marcelo Tosatti wrote: >> Add three PCI bridges to support 128 slots. >> >> Changes since v1: >> - Remove I/O address range "support" (so standard PCI I/O space is >> used). >> - Verify that there's no special quirks for 82801 PCI bridge. >> - Introduce separate flat IRQ mapping function for non-SPARC targets. >> >> > > I've cooled off on the 128 slot stuff, mainly because most real hosts > don't have them. An unusual configuration will likely lead to > problems > as most guest OSes and workloads will not have been tested thoroughly > with them. This is more of a "let's do this conditionally" than a "let's not do it" reason imho. > - it requires a large number of interrupts, which are difficult to > provide, and which it is hard to ensure all OSes support. MSI is > relatively new. We could just as well extend the device layout to have every device be attached to one virtual IOAPIC pin, so we'd have like 128 / 4 = 32 IOAPICs in the system and one interrupt for each device. > - is only a few interrupts are available, then each interrupt requires > scanning a large number of queues This case should be rare, basically only existent with OSs that don't support APIC properly. > If we are to do this, then we need better tests than "80 disks show > up". True. > The alternative approach of having the virtio block device control > up to > 16 disks allows having those 80 disks with just 5 slots (and 5 > interrupts). This is similar to the way traditional SCSI controllers > behave, and so should not surprise the guest OS. The one thing I'm actually really missing here is use cases. What are we doing this for? And further along the line, are there other approaches to the problems for which this was supposed to be a solution? Maybe someone can raise a case where it's not virtblk / virtnet. Alex - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
Avi Kivity wrote: > Marcelo Tosatti wrote: > >> Add three PCI bridges to support 128 slots. >> >> Changes since v1: >> - Remove I/O address range "support" (so standard PCI I/O space is used). >> - Verify that there's no special quirks for 82801 PCI bridge. >> - Introduce separate flat IRQ mapping function for non-SPARC targets. >> >> >> > > I've cooled off on the 128 slot stuff, mainly because most real hosts > don't have them. An unusual configuration will likely lead to problems > as most guest OSes and workloads will not have been tested thoroughly > with them. > > - it requires a large number of interrupts, which are difficult to > provide, and which it is hard to ensure all OSes support. MSI is > relatively new. > - is only a few interrupts are available, then each interrupt requires > scanning a large number of queues > > If we are to do this, then we need better tests than "80 disks show up". > > The alternative approach of having the virtio block device control up to > 16 disks allows having those 80 disks with just 5 slots (and 5 > interrupts). This is similar to the way traditional SCSI controllers > behave, and so should not surprise the guest OS. > If you have a single virtio-blk device that shows up as 8 functions, we could achieve the same thing. We can cheat with the interrupt handlers to avoid cache line bouncing too. Plus, we can use PCI hotplug so we don't have to reinvent a new hotplug mechanism. I'm inclined to think that ring sharing isn't as useful as it seems as long as we don't have indirect scatter gather lists. Regards, Anthony Liguori - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
On May 4, 2008, at 9:56 AM, Avi Kivity wrote: > Marcelo Tosatti wrote: >> Add three PCI bridges to support 128 slots. >> >> Changes since v1: >> - Remove I/O address range "support" (so standard PCI I/O space is >> used). >> - Verify that there's no special quirks for 82801 PCI bridge. >> - Introduce separate flat IRQ mapping function for non-SPARC targets. >> >> > > I've cooled off on the 128 slot stuff, mainly because most real hosts > don't have them. An unusual configuration will likely lead to > problems > as most guest OSes and workloads will not have been tested thoroughly > with them. This is more of a "let's do this conditionally" than a "let's not do it" reason imho. > - it requires a large number of interrupts, which are difficult to > provide, and which it is hard to ensure all OSes support. MSI is > relatively new. We could just as well extend the device layout to have every device be attached to one virtual IOAPIC pin, so we'd have like 128 / 4 = 32 IOAPICs in the system and one interrupt for each device. > - is only a few interrupts are available, then each interrupt requires > scanning a large number of queues This case should be rare, basically only existent with OSs that don't support APIC properly. > If we are to do this, then we need better tests than "80 disks show > up". True. > The alternative approach of having the virtio block device control > up to > 16 disks allows having those 80 disks with just 5 slots (and 5 > interrupts). This is similar to the way traditional SCSI controllers > behave, and so should not surprise the guest OS. The one thing I'm actually really missing here is use cases. What are we doing this for? And further along the line, are there other approaches to the problems for which this was supposed to be a solution? Maybe someone can raise a case where it's not virtblk / virtnet. Alex - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
Marcelo Tosatti wrote: > Add three PCI bridges to support 128 slots. > > Changes since v1: > - Remove I/O address range "support" (so standard PCI I/O space is used). > - Verify that there's no special quirks for 82801 PCI bridge. > - Introduce separate flat IRQ mapping function for non-SPARC targets. > > I've cooled off on the 128 slot stuff, mainly because most real hosts don't have them. An unusual configuration will likely lead to problems as most guest OSes and workloads will not have been tested thoroughly with them. - it requires a large number of interrupts, which are difficult to provide, and which it is hard to ensure all OSes support. MSI is relatively new. - is only a few interrupts are available, then each interrupt requires scanning a large number of queues If we are to do this, then we need better tests than "80 disks show up". The alternative approach of having the virtio block device control up to 16 disks allows having those 80 disks with just 5 slots (and 5 interrupts). This is similar to the way traditional SCSI controllers behave, and so should not surprise the guest OS. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [patch 0/3] QEMU/KVM: add support for 128 PCI slots (v2)
Add three PCI bridges to support 128 slots. Changes since v1: - Remove I/O address range "support" (so standard PCI I/O space is used). - Verify that there's no special quirks for 82801 PCI bridge. - Introduce separate flat IRQ mapping function for non-SPARC targets. -- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel