Alexander Graf wrote: >> Marcelo Tosatti wrote: >>> Add three PCI bridges to support 128 slots. >>> >>> Changes since v1: >>> - Remove I/O address range "support" (so standard PCI I/O space is >>> used). >>> - Verify that there's no special quirks for 82801 PCI bridge. >>> - Introduce separate flat IRQ mapping function for non-SPARC targets. >>> >>> >> >> I've cooled off on the 128 slot stuff, mainly because most real hosts >> don't have them. An unusual configuration will likely lead to problems >> as most guest OSes and workloads will not have been tested thoroughly >> with them. > > This is more of a "let's do this conditionally" than a "let's not do > it" reason imho.
Yes. More precisely, let's not do it until we're sure it works and performs. I don't think a queue-per-disk approach will perform well, since the queue will always be very short and will not be able to amortize exit costs and ring management overhead very well. >> - it requires a large number of interrupts, which are difficult to >> provide, and which it is hard to ensure all OSes support. MSI is >> relatively new. > > We could just as well extend the device layout to have every device be > attached to one virtual IOAPIC pin, so we'd have like 128 / 4 = 32 > IOAPICs in the system and one interrupt for each device. That's problematic for these reasons: - how many OSes work well with 32 IOAPICs? - at one point, you run out of interrupt vectors (~ 220 per cpu if the OS can allocate per-cpu vectors; otherwise just ~220) - you will have many interrupts fired, each for a single device with a few requests, reducing performance >> - is only a few interrupts are available, then each interrupt requires >> scanning a large number of queues > > This case should be rare, basically only existent with OSs that don't > support APIC properly. > Hopefully. >> The alternative approach of having the virtio block device control up to >> 16 disks allows having those 80 disks with just 5 slots (and 5 >> interrupts). This is similar to the way traditional SCSI controllers >> behave, and so should not surprise the guest OS. > > The one thing I'm actually really missing here is use cases. What are > we doing this for? And further along the line, are there other > approaches to the problems for which this was supposed to be a > solution? Maybe someone can raise a case where it's not virtblk / > virtnet. The requirement for lots of storage is a given. There are two ways of doing that, paying a lot of money to EMC or NetApp for a storage controller, or connecting lots of disks directly and doing the storage controller on the OS (what EMC and NetApp do anyway, inside their boxes). zfs is a good example of a use case, and I'd guess databases could use this too if they were able to supply the redundancy. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel