Re: [Qemu-devel] [PATCH 0/3 v2] virtio: improve virtio devices initialization time
Hi Gal, So the good news is that before I applied your patch I found that initialization time has improved. For 128 virtio-net devices it has gone from 137s the last time I tested it, to 40s today. The bad news is that for 256 virtio-net devices - it now just hangs. However when I apply your patch - it now core dumps for 128 virtio-devices. You can find the output here - https://pastebin.com/W4DXZ6J5. My config is attached to this email. On 17/01/2018 09:28, Gal Hammer wrote: Hi Ray, On Tue, Jan 16, 2018 at 5:40 PM, Kinsella, Ray <m...@ashroe.eu> wrote: Hi Gal, I'm not sure my patch will help with your problem of multiple PCI devices (lots of devices). My patch improves the eventfd registration path while it seems that the source of your problem is either limitation enforced by the PCI specifications or by the QEMU's PCI emulation code. Did you read down the thread to my timings? https://marc.info/?l=qemu-devel=150100585427348=2 [device "pxb"] driver = "pxb-pcie" bus = "pcie.0" bus_nr = "0x80" [device "io0"] driver = "ioh3420" multifunction = "on" addr = "0.0" chassis = "0" bus = "pxb" [chardev "vhost0"] backend = "socket" path = "/tmp/vpp/vhost-user0.sock" server = "off" [netdev "netdev0"] type = "vhost-user" chardev = "vhost0" [device "virtio0"] driver = "virtio-net-pci" netdev = "netdev0" bus = "io0" multifunction = "on" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.0" [chardev "vhost1"] backend = "socket" path = "/tmp/vpp/vhost-user1.sock" server = "off" [netdev "netdev1"] type = "vhost-user" chardev = "vhost1" [device "virtio1"] driver = "virtio-net-pci" netdev = "netdev1" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.1" [chardev "vhost2"] backend = "socket" path = "/tmp/vpp/vhost-user2.sock" server = "off" [netdev "netdev2"] type = "vhost-user" chardev = "vhost2" [device "virtio2"] driver = "virtio-net-pci" netdev = "netdev2" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.2" [chardev "vhost3"] backend = "socket" path = "/tmp/vpp/vhost-user3.sock" server = "off" [netdev "netdev3"] type = "vhost-user" chardev = "vhost3" [device "virtio3"] driver = "virtio-net-pci" netdev = "netdev3" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.3" [chardev "vhost4"] backend = "socket" path = "/tmp/vpp/vhost-user4.sock" server = "off" [netdev "netdev4"] type = "vhost-user" chardev = "vhost4" [device "virtio4"] driver = "virtio-net-pci" netdev = "netdev4" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.4" [chardev "vhost5"] backend = "socket" path = "/tmp/vpp/vhost-user5.sock" server = "off" [netdev "netdev5"] type = "vhost-user" chardev = "vhost5" [device "virtio5"] driver = "virtio-net-pci" netdev = "netdev5" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.5" [chardev "vhost6"] backend = "socket" path = "/tmp/vpp/vhost-user6.sock" server = "off" [netdev "netdev6"] type = "vhost-user" chardev = "vhost6" [device "virtio6"] driver = "virtio-net-pci" netdev = "netdev6" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.6" [chardev "vhost7"] backend = "socket" path = "/tmp/vpp/vhost-user7.sock" server = "off" [netdev "netdev7"] type = "vhost-user" chardev = "vhost7" [device "virtio7"] driver = "virtio-net-pci" netdev = "netdev7" bus = "io0" disable-modern = "off" disable-legacy = "on" x-ignore-backend-features = "on" addr = "0.7"
Re: [Qemu-devel] [PATCH 0/3 v2] virtio: improve virtio devices initialization time
Hi Gal, Brilliant - will test this in the next day or two. Hopefully this will help resolve the issues I reported last summer. http://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg05268.html Ray K On 14/01/2018 10:06, Gal Hammer wrote: A bug was reported about a very slow boot time and a 100% CPU usage of both Windows and Linux guests when running a VM with multiple virtio-serial devices (https://bugzilla.redhat.com/1528588). For example, running a VM with 25 virtio-serial devices, each one with max_ports=511, could have a boot time of around 30 minutes. With this patch (and another patch to kvm) the boot time is reduced to approximately 3 minutes. The patch wraps all the changes made to the Memory Regions during the eventfd registrations in a memory regions transaction. I had to add a cleanup callback function to the EventNotifier struct, so it will be possible to use a transaction in the shutdown code path as well. Gal Hammer (3): qemu: add a cleanup callback function to EventNotifier virtio: postpone the execution of event_notifier_cleanup function virtio: improve virtio devices initialization time accel/kvm/kvm-all.c | 4 hw/virtio/virtio-bus.c| 19 +++ hw/virtio/virtio.c| 5 + include/qemu/event_notifier.h | 1 + util/event_notifier-posix.c | 5 - util/event_notifier-win32.c | 2 ++ 6 files changed, 27 insertions(+), 9 deletions(-)
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
Marcel, The findings are pretty consistent with what I identified. Although it looks like SeaBIOS fairs better than UEFI. Thanks for the headsup, will reply on the thread itself. Ray K -Original Message- From: Marcel Apfelbaum [mailto:mar...@redhat.com] Sent: Wednesday, August 9, 2017 3:53 AM To: Kinsella, Ray <ray.kinse...@intel.com>; Kevin O'Connor <ke...@koconnor.net> Cc: Tan, Jianfeng <jianfeng@intel.com>; seab...@seabios.org; Michael Tsirkin <m...@redhat.com>; qemu-devel@nongnu.org; Gerd Hoffmann <kra...@redhat.com> Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices On 07/08/2017 22:00, Kinsella, Ray wrote: > Hi Marcel, > Hi Ray, Please have a look on this thread, I think Laszlo and Paolo found the root cause. https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg01368.html It seems hot-plugging the devices would not help. Thanks, MArcel > Yup - I am using Seabios by default. > I took all the measures from the Kernel time reported in syslog. > As Seabios wasn't exhibiting any obvious scaling problem. > > Ray K > > -Original Message- > From: Marcel Apfelbaum [mailto:mar...@redhat.com] > Sent: Wednesday, August 2, 2017 5:43 AM > To: Kinsella, Ray <ray.kinse...@intel.com>; Kevin O'Connor > <ke...@koconnor.net> > Cc: Tan, Jianfeng <jianfeng@intel.com>; seab...@seabios.org; Michael > Tsirkin <m...@redhat.com>; qemu-devel@nongnu.org; Gerd Hoffmann > <kra...@redhat.com> > Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices > > It is an issue worth looking into it, one more question, all the measurements > are from OS boot? Do you use SeaBIOS? > No problems with the firmware? > > Thanks, > Marcel > >
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
Hi Marcel, Yup - I am using Seabios by default. I took all the measures from the Kernel time reported in syslog. As Seabios wasn't exhibiting any obvious scaling problem. Ray K -Original Message- From: Marcel Apfelbaum [mailto:mar...@redhat.com] Sent: Wednesday, August 2, 2017 5:43 AM To: Kinsella, Ray <ray.kinse...@intel.com>; Kevin O'Connor <ke...@koconnor.net> Cc: Tan, Jianfeng <jianfeng@intel.com>; seab...@seabios.org; Michael Tsirkin <m...@redhat.com>; qemu-devel@nongnu.org; Gerd Hoffmann <kra...@redhat.com> Subject: Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices It is an issue worth looking into it, one more question, all the measurements are from OS boot? Do you use SeaBIOS? No problems with the firmware? Thanks, Marcel
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
Hi Marcel, On 24/07/2017 00:14, Marcel Apfelbaum wrote: On 24/07/2017 7:53, Kinsella, Ray wrote: Even if I am not aware of how much time would take to init a bare-metal PCIe Root Port, it seems too much. So I repeated the testing for 64, 128, 256 and 512 ports. I ensured the configuration was sane, that 128 was twice the number of root ports and virtio-pci-net devices as 64. I got the following results - shown in seconds, as you can see it is non linear but not exponential, there is something that is not scaling well. 64 128 256 512 PCIe Root Ports 14 72 430 2672 ACPI4 35 342 3863 Loading Drivers 1 1 31 621 Total Boot 34 137 890 7516 ( I did try to test 1024 devices, but it just dies silently ) Ray K
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
So as it turns out at 512 devices, it is nothing to do SeaBIOS, it was the Kernel again. It is taking quite a while to startup, a little over two hours (7489 seconds). The main culprits appear to be enumerating/initializing the PCI Express ports and enabling interrupts. The PCI Express Root Ports are taking a long time to enumerate/ initializing. 42 minutes in total=2579/60=64 ports in total, 40 seconds each. [ 50.612822] pci_bus :80: root bus resource [bus 80-c1] [ 172.345361] pci :80:00.0: PCI bridge to [bus 81] ... [ 2724.734240] pci :80:08.0: PCI bridge to [bus c1] [ 2751.154702] ACPI: Enabled 2 GPEs in block 00 to 3F I assume the 1 hour (3827 seconds) below is being spent enabling interrupts. [ 2899.394288] ACPI: PCI Interrupt Link [GSIG] enabled at IRQ 22 [ 2899.531324] ACPI: PCI Interrupt Link [GSIH] enabled at IRQ 23 [ 2899.534778] ACPI: PCI Interrupt Link [GSIE] enabled at IRQ 20 [ 6726.914388] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled [ 6726.937932] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 6726.964699] Linux agpgart interface v0.103 There finally there is another 20 minutes to find in the boot. [ 7489.202589] virtio_net virtio515 enp193s0f0: renamed from eth513 Poky (Yocto Project Reference Distro) 2.3 qemux86-64 ttyS0 qemux86-64 login: root I will remove the virtio-net-pci devices and hotplug them instead. In theory it should improve boot time, at expense of incurring some of these costs at runtime. Ray K -Original Message- From: Kevin O'Connor [mailto:ke...@koconnor.net] Sent: Sunday, July 23, 2017 1:05 PM To: Marcel Apfelbaum <mar...@redhat.com>; Kinsella, Ray <ray.kinse...@intel.com> Cc: qemu-devel@nongnu.org; seab...@seabios.org; Gerd Hoffmann <kra...@redhat.com>; Michael Tsirkin <m...@redhat.com> Subject: Re: >256 Virtio-net-pci hotplug Devices On Sun, Jul 23, 2017 at 07:28:01PM +0300, Marcel Apfelbaum wrote: > On 22/07/2017 2:57, Kinsella, Ray wrote: > > When scaling up to 512 Virtio-net devices SeaBIOS appears to really > > slow down when configuring PCI Config space - haven't manage to get > > this to work yet. If there is a slowdown in SeaBIOS, it would help to produce a log with timing information - see: https://www.seabios.org/Debugging#Timing_debug_messages It may also help to increase the debug level in SeaBIOS to get more fine grained timing reports. -Kevin
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
Hi Marcel On 21/07/2017 01:33, Marcel Apfelbaum wrote: On 20/07/2017 3:44, Kinsella, Ray wrote: That's strange. Please ensure the virtio devices are working in virtio 1.0 mode (disable-modern=0,disable-legacy=1). Let us know any problems you see. Not sure what yet, I will try scaling it with hotplugging tomorrow. Updates? I have managed to scale it to 128 devices. The kernel does complain about IO address space exhaustion. [ 83.697956] pci :80:00.0: BAR 13: no space for [io size 0x1000] [ 83.700958] pci :80:00.0: BAR 13: failed to assign [io size 0x1000] [ 83.701689] pci :80:00.1: BAR 13: no space for [io size 0x1000] [ 83.702378] pci :80:00.1: BAR 13: failed to assign [io size 0x1000] [ 83.703093] pci :80:00.2: BAR 13: no space for [io size 0x1000] I was surprised that I am running out of IO address space, as I am disabling legacy virtio. I assumed that this would remove the need for SeaBIOS to allocate the PCI Express Root Port IO address space. In any case - it doesn't stop the virtio-net device coming up and working as expected. [ 668.692081] virtio_net virtio103 enp141s0f4: renamed from eth101 [ 668.707114] virtio_net virtio130 enp144s0f7: renamed from eth128 [ 668.719795] virtio_net virtio129 enp144s0f6: renamed from eth127 I encountered some issues in vhost, due to open file exhaustion but resolved these with 'ulimit' in the usual way - burned alot of time on that today. When scaling up to 512 Virtio-net devices SeaBIOS appears to really slow down when configuring PCI Config space - haven't manage to get this to work yet. Not really. All you have to do is to add a property to the pxb-pci/pxb devices: pci_domain=x; then update the ACPI table to include the pxb domain. You also have to tweak a little the pxb-pcie/pxb devices to not share the bus numbers if pci_domain > 0. Thanks for information, will add to the list. Ray K \
Re: [Qemu-devel] >256 Virtio-net-pci hotplug Devices
Hi Marcel, You can use multi-function PCIe Root Ports, this will give you 8 ports per slot, if you have 16 empty slots (I think we have more) you reach 128 root ports. Then you can use multi-function virtio-net-pci devices, this will give you 8 functions per port, so you reach the target of 1024 devices. You loose hot-plug granularity since you can hot-plug 8-functions group, but maybe is OK, depending on your scenario. Thanks for the advice losing the hotplug granularity is something I think I can live with. It would mean, I would have to track how many ports are allocated to a VM, and create 8 new ports when 1 is required, caching the other 7 for when they are needed. Even so, you can use one cold-plugged pxb-pcie if you don't have enough empty slots on pcie.0, in order to reach the maximum number of PCIe Root Ports (256) which is the maximum for a single PCI domain. Took your advice see the attached cfg, it works exactly as you indicated. If you are interested, you can use it from your VM adding -readconfig to your qemu cmd line. I can currently only manage to start a VM with around 50 coldplugged virtio devices before something breaks. Not sure what yet, I will try scaling it with hotplugging tomorrow. If you need granularity per single device (1000+ hot-pluggable), you could enhance the pxb-pcie to support multiple pci domains. Do think there would be much work in this? Thanks, Ray K test.cfg.gz Description: application/gzip
[Qemu-devel] >256 Virtio-net-pci hotplug Devices
Hi folks, I am trying to create a VM that supports hot-plugging a large number of virtio-net-pci device, up to 1000 devices initially. >From the docs (see below) and from playing with QEMU, it looks like there are >two options. Both with limitations. PCI Express switch It looks like using a PCI Express switch hierarchy is not an option due to bus exhaustion. Each downstream port creates a separate bus and each downstream port only supports hot-plugging a single device. So this gives us a max of 256-ish buses/devices pairs. PCI Root Ports The other option is use a flatter hierarchy, with a number of multi-function PCI Root Ports hanging off 'pcie.0'. However each 'PCI Root Port' can support hot-plugging a single device. So this method really becomes a function of how many free address we have on 'pci.0'. If we make room for say 16 multifunction devices, we get 16*8 ... 128 So ultimately, this will approach will give us a similar to number to using a switch. Is there another method? ( pxb-pcie doesn't support hotplug for instance, and only a single pcie domain is supported qemu ) Thanks, Ray K Pcie.txt, ( https://github.com/qemu/qemu/blob/master/docs/pcie.txt ) Q35 preso ( http://wiki.qemu.org/images/4/4e/Q35.pdf )