Hi,

I recently made the unfortunate discovery that virtio-serial-pci is quite expensive to stop/start during live migration.

By default we support 32 ports, each of which uses 2 queues. In my case it takes 2-3ms per queue to disconnect on the source host, and another 2-3ms per queue to connect on the target host, for a total cost of >300ms.

In our case this roughly tripled the outage times of a libvirt-based live migration, from 150ms to almost 500ms.

It seems like the main problem is that we loop over all the queues, calling virtio_pci_set_host_notifier_internal() on each of them. That in turn calls memory_region_add_eventfd(), which calls memory_region_transaction_commit(), which scans over all the address spaces, which seems to take the vast majority of the time.

Yes, setting the max_ports value to something smaller does help, but each port still adds 10-12ms to the overall live migration time, which is crazy.

Is there anything that could be done to make this code more efficient? Could we tweak the API so that we add all the eventfds and then do a single commit at the end? Do we really need to scan the entire address space? I don't know the code well enough to answer that sort of question, but I'm hoping that one of you does.

Thanks,
Chris

Reply via email to