Hi Alex,
On 2021/9/9 4:45, Alex Williamson wrote:
On Fri, 3 Sep 2021 17:36:10 +0800
Kunkun Jiang <jiangkun...@huawei.com> wrote:
We expand MemoryRegions of vfio-pci sub-page MMIO BARs to
vfio_pci_write_config to improve IO performance.
The MemoryRegions of destination VM will not be expanded
successful in live migration, because their addresses have
been updated in vmstate_load_state (vfio_pci_load_config).
What's the call path through vfio_pci_write_config() that you're
relying on to get triggered to enable this and why wouldn't we just
walk all sub-page BARs in vfio_pci_load_config() to resolve the issue
then? It's my understanding that we do this update in write-config
because it's required that the VM sizes the BAR before using it, which
is not the case when we resume from migration. Thanks,
Let's take an example:
AArch64
host page granularity: 64KB
PCI device: *Bar2 size 32KB* [mem 0x8000200000-0x800020ffff 64bit pref]
When enable Command register bit 1(Memory Space), the code flow is
as follows:
vfio_pci_write_config (addr: 4 val: 2 len: 2)
// record the old address of each bar, 0xffffffffffffffff
old_addr[bar] = pdev->io_regions[bar].addr;
pci_default_write_config
pci_update_mappings
new_addr = pci_bar_address // 0x8000200000
r->addr = new_addr;
memory_region_addr_subregion_overlap
...
*vfio_listener_region_add*
alignment check of the ram section address and size
fail, return
*kvm_region_add*
kvm_set_phys_mem
alignment check of the ram section address and
size fail, return
// old_addr[bar] != pdev->io_regions[bar].addr &&
// 0 < vdev->bars[bar].region.size < qemu_real_host_page_size
vfio_sub_page_bar_update_mapping
*bar size = qemu_real_host_page_size*
...
vfio_listener_region_add
map success
kvm_region_add
kvm_set_phys_mem
map success
In live migration, only pci config data is sent to the destination VM.
Therefore, we need to update the bar's size before destination VM
using it.
In vfio_pci_load_config, the code flow is as follows:
vfio_pci_load_config
vmstate_load_state
*get_pci_config_device*
pci_update_mappings
...
// bar's addr is updated(0x8000200000), but bar's size
is still 32KB, so map failed
vfio_pci_write_config
// bar's addr will not be changed, so
vfio_sub_page_bar_update_mapping won't be called
My idea is that removing the check 'old_addr[bar] !=
pdev->io_regions[bar].addr' doesn't
affect the previous process. There's also a bar size check. In
vfio_sub_page_bar_update_mapping,
it will check if bar is mapped and page aligned.
1) If bar's addr is 0xffffffffffffffff, it will not pass the
vfio_sub_page_bar_update_mapping check.
2) If bar's size has been updated, it will not pass the bar size check
in vfio_pci_write_config.
Thanks,
Kunkun Jiang
Alex
Remove the restriction on base address change in
vfio_pci_write_config for correct mmapping sub-page MMIO
BARs. Accroding to my analysis, the remaining parameter
verification is enough.
Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI devices)
Reported-by: Nianyao Tang <tangnian...@huawei.com>
Reported-by: Qixin Gan <ganqi...@huawei.com>
Signed-off-by: Kunkun Jiang <jiangkun...@huawei.com>
---
hw/vfio/pci.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e1ea1d8a23..891b211ddf 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1189,18 +1189,12 @@ void vfio_pci_write_config(PCIDevice *pdev,
}
} else if (ranges_overlap(addr, len, PCI_BASE_ADDRESS_0, 24) ||
range_covers_byte(addr, len, PCI_COMMAND)) {
- pcibus_t old_addr[PCI_NUM_REGIONS - 1];
int bar;
- for (bar = 0; bar < PCI_ROM_SLOT; bar++) {
- old_addr[bar] = pdev->io_regions[bar].addr;
- }
-
pci_default_write_config(pdev, addr, val, len);
for (bar = 0; bar < PCI_ROM_SLOT; bar++) {
- if (old_addr[bar] != pdev->io_regions[bar].addr &&
- vdev->bars[bar].region.size > 0 &&
+ if (vdev->bars[bar].region.size > 0 &&
vdev->bars[bar].region.size < qemu_real_host_page_size) {
vfio_sub_page_bar_update_mapping(pdev, bar);
}
.