Fix issues in PCIe SR-IOV configuration register handling that caused inconsistent internal state due to improper write mask handling and incorrect migration behavior.
Two main problems were identified: 1. VF Enable bit write mask handling: pcie_sriov_config_write() incorrectly assumed that its val parameter was already masked, causing it to ignore the actual write mask. This led to the VF Enable bit being processed even when masked, resulting in incorrect VF registration/unregistration. 2. Migration state inconsistency: pcie_sriov_pf_post_load() unconditionally called register_vfs() regardless of the VF Enable bit state, creating inconsistent internal state when VFs should not be enabled. Additionally, it failed to properly update the NumVFs write mask based on the current configuration. Root cause analysis revealed that both functions relied on incorrect special-case assumptions instead of properly reading and consuming the actual configuration values. This change introduces a unified consume_config() function that reads actual configuration values and synchronize the internal state without special-case assumptions. The solution only adds register read overhead in non-hot-path code while ensuring correct SR-IOV state management across configuration writes and migration scenarios. Fixes: 5e7dd17e4348 ("pcie_sriov: Remove num_vfs from PCIESriovPF") Fixes: f9efcd47110d ("pcie_sriov: Register VFs after migration") Reported-by: Corentin BAYET <corentin.ba...@reversetactics.com> Signed-off-by: Akihiko Odaki <od...@rsg.ci.i.u-tokyo.ac.jp> --- hw/pci/pcie_sriov.c | 42 +++++++++++++++++++++++------------------- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c index 3ad18744f4a8ed2b35144fafcdc8e7e00fec3672..a08b5258275f51876c18d8a3f5787c5e351c38f6 100644 --- a/hw/pci/pcie_sriov.c +++ b/hw/pci/pcie_sriov.c @@ -64,6 +64,27 @@ static void unregister_vfs(PCIDevice *dev) pci_set_word(dev->wmask + dev->exp.sriov_cap + PCI_SRIOV_NUM_VF, 0xffff); } +static void consume_config(PCIDevice *dev) +{ + uint16_t sriov_cap = dev->exp.sriov_cap; + uint8_t *cfg = dev->config + sriov_cap; + uint8_t *wmask = dev->wmask + sriov_cap; + uint16_t num_vfs = pci_get_word(cfg + PCI_SRIOV_NUM_VF); + uint16_t wmask_val = PCI_SRIOV_CTRL_MSE | PCI_SRIOV_CTRL_ARI; + + if (pci_get_word(cfg + PCI_SRIOV_CTRL) & PCI_SRIOV_CTRL_VFE) { + register_vfs(dev); + } else { + unregister_vfs(dev); + } + + if (num_vfs <= pci_get_word(cfg + PCI_SRIOV_TOTAL_VF)) { + wmask_val |= PCI_SRIOV_CTRL_VFE; + } + + pci_set_word(wmask + PCI_SRIOV_CTRL, wmask_val); +} + static bool pcie_sriov_pf_init_common(PCIDevice *dev, uint16_t offset, uint16_t vf_dev_id, uint16_t init_vfs, uint16_t total_vfs, uint16_t vf_offset, @@ -416,30 +437,13 @@ void pcie_sriov_config_write(PCIDevice *dev, uint32_t address, trace_sriov_config_write(dev->name, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn), off, val, len); - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { - if (val & PCI_SRIOV_CTRL_VFE) { - register_vfs(dev); - } else { - unregister_vfs(dev); - } - } else if (range_covers_byte(off, len, PCI_SRIOV_NUM_VF)) { - uint8_t *cfg = dev->config + sriov_cap; - uint8_t *wmask = dev->wmask + sriov_cap; - uint16_t num_vfs = pci_get_word(cfg + PCI_SRIOV_NUM_VF); - uint16_t wmask_val = PCI_SRIOV_CTRL_MSE | PCI_SRIOV_CTRL_ARI; - - if (num_vfs <= pci_get_word(cfg + PCI_SRIOV_TOTAL_VF)) { - wmask_val |= PCI_SRIOV_CTRL_VFE; - } - - pci_set_word(wmask + PCI_SRIOV_CTRL, wmask_val); - } + consume_config(dev); } void pcie_sriov_pf_post_load(PCIDevice *dev) { if (dev->exp.sriov_cap) { - register_vfs(dev); + consume_config(dev); } } --- base-commit: f0737158b483e7ec2b2512145aeab888b85cc1f7 change-id: 20250709-wmask-29475ccdb3d6 Best regards, -- Akihiko Odaki <od...@rsg.ci.i.u-tokyo.ac.jp>