On 6/11/2021 2:15 PM, Steven Sistare wrote: > On 5/24/2021 2:29 PM, Steven Sistare wrote: >> On 5/21/2021 6:24 PM, Alex Williamson wrote:> On Fri, 7 May 2021 05:25:10 >> -0700 >>> Steve Sistare <steven.sist...@oracle.com> wrote: >>> >>>> [...] >>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>> index 7a4fb6c..f7ac9f03 100644 >>>> --- a/hw/vfio/pci.c >>>> +++ b/hw/vfio/pci.c >>>> @@ -29,6 +29,8 @@ >>>> #include "hw/qdev-properties.h" >>>> #include "hw/qdev-properties-system.h" >>>> #include "migration/vmstate.h" >>>> +#include "migration/cpr.h" >>>> +#include "qemu/env.h" >>>> #include "qemu/error-report.h" >>>> #include "qemu/main-loop.h" >>>> #include "qemu/module.h" >>>> @@ -1612,6 +1614,14 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice >>>> *vdev, bool enabled) >>>> } >>>> } >>>> >>>> +static void vfio_config_sync(VFIOPCIDevice *vdev, uint32_t offset, size_t >>>> len) >>>> +{ >>>> + if (pread(vdev->vbasedev.fd, vdev->pdev.config + offset, len, >>>> + vdev->config_offset + offset) != len) { >>>> + error_report("vfio_config_sync pread failed"); >>>> + } >>>> +} >>>> + >>>> static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr) >>>> { >>>> VFIOBAR *bar = &vdev->bars[nr]; >>>> @@ -1652,6 +1662,7 @@ static void vfio_bars_prepare(VFIOPCIDevice *vdev) >>>> static void vfio_bar_register(VFIOPCIDevice *vdev, int nr) >>>> { >>>> VFIOBAR *bar = &vdev->bars[nr]; >>>> + PCIDevice *pdev = &vdev->pdev; >>>> char *name; >>>> >>>> if (!bar->size) { >>>> @@ -1672,7 +1683,10 @@ static void vfio_bar_register(VFIOPCIDevice *vdev, >>>> int nr) >>>> } >>>> } >>>> >>>> - pci_register_bar(&vdev->pdev, nr, bar->type, bar->mr); >>>> + pci_register_bar(pdev, nr, bar->type, bar->mr); >>>> + if (pdev->reused) { >>>> + vfio_config_sync(vdev, pci_bar(pdev, nr), 8); >>> >>> Assuming 64-bit BARs? This might be the first case where we actually >>> rely on the kernel BAR values, IIRC we usually use QEMU's emulation. >> >> No asssumptions. vfio_config_sync() preads a piece of config space using a >> single >> system call, copying directly to the qemu buffer, not looking at words or >> calling any >> action functions. >> >> [...] >>>> @@ -3098,6 +3115,11 @@ static void vfio_realize(PCIDevice *pdev, Error >>>> **errp) >>>> vfio_register_req_notifier(vdev); >>>> vfio_setup_resetfn_quirk(vdev); >>>> >>>> + vfio_config_sync(vdev, pdev->msix_cap + PCI_MSIX_FLAGS, 2); >>>> + if (pdev->reused) { >>>> + pci_update_mappings(pdev); >>>> + } >>>> + >>> >>> Are the msix flag sync and mapping update related? They seem >>> independent to me. A blank line and comment would be helpful. >> >> OK. >> >>> I expect we'd need to call msix_enabled() somewhere for the msix flag >>> sync to be effective. >> >> Yes, vfio_pci_post_load in cpr part 2 calls msix_enabled. >> >>> Is there an assumption here of msi-x only support or is it not needed >>> for msi or intx? >> >> The code supports msi-x and msi. However, I should only be sync'ing >> PCI_MSIX_FLAGS >> if pdev->cap_present & QEMU_PCI_CAP_MSIX. And, I am missing a sync for >> PCI_MSI_FLAGS. >> I'll fix that. > > Hi Alex, FYI, I am making more changes here. The calls to vfio_config_sync > fix pdev->config[] > words that are initialized during vfio_realize(), by pread'ing from the live > kernel config. > However, it makes more sense to suppress the undesired re-initialization, > rather than undo > the damage later. Thus I will add a few more 'if (!pdev->reused)' guards in > msix and pci bar > init functions, and delete vfio_config_sync. > > Most of the config is preserved in the kernel across restart. However, the > bits that are > purely emulated (indicated by the emulated_config_bits mask) may be rejected > when they > are written through to the kernel, and thus are currently lost on restart. I > need to save > pdev->config[] in the vmstate file, and in vfio_pci_post_load, merge it with > the kernel > config using emulated_config_bits. > > Sound sane?
Furthermore, there is no need to check reused and suppress initialization of msix and pci bar, as the vmstate loader fixes them up. - Steve