On 04/07/2018 08:42, Cédric Le Goater wrote:
> On 07/04/2018 04:26 AM, Peter Xu wrote:
>> On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
>>> On 07/02/2018 05:57 AM, Peter Xu wrote:
>>>> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
>>>>> When a PCI device is unplugged, the PCI memory regions are deleted
>>>>> before the optional ROM RAMBlock is flagged non-migratable. But, when
>>>>> this is done, the RAMBlock has already been cleared from the region,
>>>>> leading to a segv.
>>>>>
>>>>> Fix the issue by testing the RAMBlock before flagging it, as it is
>>>>> done in qemu_ram_unset_idstr()
>>>>>
>>>>> Signed-off-by: Cédric Le Goater <c...@kaod.org>
>>>>> ---
>>>>>
>>>>>  I caught this bug while deleting a passthrough device from a pseries
>>>>>  machine. Here is the stack:
>>>>>    
>>>>>    #0  qemu_ram_unset_migratable (rb=0x0) at 
>>>>> /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>>>>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, 
>>>>> dev=<optimized out>)
>>>>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>>>>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>>>>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, 
>>>>> value=<optimized out>, errp=0x0)
>>>>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, 
>>>>> v=<optimized out>, name=<optimized out>, 
>>>>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, 
>>>>> v=0x7fff70033110, 
>>>>>    #7  0x00000001007aaf1c in object_property_set_qobject 
>>>>> (obj=0x101796330, value=<optimized out>, 
>>>>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, 
>>>>> value=<optimized out>, 
>>>>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>>>>>    #10 0x00000001007a6dd0 in object_finalize_child_property 
>>>>> (obj=<optimized out>, name=<optimized out>, 
>>>>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, 
>>>>> child=0x101796330, 
>>>>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb 
>>>>> (dev=0x101796330)
>>>>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>>>>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>>>>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>>>>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, 
>>>>> idx=<optimized out>)
>>>>>
>>>>>  May be we should call pci_del_option_rom() before
>>>>>  pci_unregister_io_regions() ?
>>>>
>>>> This seems to make more sense to me.
>>>>
>>>> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
>>>> it's not really deleting the ROM but unregistering the ROM only.
>>>> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
>>>> maybe we can either rename the function pci_del_option_rom(), or we
>>>> can pick the ROM destruction out of pci_unregister_io_regions() and
>>>> put it into pci_del_option_rom() to make sure it's done as the last
>>>> step?
>>>
>>> So it is a little more complex than I thought. 
>>>
>>> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
>>> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
>>> allocate the RAMBlock, but has_rom is still set to true. 
>>>
>>> When the device is deleted, pci_del_option_rom() is called and with it, 
>>> vmstate_unregister_ram() because has_rom is set to true. Leading to the
>>> SEGV.
>>>
>>> I am not sure how to handle this case. It seems that the realize routine 
>>> of VFIOPCIDevice is hijacking a little the PCIDevice layer.
>>
>> Indeed.
>>
>> Then now I'm a bit confused on who actually deleted the ROM memory
>> region that was created when pci_add_option_rom() was called.  It
>> seems to be leaked.
>>
>> AFAIU the rest of the memory regions of the BARs (0-5) are managed by
>> specific device emulation code, however this ROM memory region is
>> managed by PCI subsystem.  Not sure whether that means we should
>> destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().
>>
>> And now I see this patch might be a valid fix for the VFIO-specific
>> issue (though we might comment that a bit somewhere).
> 
> yes. I will send a v2 with an updated commit log.

I wonder if the fix is simply to... get rid of vmstate_unregister_ram.

It was added in

    commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
    Author: Hu Tao <hu...@cn.fujitsu.com>
    Date:   Wed Apr 2 15:13:27 2014 +0800

    unset RAMBlock idstr when unregister MemoryRegion

    Signed-off-by: Hu Tao <hu...@cn.fujitsu.com>
    Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>

whose commit message is a bit lacking, but
http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
more.  It seems like the original bug was a reference count issue.

Clearing the new migratable flag should also be unnecessary.

Paolo

Reply via email to