>-----Original Message----- >From: David Hildenbrand <[email protected]> >Subject: Re: [PATCH] vfio/container: Remap only populated parts in a section > >On 05.09.25 11:04, Duan, Zhenzhong wrote: >> + David Hildenbrand >> >>> -----Original Message----- >>> From: Steven Sistare <[email protected]> >>> Subject: Re: [PATCH] vfio/container: Remap only populated parts in a >section >>> >>> On 8/31/2025 10:13 PM, Duan, Zhenzhong wrote: >>>>> -----Original Message----- >>>>> From: Steven Sistare <[email protected]> >>>>> Subject: Re: [PATCH] vfio/container: Remap only populated parts in a >>> section >>>>> >>>>> On 8/13/2025 11:24 PM, Zhenzhong Duan wrote: >>>>>> If there are multiple containers and unmap-all fails for some container, >>> we >>>>>> need to remap vaddr for the other containers for which unmap-all >>>>> succeeded. >>>>>> When ram discard is enabled, we should only remap populated parts in >a >>>>>> section instead of the whole section. >>>>>> >>>>>> Export vfio_ram_discard_notify_populate() and use it to do population. >>>>>> >>>>>> Fixes: eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr >>>>> failure") >>>>>> Signed-off-by: Zhenzhong Duan <[email protected]> >>>>>> --- >>>>>> btw: I didn't find easy to test this corner case, only code inspecting >>>>> >>>>> Thanks Zhenzhong, this looks correct. >>>>> >>>>> However, I never liked patch >>>>> eba1f657cbb1 ("vfio/container: recover from unmap-all-vaddr >>> failure") >>>>> >>>>> I think it adds too much complexity for a rare case. In fact, if we >>>>> examine all the possible error return codes, I believe they all would >>>>> be caused by other qemu application bugs, or kernel bugs: >>>>> >>>>> vfio_dma_do_unmap() >>>>> returns -EBUSY if an mdev exists. qemu blocks live update >blocker >>>>> when mdev is present. If this occurs, the blocker has a bug. >>>>> returns -EINVAL if the vaddr was already invalidated. qemu >already >>>>> invalidated it, or never remapped the vaddr after a previous live >>>>> update. Both are qemu bugs. >>>>> >>>>> iopt_unmap_all >>>>> iopt_unmap_iova_range >>>>> -EBUSY - qemu is concurrently performing other dma map or >unmap >>>>> operations. a bug. >>>>> >>>>> -EDEADLOCK - Something is not responding to unmap requests. >>>>> >>>>> Therefore, I think we should just revert eba1f657cbb1, and assert that >>>>> the qemu vfio_dma_unmap_vaddr_all() call succeeds. >>>>> >>>>> Thoughts? >>>> >>>> I agree it's a rare case and your suggestion will make code simple, but I >feel >>> it's aggressive to kill QEMU instance if live update fails, try to restore >>> and >>> keep current instance running is important in cloud env and looks more >>> moderate. >>> >>> OK. >>> >>> Reviewed-by: Steve Sistare <[email protected]> >>> >>> (but you should also seek an RB from someone who is more familiar with >ram >>> discard and >>> its callbacks). >> >> Hi David, look forward to your comments, suggestions. Thanks > >Hi! > >I mean, the > > return vrdl->listener.notify_populate(&vrdl->listener, section) == 0; > >is completely wrong. > >ram_discard_manager_replay_populated() should be the right thing to do. > >Was this patch tested somehow (e.g., with virtio-mem), so we're sure it's >now behaving as expected?
I made a trick to trigger the error path and confirmed it works as expected. E.g., force each device attaching to separate container, first device unmap_all pass and second fail. There are other fixes needed by CPR-transfer, but they are unrelated to this scenario, see https://github.com/yiliu1765/qemu/commits/rdm_v1/ for the trick and all patches. > > >I would add an empty line in vfio_cpr_rdm_remap(). > > >Apart from that, LGTM > >Reviewed-by: David Hildenbrand <[email protected]> I made a minor adjustment to call rdl->notify_populate() instead of vfio_ram_discard_notify_populate() to avoid exporting it, this looks cleaner. This is same effect so dare to keep your R-b, let me know if not. Thanks Zhenzhong
