On 2/28/23 16:16, Maxime Coquelin wrote: > Hi, > > On 2/28/23 10:33, Wan Junjie wrote: >> Hi, >> >> On Tue, Feb 28, 2023 at 4:03 PM David Marchand >> <david.march...@redhat.com> wrote: >>> >>> Hello, >>> >>> On Fri, Feb 24, 2023 at 2:55 AM Han Ding <hand...@chinatelecom.cn> wrote: >>>>> On 2/23/23 10:35, Han Ding wrote: >>>>>> >>>>>> When use ovs-vsctl to delete vhostuserclient port while qemu destroy >>>>>> virtio, >>>>>> there is a deadlock between OVS main thread and the vhost-events thread. >>>>>> >>>>>> openvswitch is 2.14 and dpdk is 20.11 >>>>> >>>>> FWIW, 2.14 supposed to be used with 19.11. >>> >>> Indeed. >>> >>> >>>> >>>> Thanks for reply. >>>> >>>> The code flow path of openvswitch 3.0.1 and dpdk 21.11, dpdk 22.11 is same >>>> for destroying vhostuserclient device. So, I think there is a same bug in >>>> new version. >>> >>> Please double check the dpdk versions you are using, then have a look >>> at my comment below. >>> >>> >>>> >>>> >>>>>> >>>>>> Main thread: >>>>>> ofport_remove >>>>>> -> netdev_unref >>>>>> -> netdev_dpdk_vhost_destruct >>>>>> -> rte_vhost_driver_unregister (If fdentry is busy, circle >>>>>> again until the fdentry is not busy) >>>>>> -> fdset_try_del (fdentry is busy now, return -1. Goto >>>>>> again) >>>>>> >>>> >>>> >>>> >>>>>> vhost-nets thread: >>>>>> fdset_event_dispatch (set fdentry to busy. When destroy_device fuction >>>>>> return set the fdentry to no busy) >>>>>> -> vhost_user_read_cb >>>>>> -> vhost_user_msg_handler >>>>>> -> vhost_user_get_vring_base >>>>>> ->destroy_device >>>>>> -> ovsrcu_synchronize (Wait for other threads to >>>>>> quiesce) >>>> >>>> >>>> >>>>>> The vhost-nets thread wait for main thread to quiesce, but the main >>>>>> thread now is waiting for fdentry to no busy >>>>>> and circle all the time. >>>> >> >> >> I believe this issue exists in older versions, we have seen it before. >> >> When we create and destroy vm concurrently and continuously. This >> could be reproduced easily. >> >> We found a related patch from dpdk which is not accepted. >> >> https://patchwork.dpdk.org/project/dpdk/patch/1584007039-12437-1-git-send-email-wangzh...@jd.com/ > > Thanks for the pointer. > > I spent some time thinking about it, and I do not see other alternative > than doing what the patch you shared do, i.e. returning -EAGAIN and > expecting the application to handle it properly. > > In order not to break existing applications that expect > rte_vhost_driver_unregister to block, maybe we can introduce a new flag > passed at registration time specifying the application support non- > blocking unregistration. > > Without the flag passed, the lib would behave the same as currently, > i.e. block until fdet_try_del() succeeds. > > Would that solution work with OVS?
We can't really fail the netdev_dpdk_vhost_destruct() in OVS as there is no mechanism to re-try it. > Any other alternative in mind? We may postpone the rte_vhost_driver_unregister() call in OVS, but we'll need to make sure somehow that the same port is not re-created while it's not yet destroyed. Just a thought, I don't really know how to implement that as failing the netdev_open() might also be a problem. Best regards, Ilya Maximets. > > Regards, > Maxime > >> >>>> >>>>>> >>>>>> Whether the ovsrcu_synchronize is necessary in destrory_device >>>>> >>>>> Yes, we have to wait for all threads to stop using this device. >>>>> >>>>>> or the rte_vhost_driver_unregister is correct in fdset_try_del ? >>>>> >>>>> CC: David and Maxime. >>> >>> This could be a different issue, but there was a change for this area >>> of the vhost library in v21.11: >>> 451dc0fad83d ("vhost: fix crash on port deletion") >>> >>> I don't see it backported to 19.11 stable release, but it got to >>> v20.11.4 (used in 2.15 and 2.16 ovs branches). >>> >>> I would suggest backporting it to your 19.11 dpdk and retest with ovs 2.14. >>> >>> >>> -- >>> David Marchand >>> >>> _______________________________________________ >>> dev mailing list >>> d...@openvswitch.org >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev