On 2/28/23 16:16, Maxime Coquelin wrote:
> Hi,
> 
> On 2/28/23 10:33, Wan Junjie wrote:
>> Hi,
>>
>> On Tue, Feb 28, 2023 at 4:03 PM David Marchand
>> <david.march...@redhat.com> wrote:
>>>
>>> Hello,
>>>
>>> On Fri, Feb 24, 2023 at 2:55 AM Han Ding <hand...@chinatelecom.cn> wrote:
>>>>> On 2/23/23 10:35, Han Ding wrote:
>>>>>>
>>>>>> When use ovs-vsctl to delete vhostuserclient port while qemu destroy 
>>>>>> virtio,
>>>>>> there is a deadlock between OVS main thread and the vhost-events thread.
>>>>>>
>>>>>> openvswitch is 2.14 and dpdk is 20.11
>>>>>
>>>>> FWIW, 2.14 supposed to be used with 19.11.
>>>
>>> Indeed.
>>>
>>>
>>>>
>>>> Thanks for reply.
>>>>
>>>> The code flow path of openvswitch 3.0.1 and dpdk 21.11, dpdk 22.11 is same
>>>> for destroying vhostuserclient device. So, I think  there is a same bug in 
>>>> new version.
>>>
>>> Please double check the dpdk versions you are using, then have a look
>>> at my comment below.
>>>
>>>
>>>>
>>>>
>>>>>>
>>>>>> Main thread:
>>>>>> ofport_remove
>>>>>>    -> netdev_unref
>>>>>>        -> netdev_dpdk_vhost_destruct
>>>>>>             -> rte_vhost_driver_unregister (If fdentry is busy, circle 
>>>>>> again until the fdentry is not busy)
>>>>>>                 -> fdset_try_del  (fdentry is busy now, return -1. Goto 
>>>>>> again)
>>>>>>
>>>>
>>>>
>>>>
>>>>>> vhost-nets thread:
>>>>>> fdset_event_dispatch (set fdentry to busy. When destroy_device fuction 
>>>>>> return set the fdentry to no busy)
>>>>>>     -> vhost_user_read_cb
>>>>>>       -> vhost_user_msg_handler
>>>>>>           -> vhost_user_get_vring_base
>>>>>>               ->destroy_device
>>>>>>                  -> ovsrcu_synchronize (Wait for other threads to 
>>>>>> quiesce)
>>>>
>>>>
>>>>
>>>>>> The  vhost-nets thread wait for main thread to quiesce, but the main 
>>>>>> thread now is waiting for fdentry to no busy
>>>>>> and circle all the time.
>>>>
>>
>>
>> I believe this issue exists in older versions, we have seen it before.
>>
>> When we create and destroy vm concurrently and continuously. This
>> could be reproduced easily.
>>
>> We found a related patch from dpdk which is not accepted.
>>
>> https://patchwork.dpdk.org/project/dpdk/patch/1584007039-12437-1-git-send-email-wangzh...@jd.com/
> 
> Thanks for the pointer.
> 
> I spent some time thinking about it, and I do not see other alternative
> than doing what the patch you shared do, i.e. returning -EAGAIN and
> expecting the application to handle it properly.
> 
> In order not to break existing applications that expect
> rte_vhost_driver_unregister to block, maybe we can introduce a new flag
> passed at registration time specifying the application support non-
> blocking unregistration.
> 
> Without the flag passed, the lib would behave the same as currently,
> i.e. block until fdet_try_del() succeeds.
> 
> Would that solution work with OVS?

We can't really fail the netdev_dpdk_vhost_destruct() in OVS as there
is no mechanism to re-try it.

> Any other alternative in mind?

We may postpone the rte_vhost_driver_unregister() call in OVS, but
we'll need to make sure somehow that the same port is not re-created
while it's not yet destroyed.

Just a thought, I don't really know how to implement that as failing
the netdev_open() might also be a problem.

Best regards, Ilya Maximets.

> 
> Regards,
> Maxime
> 
>>
>>>>
>>>>>>
>>>>>> Whether the ovsrcu_synchronize  is necessary in destrory_device
>>>>>
>>>>> Yes, we have to wait for all threads to stop using this device.
>>>>>
>>>>>> or the rte_vhost_driver_unregister is correct in fdset_try_del ?
>>>>>
>>>>> CC: David and Maxime.
>>>
>>> This could be a different issue, but there was a change for this area
>>> of the vhost library in v21.11:
>>> 451dc0fad83d ("vhost: fix crash on port deletion")
>>>
>>> I don't see it backported to 19.11 stable release, but it got to
>>> v20.11.4 (used in 2.15 and 2.16 ovs branches).
>>>
>>> I would suggest backporting it to your 19.11 dpdk and retest with ovs 2.14.
>>>
>>>
>>> -- 
>>> David Marchand
>>>
>>> _______________________________________________
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to