On Thu, Jan 12, 2023 at 1:26 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>
> On 1/6/23 16:58, David Marchand wrote:
> > The DPDK vhost-user library maintains more granular per queue stats
> > which can replace what OVS was providing for vhost-user ports.
> >
> > The benefits for OVS:
> > - OVS can skip parsing packet sizes on the rx side,
> > - dev->stats_lock won't be taken in rx/tx code unless some packet is
> >   dropped,
> > - vhost-user is aware of which packets are transmitted to the guest,
> >   so per *transmitted* packet size stats can be reported,
> > - more internal stats from vhost-user may be exposed, without OVS
> >   needing to understand them,
> >
> > Note: the vhost-user library does not provide global stats for a port.
> > The proposed implementation is to have the global stats (exposed via
> > netdev_get_stats()) computed by querying and aggregating all per queue
> > stats.
> > Since per queue stats are exposed via another netdev ops
> > (netdev_get_custom_stats()), this may lead to some race and small
> > discrepancies.
> > This issue might already affect other netdev classes.
>
> Hi, David and Maxime.
>
> There is a deadlock between OVS main thread and the vhost-events thread.
>
> Main thread:
>
>   iface_refresh_stats()
>   -> netdev_dpdk_vhost_get_custom_stats()
>      -> ovs_mutex_lock(&dev->mutex);
>         rte_vhost_vring_stats_get()
>         -> rte_spinlock_lock(&vq->access_lock);
>
>
> vhost-events:
>
>   vhost_user_msg_handler()
>   -> vhost_user_lock_all_queue_pairs()
>      -> rte_spinlock_lock(&vq->access_lock);
>   -> vhost_user_notify_queue_state()
>      -> vring_state_changed()
>         -> ovs_mutex_lock(&dev->mutex);
>
> So, vhost-events is waiting on dev->mutex while holding vq->access_lock.
> Main thread is waiting on vq->access_lock while holding dev->mutex.
>
>
> Can be occasionally reproduced with system-dpdk testsuite while attaching
> testpmd to a vhost-user port.
>
> GDB output:
>
> (gdb) info threads
>   Id   Target Id                                             Frame
> * 1    Thread 0x7f2d4ee87000 (LWP 2659740) "ovs-vswitchd"    
> 0x0000000001683529 in rte_spinlock_lock (sl=0x22003c60a4)
>     at ../lib/eal/x86/include/rte_spinlock.h:28
>   67   Thread 0x7f2c337fe400 (LWP 2659831) "vhost-events"    
> 0x00007f2d4ec89c50 in __lll_lock_wait () from /lib64/libc.so.6
>
> (gdb) thread 67
> [Switching to thread 67 (Thread 0x7f2c337fe400 (LWP 2659831))]
> #0  0x00007f2d4ec89c50 in __lll_lock_wait () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00007f2d4ec89c50 in __lll_lock_wait () from /lib64/libc.so.6
> #1  0x00007f2d4ec901e2 in pthread_mutex_lock@@GLIBC_2.2.5 () from 
> /lib64/libc.so.6
> #2  0x00000000018953a8 in ovs_mutex_lock_at (l_=l_@entry=0x22003d83c0, 
> where=where@entry=0x1bd20af "lib/netdev-dpdk.c:4271") at lib/ovs-thread.c:76
> #3  0x0000000001915ebd in vring_state_changed (vid=<optimized out>, 
> queue_id=<optimized out>, enable=1) at lib/netdev-dpdk.c:4271
> #4  0x00000000007ad088 in vhost_user_msg_handler (vid=<optimized out>, 
> fd=fd@entry=80) at ../lib/vhost/vhost_user.c:3159
> #5  0x000000000167f4bf in vhost_user_read_cb (connfd=80, dat=0x7f2c28000f80, 
> remove=0x7f2c337fb0d8) at ../lib/vhost/socket.c:312
> #6  0x000000000167e844 in fdset_event_dispatch (arg=0x20b7760 
> <vhost_user+8192>) at ../lib/vhost/fd_man.c:280
> #7  0x00007f2d4ec8ce2d in start_thread () from /lib64/libc.so.6
> #8  0x00007f2d4ed121b0 in clone3 () from /lib64/libc.so.6
>
>
> (gdb) thread 1
> [Switching to thread 1 (Thread 0x7f2d4ee87000 (LWP 2659740))]
> #0  0x0000000001683529 in rte_spinlock_lock (sl=0x22003c60a4) at 
> ../lib/eal/x86/include/rte_spinlock.h:28
> 28              asm volatile (
> (gdb) bt
> #0  0x0000000001683529 in rte_spinlock_lock (sl=0x22003c60a4) at 
> ../lib/eal/x86/include/rte_spinlock.h:28
> #1  rte_vhost_vring_stats_get (vid=vid@entry=0, queue_id=queue_id@entry=1, 
> stats=stats@entry=0x383d8e0, n=n@entry=17) at ../lib/vhost/vhost.c:2099
> #2  0x0000000001910ed4 in netdev_dpdk_vhost_get_custom_stats 
> (netdev=0x22003d8400, custom_stats=0x7ffe3d7c0490) at lib/netdev-dpdk.c:3161
> #3  0x000000000176fffc in iface_refresh_stats (iface=iface@entry=0x3838f00) 
> at vswitchd/bridge.c:2639
> #4  0x0000000001779da1 in iface_refresh_stats (iface=0x3838f00) at 
> vswitchd/bridge.c:2635
> #5  run_stats_update () at vswitchd/bridge.c:3066
> #6  bridge_run () at vswitchd/bridge.c:3329
> #7  0x00000000007cbe7a in main (argc=<optimized out>, argv=<optimized out>) 
> at vswitchd/ovs-vswitchd.c:129
>

Indeed, that is a bad deadlock.
We have been discussing this with Maxime.
A fix on DPDK is probably needed but it won't be available soon.

For now, I am looking into moving state change notification handling
from the vhost-events thread to the OVS main thread.
I will post a patch later.


-- 
David Marchand

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to