Yes you're right, the master branch has no problem here. I am sorry for that it's my own code's errors causing the deadlock. We MUST make sure that dp->port_mutex is not used in pmd thread context.
Thanks very much. > -----Original Message----- > From: Ilya Maximets [mailto:i.maxim...@samsung.com] > Sent: Thursday, January 31, 2019 5:12 PM > To: Lilijun (Jerry, Cloud Networking) <jerry.lili...@huawei.com>; > d...@openvswitch.org > Cc: Stokes, Ian <ian.sto...@intel.com> > Subject: Re: [ovs-dev] dpif-netdev:fix reload pmd's dead lock > > Hi. > > On 31.01.2019 8:57, Lilijun wrote: > > This patch fix the dead lock when using dpdk userspace datapath. The > > problem is described as follows: > > 1) when add or delete port, the main thread will call > > reconfigure_datapath() in the function dpif_netdev_run() > > 2) Here the dp->port_mutex is locked. In dp_netdev_reload_pmd__(), it > > will notify each pmd to reload. > > 3) If pmd is doing packet upcall in fast_path_processing() and try to > > get > > dp->port_mutex in > > do_xlate_actions()-->tnl_route_lookup_flow()-- > >dpif_netdev_port_query_by_name(). > > Beside the fact that this patch could not be applied to master because of > some conflicts, I do not see how the dpif_netdev_port_query_by_name() > could be called from tnl_route_lookup_flow(). What OVS version you're > using ? > Could you, please, provide more detailed call trace for this deadlock ? > > Also, reload_affected_pmds() called 4 times during the reconfiguration. > Why you unlocking only for these two ? > > > Here pmd get lock failed because the main thread has got the lock in > > step 2. > > 4) So the main thread was stuck for waiting pmd to reload done. Now we > > got a dead lock. > > > > Here reload_affected_pmds() may not need to lock dp->port_mutex. So > we > > release the lock temporarily when calling reload_affected_pmds(). > > > > Signed-off-by: Lilijun <jerry.lili...@huawei.com> > > --- > > lib/dpif-netdev.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > 1564db9c6..bfd6aa74c 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -4665,8 +4665,11 @@ reconfigure_datapath(struct dp_netdev *dp) > > > > /* Reload affected pmd threads. We must wait for the pmd threads > before > > * reconfiguring the ports, because a port cannot be reconfigured while > > - * it's being used. */ > > + * it's being used. We need release dp->port_mutex to make sure that > pmds > > + * don't wait for getting the mutex when handling packet upcalls*/ > > + ovs_mutex_unlock(&dp->port_mutex); > > reload_affected_pmds(dp); > > + ovs_mutex_lock(&dp->port_mutex); > > > > /* Step 3: Reconfigure ports. */ > > > > @@ -4761,7 +4764,9 @@ reconfigure_datapath(struct dp_netdev *dp) > > } > > > > /* Reload affected pmd threads. */ > > + ovs_mutex_unlock(&dp->port_mutex); > > reload_affected_pmds(dp); > > + ovs_mutex_lock(&dp->port_mutex); > > } > > > > /* Returns true if one of the netdevs in 'dp' requires a > > reconfiguration */ > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev