Hi, Regarding the asynchronous call - thought about it, but then the request will always return OK to user-space and I will have no way to return failure error codes back to user-space.
If the above explanation is acceptable, per your other comments - I can send a new patch without the parameter change , without the empty line, and with the comment moved to the proper place in the code. Waiting for your decision, Elad. On Fri, Feb 19, 2021 at 8:42 PM Ferruh Yigit <ferruh.yi...@intel.com> wrote: > > On 11/26/2020 2:46 PM, Elad Nachman wrote: > > This patch leverages on Stephen Hemminger's 64106 patch from Dec 2019, > > and fixes the issues reported by Ferruh and Igor: > > > > A. KNI sync lock is being locked while rtnl is held. > > If two threads are calling kni_net_process_request() , > > then the first one wil take the sync lock, release rtnl lock then sleep. > > The second thread will try to lock sync lock while holding rtnl. > > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > > The remedy is to release rtnl before locking the KNI sync lock. > > Since in between nothing is accessing Linux network-wise, > > no rtnl locking is needed. > > Hi Elad, > > Thanks for explanation, that clarifies the issue. > Also I confirm I don't see the hang, at least as much as I test. > > > > > B. There is a race condition in __dev_close_many() processing the > > close_list while the application terminates. > > It looks like if two vEth devices are terminating, > > and one releases the rtnl lock, the other takes it, > > updating the close_list in an unstable state, > > causing the close_list to become a circular linked list, > > hence list_for_each_entry() will endlessly loop inside > > __dev_close_many() . > > Since the description for the original patch indicate the > > original motivation was bringing the device up, > > I have changed kni_net_process_request() to hold the rtnl mutex > > in case of bringing the device down since this is the path called > > from __dev_close_many() , causing the corruption of the close_list. > > > > I can't reproduce this case, I see the protection in the code, but better to > get > confirmation from Igor. > > > > Overall the issue seems calling a function pointed by 'rte_kni_ops' which > requires to acquire the rtnl lock. > So I wonder if this can't be handled in the ops function, by processing the > request asynchronously, > like recording the request, return from 'rte_kni_ops', and process the request > afterwards? > > I assume the application we mention is not kni sample application. > > > > > > > Signed-off-by: Elad Nachman <ela...@gmail.com> > > --- > > kernel/linux/kni/kni_net.c | 47 +++++++++++++++++++++++++------------- > > 1 file changed, 31 insertions(+), 16 deletions(-) > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index 4b752083d..cf5b0845d 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -17,6 +17,7 @@ > > #include <linux/skbuff.h> > > #include <linux/kthread.h> > > #include <linux/delay.h> > > +#include <linux/rtnetlink.h> > > > > #include <rte_kni_common.h> > > #include <kni_fifo.h> > > @@ -102,18 +103,26 @@ get_data_kva(struct kni_dev *kni, void *pkt_kva) > > * It can be called to process the request. > > */ > > static int > > -kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > > +kni_net_process_request(struct net_device *dev, struct rte_kni_request > > *req) > > { > > + struct kni_dev *kni = netdev_priv(dev); > > int ret = -1; > > void *resp_va; > > uint32_t num; > > int ret_val; > > + int req_is_dev_stop = 0; > > > > - if (!kni || !req) { > > - pr_err("No kni instance or request\n"); > > - return -EINVAL; > > - } > > + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > > + req->if_up == 0) > > + req_is_dev_stop = 1; > > > > + ASSERT_RTNL(); > > + > > + if (!req_is_dev_stop) { > > + dev_hold(dev); > > + rtnl_unlock(); > > + } > > + > > mutex_lock(&kni->sync_lock); > > > > /* Construct data */ > > @@ -125,8 +134,13 @@ kni_net_process_request(struct kni_dev *kni, struct > > rte_kni_request *req) > > goto fail; > > } > > > > + /* Since we need to wait and RTNL mutex is held > > + * drop the mutex and hold refernce to keep device > > + */ > > + > > Comment seems left here, need to go up. s/refernce/reference > > > ret_val = wait_event_interruptible_timeout(kni->wq, > > kni_fifo_count(kni->resp_q), 3 * HZ); > > + > > if (signal_pending(current) || ret_val <= 0) { > > ret = -ETIME; > > goto fail; > > @@ -144,6 +158,13 @@ kni_net_process_request(struct kni_dev *kni, struct > > rte_kni_request *req) > > > > fail: > > mutex_unlock(&kni->sync_lock); > > + > > + > > extra empty line > > > + if (!req_is_dev_stop) { > > + rtnl_lock(); > > + dev_put(dev); > > + } > > + > > return ret; > > } > > > > @@ -155,7 +176,6 @@ kni_net_open(struct net_device *dev) > > { > > int ret; > > struct rte_kni_request req; > > - struct kni_dev *kni = netdev_priv(dev); > > > > netif_start_queue(dev); > > if (kni_dflt_carrier == 1) > > @@ -168,7 +188,7 @@ kni_net_open(struct net_device *dev) > > > > /* Setting if_up to non-zero means up */ > > req.if_up = 1; > > - ret = kni_net_process_request(kni, &req); > > + ret = kni_net_process_request(dev, &req); > > > > Althoug it is not soo confusing, these lines and following ones are noise for > this patch, they are just for 'kni_net_process_request' paramter change. > > What do you think do the 'kni_net_process_request' parameter change in first > patch, and fix the issue in second, this way second patch can contain only the > actual changes required for fix.