בתאריך יום ב׳, 4 באוק׳ 2021, 17:51, מאת Ferruh Yigit < ferruh.yi...@intel.com>:
> On 10/4/2021 3:25 PM, Elad Nachman wrote: > > Can you please try to not top post, it will make impossible to follow this > discussion later from the mail archives. > > > 1. Userspace will get an error > > So there is nothing special with returning '-EAGAIN', user will only > observe an > error. > Wasn't initial intention to use '-EAGAIN' to try request again? > > To signal user-space to retry the operation. > > > 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl unlocked > > for interface down command causes a crash because of a race condition in > > the device delete/unregister list in the kernel. > > > > Why waiting with rthnl lock causes a deadlock? As said below we are already > doing it, why it is different with retry logic? > > Because it can be interface down request. > I agree to not wait with rtnl unlocked. > > > FYI, > > > > Elad. > > > > בתאריך יום ב׳, 4 באוק׳ 2021, 17:13, מאת Ferruh Yigit < > > ferruh.yi...@intel.com>: > > > >> On 10/4/2021 2:09 PM, Elad Nachman wrote: > >>> Hi, > >>> > >>> EAGAIN is propogated back to the kernel and to the caller. > >>> > >> > >> So will the user get an error, or it will be handled by the kernel and > >> retried? > >> > >>> We cannot retry from the kni kernel module since we hold the rtnl lock. > >>> > >> > >> Why not? We are already waiting until a command time out, like > >> 'kni_net_open()' > >> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we can > >> limit the > >> number of retry for safety. > >> > >>> FYI, > >>> > >>> Elad > >>> > >>> בתאריך יום ב׳, 4 באוק׳ 2021, 16:05, מאת Ferruh Yigit < > >>> ferruh.yi...@intel.com>: > >>> > >>>> On 9/24/2021 11:54 AM, Elad Nachman wrote: > >>>>> Fix lack of multiple KNI requests handling support by introducing a > >>>>> request in progress flag which will fail additional requests with > >>>>> EAGAIN return code if the original request has not been processed > >>>>> by user-space. > >>>>> > >>>>> Bugzilla ID: 809 > >>>> > >>>> Hi Eric, > >>>> > >>>> Can you please test this patch, if it solves the issue you reported? > >>>> > >>>>> > >>>>> Signed-off-by: Elad Nachman <ela...@gmail.com> > >>>>> --- > >>>>> kernel/linux/kni/kni_net.c | 9 +++++++++ > >>>>> lib/kni/rte_kni.c | 2 ++ > >>>>> lib/kni/rte_kni_common.h | 1 + > >>>>> 3 files changed, 12 insertions(+) > >>>>> > >>>> > >>>> <...> > >>>> > >>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device *dev, > >>>> struct rte_kni_request *req) > >>>>> > >>>>> mutex_lock(&kni->sync_lock); > >>>>> > >>>>> + /* Check that existing request has been processed: */ > >>>>> + cur_req = (struct rte_kni_request *)kni->sync_kva; > >>>>> + if (cur_req->req_in_progress) { > >>>>> + ret = -EAGAIN; > >>>> > >>>> Overall logic in the KNI looks good to me, this helps to serialize the > >>>> requests > >>>> even for async ones. > >>>> > >>>> But can you please clarify how it behaves in the kernel side with > >> '-EAGAIN' > >>>> return type? Will linux call the ndo again, or will it just fail. > >>>> > >>>> If it just fails should we handle the re-try on '-EAGAIN' within the > kni > >>>> module? > >>>> > >>>> > >> > >> > > Elad.