Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Igor Ryzhov
Elad, To make it work on Mellanox NIC, you need to find a way to send ALL requests to userspace without rtnl_lock held, including link down. As I understand, the race condition in __dev_close_many must be solved somehow. I can't provide remote access, but I am happy to test on Mellanox NICs, if y

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Stephen Hemminger
On Wed, 24 Feb 2021 15:49:49 +0300 Igor Ryzhov wrote: > This looks more like a hack than an actual fix to me. > > After this commit: > "ip link set up" is sent to the userspace with unlocked rtnl_lock > "ip link set down" is sent to the userspace with locked rtnl_lock Calling userspace with rtn

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Igor Ryzhov
Stephen's idea was to fix the deadlock when working with the bifurcated driver. Your rework breaks this because you still send link down requests under rtnl_lock. Did you test your patch with Mellanox devices? On Wed, Feb 24, 2021 at 5:56 PM Elad Nachman wrote: > The deadlock scenarios are expla

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Elad Nachman
The deadlock scenarios are explained below: It is described in Stephen Hemminger's original patch: " This fixes a deadlock when using KNI with bifurcated drivers. Bringing kni device up always times out when using Mellanox devices. The kernel KNI driver sends message to userspace to complete th

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Igor Ryzhov
Both link up and link down also work for me without this patch. So what's the point in merging it? Just to clarify - I am not against the idea of this patch. Talking to userspace under rtnl_lock is a bad idea. I just think that any patch should fix some specified problem. If this patch is trying

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Elad Nachman
I tested both link up and link down many times without any problems on 100 restarts of the application. Having KNI deadlock frequently for real life applications is far worst, IMHO. FYI Elad. On Wed, Feb 24, 2021 at 4:04 PM Igor Ryzhov wrote: > > Elad, > > I understand your point. > But the fa

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Igor Ryzhov
Elad, I understand your point. But the fact that this fix works for you doesn't mean that it will work for all DPDK users. For example, I provided two simple commands: "ip link set up" and "ip link set down". Your fix works for only one of them. For me, this is not a proper fix. It may work for y

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Elad Nachman
Currently KNI has a lot of issues with deadlocks locking the code, after this commit, they are gone, and the code runs properly without crashing. That was tested with over 100 restarts of the application, which previously required a hard reset of the board. I think this benefit overweights the com

Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-24 Thread Igor Ryzhov
This looks more like a hack than an actual fix to me. After this commit: "ip link set up" is sent to the userspace with unlocked rtnl_lock "ip link set down" is sent to the userspace with locked rtnl_lock How is this really fixing anything? IMHO it only complicates the code. If talking with users

[dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v3

2021-02-23 Thread Elad Nachman
This part of the series includes my fixes for the issues reported by Ferruh and Igor on top of part 1 of the patch series: A. KNI sync lock is being locked while rtnl is held. If two threads are calling kni_net_process_request() , then the first one will take the sync lock, release rtnl lock then