On 5/6/2020 1:14 AM, Stephen Hemminger wrote: > On Wed, 18 Mar 2020 16:17:57 +0100 > Thomas Monjalon <tho...@monjalon.net> wrote: > >> 17/01/2020 17:43, Ferruh Yigit: >>> On 12/22/2019 5:55 PM, Stephen Hemminger wrote: >>>> This fixes a deadlock when using KNI with bifurcated drivers. >>>> Bringing kni device up always times out when using Mellanox >>>> devices. >>>> >>>> The kernel KNI driver sends message to userspace to complete >>>> the request. For the case of bifurcated driver, this may involve >>>> an additional request to kernel to change state. This request >>>> would deadlock because KNI was holding the RTNL mutex. >>>> >>>> This was a bad design which goes back to the original code. >>>> A workaround is for KNI driver to drop RTNL while waiting. >>>> To prevent the device from disappearing while the operation >>>> is in progress, it needs to hold reference to network device >>>> while waiting. >>>> >>>> As an added benefit, an useless error check can also be removed. >>>> >>>> Fixes: 3fc5ca2f6352 ("kni: initial import") >>>> Cc: sta...@dpdk.org >>>> Signed-off-by: Stephen Hemminger <step...@networkplumber.org> >>>> --- >>> >>> This patch cause a hang on my server, not sure what exactly was the problem >>> but >>> kernel log was continuously printing "Cannot send to req_q". Will dig more. >>> >> >> Ferruh, did you have a chance to check what is hanging? >> Stephen, is there any news on your side? >> >> > > It did not hang when I tested it. The bug report is still open >
Sorry for the delay, since I am working remotely I was worried about loosing the connection to my server, finally I did create a virtual environment to test again. I confirm the hang observed %100 when two different process updates the kni interface, like two different process sets the mtu. Without this patch this works fine. I understand the motivation of the patch, but with change there is a possibility to hang the server, which we can't allow, need to find another way. Can updating mlx interface wait KNI interface operation to complete?