ocrdma failure in 4.4.0-rc5
Hi Devesh, Testing 4.4.0-rc5, the ocrdma driver is failing for me (100% reliably). If you have vlans off of the main device, this is what I get from the Fedora rawhide 4.4.0-rc5 kernel: -- Doug Ledford GPG KeyID: 0E572FDD [ 26.692881] be2net :85:00.0 ocrdma_roce: Link is Up [ 26.693339] == [ 26.693340] [ INFO: possible circular locking dependency detected ] [ 26.693341] 4.4.0-0.rc5.git3.1.fc24.x86_64 #1 Tainted: G I [ 26.693341] --- [ 26.693342] NetworkManager/2867 is trying to acquire lock: [ 26.693348] (be_adapter_list_lock){+.+.+.}, at: [] be_roce_dev_open+0x35/0x70 [be2net] [ 26.693349] but task is already holding lock: [ 26.693354] (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1b/0x40 [ 26.693355] which lock already depends on the new lock. [ 26.693355] the existing dependency chain (in reverse order) is: [ 26.693356] -> #2 (rtnl_mutex){+.+.+.}: [ 26.693361][] lock_acquire+0xce/0x1c0 [ 26.693366][] mutex_lock_nested+0x86/0x400 [ 26.693368][] rtnl_lock+0x17/0x20 [ 26.693375][] enum_all_gids_of_dev_cb+0x25/0xd0 [ib_core] [ 26.693379][] ib_enum_roce_netdev+0x128/0x130 [ib_core] [ 26.693382][] roce_rescan_device+0x21/0x30 [ib_core] [ 26.693385][] ib_cache_setup_one+0x2bc/0x3b0 [ib_core] [ 26.693388][] ib_register_device+0x2e3/0x420 [ib_core] [ 26.693391][] ocrdma_add+0x43a/0x710 [ocrdma] [ 26.693393][] _be_roce_dev_add+0x17d/0x1e0 [be2net] [ 26.693396][] be_roce_register_driver+0x6a/0xd0 [be2net] [ 26.693402][] target_dev_control_store+0x15/0x20 [target_core_mod] [ 26.693406][] do_one_initcall+0xb3/0x200 [ 26.693408][] do_init_module+0x5f/0x1e7 [ 26.693410][] load_module+0x2126/0x27d0 [ 26.693411][] SyS_init_module+0x172/0x1b0 [ 26.693412][] entry_SYSCALL_64_fastpath+0x12/0x76 [ 26.693414] -> #1 (device_mutex){+.+.+.}: [ 26.693415][] lock_acquire+0xce/0x1c0 [ 26.693417][] mutex_lock_nested+0x86/0x400 [ 26.693420][] ib_register_device+0x3f/0x420 [ib_core] [ 26.693422][] ocrdma_add+0x43a/0x710 [ocrdma] [ 26.693423][] _be_roce_dev_add+0x17d/0x1e0 [be2net] [ 26.693425][] be_roce_register_driver+0x6a/0xd0 [be2net] [ 26.693428][] target_dev_control_store+0x15/0x20 [target_core_mod] [ 26.693430][] do_one_initcall+0xb3/0x200 [ 26.693431][] do_init_module+0x5f/0x1e7 [ 26.693432][] load_module+0x2126/0x27d0 [ 26.693433][] SyS_init_module+0x172/0x1b0 [ 26.693435][] entry_SYSCALL_64_fastpath+0x12/0x76 [ 26.693436] -> #0 (be_adapter_list_lock){+.+.+.}: [ 26.693437][] __lock_acquire+0x18f9/0x1b70 [ 26.693439][] lock_acquire+0xce/0x1c0 [ 26.693440][] mutex_lock_nested+0x86/0x400 [ 26.693442][] be_roce_dev_open+0x35/0x70 [be2net] [ 26.693444][] be_open+0x670/0x700 [be2net] [ 26.693446][] __dev_open+0xc8/0x140 [ 26.693448][] __dev_change_flags+0x9d/0x160 [ 26.693449][] dev_change_flags+0x29/0x70 [ 26.693451][] do_setlink+0x636/0xb80 [ 26.693452][] rtnl_newlink+0x5ac/0x8a0 [ 26.693454][] rtnetlink_rcv_msg+0xe6/0x240 [ 26.693456][] netlink_rcv_skb+0xa4/0xc0 [ 26.693457][] rtnetlink_rcv+0x2a/0x40 [ 26.693459][] netlink_unicast+0x19a/0x290 [ 26.693460][] netlink_sendmsg+0x4c3/0x620 [ 26.693462][] sock_sendmsg+0x38/0x50 [ 26.693463][] ___sys_sendmsg+0x2c9/0x2e0 [ 26.693465][] __sys_sendmsg+0x51/0x90 [ 26.693466][] SyS_sendmsg+0x12/0x20 [ 26.693467][] entry_SYSCALL_64_fastpath+0x12/0x76 [ 26.693468] other info that might help us debug this: [ 26.693469] Chain exists of: be_adapter_list_lock --> device_mutex --> rtnl_mutex [ 26.693470] Possible unsafe locking scenario: [ 26.693470]CPU0CPU1 [ 26.693470] [ 26.693471] lock(rtnl_mutex); [ 26.693472]lock(device_mutex); [ 26.693472]lock(rtnl_mutex); [ 26.693473] lock(be_adapter_list_lock); [ 26.693473] *** DEADLOCK *** [ 26.693474] 1 lock held by NetworkManager/2867: [ 26.693476] #0: (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1b/0x40 [ 26.693476] stack backtrace: [ 26.693478] CPU: 14 PID: 2867 Comm: NetworkManager Tainted: G I 4.4.0-0.rc5.git3.1.fc24.x86_64 #1 [ 26.693479] Hardware name: Dell Inc. PowerEdge R730xd/0599V5, BIOS 1.0.4 08/28/2014 [ 26.693481]
[PATCH] IB/usnic: delete unneeded IS_ERR test
kzalloc doesn't return ERR_PTR, so there is no need to test for it. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // @@ expression x,e; @@ * x = kzalloc(...) ... when != x = e * IS_ERR_OR_NULL(x) // Signed-off-by: Julia Lawall --- drivers/infiniband/hw/usnic/usnic_ib_verbs.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c index f8e3211..20f53e5 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -625,8 +625,8 @@ struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length, virt_addr, length); mr = kzalloc(sizeof(*mr), GFP_KERNEL); - if (IS_ERR_OR_NULL(mr)) - return ERR_PTR(mr ? PTR_ERR(mr) : -ENOMEM); + if (!mr) + return ERR_PTR(-ENOMEM); mr->umem = usnic_uiom_reg_get(to_upd(pd)->umem_pd, start, length, access_flags, 0); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/18/2015 3:55 PM, Christoph Hellwig wrote: diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 284916d..e45776e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1306,7 +1306,6 @@ struct ib_mr { u64iova; u32length; unsigned int page_size; - atomic_t usecnt; /* count number of MWs */ }; This comment is part of Roland's uverbs commit. I wonder if LL driver supporting the IB_WR_BIND_MW op ref the MR on port send and deref it on completion? Or. struct ib_mw { -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RoCE passive side failures on 4.4-rc5
On 12/17/2015 3:58 PM, Or Gerlitz wrote: Using 4.4-rc5+ [1] and **not** applying any of the patches I sent today, I noted that RoCE passive side isn't working (rdma-cm, ibv_rc_pingpong works). I have two nodes in ConnectX3 VPI config (port1 IB and port2 Eth), the one with the 4.4-rc5 kernel can act as both (rping) client/server for IB links but only (rping) client for RoCE. I tried both inter-node and loopback runs, in all cases, the client side getsCM reject with reason 28, see [2], tried both iser and rping. Eth (ICMP, TCP) works OK. OK, small progress, when the force Eth link type on my IB port (using mlx4 sysfs), things work. You should be able to reproduce it on your non-VPI systems the other way around, by forcing IB link type on one of the Eth ports and see the failure. I Saw the same behavior with both 4.4-rc2 and 4.4-rc5 Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/18/2015 4:14 PM, Bart Van Assche wrote: On 12/18/2015 02:55 PM, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig Shouldn't the description of this patch be changed into something like "Remove the usecnt field from ib_mr since it is always zero" ? Agree. I would like us to avoid empty changes log for IB core patches and actually all over the rdma subsystem. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/10] IB: remove the struct ib_phys_buf definition
On 12/18/2015 3:55 PM, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Jason Gunthorpe [core] Reviewed-by: Steve Wise Here, too, please avoid empty change logs to IB core patches. Or. --- include/rdma/ib_verbs.h | 5 - 1 file changed, 5 deletions(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index ea093ee..284916d 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1143,11 +1143,6 @@ enum ib_access_flags { IB_ACCESS_ON_DEMAND = (1<<6), }; -struct ib_phys_buf { - u64 addr; - u64 size; -}; - /* * XXX: these are apparently used for ->rereg_user_mr, no idea why they * are hidden here instead of a uapi header! -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/10] IB: remove the unused usecnt field from struct ib_mr
On 12/20/2015 9:25 AM, Or Gerlitz wrote: On 12/18/2015 3:55 PM, Christoph Hellwig wrote: diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 284916d..e45776e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1306,7 +1306,6 @@ struct ib_mr { u64 iova; u32 length; unsigned int page_size; -atomic_t usecnt; /* count number of MWs */ }; This comment is part of Roland's uverbs commit. I saw now that you removed in-kernel support for MWs as a downstream patch of this series. I guess this cleanup needs to go there as the refcount field of kernel MRs relates to MWs. This will also help someone coming in the future and returning the in-kernel MW support to avoid forgetting on doing the refs. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] libmlx5: Add gitignore file to the project
From: Leon Romanovsky Add gitignore file to the libmlx5 project. Signed-off-by: Leon Romanovsky --- .gitignore | 20 1 file changed, 20 insertions(+) create mode 100644 .gitignore diff --git a/.gitignore b/.gitignore new file mode 100644 index ..be8e0f03eb93 --- /dev/null +++ b/.gitignore @@ -0,0 +1,20 @@ +*.o +*.lo +*.swp +configure +Makefile.in +autom4te.cache +aclocal.m4 +stamp-h.in +config.h.in +config.h.in~ +config.log +config.h +.libs +.deps +libmlx5.spec +Makefile +config.status +stamp-h1 +libtool +tags -- 1.7.12.4 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html