Re: [PATCH v1 for-next 00/16] On demand paging
On 7/3/2014 11:44 AM, Haggai Eran wrote: Hi Roland, I understand that you were reluctant to review these patches as long as there was an ongoing debate on whether or not the i_mmap_mutex should be changed into a spinlock. It seems that the debate concluded with the decision to change it into a rwsem [1], as apparently this provides the optimal performance with the new optimistic spinning patch [2]. I believe this means that there will be no problem adding paging support to the RDMA stack that depends on sleepable MMU notifiers. Changes from V0: http://marc.info/?l=linux-rdmam=139375790322547w=2 - Rebased against latest upstream / for-next branch. - Removed dependency on patches that were accepted upstream. - Removed pre-patches that were accepted upstream [3]. - Add extended uverb call for querying device (patch 1) and use kernel device attributes to report ODP capabilities through the new uverb entry instead of having a special verb. - Allow upgrading page access permissions during page faults. - Minor fixes to issues that came up during regression testing of the patches. The following set of patches implements on-demand paging (ODP) support in the RDMA stack and in the mlx5_ib Infiniband driver. Can we get someone to review/respond on this patch set?? pretty please? Haggai can't just keep rebasing and resending this to lay around over the mailing list... Sagi. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
Hi Chuk, Can this patch be lined up for merger. Looks like there are no oppositions to this change -Regards Devesh -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Devesh Sharma Sent: Thursday, July 31, 2014 10:45 AM To: Chuck Lever Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Chuck Lever Sent: Thursday, July 31, 2014 12:09 AM To: Devesh Sharma Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module On Jul 22, 2014, at 1:06 AM, Devesh Sharma devesh.sha...@emulex.com wrote: -Original Message- From: Chuck Lever [mailto:chuck.le...@oracle.com] Sent: Monday, July 21, 2014 11:01 PM To: Devesh Sharma Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module On Jul 21, 2014, at 1:07 PM, Devesh Sharma devesh.sha...@emulex.com wrote: Until that support is in place, obviously I would prefer that the removal of the underlying driver be prevented while there are NFS mounts in place. I think that's what NFS users have come to expect. In other words, don't allow device removal until we have support for device insertion :-) This needs a fresh series of patches? do not allow removal would likely take an approach similar to what you originally posted, unless someone has an idea how to use a CM_EVENT to make this work, or there is an objection from the kernel RDMA folks. Okay, I will re-work the patch if need be. Devesh, if there isn't one already, could you file an upstream bug at http://bugzilla.linux-nfs.org that documents the shutdown hang and perhaps summarizes this thread? Thanks! A bug has been filed https://bugzilla.linux-nfs.org/show_bug.cgi?id=266 -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
3.17-rc1 oops during network interface configuration
Hello, Has anyone else already tried to boot kernel 3.17-rc1 on an IB system ? The following call trace is triggered during boot on a system on which kernel 3.16 runs fine: BUG: unable to handle kernel paging request at 8809007e IP: __dev_queue_xmit+0x519 Call Trace: ? __dev_queue_xmit+0x49 dev_queue_xmit+0x10 neigh_connected_output ? ip_finish_output ip_finish_output ? ip_finish_output ? netif_rx_ni ip_mc_output ip_local_out_sk ip_send_skb udp_send_skb udp_sendmsg ? ip_reply_glue_bits ? __lock_is_held inet_sendmsg ? inet_sendmsg sock_sendmsg ? might_fault ? might_fault ? move_addr_to_kernel.part.38 SYSC_sendto ? sysret_check ? trace_hardirqs_on_caller ? trace_hardirqs_on_thunk SyS_sendto system_call_fastpath Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) drm_kms_helper: panic occurred, switching back to text console A screenshot of this kernel oops can be found here: https://drive.google.com/file/d/0B1YQOreL3_FxVDB5UTNwekF6LVU/ gdb translates the crash address into the following (not sure this makes sense since offset 0x519 is past the end of __dev_queue_xmit()): (gdb) list *(__dev_queue_xmit+0x519) 0x8136bc89 is in netdev_adjacent_rename_links (net/core/dev.c:5167). 5162void netdev_adjacent_rename_links(struct net_device *dev, char *oldname) 5163{ 5164struct netdev_adjacent *iter; 5165 5166list_for_each_entry(iter, dev-adj_list.upper, list) { 5167netdev_adjacent_sysfs_del(iter-dev, oldname, 5168 iter-dev-adj_list.lower); 5169netdev_adjacent_sysfs_add(iter-dev, dev, 5170 iter-dev-adj_list.lower); 5171} And the address __dev_queue_xmit+0x49 is translated by gdb into: (gdb) list *(__dev_queue_xmit+0x49) 0x8136b7b9 is in __dev_queue_xmit (./arch/x86/include/asm/preempt.h:75). 70 * The various preempt_count add/sub methods 71 */ 72 73 static __always_inline void __preempt_count_add(int val) 74 { 75 raw_cpu_add_4(__preempt_count, val); 76 } 77 78 static __always_inline void __preempt_count_sub(int val) 79 { Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] qib_keys: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
The uses of rcu_assign_pointer() are NULLing out the pointers. According to RCU_INIT_POINTER()'s block comment: 1. This use of RCU_INIT_POINTER() is NULLing out the pointer it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat bernat@gmail.com --- drivers/infiniband/hw/qib/qib_keys.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_keys.c b/drivers/infiniband/hw/qib/qib_keys.c index 3b9afcc..ad843c7 100644 --- a/drivers/infiniband/hw/qib/qib_keys.c +++ b/drivers/infiniband/hw/qib/qib_keys.c @@ -122,10 +122,10 @@ void qib_free_lkey(struct qib_mregion *mr) if (!mr-lkey_published) goto out; if (lkey == 0) - rcu_assign_pointer(dev-dma_mr, NULL); + RCU_INIT_POINTER(dev-dma_mr, NULL); else { r = lkey (32 - ib_qib_lkey_table_size); - rcu_assign_pointer(rkt-table[r], NULL); + RCU_INIT_POINTER(rkt-table[r], NULL); } qib_put_mr(mr); mr-lkey_published = 0; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] qib_qp: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
The uses of rcu_assign_pointer() are NULLing out the pointers. According to RCU_INIT_POINTER()'s block comment: 1. This use of RCU_INIT_POINTER() is NULLing out the pointer it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat bernat@gmail.com --- drivers/infiniband/hw/qib/qib_qp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_qp.c b/drivers/infiniband/hw/qib/qib_qp.c index 7fcc150..17daf42 100644 --- a/drivers/infiniband/hw/qib/qib_qp.c +++ b/drivers/infiniband/hw/qib/qib_qp.c @@ -255,10 +255,10 @@ static void remove_qp(struct qib_ibdev *dev, struct qib_qp *qp) if (rcu_dereference_protected(ibp-qp0, lockdep_is_held(dev-qpt_lock)) == qp) { - rcu_assign_pointer(ibp-qp0, NULL); + RCU_INIT_POINTER(ibp-qp0, NULL); } else if (rcu_dereference_protected(ibp-qp1, lockdep_is_held(dev-qpt_lock)) == qp) { - rcu_assign_pointer(ibp-qp1, NULL); + RCU_INIT_POINTER(ibp-qp1, NULL); } else { struct qib_qp *q; struct qib_qp __rcu **qpp; @@ -315,7 +315,7 @@ unsigned qib_free_all_qps(struct qib_devdata *dd) for (n = 0; n dev-qp_table_size; n++) { qp = rcu_dereference_protected(dev-qp_table[n], lockdep_is_held(dev-qpt_lock)); - rcu_assign_pointer(dev-qp_table[n], NULL); + RCU_INIT_POINTER(dev-qp_table[n], NULL); for (; qp; qp = rcu_dereference_protected(qp-next, lockdep_is_held(dev-qpt_lock))) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
On Aug 18, 2014, at 5:52 AM, Devesh Sharma devesh.sha...@emulex.com wrote: Hi Chuk, Can this patch be lined up for merger. Looks like there are no oppositions to this change Sorry, I was expecting a fresh rewritten patch. I haven't done a deep review of the original. Anything you want to merge should be independently tested, of course. For example Steve should include it in his testing branch. -Regards Devesh -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Devesh Sharma Sent: Thursday, July 31, 2014 10:45 AM To: Chuck Lever Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Chuck Lever Sent: Thursday, July 31, 2014 12:09 AM To: Devesh Sharma Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module On Jul 22, 2014, at 1:06 AM, Devesh Sharma devesh.sha...@emulex.com wrote: -Original Message- From: Chuck Lever [mailto:chuck.le...@oracle.com] Sent: Monday, July 21, 2014 11:01 PM To: Devesh Sharma Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- r...@vger.kernel.org Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module On Jul 21, 2014, at 1:07 PM, Devesh Sharma devesh.sha...@emulex.com wrote: Until that support is in place, obviously I would prefer that the removal of the underlying driver be prevented while there are NFS mounts in place. I think that's what NFS users have come to expect. In other words, don't allow device removal until we have support for device insertion :-) This needs a fresh series of patches? do not allow removal would likely take an approach similar to what you originally posted, unless someone has an idea how to use a CM_EVENT to make this work, or there is an objection from the kernel RDMA folks. Okay, I will re-work the patch if need be. Devesh, if there isn't one already, could you file an upstream bug at http://bugzilla.linux-nfs.org that documents the shutdown hang and perhaps summarizes this thread? Thanks! A bug has been filed https://bugzilla.linux-nfs.org/show_bug.cgi?id=266 -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband: hw: cxgb3: cxio_hal.c: Cleaning up missing null-terminate after strncpy call
Hi, Le lundi 18 août 2014 à 00:40 +0200, Rickard Strandqvist a écrit : Added a guaranteed null-terminate after call to strncpy. Good catch. Do you have an automated way to catch such mistake ? Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- drivers/infiniband/hw/cxgb3/cxio_hal.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index de1c61b4..5fc04e4 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -933,6 +933,7 @@ int cxio_rdev_open(struct cxio_rdev *rdev_p) netdev_p = rdev_p-t3cdev_p-lldev; strncpy(rdev_p-dev_name, rdev_p-t3cdev_p-name, T3_MAX_DEV_NAME_LEN); + rdev_p-dev_name[T3_MAX_DEV_NAME_LEN - 1] = '\0'; Why not replacing this by strlcpy(rdev_p-dev_name, rdev_p-t3cdev_p-name, T3_MAX_DEV_NAME_LEN); Regards. -- Yann Droneaud OPTEYA -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ipoib: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
The use of rcu_assign_pointer() is NULLing out the pointer. According to RCU_INIT_POINTER()'s block comment: 1. This use of RCU_INIT_POINTER() is NULLing out the pointer it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat bernat@gmail.com --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 5786a78..64cd5b5 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1229,7 +1229,7 @@ static void ipoib_flush_neighs(struct ipoib_dev_priv *priv) } free_htbl: - rcu_assign_pointer(ntbl-htbl, NULL); + RCU_INIT_POINTER(ntbl-htbl, NULL); call_rcu(htbl-rcu, neigh_hash_free_rcu); out_unlock: -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] infiniband: hw: cxgb3: cxio_hal.c: Cleaning up missing null-terminate after strncpy call
-Original Message- From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Rickard Strandqvist Sent: Sunday, August 17, 2014 5:40 PM To: Steve Wise; Roland Dreier Cc: Rickard Strandqvist; Sean Hefty; Hal Rosenstock; linux-rdma@vger.kernel.org; linux- ker...@vger.kernel.org Subject: [PATCH] infiniband: hw: cxgb3: cxio_hal.c: Cleaning up missing null-terminate after strncpy call Added a guaranteed null-terminate after call to strncpy. Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- drivers/infiniband/hw/cxgb3/cxio_hal.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index de1c61b4..5fc04e4 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -933,6 +933,7 @@ int cxio_rdev_open(struct cxio_rdev *rdev_p) netdev_p = rdev_p-t3cdev_p-lldev; strncpy(rdev_p-dev_name, rdev_p-t3cdev_p-name, T3_MAX_DEV_NAME_LEN); + rdev_p-dev_name[T3_MAX_DEV_NAME_LEN - 1] = '\0'; } else { PDBG(%s t3cdev_p or dev_name must be set\n, __func__); return -EINVAL; cxio_rdev_open() is called only by open_rnic_dev() which allocates the device structure by calling ib_alloc_device() which uses kzalloc(). So this change really isn't needed, is it? Steve. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband: hw: cxgb3: cxio_hal.c: Cleaning up missing null-terminate after strncpy call
2014-08-18 16:27 GMT+02:00 Steve Wise sw...@opengridcomputing.com: -Original Message- From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Rickard Strandqvist Sent: Sunday, August 17, 2014 5:40 PM To: Steve Wise; Roland Dreier Cc: Rickard Strandqvist; Sean Hefty; Hal Rosenstock; linux-rdma@vger.kernel.org; linux- ker...@vger.kernel.org Subject: [PATCH] infiniband: hw: cxgb3: cxio_hal.c: Cleaning up missing null-terminate after strncpy call Added a guaranteed null-terminate after call to strncpy. Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- drivers/infiniband/hw/cxgb3/cxio_hal.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/hw/cxgb3/cxio_hal.c b/drivers/infiniband/hw/cxgb3/cxio_hal.c index de1c61b4..5fc04e4 100644 --- a/drivers/infiniband/hw/cxgb3/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/cxio_hal.c @@ -933,6 +933,7 @@ int cxio_rdev_open(struct cxio_rdev *rdev_p) netdev_p = rdev_p-t3cdev_p-lldev; strncpy(rdev_p-dev_name, rdev_p-t3cdev_p-name, T3_MAX_DEV_NAME_LEN); + rdev_p-dev_name[T3_MAX_DEV_NAME_LEN - 1] = '\0'; } else { PDBG(%s t3cdev_p or dev_name must be set\n, __func__); return -EINVAL; cxio_rdev_open() is called only by open_rnic_dev() which allocates the device structure by calling ib_alloc_device() which uses kzalloc(). So this change really isn't needed, is it? Steve. Hi Sure, strlcpy is preferable in many ways if we only can guarantee that it is safe. I have seldom received so much criticism when I start switching to strlcpy, although much of it was justified :) Even Linus was getting into the debate. See more: https://plus.google.com/111049168280159033135/posts/1amLbuhWbh5 I change to strlcpy only when I'm sure it will not cause any other problems. But if you or anyone else can guarantee that in this case, so I'd make a new patch with strlcpy. And no Steve, or then you have to use: strncpy(rdev_p-dev_name, rdev_p-t3cdev_p-name, T3_MAX_DEV_NAME_LEN - 1); But the kzalloc should mean that there will not be a problem to switch to strlcpy .. Do we agree, I'll make a new path with strlcpy? I have run a static code analysis programs cppcheck, including finding this type of problem. But since I noticed that there were so many mistakes in addition, I have now looked through all use of strncpy in kernel Kind regards Rickard Strandqvist -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mlx5 + SRP: max_qp_sz mismatch
I have a client machine that is trying to establish an SRP connection, and it is failing due to an ENOMEM memory allocation error. I traced it down to the max_qp_sz field -- the mlx5 limits this to 16384 but the request wants 32768. I spent some time trying to figure out how this limit is set, but it isn't quite obvious. Is there a driver parameter I can set, or a hard coded limit somewhere? I'm using Ubuntu 14.04 and targetcli on the target side, Windows 2008r2 and WinOFED on the client side. Thanks, Mark -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/8] IPoIB: Make the carrier_on_task race aware
On Tue, Aug 12, 2014 at 4:38 PM, Doug Ledford dledf...@redhat.com wrote: We blindly assume that we can just take the rtnl lock and that will prevent races with downing this interface. Unfortunately, that's not the case. In ipoib_mcast_stop_thread() we will call flush_workqueue() in an attempt to clear out all remaining instances of ipoib_join_task. But, since this task is put on the same workqueue as the join task, the flush_workqueue waits on this thread too. But this thread is deadlocked on the rtnl lock. The better thing here is to use trylock and loop on that until we either get the lock or we see that FLAG_ADMIN_UP has been cleared, in which case we don't need to do anything anyway and we just return. Signed-off-by: Doug Ledford dledf...@redhat.com --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index a0a42859f12..7e9cd39b5ef 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -353,18 +353,27 @@ void ipoib_mcast_carrier_on_task(struct work_struct *work) carrier_on_task); struct ib_port_attr attr; - /* -* Take rtnl_lock to avoid racing with ipoib_stop() and -* turning the carrier back on while a device is being -* removed. -*/ if (ib_query_port(priv-ca, priv-port, attr) || attr.state != IB_PORT_ACTIVE) { ipoib_dbg(priv, Keeping carrier off until IB port is active\n); return; } - rtnl_lock(); + /* +* Take rtnl_lock to avoid racing with ipoib_stop() and +* turning the carrier back on while a device is being +* removed. However, ipoib_stop() will attempt to flush +* the workqueue while holding the rtnl lock, so loop +* on trylock until either we get the lock or we see +* FLAG_ADMIN_UP go away as that signals that we are bailing +* and can safely ignore the carrier on work +*/ + while (!rtnl_trylock()) { + if (!test_bit(IPOIB_FLAG_ADMIN_UP, priv-flags)) + return; + else + msleep(20); + } I always think rtnl lock is too big for this purpose... and that 20 ms is not ideal either. Could we have a new IPOIB private mutex used by ipoib_stop() and this section of code ? So something like: ipoib_stop() {. mutex_lock(something_new); clear_bit(IPOIB_FLAG_ADMIN_UP, priv-flags); ... mutex_unlock(something_new); return 0; } Then the loop would become: // this while-loop will be very short - since we either get the mutex quickly or return quickly. while (!mutex_trylock(something_new)) { if (!test_bit(IPOIB_FLAG_ADMIN_UP, priv-flags)) return; } if (!ipoib_cm_admin_enabled(priv-dev)) dev_set_mtu(priv-dev, min(priv-mcast_mtu, priv-admin_mtu)); netif_carrier_on(priv-dev); -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html