RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
> -Original Message- > From: Chuck Lever [mailto:chuck.le...@oracle.com] > Sent: Monday, July 21, 2014 11:01 PM > To: Devesh Sharma > Cc: Steve Wise; Shirley Ma; Hefty, Sean; Roland Dreier; linux- > r...@vger.kernel.org > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider > module > > > On Jul 21, 2014, at 1:07 PM, Devesh Sharma > wrote: > > > Until that support is in place, obviously I would prefer that the > > removal of the underlying driver be prevented while there are NFS > > mounts in place. I think that's what NFS users have come to expect. > > > > In other words, don't allow device removal until we have support > > for device insertion :-) > > > > This needs a fresh series of patches? > > "do not allow removal" would likely take an approach similar to what you > originally posted, unless someone has an idea how to use a CM_EVENT to > make this work, or there is an objection from the kernel RDMA folks. Okay, I will re-work the patch if need be. > > -- > Chuck Lever > chuck[dot]lever[at]oracle[dot]com > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Misc. fix for cxgb4 and iw_cxgb4
From: Hariprasad Shenai Date: Mon, 21 Jul 2014 20:55:11 +0530 > This patch series adds support to enchance error reporting, log detailed > warning for negative advice, support query_qp verb and advertise correct > device max attributes for iwarp. > > The patches series is created against 'net-next' tree. > And includes patches on cxgb4 and iw_cxgb4 driver. > > Since this patch-series contains changes which are dependent on commit id > 4c2c5763 ("cxgb4/iw_cxgb4: use firmware ord/ird resource limits") of net-next > tree we would like to request this patch series to get merged via David > Miller's > 'net-next' tree. > > We have included all the maintainers of respective drivers. Kindly review the > change and let us know in case of any review comments. Series applied to net-next, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
On Jul 21, 2014, at 1:07 PM, Devesh Sharma wrote: > Until that support is in place, obviously I would prefer that the > removal of the underlying driver be prevented while there are NFS > mounts in place. I think that's what NFS users have come to expect. > > In other words, don't allow device removal until we have support > for device insertion :-) > > This needs a fresh series of patches? “do not allow removal” would likely take an approach similar to what you originally posted, unless someone has an idea how to use a CM_EVENT to make this work, or there is an objection from the kernel RDMA folks. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
> -Original Message- > From: Steve Wise [mailto:sw...@opengridcomputing.com] > Sent: Monday, July 21, 2014 8:53 PM > To: 'Chuck Lever' > Cc: Devesh Sharma; 'Shirley Ma'; 'Hefty, Sean'; 'Roland Dreier'; linux- > r...@vger.kernel.org > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider > module > > > > > -Original Message- > > From: Chuck Lever [mailto:chuck.le...@oracle.com] > > Sent: Monday, July 21, 2014 10:21 AM > > To: Steve Wise > > Cc: Devesh Sharma; Shirley Ma; Hefty, Sean; Roland Dreier; > > linux-rdma@vger.kernel.org > > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider > > module > > > > > > On Jul 21, 2014, at 11:03 AM, Steve Wise > wrote: > > > > > > > > > > >> -Original Message- > > >> From: Chuck Lever [mailto:chuck.le...@oracle.com] > > >> Sent: Monday, July 21, 2014 9:54 AM > > >> To: Devesh Sharma > > >> Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; > > >> linux-rdma@vger.kernel.org > > >> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma > > >> provider module > > >> > > >> Hi Devesh- > > >> > > >> Thanks for drilling into this further. > > >> > > >> On Jul 21, 2014, at 7:48 AM, Devesh Sharma > wrote: > > >> > > >>> In rpcrdma_ep_connect(): > > >>> > > >>> write_lock(&ia->ri_qplock); > > >>> old = ia->ri_id; > > >>> ia->ri_id = id; > > >>> write_unlock(&ia->ri_qplock); > > >>> > > >>> rdma_destroy_qp(old); > > >>> rdma_destroy_id(old); => Cm -id is destroyed > here. > > >>> > > >>> > > >>> If following code fails in rpcrdma_ep_connect(): > > >>> id = rpcrdma_create_id(xprt, ia, > > >>> (struct sockaddr *)&xprt->rx_data.addr); > > >>> if (IS_ERR(id)) { > > >>> rc = -EHOSTUNREACH; > > >>> goto out; > > >>> } > > >>> > > >>> it leaves old cm-id still alive. This will always fail if Device > > >>> is removed > abruptly. > > >> > > >> For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets > > >> ep->rep_connected to -ENODEV. > > >> > > >> Then: > > >> > > >> 929 int > > >> 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia > > >> *ia) > > >> 931 { > > >> 932 struct rdma_cm_id *id, *old; > > >> 933 int rc = 0; > > >> 934 int retry_count = 0; > > >> 935 > > >> 936 if (ep->rep_connected != 0) { > > >> 937 struct rpcrdma_xprt *xprt; > > >> 938 retry: > > >> 939 dprintk("RPC: %s: reconnecting...\n", > > >> __func__); > > >> > > >> ep->rep_connected is probably -ENODEV after a device removal. It > > >> ep->would be > > >> possible for the connect worker to destroy everything associated > > >> with this connection in that case to ensure the underlying object > > >> reference counts are cleared. > > >> > > >> The immediate danger is that if there are pending RPCs, they could > > >> exit while qp/cm_id are NULL, triggering a panic in > rpcrdma_deregister_frmr_external(). > > >> Checking for NULL pointers inside the ri_qplock would prevent that. > > >> > > >> However, NFS mounts via this adapter will hang indefinitely after > > >> all transports are torn down and the adapter is gone. The only > > >> thing that can be done is something drastic like "echo b > > /proc/sysrq_trigger" on the client. > > >> > > >> Thus, IMO hot-plugging or passive fail-over are the only scenarios > > >> where this makes sense. If we have an immediate problem here, is it > > >> a problem with system shutdown ordering that can be addressed in > some other way? > > >> > > >> Until that support is in place, obviously I would prefer that the > > >> removal of the underlying driver be prevented while there are NFS > > >> mounts in place. I think that's what NFS users have come to expect. > > >> > > >> In other words, don't allow device removal until we have support > > >> for device insertion :-) This needs a fresh series of patches? > > >> > > >> > > > > > > > > > If we fix the above problems on provider unload, shouldn't the mount > > > recover if the provider module is subsequently loaded? Or another > > > provider configured such that > > > rdma_resolve_addr/route() then picks an active device? > > > > On device removal, we have to destroy everything. > > > > On insertion, we'll have to create a fresh PD and MRs, which isn't > > done now during reconnect. So, insertion needs work too. Makes Sense to me too. Thanks Chuck. -Regards Devesh > > > > Got it, thanks! > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
> -Original Message- > From: Chuck Lever [mailto:chuck.le...@oracle.com] > Sent: Monday, July 21, 2014 10:21 AM > To: Steve Wise > Cc: Devesh Sharma; Shirley Ma; Hefty, Sean; Roland Dreier; > linux-rdma@vger.kernel.org > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module > > > On Jul 21, 2014, at 11:03 AM, Steve Wise wrote: > > > > > > >> -Original Message- > >> From: Chuck Lever [mailto:chuck.le...@oracle.com] > >> Sent: Monday, July 21, 2014 9:54 AM > >> To: Devesh Sharma > >> Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; > >> linux-rdma@vger.kernel.org > >> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider > >> module > >> > >> Hi Devesh- > >> > >> Thanks for drilling into this further. > >> > >> On Jul 21, 2014, at 7:48 AM, Devesh Sharma > >> wrote: > >> > >>> In rpcrdma_ep_connect(): > >>> > >>> write_lock(&ia->ri_qplock); > >>> old = ia->ri_id; > >>> ia->ri_id = id; > >>> write_unlock(&ia->ri_qplock); > >>> > >>> rdma_destroy_qp(old); > >>> rdma_destroy_id(old); => Cm -id is destroyed > >>> here. > >>> > >>> > >>> If following code fails in rpcrdma_ep_connect(): > >>> id = rpcrdma_create_id(xprt, ia, > >>> (struct sockaddr *)&xprt->rx_data.addr); > >>> if (IS_ERR(id)) { > >>> rc = -EHOSTUNREACH; > >>> goto out; > >>> } > >>> > >>> it leaves old cm-id still alive. This will always fail if Device is > >>> removed abruptly. > >> > >> For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected > >> to -ENODEV. > >> > >> Then: > >> > >> 929 int > >> 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) > >> 931 { > >> 932 struct rdma_cm_id *id, *old; > >> 933 int rc = 0; > >> 934 int retry_count = 0; > >> 935 > >> 936 if (ep->rep_connected != 0) { > >> 937 struct rpcrdma_xprt *xprt; > >> 938 retry: > >> 939 dprintk("RPC: %s: reconnecting...\n", __func__); > >> > >> ep->rep_connected is probably -ENODEV after a device removal. It would be > >> possible for the connect worker to destroy everything associated with this > >> connection in that case to ensure the underlying object reference counts > >> are cleared. > >> > >> The immediate danger is that if there are pending RPCs, they could exit > >> while > >> qp/cm_id are NULL, triggering a panic in > >> rpcrdma_deregister_frmr_external(). > >> Checking for NULL pointers inside the ri_qplock would prevent that. > >> > >> However, NFS mounts via this adapter will hang indefinitely after all > >> transports are torn down and the adapter is gone. The only thing that can > >> be > >> done is something drastic like "echo b > /proc/sysrq_trigger" on the > >> client. > >> > >> Thus, IMO hot-plugging or passive fail-over are the only scenarios where > >> this makes sense. If we have an immediate problem here, is it a problem > >> with > >> system shutdown ordering that can be addressed in some other way? > >> > >> Until that support is in place, obviously I would prefer that the removal > >> of > >> the underlying driver be prevented while there are NFS mounts in place. I > >> think that's what NFS users have come to expect. > >> > >> In other words, don't allow device removal until we have support for device > >> insertion :-) > >> > >> > > > > > > If we fix the above problems on provider unload, shouldn't the mount > > recover if the > > provider module is subsequently loaded? Or another provider configured > > such that > > rdma_resolve_addr/route() then picks an active device? > > On device removal, we have to destroy everything. > > On insertion, we'll have to create a fresh PD and MRs, which isn't done now > during reconnect. So, insertion needs work too. > Got it, thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
On Jul 21, 2014, at 11:03 AM, Steve Wise wrote: > > >> -Original Message- >> From: Chuck Lever [mailto:chuck.le...@oracle.com] >> Sent: Monday, July 21, 2014 9:54 AM >> To: Devesh Sharma >> Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; >> linux-rdma@vger.kernel.org >> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module >> >> Hi Devesh- >> >> Thanks for drilling into this further. >> >> On Jul 21, 2014, at 7:48 AM, Devesh Sharma wrote: >> >>> In rpcrdma_ep_connect(): >>> >>> write_lock(&ia->ri_qplock); >>> old = ia->ri_id; >>> ia->ri_id = id; >>> write_unlock(&ia->ri_qplock); >>> >>> rdma_destroy_qp(old); >>> rdma_destroy_id(old); => Cm -id is destroyed >>> here. >>> >>> >>> If following code fails in rpcrdma_ep_connect(): >>> id = rpcrdma_create_id(xprt, ia, >>> (struct sockaddr *)&xprt->rx_data.addr); >>> if (IS_ERR(id)) { >>> rc = -EHOSTUNREACH; >>> goto out; >>> } >>> >>> it leaves old cm-id still alive. This will always fail if Device is removed >>> abruptly. >> >> For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected >> to -ENODEV. >> >> Then: >> >> 929 int >> 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) >> 931 { >> 932 struct rdma_cm_id *id, *old; >> 933 int rc = 0; >> 934 int retry_count = 0; >> 935 >> 936 if (ep->rep_connected != 0) { >> 937 struct rpcrdma_xprt *xprt; >> 938 retry: >> 939 dprintk("RPC: %s: reconnecting...\n", __func__); >> >> ep->rep_connected is probably -ENODEV after a device removal. It would be >> possible for the connect worker to destroy everything associated with this >> connection in that case to ensure the underlying object reference counts >> are cleared. >> >> The immediate danger is that if there are pending RPCs, they could exit while >> qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external(). >> Checking for NULL pointers inside the ri_qplock would prevent that. >> >> However, NFS mounts via this adapter will hang indefinitely after all >> transports are torn down and the adapter is gone. The only thing that can be >> done is something drastic like "echo b > /proc/sysrq_trigger" on the client. >> >> Thus, IMO hot-plugging or passive fail-over are the only scenarios where >> this makes sense. If we have an immediate problem here, is it a problem with >> system shutdown ordering that can be addressed in some other way? >> >> Until that support is in place, obviously I would prefer that the removal of >> the underlying driver be prevented while there are NFS mounts in place. I >> think that's what NFS users have come to expect. >> >> In other words, don't allow device removal until we have support for device >> insertion :-) >> >> > > > If we fix the above problems on provider unload, shouldn't the mount recover > if the > provider module is subsequently loaded? Or another provider configured such > that > rdma_resolve_addr/route() then picks an active device? On device removal, we have to destroy everything. On insertion, we’ll have to create a fresh PD and MRs, which isn’t done now during reconnect. So, insertion needs work too. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] iw_cxgb4: Don't limit TPTE count to 32KB
Use the size advertised by FW Signed-off-by: Steve Wise Signed-off-by: Hariprasad Shenai --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h |2 +- drivers/infiniband/hw/cxgb4/t4.h |1 - 2 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h index 69f047c..c378fd2 100644 --- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h +++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h @@ -193,7 +193,7 @@ static inline int c4iw_fatal_error(struct c4iw_rdev *rdev) static inline int c4iw_num_stags(struct c4iw_rdev *rdev) { - return min((int)T4_MAX_NUM_STAG, (int)(rdev->lldi.vr->stag.size >> 5)); + return (int)(rdev->lldi.vr->stag.size >> 5); } #define C4IW_WR_TO (30*HZ) diff --git a/drivers/infiniband/hw/cxgb4/t4.h b/drivers/infiniband/hw/cxgb4/t4.h index 641ab55..df5edfa 100644 --- a/drivers/infiniband/hw/cxgb4/t4.h +++ b/drivers/infiniband/hw/cxgb4/t4.h @@ -37,7 +37,6 @@ #include "t4fw_ri_api.h" #define T4_MAX_NUM_PD 65536 -#define T4_MAX_NUM_STAG (1<<15) #define T4_MAX_MR_SIZE (~0ULL) #define T4_PAGESIZE_MASK 0x000 /* 4KB-128MB */ #define T4_STAG_UNSET 0x -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] cxgb4: Add the MC1 registers to read in the interrupt handler
Signed-off-by: Hariprasad Shenai --- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 16 +--- drivers/net/ethernet/chelsio/cxgb4/t4_regs.h |3 +++ 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index eb5a278..e768852 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -1719,16 +1719,24 @@ static void mps_intr_handler(struct adapter *adapter) */ static void mem_intr_handler(struct adapter *adapter, int idx) { - static const char name[3][5] = { "EDC0", "EDC1", "MC" }; + static const char name[4][7] = { "EDC0", "EDC1", "MC/MC0", "MC1" }; unsigned int addr, cnt_addr, v; if (idx <= MEM_EDC1) { addr = EDC_REG(EDC_INT_CAUSE, idx); cnt_addr = EDC_REG(EDC_ECC_STATUS, idx); + } else if (idx == MEM_MC) { + if (is_t4(adapter->params.chip)) { + addr = MC_INT_CAUSE; + cnt_addr = MC_ECC_STATUS; + } else { + addr = MC_P_INT_CAUSE; + cnt_addr = MC_P_ECC_STATUS; + } } else { - addr = MC_INT_CAUSE; - cnt_addr = MC_ECC_STATUS; + addr = MC_REG(MC_P_INT_CAUSE, 1); + cnt_addr = MC_REG(MC_P_ECC_STATUS, 1); } v = t4_read_reg(adapter, addr) & MEM_INT_MASK; @@ -1892,6 +1900,8 @@ int t4_slow_intr_handler(struct adapter *adapter) pcie_intr_handler(adapter); if (cause & MC) mem_intr_handler(adapter, MEM_MC); + if (!is_t4(adapter->params.chip) && (cause & MC1)) + mem_intr_handler(adapter, MEM_MC1); if (cause & EDC0) mem_intr_handler(adapter, MEM_EDC0); if (cause & EDC1) diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h index 3b244ab..e3146e8 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h @@ -448,11 +448,13 @@ #define TDUE 0x0001U #define MC_INT_CAUSE 0x7518 +#define MC_P_INT_CAUSE 0x41318 #define ECC_UE_INT_CAUSE 0x0004U #define ECC_CE_INT_CAUSE 0x0002U #define PERR_INT_CAUSE 0x0001U #define MC_ECC_STATUS 0x751c +#define MC_P_ECC_STATUS 0x4131c #define ECC_CECNT_MASK 0xU #define ECC_CECNT_SHIFT 16 #define ECC_CECNT(x) ((x) << ECC_CECNT_SHIFT) @@ -1101,6 +1103,7 @@ #define I2CM 0x0002U #define CIM0x0001U +#define MC1 0x31 #define PL_INT_ENABLE 0x19410 #define PL_INT_MAP0 0x19414 #define PL_RST 0x19428 -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] iw_cxgb4: Support query_qp() verb
Signed-off-by: Steve Wise Signed-off-by: Hariprasad Shenai --- drivers/infiniband/hw/cxgb4/qp.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index fd66bd9..0e7e0e3 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1856,5 +1856,11 @@ int c4iw_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, memset(attr, 0, sizeof *attr); memset(init_attr, 0, sizeof *init_attr); attr->qp_state = to_ib_qp_state(qhp->attr.state); + init_attr->cap.max_send_wr = qhp->attr.sq_num_entries; + init_attr->cap.max_recv_wr = qhp->attr.rq_num_entries; + init_attr->cap.max_send_sge = qhp->attr.sq_max_sges; + init_attr->cap.max_recv_sge = qhp->attr.sq_max_sges; + init_attr->cap.max_inline_data = T4_MAX_SEND_INLINE; + init_attr->sq_sig_type = qhp->sq_sig_all ? IB_SIGNAL_ALL_WR : 0; return 0; } -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] iw_cxgb4: advertise the correct device max attributes
Advertise the actual max limits for things like qp depths, number of qps, cqs, etc. Clean up the queue allocation for qps and cqs. Signed-off-by: Steve Wise Signed-off-by: Hariprasad Shenai --- drivers/infiniband/hw/cxgb4/cq.c |8 +-- drivers/infiniband/hw/cxgb4/device.c | 16 +++--- drivers/infiniband/hw/cxgb4/provider.c |4 +- drivers/infiniband/hw/cxgb4/qp.c | 35 +++ drivers/infiniband/hw/cxgb4/t4.h |2 - 5 files changed, 32 insertions(+), 33 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index de9bcf2..0f773e7 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -913,14 +913,8 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int entries, /* * memsize must be a multiple of the page size if its a user cq. */ - if (ucontext) { + if (ucontext) memsize = roundup(memsize, PAGE_SIZE); - hwentries = memsize / sizeof *chp->cq.queue; - while (hwentries > rhp->rdev.hw_queue.t4_max_iq_size) { - memsize -= PAGE_SIZE; - hwentries = memsize / sizeof *chp->cq.queue; - } - } chp->cq.size = hwentries; chp->cq.memsize = memsize; chp->cq.vector = vector; diff --git a/drivers/infiniband/hw/cxgb4/device.c b/drivers/infiniband/hw/cxgb4/device.c index 770ee6f..67fe634 100644 --- a/drivers/infiniband/hw/cxgb4/device.c +++ b/drivers/infiniband/hw/cxgb4/device.c @@ -934,17 +934,17 @@ static struct c4iw_dev *c4iw_alloc(const struct cxgb4_lld_info *infop) devp->rdev.hw_queue.t4_eq_status_entries = devp->rdev.lldi.sge_ingpadboundary > 64 ? 2 : 1; - devp->rdev.hw_queue.t4_max_eq_size = - 65520 - devp->rdev.hw_queue.t4_eq_status_entries; - devp->rdev.hw_queue.t4_max_iq_size = 65520 - 1; - devp->rdev.hw_queue.t4_max_rq_size = - 8192 - devp->rdev.hw_queue.t4_eq_status_entries; + devp->rdev.hw_queue.t4_max_eq_size = 65520; + devp->rdev.hw_queue.t4_max_iq_size = 65520; + devp->rdev.hw_queue.t4_max_rq_size = 8192 - + devp->rdev.hw_queue.t4_eq_status_entries - 1; devp->rdev.hw_queue.t4_max_sq_size = - devp->rdev.hw_queue.t4_max_eq_size - 1; + devp->rdev.hw_queue.t4_max_eq_size - + devp->rdev.hw_queue.t4_eq_status_entries - 1; devp->rdev.hw_queue.t4_max_qp_depth = - devp->rdev.hw_queue.t4_max_rq_size - 1; + devp->rdev.hw_queue.t4_max_rq_size; devp->rdev.hw_queue.t4_max_cq_depth = - devp->rdev.hw_queue.t4_max_iq_size - 1; + devp->rdev.hw_queue.t4_max_iq_size - 2; devp->rdev.hw_queue.t4_stat_len = devp->rdev.lldi.sge_egrstatuspagesize; diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c index 67c4a69..72e3b69 100644 --- a/drivers/infiniband/hw/cxgb4/provider.c +++ b/drivers/infiniband/hw/cxgb4/provider.c @@ -318,7 +318,7 @@ static int c4iw_query_device(struct ib_device *ibdev, props->vendor_id = (u32)dev->rdev.lldi.pdev->vendor; props->vendor_part_id = (u32)dev->rdev.lldi.pdev->device; props->max_mr_size = T4_MAX_MR_SIZE; - props->max_qp = T4_MAX_NUM_QP; + props->max_qp = dev->rdev.lldi.vr->qp.size / 2; props->max_qp_wr = dev->rdev.hw_queue.t4_max_qp_depth; props->max_sge = T4_MAX_RECV_SGE; props->max_sge_rd = 1; @@ -326,7 +326,7 @@ static int c4iw_query_device(struct ib_device *ibdev, props->max_qp_rd_atom = min(dev->rdev.lldi.max_ordird_qp, c4iw_max_read_depth); props->max_qp_init_rd_atom = props->max_qp_rd_atom; - props->max_cq = T4_MAX_NUM_CQ; + props->max_cq = dev->rdev.lldi.vr->qp.size; props->max_cqe = dev->rdev.hw_queue.t4_max_cq_depth; props->max_mr = c4iw_num_stags(&dev->rdev); props->max_pd = T4_MAX_NUM_PD; diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index 0e7e0e3..c158fcc 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -205,9 +205,9 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq *wq, } /* -* RQT must be a power of 2. +* RQT must be a power of 2 and at least 16 deep. */ - wq->rq.rqt_size = roundup_pow_of_two(wq->rq.size); + wq->rq.rqt_size = roundup_pow_of_two(max_t(u16, wq->rq.size, 16)); wq->rq.rqt_hwaddr = c4iw_rqtpool_alloc(rdev, wq->rq.rqt_size); if (!wq->rq.rqt_hwaddr) { ret = -ENOMEM; @@ -1621,13 +1621,17 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs, if (attrs->cap.max_inline_data > T4_MAX_SEND_INLINE) return ERR_PTR(
[PATCH 0/5] Misc. fix for cxgb4 and iw_cxgb4
Hi, This patch series adds support to enchance error reporting, log detailed warning for negative advice, support query_qp verb and advertise correct device max attributes for iwarp. The patches series is created against 'net-next' tree. And includes patches on cxgb4 and iw_cxgb4 driver. Since this patch-series contains changes which are dependent on commit id 4c2c5763 ("cxgb4/iw_cxgb4: use firmware ord/ird resource limits") of net-next tree we would like to request this patch series to get merged via David Miller's 'net-next' tree. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. Thanks Hariprasad Shenai (5): cxgb4: Add the MC1 registers to read in the interrupt handler iw_cxgb4: log detailed warnings for negative advice iw_cxgb4: Support query_qp() verb iw_cxgb4: advertise the correct device max attributes iw_cxgb4: Don't limit TPTE count to 32KB drivers/infiniband/hw/cxgb4/cm.c | 29 ++ drivers/infiniband/hw/cxgb4/cq.c |8 + drivers/infiniband/hw/cxgb4/device.c | 16 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h |2 +- drivers/infiniband/hw/cxgb4/provider.c |4 +- drivers/infiniband/hw/cxgb4/qp.c | 41 +- drivers/infiniband/hw/cxgb4/t4.h |3 -- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 16 -- drivers/net/ethernet/chelsio/cxgb4/t4_regs.h |3 ++ 9 files changed, 78 insertions(+), 44 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] iw_cxgb4: log detailed warnings for negative advice
Signed-off-by: Steve Wise Signed-off-by: Hariprasad Shenai --- drivers/infiniband/hw/cxgb4/cm.c | 29 +++-- 1 files changed, 23 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index df5bd3d..6d61a16 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -1813,6 +1813,20 @@ static int is_neg_adv(unsigned int status) status == CPL_ERR_KEEPALV_NEG_ADVICE; } +static char *neg_adv_str(unsigned int status) +{ + switch (status) { + case CPL_ERR_RTX_NEG_ADVICE: + return "Retransmit timeout"; + case CPL_ERR_PERSIST_NEG_ADVICE: + return "Persist timeout"; + case CPL_ERR_KEEPALV_NEG_ADVICE: + return "Keepalive timeout"; + default: + return "Unknown"; + } +} + static void set_tcp_window(struct c4iw_ep *ep, struct port_info *pi) { ep->snd_win = snd_win; @@ -2011,8 +2025,9 @@ static int act_open_rpl(struct c4iw_dev *dev, struct sk_buff *skb) status, status2errno(status)); if (is_neg_adv(status)) { - printk(KERN_WARNING MOD "Connection problems for atid %u\n", - atid); + dev_warn(&dev->rdev.lldi.pdev->dev, +"Connection problems for atid %u status %u (%s)\n", +atid, status, neg_adv_str(status)); return 0; } @@ -2488,8 +2503,9 @@ static int peer_abort(struct c4iw_dev *dev, struct sk_buff *skb) ep = lookup_tid(t, tid); if (is_neg_adv(req->status)) { - PDBG("%s neg_adv_abort ep %p tid %u\n", __func__, ep, -ep->hwtid); + dev_warn(&dev->rdev.lldi.pdev->dev, +"Negative advice on abort - tid %u status %d (%s)\n", +ep->hwtid, req->status, neg_adv_str(req->status)); return 0; } PDBG("%s ep %p tid %u state %u\n", __func__, ep, ep->hwtid, @@ -3894,8 +3910,9 @@ static int peer_abort_intr(struct c4iw_dev *dev, struct sk_buff *skb) return 0; } if (is_neg_adv(req->status)) { - PDBG("%s neg_adv_abort ep %p tid %u\n", __func__, ep, -ep->hwtid); + dev_warn(&dev->rdev.lldi.pdev->dev, +"Negative advice on abort - tid %u status %d (%s)\n", +ep->hwtid, req->status, neg_adv_str(req->status)); kfree_skb(skb); return 0; } -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
> -Original Message- > From: Chuck Lever [mailto:chuck.le...@oracle.com] > Sent: Monday, July 21, 2014 9:54 AM > To: Devesh Sharma > Cc: Shirley Ma; Steve Wise; Hefty, Sean; Roland Dreier; > linux-rdma@vger.kernel.org > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module > > Hi Devesh- > > Thanks for drilling into this further. > > On Jul 21, 2014, at 7:48 AM, Devesh Sharma wrote: > > > In rpcrdma_ep_connect(): > > > > write_lock(&ia->ri_qplock); > >old = ia->ri_id; > >ia->ri_id = id; > >write_unlock(&ia->ri_qplock); > > > >rdma_destroy_qp(old); > >rdma_destroy_id(old); => Cm -id is destroyed > > here. > > > > > > If following code fails in rpcrdma_ep_connect(): > > id = rpcrdma_create_id(xprt, ia, > >(struct sockaddr *)&xprt->rx_data.addr); > >if (IS_ERR(id)) { > >rc = -EHOSTUNREACH; > >goto out; > >} > > > > it leaves old cm-id still alive. This will always fail if Device is removed > > abruptly. > > For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected > to -ENODEV. > > Then: > > 929 int > 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) > 931 { > 932 struct rdma_cm_id *id, *old; > 933 int rc = 0; > 934 int retry_count = 0; > 935 > 936 if (ep->rep_connected != 0) { > 937 struct rpcrdma_xprt *xprt; > 938 retry: > 939 dprintk("RPC: %s: reconnecting...\n", __func__); > > ep->rep_connected is probably -ENODEV after a device removal. It would be > possible for the connect worker to destroy everything associated with this > connection in that case to ensure the underlying object reference counts > are cleared. > > The immediate danger is that if there are pending RPCs, they could exit while > qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external(). > Checking for NULL pointers inside the ri_qplock would prevent that. > > However, NFS mounts via this adapter will hang indefinitely after all > transports are torn down and the adapter is gone. The only thing that can be > done is something drastic like "echo b > /proc/sysrq_trigger" on the client. > > Thus, IMO hot-plugging or passive fail-over are the only scenarios where > this makes sense. If we have an immediate problem here, is it a problem with > system shutdown ordering that can be addressed in some other way? > > Until that support is in place, obviously I would prefer that the removal of > the underlying driver be prevented while there are NFS mounts in place. I > think that's what NFS users have come to expect. > > In other words, don't allow device removal until we have support for device > insertion :-) > > If we fix the above problems on provider unload, shouldn't the mount recover if the provider module is subsequently loaded? Or another provider configured such that rdma_resolve_addr/route() then picks an active device? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
Hi Devesh- Thanks for drilling into this further. On Jul 21, 2014, at 7:48 AM, Devesh Sharma wrote: > In rpcrdma_ep_connect(): > > write_lock(&ia->ri_qplock); >old = ia->ri_id; >ia->ri_id = id; >write_unlock(&ia->ri_qplock); > >rdma_destroy_qp(old); >rdma_destroy_id(old); => Cm -id is destroyed here. > > > If following code fails in rpcrdma_ep_connect(): > id = rpcrdma_create_id(xprt, ia, >(struct sockaddr *)&xprt->rx_data.addr); >if (IS_ERR(id)) { >rc = -EHOSTUNREACH; >goto out; >} > > it leaves old cm-id still alive. This will always fail if Device is removed > abruptly. For CM_EVENT_DEVICE_REMOVAL, rpcrdma_conn_upcall() sets ep->rep_connected to -ENODEV. Then: 929 int 930 rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia) 931 { 932 struct rdma_cm_id *id, *old; 933 int rc = 0; 934 int retry_count = 0; 935 936 if (ep->rep_connected != 0) { 937 struct rpcrdma_xprt *xprt; 938 retry: 939 dprintk("RPC: %s: reconnecting...\n", __func__); ep->rep_connected is probably -ENODEV after a device removal. It would be possible for the connect worker to destroy everything associated with this connection in that case to ensure the underlying object reference counts are cleared. The immediate danger is that if there are pending RPCs, they could exit while qp/cm_id are NULL, triggering a panic in rpcrdma_deregister_frmr_external(). Checking for NULL pointers inside the ri_qplock would prevent that. However, NFS mounts via this adapter will hang indefinitely after all transports are torn down and the adapter is gone. The only thing that can be done is something drastic like “echo b > /proc/sysrq_trigger” on the client. Thus, IMO hot-plugging or passive fail-over are the only scenarios where this makes sense. If we have an immediate problem here, is it a problem with system shutdown ordering that can be addressed in some other way? Until that support is in place, obviously I would prefer that the removal of the underlying driver be prevented while there are NFS mounts in place. I think that’s what NFS users have come to expect. In other words, don’t allow device removal until we have support for device insertion :-) > In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced > here (cma.c): > > static int cma_acquire_dev(struct rdma_id_private *id_priv, > struct rdma_id_private *listen_id_priv) { > . > . > if (!ret) >cma_attach_to_dev(id_priv, cma_dev); > } > > static void cma_release_dev(struct rdma_id_private *id_priv) > { >mutex_lock(&lock); >list_del(&id_priv->list); >cma_deref_dev(id_priv->cma_dev); > . > . > } > > Since as per design of nfs-rdma at-least previously known good cm-id always > remains live utill > another good cm-id is created, cma_dev->refcount never becomes 0 upon device > removal . > Thus blocking the rmmod forever. > > -Regards > Devesh > >> -Original Message- >> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- >> ow...@vger.kernel.org] On Behalf Of Devesh Sharma >> Sent: Monday, July 21, 2014 11:42 AM >> To: Shirley Ma; Steve Wise; 'Chuck Lever' >> Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org >> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider >> module >> >> Shirley, >> >> Once rmmod is issued, the connection corresponding to the active mount is >> destroyed and all the associated resources Are freed. As per the processing >> logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the waiters, This >> results into Re-establishment efforts, since the device is not present any >> more, rdma_resolve_address() fails with CM resolution Error. This loop >> continues forever. >> >> I am yet to find out which part of ocrdma is blocked. I am putting some debug >> messages to find it out. I will get back to The group with an update. >> >> -Regards >> Devesh >> >>> -Original Message- >>> From: Shirley Ma [mailto:shirley...@oracle.com] >>> Sent: Friday, July 18, 2014 9:18 PM >>> To: Steve Wise; Devesh Sharma; 'Chuck Lever' >>> Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org >>> Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider >>> module >>> >>> >>> On 07/18/2014 06:27 AM, Steve Wise wrote: >>> We can't really deal with a CM_DEVICE_REMOVE event while there >>> are active NFS mounts. >>> >>> System shutdown ordering should guarantee (one would hope) that >>> NFS >>> mount points are unmounted before the RDMA/IB core >> infrastructure >>> is torn down. Ordering shouldn't matter as long all NFS activity >>> has ceased before the CM tries to remove the device. >>> >>
RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
In rpcrdma_ep_connect(): write_lock(&ia->ri_qplock); old = ia->ri_id; ia->ri_id = id; write_unlock(&ia->ri_qplock); rdma_destroy_qp(old); rdma_destroy_id(old); => Cm -id is destroyed here. If following code fails in rpcrdma_ep_connect(): id = rpcrdma_create_id(xprt, ia, (struct sockaddr *)&xprt->rx_data.addr); if (IS_ERR(id)) { rc = -EHOSTUNREACH; goto out; } it leaves old cm-id still alive. This will always fail if Device is removed abruptly. In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced here (cma.c): static int cma_acquire_dev(struct rdma_id_private *id_priv, struct rdma_id_private *listen_id_priv) { . . if (!ret) cma_attach_to_dev(id_priv, cma_dev); } static void cma_release_dev(struct rdma_id_private *id_priv) { mutex_lock(&lock); list_del(&id_priv->list); cma_deref_dev(id_priv->cma_dev); . . } Since as per design of nfs-rdma at-least previously known good cm-id always remains live utill another good cm-id is created, cma_dev->refcount never becomes 0 upon device removal . Thus blocking the rmmod forever. -Regards Devesh > -Original Message- > From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Devesh Sharma > Sent: Monday, July 21, 2014 11:42 AM > To: Shirley Ma; Steve Wise; 'Chuck Lever' > Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider > module > > Shirley, > > Once rmmod is issued, the connection corresponding to the active mount is > destroyed and all the associated resources Are freed. As per the processing > logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the waiters, This > results into Re-establishment efforts, since the device is not present any > more, rdma_resolve_address() fails with CM resolution Error. This loop > continues forever. > > I am yet to find out which part of ocrdma is blocked. I am putting some debug > messages to find it out. I will get back to The group with an update. > > -Regards > Devesh > > > -Original Message- > > From: Shirley Ma [mailto:shirley...@oracle.com] > > Sent: Friday, July 18, 2014 9:18 PM > > To: Steve Wise; Devesh Sharma; 'Chuck Lever' > > Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org > > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider > > module > > > > > > On 07/18/2014 06:27 AM, Steve Wise wrote: > > We can't really deal with a CM_DEVICE_REMOVE event while there > > are active NFS mounts. > > > > System shutdown ordering should guarantee (one would hope) that > > NFS > > mount points are unmounted before the RDMA/IB core > infrastructure > > is torn down. Ordering shouldn't matter as long all NFS activity > > has ceased before the CM tries to remove the device. > > > > So if something is hanging up the CM, there's something xprtrdma > > is not cleaning up properly. > > > > >>> > > >>> > > >>> Devesh, how are you reproducing this? Are you just rmmod'ing the > > >>> ocrdma module while there are active mounts? > > >> > > >> Yes, I am issuing rmmod while there is an active mount. In my case > > >> rmmod ocrdma remains blocked forever. > > Where is it blocked? > > > > >> Off-the-course of this discussion: Is there a reasoning behind not > > >> using > > >> ib_register_client()/ib_unregister_client() framework? > > > > > > I think the idea is that you don't need to use it if you are > > > transport-independent and use the rdmacm... > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" > > > in the body of a message to majord...@vger.kernel.org More > > majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html