Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On 4/28/2015 2:51 PM, Or Gerlitz wrote: On Mon, Apr 27, 2015 at 11:25 AM, Matan Barak mat...@mellanox.com wrote: On 4/26/2015 11:10 PM, Or Gerlitz wrote: On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur somnath.ko...@emulex.com wrote: From: Matan Barak mat...@mellanox.com Previously. we used device_mutex lock in order to protect the device's list. That means that in order to guarantee a device isn't freed while we use it, we had to lock all devices. Matan, looking on the cover letter, it says: [...] Patch 0002 adds a reference count mechanism to IB devices. This mechanism is similar to dev_hold and dev_put available for net devices. This is mandatory for later patches [...] Correct, I'll change that into: Currently we use device_mutex lock for protecting the device's list. In the current approach, in order to guarantee a device isn't freed we have to lock all devices. Adding a kref per IB device. Before an IB device is unregistered, we wait until it's not held anymore. Why is this change mandatory for the proposed design? The cleanup of roce_gid_cache is done in a different context, so we need to make sure the device is still alive while doing so. In addition, we don't want that the unregistration process of ib_core will free our context data. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On 04/28/2015 03:24 AM, Doug Ledford wrote: [snip] Also wondering, why add UDP to USNIC, is there a different USNIC? Yes, there are two transports, one a distinct ethertype and one that encapsulates USNIC in UDP. But this new enum isn't about transport, it's about protocol. So is there one USNIC protocol, with a raw layering and a separate one with UDP? Or is it one USNIC protocol with two different framings? Seems there should be at least the USNIC protocol, without the _UDP decoration, and I don't see it in the enum. Keep in mind that this enum was Liran's response to Michael's original patch. In the enum in Michael's patch, there was both USNIC and USNIC_UDP. Yeah, I've not enum PROTOCOL_USNIC since currently there is no place need it... The only three cases currently are: 1. trasnport IB, link layer IB //PROTOCOL_IB 2. transport IB, link layer ETH //PROTOCOL_IBOE 3. transport IWARP //PROTOCOL_IWARP Regards, Michael Wang Naming multiple layers together seems confusing and maybe in the end will create more code to deal with the differences. For example, what token will RoCEv2 take? RoCE_UDP, RoCE_v2 or ... ? Uncertain as of now. Ok, but it's imminent, right? What's the preference/guidance? There is a patchset from Devesh Sharma at Emulex. It added the RoCEv2 capability. As I recall, it used a new flag added to the existing port capabilities bitmask and notably did not modify either the node type or link layer that are currently used to differentiate between the different protocols. That's from memory though, so I could be mistaken. But that patchset was not written with this patchset in mind, and merging the two may well change that. In any case, there is a proposed spec to follow, so for now that's the preference/guidance (unless this rework means that we need to depart from the spec on internals for implementation reasons). -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH for-4.1] iw_cxgb4: Fix kbuild bot reported warnings
On Wed, 2015-04-29 at 00:05 +0530, Hariprasad Shenai wrote: Commit 20dca80f (iw_cxgb4: 32b platform fixes) introduced warnings related to inappropriate argument type while printing arguments The original patch has not yet been pushed upstream. There is no need to submit both it and a fix patch. I've already modified my copy of your patch to correct the errors (most of them, there were two spots I missed that need corrected still). However, that said... Reported by: Dan Carpenter dan.carpen...@oracle.com Reported by: kbuild test robot fengguang...@intel.com Signed-off-by: Hariprasad Shenai haripra...@chelsio.com --- drivers/infiniband/hw/cxgb4/cq.c | 5 +++-- drivers/infiniband/hw/cxgb4/mem.c | 4 ++-- drivers/infiniband/hw/cxgb4/qp.c | 4 ++-- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index be66d5d..1f114c0 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -340,7 +340,8 @@ static void advance_oldest_read(struct t4_wq *wq) */ void c4iw_flush_hw_cq(struct c4iw_cq *chp) { - struct t4_cqe *hw_cqe, *swcqe, read_cqe; + struct t4_cqe *hw_cqe = NULL; + struct t4_cqe *swcqe, read_cqe; struct c4iw_qp *qhp; struct t4_swsqe *swsqe; int ret; @@ -975,7 +976,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int entries, mm2-len = PAGE_SIZE; insert_mmap(ucontext, mm2); } - PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n, + PDBG(%s cqid 0x%0x chp %p size %u memsize %lu, dma_addr 0x%0llx\n, __func__, chp-cq.cqid, chp, chp-cq.size, (uintptr_t)chp-cq.memsize, ^^ This is still wrong in your fixup. The uintptr_t is to be used where you want to shove an int into a ptr or vice versa and you know that the sizes are appropriate and you don't want the compiler complaining. It isn't something that should be used in casting for a printf format. So it shouldn't have been added here in the first place. The right fix is to remove this cast so the original memsize printf format works properly again. (unsigned long long) chp-cq.dma_addr); diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c index 9a26649..42805f6 100644 --- a/drivers/infiniband/hw/cxgb4/mem.c +++ b/drivers/infiniband/hw/cxgb4/mem.c @@ -930,7 +930,7 @@ struct ib_fast_reg_page_list *c4iw_alloc_fastreg_pbl(struct ib_device *device, PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n, __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list, - (void *)(uintptr_t)c4pl-dma_addr); + (dma_addr_t *)(uintptr_t)c4pl-dma_addr); This is wrong too. It makes no sense to cast this as uintptr and then back to dma_addr_t when it was dma_addr_t to begin with. return c4pl-ibpl; } @@ -941,7 +941,7 @@ void c4iw_free_fastreg_pbl(struct ib_fast_reg_page_list *ibpl) PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n, __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list, - (void *)(uintptr_t)c4pl-dma_addr); + (dma_addr_t *)(uintptr_t)c4pl-dma_addr); dma_free_coherent(c4pl-dev-rdev.lldi.pdev-dev, c4pl-pll_len, diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index 7ce40c3..176a238 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1784,8 +1784,8 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs, qhp-ibqp.qp_num = qhp-wq.sq.qid; init_timer((qhp-timer)); INIT_LIST_HEAD(qhp-db_fc_entry); - PDBG(%s sq id %u size %u memsize %zu num_entries %u - rq id %u size %u memsize %zu num_entries %u\n, __func__, + PDBG(%s sq id %u size %u memsize %lu num_entries %u + rq id %u size %u memsize %lu num_entries %u\n, __func__, qhp-wq.sq.qid, qhp-wq.sq.size, (unsigned long)qhp-wq.sq.memsize, attrs-cap.max_send_wr, qhp-wq.rq.qid, qhp-wq.rq.size, (unsigned long)qhp-wq.rq.memsize, attrs-cap.max_recv_wr); -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Tue, 2015-04-28 at 22:11 +0300, Or Gerlitz wrote: On Tue, Apr 28, 2015 at 9:56 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote: On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote: Having some of it refer to things as IBOE and some as ROCE would be similarly confusing, and switching existing IBOE usage to ROCE would cause pain to people with out of tree drivers (Lustre is the main one I know of). There's not a good answer here. There's only less sucky ones. The tide has already turned, we should ditch iboe: $git grep -i roce_ drivers/infiniband/ | wc -l 91 $git grep -i iboe_ drivers/infiniband/ | wc -l 37 It isn't really mainline's role to be too concerned about out of tree things like Lustre. FWIW, note that Lustre is under staging for a while, not sure how close they are for actual acceptance. I thought that was just the client and didn't include the server... -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Tue, 2015-04-28 at 12:56 -0600, Jason Gunthorpe wrote: On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote: On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote: Having some of it refer to things as IBOE and some as ROCE would be similarly confusing, and switching existing IBOE usage to ROCE would cause pain to people with out of tree drivers (Lustre is the main one I know of). There's not a good answer here. There's only less sucky ones. The tide has already turned, we should ditch iboe: $git grep -i roce_ drivers/infiniband/ | wc -l 91 $git grep -i iboe_ drivers/infiniband/ | wc -l 37 It isn't really mainline's role to be too concerned about out of tree things like Lustre. While I generally agree, one need not be totally callous about out of tree things either. -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On Mon, Apr 27, 2015 at 11:25 AM, Matan Barak mat...@mellanox.com wrote: On 4/26/2015 11:10 PM, Or Gerlitz wrote: On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur somnath.ko...@emulex.com wrote: From: Matan Barak mat...@mellanox.com Previously. we used device_mutex lock in order to protect the device's list. That means that in order to guarantee a device isn't freed while we use it, we had to lock all devices. Matan, looking on the cover letter, it says: [...] Patch 0002 adds a reference count mechanism to IB devices. This mechanism is similar to dev_hold and dev_put available for net devices. This is mandatory for later patches [...] Correct, I'll change that into: Currently we use device_mutex lock for protecting the device's list. In the current approach, in order to guarantee a device isn't freed we have to lock all devices. Adding a kref per IB device. Before an IB device is unregistered, we wait until it's not held anymore. Why is this change mandatory for the proposed design? -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache
On Tue, Apr 28, 2015 at 10:17 AM, Matan Barak mat...@mellanox.com wrote: On 4/27/2015 9:22 PM, Or Gerlitz wrote: I think the real question is why to deal with RCUs that will require re-allocation of entries when it's not necessary or why do we want to use rwlock if the kernel provides a mechanism (called seqcount) that fits this problem better? I disagree about seqcount being complex - if you look at its API you'll find it's a lot simpler than RCU. I took a 2nd look, seqcount is indeed way simpler from RCU, and by itself is simple to use if you feel this provides better solution vs. simple rwlock, so I'm good with that. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH for-4.1] iw_cxgb4: Fix kbuild bot reported warnings
Commit 20dca80f (iw_cxgb4: 32b platform fixes) introduced warnings related to inappropriate argument type while printing arguments Reported by: Dan Carpenter dan.carpen...@oracle.com Reported by: kbuild test robot fengguang...@intel.com Signed-off-by: Hariprasad Shenai haripra...@chelsio.com --- drivers/infiniband/hw/cxgb4/cq.c | 5 +++-- drivers/infiniband/hw/cxgb4/mem.c | 4 ++-- drivers/infiniband/hw/cxgb4/qp.c | 4 ++-- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c index be66d5d..1f114c0 100644 --- a/drivers/infiniband/hw/cxgb4/cq.c +++ b/drivers/infiniband/hw/cxgb4/cq.c @@ -340,7 +340,8 @@ static void advance_oldest_read(struct t4_wq *wq) */ void c4iw_flush_hw_cq(struct c4iw_cq *chp) { - struct t4_cqe *hw_cqe, *swcqe, read_cqe; + struct t4_cqe *hw_cqe = NULL; + struct t4_cqe *swcqe, read_cqe; struct c4iw_qp *qhp; struct t4_swsqe *swsqe; int ret; @@ -975,7 +976,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int entries, mm2-len = PAGE_SIZE; insert_mmap(ucontext, mm2); } - PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n, + PDBG(%s cqid 0x%0x chp %p size %u memsize %lu, dma_addr 0x%0llx\n, __func__, chp-cq.cqid, chp, chp-cq.size, (uintptr_t)chp-cq.memsize, (unsigned long long) chp-cq.dma_addr); diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c index 9a26649..42805f6 100644 --- a/drivers/infiniband/hw/cxgb4/mem.c +++ b/drivers/infiniband/hw/cxgb4/mem.c @@ -930,7 +930,7 @@ struct ib_fast_reg_page_list *c4iw_alloc_fastreg_pbl(struct ib_device *device, PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n, __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list, -(void *)(uintptr_t)c4pl-dma_addr); +(dma_addr_t *)(uintptr_t)c4pl-dma_addr); return c4pl-ibpl; } @@ -941,7 +941,7 @@ void c4iw_free_fastreg_pbl(struct ib_fast_reg_page_list *ibpl) PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n, __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list, -(void *)(uintptr_t)c4pl-dma_addr); +(dma_addr_t *)(uintptr_t)c4pl-dma_addr); dma_free_coherent(c4pl-dev-rdev.lldi.pdev-dev, c4pl-pll_len, diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index 7ce40c3..176a238 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1784,8 +1784,8 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *attrs, qhp-ibqp.qp_num = qhp-wq.sq.qid; init_timer((qhp-timer)); INIT_LIST_HEAD(qhp-db_fc_entry); - PDBG(%s sq id %u size %u memsize %zu num_entries %u -rq id %u size %u memsize %zu num_entries %u\n, __func__, + PDBG(%s sq id %u size %u memsize %lu num_entries %u +rq id %u size %u memsize %lu num_entries %u\n, __func__, qhp-wq.sq.qid, qhp-wq.sq.size, (unsigned long)qhp-wq.sq.memsize, attrs-cap.max_send_wr, qhp-wq.rq.qid, qhp-wq.rq.size, (unsigned long)qhp-wq.rq.memsize, attrs-cap.max_recv_wr); -- 2.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 20/23] IB/Verbs: Use management helper cap_ib_mcast()
Introduce helper cap_ib_mcast() to help us check if the port of an IB device support Infiniband Multicast. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 6 +++--- drivers/infiniband/core/multicast.c | 6 +++--- include/rdma/ib_verbs.h | 15 +++ 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index ec3a901..c06ca60 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1007,7 +1007,7 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv) mc = container_of(id_priv-mc_list.next, struct cma_multicast, list); list_del(mc-list); - if (rdma_protocol_ib(id_priv-cma_dev-device, + if (cap_ib_mcast(id_priv-cma_dev-device, id_priv-id.port_num)) { ib_sa_free_multicast(mc-multicast.ib); kfree(mc); @@ -3321,7 +3321,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, if (rdma_protocol_iboe(id-device, id-port_num)) { kref_init(mc-mcref); ret = cma_iboe_join_multicast(id_priv, mc); - } else if (rdma_protocol_ib(id-device, id-port_num)) + } else if (cap_ib_mcast(id-device, id-port_num)) ret = cma_join_ib_multicast(id_priv, mc); else ret = -ENOSYS; @@ -3355,7 +3355,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr) BUG_ON(id_priv-cma_dev-device != id-device); - if (rdma_protocol_ib(id-device, id-port_num)) { + if (cap_ib_mcast(id-device, id-port_num)) { ib_sa_free_multicast(mc-multicast.ib); kfree(mc); } else if (rdma_protocol_iboe(id-device, id-port_num)) diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c index b57ed03..bdc1880 100644 --- a/drivers/infiniband/core/multicast.c +++ b/drivers/infiniband/core/multicast.c @@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler, int index; dev = container_of(handler, struct mcast_device, event_handler); - if (WARN_ON(!rdma_protocol_ib(dev-device, event-element.port_num))) + if (WARN_ON(!cap_ib_mcast(dev-device, event-element.port_num))) return; index = event-element.port_num - dev-start_port; @@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device) } for (i = 0; i = dev-end_port - dev-start_port; i++) { - if (!rdma_protocol_ib(device, dev-start_port + i)) + if (!cap_ib_mcast(device, dev-start_port + i)) continue; port = dev-port[i]; port-dev = dev; @@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device) flush_workqueue(mcast_wq); for (i = 0; i = dev-end_port - dev-start_port; i++) { - if (rdma_protocol_ib(device, dev-start_port + i)) { + if (cap_ib_mcast(device, dev-start_port + i)) { port = dev-port[i]; deref_port(port); wait_for_completion(port-comp); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index f3d9760..dde2aa9 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1849,6 +1849,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 port_num) return rdma_protocol_ib(device, port_num); } +/** + * cap_ib_mcast - Check if the port of device has the capability Infiniband + * Multicast. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support Infiniband + * Multicast. + */ +static inline int cap_ib_mcast(struct ib_device *device, u8 port_num) +{ + return cap_ib_sa(device, port_num); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid *gid); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 19/23] IB/Verbs: Use management helper cap_ib_sa()
Introduce helper cap_ib_sa() to help us check if the port of an IB device support Infiniband Subnet Administration. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 4 ++-- drivers/infiniband/core/sa_query.c | 10 +- drivers/infiniband/core/ucma.c | 2 +- include/rdma/ib_verbs.h| 15 +++ 4 files changed, 23 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 7d55296..ec3a901 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -933,7 +933,7 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv) static void cma_cancel_route(struct rdma_id_private *id_priv) { - if (rdma_protocol_ib(id_priv-id.device, id_priv-id.port_num)) { + if (cap_ib_sa(id_priv-id.device, id_priv-id.port_num)) { if (id_priv-query) ib_sa_cancel_query(id_priv-query_id, id_priv-query); } @@ -1957,7 +1957,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms) return -EINVAL; atomic_inc(id_priv-refcount); - if (rdma_protocol_ib(id-device, id-port_num)) + if (cap_ib_sa(id-device, id-port_num)) ret = cma_resolve_ib_route(id_priv, timeout_ms); else if (rdma_protocol_iboe(id-device, id-port_num)) ret = cma_resolve_iboe_route(id_priv); diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c index b115c28..c82aa48 100644 --- a/drivers/infiniband/core/sa_query.c +++ b/drivers/infiniband/core/sa_query.c @@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event struct ib_sa_port *port = sa_dev-port[event-element.port_num - sa_dev-start_port]; - if (WARN_ON(!rdma_protocol_ib(handler-device, port-port_num))) + if (WARN_ON(!cap_ib_sa(handler-device, port-port_num))) return; spin_lock_irqsave(port-ah_lock, flags); @@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device) for (i = 0; i = e - s; ++i) { spin_lock_init(sa_dev-port[i].ah_lock); - if (!rdma_protocol_ib(device, i + 1)) + if (!cap_ib_sa(device, i + 1)) continue; sa_dev-port[i].sm_ah= NULL; @@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device) goto err; for (i = 0; i = e - s; ++i) { - if (rdma_protocol_ib(device, i + 1)) + if (cap_ib_sa(device, i + 1)) update_sm_ah(sa_dev-port[i].update_task); } @@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device) err: while (--i = 0) { - if (rdma_protocol_ib(device, i + 1)) + if (cap_ib_sa(device, i + 1)) ib_unregister_mad_agent(sa_dev-port[i].agent); } free: @@ -1237,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device) flush_workqueue(ib_wq); for (i = 0; i = sa_dev-end_port - sa_dev-start_port; ++i) { - if (rdma_protocol_ib(device, i + 1)) { + if (cap_ib_sa(device, i + 1)) { ib_unregister_mad_agent(sa_dev-port[i].agent); if (sa_dev-port[i].sm_ah) kref_put(sa_dev-port[i].sm_ah-ref, free_sm_ah); diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c index dae7620..6204065 100644 --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file, resp.node_guid = (__force __u64) ctx-cm_id-device-node_guid; resp.port_num = ctx-cm_id-port_num; - if (rdma_protocol_ib(ctx-cm_id-device, ctx-cm_id-port_num)) + if (cap_ib_sa(ctx-cm_id-device, ctx-cm_id-port_num)) ucma_copy_ib_route(resp, ctx-cm_id-route); else if (rdma_protocol_iboe(ctx-cm_id-device, ctx-cm_id-port_num)) ucma_copy_iboe_route(resp, ctx-cm_id-route); diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index d69e467..f3d9760 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1834,6 +1834,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 port_num) return rdma_protocol_iwarp(device, port_num); } +/** + * cap_ib_sa - Check if the port of device has the capability Infiniband + * Subnet Administration. + * + * @device: Device to be checked
[PATCH v7 18/23] IB/Verbs: Use management helper cap_iw_cm()
Introduce helper cap_iw_cm() to help us check if the port of an IB device support IWARP Communication Manager. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 14 +++--- include/rdma/ib_verbs.h | 15 +++ 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index ecb0484..7d55296 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -754,7 +754,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr, if (qp_attr-qp_state == IB_QPS_RTR) qp_attr-rq_psn = id_priv-seq_num; - } else if (rdma_protocol_iwarp(id-device, id-port_num)) { + } else if (cap_iw_cm(id-device, id-port_num)) { if (!id_priv-cm_id.iw) { qp_attr-qp_access_flags = 0; *qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS; @@ -1036,7 +1036,7 @@ void rdma_destroy_id(struct rdma_cm_id *id) if (cap_ib_cm(id_priv-id.device, 1)) { if (id_priv-cm_id.ib) ib_destroy_cm_id(id_priv-cm_id.ib); - } else if (rdma_protocol_iwarp(id_priv-id.device, 1)) { + } else if (cap_iw_cm(id_priv-id.device, 1)) { if (id_priv-cm_id.iw) iw_destroy_cm_id(id_priv-cm_id.iw); } @@ -2520,7 +2520,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog) ret = cma_ib_listen(id_priv); if (ret) goto err; - } else if (rdma_protocol_iwarp(id-device, 1)) { + } else if (cap_iw_cm(id-device, 1)) { ret = cma_iw_listen(id_priv, backlog); if (ret) goto err; @@ -2865,7 +2865,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) ret = cma_resolve_ib_udp(id_priv, conn_param); else ret = cma_connect_ib(id_priv, conn_param); - } else if (rdma_protocol_iwarp(id-device, id-port_num)) + } else if (cap_iw_cm(id-device, id-port_num)) ret = cma_connect_iw(id_priv, conn_param); else ret = -ENOSYS; @@ -2987,7 +2987,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) else ret = cma_rep_recv(id_priv); } - } else if (rdma_protocol_iwarp(id-device, id-port_num)) + } else if (cap_iw_cm(id-device, id-port_num)) ret = cma_accept_iw(id_priv, conn_param); else ret = -ENOSYS; @@ -3042,7 +3042,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data, ret = ib_send_cm_rej(id_priv-cm_id.ib, IB_CM_REJ_CONSUMER_DEFINED, NULL, 0, private_data, private_data_len); - } else if (rdma_protocol_iwarp(id-device, id-port_num)) { + } else if (cap_iw_cm(id-device, id-port_num)) { ret = iw_cm_reject(id_priv-cm_id.iw, private_data, private_data_len); } else @@ -3068,7 +3068,7 @@ int rdma_disconnect(struct rdma_cm_id *id) /* Initiate or respond to a disconnect. */ if (ib_send_cm_dreq(id_priv-cm_id.ib, NULL, 0)) ib_send_cm_drep(id_priv-cm_id.ib, NULL, 0); - } else if (rdma_protocol_iwarp(id-device, id-port_num)) { + } else if (cap_iw_cm(id-device, id-port_num)) { ret = iw_cm_disconnect(id_priv-cm_id.iw, 0); } else ret = -EINVAL; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 87b07f2..d69e467 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1819,6 +1819,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num) return rdma_ib_or_iboe(device, port_num); } +/** + * cap_iw_cm - Check if the port of device has the capability IWARP + * Communication Manager. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support IWARP + * Communication Manager. + */ +static inline int cap_iw_cm(struct ib_device *device, u8 port_num) +{ + return rdma_protocol_iwarp(device, port_num); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid
[PATCH v7 14/23] IB/Verbs: Reform rest part in IB-core cma
Use raw management helpers to reform rest part in IB-core cma. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 20 +--- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 3fb3458..d43f492f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -447,10 +447,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv) pkey = ntohs(addr-sib_pkey); list_for_each_entry(cur_dev, dev_list, list) { - if (rdma_node_get_transport(cur_dev-device-node_type) != RDMA_TRANSPORT_IB) - continue; - for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) { + if (!rdma_ib_or_iboe(cur_dev-device, p)) + continue; + if (ib_find_cached_pkey(cur_dev-device, p, pkey, index)) continue; @@ -645,10 +645,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private *id_priv, if (ret) goto out; - if (rdma_node_get_transport(id_priv-cma_dev-device-node_type) - == RDMA_TRANSPORT_IB - rdma_port_get_link_layer(id_priv-id.device, id_priv-id.port_num) - == IB_LINK_LAYER_ETHERNET) { + BUG_ON(id_priv-cma_dev-device != id_priv-id.device); + + if (rdma_protocol_iboe(id_priv-id.device, id_priv-id.port_num)) { ret = rdma_addr_find_smac_by_sgid(sgid, qp_attr.smac, NULL); if (ret) @@ -712,11 +711,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv, int ret; u16 pkey; - if (rdma_port_get_link_layer(id_priv-id.device, id_priv-id.port_num) == - IB_LINK_LAYER_INFINIBAND) - pkey = ib_addr_get_pkey(dev_addr); - else + if (rdma_protocol_iboe(id_priv-id.device, id_priv-id.port_num)) pkey = 0x; + else + pkey = ib_addr_get_pkey(dev_addr); ret = ib_find_cached_pkey(id_priv-id.device, id_priv-id.port_num, pkey, qp_attr-pkey_index); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 16/23] IB/Verbs: Use management helper cap_ib_smi()
Introduce helper cap_ib_smi() to help us check if the port of an IB device support Infiniband Subnet Management Interface. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/agent.c | 2 +- drivers/infiniband/core/mad.c | 2 +- include/rdma/ib_verbs.h | 15 +++ 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c index 89d4fbc..61471ee 100644 --- a/drivers/infiniband/core/agent.c +++ b/drivers/infiniband/core/agent.c @@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num) goto error1; } - if (rdma_protocol_ib(device, port_num)) { + if (cap_ib_smi(device, port_num)) { /* Obtain send only MAD agent for SMI QP */ port_priv-agent[0] = ib_register_mad_agent(device, port_num, IB_QPT_SMI, NULL, 0, diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 59459e7..ee3a05e 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device, init_mad_qp(port_priv, port_priv-qp_info[1]); cq_size = mad_sendq_size + mad_recvq_size; - has_smi = rdma_protocol_ib(device, port_num); + has_smi = cap_ib_smi(device, port_num); if (has_smi) cq_size *= 2; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index cb3ba2d..b364a82 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1789,6 +1789,21 @@ static inline int cap_ib_mad(struct ib_device *device, u8 port_num) return rdma_ib_or_iboe(device, port_num); } +/** + * cap_ib_smi - Check if the port of device has the capability Infiniband + * Subnet Management Interface. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support Infiniband + * Subnet Management Interface. + */ +static inline int cap_ib_smi(struct ib_device *device, u8 port_num) +{ + return rdma_protocol_ib(device, port_num); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid *gid); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 02/23] IB/Verbs: Implement raw management helpers
Add raw helpers: rdma_protocol_ib rdma_protocol_iboe rdma_protocol_iwarp rdma_ib_or_iboe To help us detect which technology the port supported. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- include/rdma/ib_verbs.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 080f204..acdba60 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1752,6 +1752,28 @@ int ib_query_port(struct ib_device *device, enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num); +static inline int rdma_protocol_ib(struct ib_device *device, u8 port_num) +{ + return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IB; +} + +static inline int rdma_protocol_iboe(struct ib_device *device, u8 port_num) +{ + return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IBOE; +} + +static inline int rdma_protocol_iwarp(struct ib_device *device, u8 port_num) +{ + return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IWARP; +} + +static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num) +{ + enum rdma_protocol_type pt = device-query_protocol(device, port_num); + + return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid *gid); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 00/23] IB/Verbs: IB Management Helpers
Since v6: * Thanks to Ira, Devesh for the review and testing :-) * Thanks for the comments from Sean, Tom, Jason, Doug, Devesh, Ira, Liran :-) Please remind me if anything missed :-P * Use query_protocol() and enum protocol type in 1# * Use rdma_protocol_XX() in 2# * Drop cma_set_legacy_transport() * Reserve rdma_ib_or_iboe() and rdma_node_get_transport() * Updated github repository to v7 There are plenty of lengthy code to check the transport type of IB device, or the link layer type of it's port, but actually we are just speculating whether a particular management/feature is supported by the device/port. Thus instead of inferring, we should have our own mechanism for IB management capability/protocol/feature checking, several proposals below. This patch set will introduce query_protocol() to check management requirement instead of inferring from transport and link layer respectively, along with the new enum on protocol type. Mapping List: node-type link-layer transport protocol nes RNICETH IWARP IWARP amso1100RNICETH IWARP IWARP cxgb3 RNICETH IWARP IWARP cxgb4 RNICETH IWARP IWARP usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP ocrdma IB_CA ETH IB IBOE mlx4IB_CA IB/ETH IB IB/IBOE mlx5IB_CA IB IB IB ehcaIB_CA IB IB IB ipath IB_CA IB IB IB mthca IB_CA IB IB IB qib IB_CA IB IB IB For example: if (transport == IB) (link-layer == ETH) will now become: if (query_protocol() == IBOE) Thus we will be able to get rid of the respective transport and link-layer checking, and it will help us to add new protocol/Technology (like OPA) more easier, also with the introduced management helpers, IB management logical will be more clear and easier for extending. Highlights: The 'mgmt-helpers' branch of 'g...@github.com:ywang-pb/infiniband-wy.git' contain this series based on the latest 'infiniband/for-next' The patch set covered a wide range of IB stuff, thus for those who are familiar with the particular part, your suggestion would be invaluable ;-) Patch 1#~14# included all the logical reform, 15#~23# introduced the management helpers. we appreciate for those one who have the HW willing to provide Tested-by :-) Doug suggested the bitmask mechanism: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html which could be the plan for future reforming, we prefer that to be another series which focus on semantic and performance. This patch-set is somewhat 'bloated' now and it may be a good timing for staging, I'd like to suggest we focus on improving existed helpers and push all the further reforms into next series ;-) Proposals: Sean: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23339.html Doug: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23418.html https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html Jason: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23425.html Michael Wang (23): IB/Verbs: Implement new callback query_protocol() IB/Verbs: Implement raw management helpers IB/Verbs: Reform IB-core mad/agent/user_mad IB/Verbs: Reform IB-core cm IB/Verbs: Reform IB-core sa_query IB/Verbs: Reform IB-core multicast IB/Verbs: Reform IB-ulp ipoib IB/Verbs: Reform IB-ulp xprtrdma IB/Verbs: Reform IB-core verbs IB/Verbs: Reform cm related part in IB-core cma/ucm IB/Verbs: Reform route related part in IB-core cma IB/Verbs: Reform mcast related part in IB-core cma IB/Verbs: Reform cma_acquire_dev() IB/Verbs: Reform rest part in IB-core cma IB/Verbs: Use management helper cap_ib_mad() IB/Verbs: Use management helper cap_ib_smi() IB/Verbs: Use management helper cap_ib_cm() IB/Verbs: Use management helper cap_iw_cm() IB/Verbs: Use management helper cap_ib_sa() IB/Verbs: Use management helper cap_ib_mcast() IB/Verbs: Use management helper cap_read_multi_sge() IB/Verbs: Use management helper cap_af_ib() IB/Verbs: Use management helper cap_eth_ah() drivers/infiniband/core/agent.c | 2 +- drivers/infiniband/core/cm.c | 20 ++- drivers/infiniband/core/cma.c| 257 +++ drivers/infiniband/core/device.c | 1 + drivers/infiniband/core/mad.c| 43 +++-- drivers/infiniband/core/multicast.c | 12 +-
[PATCH v7 03/23] IB/Verbs: Reform IB-core mad/agent/user_mad
Use raw management helpers to reform IB-core mad/agent/user_mad. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/agent.c| 2 +- drivers/infiniband/core/mad.c | 43 +++--- drivers/infiniband/core/user_mad.c | 26 --- 3 files changed, 41 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c index f6d2961..89d4fbc 100644 --- a/drivers/infiniband/core/agent.c +++ b/drivers/infiniband/core/agent.c @@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num) goto error1; } - if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) { + if (rdma_protocol_ib(device, port_num)) { /* Obtain send only MAD agent for SMI QP */ port_priv-agent[0] = ib_register_mad_agent(device, port_num, IB_QPT_SMI, NULL, 0, diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 74c30f4..507eb67 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device, init_mad_qp(port_priv, port_priv-qp_info[1]); cq_size = mad_sendq_size + mad_recvq_size; - has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND; + has_smi = rdma_protocol_ib(device, port_num); if (has_smi) cq_size *= 2; @@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device) { int start, end, i; - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; - if (device-node_type == RDMA_NODE_IB_SWITCH) { start = 0; end = 0; @@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device) } for (i = start; i = end; i++) { + if (!rdma_ib_or_iboe(device, i)) + continue; + if (ib_mad_port_open(device, i)) { dev_err(device-dev, Couldn't open port %d\n, i); goto error; @@ -3086,40 +3086,39 @@ error_agent: dev_err(device-dev, Couldn't close port %d\n, i); error: - i--; + while (--i = start) { + if (!rdma_ib_or_iboe(device, i)) + continue; - while (i = start) { if (ib_agent_port_close(device, i)) dev_err(device-dev, Couldn't close port %d for agents\n, i); if (ib_mad_port_close(device, i)) dev_err(device-dev, Couldn't close port %d\n, i); - i--; } } static void ib_mad_remove_device(struct ib_device *device) { - int i, num_ports, cur_port; - - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; + int start, end, i; if (device-node_type == RDMA_NODE_IB_SWITCH) { - num_ports = 1; - cur_port = 0; + start = 0; + end = 0; } else { - num_ports = device-phys_port_cnt; - cur_port = 1; + start = 1; + end = device-phys_port_cnt; } - for (i = 0; i num_ports; i++, cur_port++) { - if (ib_agent_port_close(device, cur_port)) + + for (i = start; i = end; i++) { + if (!rdma_ib_or_iboe(device, i)) + continue; + + if (ib_agent_port_close(device, i)) dev_err(device-dev, - Couldn't close port %d for agents\n, - cur_port); - if (ib_mad_port_close(device, cur_port)) - dev_err(device-dev, Couldn't close port %d\n, - cur_port); + Couldn't close port %d for agents\n, i); + if (ib_mad_port_close(device, i)) + dev_err(device-dev, Couldn't close port %d\n, i); } } diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c index 928cdd2..aa8b334 100644 --- a/drivers/infiniband/core/user_mad.c +++ b/drivers/infiniband/core/user_mad.c @@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device) { struct ib_umad_device *umad_dev; int s, e, i; - - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) -
[PATCH v7 21/23] IB/Verbs: Use management helper cap_read_multi_sge()
Introduce helper cap_read_multi_sge() to help us check if the port of an IB device support RDMA Read Multiple Scatter-Gather Entries. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- include/rdma/ib_verbs.h | 15 +++ net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 +- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index dde2aa9..cca0293 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1864,6 +1864,21 @@ static inline int cap_ib_mcast(struct ib_device *device, u8 port_num) return cap_ib_sa(device, port_num); } +/** + * cap_read_multi_sge - Check if the port of device has the capability + * RDMA Read Multiple Scatter-Gather Entries. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support + * RDMA Read Multiple Scatter-Gather Entries. + */ +static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num) +{ + return !rdma_protocol_iwarp(device, port_num); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid *gid); diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index 2cc625d..7711b7a 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -117,7 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp, static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count) { - if (rdma_protocol_iwarp(xprt-sc_cm_id-device, + if (!cap_read_multi_sge(xprt-sc_cm_id-device, xprt-sc_cm_id-port_num)) return 1; else -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 22/23] IB/Verbs: Use management helper cap_af_ib()
Introduce helper cap_af_ib() to help us check if the port of an IB device support Native Infiniband Address. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 2 +- include/rdma/ib_verbs.h | 15 +++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index c06ca60..c3dbcdd 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -448,7 +448,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv) list_for_each_entry(cur_dev, dev_list, list) { for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) { - if (!rdma_ib_or_iboe(cur_dev-device, p)) + if (!cap_af_ib(cur_dev-device, p)) continue; if (ib_find_cached_pkey(cur_dev-device, p, pkey, index)) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index cca0293..c045be1 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1865,6 +1865,21 @@ static inline int cap_ib_mcast(struct ib_device *device, u8 port_num) } /** + * cap_af_ib - Check if the port of device has the capability + * Native Infiniband Address. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support + * Native Infiniband Address. + */ +static inline int cap_af_ib(struct ib_device *device, u8 port_num) +{ + return rdma_ib_or_iboe(device, port_num); +} + +/** * cap_read_multi_sge - Check if the port of device has the capability * RDMA Read Multiple Scatter-Gather Entries. * -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 12/23] IB/Verbs: Reform mcast related part in IB-core cma
Use raw management helpers to reform mcast related part in IB-core cma. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 56 ++- 1 file changed, 18 insertions(+), 38 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 36c5f8a..34ec13f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv) mc = container_of(id_priv-mc_list.next, struct cma_multicast, list); list_del(mc-list); - switch (rdma_port_get_link_layer(id_priv-cma_dev-device, id_priv-id.port_num)) { - case IB_LINK_LAYER_INFINIBAND: + if (rdma_protocol_ib(id_priv-cma_dev-device, + id_priv-id.port_num)) { ib_sa_free_multicast(mc-multicast.ib); kfree(mc); - break; - case IB_LINK_LAYER_ETHERNET: + } else kref_put(mc-mcref, release_mc); - break; - default: - break; - } } } @@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, list_add(mc-list, id_priv-mc_list); spin_unlock(id_priv-lock); - switch (rdma_node_get_transport(id-device-node_type)) { - case RDMA_TRANSPORT_IB: - switch (rdma_port_get_link_layer(id-device, id-port_num)) { - case IB_LINK_LAYER_INFINIBAND: - ret = cma_join_ib_multicast(id_priv, mc); - break; - case IB_LINK_LAYER_ETHERNET: - kref_init(mc-mcref); - ret = cma_iboe_join_multicast(id_priv, mc); - break; - default: - ret = -EINVAL; - } - break; - default: + if (rdma_protocol_iboe(id-device, id-port_num)) { + kref_init(mc-mcref); + ret = cma_iboe_join_multicast(id_priv, mc); + } else if (rdma_protocol_ib(id-device, id-port_num)) + ret = cma_join_ib_multicast(id_priv, mc); + else ret = -ENOSYS; - break; - } if (ret) { spin_lock_irq(id_priv-lock); @@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr) ib_detach_mcast(id-qp, mc-multicast.ib-rec.mgid, be16_to_cpu(mc-multicast.ib-rec.mlid)); - if (rdma_node_get_transport(id_priv-cma_dev-device-node_type) == RDMA_TRANSPORT_IB) { - switch (rdma_port_get_link_layer(id-device, id-port_num)) { - case IB_LINK_LAYER_INFINIBAND: - ib_sa_free_multicast(mc-multicast.ib); - kfree(mc); - break; - case IB_LINK_LAYER_ETHERNET: - kref_put(mc-mcref, release_mc); - break; - default: - break; - } - } + + BUG_ON(id_priv-cma_dev-device != id-device); + + if (rdma_protocol_ib(id-device, id-port_num)) { + ib_sa_free_multicast(mc-multicast.ib); + kfree(mc); + } else if (rdma_protocol_iboe(id-device, id-port_num)) + kref_put(mc-mcref, release_mc); + return; } } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 11/23] IB/Verbs: Reform route related part in IB-core cma
Use raw management helpers to reform route related part in IB-core cma. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 31 --- drivers/infiniband/core/ucma.c | 25 ++--- 2 files changed, 14 insertions(+), 42 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 8a07e89..36c5f8a 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv) static void cma_cancel_route(struct rdma_id_private *id_priv) { - switch (rdma_port_get_link_layer(id_priv-id.device, id_priv-id.port_num)) { - case IB_LINK_LAYER_INFINIBAND: + if (rdma_protocol_ib(id_priv-id.device, id_priv-id.port_num)) { if (id_priv-query) ib_sa_cancel_query(id_priv-query_id, id_priv-query); - break; - default: - break; } } @@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms) return -EINVAL; atomic_inc(id_priv-refcount); - switch (rdma_node_get_transport(id-device-node_type)) { - case RDMA_TRANSPORT_IB: - switch (rdma_port_get_link_layer(id-device, id-port_num)) { - case IB_LINK_LAYER_INFINIBAND: - ret = cma_resolve_ib_route(id_priv, timeout_ms); - break; - case IB_LINK_LAYER_ETHERNET: - ret = cma_resolve_iboe_route(id_priv); - break; - default: - ret = -ENOSYS; - } - break; - case RDMA_TRANSPORT_IWARP: + if (rdma_protocol_ib(id-device, id-port_num)) + ret = cma_resolve_ib_route(id_priv, timeout_ms); + else if (rdma_protocol_iboe(id-device, id-port_num)) + ret = cma_resolve_iboe_route(id_priv); + else if (rdma_protocol_iwarp(id-device, id-port_num)) ret = cma_resolve_iw_route(id_priv, timeout_ms); - break; - default: + else ret = -ENOSYS; - break; - } + if (ret) goto err; diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c index 45d67e9..dae7620 100644 --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file, resp.node_guid = (__force __u64) ctx-cm_id-device-node_guid; resp.port_num = ctx-cm_id-port_num; - switch (rdma_node_get_transport(ctx-cm_id-device-node_type)) { - case RDMA_TRANSPORT_IB: - switch (rdma_port_get_link_layer(ctx-cm_id-device, - ctx-cm_id-port_num)) { - case IB_LINK_LAYER_INFINIBAND: - ucma_copy_ib_route(resp, ctx-cm_id-route); - break; - case IB_LINK_LAYER_ETHERNET: - ucma_copy_iboe_route(resp, ctx-cm_id-route); - break; - default: - break; - } - break; - case RDMA_TRANSPORT_IWARP: + + if (rdma_protocol_ib(ctx-cm_id-device, ctx-cm_id-port_num)) + ucma_copy_ib_route(resp, ctx-cm_id-route); + else if (rdma_protocol_iboe(ctx-cm_id-device, ctx-cm_id-port_num)) + ucma_copy_iboe_route(resp, ctx-cm_id-route); + else if (rdma_protocol_iwarp(ctx-cm_id-device, ctx-cm_id-port_num)) ucma_copy_iw_route(resp, ctx-cm_id-route); - break; - default: - break; - } out: if (copy_to_user((void __user *)(unsigned long)cmd.response, -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 04/23] IB/Verbs: Reform IB-core cm
Use raw management helpers to reform IB-core cm. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cm.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index e28a494..add5e484 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -3760,11 +3760,9 @@ static void cm_add_one(struct ib_device *ib_device) }; unsigned long flags; int ret; + int count = 0; u8 i; - if (rdma_node_get_transport(ib_device-node_type) != RDMA_TRANSPORT_IB) - return; - cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) * ib_device-phys_port_cnt, GFP_KERNEL); if (!cm_dev) @@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device) set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask); for (i = 1; i = ib_device-phys_port_cnt; i++) { + if (!rdma_ib_or_iboe(ib_device, i)) + continue; + port = kzalloc(sizeof *port, GFP_KERNEL); if (!port) goto error1; @@ -3809,7 +3810,13 @@ static void cm_add_one(struct ib_device *ib_device) ret = ib_modify_port(ib_device, i, 0, port_modify); if (ret) goto error3; + + count++; } + + if (!count) + goto free; + ib_set_client_data(ib_device, cm_client, cm_dev); write_lock_irqsave(cm.device_lock, flags); @@ -3825,11 +3832,15 @@ error1: port_modify.set_port_cap_mask = 0; port_modify.clr_port_cap_mask = IB_PORT_CM_SUP; while (--i) { + if (!rdma_ib_or_iboe(ib_device, i)) + continue; + port = cm_dev-port[i-1]; ib_modify_port(ib_device, port-port_num, 0, port_modify); ib_unregister_mad_agent(port-mad_agent); cm_remove_port_fs(port); } +free: device_unregister(cm_dev-device); kfree(cm_dev); } @@ -3853,6 +3864,9 @@ static void cm_remove_one(struct ib_device *ib_device) write_unlock_irqrestore(cm.device_lock, flags); for (i = 1; i = ib_device-phys_port_cnt; i++) { + if (!rdma_ib_or_iboe(ib_device, i)) + continue; + port = cm_dev-port[i-1]; ib_modify_port(ib_device, port-port_num, 0, port_modify); ib_unregister_mad_agent(port-mad_agent); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On Tue, Apr 28, 2015 at 11:32:08AM +0300, Matan Barak wrote: This was already asked by Haggai Eran awhile ago and was answered. Anyway, in ib_unregister_device we delete all client's related data. We would like to ensure that all references were released before this data is being deleted. Meaning, we would like to ensure the device is still functioning but isn't referenced rather than just to avoid freeing the IB device's memory. A kref is the wrong datastructure for that purpose. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 13/23] IB/Verbs: Reform cma_acquire_dev()
Reform cma_acquire_dev() with management helpers, introduce cma_validate_port() to make the code more clean. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cma.c | 68 +-- 1 file changed, 40 insertions(+), 28 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 34ec13f..3fb3458 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -349,18 +349,35 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a return ret; } +static inline int cma_validate_port(struct ib_device *device, u8 port, + union ib_gid *gid, int dev_type) +{ + u8 found_port; + int ret = -ENODEV; + + if ((dev_type == ARPHRD_INFINIBAND) !rdma_protocol_ib(device, port)) + return ret; + + if ((dev_type != ARPHRD_INFINIBAND) rdma_protocol_ib(device, port)) + return ret; + + ret = ib_find_cached_gid(device, gid, found_port, NULL); + if (port != found_port) + return -ENODEV; + + return ret; +} + static int cma_acquire_dev(struct rdma_id_private *id_priv, struct rdma_id_private *listen_id_priv) { struct rdma_dev_addr *dev_addr = id_priv-id.route.addr.dev_addr; struct cma_device *cma_dev; - union ib_gid gid, iboe_gid; + union ib_gid gid, iboe_gid, *gidp; int ret = -ENODEV; - u8 port, found_port; - enum rdma_link_layer dev_ll = dev_addr-dev_type == ARPHRD_INFINIBAND ? - IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET; + u8 port; - if (dev_ll != IB_LINK_LAYER_INFINIBAND + if (dev_addr-dev_type != ARPHRD_INFINIBAND id_priv-id.ps == RDMA_PS_IPOIB) return -EINVAL; @@ -370,41 +387,36 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv, memcpy(gid, dev_addr-src_dev_addr + rdma_addr_gid_offset(dev_addr), sizeof gid); - if (listen_id_priv - rdma_port_get_link_layer(listen_id_priv-id.device, -listen_id_priv-id.port_num) == dev_ll) { + + if (listen_id_priv) { cma_dev = listen_id_priv-cma_dev; port = listen_id_priv-id.port_num; - if (rdma_node_get_transport(cma_dev-device-node_type) == RDMA_TRANSPORT_IB - rdma_port_get_link_layer(cma_dev-device, port) == IB_LINK_LAYER_ETHERNET) - ret = ib_find_cached_gid(cma_dev-device, iboe_gid, -found_port, NULL); - else - ret = ib_find_cached_gid(cma_dev-device, gid, -found_port, NULL); + gidp = rdma_protocol_iboe(cma_dev-device, port) ? + iboe_gid : gid; - if (!ret (port == found_port)) { - id_priv-id.port_num = found_port; + ret = cma_validate_port(cma_dev-device, port, gidp, + dev_addr-dev_type); + if (!ret) { + id_priv-id.port_num = port; goto out; } } + list_for_each_entry(cma_dev, dev_list, list) { for (port = 1; port = cma_dev-device-phys_port_cnt; ++port) { if (listen_id_priv listen_id_priv-cma_dev == cma_dev listen_id_priv-id.port_num == port) continue; - if (rdma_port_get_link_layer(cma_dev-device, port) == dev_ll) { - if (rdma_node_get_transport(cma_dev-device-node_type) == RDMA_TRANSPORT_IB - rdma_port_get_link_layer(cma_dev-device, port) == IB_LINK_LAYER_ETHERNET) - ret = ib_find_cached_gid(cma_dev-device, iboe_gid, found_port, NULL); - else - ret = ib_find_cached_gid(cma_dev-device, gid, found_port, NULL); - - if (!ret (port == found_port)) { - id_priv-id.port_num = found_port; - goto out; - } + + gidp = rdma_protocol_iboe(cma_dev-device, port) ? + iboe_gid : gid; + + ret = cma_validate_port(cma_dev-device, port, gidp, +
Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On 4/28/2015 7:03 PM, Jason Gunthorpe wrote: On Tue, Apr 28, 2015 at 11:32:08AM +0300, Matan Barak wrote: This was already asked by Haggai Eran awhile ago and was answered. Anyway, in ib_unregister_device we delete all client's related data. We would like to ensure that all references were released before this data is being deleted. Meaning, we would like to ensure the device is still functioning but isn't referenced rather than just to avoid freeing the IB device's memory. A kref is the wrong datastructure for that purpose. What is the right data structure in your opinion? Jason Matan -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
For the UDP port used by the usNIC QP, the usnic_verbs kernel driver requires user space to pass a file descriptor of a regular UDP socket down at create_qp time. The reference count on this socket is incremented to make sure that the socket can't disappear out from under us. Then an RX filter is installed in the NIC which matches UDP/IP/Ethernet packets that are destined for the UDP port to which the given socket is already bound. So there is a real UDP socket to make most of the usual things happen in the net stack, but the raw UDP/IP/Ethernet packets get delivered directly to the user space queues by the NIC. E.g., netstat and lsof show you proper addressing information, though obviously any information related to data-path statistics will not be accurate. At teardown we just reverse the steps. However, I'm not sure if that's the sort of information you were looking for. This is more part of the RoCEv2 discussion than this thread. But, yes, this is what I was looking for. Conceptually, this is loosely similar to the port mapper functionality in iWarp, with a direct port mapping. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Apr 28, 2015, at 2:53 PM, Hefty, Sean sean.he...@intel.com wrote: Is the concern here about CM issues or the UDP ports used by the actual usNIC RQs? UDP port space sharing For the UDP port used by the usNIC QP, the usnic_verbs kernel driver requires user space to pass a file descriptor of a regular UDP socket down at create_qp time. The reference count on this socket is incremented to make sure that the socket can't disappear out from under us. Then an RX filter is installed in the NIC which matches UDP/IP/Ethernet packets that are destined for the UDP port to which the given socket is already bound. So there is a real UDP socket to make most of the usual things happen in the net stack, but the raw UDP/IP/Ethernet packets get delivered directly to the user space queues by the NIC. E.g., netstat and lsof show you proper addressing information, though obviously any information related to data-path statistics will not be accurate. At teardown we just reverse the steps. However, I'm not sure if that's the sort of information you were looking for. -Dave -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On 4/27/2015 7:22 PM, Jason Gunthorpe wrote: On Mon, Apr 27, 2015 at 11:25:56AM +0300, Matan Barak wrote: On 4/26/2015 11:10 PM, Or Gerlitz wrote: On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur somnath.ko...@emulex.com wrote: From: Matan Barak mat...@mellanox.com Previously. we used device_mutex lock in order to protect the device's list. That means that in order to guarantee a device isn't freed while we use it, we had to lock all devices. Matan, looking on the cover letter, it says: [...] Patch 0002 adds a reference count mechanism to IB devices. This mechanism is similar to dev_hold and dev_put available for net devices. This is mandatory for later patches [...] So in that respect, saying here Previously. we used device_mutex lock is a bit cryptic, @ least one typo must exist in this sentence, right? did you want to say Currently we use device_mutex lock for XXX [...] and this should be replaced as of a YYY change which is to be introduced [...] please clarify Correct, I'll change that into: Currently we use device_mutex lock for protecting the device's list. In the current approach, in order to guarantee a device isn't freed we have to lock all devices. Adding a kref per IB device. Before an IB device is unregistered, we wait until it's not held anymore. Why do we need two krefs for this structure? There is already a kref inside the embedded 'struct device dev' Sounds wrong to me without a lot more explanation. This was already asked by Haggai Eran awhile ago and was answered. Anyway, in ib_unregister_device we delete all client's related data. We would like to ensure that all references were released before this data is being deleted. Meaning, we would like to ensure the device is still functioning but isn't referenced rather than just to avoid freeing the IB device's memory. ib_device_get and ib_device_hold are APIs for the clients, similar to dev_hold and dev_put. Jason Regards, Matan -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote: On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote: Having some of it refer to things as IBOE and some as ROCE would be similarly confusing, and switching existing IBOE usage to ROCE would cause pain to people with out of tree drivers (Lustre is the main one I know of). There's not a good answer here. There's only less sucky ones. The tide has already turned, we should ditch iboe: $git grep -i roce_ drivers/infiniband/ | wc -l 91 $git grep -i iboe_ drivers/infiniband/ | wc -l 37 It isn't really mainline's role to be too concerned about out of tree things like Lustre. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 04/23] IB/Verbs: Reform IB-core cm
On Tue, Apr 28, 2015 at 6:10 PM, Michael Wang yun.w...@profitbricks.com wrote: Use raw management helpers to reform IB-core cm. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/cm.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) Hi Michael, I don't really see the benefit (e.g for someone doing bisection 1/2/5/10 years from now and landing here) of listing all the group of reviewers for each of the ~30 patches that make this series, any special reason that caused you doing so? Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
recent qib patch
Al, In commit commit 4961772560d2f19695c73ece943716033ad62ac2 Author: Al Viro v...@zeniv.linux.org.uk Date: Sat Apr 4 00:11:32 2015 -0400 infinibad: weird APIs switched to -write_iter() Things Not To Do When Writing A Driver, part 1001st: have writev() and write() on the same file doing completely different things. As in, interpret very different sets of commands. We _can_ handle that, but it's a bloody bad idea. Don't do that in new drivers. Ever. Signed-off-by: Al Viro v...@zeniv.linux.org.uk You note an objection to qib's (and ipath's) use of the write overload for control messages. What would be an acceptable mechanism for control v.s. data path? It looks to me like this current implementation might have been to avoid using ioctl. Mike -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices
On Tue, Apr 28, 2015 at 8:43 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Tue, Apr 28, 2015 at 05:03:11PM +0300, Matan Barak wrote: [...] Or: This should have been fixed after Haggai brought it up... Jason, looking again on the correspondence between Matan and Haggai, I think this one was sort of left in the air (or actually fell on the floor), happens, and indeed we should strive to do better and avoid that, thanks for the 2nd eye. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Tue, Apr 28, 2015 at 9:56 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote: On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote: Having some of it refer to things as IBOE and some as ROCE would be similarly confusing, and switching existing IBOE usage to ROCE would cause pain to people with out of tree drivers (Lustre is the main one I know of). There's not a good answer here. There's only less sucky ones. The tide has already turned, we should ditch iboe: $git grep -i roce_ drivers/infiniband/ | wc -l 91 $git grep -i iboe_ drivers/infiniband/ | wc -l 37 It isn't really mainline's role to be too concerned about out of tree things like Lustre. FWIW, note that Lustre is under staging for a while, not sure how close they are for actual acceptance. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
Keep in mind that this enum was Liran's response to Michael's original patch. In the enum in Michael's patch, there was both USNIC and USNIC_UDP. Right! That's why I'm confused. Seems wrong to drop it, right? I think the original USNIC protocol is layered directly over Ethernet. The protocol basically stole an Ethertype (the one used for IBoE/RoCE) and implemented a proprietary protocol instead. I have no idea how you resolve that, but I also don't think it's used anymore. USNIC_UDP is just UDP. Well, if RoCEv2 uses the same protocol enum, that may introduce new confusion, for example there will be some new CM handling for UDP encap, source port selection, and of course vlan/tag assignment, etc. But if there is support under way, and everyone is clear, then, ok. RoCEv2/IBoUDP shares the same port space as UDP. It has a similar issues as iWarp does sharing state with the main network stack. I'm not aware of any proposal for resolving that. Does it require using a separate IP address? Does it use a port mapper function? Does netdev care for UDP? I'm not sure what USNIC does for this either, but a common solution between USNIC and IBoUDP seems reasonable. N�r��yb�X��ǧv�^�){.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w��� ���j:+v���w�j�mzZ+�ݢj��!�i
Re: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache
On 4/27/2015 9:22 PM, Or Gerlitz wrote: On Mon, Apr 27, 2015 at 10:32 AM, Matan Barak mat...@mellanox.com wrote: On 4/26/2015 8:20 PM, Or Gerlitz wrote: On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur somnath.ko...@emulex.com wrote: From: Matan Barak mat...@mellanox.com In order to manage multiple types, vlans and MACs per GID, we need to store them along the GID itself. We store the net device as well, as sometimes GIDs should be handled according to the net device they came from. Since populating the GID table should be identical for every RoCE provider, the GIDs table should be handled in ib_core. Adding a GID cache table that supports a lockless find, add and delete gids. The lockless nature comes from using a unique sequence number per table entry and detecting that while reading/ writing this sequence wasn't changed. Matan, please use existing mechanism which fits the problem you are trying to solve, I guess one of RCU or seqlock should do the job. seqcount fits this problem better. Since if a write and read are done in parallel, there's a good chance we read an out of date entry and we are going to use a GID entry that's going to change in T+epsilon, so RCU doesn't really have an advantage here. So going back to the problem... we are talking on applications/drivers that attempt to establish new connections doing reads and writes done on behalf of IP stack changes, both are very much not critical path. So this is kind of similar to the neighbour table maintained by ND subsystem which is used by all IP based networking applications and that code uses RCU. I don't see what's wrong with RCU for our sort smaller scale subsystem and what is even wrong with simple rwlock which is the mechanism used today by the IB core git cache, this goes too complex and for no reason that I can think of. I think the real question is why to deal with RCUs that will require re-allocation of entries when it's not necessary or why do we want to use rwlock if the kernel provides a mechanism (called seqcount) that fits this problem better? I disagree about seqcount being complex - if you look at its API you'll find it's a lot simpler than RCU. The current implementation is a bit more efficient than seqcount, as it allows early termination of read-while-write (because the write puts a known currently updating value that the read knows to ignore). AFAIK, this doesn't exist in the current seqcount implementation. However, since this isn't a crucial data-path, I'll change that to seqcount. seqcount is preferred over seqlock, as I don't need the spinlock in seqlock. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
On Apr 28, 2015, at 1:14 AM, Hefty, Sean sean.he...@intel.com wrote: Keep in mind that this enum was Liran's response to Michael's original patch. In the enum in Michael's patch, there was both USNIC and USNIC_UDP. Right! That's why I'm confused. Seems wrong to drop it, right? I think the original USNIC protocol is layered directly over Ethernet. The protocol basically stole an Ethertype (the one used for IBoE/RoCE) and implemented a proprietary protocol instead. I have no idea how you resolve that, but I also don't think it's used anymore. USNIC_UDP is just UDP. Sean is correct. The legacy RDMA_TRANSPORT_USNIC code used a proprietary protocol over plain Ethernet frames. The newer RDMA_TRANSPORT_USNIC_UDP code is just standard UDP/IP/Ethernet packets exposed to user space via the uverbs stack. The current kernel module will support both formats, it just depends on which user space requests at create_qp time. From the kernel point of view there is no common protocol between the two TRANSPORTs (other than sharing partially similar Ethernet frames at L2). I posted last week to clarify some of this: http://marc.info/?l=linux-rdmam=142972177830718w=2 Well, if RoCEv2 uses the same protocol enum, that may introduce new confusion, for example there will be some new CM handling for UDP encap, source port selection, and of course vlan/tag assignment, etc. But if there is support under way, and everyone is clear, then, ok. RoCEv2/IBoUDP shares the same port space as UDP. It has a similar issues as iWarp does sharing state with the main network stack. I'm not aware of any proposal for resolving that. Does it require using a separate IP address? Does it use a port mapper function? Does netdev care for UDP? I'm not sure what USNIC does for this either, but a common solution between USNIC and IBoUDP seems reasonable. Is the concern here about CM issues or the UDP ports used by the actual usNIC RQs? CM is not used/supported for usNIC at this time. -Dave -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()
Is the concern here about CM issues or the UDP ports used by the actual usNIC RQs? UDP port space sharing -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 15/23] IB/Verbs: Use management helper cap_ib_mad()
Introduce helper cap_ib_mad() to help us check if the port of an IB device support Infiniband Management Datagrams. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/mad.c | 6 +++--- drivers/infiniband/core/user_mad.c | 6 +++--- include/rdma/ib_verbs.h| 15 +++ 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 507eb67..59459e7 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device) } for (i = start; i = end; i++) { - if (!rdma_ib_or_iboe(device, i)) + if (!cap_ib_mad(device, i)) continue; if (ib_mad_port_open(device, i)) { @@ -3087,7 +3087,7 @@ error_agent: error: while (--i = start) { - if (!rdma_ib_or_iboe(device, i)) + if (!cap_ib_mad(device, i)) continue; if (ib_agent_port_close(device, i)) @@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device) } for (i = start; i = end; i++) { - if (!rdma_ib_or_iboe(device, i)) + if (!cap_ib_mad(device, i)) continue; if (ib_agent_port_close(device, i)) diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c index aa8b334..e3ccbf2 100644 --- a/drivers/infiniband/core/user_mad.c +++ b/drivers/infiniband/core/user_mad.c @@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device) umad_dev-end_port = e; for (i = s; i = e; ++i) { - if (!rdma_ib_or_iboe(device, i)) + if (!cap_ib_mad(device, i)) continue; umad_dev-port[i - s].umad_dev = umad_dev; @@ -1315,7 +1315,7 @@ static void ib_umad_add_one(struct ib_device *device) err: while (--i = s) { - if (!rdma_ib_or_iboe(device, i)) + if (!cap_ib_mad(device, i)) continue; ib_umad_kill_port(umad_dev-port[i - s]); @@ -1333,7 +1333,7 @@ static void ib_umad_remove_one(struct ib_device *device) return; for (i = 0; i = umad_dev-end_port - umad_dev-start_port; ++i) { - if (rdma_ib_or_iboe(device, i)) + if (cap_ib_mad(device, i)) ib_umad_kill_port(umad_dev-port[i]); } diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index acdba60..cb3ba2d 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1774,6 +1774,21 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num) return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE); } +/** + * cap_ib_mad - Check if the port of device has the capability Infiniband + * Management Datagrams. + * + * @device: Device to be checked + * @port_num: Port number of the device + * + * Return 0 when port of the device don't support Infiniband + * Management Datagrams. + */ +static inline int cap_ib_mad(struct ib_device *device, u8 port_num) +{ + return rdma_ib_or_iboe(device, port_num); +} + int ib_query_gid(struct ib_device *device, u8 port_num, int index, union ib_gid *gid); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 07/23] IB/Verbs: Reform IB-ulp ipoib
Use raw management helpers to reform IB-ulp ipoib. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 7cad4dd..468fc2b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1680,9 +1680,7 @@ static void ipoib_add_one(struct ib_device *device) struct net_device *dev; struct ipoib_dev_priv *priv; int s, e, p; - - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; + int count = 0; dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL); if (!dev_list) @@ -1699,15 +1697,21 @@ static void ipoib_add_one(struct ib_device *device) } for (p = s; p = e; ++p) { - if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND) + if (!rdma_protocol_ib(device, p)) continue; dev = ipoib_add_port(ib%d, device, p); if (!IS_ERR(dev)) { priv = netdev_priv(dev); list_add_tail(priv-list, dev_list); + count++; } } + if (!count) { + kfree(dev_list); + return; + } + ib_set_client_data(device, ipoib_client, dev_list); } @@ -1716,9 +1720,6 @@ static void ipoib_remove_one(struct ib_device *device) struct ipoib_dev_priv *priv, *tmp; struct list_head *dev_list; - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; - dev_list = ib_get_client_data(device, ipoib_client); if (!dev_list) return; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 08/23] IB/Verbs: Reform IB-ulp xprtrdma
Use raw management helpers to reform IB-ulp xprtrdma. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 4 +-- net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +--- 2 files changed, 20 insertions(+), 29 deletions(-) diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c index f9f13a3..2cc625d 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -117,8 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp, static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count) { - if (rdma_node_get_transport(xprt-sc_cm_id-device-node_type) == -RDMA_TRANSPORT_IWARP) + if (rdma_protocol_iwarp(xprt-sc_cm_id-device, + xprt-sc_cm_id-port_num)) return 1; else return min_t(int, sge_count, xprt-sc_max_sge); diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index f609c1c..3df8320 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) struct ib_qp_init_attr qp_attr; struct ib_device_attr devattr; int uninitialized_var(dma_mr_acc); - int need_dma_mr; + int need_dma_mr = 0; int ret; int i; @@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt) /* * Determine if a DMA MR is required and if so, what privs are required */ - switch (rdma_node_get_transport(newxprt-sc_cm_id-device-node_type)) { - case RDMA_TRANSPORT_IWARP: - newxprt-sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV; - if (!(newxprt-sc_dev_caps SVCRDMA_DEVCAP_FAST_REG)) { - need_dma_mr = 1; - dma_mr_acc = - (IB_ACCESS_LOCAL_WRITE | -IB_ACCESS_REMOTE_WRITE); - } else if (!(devattr.device_cap_flags IB_DEVICE_LOCAL_DMA_LKEY)) { - need_dma_mr = 1; - dma_mr_acc = IB_ACCESS_LOCAL_WRITE; - } else - need_dma_mr = 0; - break; - case RDMA_TRANSPORT_IB: - if (!(newxprt-sc_dev_caps SVCRDMA_DEVCAP_FAST_REG)) { - need_dma_mr = 1; - dma_mr_acc = IB_ACCESS_LOCAL_WRITE; - } else if (!(devattr.device_cap_flags -IB_DEVICE_LOCAL_DMA_LKEY)) { - need_dma_mr = 1; - dma_mr_acc = IB_ACCESS_LOCAL_WRITE; - } else - need_dma_mr = 0; - break; - default: + if (!rdma_protocol_iwarp(newxprt-sc_cm_id-device, +newxprt-sc_cm_id-port_num) + !rdma_ib_or_iboe(newxprt-sc_cm_id-device, +newxprt-sc_cm_id-port_num)) goto errout; + + if (!(newxprt-sc_dev_caps SVCRDMA_DEVCAP_FAST_REG) || + !(devattr.device_cap_flags IB_DEVICE_LOCAL_DMA_LKEY)) { + need_dma_mr = 1; + dma_mr_acc = IB_ACCESS_LOCAL_WRITE; + if (rdma_protocol_iwarp(newxprt-sc_cm_id-device, + newxprt-sc_cm_id-port_num) + !(newxprt-sc_dev_caps SVCRDMA_DEVCAP_FAST_REG)) + dma_mr_acc |= IB_ACCESS_REMOTE_WRITE; } + if (rdma_protocol_iwarp(newxprt-sc_cm_id-device, + newxprt-sc_cm_id-port_num)) + newxprt-sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV; + /* Create the DMA MR if needed, otherwise, use the DMA LKEY */ if (need_dma_mr) { /* Register all of physical memory */ -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 06/23] IB/Verbs: Reform IB-core multicast
Use raw management helpers to reform IB-core multicast. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/multicast.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c index fa17b55..b57ed03 100644 --- a/drivers/infiniband/core/multicast.c +++ b/drivers/infiniband/core/multicast.c @@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler, int index; dev = container_of(handler, struct mcast_device, event_handler); - if (rdma_port_get_link_layer(dev-device, event-element.port_num) != - IB_LINK_LAYER_INFINIBAND) + if (WARN_ON(!rdma_protocol_ib(dev-device, event-element.port_num))) return; index = event-element.port_num - dev-start_port; @@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device) int i; int count = 0; - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; - dev = kmalloc(sizeof *dev + device-phys_port_cnt * sizeof *port, GFP_KERNEL); if (!dev) @@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device) } for (i = 0; i = dev-end_port - dev-start_port; i++) { - if (rdma_port_get_link_layer(device, dev-start_port + i) != - IB_LINK_LAYER_INFINIBAND) + if (!rdma_protocol_ib(device, dev-start_port + i)) continue; port = dev-port[i]; port-dev = dev; @@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device) flush_workqueue(mcast_wq); for (i = 0; i = dev-end_port - dev-start_port; i++) { - if (rdma_port_get_link_layer(device, dev-start_port + i) == - IB_LINK_LAYER_INFINIBAND) { + if (rdma_protocol_ib(device, dev-start_port + i)) { port = dev-port[i]; deref_port(port); wait_for_completion(port-comp); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 05/23] IB/Verbs: Reform IB-core sa_query
Use raw management helpers to reform IB-core sa_query. Cc: Hal Rosenstock h...@dev.mellanox.co.il Cc: Steve Wise sw...@opengridcomputing.com Cc: Tom Talpey t...@talpey.com Cc: Jason Gunthorpe jguntho...@obsidianresearch.com Cc: Doug Ledford dledf...@redhat.com Cc: Ira Weiny ira.we...@intel.com Cc: Sean Hefty sean.he...@intel.com Signed-off-by: Michael Wang yun.w...@profitbricks.com --- drivers/infiniband/core/sa_query.c | 30 +- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c index c38f030..b115c28 100644 --- a/drivers/infiniband/core/sa_query.c +++ b/drivers/infiniband/core/sa_query.c @@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event struct ib_sa_port *port = sa_dev-port[event-element.port_num - sa_dev-start_port]; - if (rdma_port_get_link_layer(handler-device, port-port_num) != IB_LINK_LAYER_INFINIBAND) + if (WARN_ON(!rdma_protocol_ib(handler-device, port-port_num))) return; spin_lock_irqsave(port-ah_lock, flags); @@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num, ah_attr-port_num = port_num; ah_attr-static_rate = rec-rate; - force_grh = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_ETHERNET; + force_grh = rdma_protocol_iboe(device, port_num); if (rec-hop_limit 1 || force_grh) { ah_attr-ah_flags = IB_AH_GRH; @@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device) { struct ib_sa_device *sa_dev; int s, e, i; - - if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB) - return; + int count = 0; if (device-node_type == RDMA_NODE_IB_SWITCH) s = e = 0; @@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device) for (i = 0; i = e - s; ++i) { spin_lock_init(sa_dev-port[i].ah_lock); - if (rdma_port_get_link_layer(device, i + 1) != IB_LINK_LAYER_INFINIBAND) + if (!rdma_protocol_ib(device, i + 1)) continue; sa_dev-port[i].sm_ah= NULL; @@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device) goto err; INIT_WORK(sa_dev-port[i].update_task, update_sm_ah); + + count++; } + if (!count) + goto free; + ib_set_client_data(device, sa_client, sa_dev); /* @@ -1204,19 +1207,20 @@ static void ib_sa_add_one(struct ib_device *device) if (ib_register_event_handler(sa_dev-event_handler)) goto err; - for (i = 0; i = e - s; ++i) - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) + for (i = 0; i = e - s; ++i) { + if (rdma_protocol_ib(device, i + 1)) update_sm_ah(sa_dev-port[i].update_task); + } return; err: - while (--i = 0) - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) + while (--i = 0) { + if (rdma_protocol_ib(device, i + 1)) ib_unregister_mad_agent(sa_dev-port[i].agent); - + } +free: kfree(sa_dev); - return; } @@ -1233,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device) flush_workqueue(ib_wq); for (i = 0; i = sa_dev-end_port - sa_dev-start_port; ++i) { - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) { + if (rdma_protocol_ib(device, i + 1)) { ib_unregister_mad_agent(sa_dev-port[i].agent); if (sa_dev-port[i].sm_ah) kref_put(sa_dev-port[i].sm_ah-ref, free_sm_ah); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html