Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Matan Barak



On 4/28/2015 2:51 PM, Or Gerlitz wrote:

On Mon, Apr 27, 2015 at 11:25 AM, Matan Barak mat...@mellanox.com wrote:

On 4/26/2015 11:10 PM, Or Gerlitz wrote:

On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur
somnath.ko...@emulex.com wrote:


From: Matan Barak mat...@mellanox.com

Previously. we used device_mutex lock in order to protect
the device's list. That means that in order to guarantee a
device isn't freed while we use it, we had to lock all
devices.



Matan, looking on the cover letter, it says: [...] Patch 0002 adds a
reference count mechanism to IB devices. This mechanism is similar to
dev_hold and dev_put available for net devices. This is mandatory for
later patches [...]



Correct, I'll change that into:
Currently we use device_mutex lock for protecting the device's list. In the
current approach, in order to guarantee a device isn't freed we have to lock
all devices.
Adding a kref per IB device. Before an IB device is unregistered, we wait
until it's not held anymore.


Why is this change mandatory for the proposed design?



The cleanup of roce_gid_cache is done in a different context, so we need 
to make sure the device is still alive while doing so. In addition, we 
don't want that the unregistration process of ib_core will free our 
context data.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Michael Wang


On 04/28/2015 03:24 AM, Doug Ledford wrote:
[snip]
 Also wondering, why add UDP to USNIC, is there a different USNIC?

 Yes, there are two transports, one a distinct ethertype and one that
 encapsulates USNIC in UDP.

 But this new enum isn't about transport, it's about protocol. So is
 there one USNIC protocol, with a raw layering and a separate one with
 UDP? Or is it one USNIC protocol with two different framings? Seems
 there should be at least the USNIC protocol, without the _UDP
 decoration, and I don't see it in the enum.
 
 Keep in mind that this enum was Liran's response to Michael's original
 patch.  In the enum in Michael's patch, there was both USNIC and
 USNIC_UDP.

Yeah, I've not enum PROTOCOL_USNIC since currently there is no place
need it...

The only three cases currently are:
1. trasnport IB, link layer IB  //PROTOCOL_IB
2. transport IB, link layer ETH //PROTOCOL_IBOE
3. transport IWARP  //PROTOCOL_IWARP

Regards,
Michael Wang

 

 Naming multiple layers together seems confusing and maybe in the end
 will create more code to deal with the differences. For example, what
 token will RoCEv2 take? RoCE_UDP, RoCE_v2 or ... ?

 Uncertain as of now.

 Ok, but it's imminent, right? What's the preference/guidance?
 
 There is a patchset from Devesh Sharma at Emulex.  It added the RoCEv2
 capability.  As I recall, it used a new flag added to the existing port
 capabilities bitmask and notably did not modify either the node type or
 link layer that are currently used to differentiate between the
 different protocols.  That's from memory though, so I could be mistaken.
 
 But that patchset was not written with this patchset in mind, and
 merging the two may well change that.  In any case, there is a proposed
 spec to follow, so for now that's the preference/guidance (unless this
 rework means that we need to depart from the spec on internals for
 implementation reasons).
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-4.1] iw_cxgb4: Fix kbuild bot reported warnings

2015-04-28 Thread Doug Ledford
On Wed, 2015-04-29 at 00:05 +0530, Hariprasad Shenai wrote:
 Commit 20dca80f (iw_cxgb4: 32b platform fixes) introduced warnings
 related to inappropriate argument type while printing arguments

The original patch has not yet been pushed upstream.  There is no need
to submit both it and a fix patch.  I've already modified my copy of
your patch to correct the errors (most of them, there were two spots I
missed that need corrected still).  However, that said...

 Reported by: Dan Carpenter dan.carpen...@oracle.com
 Reported by: kbuild test robot fengguang...@intel.com
 Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
 ---
  drivers/infiniband/hw/cxgb4/cq.c  | 5 +++--
  drivers/infiniband/hw/cxgb4/mem.c | 4 ++--
  drivers/infiniband/hw/cxgb4/qp.c  | 4 ++--
  3 files changed, 7 insertions(+), 6 deletions(-)
 
 diff --git a/drivers/infiniband/hw/cxgb4/cq.c 
 b/drivers/infiniband/hw/cxgb4/cq.c
 index be66d5d..1f114c0 100644
 --- a/drivers/infiniband/hw/cxgb4/cq.c
 +++ b/drivers/infiniband/hw/cxgb4/cq.c
 @@ -340,7 +340,8 @@ static void advance_oldest_read(struct t4_wq *wq)
   */
  void c4iw_flush_hw_cq(struct c4iw_cq *chp)
  {
 - struct t4_cqe *hw_cqe, *swcqe, read_cqe;
 + struct t4_cqe *hw_cqe = NULL;
 + struct t4_cqe *swcqe, read_cqe;
   struct c4iw_qp *qhp;
   struct t4_swsqe *swsqe;
   int ret;
 @@ -975,7 +976,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int 
 entries,
   mm2-len = PAGE_SIZE;
   insert_mmap(ucontext, mm2);
   }
 - PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n,
 + PDBG(%s cqid 0x%0x chp %p size %u memsize %lu, dma_addr 0x%0llx\n,
__func__, chp-cq.cqid, chp, chp-cq.size,
(uintptr_t)chp-cq.memsize,
   ^^
   This is still wrong in your fixup.  The uintptr_t is to
be used where you want to shove an int into a ptr or vice versa and you
know that the sizes are appropriate and you don't want the compiler
complaining.  It isn't something that should be used in casting for a
printf format.  So it shouldn't have been added here in the first place.
The right fix is to remove this cast so the original memsize printf
format works properly again.

(unsigned long long) chp-cq.dma_addr);
  


 diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
 b/drivers/infiniband/hw/cxgb4/mem.c
 index 9a26649..42805f6 100644
 --- a/drivers/infiniband/hw/cxgb4/mem.c
 +++ b/drivers/infiniband/hw/cxgb4/mem.c
 @@ -930,7 +930,7 @@ struct ib_fast_reg_page_list 
 *c4iw_alloc_fastreg_pbl(struct ib_device *device,
  
   PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n,
__func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list,
 -  (void *)(uintptr_t)c4pl-dma_addr);
 +  (dma_addr_t *)(uintptr_t)c4pl-dma_addr);

This is wrong too.  It makes no sense to cast this as uintptr and then
back to dma_addr_t when it was dma_addr_t to begin with.

 
   return c4pl-ibpl;
  }
 @@ -941,7 +941,7 @@ void c4iw_free_fastreg_pbl(struct ib_fast_reg_page_list 
 *ibpl)
  
   PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n,
__func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list,
 -  (void *)(uintptr_t)c4pl-dma_addr);
 +  (dma_addr_t *)(uintptr_t)c4pl-dma_addr);
  
   dma_free_coherent(c4pl-dev-rdev.lldi.pdev-dev,
 c4pl-pll_len,
 diff --git a/drivers/infiniband/hw/cxgb4/qp.c 
 b/drivers/infiniband/hw/cxgb4/qp.c
 index 7ce40c3..176a238 100644
 --- a/drivers/infiniband/hw/cxgb4/qp.c
 +++ b/drivers/infiniband/hw/cxgb4/qp.c
 @@ -1784,8 +1784,8 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct 
 ib_qp_init_attr *attrs,
   qhp-ibqp.qp_num = qhp-wq.sq.qid;
   init_timer((qhp-timer));
   INIT_LIST_HEAD(qhp-db_fc_entry);
 - PDBG(%s sq id %u size %u memsize %zu num_entries %u 
 -  rq id %u size %u memsize %zu num_entries %u\n, __func__,
 + PDBG(%s sq id %u size %u memsize %lu num_entries %u 
 +  rq id %u size %u memsize %lu num_entries %u\n, __func__,
qhp-wq.sq.qid, qhp-wq.sq.size, (unsigned long)qhp-wq.sq.memsize,
attrs-cap.max_send_wr, qhp-wq.rq.qid, qhp-wq.rq.size,
(unsigned long)qhp-wq.rq.memsize, attrs-cap.max_recv_wr);


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Doug Ledford
On Tue, 2015-04-28 at 22:11 +0300, Or Gerlitz wrote:
 On Tue, Apr 28, 2015 at 9:56 PM, Jason Gunthorpe
 jguntho...@obsidianresearch.com wrote:
  On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote:
  On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote:
 
  Having some of it refer to things as IBOE and some as ROCE would be
  similarly confusing, and switching existing IBOE usage to ROCE would
  cause pain to people with out of tree drivers (Lustre is the main one I
  know of).  There's not a good answer here.  There's only less sucky
  ones.
 
  The tide has already turned, we should ditch iboe:
 
  $git grep -i roce_ drivers/infiniband/ | wc -l
  91
  $git grep -i iboe_ drivers/infiniband/ | wc -l
  37
 
  It isn't really mainline's role to be too concerned about out of tree
  things like Lustre.
 
 FWIW, note that Lustre is under staging for a while, not sure how
 close they are for actual acceptance.

I thought that was just the client and didn't include the server...


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Doug Ledford
On Tue, 2015-04-28 at 12:56 -0600, Jason Gunthorpe wrote:
 On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote:
  On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote:
 
  Having some of it refer to things as IBOE and some as ROCE would be
  similarly confusing, and switching existing IBOE usage to ROCE would
  cause pain to people with out of tree drivers (Lustre is the main one I
  know of).  There's not a good answer here.  There's only less sucky
  ones.
 
 The tide has already turned, we should ditch iboe:
 
 $git grep -i roce_ drivers/infiniband/ | wc -l
 91
 $git grep -i iboe_ drivers/infiniband/ | wc -l
 37
 
 It isn't really mainline's role to be too concerned about out of tree
 things like Lustre.

While I generally agree, one need not be totally callous about out of
tree things either.


-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD




signature.asc
Description: This is a digitally signed message part


Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Or Gerlitz
On Mon, Apr 27, 2015 at 11:25 AM, Matan Barak mat...@mellanox.com wrote:
 On 4/26/2015 11:10 PM, Or Gerlitz wrote:
 On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur
 somnath.ko...@emulex.com wrote:

 From: Matan Barak mat...@mellanox.com

 Previously. we used device_mutex lock in order to protect
 the device's list. That means that in order to guarantee a
 device isn't freed while we use it, we had to lock all
 devices.


 Matan, looking on the cover letter, it says: [...] Patch 0002 adds a
 reference count mechanism to IB devices. This mechanism is similar to
 dev_hold and dev_put available for net devices. This is mandatory for
 later patches [...]

 Correct, I'll change that into:
 Currently we use device_mutex lock for protecting the device's list. In the
 current approach, in order to guarantee a device isn't freed we have to lock
 all devices.
 Adding a kref per IB device. Before an IB device is unregistered, we wait
 until it's not held anymore.

Why is this change mandatory for the proposed design?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache

2015-04-28 Thread Or Gerlitz
On Tue, Apr 28, 2015 at 10:17 AM, Matan Barak mat...@mellanox.com wrote:
 On 4/27/2015 9:22 PM, Or Gerlitz wrote:
 I think the real question is why to deal with RCUs that will require
 re-allocation of entries when it's not necessary or why do we want to use
 rwlock if the kernel provides a mechanism (called seqcount) that fits this
 problem better?
 I disagree about seqcount being complex - if you look at its API you'll find
 it's a lot simpler than RCU.

I took a 2nd look, seqcount is indeed way simpler from RCU, and by
itself is simple to use
if you feel this provides better solution vs. simple rwlock, so I'm
good with that.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1] iw_cxgb4: Fix kbuild bot reported warnings

2015-04-28 Thread Hariprasad Shenai
Commit 20dca80f (iw_cxgb4: 32b platform fixes) introduced warnings
related to inappropriate argument type while printing arguments

Reported by: Dan Carpenter dan.carpen...@oracle.com
Reported by: kbuild test robot fengguang...@intel.com
Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cq.c  | 5 +++--
 drivers/infiniband/hw/cxgb4/mem.c | 4 ++--
 drivers/infiniband/hw/cxgb4/qp.c  | 4 ++--
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index be66d5d..1f114c0 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -340,7 +340,8 @@ static void advance_oldest_read(struct t4_wq *wq)
  */
 void c4iw_flush_hw_cq(struct c4iw_cq *chp)
 {
-   struct t4_cqe *hw_cqe, *swcqe, read_cqe;
+   struct t4_cqe *hw_cqe = NULL;
+   struct t4_cqe *swcqe, read_cqe;
struct c4iw_qp *qhp;
struct t4_swsqe *swsqe;
int ret;
@@ -975,7 +976,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int 
entries,
mm2-len = PAGE_SIZE;
insert_mmap(ucontext, mm2);
}
-   PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n,
+   PDBG(%s cqid 0x%0x chp %p size %u memsize %lu, dma_addr 0x%0llx\n,
 __func__, chp-cq.cqid, chp, chp-cq.size,
 (uintptr_t)chp-cq.memsize,
 (unsigned long long) chp-cq.dma_addr);
diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
b/drivers/infiniband/hw/cxgb4/mem.c
index 9a26649..42805f6 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -930,7 +930,7 @@ struct ib_fast_reg_page_list *c4iw_alloc_fastreg_pbl(struct 
ib_device *device,
 
PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n,
 __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list,
-(void *)(uintptr_t)c4pl-dma_addr);
+(dma_addr_t *)(uintptr_t)c4pl-dma_addr);
 
return c4pl-ibpl;
 }
@@ -941,7 +941,7 @@ void c4iw_free_fastreg_pbl(struct ib_fast_reg_page_list 
*ibpl)
 
PDBG(%s c4pl %p pll_len %u page_list %p dma_addr %pad\n,
 __func__, c4pl, c4pl-pll_len, c4pl-ibpl.page_list,
-(void *)(uintptr_t)c4pl-dma_addr);
+(dma_addr_t *)(uintptr_t)c4pl-dma_addr);
 
dma_free_coherent(c4pl-dev-rdev.lldi.pdev-dev,
  c4pl-pll_len,
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
index 7ce40c3..176a238 100644
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -1784,8 +1784,8 @@ struct ib_qp *c4iw_create_qp(struct ib_pd *pd, struct 
ib_qp_init_attr *attrs,
qhp-ibqp.qp_num = qhp-wq.sq.qid;
init_timer((qhp-timer));
INIT_LIST_HEAD(qhp-db_fc_entry);
-   PDBG(%s sq id %u size %u memsize %zu num_entries %u 
-rq id %u size %u memsize %zu num_entries %u\n, __func__,
+   PDBG(%s sq id %u size %u memsize %lu num_entries %u 
+rq id %u size %u memsize %lu num_entries %u\n, __func__,
 qhp-wq.sq.qid, qhp-wq.sq.size, (unsigned long)qhp-wq.sq.memsize,
 attrs-cap.max_send_wr, qhp-wq.rq.qid, qhp-wq.rq.size,
 (unsigned long)qhp-wq.rq.memsize, attrs-cap.max_recv_wr);
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 20/23] IB/Verbs: Use management helper cap_ib_mcast()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_mcast() to help us check if the port of an
IB device support Infiniband Multicast.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c   |  6 +++---
 drivers/infiniband/core/multicast.c |  6 +++---
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ec3a901..c06ca60 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1007,7 +1007,7 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv-mc_list.next,
  struct cma_multicast, list);
list_del(mc-list);
-   if (rdma_protocol_ib(id_priv-cma_dev-device,
+   if (cap_ib_mcast(id_priv-cma_dev-device,
  id_priv-id.port_num)) {
ib_sa_free_multicast(mc-multicast.ib);
kfree(mc);
@@ -3321,7 +3321,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
if (rdma_protocol_iboe(id-device, id-port_num)) {
kref_init(mc-mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
-   } else if (rdma_protocol_ib(id-device, id-port_num))
+   } else if (cap_ib_mcast(id-device, id-port_num))
ret = cma_join_ib_multicast(id_priv, mc);
else
ret = -ENOSYS;
@@ -3355,7 +3355,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
 
BUG_ON(id_priv-cma_dev-device != id-device);
 
-   if (rdma_protocol_ib(id-device, id-port_num)) {
+   if (cap_ib_mcast(id-device, id-port_num)) {
ib_sa_free_multicast(mc-multicast.ib);
kfree(mc);
} else if (rdma_protocol_iboe(id-device, id-port_num))
diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index b57ed03..bdc1880 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (WARN_ON(!rdma_protocol_ib(dev-device, event-element.port_num)))
+   if (WARN_ON(!cap_ib_mcast(dev-device, event-element.port_num)))
return;
 
index = event-element.port_num - dev-start_port;
@@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i = dev-end_port - dev-start_port; i++) {
-   if (!rdma_protocol_ib(device, dev-start_port + i))
+   if (!cap_ib_mcast(device, dev-start_port + i))
continue;
port = dev-port[i];
port-dev = dev;
@@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i = dev-end_port - dev-start_port; i++) {
-   if (rdma_protocol_ib(device, dev-start_port + i)) {
+   if (cap_ib_mcast(device, dev-start_port + i)) {
port = dev-port[i];
deref_port(port);
wait_for_completion(port-comp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index f3d9760..dde2aa9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1849,6 +1849,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 
port_num)
return rdma_protocol_ib(device, port_num);
 }
 
+/**
+ * cap_ib_mcast - Check if the port of device has the capability Infiniband
+ * Multicast.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Multicast.
+ */
+static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
+{
+   return cap_ib_sa(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 19/23] IB/Verbs: Use management helper cap_ib_sa()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_sa() to help us check if the port of an
IB device support Infiniband Subnet Administration.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c  |  4 ++--
 drivers/infiniband/core/sa_query.c | 10 +-
 drivers/infiniband/core/ucma.c |  2 +-
 include/rdma/ib_verbs.h| 15 +++
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 7d55296..ec3a901 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -933,7 +933,7 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   if (rdma_protocol_ib(id_priv-id.device, id_priv-id.port_num)) {
+   if (cap_ib_sa(id_priv-id.device, id_priv-id.port_num)) {
if (id_priv-query)
ib_sa_cancel_query(id_priv-query_id, id_priv-query);
}
@@ -1957,7 +1957,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(id_priv-refcount);
-   if (rdma_protocol_ib(id-device, id-port_num))
+   if (cap_ib_sa(id-device, id-port_num))
ret = cma_resolve_ib_route(id_priv, timeout_ms);
else if (rdma_protocol_iboe(id-device, id-port_num))
ret = cma_resolve_iboe_route(id_priv);
diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index b115c28..c82aa48 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
sa_dev-port[event-element.port_num - 
sa_dev-start_port];
 
-   if (WARN_ON(!rdma_protocol_ib(handler-device, port-port_num)))
+   if (WARN_ON(!cap_ib_sa(handler-device, port-port_num)))
return;
 
spin_lock_irqsave(port-ah_lock, flags);
@@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i = e - s; ++i) {
spin_lock_init(sa_dev-port[i].ah_lock);
-   if (!rdma_protocol_ib(device, i + 1))
+   if (!cap_ib_sa(device, i + 1))
continue;
 
sa_dev-port[i].sm_ah= NULL;
@@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
for (i = 0; i = e - s; ++i) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (cap_ib_sa(device, i + 1))
update_sm_ah(sa_dev-port[i].update_task);
}
 
@@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
 err:
while (--i = 0) {
-   if (rdma_protocol_ib(device, i + 1))
+   if (cap_ib_sa(device, i + 1))
ib_unregister_mad_agent(sa_dev-port[i].agent);
}
 free:
@@ -1237,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i = sa_dev-end_port - sa_dev-start_port; ++i) {
-   if (rdma_protocol_ib(device, i + 1)) {
+   if (cap_ib_sa(device, i + 1)) {
ib_unregister_mad_agent(sa_dev-port[i].agent);
if (sa_dev-port[i].sm_ah)
kref_put(sa_dev-port[i].sm_ah-ref, 
free_sm_ah);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index dae7620..6204065 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
resp.node_guid = (__force __u64) ctx-cm_id-device-node_guid;
resp.port_num = ctx-cm_id-port_num;
 
-   if (rdma_protocol_ib(ctx-cm_id-device, ctx-cm_id-port_num))
+   if (cap_ib_sa(ctx-cm_id-device, ctx-cm_id-port_num))
ucma_copy_ib_route(resp, ctx-cm_id-route);
else if (rdma_protocol_iboe(ctx-cm_id-device, ctx-cm_id-port_num))
ucma_copy_iboe_route(resp, ctx-cm_id-route);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d69e467..f3d9760 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1834,6 +1834,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 
port_num)
return rdma_protocol_iwarp(device, port_num);
 }
 
+/**
+ * cap_ib_sa - Check if the port of device has the capability Infiniband
+ * Subnet Administration.
+ *
+ * @device: Device to be checked

[PATCH v7 18/23] IB/Verbs: Use management helper cap_iw_cm()

2015-04-28 Thread Michael Wang
Introduce helper cap_iw_cm() to help us check if the port of an
IB device support IWARP Communication Manager.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c | 14 +++---
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ecb0484..7d55296 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -754,7 +754,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct 
ib_qp_attr *qp_attr,
 
if (qp_attr-qp_state == IB_QPS_RTR)
qp_attr-rq_psn = id_priv-seq_num;
-   } else if (rdma_protocol_iwarp(id-device, id-port_num)) {
+   } else if (cap_iw_cm(id-device, id-port_num)) {
if (!id_priv-cm_id.iw) {
qp_attr-qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
@@ -1036,7 +1036,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
if (cap_ib_cm(id_priv-id.device, 1)) {
if (id_priv-cm_id.ib)
ib_destroy_cm_id(id_priv-cm_id.ib);
-   } else if (rdma_protocol_iwarp(id_priv-id.device, 1)) {
+   } else if (cap_iw_cm(id_priv-id.device, 1)) {
if (id_priv-cm_id.iw)
iw_destroy_cm_id(id_priv-cm_id.iw);
}
@@ -2520,7 +2520,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
-   } else if (rdma_protocol_iwarp(id-device, 1)) {
+   } else if (cap_iw_cm(id-device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
@@ -2865,7 +2865,7 @@ int rdma_connect(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
-   } else if (rdma_protocol_iwarp(id-device, id-port_num))
+   } else if (cap_iw_cm(id-device, id-port_num))
ret = cma_connect_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -2987,7 +2987,7 @@ int rdma_accept(struct rdma_cm_id *id, struct 
rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
-   } else if (rdma_protocol_iwarp(id-device, id-port_num))
+   } else if (cap_iw_cm(id-device, id-port_num))
ret = cma_accept_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3042,7 +3042,7 @@ int rdma_reject(struct rdma_cm_id *id, const void 
*private_data,
ret = ib_send_cm_rej(id_priv-cm_id.ib,
 IB_CM_REJ_CONSUMER_DEFINED, NULL,
 0, private_data, private_data_len);
-   } else if (rdma_protocol_iwarp(id-device, id-port_num)) {
+   } else if (cap_iw_cm(id-device, id-port_num)) {
ret = iw_cm_reject(id_priv-cm_id.iw,
   private_data, private_data_len);
} else
@@ -3068,7 +3068,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv-cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv-cm_id.ib, NULL, 0);
-   } else if (rdma_protocol_iwarp(id-device, id-port_num)) {
+   } else if (cap_iw_cm(id-device, id-port_num)) {
ret = iw_cm_disconnect(id_priv-cm_id.iw, 0);
} else
ret = -EINVAL;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 87b07f2..d69e467 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1819,6 +1819,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 
port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * cap_iw_cm - Check if the port of device has the capability IWARP
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support IWARP
+ * Communication Manager.
+ */
+static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid 

[PATCH v7 14/23] IB/Verbs: Reform rest part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform rest part in IB-core cma.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 3fb3458..d43f492f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -447,10 +447,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
pkey = ntohs(addr-sib_pkey);
 
list_for_each_entry(cur_dev, dev_list, list) {
-   if (rdma_node_get_transport(cur_dev-device-node_type) != 
RDMA_TRANSPORT_IB)
-   continue;
-
for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) {
+   if (!rdma_ib_or_iboe(cur_dev-device, p))
+   continue;
+
if (ib_find_cached_pkey(cur_dev-device, p, pkey, 
index))
continue;
 
@@ -645,10 +645,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private 
*id_priv,
if (ret)
goto out;
 
-   if (rdma_node_get_transport(id_priv-cma_dev-device-node_type)
-   == RDMA_TRANSPORT_IB 
-   rdma_port_get_link_layer(id_priv-id.device, id_priv-id.port_num)
-   == IB_LINK_LAYER_ETHERNET) {
+   BUG_ON(id_priv-cma_dev-device != id_priv-id.device);
+
+   if (rdma_protocol_iboe(id_priv-id.device, id_priv-id.port_num)) {
ret = rdma_addr_find_smac_by_sgid(sgid, qp_attr.smac, NULL);
 
if (ret)
@@ -712,11 +711,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private 
*id_priv,
int ret;
u16 pkey;
 
-   if (rdma_port_get_link_layer(id_priv-id.device, id_priv-id.port_num) 
==
-   IB_LINK_LAYER_INFINIBAND)
-   pkey = ib_addr_get_pkey(dev_addr);
-   else
+   if (rdma_protocol_iboe(id_priv-id.device, id_priv-id.port_num))
pkey = 0x;
+   else
+   pkey = ib_addr_get_pkey(dev_addr);
 
ret = ib_find_cached_pkey(id_priv-id.device, id_priv-id.port_num,
  pkey, qp_attr-pkey_index);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 16/23] IB/Verbs: Use management helper cap_ib_smi()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_smi() to help us check if the port of an
IB device support Infiniband Subnet Management Interface.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/agent.c |  2 +-
 drivers/infiniband/core/mad.c   |  2 +-
 include/rdma/ib_verbs.h | 15 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index 89d4fbc..61471ee 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_protocol_ib(device, port_num)) {
+   if (cap_ib_smi(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv-agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 59459e7..ee3a05e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, port_priv-qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_protocol_ib(device, port_num);
+   has_smi = cap_ib_smi(device, port_num);
if (has_smi)
cq_size *= 2;
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cb3ba2d..b364a82 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1789,6 +1789,21 @@ static inline int cap_ib_mad(struct ib_device *device, 
u8 port_num)
return rdma_ib_or_iboe(device, port_num);
 }
 
+/**
+ * cap_ib_smi - Check if the port of device has the capability Infiniband
+ * Subnet Management Interface.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Management Interface.
+ */
+static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
+{
+   return rdma_protocol_ib(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 02/23] IB/Verbs: Implement raw management helpers

2015-04-28 Thread Michael Wang
Add raw helpers:
rdma_protocol_ib
rdma_protocol_iboe
rdma_protocol_iwarp
rdma_ib_or_iboe
To help us detect which technology the port supported.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 include/rdma/ib_verbs.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 080f204..acdba60 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1752,6 +1752,28 @@ int ib_query_port(struct ib_device *device,
 enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
   u8 port_num);
 
+static inline int rdma_protocol_ib(struct ib_device *device, u8 port_num)
+{
+   return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IB;
+}
+
+static inline int rdma_protocol_iboe(struct ib_device *device, u8 port_num)
+{
+   return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IBOE;
+}
+
+static inline int rdma_protocol_iwarp(struct ib_device *device, u8 port_num)
+{
+   return device-query_protocol(device, port_num) == RDMA_PROTOCOL_IWARP;
+}
+
+static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
+{
+   enum rdma_protocol_type pt = device-query_protocol(device, port_num);
+
+   return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 00/23] IB/Verbs: IB Management Helpers

2015-04-28 Thread Michael Wang
Since v6:
  * Thanks to Ira, Devesh for the review and testing :-)
  * Thanks for the comments from Sean, Tom, Jason, Doug, Devesh, Ira,
Liran :-) Please remind me if anything missed :-P
  * Use query_protocol() and enum protocol type in 1#
  * Use rdma_protocol_XX() in 2#
  * Drop cma_set_legacy_transport()
  * Reserve rdma_ib_or_iboe() and rdma_node_get_transport()
  * Updated github repository to v7

There are plenty of lengthy code to check the transport type of IB device,
or the link layer type of it's port, but actually we are just speculating
whether a particular management/feature is supported by the device/port.

Thus instead of inferring, we should have our own mechanism for IB management
capability/protocol/feature checking, several proposals below.

This patch set will introduce query_protocol() to check management requirement
instead of inferring from transport and link layer respectively, along with
the new enum on protocol type.

Mapping List:
node-type   link-layer  transport   protocol
nes RNICETH IWARP   IWARP
amso1100RNICETH IWARP   IWARP
cxgb3   RNICETH IWARP   IWARP
cxgb4   RNICETH IWARP   IWARP
usnic   USNIC_UDP   ETH USNIC_UDP   USNIC_UDP
ocrdma  IB_CA   ETH IB  IBOE
mlx4IB_CA   IB/ETH  IB  IB/IBOE
mlx5IB_CA   IB  IB  IB
ehcaIB_CA   IB  IB  IB
ipath   IB_CA   IB  IB  IB
mthca   IB_CA   IB  IB  IB
qib IB_CA   IB  IB  IB

For example:
if (transport == IB)  (link-layer == ETH)
will now become:
if (query_protocol() == IBOE)

Thus we will be able to get rid of the respective transport and link-layer
checking, and it will help us to add new protocol/Technology (like OPA) more
easier, also with the introduced management helpers, IB management logical
will be more clear and easier for extending.

Highlights:
The 'mgmt-helpers' branch of 'g...@github.com:ywang-pb/infiniband-wy.git'
contain this series based on the latest 'infiniband/for-next'

The patch set covered a wide range of IB stuff, thus for those who are
familiar with the particular part, your suggestion would be invaluable ;-)

Patch 1#~14# included all the logical reform, 15#~23# introduced the
management helpers.

we appreciate for those one who have the HW willing to provide Tested-by :-)

Doug suggested the bitmask mechanism:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
which could be the plan for future reforming, we prefer that to be another
series which focus on semantic and performance.

This patch-set is somewhat 'bloated' now and it may be a good timing for
staging, I'd like to suggest we focus on improving existed helpers and push
all the further reforms into next series ;-)


Proposals:
Sean:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23339.html
Doug:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23418.html
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23765.html
Jason:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23425.html

Michael Wang (23):
  IB/Verbs: Implement new callback query_protocol()
  IB/Verbs: Implement raw management helpers
  IB/Verbs: Reform IB-core mad/agent/user_mad
  IB/Verbs: Reform IB-core cm
  IB/Verbs: Reform IB-core sa_query
  IB/Verbs: Reform IB-core multicast
  IB/Verbs: Reform IB-ulp ipoib
  IB/Verbs: Reform IB-ulp xprtrdma
  IB/Verbs: Reform IB-core verbs
  IB/Verbs: Reform cm related part in IB-core cma/ucm
  IB/Verbs: Reform route related part in IB-core cma
  IB/Verbs: Reform mcast related part in IB-core cma
  IB/Verbs: Reform cma_acquire_dev()
  IB/Verbs: Reform rest part in IB-core cma
  IB/Verbs: Use management helper cap_ib_mad()
  IB/Verbs: Use management helper cap_ib_smi()
  IB/Verbs: Use management helper cap_ib_cm()
  IB/Verbs: Use management helper cap_iw_cm()
  IB/Verbs: Use management helper cap_ib_sa()
  IB/Verbs: Use management helper cap_ib_mcast()
  IB/Verbs: Use management helper cap_read_multi_sge()
  IB/Verbs: Use management helper cap_af_ib()
  IB/Verbs: Use management helper cap_eth_ah()

 drivers/infiniband/core/agent.c  |   2 +-
 drivers/infiniband/core/cm.c |  20 ++-
 drivers/infiniband/core/cma.c| 257 +++
 drivers/infiniband/core/device.c |   1 +
 drivers/infiniband/core/mad.c|  43 +++--
 drivers/infiniband/core/multicast.c  |  12 +-
 

[PATCH v7 03/23] IB/Verbs: Reform IB-core mad/agent/user_mad

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core mad/agent/user_mad.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/agent.c|  2 +-
 drivers/infiniband/core/mad.c  | 43 +++---
 drivers/infiniband/core/user_mad.c | 26 ---
 3 files changed, 41 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..89d4fbc 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   if (rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv-agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..507eb67 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, port_priv-qp_info[1]);
 
cq_size = mad_sendq_size + mad_recvq_size;
-   has_smi = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_INFINIBAND;
+   has_smi = rdma_protocol_ib(device, port_num);
if (has_smi)
cq_size *= 2;
 
@@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
 {
int start, end, i;
 
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
-
if (device-node_type == RDMA_NODE_IB_SWITCH) {
start = 0;
end   = 0;
@@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i = end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
if (ib_mad_port_open(device, i)) {
dev_err(device-dev, Couldn't open port %d\n, i);
goto error;
@@ -3086,40 +3086,39 @@ error_agent:
dev_err(device-dev, Couldn't close port %d\n, i);
 
 error:
-   i--;
+   while (--i = start) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
 
-   while (i = start) {
if (ib_agent_port_close(device, i))
dev_err(device-dev,
Couldn't close port %d for agents\n, i);
if (ib_mad_port_close(device, i))
dev_err(device-dev, Couldn't close port %d\n, i);
-   i--;
}
 }
 
 static void ib_mad_remove_device(struct ib_device *device)
 {
-   int i, num_ports, cur_port;
-
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int start, end, i;
 
if (device-node_type == RDMA_NODE_IB_SWITCH) {
-   num_ports = 1;
-   cur_port = 0;
+   start = 0;
+   end   = 0;
} else {
-   num_ports = device-phys_port_cnt;
-   cur_port = 1;
+   start = 1;
+   end   = device-phys_port_cnt;
}
-   for (i = 0; i  num_ports; i++, cur_port++) {
-   if (ib_agent_port_close(device, cur_port))
+
+   for (i = start; i = end; i++) {
+   if (!rdma_ib_or_iboe(device, i))
+   continue;
+
+   if (ib_agent_port_close(device, i))
dev_err(device-dev,
-   Couldn't close port %d for agents\n,
-   cur_port);
-   if (ib_mad_port_close(device, cur_port))
-   dev_err(device-dev, Couldn't close port %d\n,
-   cur_port);
+   Couldn't close port %d for agents\n, i);
+   if (ib_mad_port_close(device, i))
+   dev_err(device-dev, Couldn't close port %d\n, i);
}
 }
 
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 928cdd2..aa8b334 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
 {
struct ib_umad_device *umad_dev;
int s, e, i;
-
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   

[PATCH v7 21/23] IB/Verbs: Use management helper cap_read_multi_sge()

2015-04-28 Thread Michael Wang
Introduce helper cap_read_multi_sge() to help us check if the port of an
IB device support RDMA Read Multiple Scatter-Gather Entries.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 include/rdma/ib_verbs.h | 15 +++
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |  2 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index dde2aa9..cca0293 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1864,6 +1864,21 @@ static inline int cap_ib_mcast(struct ib_device *device, 
u8 port_num)
return cap_ib_sa(device, port_num);
 }
 
+/**
+ * cap_read_multi_sge - Check if the port of device has the capability
+ * RDMA Read Multiple Scatter-Gather Entries.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * RDMA Read Multiple Scatter-Gather Entries.
+ */
+static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
+{
+   return !rdma_protocol_iwarp(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 2cc625d..7711b7a 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,7 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_protocol_iwarp(xprt-sc_cm_id-device,
+   if (!cap_read_multi_sge(xprt-sc_cm_id-device,
xprt-sc_cm_id-port_num))
return 1;
else
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 22/23] IB/Verbs: Use management helper cap_af_ib()

2015-04-28 Thread Michael Wang
Introduce helper cap_af_ib() to help us check if the port of an
IB device support Native Infiniband Address.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c |  2 +-
 include/rdma/ib_verbs.h   | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c06ca60..c3dbcdd 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -448,7 +448,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private 
*id_priv)
 
list_for_each_entry(cur_dev, dev_list, list) {
for (p = 1; p = cur_dev-device-phys_port_cnt; ++p) {
-   if (!rdma_ib_or_iboe(cur_dev-device, p))
+   if (!cap_af_ib(cur_dev-device, p))
continue;
 
if (ib_find_cached_pkey(cur_dev-device, p, pkey, 
index))
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cca0293..c045be1 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1865,6 +1865,21 @@ static inline int cap_ib_mcast(struct ib_device *device, 
u8 port_num)
 }
 
 /**
+ * cap_af_ib - Check if the port of device has the capability
+ * Native Infiniband Address.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Native Infiniband Address.
+ */
+static inline int cap_af_ib(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
+/**
  * cap_read_multi_sge - Check if the port of device has the capability
  * RDMA Read Multiple Scatter-Gather Entries.
  *
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 12/23] IB/Verbs: Reform mcast related part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform mcast related part in IB-core cma.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c | 56 ++-
 1 file changed, 18 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 36c5f8a..34ec13f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private 
*id_priv)
mc = container_of(id_priv-mc_list.next,
  struct cma_multicast, list);
list_del(mc-list);
-   switch (rdma_port_get_link_layer(id_priv-cma_dev-device, 
id_priv-id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv-cma_dev-device,
+ id_priv-id.port_num)) {
ib_sa_free_multicast(mc-multicast.ib);
kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
+   } else
kref_put(mc-mcref, release_mc);
-   break;
-   default:
-   break;
-   }
}
 }
 
@@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr,
list_add(mc-list, id_priv-mc_list);
spin_unlock(id_priv-lock);
 
-   switch (rdma_node_get_transport(id-device-node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id-device, id-port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_join_ib_multicast(id_priv, mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_init(mc-mcref);
-   ret = cma_iboe_join_multicast(id_priv, mc);
-   break;
-   default:
-   ret = -EINVAL;
-   }
-   break;
-   default:
+   if (rdma_protocol_iboe(id-device, id-port_num)) {
+   kref_init(mc-mcref);
+   ret = cma_iboe_join_multicast(id_priv, mc);
+   } else if (rdma_protocol_ib(id-device, id-port_num))
+   ret = cma_join_ib_multicast(id_priv, mc);
+   else
ret = -ENOSYS;
-   break;
-   }
 
if (ret) {
spin_lock_irq(id_priv-lock);
@@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct 
sockaddr *addr)
ib_detach_mcast(id-qp,
mc-multicast.ib-rec.mgid,

be16_to_cpu(mc-multicast.ib-rec.mlid));
-   if 
(rdma_node_get_transport(id_priv-cma_dev-device-node_type) == 
RDMA_TRANSPORT_IB) {
-   switch (rdma_port_get_link_layer(id-device, 
id-port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ib_sa_free_multicast(mc-multicast.ib);
-   kfree(mc);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   kref_put(mc-mcref, release_mc);
-   break;
-   default:
-   break;
-   }
-   }
+
+   BUG_ON(id_priv-cma_dev-device != id-device);
+
+   if (rdma_protocol_ib(id-device, id-port_num)) {
+   ib_sa_free_multicast(mc-multicast.ib);
+   kfree(mc);
+   } else if (rdma_protocol_iboe(id-device, id-port_num))
+   kref_put(mc-mcref, release_mc);
+
return;
}
}
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 11/23] IB/Verbs: Reform route related part in IB-core cma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform route related part in IB-core cma.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c  | 31 ---
 drivers/infiniband/core/ucma.c | 25 ++---
 2 files changed, 14 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8a07e89..36c5f8a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct 
rdma_id_private *id_priv)
 
 static void cma_cancel_route(struct rdma_id_private *id_priv)
 {
-   switch (rdma_port_get_link_layer(id_priv-id.device, 
id_priv-id.port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
+   if (rdma_protocol_ib(id_priv-id.device, id_priv-id.port_num)) {
if (id_priv-query)
ib_sa_cancel_query(id_priv-query_id, id_priv-query);
-   break;
-   default:
-   break;
}
 }
 
@@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int 
timeout_ms)
return -EINVAL;
 
atomic_inc(id_priv-refcount);
-   switch (rdma_node_get_transport(id-device-node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(id-device, id-port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ret = cma_resolve_ib_route(id_priv, timeout_ms);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ret = cma_resolve_iboe_route(id_priv);
-   break;
-   default:
-   ret = -ENOSYS;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+   if (rdma_protocol_ib(id-device, id-port_num))
+   ret = cma_resolve_ib_route(id_priv, timeout_ms);
+   else if (rdma_protocol_iboe(id-device, id-port_num))
+   ret = cma_resolve_iboe_route(id_priv);
+   else if (rdma_protocol_iwarp(id-device, id-port_num))
ret = cma_resolve_iw_route(id_priv, timeout_ms);
-   break;
-   default:
+   else
ret = -ENOSYS;
-   break;
-   }
+
if (ret)
goto err;
 
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 45d67e9..dae7620 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,
 
resp.node_guid = (__force __u64) ctx-cm_id-device-node_guid;
resp.port_num = ctx-cm_id-port_num;
-   switch (rdma_node_get_transport(ctx-cm_id-device-node_type)) {
-   case RDMA_TRANSPORT_IB:
-   switch (rdma_port_get_link_layer(ctx-cm_id-device,
-   ctx-cm_id-port_num)) {
-   case IB_LINK_LAYER_INFINIBAND:
-   ucma_copy_ib_route(resp, ctx-cm_id-route);
-   break;
-   case IB_LINK_LAYER_ETHERNET:
-   ucma_copy_iboe_route(resp, ctx-cm_id-route);
-   break;
-   default:
-   break;
-   }
-   break;
-   case RDMA_TRANSPORT_IWARP:
+
+   if (rdma_protocol_ib(ctx-cm_id-device, ctx-cm_id-port_num))
+   ucma_copy_ib_route(resp, ctx-cm_id-route);
+   else if (rdma_protocol_iboe(ctx-cm_id-device, ctx-cm_id-port_num))
+   ucma_copy_iboe_route(resp, ctx-cm_id-route);
+   else if (rdma_protocol_iwarp(ctx-cm_id-device, ctx-cm_id-port_num))
ucma_copy_iw_route(resp, ctx-cm_id-route);
-   break;
-   default:
-   break;
-   }
 
 out:
if (copy_to_user((void __user *)(unsigned long)cmd.response,
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 04/23] IB/Verbs: Reform IB-core cm

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core cm.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cm.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..add5e484 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3760,11 +3760,9 @@ static void cm_add_one(struct ib_device *ib_device)
};
unsigned long flags;
int ret;
+   int count = 0;
u8 i;
 
-   if (rdma_node_get_transport(ib_device-node_type) != RDMA_TRANSPORT_IB)
-   return;
-
cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
 ib_device-phys_port_cnt, GFP_KERNEL);
if (!cm_dev)
@@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)
 
set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i = ib_device-phys_port_cnt; i++) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = kzalloc(sizeof *port, GFP_KERNEL);
if (!port)
goto error1;
@@ -3809,7 +3810,13 @@ static void cm_add_one(struct ib_device *ib_device)
ret = ib_modify_port(ib_device, i, 0, port_modify);
if (ret)
goto error3;
+
+   count++;
}
+
+   if (!count)
+   goto free;
+
ib_set_client_data(ib_device, cm_client, cm_dev);
 
write_lock_irqsave(cm.device_lock, flags);
@@ -3825,11 +3832,15 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = cm_dev-port[i-1];
ib_modify_port(ib_device, port-port_num, 0, port_modify);
ib_unregister_mad_agent(port-mad_agent);
cm_remove_port_fs(port);
}
+free:
device_unregister(cm_dev-device);
kfree(cm_dev);
 }
@@ -3853,6 +3864,9 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(cm.device_lock, flags);
 
for (i = 1; i = ib_device-phys_port_cnt; i++) {
+   if (!rdma_ib_or_iboe(ib_device, i))
+   continue;
+
port = cm_dev-port[i-1];
ib_modify_port(ib_device, port-port_num, 0, port_modify);
ib_unregister_mad_agent(port-mad_agent);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Jason Gunthorpe
On Tue, Apr 28, 2015 at 11:32:08AM +0300, Matan Barak wrote:

 This was already asked by Haggai Eran awhile ago and was answered.
 Anyway, in ib_unregister_device we delete all client's related data.
 We would like to ensure that all references were released before this
 data is being deleted. Meaning, we would like to ensure the device
 is still functioning but isn't referenced rather than just to avoid
 freeing the IB device's memory.

A kref is the wrong datastructure for that purpose.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 13/23] IB/Verbs: Reform cma_acquire_dev()

2015-04-28 Thread Michael Wang
Reform cma_acquire_dev() with management helpers, introduce
cma_validate_port() to make the code more clean.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/cma.c | 68 +--
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 34ec13f..3fb3458 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -349,18 +349,35 @@ static int cma_translate_addr(struct sockaddr *addr, 
struct rdma_dev_addr *dev_a
return ret;
 }
 
+static inline int cma_validate_port(struct ib_device *device, u8 port,
+ union ib_gid *gid, int dev_type)
+{
+   u8 found_port;
+   int ret = -ENODEV;
+
+   if ((dev_type == ARPHRD_INFINIBAND)  !rdma_protocol_ib(device, port))
+   return ret;
+
+   if ((dev_type != ARPHRD_INFINIBAND)  rdma_protocol_ib(device, port))
+   return ret;
+
+   ret = ib_find_cached_gid(device, gid, found_port, NULL);
+   if (port != found_port)
+   return -ENODEV;
+
+   return ret;
+}
+
 static int cma_acquire_dev(struct rdma_id_private *id_priv,
   struct rdma_id_private *listen_id_priv)
 {
struct rdma_dev_addr *dev_addr = id_priv-id.route.addr.dev_addr;
struct cma_device *cma_dev;
-   union ib_gid gid, iboe_gid;
+   union ib_gid gid, iboe_gid, *gidp;
int ret = -ENODEV;
-   u8 port, found_port;
-   enum rdma_link_layer dev_ll = dev_addr-dev_type == ARPHRD_INFINIBAND ?
-   IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
+   u8 port;
 
-   if (dev_ll != IB_LINK_LAYER_INFINIBAND 
+   if (dev_addr-dev_type != ARPHRD_INFINIBAND 
id_priv-id.ps == RDMA_PS_IPOIB)
return -EINVAL;
 
@@ -370,41 +387,36 @@ static int cma_acquire_dev(struct rdma_id_private 
*id_priv,
 
memcpy(gid, dev_addr-src_dev_addr +
   rdma_addr_gid_offset(dev_addr), sizeof gid);
-   if (listen_id_priv 
-   rdma_port_get_link_layer(listen_id_priv-id.device,
-listen_id_priv-id.port_num) == dev_ll) {
+
+   if (listen_id_priv) {
cma_dev = listen_id_priv-cma_dev;
port = listen_id_priv-id.port_num;
-   if (rdma_node_get_transport(cma_dev-device-node_type) == 
RDMA_TRANSPORT_IB 
-   rdma_port_get_link_layer(cma_dev-device, port) == 
IB_LINK_LAYER_ETHERNET)
-   ret = ib_find_cached_gid(cma_dev-device, iboe_gid,
-found_port, NULL);
-   else
-   ret = ib_find_cached_gid(cma_dev-device, gid,
-found_port, NULL);
+   gidp = rdma_protocol_iboe(cma_dev-device, port) ?
+  iboe_gid : gid;
 
-   if (!ret  (port  == found_port)) {
-   id_priv-id.port_num = found_port;
+   ret = cma_validate_port(cma_dev-device, port, gidp,
+   dev_addr-dev_type);
+   if (!ret) {
+   id_priv-id.port_num = port;
goto out;
}
}
+
list_for_each_entry(cma_dev, dev_list, list) {
for (port = 1; port = cma_dev-device-phys_port_cnt; ++port) {
if (listen_id_priv 
listen_id_priv-cma_dev == cma_dev 
listen_id_priv-id.port_num == port)
continue;
-   if (rdma_port_get_link_layer(cma_dev-device, port) == 
dev_ll) {
-   if 
(rdma_node_get_transport(cma_dev-device-node_type) == RDMA_TRANSPORT_IB 
-   rdma_port_get_link_layer(cma_dev-device, 
port) == IB_LINK_LAYER_ETHERNET)
-   ret = 
ib_find_cached_gid(cma_dev-device, iboe_gid, found_port, NULL);
-   else
-   ret = 
ib_find_cached_gid(cma_dev-device, gid, found_port, NULL);
-
-   if (!ret  (port == found_port)) {
-   id_priv-id.port_num = found_port;
-   goto out;
-   }
+
+   gidp = rdma_protocol_iboe(cma_dev-device, port) ?
+  iboe_gid : gid;
+
+   ret = cma_validate_port(cma_dev-device, port, gidp,
+

Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Matan Barak



On 4/28/2015 7:03 PM, Jason Gunthorpe wrote:

On Tue, Apr 28, 2015 at 11:32:08AM +0300, Matan Barak wrote:


This was already asked by Haggai Eran awhile ago and was answered.
Anyway, in ib_unregister_device we delete all client's related data.
We would like to ensure that all references were released before this
data is being deleted. Meaning, we would like to ensure the device
is still functioning but isn't referenced rather than just to avoid
freeing the IB device's memory.


A kref is the wrong datastructure for that purpose.


What is the right data structure in your opinion?



Jason



Matan
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Hefty, Sean
 For the UDP port used by the usNIC QP, the usnic_verbs kernel driver
 requires user space to pass a file descriptor of a regular UDP socket down
 at create_qp time.  The reference count on this socket is incremented to
 make sure that the socket can't disappear out from under us.  Then an RX
 filter is installed in the NIC which matches UDP/IP/Ethernet packets that
 are destined for the UDP port to which the given socket is already bound.
 So there is a real UDP socket to make most of the usual things happen in
 the net stack, but the raw UDP/IP/Ethernet packets get delivered directly
 to the user space queues by the NIC.  E.g., netstat and lsof show you
 proper addressing information, though obviously any information related to
 data-path statistics will not be accurate.  At teardown we just reverse
 the steps.
 
 However, I'm not sure if that's the sort of information you were looking
 for.

This is more part of the RoCEv2 discussion than this thread.  But, yes, this is 
what I was looking for.  Conceptually, this is loosely similar to the port 
mapper functionality in iWarp, with a direct port mapping. Thanks.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Dave Goodell (dgoodell)
On Apr 28, 2015, at 2:53 PM, Hefty, Sean sean.he...@intel.com wrote:

 Is the concern here about CM issues or the UDP ports used by the actual
 usNIC RQs? 
 
 UDP port space sharing

For the UDP port used by the usNIC QP, the usnic_verbs kernel driver requires 
user space to pass a file descriptor of a regular UDP socket down at create_qp 
time.  The reference count on this socket is incremented to make sure that the 
socket can't disappear out from under us.  Then an RX filter is installed in 
the NIC which matches UDP/IP/Ethernet packets that are destined for the UDP 
port to which the given socket is already bound.  So there is a real UDP socket 
to make most of the usual things happen in the net stack, but the raw 
UDP/IP/Ethernet packets get delivered directly to the user space queues by the 
NIC.  E.g., netstat and lsof show you proper addressing information, though 
obviously any information related to data-path statistics will not be accurate. 
 At teardown we just reverse the steps.

However, I'm not sure if that's the sort of information you were looking for.

-Dave

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Matan Barak



On 4/27/2015 7:22 PM, Jason Gunthorpe wrote:

On Mon, Apr 27, 2015 at 11:25:56AM +0300, Matan Barak wrote:



On 4/26/2015 11:10 PM, Or Gerlitz wrote:

On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur
somnath.ko...@emulex.com wrote:

From: Matan Barak mat...@mellanox.com

Previously. we used device_mutex lock in order to protect
the device's list. That means that in order to guarantee a
device isn't freed while we use it, we had to lock all
devices.


Matan, looking on the cover letter, it says: [...] Patch 0002 adds a
reference count mechanism to IB devices. This mechanism is similar to
dev_hold and dev_put available for net devices. This is mandatory for
later patches [...]

So in that respect, saying here Previously. we used device_mutex
lock is a bit cryptic, @ least one typo must exist in this sentence,
right? did you want to say Currently we use device_mutex lock for XXX
[...] and this should be replaced as of a YYY change which is to be
introduced [...] please clarify


Correct, I'll change that into:

Currently we use device_mutex lock for protecting the device's list.
In the current approach, in order to guarantee a device isn't freed
we have to lock all devices.

Adding a kref per IB device. Before an IB device is unregistered, we
wait until it's not held anymore.


Why do we need two krefs for this structure? There is already a kref
inside the embedded 'struct device dev'

Sounds wrong to me without a lot more explanation.



This was already asked by Haggai Eran awhile ago and was answered. 
Anyway, in ib_unregister_device we delete all client's related data.

We would like to ensure that all references were released before this
data is being deleted. Meaning, we would like to ensure the device is 
still functioning but isn't referenced rather than just to avoid freeing 
the IB device's memory.

ib_device_get and ib_device_hold are APIs for the clients, similar to
dev_hold and dev_put.


Jason



Regards,
Matan
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Jason Gunthorpe
On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote:
 On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote:

 Having some of it refer to things as IBOE and some as ROCE would be
 similarly confusing, and switching existing IBOE usage to ROCE would
 cause pain to people with out of tree drivers (Lustre is the main one I
 know of).  There's not a good answer here.  There's only less sucky
 ones.

The tide has already turned, we should ditch iboe:

$git grep -i roce_ drivers/infiniband/ | wc -l
91
$git grep -i iboe_ drivers/infiniband/ | wc -l
37

It isn't really mainline's role to be too concerned about out of tree
things like Lustre.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 04/23] IB/Verbs: Reform IB-core cm

2015-04-28 Thread Or Gerlitz
On Tue, Apr 28, 2015 at 6:10 PM, Michael Wang yun.w...@profitbricks.com wrote:
 Use raw management helpers to reform IB-core cm.

 Cc: Hal Rosenstock h...@dev.mellanox.co.il
 Cc: Steve Wise sw...@opengridcomputing.com
 Cc: Tom Talpey t...@talpey.com
 Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
 Cc: Doug Ledford dledf...@redhat.com
 Cc: Ira Weiny ira.we...@intel.com
 Cc: Sean Hefty sean.he...@intel.com
 Signed-off-by: Michael Wang yun.w...@profitbricks.com
 ---
  drivers/infiniband/core/cm.c | 20 +---
  1 file changed, 17 insertions(+), 3 deletions(-)

Hi Michael,

I don't really see the benefit (e.g for someone doing bisection
1/2/5/10 years from now and landing here) of listing all the group of
reviewers for each of the ~30 patches that make this series, any
special reason that caused you doing so?

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


recent qib patch

2015-04-28 Thread Marciniszyn, Mike
Al,

In commit 

commit 4961772560d2f19695c73ece943716033ad62ac2
Author: Al Viro v...@zeniv.linux.org.uk
Date:   Sat Apr 4 00:11:32 2015 -0400

    infinibad: weird APIs switched to -write_iter()

    Things Not To Do When Writing A Driver, part 1001st:
    have writev() and write() on the same file doing completely
    different things.  As in, interpret very different sets of
    commands.

    We _can_ handle that, but it's a bloody bad idea.
    Don't do that in new drivers.  Ever.

    Signed-off-by: Al Viro v...@zeniv.linux.org.uk

You note an objection to qib's (and ipath's) use of the write overload for 
control messages.

What would be an acceptable mechanism for control v.s. data path?

It looks to me like this current implementation might have been to avoid using 
ioctl.

Mike



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 for-next 02/33] IB/core: Add kref to IB devices

2015-04-28 Thread Or Gerlitz
On Tue, Apr 28, 2015 at 8:43 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
 On Tue, Apr 28, 2015 at 05:03:11PM +0300, Matan Barak wrote:

[...]
 Or: This should have been fixed after Haggai brought it up...

Jason, looking again on the correspondence between Matan and Haggai, I
think this one was sort of left in the air (or actually fell on the
floor),  happens, and indeed we should strive to do better and avoid
that, thanks for the 2nd eye.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Or Gerlitz
On Tue, Apr 28, 2015 at 9:56 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
 On Mon, Apr 27, 2015 at 09:24:35PM -0400, Doug Ledford wrote:
 On Mon, 2015-04-27 at 17:53 -0700, Tom Talpey wrote:

 Having some of it refer to things as IBOE and some as ROCE would be
 similarly confusing, and switching existing IBOE usage to ROCE would
 cause pain to people with out of tree drivers (Lustre is the main one I
 know of).  There's not a good answer here.  There's only less sucky
 ones.

 The tide has already turned, we should ditch iboe:

 $git grep -i roce_ drivers/infiniband/ | wc -l
 91
 $git grep -i iboe_ drivers/infiniband/ | wc -l
 37

 It isn't really mainline's role to be too concerned about out of tree
 things like Lustre.

FWIW, note that Lustre is under staging for a while, not sure how
close they are for actual acceptance.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Hefty, Sean
  Keep in mind that this enum was Liran's response to Michael's original
  patch.  In the enum in Michael's patch, there was both USNIC and
  USNIC_UDP.
 
 Right! That's why I'm confused. Seems wrong to drop it, right?

I think the original USNIC protocol is layered directly over Ethernet.  The 
protocol basically stole an Ethertype (the one used for IBoE/RoCE) and 
implemented a proprietary protocol instead.  I have no idea how you resolve 
that, but I also don't think it's used anymore.  USNIC_UDP is just UDP.

 Well, if RoCEv2 uses the same protocol enum, that may introduce new
 confusion, for example there will be some new CM handling for UDP encap,
 source port selection, and of course vlan/tag assignment, etc. But if
 there is support under way, and everyone is clear, then, ok.

RoCEv2/IBoUDP shares the same port space as UDP.  It has a similar issues as 
iWarp does sharing state with the main network stack.  I'm not aware of any 
proposal for resolving that.  Does it require using a separate IP address?  
Does it use a port mapper function?  Does netdev care for UDP?  I'm not sure 
what USNIC does for this either, but a common solution between USNIC and IBoUDP 
seems reasonable.


N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

Re: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache

2015-04-28 Thread Matan Barak



On 4/27/2015 9:22 PM, Or Gerlitz wrote:

On Mon, Apr 27, 2015 at 10:32 AM, Matan Barak mat...@mellanox.com wrote:

On 4/26/2015 8:20 PM, Or Gerlitz wrote:

On Thu, Mar 26, 2015 at 12:19 AM, Somnath Kotur
somnath.ko...@emulex.com wrote:

From: Matan Barak mat...@mellanox.com



In order to manage multiple types, vlans and MACs per GID, we
need to store them along the GID itself. We store the net device
as well, as sometimes GIDs should be handled according to the
net device they came from. Since populating the GID table should
be identical for every RoCE provider, the GIDs table should be
handled in ib_core.

Adding a GID cache table that supports a lockless find, add and
delete gids. The lockless nature comes from using a unique
sequence number per table entry and detecting that while reading/
writing this sequence wasn't changed.



Matan, please use existing mechanism which fits the problem you are
trying to solve, I guess one of RCU or seqlock should do the job.



seqcount fits this problem better. Since if a write and read are done in
parallel, there's a good chance we read an out of date entry and we are
going to use a GID entry that's going to change in T+epsilon, so RCU doesn't
really have an advantage here.


So going back to the problem... we are talking on applications/drivers
that attempt to establish new connections doing reads and writes done
on behalf of IP stack changes, both are very much not critical path.
So this is kind of similar to the neighbour table maintained by ND
subsystem which is used by all IP based networking applications and
that code uses RCU. I don't see what's wrong with RCU for our sort
smaller scale subsystem and what is even wrong with simple rwlock
which is the mechanism used today by the IB core git cache, this goes
too complex and for no reason that I can think of.



I think the real question is why to deal with RCUs that will require 
re-allocation of entries when it's not necessary or why do we want to 
use rwlock if the kernel provides a mechanism (called seqcount) that 
fits this problem better?
I disagree about seqcount being complex - if you look at its API you'll 
find it's a lot simpler than RCU.





The current implementation is a bit more efficient than seqcount, as it
allows early termination of read-while-write (because the write puts a known
currently updating value that the read knows to ignore). AFAIK, this
doesn't exist in the current seqcount implementation. However, since this
isn't a crucial data-path, I'll change that to seqcount.

seqcount is preferred over seqlock, as I don't need the spinlock in seqlock.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Dave Goodell (dgoodell)
On Apr 28, 2015, at 1:14 AM, Hefty, Sean sean.he...@intel.com wrote:

 Keep in mind that this enum was Liran's response to Michael's original
 patch.  In the enum in Michael's patch, there was both USNIC and
 USNIC_UDP.
 
 Right! That's why I'm confused. Seems wrong to drop it, right?
 
 I think the original USNIC protocol is layered directly over Ethernet.  The 
 protocol basically stole an Ethertype (the one used for IBoE/RoCE) and 
 implemented a proprietary protocol instead.  I have no idea how you resolve 
 that, but I also don't think it's used anymore.  USNIC_UDP is just UDP.

Sean is correct. The legacy RDMA_TRANSPORT_USNIC code used a proprietary 
protocol over plain Ethernet frames.  The newer RDMA_TRANSPORT_USNIC_UDP code 
is just standard UDP/IP/Ethernet packets exposed to user space via the uverbs 
stack.  The current kernel module will support both formats, it just depends on 
which user space requests at create_qp time.  From the kernel point of view 
there is no common protocol between the two TRANSPORTs (other than sharing 
partially similar Ethernet frames at L2).

I posted last week to clarify some of this: 
http://marc.info/?l=linux-rdmam=142972177830718w=2

 Well, if RoCEv2 uses the same protocol enum, that may introduce new
 confusion, for example there will be some new CM handling for UDP encap,
 source port selection, and of course vlan/tag assignment, etc. But if
 there is support under way, and everyone is clear, then, ok.
 
 RoCEv2/IBoUDP shares the same port space as UDP.  It has a similar issues as 
 iWarp does sharing state with the main network stack.  I'm not aware of any 
 proposal for resolving that.  Does it require using a separate IP address?  
 Does it use a port mapper function?  Does netdev care for UDP?  I'm not sure 
 what USNIC does for this either, but a common solution between USNIC and 
 IBoUDP seems reasonable.

Is the concern here about CM issues or the UDP ports used by the actual usNIC 
RQs?  CM is not used/supported for usNIC at this time.

-Dave

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-28 Thread Hefty, Sean
 Is the concern here about CM issues or the UDP ports used by the actual
 usNIC RQs? 

UDP port space sharing
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 15/23] IB/Verbs: Use management helper cap_ib_mad()

2015-04-28 Thread Michael Wang
Introduce helper cap_ib_mad() to help us check if the port of an
IB device support Infiniband Management Datagrams.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/mad.c  |  6 +++---
 drivers/infiniband/core/user_mad.c |  6 +++---
 include/rdma/ib_verbs.h| 15 +++
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 507eb67..59459e7 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
}
 
for (i = start; i = end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_mad_port_open(device, i)) {
@@ -3087,7 +3087,7 @@ error_agent:
 
 error:
while (--i = start) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
@@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device)
}
 
for (i = start; i = end; i++) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
if (ib_agent_port_close(device, i))
diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index aa8b334..e3ccbf2 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev-end_port   = e;
 
for (i = s; i = e; ++i) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
umad_dev-port[i - s].umad_dev = umad_dev;
@@ -1315,7 +1315,7 @@ static void ib_umad_add_one(struct ib_device *device)
 
 err:
while (--i = s) {
-   if (!rdma_ib_or_iboe(device, i))
+   if (!cap_ib_mad(device, i))
continue;
 
ib_umad_kill_port(umad_dev-port[i - s]);
@@ -1333,7 +1333,7 @@ static void ib_umad_remove_one(struct ib_device *device)
return;
 
for (i = 0; i = umad_dev-end_port - umad_dev-start_port; ++i) {
-   if (rdma_ib_or_iboe(device, i))
+   if (cap_ib_mad(device, i))
ib_umad_kill_port(umad_dev-port[i]);
}
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index acdba60..cb3ba2d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1774,6 +1774,21 @@ static inline int rdma_ib_or_iboe(struct ib_device 
*device, u8 port_num)
return (pt == RDMA_PROTOCOL_IB || pt == RDMA_PROTOCOL_IBOE);
 }
 
+/**
+ * cap_ib_mad - Check if the port of device has the capability Infiniband
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Management Datagrams.
+ */
+static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
+{
+   return rdma_ib_or_iboe(device, port_num);
+}
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 07/23] IB/Verbs: Reform IB-ulp ipoib

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-ulp ipoib.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 7cad4dd..468fc2b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1680,9 +1680,7 @@ static void ipoib_add_one(struct ib_device *device)
struct net_device *dev;
struct ipoib_dev_priv *priv;
int s, e, p;
-
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
if (!dev_list)
@@ -1699,15 +1697,21 @@ static void ipoib_add_one(struct ib_device *device)
}
 
for (p = s; p = e; ++p) {
-   if (rdma_port_get_link_layer(device, p) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, p))
continue;
dev = ipoib_add_port(ib%d, device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
list_add_tail(priv-list, dev_list);
+   count++;
}
}
 
+   if (!count) {
+   kfree(dev_list);
+   return;
+   }
+
ib_set_client_data(device, ipoib_client, dev_list);
 }
 
@@ -1716,9 +1720,6 @@ static void ipoib_remove_one(struct ib_device *device)
struct ipoib_dev_priv *priv, *tmp;
struct list_head *dev_list;
 
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev_list = ib_get_client_data(device, ipoib_client);
if (!dev_list)
return;
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 08/23] IB/Verbs: Reform IB-ulp xprtrdma

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-ulp xprtrdma.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c  |  4 +--
 net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +---
 2 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f9f13a3..2cc625d 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
 
 static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
 {
-   if (rdma_node_get_transport(xprt-sc_cm_id-device-node_type) ==
-RDMA_TRANSPORT_IWARP)
+   if (rdma_protocol_iwarp(xprt-sc_cm_id-device,
+   xprt-sc_cm_id-port_num))
return 1;
else
return min_t(int, sge_count, xprt-sc_max_sge);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c 
b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index f609c1c..3df8320 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
struct ib_qp_init_attr qp_attr;
struct ib_device_attr devattr;
int uninitialized_var(dma_mr_acc);
-   int need_dma_mr;
+   int need_dma_mr = 0;
int ret;
int i;
 
@@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt 
*xprt)
/*
 * Determine if a DMA MR is required and if so, what privs are required
 */
-   switch (rdma_node_get_transport(newxprt-sc_cm_id-device-node_type)) {
-   case RDMA_TRANSPORT_IWARP:
-   newxprt-sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
-   if (!(newxprt-sc_dev_caps  SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc =
-   (IB_ACCESS_LOCAL_WRITE |
-IB_ACCESS_REMOTE_WRITE);
-   } else if (!(devattr.device_cap_flags  
IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   case RDMA_TRANSPORT_IB:
-   if (!(newxprt-sc_dev_caps  SVCRDMA_DEVCAP_FAST_REG)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else if (!(devattr.device_cap_flags 
-IB_DEVICE_LOCAL_DMA_LKEY)) {
-   need_dma_mr = 1;
-   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
-   } else
-   need_dma_mr = 0;
-   break;
-   default:
+   if (!rdma_protocol_iwarp(newxprt-sc_cm_id-device,
+newxprt-sc_cm_id-port_num) 
+   !rdma_ib_or_iboe(newxprt-sc_cm_id-device,
+newxprt-sc_cm_id-port_num))
goto errout;
+
+   if (!(newxprt-sc_dev_caps  SVCRDMA_DEVCAP_FAST_REG) ||
+   !(devattr.device_cap_flags  IB_DEVICE_LOCAL_DMA_LKEY)) {
+   need_dma_mr = 1;
+   dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
+   if (rdma_protocol_iwarp(newxprt-sc_cm_id-device,
+   newxprt-sc_cm_id-port_num) 
+   !(newxprt-sc_dev_caps  SVCRDMA_DEVCAP_FAST_REG))
+   dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
}
 
+   if (rdma_protocol_iwarp(newxprt-sc_cm_id-device,
+   newxprt-sc_cm_id-port_num))
+   newxprt-sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
+
/* Create the DMA MR if needed, otherwise, use the DMA LKEY */
if (need_dma_mr) {
/* Register all of physical memory */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 06/23] IB/Verbs: Reform IB-core multicast

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core multicast.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/multicast.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index fa17b55..b57ed03 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
int index;
 
dev = container_of(handler, struct mcast_device, event_handler);
-   if (rdma_port_get_link_layer(dev-device, event-element.port_num) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(dev-device, event-element.port_num)))
return;
 
index = event-element.port_num - dev-start_port;
@@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
int i;
int count = 0;
 
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
-
dev = kmalloc(sizeof *dev + device-phys_port_cnt * sizeof *port,
  GFP_KERNEL);
if (!dev)
@@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}
 
for (i = 0; i = dev-end_port - dev-start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev-start_port + i) !=
-   IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, dev-start_port + i))
continue;
port = dev-port[i];
port-dev = dev;
@@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);
 
for (i = 0; i = dev-end_port - dev-start_port; i++) {
-   if (rdma_port_get_link_layer(device, dev-start_port + i) ==
-   IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, dev-start_port + i)) {
port = dev-port[i];
deref_port(port);
wait_for_completion(port-comp);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 05/23] IB/Verbs: Reform IB-core sa_query

2015-04-28 Thread Michael Wang
Use raw management helpers to reform IB-core sa_query.

Cc: Hal Rosenstock h...@dev.mellanox.co.il
Cc: Steve Wise sw...@opengridcomputing.com
Cc: Tom Talpey t...@talpey.com
Cc: Jason Gunthorpe jguntho...@obsidianresearch.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Ira Weiny ira.we...@intel.com
Cc: Sean Hefty sean.he...@intel.com
Signed-off-by: Michael Wang yun.w...@profitbricks.com
---
 drivers/infiniband/core/sa_query.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index c38f030..b115c28 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, 
struct ib_event *event
struct ib_sa_port *port =
sa_dev-port[event-element.port_num - 
sa_dev-start_port];
 
-   if (rdma_port_get_link_layer(handler-device, port-port_num) 
!= IB_LINK_LAYER_INFINIBAND)
+   if (WARN_ON(!rdma_protocol_ib(handler-device, port-port_num)))
return;
 
spin_lock_irqsave(port-ah_lock, flags);
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 
port_num,
ah_attr-port_num = port_num;
ah_attr-static_rate = rec-rate;
 
-   force_grh = rdma_port_get_link_layer(device, port_num) == 
IB_LINK_LAYER_ETHERNET;
+   force_grh = rdma_protocol_iboe(device, port_num);
 
if (rec-hop_limit  1 || force_grh) {
ah_attr-ah_flags = IB_AH_GRH;
@@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
 {
struct ib_sa_device *sa_dev;
int s, e, i;
-
-   if (rdma_node_get_transport(device-node_type) != RDMA_TRANSPORT_IB)
-   return;
+   int count = 0;
 
if (device-node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
 
for (i = 0; i = e - s; ++i) {
spin_lock_init(sa_dev-port[i].ah_lock);
-   if (rdma_port_get_link_layer(device, i + 1) != 
IB_LINK_LAYER_INFINIBAND)
+   if (!rdma_protocol_ib(device, i + 1))
continue;
 
sa_dev-port[i].sm_ah= NULL;
@@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;
 
INIT_WORK(sa_dev-port[i].update_task, update_sm_ah);
+
+   count++;
}
 
+   if (!count)
+   goto free;
+
ib_set_client_data(device, sa_client, sa_dev);
 
/*
@@ -1204,19 +1207,20 @@ static void ib_sa_add_one(struct ib_device *device)
if (ib_register_event_handler(sa_dev-event_handler))
goto err;
 
-   for (i = 0; i = e - s; ++i)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   for (i = 0; i = e - s; ++i) {
+   if (rdma_protocol_ib(device, i + 1))
update_sm_ah(sa_dev-port[i].update_task);
+   }
 
return;
 
 err:
-   while (--i = 0)
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   while (--i = 0) {
+   if (rdma_protocol_ib(device, i + 1))
ib_unregister_mad_agent(sa_dev-port[i].agent);
-
+   }
+free:
kfree(sa_dev);
-
return;
 }
 
@@ -1233,7 +1237,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);
 
for (i = 0; i = sa_dev-end_port - sa_dev-start_port; ++i) {
-   if (rdma_port_get_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND) {
+   if (rdma_protocol_ib(device, i + 1)) {
ib_unregister_mad_agent(sa_dev-port[i].agent);
if (sa_dev-port[i].sm_ah)
kref_put(sa_dev-port[i].sm_ah-ref, 
free_sm_ah);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html