Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

2015-04-17 Thread Michael Wang
On 04/16/2015 08:07 PM, Steve Wise wrote:
 
 
 -Original Message-
 From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com]
 Sent: Thursday, April 16, 2015 11:43 AM
 To: Michael Wang
 Cc: Roland Dreier; Sean Hefty; Hal Rosenstock; linux-rdma@vger.kernel.org; 
 linux-ker...@vger.kernel.org; Tom Tucker; Steve Wise;
 Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal 
 Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira
 Weiny;
 Tom Talpey; Doug Ledford
 Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

 On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:

 I would be very happy to see a patch that adds cap_ib_smi to the
 current tree and states 'This patch is tested to have no change on the
 binary compilation results'

 There are too much reform there (per-dev to per-port), I guess the binary
 will changed more or less anyway...

 I think this patch series is huge, and everytime someone new looks at
 it small functional errors seem to pop up..

 Doing something to reduce the review surface would be really helpful
 here. Not changing the same line twice, using tools too perform these
 transforms and then assert the patch is a NOP because .. tools. Some
 other idea?

 
 Don't try and change everything in one giant series.   Just do some changes 
 this cycle (keep it at  8 or 10 patches), and do more
 later...

Actually only 1#~15# related to logical reform, rest are just replacement :-)

Me too would like to stop introducing new stuff at this moment, and focus on
the improvement of what we have already settled down.

Regards,
Michael Wang

 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

2015-04-17 Thread Michael Wang
Hi, Roland

Thanks for the comment :-)

On 04/16/2015 07:02 PM, Roland Dreier wrote:
 On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe
 jguntho...@obsidianresearch.com wrote:
 We can give client-add() callback a return value and make
 ib_register_device() return -ENOMEM when it failed, just wondering
 why we don't do this at first, any special reason?
 
 No idea, but having ib_register_device fail and unwind if a client
 fails to attach makes sense to me.
 
 It seems a bit unfriendly to fail an entire device if one ULP has a
 problem.  Let's say you have a system whose main network connection is
 IPoIB.  Would you want that connection to come up even if, say, the
 NFS/RDMA server fails to find the memory registration type it likes?

Agree, the idea is correct that one client's initialization failure should not
influence the whole device, as long as the rest client can keep the device
working (but how to estimate that...).

While just ignore the failure seems really strange...

Regards,
Michael Wang

 
  - R.
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

2015-04-17 Thread Michael Wang
On 04/16/2015 07:05 PM, Weiny, Ira wrote:

 On Wed, Apr 15, 2015 at 09:58:18AM +0200, Michael Wang wrote:

 We can give client-add() callback a return value and make
 ib_register_device() return -ENOMEM when it failed, just wondering why
 we don't do this at first, any special reason?

 No idea, but having ib_register_device fail and unwind if a client fails to 
 attach
 makes sense to me.
 
 Yes that is what we should do _but_ 
 
 I think we should tackle that in a different series.
 
 As you said in another email, this series is getting very long and hard to 
 review/prove is correct.  This is why I was advocating keeping a check at the 
 top of cm_add_one which verified all Ports supported the CM.  This is the 
 current logic today and is proven to work for the devices/use cases out there.
 
 We can clean up the initialization code and implement support for individual 
 ports in follow on patches.

Agree, as long as this series do not introduce any Bug, I suggest we
put other reform ideas into next series :-)

We have already eliminate the old inferring way and integrate all the
cases into helpers, further reform should be far more clear based on
this foundation.

Regards,
Michael Wang

 
 Ira
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 10/27] IB/Verbs: Reform cm related part in IB-core cma/ucm

2015-04-17 Thread Michael Wang


On 04/16/2015 07:30 PM, Hefty, Sean wrote:
 To be confirmed:
 PORT ASSIGNED
 rdma_init_qp_attr   Y
 rdma_destroy_id unknown
 cma_listen_on_dev   N
 cma_bind_loopback   N
 
 Bind loopback will attach to a port, but the id does not have on entry.
 
 rdma_listen N

 Why N? rdma_listen() can be constrained to a single port, right?
 And even if wildcarded, it needs to act on multiple ports, which is
 to say, it will fail only if no ports are eligible.
 
 Rdma listen should be unknown.  The id may be assigned to a port.  It depends 
 on the source address.

Agree, so for those 'N' or 'unknow', let's use port 1 directly.

 
 rdma_connectY
 rdma_accept Y
 rdma_reject Y
 rdma_disconnect Y
 ib_ucm_add_one  N
 
 Others look correct.
 
 Btw, thanks for your work on this.  I know this is becoming a much bigger 
 change than you originally intended.  :)

My pleasure :-) It do exceeded my expectations, fortunately I have you guys
on my side, without your comments, it's impossible to make it so far :-)

Regards,
Michael Wang

 
 - Sean
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 10/27] IB/Verbs: Reform cm related part in IB-core cma/ucm

2015-04-17 Thread Michael Wang


On 04/16/2015 05:58 PM, Jason Gunthorpe wrote:
 On Thu, Apr 16, 2015 at 10:08:10AM +0200, Michael Wang wrote:

 Use raw management helpers to reform cm related part in IB-core cma/ucm.

 These checks focus on the device cm type rather than the port capability,
 directly pass port 1 works currently, but can't support mixing cm type
 device in future.
 
 After the discussion settled, I ended up thinking that implementing
 explicit device checks, for use by CM, and the BUG_ON at register to
 require all ports have the same value was the best option.
 
 It also looks like hardwired 1 won't work on switch ports, so it is no-go.

AFAIK, the current HW won't trigger such Bug, actually only mlx4 using port_num
to check the link-layer (but still ib cm anyway), others are just static 
whatever
the port_num is.

Thus as long as the check is still count on transport type and link-layer, the
BUG may never be triggered...

Regards,
Michael Wang

 
 Jason
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 10/27] IB/Verbs: Reform cm related part in IB-core cma/ucm

2015-04-17 Thread Michael Wang


On 04/16/2015 07:21 PM, Tom Talpey wrote:
 On 4/16/2015 11:22 AM, Michael Wang wrote:


 On 04/16/2015 04:31 PM, Hefty, Sean wrote:
 This is equivalent to today where the checks are per node rather than
 per port.

 Should all checks here be port 1 based or only certain ones like listen
 ? For example, in connect/reject/disconnect, don't we already have port
 ? Guess this can be dealt with later as this is not a regression from
 the current implementation.

 Yeah, these parts of cma may need more carve in future, like some new
 callback
 for different CM type as Sean suggested.

 Maybe directly using 1 could help to highlight the problem ;-)

 Only a few checks need to be per device.  I think I pointed those out 
 previously.  Testing should show anywhere that we miss fairly quickly, 
 since port would still be 0.  For the checks that can be updated to be per 
 port, I would rather go ahead and convert them.

 Got it, will be changed in next version :-)

 To be confirmed:
 PORT ASSIGNED
 rdma_init_qp_attrY
 rdma_destroy_idunknown
 cma_listen_on_devN
 cma_bind_loopbackN
 rdma_listenN
 
 Why N? rdma_listen() can be constrained to a single port, right?
 And even if wildcarded, it needs to act on multiple ports, which is
 to say, it will fail only if no ports are eligible.

Yeah, it can or can't, maybe 'unknown' is better :-)

Regards,
Michael Wang

 
 Tom.
 
 
 rdma_connectY
 rdma_acceptY
 rdma_rejectY
 rdma_disconnectY
 ib_ucm_add_oneN

 Is this list correct?

 Regards,
 Michael Wang


 - Sean



 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1 2/4] iw_cxgb4: 32b platform fixes

2015-04-17 Thread Hariprasad Shenai
- get_dma_mr() was using ~0UL which is should be ~0ULL.  This causes the
DMA MR to get setup incorrectly in hardware.

- wr_log_show() needed a 64b divide function div64_u64() instead of
  doing
division directly.

- fixed warnings about recasting a pointer to a u64

Signed-off-by: Steve Wise sw...@opengridcomputing.com
Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cm.c |  2 +-
 drivers/infiniband/hw/cxgb4/cq.c |  6 +++---
 drivers/infiniband/hw/cxgb4/device.c |  6 +++---
 drivers/infiniband/hw/cxgb4/mem.c| 10 +-
 drivers/infiniband/hw/cxgb4/qp.c | 14 +++---
 5 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index 6ed5025..636fe84 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -3571,7 +3571,7 @@ static void send_fw_pass_open_req(struct c4iw_dev *dev, 
struct sk_buff *skb,
 * TP will ignore any value  0 for MSS index.
 */
req-tcb.opt0 = cpu_to_be64(MSS_IDX_V(0xF));
-   req-cookie = (unsigned long)skb;
+   req-cookie = (u64)(unsigned long)skb;
 
set_wr_txq(req_skb, CPL_PRIORITY_CONTROL, port_id);
ret = cxgb4_ofld_send(dev-rdev.lldi.ports[0], req_skb);
diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index ab7692a..a0358b1 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -55,7 +55,7 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq 
*cq,
FW_RI_RES_WR_NRES_V(1) |
FW_WR_COMPL_F);
res_wr-len16_pkd = cpu_to_be32(DIV_ROUND_UP(wr_len, 16));
-   res_wr-cookie = (unsigned long) wr_wait;
+   res_wr-cookie = (u64)(unsigned long)wr_wait;
res = res_wr-res;
res-u.cq.restype = FW_RI_RES_TYPE_CQ;
res-u.cq.op = FW_RI_RES_OP_RESET;
@@ -125,7 +125,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq 
*cq,
FW_RI_RES_WR_NRES_V(1) |
FW_WR_COMPL_F);
res_wr-len16_pkd = cpu_to_be32(DIV_ROUND_UP(wr_len, 16));
-   res_wr-cookie = (unsigned long) wr_wait;
+   res_wr-cookie = (u64)(unsigned long)wr_wait;
res = res_wr-res;
res-u.cq.restype = FW_RI_RES_TYPE_CQ;
res-u.cq.op = FW_RI_RES_OP_WRITE;
@@ -970,7 +970,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int 
entries,
}
PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n,
 __func__, chp-cq.cqid, chp, chp-cq.size,
-chp-cq.memsize,
+(unsigned long)chp-cq.memsize,
 (unsigned long long) chp-cq.dma_addr);
return chp-ibcq;
 err5:
diff --git a/drivers/infiniband/hw/cxgb4/device.c 
b/drivers/infiniband/hw/cxgb4/device.c
index 8fb295e..5f7bb78 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -151,7 +151,7 @@ static int wr_log_show(struct seq_file *seq, void *v)
int prev_ts_set = 0;
int idx, end;
 
-#define ts2ns(ts) div64_ul((ts) * dev-rdev.lldi.cclk_ps, 1000)
+#define ts2ns(ts) div64_u64((ts) * dev-rdev.lldi.cclk_ps, 1000)
 
idx = atomic_read(dev-rdev.wr_log_idx) 
(dev-rdev.wr_log_size - 1);
@@ -784,10 +784,10 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
 rdev-lldi.vr-qp.size,
 rdev-lldi.vr-cq.start,
 rdev-lldi.vr-cq.size);
-   PDBG(udb len 0x%x udb base %llx db_reg %p gts_reg %p qpshift %lu 
+   PDBG(udb len 0x%x udb base %p db_reg %p gts_reg %p qpshift %lu 
 qpmask 0x%x cqshift %lu cqmask 0x%x\n,
 (unsigned)pci_resource_len(rdev-lldi.pdev, 2),
-(u64)pci_resource_start(rdev-lldi.pdev, 2),
+(void *)(unsigned long)pci_resource_start(rdev-lldi.pdev, 2),
 rdev-lldi.db_reg,
 rdev-lldi.gts_reg,
 rdev-qpshift, rdev-qpmask,
diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
b/drivers/infiniband/hw/cxgb4/mem.c
index 6791fd1..30db971 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -144,7 +144,7 @@ static int _c4iw_write_mem_inline(struct c4iw_rdev *rdev, 
u32 addr, u32 len,
if (i == (num_wqe-1)) {
req-wr.wr_hi = cpu_to_be32(FW_WR_OP_V(FW_ULPTX_WR) |
FW_WR_COMPL_F);
-   req-wr.wr_lo = (__force __be64)(unsigned long) 
wr_wait;
+   req-wr.wr_lo = (__force __be64)(unsigned long)wr_wait;
} else
req-wr.wr_hi = cpu_to_be32(FW_WR_OP_V(FW_ULPTX_WR));
req-wr.wr_mid = cpu_to_be32(
@@ -676,12 +676,12 @@ struct ib_mr *c4iw_get_dma_mr(struct ib_pd *pd, int acc)
mhp-attr.zbva = 0;
mhp-attr.va_fbo = 0;
mhp-attr.page_size = 0;
-   mhp-attr.len = 

[PATCH for-4.1 3/4] iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs

2015-04-17 Thread Hariprasad Shenai
For T5, we must not use the kdb/kgts registers, in order avoid db drops
under extreme loads.

Signed-off-by: Steve Wise sw...@opengridcomputing.com
Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cq.c | 15 +++
 drivers/infiniband/hw/cxgb4/t4.h |  7 ---
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index a0358b1..2504bce 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -156,12 +156,19 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq 
*cq,
goto err4;
 
cq-gen = 1;
-   cq-gts = rdev-lldi.gts_reg;
cq-rdev = rdev;
if (user) {
-   cq-ugts = (u64)pci_resource_start(rdev-lldi.pdev, 2) +
-   (cq-cqid  rdev-cqshift);
-   cq-ugts = PAGE_MASK;
+   u32 off = (cq-cqid  rdev-cqshift)  PAGE_MASK;
+
+   cq-ugts = (u64)rdev-bar2_pa + off;
+   } else if (is_t4(rdev-lldi.adapter_type)) {
+   cq-gts = rdev-lldi.gts_reg;
+   cq-qid_mask = -1U;
+   } else {
+   u32 off = ((cq-cqid  rdev-cqshift)  PAGE_MASK) + 12;
+
+   cq-gts = rdev-bar2_kva + off;
+   cq-qid_mask = rdev-qpmask;
}
return 0;
 err4:
diff --git a/drivers/infiniband/hw/cxgb4/t4.h b/drivers/infiniband/hw/cxgb4/t4.h
index 871cdca..0b47f9a 100644
--- a/drivers/infiniband/hw/cxgb4/t4.h
+++ b/drivers/infiniband/hw/cxgb4/t4.h
@@ -539,6 +539,7 @@ struct t4_cq {
size_t memsize;
__be64 bits_type_ts;
u32 cqid;
+   u32 qid_mask;
int vector;
u16 size; /* including status page */
u16 cidx;
@@ -563,12 +564,12 @@ static inline int t4_arm_cq(struct t4_cq *cq, int se)
set_bit(CQ_ARMED, cq-flags);
while (cq-cidx_inc  CIDXINC_M) {
val = SEINTARM_V(0) | CIDXINC_V(CIDXINC_M) | TIMERREG_V(7) |
- INGRESSQID_V(cq-cqid);
+ INGRESSQID_V(cq-cqid  cq-qid_mask);
writel(val, cq-gts);
cq-cidx_inc -= CIDXINC_M;
}
val = SEINTARM_V(se) | CIDXINC_V(cq-cidx_inc) | TIMERREG_V(6) |
- INGRESSQID_V(cq-cqid);
+ INGRESSQID_V(cq-cqid  cq-qid_mask);
writel(val, cq-gts);
cq-cidx_inc = 0;
return 0;
@@ -601,7 +602,7 @@ static inline void t4_hwcq_consume(struct t4_cq *cq)
u32 val;
 
val = SEINTARM_V(0) | CIDXINC_V(cq-cidx_inc) | TIMERREG_V(7) |
- INGRESSQID_V(cq-cqid);
+ INGRESSQID_V(cq-cqid  cq-qid_mask);
writel(val, cq-gts);
cq-cidx_inc = 0;
}
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1 4/4] iw_cxgb4: enforce qp/cq id requirements

2015-04-17 Thread Hariprasad Shenai
Currently the iw_cxgb4 implementation requires the qp and cq qid densities
to match as well as the qp and cq id ranges.  So fail a device open if
the device configuration doesn't meet the requirements.

The reason for these restictions has to do with the fact that IQ qid X
has a UGTS register in the same bar2 page as EQ qid X.  Thus both qids
need to be allocated to the same user process for security reasons.
The logic that does this (the qpid allocator in iw_cxgb4/resource.c)
handles this but requires the above restrictions.

Signed-off-by: Steve Wise sw...@opengridcomputing.com
Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/device.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/infiniband/hw/cxgb4/device.c 
b/drivers/infiniband/hw/cxgb4/device.c
index 5f7bb78..f36aa21 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -765,6 +765,29 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
c4iw_init_dev_ucontext(rdev, rdev-uctx);
 
/*
+* This implementation assumes udb_density == ucq_density!  Eventually
+* we might need to support this but for now fail the open. Also the
+* cqid and qpid range must match for now.
+*/
+   if (rdev-lldi.udb_density != rdev-lldi.ucq_density) {
+   pr_err(MOD %s: unsupported udb/ucq densities %u/%u\n,
+  pci_name(rdev-lldi.pdev), rdev-lldi.udb_density,
+  rdev-lldi.ucq_density);
+   err = -EINVAL;
+   goto err1;
+   }
+   if (rdev-lldi.vr-qp.start != rdev-lldi.vr-cq.start ||
+   rdev-lldi.vr-qp.size != rdev-lldi.vr-cq.size) {
+   pr_err(MOD %s: unsupported qp and cq id ranges 
+  qp start %u size %u cq start %u size %u\n,
+  pci_name(rdev-lldi.pdev), rdev-lldi.vr-qp.start,
+  rdev-lldi.vr-qp.size, rdev-lldi.vr-cq.size,
+  rdev-lldi.vr-cq.size);
+   err = -EINVAL;
+   goto err1;
+   }
+
+   /*
 * qpshift is the number of bits to shift the qpid left in order
 * to get the correct address of the doorbell for that qp.
 */
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1 0/4] Misc. fixes and cleanup for iw_cxgb4

2015-04-17 Thread Hariprasad Shenai
Hi,

This patch series changes a macro definition to be consistent with other
register macro defines. Fixes for 32b platform, use BAR2 register for kernel
mode CQ's and enforces qp/cq id requirements.

The patches series is created against github for-4.1 branch.
And includes patches on iw_cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.

Thanks

Hariprasad Shenai (4):
  iw_cxgb4: Cleanup register defines/MACROS
  iw_cxgb4: 32b platform fixes
  iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
  iw_cxgb4: enforce qp/cq id requirements

 drivers/infiniband/hw/cxgb4/cm.c  |  6 +++---
 drivers/infiniband/hw/cxgb4/cq.c  | 21 ++---
 drivers/infiniband/hw/cxgb4/device.c  | 29 ++---
 drivers/infiniband/hw/cxgb4/mem.c | 10 +-
 drivers/infiniband/hw/cxgb4/qp.c  | 14 +++---
 drivers/infiniband/hw/cxgb4/t4.h  |  7 ---
 drivers/infiniband/hw/cxgb4/t4fw_ri_api.h |  4 +++-
 7 files changed, 62 insertions(+), 29 deletions(-)

-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1 1/4] iw_cxgb4: Cleanup register defines/MACROS

2015-04-17 Thread Hariprasad Shenai
Cleanup macros and register defines for consistency

Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
---
 drivers/infiniband/hw/cxgb4/cm.c  | 4 ++--
 drivers/infiniband/hw/cxgb4/t4fw_ri_api.h | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index 57176dd..6ed5025 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -675,7 +675,7 @@ static int send_connect(struct c4iw_ep *ep)
if (is_t5(ep-com.dev-rdev.lldi.adapter_type)) {
opt2 |= T5_OPT_2_VALID_F;
opt2 |= CONG_CNTRL_V(CONG_ALG_TAHOE);
-   opt2 |= CONG_CNTRL_VALID; /* OPT_2_ISS for T5 */
+   opt2 |= T5_ISS_F;
}
t4_set_arp_err_handler(skb, ep, act_open_req_arp_failure);
 
@@ -2214,7 +2214,7 @@ static void accept_cr(struct c4iw_ep *ep, struct sk_buff 
*skb,
u32 isn = (prandom_u32()  ~7UL) - 1;
opt2 |= T5_OPT_2_VALID_F;
opt2 |= CONG_CNTRL_V(CONG_ALG_TAHOE);
-   opt2 |= CONG_CNTRL_VALID; /* OPT_2_ISS for T5 */
+   opt2 |= T5_ISS_F;
rpl5 = (void *)rpl;
memset(rpl5-iss, 0, roundup(sizeof(*rpl5)-sizeof(*rpl), 16));
if (peer2peer)
diff --git a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h 
b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h
index 5e53327..343e8daf 100644
--- a/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h
+++ b/drivers/infiniband/hw/cxgb4/t4fw_ri_api.h
@@ -848,6 +848,8 @@ enum { /* TCP congestion control 
algorithms */
 #define CONG_CNTRL_V(x) ((x)  CONG_CNTRL_S)
 #define CONG_CNTRL_G(x) (((x)  CONG_CNTRL_S)  CONG_CNTRL_M)
 
-#define CONG_CNTRL_VALID   (1  18)
+#define T5_ISS_S18
+#define T5_ISS_V(x) ((x)  T5_ISS_S)
+#define T5_ISS_FT5_ISS_V(1U)
 
 #endif /* _T4FW_RI_API_H_ */
-- 
2.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

2015-04-17 Thread Michael Wang


On 04/16/2015 06:43 PM, Jason Gunthorpe wrote:
 On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:
 
 I would be very happy to see a patch that adds cap_ib_smi to the
 current tree and states 'This patch is tested to have no change on the
 binary compilation results'

 There are too much reform there (per-dev to per-port), I guess the binary
 will changed more or less anyway...
 
 I think this patch series is huge, and everytime someone new looks at
 it small functional errors seem to pop up..

This is a big changing after all :-P

As Doug suggested at very beginning, all these changing are necessary
in order to eliminate the usage of old inferring method, then we will
have a clean stage for next reform.

And since it's big, I tried to classified them according to logical,
to help us review more easily, I'm not sure but compress the series
may increasing the difficulty of reviewing...

 
 Doing something to reduce the review surface would be really helpful
 here. Not changing the same line twice, using tools too perform these
 transforms and then assert the patch is a NOP because .. tools. Some
 other idea?

Actually the main reform work finished in 1#~15#, the rest are just
introducing cap_XX which we only need to check the description and
usage, thus I'd like to suggest we focus on reviewing 1#~15#, after all,
the rest won't introducing Bug and we can edit them at any time :-P

Frankly speaking I think it's a good thing that we locate errors at
this moment, whenever someone find issues, that means the patch has
been reviewed thoroughly, I think may be just few more version, this
series will become stable ;-)

Regards,
Michael Wang


 
 Jason
 
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stepping down as maintainer (was Re: [PATCH for-next 0/9] mlx4 changes in virtual GID management)

2015-04-17 Thread Or Gerlitz
On Fri, Apr 17, 2015 at 1:03 AM, Doug Ledford dledf...@redhat.com wrote:
 On Thu, 2015-04-16 at 15:37 -0600, Jason Gunthorpe wrote:
 On Fri, Apr 17, 2015 at 12:26:47AM +0300, Or Gerlitz wrote:
  On Thu, Apr 16, 2015 at 11:13 PM, Jason Gunthorpe
  jguntho...@obsidianresearch.com wrote:
   On Wed, Apr 15, 2015 at 08:39:13AM -0400, Doug Ledford wrote:
  
   Let’s take this as a starting point.  If there are other patches
   people think should make 4.1, please bring them to my attention and
   I’ll add them for review and possible addition to the list.
  
   git://github.com/dledford/linux.git for-4.1
  
   I noticed a few patches didn't get Reviewed-By's etc from the mailing
   list annotated..
 
  Every patch need not have a Reviewed-By annotation, it's perfectly up
  to the maintainer judgement to make sure there was either consensus,

 More clearly: There were several patches where a Reviewed-By/etc was
 *provided* on-list but did not get captured in the git commit.

 Sure, that's what I figured you meant.  That most likely means that I
 pulled the bundle from patchworks before their review was captured.  It
 means I would need to either re-pull the bundle or manually add the
 review.

 AFAIK, capturing provided tags is considered best practice as a way to
 recognize and retain contributors.

 Agreed.  I'll double check them tomorrow.

Doug,

Seems like Roland isn't around to carry the rdma pull request for 4.1 ...

I assume the most safe way to do that would be ask Dave Miller to pull
the patches from your tree into net-next ASAP, next, have them sit
there for 1-2 days to pass linux-next etc auto merge tests, and then
have Dave to send a pull request. Dave, I hope it would be OK from you
to help us here with the 4.1-rc1 bits. For 4.2 and 4.1-rc  1 Doug
will should be ready for sending pull request directly to Linus.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RDMA patches for 4.1

2015-04-17 Thread Christoph Lameter
On Thu, 16 Apr 2015, Or Gerlitz wrote:

 I still didn't hear clear response from Roland confirming that he's gonna send
 pull request with these bits.

 Roland, please confirm that you are taking this and if not, Dave can you help
 us here? copied linux-rdma.

Or, I think the best way forwards is just to setup a tree put the stuff in
there, get some review and then send the request to pull yourself if
Roland does not answer.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stepping down as maintainer (was Re: [PATCH for-next 0/9] mlx4 changes in virtual GID management)

2015-04-17 Thread Or Gerlitz
On Fri, Apr 17, 2015 at 8:03 PM, Roland Dreier rol...@kernel.org wrote:
 On Fri, Apr 17, 2015 at 4:15 AM, Or Gerlitz gerlitz...@gmail.com wrote:
 Seems like Roland isn't around to carry the rdma pull request for 4.1 ...

 Why do you ignore my emails (I see no reply to
 http://www.spinics.net/lists/linux-rdma/msg24194.html) and jump to
 this conclusion?

Sorry, I misread where things are, I was under the impression that all
was in place and expected you to pull things after you made this post
@ Wednesday http://www.spinics.net/lists/linux-rdma/msg24191.html, NM,
we're on track. Thanks a lot for driving the bits for the 4.1 merge
window.

 I will send a pull request for what is in Doug's tree (and also in my tree).

cool
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stepping down as maintainer (was Re: [PATCH for-next 0/9] mlx4 changes in virtual GID management)

2015-04-17 Thread Roland Dreier
On Apr 11, 2015 12:28 PM, Or Gerlitz gerlitz...@gmail.com wrote:

 If taking off the maintainer hat would get you some free/spare cycles,
 maybe you could resume posting in the digitalvampire blog and/or
 participate in the upstreamming of soft-RoCE driver? some folks here
 are working on this, so if you're interested, let me know...

Thank you very much for the direction.  I'm OK figuring out what to do
for myself though.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-4.1 2/4] iw_cxgb4: 32b platform fixes

2015-04-17 Thread Doug Ledford
On Fri, 2015-04-17 at 20:35 +0530, Hariprasad Shenai wrote:
 - get_dma_mr() was using ~0UL which is should be ~0ULL.  This causes the
 DMA MR to get setup incorrectly in hardware.
 
 - wr_log_show() needed a 64b divide function div64_u64() instead of
   doing
 division directly.
 
 - fixed warnings about recasting a pointer to a u64
 
 Signed-off-by: Steve Wise sw...@opengridcomputing.com
 Signed-off-by: Hariprasad Shenai haripra...@chelsio.com
 ---
  drivers/infiniband/hw/cxgb4/cm.c |  2 +-
  drivers/infiniband/hw/cxgb4/cq.c |  6 +++---
  drivers/infiniband/hw/cxgb4/device.c |  6 +++---
  drivers/infiniband/hw/cxgb4/mem.c| 10 +-
  drivers/infiniband/hw/cxgb4/qp.c | 14 +++---
  5 files changed, 19 insertions(+), 19 deletions(-)
 
 diff --git a/drivers/infiniband/hw/cxgb4/cm.c 
 b/drivers/infiniband/hw/cxgb4/cm.c
 index 6ed5025..636fe84 100644
 --- a/drivers/infiniband/hw/cxgb4/cm.c
 +++ b/drivers/infiniband/hw/cxgb4/cm.c
 @@ -3571,7 +3571,7 @@ static void send_fw_pass_open_req(struct c4iw_dev *dev, 
 struct sk_buff *skb,
* TP will ignore any value  0 for MSS index.
*/
   req-tcb.opt0 = cpu_to_be64(MSS_IDX_V(0xF));
 - req-cookie = (unsigned long)skb;
 + req-cookie = (u64)(unsigned long)skb;

Wouldn't uintptr be a better option here?  Storing a pointer in an int
(or vice versa) is exactly the kind of thing it was created to handle in
the first place.

   set_wr_txq(req_skb, CPL_PRIORITY_CONTROL, port_id);
   ret = cxgb4_ofld_send(dev-rdev.lldi.ports[0], req_skb);
 diff --git a/drivers/infiniband/hw/cxgb4/cq.c 
 b/drivers/infiniband/hw/cxgb4/cq.c
 index ab7692a..a0358b1 100644
 --- a/drivers/infiniband/hw/cxgb4/cq.c
 +++ b/drivers/infiniband/hw/cxgb4/cq.c
 @@ -55,7 +55,7 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq 
 *cq,
   FW_RI_RES_WR_NRES_V(1) |
   FW_WR_COMPL_F);
   res_wr-len16_pkd = cpu_to_be32(DIV_ROUND_UP(wr_len, 16));
 - res_wr-cookie = (unsigned long) wr_wait;
 + res_wr-cookie = (u64)(unsigned long)wr_wait;
   res = res_wr-res;
   res-u.cq.restype = FW_RI_RES_TYPE_CQ;
   res-u.cq.op = FW_RI_RES_OP_RESET;
 @@ -125,7 +125,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq 
 *cq,
   FW_RI_RES_WR_NRES_V(1) |
   FW_WR_COMPL_F);
   res_wr-len16_pkd = cpu_to_be32(DIV_ROUND_UP(wr_len, 16));
 - res_wr-cookie = (unsigned long) wr_wait;
 + res_wr-cookie = (u64)(unsigned long)wr_wait;
   res = res_wr-res;
   res-u.cq.restype = FW_RI_RES_TYPE_CQ;
   res-u.cq.op = FW_RI_RES_OP_WRITE;
 @@ -970,7 +970,7 @@ struct ib_cq *c4iw_create_cq(struct ib_device *ibdev, int 
 entries,
   }
   PDBG(%s cqid 0x%0x chp %p size %u memsize %zu, dma_addr 0x%0llx\n,
__func__, chp-cq.cqid, chp, chp-cq.size,
 -  chp-cq.memsize,
 +  (unsigned long)chp-cq.memsize,
(unsigned long long) chp-cq.dma_addr);
   return chp-ibcq;
  err5:
 diff --git a/drivers/infiniband/hw/cxgb4/device.c 
 b/drivers/infiniband/hw/cxgb4/device.c
 index 8fb295e..5f7bb78 100644
 --- a/drivers/infiniband/hw/cxgb4/device.c
 +++ b/drivers/infiniband/hw/cxgb4/device.c
 @@ -151,7 +151,7 @@ static int wr_log_show(struct seq_file *seq, void *v)
   int prev_ts_set = 0;
   int idx, end;
  
 -#define ts2ns(ts) div64_ul((ts) * dev-rdev.lldi.cclk_ps, 1000)
 +#define ts2ns(ts) div64_u64((ts) * dev-rdev.lldi.cclk_ps, 1000)
  
   idx = atomic_read(dev-rdev.wr_log_idx) 
   (dev-rdev.wr_log_size - 1);
 @@ -784,10 +784,10 @@ static int c4iw_rdev_open(struct c4iw_rdev *rdev)
rdev-lldi.vr-qp.size,
rdev-lldi.vr-cq.start,
rdev-lldi.vr-cq.size);
 - PDBG(udb len 0x%x udb base %llx db_reg %p gts_reg %p qpshift %lu 
 + PDBG(udb len 0x%x udb base %p db_reg %p gts_reg %p qpshift %lu 
qpmask 0x%x cqshift %lu cqmask 0x%x\n,
(unsigned)pci_resource_len(rdev-lldi.pdev, 2),
 -  (u64)pci_resource_start(rdev-lldi.pdev, 2),
 +  (void *)(unsigned long)pci_resource_start(rdev-lldi.pdev, 2),
rdev-lldi.db_reg,
rdev-lldi.gts_reg,
rdev-qpshift, rdev-qpmask,
 diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
 b/drivers/infiniband/hw/cxgb4/mem.c
 index 6791fd1..30db971 100644
 --- a/drivers/infiniband/hw/cxgb4/mem.c
 +++ b/drivers/infiniband/hw/cxgb4/mem.c
 @@ -144,7 +144,7 @@ static int _c4iw_write_mem_inline(struct c4iw_rdev *rdev, 
 u32 addr, u32 len,
   if (i == (num_wqe-1)) {
   req-wr.wr_hi = cpu_to_be32(FW_WR_OP_V(FW_ULPTX_WR) |
   FW_WR_COMPL_F);
 - req-wr.wr_lo = (__force __be64)(unsigned long) 
 wr_wait;
 + req-wr.wr_lo = (__force __be64)(unsigned long)wr_wait;
   } else
   req-wr.wr_hi = cpu_to_be32(FW_WR_OP_V(FW_ULPTX_WR));

Re: Stepping down as maintainer (was Re: [PATCH for-next 0/9] mlx4 changes in virtual GID management)

2015-04-17 Thread Roland Dreier
On Fri, Apr 17, 2015 at 4:15 AM, Or Gerlitz gerlitz...@gmail.com wrote:
 Seems like Roland isn't around to carry the rdma pull request for 4.1 ...

Why do you ignore my emails (I see no reply to
http://www.spinics.net/lists/linux-rdma/msg24194.html) and jump to
this conclusion?

I will send a pull request for what is in Doug's tree (and also in my tree).
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V1 net-next] IB/ipoib: Fix ndo_get_iflink

2015-04-17 Thread David Miller
From: Erez Shitrit ere...@mellanox.com
Date: Thu, 16 Apr 2015 16:34:34 +0300

 Currently, iflink of the parent interface was always accessed, even 
 when interface didn't have a parent and hence we crashed there.
 
 Handle the interface types properly: for a child interface, return
 the ifindex of the parent, for parent interface, return its ifindex.
 
 For child devices, make sure to set the parent pointer prior to
 invoking register_netdevice(), this allows the new ndo to be called
 by the stack immediately after the child device is registered.
 
 Fixes: 5aa7add8f14b ('infiniband/ipoib: implement ndo_get_iflink')
 Reported-by: Honggang Li ho...@redhat.com
 Signed-off-by: Erez Shitrit ere...@mellanox.com
 Signed-off-by: Honggang Li ho...@redhat.com

Applied, thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-4.1 1/3] RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the remote connecting peer

2015-04-17 Thread Tatyana Nikolova
This patch series was first submitted for inclusion upstream on 10/13/14, but
it seems to have slipped through the cracks.

Add functionality to allow the port mapper to provide to its client
the actual (non-mapped) ip/tcp address information of the remote connecting peer

1) Adding remote_info_cb() to process the address info of
   the remote peer when it initiates the connection
2) Adding a hash list to store the remote address info
3) Adding functionality to add/remove the remote address info
   After the info has been provided to the port mapper client,
   it is removed from the hash list 


The port mapper client needs to know the actual (non-mapped) address info of 
the remote peer for each connection, because it is reporting it to the
user space application which isn't aware of the port mapper service. 
(The clients already know the local non-mapped address info)


Signed-off-by: Tatyana Nikolova tatyana.e.nikol...@intel.com
Reviewed-by: Steve Wise sw...@opengridcomputing.com
---
 drivers/infiniband/core/iwpm_msg.c  |   73 -
 drivers/infiniband/core/iwpm_util.c |  208 +--
 drivers/infiniband/core/iwpm_util.h |   15 +++
 include/rdma/iw_portmap.h   |   25 
 include/uapi/rdma/rdma_netlink.h|1 +
 5 files changed, 288 insertions(+), 34 deletions(-)

diff --git a/drivers/infiniband/core/iwpm_msg.c 
b/drivers/infiniband/core/iwpm_msg.c
index b85ddbc..ab08170 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -468,7 +468,8 @@ add_mapping_response_exit:
 }
 EXPORT_SYMBOL(iwpm_add_mapping_cb);
 
-/* netlink attribute policy for the response to add and query mapping request 
*/
+/* netlink attribute policy for the response to add and query mapping request
+ * and response with remote address info */
 static const struct nla_policy resp_query_policy[IWPM_NLA_RQUERY_MAPPING_MAX] 
= {
[IWPM_NLA_QUERY_MAPPING_SEQ]  = { .type = NLA_U32 },
[IWPM_NLA_QUERY_LOCAL_ADDR]   = { .len = sizeof(struct 
sockaddr_storage) },
@@ -559,6 +560,76 @@ query_mapping_response_exit:
 }
 EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
 
+/*
+ * iwpm_remote_info_cb - Process a port mapper message, containing
+ *   the remote connecting peer address info
+ */
+int iwpm_remote_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
+{
+   struct nlattr *nltb[IWPM_NLA_RQUERY_MAPPING_MAX];
+   struct sockaddr_storage *local_sockaddr, *remote_sockaddr;
+   struct sockaddr_storage *mapped_loc_sockaddr, *mapped_rem_sockaddr;
+   struct iwpm_remote_info *rem_info;
+   const char *msg_type;
+   u8 nl_client;
+   int ret = -EINVAL;
+
+   msg_type = Remote Mapping info;
+   if (iwpm_parse_nlmsg(cb, IWPM_NLA_RQUERY_MAPPING_MAX,
+   resp_query_policy, nltb, msg_type))
+   return ret;
+
+   nl_client = RDMA_NL_GET_CLIENT(cb-nlh-nlmsg_type);
+   if (!iwpm_valid_client(nl_client)) {
+   pr_info(%s: Invalid port mapper client = %d\n,
+   __func__, nl_client);
+   return ret;
+   }
+   atomic_set(echo_nlmsg_seq, cb-nlh-nlmsg_seq);
+
+   local_sockaddr = (struct sockaddr_storage *)
+   nla_data(nltb[IWPM_NLA_QUERY_LOCAL_ADDR]);
+   remote_sockaddr = (struct sockaddr_storage *)
+   nla_data(nltb[IWPM_NLA_QUERY_REMOTE_ADDR]);
+   mapped_loc_sockaddr = (struct sockaddr_storage *)
+   nla_data(nltb[IWPM_NLA_RQUERY_MAPPED_LOC_ADDR]);
+   mapped_rem_sockaddr = (struct sockaddr_storage *)
+   nla_data(nltb[IWPM_NLA_RQUERY_MAPPED_REM_ADDR]);
+
+   if (mapped_loc_sockaddr-ss_family != local_sockaddr-ss_family ||
+   mapped_rem_sockaddr-ss_family != remote_sockaddr-ss_family) {
+   pr_info(%s: Sockaddr family doesn't match the requested one\n,
+   __func__);
+   return ret;
+   }
+   rem_info = kzalloc(sizeof(struct iwpm_remote_info), GFP_ATOMIC);
+   if (!rem_info) {
+   pr_err(%s: Unable to allocate a remote info\n, __func__);
+   ret = -ENOMEM;
+   return ret;
+   }
+   memcpy(rem_info-mapped_loc_sockaddr, mapped_loc_sockaddr,
+  sizeof(struct sockaddr_storage));
+   memcpy(rem_info-remote_sockaddr, remote_sockaddr,
+  sizeof(struct sockaddr_storage));
+   memcpy(rem_info-mapped_rem_sockaddr, mapped_rem_sockaddr,
+  sizeof(struct sockaddr_storage));
+   rem_info-nl_client = nl_client;
+
+   iwpm_add_remote_info(rem_info);
+
+   iwpm_print_sockaddr(local_sockaddr,
+   remote_info: Local sockaddr:);
+   iwpm_print_sockaddr(mapped_loc_sockaddr,
+   remote_info: Mapped local sockaddr:);
+   iwpm_print_sockaddr(remote_sockaddr,
+   

[PATCH for-4.1 3/3] RDMA/cxgb4: Report the actual address of the remote connecting peer

2015-04-17 Thread Tatyana Nikolova
From: Steve Wise sw...@opengridcomputing.com

When iWARP port mapping is being done, the passive side of a connection
only knows the mapped address/port of the peer.  So now query the IWPM
to get the actual address/port of the peer.

Also setup the passive side endpoint to correctly display the actual
and mapped addresses for the new connection.

Signed-off-by: Steve Wise sw...@opengridcomputing.com
---

 drivers/infiniband/hw/cxgb4/cm.c |   54 +++---
 drivers/infiniband/hw/cxgb4/device.c |1 +
 2 files changed, 51 insertions(+), 4 deletions(-)


diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index c2fb71c..f6c7b32 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -580,6 +580,22 @@ static void c4iw_record_pm_msg(struct c4iw_ep *ep,
sizeof(ep-com.mapped_remote_addr));
 }
 
+static int get_remote_addr(struct c4iw_ep *ep)
+{
+   int ret;
+
+   print_addr(ep-com, __func__, get_remote_addr);
+
+   ret = iwpm_get_remote_info(ep-com.mapped_local_addr,
+  ep-com.mapped_remote_addr,
+  ep-com.remote_addr, RDMA_NL_C4IW);
+   if (ret)
+   pr_info(MOD Unable to find remote peer addr info - err %d\n,
+   ret);
+
+   return ret;
+}
+
 static void best_mtu(const unsigned short *mtus, unsigned short mtu,
 unsigned int *idx, int use_ts)
 {
@@ -2343,27 +2359,57 @@ static int pass_accept_req(struct c4iw_dev *dev, struct 
sk_buff *skb)
state_set(child_ep-com, CONNECTING);
child_ep-com.dev = dev;
child_ep-com.cm_id = NULL;
+
+   /*
+* The mapped_local and mapped_remote addresses get setup with
+* the actual 4-tuple.  The local address will be based on the
+* actual local address of the connection, but on the port number
+* of the parent listening endpoint.  The remote address is
+* setup based on a query to the IWPM since we don't know what it
+* originally was before mapping.  If no mapping was done, then
+* mapped_remote == remote, and mapped_local == local.
+*/
if (iptype == 4) {
struct sockaddr_in *sin = (struct sockaddr_in *)
-   child_ep-com.local_addr;
+   child_ep-com.mapped_local_addr;
+
sin-sin_family = PF_INET;
sin-sin_port = local_port;
sin-sin_addr.s_addr = *(__be32 *)local_ip;
-   sin = (struct sockaddr_in *)child_ep-com.remote_addr;
+
+   sin = (struct sockaddr_in *)child_ep-com.local_addr;
+   sin-sin_family = PF_INET;
+   sin-sin_port = ((struct sockaddr_in *)
+parent_ep-com.local_addr)-sin_port;
+   sin-sin_addr.s_addr = *(__be32 *)local_ip;
+
+   sin = (struct sockaddr_in *)child_ep-com.mapped_remote_addr;
sin-sin_family = PF_INET;
sin-sin_port = peer_port;
sin-sin_addr.s_addr = *(__be32 *)peer_ip;
} else {
struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)
-   child_ep-com.local_addr;
+   child_ep-com.mapped_local_addr;
+
sin6-sin6_family = PF_INET6;
sin6-sin6_port = local_port;
memcpy(sin6-sin6_addr.s6_addr, local_ip, 16);
-   sin6 = (struct sockaddr_in6 *)child_ep-com.remote_addr;
+
+   sin6 = (struct sockaddr_in6 *)child_ep-com.local_addr;
+   sin6-sin6_family = PF_INET6;
+   sin6-sin6_port = ((struct sockaddr_in6 *)
+  parent_ep-com.local_addr)-sin6_port;
+   memcpy(sin6-sin6_addr.s6_addr, local_ip, 16);
+
+   sin6 = (struct sockaddr_in6 *)child_ep-com.mapped_remote_addr;
sin6-sin6_family = PF_INET6;
sin6-sin6_port = peer_port;
memcpy(sin6-sin6_addr.s6_addr, peer_ip, 16);
}
+   memcpy(child_ep-com.remote_addr, child_ep-com.mapped_remote_addr,
+  sizeof(child_ep-com.remote_addr));
+   get_remote_addr(child_ep);
+
c4iw_get_ep(parent_ep-com);
child_ep-parent_ep = parent_ep;
child_ep-tos = GET_POPEN_TOS(ntohl(req-tos_stid));
diff --git a/drivers/infiniband/hw/cxgb4/device.c 
b/drivers/infiniband/hw/cxgb4/device.c
index 72f1f05..4c0f238 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -93,6 +93,7 @@ static struct ibnl_client_cbs c4iw_nl_cb_table[] = {
[RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
[RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb},
[RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb},
+   [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb},

[PATCH for-4.1 2/3] RDMA/nes: Report the actual address of the remote connecting peer

2015-04-17 Thread Tatyana Nikolova
Get the actual (non-mapped) ip/tcp address of the remote connecting peer
from the port mapper and report the address info to the user space application
at the time of connection establishment

Signed-off-by: Tatyana Nikolova tatyana.e.nikol...@intel.com
---
 drivers/infiniband/hw/nes/nes.c|1 +
 drivers/infiniband/hw/nes/nes_cm.c |   65 ++-
 2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c
index 3b2a6dc..9f9d5c5 100644
--- a/drivers/infiniband/hw/nes/nes.c
+++ b/drivers/infiniband/hw/nes/nes.c
@@ -116,6 +116,7 @@ static struct ibnl_client_cbs nes_nl_cb_table[] = {
[RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
[RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
[RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb},
+   [RDMA_NL_IWPM_REMOTE_INFO] = {.dump = iwpm_remote_info_cb},
[RDMA_NL_IWPM_HANDLE_ERR] = {.dump = iwpm_mapping_error_cb},
[RDMA_NL_IWPM_MAPINFO] = {.dump = iwpm_mapping_info_cb},
[RDMA_NL_IWPM_MAPINFO_NUM] = {.dump = iwpm_ack_mapping_info_cb}
diff --git a/drivers/infiniband/hw/nes/nes_cm.c 
b/drivers/infiniband/hw/nes/nes_cm.c
index 6f09a72..785c2fa 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -596,27 +596,52 @@ static void nes_form_reg_msg(struct nes_vnic *nesvnic,
memcpy(pm_msg-if_name, nesvnic-netdev-name, IWPM_IFNAME_SIZE);
 }
 
+static void record_sockaddr_info(struct sockaddr_storage *addr_info,
+   nes_addr_t *ip_addr, u16 *port_num)
+{
+   struct sockaddr_in *in_addr = (struct sockaddr_in *)addr_info;
+
+   if (in_addr-sin_family == AF_INET) {
+   *ip_addr = ntohl(in_addr-sin_addr.s_addr);
+   *port_num = ntohs(in_addr-sin_port);
+   }
+}
+
 /*
  * nes_record_pm_msg - Save the received mapping info
  */
 static void nes_record_pm_msg(struct nes_cm_info *cm_info,
struct iwpm_sa_data *pm_msg)
 {
-   struct sockaddr_in *mapped_loc_addr =
-   (struct sockaddr_in *)pm_msg-mapped_loc_addr;
-   struct sockaddr_in *mapped_rem_addr =
-   (struct sockaddr_in *)pm_msg-mapped_rem_addr;
-
-   if (mapped_loc_addr-sin_family == AF_INET) {
-   cm_info-mapped_loc_addr =
-   ntohl(mapped_loc_addr-sin_addr.s_addr);
-   cm_info-mapped_loc_port = ntohs(mapped_loc_addr-sin_port);
-   }
-   if (mapped_rem_addr-sin_family == AF_INET) {
-   cm_info-mapped_rem_addr =
-   ntohl(mapped_rem_addr-sin_addr.s_addr);
-   cm_info-mapped_rem_port = ntohs(mapped_rem_addr-sin_port);
-   }
+   record_sockaddr_info(pm_msg-mapped_loc_addr,
+   cm_info-mapped_loc_addr, cm_info-mapped_loc_port);
+
+   record_sockaddr_info(pm_msg-mapped_rem_addr,
+   cm_info-mapped_rem_addr, cm_info-mapped_rem_port);
+}
+
+/*
+ * nes_get_reminfo - Get the address info of the remote connecting peer
+ */
+static int nes_get_remote_addr(struct nes_cm_node *cm_node)
+{
+   struct sockaddr_storage mapped_loc_addr, mapped_rem_addr;
+   struct sockaddr_storage remote_addr;
+   int ret;
+
+   nes_create_sockaddr(htonl(cm_node-mapped_loc_addr),
+   htons(cm_node-mapped_loc_port), mapped_loc_addr);
+   nes_create_sockaddr(htonl(cm_node-mapped_rem_addr),
+   htons(cm_node-mapped_rem_port), mapped_rem_addr);
+
+   ret = iwpm_get_remote_info(mapped_loc_addr, mapped_rem_addr,
+   remote_addr, RDMA_NL_NES);
+   if (ret)
+   nes_debug(NES_DBG_CM, Unable to find remote peer address 
info\n);
+   else
+   record_sockaddr_info(remote_addr, cm_node-rem_addr,
+   cm_node-rem_port);
+   return ret;
 }
 
 /**
@@ -1566,9 +1591,14 @@ static struct nes_cm_node *make_cm_node(struct 
nes_cm_core *cm_core,
return NULL;
 
/* set our node specific transport info */
-   cm_node-loc_addr = cm_info-loc_addr;
+   if (listener) {
+   cm_node-loc_addr = listener-loc_addr;
+   cm_node-loc_port = listener-loc_port;
+   } else {
+   cm_node-loc_addr = cm_info-loc_addr;
+   cm_node-loc_port = cm_info-loc_port;
+   }
cm_node-rem_addr = cm_info-rem_addr;
-   cm_node-loc_port = cm_info-loc_port;
cm_node-rem_port = cm_info-rem_port;
 
cm_node-mapped_loc_addr = cm_info-mapped_loc_addr;
@@ -2151,6 +2181,7 @@ static int handle_ack_pkt(struct nes_cm_node *cm_node, 
struct sk_buff *skb,
cm_node-state = NES_CM_STATE_ESTABLISHED;
if (datasize) {
cm_node-tcp_cntxt.rcv_nxt = inc_sequence + datasize;
+