Re: [ewg] [PATCH] mlx4: remove limitation on LSO header size

2009-10-12 Thread Roland Dreier

 +   *blh = unlikely(halign  64) ? 1 : 0;

   This idiom of (boolean condition) ? 1 : 0 looks odd to me... doesn't
   (halign  64) already evaluate to 1 or 0 anyway?  Does the unlikely()
   actually affect code generation here?

  True, (halign  64) is the same and is cleaner. As for the unlikely()
  -- well it's already been there and besides, we're never sure if it
  will improve anything so the same question could be asked for other
  places in the code.

I was just wondering in this case where you are just assigning the
boolean value of the expression to a variable how unlikely affects
things.  I can understand for conditional jumps how the compiler can
choose to make the likely case more efficient, but when there are no
jumps then I was just curious how the hint could help.

   I assume this initialization is to avoid a compiler warning.  But the
   code is actually correct without initializing blh -- so I think that we
   save a tiny bit of code by doing uninitialized_var() instead?

  We must initialize blh since it is used for any send request and not
  just LSO opcodes. 

So then this patch was buggy because blh was not reinitialized as we
loop through multiple work requests?  eg an LSO request followed by a
non-LSO request?

Anyway I'll look over the newer patch.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH] mlx4: remove limitation on LSO header size

2009-10-11 Thread Eli Cohen
On Wed, Oct 07, 2009 at 03:45:16PM -0700, Roland Dreier wrote:
 
   +  *blh = unlikely(halign  64) ? 1 : 0;
 
 This idiom of (boolean condition) ? 1 : 0 looks odd to me... doesn't
 (halign  64) already evaluate to 1 or 0 anyway?  Does the unlikely()
 actually affect code generation here?
True, (halign  64) is the same and is cleaner. As for the unlikely()
-- well it's already been there and besides, we're never sure if it
will improve anything so the same question could be asked for other
places in the code.


 
 With that said, see below...
 
   +  int blh = 0;
 
 I assume this initialization is to avoid a compiler warning.  But the
 code is actually correct without initializing blh -- so I think that we
 save a tiny bit of code by doing uninitialized_var() instead?
We must initialize blh since it is used for any send request and not
just LSO opcodes. 


 
   +  (blh ? cpu_to_be32(1  6) : 0);
 
 ...given that the only use of blh is as a flag to decide what constant
 to use here, does it generate better code to make blh be __be32 and set
 the value directly in build_lso_seg, ie do:
 
   *blh = unlikely(halign  64) ? cpu_to_be32(1  6) : 0;
 
 and then use blh without ?: in mlx4_ib_post_send...
 

So we can let build_lso_header() put the corrent value for blh and
unconditionally or it into owner_opcode.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH] mlx4: remove limitation on LSO header size

2009-10-07 Thread Roland Dreier

  +*blh = unlikely(halign  64) ? 1 : 0;

This idiom of (boolean condition) ? 1 : 0 looks odd to me... doesn't
(halign  64) already evaluate to 1 or 0 anyway?  Does the unlikely()
actually affect code generation here?

With that said, see below...

  +int blh = 0;

I assume this initialization is to avoid a compiler warning.  But the
code is actually correct without initializing blh -- so I think that we
save a tiny bit of code by doing uninitialized_var() instead?

  +(blh ? cpu_to_be32(1  6) : 0);

...given that the only use of blh is as a flag to decide what constant
to use here, does it generate better code to make blh be __be32 and set
the value directly in build_lso_seg, ie do:

*blh = unlikely(halign  64) ? cpu_to_be32(1  6) : 0;

and then use blh without ?: in mlx4_ib_post_send...

 - R.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] mlx4: remove limitation on LSO header size

2009-09-30 Thread Eli Cohen
Current code has a limitation as for the size of an LSO header not allowed to
cross a 64 byte boundary. This patch removes this limitation by setting the WQE
RR for large headers thus allowing LSO headers of any size. The extra buffer
reserved for MLX4_IB_QP_LSO QPs has been doubled, from 64 to 128 bytes,
assuming this is reasonable upper limit to header length.
Also, this patch will cause IB_DEVICE_UD_TSO to be set only of FW versions that
set MLX4_DEV_CAP_FLAG_BLH; e.g. FW version 2.6.000 and higher.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/hw/mlx4/main.c |2 +-
 drivers/infiniband/hw/mlx4/qp.c   |   17 +++--
 include/linux/mlx4/device.h   |1 +
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 3cb3f47..e596537 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -103,7 +103,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props-device_cap_flags |= IB_DEVICE_UD_AV_PORT_ENFORCE;
if (dev-dev-caps.flags  MLX4_DEV_CAP_FLAG_IPOIB_CSUM)
props-device_cap_flags |= IB_DEVICE_UD_IP_CSUM;
-   if (dev-dev-caps.max_gso_sz)
+   if (dev-dev-caps.max_gso_sz  dev-dev-caps.flags  
MLX4_DEV_CAP_FLAG_BLH)
props-device_cap_flags |= IB_DEVICE_UD_TSO;
if (dev-dev-caps.bmme_flags  MLX4_BMME_FLAG_RESERVED_LKEY)
props-device_cap_flags |= IB_DEVICE_LOCAL_DMA_LKEY;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 219b103..1b356cf 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -261,7 +261,7 @@ static int send_wqe_overhead(enum ib_qp_type type, u32 
flags)
case IB_QPT_UD:
return sizeof (struct mlx4_wqe_ctrl_seg) +
sizeof (struct mlx4_wqe_datagram_seg) +
-   ((flags  MLX4_IB_QP_LSO) ? 64 : 0);
+   ((flags  MLX4_IB_QP_LSO) ? 128 : 0);
case IB_QPT_UC:
return sizeof (struct mlx4_wqe_ctrl_seg) +
sizeof (struct mlx4_wqe_raddr_seg);
@@ -1467,16 +1467,11 @@ static void __set_data_seg(struct mlx4_wqe_data_seg 
*dseg, struct ib_sge *sg)
 
 static int build_lso_seg(struct mlx4_wqe_lso_seg *wqe, struct ib_send_wr *wr,
 struct mlx4_ib_qp *qp, unsigned *lso_seg_len,
-__be32 *lso_hdr_sz)
+__be32 *lso_hdr_sz, int *blh)
 {
unsigned halign = ALIGN(sizeof *wqe + wr-wr.ud.hlen, 16);
 
-   /*
-* This is a temporary limitation and will be removed in
-* a forthcoming FW release:
-*/
-   if (unlikely(halign  64))
-   return -EINVAL;
+   *blh = unlikely(halign  64) ? 1 : 0;
 
if (unlikely(!(qp-flags  MLX4_IB_QP_LSO) 
 wr-num_sge  qp-sq.max_gs - (halign  4)))
@@ -1523,6 +1518,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
__be32 *lso_wqe;
__be32 uninitialized_var(lso_hdr_sz);
int i;
+   int blh = 0;
 
spin_lock_irqsave(qp-sq.lock, flags);
 
@@ -1616,7 +1612,7 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
size += sizeof (struct mlx4_wqe_datagram_seg) / 16;
 
if (wr-opcode == IB_WR_LSO) {
-   err = build_lso_seg(wqe, wr, qp, seglen, 
lso_hdr_sz);
+   err = build_lso_seg(wqe, wr, qp, seglen, 
lso_hdr_sz, blh);
if (unlikely(err)) {
*bad_wr = wr;
goto out;
@@ -1687,7 +1683,8 @@ int mlx4_ib_post_send(struct ib_qp *ibqp, struct 
ib_send_wr *wr,
}
 
ctrl-owner_opcode = mlx4_ib_opcode[wr-opcode] |
-   (ind  qp-sq.wqe_cnt ? cpu_to_be32(1  31) : 0);
+   (ind  qp-sq.wqe_cnt ? cpu_to_be32(1  31) : 0) |
+   (blh ? cpu_to_be32(1  6) : 0);
 
stamp = ind + qp-sq_spare_wqes;
ind += DIV_ROUND_UP(size * 16, 1U  qp-sq.wqe_shift);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index ce7cc6c..e92d1bf 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -61,6 +61,7 @@ enum {
MLX4_DEV_CAP_FLAG_BAD_PKEY_CNTR = 1   8,
MLX4_DEV_CAP_FLAG_BAD_QKEY_CNTR = 1   9,
MLX4_DEV_CAP_FLAG_DPDP  = 1  12,
+   MLX4_DEV_CAP_FLAG_BLH   = 1  15,
MLX4_DEV_CAP_FLAG_MEM_WINDOW= 1  16,
MLX4_DEV_CAP_FLAG_APM   = 1  17,
MLX4_DEV_CAP_FLAG_ATOMIC= 1  18,
-- 
1.6.4.3

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg