Re: [PATCH v2] opensm: bug in trap report for MC create(66) and delete(67) traps

2010-02-07 Thread Eli Dorfman (Voltaire)
Hal Rosenstock wrote:
 On Fri, Feb 5, 2010 at 9:18 AM, Eli Dorfman dorfman@gmail.com wrote:
 On Thu, Feb 4, 2010 at 10:52 PM, Hal Rosenstock
 hal.rosenst...@gmail.com wrote:
 On Thu, Feb 4, 2010 at 12:43 PM, Eli Dorfman (Voltaire)
 dorfman@gmail.com wrote:
 Subject: [PATCH] Wrong handling of MC create and delete traps

 For these traps the GID in the data details is the MGID and
 not the source port gid.
 So the SM should check that subscriber port has the pkey of the MC group.
 There was also an error in comparing the subnet prefix and guid due to
 host/network order mismatch.

 Signed-off-by: Eli Dorfman e...@voltaire.com
 ---
  opensm/opensm/osm_inform.c |  151 
 ---
  1 files changed, 98 insertions(+), 53 deletions(-)

 diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c
 index 8108213..ae4fe71 100644
 --- a/opensm/opensm/osm_inform.c
 +++ b/opensm/opensm/osm_inform.c
 @@ -341,6 +341,103 @@ Exit:
return status;
  }

 +static int is_access_permitted( osm_infr_t *p_infr_rec,
 +   osm_infr_match_ctxt_t *p_infr_match )
 +{
 +   cl_list_t *p_infr_to_remove_list = 
 p_infr_match-p_remove_infr_list;
 +   ib_inform_info_t *p_ii = (p_infr_rec-inform_record.inform_info);
 +   ib_mad_notice_attr_t *p_ntc = p_infr_match-p_ntc;
 +   uint16_t trap_num = cl_ntoh16(p_ntc-g_or_v.generic.trap_num);
 +   osm_subn_t *p_subn = p_infr_rec-sa-p_subn;
 +   osm_log_t *p_log = p_infr_rec-sa-p_log;
 +   char gid_str[INET6_ADDRSTRLEN];
 +   osm_mgrp_t *p_mgrp;
 +   ib_gid_t source_gid;
 +   osm_port_t *p_src_port;
 +   osm_port_t *p_dest_port;
 +
 +   /* In case of GID_IN(64) or GID_OUT(65) traps the source gid
 +  comparison should be done on the trap source (saved as the gid 
 in the
 +  data details field).
 +  For traps MC_CREATE(66) or MC_DELETE(67) the data details gid is
 +  the MGID. We need to check whether subscriber has the pky of

   typo  

   pkey


 +  the MC group.
 Shouldn't this be the subscriber has a compatible pkey with MC group ?
 The MC group has a full member PKey and the members can be full or
 limited.
 I accept the correction.
 
 Doesn't this require a code change for handling trap cases 66-67 ?

I think that you referred to the comment since the code is handling this 
properly (in my opinion).

 
 Sasha, can you please change this in the commit (only if there are not
 other remarks).
 
 Is that what you are asking Sasha to do (beyond the typos) ?

I asked Sasha to fix only the typo in commit.

 
 BTW, there is no explicit reference in the IB spec for MC group
 create/delete trap (at least I didn't find it).
 
 Not sure what you mean by this. What didn't you find ?

in the spec see o13-17.1.2

Thanks,
Eli
 
 -- Hal
 
 +  In all other cases the issuer gis is the trap source.
   typo  ^^^
   gid

 and this typo of course.

 Thanks,
 Eli
 -- Hal

 +   */
 +   if (trap_num = 64  trap_num = 67 )
 +   /* The issuer of these traps is the SM so source_gid
 +  is the gid saved on the data details */
 +   source_gid = p_ntc-data_details.ntc_64_67.gid;
 +   else
 +   source_gid = p_ntc-issuer_gid;
 +
 +   p_dest_port =
 +   cl_ptr_vector_get(p_subn-port_lid_tbl,
 + cl_ntoh16(p_infr_rec-report_addr.dest_lid));
 +   if (!p_dest_port) {
 +   OSM_LOG(p_log, OSM_LOG_INFO,
 +   Cannot find destination port with LID:%u\n,
 +   cl_ntoh16(p_infr_rec-report_addr.dest_lid));
 +   goto Exit;
 +   }
 +
 +   switch (trap_num) {
 +   case 66:
 +   case 67:
 +   p_mgrp = osm_get_mgrp_by_mgid(p_subn, source_gid);
 +   if (!p_mgrp) {
 +   OSM_LOG(p_log, OSM_LOG_INFO,
 +   Cannot find MGID %s\n,
 +   inet_ntop(AF_INET6, 
 source_gid.raw, gid_str, sizeof gid_str));
 +   goto Exit;
 +   }
 +
 +   if (!osm_physp_has_pkey(p_log,
 +   p_mgrp-mcmember_rec.pkey,
 +   p_dest_port-p_physp)) {
 +   OSM_LOG(p_log, OSM_LOG_INFO,
 +   MGID %s and port GUID:0x%016 
 PRIx64  do not share same pkey\n,
 +   inet_ntop(AF_INET6, 
 source_gid.raw, gid_str, sizeof gid_str),
 +   cl_ntoh64(p_dest_port-guid));
 +   goto Exit;
 +   }
 +   

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Tziporet Koren

On 2/7/2010 3:22 AM, Steve Wise wrote:
   


Good catch, I'll update the patch and submit for 2.6.33 on Monday.


   

NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1.

   

If this patch will be accepted to the kernel 2.6.33 we can take it too

Tziporet
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] ib/iser: major face lift of the data path code

2010-02-07 Thread Or Gerlitz
Vladislav Bolkhovitin wrote:
 Or Gerlitz wrote:

 From where did you get those latency numbers?

read iostat(8), you'll see that await is The average time (in milliseconds) 
for I/O requests issued to the device to be served

 what kind of test did you do?

I connected a Linux box through iser to a target exposing four NULL luns, 
i.e luns which are not backed by real storage. I run four copies of disktest
each over a different lun, where each was reading 1k blocks, etc. 

Or.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise

Tziporet Koren wrote:

On 2/7/2010 3:22 AM, Steve Wise wrote:
  


Good catch, I'll update the patch and submit for 2.6.33 on Monday.


   

NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1.

   

If this patch will be accepted to the kernel 2.6.33 we can take it too

If ofed-1.5.1 is based on 2.6.33 then it will get this patch 
automatically (assuming it goes upstream and makes 2.6.33).  Or we can 
pull it in as a kernel_patches/fixes/ patch.


My point, though, is that even with this patch in ofed-1.5.1, we still 
have an openmpi/IB/rdmacm regression.  The only way to avoid this 
regression without changing openmpi is to disallow _all_ rdma binds to 
127.0.0.1.


Steve.





Tziporet


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Roland Dreier
  My point, though, is that even with this patch in ofed-1.5.1, we still
  have an openmpi/IB/rdmacm regression.  The only way to avoid this
  regression without changing openmpi is to disallow _all_ rdma binds to
  127.0.0.1.

Can you identify the source of the regression?  ie what was the change
that broke things?

I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.
-- 
Roland Dreier rola...@cisco.com
Cisco.com - http://www.cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise

Roland Dreier wrote:

  My point, though, is that even with this patch in ofed-1.5.1, we still
  have an openmpi/IB/rdmacm regression.  The only way to avoid this
  regression without changing openmpi is to disallow _all_ rdma binds to
  127.0.0.1.

Can you identify the source of the regression?  ie what was the change
that broke things?

  


It is the same commit you sited earlier.  It enables binding rdma cm_ids 
to 127.0.0.1.  Sean's proposed patch on top of that disables this only 
for iwarp devices.




I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Sean Hefty
Can you identify the source of the regression?  ie what was the change
that broke things?

My understanding is that support for loopback addresses exposes an existing bug
in openmpi.  It tries to bind to 127.0.0.1, which now succeeds.  Openmpi passes
that address to a remote node for use in connections.

I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.

Unless we never support loopback addresses, openmpi will see a regression.  The
only other problem that I'm aware of for 2.6.33 is that the bind to a loopback
address will succeed, even though the RDMA device may not support loopback.
This is true for the Chelsio and Ammasso drivers.  Connections should still
fail, but the bind is basically useless in this case.  I will try to get a patch
for that tomorrow.

- Sean

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices

2010-02-07 Thread Sean Hefty
Since iWarp devices are not guaranteed to support loopback connections,
prevent rdma_bind_addr from associating the loopback address with
an iWarp device.

Signed-off-by: Sean Hefty sean.he...@intel.com
---
This includes feedback from Steve Wise based on the initial rfc patch.

Although this patch is needed to prevent binding to RDMA devices that
may not support loopback addressing, is also works around a bug in
openmpi using the loopback address to bind to an iwarp device.

This is not a perfect solution either, since it disable all iwarp devices.
The NES driver should be able to support loopback connections, though that
feature has never been tested.  We may need a per device attribute.
I will look at creating such a patch, but wanted to post this in case I
can't get that one done in the next couple of days. 

 drivers/infiniband/core/cma.c |   18 --
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc9b594..fe8b0c0 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1739,6 +1739,9 @@ err:
 }
 EXPORT_SYMBOL(rdma_resolve_route);
 
+/*
+ * Only IB devices are guaranteed to support loopback connections.
+ */
 static int cma_bind_loopback(struct rdma_id_private *id_priv)
 {
struct cma_device *cma_dev;
@@ -1753,14 +1756,19 @@ static int cma_bind_loopback(struct rdma_id_private 
*id_priv)
ret = -ENODEV;
goto out;
}
-   list_for_each_entry(cma_dev, dev_list, list)
+   list_for_each_entry(cma_dev, dev_list, list) {
+   if (rdma_node_get_transport(cma_dev-device-node_type) !=
+   RDMA_TRANSPORT_IB)
+   continue;
+
for (p = 1; p = cma_dev-device-phys_port_cnt; ++p)
if (!ib_query_port(cma_dev-device, p, port_attr) 
port_attr.state == IB_PORT_ACTIVE)
goto port_found;
+   }
 
-   p = 1;
-   cma_dev = list_entry(dev_list.next, struct cma_device, list);
+   ret = -ENODEV;
+   goto out;
 
 port_found:
ret = ib_get_cached_gid(cma_dev-device, p, 0, gid);
@@ -1771,9 +1779,7 @@ port_found:
if (ret)
goto out;
 
-   id_priv-id.route.addr.dev_addr.dev_type =
-   (rdma_node_get_transport(cma_dev-device-node_type) == 
RDMA_TRANSPORT_IB) ?
-   ARPHRD_INFINIBAND : ARPHRD_ETHER;
+   id_priv-id.route.addr.dev_addr.dev_type = ARPHRD_INFINIBAND;
 
rdma_addr_set_sgid(id_priv-id.route.addr.dev_addr, gid);
ib_addr_set_pkey(id_priv-id.route.addr.dev_addr, pkey);



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html