Re: [PATCH v2] opensm: bug in trap report for MC create(66) and delete(67) traps
Hal Rosenstock wrote: On Fri, Feb 5, 2010 at 9:18 AM, Eli Dorfman dorfman@gmail.com wrote: On Thu, Feb 4, 2010 at 10:52 PM, Hal Rosenstock hal.rosenst...@gmail.com wrote: On Thu, Feb 4, 2010 at 12:43 PM, Eli Dorfman (Voltaire) dorfman@gmail.com wrote: Subject: [PATCH] Wrong handling of MC create and delete traps For these traps the GID in the data details is the MGID and not the source port gid. So the SM should check that subscriber port has the pkey of the MC group. There was also an error in comparing the subnet prefix and guid due to host/network order mismatch. Signed-off-by: Eli Dorfman e...@voltaire.com --- opensm/opensm/osm_inform.c | 151 --- 1 files changed, 98 insertions(+), 53 deletions(-) diff --git a/opensm/opensm/osm_inform.c b/opensm/opensm/osm_inform.c index 8108213..ae4fe71 100644 --- a/opensm/opensm/osm_inform.c +++ b/opensm/opensm/osm_inform.c @@ -341,6 +341,103 @@ Exit: return status; } +static int is_access_permitted( osm_infr_t *p_infr_rec, + osm_infr_match_ctxt_t *p_infr_match ) +{ + cl_list_t *p_infr_to_remove_list = p_infr_match-p_remove_infr_list; + ib_inform_info_t *p_ii = (p_infr_rec-inform_record.inform_info); + ib_mad_notice_attr_t *p_ntc = p_infr_match-p_ntc; + uint16_t trap_num = cl_ntoh16(p_ntc-g_or_v.generic.trap_num); + osm_subn_t *p_subn = p_infr_rec-sa-p_subn; + osm_log_t *p_log = p_infr_rec-sa-p_log; + char gid_str[INET6_ADDRSTRLEN]; + osm_mgrp_t *p_mgrp; + ib_gid_t source_gid; + osm_port_t *p_src_port; + osm_port_t *p_dest_port; + + /* In case of GID_IN(64) or GID_OUT(65) traps the source gid + comparison should be done on the trap source (saved as the gid in the + data details field). + For traps MC_CREATE(66) or MC_DELETE(67) the data details gid is + the MGID. We need to check whether subscriber has the pky of typo pkey + the MC group. Shouldn't this be the subscriber has a compatible pkey with MC group ? The MC group has a full member PKey and the members can be full or limited. I accept the correction. Doesn't this require a code change for handling trap cases 66-67 ? I think that you referred to the comment since the code is handling this properly (in my opinion). Sasha, can you please change this in the commit (only if there are not other remarks). Is that what you are asking Sasha to do (beyond the typos) ? I asked Sasha to fix only the typo in commit. BTW, there is no explicit reference in the IB spec for MC group create/delete trap (at least I didn't find it). Not sure what you mean by this. What didn't you find ? in the spec see o13-17.1.2 Thanks, Eli -- Hal + In all other cases the issuer gis is the trap source. typo ^^^ gid and this typo of course. Thanks, Eli -- Hal + */ + if (trap_num = 64 trap_num = 67 ) + /* The issuer of these traps is the SM so source_gid + is the gid saved on the data details */ + source_gid = p_ntc-data_details.ntc_64_67.gid; + else + source_gid = p_ntc-issuer_gid; + + p_dest_port = + cl_ptr_vector_get(p_subn-port_lid_tbl, + cl_ntoh16(p_infr_rec-report_addr.dest_lid)); + if (!p_dest_port) { + OSM_LOG(p_log, OSM_LOG_INFO, + Cannot find destination port with LID:%u\n, + cl_ntoh16(p_infr_rec-report_addr.dest_lid)); + goto Exit; + } + + switch (trap_num) { + case 66: + case 67: + p_mgrp = osm_get_mgrp_by_mgid(p_subn, source_gid); + if (!p_mgrp) { + OSM_LOG(p_log, OSM_LOG_INFO, + Cannot find MGID %s\n, + inet_ntop(AF_INET6, source_gid.raw, gid_str, sizeof gid_str)); + goto Exit; + } + + if (!osm_physp_has_pkey(p_log, + p_mgrp-mcmember_rec.pkey, + p_dest_port-p_physp)) { + OSM_LOG(p_log, OSM_LOG_INFO, + MGID %s and port GUID:0x%016 PRIx64 do not share same pkey\n, + inet_ntop(AF_INET6, source_gid.raw, gid_str, sizeof gid_str), + cl_ntoh64(p_dest_port-guid)); + goto Exit; + } +
Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes
On 2/7/2010 3:22 AM, Steve Wise wrote: Good catch, I'll update the patch and submit for 2.6.33 on Monday. NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1. If this patch will be accepted to the kernel 2.6.33 we can take it too Tziporet -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] ib/iser: major face lift of the data path code
Vladislav Bolkhovitin wrote: Or Gerlitz wrote: From where did you get those latency numbers? read iostat(8), you'll see that await is The average time (in milliseconds) for I/O requests issued to the device to be served what kind of test did you do? I connected a Linux box through iser to a target exposing four NULL luns, i.e luns which are not backed by real storage. I run four copies of disktest each over a different lun, where each was reading 1k blocks, etc. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes
Tziporet Koren wrote: On 2/7/2010 3:22 AM, Steve Wise wrote: Good catch, I'll update the patch and submit for 2.6.33 on Monday. NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1. If this patch will be accepted to the kernel 2.6.33 we can take it too If ofed-1.5.1 is based on 2.6.33 then it will get this patch automatically (assuming it goes upstream and makes 2.6.33). Or we can pull it in as a kernel_patches/fixes/ patch. My point, though, is that even with this patch in ofed-1.5.1, we still have an openmpi/IB/rdmacm regression. The only way to avoid this regression without changing openmpi is to disallow _all_ rdma binds to 127.0.0.1. Steve. Tziporet -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes
My point, though, is that even with this patch in ofed-1.5.1, we still have an openmpi/IB/rdmacm regression. The only way to avoid this regression without changing openmpi is to disallow _all_ rdma binds to 127.0.0.1. Can you identify the source of the regression? ie what was the change that broke things? I'm most concerned that there is another regression in 2.6.33, and if so I would like to try and avoid letting that get into the final release. -- Roland Dreier rola...@cisco.com Cisco.com - http://www.cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes
Roland Dreier wrote: My point, though, is that even with this patch in ofed-1.5.1, we still have an openmpi/IB/rdmacm regression. The only way to avoid this regression without changing openmpi is to disallow _all_ rdma binds to 127.0.0.1. Can you identify the source of the regression? ie what was the change that broke things? It is the same commit you sited earlier. It enables binding rdma cm_ids to 127.0.0.1. Sean's proposed patch on top of that disables this only for iwarp devices. I'm most concerned that there is another regression in 2.6.33, and if so I would like to try and avoid letting that get into the final release. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [ewg] bug 1918 - openmpi broken due to rdma-cm changes
Can you identify the source of the regression? ie what was the change that broke things? My understanding is that support for loopback addresses exposes an existing bug in openmpi. It tries to bind to 127.0.0.1, which now succeeds. Openmpi passes that address to a remote node for use in connections. I'm most concerned that there is another regression in 2.6.33, and if so I would like to try and avoid letting that get into the final release. Unless we never support loopback addresses, openmpi will see a regression. The only other problem that I'm aware of for 2.6.33 is that the bind to a loopback address will succeed, even though the RDMA device may not support loopback. This is true for the Chelsio and Ammasso drivers. Connections should still fail, but the bind is basically useless in this case. I will try to get a patch for that tomorrow. - Sean -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices
Since iWarp devices are not guaranteed to support loopback connections, prevent rdma_bind_addr from associating the loopback address with an iWarp device. Signed-off-by: Sean Hefty sean.he...@intel.com --- This includes feedback from Steve Wise based on the initial rfc patch. Although this patch is needed to prevent binding to RDMA devices that may not support loopback addressing, is also works around a bug in openmpi using the loopback address to bind to an iwarp device. This is not a perfect solution either, since it disable all iwarp devices. The NES driver should be able to support loopback connections, though that feature has never been tested. We may need a per device attribute. I will look at creating such a patch, but wanted to post this in case I can't get that one done in the next couple of days. drivers/infiniband/core/cma.c | 18 -- 1 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index cc9b594..fe8b0c0 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1739,6 +1739,9 @@ err: } EXPORT_SYMBOL(rdma_resolve_route); +/* + * Only IB devices are guaranteed to support loopback connections. + */ static int cma_bind_loopback(struct rdma_id_private *id_priv) { struct cma_device *cma_dev; @@ -1753,14 +1756,19 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv) ret = -ENODEV; goto out; } - list_for_each_entry(cma_dev, dev_list, list) + list_for_each_entry(cma_dev, dev_list, list) { + if (rdma_node_get_transport(cma_dev-device-node_type) != + RDMA_TRANSPORT_IB) + continue; + for (p = 1; p = cma_dev-device-phys_port_cnt; ++p) if (!ib_query_port(cma_dev-device, p, port_attr) port_attr.state == IB_PORT_ACTIVE) goto port_found; + } - p = 1; - cma_dev = list_entry(dev_list.next, struct cma_device, list); + ret = -ENODEV; + goto out; port_found: ret = ib_get_cached_gid(cma_dev-device, p, 0, gid); @@ -1771,9 +1779,7 @@ port_found: if (ret) goto out; - id_priv-id.route.addr.dev_addr.dev_type = - (rdma_node_get_transport(cma_dev-device-node_type) == RDMA_TRANSPORT_IB) ? - ARPHRD_INFINIBAND : ARPHRD_ETHER; + id_priv-id.route.addr.dev_addr.dev_type = ARPHRD_INFINIBAND; rdma_addr_set_sgid(id_priv-id.route.addr.dev_addr, gid); ib_addr_set_pkey(id_priv-id.route.addr.dev_addr, pkey); -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html