[PATCH] osm_sa_sminfo_record.c: fix sminfo sa query returns all sminfo records when filtering by LID in osm_smir_rcv_process
From fefee6db594ab6cc66da9aef1084f14d7006 Mon Sep 17 00:00:00 2001 From: Dan Ben Yosef da...@mellanox.com Date: Wed, 5 Jun 2013 14:52:01 +0300 Subject: [PATCH] osm_sa_sminfo_record.c: fix sminfo sa query returns all sminfo records when filtering by LID in osm_smir_rcv_process Signed-off-by: Dan Ben Yosef da...@mellanox.com --- opensm/osm_sa_sminfo_record.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/osm_sa_sminfo_record.c b/opensm/osm_sa_sminfo_record.c index f6bd9be..f73b433 100644 --- a/opensm/osm_sa_sminfo_record.c +++ b/opensm/osm_sa_sminfo_record.c @@ -310,7 +310,7 @@ void osm_smir_rcv_process(IN void *ctx, IN void *data) OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 280A: No remote SM for GUID 0x%016 PRIx64 \n, cl_ntoh64(port_guid)); - } else { + } else if (!p_port) { /* Go over all other known (remote) SMs */ cl_qmap_apply_func(sa-p_subn-sm_guid_tbl, sa_smir_by_comp_mask_cb, context); -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags] libibnetdisc/Makefile.am: Add missing makefile dependency
Found-by: Doron Tsur dor...@mellanox.com Signed-off-by: Hal Rosenstock h...@mellanox.com --- diff --git a/libibnetdisc/Makefile.am b/libibnetdisc/Makefile.am index d05604f..2cfcc33 100644 --- a/libibnetdisc/Makefile.am +++ b/libibnetdisc/Makefile.am @@ -35,6 +35,7 @@ libibnetdiscincludedir = $(includedir)/infiniband test_testleaks_SOURCES = test/testleaks.c test_testleaks_CFLAGS = -Wall $(DBGFLAGS) test_testleaks_LDFLAGS = -libnetdisc +test_testleaks_DEPENDENCIES = libibnetdisc.la libibnetdiscinclude_HEADERS = $(srcdir)/include/infiniband/ibnetdisc.h \ $(srcdir)/include/infiniband/ibnetdisc_osd.h -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2 0/4] Add IPv6 support for iWARP
Hi All, This patch series adds IPv6 support for iWARP. It enables Chelsio's T4 and T5 adapters to transmitt RDMA traffic over IPv6 address. It adds new apis and cpl messages in cxgb4 to support IPv6 operations and uses them in RDMA/cxgb4. The patch series modifies the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. It changes the references of local_addr and remote_addr in RDMA/cxgb4, RDMA/cxgb3, RDMA/nes and amso drivers such that build failure is avoided. We would like to submit this patch series via Roland's infiniband tree for-next branch. However the series requires the latest changes pushed in cxgb4 driver in David Miller's net-next tree. We request Roland to merge cxgb4 and RDMA/cxgb4 drivers from net-next tree before applying this series on for-next branch. We have created this patch series on top of net-next tree. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. Thanks, Vipul Pandya V2: Used local variables instead of typecasting in patch 1/4. Steve Wise (1): RDMA/cma: Add IPv6 support for iWARP. Vipul Pandya (3): cxgb4: Add routines to create and remove listening IPv6 servers cxgb4: Add CLIP support to store compressed IPv6 address RDMA/cxgb4: Add support for active and passive open connection with IPv6 address drivers/infiniband/core/cma.c | 69 +- drivers/infiniband/hw/amso1100/c2_ae.c | 18 +- drivers/infiniband/hw/amso1100/c2_cm.c | 16 +- drivers/infiniband/hw/cxgb3/iwch_cm.c | 46 +- drivers/infiniband/hw/cxgb4/cm.c| 842 drivers/infiniband/hw/cxgb4/device.c| 116 +++- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 4 +- drivers/infiniband/hw/nes/nes_cm.c | 148 +++-- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 277 +++- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h | 5 + drivers/net/ethernet/chelsio/cxgb4/t4_msg.h | 17 +- drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h | 23 + include/rdma/iw_cm.h| 8 +- 13 files changed, 1142 insertions(+), 447 deletions(-) -- 1.8.0 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2 3/4] cxgb4: Add CLIP support to store compressed IPv6 address
The Compressed LIP region is used to hold a limited number of Local IPv6 addresses. This region is primarily used to reduce the TCAM space consumed for an IPv6 offloaded connection. A 128-bit LIP will be reduced to 13-bit and stored in the TCAM if there is a match between the IPv6 tuple's LIP and the one stored in the CLIP region. Signed-off-by: Vipul Pandya vi...@chelsio.com --- V2: Used local variables instead of typecasting in patch 1/4. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 206 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h | 23 +++ 2 files changed, 229 insertions(+) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index d1d6ff7..4ae287c 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -60,6 +60,7 @@ #include linux/workqueue.h #include net/neighbour.h #include net/netevent.h +#include net/addrconf.h #include asm/uaccess.h #include cxgb4.h @@ -68,6 +69,11 @@ #include t4fw_api.h #include l2t.h +#include ../drivers/net/bonding/bonding.h + +#ifdef DRV_VERSION +#undef DRV_VERSION +#endif #define DRV_VERSION 2.0.0-ko #define DRV_DESC Chelsio T4/T5 Network Driver @@ -3227,6 +3233,38 @@ static int tid_init(struct tid_info *t) return 0; } +static int cxgb4_clip_get(const struct net_device *dev, + const struct in6_addr *lip) +{ + struct adapter *adap; + struct fw_clip_cmd c; + + adap = netdev2adap(dev); + memset(c, 0, sizeof(c)); + c.op_to_write = htonl(FW_CMD_OP(FW_CLIP_CMD) | + FW_CMD_REQUEST | FW_CMD_WRITE); + c.alloc_to_len16 = htonl(F_FW_CLIP_CMD_ALLOC | FW_LEN16(c)); + *(__be64 *)c.ip_hi = *(__be64 *)(lip-s6_addr); + *(__be64 *)c.ip_lo = *(__be64 *)(lip-s6_addr + 8); + return t4_wr_mbox_meat(adap, adap-mbox, c, sizeof(c), c, false); +} + +static int cxgb4_clip_release(const struct net_device *dev, + const struct in6_addr *lip) +{ + struct adapter *adap; + struct fw_clip_cmd c; + + adap = netdev2adap(dev); + memset(c, 0, sizeof(c)); + c.op_to_write = htonl(FW_CMD_OP(FW_CLIP_CMD) | + FW_CMD_REQUEST | FW_CMD_READ); + c.alloc_to_len16 = htonl(F_FW_CLIP_CMD_FREE | FW_LEN16(c)); + *(__be64 *)c.ip_hi = *(__be64 *)(lip-s6_addr); + *(__be64 *)c.ip_lo = *(__be64 *)(lip-s6_addr + 8); + return t4_wr_mbox_meat(adap, adap-mbox, c, sizeof(c), c, false); +} + /** * cxgb4_create_server - create an IP server * @dev: the device @@ -3878,6 +3916,169 @@ int cxgb4_unregister_uld(enum cxgb4_uld type) } EXPORT_SYMBOL(cxgb4_unregister_uld); +/* Check if netdev on which event is occured belongs to us or not. Return + * suceess (1) if it belongs otherwise failure (0). + */ +static int cxgb4_netdev(struct net_device *netdev) +{ + struct adapter *adap; + int i; + + mutex_lock(uld_mutex); + list_for_each_entry(adap, adapter_list, list_node) + for (i = 0; i MAX_NPORTS; i++) + if (adap-port[i] == netdev) { + mutex_unlock(uld_mutex); + return 1; + } + mutex_unlock(uld_mutex); + return 0; +} + +static int clip_add(struct net_device *event_dev, struct inet6_ifaddr *ifa, + unsigned long event) +{ + int ret = NOTIFY_DONE; + + rcu_read_lock(); + if (cxgb4_netdev(event_dev)) { + switch (event) { + case NETDEV_UP: + ret = cxgb4_clip_get(event_dev, + (const struct in6_addr *)ifa-addr.s6_addr); + if (ret 0) { + rcu_read_unlock(); + return ret; + } + ret = NOTIFY_OK; + break; + case NETDEV_DOWN: + cxgb4_clip_release(event_dev, + (const struct in6_addr *)ifa-addr.s6_addr); + ret = NOTIFY_OK; + break; + default: + break; + } + } + rcu_read_unlock(); + return ret; +} + +static int cxgb4_inet6addr_handler(struct notifier_block *this, + unsigned long event, void *data) +{ + struct inet6_ifaddr *ifa = data; + struct net_device *event_dev; + int ret = NOTIFY_DONE; + int cnt; + struct bonding *bond = netdev_priv(ifa-idev-dev); + struct slave *slave; + struct pci_dev *first_pdev = NULL; + + if (ifa-idev-dev-priv_flags IFF_802_1Q_VLAN) { + event_dev = vlan_dev_real_dev(ifa-idev-dev); + ret = clip_add(event_dev, ifa, event); + } else if (ifa-idev-dev-flags
[PATCH V2 4/4] RDMA/cxgb4: Add support for active and passive open connection with IPv6 address
Add new cpl messages, cpl_act_open_req6 and cpl_t5_act_open_req6, for initiating active open connections. Use LLD api cxgb4_create_server and cxgb4_create_server6 for initiating passive open connections. Similarly use cxgb4_remove_server to remove the passive open connections in place of listen_stop. Add support for iWARP over VLAN device and enable IPv6 support on VLAN device. Make use of import_ep in c4iw_reconnect. Signed-off-by: Vipul Pandya vi...@chelsio.com --- V2: Used local variables instead of typecasting in patch 1/4. drivers/infiniband/hw/cxgb4/cm.c | 805 ++--- drivers/infiniband/hw/cxgb4/device.c | 116 +++-- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 4 +- 3 files changed, 626 insertions(+), 299 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c index 76d8812..0745ae4 100644 --- a/drivers/infiniband/hw/cxgb4/cm.c +++ b/drivers/infiniband/hw/cxgb4/cm.c @@ -44,6 +44,8 @@ #include net/netevent.h #include net/route.h #include net/tcp.h +#include net/ip6_route.h +#include net/addrconf.h #include iw_cxgb4.h @@ -333,19 +335,72 @@ static struct sk_buff *get_skb(struct sk_buff *skb, int len, gfp_t gfp) return skb; } -static struct rtable *find_route(struct c4iw_dev *dev, __be32 local_ip, +static struct net_device *get_real_dev(struct net_device *egress_dev) +{ + struct net_device *phys_dev = egress_dev; + if (egress_dev-priv_flags IFF_802_1Q_VLAN) + phys_dev = vlan_dev_real_dev(egress_dev); + return phys_dev; +} + +static int our_interface(struct c4iw_dev *dev, struct net_device *egress_dev) +{ + int i; + + egress_dev = get_real_dev(egress_dev); + for (i = 0; i dev-rdev.lldi.nports; i++) + if (dev-rdev.lldi.ports[i] == egress_dev) + return 1; + return 0; +} + +static struct dst_entry *find_route6(struct c4iw_dev *dev, __u8 *local_ip, +__u8 *peer_ip, __be16 local_port, +__be16 peer_port, u8 tos, +__u32 sin6_scope_id) +{ + struct flowi6 fl6; + struct dst_entry *dst; + + memset(fl6, 0, sizeof(fl6)); + memcpy(fl6.daddr, peer_ip, 16); + memcpy(fl6.saddr, local_ip, 16); + if (ipv6_addr_type(fl6.daddr) IPV6_ADDR_LINKLOCAL) + fl6.flowi6_oif = sin6_scope_id; + dst = ip6_route_output(init_net, NULL, fl6); + if (!dst) + goto out; + if (!our_interface(dev, ip6_dst_idev(dst)-dev) + !(ip6_dst_idev(dst)-dev-flags IFF_LOOPBACK)) { + dst_release(dst); + dst = NULL; + } +out: + return dst; +} + +static struct dst_entry *find_route(struct c4iw_dev *dev, __be32 local_ip, __be32 peer_ip, __be16 local_port, __be16 peer_port, u8 tos) { struct rtable *rt; struct flowi4 fl4; + struct neighbour *n; rt = ip_route_output_ports(init_net, fl4, NULL, peer_ip, local_ip, peer_port, local_port, IPPROTO_TCP, tos, 0); if (IS_ERR(rt)) return NULL; - return rt; + n = dst_neigh_lookup(rt-dst, peer_ip); + if (!n) + return NULL; + if (!our_interface(dev, n-dev)) { + dst_release(rt-dst); + return NULL; + } + neigh_release(n); + return rt-dst; } static void arp_failure_discard(void *handle, struct sk_buff *skb) @@ -512,15 +567,28 @@ static int send_connect(struct c4iw_ep *ep) { struct cpl_act_open_req *req; struct cpl_t5_act_open_req *t5_req; + struct cpl_act_open_req6 *req6; + struct cpl_t5_act_open_req6 *t5_req6; struct sk_buff *skb; u64 opt0; u32 opt2; unsigned int mtu_idx; int wscale; - int size = is_t4(ep-com.dev-rdev.lldi.adapter_type) ? - sizeof(struct cpl_act_open_req) : - sizeof(struct cpl_t5_act_open_req); - int wrlen = roundup(size, 16); + int wrlen; + int sizev4 = is_t4(ep-com.dev-rdev.lldi.adapter_type) ? + sizeof(struct cpl_act_open_req) : + sizeof(struct cpl_t5_act_open_req); + int sizev6 = is_t4(ep-com.dev-rdev.lldi.adapter_type) ? + sizeof(struct cpl_act_open_req6) : + sizeof(struct cpl_t5_act_open_req6); + struct sockaddr_in *la = (struct sockaddr_in *)ep-com.local_addr; + struct sockaddr_in *ra = (struct sockaddr_in *)ep-com.remote_addr; + struct sockaddr_in6 *la6 = (struct sockaddr_in6 *)ep-com.local_addr; + struct sockaddr_in6 *ra6 = (struct sockaddr_in6 *)ep-com.remote_addr; + + wrlen = (ep-com.remote_addr.ss_family == AF_INET) ? +
[PATCH V2 2/4] cxgb4: Add routines to create and remove listening IPv6 servers
Add cxgb4_create_server6 and cxgb4_remove_server routines to create and remove listening IPv6 servers. Return success (0) from cxgb4_create_server in case of ctrl queue congestion since in case of congestion, passive open request gets queued and gets processed later. If non zero value is returned it can be treated as an error and ULD can free STID which can result into an error in passive open reply. Add cpl structure for active open request with IPv6 address for T5. Signed-off-by: Vipul Pandya vi...@chelsio.com --- V2: Used local variables instead of typecasting in patch 1/4. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 71 - drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h | 5 ++ drivers/net/ethernet/chelsio/cxgb4/t4_msg.h | 17 +- 3 files changed, 91 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 5a3256b..d1d6ff7 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -3246,6 +3246,7 @@ int cxgb4_create_server(const struct net_device *dev, unsigned int stid, struct sk_buff *skb; struct adapter *adap; struct cpl_pass_open_req *req; + int ret; skb = alloc_skb(sizeof(*req), GFP_KERNEL); if (!skb) @@ -3263,10 +3264,78 @@ int cxgb4_create_server(const struct net_device *dev, unsigned int stid, req-opt0 = cpu_to_be64(TX_CHAN(chan)); req-opt1 = cpu_to_be64(CONN_POLICY_ASK | SYN_RSS_ENABLE | SYN_RSS_QUEUE(queue)); - return t4_mgmt_tx(adap, skb); + ret = t4_mgmt_tx(adap, skb); + return net_xmit_eval(ret); } EXPORT_SYMBOL(cxgb4_create_server); +/* cxgb4_create_server6 - create an IPv6 server + * @dev: the device + * @stid: the server TID + * @sip: local IPv6 address to bind server to + * @sport: the server's TCP port + * @queue: queue to direct messages from this server to + * + * Create an IPv6 server for the given port and address. + * Returns 0 on error and one of the %NET_XMIT_* values on success. + */ +int cxgb4_create_server6(const struct net_device *dev, unsigned int stid, +const struct in6_addr *sip, __be16 sport, +unsigned int queue) +{ + unsigned int chan; + struct sk_buff *skb; + struct adapter *adap; + struct cpl_pass_open_req6 *req; + int ret; + + skb = alloc_skb(sizeof(*req), GFP_KERNEL); + if (!skb) + return -ENOMEM; + + adap = netdev2adap(dev); + req = (struct cpl_pass_open_req6 *)__skb_put(skb, sizeof(*req)); + INIT_TP_WR(req, 0); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_PASS_OPEN_REQ6, stid)); + req-local_port = sport; + req-peer_port = htons(0); + req-local_ip_hi = *(__be64 *)(sip-s6_addr); + req-local_ip_lo = *(__be64 *)(sip-s6_addr + 8); + req-peer_ip_hi = cpu_to_be64(0); + req-peer_ip_lo = cpu_to_be64(0); + chan = rxq_to_chan(adap-sge, queue); + req-opt0 = cpu_to_be64(TX_CHAN(chan)); + req-opt1 = cpu_to_be64(CONN_POLICY_ASK | + SYN_RSS_ENABLE | SYN_RSS_QUEUE(queue)); + ret = t4_mgmt_tx(adap, skb); + return net_xmit_eval(ret); +} +EXPORT_SYMBOL(cxgb4_create_server6); + +int cxgb4_remove_server(const struct net_device *dev, unsigned int stid, + unsigned int queue, bool ipv6) +{ + struct sk_buff *skb; + struct adapter *adap; + struct cpl_close_listsvr_req *req; + int ret; + + adap = netdev2adap(dev); + + skb = alloc_skb(sizeof(*req), GFP_KERNEL); + if (!skb) + return -ENOMEM; + + req = (struct cpl_close_listsvr_req *)__skb_put(skb, sizeof(*req)); + INIT_TP_WR(req, 0); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_CLOSE_LISTSRV_REQ, stid)); + req-reply_ctrl = htons(NO_REPLY(0) | (ipv6 ? LISTSVR_IPV6(1) : + LISTSVR_IPV6(0)) | QUEUENO(queue)); + ret = t4_mgmt_tx(adap, skb); + return net_xmit_eval(ret); +} +EXPORT_SYMBOL(cxgb4_remove_server); + /** * cxgb4_best_mtu - find the entry in the MTU table closest to an MTU * @mtus: the HW MTU table diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h index 4faf4d0..6f21f24 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h @@ -154,6 +154,11 @@ struct in6_addr; int cxgb4_create_server(const struct net_device *dev, unsigned int stid, __be32 sip, __be16 sport, __be16 vlan, unsigned int queue); +int cxgb4_create_server6(const struct net_device *dev, unsigned int stid, +const struct in6_addr *sip, __be16 sport, +unsigned
[PATCH V2 1/4] RDMA/cma: Add IPv6 support for iWARP.
From: Steve Wise sw...@opengridcomputing.com This patch modifies the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. It changes the references of local_addr and remote_addr in RDMA/cxgb4, RDMA/cxgb3, RDMA/nes and amso drivers such that build failure is avoided. However to be able to actully run the traffic over IPv6 address respective drivers have to add supportive code. Signed-off-by: Steve Wise sw...@opengridcomputing.com --- V2: Used local variables instead of typecasting in patch 1/4. drivers/infiniband/core/cma.c | 69 --- drivers/infiniband/hw/amso1100/c2_ae.c | 18 ++-- drivers/infiniband/hw/amso1100/c2_cm.c | 16 +++- drivers/infiniband/hw/cxgb3/iwch_cm.c | 46 ++ drivers/infiniband/hw/cxgb4/cm.c | 55 ++-- drivers/infiniband/hw/nes/nes_cm.c | 148 ++--- include/rdma/iw_cm.h | 8 +- 7 files changed, 205 insertions(+), 155 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 34fbc2f..72b15c1 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1335,8 +1335,9 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) { struct rdma_id_private *id_priv = iw_id-context; struct rdma_cm_event event; - struct sockaddr_in *sin; int ret = 0; + struct sockaddr *laddr = (struct sockaddr *)iw_event-local_addr; + struct sockaddr *raddr = (struct sockaddr *)iw_event-remote_addr; if (cma_disable_callback(id_priv, RDMA_CM_CONNECT)) return 0; @@ -1347,10 +1348,10 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) event.event = RDMA_CM_EVENT_DISCONNECTED; break; case IW_CM_EVENT_CONNECT_REPLY: - sin = (struct sockaddr_in *) id_priv-id.route.addr.src_addr; - *sin = iw_event-local_addr; - sin = (struct sockaddr_in *) id_priv-id.route.addr.dst_addr; - *sin = iw_event-remote_addr; + memcpy(id_priv-id.route.addr.src_addr, laddr, + ip_addr_size(laddr)); + memcpy(id_priv-id.route.addr.dst_addr, raddr, + ip_addr_size(raddr)); switch (iw_event-status) { case 0: event.event = RDMA_CM_EVENT_ESTABLISHED; @@ -1400,11 +1401,12 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, { struct rdma_cm_id *new_cm_id; struct rdma_id_private *listen_id, *conn_id; - struct sockaddr_in *sin; struct net_device *dev = NULL; struct rdma_cm_event event; int ret; struct ib_device_attr attr; + struct sockaddr *laddr = (struct sockaddr *)iw_event-local_addr; + struct sockaddr *raddr = (struct sockaddr *)iw_event-remote_addr; listen_id = cm_id-context; if (cma_disable_callback(listen_id, RDMA_CM_LISTEN)) @@ -1422,14 +1424,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, mutex_lock_nested(conn_id-handler_mutex, SINGLE_DEPTH_NESTING); conn_id-state = RDMA_CM_CONNECT; - dev = ip_dev_find(init_net, iw_event-local_addr.sin_addr.s_addr); - if (!dev) { - ret = -EADDRNOTAVAIL; - mutex_unlock(conn_id-handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; - } - ret = rdma_copy_addr(conn_id-id.route.addr.dev_addr, dev, NULL); + ret = rdma_translate_ip(laddr, conn_id-id.route.addr.dev_addr); if (ret) { mutex_unlock(conn_id-handler_mutex); rdma_destroy_id(new_cm_id); @@ -1447,10 +1442,8 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, cm_id-context = conn_id; cm_id-cm_handler = cma_iw_handler; - sin = (struct sockaddr_in *) new_cm_id-route.addr.src_addr; - *sin = iw_event-local_addr; - sin = (struct sockaddr_in *) new_cm_id-route.addr.dst_addr; - *sin = iw_event-remote_addr; + memcpy(new_cm_id-route.addr.src_addr, laddr, ip_addr_size(laddr)); + memcpy(new_cm_id-route.addr.dst_addr, raddr, ip_addr_size(raddr)); ret = ib_query_device(conn_id-id.device, attr); if (ret) { @@ -1526,8 +1519,9 @@ static int cma_ib_listen(struct rdma_id_private *id_priv) static int cma_iw_listen(struct rdma_id_private *id_priv, int backlog) { int ret; - struct sockaddr_in *sin; struct iw_cm_id *id; + struct sockaddr *src_addr = (struct sockaddr *) + id_priv-id.route.addr.src_addr; id = iw_create_cm_id(id_priv-id.device, iw_conn_req_handler, @@ -1537,8 +1531,8 @@ static int cma_iw_listen(struct rdma_id_private *id_priv, int backlog)
Re: [PATCH] IB/srp: Maintain a single connection per I_T nexus
Hi all, as proposed by Or, let's discuss this on the mailing list. This is a fundamental change required for everything related to multipathing. It influences automatic reconnect patches which will follow. So let's agree on the right solution here first before looking at other patches. In my opinion the 'add_target' sysfs attribute shouldn't be used for any manual reconnect as well. This is why my patch rejects the double login attempt instead of reconnecting an existing connection. This can help to find scripting issues and things like this. We can't expect that all users are using the srp-tools. Please compare with Bart's version and let's discuss this here. https://github.com/bvanassche/ib_srp-backport/commit/7d8774ff58d489858b1c046b2bf01b4e84e8dd9b Cheers, Sebastian On 12.06.2013 13:29, Sebastian Riemer wrote: The sysfs attribute 'add_target' may not be used for multiple logins to the same target. If doing so with multipathing, this crashes the multipath-tools. Furthermore, we want to prevent the possibility of data corruption here. If manual reconnect is necessary, then the user can use the 'delete' sysfs attribute of the remote port before connecting. Note: The function srp_conn_unique() has been taken from Bart Van Assche. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/14] IB SRP initiator patches for kernel 3.11
The purpose of this InfiniBand SRP initiator patch series is as follows: - Make the SRP initiator driver better suited for use in a H.A. setup. Speed up failover by reducing the IB RC retry count and by notifying multipathd faster about transport layer failures by adding fast_io_fail_tmo and dev_loss_tmo parameters. - Make the SRP initiator better suited for use on NUMA systems by making the HCA completion vector configurable. Some of the patches in this patch series were posted previously as Make ib_srp better suited for H.A. purposes. The changes compared to version five of that patch series are: - Left out patches that are already upstream. - Made it possible to set dev_loss_tmo to off. This is useful in a setup using initiator side mirroring to avoid that new /dev/sd* names are reassigned after a failover or cable pull and reinsert. - Added kernel module parameters to ib_srp for configuring default values of the fast_io_fail_tmo and dev_loss_tmo parameters. - Added a patch from Dotan Barak that fixes a kernel oops during rmmod triggered by resource allocation failure at module load time. - Avoid duplicate connections by refusing relogins instead of dropping duplicate connections, as proposed by Sebastian Riemer. - Added a patch from Sebastian Riemer for failing SCSI commands silently. - Added a patch from Vu Pham to make the transport layer (IB RC) retry count configurable. - Made HCA completion vector configurable. Changes since v4: - Added a patch for removing SCSI devices upon a port down event Changes since v3: - Restored the dev_loss_tmo and fast_io_fail_tmo sysfs attributes. - Included a patch to fix an ib_srp crash that could be triggered by cable pulling. Changes since v2: - Addressed the v2 review comments. - Dropped the patches that have already been merged. - Dropped the patches for integration with multipathd. - Dropped the micro-optimization of the IB completion handlers. The individual patches in this series are as follows: 0001-IB-srp-Fix-remove_one-crash-due-to-resource-exhausti.patch 0002-IB-srp-Fix-race-between-srp_queuecommand-and-srp_cla.patch 0003-IB-srp-Avoid-that-srp_reset_host-is-skipped-after-a-.patch 0004-IB-srp-Skip-host-settle-delay.patch 0005-IB-srp-Maintain-a-single-connection-per-I_T-nexus.patch 0006-IB-srp-Keep-rport-as-long-as-the-IB-transport-layer.patch 0007-scsi_transport_srp-Add-transport-layer-error-handlin.patch 0008-IB-srp-Add-srp_terminate_io.patch 0009-IB-srp-Use-SRP-transport-layer-error-recovery.patch 0010-IB-srp-Start-timers-if-a-transport-layer-error-occur.patch 0011-IB-srp-Fail-SCSI-commands-silently.patch 0012-IB-srp-Make-HCA-completion-vector-configurable.patch 0013-IB-srp-Make-transport-layer-retry-count-configurable.patch 0014-IB-srp-Bump-driver-version-and-release-date.patch -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
If the add_one callback fails during driver load no resources are allocated so there isn't a need to release any resources. Trying to clean the resource may lead to the following kernel panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [a0132331] srp_remove_one+0x31/0x240 [ib_srp] RIP: 0010:[a0132331] [a0132331] srp_remove_one+0x31/0x240 [ib_srp] Process rmmod (pid: 4562, threadinfo 8800dd738000, task 8801167e60c0) Call Trace: [a024500e] ib_unregister_client+0x4e/0x120 [ib_core] [a01361bd] srp_cleanup_module+0x15/0x71 [ib_srp] [810ac6a4] sys_delete_module+0x194/0x260 [8100b0f2] system_call_fastpath+0x16/0x1b [bvanassche: Shortened patch description] Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Eli Cohen e...@mellanox.co.il Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 7ccf328..368d160 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device *device) struct srp_target_port *target; srp_dev = ib_get_client_data(device, srp_client); + if (!srp_dev) + return; list_for_each_entry_safe(host, tmp_host, srp_dev-dev_list, list) { device_unregister(host-dev); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/14] IB/srp: Skip host settle delay
The SRP initiator implements host reset by reconnecting to the SRP target. That means that communication with the target is possible as soon as host reset finished. Hence skip the host settle delay. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index fb37b47..be12780 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1949,6 +1949,7 @@ static struct scsi_host_template srp_template = { .eh_abort_handler = srp_abort, .eh_device_reset_handler= srp_reset_device, .eh_host_reset_handler = srp_reset_host, + .skip_settle_delay = true, .sg_tablesize = SRP_DEF_SG_TABLESIZE, .can_queue = SRP_CMD_SQ_SIZE, .this_id= -1, -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
An SRP target is required to maintain a single connection between initiator and target. This means that if the 'add_target' attribute is used to create a second connection to a target that the first connection will be logged out and that the SCSI error handler will kick in. The SCSI error handler will cause the SRP initiator to reconnect, which will cause I/O over the second connection to fail. Avoid such ping-pong behavior by disabling relogins. Note: if reconnecting manually is necessary, that is possible by deleting and recreating an rport via sysfs. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 38 +++ 1 file changed, 38 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index be12780..1a73b24 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -556,6 +556,36 @@ static void srp_rport_delete(struct srp_rport *rport) srp_queue_remove_work(target); } +/** + * srp_conn_unique() - check whether the connection to a target is unique + */ +static bool srp_conn_unique(struct srp_host *host, + struct srp_target_port *target) +{ + struct srp_target_port *t; + bool ret = false; + + if (target-state == SRP_TARGET_REMOVED) + goto out; + + ret = true; + + spin_lock(host-target_lock); + list_for_each_entry(t, host-target_list, list) { + if (t != target + target-id_ext == t-id_ext + target-ioc_guid == t-ioc_guid + target-initiator_ext == t-initiator_ext) { + ret = false; + break; + } + } + spin_unlock(host-target_lock); + +out: + return ret; +} + static int srp_connect_target(struct srp_target_port *target) { int retries = 3; @@ -2261,6 +2291,14 @@ static ssize_t srp_create_target(struct device *dev, if (ret) goto err; + if (!srp_conn_unique(target-srp_host, target)) { + shost_printk(KERN_INFO, target-scsi_host, +PFX Already connected to target port %.*s\n, +(int)count, buf); + ret = -EEXIST; + goto err; + } + if (!host-srp_dev-fmr_pool !target-allow_ext_sg target-cmd_sg_cnt target-sg_tablesize) { pr_warn(No FMR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer
Keep the rport data structure around after srp_remove_host() has finished until cleanup of the IB transport layer has finished completely. This is necessary because later patches use the rport pointer inside the queuecommand callback. Without this patch accessing the rport from inside a queuecommand callback is racy because srp_remove_host() must be invoked before scsi_remove_host() and because the queuecommand callback may get invoked after srp_remove_host() has finished. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: James Bottomley jbottom...@parallels.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |3 +++ drivers/infiniband/ulp/srp/ib_srp.h |1 + drivers/scsi/scsi_transport_srp.c | 18 ++ include/scsi/scsi_transport_srp.h |2 ++ 4 files changed, 24 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 1a73b24..2bd0587 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -526,11 +526,13 @@ static void srp_remove_target(struct srp_target_port *target) WARN_ON_ONCE(target-state != SRP_TARGET_REMOVED); srp_del_scsi_host_attr(target-scsi_host); + srp_rport_get(target-rport); srp_remove_host(target-scsi_host); scsi_remove_host(target-scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target-cm_id); srp_free_target_ib(target); + srp_rport_put(target-rport); srp_free_req_data(target); scsi_host_put(target-scsi_host); } @@ -2009,6 +2011,7 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target) } rport-lld_data = target; + target-rport = rport; spin_lock(host-target_lock); list_add_tail(target-list, host-target_list); diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index 66fbedd..1817ed5 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -153,6 +153,7 @@ struct srp_target_port { u16 io_class; struct srp_host*srp_host; struct Scsi_Host *scsi_host; + struct srp_rport *rport; chartarget_name[32]; unsigned intscsi_id; unsigned intsg_tablesize; diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index f379c7f..f7ba94a 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -185,6 +185,24 @@ static int srp_host_match(struct attribute_container *cont, struct device *dev) } /** + * srp_rport_get() - increment rport reference count + */ +void srp_rport_get(struct srp_rport *rport) +{ + get_device(rport-dev); +} +EXPORT_SYMBOL(srp_rport_get); + +/** + * srp_rport_put() - decrement rport reference count + */ +void srp_rport_put(struct srp_rport *rport) +{ + put_device(rport-dev); +} +EXPORT_SYMBOL(srp_rport_put); + +/** * srp_rport_add - add a SRP remote port to the device hierarchy * @shost: scsi host the remote port is connected to. * @ids: The port id for the remote port. diff --git a/include/scsi/scsi_transport_srp.h b/include/scsi/scsi_transport_srp.h index ff0f04a..5a2d2d1 100644 --- a/include/scsi/scsi_transport_srp.h +++ b/include/scsi/scsi_transport_srp.h @@ -38,6 +38,8 @@ extern struct scsi_transport_template * srp_attach_transport(struct srp_function_template *); extern void srp_release_transport(struct scsi_transport_template *); +extern void srp_rport_get(struct srp_rport *rport); +extern void srp_rport_put(struct srp_rport *rport); extern struct srp_rport *srp_rport_add(struct Scsi_Host *, struct srp_rport_identifiers *); extern void srp_rport_del(struct srp_rport *); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/14] scsi_transport_srp: Add transport layer error handling
Add the necessary functions in the SRP transport module to allow an SRP initiator driver to implement transport layer error handling similar to the functionality already provided by the FC transport layer. This includes: - Support for implementing fast_io_fail_tmo, the time that should elapse after having detected a transport layer problem and before failing I/O. - Support for implementing dev_loss_tmo, the time that should elapse after having detected a transport layer problem and before removing a remote port. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: James Bottomley jbottom...@parallels.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- Documentation/ABI/stable/sysfs-transport-srp | 36 ++ drivers/scsi/scsi_transport_srp.c| 489 +- include/scsi/scsi_transport_srp.h| 54 ++- 3 files changed, 576 insertions(+), 3 deletions(-) diff --git a/Documentation/ABI/stable/sysfs-transport-srp b/Documentation/ABI/stable/sysfs-transport-srp index b36fb0d..b7759b1 100644 --- a/Documentation/ABI/stable/sysfs-transport-srp +++ b/Documentation/ABI/stable/sysfs-transport-srp @@ -5,6 +5,23 @@ Contact: linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org Description: Instructs an SRP initiator to disconnect from a target and to remove all LUNs imported from that target. +What: /sys/class/srp_remote_ports/port-h:n/dev_loss_tmo +Date: September 1, 2013 +KernelVersion: 3.11 +Contact: linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org +Description: Number of seconds the SCSI layer will wait after a transport + layer error has been observed before removing a target port. + Zero means immediate removal. + +What: /sys/class/srp_remote_ports/port-h:n/fast_io_fail_tmo +Date: September 1, 2013 +KernelVersion: 3.11 +Contact: linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org +Description: Number of seconds the SCSI layer will wait after a transport + layer error has been observed before failing I/O. Zero means + immediate removal. A negative value will disable this + behavior. + What: /sys/class/srp_remote_ports/port-h:n/port_id Date: June 27, 2007 KernelVersion: 2.6.24 @@ -12,8 +29,27 @@ Contact: linux-s...@vger.kernel.org Description: 16-byte local SRP port identifier in hexadecimal format. An example: 4c:49:4e:55:58:20:56:49:4f:00:00:00:00:00:00:00. +What: /sys/class/srp_remote_ports/port-h:n/reconnect_delay +Date: September 1, 2013 +KernelVersion: 3.11 +Contact: linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org +Description: Number of seconds the SCSI layer will wait after a reconnect + attempt failed before retrying. + What: /sys/class/srp_remote_ports/port-h:n/roles Date: June 27, 2007 KernelVersion: 2.6.24 Contact: linux-s...@vger.kernel.org Description: Role of the remote port. Either SRP Initiator or SRP Target. + +What: /sys/class/srp_remote_ports/port-h:n/state +Date: September 1, 2013 +KernelVersion: 3.11 +Contact: linux-s...@vger.kernel.org, linux-rdma@vger.kernel.org +Description: State of the transport layer to the remote port. running if + the transport layer is operational; blocked if a transport + layer error has been encountered but the fail_io_fast_tmo + timer has not yet fired; fail-fast after the + fail_io_fast_tmo timer has fired and before the dev_loss_tmo + timer has fired; lost after the dev_loss_tmo timer has + fired and before the port is finally removed. diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index f7ba94a..53b34b3 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -24,12 +24,15 @@ #include linux/err.h #include linux/slab.h #include linux/string.h +#include linux/delay.h #include scsi/scsi.h #include scsi/scsi_device.h #include scsi/scsi_host.h #include scsi/scsi_transport.h +#include scsi/scsi_cmnd.h #include scsi/scsi_transport_srp.h +#include scsi_priv.h #include scsi_transport_srp_internal.h struct srp_host_attrs { @@ -38,7 +41,7 @@ struct srp_host_attrs { #define to_srp_host_attrs(host)((struct srp_host_attrs *)(host)-shost_data) #define SRP_HOST_ATTRS 0 -#define SRP_RPORT_ATTRS 3 +#define SRP_RPORT_ATTRS 8 struct srp_internal { struct scsi_transport_template t; @@ -54,6 +57,28 @@ struct srp_internal { #definedev_to_rport(d) container_of(d, struct srp_rport, dev) #define transport_class_to_srp_rport(dev) dev_to_rport((dev)-parent) +static inline struct Scsi_Host *rport_to_shost(struct
[PATCH 08/14] IB/srp: Add srp_terminate_io()
Finish all outstanding I/O requests after fast_io_fail_tmo expired, which speeds up failover in a multipath setup. This patch is a reworked version of a patch from Sebastian Riemer. Reported-by: Sebastian Riemer sebastian.rie...@profitbricks.com Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 2bd0587..ffda0ca 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -716,17 +716,29 @@ static void srp_free_req(struct srp_target_port *target, spin_unlock_irqrestore(target-lock, flags); } -static void srp_reset_req(struct srp_target_port *target, struct srp_request *req) +static void srp_finish_req(struct srp_target_port *target, + struct srp_request *req, int result) { struct scsi_cmnd *scmnd = srp_claim_req(target, req, NULL); if (scmnd) { srp_free_req(target, req, scmnd, 0); - scmnd-result = DID_RESET 16; + scmnd-result = result; scmnd-scsi_done(scmnd); } } +static void srp_terminate_io(struct srp_rport *rport) +{ + struct srp_target_port *target = rport-lld_data; + int i; + + for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { + struct srp_request *req = target-req_ring[i]; + srp_finish_req(target, req, DID_TRANSPORT_FAILFAST 16); + } +} + static int srp_reconnect_target(struct srp_target_port *target) { struct Scsi_Host *shost = target-scsi_host; @@ -753,8 +765,7 @@ static int srp_reconnect_target(struct srp_target_port *target) for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { struct srp_request *req = target-req_ring[i]; - if (req-scmnd) - srp_reset_req(target, req); + srp_finish_req(target, req, DID_RESET 16); } INIT_LIST_HEAD(target-free_tx); @@ -1809,7 +1820,7 @@ static int srp_reset_device(struct scsi_cmnd *scmnd) for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { struct srp_request *req = target-req_ring[i]; if (req-scmnd req-scmnd-device == scmnd-device) - srp_reset_req(target, req); + srp_finish_req(target, req, DID_RESET 16); } return SUCCESS; @@ -2589,6 +2600,7 @@ static void srp_remove_one(struct ib_device *device) static struct srp_function_template ib_srp_transport_functions = { .rport_delete= srp_rport_delete, + .terminate_rport_io = srp_terminate_io, }; static int __init srp_init_module(void) -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/14] IB/srp: Use SRP transport layer error recovery
Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB SRP initiator. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@kernel.org Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 123 +-- drivers/infiniband/ulp/srp/ib_srp.h |1 - 2 files changed, 88 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index ffda0ca..6f3e0e5 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -86,6 +86,31 @@ module_param(topspin_workarounds, int, 0444); MODULE_PARM_DESC(topspin_workarounds, Enable workarounds for Topspin/Cisco SRP target bugs if != 0); +static int srp_reconnect_delay = 10; +module_param_named(reconnect_delay, srp_reconnect_delay, int, S_IRUGO|S_IWUSR); +MODULE_PARM_DESC(reconnect_delay, Time between successive reconnect attempts); + +static struct kernel_param_ops srp_tmo_ops; + +static int srp_fast_io_fail_tmo = 15; +module_param_cb(fast_io_fail_tmo, srp_tmo_ops, srp_fast_io_fail_tmo, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(fast_io_fail_tmo, +Number of seconds between the observation of a transport + layer error and failing all I/O. \off\ means that this + functionality is disabled.); + +static int srp_dev_loss_tmo = 600; +module_param_cb(dev_loss_tmo, srp_tmo_ops, srp_dev_loss_tmo, + S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(dev_loss_tmo, +Maximum number of seconds that the SRP transport should + insulate transport layer errors. After this time has been + exceeded the SCSI target is removed. Should be + between 1 and __stringify(SCSI_DEVICE_BLOCK_MAX_TIMEOUT) + if fast_io_fail_tmo has not been set. \off\ means that + this functionality is disabled.); + static void srp_add_one(struct ib_device *device); static void srp_remove_one(struct ib_device *device); static void srp_recv_completion(struct ib_cq *cq, void *target_ptr); @@ -102,6 +127,44 @@ static struct ib_client srp_client = { static struct ib_sa_client srp_sa_client; +static int srp_tmo_get(char *buffer, const struct kernel_param *kp) +{ + int tmo = *(int *)kp-arg; + + if (tmo = 0) + return sprintf(buffer, %d, tmo); + else + return sprintf(buffer, off); +} + +static int srp_tmo_set(const char *val, const struct kernel_param *kp) +{ + int tmo, res; + + if (strcmp(val, off) != 0) { + res = kstrtoint(val, 0, tmo); + if (res) + goto out; + } else { + tmo = -1; + } + if (kp-arg == srp_fast_io_fail_tmo) + res = srp_tmo_valid(tmo, srp_dev_loss_tmo); + else + res = srp_tmo_valid(srp_fast_io_fail_tmo, tmo); + if (res) + goto out; + *(int *)kp-arg = tmo; + +out: + return res; +} + +static struct kernel_param_ops srp_tmo_ops = { + .get = srp_tmo_get, + .set = srp_tmo_set, +}; + static inline struct srp_target_port *host_to_target(struct Scsi_Host *host) { return (struct srp_target_port *) host-hostdata; @@ -739,13 +802,20 @@ static void srp_terminate_io(struct srp_rport *rport) } } -static int srp_reconnect_target(struct srp_target_port *target) +/* + * It is up to the caller to ensure that srp_rport_reconnect() calls are + * serialized and that no concurrent srp_queuecommand(), srp_abort(), + * srp_reset_device() or srp_reset_host() calls will occur while this function + * is in progress. One way to realize that is not to call this function + * directly but to call srp_reconnect_rport() instead since that last function + * serializes calls of this function via rport-mutex and also blocks + * srp_queuecommand() calls before invoking this function. + */ +static int srp_rport_reconnect(struct srp_rport *rport) { - struct Scsi_Host *shost = target-scsi_host; + struct srp_target_port *target = rport-lld_data; int i, ret; - scsi_target_block(shost-shost_gendev); - srp_disconnect_target(target); /* * Now get a new local CM ID so that we avoid confusing the target in @@ -775,28 +845,9 @@ static int srp_reconnect_target(struct srp_target_port *target) if (ret == 0) ret = srp_connect_target(target); - scsi_target_unblock(shost-shost_gendev, ret == 0 ? SDEV_RUNNING : - SDEV_TRANSPORT_OFFLINE); - target-transport_offline = !!ret; - - if (ret) - goto err; - - shost_printk(KERN_INFO, target-scsi_host, PFX reconnect succeeded\n); - - return ret; - -err: - shost_printk(KERN_ERR, target-scsi_host, -
[PATCH 10/14] IB/srp: Start timers if a transport layer error occurs
Start the reconnect timer, fast_io_fail timer and dev_loss timer if a transport layer error occurs. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@kernel.org Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 19 +++ drivers/infiniband/ulp/srp/ib_srp.h |1 + 2 files changed, 20 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 6f3e0e5..7cf44e1 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -595,6 +595,7 @@ static void srp_remove_target(struct srp_target_port *target) srp_disconnect_target(target); ib_destroy_cm_id(target-cm_id); srp_free_target_ib(target); + cancel_work_sync(target-tl_err_work); srp_rport_put(target-rport); srp_free_req_data(target); scsi_host_put(target-scsi_host); @@ -1394,6 +1395,21 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) PFX Recv failed with error code %d\n, res); } +/** + * srp_tl_err_work() - handle a transport layer error + * + * Note: This function may get invoked before the rport has been created, + * hence the target-rport test. + */ +static void srp_tl_err_work(struct work_struct *work) +{ + struct srp_target_port *target; + + target = container_of(work, struct srp_target_port, tl_err_work); + if (target-rport) + srp_start_tl_fail_timers(target-rport); +} + static void srp_handle_qp_err(enum ib_wc_status wc_status, enum ib_wc_opcode wc_opcode, struct srp_target_port *target) @@ -1403,6 +1419,7 @@ static void srp_handle_qp_err(enum ib_wc_status wc_status, PFX failed %s status %d\n, wc_opcode IB_WC_RECV ? receive : send, wc_status); + queue_work(system_long_wq, target-tl_err_work); } target-qp_in_error = true; } @@ -1763,6 +1780,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event) if (ib_send_cm_drep(cm_id, NULL, 0)) shost_printk(KERN_ERR, target-scsi_host, PFX Sending CM DREP failed\n); + queue_work(system_long_wq, target-tl_err_work); break; case IB_CM_TIMEWAIT_EXIT: @@ -2374,6 +2392,7 @@ static ssize_t srp_create_target(struct device *dev, sizeof (struct srp_indirect_buf) + target-cmd_sg_cnt * sizeof (struct srp_direct_buf); + INIT_WORK(target-tl_err_work, srp_tl_err_work); INIT_WORK(target-remove_work, srp_remove_work); spin_lock_init(target-lock); INIT_LIST_HEAD(target-free_tx); diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index fda82f7..e45d9d0 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -175,6 +175,7 @@ struct srp_target_port { struct srp_iu *rx_ring[SRP_RQ_SIZE]; struct srp_request req_ring[SRP_CMD_SQ_SIZE]; + struct work_struct tl_err_work; struct work_struct remove_work; struct list_headlist; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/14] IB/srp: Fail SCSI commands silently
From: Sebastian Riemer sebastian.rie...@profitbricks.com Avoid that path failover in a multipath setup causes the SCSI layer to generate kernel messages about SCSI command failures. This patch speeds up SRP initiator operation significantly when monitoring kernel messages over a serial port. [bvanassche: Changed patch description] Signed-off-by: Sebastian Riemer sebastian.rie...@profitbricks.com Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com --- drivers/infiniband/ulp/srp/ib_srp.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 7cf44e1..dbedac9 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -787,6 +787,7 @@ static void srp_finish_req(struct srp_target_port *target, if (scmnd) { srp_free_req(target, req, scmnd, 0); + scmnd-request-cmd_flags |= REQ_QUIET; scmnd-result = result; scmnd-scsi_done(scmnd); } @@ -1467,6 +1468,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) result = srp_chkready(target-rport); if (unlikely(result)) { + scmnd-request-cmd_flags |= REQ_QUIET; scmnd-result = result; scmnd-scsi_done(scmnd); return 0; @@ -1868,6 +1870,7 @@ static int srp_abort(struct scsi_cmnd *scmnd) else ret = FAILED; srp_free_req(target, req, scmnd, 0); + scmnd-request-cmd_flags |= REQ_QUIET; scmnd-result = DID_ABORT 16; scmnd-scsi_done(scmnd); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/14] IB/srp: Make HCA completion vector configurable
Several InfiniBand HCA's allow to configure the completion vector per queue pair. This allows to spread the workload created by IB completion interrupts over multiple MSI-X vectors and hence over multiple CPU cores. In other words, configuring the completion vector properly not only allows to reduce latency on an initiator connected to multiple SRP targets but also allows to improve throughput. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 26 -- drivers/infiniband/ulp/srp/ib_srp.h |1 + 2 files changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index dbedac9..49d8464 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -294,14 +294,16 @@ static int srp_create_target_ib(struct srp_target_port *target) return -ENOMEM; recv_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_recv_completion, NULL, target, SRP_RQ_SIZE, 0); + srp_recv_completion, NULL, target, SRP_RQ_SIZE, + target-comp_vector); if (IS_ERR(recv_cq)) { ret = PTR_ERR(recv_cq); goto err; } send_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_send_completion, NULL, target, SRP_SQ_SIZE, 0); + srp_send_completion, NULL, target, SRP_SQ_SIZE, + target-comp_vector); if (IS_ERR(send_cq)) { ret = PTR_ERR(send_cq); goto err_recv_cq; @@ -2006,6 +2008,14 @@ static ssize_t show_local_ib_device(struct device *dev, return sprintf(buf, %s\n, target-srp_host-srp_dev-dev-name); } +static ssize_t show_comp_vector(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct srp_target_port *target = host_to_target(class_to_shost(dev)); + + return sprintf(buf, %d\n, target-comp_vector); +} + static ssize_t show_cmd_sg_entries(struct device *dev, struct device_attribute *attr, char *buf) { @@ -2032,6 +2042,7 @@ static DEVICE_ATTR(req_lim, S_IRUGO, show_req_lim, NULL); static DEVICE_ATTR(zero_req_lim,S_IRUGO, show_zero_req_lim, NULL); static DEVICE_ATTR(local_ib_port, S_IRUGO, show_local_ib_port, NULL); static DEVICE_ATTR(local_ib_device, S_IRUGO, show_local_ib_device, NULL); +static DEVICE_ATTR(comp_vector, S_IRUGO, show_comp_vector, NULL); static DEVICE_ATTR(cmd_sg_entries, S_IRUGO, show_cmd_sg_entries, NULL); static DEVICE_ATTR(allow_ext_sg,S_IRUGO, show_allow_ext_sg,NULL); @@ -2046,6 +2057,7 @@ static struct device_attribute *srp_host_attrs[] = { dev_attr_zero_req_lim, dev_attr_local_ib_port, dev_attr_local_ib_device, + dev_attr_comp_vector, dev_attr_cmd_sg_entries, dev_attr_allow_ext_sg, NULL @@ -2140,6 +2152,7 @@ enum { SRP_OPT_CMD_SG_ENTRIES = 1 9, SRP_OPT_ALLOW_EXT_SG= 1 10, SRP_OPT_SG_TABLESIZE= 1 11, + SRP_OPT_COMP_VECTOR = 1 12, SRP_OPT_ALL = (SRP_OPT_ID_EXT | SRP_OPT_IOC_GUID | SRP_OPT_DGID | @@ -2160,6 +2173,7 @@ static const match_table_t srp_opt_tokens = { { SRP_OPT_CMD_SG_ENTRIES, cmd_sg_entries=%u }, { SRP_OPT_ALLOW_EXT_SG, allow_ext_sg=%u }, { SRP_OPT_SG_TABLESIZE, sg_tablesize=%u }, + { SRP_OPT_COMP_VECTOR, comp_vector=%u}, { SRP_OPT_ERR, NULL} }; @@ -2315,6 +2329,14 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target) target-sg_tablesize = token; break; + case SRP_OPT_COMP_VECTOR: + if (match_int(args, token) || token 0) { + pr_warn(bad comp_vector parameter '%s'\n, p); + goto out; + } + target-comp_vector = token; + break; + default: pr_warn(unknown parameter or missing value '%s' in target creation request\n, p); diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index e45d9d0..cbc0b14 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -156,6 +156,7 @@ struct srp_target_port { char
[PATCH 13/14] IB/srp: Make transport layer retry count configurable
From: Vu Pham vuhu...@mellanox.com Allow the InfiniBand RC retry count to be configured by the user as an option in the target login string. The transport layer timeout in nanoseconds is computed as follows from the retry count: rc_timeout = rc_retry_count * 4 * 4096 * (1 qp-timeout) The default value for tl_retry_count is changed from 7 into 2. Hence with a qp-timedout value of 19 this patch reduces the default transport layer timeout from about 60s to about 17s. The purpose of this patch is to reduce the time needed for SCSI error handling significantly and at the same time to avoid activating the SCSI error handler on an IB path with a regular BER or due to brief IB network congestion. [bvanassche: Rewrote patch description] Signed-off-by: Vu Pham v...@mellanox.com Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 24 +++- drivers/infiniband/ulp/srp/ib_srp.h |1 + 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 49d8464..44ee93a 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -453,7 +453,7 @@ static int srp_send_req(struct srp_target_port *target) req-param.responder_resources= 4; req-param.remote_cm_response_timeout = 20; req-param.local_cm_response_timeout = 20; - req-param.retry_count= 7; + req-param.retry_count= target-tl_retry_count; req-param.rnr_retry_count= 7; req-param.max_cm_retries = 15; @@ -2016,6 +2016,14 @@ static ssize_t show_comp_vector(struct device *dev, return sprintf(buf, %d\n, target-comp_vector); } +static ssize_t show_tl_retry_count(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct srp_target_port *target = host_to_target(class_to_shost(dev)); + + return sprintf(buf, %d\n, target-tl_retry_count); +} + static ssize_t show_cmd_sg_entries(struct device *dev, struct device_attribute *attr, char *buf) { @@ -2043,6 +2051,7 @@ static DEVICE_ATTR(zero_req_lim,S_IRUGO, show_zero_req_lim, NULL); static DEVICE_ATTR(local_ib_port, S_IRUGO, show_local_ib_port, NULL); static DEVICE_ATTR(local_ib_device, S_IRUGO, show_local_ib_device, NULL); static DEVICE_ATTR(comp_vector, S_IRUGO, show_comp_vector, NULL); +static DEVICE_ATTR(tl_retry_count, S_IRUGO, show_tl_retry_count, NULL); static DEVICE_ATTR(cmd_sg_entries, S_IRUGO, show_cmd_sg_entries, NULL); static DEVICE_ATTR(allow_ext_sg,S_IRUGO, show_allow_ext_sg,NULL); @@ -2058,6 +2067,7 @@ static struct device_attribute *srp_host_attrs[] = { dev_attr_local_ib_port, dev_attr_local_ib_device, dev_attr_comp_vector, + dev_attr_tl_retry_count, dev_attr_cmd_sg_entries, dev_attr_allow_ext_sg, NULL @@ -2153,6 +2163,7 @@ enum { SRP_OPT_ALLOW_EXT_SG= 1 10, SRP_OPT_SG_TABLESIZE= 1 11, SRP_OPT_COMP_VECTOR = 1 12, + SRP_OPT_TL_RETRY_COUNT = 1 13, SRP_OPT_ALL = (SRP_OPT_ID_EXT | SRP_OPT_IOC_GUID | SRP_OPT_DGID | @@ -2174,6 +2185,7 @@ static const match_table_t srp_opt_tokens = { { SRP_OPT_ALLOW_EXT_SG, allow_ext_sg=%u }, { SRP_OPT_SG_TABLESIZE, sg_tablesize=%u }, { SRP_OPT_COMP_VECTOR, comp_vector=%u}, + { SRP_OPT_TL_RETRY_COUNT, tl_retry_count=%u }, { SRP_OPT_ERR, NULL} }; @@ -2337,6 +2349,15 @@ static int srp_parse_options(const char *buf, struct srp_target_port *target) target-comp_vector = token; break; + case SRP_OPT_TL_RETRY_COUNT: + if (match_int(args, token) || token 2 || token 7) { + pr_warn(bad tl_retry_count parameter '%s' (must be a number between 2 and 7)\n, + p); + goto out; + } + target-tl_retry_count = token; + break; + default: pr_warn(unknown parameter or missing value '%s' in target creation request\n, p); @@ -2391,6 +2412,7 @@ static ssize_t srp_create_target(struct device *dev, target-cmd_sg_cnt = cmd_sg_entries; target-sg_tablesize= indirect_sg_entries ? : cmd_sg_entries; target-allow_ext_sg= allow_ext_sg;
[PATCH 14/14] IB/srp: Bump driver version and release date
From: Vu Pham vuhu...@mellanox.com Signed-off-by: Vu Pham v...@mellanox.com Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 44ee93a..dd4b1a7 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -53,8 +53,8 @@ #define DRV_NAME ib_srp #define PFXDRV_NAME : -#define DRV_VERSION0.2 -#define DRV_RELDATENovember 1, 2005 +#define DRV_VERSION1.0 +#define DRV_RELDATEJune 1, 2013 MODULE_AUTHOR(Roland Dreier); MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
On 06/12/13 15:20, Bart Van Assche wrote: If the add_one callback fails during driver load no resources are allocated so there isn't a need to release any resources. Trying to clean the resource may lead to the following kernel panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [a0132331] srp_remove_one+0x31/0x240 [ib_srp] RIP: 0010:[a0132331] [a0132331] srp_remove_one+0x31/0x240 [ib_srp] Process rmmod (pid: 4562, threadinfo 8800dd738000, task 8801167e60c0) Call Trace: [a024500e] ib_unregister_client+0x4e/0x120 [ib_core] [a01361bd] srp_cleanup_module+0x15/0x71 [ib_srp] [810ac6a4] sys_delete_module+0x194/0x260 [8100b0f2] system_call_fastpath+0x16/0x1b [bvanassche: Shortened patch description] Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Eli Cohen e...@mellanox.co.il Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 7ccf328..368d160 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device *device) struct srp_target_port *target; srp_dev = ib_get_client_data(device, srp_client); + if (!srp_dev) + return; list_for_each_entry_safe(host, tmp_host, srp_dev-dev_list, list) { device_unregister(host-dev); Please note that this patch was authored by Dotan Barak, so I should have mentioned: From: Dotan Barak dot...@dev.mellanox.co.il -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 1/6] ibtracert.c: fix resource leak
From: Dan Ben Yosef da...@dev.mellanox.co.il Call return without freeing the buffer. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibtracert.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/ibtracert.c b/src/ibtracert.c index 5800e40..cc0ac98 100644 --- a/src/ibtracert.c +++ b/src/ibtracert.c @@ -567,13 +567,15 @@ static Node *find_mcpath(ib_portid_t * from, int mlid) if (from-drpath.cnt 0) path-drpath.cnt--; } else { - if (!(port = calloc(1, sizeof(Port + if (!(port = calloc(1, sizeof(Port { IBERROR(out of memory); - + return 0; + } if (get_port(port, i, path) 0) { IBWARN (can't reach node %s port %d, portid2str(path), i); + free(port); return 0; } @@ -585,8 +587,10 @@ static Node *find_mcpath(ib_portid_t * from, int mlid) link_port(port, node); #endif - if (extend_dpath(path-drpath, i) 0) + if (extend_dpath(path-drpath, i) 0) { + free(port); return 0; + } } if (!(remotenode = calloc(1, sizeof(Node -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 5/6] ibtracert.c: Uninitialized variable
From: Dan Ben Yosef da...@dev.mellanox.co.il Reading from uninitialized variable Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibtracert.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/src/ibtracert.c b/src/ibtracert.c index cc0ac98..9699ccc 100644 --- a/src/ibtracert.c +++ b/src/ibtracert.c @@ -250,6 +250,13 @@ static int find_route(ib_portid_t * from, ib_portid_t * to, int dump) int maxhops = MAXHOPS; int portnum, outport; + memset(fromnode,0,sizeof(Node)); + memset(tonode,0,sizeof(Node)); + memset(nextnode,0,sizeof(Node)); + memset(fromport,0,sizeof(Port)); + memset(toport,0,sizeof(Port)); + memset(nextport,0,sizeof(Port)); + DEBUG(from %s, portid2str(from)); if (get_node(fromnode, fromport, from) 0 || -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 3/6] ibstat.c: fix buffer-not-null-terminated
From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index acb9e8e..bdc021b 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,7 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; - memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); + memset(names, 0, sizeof(names[0][0]) * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN); const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 4/6] ibsysstat.c: fix resource leak
From: Dan Ben Yosef da...@dev.mellanox.co.il Memory leak. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibsysstat.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/ibsysstat.c b/src/ibsysstat.c index eb876d4..c1438c1 100644 --- a/src/ibsysstat.c +++ b/src/ibsysstat.c @@ -271,8 +271,10 @@ int build_cpuinfo(void) while (fgets(line, sizeof(line) - 1, f)) { if (!strncmp(line, processor\t, 10)) { ncpu++; - if (ncpu MAX_CPUS) + if (ncpu MAX_CPUS) { + fclose(f); return MAX_CPUS; + } continue; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..acb9e8e 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,6 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; + memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, @@ -314,7 +315,7 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[i], argv[0], sizeof names[i]-1); n = 1; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH infiniband-diags 6/6] saquery.c: resource leak
From: Dan Ben Yosef da...@dev.mellanox.co.il Overwriting node_name in call node_name = remap_node_name(node_name_map, nr-node_info.node_guid, (char*)nr-node_desc.description) leaks the storage that node_name points to. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/saquery.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/saquery.c b/src/saquery.c index d31d77d..ca223c6 100644 --- a/src/saquery.c +++ b/src/saquery.c @@ -365,6 +365,8 @@ static void dump_multicast_member_record(ib_member_rec_t *p_mcmr, ib_node_record_t *nr = sa_get_query_rec(nr_result-p_result_madw, i); if (nr-node_info.port_guid == p_mcmr-port_gid.unicast.interface_id) { + if(node_name != NULL) + free(node_name); node_name = remap_node_name(node_name_map, nr-node_info.node_guid, (char *)nr-node_desc.description); -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion
On 12.06.2013 15:38, Bart Van Assche wrote: On 06/12/13 15:20, Bart Van Assche wrote: If the add_one callback fails during driver load no resources are allocated so there isn't a need to release any resources. Trying to clean the resource may lead to the following kernel panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [a0132331] srp_remove_one+0x31/0x240 [ib_srp] RIP: 0010:[a0132331] [a0132331] srp_remove_one+0x31/0x240 [ib_srp] Process rmmod (pid: 4562, threadinfo 8800dd738000, task 8801167e60c0) Call Trace: [a024500e] ib_unregister_client+0x4e/0x120 [ib_core] [a01361bd] srp_cleanup_module+0x15/0x71 [ib_srp] [810ac6a4] sys_delete_module+0x194/0x260 [8100b0f2] system_call_fastpath+0x16/0x1b [bvanassche: Shortened patch description] Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il Reviewed-by: Eli Cohen e...@mellanox.co.il Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 7ccf328..368d160 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2507,6 +2507,8 @@ static void srp_remove_one(struct ib_device *device) struct srp_target_port *target; srp_dev = ib_get_client_data(device, srp_client); +if (!srp_dev) +return; list_for_each_entry_safe(host, tmp_host, srp_dev-dev_list, list) { device_unregister(host-dev); Please note that this patch was authored by Dotan Barak, so I should have mentioned: From: Dotan Barak dot...@dev.mellanox.co.il Acked-by: Sebastian Riemer sebastian.rie...@profitbricks.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE:[PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
An SRP target is required to maintain a single connection between initiator and target. This means that if the 'add_target' attribute is used to create a second connection to a target that the first connection will be logged out and that the SCSI error handler will kick in. The SCSI error handler will cause the SRP initiator to reconnect, which will cause I/O over the second connection to fail. Avoid such ping-pong behavior by disabling relogins. Note: if reconnecting manually is necessary, that is possible by deleting and recreating an rport via sysfs. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Thanks Bart for refreshing this new patch set. I think you should add Signed-off-by for Sebastian ? Best regards, Jack -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
Wait a minute, so you've changed this commit to also hold that target lock in the following functions in error case: srp_unmap_data(), srp_put_tx_iu() This is different from: https://github.com/bvanassche/ib_srp-backport/commit/6ce0e30dbb69973926df84292239f0c20f6a2d6c srp_unmap_data() calls ib_fmr_pool_unmap() which uses an own spin lock (pool-pool_lock). srp_put_tx_iu() acquires the target lock as well (target-lock). That's spin lock in spin lock. I would say that this dead locks. I like the other version more. Cheers, Sebastian On 12.06.2013 15:21, Bart Van Assche wrote: Avoid that srp_claim_command() can claim a command while srp_queuecommand() is still busy queueing the same command. Found this via source reading. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 368d160..9c638dd 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1367,7 +1367,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) req = list_first_entry(target-free_reqs, struct srp_request, list); list_del(req-list); - spin_unlock_irqrestore(target-lock, flags); dev = target-srp_host-srp_dev-dev; ib_dma_sync_single_for_cpu(dev, iu-dma, target-max_iu_len, @@ -1401,6 +1400,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) shost_printk(KERN_ERR, target-scsi_host, PFX Send failed\n); goto err_unmap; } + spin_unlock_irqrestore(target-lock, flags); return 0; @@ -1409,8 +1409,6 @@ err_unmap: err_iu: srp_put_tx_iu(target, iu, SRP_IU_CMD); - - spin_lock_irqsave(target-lock, flags); list_add(req-list, target-free_reqs); err_unlock: -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()
On 06/12/13 16:58, Sebastian Riemer wrote: Wait a minute, so you've changed this commit to also hold that target lock in the following functions in error case: srp_unmap_data(), srp_put_tx_iu() This is different from: https://github.com/bvanassche/ib_srp-backport/commit/6ce0e30dbb69973926df84292239f0c20f6a2d6c srp_unmap_data() calls ib_fmr_pool_unmap() which uses an own spin lock (pool-pool_lock). srp_put_tx_iu() acquires the target lock as well (target-lock). That's spin lock in spin lock. I would say that this dead locks. I like the other version more. Not sure how I missed that ... I will drop this version and replace it with the proper fix. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus
On 06/12/13 16:29, Jack Wang wrote: An SRP target is required to maintain a single connection between initiator and target. This means that if the 'add_target' attribute is used to create a second connection to a target that the first connection will be logged out and that the SCSI error handler will kick in. The SCSI error handler will cause the SRP initiator to reconnect, which will cause I/O over the second connection to fail. Avoid such ping-pong behavior by disabling relogins. Note: if reconnecting manually is necessary, that is possible by deleting and recreating an rport via sysfs. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Thanks Bart for refreshing this new patch set. I think you should add Signed-off-by for Sebastian ? This patch differs slightly from what Sebastian had posted. But if Sebastian agrees with this version, I'll be happy to add his signed-off-by. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] infiniband_diags/src/saquery.c: Add SMInfoRecord support
Husam, Sorry for the delay but I have to report that this patch is somehow corrupted. I can't even apply the patch by hand after cleaning up the invalid line wrap. 10:46:39 patch -p 1 ../incomming/infiniband_diags-src-saquery.c-Add-SMInfoRecord-support.patch patching file doc/rst/saquery.8.in.rst Hunk #1 FAILED at 156. 1 out of 1 hunk FAILED -- saving rejects to file doc/rst/saquery.8.in.rst.rej patching file src/saquery.c Hunk #1 FAILED at 487. Hunk #2 FAILED at 1192. 2 out of 2 hunks FAILED -- saving rejects to file src/saquery.c.rej Could you rebase and try again? Ira -Original Message- From: Husam Kahalah [mailto:hkaha...@asaltech.com] Sent: Tuesday, May 28, 2013 5:04 AM To: Weiny, Ira Cc: linux-rdma@vger.kernel.org; da...@mellanox.com; dani...@mellanox.com Subject: [PATCH] infiniband_diags/src/saquery.c: Add SMInfoRecord support This patch is the combination of 2 patches infiniband_diags/src/saquery.c: Add SMInfoRecord support infiniband_diags/doc/rst/saquery.8.in.rst: Add support to the manual Signed-off-by: Husam Kahalah hkaha...@asaltech.com --- doc/rst/saquery.8.in.rst |1 + src/saquery.c| 45 + 2 files changed, 46 insertions(+) diff --git a/doc/rst/saquery.8.in.rst b/doc/rst/saquery.8.in.rst index 8d87096..0551917 100644 --- a/doc/rst/saquery.8.in.rst +++ b/doc/rst/saquery.8.in.rst @@ -156,6 +156,7 @@ Supported query names (and aliases): MFTRecord (MFTR) [[mlid]/[position]/[block]] GUIDInfoRecord (GIR) [[lid]/[block]] SwitchInfoRecord (SWIR) [lid] +SMInfoRecord (SMIR) [lid] diff --git a/src/saquery.c b/src/saquery.c index a5f9171..d60ac19 100644 --- a/src/saquery.c +++ b/src/saquery.c @@ -487,6 +487,30 @@ static void dump_service_record(void *data) cl_ntoh64(p_sr-service_data64[1])); } +static void dump_sm_info_record(void *data) { +ib_sminfo_record_t *p_smr = data; +const ib_sm_info_t *const p_smi = p_smr-sm_info; +uint8_t priority, state; +priority = ib_sminfo_get_priority(p_smi); +state = ib_sminfo_get_state(p_smi); + +printf(SMInfoRecord dump:\n + \t\tRID\n + \t\tLID...%u\n + \t\tSMInfo dump:\n + \t\tGUID..0x%016 PRIx64 \n + \t\tSM_Key0x%016 PRIx64 \n + \t\tActCount..0x%X\n + \t\tPriority..%u\n + \t\tSMState...%u\n, + cl_ntoh16(p_smr-lid), + cl_ntoh64(p_smr-sm_info.guid), + cl_ntoh64(p_smr-sm_info.sm_key), + cl_ntoh32(p_smr-sm_info.act_count), + priority, state); +} + static void dump_switch_info_record(void *data) { ib_switch_info_record_t *p_sir = data; @@ -1192,6 +1216,25 @@ static int query_service_records(const struct query_cmd *q, struct sa_handle * h dump_service_record); } +static int query_sm_info_records(const struct query_cmd *q, + struct sa_handle * h, struct query_params *p, + int argc, char *argv[]) { +ib_sminfo_record_t smir; +ib_net64_t comp_mask = 0; +int lid = 0; + +if (argc 0) +parse_lid_and_ports(h, argv[0], lid, NULL, NULL); + +memset(smir, 0, sizeof(smir)); +CHECK_AND_SET_VAL(lid, 16, 0, smir.lid, SMIR, LID); + +return get_and_dump_any_records(h, IB_SA_ATTR_SMINFORECORD, 0, +comp_mask, smir, sizeof(smir), +dump_sm_info_record); } + static int query_switchinfo_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) @@ -1404,6 +1447,8 @@ static const struct query_cmd query_cmds[] = { [[lid]/[block]], query_guidinfo_records}, {SwitchInfoRecord, SWIR, IB_SA_ATTR_SWITCHINFORECORD, [lid], query_switchinfo_records}, +{SMInfoRecord, SMIR, IB_SA_ATTR_SMINFORECORD, +[lid], query_sm_info_records}, {0} }; -- 1.7.9.6 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH infiniband-diags] libibnetdisc/Makefile.am: Add missing makefile dependency
-Original Message- From: Hal Rosenstock [mailto:h...@dev.mellanox.co.il] Subject: Re: [PATCH infiniband-diags] libibnetdisc/Makefile.am: Add missing makefile dependency On 6/12/2013 2:25 PM, Weiny, Ira wrote: What is the error which this fixed? It's an issue when parallel make is used (make -j) where this library might not have completed building first. Ok, I've never hit this, just luck I guess. ;-) I would think that the test_testleaks_LDFLAGS would take care of this? I don't think so. Ok, seems reasonable. Thanks, applied, Ira -- Hal Ira -Original Message- From: Hal Rosenstock [mailto:h...@dev.mellanox.co.il] Sent: Wednesday, June 12, 2013 4:04 AM To: Weiny, Ira Cc: linux-rdma (linux-rdma@vger.kernel.org); Doron Tsur Subject: [PATCH infiniband-diags] libibnetdisc/Makefile.am: Add missing makefile dependency Found-by: Doron Tsur dor...@mellanox.com Signed-off-by: Hal Rosenstock h...@mellanox.com --- diff --git a/libibnetdisc/Makefile.am b/libibnetdisc/Makefile.am index d05604f..2cfcc33 100644 --- a/libibnetdisc/Makefile.am +++ b/libibnetdisc/Makefile.am @@ -35,6 +35,7 @@ libibnetdiscincludedir = $(includedir)/infiniband test_testleaks_SOURCES = test/testleaks.c test_testleaks_CFLAGS = -Wall $(DBGFLAGS) test_testleaks_LDFLAGS = -libnetdisc +test_testleaks_DEPENDENCIES = libibnetdisc.la libibnetdiscinclude_HEADERS = $(srcdir)/include/infiniband/ibnetdisc.h \ $(srcdir)/include/infiniband/ibnetdisc_osd.h -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH V2 1/4] RDMA/cma: Add IPv6 support for iWARP.
Hello Steve, I reviewed the iWarp IPv6 nes patch. Thanks, Tatyana -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Vipul Pandya Sent: Wednesday, June 12, 2013 6:42 AM To: linux-rdma@vger.kernel.org; net...@vger.kernel.org Cc: rol...@purestorage.com; da...@davemloft.net; d...@chelsio.com; d...@chelsio.com; sw...@opengridcomputing.com; rol...@kernel.org; Hefty, Sean; hal.rosenst...@gmail.com; t...@opengridcomputing.com; Latif, Faisal; a...@linux-foundation.org; sasha.le...@oracle.com; nirran...@chelsio.com; vi...@chelsio.com Subject: [PATCH V2 1/4] RDMA/cma: Add IPv6 support for iWARP. From: Steve Wise sw...@opengridcomputing.com This patch modifies the type of local_addr and remote_addr fields in struct iw_cm_id from struct sockaddr_in to struct sockaddr_storage to hold IPv6 and IPv4 addresses uniformly. It changes the references of local_addr and remote_addr in RDMA/cxgb4, RDMA/cxgb3, RDMA/nes and amso drivers such that build failure is avoided. However to be able to actully run the traffic over IPv6 address respective drivers have to add supportive code. Signed-off-by: Steve Wise sw...@opengridcomputing.com --- V2: Used local variables instead of typecasting in patch 1/4. drivers/infiniband/core/cma.c | 69 --- drivers/infiniband/hw/amso1100/c2_ae.c | 18 ++-- drivers/infiniband/hw/amso1100/c2_cm.c | 16 +++- drivers/infiniband/hw/cxgb3/iwch_cm.c | 46 ++ drivers/infiniband/hw/cxgb4/cm.c | 55 ++-- drivers/infiniband/hw/nes/nes_cm.c | 148 ++--- include/rdma/iw_cm.h | 8 +- 7 files changed, 205 insertions(+), 155 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 34fbc2f..72b15c1 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1335,8 +1335,9 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) { struct rdma_id_private *id_priv = iw_id-context; struct rdma_cm_event event; - struct sockaddr_in *sin; int ret = 0; + struct sockaddr *laddr = (struct sockaddr *)iw_event-local_addr; + struct sockaddr *raddr = (struct sockaddr *)iw_event-remote_addr; if (cma_disable_callback(id_priv, RDMA_CM_CONNECT)) return 0; @@ -1347,10 +1348,10 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) event.event = RDMA_CM_EVENT_DISCONNECTED; break; case IW_CM_EVENT_CONNECT_REPLY: - sin = (struct sockaddr_in *) id_priv-id.route.addr.src_addr; - *sin = iw_event-local_addr; - sin = (struct sockaddr_in *) id_priv-id.route.addr.dst_addr; - *sin = iw_event-remote_addr; + memcpy(id_priv-id.route.addr.src_addr, laddr, + ip_addr_size(laddr)); + memcpy(id_priv-id.route.addr.dst_addr, raddr, + ip_addr_size(raddr)); switch (iw_event-status) { case 0: event.event = RDMA_CM_EVENT_ESTABLISHED; @@ -1400,11 +1401,12 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, { struct rdma_cm_id *new_cm_id; struct rdma_id_private *listen_id, *conn_id; - struct sockaddr_in *sin; struct net_device *dev = NULL; struct rdma_cm_event event; int ret; struct ib_device_attr attr; + struct sockaddr *laddr = (struct sockaddr *)iw_event-local_addr; + struct sockaddr *raddr = (struct sockaddr *)iw_event-remote_addr; listen_id = cm_id-context; if (cma_disable_callback(listen_id, RDMA_CM_LISTEN)) @@ -1422,14 +1424,7 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, mutex_lock_nested(conn_id-handler_mutex, SINGLE_DEPTH_NESTING); conn_id-state = RDMA_CM_CONNECT; - dev = ip_dev_find(init_net, iw_event-local_addr.sin_addr.s_addr); - if (!dev) { - ret = -EADDRNOTAVAIL; - mutex_unlock(conn_id-handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; - } - ret = rdma_copy_addr(conn_id-id.route.addr.dev_addr, dev, NULL); + ret = rdma_translate_ip(laddr, conn_id-id.route.addr.dev_addr); if (ret) { mutex_unlock(conn_id-handler_mutex); rdma_destroy_id(new_cm_id); @@ -1447,10 +1442,8 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, cm_id-context = conn_id; cm_id-cm_handler = cma_iw_handler; - sin = (struct sockaddr_in *) new_cm_id-route.addr.src_addr; - *sin = iw_event-local_addr; - sin = (struct sockaddr_in *) new_cm_id-route.addr.dst_addr; - *sin = iw_event-remote_addr; + memcpy(new_cm_id-route.addr.src_addr, laddr, ip_addr_size(laddr)); +
Re: [PATCH V2 1/4] RDMA/cma: Add IPv6 support for iWARP.
From: Nikolova, Tatyana E tatyana.e.nikol...@intel.com Date: Wed, 12 Jun 2013 19:56:37 + I reviewed the iWarp IPv6 nes patch. Can you please not top-post and quote the entire patch like this? -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 1/6] ibtracert.c: fix resource leak
On 6/12/2013 4:54 PM, Ira Weiny wrote: On Wed, 12 Jun 2013 09:55:26 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Call return without freeing the buffer. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com Thanks applied with modifications below. --- src/ibtracert.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/ibtracert.c b/src/ibtracert.c index 5800e40..cc0ac98 100644 --- a/src/ibtracert.c +++ b/src/ibtracert.c @@ -567,13 +567,15 @@ static Node *find_mcpath(ib_portid_t * from, int mlid) if (from-drpath.cnt 0) path-drpath.cnt--; } else { -if (!(port = calloc(1, sizeof(Port +if (!(port = calloc(1, sizeof(Port { IBERROR(out of memory); - +return 0; There is no need for this as IBERROR exits. Agreed. Did you want to continue execution for some reason? If so you'll need to change IBERROR to IBWARN and resubmit this hunk as a separate patch. I think it's consistent to the other memory failure handling without this hunk but Dan has the final word as to what he would like done. Thanks. -- Hal Thanks, Ira +} if (get_port(port, i, path) 0) { IBWARN (can't reach node %s port %d, portid2str(path), i); +free(port); return 0; } @@ -585,8 +587,10 @@ static Node *find_mcpath(ib_portid_t * from, int mlid) link_port(port, node); #endif -if (extend_dpath(path-drpath, i) 0) +if (extend_dpath(path-drpath, i) 0) { +free(port); return 0; +} } if (!(remotenode = calloc(1, sizeof(Node -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of ummunot branch?
On Jun 10, 2013, at 11:56 AM, Liran Liss lir...@mellanox.com wrote: Register all address space is the moral equivalent of not having userspace registration, so let's talk about it in those terms. Specifically, there's a subtle difference between: a) telling verbs to register (0...2^64) -- Which is weird because it tells verbs to register memory that isn't in my address space Another way to look at it is specify IO access permissions for address space ranges. This could be useful to implement a buffer pool to be used for a specific MR only, yet still map/unmap memory within this pool on the fly to optimize physical memory utilization. In this case, you would provide smaller ranges than 2^64... Hmm; I'm not sure I understand. Userspace doesn't control what virtual addresses it gets back from mmap/etc. So how is what you're talking about different than regular/reactive memory registration? (vs. pre-emptively registering a whole pile of memory that doesn't exist yet) Specifically: I'm confused because you said you could (preemptively) register some small regions (that assumedly don't yet exist in your virtual memory address space) and use them as memory pools. But given that userspace doesn't control its virtual address ranges, I'm not sure how that's useful. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of ummunot branch?
On Wed, Jun 12, 2013 at 09:10:57PM +, Jeff Squyres (jsquyres) wrote: Another way to look at it is specify IO access permissions for address space ranges. This could be useful to implement a buffer pool to be used for a specific MR only, yet still map/unmap memory within this pool on the fly to optimize physical memory utilization. In this case, you would provide smaller ranges than 2^64... Hmm; I'm not sure I understand. Userspace doesn't control what virtual addresses it gets back from mmap/etc. Yes, it can, via MAP_FIXED. There are lots of fun tricks you can play using that. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of ummunot branch?
On Jun 10, 2013, at 1:26 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: I agree that pushing all registration issues out of the application and (somewhere) into the verbs stack would be a nice solution. Well, it creates a mess in another sense, because now you've lost context. When your MPI goes to do a 1byte send the kernel may well prefetch a few megabytes of page tables, whereas an implementation in userspace still has the context and can say, no I don't need that.. It seems like there are Big Problems on either side of this problem (userspace and kernel). I thought that ummunotify was a good balance between the two -- MPI kept its registration caches (which are annoying, but we have long-since understood that *someone* has to maintain them), but it gets a bulletproof way to keep them coherent. That is what is missing in today's solutions: bulletproofness (plus we have to use the horrid glibc malloc hooks, which are deprecated and are going away). That being said, everyone I've talked to about ODP finds it very, very strange that the kernel would keep memory registrations around for memory that is no longer part of a process. Not only does it MRs are badly named. They are not 'memory registrations'. They are 'address registrations'. Don't conflat address === memory in your head, then it seems weird :) The memory the address space points to is flexible. The address space is tied to the lifetime of the process. It doesn't matter if there is no memory mapped to the address space, the address space is still there. Liran had a good example. You can register address space and then use mmap/munmap/MAP_FIXED to mess around with where it points to ...but this is not how people write applications. Real apps use malloc (and some direct mmap, and perhaps even some shared memory). They don't pay attention to the contiguiousness (is that a word?) of memory/addresses in the large scale. To be clear: the most tightly bound codes *do* actually care about cache hits and locality, but that's in the small scale -- not in the large scale. I would find it hard to believe that a real code would pay attention to where in its address range a given malloc() returns, for example. *That's* what makes this whole concept weird. It seems like this is a perfect kernel space concept, but is quite foreign to userspace developers. A practical example of using this would be to avoid the need to send scatter buffer pointers to the remote. The remote writes into a memory ring and the ring is made 'endless' by clever use of remapping. I don't understand -- please explain your example a bit more...? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
On Wed, 12 Jun 2013 09:55:32 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Good catch, but I think this is an issue in umad_get_cas_names! Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..acb9e8e 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,6 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; + memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); I don't think you need this as it will not fix umad_get_cas_names. const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, @@ -314,7 +315,7 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[i], argv[0], sizeof names[i]-1); This is actually dead code. IBPANIC exits, if your linking to libibmad. Do you have a different IBPANIC which does not exit? Do you have a use case which hits this bug? [root@iqa-136 sbin]# ./ibstat myverylongca_name_0123456789_0123456789 ibpanic: [9262] main: 'myverylongca_name_0123456789_0123456789' IB device can't be found: Success [root@iqa-136 sbin]# ./ibstat notfoundca ibpanic: [9273] main: 'notfoundca' IB device can't be found: Success Ira n = 1; } -- 1.7.8.2 -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 3/6] ibstat.c: fix buffer-not-null-terminated
Same comment as previous patch and this should have been rolled into that patch anyway. Ira On Wed, 12 Jun 2013 09:55:37 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index acb9e8e..bdc021b 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,7 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; - memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); + memset(names, 0, sizeof(names[0][0]) * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN); const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 4/6] ibsysstat.c: fix resource leak
On Wed, 12 Jun 2013 09:55:43 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Thanks applied! Ira Memory leak. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibsysstat.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/src/ibsysstat.c b/src/ibsysstat.c index eb876d4..c1438c1 100644 --- a/src/ibsysstat.c +++ b/src/ibsysstat.c @@ -271,8 +271,10 @@ int build_cpuinfo(void) while (fgets(line, sizeof(line) - 1, f)) { if (!strncmp(line, processor\t, 10)) { ncpu++; - if (ncpu MAX_CPUS) + if (ncpu MAX_CPUS) { + fclose(f); return MAX_CPUS; + } continue; } -- 1.7.8.2 -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 5/6] ibtracert.c: Uninitialized variable
On Wed, 12 Jun 2013 09:55:48 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Thanks applied! Ira Reading from uninitialized variable Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibtracert.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/src/ibtracert.c b/src/ibtracert.c index cc0ac98..9699ccc 100644 --- a/src/ibtracert.c +++ b/src/ibtracert.c @@ -250,6 +250,13 @@ static int find_route(ib_portid_t * from, ib_portid_t * to, int dump) int maxhops = MAXHOPS; int portnum, outport; + memset(fromnode,0,sizeof(Node)); + memset(tonode,0,sizeof(Node)); + memset(nextnode,0,sizeof(Node)); + memset(fromport,0,sizeof(Port)); + memset(toport,0,sizeof(Port)); + memset(nextport,0,sizeof(Port)); + DEBUG(from %s, portid2str(from)); if (get_node(fromnode, fromport, from) 0 || -- 1.7.8.2 -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 6/6] saquery.c: resource leak
On Wed, 12 Jun 2013 09:55:53 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Thanks applied! Ira Overwriting node_name in call node_name = remap_node_name(node_name_map, nr-node_info.node_guid, (char*)nr-node_desc.description) leaks the storage that node_name points to. Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/saquery.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/src/saquery.c b/src/saquery.c index d31d77d..ca223c6 100644 --- a/src/saquery.c +++ b/src/saquery.c @@ -365,6 +365,8 @@ static void dump_multicast_member_record(ib_member_rec_t *p_mcmr, ib_node_record_t *nr = sa_get_query_rec(nr_result-p_result_madw, i); if (nr-node_info.port_guid == p_mcmr-port_gid.unicast.interface_id) { + if(node_name != NULL) + free(node_name); node_name = remap_node_name(node_name_map, nr-node_info.node_guid, (char *)nr-node_desc.description); -- 1.7.8.2 -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of ummunot branch?
On Wed, Jun 12, 2013 at 09:18:34PM +, Jeff Squyres (jsquyres) wrote: Well, it creates a mess in another sense, because now you've lost context. When your MPI goes to do a 1byte send the kernel may well prefetch a few megabytes of page tables, whereas an implementation in userspace still has the context and can say, no I don't need that.. It seems like there are Big Problems on either side of this problem (userspace and kernel). I thought that ummunotify was a good balance between the two -- MPI kept its registration caches (which are annoying, but we have long-since understood that *someone* has to maintain them), but it gets a bulletproof way to keep them coherent. That is what is missing in today's solutions: bulletproofness (plus we have to use the horrid glibc malloc hooks, which are deprecated and are going away). Ditto. Someone has to finish the ummunotify rewrite Roland started. Realistically MPI is going to be the only user, can someone from the MPI world do this? It doesn't matter if there is no memory mapped to the address space, the address space is still there. Liran had a good example. You can register address space and then use mmap/munmap/MAP_FIXED to mess around with where it points to ...but this is not how people write applications. Real apps use malloc (and some direct mmap, and perhaps even some shared memory). *shrug* I used MAP_FIXED for some RDMA regions in my IB verbs apps, specifically to create specalized high-performance memory structures. It isn't a general purpose technique for non-RDMA apps - but especially when combined with ODP it is useful in some places. A practical example of using this would be to avoid the need to send scatter buffer pointers to the remote. The remote writes into a memory ring and the ring is made 'endless' by clever use of remapping. I don't understand -- please explain your example a bit more...? You have a memory pool. There are two mappings to this physical memory, one for the CPU to use, one for RDMA to use. The RDMA mapping is a linear ring, the remote just spews linearly via RDMA WRITE. When messages arrive the CPU xlates the RDMA ring virtual address to the CPU address, and accesses the memory from there. It then finds a free block in the memory pool and remaps it into the RDMA pool and tells the remote that there is more free memory. From the perspective of the remote this creates an endless, apparently linear, ring. When the CPU is done with its memory it adds it back to free block pool. At the start of time the RDMA ring maps 1:1 to the CPU pool. As xfers happen the RDMA rings maps non-linearly depending on when the CPU is done with the memory. There are lots of details to make this work, but you avoid sending s/g lists, and generally make communication more asynchronous. s/g lists are expensive. A 1GB ring requires nearly 2MB to describe with s/g lists, and a 40GB nic can turn that ring over 4 times per second! You can do something similar with sends, but sends have to pre-size buffers, wheras this scheme lets you send any size message with optimal memory usage. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
On 6/12/2013 5:19 PM, Ira Weiny wrote: On Wed, 12 Jun 2013 09:55:32 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Good catch, but I think this is an issue in umad_get_cas_names! Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..acb9e8e 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,6 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; +memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); I don't think you need this as it will not fix umad_get_cas_names. The proper memset is already in the tree. const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, @@ -314,7 +315,7 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); -strncpy(names[i], argv[0], sizeof names[i]); +strncpy(names[i], argv[0], sizeof names[i]-1); This is actually dead code. IBPANIC exits, if your linking to libibmad. Do you have a different IBPANIC which does not exit? Do you have a use case which hits this bug? [root@iqa-136 sbin]# ./ibstat myverylongca_name_0123456789_0123456789 ibpanic: [9262] main: 'myverylongca_name_0123456789_0123456789' IB device can't be found: Success [root@iqa-136 sbin]# ./ibstat notfoundca ibpanic: [9273] main: 'notfoundca' IB device can't be found: Success I think the change is to quiet a Coverity detected error but Dan is the definitive source for this change. -- Hal Ira n = 1; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
On 6/12/2013 6:03 PM, Hal Rosenstock wrote: On 6/12/2013 5:19 PM, Ira Weiny wrote: On Wed, 12 Jun 2013 09:55:32 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il Buffer may not have a null terminator if the source string's length is equal to the buffer size. Good catch, but I think this is an issue in umad_get_cas_names! Signed-off-by: Dan Ben Yosef da...@dev.mellanox.co.il Signed-off-by: Hal Rosenstock h...@mellanox.com --- src/ibstat.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..acb9e8e 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -279,6 +279,7 @@ int main(int argc, char *argv[]) int dev_port = -1; int n, i; + memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * UMAD_CA_NAME_LEN)); I don't think you need this as it will not fix umad_get_cas_names. The proper memset is already in the tree. It isn't in the tree; I was looking at modified source... const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, @@ -314,7 +315,7 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[i], argv[0], sizeof names[i]-1); This is actually dead code. IBPANIC exits, if your linking to libibmad. Do you have a different IBPANIC which does not exit? Do you have a use case which hits this bug? [root@iqa-136 sbin]# ./ibstat myverylongca_name_0123456789_0123456789 ibpanic: [9262] main: 'myverylongca_name_0123456789_0123456789' IB device can't be found: Success [root@iqa-136 sbin]# ./ibstat notfoundca ibpanic: [9273] main: 'notfoundca' IB device can't be found: Success I think the change is to quiet a Coverity detected error but Dan is the definitive source for this change. -- Hal Ira n = 1; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
-Original Message- From: Hal Rosenstock [mailto:h...@dev.mellanox.co.il] Subject: Re: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null- terminated On 6/12/2013 6:03 PM, Hal Rosenstock wrote: On 6/12/2013 5:19 PM, Ira Weiny wrote: On Wed, 12 Jun 2013 09:55:32 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: From: Dan Ben Yosef da...@dev.mellanox.co.il + memset(names, 0, sizeof(names[0][0] * UMAD_MAX_DEVICES * +UMAD_CA_NAME_LEN)); I don't think you need this as it will not fix umad_get_cas_names. The proper memset is already in the tree. It isn't in the tree; I was looking at modified source... Sounds like that should be pushed upstream! ;-) const struct ibdiag_opt opts[] = { {list_of_cas, 'l', 0, NULL, list all IB devices}, {short, 's', 0, NULL, short output}, @@ -314,7 +315,7 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[i], argv[0], sizeof names[i]-1); This is actually dead code. IBPANIC exits, if your linking to libibmad. Do you have a different IBPANIC which does not exit? Do you have a use case which hits this bug? [root@iqa-136 sbin]# ./ibstat myverylongca_name_0123456789_0123456789 ibpanic: [9262] main: 'myverylongca_name_0123456789_0123456789' IB device can't be found: Success [root@iqa-136 sbin]# ./ibstat notfoundca ibpanic: [9273] main: 'notfoundca' IB device can't be found: Success I think the change is to quiet a Coverity detected error but Dan is the definitive source for this change. I think we should remove the dead code and that should fix Coverity. I thought it detected dead code. I wonder if I missed something? Ira -- Hal Ira n = 1; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null-terminated
-Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- Subject: RE: [PATCH infiniband-diags 2/6] ibstat.c: fix buffer-not-null- terminated I think the change is to quiet a Coverity detected error but Dan is the definitive source for this change. I think we should remove the dead code and that should fix Coverity. I thought it detected dead code. I wonder if I missed something? That is not dead code. I misread where the break jumps to. However, the code is redundant and there is a bug.[*] 16:16:51 ibstat -l mlx4_0 mlx4_1 16:17:57 ibstat -l mlx4_1 mlx4_0 The intent was to copy argv[0] into names[0] and set n=1. But that is not happening... Ira [*] not that the above is really a valid use case. Ira -- Hal Ira n = 1; } -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] infiniband-diags: ibstat fix strncpy coverity check and -l bug
Signed-off-by: Ira Weiny ira.we...@intel.com --- src/ibstat.c |9 ++--- 1 files changed, 2 insertions(+), 7 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..1ef27f2 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -314,7 +314,8 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[0], argv[0], sizeof(names[0])-1); + names[0][sizeof(names[0])-1] = '\0'; n = 1; } @@ -324,12 +325,6 @@ int main(int argc, char *argv[]) return 0; } - if (!list_only argc) { - if (ca_stat(argv[0], dev_port, short_format) 0) - IBPANIC(stat of IB device '%s' failed, argv[0]); - return 0; - } - for (i = 0; i n; i++) { if (list_only) printf(%s\n, names[i]); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband-diags: ibstat fix strncpy coverity check and -l bug
On 6/12/2013 8:03 PM, Ira Weiny wrote: This seems like 2 different patches to me. Signed-off-by: Ira Weiny ira.we...@intel.com --- src/ibstat.c |9 ++--- 1 files changed, 2 insertions(+), 7 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..1ef27f2 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -314,7 +314,8 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[0], argv[0], sizeof(names[0])-1); + names[0][sizeof(names[0])-1] = '\0'; n = 1; } @@ -324,12 +325,6 @@ int main(int argc, char *argv[]) return 0; } - if (!list_only argc) { I think I see the issue. Rather than eliminate this if clause, I think the line above should be: if (!list_only argc 1) { so that this properly handles the case when the port option is supplied. -- Hal - if (ca_stat(argv[0], dev_port, short_format) 0) - IBPANIC(stat of IB device '%s' failed, argv[0]); - return 0; - } - for (i = 0; i n; i++) { if (list_only) printf(%s\n, names[i]); -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] infiniband-diags: ibstat fix strncpy coverity check and -l bug
On Wed, 12 Jun 2013 21:38:58 -0400 Hal Rosenstock h...@dev.mellanox.co.il wrote: On 6/12/2013 8:03 PM, Ira Weiny wrote: This seems like 2 different patches to me. Not really... Signed-off-by: Ira Weiny ira.we...@intel.com --- src/ibstat.c |9 ++--- 1 files changed, 2 insertions(+), 7 deletions(-) @@ -324,12 +325,6 @@ int main(int argc, char *argv[]) return 0; } - if (!list_only argc) { I think I see the issue. Rather than eliminate this if clause, I think the line above should be: if (!list_only argc 1) { so that this properly handles the case when the port option is supplied. V2 on the way. I think we should still get rid of that if clause but your right I broke the port parameter. Ira -- Hal - if (ca_stat(argv[0], dev_port, short_format) 0) - IBPANIC(stat of IB device '%s' failed, argv[0]); - return 0; - } - for (i = 0; i n; i++) { if (list_only) printf(%s\n, names[i]); -- Ira Weiny ira.we...@intel.com -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] infiniband-diags: ibstat fix strncpy coverity check and -l bug
Changes since V1: Fix port parameter usage. Signed-off-by: Ira Weiny ira.we...@intel.com --- src/ibstat.c | 11 +++ 1 files changed, 3 insertions(+), 8 deletions(-) diff --git a/src/ibstat.c b/src/ibstat.c index 665bb0a..37f2361 100644 --- a/src/ibstat.c +++ b/src/ibstat.c @@ -314,7 +314,8 @@ int main(int argc, char *argv[]) if (i = n) IBPANIC('%s' IB device can't be found, argv[0]); - strncpy(names[i], argv[0], sizeof names[i]); + strncpy(names[0], argv[0], sizeof(names[0])-1); + names[0][sizeof(names[0])-1] = '\0'; n = 1; } @@ -324,16 +325,10 @@ int main(int argc, char *argv[]) return 0; } - if (!list_only argc) { - if (ca_stat(argv[0], dev_port, short_format) 0) - IBPANIC(stat of IB device '%s' failed, argv[0]); - return 0; - } - for (i = 0; i n; i++) { if (list_only) printf(%s\n, names[i]); - else if (ca_stat(names[i], -1, short_format) 0) + else if (ca_stat(names[i], dev_port, short_format) 0) IBPANIC(stat of IB device '%s' failed, names[i]); } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html