[openib-general] [PATCH] IB/ipoib: compliance/interoperability fix
ipoib assumes that high (reserved) octet in hardware address is 0, and copies it into the QPN. This violates RFC 4391 (which requires that the high 8 bits are ignored on receive), and will result in invalid QPN passed to hardware when inter-operating with IPoIB connected mode. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 1eaf00e..cdc98b1 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -49,6 +49,8 @@ #include linux/in.h #include net/dst.h +#define IPOIB_QPN(ha) (be32_to_cpup((__be32 *) ha) 0xff) + MODULE_AUTHOR(Roland Dreier); MODULE_DESCRIPTION(IP-over-InfiniBand net driver); MODULE_LICENSE(Dual BSD/GPL); @@ -517,6 +516,5 @@ static void neigh_add_path(struct sk_buf - ipoib_send(dev, skb, path-ah, - be32_to_cpup((__be32 *) skb-dst-neighbour-ha)); + ipoib_send(dev, skb, path-ah, IPOIB_QPN(skb-dst-neighbour-ha)); } else { neigh-ah = NULL; __skb_queue_tail(neigh-queue, skb); @@ -599,8 +594,7 @@ static void unicast_arp_send(struct sk_b ipoib_dbg(priv, Send unicast ARP to %04x\n, be16_to_cpu(path-pathrec.dlid)); - ipoib_send(dev, skb, path-ah, - be32_to_cpup((__be32 *) phdr-hwaddr)); + ipoib_send(dev, skb, path-ah, IPOIB_QPN(phdr-hwaddr)); } else if ((path-query || !path_rec_start(dev, path)) skb_queue_len(path-queue) IPOIB_MAX_PATH_REC_QUEUE) { /* put pseudoheader back on for next time */ @@ -661,8 +655,7 @@ static int ipoib_start_xmit(struct sk_bu goto out; } - ipoib_send(dev, skb, neigh-ah, - be32_to_cpup((__be32 *) skb-dst-neighbour-ha)); + ipoib_send(dev, skb, neigh-ah, IPOIB_QPN(skb-dst-neighbour-ha)); goto out; } @@ -694,7 +687,7 @@ static int ipoib_start_xmit(struct sk_bu IPOIB_GID_FMT \n, skb-dst ? neigh : dst, be16_to_cpup((__be16 *) skb-data), - be32_to_cpup((__be32 *) phdr-hwaddr), + IPOIB_QPN(phdr-hwaddr), IPOIB_GID_RAW_ARG(phdr-hwaddr + 4)); dev_kfree_skb_any(skb); ++priv-stats.tx_dropped; @@ -777,7 +770,7 @@ static void ipoib_neigh_destructor(struc ipoib_dbg(priv, neigh_destructor for %06x IPOIB_GID_FMT \n, - be32_to_cpup((__be32 *) n-ha), + IPOIB_QPN(n-ha), IPOIB_GID_RAW_ARG(n-ha + 4)); spin_lock_irqsave(priv-lock, flags); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] I need your help.
Hello! sir. I 've been developing my mpich projects on infiniband cluster for two months. $ ibstat CA type: MT25204 Number of ports: 1 Firmware version: 1.1.0 Hardware version: a0 Node GUID: 0xe865620060529997 System image GUID: 0xe86562006052999a Port 1: State: Active Physical state: LinkUp Rate: 10 Base lid: 82 LMC: 0 SM lid: 82 Capability mask: 0x02510a6a Port GUID: 0xe865620060529998 I've downloaded Mellanox IB-Verbs API (VAPI) , but I works on openib version. Would you mind telling me where I can download the API manual about OpenIB? thank you in advance. Wang. Nov.15 - Mp3疯狂搜-新歌热歌高速下 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx)
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael S. Tsirkin Subject: Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx) Quoting Ramachandra K [EMAIL PROTECTED]: Subject: [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx) This patch set adds support for the QLogic Virtual Ethernet I/O controller (VEx), which presents a true Ethernet NIC to the host. This driver provides a standard Ethernet NIC interface to the system and treats IB as an I/O bus to allow a host CPU to use the VEx card as its NIC. Is the VEx wire protocol documented somewhere? For example, what is a viport? What is a netpath? It's somewhat hard to understand the code without the protocol spec it is trying to implement. -- MST The VNIC software is a device driver for a remote device on the IB fabric, the VEx. We have followed the convention and standard set by previous submitters of device driver code to either OpenFabrics or the Linux kernel, in that the code is the documentation. The device drivers for mthca, ehca, ipath, or for that matter, Ethernet NICs like the Intel Pro 1000, do not document the protocol they implement when managing their device over PCI or PCI-X. The VNIC manages the VEx over the IB bus and as such it is a device driver in the same class as those mentioned above. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] IB/ipoib: fix skb leak
ipoib_neigh_free is sometimes called while neighbour is still alive, so it might have queued skbs. Fix skb leak in this case. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- Hi, Roland! I saw this potential issue when I went over the code. What do you think? diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index e5b793d..c0fb316 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -260,7 +279,7 @@ static inline struct ipoib_neigh **to_ip } struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh); -void ipoib_neigh_free(struct ipoib_neigh *neigh); +void ipoib_neigh_free(struct net_dev *dev, struct ipoib_neigh *neigh); extern struct workqueue_struct *ipoib_workqueue; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 1eaf00e..ac7e421 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -262,7 +264,7 @@ static void path_free(struct net_device if (neigh-ah) ipoib_put_ah(neigh-ah); - ipoib_neigh_free(neigh); + ipoib_neigh_free(dev, neigh); } spin_unlock_irqrestore(priv-lock, flags); @@ -517,9 +516,10 @@ static void neigh_add_path(struct sk_buf } else { neigh-ah = NULL; - __skb_queue_tail(neigh-queue, skb); if (!path-query path_rec_start(dev, path)) goto err_list; + + __skb_queue_tail(neigh-queue, skb); } spin_unlock(priv-lock); @@ -537,7 +533,7 @@ err_list: list_del(neigh-list); err_path: - ipoib_neigh_free(neigh); + ipoib_neigh_free(dev, neigh); ++priv-stats.tx_dropped; dev_kfree_skb_any(skb); @@ -655,9 +650,9 @@ static int ipoib_start_xmit(struct sk_bu */ ipoib_put_ah(neigh-ah); list_del(neigh-list); - ipoib_neigh_free(neigh); + ipoib_neigh_free(dev, neigh); spin_unlock(priv-lock); ipoib_path_lookup(skb, dev); goto out; } @@ -787,7 +781,7 @@ static void ipoib_neigh_destructor(struc if (neigh-ah) ah = neigh-ah; list_del(neigh-list); - ipoib_neigh_free(neigh); + ipoib_neigh_free(dev, neigh); } spin_unlock_irqrestore(priv-lock, flags); @@ -810,9 +804,15 @@ struct ipoib_neigh *ipoib_neigh_alloc(st return neigh; } -void ipoib_neigh_free(struct ipoib_neigh *neigh) +void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh) { + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; *to_ipoib_neigh(neigh-neighbour) = NULL; + while ((skb = __skb_dequeue(neigh-queue))) { + ++priv-stats.tx_dropped; + dev_kfree_skb_any(skb); + } kfree(neigh); } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 3faa182..d282d65 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -114,7 +114,7 @@ static void ipoib_mcast_free(struct ipoi */ if (neigh-ah) ipoib_put_ah(neigh-ah); - ipoib_neigh_free(neigh); + ipoib_neigh_free(dev, neigh); } spin_unlock_irqrestore(priv-lock, flags); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Send_bw in UD
Hi, Infiniband specification says that the completion notification in case of RC occurs when the data has actually reached the destination buffer. Whereas for UD it is given when the data is placed on the infiniband line. I was going through the code send_bw.c( https://openfabrics.org/svn/gen2/tags/openib-1.0-rc1/src/userspace/perftest/send_bw.c ). This tells the time taken for the data to reach the destination. In the case of UD the same code is used. Should it not have the code which waits for the acknowledgement from the destination? Alternately, is the bandwidth computation wrong in this case? Comments will be welcome, -Chev ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Mellanox ibtp requires vl.h which is not found
Hello, Many of the Mellanox tests require a header file named vl.h For example: https://openib.org/svn/trunk/contrib/mellanox/ibtp/gen2/userspace/userac cess/qp_test/main.c Where can I find it? It's not anywhere in /usr/local/ofed nor /usr/include ... Thanks ___ Yosef Etigin, ib-host-stack | +972-9-971-7630 (o) | +972-54-218 8036(m) Voltaire - The Grid Backbone www.voltaire.com http://www.voltaire.com. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Mellanox ibtp requires vl.h which is not found
Hi Yosef, You can found vl library under: https://openib.org/svn/trunk/contrib/mellanox/ibtp/common/tools/vl Regards, Vladimir Yosef Eitgin wrote: Hello, Many of the Mellanox tests require a header file named “vl.h” For example: https://openib.org/svn/trunk/contrib/mellanox/ibtp/gen2/userspace/useraccess/qp_test/main.c Where can I find it? It’s not anywhere in /usr/local/ofed nor /usr/include … Thanks ___ Yosef Etigin, ib-host-stack | +972-9-971-7630 (o) | +972-54-218 8036(m) Voltaire – _The Grid Backbone_ www.voltaire.com http://www.voltaire.com. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Mellanox ibtp requires vl.h which is not found
Hi Yosef. Hello, Many of the Mellanox tests require a header file named vl.h For example: https://openib.org/svn/trunk/contrib/mellanox/ibtp/gen2/userspace/userac cess/qp_test/main.c Where can I find it? It's not anywhere in /usr/local/ofed nor /usr/include ... Thanks The VL library can be found in the following URL: https://openib.org/svn/trunk/contrib/mellanox/ibtp/common/tools/vl/ thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Send_bw in UD
Hi. Hi, Infiniband specification says that the completion notification in case of RC occurs when the data has actually reached the destination buffer. Whereas for UD it is given when the data is placed on the infiniband line. You are absolutely right. I was going through the code send_bw.c( https://openfabrics.org/svn/gen2/tags/openib-1.0-rc1/src/userspace/perftest/send_bw.c ). This tells the time taken for the data to reach the destination. In the case of UD the same code is used. Should it not have the code which waits for the acknowledgement from the destination? Alternately, is the bandwidth computation wrong in this case? Comments will be welcome, -Chev This test is a pingpong test, so if data is being received from the remote side (even for UD QPs) that means that he got the data... This test assumes that no packet was dropped. Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Send_bw in UD
Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: I was going through the code send_bw.c( https://openfabrics.org/svn/gen2/tags/openib-1.0-rc1/src/userspace/perftest/send_bw.c ). This test is a pingpong test I think that there's no ping pong in send_bw - it measures one way streaming bw. We have the following comment at line 815: /* client is posting and not receiving. */ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] what should happen in a completion event channel is being destroyed when there are several CQs associated to it?
Hi roland. What should happen in a completion event channel is being destroyed when there are several CQs associated to it? Should this operation fail (return EBUSY)? I think that would be the most consistent thing, since we return EBUSY for example if a CQ is destroyed with QPs still attached. When i tried to do it and later on try to wait for a completion on this event channel i got seg fault... Does the destroy succeed? Anyway I'll look at this code to see if it seems OK. - R. I'm writing the man pages to this verb, so which behaviour should i write the current behaviour or the future behaviour? for now, i'm writing the current behaviour. thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Send_bw in UD
Quoting r. Chevchenkovic Chevchenkovic [EMAIL PROTECTED]: Subject: Send_bw in UD Hi, Infiniband specification says that the completion notification in case of RC occurs when the data has actually reached the destination buffer. Whereas for UD it is given when the data is placed on the infiniband line. I was going through the code send_bw.c( https://openfabrics.org/svn/gen2/ tags/openib-1.0-rc1/src/userspace/perftest/send_bw.c ). This tells the time taken for the data to reach the destination. No, this test measures streaming bandwidth. Compare this to UDP bandwidth test. In the case of UD the same code is used. Should it not have the code which waits for the acknowledgement from the destination? Once, at the end of the test? I believe the difference will be negligeable, and the test will get more confusing. Alternately, is the bandwidth computation wrong in this case? Comments will be welcome, The computation is performed correctly. The test currently will simply block forever on server side if there is some packet loss. If the test run to completion, no packets were lost and this means that streaming bandwidth was measured correctly. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] First draft of the man pages
Hi. Attached is the first draft of the man pages for the libibverbs. I hope that in the next few weeks, the man pages will be committed to the openib svn (i guess with several changes ..). feedback is always welcome Dotan man_pages.tar.gz Description: GNU Zip compressed data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1/2] libibumad/libibmad/diags: fix printf style uses
This fixes various uses of printf() style functions. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- diags/src/ibnetdiscover.c |7 --- diags/src/ibtracert.c | 15 --- libibmad/src/rpc.c|2 +- libibumad/src/umad.c | 16 4 files changed, 21 insertions(+), 19 deletions(-) diff --git a/diags/src/ibnetdiscover.c b/diags/src/ibnetdiscover.c index c6e35e4..612aee0 100644 --- a/diags/src/ibnetdiscover.c +++ b/diags/src/ibnetdiscover.c @@ -44,6 +44,7 @@ #include time.h #include string.h #include getopt.h #include ctype.h +#include inttypes.h #define __BUILD_VERSION_TAG__ 1.2 #include common.h @@ -175,8 +176,8 @@ get_node(Node *node, Port *port, ib_port mad_decode_field(si, IB_SW_ENHANCED_PORT0_F, node-smaenhsp0); } - DEBUG(portid %s: got switch node %Lx '%s', - portid2str(portid), node-nodeguid, nd); + DEBUG(portid %s: got switch node % PRIx64 '%s', + portid2str(portid), node-nodeguid, node-nodedesc); return 1; } @@ -242,7 +243,7 @@ insert_node(Node *new) for (node = nodestbl[hash]; node; node = node-htnext) if (node-nodeguid == new-nodeguid) { - DEBUG(node %Lx already exists, new-nodeguid); + DEBUG(node % PRIx64 already exists, new-nodeguid); return node; } diff --git a/diags/src/ibtracert.c b/diags/src/ibtracert.c index 64dbe00..56c312d 100644 --- a/diags/src/ibtracert.c +++ b/diags/src/ibtracert.c @@ -43,6 +43,7 @@ #include stdarg.h #include ctype.h #include getopt.h #include netinet/in.h +#include inttypes.h #define __BUILD_VERSION_TAG__ 1.2 #include common.h @@ -166,7 +167,7 @@ get_node(Node *node, Port *port, ib_port mad_decode_field(pi, IB_PORT_LMC_F, port-lmc); mad_decode_field(pi, IB_PORT_STATE_F, port-state); - DEBUG(portid %s: got node %Lx '%s', portid2str(portid), node-nodeguid, nd); + DEBUG(portid %s: got node % PRIx64 '%s', portid2str(portid), node-nodeguid, node-nodedesc); return 0; } @@ -332,7 +333,7 @@ find_route(ib_portid_t *from, ib_portid_ DEBUG(ca or router node); if (!sameport(port, fromport)) { - IBWARN(can't continue: reached CA or router port %Lx, lid %d, + IBWARN(can't continue: reached CA or router port % PRIx64 , lid %d, port-portguid, port-lid); return -1; } @@ -378,7 +379,7 @@ badoutport: return -1; badtbl: IBWARN(Bad forwarding table entry found at: node \%s\ lid entry %d is %d (top %d), - node-nodedesc, to, outport, sw.linearFDBtop); + node-nodedesc, to-lid, outport, sw.linearFDBtop); return -1; badpath: IBWARN(Direct path too long!); @@ -402,7 +403,7 @@ insert_node(Node *new) for (node = nodestbl[hash]; node; node = node-htnext) if (node-nodeguid == new-nodeguid) { - DEBUG(node %Lx already exists, new-nodeguid); + DEBUG(node % PRIx64 already exists, new-nodeguid); return -1; } @@ -501,7 +502,7 @@ switch_mclookup(Node *node, ib_portid_t *map = 1; else continue; - VERBOSE(Switch guid 0x%Lx: mlid 0x%x is forwarded to port %d, + VERBOSE(Switch guid 0x% PRIx64 : mlid 0x%x is forwarded to port %d, node-nodeguid, mlid + 0xc000, i + set * 16); } } @@ -565,7 +566,7 @@ find_mcpath(ib_portid_t *from, int mlid) leafport = path-drpath.p[path-drpath.cnt]; map[port-portnum] = 1; node-upport = 0; /* starting here */ - DEBUG(Starting from CA 0x%Lx lid %d port %d (leafport %d), + DEBUG(Starting from CA 0x% PRIx64 lid %d port %d (leafport %d), node-nodeguid, port-lid, port-portnum, leafport); } else {/* switch */ @@ -574,7 +575,7 @@ find_mcpath(ib_portid_t *from, int mlid) node-upport = leafport; if (switch_mclookup(node, path, mlid, map) 0) { - IBWARN(skipping bad Switch 0x%Lx lid %d, + IBWARN(skipping bad Switch 0x% PRIx64 lid % PRIx64 , node-nodeguid, port-portguid); continue; } diff --git a/libibmad/src/rpc.c
[openib-general] [PATCH 2/2] libibcommon: enable printf() style format strict checking
This enables strict format/args checking for printf() style functions. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- libibcommon/include/infiniband/common.h | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h index 83c0679..66afab0 100644 --- a/libibcommon/include/infiniband/common.h +++ b/libibcommon/include/infiniband/common.h @@ -114,11 +114,16 @@ #endif #define ENUM_STR_DEF(enumname, last, val) (((unsigned)(val) last) ? enumname ## _str[val] : ???) #define ENUM_STR_ARRAY(name) char * name ## _str[] +#ifdef __GNUC__ +#define STRICT_FORMAT __attribute__((format(printf, 2, 3))) +#else +#define STRICT_FORMAT +#endif /* util.c: debugging and tracing */ -void ibwarn(const char * const fn, char *msg, ...); -void ibpanic(const char * const fn, char *msg, ...); -void logmsg(const char *const fn, char *msg, ...); +void ibwarn(const char * const fn, char *msg, ...) STRICT_FORMAT; +void ibpanic(const char * const fn, char *msg, ...) STRICT_FORMAT; +void logmsg(const char *const fn, char *msg, ...) STRICT_FORMAT; void xdump(FILE *file, char *msg, void *p, int size); -- 1.4.3.2.g4bf7 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/2] libibcommon: enable printf() style format strict checking
diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h index 83c0679..66afab0 100644 --- a/libibcommon/include/infiniband/common.h +++ b/libibcommon/include/infiniband/common.h @@ -114,11 +114,16 @@ #endif #define ENUM_STR_DEF(enumname, last, val)(((unsigned)(val) last) ? enumname ## _str[val] : ???) #define ENUM_STR_ARRAY(name) char * name ## _str[] +#ifdef __GNUC__ +#define STRICT_FORMAT __attribute__((format(printf, 2, 3))) +#else +#define STRICT_FORMAT +#endif You are polluting the global namespace - macros must be prefixed with library name. But anyway - why is this necessary? Does anyone actually try compiling libibcommon not in gcc? Why? And AFAIK e.g. intel compiler implements this __attribute__. /* util.c: debugging and tracing */ -void ibwarn(const char * const fn, char *msg, ...); -void ibpanic(const char * const fn, char *msg, ...); -void logmsg(const char *const fn, char *msg, ...); +void ibwarn(const char * const fn, char *msg, ...) STRICT_FORMAT; +void ibpanic(const char * const fn, char *msg, ...) STRICT_FORMAT; +void logmsg(const char *const fn, char *msg, ...) STRICT_FORMAT; void xdump(FILE *file, char *msg, void *p, int size); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return maybe_missed_event hint from ib_req_notify_cq()
[EMAIL PROTECTED] wrote on 11/14/2006 03:18:23 PM: Shirley The rotting packet situation consistently happens for Shirley ehca driver. The napi could poll forever with your Shirley original patch. That's the reason I defer the rotting Shirley packet process in next napi poll. Hmm, I don't see it. In my latest patch, the poll routine does: repoll: done = 0; empty = 0; while (max) { t = min(IPOIB_NUM_WC, max); n = ib_poll_cq(priv-cq, t, priv-ibwc); for (i = 0; i n; ++i) { if (priv-ibwc[i].wr_id IPOIB_OP_RECV) { ++done; --max; ipoib_ib_handle_rx_wc(dev, priv-ibwc + i); } else ipoib_ib_handle_tx_wc(dev, priv-ibwc + i); } if (n != t) { empty = 1; break; } } dev-quota -= done; *budget-= done; if (empty) { netif_rx_complete(dev); if (unlikely(ib_req_notify_cq(priv-cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS)) netif_rx_reschedule(dev, 0)) goto repoll; return 0; } return 1; so every receive completion will count against the limit set by the variable max. The only way I could see the driver staying in the poll routine for a long time would be if it was only processing send completions, but even that doesn't actually seem bad: the driver is making progress handling completions. Is it possible that when one gets into the rotting packet case, the quota is at or close to 0 (on ehca). If in the cass it is 0 and netif_rx_reschedule() case wins (over netif_rx_schedule()) then it keeps spinning unable to process any packets since the undo parameter for netif_reschedule() is 0. If netif_rx_reschedule() keeps winning for a few iterations then the receive queues get full and dropping packets, thus causing a loss in performance. If this is indeed the case, then one option to try out may be is to change the undo parameter of netif_rx_rechedule()to either IB_WC or even dev-weight. Shirley It does help the performance from 1XXMb/s to 7XXMb/s, but Shirley not as expected 3XXXMb/s. Is that 3xxx Mb/sec the performance you see without the NAPI patch? Shirley With the defer rotting packet process patch, I can see Shirley packets out of order problem in TCP layer. Is it Shirley possible there is a race somewhere causing two napi polls Shirley in the same time? mthca seems to use irq auto affinity, Shirley but ehca uses round-robin interrupt. I don't see how two NAPI polls could run at once, and I would expect worse effects from them stepping on each other than just out-of-order packets. However, the fact that ehca does round-robin interrupt handling might lead to out-of-order packets just because different CPUs are all feeding packets into the network stack. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenSM log growing too big
Not sure what question you are asking exactly. Is it what do those messages mean or the file getting large or both ? What options are you using on OpenSM startup ? Also, any chance you can move forward on a more recent and better OpenSM ? -- Hal From: [EMAIL PROTECTED] on behalf of Venkatesh Babu Sent: Wed 11/15/2006 10:22 PM To: openib-general@openib.org Cc: Venkatesh Babu Subject: [openib-general] OpenSM log growing too big I have OFED 1.0 stack and running OpenSM on a server connected to a IB subnet with couple of nodes. Usually the log file size is small. But ocassionally it is growing too big and filling up the whole hard disk. [EMAIL PROTECTED] ~]# ls -l /var/log/opensm* -rw-r--r-- 1 root root 33879121502 Nov 15 14:54 /var/log/opensm.log Most of the opensm.log file is filled with following messages. Out of 240,168,770 lines of log file 239,782,972 lines are from this __osm_trap_rcv_process_request. Nov 14 13:59:35 273746 [42803960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 127 times consecutively Nov 14 13:59:35 273908 [41401960] - __osm_trap_rcv_process_request: Received Generic Notice type:0x01 num:128 Producer:2 from LID:0x0005 TID:0x09733372Nov 14 13:59:35 273966 [41401960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 128 times consecutively Nov 14 13:59:35 274176 [41E02960] - __osm_trap_rcv_process_request: Received Generic Notice type:0x01 num:128 Producer:2 from LID:0x0005 TID:0x09733373Nov 14 13:59:35 274234 [41E02960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 129 times consecutively Nov 14 13:59:35 274380 [43204960] - __osm_trap_rcv_process_request: Received Generic Notice type:0x01 num:128 Producer:2 from LID:0x0005 TID:0x09733374Nov 14 13:59:35 274436 [43204960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 130 times consecutively Nov 14 13:59:35 274662 [42803960] - __osm_trap_rcv_process_request: Received Generic Notice type:0x01 num:128 Producer:2 from LID:0x0005 TID:0x09733375Nov 14 13:59:35 274720 [42803960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 131 times consecutively Nov 14 13:59:35 274970 [41401960] - __osm_trap_rcv_process_request: Received Generic Notice type:0x01 num:128 Producer:2 from LID:0x0005 TID:0x09733376Nov 14 13:59:35 275026 [41401960] - __osm_trap_rcv_process_request: ERR 3804: Received trap 132 times consecutively ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/ucm: fix deadlock in cleanup
Michael S. Tsirkin wrote: ib_ucm_cleanup_events has file_mutex while calling ib_destroy_cm_id. It seems this can deadlock since ib_destroy_cm_id flushes event handlers, and ib_ucm_event_handler needs file_mutex, too. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- I'll be testing the following the next night - but it seems a right thing to do regardless of whether it fixes the issues I reported earlier. Sean, does this make sense to you? If yes, please ack for 2.6.19. Yes - this looks right to me. Acked-by: Sean Hefty [EMAIL PROTECTED] diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c index ad4f4d5..0128288 100644 --- a/drivers/infiniband/core/ucm.c +++ b/drivers/infiniband/core/ucm.c @@ -161,12 +161,14 @@ static void ib_ucm_cleanup_events(struct struct ib_ucm_event, ctx_list); list_del(uevent-file_list); list_del(uevent-ctx_list); + mutex_unlock(ctx-file-file_mutex); /* clear incoming connections. */ if (ib_ucm_new_cm_id(uevent-resp.event)) ib_destroy_cm_id(uevent-cm_id); kfree(uevent); + mutex_lock(ctx-file-file_mutex); } mutex_unlock(ctx-file-file_mutex); } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return maybe_missed_event hint from ib_req_notify_cq()
Quoting r. Roland Dreier [EMAIL PROTECTED]: I would really like to understand why ehca does worse with NAPI. In my tests both mthca and ipath exhibit various degrees of improvement depending on the test -- but I've never seen performance get worse. This is the main thing holding back merging NAPI. Documentation/netowkring/NAPI_HOWTO.txt says: APPENDIX 3: Scheduling issues As seen NAPI moves processing to softirq level. Linux uses the ksoftirqd as the general solution to schedule softirq's to run before next interrupt and by putting them under scheduler control. Also this prevents consecutive softirq's from monopolize the CPU. This also have the effect that the priority of ksoftirq needs to be considered when running very CPU-intensive applications and networking to get the proper balance of softirq/user balance. Increasing ksoftirq priority to 0 (eventually more) is reported cure problems with low network performance at high CPU load. So I wonder 1. Was this tried? Its clear that we have high CPU load. 2. Could this be the reason that e.g. e1000 disables NAPI by default? The issue seem sufficiently tricky that we may yet find ourselves debugging NAPI performance problems in the field. Maybe we still need a module option ... -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Question about the query QP mask
Hi. in the file ib_verbs, in the description of the verb ib_query_qp it is written: The qp_attr_mask may be used to limit the query o gathering only the selected attributes.. I checked the low level drivers of all of the HCAs and only the eHCA is actually behave like this (and set ONLY the masked attributes). What should be the expected behavior? Should this description should be changed or should the low level drivers of mthca and ipath need to be changed? thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client?
Here is my next and hopefully last problem. As described earlier I'm connecting a gen2 x86 clients to a gen1 PowerPC server After having sorted out the trouble with the CM parameters I'm now having trouble with RDMA read from client on the server. What works is: - gen2 x86 client doing a VAPI_SEND to gen1 PowerPC server. (this wasn't working last time) - RDMA write from gen1 PowerPC server to gen2 x86 client What is not working is: - RDMA read from gen2 x86 client on gen1 PowerPC server. I'm getting a vendor_error 0x81 VAPI_RETRY_EXC_ERR in the send completion queue. The RDMA start address, length and key have been exchanged and look identical on both sides. Doing connections and transfers between x86 only gen1 server x86 gen2 client works in all directions. (Send and receive as well as RDMA read and write) So a gen2 client can do a RDMA read from a gen1 server! Having a gen1 PowerPC server and a gen1 x86 client works as well. So a gen1 PowerPC server can be RDMA read from an x86 client! I'm again a little puzzled what can the gen2 server do wrong in a RDMA read on a PowerPC server when it can do the same operation a x86 server? Any ideas, thoughts, help are more then welcome Thanks Thomas Here is my code I'm using to do RDMA I'm always having only a single segment to be transmitted! rdma(ibv_sge *sgList, int sgListlen, int size, bool write) { struct ibv_send_wr wr; struct ibv_send_wr *bad_wr; int res; int localErrno = 0; uint64_tremainingBytes = ntohl(_remoteBufferInfo-totalSize); sgList[0].length = remainingBytes; memset(wr, 0, sizeof(wr)); wr.next= NULL; wr.wr_id = 1; wr.opcode = write ? IBV_WR_RDMA_WRITE : IBV_WR_RDMA_READ; wr.send_flags = IBV_SEND_SIGNALED; wr.sg_list = sgList; wr.num_sge = 1; wr.wr.rdma.remote_addr = ntohll(_remoteBufferInfo-sgList[0].addr); wr.wr.rdma.rkey= ntohl (_remoteBufferInfo-sgList[0].lkey); cleanCq(_sCq); res = ibv_post_send(_dataQp, wr, bad_wr); if (res != 0) { DEBUG1(Error in RDMA operation scheduling: %s\n, strerror(res)); sv2BreakConnection(); localErrno = ENOTCONN; return 0; } else { if (waitOnCq(_sCq)) { localErrno = -1; } } } Thomas Bub Grass Valley Germany GmbH Brunnenweg 9 64331 Weiterstadt, Germany Tel: +49 6150 104 147 Fax: +49 6150 104 656 Email: [EMAIL PROTECTED] www.GrassValley.com http://www.grassvalley.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client?
Subject: What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client? Here is my next and hopefully last problem. As described earlier I’m connecting a gen2 x86 clients to a gen1 PowerPC server Endian-ness issues? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx)
If you think these patches are good enough, could you please create a branch in your git tree based on for-2.6.20 for this code ? Yes, I will create a vex branch for this in my tree. However, moving this further upstream will depend on getting a real review of the code, and some sort of protocol document will probably be required for anyone to wade through this... - R. Thanks, Roland. I understand that the code has to go through several rounds of review before it is moved upstream. Would the branch that you create be in sync with the for-2.6.20 branch ? That way I can keep the code in sync with the latest changes. Also is the branch already created ? I tried to update my copy of the tree but could not see a vex branch. Please note that this driver is a device driver for a remote device and the communication between the driver and the device is like any other device driver, its just that this driver uses IB as its bus where as others use PCI etc. The entire communication between the driver and VEx can be understood from a reading of the code. To make it simpler to understand the code, I am providing a small note about the terminology and code organization: Each virtual NIC has a netpath, which is an abstraction of a connection to the VEx. Each netpath has a viport, the virtual port, which is an abstraction of the control and data IB connections through which control and data messages are exchanged. The control messages which are exchanged can be seen in vnic_control_pkt.h (patch 4). The data messages are nothing but transfer of data itself.(patch 5) The series of functions that are called in viport_statemachine() in vnic_viport.c (patch 3) are a good starting point to understand the control path. In this, establishment of control and data IB connections is done in viport_handle_init_states(). After that, the sequence of request and response messages that are exchanged can be seen in viport_handle_control_states(), viport_handle_data_states(), viport_handle_xchgpool_states(), viport_handle_idle_states() and so on. The code flow of the driver itself begins when the VEx information is written to the sysfs file create_primary. [vnic_create_primary() in vnic_sys.c (patch 8)] Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] RDMA/iwcm: Get rid of extra call to list_empty()
The race is that you've deleted the work queue element that is enqueued on the iwcm_wq. It's as simple as that. To prove it to yourself, apply your patch. Turn on memory debug support in the kernel and recompile your code. Then run rdma_krping clients in four different threads against your server with an I/O count of 1. You'll hit the race and can look at it yourself. I don't know any better way to explain it... Sorry. On 11/13/06 10:44 PM, Krishna Kumar2 [EMAIL PROTECTED] wrote: Hi Tom, No, to understand why go look at the implementation of queue_work. BTW, this I was describing the implementation of queue_work() in my previous mail. So sorry to be dense, but I do not understand why this patch introduces a race. Can you explain the race that you had found ? What I understood of queue_work() is : If cm_work_handler() is already running and processing the last entry at the same time this new entry was added, it is guaranteed to find this new entry in it's current run iteration, and process it. The only issue is with the extra queue_work by iwcm parallely on a different cpu for the same case. So if iwcm had done a redundant queue_work on this queue, which, besides adding the new entry to the workqueue, also does a wakeup of worker_thread (which is still running the previous iteration of run_workqueue - cm_work_handler). I am assuming that the wake up function is default_wake_function(), since I couldn't locate in wait* code where this is initialized. When cm_work_handler finishes removing this new entry, it returns to worker_thread, which will do a schedule() and sleep till it is woken up again (since default_wake_function found that the thread is already running and had done nothing). Are you referring to a race where the queue_work is done between the time cm_work_handler finished running and before it gets back to schedule ? I feel that should not matter as the run_workqueue() will find this entry in it's cwq-worklist and continue processing instead of exiting to worker_thread() and schedule(). Still confused about the race :) Thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/mthca: HCA profile module parameters
Roland Dreier wrote: The patch is line-wrapped and bizarrely corrupted and won't apply, eg: + mthca_warn(mdev, num_qp rounded to power of 2 (%d).\n, + default_profile.num_qp); +} This is completely unnecessary: +#define to_up_power_of_2(x) (x = roundup_pow_of_two(x)) ...just open code this. And this seems strange: +#define is_power_of_2(x) (x0 (x (x - 1))) so there's no warning if someone passes in a negative value?? and it's backwards too, (x (x - 1)) is 0 precisely for the powers of 2. Was this patch tested at all? Anyway, all this + if (!is_power_of_2(default_profile.num_qp)){ + to_up_power_of_2(default_profile.num_qp); + mthca_warn(mdev, num_qp rounded to power of 2 (%d).\n, + default_profile.num_qp); +} seems very repetive. Can't it be wrapped up in a function so we just do something like mthca_check_profile_value(default_profile.num_qp); mthca_check_profile_value(default_profile.rdb_per_qp); mthca_check_profile_value(default_profile.num_cq); etc. - R. Thanks for the comments Lines became wrapped because I used a wrong email client. I'll re-submit with another client but this would be in a new thread because I still have problems reading mail with it and therefore I can't reply to this thread. Sorry for the bother... The patch was tested but unfortunately I sent the wrong one (not the final). The new version is the one I should have sent + changes according to the comments here. thanks MoniS ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH v2] IB_mthca HCA profile module parameters
From: Leonid Arsh [EMAIL PROTECTED] Adds module parameters that enable settting some of the HCA profile values Signed-off-by: Leonid Arsh [EMAIL PROTECTED] Signed-off-by: Moni Shoua [EMAIL PROTECTED] --- mthca_main.c | 139 ++- 1 files changed, 128 insertions(+), 11 deletions(-) --- mthca_main.c.orig 2006-11-14 22:07:58.0 -0500 +++ mthca_main.c2006-11-16 11:27:17.683513163 -0500 @@ -80,21 +80,134 @@ module_param(tune_pci, int, 0444); MODULE_PARM_DESC(tune_pci, increase PCI burst from the default set by BIOS if nonzero); +#define MTHCA_DEFAULT_NUM_QP(1 16) +#define MTHCA_DEFAULT_RDB_PER_QP(1 2) +#define MTHCA_DEFAULT_NUM_CQ(1 16) +#define MTHCA_DEFAULT_NUM_MCG (1 13) +#define MTHCA_DEFAULT_NUM_MPT (1 17) +#define MTHCA_DEFAULT_NUM_MTT (1 20) +#define MTHCA_DEFAULT_NUM_UDAV (1 15) +#define MTHCA_DEFAULT_NUM_RESERVED_MTTS (1 18) +#define MTHCA_DEFAULT_NUM_UARC_SIZE (1 18) + +static struct mthca_profile default_profile = { + .num_qp= MTHCA_DEFAULT_NUM_QP, + .rdb_per_qp= MTHCA_DEFAULT_RDB_PER_QP, + .num_cq= MTHCA_DEFAULT_NUM_CQ, + .num_mcg = MTHCA_DEFAULT_NUM_MCG, + .num_mpt = MTHCA_DEFAULT_NUM_MPT, + .num_mtt = MTHCA_DEFAULT_NUM_MTT, + .num_udav = MTHCA_DEFAULT_NUM_UDAV,/* Tavor only */ + .fmr_reserved_mtts = MTHCA_DEFAULT_NUM_RESERVED_MTTS, /* Tavor only */ + .uarc_size = MTHCA_DEFAULT_NUM_UARC_SIZE, /* Arbel only */ +}; + +module_param_named(num_qp, default_profile.num_qp, int, 0444); +MODULE_PARM_DESC(num_qp, maximum number of available QPs per HCA); + +module_param_named(rdb_per_qp, default_profile.rdb_per_qp, int, 0444); +MODULE_PARM_DESC(rdb_per_qp, number of RDB buffers per QP); + +module_param_named(num_cq, default_profile.num_cq, int, 0444); +MODULE_PARM_DESC(num_cq, maximum number of CQs per HCA); + +module_param_named(num_mcg, default_profile.num_mcg, int, 0444); +MODULE_PARM_DESC(num_mcg, maximum number of multicast groups per HCA); + +module_param_named(num_mpt, default_profile.num_mpt, int, 0444); +MODULE_PARM_DESC(num_mpt, + maximum number of memory protection pable entries per HCA); + +module_param_named(num_mtt, default_profile.num_mtt, int, 0444); +MODULE_PARM_DESC(num_mtt, +maximum number of memory translation table segments per HCA); +/* Tavor only */ +module_param_named(num_udav, default_profile.num_udav, int, 0444); +MODULE_PARM_DESC(num_udav, maximum number of UD address vectors per HCA); + +/* Tavor only */ +module_param_named(fmr_reserved_mtts, default_profile.fmr_reserved_mtts, int, 0444); +MODULE_PARM_DESC(fmr_reserved_mtts, +number of memory translation table segments reserved for FMR); + static const char mthca_version[] __devinitdata = DRV_NAME : Mellanox InfiniBand HCA driver v DRV_VERSION ( DRV_RELDATE )\n; -static struct mthca_profile default_profile = { - .num_qp= 1 16, - .rdb_per_qp= 4, - .num_cq= 1 16, - .num_mcg = 1 13, - .num_mpt = 1 17, - .num_mtt = 1 20, - .num_udav = 1 15, /* Tavor only */ - .fmr_reserved_mtts = 1 18, /* Tavor only */ - .uarc_size = 1 18, /* Arbel only */ -}; +#define is_power_of_2(x) (!(x (x - 1))) + +static int __devinit mthca_check_profile_value(int* pval,int pval_default){ +/* value must be positive and power of 2 */ +int old_pval = *pval; +if (old_pval = 0) { +*pval = pval_default; +} else if (!is_power_of_2(old_pval)) { +*pval = roundup_pow_of_two(old_pval); +} +return old_pval-*pval; +} + +static int __devinit mthca_validate_profile(struct mthca_dev *mdev, + struct mthca_profile *profile) +{ +if (mthca_check_profile_value(default_profile.num_qp, + MTHCA_DEFAULT_NUM_QP)){ + mthca_warn(mdev,invalid num_qp passed. changed to %d.\n, + default_profile.num_qp); + } + + if (mthca_check_profile_value(default_profile.rdb_per_qp, + MTHCA_DEFAULT_RDB_PER_QP)){ +mthca_warn(mdev,invalid rdb_per_qp passed. changed to %d\n, + default_profile.rdb_per_qp); + } + + if (mthca_check_profile_value(default_profile.num_cq, + MTHCA_DEFAULT_NUM_CQ)){ + mthca_warn(mdev,invalid num_cq passed. changed to %d\n, + default_profile.num_cq); + } + + if (mthca_check_profile_value(default_profile.num_mcg, + MTHCA_DEFAULT_NUM_MCG)){ + mthca_warn(mdev,invalid num_mcg passed. changed to
Re: [openib-general] [PATCH] IB/SRP - increase supported CDB size
Definitely makes sense. I queued the following version for 2.6.20, which gets the max CDB size directly from struct srp_cmd. Does this look OK to you? Thanks, Roland diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 4b09147..01776c9 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1716,7 +1716,8 @@ static ssize_t srp_create_target(struct if (!target_host) return -ENOMEM; - target_host-max_lun = SRP_MAX_LUN; + target_host-max_lun = SRP_MAX_LUN; + target_host-max_cmd_len = sizeof ((struct srp_cmd *) (void *) 0L)-cdb; target = host_to_target(target_host); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] I need your help.
强 马 wrote: Hello! sir. I 've been developing my mpich projects on infiniband cluster for two months. $ ibstat CA type: MT25204 Number of ports: 1 Firmware version: 1.1.0 Hardware version: a0 Node GUID: 0xe865620060529997 System image GUID: 0xe86562006052999a Port 1: State: Active Physical state: LinkUp Rate: 10 Base lid: 82 LMC: 0 SM lid: 82 Capability mask: 0x02510a6a Port GUID: 0xe865620060529998 I've downloaded Mellanox IB-Verbs API (VAPI) , but I works on openib version. Would you mind telling me where I can download the API manual about OpenIB? thank you in advance. There is no API manual for openib, however Dotan has started MAN pages. See his email on the list with the man pages attached Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx)
Would the branch that you create be in sync with the for-2.6.20 branch ? That way I can keep the code in sync with the latest changes. Also is the branch already created ? I tried to update my copy of the tree but could not see a vex branch. I have not created the branch yet. I will probably be able to do it today. The way I will do it is to make a branch from the master branch so that your patches should just be on top of Linus's tree. Please note that this driver is a device driver for a remote device and the communication between the driver and the device is like any other device driver, its just that this driver uses IB as its bus where as others use PCI etc. OK... Each virtual NIC has a netpath, which is an abstraction of a connection to the VEx. Each netpath has a viport, the virtual port, which is an abstraction of the control and data IB connections through which control and data messages are exchanged. ...but this seems like over-abstraction that makes the code harder to understand. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client?
I'm again a little puzzled what can the gen2 server do wrong in a RDMA read on a PowerPC server when it can do the same operation a x86 server? Do you have any endian conversion bugs? wr.wr.rdma.rkey= ntohl (_remoteBufferInfo-sgList[0].lkey); The R_Key you use should be the remote side's R_Key, not the L_Key (_local_ key) that it has. This doesn't matter for HCAs where they are the same, but it's better to do things correctly... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client?
Subject: What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client? Here is my next and hopefully last problem. As described earlier Iâm connecting a gen2 x86 clients to a gen1 PowerPC server Endian-ness issues? How do you connect the QPs? if you are using CM i think there is an endianess inconsisstency between the CMs. i suggest you to check the following attributes between the two sides: rq_psn sq_psn dlid dest_qp_num problem in one of those attributes can cause the retry exceeded you got ... Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] What could prevent a gen2 x86 client qp from doing RDMA_READ on a gen1 PowerPC client?
Endian-ness issues? Yes that is the first thought. But where? Since my gen2 x86 rdma code can do an rdma read from a gen1 and gen2 x86 server I think the only values in the ibv_send_wr that can be wrong talking to a PowerPC server can be remote_addr and rkey right I already swapped both but without success. Are there other places in the ibv_send_wr or the underlying code that might be endian-ness fooled? Since I can do a VAPI_SEND (non RDMA) from the gen2 x86 client to the gen1 PowerPC server I think the qp should be OK? Is there something RDMA READ specific in the qp that still might not be right after my CM connection from gen2 to gen1? Don't forget the RDMA WRITE from the gen1 PowerPC server to the gen2 x86 client on the same qp works just before the RDMA READ from gen2 x86 client on the gen1 PowerPC server fails. Still confusing. Thomas Yes. there are 2 attributes in every QP that handles RDMA Reads/Atomic operations: a) how many outstanding RDMA Read atomic the QP may send as an initiator b) how many outstanding RDMA Read atomic the QP may send as a target the connectivty between QPs X and Y should be: Xa = Yb Xb = Ya and ofcourese RDMA Read need to be enabled in the QP access permissions and in the MR permissions ... Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenSM log growing too big
Hal Rosenstock wrote: Not sure what question you are asking exactly. Is it what do those messages mean or the file getting large or both ? Both. The message looks like LID 5 is generating too many events. The log file grows few MBs a second. What ever the problem with the port it should not generate these many log messages. I guess it is a OpenSM bug. What options are you using on OpenSM startup ? root 7703 0.0 0.0 92784 1652 ?Sl 05:00 0:01 /usr/bin/opensm -g 0x005045014ac20001 -p 11 -s 10 -u -f /var/log/opensm.log Also, any chance you can move forward on a more recent and better OpenSM ? It is difficult to use OpenSM from OFED 1.1. Because we need to do another QA verification cycle with our product. But I can find the specific patch to the OpenSM I can apply that patch to the existing OpenSM. VBabu -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Setting up VLArbitration tables SL2VLMapping
Hi, I have installed the OFED 1.1 distribution and have a Mellanox 25208 HCA. I want to know if there is any particular program/application that would allow me to set the SL2VL mapping and VL arbitration tables for the HCA? Thanks, Adit - Adit Ranadive MS CS Candidate Georgia Institute of Technology, Atlanta, GA ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Question about multicast GIDs
Robert Walsh wrote: Roland Dreier wrote: Is there are registration authority for multicast GIDs? Or at least a safe way of assigning a range of GIDs to a vendor? I don't think so. Perhaps RFC 3307 would be of some use... Ah - looks exactly like what I was looking for. Thanks. Hmm - spoke too soon. This seems to be related to IPv6 multicast GIDs, but not IB. The idea is similar, but the allocation mechanism is entirely arbitrary (but consistent) and I don't think it would map from IPv6 to IB in any meaningful way. I'll talk to the folks here who are on the various IB committees and see if they have any thoughts on this. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return maybe_missed_event hint from ib_req_notify_cq()
Pradeep Is it possible that when one gets into the rotting Pradeep packet case, the quota is at or close to 0 (on ehca). If Pradeep in the cass it is 0 and netif_rx_reschedule() case wins Pradeep (over netif_rx_schedule()) then it keeps spinning unable Pradeep to process any packets since the undo parameter for Pradeep netif_reschedule() is 0. It is possible that the quota is close to 0, but I don't see how the poll routine could spin with quota (the variable max) equal to 0. If max is 0, then the while (max) loop will never be entered, empty will remain 0, and the poll routine will simply fall through and return 1. Do you agree with that summary? We don't want the undo parameter of netif_rx_reschedule() to be non-zero because when we go back to repoll, done is reset to 0. So there's no reason to increase the quota again. I guess you could instrument how many iterations there are with a small value of max, but I would assume it's self-limiting, since the last few completions should appear fairly quickly. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] account on the new ofa server
How do I get an account on the new ofa server? Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return maybe_missed_event hint from ib_req_notify_cq()
What I have found in ehca driver, n! = t, does't mean it's empty. If poll again, there are still some packets in cq. IB_CQ_REPORT_mISSED_EVENTS most of the time reports 1. It relies on netif_rx_reschedule() returns 0 to exit napi poll. That might be the reason in poll routine for a long time? I will rerun my test to use n! = 0 to see any difference here. Maybe there's an ehca bug in poll CQ? If n != t then it should mean that the CQ was indeed drained. I would expect a missed event would be rare, because it means a completion occurs between the last poll CQ and the request notify, and that shouldn't be that common... My rough estimate is that even at a higher throughput than what you're seeing, IPoIB should only generate ~ 500K completions/sec, which means the average delay between completions is 2 microseconds. So I wouldn't expect completions to hit the window between poll and request notify that often. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] make ipoib_ib_dev_stop void?
Shouldn't ipoib_ib_dev_stop be void? Looks like it -- after all we never check the return value. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: add userspace support]
Hey Roland, What's the plan on this series? Do you plan on pulling these into your for-2.6.20 tree? (don't mean to push...just wondering if they're on track) Thanks, Steve. Forwarded Message From: Sean Hefty [EMAIL PROTECTED] To: 'Roland Dreier' [EMAIL PROTECTED], openib-general@openib.org Subject: [openib-general] [PATCH 0/7 v2] for 2.6.20 rdma/cma: add userspace support Date: Tue, 24 Oct 2006 15:25:48 -0700 The following set of patches expand the rdma_cm support to include UD and multicast, and expose the rdma_cm to userspace. I would like to target the 2.6.20 kernel, but at least getting them into one or more branches would be helpful for other developers to test against these changes. As mentioned in the RFC, the patches borrow heavily from the code checked into openfabrics svn, but there are some notable differences. The main difference from the patches submitted for the RFC is the integration of the ib_multicast module with the ib_sa module. The two modules are loosely coupled, with minimal changes made to the existing sa_query code. Signed-off-by: Sean Hefty [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 5/11] Implementation of Data path of the communication protocol
While importing these patches, I got several Space in indent is followed by a tab. errors. For example, the line + __constant_cpu_to_be16(ETH_P_8021Q))) { which also leads to the comment that there's no reason for __constant_cpu_to_be16() here -- just use cpu_to_be16 and let the compiler do the optimization. (the __constant form is only needed in places where the function call is a syntax error) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: add userspace support]
What's the plan on this series? Do you plan on pulling these into your for-2.6.20 tree? I need to make time to read them over. And I would like to get some resolution for the IPoIB crashes that Mellanox sees before we commit to merging them into 2.6.20. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx)
OK, I just pushed out a vex branch with these patches in it. I noticed that you put your code under ulp/vnic -- that seems a little too generic to me, given that this is one particular proprietary vnic implementation. Maybe something like ulp/qlvex or something like that? And similarly for the config options -- probably something like CONFIG_INFINIBAND_QLOGIC_VEX would be better to avoid clashes. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Question about the query QP mask
What should be the expected behavior? Should this description should be changed or should the low level drivers of mthca and ipath need to be changed? The mask is used as a hint to the low-level driver about which attributes the consumer cares about. The driver may fill in more fields, but it can use the mask to optimize some calls, if filling in a particular field is expensive and that field is not requested by the consumer. I guess we should update the documentation to reflect this. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2 0/11] [RFC] Support for QLogic Virtual Ethernet I/O Controller (VEx)
Oh, one other things -- you probably want to add a MAINTAINERS entry for your driver so people know who to bother... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2] IB_mthca HCA profile module parameters
We seem to be making negative progress :( The patch is still corrupted, eg: +module_param_named(num_mpt, default_profile.num_mpt, int, 0444); +MODULE_PARM_DESC(num_mpt, + maximum number of memory protection pable entries per HCA); Indentation is completely borken: +static int __devinit mthca_check_profile_value(int* pval,int pval_default){ +/* value must be positive and power of 2 */ +int old_pval = *pval; No braces needed around one-statement blocks: +if (old_pval = 0) { +*pval = pval_default; +} else if (!is_power_of_2(old_pval)) { And that test is_power_of_2() is completely unnecessary -- just set *pval to roundup_pow_of_two unconditionally (and kill the is_power_of_2 macro completely). +if (mthca_check_profile_value(default_profile.num_qp, + MTHCA_DEFAULT_NUM_QP)){ +mthca_warn(mdev,invalid num_qp passed. changed to %d.\n, + default_profile.num_qp); + } You should be able to create a macro that passes the name of the parameter in too, and move the if statement and the warning into mthca_check_profile_value... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] opensm problem
I'm using 2.6.19-rc5 + sean's ucma patch series plus the latest userspace/management code from svn. I'm running mthca point to point between two servers. When I start opensm, it continually logs these messages: Nov 16 14:15:15 567111 [42003940] - __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x00010083, expected comp mask = 0x000130c7, MGID: 0xff12401b : 0x0016 from port 0x0002c902002147c9 I think this is some sort of mismatch between the mcast code in seans patch series and the management code maybe? Anybody seen this? Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] DevCon decision on userspace
Following Roland's lead on his userspace libraries (verbs and mthca), the DevCon decided that the userspace trunk will be moved to git with each component maintainer have a public tree with one or more branches to be pushed up to a git trunk. It is a requirement to import all the version history from svn and prune as appropriate. The timeframe for this is TBD. Any comments from maintainers and any consumers of the current userspace trunk ? -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Setting up VLArbitration tables SL2VLMapping
OpenSM supports setting up these tables. There is info in the man page on opensm on this. -- Hal From: [EMAIL PROTECTED] on behalf of Adit Ranadive Sent: Thu 11/16/2006 1:50 PM To: openib-general@openib.org Subject: [openib-general] Setting up VLArbitration tables SL2VLMapping Hi, I have installed the OFED 1.1 distribution and have a Mellanox 25208 HCA. I want to know if there is any particular program/application that would allow me to set the SL2VL mapping and VL arbitration tables for the HCA? Thanks, Adit - Adit Ranadive MS CS Candidate Georgia Institute of Technology, Atlanta, GA ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
Steve, Those messages mean that you are joining a MC group which is not already created. The MGID iof 0xff12401b : 0x0016 is for 224.0.0.22. That is for IGMP on your IPoIB subnet. The group either needs to be preconfigured or the first joiner needs to create the group (which requires more characteristics). OpenSM already precreates some groups but not this one. This can be added easily. Can it wait until next week ? -- Hal From: [EMAIL PROTECTED] on behalf of Steve Wise Sent: Thu 11/16/2006 3:18 PM To: openib-general Subject: [openib-general] opensm problem I'm using 2.6.19-rc5 + sean's ucma patch series plus the latest userspace/management code from svn. I'm running mthca point to point between two servers. When I start opensm, it continually logs these messages: Nov 16 14:15:15 567111 [42003940] - __osm_mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x00010083, expected comp mask = 0x000130c7, MGID: 0xff12401b : 0x0016 from port 0x0002c902002147c9 I think this is some sort of mismatch between the mcast code in seans patch series and the management code maybe? Anybody seen this? Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
On Thu, 2006-11-16 at 23:28 +0200, Hal Rosenstock wrote: Steve, Those messages mean that you are joining a MC group which is not already created. The MGID iof 0xff12401b : 0x0016 is for 224.0.0.22. That is for IGMP on your IPoIB subnet. The group either needs to be preconfigured or the first joiner needs to create the group (which requires more characteristics). OpenSM already precreates some groups but not this one. This can be added easily. Can it wait until next week ? I guess, but I'm wondering what changed? This used to just work out of the box. Perhaps the IPoIB module in 2.6.19 isn't up to date? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
Steve, Did you configure the kernel differently ? Is IGMP turned on somehow ? (I haven't run with Sean's multicast code.) BTW, as I mentioned, this can be solved on the client side equally as well as the SM side. -- Hal From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thu 11/16/2006 4:32 PM To: Hal Rosenstock Cc: openib-general@openib.org Subject: RE: [openib-general] opensm problem On Thu, 2006-11-16 at 23:28 +0200, Hal Rosenstock wrote: Steve, Those messages mean that you are joining a MC group which is not already created. The MGID iof 0xff12401b : 0x0016 is for 224.0.0.22. That is for IGMP on your IPoIB subnet. The group either needs to be preconfigured or the first joiner needs to create the group (which requires more characteristics). OpenSM already precreates some groups but not this one. This can be added easily. Can it wait until next week ? I guess, but I'm wondering what changed? This used to just work out of the box. Perhaps the IPoIB module in 2.6.19 isn't up to date? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenSM log growing too big
Hi Venkat, See embedded hnr.../hnr comments below. -- Hal From: Venkatesh Babu [mailto:[EMAIL PROTECTED] Sent: Thu 11/16/2006 1:39 PM To: Hal Rosenstock Cc: openib-general@openib.org Subject: Re: [openib-general] OpenSM log growing too big Hal Rosenstock wrote: Not sure what question you are asking exactly. Is it what do those messages mean or the file getting large or both ? Both. The message looks like LID 5 is generating too many events. hnr Yes, LID 5 is a switch LID and there is a port which is flapping. Bad cable ? /hnr The log file grows few MBs a second. What ever the problem with the port it should not generate these many log messages. I guess it is a OpenSM bug. hnr The code is reducing the messages which are similar (approx 128 traps). The SM is repressing the trap and then the switch regenerates it becuase there is a port going up and down. That issue should be resolved. There has been discussion on the list and patches on dealing with the log and limiting its size that are in more recent versions of OpenSM. I'll look at it to see if I can reduce these messages further. /hnr What options are you using on OpenSM startup ? root 7703 0.0 0.0 92784 1652 ?Sl 05:00 0:01 /usr/bin/opensm -g 0x005045014ac20001 -p 11 -s 10 -u -f /var/log/opensm.log Also, any chance you can move forward on a more recent and better OpenSM ? It is difficult to use OpenSM from OFED 1.1. Because we need to do another QA verification cycle with our product. But I can find the specific patch to the OpenSM I can apply that patch to the existing OpenSM. hnr I would highly recommend moving to OFED 1.1 OpenSM (from OFED 1.0). Many bugs have been fixed and it is much more robust. /hnr VBabu -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
On Thu, 2006-11-16 at 23:37 +0200, Hal Rosenstock wrote: Steve, Did you configure the kernel differently ? Is IGMP turned on somehow ? (I haven't run with Sean's multicast code.) IGMP turned on where? BTW, as I mentioned, this can be solved on the client side equally as well as the SM side. How? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/ipoib: compliance/interoperability fix
Thanks, I queued this for 2.6.19. I assume it's been tested carefully? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
Steve, See hnr.../hnr embedded comments below. -- Hal From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thu 11/16/2006 4:51 PM To: Hal Rosenstock Cc: openib-general@openib.org Subject: RE: [openib-general] opensm problem On Thu, 2006-11-16 at 23:37 +0200, Hal Rosenstock wrote: Steve, Did you configure the kernel differently ? Is IGMP turned on somehow ? (I haven't run with Sean's multicast code.) IGMP turned on where? | hnr Not sure what turns this on. I think IP multicast needs to be configured in the kernel. I don't think it is automatic although that might be the default config. Also, using IP multicast (via Sean's multicast code) likely causes IGMP to be used so the routers know the IPmc groups being created/joined/left. /hnr BTW, as I mentioned, this can be solved on the client side equally as well as the SM side. How? hnr The client side can do a full join with all the SA required characteristics (called components in IB). There are downsides to both approaches: configuration (SM side) versus first joiner node characteristics are the ones enforced for the group which may not be the desired result. /hnr ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] opensm problem
IGMP turned on where? | hnr Not sure what turns this on. I think IP multicast needs to be configured in the kernel. I don't think it is automatic although that might be the default config. Also, using IP multicast (via Sean's multicast code) likely causes IGMP to be used so the routers know the IPmc groups being created/joined/left. /hnr BTW, as I mentioned, this can be solved on the client side equally as well as the SM side. I figured it out what was causing all the joins. I was running mdnsd (Multicast DNS daemon). I turned it off and things are nice and quiet. I guess SUSE 10.1 turns this on by default... Thanks for clues!! Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] להגיע אל מעבר לציפיות שלך
Title: כדי לחיות חיים כמו שתמיד רצית, צריך... "קרש קפיצה" פריצת דרך אישית – תהליך הכולל: סדנה בשילוב אימון אישי (COACHING) מהי פריצת דרך אישית? מעשה או תהליך, שמביא אותנו להכרה וצעידה בדרך חדשה ומאתגרת, שהייתה "חסומה" בפנינו בעבר. דרך המובילה לתוצאות אותם אנו רוצים להשיג בחיינו. בראש מכון כריזמה MP בע"מ - ערן שי - פסיכולוגחברתי (MA) – מנחה ומאמן- מנכ"ל החברה. מומחה בהנחיית קבוצות ופרטים בגישה הקוגניטיבית-התנהגותית. מנחה מקצועי של מאמנים בסדנאות לפריצת דרך אישית. ניסיון של מעל 15 שנים בעבודה בהנחיה והדרכה. לחצו על הקישור והתחילו את השינוי המיוחל: http://www.charismassertiv.com/contactus.php להסרה/REMOVE ME ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/2] libibcommon: enable printf() style format strict checking
On 17:03 Thu 16 Nov , Michael S. Tsirkin wrote: diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h index 83c0679..66afab0 100644 --- a/libibcommon/include/infiniband/common.h +++ b/libibcommon/include/infiniband/common.h @@ -114,11 +114,16 @@ #endif #define ENUM_STR_DEF(enumname, last, val) (((unsigned)(val) last) ? enumname ## _str[val] : ???) #define ENUM_STR_ARRAY(name) char * name ## _str[] +#ifdef __GNUC__ +#define STRICT_FORMAT __attribute__((format(printf, 2, 3))) +#else +#define STRICT_FORMAT +#endif You are polluting the global namespace - macros must be prefixed with library name. This is not the style for this library, but I have nothing against adding prefix here. Will do. But anyway - why is this necessary? Does anyone actually try compiling libibcommon not in gcc? Why? I don't know if anyone will want to build this with non-gcc compiler, but I know that this attribute is gcc extension. And AFAIK e.g. intel compiler implements this __attribute__. As well as format(printf(...))? It is nice. I don't have icc to check this, but feel free to send the patch if you like. Sasha /* util.c: debugging and tracing */ -void ibwarn(const char * const fn, char *msg, ...); -void ibpanic(const char * const fn, char *msg, ...); -void logmsg(const char *const fn, char *msg, ...); +void ibwarn(const char * const fn, char *msg, ...) STRICT_FORMAT; +void ibpanic(const char * const fn, char *msg, ...) STRICT_FORMAT; +void logmsg(const char *const fn, char *msg, ...) STRICT_FORMAT; void xdump(FILE *file, char *msg, void *p, int size); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: add userspace support]
I need to make time to read them over. And I would like to get some resolution for the IPoIB crashes that Mellanox sees before we commit to merging them into 2.6.20. I agree that we need to fix the ipoib crashes before merging this upstream. After that is resolved, I need to make a couple of small updates to the patches before resubmitting. If this misses 2.6.20, I don't think it'll be a big deal. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: add userspace support]
On Thu, 2006-11-16 at 15:00 -0800, Sean Hefty wrote: I need to make time to read them over. And I would like to get some resolution for the IPoIB crashes that Mellanox sees before we commit to merging them into 2.6.20. I agree that we need to fix the ipoib crashes before merging this upstream. After that is resolved, I need to make a couple of small updates to the patches before resubmitting. If this misses 2.6.20, I don't think it'll be a big deal. - Sean It would be nice to get the user mode connection setup code in 2.6.20. Without it, there's no user mode support for iwarp. The instability is in the mcast stuff, right? Can we separate the two and pull in the connection setup support for user mode? Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return maybe_missed_event hint from ib_req_notify_cq()
Roland Dreier [EMAIL PROTECTED] wrote on 11/16/2006 11:26:31 AM: What I have found in ehca driver, n! = t, does't mean it's empty. If poll again, there are still some packets in cq. IB_CQ_REPORT_mISSED_EVENTS most of the time reports 1. It relies on netif_rx_reschedule() returns0 to exit napi poll. That might be the reason in poll routine for a long time? I will rerun my test to use n! = 0 to see any difference here. Maybe there's an ehca bug in poll CQ? If n != t then it should mean that the CQ was indeed drained. I would expect a missed event would be rare, because it means a completion occurs between the last poll CQ and the request notify, and that shouldn't be that common... My rough estimate is that even at a higher throughput than what you're seeing, IPoIB should only generate ~ 500K completions/sec, which means the average delay between completions is 2 microseconds. So I wouldn't expect completions to hit the window between poll and request notify that often. - R. I have tried low_latency is 1 to disable TCP prequeue, the throughput was increased from 1XXMb/s to 4XXMb/s. If I delayed net_skb_receive() a little bit, I could get around 1700Mb/s. If I totally disable netif_rx_reschedule(), then there is no repoll and return 0, I could get around 2900Mb/s throughout without packet seeing out of order issues. I have tried to add a spin lock in ipoib_poll(). And I still see packets out of orders. disable prequeue: 2XXMb/s to 4XXMb/s (packets out of order) slowdown netif_receive_skb: 17XXMb/s (packets out of order) don't handle missed event: 28XXMb/s (no packets out of order) handler missed envent later: 7XXMb/s to 11XXMb/s (packets out of order) Maybe it is ehca driver deliver packets much faster? Which makes me think user processes tcp backlogqueue, prequeue might be out of order? Thanks Shirley Ma___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] DevCon decision on userspace
On 23:19 Thu 16 Nov , Hal Rosenstock wrote: Following Roland's lead on his userspace libraries (verbs and mthca), the DevCon decided that the userspace trunk will be moved to git with each component maintainer have a public tree with one or more branches to be pushed up to a git trunk. It is a requirement to import all the version history from svn and prune as appropriate. The timeframe for this is TBD. Any comments from maintainers and any consumers of the current userspace trunk ? Hal, we are lucky - we already have converted to git src/userspace/management tree on the new server: git://staging.openfabrics.org/~sashak/management.git As conversion tool I've used git-svnimport script distributed with git (the recent version is better - it supports openib subproject imports - '-P' option). There is another tool - git-svn, once it was not able to import branches, but it was under active development last time, so now this may be better. With git-svnimport the command should be like: git_svnimport -v -r -m -F -S -C git-dir -A authors-file \ -T gen2/trunk -b gen2/branches -t gen2/tags \ -P src/userspace/management https://openib.org/svn Some options (like -r, -v) can be omitted - see --help output. Better (and much much faster) to run import against local SVN repository. After import you may want to review resulted git tree, remove unrelated tags and branches, run git-repack -a -d and finally to push imported tree to public place. Sasha ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2/2 v2] libibcommon: enable printf() style format strict checking
This enables strict format/args checking for printf() style functions. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- libibcommon/include/infiniband/common.h | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h index 83c0679..c41217d 100644 --- a/libibcommon/include/infiniband/common.h +++ b/libibcommon/include/infiniband/common.h @@ -114,11 +114,16 @@ static inline uint64_t htonll(uint64_t x #define ENUM_STR_DEF(enumname, last, val) (((unsigned)(val) last) ? enumname ## _str[val] : ???) #define ENUM_STR_ARRAY(name) char * name ## _str[] +#ifdef __GNUC__ +#define IBCOMMON_STRICT_FORMAT __attribute__((format(printf, 2, 3))) +#else +#define IBCOMMON_STRICT_FORMAT +#endif /* util.c: debugging and tracing */ -void ibwarn(const char * const fn, char *msg, ...); -void ibpanic(const char * const fn, char *msg, ...); -void logmsg(const char *const fn, char *msg, ...); +void ibwarn(const char * const fn, char *msg, ...) IBCOMMON_STRICT_FORMAT; +void ibpanic(const char * const fn, char *msg, ...) IBCOMMON_STRICT_FORMAT; +void logmsg(const char *const fn, char *msg, ...) IBCOMMON_STRICT_FORMAT; void xdump(FILE *file, char *msg, void *p, int size); -- 1.4.4.g031c-dirty ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: adduserspace support]
It would be nice to get the user mode connection setup code in 2.6.20. Without it, there's no user mode support for iwarp. The instability is in the mcast stuff, right? Can we separate the two and pull in the connection setup support for user mode? I'd rather not, since removing multicast support changes the ABI. I had another request to try to merge this upstream for 2.6.20 by Voltaire, and I am actively debugging this when I'm not sitting in a conference... (I.e. I will try.) - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 293] New: udev rules should be KERNEL== not KERNEL=
http://openib.org/bugzilla/show_bug.cgi?id=293 Summary: udev rules should be KERNEL== not KERNEL= Product: OpenFabrics Linux Version: gen2 Platform: All OS/Version: Other Status: NEW Severity: major Priority: P2 Component: IB Core AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Apparently newer udev's are complaining about this, FC5 doesn't but works with the change. FC6 will emit errors when the rules are processed --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Fwd: [PATCH 0/7 v2] for 2.6.20 rdma/cma: adduserspace support]
If we're confident on the multicast ABI now, we could stub it out for 2.6.20 (just return -ENOSYS or something). Then the userspace side would fail gracefully against old kernels and we could merge multicast support later. But that adds work to strip out the multicast support. And it assumes that we know the ABI now. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] RFC libibverbs - Pass provider data through ibv_cmd_req_notify_cq()
OK, I applied these patches to libibverbs (and a corresponding patch to libmthca) and pushed the new trees out. Steve, can you pull and make sure I got everything you needed in? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/2] libibcommon: enable printf() style format strict checking
See embedded comment below. From: Sasha Khapyorsky [mailto:[EMAIL PROTECTED] Sent: Thu 11/16/2006 5:52 PM To: Michael S. Tsirkin Cc: Hal Rosenstock; openib-general@openib.org Subject: Re: [PATCH 2/2] libibcommon: enable printf() style format strict checking On 17:03 Thu 16 Nov , Michael S. Tsirkin wrote: diff --git a/libibcommon/include/infiniband/common.h b/libibcommon/include/infiniband/common.h index 83c0679..66afab0 100644 --- a/libibcommon/include/infiniband/common.h +++ b/libibcommon/include/infiniband/common.h @@ -114,11 +114,16 @@ #endif #define ENUM_STR_DEF(enumname, last, val) (((unsigned)(val) last) ? enumname ## _str[val] : ???) #define ENUM_STR_ARRAY(name) char * name ## _str[] +#ifdef __GNUC__ +#define STRICT_FORMAT __attribute__((format(printf, 2, 3))) +#else +#define STRICT_FORMAT +#endif You are polluting the global namespace - macros must be prefixed with library name. This is not the style for this library, hnr This is something I want to clean up by deprecating the non prefixed names. /hnr but I have nothing against adding prefix here. Will do. But anyway - why is this necessary? Does anyone actually try compiling libibcommon not in gcc? Why? I don't know if anyone will want to build this with non-gcc compiler, but I know that this attribute is gcc extension. And AFAIK e.g. intel compiler implements this __attribute__. As well as format(printf(...))? It is nice. I don't have icc to check this, but feel free to send the patch if you like. Sasha /* util.c: debugging and tracing */ -void ibwarn(const char * const fn, char *msg, ...); -void ibpanic(const char * const fn, char *msg, ...); -void logmsg(const char *const fn, char *msg, ...); +void ibwarn(const char * const fn, char *msg, ...) STRICT_FORMAT; +void ibpanic(const char * const fn, char *msg, ...) STRICT_FORMAT; +void logmsg(const char *const fn, char *msg, ...) STRICT_FORMAT; void xdump(FILE *file, char *msg, void *p, int size); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] add SIGUSR1 to reopen osm.log
Our sysadmins have been rotating OpenSM's osm.log file and then restarting OpenSM. As this is a less than optimal solution if you have jobs running on the system, I wrote this patch (against OFED 1.1) which adds a handler for SIGUSR1 that reopens OpenSM's log file without a restart. Ira Weiny [EMAIL PROTECTED] sigusr1-logreopen-opensm.patch Description: Binary data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 01/13] Linux RDMA Core Changes
This looks completely sane to me, so I have no problem merging this stuff once the rest of the Chelsio-specific stuff is reviewed. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] RDMA/iwcm: Get rid of extra call to list_empty()
Hi Tom, OK, I will try it and report what I find. Thanks! - KK Tom Tucker [EMAIL PROTECTED] wrote on 11/16/2006 10:08:03 PM: The race is that you've deleted the work queue element that is enqueued on the iwcm_wq. It's as simple as that. To prove it to yourself, apply your patch. Turn on memory debug support in the kernel and recompile your code. Then run rdma_krping clients in four different threads against your server with an I/O count of 1. You'll hit the race and can look at it yourself. I don't know any better way to explain it... Sorry. On 11/13/06 10:44 PM, Krishna Kumar2 [EMAIL PROTECTED] wrote: Hi Tom, No, to understand why go look at the implementation of queue_work. BTW, this I was describing the implementation of queue_work() in my previous mail. So sorry to be dense, but I do not understand why this patch introduces a race. Can you explain the race that you had found ? What I understood of queue_work() is : If cm_work_handler() is already running and processing the last entry at the same time this new entry was added, it is guaranteed to find this new entry in it's current run iteration, and process it. The only issue is with the extra queue_work by iwcm parallely on a different cpu for the same case. So if iwcm had done a redundant queue_work on this queue, which, besides adding the new entry to the workqueue, also does a wakeup of worker_thread (which is still running the previous iteration of run_workqueue - cm_work_handler). I am assuming that the wake up function is default_wake_function(), since I couldn't locate in wait* code where this is initialized. When cm_work_handler finishes removing this new entry, it returns to worker_thread, which will do a schedule() and sleep till it is woken up again (since default_wake_function found that the thread is already running and had done nothing). Are you referring to a race where the queue_work is done between the time cm_work_handler finished running and before it gets back to schedule ? I feel that should not matter as the run_workqueue() will find this entry in it's cwq-worklist and continue processing instead of exiting to worker_thread() and schedule(). Still confused about the race :) Thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenSM log growing too big
Hal Rosenstock wrote: hnr Yes, LID 5 is a switch LID and there is a port which is flapping. Bad cable ? /hnr When this port is disconnected the OpenSM stops logging these messages. It could have been bad connection. hnr The code is reducing the messages which are similar (approx 128 traps). The SM is repressing the trap and then the switch regenerates it becuase there is a port going up and down. That issue should be resolved. There has been discussion on the list and patches on dealing with the log and limiting its size that are in more recent versions of OpenSM. I'll look at it to see if I can reduce these messages further. /hnr It would be great if you can provide this patch. hnr I would highly recommend moving to OFED 1.1 OpenSM (from OFED 1.0). Many bugs have been fixed and it is much more robust. /hnr I agree. I am trying to push this. VBabu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [RFC] [PATCH] RDMA/iwcm: Cleanup IWCM_F_CALLBACK_DESTROY usage.
Cleanup IWCM_F_CALLBACK_DESTROY usage. It is being set only in cm_conn_req_handler(), and that too on a child handle. Remove IWCM_F_CALLBACK_DESTROY as the same result can be achieved otherwise. Patch against 2.6.19-rc5. Signed-off-by: Krishna Kumar [EMAIL PROTECTED] --- diff -ruNp org/drivers/infiniband/core/iwcm.c new/drivers/infiniband/core/iwcm.c --- org/drivers/infiniband/core/iwcm.c 2006-10-09 16:40:04.0 +0530 +++ new/drivers/infiniband/core/iwcm.c 2006-10-09 16:52:03.0 +0530 @@ -161,8 +161,6 @@ static int iwcm_deref_id(struct iwcm_id_ BUG_ON(!list_empty(cm_id_priv-work_list)); if (waitqueue_active(cm_id_priv-destroy_comp.wait)) { BUG_ON(cm_id_priv-state != IW_CM_STATE_DESTROYING); - BUG_ON(test_bit(IWCM_F_CALLBACK_DESTROY, - cm_id_priv-flags)); ret = 1; } complete(cm_id_priv-destroy_comp); @@ -386,7 +384,6 @@ void iw_destroy_cm_id(struct iw_cm_id *c struct iwcm_id_private *cm_id_priv; cm_id_priv = container_of(cm_id, struct iwcm_id_private, id); - BUG_ON(test_bit(IWCM_F_CALLBACK_DESTROY, cm_id_priv-flags)); destroy_cm_id(cm_id); @@ -833,11 +830,12 @@ static void cm_work_handler(void *arg) struct iwcm_id_private *cm_id_priv = work-cm_id; unsigned long flags; int empty; - int ret = 0; spin_lock_irqsave(cm_id_priv-lock, flags); empty = list_empty(cm_id_priv-work_list); while (!empty) { + int ret; + work = list_entry(cm_id_priv-work_list.next, struct iwcm_work, list); list_del_init(work-list); @@ -847,16 +845,13 @@ static void cm_work_handler(void *arg) spin_unlock_irqrestore(cm_id_priv-lock, flags); ret = process_event(cm_id_priv, work-event); - if (ret) { - set_bit(IWCM_F_CALLBACK_DESTROY, cm_id_priv-flags); + if (ret) destroy_cm_id(cm_id_priv-id); - } BUG_ON(atomic_read(cm_id_priv-refcount)==0); if (iwcm_deref_id(cm_id_priv)) return; - if (atomic_read(cm_id_priv-refcount)==0 - test_bit(IWCM_F_CALLBACK_DESTROY, cm_id_priv-flags)) { + if (ret atomic_read(cm_id_priv-refcount) == 0) { dealloc_work_entries(cm_id_priv); kfree(cm_id_priv); return; diff -ruNp org/drivers/infiniband/core/iwcm.h new/drivers/infiniband/core/iwcm.h --- org/drivers/infiniband/core/iwcm.h 2006-10-09 16:40:04.0 +0530 +++ new/drivers/infiniband/core/iwcm.h 2006-10-09 16:52:03.0 +0530 @@ -56,7 +56,6 @@ struct iwcm_id_private { struct list_head work_free_list; }; -#define IWCM_F_CALLBACK_DESTROY 1 -#define IWCM_F_CONNECT_WAIT 2 +#define IWCM_F_CONNECT_WAIT 1 #endif /* IWCM_H */ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] RDMA/iwcm: Teach lockdep about nesting of lock-classes.
I sometimes get this erroneous warning message about lock recursion : : [ INFO: possible recursive locking detected ] : rdma_bw/3558 is trying to acquire lock: : (cq-lock){}, at: [f9398d36] c2_free_qp+0x78/0x180 [iw_c2] : but task is already holding lock: : (cq-lock){}, at: [f9398d29] c2_free_qp+0x6b/0x180 [iw_c2] The fix is to teach lockdep about this nesting of a lock-class. Patch against 2.6.19-rc5. Signed-off-by: Krishna Kumar [EMAIL PROTECTED] --- diff -ruNp org/drivers/infiniband/hw/amso1100/c2_qp.c new/drivers/infiniband/hw/amso1100/c2_qp.c --- org/drivers/infiniband/hw/amso1100/c2_qp.c 2006-11-15 12:40:04.0 +0530 +++ new/drivers/infiniband/hw/amso1100/c2_qp.c 2006-11-15 13:02:03.0 +0530 @@ -578,7 +578,7 @@ void c2_free_qp(struct c2_dev *c2dev, st */ spin_lock_irq(send_cq-lock); if (send_cq != recv_cq) - spin_lock(recv_cq-lock); + spin_lock_nested(recv_cq-lock, SINGLE_DEPTH_NESTING); c2_free_qpn(c2dev, qp-qpn); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [RFC] [PATCH] RDMA/iwcm: Prevent deadlock in locking.
Since create_qp and destroy_qp can be called from userspace and from other kernel routines, it is possible to swap send_cq and recv_cq in different calls for creating different qp's (RFC). This can result in a deadlock, if the two locks are got out of order. Patch against 2.6.19-rc5. Signed-off-by: Krishna Kumar [EMAIL PROTECTED] --- diff -ruNp org/drivers/infiniband/hw/amso1100/c2_qp.c new/drivers/infiniband/hw/amso1100/c2_qp.c --- org/drivers/infiniband/hw/amso1100/c2_qp.c 2006-11-15 12:40:04.0 +0530 +++ new/drivers/infiniband/hw/amso1100/c2_qp.c 2006-11-16 18:10:03.0 +0530 @@ -564,6 +564,32 @@ int c2_alloc_qp(struct c2_dev *c2dev, return err; } +static inline void c2_lock_cqs(struct c2_cq *send_cq, struct c2_cq *recv_cq) +{ + if (send_cq == recv_cq) + spin_lock_irq(send_cq-lock); + else if (send_cq recv_cq) { + spin_lock_irq(send_cq-lock); + spin_lock_nested(recv_cq-lock, SINGLE_DEPTH_NESTING); + } else { + spin_lock_irq(recv_cq-lock); + spin_lock_nested(send_cq-lock, SINGLE_DEPTH_NESTING); + } +} + +static inline void c2_unlock_cqs(struct c2_cq *send_cq, struct c2_cq *recv_cq) +{ + if (send_cq == recv_cq) + spin_unlock_irq(send_cq-lock); + else if (send_cq recv_cq) { + spin_unlock(recv_cq-lock); + spin_unlock_irq(send_cq-lock); + } else { + spin_unlock(send_cq-lock); + spin_unlock_irq(recv_cq-lock); + } +} + void c2_free_qp(struct c2_dev *c2dev, struct c2_qp *qp) { struct c2_cq *send_cq; @@ -576,15 +602,9 @@ void c2_free_qp(struct c2_dev *c2dev, st * Lock CQs here, so that CQ polling code can do QP lookup * without taking a lock. */ - spin_lock_irq(send_cq-lock); - if (send_cq != recv_cq) - spin_lock_nested(recv_cq-lock, SINGLE_DEPTH_NESTING); - + c2_lock_cqs(send_cq, recv_cq); c2_free_qpn(c2dev, qp-qpn); - - if (send_cq != recv_cq) - spin_unlock(recv_cq-lock); - spin_unlock_irq(send_cq-lock); + c2_unlock_cqs(send_cq, recv_cq); /* * Destory qp in the rnic... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()
cm_conn_req_handler() : 1. Setting IWCM_F_CALLBACK_DESTROY on cm_id (child handle), which doesn't achieve anything, since checking IWCM_F_CALLBACK_DESTROY in the parent's flag (in cm_work_handler) means that this will never be true. 2. Calling destroy_cm_id leaks 3 work 'free' list entries. 3. cm_id is freed up wrongly and not cm_id_priv (though the effect is the same since cm_id is the first element of cm_id_priv, but still a bug if the top level cm_id changes). 4. Reject message has to be sent on failure. Tested this without the fix and found the client hangs, waited for about 20 mins and then did Ctrl-C but the process is unkillable. All 4 above cases were tested by injecting error in iw_conn_req_handler() and they were confirmed. I added the BUG_ON() to confirm the earlier check for refcount == 0. Patch against 2.6.19-rc5. Signed-off-by: Krishna Kumar [EMAIL PROTECTED] --- diff -ruNp org/drivers/infiniband/core/iwcm.c new/drivers/infiniband/core/iwcm.c --- org/drivers/infiniband/core/iwcm.c 2006-10-09 16:40:04.0 +0530 +++ new/drivers/infiniband/core/iwcm.c 2006-10-09 16:52:03.0 +0530 @@ -648,10 +648,9 @@ static void cm_conn_req_handler(struct i /* Call the client CM handler */ ret = cm_id-cm_handler(cm_id, iw_event); if (ret) { - set_bit(IWCM_F_CALLBACK_DESTROY, cm_id_priv-flags); - destroy_cm_id(cm_id); - if (atomic_read(cm_id_priv-refcount)==0) - kfree(cm_id); + BUG_ON(atomic_read(cm_id_priv-refcount) != 1); + iw_cm_reject(cm_id, NULL, 0); + iw_destroy_cm_id(cm_id); } if (iw_event-private_data_len) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 09/13] Core WQE/CQE Types
+struct t3_send_wr { +struct fw_riwrh wrh;/* 0 */ +union t3_wrid wrid; /* 1 */ + +enum t3_rdma_opcode rdmaop:8; +u32 reserved:24;/* 2 */ Does this do the right thing wrt endianness? I'd be more comfortable with something like u8 rdmaop; u8 reserved[3]; (although the __attribute__((packed)) on enum t3_rdma_opcode does make it OK to use here, I guess) +u32 rem_stag; /* 2 */ +u32 plen; /* 3 */ +u32 num_sgle; +struct t3_sge sgl[T3_MAX_SGE]; /* 4+ */ +}; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general