[openib-general] Re: [PATCH] rewrite perftest/README
Quoting r. Grant Grundler [EMAIL PROTECTED]: Subject: [PATCH] rewrite perftest/README Michael, Here's a complete rewrite of the README file. Should make it easier for people to understand o how to build o run o interpret results I'd still like to add abit more about statisical significance of the sample size of 1000 but am just refreshing my memory (ok, lame excuse, I've forgotten everything from 20years ago :^) on how to do that. Maybe you can craft something based on your experience plus the observations below? Ditching the last two (extreme) readings from the server side of the histogram: o standard deviation86 cycles o arithmetic mean (average) 7135 cycles o median 7126 cycles o min6906 cycles (sorted sample #1) o max7490 cycles (sorted sample #997) (For the record, #998 is 8798 and #999 is 50305, we are clearly measuring something else here too) (1.5Ghz IA64, PCI-X, 2.6.11, forgot which SVN they are running) The median value is *very* reproducible on this configuration. +- 1 cycle consistently over 5 runs of rdma_lat. thanks, grant Signed-off-by: Grant Grundler [EMAIL PROTECTED] Thanks, checked in. -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: diff-perftest-07 replace pp_get_local_lid()
Quoting r. Grant Grundler [EMAIL PROTECTED]: Subject: diff-perftest-07 replace pp_get_local_lid() Hi Michael, Following patch to rdma_lat.c: o replaces pp_get_local_lid with code from ibv_pingpong. This calls into libibverbs instead of fishing around in /sys FS. o makes two minor white space fix-ups. Signed-off-by: Grant Grundler [EMAIL PROTECTED] I'd like to slowly restructure main() into multiple distinct parts: 1) parameter parsing/setting 2) global data init (e.g. srand()) 3) setup connection 3) negotiate test+parameters with server/client 4) run test (maybe several iterations with different params) 5) exit/cleanup I'm thinking about how to keep the server running and iterating. The goal is to be able to run a sequence of tests just from the client side. Or is this a waste of time? Should I rather be looking at fixing up netperf to support IB? thanks, grant thanks, checked in. -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [RFC] [PATCH] user_mad: Support RMPP on send side
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [RFC] [PATCH] user_mad: Support RMPP on send side This looks OK to check in with one small comment on the following: - if (copy_to_user(buf, packet-mad, sizeof packet-mad)) + if (copy_to_user(buf, packet-mad, + min(count, packet-length + + sizeof (struct ib_user_mad ret = -EFAULT; else - ret = sizeof packet-mad; + ret = count; This code will truncate a received MAD that is bigger than the buffer passed into read(), but return the full size of the packet. I don't think read() is allowed to do this: the return value can be at most the count value passed in by the user. I think we have two options: truncate and return the actual amount of data read to the user, or return an error if the user's buffer is too small. - R. If you truncate, how will the user know the MAD was truncated? -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] registering read-only memory
Roland, the following code snippet: const char foo[]=Michael Tsirkin; ibv_reg_mr(ctx-pd, foo, strlen(foo), 0); exposes two problems with ibv_reg_mr: 1. Compiling this code I get a warning: warning: passing arg 2 of `ibv_reg_mr' discards qualifiers from pointer target type. Same if foo is declared volatile. I suggest changing ibv_reg_mr to accept const volatile void * as a second parameter. const is OK since ibv_reg_mr never actually writes to the buffer. volatile is needed by some applications to prevent the compiler from assuming it can re-order accesses to this buffer. Patch attached (below). 2. ibv_reg_mr fails. Why is that? System details: gcc --version gcc (GCC) 3.3.3 (SuSE Linux) uname -a Linux swlab156 2.6.11-openib #29 SMP Mon Apr 18 16:17:51 IDT 2005 x86_64 x86_64 x86_64 GNU/Linux Make ibv_reg_mr accept buffer as volatile const *. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] Index: libibverbs/include/infiniband/verbs.h === --- libibverbs/include/infiniband/verbs.h (revision 2408) +++ libibverbs/include/infiniband/verbs.h (working copy) @@ -499,7 +499,7 @@ extern int ibv_dealloc_pd(struct ibv_pd /** * ibv_reg_mr - Register a memory region */ -extern struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, +extern struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, volatile const void *addr, size_t length, enum ibv_access_flags access); /** Index: libibverbs/src/verbs.c === --- libibverbs/src/verbs.c (revision 2408) +++ libibverbs/src/verbs.c (working copy) @@ -64,7 +64,7 @@ int ibv_dealloc_pd(struct ibv_pd *pd) return pd-context-ops.dealloc_pd(pd); } -struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, +struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, volatile const void *addr, size_t length, enum ibv_access_flags access) { struct ibv_mr *mr; Index: libmthca/src/mthca.h === --- libmthca/src/mthca.h(revision 2408) +++ libmthca/src/mthca.h(working copy) @@ -260,7 +260,7 @@ extern int mthca_query_port(struct ibv_c extern struct ibv_pd *mthca_alloc_pd(struct ibv_context *context); extern int mthca_free_pd(struct ibv_pd *pd); -extern struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr, +extern struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, volatile const void *addr, size_t length, enum ibv_access_flags access); extern int mthca_dereg_mr(struct ibv_mr *mr); Index: libmthca/src/verbs.c === --- libmthca/src/verbs.c(revision 2408) +++ libmthca/src/verbs.c(working copy) @@ -93,7 +93,8 @@ int mthca_free_pd(struct ibv_pd *pd) return 0; } -static struct ibv_mr *__mthca_reg_mr(struct ibv_pd *pd, void *addr, +static struct ibv_mr *__mthca_reg_mr(struct ibv_pd *pd, +volatile const void *addr, size_t length, uint64_t hca_va, enum ibv_access_flags access) { @@ -113,7 +114,7 @@ static struct ibv_mr *__mthca_reg_mr(str return mr; } -struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr, +struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, volatile const void *addr, size_t length, enum ibv_access_flags access) { return __mthca_reg_mr(pd, addr, length, (uintptr_t) addr, access); -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] CM private data
Caitlin I believe that an IT-API it_ep_accept() supplies private data that it expects to be delivered to its peer when the three-way handshake option is selected. Both IT-API it_ep_accept() and DAT dat_cr_accept() cause the provider at the passive side to send a --REP--, so how --RTU-- is related to the _accept calls? I think you ment to say that exposing two-way handshake means that the consumer can not supply private data for the RTU as he is not aware of it (the RTU) being sent? Or. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Caitlin Bestler Sent: Thursday, May 19, 2005 2:16 AM To: Sean Hefty Cc: openib-general Subject: Re: [openib-general] CM private data I believe that an IT-API it_ep_accept() supplies private data that it expects to be delivered to its peer when the three-way handshake option is selected. DAT only exposes a two-way handshake, so there it never requires private data on the RTU. I don't know if any IT-API applications actually require the three-way handshake. On 5/18/05, Sean Hefty [EMAIL PROTECTED] wrote: Do any applications make use of the private data in these CM message: RTU, MRA, or DREP? I'm doubtful of the RTU or DREP, but not as sure of the MRA. Since no replies are generated in response to these messages, the CM does not keep them after their sends complete. However, it may need to resend the messages. For example, it will resend the RTU if a duplicate REP is received. If no applications are using the private data, I will not worry about storing the private data for the retransmissions at this time. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CM private data
With a two-way handshake only the passive side accepts the connection request (it_ep_accept() or dat_cr_accept()). IT-API defines an optional *three-way* handshake where the active side must *also* call it_ep_accept() before the final connection establishment can proceed. This does not map to iWARP/MPA in any standard way, so rather than defining a protocol to wrap the first handshake private data, DAT decided to stick with only two-way handshaking. There has been no screaming, so we can presume that *most* applications find two-way handshaking acceptable. However an IT-API provider is free to say that it supports three-way handshaking, and I believe it is implied (if not required) that InfiniBand providers do so. What I do not know is if any actual applications have ever made use of this capability, or if it is only a defensive encoding of a protocol option into an API that prefers not to avoid removal of options that are available at the wire level. DAT emphasized providing a clean API for transport neutral services, and so simply said use two-way handshaking. In any event, if no support is going to be provided for private data on the second it_ep_accept at the verb layer then that should be explicitly documented, and I'd suggest sending a 'heads up' to the IT-API authors, On 5/19/05, Or Gerlitz [EMAIL PROTECTED] wrote: Caitlin I believe that an IT-API it_ep_accept() supplies private data that it expects to be delivered to its peer when the three-way handshake option is selected. Both IT-API it_ep_accept() and DAT dat_cr_accept() cause the provider at the passive side to send a --REP--, so how --RTU-- is related to the _accept calls? I think you ment to say that exposing two-way handshake means that the consumer can not supply private data for the RTU as he is not aware of it (the RTU) being sent? Or. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Caitlin Bestler Sent: Thursday, May 19, 2005 2:16 AM To: Sean Hefty Cc: openib-general Subject: Re: [openib-general] CM private data I believe that an IT-API it_ep_accept() supplies private data that it expects to be delivered to its peer when the three-way handshake option is selected. DAT only exposes a two-way handshake, so there it never requires private data on the RTU. I don't know if any IT-API applications actually require the three-way handshake. On 5/18/05, Sean Hefty [EMAIL PROTECTED] wrote: Do any applications make use of the private data in these CM message: RTU, MRA, or DREP? I'm doubtful of the RTU or DREP, but not as sure of the MRA. Since no replies are generated in response to these messages, the CM does not keep them after their sends complete. However, it may need to resend the messages. For example, it will resend the RTU if a duplicate REP is received. If no applications are using the private data, I will not worry about storing the private data for the retransmissions at this time. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] management: resource leak fixes
Management libraries leak resources (memory, file/directory handles). Also a trailing whitespace fix in one place in libibcommon. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] Index: libibumad/src/umad.c === --- libibumad/src/umad.c(revision 2411) +++ libibumad/src/umad.c(working copy) @@ -357,8 +357,10 @@ get_ca(char *ca_name, umad_ca_t *ca) if (!(dir = opendir(dir_name))) return -ENOENT; - if ((r = scandir(dir_name, namelist, 0, alphasort)) 0) - return -EIO; + if ((r = scandir(dir_name, namelist, 0, alphasort)) 0) { + ret = errno 0 ? errno : -EIO; + goto error; + } ret = 0; ca-numports = 0; @@ -388,6 +390,7 @@ get_ca(char *ca_name, umad_ca_t *ca) free(namelist[i]); free(namelist); + closedir(dir); put_ca(ca); return 0; @@ -395,7 +398,8 @@ clean: for (i = 0; i r; i++) free(namelist[i]); free(namelist); - +error: + closedir(dir); release_ca(ca); return ret; Index: libibcommon/src/sysfs.c === --- libibcommon/src/sysfs.c (revision 2411) +++ libibcommon/src/sysfs.c (working copy) @@ -56,6 +56,7 @@ #include sys/poll.h #include syslog.h #include netinet/in.h +#include errno.h #include common.h @@ -86,14 +87,20 @@ sys_read_string(char *dir_name, char *fi if ((fd = open(path, O_RDONLY)) 0) return ret_code(); - if ((r = read(fd, str, max_len)) 0) + if ((r = read(fd, str, max_len)) 0) { + int e = errno; + close(fd); + errno = e; return ret_code(); + } str[(r max_len) ? r : max_len - 1] = 0; if ((s = strrchr(str, '\n'))) *s = 0; - return 0; + + close(fd); + return 0; } int -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] 2.6.12 question
Hello Is there a branch for 2.6.12? We have noticed that we get different performance when built on 2.6.11-5 vs 2.6.12-rc4. thanks ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: 2.6.12 question
Quoting r. Parks Fields [EMAIL PROTECTED]: Subject: 2.6.12 question Hello Is there a branch for 2.6.12? We have noticed that we get different performance when built on 2.6.11-5 vs 2.6.12-rc4. thanks 1. Could you elaborate please? 2. I suggest you make sure the kernels are built with all the same options, especially regarding security, networking and cpu type options. -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH] management: resource leak fixes
On Thu, 2005-05-19 at 09:24, Michael S. Tsirkin wrote: Management libraries leak resources (memory, file/directory handles). Also a trailing whitespace fix in one place in libibcommon. Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: 2.6.12 question
2. I suggest you make sure the kernels are built with all the same options, especially regarding security, networking and cpu type options. This is on the same box, but I have to check the .config files. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.12 question
Parks Is there a branch for 2.6.12? We have noticed that we get Parks different performance when built on 2.6.11-5 vs 2.6.12-rc4. Obviously the drivers/infiniband already in 2.6.12-rc4 should work fine. Current svn should also work -- I think the only change required is for SDP, and Tom Duffy posted a patch. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: registering read-only memory
Michael 2. ibv_reg_mr fails. Why is that? How does it fail? R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CM private data
Caitlin Bestler wrote: In any event, if no support is going to be provided for private data on the second it_ep_accept at the verb layer then that should be explicitly documented, and I'd suggest sending a 'heads up' to the IT-API authors, To clarify, I was only trying to determine when to implement this, not if. Based on the feedback, I will try to fix this as part of my next set of changes to the CM. Thanks, Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: sdp_link.c info_list locking question
On Thu, May 19, 2005 at 11:12:38AM +0300, Michael S. Tsirkin wrote: Libor, I'm looking at sdp_link.c, and I dont see any lock protecting the info_list linked list. What prevents sdp_path_info_lookup from being called while sdp_path_info_create or sdp_path_info_destroy is in progress? Michael, you are correct that the locking is missing, there's actually a line about it in the TODO file. When I moved the sdp code to gen2 it was easier to rewrite sdp_link because of all the other junk that was in the old version, but to get something done quickly I left out locking. I figured eventually the IP to PathRecord service would be broken into a seperate module, like the old ip2pr, since there is the possibility for multiple consumers. However, that has yet to happen, so if someone wants to add the locking, feel free. -Libor ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: sdp_link.c info_list locking question
Quoting r. Libor Michalek [EMAIL PROTECTED]: Subject: Re: sdp_link.c info_list locking question On Thu, May 19, 2005 at 11:12:38AM +0300, Michael S. Tsirkin wrote: Libor, I'm looking at sdp_link.c, and I dont see any lock protecting the info_list linked list. What prevents sdp_path_info_lookup from being called while sdp_path_info_create or sdp_path_info_destroy is in progress? Michael, you are correct that the locking is missing, there's actually a line about it in the TODO file. When I moved the sdp code to gen2 it was easier to rewrite sdp_link because of all the other junk that was in the old version, but to get something done quickly I left out locking. I figured eventually the IP to PathRecord service would be broken into a seperate module, like the old ip2pr, since there is the possibility for multiple consumers. However, that has yet to happen, so if someone wants to add the locking, feel free. -Libor OK. I might look at this next week. First step would be to switch to standard list macros, though. -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] CM private data
From: Caitlin Bestler [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 9:25 AM So being predictably unreliable for one implementation stage is certainly something you can get away with. Even when you add support it might be quite acceptable to send the private data *only* on the first try, or to require the IT-API layer to do the retries. Only sending the user's private data on the first try requires the user to support connection establishment with: - no private data (no RTU received) - zero private data (RTU retry) - remote peer's private data (first RTU) The receipt of private data does not mean that private data is actually what the remote side sent, and also requires users to never use all zero private data since that would make the second and third case above indistinguishable. So if the CM is going to expose the private data, it needs to put that private data in all retries. I'm still for hiding the RTU private data. I think it's useless because it's unreliable - anything exchanged via private data in the RTU must also be exchanged by other means in case the connection is established before the RTU is received. Any ULPs that depend on the RTU private data are setting themselves up for potential failures. Just my opinion, though... - Fab ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CM private data
Fab Tillier wrote: I'm still for hiding the RTU private data. I think it's useless because it's unreliable - anything exchanged via private data in the RTU must also be exchanged by other means in case the connection is established before the RTU is received. Any ULPs that depend on the RTU private data are setting themselves up for potential failures. This depends on the implementation. If the server side of a connection initiates the data transfer, it cannot do so until an RTU is received. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] How about ib_send_page() ?
snip... The most interesting optimization available is implementing the IPoIB connected mode draft, although I don't think it's as easy as Vivek indicated -- for example, I'm not sure how to deal with having different MTUs depending on the destination. snip... The draft does allow for a negotiation per connection for the implementations that wish to take advantage of it. However, an implementation can by default choose to use a 'connected-mode MTU' e.g. 32K always. It can then, for every connection choose to, negotiate to this value and if it is not workable fall back to the UD mode and deny the connection mode. The ARP entries hold the connected mode flags thereby keeping track of the mode to use per destination. I'd be more than happy to discuss other implementation issues. As I noted earlier it will also help refine the draft. Vivek P.S. cc replies to [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: user_mad::ib_umad_read question
Quoting r. Hal Rosenstock [EMAIL PROTECTED]: Subject: user_mad::ib_umad_read question Hi, In ib_umad_read, there is currently (or soon to be something like) the following: ... packet = list_entry(file-recv_list.next, struct ib_umad_packet, list); list_del(packet-list); spin_unlock_irq(file-recv_lock); if (copy_to_user(buf, packet-mad, min(count, packet-length + sizeof (struct ib_user_mad ret = -EFAULT; else ret = count; kfree(packet); return ret; Should the packet be thrown away because copy_to_user() fails ? Shouldn't it be placed back at the head of the list ? Unfortunately, that would mean holding the recv lock longer (through the duration of copy_to_user). -- Hal copy_to_user might sleep so you cant call it under a spinlock. Since the user is only hurting himself by passing an illegal address, I'd think it doesnt hurt to drop the mad. -- MST - Michael S. Tsirkin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CM private data
On Thu, May 19, 2005 at 09:34:16AM -0700, Fab Tillier wrote: From: Caitlin Bestler [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 9:25 AM So being predictably unreliable for one implementation stage is certainly something you can get away with. Even when you add support it might be quite acceptable to send the private data *only* on the first try, or to require the IT-API layer to do the retries. I'm still for hiding the RTU private data. I think it's useless because it's unreliable - anything exchanged via private data in the RTU must also be exchanged by other means in case the connection is established before the RTU is received. Any ULPs that depend on the RTU private data are setting themselves up for potential failures. I agree for exactly the reason you give, I can't think of a legitimate use for RTU private data. I'd get rid of it entirely, from the code as well as the spec, which is why I think it would be a waste of someones time to add the correct support for it. -Libor ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] How about ib_send_page() ?
Hi Vivek, On Thu, 2005-05-19 at 12:41, Vivek Kashyap wrote: snip... The most interesting optimization available is implementing the IPoIB connected mode draft, although I don't think it's as easy as Vivek indicated -- for example, I'm not sure how to deal with having different MTUs depending on the destination. snip... The draft does allow for a negotiation per connection for the implementations that wish to take advantage of it. However, an implementation can by default choose to use a 'connected-mode MTU' e.g. 32K always. It can then, for every connection choose to, negotiate to this value and if it is not workable fall back to the UD mode and deny the connection mode. The ARP entries hold the connected mode flags thereby keeping track of the mode to use per destination. Sounds like there should be an agreement on a default connected mode MTU or else this will drop down to UD. I have a couple of clarification questions on 5.1 Per-Connection MTU: 1. I presume the Receive MTU is in the first 2 bytes of the private data in the CM messages. Is that correct ? 2. Also, CM REQ is mentioned for the requester receive MTU. Wouldn't CM REP carry the granted receive MTU which is constrained to be the requested MTU or less ? So 2 things on this: The I-D says The private data field MUST carry the receive MTU. Does that include RTUs ? Thanks. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] remove redundant check in mthca_provider.c
Remove redundant check of pd-uobject in mthca_provider.c Signed-off-by: James Lentini [EMAIL PROTECTED]Index: infiniband/hw/mthca/mthca_provider.c === --- infiniband/hw/mthca/mthca_provider.c(revision 2404) +++ infiniband/hw/mthca/mthca_provider.c(working copy) @@ -478,9 +478,7 @@ kfree(qp); return ERR_PTR(err); } - } - if (pd-uobject) { qp-mr.ibmr.lkey = ucmd.lkey; qp-sq.db_index = ucmd.sq_db_index; qp-rq.db_index = ucmd.rq_db_index; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: diff-perftest-07 replace pp_get_local_lid()
On Thu, May 19, 2005 at 08:51:26AM +0300, Michael S. Tsirkin wrote: Quoting r. Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: diff-perftest-07 replace pp_get_local_lid() Should I rather be looking at fixing up netperf to support IB? thanks, grant That may be kind of hard, given that uverbs API is completely different from socket API. Wait, isnt this what SDP is doing? SDP is a thin, kernel translation layer from IP sockets to verbs, right? The idea I'm chasing is for netperf to use uverbs directly like rdma_lat does. grant ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] How about ib_send_page() ?
Vivek The draft does allow for a negotiation per connection for Vivek the implementations that wish to take advantage of Vivek it. However, an implementation can by default choose to use Vivek a 'connected-mode MTU' e.g. 32K always. It can then, for Vivek every connection choose to, negotiate to this value and if Vivek it is not workable fall back to the UD mode and deny the Vivek connection mode. The ARP entries hold the connected mode Vivek flags thereby keeping track of the mode to use per Vivek destination. But this means that the MTU of the link will be different for UD destinations (including multicast) and RC destinations, right? Or am I missing something? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] rewrite perftest/README
On Thu, May 19, 2005 at 06:32:32AM +0100, Paul Baxter wrote: Grant, Michael Great work! I wanted to point out a recent thread on comp.arch discussing the merits of using the geometric mean instead of the arithmetic mean as a basis for sampling the timing of a population. Paul, thanks for pointing out the thread - I'm learning alot! :^) My statistics is a bit rusty, but calculating and presenting GM might be useful a well? After reading the original posting, I think the answer is NO. Particularly, two comments in John Mashey's original posting: | The GM is the correct mean for combining benchmark results intended | to be a sample from a larger population of programs and intended to | predict the performance distribution of other benchmarks, which after | all, is what people want for generalized performance comparisons. | This is true if the population follows a *lognormal* [described later] | distribution ... and it turns out, many do so. rdma_lat is one benchmark - not a combination of benchmark results. If perftest ends up with more than two benchmarks, we can try to look weighted geometric mean to get one number from running the whole mess. But TBH, I have no interest or need to do that. Sounds like statistical wanking off to me. I'm much more interested in comparing differences of individual rdma_lat runs and sort out why the results are different. | If you know your workload, your benchamrks *are* the population, | and you use algebra. We know pretty well what we are measuring with rmda_lat. Well, Michael does. I only have a dangerously vague understanding. :^) I know we have two sets of data: warmup and runtime. The first two or three measurements are related to warmup. John Mashey's posting suggested keeping separate groups of data apart. Ie. it was probably correct to not include warmup values in the std deviation and arithmetic mean (avg). Conclusion: The median value still looks like the right thing to report. http://groups.google.co.uk/group/comp.arch/browse_thread/thread/3f5a9ed1d79ed726/416e58b5e48c1715?q=geometricrnum=6hl=en#416e58b5e48c1715 [comp.arch, 'SPEC use of Geometric Mean', May 13th, 2005, John Mashey ] thanks again, grant ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] How about ib_send_page() ?
On Wed, May 18, 2005 at 08:00:15PM -0700, Felix Marti wrote: Hi Roland, define SMP :) Anytime a CPU is cache coherent with another CPU. at these rates, system architecture comes into place, Definitely. The architecture puts boundaries on how coherency can be implemented...and thus available memory bus bandwidth. grant ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] CM private data
All three of these have mechanisms whereby the message can be skipped, in which case applications should not depend on the private data (IB spec mentions that apps should not depend on them). For example, A inbound receive while in RTR state after having sent a REP can be treated as an RTU, in which case later arrival of the RTU is to be discarded by the CM. Todd Rimmer -Original Message- From: Tillier, Fabian Sent: Wednesday, May 18, 2005 3:28 PM To: 'Sean Hefty'; openib-general Subject: RE: [openib-general] CM private data From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 18, 2005 12:23 PM Do any applications make use of the private data in these CM message: RTU, MRA, or DREP? I'm doubtful of the RTU or DREP, but not as sure of the MRA. Since no replies are generated in response to these messages, the CM does not keep them after their sends complete. However, it may need to resend the messages. For example, it will resend the RTU if a duplicate REP is received. If no applications are using the private data, I will not worry about storing the private data for the retransmissions at this time. If you're not going to store them, I would suggest removing the private data for the associated calls (RTU, MRA, DREP). That way it becomes very clear to applications wishing to use the private data that they can't. - Fab ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Infinipath support in OpenIB
I read with interest the PR blurb at http://supercomputingonline.com/article.php?sid=8740 regarding InfiniPath's very low latencies and high throughput for MPI even at very modest message sizes. The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Is this something the OpenIB software developers are working on and something that can be commented on in public, or should I speculate/ask Pathscale? Paul ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OOPS: ib_mad crashery on bootup
This is from latest bits, r2414. Bringing up interface ib0: stack segment: [1] SMP CPU 0 Modules linked in: ext3 jbd dm_mod video container button battery ac ohci_hcd tpm_nsc tpm i2c_amd756 i2c_core ib_mthca ib_ipoib ib_sa ib_mad ib_core tg3 floppy xfs exportfs mptscsih mptbase sd_mod scsi_mod Pid: 1253, comm: ib_mad1 Not tainted 2.6.12-rc4openib RIP: 0010:[880f633f] 880f633f{:ib_mad:ib_mad_send_done_handler+31} RSP: 0018:81007ed71d48 EFLAGS: 00010296 RAX: 01001508 RBX: 81007f346298 RCX: RDX: c2052001 RSI: 81007ed71dc8 RDI: 81007f3461c0 RBP: 01001500 R08: 81007ed7 R09: 81003fa69070 R10: 81003fc4d258 R11: 0008 R12: 81003f33c4b0 R13: 81003f33c4b0 R14: 81007ed71dc8 R15: 880f89a0 FS: 2aad4e00() GS:80489000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 003a2428ebf0 CR3: 00101000 CR4: 06e0 Process ib_mad1 (pid: 1253, threadinfo 81007ed7, task 81003fa69070) Stack: 8001 81003fa69070 81007f346298 81007f3462a0 81003fd8c778 81007f3461c0 81007ed71dc8 880f89a0 880f8f1b Call Trace:880f89a0{:ib_mad:ib_mad_completion_handler+0} 880f8f1b{:ib_mad:ib_mad_completion_handler+1403} 8016686b{cache_free_debugcheck+715} 80130343{__wake_up+67} 880f89a0{:ib_mad:ib_mad_completion_handler+0} 80148a6c{worker_thread+476} 80132260{default_wake_function+0} 8014d0b0{keventd_create_kthread+0} 80148890{worker_thread+0} 8014d0b0{keventd_create_kthread+0} 8014d329{kthread+217} 80133770{schedule_tail+64} 8010f57f{child_rip+8} 8014d0b0{keventd_create_kthread+0} 8014d250{kthread+0} 8010f577{child_rip+0} Code: 4c 8b 7d 20 48 89 04 24 48 89 ef 31 db e8 bf 3c 25 f8 49 8b RIP 880f633f{:ib_mad:ib_mad_send_done_handler+31} RSP 81007ed71d48 Error, some other host already uses address 192.168.0.233. -- I wish we lived in the America of yesteryear that only exists in the minds of us Republicans. -- Ned Flanders signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] kdapltest regression? failing now...
I am not sure when this started, but after updating to top of trunk*, I can no longer get kdapltest to work properly. Both ipoib and sdp are working. Both server and client are returning an error: DAT_INVALID_HANDLE. This is coming from ib_create_qp(). With debugging turned on: [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) kDAPL: dapl_ia_open () returns 0x0 kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 kDAPL: dapl_evd_free (81001ba22050) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba22168) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba166e0) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba164b0) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_pz_free (81001b9442c8) kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) kDAPL: dapl_ia_query () returns 0x0 kDAPL: dapl_ia_close (81001ba165c8, 1) kDAPL: dapl_evd_free (81001ba167f8) kDAPL: dapl_evd_free () returns 0x0 Server_Cmd.debug: 1 Server_Cmd.dapl_name: mthca0a DT_cs_Server: IA mthca0a opened DT_cs_Server: PZ created DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE DT_cs_Server: Waiting for clients to all go away... DT_cs_Server: Cleaning up ... DT_cs_Server: IA mthca0a closed DT_cs_Server (mthca0a): Exiting. TEST INSTANCE 0 TEST return code = 1 Also, the ib_at module prints this out now when you ping (after running kdapltest)... ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 -tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infinipath support in OpenIB
Paul: The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Is this something the OpenIB software developers are working on and something that can be commented on in public, or should I speculate/ask Pathscale? Thanks for asking. We are actively working on supporting OpenIB with the help of the OpenIB community and hope to be able to submit code that supports InfiniPath in the near future. Johann ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infinipath support in OpenIB
Paul Baxter wrote: I read with interest the PR blurb at http://supercomputingonline.com/article.php?sid=8740 regarding InfiniPath's very low latencies and high throughput for MPI even at very modest message sizes. The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Is this something the OpenIB software developers are working on and something that can be commented on in public, or should I speculate/ask Pathscale? I would think that as long as they provide an HCA driver that registers with the core layer and exposes the proper APIs, they would be able to run with the openib stack. I do not know if their code is open source however. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infinipath support in OpenIB
On Thu, 2005-05-19 at 20:29 +0100, Paul Baxter wrote: I read with interest the PR blurb at http://supercomputingonline.com/article.php?sid=8740 regarding InfiniPath's very low latencies and high throughput for MPI even at very modest message sizes. The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Also interesting that OpenIB is suddenly an industry standard. ;-) -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [PATCH][kdapl] fix spin_lock_irqsave/spin_unlock_irqrestore i mplementation
On Wed, 2005-05-18 at 22:44 +0300, Itamar Rabenstein wrote: evd producer locking is something that we need in openib kdapl as it was in Source Forge implementation. This is not a good excuse. OpenIB kdapl is a totally different beast from the sourceforge implementation and will differ over time. -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I think I figure this out. DAPL was assuming a particular maximum scatter gather list size. I'm going to change it to query for this value. Hopefully I'll have a fix shortly. james On Thu, 19 May 2005, James Lentini wrote: For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] RE: [PATCH][kdapl] fix spin_lock_irqsave/spi n_unlock_irqrestore i mplementation
I am not saying that we need it in OpenIb code because it was in SF code I said that in OpenIb implementation we need it the same as in SF we need it. we must have this lock because for example : if the same evd will be a CM evd for 2 ep's each one on different Thread both can try to add an event to the evd in the same time there for we need to lock the evd when we take an empty event from one list (empty list) and to unlock it after we add the event to the second list (waiting events list) the lock is in one function and the unlock in the second function. so we need in out OpenIb code and also SF code need it (;-) Itamar -Original Message- From: Tom Duffy [mailto:[EMAIL PROTECTED] Sent: Thursday, May 19, 2005 11:09 PM To: Itamar Rabenstein Cc: James Lentini; openib-general Subject: Re: [openib-general] RE: [PATCH][kdapl] fix spin_lock_irqsave/spin_unlock_irqrestore i mplementation On Wed, 2005-05-18 at 22:44 +0300, Itamar Rabenstein wrote: evd producer locking is something that we need in openib kdapl as it was in Source Forge implementation. This is not a good excuse. OpenIB kdapl is a totally different beast from the sourceforge implementation and will differ over time. -tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] user_mad: Support RMPP on send side
user_mad: Support RMPP on send side Note that this change will need a coordinated change to OpenSM and some userspace/management libraries which will be done as soon as possible once this patch is accepted. Receive side support for RMPP will be added separately. Signed-off-by: Hal Rosenstock [EMAIL PROTECTED] Index: infiniband/include/ib_user_mad.h === --- infiniband/include/ib_user_mad.h(revision 2413) +++ infiniband/include/ib_user_mad.h(working copy) @@ -1,5 +1,6 @@ /* * Copyright (c) 2004 Topspin Communications. All rights reserved. + * Copyright (c) 2005 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -42,7 +43,7 @@ * Increment this value if any changes that break userspace ABI * compatibility are made. */ -#define IB_USER_MAD_ABI_VERSION2 +#define IB_USER_MAD_ABI_VERSION3 /* * Make sure that all structs defined in this file remain laid out so @@ -51,8 +52,7 @@ */ /** - * ib_user_mad - MAD packet - * @data - Contents of MAD + * ib_user_mad_hdr - MAD packet header * @id - ID of agent MAD received with/to be sent with * @status - 0 on successful receive, ETIMEDOUT if no response * received (transaction ID in data[] will be set to TID of original @@ -72,8 +72,7 @@ * * All multi-byte quantities are stored in network (big endian) byte order. */ -struct ib_user_mad { - __u8data[256]; +struct ib_user_mad_hdr { __u32 id; __u32 status; __u32 timeout_ms; @@ -91,6 +90,17 @@ }; /** + * ib_user_mad - MAD packet + * @hdr - MAD packet header + * @data - Contents of MAD + * + */ +struct ib_user_mad { + struct ib_user_mad_hdr hdr; + __u8data[0]; +}; + +/** * ib_user_mad_reg_req - MAD registration request * @id - Set by the kernel; used to identify agent in future requests. * @qpn - Queue pair number; must be 0 or 1. @@ -103,6 +113,8 @@ * management class to receive. * @oui: Indicates IEEE OUI when mgmt_class is a vendor class * in the range from 0x30 to 0x4f. Otherwise not used. + * @rmpp_version: If set, indicates the RMPP version used. + * */ struct ib_user_mad_reg_req { __u32 id; @@ -111,6 +123,7 @@ __u8mgmt_class; __u8mgmt_class_version; __u8oui[3]; + __u8rmpp_version; }; #define IB_IOCTL_MAGIC 0x1b Index: infiniband/core/user_mad.c === --- infiniband/core/user_mad.c (revision 2413) +++ infiniband/core/user_mad.c (working copy) @@ -1,5 +1,6 @@ /* * Copyright (c) 2004 Topspin Communications. All rights reserved. + * Copyright (c) 2005 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -94,10 +95,12 @@ }; struct ib_umad_packet { - struct ib_user_mad mad; struct ib_ah *ah; + struct ib_mad_send_buf *msg; struct list_head list; + intlength; DECLARE_PCI_UNMAP_ADDR(mapping) + struct ib_user_mad mad; }; static const dev_t base_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE); @@ -114,10 +117,10 @@ int ret = 1; down_read(file-agent_mutex); - for (packet-mad.id = 0; -packet-mad.id IB_UMAD_MAX_AGENTS; -packet-mad.id++) - if (agent == file-agent[packet-mad.id]) { + for (packet-mad.hdr.id = 0; +packet-mad.hdr.id IB_UMAD_MAX_AGENTS; +packet-mad.hdr.id++) + if (agent == file-agent[packet-mad.hdr.id]) { spin_lock_irq(file-recv_lock); list_add_tail(packet-list, file-recv_list); spin_unlock_irq(file-recv_lock); @@ -138,14 +141,11 @@ struct ib_umad_packet *packet = (void *) (unsigned long) send_wc-wr_id; - dma_unmap_single(agent-device-dma_device, -pci_unmap_addr(packet, mapping), -sizeof packet-mad.data, -DMA_TO_DEVICE); - ib_destroy_ah(packet-ah); + ib_free_send_mad(packet-msg); + ib_destroy_ah(packet-msg-send_wr.wr.ud.ah); if (send_wc-status == IB_WC_RESP_TIMEOUT_ERR) { - packet-mad.status = ETIMEDOUT; + packet-mad.hdr.status = ETIMEDOUT; if (!queue_packet(file, agent, packet)) return; @@ -159,30 +159,34 @@ { struct ib_umad_file *file = agent-context; struct ib_umad_packet *packet; + int length; + if (mad_recv_wc-wc-status != IB_WC_SUCCESS) goto out; - packet = kmalloc(sizeof *packet, GFP_KERNEL); + length = 256; /*
RE: [openib-general] performance counters in /sys
Title: RE: [openib-general] performance counters in /sys -Original Message- From: Hal Rosenstock [mailto:[EMAIL PROTECTED]] Sent: Thursday, May 19, 2005 6:48 PM To: Yaron Haviv Cc: Mark Seger; openib-general@openib.org Subject: RE: [openib-general] performance counters in /sys On Thu, 2005-05-19 at 16:45, Yaron Haviv wrote: I believe you can use the per VL counters for that (IB allows counting traffic on a specific VL) By matching ULPs to VLs (e.g. through the ib_at lib we suggested) You can get both congestion isolation per traffic type as well as the ability to count traffic per ULP (note that up to 8 VLs are supported in the Mellanox chips) PortXmitDataVL[n], PortRcvDataVL[n], PortXmitPktVL[n], and PortRcvPktVL[n] are all IB optional. Do the Mellanox HCAs support these counters ? -- Hal No ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] at: Change normal message from WARN to DEBUG
at: Change normal message from WARN to DEBUG Signed-off-by: Hal Rosenstock [EMAIL PROTECTED] Index: at.c === --- at.c(revision 2379) +++ at.c(working copy) @@ -737,13 +737,13 @@ arp = (struct ib_arp *)skb-nh.raw; - WARN(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx, -(arp-src_ip 0x00ff), -(arp-src_ip 0xff00) 8, -(arp-src_ip 0x00ff) 16, -(arp-src_ip 0xff00) 24, -be64_to_cpu(arp-src_gid.global.subnet_prefix), -be64_to_cpu(arp-src_gid.global.interface_id)); + DEBUG(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx, + (arp-src_ip 0x00ff), + (arp-src_ip 0xff00) 8, + (arp-src_ip 0x00ff) 16, + (arp-src_ip 0xff00) 24, + be64_to_cpu(arp-src_gid.global.subnet_prefix), + be64_to_cpu(arp-src_gid.global.interface_id)); spin_lock_irqsave(q-lock, flags); for (a = q-next; a != q; a = a-next) { ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infinipath support in OpenIB
On Thu, May 19, 2005 at 12:46:04PM -0700, Tom Duffy wrote: The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Also interesting that OpenIB is suddenly an industry standard. ;-) Tom, There's a reason for that -- when we asked customers, they asked us to support OpenIB in particular. So congratulations, not only is OpenIB a part of the IB ecosystem, it's one that everyone seems to want to support. We didn't want to invent a new flavor of verbs or DAPL anyway. No point to it, no benefit to anyone. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I commited a fix for this in revision 2420. The problem turned out to be that DAPL wasn't initializing the max_inline_data value of the QP attr's cap structure. Let me know if you still have any problems. There is a patch in the pipeline that will remove the IBAT printout you mentioned. james On Thu, 19 May 2005, James Lentini wrote: I think I figure this out. DAPL was assuming a particular maximum scatter gather list size. I'm going to change it to query for this value. Hopefully I'll have a fix shortly. james On Thu, 19 May 2005, James Lentini wrote: For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [kDAPL] module parameter names
With revision 2420, I made the dat and ib_dat_provider module debug parameter names consistent. They are now both dbg_mask. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH][kdapl] fix spin_lock_irqsave/spin_unlock_irqrestore implementation
Itamar, Thank you for pointing this out. Long term I think it will be better to keep the flags with the spin lock so that these scoping issues don't crop up and force us to pass the flags around. The fix for this is in revision 2420. james On Wed, 18 May 2005, Itamar wrote: itamar when spin_lock_irqsave and spin_unlock_irqrestore are not called in the same function itamar need to pass the flags form the save the store itamar itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED] itamar itamar Index: dapl_evd_util.c itamar === itamar --- dapl_evd_util.c (revision 2374) itamar +++ dapl_evd_util.c (working copy) itamar @@ -378,20 +378,19 @@ itamar * that the lock is held. itamar */ itamar itamar -static struct dat_event *dapl_evd_get_event(DAPL_EVD * evd_ptr) itamar +static struct dat_event *dapl_evd_get_event(DAPL_EVD * evd_ptr, unsigned long *flags) itamar { itamar struct dat_event *event; itamar - unsigned long flags; itamar itamar if (evd_ptr-evd_producer_locking_needed) { itamar - spin_lock_irqsave(evd_ptr-header.lock, flags); itamar + spin_lock_irqsave(evd_ptr-header.lock, *flags); itamar } itamar itamar event = (struct dat_event *) dapl_rbuf_remove(evd_ptr-free_event_queue); itamar itamar /* Release the lock if it was taken and the call failed. */ itamar if (!event evd_ptr-evd_producer_locking_needed) { itamar - spin_unlock_irqrestore(evd_ptr-header.lock, flags); itamar + spin_unlock_irqrestore(evd_ptr-header.lock, *flags); itamar } itamar itamar return event; itamar @@ -406,10 +405,10 @@ itamar */ itamar itamar static void dapl_evd_post_event(DAPL_EVD *evd_ptr, itamar - const struct dat_event *event_ptr) itamar + const struct dat_event *event_ptr, itamar + unsigned long flags) itamar { itamar u32 dat_status; itamar - unsigned long flags; itamar DAPL_CNO * cno_to_trigger = NULL; itamar itamar dapl_dbg_log(DAPL_DBG_TYPE_EVD, itamar @@ -459,7 +458,7 @@ itamar DAPL_EVD * overflow_evd_ptr) itamar { itamar struct dat_event *overflow_event; itamar - itamar + unsigned long flags; itamar /* The overflow_evd_ptr mght be the same as evd. itamar * In that case we've got a catastrophic overflow. itamar */ itamar @@ -469,7 +468,7 @@ itamar return; itamar } itamar itamar - overflow_event = dapl_evd_get_event(overflow_evd_ptr); itamar + overflow_event = dapl_evd_get_event(overflow_evd_ptr, flags); itamar if (!overflow_event) { itamar /* this is not good */ itamar overflow_evd_ptr-catastrophic_overflow = TRUE; itamar @@ -477,17 +476,18 @@ itamar return; itamar } itamar dapl_evd_format_overflow_event(overflow_evd_ptr, overflow_event); itamar - dapl_evd_post_event(overflow_evd_ptr, overflow_event); itamar + dapl_evd_post_event(overflow_evd_ptr, overflow_event, flags); itamar itamar return; itamar } itamar itamar static struct dat_event *dapl_evd_get_and_init_event(DAPL_EVD *evd_ptr, itamar -enum dat_event_number evno) itamar +enum dat_event_number evno, itamar +unsigned long *flags) itamar { itamar struct dat_event *event_ptr; itamar itamar - event_ptr = dapl_evd_get_event(evd_ptr); itamar + event_ptr = dapl_evd_get_event(evd_ptr, flags); itamar if (!event_ptr) itamar dapl_evd_post_overflow_event(evd_ptr-header.owner_ia- itamar async_error_evd, evd_ptr); itamar @@ -507,7 +507,8 @@ itamarDAT_CR_HANDLE cr_handle) itamar { itamar struct dat_event *event_ptr; itamar - event_ptr = dapl_evd_get_and_init_event(evd_ptr, event_number); itamar + unsigned long flags; itamar + event_ptr = dapl_evd_get_and_init_event(evd_ptr, event_number, flags); itamar /* itamar * Note event lock may be held on successful return itamar * to be released by dapl_evd_post_event(), if provider side locking itamar @@ -525,7 +526,7 @@ itamar event_ptr-event_data.cr_arrival_event_data.conn_qual = conn_qual; itamar event_ptr-event_data.cr_arrival_event_data.cr_handle = cr_handle; itamar itamar - dapl_evd_post_event(evd_ptr, event_ptr); itamar + dapl_evd_post_event(evd_ptr, event_ptr, flags); itamar itamar return DAT_SUCCESS; itamar } itamar @@ -537,7 +538,8 @@ itamarvoid
[openib-general] Re: [PATCH] kDAPL: add some clarification to a few debug printks
Commited in revision 2421. On Wed, 18 May 2005, Tom Duffy wrote: tduffy I was running into some issues and noticed that these printks were not tduffy very clear about where they were coming from. So, adding a little more tduffy info. tduffy tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED] tduffy tduffy Index: linux-kernel/dat-provider/dapl_openib_qp.c tduffy === tduffy --- linux-kernel/dat-provider/dapl_openib_qp.c (revision 2382) tduffy +++ linux-kernel/dat-provider/dapl_openib_qp.c (working copy) tduffy @@ -108,7 +108,8 @@ u32 dapl_ib_qp_alloc(DAPL_IA *ia_ptr, DA tduffy ep_ptr-qp_handle = ib_create_qp(ib_pd_handle, qp_attr); tduffy if (IS_ERR(ep_ptr-qp_handle)) { tduffy ib_status = PTR_ERR(ep_ptr-qp_handle); tduffy - dapl_dbg_log(DAPL_DBG_TYPE_ERR, failed code = %d\n, tduffy + dapl_dbg_log(DAPL_DBG_TYPE_ERR, tduffy + dapl_ib_qp_alloc: ib_create_qp failed = %d\n, tduffy ib_status); tduffy return dapl_ib_status_convert(ib_status); tduffy } tduffy @@ -197,8 +198,9 @@ ib_cq_handle_t dapl_get_dto_cq(DAPL_IA * tduffy ib_status = tduffy PTR_ERR(ia_ptr-hca_ptr-ib_trans. tduffy null_ib_cq_handle); tduffy - dapl_dbg_log(DAPL_DBG_TYPE_ERR, failed code = %d\n, tduffy -ib_status); tduffy + dapl_dbg_log(DAPL_DBG_TYPE_ERR, tduffy + dapl_get_dto_cq: ib_create_cq failed tduffy += %d\n, ib_status); tduffy ia_ptr-hca_ptr-ib_trans.null_ib_cq_handle = tduffy IB_INVALID_HANDLE; tduffy } tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH] kDAPL: return is not a function
Committed in revision 2422. On Wed, 18 May 2005, Tom Duffy wrote: tduffy return is not a function tduffy tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED] tduffy tduffy Index: linux-kernel-return/dat-provider/dapl_cookie.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_cookie.c (revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_cookie.c (working copy) tduffy @@ -134,9 +134,9 @@ u32 dapl_cb_create(DAPL_COOKIE_BUFFER *b tduffy buffer-pool[i].ep = ep; tduffy } tduffy tduffy - return (DAT_SUCCESS); tduffy + return DAT_SUCCESS; tduffy } else { tduffy - return (DAT_INSUFFICIENT_RESOURCES); tduffy + return DAT_INSUFFICIENT_RESOURCES; tduffy } tduffy } tduffy tduffy Index: linux-kernel-return/dat-provider/dapl_ring_buffer_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_ring_buffer_util.c (revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_ring_buffer_util.c (working copy) tduffy @@ -237,7 +237,7 @@ void *dapl_rbuf_remove(DAPL_RING_BUFFER tduffy if (val == pos) { tduffy pos = (pos + 1) rbuf-lim;/* verify in range */ tduffy tduffy - return (rbuf-base[pos]); tduffy + return rbuf-base[pos]; tduffy } tduffy } tduffy tduffy Index: linux-kernel-return/dat-provider/dapl_lmr_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_lmr_util.c(revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_lmr_util.c(working copy) tduffy @@ -50,7 +50,7 @@ DAPL_LMR *dapl_lmr_alloc(DAPL_IA * ia, tduffy /* Allocate LMR */ tduffy lmr = (DAPL_LMR *) kmalloc(sizeof(DAPL_LMR), GFP_ATOMIC); tduffy if (NULL == lmr) { tduffy - return (NULL); tduffy + return NULL; tduffy } tduffy tduffy /* zero the structure */ tduffy @@ -80,7 +80,7 @@ DAPL_LMR *dapl_lmr_alloc(DAPL_IA * ia, tduffy lmr-param.mem_priv = mem_priv; tduffy atomic_set(lmr-lmr_ref_count, 0); tduffy tduffy - return (lmr); tduffy + return lmr; tduffy } tduffy tduffy void dapl_lmr_dealloc(DAPL_LMR * lmr) tduffy Index: linux-kernel-return/dat-provider/dapl_hca_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_hca_util.c(revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_hca_util.c(working copy) tduffy @@ -83,7 +83,7 @@ DAPL_HCA *dapl_hca_alloc(char *name, str tduffy } tduffy } tduffy tduffy - return (hca_ptr); tduffy + return hca_ptr; tduffy } tduffy tduffy /* tduffy Index: linux-kernel-return/dat-provider/dapl_cr_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_cr_util.c (revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_cr_util.c (working copy) tduffy @@ -44,7 +44,7 @@ DAPL_CR *dapl_cr_alloc(DAPL_IA * ia_ptr) tduffy /* Allocate EP */ tduffy cr_ptr = (DAPL_CR *) kmalloc(sizeof(DAPL_CR), GFP_ATOMIC); tduffy if (cr_ptr == NULL) { tduffy - return (NULL); tduffy + return NULL; tduffy } tduffy tduffy /* zero the structure */ tduffy @@ -62,7 +62,7 @@ DAPL_CR *dapl_cr_alloc(DAPL_IA * ia_ptr) tduffy dapl_llist_init_entry(cr_ptr-header.ia_list_entry); tduffy spin_lock_init(cr_ptr-header.lock); tduffy tduffy - return (cr_ptr); tduffy + return cr_ptr; tduffy } tduffy tduffy /* tduffy Index: linux-kernel-return/dat-provider/dapl_ia_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_ia_util.c (revision 2382) tduffy +++ linux-kernel-return/dat-provider/dapl_ia_util.c (working copy) tduffy @@ -65,7 +65,7 @@ DAPL_IA *dapl_ia_alloc(struct dat_provid tduffy /* Allocate IA */ tduffy ia_ptr = (DAPL_IA *) kmalloc(sizeof(DAPL_IA), GFP_ATOMIC); tduffy if (ia_ptr == NULL) { tduffy - return (NULL); tduffy + return NULL; tduffy } tduffy tduffy /* zero the structure */ tduffy @@ -100,7 +100,7 @@ DAPL_IA *dapl_ia_alloc(struct dat_provid tduffy tduffy dapl_hca_link_ia(hca_ptr, ia_ptr); tduffy tduffy - return (ia_ptr); tduffy + return ia_ptr; tduffy } tduffy tduffy /* tduffy Index: linux-kernel-return/dat-provider/dapl_pz_util.c tduffy === tduffy --- linux-kernel-return/dat-provider/dapl_pz_util.c
[openib-general] Re: [PATCH] at: Change normal message from WARN to DEBUG
Committed in revision 2423. On Thu, 19 May 2005, Hal Rosenstock wrote: halr at: Change normal message from WARN to DEBUG halr halr Signed-off-by: Hal Rosenstock [EMAIL PROTECTED] halr halr Index: at.c halr === halr --- at.c(revision 2379) halr +++ at.c(working copy) halr @@ -737,13 +737,13 @@ halr halr arp = (struct ib_arp *)skb-nh.raw; halr halr - WARN(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx, halr -(arp-src_ip 0x00ff), halr -(arp-src_ip 0xff00) 8, halr -(arp-src_ip 0x00ff) 16, halr -(arp-src_ip 0xff00) 24, halr -be64_to_cpu(arp-src_gid.global.subnet_prefix), halr -be64_to_cpu(arp-src_gid.global.interface_id)); halr + DEBUG(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx, halr + (arp-src_ip 0x00ff), halr + (arp-src_ip 0xff00) 8, halr + (arp-src_ip 0x00ff) 16, halr + (arp-src_ip 0xff00) 24, halr + be64_to_cpu(arp-src_gid.global.subnet_prefix), halr + be64_to_cpu(arp-src_gid.global.interface_id)); halr halr spin_lock_irqsave(q-lock, flags); halr for (a = q-next; a != q; a = a-next) { halr halr ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OOPS: ib_mad crashery on bootup
Error, some other host already uses address 192.168.0.233. I hit a simliar problem a while ago. But I couldn't reproduct it since then. Is this reproducible if you configure duplicate IP addresses? Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OOPS: ib_mad crashery on bootup
On Thu, 2005-05-19 at 21:30 -0700, Shirley Ma wrote: Error, some other host already uses address 192.168.0.233. I hit a simliar problem a while ago. But I couldn't reproduct it since then. Is this reproducible if you configure duplicate IP addresses? I don't think this particular error is related to the crash I observed. I see this on normal bootup of FC3 w/ ib0 configured. The crash I saw happened once, but upon reboot was not reproducible. -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
On Thu, 2005-05-19 at 21:43 -0400, James Lentini wrote: I commited a fix for this in revision 2420. The problem turned out to be that DAPL wasn't initializing the max_inline_data value of the QP attr's cap structure. Let me know if you still have any problems. Good job. All is well now. Tested working. Thanks, -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general