[openib-general] Re: [PATCH] rewrite perftest/README

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Grant Grundler [EMAIL PROTECTED]:
 Subject: [PATCH] rewrite perftest/README
 
 Michael,
 Here's a complete rewrite of the README file.
 Should make it easier for people to understand
   o how to build
   o run
   o interpret results
 
 
 I'd still like to add abit more about statisical significance
 of the sample size of 1000 but am just refreshing my memory
 (ok, lame excuse, I've forgotten everything from 20years ago :^)
 on how to do that. Maybe you can craft something based on your
 experience plus the observations below?
 
 Ditching the last two (extreme) readings from the server side
 of the histogram:
   o standard deviation86 cycles
   o arithmetic mean (average) 7135 cycles
   o median  7126 cycles
 o min6906 cycles (sorted sample #1)
   o max7490 cycles (sorted sample #997)
 
 (For the record, #998 is 8798 and #999 is 50305, we are clearly
 measuring something else here too)
 
 (1.5Ghz IA64, PCI-X, 2.6.11, forgot which SVN they are running)
 
 The median value is *very* reproducible on this configuration.
 +- 1 cycle consistently over 5 runs of rdma_lat.
 
 
 thanks,
 grant
 
 Signed-off-by: Grant Grundler [EMAIL PROTECTED]

Thanks, checked in.


-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: diff-perftest-07 replace pp_get_local_lid()

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Grant Grundler [EMAIL PROTECTED]:
 Subject: diff-perftest-07 replace pp_get_local_lid()
 
 Hi Michael,
 
 Following patch to rdma_lat.c:
 o replaces pp_get_local_lid with code from ibv_pingpong.
   This calls into libibverbs instead of fishing around in /sys FS.
 
 o makes two minor white space fix-ups.
 
 Signed-off-by: Grant Grundler [EMAIL PROTECTED]
 
 
 I'd like to slowly restructure main() into multiple distinct parts:
   1) parameter parsing/setting
   2) global data init (e.g. srand())
   3) setup connection
   3) negotiate test+parameters with server/client
   4) run test (maybe several iterations with different params)
   5) exit/cleanup
 
 I'm thinking about how to keep the server running and iterating.
 The goal is to be able to run a sequence of tests just
 from the client side.
 
 Or is this a waste of time?
 Should I rather be looking at fixing up netperf to support IB?
 
 thanks,
 grant
 

thanks, checked in.

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [RFC] [PATCH] user_mad: Support RMPP on send side

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [RFC] [PATCH] user_mad: Support RMPP on send side
 
 This looks OK to check in with one small comment on the following:
 
 - if (copy_to_user(buf, packet-mad, sizeof packet-mad))
 + if (copy_to_user(buf, packet-mad,
 +  min(count, packet-length +
 +  sizeof (struct ib_user_mad
   ret = -EFAULT;
   else
 - ret = sizeof packet-mad;
 + ret = count;
 
 This code will truncate a received MAD that is bigger than the buffer
 passed into read(), but return the full size of the packet.  I don't
 think read() is allowed to do this: the return value can be at most
 the count value passed in by the user.
 
 I think we have two options: truncate and return the actual amount of
 data read to the user, or return an error if the user's buffer is too
 small.
 
  - R.
 

If you truncate, how will the user know the MAD was truncated?

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] registering read-only memory

2005-05-19 Thread Michael S. Tsirkin
Roland, the following code snippet:

const char foo[]=Michael Tsirkin;
ibv_reg_mr(ctx-pd, foo, strlen(foo), 0);

exposes two problems with ibv_reg_mr:

1. Compiling this code I get a warning:
warning: passing arg 2 of `ibv_reg_mr' discards qualifiers
from pointer target type.

Same if foo is declared volatile.

I suggest changing ibv_reg_mr to accept const volatile void *
as a second parameter.
const is OK since ibv_reg_mr never actually writes to the buffer.
volatile is needed by some applications to prevent the compiler from
assuming it can re-order accesses to this buffer.

Patch attached (below).

2. ibv_reg_mr fails.
Why is that?


System details:

gcc --version
gcc (GCC) 3.3.3 (SuSE Linux)

uname -a
Linux swlab156 2.6.11-openib #29 SMP Mon Apr 18 16:17:51 IDT 2005
x86_64 x86_64 x86_64 GNU/Linux

Make ibv_reg_mr accept buffer as volatile const *.

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: libibverbs/include/infiniband/verbs.h
===
--- libibverbs/include/infiniband/verbs.h   (revision 2408)
+++ libibverbs/include/infiniband/verbs.h   (working copy)
@@ -499,7 +499,7 @@ extern int ibv_dealloc_pd(struct ibv_pd 
 /**
  * ibv_reg_mr - Register a memory region
  */
-extern struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr,
+extern struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, volatile const void *addr,
 size_t length, enum ibv_access_flags access);
 
 /**
Index: libibverbs/src/verbs.c
===
--- libibverbs/src/verbs.c  (revision 2408)
+++ libibverbs/src/verbs.c  (working copy)
@@ -64,7 +64,7 @@ int ibv_dealloc_pd(struct ibv_pd *pd)
return pd-context-ops.dealloc_pd(pd);
 }
 
-struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr,
+struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, volatile const void *addr,
  size_t length, enum ibv_access_flags access)
 {
struct ibv_mr *mr;
Index: libmthca/src/mthca.h
===
--- libmthca/src/mthca.h(revision 2408)
+++ libmthca/src/mthca.h(working copy)
@@ -260,7 +260,7 @@ extern int mthca_query_port(struct ibv_c
 extern struct ibv_pd *mthca_alloc_pd(struct ibv_context *context);
 extern int mthca_free_pd(struct ibv_pd *pd);
 
-extern struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr,
+extern struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, volatile const void 
*addr,
   size_t length, enum ibv_access_flags access);
 extern int mthca_dereg_mr(struct ibv_mr *mr);
 
Index: libmthca/src/verbs.c
===
--- libmthca/src/verbs.c(revision 2408)
+++ libmthca/src/verbs.c(working copy)
@@ -93,7 +93,8 @@ int mthca_free_pd(struct ibv_pd *pd)
return 0;
 }
 
-static struct ibv_mr *__mthca_reg_mr(struct ibv_pd *pd, void *addr,
+static struct ibv_mr *__mthca_reg_mr(struct ibv_pd *pd,
+volatile const void *addr,
 size_t length, uint64_t hca_va,
 enum ibv_access_flags access)
 {
@@ -113,7 +114,7 @@ static struct ibv_mr *__mthca_reg_mr(str
return mr;
 }
 
-struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr,
+struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, volatile const void *addr,
size_t length, enum ibv_access_flags access)
 {
return __mthca_reg_mr(pd, addr, length, (uintptr_t) addr, access);
-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] CM private data

2005-05-19 Thread Or Gerlitz
Caitlin I believe that an IT-API it_ep_accept() supplies private data
that it expects to be delivered to its peer when the three-way handshake
option is selected. 

Both IT-API it_ep_accept() and DAT dat_cr_accept() cause the provider at
the passive side to send a --REP--, 
so how --RTU-- is related to the _accept calls? 

I think you ment to say that exposing two-way handshake means that the
consumer can not supply private data for 
the RTU as he is not aware of it (the RTU) being sent?

Or.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Caitlin Bestler
Sent: Thursday, May 19, 2005 2:16 AM
To: Sean Hefty
Cc: openib-general
Subject: Re: [openib-general] CM private data

I believe that an IT-API it_ep_accept() supplies private data that it
expects to be delivered to its peer when the three-way handshake option
is selected.

DAT only exposes a two-way handshake, so there it never requires private
data on the RTU.

I don't know if any IT-API applications actually require the three-way
handshake.


On 5/18/05, Sean Hefty [EMAIL PROTECTED] wrote:
 Do any applications make use of the private data in these CM message: 
 RTU, MRA, or DREP?  I'm doubtful of the RTU or DREP, but not as sure
of the MRA.
 
 Since no replies are generated in response to these messages, the CM 
 does not keep them after their sends complete.  However, it may need 
 to resend the messages.  For example, it will resend the RTU if a 
 duplicate REP is received.
 
 If no applications are using the private data, I will not worry about 
 storing the private data for the retransmissions at this time.
 
 - Sean
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit 
 http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] CM private data

2005-05-19 Thread Caitlin Bestler
With a two-way handshake only the passive side accepts the 
connection request (it_ep_accept() or dat_cr_accept()). IT-API
defines an optional *three-way* handshake where the active
side must *also* call it_ep_accept() before the final connection
establishment can proceed.

This does not map to iWARP/MPA in any standard way, so
rather than defining a protocol to wrap the first handshake
private data, DAT decided to stick with only two-way 
handshaking.

There has been no screaming, so we can presume that
*most* applications find two-way handshaking acceptable.

However an IT-API provider is free to say that it supports
three-way handshaking, and I believe it is implied (if not
required) that InfiniBand providers do so.

What I do not know is if any actual applications have ever
made use of this capability, or if it is only a defensive
encoding of a protocol option into an API that prefers
not to avoid removal of options that are available at the
wire level. DAT emphasized providing a clean API for
transport neutral services, and so simply said use
two-way handshaking.

In any event, if no support is going to be provided for
private data on the second it_ep_accept at the verb
layer then that should be explicitly documented, and
I'd suggest sending a 'heads up' to the IT-API authors,

On 5/19/05, Or Gerlitz [EMAIL PROTECTED] wrote:
 Caitlin I believe that an IT-API it_ep_accept() supplies private data
 that it expects to be delivered to its peer when the three-way handshake
 option is selected.
 
 Both IT-API it_ep_accept() and DAT dat_cr_accept() cause the provider at
 the passive side to send a --REP--,
 so how --RTU-- is related to the _accept calls?
 
 I think you ment to say that exposing two-way handshake means that the
 consumer can not supply private data for
 the RTU as he is not aware of it (the RTU) being sent?
 
 Or.
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Caitlin Bestler
 Sent: Thursday, May 19, 2005 2:16 AM
 To: Sean Hefty
 Cc: openib-general
 Subject: Re: [openib-general] CM private data
 
 I believe that an IT-API it_ep_accept() supplies private data that it
 expects to be delivered to its peer when the three-way handshake option
 is selected.
 
 DAT only exposes a two-way handshake, so there it never requires private
 data on the RTU.
 
 I don't know if any IT-API applications actually require the three-way
 handshake.
 
 
 On 5/18/05, Sean Hefty [EMAIL PROTECTED] wrote:
  Do any applications make use of the private data in these CM message:
  RTU, MRA, or DREP?  I'm doubtful of the RTU or DREP, but not as sure
 of the MRA.
 
  Since no replies are generated in response to these messages, the CM
  does not keep them after their sends complete.  However, it may need
  to resend the messages.  For example, it will resend the RTU if a
  duplicate REP is received.
 
  If no applications are using the private data, I will not worry about
  storing the private data for the retransmissions at this time.
 
  - Sean
  ___
  openib-general mailing list
  openib-general@openib.org
  http://openib.org/mailman/listinfo/openib-general
 
  To unsubscribe, please visit
  http://openib.org/mailman/listinfo/openib-general
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit
 http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] management: resource leak fixes

2005-05-19 Thread Michael S. Tsirkin
Management libraries leak resources (memory, file/directory handles).
Also a trailing whitespace fix in one place in libibcommon.

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: libibumad/src/umad.c
===
--- libibumad/src/umad.c(revision 2411)
+++ libibumad/src/umad.c(working copy)
@@ -357,8 +357,10 @@ get_ca(char *ca_name, umad_ca_t *ca)
if (!(dir = opendir(dir_name)))
return -ENOENT;
 
-   if ((r = scandir(dir_name, namelist, 0, alphasort))  0)
-   return -EIO;
+   if ((r = scandir(dir_name, namelist, 0, alphasort))  0) {
+   ret = errno  0 ? errno : -EIO;
+   goto error;
+   }
 
ret = 0;
ca-numports = 0;
@@ -388,6 +390,7 @@ get_ca(char *ca_name, umad_ca_t *ca)
free(namelist[i]);
free(namelist);
 
+   closedir(dir);
put_ca(ca);
return 0;
 
@@ -395,7 +398,8 @@ clean:
for (i = 0; i  r; i++)
free(namelist[i]);
free(namelist);
-
+error:
+   closedir(dir);
release_ca(ca);
 
return ret;
Index: libibcommon/src/sysfs.c
===
--- libibcommon/src/sysfs.c (revision 2411)
+++ libibcommon/src/sysfs.c (working copy)
@@ -56,6 +56,7 @@
 #include sys/poll.h
 #include syslog.h
 #include netinet/in.h
+#include errno.h
 
 #include common.h
 
@@ -86,14 +87,20 @@ sys_read_string(char *dir_name, char *fi
if ((fd = open(path, O_RDONLY))  0)
return ret_code();
 
-   if ((r = read(fd, str, max_len))  0)
+   if ((r = read(fd, str, max_len))  0) {
+   int e = errno;
+   close(fd);
+   errno = e;
return ret_code();
+   }
 
str[(r  max_len) ? r : max_len - 1] = 0;
 
if ((s = strrchr(str, '\n')))
*s = 0;
-   return 0;   
+
+   close(fd);
+   return 0;
 }
 
 int

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] 2.6.12 question

2005-05-19 Thread Parks Fields
Hello
Is there a branch for 2.6.12?
We have noticed that we get different performance when built on 2.6.11-5 vs 
2.6.12-rc4.

thanks
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: 2.6.12 question

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Parks Fields [EMAIL PROTECTED]:
 Subject: 2.6.12 question
 
 Hello
 
 Is there a branch for 2.6.12?
 We have noticed that we get different performance when built on 2.6.11-5 vs 
 2.6.12-rc4.
 
 thanks
 

1. Could you elaborate please?
2. I suggest you make sure the kernels are built with all the same options,
   especially regarding security, networking and cpu type options.

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] management: resource leak fixes

2005-05-19 Thread Hal Rosenstock
On Thu, 2005-05-19 at 09:24, Michael S. Tsirkin wrote:
 Management libraries leak resources (memory, file/directory handles).
 Also a trailing whitespace fix in one place in libibcommon.
 
 Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Thanks. Applied.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: 2.6.12 question

2005-05-19 Thread Parks Fields

2. I suggest you make sure the kernels are built with all the same options,
   especially regarding security, networking and cpu type options.
This is on the same box, but I have to check the .config files.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] 2.6.12 question

2005-05-19 Thread Roland Dreier
Parks Is there a branch for 2.6.12?  We have noticed that we get
Parks different performance when built on 2.6.11-5 vs 2.6.12-rc4.

Obviously the drivers/infiniband already in 2.6.12-rc4 should work
fine.  Current svn should also work -- I think the only change
required is for SDP, and Tom Duffy posted a patch.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: registering read-only memory

2005-05-19 Thread Roland Dreier
Michael 2. ibv_reg_mr fails.  Why is that?

How does it fail?

  R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] CM private data

2005-05-19 Thread Sean Hefty
Caitlin Bestler wrote:
In any event, if no support is going to be provided for
private data on the second it_ep_accept at the verb
layer then that should be explicitly documented, and
I'd suggest sending a 'heads up' to the IT-API authors,
To clarify, I was only trying to determine when to implement this, not if. 
Based on the feedback, I will try to fix this as part of my next set of 
changes to the CM.

Thanks,
Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: sdp_link.c info_list locking question

2005-05-19 Thread Libor Michalek
On Thu, May 19, 2005 at 11:12:38AM +0300, Michael S. Tsirkin wrote:
 Libor, I'm looking at sdp_link.c, and I dont see any lock
 protecting the info_list linked list.
 What prevents sdp_path_info_lookup from being called while
 sdp_path_info_create or sdp_path_info_destroy is in progress?

Michael, you are correct that the locking is missing, there's
actually a line about it in the TODO file. When I moved the
sdp code to gen2 it was easier to rewrite sdp_link because of
all the other junk that was in the old version, but to get
something done quickly I left out locking. I figured eventually
the IP to PathRecord service would be broken into a seperate
module, like the old ip2pr, since there is the possibility for
multiple consumers.  However, that has yet to happen, so if
someone wants to add the locking, feel free.


-Libor
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: sdp_link.c info_list locking question

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Libor Michalek [EMAIL PROTECTED]:
 Subject: Re: sdp_link.c info_list locking question
 
 On Thu, May 19, 2005 at 11:12:38AM +0300, Michael S. Tsirkin wrote:
  Libor, I'm looking at sdp_link.c, and I dont see any lock
  protecting the info_list linked list.
  What prevents sdp_path_info_lookup from being called while
  sdp_path_info_create or sdp_path_info_destroy is in progress?
 
 Michael, you are correct that the locking is missing, there's
 actually a line about it in the TODO file. When I moved the
 sdp code to gen2 it was easier to rewrite sdp_link because of
 all the other junk that was in the old version, but to get
 something done quickly I left out locking. I figured eventually
 the IP to PathRecord service would be broken into a seperate
 module, like the old ip2pr, since there is the possibility for
 multiple consumers.  However, that has yet to happen, so if
 someone wants to add the locking, feel free.
 
 
 -Libor
 

OK. I might look at this next week.
First step would be to switch to standard list macros, though.

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] CM private data

2005-05-19 Thread Fab Tillier
 From: Caitlin Bestler [mailto:[EMAIL PROTECTED]
 Sent: Thursday, May 19, 2005 9:25 AM
 
 So being predictably unreliable for one implementation
 stage is certainly something you can get away with.
 Even when you add support it might be quite acceptable
 to send the private data *only* on the first try, or to
 require the IT-API layer to do the retries.

Only sending the user's private data on the first try requires the user to
support connection establishment with:
- no private data (no RTU received)
- zero private data (RTU retry)
- remote peer's private data (first RTU)

The receipt of private data does not mean that private data is actually what
the remote side sent, and also requires users to never use all zero private
data since that would make the second and third case above
indistinguishable.

So if the CM is going to expose the private data, it needs to put that
private data in all retries.

I'm still for hiding the RTU private data.  I think it's useless because
it's unreliable - anything exchanged via private data in the RTU must also
be exchanged by other means in case the connection is established before the
RTU is received.  Any ULPs that depend on the RTU private data are setting
themselves up for potential failures.

Just my opinion, though...

- Fab

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] CM private data

2005-05-19 Thread Sean Hefty
Fab Tillier wrote:
I'm still for hiding the RTU private data.  I think it's useless because
it's unreliable - anything exchanged via private data in the RTU must also
be exchanged by other means in case the connection is established before the
RTU is received.  Any ULPs that depend on the RTU private data are setting
themselves up for potential failures.
This depends on the implementation.  If the server side of a connection 
initiates the data transfer, it cannot do so until an RTU is received.

- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Vivek Kashyap
snip...

 
 The most interesting optimization available is implementing the IPoIB
 connected mode draft, although I don't think it's as easy as Vivek
 indicated -- for example, I'm not sure how to deal with having
 different MTUs depending on the destination.

snip...

The draft does allow for a negotiation per connection for the implementations
that wish to take advantage of it. However, an implementation can by default
choose to use a 'connected-mode MTU' e.g. 32K always. It can then, for every 
connection choose to, negotiate to this value and if it is not workable fall 
back to the UD mode and deny the connection mode. The ARP entries hold the 
connected mode flags thereby keeping track of the mode to use per destination. 

I'd be more than happy to discuss other implementation issues. As I noted 
earlier it will also help refine the draft.

Vivek

P.S. cc replies to [EMAIL PROTECTED]


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: user_mad::ib_umad_read question

2005-05-19 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock [EMAIL PROTECTED]:
 Subject: user_mad::ib_umad_read question
 
 Hi,
 
 In ib_umad_read, there is currently (or soon to be something like) the
 following:
   ...
 packet = list_entry(file-recv_list.next, struct ib_umad_packet, 
 list);
 list_del(packet-list);
 
 spin_unlock_irq(file-recv_lock);
 
 if (copy_to_user(buf, packet-mad,
  min(count, packet-length +
  sizeof (struct ib_user_mad
 ret = -EFAULT;
 else
 ret = count;
 
 kfree(packet);
 return ret;
 
 Should the packet be thrown away because copy_to_user() fails ?
 Shouldn't it be placed back at the head of the list ? Unfortunately,
 that would mean holding the recv lock longer (through the duration of
 copy_to_user).
 
 -- Hal

copy_to_user might sleep so you cant call it under a spinlock.

Since the user is only hurting himself by passing an illegal address,
I'd think it doesnt hurt to drop the mad.

-- 
MST - Michael S. Tsirkin
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] CM private data

2005-05-19 Thread Libor Michalek
On Thu, May 19, 2005 at 09:34:16AM -0700, Fab Tillier wrote:
  From: Caitlin Bestler [mailto:[EMAIL PROTECTED]
  Sent: Thursday, May 19, 2005 9:25 AM
  
  So being predictably unreliable for one implementation
  stage is certainly something you can get away with.
  Even when you add support it might be quite acceptable
  to send the private data *only* on the first try, or to
  require the IT-API layer to do the retries.
 
 I'm still for hiding the RTU private data.  I think it's useless because
 it's unreliable - anything exchanged via private data in the RTU must also
 be exchanged by other means in case the connection is established before the
 RTU is received.  Any ULPs that depend on the RTU private data are setting
 themselves up for potential failures.

  I agree for exactly the reason you give, I can't think of a legitimate
use for RTU private data. I'd get rid of it entirely, from the code as
well as the spec, which is why I think it would be a waste of someones
time to add  the correct support for it.

-Libor
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Hal Rosenstock
Hi Vivek,

On Thu, 2005-05-19 at 12:41, Vivek Kashyap wrote:
 snip...
 
  
  The most interesting optimization available is implementing the IPoIB
  connected mode draft, although I don't think it's as easy as Vivek
  indicated -- for example, I'm not sure how to deal with having
  different MTUs depending on the destination.
 
 snip...
 
 The draft does allow for a negotiation per connection for the implementations
 that wish to take advantage of it. However, an implementation can by default
 choose to use a 'connected-mode MTU' e.g. 32K always. It can then, for every 
 connection choose to, negotiate to this value and if it is not workable fall 
 back to the UD mode and deny the connection mode. The ARP entries hold the 
 connected mode flags thereby keeping track of the mode to use per 
 destination. 

Sounds like there should be an agreement on a default connected mode
MTU or else this will drop down to UD.

I have a couple of clarification questions on 5.1 Per-Connection MTU:

1. I presume the Receive MTU is in the first 2 bytes of the private data
in the CM messages. Is that correct ?

2. Also, CM REQ is mentioned for the requester receive MTU. Wouldn't CM
REP carry the granted receive MTU which is constrained to be the
requested MTU or less ? So 2 things on this:
The I-D says The private data field MUST carry the receive MTU. Does
that include RTUs ?

Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] remove redundant check in mthca_provider.c

2005-05-19 Thread James Lentini
Remove redundant check of pd-uobject in mthca_provider.c
Signed-off-by: James Lentini [EMAIL PROTECTED]Index: infiniband/hw/mthca/mthca_provider.c
===
--- infiniband/hw/mthca/mthca_provider.c(revision 2404)
+++ infiniband/hw/mthca/mthca_provider.c(working copy)
@@ -478,9 +478,7 @@
kfree(qp);
return ERR_PTR(err);
}
-   }
 
-   if (pd-uobject) {
qp-mr.ibmr.lkey = ucmd.lkey;
qp-sq.db_index  = ucmd.sq_db_index;
qp-rq.db_index  = ucmd.rq_db_index;
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: diff-perftest-07 replace pp_get_local_lid()

2005-05-19 Thread Grant Grundler
On Thu, May 19, 2005 at 08:51:26AM +0300, Michael S. Tsirkin wrote:
 Quoting r. Michael S. Tsirkin [EMAIL PROTECTED]:
  Subject: Re: diff-perftest-07 replace pp_get_local_lid()
   Should I rather be looking at fixing up netperf to support IB?
   
   thanks,
   grant
   
  
  That may be kind of hard, given that uverbs API is completely different
  from socket API.
 
 Wait, isnt this what SDP is doing?

SDP is a thin, kernel translation layer from IP sockets to verbs, right?
The idea I'm chasing is for netperf to use uverbs directly like
rdma_lat does.

grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Roland Dreier
Vivek The draft does allow for a negotiation per connection for
Vivek the implementations that wish to take advantage of
Vivek it. However, an implementation can by default choose to use
Vivek a 'connected-mode MTU' e.g. 32K always. It can then, for
Vivek every connection choose to, negotiate to this value and if
Vivek it is not workable fall back to the UD mode and deny the
Vivek connection mode. The ARP entries hold the connected mode
Vivek flags thereby keeping track of the mode to use per
Vivek destination.

But this means that the MTU of the link will be different for UD
destinations (including multicast) and RC destinations, right?  Or am
I missing something?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] rewrite perftest/README

2005-05-19 Thread Grant Grundler
On Thu, May 19, 2005 at 06:32:32AM +0100, Paul Baxter wrote:
 Grant, Michael
 
 Great work!
 
 I wanted to point out a recent thread on comp.arch discussing the merits of 
 using the geometric mean instead of the arithmetic mean as a basis for 
 sampling the timing of a population.

Paul,
thanks for pointing out the thread - I'm learning alot! :^)

 My statistics is a bit rusty, but 
 calculating and presenting GM might be useful a well?

After reading the original posting, I think the answer is NO.

Particularly, two comments in John Mashey's original posting:
| The GM is the correct mean for combining benchmark results intended
| to be a sample from a larger population of programs and intended to
| predict the performance distribution of other benchmarks, which after
| all, is what people want for generalized performance comparisons.
| This is true if the population follows a *lognormal* [described later]
| distribution ... and it turns out, many do so. 


rdma_lat is one benchmark - not a combination of benchmark results.
If perftest ends up with more than two benchmarks, we can try to
look weighted geometric mean to get one number from running the whole
mess. But TBH, I have no interest or need to do that. Sounds like
statistical wanking off to me. I'm much more interested in comparing
differences of individual rdma_lat runs and sort out why the results
are different.


| If you know your workload, your benchamrks *are* the population,
| and you use algebra. 

We know pretty well what we are measuring with rmda_lat.
Well, Michael does. I only have a dangerously vague understanding. :^)

I know we have two sets of data: warmup and runtime.
The first two or three measurements are related to warmup.
John Mashey's posting suggested keeping separate groups of data apart.
Ie. it was probably correct to not include warmup values in the
std deviation and arithmetic mean (avg).

Conclusion: The median value still looks like the right thing to report.

 http://groups.google.co.uk/group/comp.arch/browse_thread/thread/3f5a9ed1d79ed726/416e58b5e48c1715?q=geometricrnum=6hl=en#416e58b5e48c1715
 
 [comp.arch, 'SPEC use of Geometric Mean', May 13th, 2005, John Mashey ] 

thanks again,
grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Grant Grundler
On Wed, May 18, 2005 at 08:00:15PM -0700, Felix Marti wrote:
 Hi Roland,
  
 define SMP :)

Anytime a CPU is cache coherent with another CPU.

 at these rates, system architecture comes into place,

Definitely. The architecture puts boundaries on how coherency
can be implemented...and thus available memory bus bandwidth.

grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] CM private data

2005-05-19 Thread Rimmer, Todd
All three of these have mechanisms whereby the message can be skipped, in which 
case applications should not depend on the private data (IB spec mentions that 
apps should not depend on them).

For example, A inbound receive while in RTR state after having sent a REP can 
be treated as an RTU, in which case later arrival of the RTU is to be discarded 
by the CM.

Todd Rimmer

 -Original Message-
 From: Tillier, Fabian 
 Sent: Wednesday, May 18, 2005 3:28 PM
 To: 'Sean Hefty'; openib-general
 Subject: RE: [openib-general] CM private data
 
 
  From: Sean Hefty [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, May 18, 2005 12:23 PM
  
  Do any applications make use of the private data in these 
 CM message: RTU,
  MRA, or DREP?  I'm doubtful of the RTU or DREP, but not as 
 sure of the
 MRA.
  
  Since no replies are generated in response to these 
 messages, the CM does
  not keep them after their sends complete.  However, it may 
 need to resend
  the messages.  For example, it will resend the RTU if a 
 duplicate REP is
  received.
  
  If no applications are using the private data, I will not 
 worry about
  storing the private data for the retransmissions at this time.
 
 If you're not going to store them, I would suggest removing 
 the private data
 for the associated calls (RTU, MRA, DREP).  That way it 
 becomes very clear
 to applications wishing to use the private data that they can't.
 
 - Fab
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit 
 http://openib.org/mailman/listinfo/openib-general
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Infinipath support in OpenIB

2005-05-19 Thread Paul Baxter
I read with interest the PR blurb at 
http://supercomputingonline.com/article.php?sid=8740
regarding InfiniPath's very low latencies and high throughput for MPI even 
at very modest message sizes.

The article states: 'InfiniPath will also support the OpenIB software stack 
providing full InfiniBand compliance.'

Is this something the OpenIB software developers are working on and 
something that can be commented on in public, or should I speculate/ask 
Pathscale?

Paul 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] OOPS: ib_mad crashery on bootup

2005-05-19 Thread Tom Duffy
This is from latest bits, r2414.

Bringing up interface ib0:  stack segment:  [1] SMP
CPU 0
Modules linked in: ext3 jbd dm_mod video container button battery ac ohci_hcd 
tpm_nsc tpm i2c_amd756 i2c_core ib_mthca ib_ipoib ib_sa ib_mad ib_core tg3 
floppy xfs exportfs mptscsih mptbase sd_mod scsi_mod
Pid: 1253, comm: ib_mad1 Not tainted 2.6.12-rc4openib
RIP: 0010:[880f633f] 
880f633f{:ib_mad:ib_mad_send_done_handler+31}
RSP: 0018:81007ed71d48  EFLAGS: 00010296
RAX: 01001508 RBX: 81007f346298 RCX: 
RDX: c2052001 RSI: 81007ed71dc8 RDI: 81007f3461c0
RBP: 01001500 R08: 81007ed7 R09: 81003fa69070
R10: 81003fc4d258 R11: 0008 R12: 81003f33c4b0
R13: 81003f33c4b0 R14: 81007ed71dc8 R15: 880f89a0
FS:  2aad4e00() GS:80489000() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 003a2428ebf0 CR3: 00101000 CR4: 06e0
Process ib_mad1 (pid: 1253, threadinfo 81007ed7, task 81003fa69070)
Stack: 8001  81003fa69070 81007f346298
   81007f3462a0 81003fd8c778 81007f3461c0 81007ed71dc8
   880f89a0 880f8f1b
Call Trace:880f89a0{:ib_mad:ib_mad_completion_handler+0}
   880f8f1b{:ib_mad:ib_mad_completion_handler+1403}
   8016686b{cache_free_debugcheck+715} 
80130343{__wake_up+67}
   880f89a0{:ib_mad:ib_mad_completion_handler+0}
   80148a6c{worker_thread+476} 
80132260{default_wake_function+0}
   8014d0b0{keventd_create_kthread+0} 
80148890{worker_thread+0}
   8014d0b0{keventd_create_kthread+0} 
8014d329{kthread+217}
   80133770{schedule_tail+64} 8010f57f{child_rip+8}
   8014d0b0{keventd_create_kthread+0} 
8014d250{kthread+0}
   8010f577{child_rip+0}

Code: 4c 8b 7d 20 48 89 04 24 48 89 ef 31 db e8 bf 3c 25 f8 49 8b
RIP 880f633f{:ib_mad:ib_mad_send_done_handler+31} RSP 
81007ed71d48
 Error, some other host already uses address 192.168.0.233.

-- 
I wish we lived in the America of yesteryear that only exists in the
minds of us Republicans.
-- Ned Flanders


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] kdapltest regression? failing now...

2005-05-19 Thread Tom Duffy
I am not sure when this started, but after updating to top of trunk*, I
can no longer get kdapltest to work properly.  Both ipoib and sdp are
working.

Both server and client are returning an error: DAT_INVALID_HANDLE.  This
is coming from ib_create_qp().  With debugging turned on:

[EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d
kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8)
kDAPL: dapl_ia_open () returns 0x0
kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0)
kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8)
kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0)
kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300)
kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8)
kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 
81001ba166e0, 81001ba22050, , 81000b806318)
kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
kDAPL: dapl_evd_free (81001ba22050)
kDAPL: dapl_evd_free () returns 0x0
kDAPL: dapl_evd_free (81001ba22168)
kDAPL: dapl_evd_free () returns 0x0
kDAPL: dapl_evd_free (81001ba166e0)
kDAPL: dapl_evd_free () returns 0x0
kDAPL: dapl_evd_free (81001ba164b0)
kDAPL: dapl_evd_free () returns 0x0
kDAPL: dapl_pz_free (81001b9442c8)
kDAPL: dapl_ia_query (81001ba165c8, , , 
81001bba7b28)
kDAPL: dapl_ia_query () returns 0x0
kDAPL: dapl_ia_close (81001ba165c8, 1)
kDAPL: dapl_evd_free (81001ba167f8)
kDAPL: dapl_evd_free () returns 0x0
Server_Cmd.debug:   1
Server_Cmd.dapl_name: mthca0a
DT_cs_Server: IA mthca0a opened
DT_cs_Server: PZ created
DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
DT_cs_Server: Waiting for clients to all go away...
DT_cs_Server: Cleaning up ...
DT_cs_Server: IA mthca0a closed
DT_cs_Server (mthca0a):  Exiting.
TEST INSTANCE 0
TEST return code = 1

Also, the ib_at module prints this out now when you ping (after running
kdapltest)... 

ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 
0xfe82c9010a99e031

-tduffy

* running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 
machines back-2-back


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] kdapltest regression? failing now...

2005-05-19 Thread James Lentini

I'm looking into this Tom. 

The following code was added to hw/mthca/mthca_qp.c on Friday 
(starting on line 1233):


if ((qp-transport == MLX  qp-sq.max_gs + 2  dev-limits.max_sg) ||
qp-sq.max_gs  dev-limits.max_sg || qp-rq.max_gs  dev-limits.max_sg)
 return -EINVAL;

If anyone knows what we have set incorrectly, please let me know.

Thanks,
james

On Thu, 19 May 2005, Tom Duffy wrote:

tduffy I am not sure when this started, but after updating to top of trunk*, I
tduffy can no longer get kdapltest to work properly.  Both ipoib and sdp are
tduffy working.
tduffy 
tduffy Both server and client are returning an error: DAT_INVALID_HANDLE.  This
tduffy is coming from ib_create_qp().  With debugging turned on:
tduffy 
tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d
tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8)
tduffy kDAPL: dapl_ia_open () returns 0x0
tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 
81000b8062e8)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 
81000b8062f0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 
81000b806300)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 
81000b8062f8)
tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 
81001ba164b0, 81001ba166e0, 81001ba22050, , 
81000b806318)
tduffy kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
tduffy kDAPL: dapl_evd_free (81001ba22050)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba22168)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba166e0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba164b0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_pz_free (81001b9442c8)
tduffy kDAPL: dapl_ia_query (81001ba165c8, , 
, 81001bba7b28)
tduffy kDAPL: dapl_ia_query () returns 0x0
tduffy kDAPL: dapl_ia_close (81001ba165c8, 1)
tduffy kDAPL: dapl_evd_free (81001ba167f8)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy Server_Cmd.debug:   1
tduffy Server_Cmd.dapl_name: mthca0a
tduffy DT_cs_Server: IA mthca0a opened
tduffy DT_cs_Server: PZ created
tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
tduffy DT_cs_Server: Waiting for clients to all go away...
tduffy DT_cs_Server: Cleaning up ...
tduffy DT_cs_Server: IA mthca0a closed
tduffy DT_cs_Server (mthca0a):  Exiting.
tduffy TEST INSTANCE 0
tduffy TEST return code = 1
tduffy 
tduffy Also, the ib_at module prints this out now when you ping (after running
tduffy kdapltest)... 
tduffy 
tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 
0xfe82c9010a99e031
tduffy 
tduffy -tduffy
tduffy 
tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm 
r2414 2 machines back-2-back
tduffy 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Infinipath support in OpenIB

2005-05-19 Thread Johann George
Paul:

 The article states: 'InfiniPath will also support the OpenIB software stack 
 providing full InfiniBand compliance.'
 
 Is this something the OpenIB software developers are working on and 
 something that can be commented on in public, or should I speculate/ask 
 Pathscale?

Thanks for asking.  We are actively working on supporting OpenIB with the
help of the OpenIB community and hope to be able to submit code that
supports InfiniPath in the near future.

Johann
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Infinipath support in OpenIB

2005-05-19 Thread Sean Hefty
Paul Baxter wrote:
I read with interest the PR blurb at 
http://supercomputingonline.com/article.php?sid=8740
regarding InfiniPath's very low latencies and high throughput for MPI 
even at very modest message sizes.

The article states: 'InfiniPath will also support the OpenIB software 
stack providing full InfiniBand compliance.'

Is this something the OpenIB software developers are working on and 
something that can be commented on in public, or should I speculate/ask 
Pathscale?
I would think that as long as they provide an HCA driver that registers with 
 the core layer and exposes the proper APIs, they would be able to run with 
the openib stack.  I do not know if their code is open source however.

- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Infinipath support in OpenIB

2005-05-19 Thread Tom Duffy
On Thu, 2005-05-19 at 20:29 +0100, Paul Baxter wrote:
 I read with interest the PR blurb at 
 http://supercomputingonline.com/article.php?sid=8740
 regarding InfiniPath's very low latencies and high throughput for MPI even 
 at very modest message sizes.
 
 The article states: 'InfiniPath will also support the OpenIB software stack 
 providing full InfiniBand compliance.'

Also interesting that OpenIB is suddenly an industry standard.  ;-)

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] kdapltest regression? failing now...

2005-05-19 Thread James Lentini
For what it's worth, this is the check that we are failing:
 qp-sq.max_gs  dev-limits.max_sg
( qp-sq.max_gs + 2  dev-limits.max_sg is also true but
  qp-transport == MLX is not).
On Thu, 19 May 2005, James Lentini wrote:
I'm looking into this Tom.
The following code was added to hw/mthca/mthca_qp.c on Friday
(starting on line 1233):
if ((qp-transport == MLX  qp-sq.max_gs + 2  dev-limits.max_sg) ||
   qp-sq.max_gs  dev-limits.max_sg || qp-rq.max_gs  dev-limits.max_sg)
return -EINVAL;
If anyone knows what we have set incorrectly, please let me know.
Thanks,
james
On Thu, 19 May 2005, Tom Duffy wrote:
tduffy I am not sure when this started, but after updating to top of trunk*, I
tduffy can no longer get kdapltest to work properly.  Both ipoib and sdp are
tduffy working.
tduffy
tduffy Both server and client are returning an error: DAT_INVALID_HANDLE.  This
tduffy is coming from ib_create_qp().  With debugging turned on:
tduffy
tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d
tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8)
tduffy kDAPL: dapl_ia_open () returns 0x0
tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 
81000b8062e8)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 
81000b8062f0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 
81000b806300)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 
81000b8062f8)
tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 
81001ba164b0, 81001ba166e0, 81001ba22050, , 
81000b806318)
tduffy kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
tduffy kDAPL: dapl_evd_free (81001ba22050)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba22168)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba166e0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba164b0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_pz_free (81001b9442c8)
tduffy kDAPL: dapl_ia_query (81001ba165c8, , 
, 81001bba7b28)
tduffy kDAPL: dapl_ia_query () returns 0x0
tduffy kDAPL: dapl_ia_close (81001ba165c8, 1)
tduffy kDAPL: dapl_evd_free (81001ba167f8)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy Server_Cmd.debug:   1
tduffy Server_Cmd.dapl_name: mthca0a
tduffy DT_cs_Server: IA mthca0a opened
tduffy DT_cs_Server: PZ created
tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
tduffy DT_cs_Server: Waiting for clients to all go away...
tduffy DT_cs_Server: Cleaning up ...
tduffy DT_cs_Server: IA mthca0a closed
tduffy DT_cs_Server (mthca0a):  Exiting.
tduffy TEST INSTANCE 0
tduffy TEST return code = 1
tduffy
tduffy Also, the ib_at module prints this out now when you ping (after running
tduffy kdapltest)...
tduffy
tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 
0xfe82c9010a99e031
tduffy
tduffy -tduffy
tduffy
tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm 
r2414 2 machines back-2-back
tduffy
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] RE: [PATCH][kdapl] fix spin_lock_irqsave/spin_unlock_irqrestore i mplementation

2005-05-19 Thread Tom Duffy
On Wed, 2005-05-18 at 22:44 +0300, Itamar Rabenstein wrote:
 evd producer locking is something that we need in openib kdapl
  as it was in Source Forge implementation.

This is not a good excuse.  OpenIB kdapl is a totally different beast
from the sourceforge implementation and will differ over time.

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] kdapltest regression? failing now...

2005-05-19 Thread James Lentini
I think I figure this out. DAPL was assuming a particular maximum 
scatter gather list size. I'm going to change it to query for this 
value. Hopefully I'll have a fix shortly.

james
On Thu, 19 May 2005, James Lentini wrote:
For what it's worth, this is the check that we are failing:
qp-sq.max_gs  dev-limits.max_sg
( qp-sq.max_gs + 2  dev-limits.max_sg is also true but
 qp-transport == MLX is not).
On Thu, 19 May 2005, James Lentini wrote:
I'm looking into this Tom.
The following code was added to hw/mthca/mthca_qp.c on Friday
(starting on line 1233):
if ((qp-transport == MLX  qp-sq.max_gs + 2  dev-limits.max_sg) ||
   qp-sq.max_gs  dev-limits.max_sg || qp-rq.max_gs  
dev-limits.max_sg)
return -EINVAL;

If anyone knows what we have set incorrectly, please let me know.
Thanks,
james
On Thu, 19 May 2005, Tom Duffy wrote:
tduffy I am not sure when this started, but after updating to top of 
trunk*, I
tduffy can no longer get kdapltest to work properly.  Both ipoib and sdp 
are
tduffy working.
tduffy
tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. 
This
tduffy is coming from ib_create_qp().  With debugging turned on:
tduffy
tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d
tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 
81000b8062d8)
tduffy kDAPL: dapl_ia_open () returns 0x0
tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 
81000b8062e8)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 
81000b8062f0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 
81000b806300)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 
81000b8062f8)
tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 
81001ba164b0, 81001ba166e0, 81001ba22050, , 
81000b806318)
tduffy kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
tduffy kDAPL: dapl_evd_free (81001ba22050)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba22168)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba166e0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba164b0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_pz_free (81001b9442c8)
tduffy kDAPL: dapl_ia_query (81001ba165c8, , 
, 81001bba7b28)
tduffy kDAPL: dapl_ia_query () returns 0x0
tduffy kDAPL: dapl_ia_close (81001ba165c8, 1)
tduffy kDAPL: dapl_evd_free (81001ba167f8)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy Server_Cmd.debug:   1
tduffy Server_Cmd.dapl_name: mthca0a
tduffy DT_cs_Server: IA mthca0a opened
tduffy DT_cs_Server: PZ created
tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
tduffy DT_cs_Server: Waiting for clients to all go away...
tduffy DT_cs_Server: Cleaning up ...
tduffy DT_cs_Server: IA mthca0a closed
tduffy DT_cs_Server (mthca0a):  Exiting.
tduffy TEST INSTANCE 0
tduffy TEST return code = 1
tduffy
tduffy Also, the ib_at module prints this out now when you ping (after 
running
tduffy kdapltest)...
tduffy
tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 
0xfe82c9010a99e031
tduffy
tduffy -tduffy
tduffy
tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm 
r2414 2 machines back-2-back
tduffy


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] RE: [PATCH][kdapl] fix spin_lock_irqsave/spi n_unlock_irqrestore i mplementation

2005-05-19 Thread Itamar Rabenstein
I am not saying that we need it in OpenIb code because it was in SF code 
I said that in OpenIb implementation we need it the same as in SF we need
it.
we must have this lock because for example :
if the same evd will be a CM evd for 2 ep's
each one on different Thread both can try to add an event to the evd in the
same time
there for we need to lock the evd when we take an empty event from one list
(empty list) 
and to unlock it after we add the event to the second list (waiting events
list)
the lock is in one function and the unlock in the second function.

so we need in out OpenIb code and also SF code need it (;-)

 Itamar

 -Original Message-
 From: Tom Duffy [mailto:[EMAIL PROTECTED]
 Sent: Thursday, May 19, 2005 11:09 PM
 To: Itamar Rabenstein
 Cc: James Lentini; openib-general
 Subject: Re: [openib-general] RE: [PATCH][kdapl] fix
 spin_lock_irqsave/spin_unlock_irqrestore i mplementation
 
 
 On Wed, 2005-05-18 at 22:44 +0300, Itamar Rabenstein wrote:
  evd producer locking is something that we need in openib kdapl
   as it was in Source Forge implementation.
 
 This is not a good excuse.  OpenIB kdapl is a totally different beast
 from the sourceforge implementation and will differ over time.
 
 -tduffy
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] user_mad: Support RMPP on send side

2005-05-19 Thread Hal Rosenstock
user_mad: Support RMPP on send side

Note that this change will need a coordinated change to OpenSM and some
userspace/management libraries which will be done as soon as possible
once this patch is accepted.

Receive side support for RMPP will be added separately.

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: infiniband/include/ib_user_mad.h
===
--- infiniband/include/ib_user_mad.h(revision 2413)
+++ infiniband/include/ib_user_mad.h(working copy)
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2004 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005 Voltaire, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -42,7 +43,7 @@
  * Increment this value if any changes that break userspace ABI
  * compatibility are made.
  */
-#define IB_USER_MAD_ABI_VERSION2
+#define IB_USER_MAD_ABI_VERSION3
 
 /*
  * Make sure that all structs defined in this file remain laid out so
@@ -51,8 +52,7 @@
  */
 
 /**
- * ib_user_mad - MAD packet
- * @data - Contents of MAD
+ * ib_user_mad_hdr - MAD packet header
  * @id - ID of agent MAD received with/to be sent with
  * @status - 0 on successful receive, ETIMEDOUT if no response
  *   received (transaction ID in data[] will be set to TID of original
@@ -72,8 +72,7 @@
  *
  * All multi-byte quantities are stored in network (big endian) byte order.
  */
-struct ib_user_mad {
-   __u8data[256];
+struct ib_user_mad_hdr {
__u32   id;
__u32   status;
__u32   timeout_ms;
@@ -91,6 +90,17 @@
 };
 
 /**
+ * ib_user_mad - MAD packet
+ * @hdr - MAD packet header
+ * @data - Contents of MAD
+ *
+ */
+struct ib_user_mad {
+   struct ib_user_mad_hdr hdr;
+   __u8data[0];
+};
+
+/**
  * ib_user_mad_reg_req - MAD registration request
  * @id - Set by the kernel; used to identify agent in future requests.
  * @qpn - Queue pair number; must be 0 or 1.
@@ -103,6 +113,8 @@
  *   management class to receive.
  * @oui: Indicates IEEE OUI when mgmt_class is a vendor class
  *   in the range from 0x30 to 0x4f. Otherwise not used.
+ * @rmpp_version: If set, indicates the RMPP version used.
+ * 
  */
 struct ib_user_mad_reg_req {
__u32   id;
@@ -111,6 +123,7 @@
__u8mgmt_class;
__u8mgmt_class_version;
__u8oui[3];
+   __u8rmpp_version;
 };
 
 #define IB_IOCTL_MAGIC 0x1b
Index: infiniband/core/user_mad.c
===
--- infiniband/core/user_mad.c  (revision 2413)
+++ infiniband/core/user_mad.c  (working copy)
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2004 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005 Voltaire, Inc. All rights reserved. 
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -94,10 +95,12 @@
 };
 
 struct ib_umad_packet {
-   struct ib_user_mad mad;
struct ib_ah  *ah;
+   struct ib_mad_send_buf *msg;
struct list_head   list;
+   intlength;
DECLARE_PCI_UNMAP_ADDR(mapping)
+   struct ib_user_mad mad;
 };
 
 static const dev_t base_dev = MKDEV(IB_UMAD_MAJOR, IB_UMAD_MINOR_BASE);
@@ -114,10 +117,10 @@
int ret = 1;
 
down_read(file-agent_mutex);
-   for (packet-mad.id = 0;
-packet-mad.id  IB_UMAD_MAX_AGENTS;
-packet-mad.id++)
-   if (agent == file-agent[packet-mad.id]) {
+   for (packet-mad.hdr.id = 0;
+packet-mad.hdr.id  IB_UMAD_MAX_AGENTS;
+packet-mad.hdr.id++)
+   if (agent == file-agent[packet-mad.hdr.id]) {
spin_lock_irq(file-recv_lock);
list_add_tail(packet-list, file-recv_list);
spin_unlock_irq(file-recv_lock);
@@ -138,14 +141,11 @@
struct ib_umad_packet *packet =
(void *) (unsigned long) send_wc-wr_id;
 
-   dma_unmap_single(agent-device-dma_device,
-pci_unmap_addr(packet, mapping),
-sizeof packet-mad.data,
-DMA_TO_DEVICE);
-   ib_destroy_ah(packet-ah);
+   ib_free_send_mad(packet-msg);
+   ib_destroy_ah(packet-msg-send_wr.wr.ud.ah);
 
if (send_wc-status == IB_WC_RESP_TIMEOUT_ERR) {
-   packet-mad.status = ETIMEDOUT;
+   packet-mad.hdr.status = ETIMEDOUT;
 
if (!queue_packet(file, agent, packet))
return;
@@ -159,30 +159,34 @@
 {
struct ib_umad_file *file = agent-context;
struct ib_umad_packet *packet;
+   int length;
 
+
if (mad_recv_wc-wc-status != IB_WC_SUCCESS)
goto out;
 
-   packet = kmalloc(sizeof *packet, GFP_KERNEL);
+   length = 256;   /* 

RE: [openib-general] performance counters in /sys

2005-05-19 Thread Diego Crupnicoff
Title: RE: [openib-general] performance counters in /sys






 -Original Message-
 From: Hal Rosenstock [mailto:[EMAIL PROTECTED]] 
 Sent: Thursday, May 19, 2005 6:48 PM
 To: Yaron Haviv
 Cc: Mark Seger; openib-general@openib.org
 Subject: RE: [openib-general] performance counters in /sys
 
 
 On Thu, 2005-05-19 at 16:45, Yaron Haviv wrote:
  I believe you can use the per VL counters for that
  (IB allows counting traffic on a specific VL)
  By matching ULPs to VLs (e.g. through the ib_at lib we suggested)
  You can get both congestion isolation per traffic type as 
 well as the
  ability to count traffic per ULP 
  (note that up to 8 VLs are supported in the Mellanox chips)
 
 PortXmitDataVL[n], PortRcvDataVL[n], PortXmitPktVL[n], and 
 PortRcvPktVL[n] are all IB optional. Do the Mellanox HCAs 
 support these counters ?
 
 -- Hal


No



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] at: Change normal message from WARN to DEBUG

2005-05-19 Thread Hal Rosenstock
at: Change normal message from WARN to DEBUG

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: at.c
===
--- at.c(revision 2379)
+++ at.c(working copy)
@@ -737,13 +737,13 @@
 
arp = (struct ib_arp *)skb-nh.raw;
 
-   WARN(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx,
-(arp-src_ip  0x00ff),
-(arp-src_ip  0xff00)  8,
-(arp-src_ip  0x00ff)  16,
-(arp-src_ip  0xff00)  24,
-be64_to_cpu(arp-src_gid.global.subnet_prefix),
-be64_to_cpu(arp-src_gid.global.interface_id));
+   DEBUG(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx,
+ (arp-src_ip  0x00ff),
+ (arp-src_ip  0xff00)  8,
+ (arp-src_ip  0x00ff)  16,
+ (arp-src_ip  0xff00)  24,
+ be64_to_cpu(arp-src_gid.global.subnet_prefix),
+ be64_to_cpu(arp-src_gid.global.interface_id));
 
spin_lock_irqsave(q-lock, flags);
for (a = q-next; a != q; a = a-next) {


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Infinipath support in OpenIB

2005-05-19 Thread Greg Lindahl
On Thu, May 19, 2005 at 12:46:04PM -0700, Tom Duffy wrote:

  The article states: 'InfiniPath will also support the OpenIB software stack 
  providing full InfiniBand compliance.'
 
 Also interesting that OpenIB is suddenly an industry standard.  ;-)

Tom,

There's a reason for that -- when we asked customers, they asked us to
support OpenIB in particular. So congratulations, not only is OpenIB a
part of the IB ecosystem, it's one that everyone seems to want to
support.

We didn't want to invent a new flavor of verbs or DAPL anyway. No
point to it, no benefit to anyone.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] kdapltest regression? failing now...

2005-05-19 Thread James Lentini
I commited a fix for this in revision 2420. The problem turned out to 
be that DAPL wasn't initializing the max_inline_data value of the QP 
attr's cap structure.

Let me know if you still have any problems.
There is a patch in the pipeline that will remove the IBAT printout 
you mentioned.

james
On Thu, 19 May 2005, James Lentini wrote:
I think I figure this out. DAPL was assuming a particular maximum scatter 
gather list size. I'm going to change it to query for this value. Hopefully 
I'll have a fix shortly.

james
On Thu, 19 May 2005, James Lentini wrote:
For what it's worth, this is the check that we are failing:
qp-sq.max_gs  dev-limits.max_sg
( qp-sq.max_gs + 2  dev-limits.max_sg is also true but
 qp-transport == MLX is not).
On Thu, 19 May 2005, James Lentini wrote:
I'm looking into this Tom.
The following code was added to hw/mthca/mthca_qp.c on Friday
(starting on line 1233):
if ((qp-transport == MLX  qp-sq.max_gs + 2  dev-limits.max_sg) ||
   qp-sq.max_gs  dev-limits.max_sg || qp-rq.max_gs  
dev-limits.max_sg)
return -EINVAL;

If anyone knows what we have set incorrectly, please let me know.
Thanks,
james
On Thu, 19 May 2005, Tom Duffy wrote:
tduffy I am not sure when this started, but after updating to top of 
trunk*, I
tduffy can no longer get kdapltest to work properly.  Both ipoib and sdp 
are
tduffy working.
tduffy
tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. 
This
tduffy is coming from ib_create_qp().  With debugging turned on:
tduffy
tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d
tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 
81000b8062d8)
tduffy kDAPL: dapl_ia_open () returns 0x0
tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 
81000b8062e8)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 
81000b8062f0)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 
81000b806300)
tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 
81000b8062f8)
tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 
81001ba164b0, 81001ba166e0, 81001ba22050, , 
81000b806318)
tduffy kDAPL:  dapl_ib_qp_alloc: ib_create_qp failed = -22
tduffy kDAPL: dapl_evd_free (81001ba22050)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba22168)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba166e0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_evd_free (81001ba164b0)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy kDAPL: dapl_pz_free (81001b9442c8)
tduffy kDAPL: dapl_ia_query (81001ba165c8, , 
, 81001bba7b28)
tduffy kDAPL: dapl_ia_query () returns 0x0
tduffy kDAPL: dapl_ia_close (81001ba165c8, 1)
tduffy kDAPL: dapl_evd_free (81001ba167f8)
tduffy kDAPL: dapl_evd_free () returns 0x0
tduffy Server_Cmd.debug:   1
tduffy Server_Cmd.dapl_name: mthca0a
tduffy DT_cs_Server: IA mthca0a opened
tduffy DT_cs_Server: PZ created
tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE
tduffy DT_cs_Server: Waiting for clients to all go away...
tduffy DT_cs_Server: Cleaning up ...
tduffy DT_cs_Server: IA mthca0a closed
tduffy DT_cs_Server (mthca0a):  Exiting.
tduffy TEST INSTANCE 0
tduffy TEST return code = 1
tduffy
tduffy Also, the ib_at module prints this out now when you ping (after 
running
tduffy kdapltest)...
tduffy
tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 
0xfe82c9010a99e031
tduffy
tduffy -tduffy
tduffy
tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, 
opensm r2414 2 machines back-2-back
tduffy



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [kDAPL] module parameter names

2005-05-19 Thread James Lentini
With revision 2420, I made the dat and ib_dat_provider module 
debug parameter names consistent. They are now both dbg_mask.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH][kdapl] fix spin_lock_irqsave/spin_unlock_irqrestore implementation

2005-05-19 Thread James Lentini

Itamar,

Thank you for pointing this out.

Long term I think it will be better to keep the flags with the spin 
lock so that these scoping issues don't crop up and force us to pass 
the flags around. The fix for this is in revision 2420.

james

On Wed, 18 May 2005, Itamar wrote:

itamar when spin_lock_irqsave and spin_unlock_irqrestore are not called in the 
same function 
itamar need to pass the flags form the save the store
itamar 
itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED]
itamar 
itamar Index: dapl_evd_util.c
itamar ===
itamar --- dapl_evd_util.c (revision 2374)
itamar +++ dapl_evd_util.c (working copy)
itamar @@ -378,20 +378,19 @@
itamar   * that the lock is held.
itamar   */
itamar  
itamar -static struct dat_event *dapl_evd_get_event(DAPL_EVD * evd_ptr)
itamar +static struct dat_event *dapl_evd_get_event(DAPL_EVD * evd_ptr, 
unsigned long *flags)
itamar  {
itamar struct dat_event *event;
itamar -   unsigned long flags;
itamar  
itamar if (evd_ptr-evd_producer_locking_needed) {
itamar -   spin_lock_irqsave(evd_ptr-header.lock, flags);
itamar +   spin_lock_irqsave(evd_ptr-header.lock, *flags);
itamar }
itamar  
itamar event = (struct dat_event *) 
dapl_rbuf_remove(evd_ptr-free_event_queue);
itamar  
itamar /* Release the lock if it was taken and the call failed.  */
itamar if (!event  evd_ptr-evd_producer_locking_needed) {
itamar -   spin_unlock_irqrestore(evd_ptr-header.lock, flags);
itamar +   spin_unlock_irqrestore(evd_ptr-header.lock, *flags);
itamar }
itamar  
itamar return event;
itamar @@ -406,10 +405,10 @@
itamar   */
itamar  
itamar  static void dapl_evd_post_event(DAPL_EVD *evd_ptr,
itamar -   const struct dat_event *event_ptr)
itamar +   const struct dat_event *event_ptr,
itamar +   unsigned long flags)
itamar  {
itamar u32 dat_status;
itamar -   unsigned long flags;
itamar DAPL_CNO * cno_to_trigger = NULL;
itamar  
itamar dapl_dbg_log(DAPL_DBG_TYPE_EVD,
itamar @@ -459,7 +458,7 @@
itamar  DAPL_EVD * overflow_evd_ptr)
itamar  {
itamar struct dat_event *overflow_event;
itamar -
itamar +   unsigned long flags;
itamar /* The overflow_evd_ptr mght be the same as evd.
itamar  * In that case we've got a catastrophic overflow.
itamar  */
itamar @@ -469,7 +468,7 @@
itamar return;
itamar }
itamar  
itamar -   overflow_event = dapl_evd_get_event(overflow_evd_ptr);
itamar +   overflow_event = dapl_evd_get_event(overflow_evd_ptr, flags);
itamar if (!overflow_event) {
itamar /* this is not good */
itamar overflow_evd_ptr-catastrophic_overflow = TRUE;
itamar @@ -477,17 +476,18 @@
itamar return;
itamar }
itamar dapl_evd_format_overflow_event(overflow_evd_ptr, 
overflow_event);
itamar -   dapl_evd_post_event(overflow_evd_ptr, overflow_event);
itamar +   dapl_evd_post_event(overflow_evd_ptr, overflow_event, flags);
itamar  
itamar return;
itamar  }
itamar  
itamar  static struct dat_event *dapl_evd_get_and_init_event(DAPL_EVD *evd_ptr,
itamar -enum 
dat_event_number evno)
itamar +enum 
dat_event_number evno,
itamar +unsigned long 
*flags)
itamar  {
itamar struct dat_event *event_ptr;
itamar  
itamar -   event_ptr = dapl_evd_get_event(evd_ptr);
itamar +   event_ptr = dapl_evd_get_event(evd_ptr, flags);
itamar if (!event_ptr)
itamar dapl_evd_post_overflow_event(evd_ptr-header.owner_ia-
itamar  async_error_evd, evd_ptr);
itamar @@ -507,7 +507,8 @@
itamarDAT_CR_HANDLE cr_handle)
itamar  {
itamar struct dat_event *event_ptr;
itamar -   event_ptr = dapl_evd_get_and_init_event(evd_ptr, event_number);
itamar +   unsigned long flags;
itamar +   event_ptr = dapl_evd_get_and_init_event(evd_ptr, event_number, 
flags);
itamar /*
itamar  * Note event lock may be held on successful return
itamar  * to be released by dapl_evd_post_event(), if provider side 
locking
itamar @@ -525,7 +526,7 @@
itamar event_ptr-event_data.cr_arrival_event_data.conn_qual = 
conn_qual;
itamar event_ptr-event_data.cr_arrival_event_data.cr_handle = 
cr_handle;
itamar  
itamar -   dapl_evd_post_event(evd_ptr, event_ptr);
itamar +   dapl_evd_post_event(evd_ptr, event_ptr, flags);
itamar  
itamar return DAT_SUCCESS;
itamar  }
itamar @@ -537,7 +538,8 @@
itamarvoid 

[openib-general] Re: [PATCH] kDAPL: add some clarification to a few debug printks

2005-05-19 Thread James Lentini

Commited in revision 2421.

On Wed, 18 May 2005, Tom Duffy wrote:

tduffy I was running into some issues and noticed that these printks were not
tduffy very clear about where they were coming from.  So, adding a little more
tduffy info.
tduffy 
tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED]
tduffy 
tduffy Index: linux-kernel/dat-provider/dapl_openib_qp.c
tduffy ===
tduffy --- linux-kernel/dat-provider/dapl_openib_qp.c  (revision 2382)
tduffy +++ linux-kernel/dat-provider/dapl_openib_qp.c  (working copy)
tduffy @@ -108,7 +108,8 @@ u32 dapl_ib_qp_alloc(DAPL_IA *ia_ptr, DA
tduffy ep_ptr-qp_handle = ib_create_qp(ib_pd_handle, qp_attr);
tduffy if (IS_ERR(ep_ptr-qp_handle)) {
tduffy ib_status = PTR_ERR(ep_ptr-qp_handle);
tduffy -   dapl_dbg_log(DAPL_DBG_TYPE_ERR,  failed code = %d\n,
tduffy +   dapl_dbg_log(DAPL_DBG_TYPE_ERR,
tduffy + dapl_ib_qp_alloc: ib_create_qp failed = 
%d\n,
tduffy  ib_status);
tduffy return dapl_ib_status_convert(ib_status);
tduffy }
tduffy @@ -197,8 +198,9 @@ ib_cq_handle_t dapl_get_dto_cq(DAPL_IA *
tduffy ib_status =
tduffy PTR_ERR(ia_ptr-hca_ptr-ib_trans.
tduffy null_ib_cq_handle);
tduffy -   dapl_dbg_log(DAPL_DBG_TYPE_ERR,  failed code = 
%d\n,
tduffy -ib_status);
tduffy +   dapl_dbg_log(DAPL_DBG_TYPE_ERR,
tduffy + dapl_get_dto_cq: ib_create_cq 
failed 
tduffy += %d\n, ib_status);
tduffy ia_ptr-hca_ptr-ib_trans.null_ib_cq_handle =
tduffy IB_INVALID_HANDLE;
tduffy }
tduffy 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] kDAPL: return is not a function

2005-05-19 Thread James Lentini

Committed in revision 2422.

On Wed, 18 May 2005, Tom Duffy wrote:

tduffy return is not a function
tduffy 
tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED]
tduffy 
tduffy Index: linux-kernel-return/dat-provider/dapl_cookie.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_cookie.c  (revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_cookie.c  (working copy)
tduffy @@ -134,9 +134,9 @@ u32 dapl_cb_create(DAPL_COOKIE_BUFFER *b
tduffy buffer-pool[i].ep = ep;
tduffy }
tduffy  
tduffy -   return (DAT_SUCCESS);
tduffy +   return DAT_SUCCESS;
tduffy } else {
tduffy -   return (DAT_INSUFFICIENT_RESOURCES);
tduffy +   return DAT_INSUFFICIENT_RESOURCES;
tduffy }
tduffy  }
tduffy  
tduffy Index: linux-kernel-return/dat-provider/dapl_ring_buffer_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_ring_buffer_util.c
(revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_ring_buffer_util.c
(working copy)
tduffy @@ -237,7 +237,7 @@ void *dapl_rbuf_remove(DAPL_RING_BUFFER 
tduffy if (val == pos) {
tduffy pos = (pos + 1)  rbuf-lim;/* verify in 
range */
tduffy  
tduffy -   return (rbuf-base[pos]);
tduffy +   return rbuf-base[pos];
tduffy }
tduffy }
tduffy  
tduffy Index: linux-kernel-return/dat-provider/dapl_lmr_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_lmr_util.c(revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_lmr_util.c(working copy)
tduffy @@ -50,7 +50,7 @@ DAPL_LMR *dapl_lmr_alloc(DAPL_IA * ia,
tduffy /* Allocate LMR */
tduffy lmr = (DAPL_LMR *) kmalloc(sizeof(DAPL_LMR), GFP_ATOMIC);
tduffy if (NULL == lmr) {
tduffy -   return (NULL);
tduffy +   return NULL;
tduffy }
tduffy  
tduffy /* zero the structure */
tduffy @@ -80,7 +80,7 @@ DAPL_LMR *dapl_lmr_alloc(DAPL_IA * ia,
tduffy lmr-param.mem_priv = mem_priv;
tduffy atomic_set(lmr-lmr_ref_count, 0);
tduffy  
tduffy -   return (lmr);
tduffy +   return lmr;
tduffy  }
tduffy  
tduffy  void dapl_lmr_dealloc(DAPL_LMR * lmr)
tduffy Index: linux-kernel-return/dat-provider/dapl_hca_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_hca_util.c(revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_hca_util.c(working copy)
tduffy @@ -83,7 +83,7 @@ DAPL_HCA *dapl_hca_alloc(char *name, str
tduffy }
tduffy }
tduffy  
tduffy -   return (hca_ptr);
tduffy +   return hca_ptr;
tduffy  }
tduffy  
tduffy  /*
tduffy Index: linux-kernel-return/dat-provider/dapl_cr_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_cr_util.c (revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_cr_util.c (working copy)
tduffy @@ -44,7 +44,7 @@ DAPL_CR *dapl_cr_alloc(DAPL_IA * ia_ptr)
tduffy /* Allocate EP */
tduffy cr_ptr = (DAPL_CR *) kmalloc(sizeof(DAPL_CR), GFP_ATOMIC);
tduffy if (cr_ptr == NULL) {
tduffy -   return (NULL);
tduffy +   return NULL;
tduffy }
tduffy  
tduffy /* zero the structure */
tduffy @@ -62,7 +62,7 @@ DAPL_CR *dapl_cr_alloc(DAPL_IA * ia_ptr)
tduffy dapl_llist_init_entry(cr_ptr-header.ia_list_entry);
tduffy spin_lock_init(cr_ptr-header.lock);
tduffy  
tduffy -   return (cr_ptr);
tduffy +   return cr_ptr;
tduffy  }
tduffy  
tduffy  /*
tduffy Index: linux-kernel-return/dat-provider/dapl_ia_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_ia_util.c (revision 2382)
tduffy +++ linux-kernel-return/dat-provider/dapl_ia_util.c (working copy)
tduffy @@ -65,7 +65,7 @@ DAPL_IA *dapl_ia_alloc(struct dat_provid
tduffy /* Allocate IA */
tduffy ia_ptr = (DAPL_IA *) kmalloc(sizeof(DAPL_IA), GFP_ATOMIC);
tduffy if (ia_ptr == NULL) {
tduffy -   return (NULL);
tduffy +   return NULL;
tduffy }
tduffy  
tduffy /* zero the structure */
tduffy @@ -100,7 +100,7 @@ DAPL_IA *dapl_ia_alloc(struct dat_provid
tduffy  
tduffy dapl_hca_link_ia(hca_ptr, ia_ptr);
tduffy  
tduffy -   return (ia_ptr);
tduffy +   return ia_ptr;
tduffy  }
tduffy  
tduffy  /*
tduffy Index: linux-kernel-return/dat-provider/dapl_pz_util.c
tduffy ===
tduffy --- linux-kernel-return/dat-provider/dapl_pz_util.c 

[openib-general] Re: [PATCH] at: Change normal message from WARN to DEBUG

2005-05-19 Thread James Lentini

Committed in revision 2423.

On Thu, 19 May 2005, Hal Rosenstock wrote:

halr at: Change normal message from WARN to DEBUG
halr 
halr Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]
halr 
halr Index: at.c
halr ===
halr --- at.c(revision 2379)
halr +++ at.c(working copy)
halr @@ -737,13 +737,13 @@
halr  
halr arp = (struct ib_arp *)skb-nh.raw;
halr  
halr -   WARN(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx,
halr -(arp-src_ip  0x00ff),
halr -(arp-src_ip  0xff00)  8,
halr -(arp-src_ip  0x00ff)  16,
halr -(arp-src_ip  0xff00)  24,
halr -be64_to_cpu(arp-src_gid.global.subnet_prefix),
halr -be64_to_cpu(arp-src_gid.global.interface_id));
halr +   DEBUG(Process IB ARP ip %d.%d.%d.%d gid 0x%016llx%016llx,
halr + (arp-src_ip  0x00ff),
halr + (arp-src_ip  0xff00)  8,
halr + (arp-src_ip  0x00ff)  16,
halr + (arp-src_ip  0xff00)  24,
halr + be64_to_cpu(arp-src_gid.global.subnet_prefix),
halr + be64_to_cpu(arp-src_gid.global.interface_id));
halr  
halr spin_lock_irqsave(q-lock, flags);
halr for (a = q-next; a != q; a = a-next) {
halr 
halr 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] OOPS: ib_mad crashery on bootup

2005-05-19 Thread Shirley Ma

 Error, some other host already uses address 192.168.0.233.

I hit a simliar problem a while ago. But I couldn't reproduct it since
then.
Is this reproducible if you configure
duplicate IP addresses?

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OOPS: ib_mad crashery on bootup

2005-05-19 Thread Tom Duffy
On Thu, 2005-05-19 at 21:30 -0700, Shirley Ma wrote:
 
  Error, some other host already uses address 192.168.0.233. 
 
 I hit a simliar problem a while ago. But I couldn't reproduct it since
 then. 
 Is this reproducible if you configure duplicate IP addresses? 

I don't think this particular error is related to the crash I observed.
I see this on normal bootup of FC3 w/ ib0 configured.

The crash I saw happened once, but upon reboot was not reproducible.

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] kdapltest regression? failing now...

2005-05-19 Thread Tom Duffy
On Thu, 2005-05-19 at 21:43 -0400, James Lentini wrote:
 I commited a fix for this in revision 2420. The problem turned out to 
 be that DAPL wasn't initializing the max_inline_data value of the QP 
 attr's cap structure.
 
 Let me know if you still have any problems.

Good job.  All is well now.  Tested working.

Thanks,

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general