[openib-general] Re: Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Michael S. Tsirkin
Quoting r. Shirley Ma [EMAIL PROTECTED]: Subject: Re: Re: [PATCH] rdma_lat-09 and results ./rdma_lat libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/local/lib/infiniband No IB devices found I have one node working, one node

Re: [openib-general] user space verbs examples

2005-06-02 Thread Gleb Natapov
On Wed, Jun 01, 2005 at 07:24:18PM -0400, Hal Rosenstock wrote: Hi Roland, I can run ibv_devices but ibv_asyncwatch seems to fail: ibv_devices device node GUID -- mthca0 0008f10403960558 ibv_asyncwatch

Re: [openib-general] Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Gleb Natapov
On Thu, Jun 02, 2005 at 08:29:34AM +0300, Michael S. Tsirkin wrote: Quoting r. Grant Grundler [EMAIL PROTECTED]: Subject: [PATCH] rdma_lat-09 and results Michael, Good news: My next cleanup of rdma_lat.c is working and patch is appended. Summary of changes below. Bad

[openib-general] Re: Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Michael S. Tsirkin
Quoting r. Gleb Natapov [EMAIL PROTECTED]: Subject: Re: Re: [PATCH] rdma_lat-09 and results On Thu, Jun 02, 2005 at 08:29:34AM +0300, Michael S. Tsirkin wrote: Quoting r. Grant Grundler [EMAIL PROTECTED]: Subject: [PATCH] rdma_lat-09 and results Michael, Good news: My

[openib-general] Re: Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Shirley Ma
libibverbs cant find the libmthca plugin. ibv_pingpong, ibv_asyncwatch ibv_devices work OK. Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 ___ openib-general mailing list

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Caitlin Bestler
I hadn't heard that insmod was being removed from Linux. In fact the DAPL Plugfest successfully used kernel daemons and kdapltest to demonstrate DAT interoperability across multiple vendors: kernel to kernel, kernel to user and user to user. These are existing applications already deployed. I

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Christoph Hellwig
On Thu, Jun 02, 2005 at 05:22:26AM -0700, Caitlin Bestler wrote: I hadn't heard that insmod was being removed from Linux. No one claimed that. In fact the DAPL Plugfest successfully used kernel daemons and kdapltest to demonstrate DAT interoperability across multiple vendors: kernel to

[openib-general] [patch][kdapl] enable kdapltest -T P

2005-06-02 Thread Itamar
in order to enable kdapltest -T P i needed to remark attr that are not set by openib gen2 1) ia_attr.max_evd_qlen 2) ia_attr.max_rdma_read_per_ep_in 3) ia_attr.max_rdma_read_per_ep_out also there was bug in kdapltest memory registration (file dapl_bpool.c) and there was bug where we free memory

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-06-02 Thread James Lentini
On Tue, 31 May 2005, Tom Duffy wrote: On Tue, 2005-05-31 at 14:17 -0400, James Lentini wrote: Here's the specification's exact description: timeout: Duration of time, in microseconds, that a consumer waits for connection establishment. The value of DAT_TIMEOUT_INFINITE

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-06-02 Thread James Lentini
On Tue, 31 May 2005, Tom Duffy wrote: On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote: Sean, Is there any way of requesting an infinite number of retries? There is not, but nothing prevents a user from simply re-issuing a request after it times out. Infinite retries inside the

[openib-general] Re: [PATCH] mthca: fix registration for giant MRs

2005-06-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: [PATCH] mthca: fix registration for giant MRs Here's a patch that allows mthca to break up registration of giant userspace MRs into multiple firmware commands. The net effect of this is that there should no longer be any limit (beyond

[openib-general] RE: [PATCH][kdapl] fix fatal bug in triger the evd upcall

2005-06-02 Thread James Lentini
Sorry about that. Should be fixed now. On Wed, 1 Jun 2005, Itamar Rabenstein wrote: itamar Hi James, itamar itamar you replied with Committed in revision 2514. itamar But I am checking and you have committed my patch with a change. itamar Please whenever you decide to change a patch please

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread James Lentini
On Wed, 1 Jun 2005, Tom Duffy wrote: On Wed, 2005-06-01 at 12:04 +0200, Christoph Hellwig wrote: That beeing said, one of the first thing you should get rid of if you want to be able to take code from kdapl to the generic rdma code is way it deals with handles. The kdapl code gives up

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Tom Duffy
On Thu, 2005-06-02 at 11:16 -0400, James Lentini wrote: - in dat.h, create a public structure for each object type: struct dat_ep { struct dat_provider *provider; }; - in the transport provider (dapl.h) have a private structure that contains the public one: struct

[openib-general] Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Grant Grundler
On Thu, Jun 02, 2005 at 08:29:34AM +0300, Michael S. Tsirkin wrote: ... I changed the timestamping strategy. I used to: ... This meant that tstamp instruction was out of the data path, while we did polling. On the negative side, although the average (and likely median) delta between tstamps

[openib-general] Re: station hang with the last patch to infiniband/core/sa_query.c

2005-06-02 Thread Roland Dreier
Thanks, I found a problem with that patch and checked in a fix. Can you let me know if it works better now? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit

[openib-general] Re: [PATCH] mthca: fix registration for giant MRs

2005-06-02 Thread Roland Dreier
Michael I'd like to suggest we keep passing struct mthca_buddy Michael *to mthca_alloc_mtt, instead of passing around and Michael keeping in memory the binary fmr flag, since all that Michael this flag does is select the right allocator, and callers Michael of

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread James Lentini
On Thu, 2 Jun 2005, Tom Duffy wrote: On Thu, 2005-06-02 at 11:16 -0400, James Lentini wrote: - in dat.h, create a public structure for each object type: struct dat_ep { struct dat_provider *provider; }; - in the transport provider (dapl.h) have a private structure that

RE: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA:Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Ryan, Jim
Please see some comments below which Im offering on my own but I hope speaking fairly and responsibly for the steering committee (more correctly the Board) of OpenIB. Jim Ryan, Chairman, OpenIB From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Venkata Jagana

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Caitlin Bestler
On 6/2/05, Tom Duffy [EMAIL PROTECTED] wrote: On Thu, 2005-06-02 at 11:16 -0400, James Lentini wrote: - in dat.h, create a public structure for each object type: struct dat_ep { struct dat_provider *provider; }; - in the transport provider (dapl.h) have a private structure

[openib-general] Re: [PATCH] mthca: fix registration for giant MRs

2005-06-02 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] mthca: fix registration for giant MRs Michael Why dont we keep mthca_mtt by instance in struct Michael mthca_mr, like this: struct mthca_mr { struct ib_mr ibmr; Michael struct mthca_mtt mtt; }; Michael Saves

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-02 Thread Caitlin Bestler
Why is it that you believe that the DAT registry does not support plug and play? The interface was most specifically designed to allow that. When a device is plugged in, the driver is loaded by existing OS mechanisms. It can then load the provider code (if needed). Whenever the provider module is

Re: [openib-general] [PATCH] rdma_lat-09 and results

2005-06-02 Thread Grant Grundler
On Wed, Jun 01, 2005 at 05:52:28PM -0700, Grant Grundler wrote: Here's with the new rdma_lat.c: Here are some pfmon31 perf results for the new rdma_lat. The first set is just L3 Cache misses. The second set shows all cache misses. The sample rate on the second set was a bit high - note the

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-06-02 Thread Tom Duffy
On Thu, 2005-06-02 at 10:17 -0400, James Lentini wrote: On Tue, 31 May 2005, Tom Duffy wrote: On Tue, 2005-05-31 at 14:17 -0400, James Lentini wrote: Here's the specification's exact description: timeout: Duration of time, in microseconds, that a consumer waits for

[openib-general] separate CQs for QP 0/1?

2005-06-02 Thread Sean Hefty
I haven't been able to determine the cause of the MAD crashes, so I was going to separate the CQs for QP 0 and 1 to give me more context/state information when a completion occurs. Is this a change that we want permanently? If so, I will cleanup the change for submission, otherwise, I'll

Re: [openib-general] Re: opensm: new segv on shutdown

2005-06-02 Thread Tom Duffy
On Wed, 2005-06-01 at 20:45 -0400, Hal Rosenstock wrote: On Wed, 2005-06-01 at 16:51, Tom Duffy wrote: I am putting together a network with a dumb IB switch, a couple of Linux OpenIB boxes, a Solaris 10 box, a Solaris Nevada box, etc. I fired up opensm on one of the Linux nodes, tried to

Re: [openib-general] separate CQs for QP 0/1?

2005-06-02 Thread Roland Dreier
Sean I haven't been able to determine the cause of the MAD Sean crashes, so I was going to separate the CQs for QP 0 and 1 Sean to give me more context/state information when a completion Sean occurs. Sean Is this a change that we want permanently? If so, I will Sean

Re: [openib-general] separate CQs for QP 0/1?

2005-06-02 Thread Sean Hefty
Roland Dreier wrote: Sean I haven't been able to determine the cause of the MAD Sean crashes, so I was going to separate the CQs for QP 0 and 1 Sean to give me more context/state information when a completion Sean occurs. Sean Is this a change that we want permanently? If

[openib-general] Re: [PATCH] mthca: fix registration for giant MRs

2005-06-02 Thread Roland Dreier
OK, I respun the patch. I didn't change the calculation of mtt-order to use fls, because the formula fls(max(size, MTHCA_MTT_SEG_SIZE / 8) - 1) should really be fls(max(size, MTHCA_MTT_SEG_SIZE / 8)) - fls(MTHCA_MTT_SEG_SIZE / 8) or something like that, and I ended up confusing myself and so I

[openib-general] kdaptest wedges server

2005-06-02 Thread Josh England
Hi, The kdapltest program wedges consistently wedges a node. I just tried it with Revision 2523, and it runs fine the first couple of times. The third time took a while, and the fourth time killed the server. The server is running: kdapltest -T S -D mthca0a The client runs this several times

Re: [openib-general] Re: opensm: new segv on shutdown

2005-06-02 Thread Hal Rosenstock
On Thu, 2005-06-02 at 13:31, Tom Duffy wrote: On Wed, 2005-06-01 at 20:45 -0400, Hal Rosenstock wrote: On Wed, 2005-06-01 at 16:51, Tom Duffy wrote: I am putting together a network with a dumb IB switch, a couple of Linux OpenIB boxes, a Solaris 10 box, a Solaris Nevada box, etc. I

[openib-general] Re: [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-02 Thread James Lentini
I committed everything in revision 2526, except changing the name of dat_dictionary_key_is_equal to dat_dict_key_is_equal. I assumed that you did this to fit as much of the function declartion on one line as possible. I think I achieved the result you wanted by giving the parameters shorter

[openib-general] RE: station hang with the last patch to infiniband/core/sa_query. c

2005-06-02 Thread Itamar Rabenstein
Thanks , ping is working now (;-) Itamar -Original Message- From: Roland Dreier [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 7:05 PM To: Itamar Rabenstein Cc: openib-general@openib.org Subject: Re: station hang with the last patch to infiniband/core/sa_query.c

RE: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-02 Thread Itamar Rabenstein
Hi , I am not an expert to pci hot plug but as far as know we currently dont have away in kdapl to inform the consumer of kdapl that a provider is going to be unloaded. pci hot plug should enable any low level driver (openib) to inform high level driver(kdapl) that the devide is going down. In

RE: [openib-general] kdaptest wedges server

2005-06-02 Thread Itamar Rabenstein
Hi Josh, Can you run it with debug option ( -d ) and tell me what is the last debug print on the server side ? when you say kill can you hit ^c or is it hanging the station ? what is system configuration ? (arch , Os ,kernel) Itamar -Original Message- From: Josh England

[openib-general] UCM accesses internal cm_id state

2005-06-02 Thread Sean Hefty
The ucm code in ib_ucm_event_handler() reads the cm_id-state information in response to an event occurrence. The cm_id state can change dynamically in response to an event, and in general should not be accessed. (I.e. the current cm_id state may not match up with the event being reported.)

[openib-general] [PATCH]libibcm: updated README for device node configuration

2005-06-02 Thread Bill Jordan
Patch to add basic content to the CM library README file. Add build instructions, install directory, and device node configuration. Signed-off-by: Bill Jordan [EMAIL PROTECTED] Index: README === --- README (revision 2523) +++

[openib-general] Re: [PATCHv3][RFC] kDAPL: use cm timers instead of own

2005-06-02 Thread Tom Duffy
On Thu, 2005-06-02 at 16:18 -0400, James Lentini wrote: I'd recommend these changes (see attached): - use the new microsecond conversion function you sent great - keep a default DAT timeout constant (I propose DAT_TIMEOUT_MAX) fair enough. - set the timeout a little differently. Instead

[openib-general] Re: [PATCHv3][RFC] kDAPL: use cm timers instead of own

2005-06-02 Thread James Lentini
On Thu, 2 Jun 2005, Tom Duffy wrote: On Thu, 2005-06-02 at 16:18 -0400, James Lentini wrote: I'd recommend these changes (see attached): - use the new microsecond conversion function you sent great - keep a default DAT timeout constant (I propose DAT_TIMEOUT_MAX) fair enough. - set

Re: [openib-general] [PATCH]libibcm: updated README for device node configuration

2005-06-02 Thread William Jordan
Libor, Ignore this patch. I'll submit a different one to change the ib_ucm device node from /dev/infiniband/cm to /dev/infiniband_cm. This makes the udev setup unnecessary. And, this probably wasn't the right README file for the udev rules anyway. They belong with the module, not with the

[openib-general] Re: [patch][kdapl] enable kdapltest -T P

2005-06-02 Thread James Lentini
A couple of questions below: On Thu, 2 Jun 2005, Itamar wrote: in order to enable kdapltest -T P i needed to remark attr that are not set by openib gen2 1) ia_attr.max_evd_qlen 2) ia_attr.max_rdma_read_per_ep_in 3) ia_attr.max_rdma_read_per_ep_out also there was bug in kdapltest memory

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-02 Thread Jeff Carr
On 05/31/05 16:30, Sean Hefty wrote: Has anyone seen ib_send_cm_req() return -22? I believe that this is a timeout error, possibly indicating that the server side of the connection wasn't running. You may also want to verify the slid and dlid are correct for your configuration.

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-02 Thread Christoph Hellwig
On Thu, Jun 02, 2005 at 11:16:53AM -0400, James Lentini wrote: This could be an improvement. We just need to be careful that we don't expose anything transport specific. Off the top of my head, I can think of one way to do this: - in dat.h, create a public structure for each object type:

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-02 Thread Hal Rosenstock
On Thu, 2005-06-02 at 18:09, Jeff Carr wrote: Is there a simple way to discover the lid values of other systems? Locally, you can run /usr/local/ib/bin/ibstatus or ibstat. Remotely, you would need to know the remote GID (subnet prefix + GUID) and ask the SA what the LID was for that (by getting

RE: [openib-general] RE: [ANNOUNCE][PATCH] New Linux 2.6.9 backportpatches and corresponding userspace tar ball available

2005-06-02 Thread Woodruff, Robert J
Michael Wrote Patches are located in the SVN tree under gen2/trunk/src/linux-kernel/patches/backport-to-2.6.9/ infiniband-backport-svn2425-to-2.6.9-kernel-fixups-01.diff infiniband-backport-svn2425-to-2.6.9-openib-drivers-02.diff

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-02 Thread Sean Hefty
Jeff Carr wrote: Is there a simple way to discover the lid values of other systems? Simple? Not really. You could query the SA to obtain a list of path records to all systems, and then extract the LIDs from those. I use: cat /sys/class/infiniband/mthca0/ports/1/lid to get the LIDs of

[openib-general] opensm fails to bring up subnet..

2005-06-02 Thread Troy Benjegerdes
I'm having intermittent problems with opensm.. It seems after a while IPoIB stops working and if I restart opensm, it starts spitting out errors. Do I have a misbehaving switch somewhere? ibnetdiscover seems to work fine. (this is from running 'opensm -v -o -r')

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-02 Thread Jeff Carr
On 06/01/05 14:43, William Jordan wrote: Has anyone seen ib_send_cm_req() return -22? I'm not sure what you are testing with, Jeff, but I ran into the same problem the first time I tried to use ucm_simple. The source and destination lid and guid are embedded in the source, and need to be

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-02 Thread Jeff Carr
On 06/02/05 15:27, Hal Rosenstock wrote: On Thu, 2005-06-02 at 18:09, Jeff Carr wrote: Is there a simple way to discover the lid values of other systems? Locally, you can run /usr/local/ib/bin/ibstatus or ibstat. Remotely, you would need to know the remote GID (subnet prefix + GUID) and

[openib-general] Re: [PATCH] rdma_lat-09 and results

2005-06-02 Thread Grant Grundler
On Fri, Jun 03, 2005 at 02:45:19AM +0300, Michael S. Tsirkin wrote: And PLEASE, if you reply, please delete quoted text you are not responding to from your reply. I'm getting tired of wading through 5 pages of quotes to get to a 3 line comment. Michael, Could you please at least attempt to

Re: [openib-general] cable test/error count utilities?

2005-06-02 Thread Grant Grundler
On Thu, Jun 02, 2005 at 07:25:51PM -0500, Troy Benjegerdes wrote: Some of my problems seem to be from intermittent cables.. Is there anything for OpenIB that can read error counters? cat /sys/class/infiniband/mthca0/ports/1/counters/*errors What I'd really like to see is something that I

[openib-general] [PATCH] kdapltest: fix pointer to pointer bug

2005-06-02 Thread Tom Duffy
In my work going through trying to get rid of the opaque dat handles, I came across what looks like a bug in kdapltest. I don't think DT_Performance_Test_Create() and DT_Performance_Test_Client() should take a DAT_IA_HANDLE * as an argument as this would be a pointer to a pointer. Of course, the

[openib-general] RE: [patch][kdapl] enable kdapltest -T P

2005-06-02 Thread Itamar Rabenstein
*/ Could we initialize the ia_attr.max_evd_qlen value correctly in the provider? Current openib gen2 code is not reporting the max cq size and i dont think that we should put a fix number . if we want to get the number we need Roland to fill this number