[openib-general] [MailServer Notification]To Recipient file blocking settings matched and action taken.

2005-06-03 Thread Administrator
ScanMail for Microsoft Exchange has blocked an attachment. Sender = [EMAIL PROTECTED] Recipient(s) = openib-general@openib.org Subject = [openib-general] Mail System Error - Returned Mail Scanning time = 6/3/2005 6:58:14 PM Action on file blocking: The attachment DOCUMENT.exe matches the file blo

Re: [openib-general] low number of CQ entries allowed

2005-06-03 Thread Shirley Ma
FYI. r2545 kernel + r2519 userspace works ok. Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/

[openib-general] Re: [PATCH] rdma_lat-09 and results

2005-06-03 Thread Grant Grundler
On Fri, Jun 03, 2005 at 02:45:19AM +0300, Michael S. Tsirkin wrote: ... > Its possible to link libibverbs and libmthca statically. > I did it once. While I can get "mthca.a" to link statically, I get warnings and the binary doesn't work. -LOADLIBES += -libverbs +LOADLIBES += -static -libverbs -lp

Re: [openib-general] low number of CQ entries allowed

2005-06-03 Thread Sayantan Sur
To add to this, I cannot run `pingpong' on r2539. Because of QP creation error. Thanks, Sayantan. [EMAIL PROTECTED]:examples] ./a.out --help ./a.out: unrecognized option `--help' Usage: ./a.outstart a server and wait for connection ./a.out connect to server at Options: -p

Re: [openib-general] low number of CQ entries allowed

2005-06-03 Thread Sayantan Sur
Hi, * On Jun,2 Shirley Ma<[EMAIL PROTECTED]> wrote : > > I have no problem to create 65000 QPs on r2519. Thanks for your input. However, I am talking about r2539 and about the number of *completion* queue entries. I haven't yet tested creating large number of QPs. Thanks, Sayantan. > > Shirle

Re: [openib-general] low number of CQ entries allowed

2005-06-03 Thread Shirley Ma
I have no problem to create 65000 QPs on r2519. Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinf

[openib-general] low number of CQ entries allowed

2005-06-03 Thread Sayantan Sur
Hi, Today I upgraded Gen2 drivers to revision 2539. I am seeing that there are CQ & QP creation problems with larger number of CQ entries. For example, with rdma_lat (part of perftest), I can go upto number of CQ entries = 256 when I hit an error: [EMAIL PROTECTED]:perftest] ./rdma_lat --tx-dept

Re: [openib-general] [PATCH] ib_ucm: remove devfs usage

2005-06-03 Thread William Jordan
On 6/3/05, Roland Dreier <[EMAIL PROTECTED]> wrote: >William> Should the device nodes /dev/infiniband_mad and >William> /dev/infiniband_verbs exist at all? Are they artifacts of >William> the way the classes work, or a setup problem on my >William> system? Do other people have these

Re: [openib-general] [PATCH] ib_ucm: remove devfs usage

2005-06-03 Thread Libor Michalek
On Fri, Jun 03, 2005 at 04:50:02PM -0400, William Jordan wrote: > On 6/3/05, Roland Dreier <[EMAIL PROTECTED]> wrote: > >William> Should the device nodes /dev/infiniband_mad and > >William> /dev/infiniband_verbs exist at all? Are they artifacts of > >William> the way the classes work, o

[openib-general] FW: [PATCH] ib_ucm: Change sys class name to match ib_uverbs and ib_mad

2005-06-03 Thread Jordan, Bill
Patch to change sys class name of userspace cm to match format of userpace verbs and userspace mad class names. Signed-off-by: Bill Jordan <[EMAIL PROTECTED]> Index: gen2/trunk/src/linux-kernel/infiniband/core/ucm.c === --- gen2/tru

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 16:33, Troy Benjegerdes wrote: > Also, the following fixes ibchecknet for me.. Thanks. Applied. > (I seem to have a bad cable or two) Hopefully you can figure it out. You can at least check all the HCA counters. If they look OK, swap the trunk cable (the one between the 2

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 16:33, Troy Benjegerdes wrote: > > > Also, I have two machines in a state right now where they are printing > > > out: > > > > > > kernel: unregister_netdevice: waiting for ib0 to become free. Usage > > > count = 1 > > > > That's an IPoIB issue... What was the sequence of e

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 04:08:05PM -0400, Hal Rosenstock wrote: > On Fri, 2005-06-03 at 15:58, Troy Benjegerdes wrote: > > > Do you have any additional known good cables to try ? > > > > I have several cables I *could* try, but I have no idea which ones are > > good, or what ports are getting erro

Re: [openib-general] [OOPS]: unloaded ib_mthca out from under opensm causes oops

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 15:39, Tom Duffy wrote: > I had opensm running on this node, before stopping opensm, unloaded > ib_mthca, caused oops in kernel. > > [EMAIL PROTECTED] ~]# rmmod ib_mthca > [EMAIL PROTECTED] ~]# general protection fault: [1] SMP > CPU 1 > Modules linked in: ib_ipoib ib_s

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 15:58, Troy Benjegerdes wrote: > > Do you have any additional known good cables to try ? > > I have several cables I *could* try, but I have no idea which ones are > good, or what ports are getting errors. > > Some lids seem to respond to 'perfquery', but I haven't been able

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 03:33:48PM -0400, Hal Rosenstock wrote: > On Fri, 2005-06-03 at 15:24, Troy Benjegerdes wrote: > > I can't tell anything conclusive from the patch though. > > OK. So you can't recreate the original problem ? > > > How do I go about debugging the multicast arp/ipoib stuff?

Re: [openib-general] OOPS: ib_mad crashery on bootup

2005-06-03 Thread Sean Hefty
I made several modifications to the MAD layer to assist with debugging this, and after a multitude of test runs I was able to see the output shown below. Basically, a send work request is completing, but the MAD associated with the request has been freed/corrupted. It's likely that the errors

[openib-general] [OOPS]: unloaded ib_mthca out from under opensm causes oops

2005-06-03 Thread Tom Duffy
I had opensm running on this node, before stopping opensm, unloaded ib_mthca, caused oops in kernel. [EMAIL PROTECTED] ~]# rmmod ib_mthca [EMAIL PROTECTED] ~]# general protection fault: [1] SMP CPU 1 Modules linked in: ib_ipoib ib_sa ib_umad ib_mad ib_core nfs lockd md5 ipv6 parport_pc lp pa

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 15:24, Troy Benjegerdes wrote: > I can't tell anything conclusive from the patch though. OK. So you can't recreate the original problem ? > How do I go about debugging the multicast arp/ipoib stuff? Do you have any additional known good cables to try ? -- Hal

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
> > No, it doesn't seem to help. To get anything to work at all, I seem to > > need to reload all the IB modules on every maching I want to use ipoib > > on. > > > > There have been two times now I've been able to see about 4 ping > > packets, and then one of the arp entries seems to go away. > >

Re: [openib-general] ipoib: bringing ib0 down kills ib0.8001?

2005-06-03 Thread Roland Dreier
Tom> Should it be the case that bringing down ib0 should kill off Tom> the other pkey devices: Looks like a bug -- I'll take a look. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-genera

[openib-general] ipoib: bringing ib0 down kills ib0.8001?

2005-06-03 Thread Tom Duffy
Should it be the case that bringing down ib0 should kill off the other pkey devices: [EMAIL PROTECTED] ~]# netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 10.6.98.0 0.0.0.0 255.255.255.0 U 0 0 0 eth

Re: [openib-general] How about ib_send_page() ?

2005-06-03 Thread Sean Hefty
Fab Tillier wrote: Ok, so this question is from a noob, but here goes anyway. Why can't IPoIB advertise a larger MTU than the UD MTU, and then just fragment large IP packets up if they need to go over the IB UD transport? Is there any reason this couldn't work? If it does, it allows IPoIB to e

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 15:03, Troy Benjegerdes wrote: > On Fri, Jun 03, 2005 at 01:52:31PM -0400, Hal Rosenstock wrote: > > Hi Troy, > > > > On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > > > I'm having intermittent problems with opensm.. It seems after a while > > > IPoIB stops working and

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 14:38, Troy Benjegerdes wrote: > One of the problems turned out to be a crashed Xserve G5, which seemed > to have the link up, but the driver was obviously not responding to MAD > packets. Yes, that is exactly the scenario than the OpenIB OpenSM vendor layer is not handling p

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 01:52:31PM -0400, Hal Rosenstock wrote: > Hi Troy, > > On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > > I'm having intermittent problems with opensm.. It seems after a while > > IPoIB stops working and if I restart opensm, it starts spitting out > > errors. > > Pl

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Caitlin Bestler
On 6/3/05, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > The next question is to define what if any handshake is desired. > > My guess that the consumer would acknowledge this by closing > > the RNIC, and that there would be some sort of deadline for doing > > so (much like a shutdown, you h

Re: [openib-general] [PATCH] ib_ucm: remove devfs usage

2005-06-03 Thread Roland Dreier
William> Should the device nodes /dev/infiniband_mad and William> /dev/infiniband_verbs exist at all? Are they artifacts of William> the way the classes work, or a setup problem on my William> system? Do other people have these nodes? I don't get those nodes -- I seem to recall a u

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 01:59:42PM -0400, Hal Rosenstock wrote: > On Fri, 2005-06-03 at 13:17, Troy Benjegerdes wrote: > > I've also seen [0][0][0] path indicators.. are those allowed > > as well? > > That is bogus too. Are all the bogus initial paths after a umad_send > error (message like "umad_

Re: [openib-general] [PATCH] ib_ucm: remove devfs usage

2005-06-03 Thread William Jordan
On 6/2/05, Libor Michalek <[EMAIL PROTECTED]> wrote: > On Thu, Jun 02, 2005 at 03:43:47PM -0400, William Jordan wrote: > > > > The udev support/naming works fine. Regarding naming, the ib_uverbs > > and ib_umad drivers only put port specific devices in the /dev/infiniband > > directory. They also c

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 13:17, Troy Benjegerdes wrote: > I've also seen [0][0][0] path indicators.. are those allowed > as well? That is bogus too. Are all the bogus initial paths after a umad_send error (message like "umad_receiver: send completed with error" in the osm.log) ? Thanks. -- Hal

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
Hi Troy, On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > I'm having intermittent problems with opensm.. It seems after a while > IPoIB stops working and if I restart opensm, it starts spitting out > errors. Please try the following workaround and let me know if this makes things better.

[openib-general] [PATCH] OpenSM: osm_sminfo_rcv.c: Eliminate some redundant checks

2005-06-03 Thread Hal Rosenstock
osm_sminfo_rcv.c: Eliminate some redundant checks Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]> Index: osm_sminfo_rcv.c === --- osm_sminfo_rcv.c(revision 2538) +++ osm_sminfo_rcv.c(working copy) @@ -160,14 +160,7 @@

[openib-general] Re: [PATCH] sa_query: In send_mad, initialize wr.next to NULL

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 13:22, Roland Dreier wrote: > Hal> sa_query: In send_mad, initialize wr.next to NULL > > Does this solve a real problem? As far as I know, the C standard > requires any pointer fields not explicitly named in a designated > initializer for a structure to be initialized to

Re: [openib-general] cable test/error count utilities?

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 13:14, Troy Benjegerdes wrote: > On Fri, Jun 03, 2005 at 06:14:34AM -0400, Hal Rosenstock wrote: > > Usage: perfquery [-d(ebug) -G(uid_addr) -a(ll_ports) -r(reset_after_read) > > -C ca_name -P hca_port -R(eset_only) -t timeout_ms -V(ersion) -h(elp)] > > [ [[port] [reset_mask

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Christoph Hellwig
On Fri, Jun 03, 2005 at 09:13:25AM -0700, Caitlin Bestler wrote: > Customers are using insmod to load DAPL Providers today, and > then using the registry to find them. That applies to both IB and > iWARP providers. The need for the registry reduces with each > step, but it doesn't instantly vanish

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Christoph Hellwig
On Fri, Jun 03, 2005 at 04:38:15AM -0700, Caitlin Bestler wrote: > OpenRDMA discussed with the DAT Collaborative the idea > of subsuming the responsibilities of the DAT Registry, so that > the OpenRDMA directory could take the 'dat_xxx' calls > directly. When the device dependent logic used a dynam

Re: [openib-general] Re: [PATCH] sa_query: In send_mad, initialize wr.next to NULL

2005-06-03 Thread Roland Dreier
Roland> Does this solve a real problem? As far as I know, the C Roland> standard requires any pointer fields not explicitly named Roland> in a designated initializer for a structure to be Roland> initialized to NULL. In other words -- since my previous post was kind of incomprehen

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Christoph Hellwig
On Fri, Jun 03, 2005 at 04:30:51AM -0700, Caitlin Bestler wrote: > The appropriate plafce to add that would be as an unafilliated asynchronous > event reported via the async evd. No. The async evds are a horrible API that should go away not beeing added to. > The next question is to define what

[openib-general] Re: [PATCH] sa_query: In send_mad, initialize wr.next to NULL

2005-06-03 Thread Roland Dreier
Hal> sa_query: In send_mad, initialize wr.next to NULL Does this solve a real problem? As far as I know, the C standard requires any pointer fields not explicitly named in a designated initializer for a structure to be initialized to NULL. - R. __

RE: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Eitan Zahavi
Title: RE: [openib-general] opensm fails to bring up subnet.. So Troy - will you be able to capture an osm.log and send us a tar.gz ? Eitan Zahavi Design Technology Director Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245 P.O. Box 586 Yokneam 20692 ISRAEL > -Original M

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 08:37:04AM -0400, Hal Rosenstock wrote: > On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > > I'm having intermittent problems with opensm.. It seems after a while > > IPoIB stops working > > Wonder if there is some relation to the two: intermittent IPoIB and lack > o

Re: [openib-general] cable test/error count utilities?

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 06:14:34AM -0400, Hal Rosenstock wrote: > On Thu, 2005-06-02 at 20:25, Troy Benjegerdes wrote: > > Some of my problems seem to be from intermittent cables.. > > > > Is there anything for OpenIB that can read error counters? > > Aside from pulling these from the driver via

RE: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Fri, 2005-06-03 at 12:47, Eitan Zahavi wrote: > Hi, > Sorry for catching up with this late in the thread. (Thanks Hal for > waking me up...) > > > > It appears that a node is not responding to a discovery packet (SM > Get > > NodeInfo (attrID 0x11)). It's direct route initial path (an array of

RE: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Eitan Zahavi
Title: RE: [openib-general] opensm fails to bring up subnet.. Hi, Sorry for catching up with this late in the thread. (Thanks Hal for waking me up...) > > It appears that a node is not responding to a discovery packet (SM Get > NodeInfo (attrID 0x11)). It's direct route initial path (an arra

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-03 Thread Sean Hefty
Jeff Carr wrote: On the client system: # cat /sys/class/infiniband/mthca0/ports/1/lid 0x2 On the server system: # cat /sys/class/infiniband/mthca0/ports/1/lid 0x3 modprobe ib_cmpost "slid=0x3" "dlid=0x2" "message_count=0x10" "message_size=0x100" slid is Source LID (not server). dlid is Dest

[openib-general] Re: opensm: new segv on shutdown

2005-06-03 Thread Hal Rosenstock
On Wed, 2005-06-01 at 16:51, Tom Duffy wrote: > I am putting together a network with a dumb IB switch, a couple of Linux > OpenIB boxes, a Solaris 10 box, a Solaris Nevada box, etc. I fired up > opensm on one of the Linux nodes, tried to plumb Solaris, no luck. I > then hit control-c on opensm a

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Caitlin Bestler
On 6/3/05, Bob Woodruff <[EMAIL PROTECTED]> wrote: > Catlin Wrote, > >That approach is certainly applicable for OpenIB as well. > >The key is recognizing the need for a transition plan. > >Customers have DAT Providers installed now, they > >cannot synchronize getting new DAT Providers from > >their

RE: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Bob Woodruff
Catlin Wrote, >That approach is certainly applicable for OpenIB as well. >The key is recognizing the need for a transition plan. >Customers have DAT Providers installed now, they >cannot synchronize getting new DAT Providers from >their suppliers with a new Linux release. This is >especially true s

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > I'm having intermittent problems with opensm.. It seems after a while > IPoIB stops working and if I restart opensm, Another side point: I'm not sure that in all cases IPoIB currently registers (its multicast) when the SM restarts. In the sin

[openib-general] [PATCH] sa_query: In send_mad, initialize wr.next to NULL

2005-06-03 Thread Hal Rosenstock
sa_query: In send_mad, initialize wr.next to NULL Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]> Index: sa_query.c === -- sa_query.c (revision 2532) +++ sa_query.c (working copy) @@ -433,6 +433,7 @@ int ret; stru

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Thu, 2005-06-02 at 19:46, Troy Benjegerdes wrote: > Some more info.. I rebooted the switches, and tried to re-run it. > > I found that ibnetdiscover showed everything with a LID of 0 except 1 > HCA card.. when I found that machine and did 'rmmod ib_mthca', opensm > seemed to get unstuck and ma

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Hal Rosenstock
On Thu, 2005-06-02 at 19:23, Troy Benjegerdes wrote: > I'm having intermittent problems with opensm.. It seems after a while > IPoIB stops working Wonder if there is some relation to the two: intermittent IPoIB and lack of response to SM query. > and if I restart opensm, How did you get around

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Caitlin Bestler
On 6/3/05, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > Keep in mind that loading/unloading DAT Provider is *not* > > synonymous with loading/unloading drivers. In fact I believe > > the intent is to have a single provider that supports multiple > > devices. Such a provider would simply reg

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Caitlin Bestler
The appropriate plafce to add that would be as an unafilliated asynchronous event reported via the async evd. The next question is to define what if any handshake is desired. My guess that the consumer would acknowledge this by closing the RNIC, and that there would be some sort of deadline for do

Re: [openib-general] cmpost: failure sending REQ: -22

2005-06-03 Thread Hal Rosenstock
On Thu, 2005-06-02 at 18:53, Sean Hefty wrote: > And, yes, it would be really nice if the SM reassigned the same LIDs to the > same nodes where possible. It does this for some but not all cases now. LIDs are not persistent in IB unless something above and beyond the IB spec is done. SM would nee

Re: [openib-general] cable test/error count utilities?

2005-06-03 Thread Hal Rosenstock
On Thu, 2005-06-02 at 20:25, Troy Benjegerdes wrote: > Some of my problems seem to be from intermittent cables.. > > Is there anything for OpenIB that can read error counters? Aside from pulling these from the driver via /sys/class/infiniband/mthca0/ports/1/counters/, there is also perfquery whi

Re: [openib-general] [PATCH] kDAPL: cleanup dat/ a bit more

2005-06-03 Thread Christoph Hellwig
On Thu, Jun 02, 2005 at 09:56:51AM -0700, Caitlin Bestler wrote: > Why is it that you believe that the DAT registry does not support > plug and play? The interface was most specifically designed > to allow that. DAT is based on a enumerate and request instead of a callback-based client interface.

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-06-03 Thread Christoph Hellwig
On Thu, Jun 02, 2005 at 09:49:51AM -0700, Caitlin Bestler wrote: > > The other "issue" right now is that dapl has a header struct that needs > > to come first in all the structs. So, that would need to be changed. > > > > That is what enables the use of the method table. It allows the > in-line

[openib-general] Re: [PATCH] rdma_lat-09 and results

2005-06-03 Thread Michael S. Tsirkin
Quoting r. Grant Grundler <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] rdma_lat-09 and results > > On Fri, Jun 03, 2005 at 02:45:19AM +0300, Michael S. Tsirkin wrote: > > > And PLEASE, if you reply, please delete quoted text you are not responding > > > to from your reply. I'm getting tired of wadi