Re: [openib-general] Port error rate detection

2007-02-27 Thread Troy Benjegerdes
On Mon, Feb 19, 2007 at 03:53:36PM -0500, Steven Carter wrote: > I have a Nagios module that alerts on connectivity, port errors, > speed/width problems. I would like to give it the ability to change the > severity of the alert depending on whether errors are just present or if > they are incre

[openib-general] remove www.openfabrics.org SVN links..

2007-02-27 Thread Troy Benjegerdes
Can someone please update the main www.openfabrics.org web page to remove all references to subversion, and link to a wiki page on how to get the latest source? Thanks. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/

Re: [openib-general] ehca build on 2.6.18.2??

2006-11-12 Thread Troy Benjegerdes
, 2006, at 4:57 PM, Troy Benjegerdes wrote: > what is up with subversion? it does not build with errors like this: > >CC [M] drivers/infiniband/core/uverbs_main.o > drivers/infiniband/core/uverbs_main.c: In function > 'uverbs_event_get_sb': > drivers/infiniband/core/u

[openib-general] ehca build on 2.6.18.2??

2006-11-12 Thread Troy Benjegerdes
what is up with subversion? it does not build with errors like this: CC [M] drivers/infiniband/core/uverbs_main.o drivers/infiniband/core/uverbs_main.c: In function 'uverbs_event_get_sb': drivers/infiniband/core/uverbs_main.c:811: error: too few arguments to function 'get_sb_pseudo' driver

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-23 Thread Troy Benjegerdes
On Oct 23, 2006, at 8:42 AM, Hoang-Nam Nguyen wrote: > Hello Troy! >> The netpipe code is available with mercurial by: >> hg clone http://source.scl.ameslab.gov/hg/netpipe3-pvfs-dev >> Once you have pvfs2-1.5.1 installed, you should be able to do 'make >> pvfs' in the netpipe3-pvfs-dev directory

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-19 Thread Troy Benjegerdes
>>> I'm not sure the standard OpenIB NetPIPE runs can reproduce this >>> type of workload. However, we have developed a working PVFS2- >>> NetPIPE module which can reproduce this problem on occassion, if >>> there is interest in further testing this on your end, I can make >>> it available. > Yes

[openib-general] ibv_reg_mr temporary vs permanent errors

2006-10-18 Thread Troy Benjegerdes
If ibv_reg_mr fails, can an application (or library, such as pvfs) assume that this is just a temporary error, and try to deregister some memory, then try again? How can we differentiate between the case where the hardware (such as ehca) actually has more information about why the memory reg

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-18 Thread Troy Benjegerdes
(I am taking this back to the openib list because I think the list needs to hear about real applications that are hitting memory registration limits) What are the limits on the ehca memory registrations? Is there a limit to the number of regions that can be registered? Is there any way (wit

[openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-16 Thread Troy Benjegerdes
I am running PVFS2 on OpenIB, with IBM's ehca. When we start writing/reading large files, either with the NetPIPE PVFS module we have or a modified GAMESS executable that uses libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an error. This is also correlated with kernel log mess

Re: [openib-general] xt3 troubles (with OFED 1.0.1)

2006-08-01 Thread Troy Benjegerdes
On Tue, Aug 01, 2006 at 05:39:49PM -0400, Makia Minich wrote: > So, after flailing about with my IPOIB issue on the XT3, I decided that > perhaps a firmware upgrade (from 3.3.3 to 3.4.0) might be in order. Prior > to the upgrade, I was able to bring the entire stack online and see the > infiniband

[openib-general] making sense of dapl (and dat.conf)

2006-08-01 Thread Troy Benjegerdes
So, let's suppose I build ibverbs, libecha/libmthca, and dapl from subversion trunk.. what should my /etc/dat.conf file look like so things actually work? Right now I have: OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 "10.40.4.56 0" "" OpenIB-cma-ip u1.2 nonth

[openib-general] more ehca errors..

2006-07-27 Thread Troy Benjegerdes
Does this mean I have a mismatched kernel/firmware/ehca revision? This is the 2.6.17 kernel with a relatively recent ehca driver from subversion (SVNEHCA_0009) p5l3:~# p5l3:~# strace -ewrite -ewrite=all ibv_rc_pingpong write(3, "\0\0\0\0\0\4\0\2\0\0\0\0\377\350\10\230", 16) = 16 | 0 00 00 0

[openib-general] debian packages for ehca?

2006-07-27 Thread Troy Benjegerdes
I see there are debian packages in testing for libibverbs and mthca.. is the ehca userspace component ready to be packaged and put in debian testing? If not, what does it need yet? ___ openib-general mailing list openib-general@openib.org http://openib.

[openib-general] ehca issues, again

2006-07-12 Thread Troy Benjegerdes
This is the latest svn ehca code, 2.6.17 kernel. Can I also request that the EHCA driver print out what PHYP firmware it is known to work with, just like mthca prints out a warning if the mellanox card firmware is out of date? And while I'm asking about PHYP, what version are the ehca developer

Re: [openib-general] [ANNOUNCE] NetPIPE 3.7 release candidate 1

2006-07-07 Thread Troy Benjegerdes
Pradipta Kumar Banerjee wrote: > Sean Hefty wrote: >> Troy Benjegerdes wrote: >>> Is the connection manager api considered 'stable' yet? Last I knew >>> it was still undergoing a lot of development. >> >> The connection manager API is relatively

Re: [openib-general] [ANNOUNCE] NetPIPE 3.7 release candidate 1

2006-07-07 Thread Troy Benjegerdes
Pradipta Kumar Banerjee wrote: > Troy Benjegerdes wrote: > >> I am preparing to release an update to the NetPIPE benchmark >> ( http://scl.ameslab.gov/Projects/NetPIPE/NetPIPE.html ), and I would >> very much like to hear some feedback on the OpenIB verbs implementation

[openib-general] [ANNOUNCE] NetPIPE 3.7 release candidate 1

2006-07-06 Thread Troy Benjegerdes
I am preparing to release an update to the NetPIPE benchmark ( http://scl.ameslab.gov/Projects/NetPIPE/NetPIPE.html ), and I would very much like to hear some feedback on the OpenIB verbs implementation (NPibv), and take any patches to make it build on Windows as well. I would also like hear from

[openib-general] libehca.conf required, but no documentation..

2006-07-05 Thread Troy Benjegerdes
Please provide some documentation for libecha.conf. There is not even a README in userspace/libehca. Also, what firmware are the people doing echa development using? Most versions I have tried seem to have some sort of nasty issue if the SM doesn't bring the port active fast enough, which seems l

[openib-general] EHCA broken for 2.6.16?

2006-06-01 Thread Troy Benjegerdes
Okay guys, what's up this time? Kernel 2.6.16.. CC [M] drivers/infiniband/hw/ehca/ehca_main.o In file included from drivers/infiniband/hw/ehca/ehca_qes.h:47, from drivers/infiniband/hw/ehca/ipz_pt_fn.h:46, from drivers/infiniband/hw/ehca/ehca_classes.h:46,

[openib-general] opensm segfault?

2006-05-16 Thread Troy Benjegerdes
I got this after an indeterminate amount of time running opensm.. (gdb) bt #0 0x2b90b0dbebf3 in cl_memcpy (p_dest=0x2ac88850, p_src=0x0, count=64) at cl_memory_osd.c:87 #1 0x00415053 in osm_pkey_tbl_sync_new_blocks ( p_pkey_tbl=0x2ad99228) at osm_pkey.c:127 #2 0x000

Re: [openib-general] Re: TSO and IPoIB performance degradation

2006-05-02 Thread Troy Benjegerdes
On Thu, Apr 27, 2006 at 05:16:29PM -0700, Greg Lindahl wrote: > On Thu, Apr 27, 2006 at 04:22:40PM -0700, Grant Grundler wrote: > > > Anything preventnig such a gateway from routing SDP to ethernet? > > Those gateways obviously will grok IB protocols. > > I'm asking becuase I don't understand/know

[openib-general] ibv_rc_pingpong debugging..

2006-04-28 Thread Troy Benjegerdes
So, how do I start debugging this? ibv_devinfo reports the port as active.. what else would cause this? (I am running the userspace modules from http://openib.red-bean.com/rc2/SOURCES/ , and kernel 2.6.16.11) [EMAIL PROTECTED] netpipe3-dev]# ibv_rc_pingpong -n 1 node4 local address: LID 0x00

[openib-general] Re: TSO and IPoIB performance degradation

2006-04-26 Thread Troy Benjegerdes
On Mon, Mar 20, 2006 at 02:37:04AM -0800, David S. Miller wrote: > From: "Michael S. Tsirkin" <[EMAIL PROTECTED]> > Date: Mon, 20 Mar 2006 12:22:34 +0200 > > > Quoting r. David S. Miller <[EMAIL PROTECTED]>: > > > The path an SKB can take is opaque and unknown until the very last > > > moment it i

[openib-general] opensm issues on 64 node RHEL4 cluster?

2006-04-13 Thread Troy Benjegerdes
We just moved a cluster over to the latest redhat release, and opensm seems to be having issues. This is running the redhat provided kernel and opensm packages [EMAIL PROTECTED] troy]# uname -r 2.6.9-34.ELsmp [EMAIL PROTECTED] troy]# cat /etc/redhat-release Red Hat Enterprise Linux WS release 4 (

Re: [openib-general] EHCA crash on module unload?

2006-04-11 Thread Troy Benjegerdes
I had unplugged, then re-plugged the cable, and then ran the following: rmmod hcad_mod ib_mthca ib_uverbs ib_ipoib ib_sa ib_mad ib_core Heiko J Schick wrote: Hello Troy, did you unload first all OpenIB modules and then the eHCA module or the other way around? Can you see any other message (

[openib-general] EHCA crash on module unload?

2006-04-11 Thread Troy Benjegerdes
p5l2:/usr/src/linux-2.6.16/drivers/infiniband# svnversion . 5988 p5l2:~# [86044.767087] Unable to handle kernel paging request for data at address 0x0068 [86044.767115] Faulting instruction address: 0xd00018fd4b38 [86044.767132] Oops: Kernel access of bad area, sig: 11 [#1] [86044.767149]

[openib-general] ehca error message translation request..

2006-03-23 Thread Troy Benjegerdes
Can someone please translate? babelfish doesn't talk ibmese.. [8270280.043608] eHCA Infiniband Device Driver (Rel.: SVNEHCA_0002) [8297399.067840] PU0002 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR opcode=168 ret=ffd3 arg1=10010304 arg2=2009 arg3=ac0 arg4=

Re: [openib-general] ehca ipz_qeit_reset???

2006-03-23 Thread Troy Benjegerdes
Okay guys, what gives here.. src/ehca_umain.c: In function 'ehcau_modify_qp': src/ehca_umain.c:407: warning: implicit declaration of function 'ipz_qeit_reset' gcc -DHAVE_CONFIG_H -I. -I. -I. -O2 -g -Wall -D_GNU_SOURCE -DP_SERIES -I../libibverbs/include -Isrc -g -O2 -MT src_libehca_la-ehca_umain.

Re: [openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
On Thu, Mar 23, 2006 at 01:35:50PM -0800, Roland Dreier wrote: > Troy> Okay, this is hokey. Both drivers should be able to > Troy> coexist. Here is a full strace with the libmthca.so removed, > Troy> which still doens't seem to work right. > > Yes, the drivers should be able to coexist

Re: [openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
On Thu, Mar 23, 2006 at 11:10:19AM -0800, Roland Dreier wrote: > > libibverbs: Warning: no userspace device-specific driver found for uverbs0 > > driver search path: /usr/lib/infiniband > > Is the ehca driver in that directory? As far as I can tell from the > strace and the libehca sour

[openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
I just built a fresh 2.6.16 kernel, and the ehca (and associated ibverbs stuff) from the lastest stubversion (5988), and things like 'ibv_devices' fail.. p5l3:~# ibv_devices libibverbs: Warning: no userspace device-specific driver found for uverbs0 driver search path: /usr/lib/infiniband

Re: [openib-general] debian package version check issues

2006-02-21 Thread Troy Benjegerdes
Note to self: Make sure all old '/usr/local/include/infiniband' stuff is nuked when you install the debian packages. (Btw, is that an error if configure picks up /usr/local/include/infiniband before /usr/include/infiniband?) On Tue, Feb 21, 2006 at 06:14:15PM -0600, Troy Benjege

Re: [openib-general] debian package version check issues

2006-02-21 Thread Troy Benjegerdes
On Tue, Feb 21, 2006 at 06:14:15PM -0600, Troy Benjegerdes wrote: > There's a few bogons in the libmthca version checks.. > And some build problems too, apparently.. I just installed the libibverbs-dev package. cc -DHAVE_CONFIG_H -I. -I. -I. -g -Wall -D_GNU_SOURCE -g -Wall -O2 -MT s

[openib-general] debian package version check issues

2006-02-21 Thread Troy Benjegerdes
There's a few bogons in the libmthca version checks.. opteron2:/usr/src/openib-src/userspace/libmthca# dpkg-buildpackage dpkg-buildpackage: source package is libmthca dpkg-buildpackage: source version is 1.0 dpkg-buildpackage: source changed by Roland Dreier <[EMAIL PROTECTED]> dpkg-buildpackage:

Re: [openib-general] iwarp: whats a pkey?

2006-01-27 Thread Troy Benjegerdes
On Fri, Jan 27, 2006 at 03:34:48PM -0800, Caitlin Bestler wrote: > [EMAIL PROTECTED] wrote: > > On Fri, 2006-01-27 at 15:32, Roland Dreier wrote: > >> Roland> No, I think trying to create a mapping is a bad idea. > >> The Roland> semantics of VLANs and IB partitions are sufficiently > >>

Re: [openib-general] 2.6.14 & ib_umad segfault

2006-01-26 Thread Troy Benjegerdes
On Thu, Jan 26, 2006 at 01:28:59PM -0800, Roland Dreier wrote: > http://openib.org/pipermail/openib-general/2006-January/015216.html Blah. I guess that's what I deserve for running something more than 2 weeks old ;) ___ openib-general mailing list openib

[openib-general] 2.6.14 & ib_umad segfault

2006-01-26 Thread Troy Benjegerdes
2.6.14, amd64, svn 5193 This happens when loading 'ib_umad' [ 282.510929] Unable to handle kernel paging request at 0e70010c RIP: [ 282.516469] {kref_get+1} [ 282.524371] PGD d6f9d067 PUD d6825067 PMD 0 [ 282.529521] Oops: [1] SMP [ 282.533312] CPU 0 [ 282.535745] Modules linke

Re: [openib-general] [PATCH] OpenSM: include OpenIB svn version when OpenIB build

2006-01-26 Thread Troy Benjegerdes
the version. > I do not recall exactly where we have left it. I remember someone > proposed a standard svn command to extract that. > Sorry about that. We all would like to get that information too. > > Eitan > > > -Original Message- > > From: Troy Benj

Re: [openib-general] [PATCH] OpenSM: include OpenIB svn version when OpenIB build

2006-01-26 Thread Troy Benjegerdes
Is there a good reason that this patche hasn't been applied yet?? If you want me to provide usefull debugging reports, I need to be able to tell from the log which SVN version opensm was built from. On Tue, Jan 03, 2006 at 12:43:33PM -0500, Hal Rosenstock wrote: > OpenSM: include OpenIB svn versi

Re: [openib-general] RE: [PATCH] [TRIVIAL] OpenSM: Separate out OSM_VERSION

2006-01-03 Thread Troy Benjegerdes
On Tue, Jan 03, 2006 at 11:01:43AM -0500, Hal Rosenstock wrote: > On Tue, 2006-01-03 at 10:43, Eitan Zahavi wrote: > > Hi Hal, > > > > Sounds good. > > I think you should be able to use the .svn/entries to get the last > > update revision and then use svn diff (or diff) to see if local mods are >

Re: [openib-general] Userspace testing results (2.6.15-rc7-git2 with modules)

2005-12-31 Thread Troy Benjegerdes
> Currently, I am running netpipe, iperf and netperf (these three tests > are giving horrible results but we are pretty sure that it is a local > issue, as both eth1 and ib0 based tests lead to poor performance) and > also netpipe with a patch from Shirley Ma to run over native IB [1]. > Additional

Re: [openib-general] OpenSM not coming out of standby state..

2005-11-30 Thread Troy Benjegerdes
On Wed, Nov 30, 2005 at 07:33:44PM -0600, Troy Benjegerdes wrote: > A couple of days ago I started up two instances of opensm on my network, > and set one with priority 11, the other with the default 10. > > I could kill one and the other would become master a few minutes later. >

[openib-general] OpenSM not coming out of standby state..

2005-11-30 Thread Troy Benjegerdes
em, let me know) -- -- Troy Benjegerdes'da hozer'[EMAIL PROTECTED] Somone asked me why I work on this free (http://www.fsf.org/philosophy/) software stuff and not get a real job. Charles Shultz had

Re: [openib-general] Re: [RFC] OpenSM: include svn version in build string

2005-11-22 Thread Troy Benjegerdes
> > > > I dont think its a good idea to add dependency on svnversion to > the makefile. > > Lets just add --with-version= option to configure. > Then the user can run it --with-version=`svnversion` If you remove the automatic generation of the version info from svnversion, you defeat the whole

Re: [openib-general] OpenSM Debug

2005-11-22 Thread Troy Benjegerdes
On Sun, Nov 20, 2005 at 09:18:27AM -0800, Fab Tillier wrote: > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > > Sent: Sunday, November 20, 2005 4:59 AM > > > > Hi Fab, > > > > On Sat, 2005-11-19 at 13:50, Fab Tillier wrote: > > > > > > That's correct - structure definitions change between the

Re: [openib-general] Re: [RFC] OpenSM: include svn version in build string

2005-11-22 Thread Troy Benjegerdes
On Tue, Nov 22, 2005 at 06:00:47PM +0200, Michael S. Tsirkin wrote: > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>: > > Subject: [RFC] OpenSM: include svn version in build string > > > > Hi, > > > > It has been requested (several times now :-) that the svn version be > > included in the OpenSM b

Re: [openib-general] another opensm crash

2005-11-20 Thread Troy Benjegerdes
Eitan Zahavi wrote: Hi Hal, To reproduce the problems we see in large subnets we have to revive the simulator project. Yael will spend some time evolving the packet dropper test on the simulator and I hope we will be able to reproduce this kind of bugs. The limit of the current test is that it

Re: [openib-general] another opensm crash

2005-11-15 Thread Troy Benjegerdes
On Mon, Nov 14, 2005 at 09:54:28PM +0200, Eitan Zahavi wrote: > Hi Troy > > Try to move aside your /lib/tls directory and see if you still get these > crashes. > We have issues with TLS pthread and glibc We still have issues with -maxsmps=8. And no, running with maxsmps=1 is not an option on thi

[openib-general] another opensm crash

2005-11-14 Thread Troy Benjegerdes
(gdb) bt #0 0x08071ff3 in osm_si_rcv_process (p_rcv=0x8090138, p_madw=0x80a1de0) at osm_sw_info_rcv.c:679 #1 0xb7fb0213 in __cl_disp_worker (context=0x8090da4) at cl_dispatcher.c:108 #2 0xb7fb8557 in __cl_thread_pool_routine (context=0x8090de4) at cl_threadpool.c:78 #3 0xb7fb834d in __

Re: [openib-general] [PATCH] Opensm - lid assignment issues

2005-11-13 Thread Troy Benjegerdes
Yael Kalka wrote: Hi Hal, During some windows tests we've discovered that there is still another problem in the lid_mgr. The problem happend when 2 HCAs had the same lid - opensm entered an infinite loop. The following patch fixes this. Thanks, Yael Signed-off-by: Yael Kalka <[EMAIL PROTECT

Re: [openib-general] OpenSM and Wrong SM_Key

2005-11-12 Thread Troy Benjegerdes
On Sat, Nov 12, 2005 at 07:34:44PM +0200, Eitan Zahavi wrote: > Hi Troy, > > Good to get a straight forward message. > > What I hear you saying is: > 1. There needs to be a parameter to control the SM behavior if it finds > another SM with non matching SM Key: > -> Either to ignore it or to die.

Re: [openib-general] OpenSM and Wrong SM_Key

2005-11-10 Thread Troy Benjegerdes
On Wed, Nov 09, 2005 at 09:46:06AM +0200, Eitan Zahavi wrote: > Hi Hal, > > I would like to bring this to MgtWG before we change anything. > IMO the situation when this happens is really not "legal" since if the > SM's are not coordinated at least in their SM_Key it will cause the two > masters on

Re: [openib-general] libehca causes segfault when not physically present..

2005-11-03 Thread Troy Benjegerdes
On Thu, Nov 03, 2005 at 11:13:58AM -0800, Roland Dreier wrote: > Heiko> this bug should be fixed in OpenIB trunk 3960. > > It's good to see this fixed and all the other cleanups in this > checkin. I'll have to go back to my ehca code reviewing > > However, when this code moves upstream,

[openib-general] OpenSM errors question..

2005-11-02 Thread Troy Benjegerdes
What does the following mean? (the ERR 1B11, in particular) Nov 02 16:18:33 656702 [41001960] -> osm_report_notice: Reporting Generic Notice type:4 num:144 from LID:0x001B GID:0xfe80,0x0002c90200402789 Nov 02 16:18:33 674607 [41802960] -> osm_ucast_mgr_process: Min Hop Tables configure

Re: [openib-general] opensm errors with ehca

2005-11-01 Thread Troy Benjegerdes
> Can you try the following opensm patch and see if this eliminates those > timeout messages ? > > This patch clears the high part of the attribute modifier when not a > switch (when obtaining the PKeyTable). > > -- Hal > > Index: osm_port_info_rcv.c > ===

[openib-general] libehca causes segfault when not physically present..

2005-10-30 Thread Troy Benjegerdes
On an Openpower720 system with a mellanox HCA (and no IBM ehca installed), I get the following when trying to run ibv_rc_pingpong: Starting program: /usr/src/openib-src/userspace/libibverbs/examples/.libs/ibv_rc_pingpong [Thread debugging using libthread_db enabled] [New Thread 4398046660640 (LWP

[openib-general] opensm errors with ehca

2005-10-30 Thread Troy Benjegerdes
The firmware on the IBM eHCA causes opensm to spit out these kinds of errors all the time.. Is there a way we can either not send P_KeyTable requests to any eHCA guids, or figure out what (if anything) is broken in their firmware? Is this a spec violation, or just ambiguities in implementation?

Re: [openib-general] prototype version of ebus driver

2005-10-28 Thread Troy Benjegerdes
On Wed, Oct 26, 2005 at 04:56:08PM +0200, IBMEHCA DD wrote: > on kernel 2.6.13 and 14 a "ebus" driver is needed to enable the ehca > driver on power5. > I just uploaded a prototype patch to gen2/users/ehca svn 3879 > Please get some responses from the PPC64 maintainers, or possibly linux-kernel.

Re: [openib-general] Re: ehca testing

2005-10-28 Thread Troy Benjegerdes
On Thu, Oct 27, 2005 at 10:03:17AM -0700, Roland Dreier wrote: > OK, looks like you have two problems. First of all, you seem to have > two versions of ib_mthca, one of which gets picked up by hotplug on > boot and one of which gets picked up by modprobe. Notice how you > don't see the > > d

Re: [openib-general] Re: ehca testing

2005-10-27 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 03:32:13PM -0700, Roland Dreier wrote: > Troy> There is some sort of strange initializiation error going on here.. > > Yes, very strange. Can you add > > printk(KERN_ERR "hca->node_type = %d\n", hca->node_type); > > to the beginning of ipoib_add_port(), and >

[openib-general] ib_mthca panic on PPC64

2005-10-27 Thread Troy Benjegerdes
I got this the other day (before I had a chance to add the debug code) p5l0:~# [443954.161068] mthca0: ib_query_pkey port 0 failed (ret = -22) [443988.334644] mthca0: ib_query_pkey port 0 failed (ret = -22) [444037.579342] ib_mthca: Mellanox InfiniBand HCA driver v0.06 (June 23, 2005) [444037.5793

Re: [openib-general] [RFC] OpenSM Interactive Console

2005-10-27 Thread Troy Benjegerdes
972-4-9097208 > Fax:+972-4-9593245 > P.O. Box 586 Yokneam 20692 ISRAEL > > > > -Original Message- > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, October 26, 2005 7:44 PM > > To: Eitan Zahavi > > Cc: Troy Benjegerdes; openib-gen

Re: [openib-general] Re: ehca testing

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 11:03:28AM -0700, Roland Dreier wrote: > Troy> I've since found I have the same problem without hcad_mod. I > Troy> don't see any errors in dmesg except for: > > Troy> [ 7415.421699] mthca0: ib_query_pkey port 0 failed (ret = -22) > > It's strange that IPoIB is

Re: [openib-general] Re: ehca testing

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 11:03:28AM -0700, Roland Dreier wrote: > Troy> I've since found I have the same problem without hcad_mod. I > Troy> don't see any errors in dmesg except for: > > Troy> [ 7415.421699] mthca0: ib_query_pkey port 0 failed (ret = -22) > > It's strange that IPoIB is

Re: [openib-general] Re: ehca testing

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 10:48:35AM -0700, Roland Dreier wrote: > Troy> This is strange.. This machine has a mellanox card, but no > Troy> ehca card. It looks like when hcad_mod and ib_mthca are > Troy> both loaded something conflicts. > > Have you confirmed that it works without hcad_

Re: [openib-general] EHCA-0028 userspace build fails with openib svn 3774

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 08:46:12AM +0200, Heiko J Schick wrote: > Hello Troy, > > this problem should be solved in EHCA2_0033. The EHCA2_0028 package was > only tested with OpenIB trunk 3615. The problem is EHCA_0028 doesn't > included the raw_fw_ver pointer for ibv_cmd_query_device. EHCA_0033 (t

[openib-general] Re: ehca testing

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 04:47:07PM +0200, Christoph Raisch wrote: > Can't promise the opensm part, but we'll try. > We have some intel machines with mellanox cards. A second possibility > would be to just use the mellanox cards in our power5 boxes for opensm and > see what happens. This is stran

Re: [openib-general] EHCA-0028 userspace build fails with openib svn 3774

2005-10-20 Thread Troy Benjegerdes
On Thu, Oct 20, 2005 at 08:46:12AM +0200, Heiko J Schick wrote: > Hello Troy, > > this problem should be solved in EHCA2_0033. The EHCA2_0028 package was > only tested with OpenIB trunk 3615. The problem is EHCA_0028 doesn't > included the raw_fw_ver pointer for ibv_cmd_query_device. > > Please u

Re: [openib-general] [RFC] OpenSM Interactive Console

2005-10-20 Thread Troy Benjegerdes
> > * Topology > > This can be done via SA queries currently. > > > * guid/lid/IPoIB address/switch port mappings > > The SM does not know (see) IPoIB addresses. The only thing it sees is > the part of the subnet address. > > The rest can be done via SA queries currently. > > > * link state >

Re: [openib-general] where is IB_WARN defined?

2005-10-19 Thread Troy Benjegerdes
> > Hrrm.. it looks like for some reason the top level makefile didn't > > rebuild libibcommon. > > Not sure why that would be. > > In my top level generated Makefile, > LIBS:=libibcommon libibumad libibmad > > @for i in $(LIBS); do\ > if [ -x $$i/autogen.sh ]; then\ >

[openib-general] EHCA-0028 userspace build fails with openib svn 3774

2005-10-19 Thread Troy Benjegerdes
make[1]: Entering directory `/usr/src/openib-src/userspace/libehca' if /bin/sh ./libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I.-O3 -g -Wall -D_GNU_SOURCE -DP_SERIES -I../libibverbs/include -Isrc -g -O2 -MTsrc_libehca_la-ehca_umain.lo -MD -MP -MF ".deps/src_libehca_la-ehca_umain.Tp

Re: [openib-general] [RFC] OpenSM Interactive Console

2005-10-19 Thread Troy Benjegerdes
On Tue, Oct 18, 2005 at 03:10:31PM -0400, Hal Rosenstock wrote: > Currently, OpenSM does not support an interactive console. There has > been a desire to introduce the ability to change certain parameters (as > well as display things) once OpenSM has started. This patch introduces > the first most

Re: [openib-general] where is IB_WARN defined?

2005-10-19 Thread Troy Benjegerdes
On Wed, Oct 19, 2005 at 04:30:06PM -0400, Hal Rosenstock wrote: > On Wed, 2005-10-19 at 16:09, Troy Benjegerdes wrote: > > I'm trying to rebuild opensm, and the libibumad configure is failing > > because > > IB_WARN is apparently not defined anyplace I can find it

[openib-general] EHCA ipoib error..

2005-10-19 Thread Troy Benjegerdes
I get the following errors when trying to bring up ipoib: 10:~# modprobe ib_hcad_mod ehca_nr_ports=1 FATAL: Module ib_hcad_mod not found. 10:~# modprobe hcad_mod ehca_nr_ports=1 [ 1401.993165] eHCA Infiniband Device Driver (Rel.: EHCA2_0028) [ 1401.994486] xics_enable_irq: irq=36868: ibm_int_on re

[openib-general] where is IB_WARN defined?

2005-10-19 Thread Troy Benjegerdes
I'm trying to rebuild opensm, and the libibumad configure is failing because IB_WARN is apparently not defined anyplace I can find it. -- -- Troy Benjegerdes'da hozer'[

Re: [openib-general] Cray XD1 and OpenSM.. (ignoreing certain guids?)

2005-10-15 Thread Troy Benjegerdes
It's more than that: in addition to the IB hardware/driver difference, it will need to be ported from Linux to whatever Cray OS is. The Cray XD1 is actually running Linux.. I've even managed to build and boot my own kernel on one. They are actually using a derivative of the OpenIB SDP code

Re: [openib-general] Cray XD1 and OpenSM.. (ignoreing certain guids?)

2005-10-15 Thread Troy Benjegerdes
I'm unaware of such an option. Not sure how you would specify which nodes to ignore. Why would you want them on the net if they are to be ignored ? Nodes are supposed to be IB compliant: SMA is a required component of all nodes. So I presume there is no SMA for the Cray XD1. If someone

[openib-general] Cray XD1 and OpenSM.. (ignoreing certain guids?)

2005-10-14 Thread Troy Benjegerdes
In the interest of plugging absolutely everything I have with infiniband ports together and seeing what falls over, I connected a Cray XD1 to a small (2 machine) infiniband network running OpenSM. Ideally, I'd like to find out what sort of minimal emulation code needs to be running on the XD1 node

Re: [openib-general] Re: IBM eHCA testing..

2005-10-14 Thread Troy Benjegerdes
Hal Rosenstock wrote: On Thu, 2005-10-13 at 18:46, Troy Benjegerdes wrote: I'm also attaching part of an opensm log file. (the full copy is at http://scl.ameslab.gov/~troy/osm-ehca.log ) The IBM galaxy adapters are at: Initial path: [0][1][16] Initial path: [0][

Re: [openib-general] Re: IBM eHCA testing..

2005-10-13 Thread Troy Benjegerdes
On Wed, Oct 12, 2005 at 01:04:37PM +0200, IBMEHCA DD wrote: > I just released the ehca2_0028 which uses svn 3615 on > https://sourceforge.net/projects/ibmehcad/ > As you might notice the license already has changed to the openib.org > license. > > With 2.6.13 we had the non-issue that our maun f

Re: [openib-general] IBM eHCA testing..

2005-10-12 Thread Troy Benjegerdes
What is the turnaround time on a firmware change? If we can get an update, I think that would be the best solution. I'll be happy to test this. On Wed, Oct 12, 2005 at 11:36:59AM +0200, IBMEHCA DD wrote: > This is basically the answer why its so "sensitive" which port is plugged. > We're working o

Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Troy Benjegerdes
On Tue, Oct 11, 2005 at 09:13:20AM -0700, Shirley Ma wrote: > The IB stack doesn't handle errors during client initialization. This > problem is easy to reproduce by inducing errors (resouce allocation > failure or query failure) in mad_client or sa_client registration. I am > working on a patch

Re: [openib-general] IBM eHCA testing..

2005-10-09 Thread Troy Benjegerdes
What's the status on getting the ehca driver integrated into subversion? If there's something holding it up, can we at least get a version that can be dropped into drivers/infiniband/hw ? Also, one final note, is it really appropriate to have ehca/ebus in the infiniband directory? It's really a PP

Re: [openib-general] IBM eHCA testing..

2005-10-07 Thread Troy Benjegerdes
On Fri, Oct 07, 2005 at 09:33:27AM -0700, Shirley Ma wrote: > Hi, Troy, > > There is INSTALL file in the EHCA driver package. > In OpenPower 720 port 1 is at the top, port 2 is at the bottom. > In P570, port1 is at the bottom, port2 is at the top. Okay, I guess I should read more carefully ;) Wh

[openib-general] IBM eHCA testing..

2005-10-07 Thread Troy Benjegerdes
I have two IBM eHCA cards installed and it appears that OpenSM is happily talking to the firmware and bringing up the links. So now I'm looking at the install instructions for the ehca2_EHCA2_0025.tgz code drop, and wondering what (if any) issues there are with a 2.6.13 kernel, or later OpenIB svn

Re: [openib-general] [PATCH] udapl: PPC64 cpuinfo change

2005-10-06 Thread Troy Benjegerdes
On Thu, Oct 06, 2005 at 02:14:08PM -0700, Grant Grundler wrote: > On Thu, Oct 06, 2005 at 11:48:02AM -0600, Todd Bowman wrote: > > /proc/cpuinfo on PPC64 prints different label for processor speed. > ... > > ISTR the "clock" value in cpuinfo is NOT the same as the CPU MHz. > Can you remind me if "

Re: [openib-general] [ANNOUCEv2] OpenIB OpenSM 1.1.0: trunk now supports 1.8.0 features

2005-09-13 Thread Troy Benjegerdes
ve since unplugged that > > node, and can put it back in tommorow if you want more debug info. > > Great. More later on the log itself... > > -- Hal > -- -- Troy Benjegerdes'da

Re: [openib-general] [PATCH v1/RFC] IB: Add SCSI RDMA Protocol (SRP) initiator

2005-09-13 Thread Troy Benjegerdes
On Tue, Sep 13, 2005 at 02:52:06PM -0700, Roland Dreier wrote: > >>>>> "Troy" == Troy Benjegerdes <[EMAIL PROTECTED]> writes: > > Troy> Is there anyplace I can find an SRP target for Linux? What > Troy> is available? (Ideally, I'd li

Re: [openib-general] IBM eHCA Device Driver for gen2 IB stack

2005-09-13 Thread Troy Benjegerdes
On Fri, Jul 22, 2005 at 01:41:31PM +0200, IBMEHCA DD wrote: > Hi, > we've completed the first alpha code drop of the Power5 IBM eHCA Device > Driver for the for the gen2 openib.org stack. > We're running IPoIB and ibv userspace programs successfully with this code > in our lab setup. > > The sou

Re: [openib-general] [PATCH v1/RFC] IB: Add SCSI RDMA Protocol (SRP) initiator

2005-09-13 Thread Troy Benjegerdes
Is there anyplace I can find an SRP target for Linux? What is available? (Ideally, I'd like one for 2.6.1[3,4] ) On Tue, Sep 13, 2005 at 11:14:55AM -0700, Roland Dreier wrote: > Sorry to interrupt the SAS arguments, but... > > Here's the latest version of the InfiniBand SRP initiator. I think >

Re: [openib-general] [ANNOUCEv2] OpenIB OpenSM 1.1.0: trunk now supports 1.8.0 features

2005-09-13 Thread Troy Benjegerdes
On Tue, Sep 13, 2005 at 07:20:27AM -0400, Hal Rosenstock wrote: > [This is a minor update to the previous announcement on this.] > > OpenIB OpenSM 1.1.0 now includes the OpenSM 1.8.0 functionality. > > Major thanks go to Yael Kalka and Eitan Zahavi of Mellanox. > > This is a complete merge of th

Re: [openib-general] mpi drop in openib tree

2005-08-27 Thread Troy Benjegerdes
On Thu, Aug 25, 2005 at 12:50:04PM -0400, Dhabaleswar Panda wrote: > Hi Roland, > > > As for whether MPI should be in the OpenIB subversion tree or not, my > > personal opinion is that having MPI there is only appropriate if the > > svn tree is being used as the primary development source tree. I

Re: [openib-general] 'Couldn't post send' error?

2005-08-13 Thread Troy Benjegerdes
On Fri, Aug 12, 2005 at 09:42:58PM -0500, Troy Benjegerdes wrote: > What's this mean? > > da4:~/NetPIPE_3.6.2# ibv_rc_pingpong 10.1.5.218 > local address: LID 0x0002, QPN 0x0d0404, PSN 0x599dea > remote address: LID 0x0001, QPN 0x090404, PSN 0x93b0c8 > Couldn't

[openib-general] 'Couldn't post send' error?

2005-08-12 Thread Troy Benjegerdes
s like 'uc' and 'ud' versions work just fine. -- ------ Troy Benjegerdes'da hozer'[EMAIL PROTECTED] Somone asked me why I work on this free (http://www.fsf.org/philo

Re: [openib-general] Re: [PATCH 05/16] IB uverbs: core implementation

2005-06-29 Thread Troy Benjegerdes
On Wed, Jun 29, 2005 at 09:12:09AM -0700, Greg KH wrote: > On Tue, Jun 28, 2005 at 11:13:22PM -0500, Troy Benjegerdes wrote: > > On Tue, Jun 28, 2005 at 05:27:09PM -0700, Greg KH wrote: > > > On Tue, Jun 28, 2005 at 04:03:43PM -0700, Roland Dreier wrote: > > > > +++

Re: [openib-general] Re: [PATCH 05/16] IB uverbs: core implementation

2005-06-28 Thread Troy Benjegerdes
e. But as soon as you build a binary agaist the linux kernel, the binary is irrevocably GPL licensed. -- Troy Benjegerdes'da hozer'[EMAIL PROTECTED] Somone asked my why I work on this free (http://www.fsf.org/philosophy/) software stuff and not g

[openib-general] Re: SDP: device mthca0 does not support fast memory regions

2005-06-16 Thread Troy Benjegerdes
On Thu, Jun 16, 2005 at 10:03:55PM +0300, Michael S. Tsirkin wrote: > Quoting r. Libor Michalek <[EMAIL PROTECTED]>: > > Subject: Re: SDP: device mthca0 does not support fast memory regions > > > > On Wed, Jun 15, 2005 at 04:30:47PM -0500, Troy Benjegerdes wrote: >

[openib-general] SDP: device mthca0 does not support fast memory regions

2005-06-15 Thread Troy Benjegerdes
I'm getting a 'SDP: device mthca0 does nto support fast memory regions' error on PPC64 systems.. Is there something that needs to be done for PPC, or could I have an older version of the mthca module hanging around? ___ openib-general mailing list openib-

[openib-general] 2.6.11.11 NFS over IPoIB crash

2005-06-08 Thread Troy Benjegerdes
We are running NFS over IPOIB, and are getting kernel panics under heavy NFS I/O. This is on a PowerMac G5, and the server is a dual opteron running 2.6.11 with the OpenIB code from subversion. It looks like a bug in nfs.. but we've only seen it using IPoIB... Is it worth trying to reproduce th

Re: [openib-general] opensm fails to bring up subnet..

2005-06-03 Thread Troy Benjegerdes
On Fri, Jun 03, 2005 at 04:08:05PM -0400, Hal Rosenstock wrote: > On Fri, 2005-06-03 at 15:58, Troy Benjegerdes wrote: > > > Do you have any additional known good cables to try ? > > > > I have several cables I *could* try, but I have no idea which ones are > > g

  1   2   >