Re: [openib-general] bugs filed for problems compiling OFED 1.2 alpha1
Some of these might be fixed in recent nightly builds. Specifically I know 383 was fixed yesterday. Please check this and let us know. Thanks, what is the URL for the nightly builds? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ewg] RE: bugs filed for problems compiling OFED 1.2 alpha1
I want a full OFED build, please. This was agreed to in one of the OFED bi-weekly calls. Scott -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Monday, February 26, 2007 8:59 AM To: Scott Weitzenkamp (sweitzen) Cc: Michael S. Tsirkin; [EMAIL PROTECTED]; OPENIB Subject: Re: [ewg] RE: bugs filed for problems compiling OFED 1.2 alpha1 On Mon, 2007-02-26 at 08:49 -0800, Scott Weitzenkamp (sweitzen) wrote: Some of these might be fixed in recent nightly builds. Specifically I know 383 was fixed yesterday. Please check this and let us know. Thanks, what is the URL for the nightly builds? Scott ___ ewg mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg http://www.openfabrics.org/builds/ofa_1_2_kernel/ The latest: http://www.openfabrics.org/builds/ofa_1_2_kernel/ofa_1_2_kerne l-20070226-0405.tgz -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] bugs filed for problems compiling OFED 1.2 alpha1
Please fix these bugs for beta. I've compiled for RHEL4 and SLES10 on x86_64, i686, ia64, and ppc64. I compiled all MPIs with GNU, Intel, and PGI compilers. * 380 OFED 1.2 alpha1 gcc MVAPICH won't compile on RHEL4 IA64 * 381 OFED 1.2 alpha1 MVAPICH2 won't compile on RHEL4 IA64 with Intel compiler * 382 OFED 1.2 alpha1 mpitests won't compile with Intel compiler for Open MPI (RHEL4 IA64) * 383 OFED 1.2 alpha1 core/addr.c won't compile on SLES10 IA64 * 384 OFED 1.2 alpha1 ib-bonding won't compile on RHEL4 U3 ppc64 * 386 OFED 1.2 alpha1 gcc MVAPICH2 won't compile on RHEL4 ppc64 (add -m64) * 387 OFED 1.2 alpha1 Open MPI won't compile on SLES10 ppc64 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] bugs filed for OFED 1.2 alpha1 MPI compiler support
Please fix these bugs for beta. I've compiled for RHEL4 and SLES10 on x86_64, i686, ia64, and ppc64. I compiled all MPIs with GNU, Intel, and PGI compilers, and tried compiling and running C, C++, Fortran 77, and Fortran 90 programs with each combo. * 370 OFED 1.2 alpha1 MVAPICH does not have Intel Fortran support * 372 MVAPICH2 GNU mpif90 uses PGI not GNU compiler * 373 MVAPICH2 Intel mpif90 does not include -rpath like mpif77 does * 374 MVAPICH2 PGI mpif90 link failure: undefined reference ..Dm_mpi * 375 Open MPI PGI C++ failure at runtime Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] bugs filed for problems compiling OFED 1.2 alpha1
Scott, you have assigned all bugs to [EMAIL PROTECTED] To have the bugs resolved, please assign them to maintainers of appropriate module. Not sure what you mean by all, only 384 was not assigned to a specific person. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ewg] anyone have OFED 1.2 alpha1 compiling on ppc64
That missing header (common.h) is in libibcommon. Somehow, libibcommon is not installed. libibumad depends on libibcommon. Is this a build/install script issue with OFED 1.2 ? Vlad ? -- Hal I tried install.sh again, this time telling it to build libibcommon instead of relying on dependencies, and get this: + install -m 0755 /var/tmp/OFED/usr/local/ofed/bin32/mread /var/tmp/OFED/usr/lo\ cal/ofed/bin install: cannot stat `/var/tmp/OFED/usr/local/ofed/bin32/mread': No such file o\ r directory I believe mread has been renamed to mstread. # ls /var/tmp/OFED/usr/local/ofed/bin32 mstflint mstmread mstmwrite mstregdump mstvpd Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] anyone have OFED 1.2 alpha1 compiling on ppc64
How do I upload sources? -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, February 22, 2007 1:00 AM To: Scott Weitzenkamp (sweitzen) Cc: [EMAIL PROTECTED]; OPENIB Subject: Re: anyone have OFED 1.2 alpha1 compiling on ppc64 Quoting Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED]: Subject: anyone have OFED 1.2 alpha1 compiling on ppc64 I tried both RHEL4 and SLES10 usinstall.sh, and get this. I filed bug 379, anyone else tried ppc64? Scott, could pls you upload the kernel sources and .config files to staging? If you do, we'll be able to add these to mightly cross-build environment. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] anyone have OFED 1.2 alpha1 compiling on ppc64
Don't you have an account at ssh.openfabrics.org? Can an admin please give me an account? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] anyone have OFED 1.2 alpha1 compiling on ppc64
I tried both RHEL4 and SLES10 usinstall.sh, and get this. I filed bug 379, anyone else tried ppc64? gcc -DHAVE_CONFIG_H -I. -I. -I. -I./include/infiniband -I./../libibcommon/incl\ ude/infiniband -Wall -m64 -g -O2 -MT libibumad_la-umad.lo -MD -MP -MF .deps/lib\ ibumad_la-umad.Tpo -c src/umad.c -fPIC -DPIC -o .libs/libibumad_la-umad.o In file included from src/umad.c:50: ./include/infiniband/umad.h:37:31: infiniband/common.h: No such file or directo\ ry src/umad.c: In function `port_alloc': src/umad.c:94: warning: implicit declaration of function `IBWARN' src/umad.c: In function `get_port': src/umad.c:160: warning: implicit declaration of function `snprintf' src/umad.c:163: warning: implicit declaration of function `sys_read_uint' src/umad.c:177: warning: implicit declaration of function `sys_read_uint64' src/umad.c:182: warning: implicit declaration of function `sys_read_gid' src/umad.c: In function `get_ca': src/umad.c:354: warning: implicit declaration of function `sys_read_string' src/umad.c:363: warning: implicit declaration of function `sys_read_guid' make[3]: *** [libibumad_la-umad.lo] Error 1 make[3]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/m\ anagement/libibumad' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/m\ anagement/libibumad' make[1]: *** [all] Error 2 make[1]: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/m\ anagement/libibumad' make: *** [subdirs] Error 1 make: Leaving directory `/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/mana\ gement' Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] fix SDP bug 108 for OFED 1.2 beta?
Tziporet and Michael, every since the SDP rewrite in OFED 1.0 rc5, SDP throughput drops with message size 64KB, see attached graph. Can you please fix this for OFED 1.2 beta? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems sdp2sdp.TCP_STREAM.000.tput_log.pdf Description: sdp2sdp.TCP_STREAM.000.tput_log.pdf ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH2 working with OFED 1.2 alpha1 and IB?
It looks like you are using an older version of the SRPM: mvapich2-0.9.8-3. This version had some shared library issues with the ofed 1.2 build. The latest MVAPICH2 SRPM version is mvapich2-0.9.8-4. Shaun posted the following e-mail on Feb 15th. Please use this latest version and let us know whether the problem still persists. This fixed it, thanks. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2
Tziporet and Doug, we can discuss this at the OFED conf call on Feb 26, I suggest we try to improve this area. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Doug Ledford Sent: Wednesday, February 14, 2007 10:36 AM To: Jeff Squyres (jsquyres) Cc: [EMAIL PROTECTED]; 'Openib-General@Openib.Org' Subject: Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2 On Fri, 2007-02-09 at 13:38 -0500, Jeff Squyres wrote: New SRPM on server that munges the %build section into the %install section. Yuck. :-) Worse than yuck, it's wrong. Your SuSE %build section bug is a result of trying to build against something that isn't installed yet but is required for the build. You guys chose to split things up into modules, and that's fine and the way things should be, but that means you need to install required packages along the way if you want to build against them, not try to build against binaries in temporary directories. Apart from that though, I can assure you that on RHEL and FC, the %build section is a requirement if you want valid -debuginfo packages. I've brought it up at the last two conferences I attented, and I usually get a brick wall when I do, but the OFED packaging process is broken by design. As Shaun brought up, one of the benefits of proper RPM packaging is reliable, reproducible builds, not to mention the whole issue of debugging with gdb is nigh impossible without valid debuginfo rpms; all of which are vital to supportability. I'm looking through the alpha1 tarball right now, I'll comment on it later under separate email. But, first glance is that I'll be ripping everything out and making it sane again. Which brings up another point that I've mentioned before but nothing has happened on: as long as you guys keep making your distribution use an installation hierarchy that violates the rules for distributions shipping code, places like Novell or Red Hat have one of two choices: violate the Linux File Hierarchy Standard in our distributions or use a different hierarchy than you do. Obviously, we aren't going to fore go LFHS compliance of our entire product for just this, so we use a different hierarchy than you. In the end, this can end up causing confusion for customers, as well as inconsistency between what Red Hat or Novell or you guys choose to use as the file placement. Something needs to be done to standardize installation directories in an acceptable place IMO (/usr/local is verboten for a distribution to use, and theoretically that should include you guys since you are a distribution source, the only real reason people are compiling your code locally is that you don't provide binary RPMs or because they want a custom compiler instead of gcc, not because they are trying out new software they don't necessarily intend to keep/use or which is new enough that no one has formally packaged it up, which is what /usr/local is for). On Feb 7, 2007, at 11:42 AM, Vladimir Sokolovsky wrote: Hi Jeff, Please remove %build macro from the RPM spec file. On SuSE distros it removes RPM_BUILD_ROOT. Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.23343 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + /bin/rm -rf /var/tmp/OFED ++ dirname /var/tmp/OFED + /bin/mkdir -p /var/tmp + /bin/mkdir /var/tmp/OFED + cd openmpi-1.2b4ofedr13470 + fortify_source=1 + test '' '!=' '' ... -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. -- Doug Ledford [EMAIL PROTECTED] GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] how to handle OFEd 1.2 bugs in bugzilla
Yes, I'd like to add alpha1, etc. version numbers in bugzilla. For existing bugs, the Reporter and Assignee should try to communicate/negotiate Priority/Severity. For bugs in areas that Cisco supports, I review the bugs and try to ask for desired ones to be fixed. I was happy with the responses I got for OFED 1.1 from Mellanox and Open MPI. If you want a bug scrub, I suggest a distributed one, where someone from each company scrubs the bugs in areas they are responsible for. Scott -Original Message- From: Tziporet Koren [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 14, 2007 6:18 AM To: Scott Weitzenkamp (sweitzen) Cc: EWG; OPENIB Subject: how to handle OFEd 1.2 bugs in bugzilla Hi Scott and all, I wish to consult with you in the way we will treat OFED 1.2 bugs in bugzilla. 1. Do we want to have 1.2-alpha 1.2-beta, 1.2-rcX in version, or just 1.2 as we have now 2. What do we wish to do with bugs that were opened for 1.1 and are still open? 3. What to do with old bugs that where open to gen2 in general? 4. What is our methodology for priority and severity setup? (There are too many blocker bugs still open in OFED 1.1 so they are not actually blockers or they were fixed but not updated) Thanks, Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.2 alpha release
I don't remember discussing dropping RHEL4 U3, and would like to add it back to the official list. IPoIB multicast does not work correctly (bug 266) in RHEL4 U4, thus RHEL4 U3 is the most recent working RHEL release in this area (unless it has been fixed in U4 errata kernels). The new ib-bonding RPM also says it only supports RHEL4 U3 for Red Hat releases. We should probably also plan for SLES10 SP1 support in OFED 1.2. Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Wednesday, February 14, 2007 8:25 AM To: EWG Cc: OPENIB Subject: [openib-general] OFED 1.2 alpha release Hi, In two weeks delay we publish OFED 1.2-alpha1 on http://www.openfabrics.org/builds/ofed-1.2/ File: OFED-1.2-alpha1.tgz BUILD_ID contains info on all packages sources location. Please report any issues in bugzilla https://bugs.openfabrics.org/ Tziporet Vlad OS support: Novell: - SLES 9.0 SP3 - SLES10 Redhat: - Redhat EL4 up4 - Redhat EL5 beta2 (only partially tested) kernel.org: - 2.6.20 - 2.6.19 Note: Redhat EL4 up3, Fedora C4, Fedora C6 and SuSE Pro 10 are not part of the official list. We keep the backport patches for these OSes and make sure OFED compile and loaded properly but will not do full QA cycle. Systems: * x86_64 * x86 * ia64 * ppc64 (have not tested user space) Main changes from OFED-1.1: 1. iWRAP is now supported with Chelsio T3 2. New kernel modules: VNIC, RDS, Bonding, SA cache, 3. New packages: MVAPICH2 4. IPoIB Connected mode 5. Multicast join from user space 6. libibverbs 1.1 7. OpenSM new routing models: FAT tree routing and Taurus routing 8. GUI tool for network diagnostic 9. New MPI releases: MVAPICH: version 0.9.9, Open MPI: version 1.2, MVAPICH2: version 0.9.8 Detailed list of changes can be found in: https://wiki.openfabrics.org/tiki-index.php?page=OFED+1.2+release+plan+a nd+features Limitations and known issues: 1. ipath driver compilation fails on all systems, except for kernel 2.6.20 2. libipathverbs is not working with libibverbs 1.1 3. SDP netstat does not available on RHEL5 (due to compilation errors) 4. Routing table problem in SLES10 when using port #2 5. RDS compiles only on kernel 2.6.18/19/20 6. MVAPICH2 installation fails on SuSE Pro 10. 7. mstflint is not working on ppc64 8. RDS was not tested Missing features that should be completed for the Beta: 1. Add madeye utility 2. RDS to support SLES10 and RHEL For details on each module status see: https://wiki.openfabrics.org/tiki-index.php?page=Teleconf+02-12-2007 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Problem with install.sh openib-diags OFED-1.2-20070208-1508.tgz
I'm using install.sh on RHEL4 U3 x86_64 Preparing... ## kernel-ib-devel ## kernel-ib ## error: Failed dependencies: perl(IBswcountlimits) is needed by openib-diags-1.2.0-pre1.x86_64 ERROR: Failed executing /bin/rpm -ihv /tmp/OFED-1.2-20070208-1508/RPMS/redhat-\ release-4AS-4.1/dapl-1.2.0-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat\ -release-4AS-4.1/dapl-devel-1.2.0-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS\ /redhat-release-4AS-4.1/libibcommon-1.0.2-0.x86_64.rpm /tmp/OFED-1.2-20070208-1\ 508/RPMS/redhat-release-4AS-4.1/libibcommon-devel-1.0.2-0.x86_64.rpm /tmp/OFED-\ 1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libibmad-1.0.2-0.x86_64.rp m /tmp/\ OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libibmad-devel-1.0.2- 0.x86_6\ 4.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libibumad-1.0.2- 0\ .x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libibumad-d\ evel-1.0.2-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1\ /libibverbs-1.1-pre1.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release\ -4AS-4.1/libibverbs-devel-1.1-pre1.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/\ redhat-release-4AS-4.1/libibverbs-utils-1.1-pre1.x86_64.rpm /tmp/OFED-1.2-20070\ 208-1508/RPMS/redhat-release-4AS-4.1/libmthca-1.0.4-pre.x86_64.rpm /tmp/OFED-1.\ 2-20070208-1508/RPMS/redhat-release-4AS-4.1/libmthca-devel-1.0.4-pre.x86 _64.rpm\ /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libopensm-3.0.1- 0.x86_\ 64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libopensm-devel- \ 3.0.1-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libo\ smcomp-3.0.1-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4\ .1/libosmcomp-devel-3.0.1-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-\ release-4AS-4.1/libosmvendor-3.0.1-0.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPM\ S/redhat-release-4AS-4.1/libosmvendor-devel-3.0.1-0.x86_64.rpm /tmp/OFED-1.2-20\ 070208-1508/RPMS/redhat-release-4AS-4.1/librdmacm-0.9.0-0.x86_64.rpm /tmp/OFED-\ 1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/librdmacm-devel-0.9.0-0.x8 6_64.rp\ m /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/libsdp-1.1.99-0. x86_6\ 4.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/openib-diags-1.2 .\ 0-pre1.x86_64.rpm /tmp/OFED-1.2-20070208-1508/RPMS/redhat-release-4AS-4.1/perft\ est-1.2-0.x86_64.rpm Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] no RDS in OFED 1.2?
I don't see RDS in the feature freeze builds yet. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dapltest?
I opened bug 350, I would like dapltest (and any other useful dapl test programs) too. Scott -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Arlin Davis Sent: Wednesday, February 07, 2007 11:03 AM To: Steve Wise Cc: openib-general; Arlin Davis Subject: Re: [openib-general] dapltest? Steve Wise wrote: Hey Arlin, Shouldn't dapl/test be shipped with OFED? It appears not to be... Yes, I will try to get to this by next week at the latest. Can you add a bugzilla report to track against? -arlin ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
Not getting MPI RPMS for Intel compilers, either. Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_mvapich2_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mvapich2_intel-0 .9.8-1.x 86_64.rpm not found Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/op enmpi_gcc-1.2b4ofedr13470-1ofed.x86_64.rpm Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_openmpi_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/openmpi_intel-1. 2b4ofedr 13470-1ofed.x86_64.rpm not found ERROR: -.x86_64.rpm not found under /tmp/OFED-1.2-20070205-1823/RPMS/redhat-rele ase-4AS-4.1. Installation finished successfully... Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:44 PM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release Moving on, I set ib_bonding=n in ofed.conf and try install.sh again, and now get this: ... Building MVAPICH RPM. Please wait... Using gcc compiler Running rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_nam e mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --define 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/loc al/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPMS/mvapich-0.9.9-9 71.src.rpm ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm See log file: /tmp/OFED.6120.log # tail /tmp/OFED.6120.log + LANG=C + export LANG + unset DISPLAY /var/tmp/rpm-tmp.870: line 33: syntax error near unexpected token `)' error: Bad exit status from /var/tmp/rpm-tmp.870 (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.870 (%install) ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:27 PM To: Vladimir Sokolovsky; [EMAIL PROTECTED]; Tziporet Koren; Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: RE: [openib-general] OFED-1.2 first release Vlad and Tziporet, It might help if you elaborated on what you meant by first release, you have been saying code freeze but really this is feature freeze, right? This announcement is quite a bit different from previous OFED announcements, where you detailed what features were available and what OS were supported. The daily build email mentions compiling against kernels, but I haven't seen what distros were actually tested. Are we starting from scratch on compiling and testing with distros like RHEL4? Do you anticipate we will just go day by day with builds trying to stabilize things initially? In any case, here's what I see when I try to compile with install.sh on RHEL4 U3 x86_64: ... /tmp/OFED-1.2-20070205-1823/build.sh: line 802: kernel-ib: command not found Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-docs-1. 2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-docs-1.2-0.noarch.rpm /tmp/ OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-scripts -1.2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-scripts-1.2-0.noarch.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release
Re: [openib-general] OFED-1.2 first release
sdpnetstat is getting added to the dapl-devel RPM. # rpm -qlip dapl-devel-1.2.0-0.x86_64.rpm Name: dapl-devel Relocations: (not relocatable) Version : 1.2.0 Vendor: OpenFabrics Release : 0 Build Date: Mon 05 Feb 2007 09:48:50 PM PST Install Date: (not installed) Build Host: svbu-qa1850-1.cisco.com Group : System Environment/Libraries Source RPM: ofa_user-1.2-alpha1.src .rpm Size: 692598 License: GPL/BSD Signature : (none) URL : http://www.openfabrics.org/ Summary : Development files for the libdat and libdapl libraries Description : Static libraries and header files for the libdat and libdapl library. /usr/local/ofed/bin/sdpnetstat /usr/local/ofed/include/dat/dat.h /usr/local/ofed/include/dat/dat_error.h /usr/local/ofed/include/dat/dat_platform_specific.h /usr/local/ofed/include/dat/dat_redirection.h /usr/local/ofed/include/dat/dat_registry.h /usr/local/ofed/include/dat/dat_vendor_specific.h /usr/local/ofed/include/dat/udat.h /usr/local/ofed/include/dat/udat_config.h /usr/local/ofed/include/dat/udat_redirection.h /usr/local/ofed/include/dat/udat_vendor_specific.h /usr/local/ofed/lib64/libdaplcma.a /usr/local/ofed/lib64/libdaplcma.so /usr/local/ofed/lib64/libdat.a /usr/local/ofed/lib64/libdat.so From: Scott Weitzenkamp (sweitzen) Sent: Tuesday, February 06, 2007 12:07 AM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release Not getting MPI RPMS for Intel compilers, either. Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_mvapich2_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mvapich2_intel-0 .9.8-1.x 86_64.rpm not found Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/op enmpi_gcc-1.2b4ofedr13470-1ofed.x86_64.rpm Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_openmpi_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/openmpi_intel-1. 2b4ofedr 13470-1ofed.x86_64.rpm not found ERROR: -.x86_64.rpm not found under /tmp/OFED-1.2-20070205-1823/RPMS/redhat-rele ase-4AS-4.1. Installation finished successfully... Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:44 PM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release Moving on, I set ib_bonding=n in ofed.conf and try install.sh again, and now get this: ... Building MVAPICH RPM. Please wait... Using gcc compiler Running rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_nam e mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --define 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/loc al/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPMS/mvapich-0.9.9-9 71.src.rpm ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm See log file: /tmp/OFED.6120.log # tail /tmp/OFED.6120.log + LANG=C + export LANG + unset DISPLAY /var/tmp/rpm-tmp.870: line 33: syntax error near unexpected token `)' error: Bad exit status from /var/tmp/rpm-tmp.870 (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.870 (%install) ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local
Re: [openib-general] OFED-1.2 first release
libibverbs is not working. I have opened bugs 342-346 for the issues I've found so far: # ibv_devices libibverbs: Warning: couldn't open config directory '/usr/local/ofed/etc/libibverbs.d'. libibverbs: Warning: no userspace device-specific driver found for /sys/class/in finiband_verbs/uverbs0 device node GUID -- Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: Scott Weitzenkamp (sweitzen) Sent: Tuesday, February 06, 2007 9:34 AM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release sdpnetstat is getting added to the dapl-devel RPM. # rpm -qlip dapl-devel-1.2.0-0.x86_64.rpm Name: dapl-devel Relocations: (not relocatable) Version : 1.2.0 Vendor: OpenFabrics Release : 0 Build Date: Mon 05 Feb 2007 09:48:50 PM PST Install Date: (not installed) Build Host: svbu-qa1850-1.cisco.com Group : System Environment/Libraries Source RPM: ofa_user-1.2-alpha1.src .rpm Size: 692598 License: GPL/BSD Signature : (none) URL : http://www.openfabrics.org/ Summary : Development files for the libdat and libdapl libraries Description : Static libraries and header files for the libdat and libdapl library. /usr/local/ofed/bin/sdpnetstat /usr/local/ofed/include/dat/dat.h /usr/local/ofed/include/dat/dat_error.h /usr/local/ofed/include/dat/dat_platform_specific.h /usr/local/ofed/include/dat/dat_redirection.h /usr/local/ofed/include/dat/dat_registry.h /usr/local/ofed/include/dat/dat_vendor_specific.h /usr/local/ofed/include/dat/udat.h /usr/local/ofed/include/dat/udat_config.h /usr/local/ofed/include/dat/udat_redirection.h /usr/local/ofed/include/dat/udat_vendor_specific.h /usr/local/ofed/lib64/libdaplcma.a /usr/local/ofed/lib64/libdaplcma.so /usr/local/ofed/lib64/libdat.a /usr/local/ofed/lib64/libdat.so From: Scott Weitzenkamp (sweitzen) Sent: Tuesday, February 06, 2007 12:07 AM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release Not getting MPI RPMS for Intel compilers, either. Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_mvapich2_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mvapich2_intel-0 .9.8-1.x 86_64.rpm not found Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/op enmpi_gcc-1.2b4ofedr13470-1ofed.x86_64.rpm Running /bin/rpm -Uhv /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/mp itests_openmpi_gcc-2.0-698.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1/openmpi_intel-1. 2b4ofedr 13470-1ofed.x86_64.rpm not found ERROR: -.x86_64.rpm not found under /tmp/OFED-1.2-20070205-1823/RPMS/redhat-rele ase-4AS-4.1. Installation finished successfully... Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:44 PM To: Scott Weitzenkamp (sweitzen); 'Vladimir Sokolovsky'; '[EMAIL PROTECTED]'; 'Tziporet Koren' Cc: 'openib-general@openib.org' Subject: RE: [openib-general] OFED-1.2 first release Moving on, I set ib_bonding=n in ofed.conf and try install.sh again, and now get this: ... Building MVAPICH RPM. Please wait... Using gcc compiler Running rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_nam e mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --define 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/loc al/ofed/mpi
Re: [openib-general] MVAPICH2 SRPM and install file patches
Shaun, Thanks for doing this. I see things like romio and shlibs configurable in the patch, what about other MVAPICH2 features like fault tolerance, multi rail, threads, and MPD? How can configure them when I use install.sh to compile and install OFED? I also didn't quite understand the ib-vs-iwarp configuration, I thought OFED 1.2 would support both. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shaun Rowland Sent: Wednesday, January 31, 2007 5:33 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: [openib-general] MVAPICH2 SRPM and install file patches I've placed the MVAPICH2 SRPM on the OFA server in ~rowland/ofed_1_2, and it is linked to here: http://www.openfabrics.org/~rowland/ofed_1_2/ Additionally, I am including a patch in this email that updates the ofed_1_2_scripts files from the GIT repository we were given to handle the MVAPICH2 SRPM file. Basically, installing MVAPICH2 is similar to the other MPI packages, except that I have added a choice option to build with iWARP support or not. The default is IB only. If the user has selected the librdmacm packages and the mvapich2 package, this choice is presented. This is also saved in the ofed.conf file using an MVAPICH2_IMPL variable, and the librdmacm packages are added as dependencies if the iWARP version of MVAPICH2 is desired and they are not already in the ofed.conf file, which seems like standard behavior in the scripts. The resulting binary RPM uses the name convention mvapich2_compiler as normal in either case. There are various ways this could be implemented, perhaps in a better manner. This is what I was able to come up with by today. Since the installation scripts given were very similar to the original OFED 1.1 scripts, I was able to test the installation procedure using OFED 1.1 files. Everything worked for me, including building the mpitests package against the mvapich2 package. There are some comments about this in what I have done. I hope that it is helpful in getting our SRPM integrated into the installation scripts. Additionally, I put a README file in my ofed_1_2 directory that contains information about the macros that can be used with our SRPM file. The SRPM can be used to install against an existing OFED installation, and those macros control various aspects of the result. There is one special macro I use for when the SRPM is being built along with the OFED source, and its use should be clear in the patched build.sh script and associated comment. -- Shaun Rowland [EMAIL PROTECTED] http://www.cse.ohio-state.edu/~rowland/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
Vlad and Tziporet, It might help if you elaborated on what you meant by first release, you have been saying code freeze but really this is feature freeze, right? This announcement is quite a bit different from previous OFED announcements, where you detailed what features were available and what OS were supported. The daily build email mentions compiling against kernels, but I haven't seen what distros were actually tested. Are we starting from scratch on compiling and testing with distros like RHEL4? Do you anticipate we will just go day by day with builds trying to stabilize things initially? In any case, here's what I see when I try to compile with install.sh on RHEL4 U3 x86_64: ... /tmp/OFED-1.2-20070205-1823/build.sh: line 802: kernel-ib: command not found Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-docs-1. 2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-docs-1.2-0.noarch.rpm /tmp/ OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-scripts -1.2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-scripts-1.2-0.noarch.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ib-bonding-0.9.0-1.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 See log file: /tmp/OFED.10899.log # tail -10 /tmp/OFED.10899.log Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/ib-bonding-0. 9.0-root Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1-rh-x86_64.rpm Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-debuginfo-0.9.0-1-rh-x86_64.rpm Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.98615 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + rm -rf ib-bonding-0.9.0 + exit 0 /bin/mv: cannot stat `/var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm ': No such file or directory ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vladimir Sokolovsky Sent: Monday, February 05, 2007 2:26 PM To: [EMAIL PROTECTED] Cc: openib-general@openib.org Subject: [openib-general] OFED-1.2 first release Hi, OFED-1.2-20070205-1823.tgz can be downloaded from http://www.openfabrics.org/builds/ofed-1.2/ The first OFED package includes: ofa_kernel-1.2-alpha1.src.rpm ofa_user-1.2-alpha1.src.rpm mvapich-0.9.9-971.src.rpm mvapich2-0.9.8-1.src.rpm openmpi-1.2b4ofedr13470-1ofed.src.rpm mpitests-2.0-698.src.rpm open-iscsi-generic-2.0-742.src.rpm ib-bonding-0.9.0-1.src.rpm ofed-docs-1.2-0.src.rpm ofed-scripts-1.2-0.src.rpm Known issues: srptools - compilation fails openib_diags - compilation fails ibutils - not included yet To build OFED RPMs: cd OFED-1.2-20070205-1823 ./build.sh Created RPMs will be stored under OFED-1.2-20070205-1823/RPMS/ directory. To install OFED RPMs: cd OFED-1.2-20070205-1823 ./install.sh For a detailed installation guide, see OFED-1.2-xxx/docs/OFED_Installation_Guide.txt -- Vladimir Sokolovsky [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.2 first release
Moving on, I set ib_bonding=n in ofed.conf and try install.sh again, and now get this: ... Building MVAPICH RPM. Please wait... Using gcc compiler Running rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_nam e mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --define 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/loc al/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPMS/mvapich-0.9.9-9 71.src.rpm ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm See log file: /tmp/OFED.6120.log # tail /tmp/OFED.6120.log + LANG=C + export LANG + unset DISPLAY /var/tmp/rpm-tmp.870: line 33: syntax error near unexpected token `)' error: Bad exit status from /var/tmp/rpm-tmp.870 (%install) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.870 (%install) ERROR: Failed executing rpmbuild -v --rebuild --define '_topdir /var/tmp/OFEDRP M' --define '_name mvapich_gcc' --define 'ofed 1' --define 'compiler gcc' --defi ne 'openib_prefix /usr/local/ofed' --define 'build_root /var/tmp/OFED' --define '_prefix /usr/local/ofed/mpi/gcc/mvapich-0.9.9' /tmp/OFED-1.2-20070205-1823/SRPM S/mvapich-0.9.9-971.src.rpm Scott From: Scott Weitzenkamp (sweitzen) Sent: Monday, February 05, 2007 9:27 PM To: Vladimir Sokolovsky; [EMAIL PROTECTED]; Tziporet Koren; Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: RE: [openib-general] OFED-1.2 first release Vlad and Tziporet, It might help if you elaborated on what you meant by first release, you have been saying code freeze but really this is feature freeze, right? This announcement is quite a bit different from previous OFED announcements, where you detailed what features were available and what OS were supported. The daily build email mentions compiling against kernels, but I haven't seen what distros were actually tested. Are we starting from scratch on compiling and testing with distros like RHEL4? Do you anticipate we will just go day by day with builds trying to stabilize things initially? In any case, here's what I see when I try to compile with install.sh on RHEL4 U3 x86_64: ... /tmp/OFED-1.2-20070205-1823/build.sh: line 802: kernel-ib: command not found Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-docs-1. 2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-docs-1.2-0.noarch.rpm /tmp/ OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --target=noarch --define '_topdir /var/tmp/OFEDRPM' - -define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ofed-scripts -1.2-0.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/noarch/ofed-scripts-1.2-0.noarch.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Running rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' /tmp/OFED-1.2-20070205-1823/SRPMS/ib-bonding-0.9.0-1.src.rpm Running /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm /t mp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 See log file: /tmp/OFED.10899.log # tail -10 /tmp/OFED.10899.log Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/ib-bonding-0. 9.0-root Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1-rh-x86_64.rpm Wrote: /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-debuginfo-0.9.0-1-rh-x86_64.rpm Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.98615 + umask 022 + cd /var/tmp/OFEDRPM/BUILD + rm -rf ib-bonding-0.9.0 + exit 0 /bin/mv: cannot stat `/var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9.0-1.x86_64.rpm ': No such file or directory ERROR: Failed executing /bin/mv -f /var/tmp/OFEDRPM/RPMS/x86_64/ib-bonding-0.9. 0-1.x86_64.rpm /tmp/OFED-1.2-20070205-1823/RPMS/redhat-release-4AS-4.1 Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vladimir Sokolovsky Sent: Monday, February 05, 2007 2:26 PM To: [EMAIL PROTECTED
Re: [openib-general] topspin vs ofed ?
If you have a Cisco support contract, you can use either stack and get support from Cisco, such as RPMs for some errata kernels. With OFED you can compile the source yourself for errata kernels. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jonas Mardosas Sent: Tuesday, January 30, 2007 2:29 AM To: openib-general@openib.org Subject: [openib-general] topspin vs ofed ? Hello, I need some information about infiniband drivers. I use Scientific linux 4.4, and now i installed newest kernel, but topspin drivers for my adapters dont work on newest kernel, i looked in cisco website, there is the same version of infiniband host adapters drivers, that was before 3.2.0 (118), so how i understund i can use OFED-1.1, what are differences between topspin drivers and Ofed? wich is better? what are your suggestions? Thak you for your responses. -- Jonas Mardosas BGM Sistemu inzinierius M.K.Ciurlionio 17, LT-03104 Vilnius mob.tel. +370 698 74002 mail:[EMAIL PROTECTED] http://www.bgm.lt ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED 1.2 bug reporting
I have added a version 1.2. Tziporet is the first build going to be called rc1 or something else? Scott -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Wise Sent: Wednesday, January 24, 2007 6:41 AM To: openib-general Subject: [openib-general] OFED 1.2 bug reporting Should I be using the open fabrics bugzilla to open bugs against OFED 1.2? If so, should a new 'version' be added for ofed-1.2? Right now the only version that makes sense is 'gen2', but that doesn't really cover bugs against backport code... Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] madeye
It's not well integrated into install.sh, you have to run: OPENIB_PARAMS=--with-madeye-mod ./install.sh Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Raleigh F Rinehart Sent: Wednesday, January 17, 2007 9:48 AM To: openib-general@openib.org Subject: [openib-general] madeye I'm trying to use madeye in OFED 1.1 Release to do some debugging but it does not seem to be present. I cracked open src tarball and all the right bits seem to be there (Kconfig, makefile, src) but it doesn't seem to get built and installed as part of the normal installation procedure (running install.sh). Has anyone had any success at building, installing and using madeye in a release version of OFED? thanks, -raleigh cat /usr/local/ofed/BUILD_ID OFED-1.1 openib-1.1 (REV=9905) # User space https://openib.org/svn/gen2/branches/1.1/src/userspace Git: ref: refs/heads/ofed_1_1 commit a083ec1174cb4b5a5052ef5de9a8175df82e864a # MPI mpi_osu-0.9.7-mlx2.2.0.tgz openmpi-1.1.1-1.src.rpm mpitests-2.0-0.src.rpm uname -a Linux merrill2 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux cat /etc/SuSE-release SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Reminder: OFED 1.2 coordination meeting next Monday at 9am PST
I'd like to explore adding MVAPICH2 to OFED 1.2, perhaps Dr Panda's team can help get the source RPM integrated with OFED 1.2. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Thursday, January 11, 2007 4:15 AM To: EWG Cc: OPENIB Subject: [openib-general] Reminder: OFED 1.2 coordination meeting next Monday at 9am PST Hi All, After a long holidays break we are going to have our next OFED 1.2 coordination meeting on Monday Jan-15 at 9am PST (Jeff sent bridge info) The only agenda item I have is reviewing components' readiness for the end of month code freeze. If you have other items for the agenda please let me know Thanks, Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
We see 3.6 Gb/sec with IPoIB using RHEL4U4 2.6.9-42 x86_64 kernel on Dell PE1950 Woodcrest systems. In my testing, faster hardware is more important than newer kernels, but I don't try newer kernels much. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Lindahl Sent: Tuesday, October 24, 2006 1:16 PM To: Sean Hubbell Cc: openib-general@openib.org Subject: Re: [openib-general] IPoIB Question On Tue, Oct 24, 2006 at 08:35:18AM -0500, Sean Hubbell wrote: We are currently looking at the new tickless kernel. Do you have one that you recommend? The main one to less-recommend is 2.6.9-based kernels, those are the slowest at TCP. Modern kernels, like the ones you see in Fedora 4 and up and SLES 10, seem to all be good and about equal in this area. I don't think we've tried a tickless kernel. We do most of our testing on the various kernels that ship with distros, plus the tip-of-tree kernel.org kernel. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
Title: IPoIB Question IPoIB performance will vary quite a bit depending on what motherboard, CPU speed, and HCA type you have. What are the specs on the systems you are using? Netperf (www.netperf.org) is a good tool to measure IPoIB performance. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Hubbell, Sean C Contractor/DecibelSent: Monday, October 23, 2006 5:53 AMTo: openib-general@openib.orgCc: Sean HubbellSubject: [openib-general] IPoIB Question Hello, I currently have several applications that uses a legacy IPv4 protocol and I use IPoIB to utilize my infiniband network which works great. I have completed some timing and throughput analysis and noticed that I do not get very much more if I use an infiniband network interface than using my GigE network interface. My question is, am I using IPoIB correctly or are these the typical numbers that everyone is seeing? Is there a standard application that I may use to test my current configuration? Thanks in advance, Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
If you are using TCP, you can use SDP transparently via libsdp to get improved latency and throughput. Scott -Original Message- From: Sean Hubbell [mailto:[EMAIL PROTECTED] Sent: Monday, October 23, 2006 8:56 AM To: Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: Re: [openib-general] IPoIB Question We currently have a non-homogeneous cluster so that seems that would possible explain a few of the differences that I have seen on some of my tests. I will look at netperf.org and see what they have to offer. On another note, is there plans to have IPoIB support the full throughput that infiniband 4x or 12x has? Specifically, can I keep my legacy apps and just upgrade the network to take advantage of the bandwidth? Sean Scott Weitzenkamp (sweitzen) wrote: IPoIB performance will vary quite a bit depending on what motherboard, CPU speed, and HCA type you have. What are the specs on the systems you are using? Netperf (www.netperf.org http://www.netperf.org) is a good tool to measure IPoIB performance. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Hubbell, Sean C Contractor/Decibel *Sent:* Monday, October 23, 2006 5:53 AM *To:* openib-general@openib.org *Cc:* Sean Hubbell *Subject:* [openib-general] IPoIB Question Hello, I currently have several applications that uses a legacy IPv4 protocol and I use IPoIB to utilize my infiniband network which works great. I have completed some timing and throughput analysis and noticed that I do not get very much more if I use an infiniband network interface than using my GigE network interface. My question is, am I using IPoIB correctly or are these the typical numbers that everyone is seeing? Is there a standard application that I may use to test my current configuration? Thanks in advance, Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
Nothing today in OF to accelerate UDP sockets. Scott Thanks for the reply again. The third party api that we use leverages a combination of UDP and TCP socket conntections for speed. Is there something for UCP as well? Sean Scott Weitzenkamp (sweitzen) wrote: If you are using TCP, you can use SDP transparently via libsdp to get improved latency and throughput. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] udev on RHEL
Doug, what udev does RHEL5 beta have? Any plans to upgrade udev for RHEL4 U5? Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ishai RabinovitzSent: Tuesday, October 17, 2006 5:36 AMTo: Sharma, Karun Cc: [EMAIL PROTECTED]; openib-general@openib.orgSubject: Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule Hi, Let me first explain why the current OFED release does not support SRP-HA on RHEL4. SRP-HA is using Device Mapper multipath. Multipath prerequisites include udev of higher version than 050. RHEL4 distributions includes udev 039. udev is an important part of the distribution and I do not think that users will be ready to upgrade it in order to have SRP-HA. To my best knowledge the main reason that multipath needs at least udev 050 is because it uses the RUN option (This option executes its given parameter after the device exist). Multipath uses the RUN option to execute kpartx thathandles the partitions of the new device. SRP-HA also uses the RUN option to execute the multipath command. I have an idea on how to overcome this problem. I want to implement a srp-multipath-daemon. This daemon will get kpartx and multipath requests using a shared message queue. The udev will use the PROGRAM option (That executes its given parameter immediately - before the device exist) to post request to this shared message queue and return immediately. The daemon will wait for the device to create and only than it will execute the commands. In any case this technique will not be a part of the coming OFED release. Ishai -Original Message-From: Sharma, Karun [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 5:11 AMTo: Tziporet Koren; Open FabricsCc: openibSubject: RE: [openfabrics-ewg] OFED 1.1 release schedule The plan is OK with Silverstorm. I have a question though.What aretheplans to support SRP-HA featureon RHEL4 kernels ? Thanks Karun ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] test results from Mellanox, Voltaire, QLogic, and IBM for OFED 1.1 rc7?
Do you have any performance data? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Robert Walsh Sent: Thursday, October 19, 2006 4:23 PM To: [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: [openib-general] [openfabrics-ewg] test results from Mellanox, Voltaire, QLogic, and IBM for OFED 1.1 rc7? QLogic have tests OFED 1.1pre1 and are happy with the results. We have tests UD, UC, RC, IPoIB, SDP and uDAPL. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.1-pre1 is ready
Cisco is happy with OFED 1.1 pre1, we only did light testing because no C changes were made. The following bugs have been tested and closed. 273 OFED 1.1 rc7 does not work with Cisco FC Gateway 278 OFED 1.1: two copies of openib.spec in openib-1.1.tgz 268 OFED openibd script references IBG2 267 OFED 1.1 MVAPICH not working on SLES10 due to 127.0.0.2 /etc/hosts entry 271 misleading error message when stopping openibd if SDP in use 277 OFED 1.1 rc7: uninitialized value during IPoIB failover in ipoib_ha.pl 274 OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs 249 OFED 1.1: Open MPI 1.1.1 won't compile with Intel C 9.[01] on SLES 10 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Tuesday, October 17, 2006 12:09 PM To: Open Fabrics Cc: openib Subject: [openib-general] OFED-1.1-pre1 is ready Hi All, OFED 1.1-pre1 is available: URL: https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1. 1-pre1.tgz According to the 1.1 release schedule I published yesterday and got all partners approval (Qlogic have not answered so I assumed its OK with them too). Each company has 3 days for basic dead or alive tests and making sure that no blocker issues are still open. If everything goes well we will do the release at the end of this Thursday. Components owners: Please remember to update the release notes till Wednesday. Documents should be the only component that will be changed from this pre-release to the official release. Tziporet Vlad == == Release details: BUILD_ID: OFED-1.1-pre1 openib-1.1 (REV=9854) # User space https://openib.org/svn/gen2/branches/1.1/src/userspace Git: ref: refs/heads/ofed_1_1 commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78 # MPI mpi_osu-0.9.7-mlx2.2.0.tgz openmpi-1.1.1-1.src.rpm mpitests-2.0-0.src.rpm Fixed bugs: BUG 273: OFED 1.1 rc7 does not work with Cisco FC Gateway BUG 274: OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs BUG 277: OFED 1.1 rc7: uninitialized value during IPoIB failover in ipoib_ha.pl BUG 278: OFED 1.1: two copies of openib.spec in openib-1.1.tgz Other changes from OFED-1.1-rc7: - Fix in ibdiagnet to support SM on a switch - Activate scaling code of ehca as default in the install - Documentation update - Dapl: removed SCM from the configuration file dat.conf. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7
SRP got broken in rc7 for the Cisco Fibre Channel gateway, so we couldn't test it with that. We have started testing with DDN IB storage, but don't have test results to share yet. I'm sad to report no SRP HA testing in Cisco SQA yet. It's next on the todo list (right after IPoIB HA). Scott From: Sujal Das [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 18, 2006 4:09 PMTo: Scott Weitzenkamp (sweitzen)Cc: openib-general@openib.orgSubject: RE: [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7 Scott, thanks for the report. Based on this, it looks like Cisco did not test the SRP initiator and HA functions with any SRP targets. Is that a fair assessment? From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Wednesday, October 18, 2006 2:24 PMTo: [EMAIL PROTECTED]Cc: openib-general@openib.orgSubject: [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7 Regression testing went well, using Cisco switches and Cisco (Mellanox) HCAs. See attached spreadsheet for more details. The following increase in testing happened: Started testing SLES10 IA32 (will have IA64 and PPC64 results for pre1). Switched to HP MPI 2.2.5, which is first version to support OF. The following bugs were tested and closed. 247 OFED IPoIB HA not working on RHEL4 U3 259 problems with OFED IPoIB HA on SLES10 173 OFED mpitests: add osu_{bw,latency,bibw,bcast}.c examples The following bugs were opened, but all have been marked fixed in pre1, thanks Mellanox folks for the quick response. 273 OFED 1.1 rc7 does not work with Cisco FC Gateway 274 OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs 277 OFED 1.1 rc7: uninitialized value during IPoIB failover in ipoib_ha.pl 278 OFED 1.1: two copies of openib.spec in openib-1.1.tgz Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] test results from Mellanox, Voltaire, QLogic, and IBM for OFED 1.1 rc7?
What testing did these companies do with rc7? I'd kinda like to see performance data for the QLogic and IBM HCAs... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] sysfs exposure of port counters useless?
I agree the 32-bit byte and packet counters are useless as they get pegged in a few seconds on a busy IB networks. I thought there was an effort in IBTA to fix this. For IB counters in a Cisco switch, we read and reset the 32-bit counters once per second and keep 64-bit counters internally. This would be possible in OF too, right? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Newton Sent: Tuesday, October 17, 2006 5:10 PM To: Hal Rosenstock Cc: openib-general@openib.org Subject: Re: [openib-general] sysfs exposure of port counters useless? On Tue, 17 Oct 2006, Hal Rosenstock wrote: On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote: From: Michael Newton Sent: Tuesday, October 17, 2006 3:02 AM To: openib-general@openib.org Subject: [openib-general] sysfs exposure of port counters useless? These are 32 bit counters. The rcv/xmit_data counters count 32-bit blocks. Also, these counts do not wrap: they peg at all 1s. At infiniband speeds, these counts can peg out very quickly indeed, to the point they can really only be of use if they can be reset each time there read. Now if anyone who wants to use them has to go the CLI to reset them, and theres little point in reading them without reset, why would anyone read them via sysfs? so why have them? We have found that while your comment is true for the data movement counters, the error counters should not peg quickly, hence it is valid its true i overstated the case just a little;) .. yes error counters should be fine and its mainly the data counters that are problematic (tho now im not sure i havent seen the packet counters freeze when the data ones peg out).. to read them without resetting. However it is also useful to have an ability to reset them. Of course if there are other CLI commands which do this easily, the sysfs info is of less value. There are diag tools for this. thats where we started.. the point im making is that exposing the data counters in sysfs is of little use, because if you have to go to other tools to reset, why wouldnt you use them to read as well? i was looking at exposing infiniband stats via PCP (http://oss.sgi.com/projects/pcp/). This would be useful for folk doing IB performance testing. Its very easy to just feed in the sysfs values.. unfortunately they turn out to be of little value. Life would be so much easier if there were 64 bit counters available. Instead I will probably need to have an additional daemon to construct them. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] We wish to do the 1.1 release next week
Yes, that would be great. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Tziporet Koren [mailto:[EMAIL PROTECTED] Sent: Monday, October 16, 2006 9:26 AM To: Scott Weitzenkamp (sweitzen); Tziporet Koren; [EMAIL PROTECTED]; OPENIB Subject: RE: [openfabrics-ewg] [openib-general] We wish to do the 1.1 release next week This patch is already in. We will publish latest pre-release version tomorrow so everybody can do latest checks. Is this OK? Tziporet -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen) Sent: Sunday, October 15, 2006 10:16 PM To: Tziporet Koren; [EMAIL PROTECTED]; OPENIB Subject: Re: [openfabrics-ewg] [openib-general] We wish to do the 1.1 release next week Yes, bug 273 (http://openib.org/bugzilla/show_bug.cgi?id=273) is a blocking issue for Cisco. Roland sent a patch last Monday. I'm done testing the other parts of rc7, and am testing his patch later today. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Thursday, October 12, 2006 7:44 AM To: [EMAIL PROTECTED]; OPENIB Subject: [openib-general] We wish to do the 1.1 release next week Hi all, I am back from vacation and found you waited with the release for me :-) From a quick look at status mails I think we can do the official release next week. Please reply if there are still any blocking issues you have. Also - please update all documents till end of Monday next week. Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule
This plan is OK with Cisco. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Monday, October 16, 2006 10:04 AMTo: Open FabricsCc: openibSubject: [openfabrics-ewg] OFED 1.1 release schedule This is the plan to do the 1.1 release this week: We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct) Only blocker issues from RC7 will be updated: SRP fix for Cisco FC gateway Small updates for the install Fix in diagnet to support SM on a switch Activate scaling code of ehca as default in the install Documentation update Each company will have 3 days for latest certification process and then the release can be done on Thursday. Company owners please approve if this is OK with you. If not please elaborate the blocking reasons. Thanks, Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] We wish to do the 1.1 release next week
Yes, bug 273 (http://openib.org/bugzilla/show_bug.cgi?id=273) is a blocking issue for Cisco. Roland sent a patch last Monday. I'm done testing the other parts of rc7, and am testing his patch later today. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Thursday, October 12, 2006 7:44 AM To: [EMAIL PROTECTED]; OPENIB Subject: [openib-general] We wish to do the 1.1 release next week Hi all, I am back from vacation and found you waited with the release for me :-) From a quick look at status mails I think we can do the official release next week. Please reply if there are still any blocking issues you have. Also - please update all documents till end of Monday next week. Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Cisco SQA results for OFED 1.1 rc6
The following bugs and enhancement requests were filed. 247 OFED IPoIB HA not working on RHEL4 U3 We fixed it inRC7 Agreed fixed, thanks. 249 OFED 1.1: Open MPI 1.1.1 won't compile with Intel C 9.[01] on SLES 10 I guess this will not be fixed for OFED 1.1. Correct? There is an update to Intel C 9.1, I'll be trying it out with rc7. 258 OFED: ppc64 GNU mpif90 missing for MVAPICH Can you send us log file? I put the logs in bugzilla. 259 problems with OFED IPoIB HA on SLES10 Fixed in RC7 Agreed fixed, thanks. 269 OFED 1.1 rc6 IPoIB does not interoperate with Cisco SFS 3001 Can Cisco debug it and send patches? We're working on it. 270 tvflash does not work with HCA recovery jumper Can Cisco send a fix? We're working on it. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7
Vlad, do you have symbol cm_issue_drep in any .ko files, because I don't. Looks like the patch is not getting compiled in for some reason. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 2:00 AM To: Arlin Davis Cc: Aviram Gutman; Scott Weitzenkamp (sweitzen); Supalov, Alexander; Magro, Bill; EWG; Openib-General@Openib.Org Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 RC7 Hi Arlin, This patch is in OFED-1.1-rc7 and applied during installation. Regards, Vladimir On Tue, 2006-10-10 at 22:50 -0700, Arlin Davis wrote: Aviram Gutman wrote: OFED-1.1-rc7 is available on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc7.tgz Please report any issues in bugzilla http://openib.org/bugzilla/ Aviram, Can you verify that the sean_cm_drep_on_not_found.patch is actually applied in RC7? Our delayed disconnect problems still exist. I don't see the new symbol cm_issue_drep in ib_cm.ko on our RC7 installed systems so I don't think the patch applied. Thanks, -arlin ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
You checked SUSE 10 or SLES 10, aren't those different distros? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 3:09 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 On some of our SUSE 10 machines i found the 127.0.0.2 ip, but it was pointing to some random Linux site (linux.org) and has no effect on mpi runs. In you case the ip point to _real_ machine, it very strange. Scott Weitzenkamp (sweitzen) wrote: Aha, I found something in /etc/hosts, thanks for the hint. 127.0.0.2 svbu-qa1850-3.cisco.com svbu-qa1850-3 If I comment this line out, MVAPICH works fine. Does Mellanox have this entry in /etc/hosts? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Thursday, October 05, 2006 5:59 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I see it for all MVAPICH tests, it's 100% consistent. MVAPICH tests are osu_benchmarks (bw/lt/etc..) or all test over mvapich on SUSE10 platform ? Please check /etc/hosts file on your machines, it should be exactly the same on all nodes. Regards, Pasha Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
We've installed four SLES10 machines so far, and they all have the 127.0.0.2 myhostname entry. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 8:49 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I mean SLES10. (yes it's different distros) Scott Weitzenkamp (sweitzen) wrote: You checked SUSE 10 or SLES 10, aren't those different distros? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 3:09 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 On some of our SUSE 10 machines i found the 127.0.0.2 ip, but it was pointing to some random Linux site (linux.org) and has no effect on mpi runs. In you case the ip point to _real_ machine, it very strange. Scott Weitzenkamp (sweitzen) wrote: Aha, I found something in /etc/hosts, thanks for the hint. 127.0.0.2 svbu-qa1850-3.cisco.com svbu-qa1850-3 If I comment this line out, MVAPICH works fine. Does Mellanox have this entry in /etc/hosts? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Thursday, October 05, 2006 5:59 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I see it for all MVAPICH tests, it's 100% consistent. MVAPICH tests are osu_benchmarks (bw/lt/etc..) or all test over mvapich on SUSE10 platform ? Please check /etc/hosts file on your machines, it should be exactly the same on all nodes. Regards, Pasha Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
We aren't using SLES auto-install. But I did google for SLES 127.0.0.2 and found this at http://www.novell.com/documentation/novellaudit20/readme/novellaudit20_r eadme.html: 2.8 SLES 10 hosts File SLES 10 includes two localhost entries in the /etc/hosts file: 127.0.0.1 and 127.0.0.2 . The steps for installing Oracle10g on SLES10 at http://wiki.novell.com/index.php/Oracle10g_R2_Database_on_SLES10_for_i38 6_Step-by-Step_1 also reference commenting out the 127.0.0.2 line. Please add this step to the OFED 1.1 MVAPICH release notes. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 9:03 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Here is some link about SuSE's bugs related to 127.0.0.2 https://bugzilla.novell.com/show_bug.cgi?id=165269 Check your SuEe auto-install stuff. It is possible that you have some broken configuration in it. Scott Weitzenkamp (sweitzen) wrote: We've installed four SLES10 machines so far, and they all have the 127.0.0.2 myhostname entry. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 8:49 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I mean SLES10. (yes it's different distros) Scott Weitzenkamp (sweitzen) wrote: You checked SUSE 10 or SLES 10, aren't those different distros? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 3:09 AM To: Scott Weitzenkamp (sweitzen) Cc: Pavel Shamis (Pasha); Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 On some of our SUSE 10 machines i found the 127.0.0.2 ip, but it was pointing to some random Linux site (linux.org) and has no effect on mpi runs. In you case the ip point to _real_ machine, it very strange. Scott Weitzenkamp (sweitzen) wrote: Aha, I found something in /etc/hosts, thanks for the hint. 127.0.0.2 svbu-qa1850-3.cisco.com svbu-qa1850-3 If I comment this line out, MVAPICH works fine. Does Mellanox have this entry in /etc/hosts? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Thursday, October 05, 2006 5:59 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I see it for all MVAPICH tests, it's 100% consistent. MVAPICH tests are osu_benchmarks (bw/lt/etc..) or all test over mvapich on SUSE10 platform ? Please check /etc/hosts file on your machines, it should be exactly the same on all nodes. Regards, Pasha Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA
Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7
Yes, the patch is being applied. Not sure why cm_issue_drep is not there though... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 9:13 AM To: Scott Weitzenkamp (sweitzen) Cc: Arlin Davis; Magro, Bill; Supalov, Alexander; Openib-General@Openib.Org; EWG Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 RC7 Hi Scott, You can check OFED compilation log file to see if this patch was applied and compiled. To get the relevant log file: ls -ltr /tmp/OFED*log | tail -2 One of them will be the compilation log. Search for sean_cm_drep_on_not_found.patch inside... Regards, Vladimir On Wed, 2006-10-11 at 08:37 -0700, Scott Weitzenkamp (sweitzen) wrote: Vlad, do you have symbol cm_issue_drep in any .ko files, because I don't. Looks like the patch is not getting compiled in for some reason. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 11, 2006 2:00 AM To: Arlin Davis Cc: Aviram Gutman; Scott Weitzenkamp (sweitzen); Supalov, Alexander; Magro, Bill; EWG; Openib-General@Openib.Org Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 RC7 Hi Arlin, This patch is in OFED-1.1-rc7 and applied during installation. Regards, Vladimir On Tue, 2006-10-10 at 22:50 -0700, Arlin Davis wrote: Aviram Gutman wrote: OFED-1.1-rc7 is available on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc7.tgz Please report any issues in bugzilla http://openib.org/bugzilla/ Aviram, Can you verify that the sean_cm_drep_on_not_found.patch is actually applied in RC7? Our delayed disconnect problems still exist. I don't see the new symbol cm_issue_drep in ib_cm.ko on our RC7 installed systems so I don't think the patch applied. Thanks, -arlin ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: [PATCH fixed] [RFC] IB/srp: enable multiple connections to the same target
I am also having new problems configuring SRP with OFED 1.1 rc7, I have asked Roland to take a look on my test networks. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lakshmanan, Madhu Sent: Monday, October 09, 2006 4:54 AM To: Ishai Rabinovitz; openib-general@openib.org Cc: Roland Dreier (rdreier) Subject: [openib-general] FW: [PATCH fixed] [RFC] IB/srp: enable multiple connections to the same target Quoting r. Roland Dreier [EMAIL PROTECTED]: Thanks, queued for 2.6.19 I tested the patches, which are included in OFED 1.1 RC7, against Silverstorm SRP targets. The patch breaks backward compatibility for fabrics that use Silverstorm targets, due to the following: It defaults the new parameter initiator_ext to 0. Silverstorm SRP targets, when configured for working with OFED stacks, are usually set to expect an initiator extension of 1, to overcome the earlier limitation of OFED stacks setting initiator extension to the port number. This implies that a user must, without exception, add initiator_ext=n to the add target echo string. It'd be useful if either or both of the following could be done: 1. Release note the above requirement of adding the initiator_ext=n string to the add target echo string, for all Silverstorm targets. 2. Maintain the earlier default of the initiator extension being equal to the port number. I have prepared a patch that does step 2 above, which I'll send in a separate e-mail based on the feedback to the above suggestions. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: [PATCH fixed] [RFC] IB/srp: enable multiple connections to the same target
Vlad, I'd like either an rc8 with this patch, or a pre1 build a day before the final build so I can test this fix. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Roland Dreier (rdreier) Sent: Monday, October 09, 2006 10:17 AM To: Scott Weitzenkamp (sweitzen) Cc: Lakshmanan, Madhu; Ishai Rabinovitz; openib-general@openib.org Subject: Re: [openib-general] FW: [PATCH fixed] [RFC] IB/srp: enable multiple connections to the same target I am also having new problems configuring SRP with OFED 1.1 rc7, I have asked Roland to take a look on my test networks. The problem is that Cisco SRP targets insist on the initiator ID being 8 bytes of 0 followed by the initiator node GUID. The source says /* * Topspin/Cisco SRP targets will reject our login unless we * zero out the first 8 bytes of our initiator port ID. The * second 8 bytes must be our local node GUID, but we always * use that anyway. */ but with the change to allow userspace-specified initiator IDs, we don't use the node GUID anyway. I added the following on top of the multiple connections patch that was queued for 2.6.19. Can this be put into OFED as well? - R. diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 3bf0c5b..4b09147 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -359,15 +359,16 @@ static int srp_send_req(struct srp_targe /* * Topspin/Cisco SRP targets will reject our login unless we - * zero out the first 8 bytes of our initiator port ID. The - * second 8 bytes must be our local node GUID, but we always - * use that anyway. + * zero out the first 8 bytes of our initiator port ID and set + * the second 8 bytes to the local node GUID. */ if (topspin_workarounds !memcmp(target-ioc_guid, topspin_oui, 3)) { printk(KERN_DEBUG PFX Topspin/Cisco initiator port ID workaround activated for target GUID %016llx\n, (unsigned long long) be64_to_cpu(target-ioc_guid)); memset(req-priv.initiator_port_id, 0, 8); + memcpy(req-priv.initiator_port_id + 8, +target-srp_host-dev-dev-node_guid, 8); } status = ib_send_cm_req(target-cm_id, req-param); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Cisco SQA results for OFED 1.1 rc6
I forgot to summarize some IPoIB and SDP performance numbers. We saw SDP latency as low as 9.5 usec, SDP throughput as high as 8.33 Gb/sec on one port, IPoIB latency as low as 16.0 usec, and IPoIB throughput as high at 3.4 Gb/sec. This is all on RHEL4, the numbers varied quite a bit depending on the system and HCA used. Roland has seen IPoIB throughput of 5.5 Gb/sec using a newer kernel than what RHEL4 uses. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Monday, October 09, 2006 12:35 AMTo: Open FabricsCc: openib-GeneralSubject: [openib-general] Cisco SQA results for OFED 1.1 rc6 The testing in general went well. All testing was done on Mellanox HCAs, both SDR and DDR. Most testing was done on RHEL4 U3, but we have done some testing on RHEL4 U4 and SLES10, and in the future will test less and less on RHEL4 U3. See attached spreadsheet for more details. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
Aha, I found something in /etc/hosts, thanks for the hint. 127.0.0.2 svbu-qa1850-3.cisco.com svbu-qa1850-3 If I comment this line out, MVAPICH works fine. Does Mellanox have this entry in /etc/hosts? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Thursday, October 05, 2006 5:59 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 I see it for all MVAPICH tests, it's 100% consistent. MVAPICH tests are osu_benchmarks (bw/lt/etc..) or all test over mvapich on SUSE10 platform ? Please check /etc/hosts file on your machines, it should be exactly the same on all nodes. Regards, Pasha Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
I see it for all MVAPICH tests, it's 100% consistent. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Problems with OFED IPoIB HA on SLES10
Vlad, I filed a bug for these issues. 1) If I start IPoIB HAwith ib0 IB port shut down (from IB switch) and ib1 IB port enabled, then IPoIB does not work because "ip monitor linkall" does not report NO-CARRIER at startup like ipoib_ha.pl is looking for. This is a major hole. 2) /etc/init.d/openibd runs ipoib_ha.pl with its stdout and stderr redirected to /dev/null, should we run with -v for verbose instead and redirect log file to /var/log? # fgrep ipoib_ha.pl /etc/init.d/openibd ipoib_ha.pl -p ${PRIMARY_IPOIB_DEV} -s ${SECONDARY_IPOIB_DEV} --with-arping --with-multicast /dev/null 21 3) I got IPoIB HA working on SLES 10, but the documentation is a little lacking. Looks like I have to put the same IP address in ifcfg-ib0 and ifcfg-ib1, is this correct? # pwd/etc/sysconfig/network# cat ifcfg-ib0DEVICE=ib0BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0># cat ifcfg-ib1DEVICE=ib1BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0> 4) If I shutdown ib0 IB port, I see this from "/usr/local/ofed/bin/ipoib_ha.pl -v --with-arping --with-multicast" Use of uninitialized value in concatenation (.) or string at /usr/local/ofed/bin/ipoib_ha.pl line 287. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Problems with OFED IPoIB HA on SLES10
Vlad, thaks for the fast response. I have some followup questions about configuring IPoIB HA, see below. 3) I got IPoIB HA working on SLES 10, but the documentation is a little lacking. Looks like I have to put the same IP address in ifcfg-ib0 and ifcfg-ib1, is this correct? Yes, IP address should be the same. Actually the configuration of the secondary interface does not matter. The High Availability daemon reads the configuration of the primary interface and migrates it between the interfaces in case of failure.If I don't have an ifcfg-ib1 file, then ipoib_ha.pl won't start. If I don't have an ifcfg-ib1, then ipoib_ha.pl won't start. I would prefer to not configure ifcfg-ib1 since I don't plan to use it. # ipoib_ha.pl --with-arping --with-multicast -vCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory... If I put different IP addresses in ifcfg-ib0 and ifcfg-ib1, then the ifcfg-ib1 IP address is used for both ib0 and ib1! # pwd/etc/sysconfig/network# cat ifcfg-ib0DEVICE=ib0BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0># cat ifcfg-ib1DEVICE=ib1BOOTPROTO=staticIPADDR=192.168.6.46NETMASK=255.255.255.0># /etc/init.d/openibd startLoading HCA driver and Access Layer: [ OK ]Setting up InfiniBand network interfaces: ib0 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20) ib0 configuration: ib1Bringing up interface ib0: [ OK ] ib1 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20)Bringing up interface ib1: [ OK ]Setting up service network . . . [ done ]# ifconfig ib0ib0 Link encap:UNSPEC HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:3 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:224 (224.0 b) # ifconfig ib1ib1 Link encap:UNSPEC HWaddr 00-00-04-05-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:304 (304.0 b) Notice how both ib0 and ib1 have the IP address from ifcfg-ib1. This contradicts this info from ipoib_release_notes.txt: b. The ib1 interface uses the configuration script of ib0. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Problems with OFED IPoIB HA on SLES10
{lock_timer_base+27} 80138f89{try_to_del_timer_sync+81} 883322b3{:ib_sa:send_handler+72} 8826762f{:ib_mad:ib_mad_complete_send_wr+421} 88267f00{:ib_mad:ib_mad_completion_handler+947} 88267b4d{:ib_mad:ib_mad_completion_handler+0} 80140177{run_workqueue+153} 8014081e{worker_thread+0} 801437e5{keventd_create_kthread+0} 80140927{worker_thread+265} 8012787f{__wake_up_common+62} 8012905a{default_wake_function+0} 801437e5{keventd_create_kthread+0} 80143aca{kthread+236} 8010b60a{child_rip+8} 801437e5{keventd_create_kthread+0} 801439de{kthread+0} 8010b602{child_rip+0} Code: f0 ff 0f 0f 88 29 01 00 00 c3 fa f0 ff 0f 0f 88 2a 01 00 00RIP 802cffea{_spin_lock_irqsave+3} RSP 810132a4fc20 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Tuesday, October 03, 2006 2:53 PMTo: Vladimir SokolovskyCc: EWG; openib-GeneralSubject: Re: [openib-general] [openfabrics-ewg] Problems with OFED IPoIB HA on SLES10 Vlad, thaks for the fast response. I have some followup questions about configuring IPoIB HA, see below. 3) I got IPoIB HA working on SLES 10, but the documentation is a little lacking. Looks like I have to put the same IP address in ifcfg-ib0 and ifcfg-ib1, is this correct? Yes, IP address should be the same. Actually the configuration of the secondary interface does not matter. The High Availability daemon reads the configuration of the primary interface and migrates it between the interfaces in case of failure.If I don't have an ifcfg-ib1 file, then ipoib_ha.pl won't start. If I don't have an ifcfg-ib1, then ipoib_ha.pl won't start. I would prefer to not configure ifcfg-ib1 since I don't plan to use it. # ipoib_ha.pl --with-arping --with-multicast -vCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directoryCan't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory... If I put different IP addresses in ifcfg-ib0 and ifcfg-ib1, then the ifcfg-ib1 IP address is used for both ib0 and ib1! # pwd/etc/sysconfig/network# cat ifcfg-ib0DEVICE=ib0BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0># cat ifcfg-ib1DEVICE=ib1BOOTPROTO=staticIPADDR=192.168.6.46NETMASK=255.255.255.0># /etc/init.d/openibd startLoading HCA driver and Access Layer: [ OK ]Setting up InfiniBand network interfaces: ib0 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20) ib0 configuration: ib1Bringing up interface ib0: [ OK ] ib1 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20)Bringing up interface ib1: [ OK ]Setting up service network . . . [ done ]# ifconfig ib0ib0 Link encap:UNSPEC HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:3 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:224 (224.0 b) # ifconfig ib1ib1 Link encap:UNSPEC HWaddr 00-00-04-05-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:304 (304.0 b) Notice how both ib0 and ib1 have the IP address from ifcfg-ib1. This contradicts this info from ipoib_release_notes.txt: b. The ib1 interface uses the configuration script of ib0. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Is this communication protocols documented anywhere? How does this feature compare to IPoIB and SDP? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ramachandra K Sent: Monday, October 02, 2006 12:58 PM To: Roland Dreier (rdreier) Cc: [EMAIL PROTECTED]; openib-General Subject: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx) Hi Roland, This patch series adds support for the SilverStorm Virtual Ethernet I/O Controllers (VEx) by adding a new kernel level driver. This kernel driver: 1. Communicates with the VEx on the SilverStorm fabric switches/directors using SilverStorm's native protocol 2. Presents a standard Ethernet NIC interface to the system 3. Uses IB reliable connection semantics 4. Is tuned for high performance and throughput The SilverStorm VEx and the associated communication protocol is in wide use amongst users of SilverStorm IB fabric solutions. This patch series is intended for your infiniband.git for-2.6.19 branch. It also has been tested against the for-2.6.20 branch. Signed-off-by: Ramachandra K [EMAIL PROTECTED] Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
Aviram, can I try Mellanox binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Scott Weitzenkamp (sweitzen) Sent: Sunday, October 01, 2006 9:31 PM To: 'Aviram Gutman'; Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: RE: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang. This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
$ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_benchmarks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
Yes, this is fine with me. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Aviram Gutman Sent: Tuesday, September 26, 2006 9:01 AM To: EWG; Openib-General@Openib.Org Subject: [openfabrics-ewg] OFED Status Hi, OFED 1.1 RC6 was released on Thu. The issues that were resolved since are: 1) OpenIB Diags build on SLES10 ppc - Solved by Moshe Katzir from Voltaire 2) iSER build on SLES10 needs root privilege - Voltaire fixed it 3) Bug #233 SDP crash on ipath - I believe MST fixed. Betsy please confirm. 4) Fix IBDM to allow multiple devices on the same machine - Eitan Zahavi fixed 5) SRP HA - Fixed by Ishai 6) IPoIB HA on RH - Vlad made progess, issue is still not solved. 7) The CM fix that Arlin asked - In Pending that IPoIB HA is solved would like to issue RC7 that suppose to be final. Is everyone OK with this approach? Aviram ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] all of the man pages should change the package name to OFED
OpenFabrics maybe, but not OFED in my opinion. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dotan Barak Sent: Thursday, September 21, 2006 8:45 AM To: Roland Dreier (rdreier); Hal Rosenstock Cc: openib Subject: [openib-general] all of the man pages should change the package name to OFED Hi. When i executed man ibv_devinfo or man ibstat (for example) i notices that those man pages are marked as part of the OpenIB package. I believe that the package name should be changed to OFED. what do you think? thanks Dotan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] [PATCH] OFED 1.1-rc3 is ready
I installed RC5 and now it just hangs, Wow - we can't even get RC5 to build here. What distro are you running? I've tried this on RC4 + a fixed libipathverbs package and it runs OK (although it does take a while, which might explain the hang you were seeing.) But mostly I'm curious how you get RC5 to build at all. We really really really shouldn't be attempting to turn RC's around as fast as RC4 to RC5 went: we basically had about enough time to throw a patch together without being able to do much testing. I think many of us are in agreement, before RC6 I propose we only check in critical work on the release branch, and get some time in to thoroughly test RC5. Non-critical fixes can wait until after 1.1. I personally would like a week to test RC5. I feel like we had forgotten what the E in OFED stands for, if we have to slip the release schedule to make this code really stable I'm in favor of it. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 status
When will rc4 be available? I'd also like to suggest we not rush the final build, end of this week seems too soon. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Thursday, September 07, 2006 1:02 PMTo: EWGCc: openibSubject: [openfabrics-ewg] OFED 1.1 status Hi, OFED 1.1 RC4 will be published on Monday 11-Sep. We currently work on several showstoppers: 223: mthca.so not properly linked to libibverbs Vlad Jack 221: SRP on V40Z and Sun T4 gets Kernel BUG at spinlock:118 - Roland 219: OFED 1.1rc3 contains prerelease unstable libibverbs code Vlad Jack Thus final release date will be delayed to end of next week Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
I only tested with renice -20. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Monday, September 11, 2006 11:02 PM To: Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: Re: [Bug 229] heavy CPU load can starve ib_mad thread on latest processors Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: Subject: [Bug 229] heavy CPU load can starve ib_mad thread on latest processors http://openib.org/bugzilla/show_bug.cgi?id=229 --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- Cisco embedded SM on a switch, thus no SM on a host, only IB drivers. Looks like we'll add the workaround for ofed. What renice level are you using? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] is there a plan for getting SDP into kernel.org?
Scott I would like to see netstat support, zcopy support, and Scott ideally AIO support get added first... Better to merge first and then add features I think. - R. How about just adding netstat before the merge, so we have some visibility into what SDP connections are in use? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 status
Please make sure 1. and 3. are fixed before you release rc4, thanks. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Thursday, September 07, 2006 1:02 PMTo: EWGCc: openibSubject: [openfabrics-ewg] OFED 1.1 status Hi, OFED 1.1 RC4 will be published on Monday 11-Sep. We currently work on several showstoppers: 223: mthca.so not properly linked to libibverbs Vlad Jack 221: SRP on V40Z and Sun T4 gets Kernel BUG at spinlock:118 - Roland 219: OFED 1.1rc3 contains prerelease unstable libibverbs code Vlad Jack Thus final release date will be delayed to end of next week Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] is there a plan for getting SDP into kernel.org?
I would like to see netstat support, zcopy support, and ideally AIO support get addedfirst... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc2 is ready (how do I enable madeye)?
5. Added Madeye utility How do I build madeye? I don't see any reference to it to install.sh. Is there any documentation for madeye? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc3 is ready
RC3 includes a bunch of binary RPMS, please remove for RC4. Look at the size of the RC3 tarball vs previous ones: $ ls -s | more total 290848 46512 OFED-1.1-rc1.tgz 0 OFED-1.1-rc1.tgz.md5sum 47048 OFED-1.1-rc2.tgz 0 OFED-1.1-rc2.tgz.md5sum 197288 OFED-1.1-rc3.tgz 0 OFED-1.1-rc3.tgz.md5sum Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Thursday, August 31, 2006 9:24 AM To: EWG Cc: OPENIB Subject: [openfabrics-ewg] OFED 1.1-rc3 is ready Hi, OFED 1.1-RC3 is available on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc3.tgz Please report any issues in bugzilla http://openib.org/bugzilla/ Schedule reminder: == Next milestones: RC4 is planned for 7-Sep. It should include critical bug fixes only. Final release will be on 11 or 12 Sep. Owners - please update release notes for RC4. Tziporet Vlad -- --- Release details: Build_id: OFED-1.1-rc3 openib-1.1 (REV=9203) # User space https://openib.org/svn/gen2/branches/1.1/src/userspace Git: ref: refs/heads/ofed_1_1 commit 338e942a4ae10d62f2632e6292f85bb1b15d154c # MPI mpi_osu-0.9.7-mlx2.2.0.tgz openmpi-1.1.1-1.src.rpm mpitests-2.0-0.src.rpm OS support: === Novell: - SLES 9.0 SP3 - SLES10 Redhat: - Redhat EL4 up3 - Redhat EL4 up4 kernel.org: - Kernel 2.6.17 Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped from the list. We keep the backport patches for these OSes and make sure OFED compile and loaded properly but will not do full QA cycle. Systems: * x86_64 * x86 * ia64 * ppc64 Main changes from OFED-1.1-rc2: === 1. Added ehca (IBM) driver. This driver can be compiled on kernel 2.6.18 only 3. Open MPI version update to openmpi-1.1.1-1 4. Core: Huge pages registration is supported 5. IPoIB high availability script supports multicast groups 6. RHEL4 up4 is now supported 7. SDP: fixed connection refused problem; get peer name working 8. libsdp: several bug fixes Limitations and known issues: = 1. SDP: For Mellanox Sinai HCAs one must use latest FW version (1.1.000). 2. SDP: Scalability issue when many connections are opened 3. SDP: If RTU packet is lost Accept call blocks even if client connected. 4. ipath driver is not supported on SLES9 SP3 5. Compilation on kernel 2.6.18-rc5 is failing - to be fixed in RC4 Missing features that should be completed for RC4: == None ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [patch] libsdp typo in config_parser
Running an MPI command with LD_PRELOAD=libsdp.so at the beginning won't cause SDP to be used on remote nodes. You have to find a way to load libsdp.so on all nodes, this might work better: LD_PRELOAD=libsdp.so mpirun -np 4 env LD_PRELOAD=libsdp.so /there/vasp/20060503/vasp.4.6/vasp.mpi Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bernhard Fischer Sent: Friday, August 18, 2006 12:22 PM To: Eitan Zahavi Cc: openib-general@openib.org Subject: Re: [openib-general] [patch] libsdp typo in config_parser On Fri, Aug 18, 2006 at 10:05:35PM +0300, Eitan Zahavi wrote: Hi Bernhard SDP traffic will not show on the IPoIB counters. It does no go through IPoIB. That's what i thought, thanks for confirming. You can use lsmod | grep ib_sdp to see how many connections are made over SDP. Running lam via 2 nodes, on 2 CPUs each, i see: # lsmod | grep ib_sdp ib_sdp 28184 4 rdma_cm27912 1 ib_sdp ib_core53632 12 ib_ucm,ib_uverbs,ib_sdp,rdma_cm,ib_cm,ib_local_sa,ib_umad,ib_i poib,ib_multicast,ib_sa,ib_mthca,ib_mad I did start lamboot with libsdp.so preloaded: $ LD_PRELOAD=/usr/local/lib64/libsdp.so lamboot l $ lamnodes C -c -n node13ib.infiniband node13ib.infiniband node15ib.infiniband node15ib.infiniband $ LD_PRELOAD=/usr/local/lib64/libsdp.so mpirun -np 4 /there/vasp/20060503/vasp.4.6/vasp.mpi Still, ifconfig ib0 (which hosts node??ib.infiniband on 10.100.0.0/24) shows that the communication is being sent over ipoib as ifconfigs counters constantly go up when communicating (only one user is active on the system). $ /sbin/ifconfig ib0 ib0 Link encap:UNSPEC HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:10.100.0.13 Bcast:10.100.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:182037964 errors:0 dropped:0 overruns:0 frame:0 TX packets:183607689 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:189334244937 (180563.2 Mb) TX bytes:194777918565 (185754.6 Mb) My libsdp.conf looks like this: $ cat /usr/local/etc/libsdp.conf #log min-level 1 destination file libsdp.log use bothconnect * 10.100.0.0/24:* use bothserver * 10.100.0.0/24:* So i fear i'm missing something crucial. Ideas? Exact number of packets and data can flowing through the IB port can be obtained by : /sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets /sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packets $ for i in /sys/class/infiniband/mthca0/ports/1/counters/*packets;do echo -n $i:' ' ; cat $i;done /sys/class/infiniband/mthca0/ports/1/counters/port_rcv_packets : 185010549 /sys/class/infiniband/mthca0/ports/1/counters/port_xmit_packet s: 186584856 PS: The different pingpong test (which have outdated names in the openib wiki, btw) do work just fine if run from the very same user, so i think that the basic verbs communication would work proper. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] does Oracle still support AIO SDP?
I know at one point Oracle supported AIO SDP on Linux, is still still supported? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary
We received our DDN equipment today and have started setting it up. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Hansen (shahanse) Sent: Tuesday, July 25, 2006 1:39 PM To: Tziporet Koren; Matt Leininger Cc: [EMAIL PROTECTED]; openib Subject: Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary Yes, Cisco plans to test OFED on a DDN SRP target. --Shawn -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Tuesday, July 25, 2006 8:40 AM To: Matt Leininger Cc: [EMAIL PROTECTED]; openib Subject: Re: [openfabrics-ewg] OFED 1.1 planning meeting - summary Matt Leininger wrote: 5. SRP: - GA quality - DM (Device Mapper) - for high availability - Basic failover/failback testing with daemon+srp+XVM/MPP and Engenio target Tziporet, Are there any plans to test with the DDN SRP target? Several DoE sites are testing/using the DDN IB based storage. Mellanox does not have DDN SRP target. We will be happy to test it of DDN will loan us a system. Another option is that DDN will take OFED 1.1 RCs and test it in their labs. Can you approach them and ask this. If yes then I can cc them on the RCs mails so they can do it. Is there any other vendor who has DDN SRP target, and going to test OFED with it? Tziporet ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-rc1 is available
Bryan, can you please add a 1.1rc1 version to bugzilla? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Tuesday, August 08, 2006 7:48 AM To: [EMAIL PROTECTED] Cc: openib Subject: [openfabrics-ewg] OFED 1.1-rc1 is available Hi, In two week delay we publish OFED 1.1-RC1 on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc1.tgz Build_id: OFED-1.1-rc1  openib-1.1 (REV=8849) # User space https://openib.org/svn/gen2/branches/1.1/src/userspace Git: git://www.mellanox.co.il/~git/infiniband ofed_1_1 ref: refs/heads/ofed_1_1 commit df6aabce49695368fd004e6505102a1519b266a4  # MPI mpi_osu-0.9.7-mlx2.2.0.tgz openmpi-1.1-1.src.rpm mpitests-2.0-0.src.rpm OS support: === Novell: - SLES 9.0 SP3* - SLES10 (official release)* Redhat: - Redhat EL4 up3 - Redhat EL4 up4 (was not tested yet) kernel.org: - Kernel 2.6.17* * Changed from 1.0 release Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped from the list. We keep the backport patches for these OSes and make sure OFED compile and loaded properly but will not do full QA cycle. Systems: * x86_64    * x86    * ia64    * ppc64 Main changes from OFED-1.0: === o Bug fixes o Enabled building 32-bit libraries on x86_64 and ppc64.   Note: sysfsutils and sysfsutils-devel 32-bit required. o Kernel code based on 2.6.18 o Kernel hot-plug support in uverbs - removing module wait till all user applications release their HCA resources. o Package sources (user space and kernel modules) are places under: prefix/src/ Original kernel sources are not replaced o RDS was removed from the OFED package o Set options in CMA uCMA o OSM new features: - Partition Manager (Pkey) - Pre-computed routing load from file - Primitive QoS - As technology preview o SDP: - Improved latency (13 usec with netperf tcprr) - Implemented Naggle algorithm - Memory leaks fixes - Error handling added o MPI: - OSU - MVAPICH: Message coalescing to improve message rate - Open MPI 1.1-1: see changes: http://svn.open-mpi.org/svn/ompi/trunk/NEWS - MPI tests: Replace to the new test versions from LLNL, Intel, OSU o SRP: - Stability o iSER: - Stability - Testing more platforms (e.g. ppc64 and ia64) - Performance improvements o Management: - Add saquery tool - Enhancement to ibnetdiscover tool with grouping function - New ibutils package: o Port error counter check o Port performance counters dump o Link width and Link Speed check by flag o uDAPL: - Scalability features needed for Intel MPI - Code was updated from trunk Limitations and known issues: = 1. ipath driver compilation fails on all systems 2. iSER support in install script for SLES 10 is missing 3. SDP: - 32 bit systems might run out of low memory when opening hundreds of sockets. - For Mellanox Sinai HCAs one must use latest FW version (1.1.000). Missing features that should be completed for RC2: == 1. SRP: - Complete testing with DM (Device Mapper) - for high availability - New daemon 2. IPoIB: High availability support using a daemon in user level 3. SDP: support sending/receiving out of band data 4. Add Madeye utility 5. Fatal error support in mthca Please report any issues in bugzilla http://openib.org/bugzilla/ Tziporet Vlad ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary
Title: Message Cisco IB host drivers are available at http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-linuxand http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-win2K. Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Monday, July 24, 2006 2:45 PMTo: [EMAIL PROTECTED]Cc: openibSubject: Re: [openib-general] [openfabrics-ewg] OFED 1.1 planning meeting - summary Hi all, This is the outcome of the meeting we had today regarding OFED 1.1 schedule and features. Tziporet 1. Schedule: Target release date: 31-Aug Intermediate milestones: 1. Create 1.1 branch of user level code and rc1: 27-Jul 2. Feature freeze : 3-Aug 3. Code freeze (rc-x): 25-Aug 4. Final release: 31-Aug In general all agreed but it seems aggressive schedule. We will delay in 1 week if needed or drop some features. There was a request for another OFED release toward SC06 that will include most updated Open MPI version and we agreed this is possible. git tree of kernel code will be available on Sandia servers once their system administrator will setup the server with git installed (should be this week) 2. Features: Note: features that are under low priority may not be qualified in final release due to schedule limitations. 1. OS: Novell: - SLES 9.0 SP3* SLES10 (official release)* Redhat: Redhat EL4 up3 - Redhat EL4 up4* kernel.org: Kernel 2.6.17* * Changes from last release Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped from the list. We will keep the backport patches for these OSes and make sure OFED compile and loaded properly but will not do full QA cycle. 2. General changes: lib32 on 64 bits systems Kernel code based on 2.6.18 HCA fatal - full flow support - Low priority High Availability in IPoIB and SRP Bug fixes 3. OSM (new code based on the trunk): Partition Manager (Pkey) - Low priority Pre-computed routing load from file Primitive QoS- As technology preview Mainly developed and verified by Voltaire. 4. SDP: Beta quality (higher stability) Improved latency Improved bandwidth of small messages (Naggle algorithm) done Support the backlog parameter in the listen call support sending/receiving out of band data Interoperability with previousSDP implementation We need SDP from Cisco to test the interoperability with their SDP 5. SRP: GA quality DM (Device Mapper) - for high availability Basic failover/failback testing with daemon+srp+XVM/MPP and Engenio target A technical mail was published on the general list. (Subject: Needed changes to support fail-over drivers). Need help from Roland to close the technical details since he is SRP maintainer. 6. IPoIB: High availability supportusing a daemon in user level 7. uDAPL: Scalability features needed for Intel MPI Going to take the new code from the trunk 8. OSU MVAPICH: Based on 0.97 (+ bug fixes) Message coalescing 9. Open MPI: Open MPI 1.1.1 - Depending on the dates/schedule of OFED 1.1 and Open MPI 1.1.1 (If not then Open MPI 1.1 will be used) The major differences between Open MPI 1.1 and 1.1.1 can be seen here: http://svn.open-mpi.org/svn/ompi/trunk/NEWS 10. MPI tests: Replace to the new test versions from LLNL, Intel, OSU 11. iSER: Stability - code review and bug fixes at iser and libiscsi code related to error handling (libiscsi is a service module used by both iscsi_tcp iser) Testing more platforms (e.g. ppc64 and ia64) Performance improvements The libiscsi fixes are (2.6.18-rc2/3) and will (2.6.19) be pushed upstream and from there be propagated to distros (eg SLES10 RH5) through their merge process. 12. RDS: Oracle and SilverStorm should update Need to decide if RDS should be removed from OFED since Oracle does not support it for now. Sujal will check it and we will get to decision soon. 13: Management: Madeye utility Add saquery tool Enhancement to ibnetdiscover tool with grouping function New ibutils package: o Port error counter check o Port performancecounters dump o Link width and Link Speed check by flag -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sujal DasSent: Monday, July 17, 2006 8:32 PMTo: [EMAIL PROTECTED]Subject: [openfabrics-ewg] OFED 1.1 planning meeting Hello all, We would like to call an OFED v1.1 planning meeting ASAP. We can use the regular time and call-in number that Shawn/Jeff from Cisco had set up earlier. Shawn/Jeff: will you please confirm and send a reminder?
Re: [openib-general] OFED 1.0 - Official Release
Tziporet, I see a few C code changes from pre1 in the form of patches. What are these and why were they added after pre1? $ diff -r OFED-1.0-pre1/SOURCES/openib-1.0/patches/OFED-1.0/SOURCES/openib-1.0/patches/ 21 | less... Only in OFED-1.0-pre1/SOURCES/openib-1.0/patches/fixes: handle_reconnect_of_offline_host.patchOnly in OFED-1.0/SOURCES/openib-1.0/patches/fixes: sdp_fix.patch Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Friday, June 16, 2006 1:55 AMTo: OpenFabricsEWG; openibSubject: [openib-general] OFED 1.0 - Official Release I am happy to announce that OFED 1.0 Official Release is now available. The release can be found under: https://openib.org/svn/gen2/branches/1.0/ofed/releases/ And later today it will be on the OpenFabrics download page: http://www.openfabrics.org/downloads.html. This is the first release that was done in a joint effort of the following companies: Cisco SilverStorm Voltaire QLogic Intel Mellanox Technologies I wish to thank all who contributed to the success of this release. Tziporet === Release summary: The OFED software package is composed of several software modules intended for use on a computer cluster constructed as an InfiniBand network. OFED package contains the following components: o OpenFabrics core and ULPs: - HCA drivers (mthca, ipath) - core - Upper Layer Protocols: IPoIB, SDP, SRP Initiator, iSER Host, RDS and uDAPL o OpenFabrics utilities: - OpenSM: InfiniBand Subnet Manager - Diagnostic tools - Performance tests o MPI: - OSU MPI stack supporting the InfiniBand interface - Open MPI stack supporting the InfiniBand interface - MPI benchmark tests (OSU BW/LAT, Pallas, Presta) o Sources of all software modules (under conditions mentioned in the modules' LICENSE files) o Documentation Notes: 1. SDP and RDS are in technology preview state. 2. The SRP Initiator and Open MPI are in beta state. 3. All other OFED components are in production state. Supported Platforms and Operating Systems CPU architectures: * x86_64 * x86 * ia64 * ppc64 Linux Operating Systems: * RedHat EL4 up2: 2.6.9-22.ELsmp * RedHat EL4 up3: 2.6.9-34.ELsmp * Fedora C4: 2.6.11-1.1369_FC4 * SLES10 RC2: 2.6.16.16-1.6-smp (or RC 2.5 2.6.16.14-6-smp) * SLES10 RC1: 2.6.16.14-6-smp * SUSE 10 Pro: 2.6.13-15-smp * kernel.org: 2.6.16.x HCAs Supported Mellanox HCAs: - InfiniHost - InfiniHost III Ex (both modes: with memory and MemFree) - InfiniHost III Lx Both SDR and DDR mode of the InfiniHost III family are supported. For official FW versions please see: http://www.mellanox.com/support/firmware_table.php Qlogic HCAs: - QHT6040 (PathScale InfiniPath HT-460) - QHT6140 (PathScale InfiniPath HT-465) - QLE6140 (PathScale InfiniPath PE-880) Switches Supported This release was tested with switches and gateways provided by the following companies: - Cisco - Voltaire - SilverStorm - Flextronics Attached are the release notes Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MVAPICH failure on IBM PPC-64 Linux machine
I agree it's not working, and I have opened bug 135 (OFED 1.0: MVAPICH doesn't work on RHEL4 U3 ppc64). Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Boris ShpolyanskySent: Monday, June 12, 2006 5:53 PMTo: openib-general@openib.orgSubject: [openib-general] MVAPICH failure on IBM PPC-64 Linux machine Hi, I've run into following failure running OSU MPI out of OFED-rc5 on IBM PPC-64 platform: [1] Abort: Error creating QPat line 820 in file viainit.cmpirun: executable version 1 does not match our version 3, This seems to be memory allocation issue which could be easily explained (and overcome) if the job is launched with regular user permissions, but in my case it's root who launches it. Have anybody tested OFED's OSU MPI on PPC-64 platform recently and can comment on this ? Thanks, Boris Shpolyansky Application Engineer Mellanox Technologies Inc. 2900 Stender Way Santa Clara, CA 95054 Tel.: (408) 916 0014 Fax: (408) 970 3403 Cell: (408) 834 9365 www.mellanox.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB MTU tunable for uDAPL and/or Intel MPI?
This didn't help. Osu_bibw.c still reports max bi bandwidth in the 1600s, should be in the 1900s. I looked back at my notes, and OFED 1.0 rc4 had desired max bi bandwidth with OFED 1.0 rc4, did the uDAPL IB MTU change? $ mpiexec -genv I_MPI_DAPL_PROVIDER OpenIB-scm -genv I_MPI_DEBUG 3 -genv I_MPI_DEVICE rdssm -genv LD_LIBRARY_PATH .../lib -n 2 ../osu_bibw.x I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma I_MPI: [0] set_up_devices(): will use device: libmpi.rdssm.so I_MPI: [0] set_up_devices(): will use DAPL provider: OpenIB-cma # OSU MPI Bidirectional Bandwidth Test (Version 2.1) # Size Bi-Bandwidth (MB/s) 1 0.813478 2 1.637650 4 3.260333 8 6.627831 16 12.168080 32 25.683379 64 50.580351 128 95.035855 256 174.132061 512 310.656179 1024513.066433 2048726.685587 4096877.233753 8192973.311995 16384 1040.096136 32768 849.790165 65536 1088.723063 131072 1296.584344 262144 1428.176271 524288 1540.248671 1048576 1579.665660 2097152 1608.765475 4194304 1628.157462 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Arlin Davis [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 11:38 AM To: Scott Weitzenkamp (sweitzen) Cc: Tziporet Koren; [EMAIL PROTECTED]; Davis, Arlin R; Lentini, James; openib-general Subject: Re: [openib-general] IB MTU tunable for uDAPL and/or Intel MPI? Scott Weitzenkamp (sweitzen) wrote: While we're talking about MTUs, is the IB MTU tunable in uDAPL and/or Intel MPI via env var or config file? Looks like Intel MPI 2.0.1 uses 2K for IB MTU like MVAPICH does in OFED 1.0 rc4 and rc6, I'd like to try 1K with Intel MPI. Scott There is no mechanism for me to modify the MTU using rdma_cm so whatever is returned in the path record is what you get with the OpenIB-cma provider. However, you could use the OpenIB-scm provider which is hard coded for 1K MTU as a comparision. Can you run with -genv I_MPI_DAPL_PROVIDER OpenIB-scm on your cluster? -arlin -- -- *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Scott Weitzenkamp (sweitzen) *Sent:* Thursday, June 08, 2006 4:38 PM *To:* Tziporet Koren; [EMAIL PROTECTED] *Cc:* openib-general *Subject:* RE: [openib-general] OFED-1.0-rc6 is available The MTU change undos the changes for bug 81, so I have reopened bug 81 (http://openib.org/bugzilla/show_bug.cgi?id=81). With rc6, PCI-X osu_bw and osu_bibw performance is bad, and PCI-E osu_bibw performance is bad. I've enclosed some performance data, look at rc4 vs rc5 vs rc6 for Cougar/Cheetah/LionMini. Are there other benchmarks driving the changes in rc6 (and rc4)? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems *OSU MPI:* *Added mpi_alltoall fine tuning parameters *Added default configuration/documentation file $MPIHOME/etc/mvapich.conf *Added shell configuration files $MPIHOME/etc/mvapich.csh , $MPIHOME/etc/mvapich.csh *Default MTU was changed back to 2K for InfiniHost III Ex and InfiniHost III Lx HCAs. For InfiniHost card recommended value is: VIADEV_DEFAULT_MTU=MTU1024 - --- ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.0-rc6 tarball available with working ipath driver
I agree, having an rc7 then ~three days to test it for regressions seems appropriate. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Woodruff, Robert J Sent: Monday, June 12, 2006 4:47 PM To: Hefty, Sean; Betsy Zeller; Tziporet Koren Cc: OpenFabricsEWG; openib-general Subject: Re: [openib-general] [openfabrics-ewg] OFED 1.0-rc6 tarball available with working ipath driver Sean wrote, Tziporet - Bryan has confirmed that with the patches you've copied, things should work correctly. We've been testing with our version, but I really want to test on the OFED-1.0 version that you've built. Can you send us a pointer to it? How can you go from an RC6 that doesn't build to a 1.0 release? Shouldn't you at least get a release candidate that builds first? - Sean I agree, don't see how we can go from something that has never been tested by the wider community to released. Has anyone run uDAPL tests or Intel MPI with Pathscale ? woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IB MTU tunable for uDAPL and/or Intel MPI?
While we're talking about MTUs, is the IB MTU tunable in uDAPL and/or Intel MPI via env var or config file? Looks like Intel MPI 2.0.1 uses 2K for IB MTU like MVAPICH does in OFED 1.0 rc4 and rc6, I'd like to try 1K with Intel MPI. Scott From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Thursday, June 08, 2006 4:38 PMTo: Tziporet Koren; [EMAIL PROTECTED]Cc: openib-generalSubject: RE: [openib-general] OFED-1.0-rc6 is available The MTU change undos the changes for bug 81, so I have reopened bug 81 (http://openib.org/bugzilla/show_bug.cgi?id=81). With rc6, PCI-X osu_bw and osu_bibw performance is bad, and PCI-E osu_bibw performance is bad. I've enclosed some performance data, look at rc4 vs rc5 vs rc6 for Cougar/Cheetah/LionMini. Are there other benchmarks driving the changes in rc6 (and rc4)? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems OSU MPI: Added mpi_alltoall fine tuning parameters Added default configuration/documentation file $MPIHOME/etc/mvapich.conf Added shell configuration files $MPIHOME/etc/mvapich.csh , $MPIHOME/etc/mvapich.csh Default MTU was changed back to 2K for InfiniHost III Ex and InfiniHost III Lx HCAs. For InfiniHost card recommended value is:VIADEV_DEFAULT_MTU=MTU1024 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o
This is working for us on RHEL4 U3, thanks! Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Thursday, May 25, 2006 2:49 AM To: Scott Weitzenkamp (sweitzen) Cc: Paul; openib-general@openib.org Subject: Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o In OFED-1.0-rc5 all binaries and libraries will be compiled on *ppc64 *with *-m64* flag. This requires sysfsutils and sysfsutils-devel 64-bit RPM to be installed (in order to build libibverbs). Also pciutils and pciutils-devel 64-bit required for tvflash package. libsdp will be built both 32 and 64 bit libraries. Note: in order to build sysfsutils 64-bit RPM run: CC=gcc -m64 rpmbuild --rebuild sysfsutils-1.3.0-1.2.1.src.rpm (This was tested on Fedora C4 PPC64) Regards, Vladimir Scott Weitzenkamp (sweitzen) wrote: I know Vlad made some changes for rc5 in this area, at least for libsdp, not sure if other libs got changed as well. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- *From:* Paul [mailto:[EMAIL PROTECTED] *Sent:* Wednesday, May 24, 2006 11:00 AM *To:* Scott Weitzenkamp (sweitzen) *Cc:* openib-general@openib.org *Subject:* Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o Scott, Upon further inspection the build.sh and install.sh scripts built 32bit libraries and binaries. If I export CFLAGS (and the like) to include -m64 then the build dies while looking for a 64bit libsysfs. rhel4 u3 does not include a ppc64 sysfsutils, nor have I been able to find an actual 64bit version of it. Is there a workaround for getting things to build actual ppc64 binaries/libraries ? The actual error is: checking for dlsym in -ldl... yes checking for pthread_mutex_init in -lpthread... yes checking for sysfs_open_class in -lsysfs... no configure: error: sysfs_open_class() not found. libibverbs requires libsysfs. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] OFED-1.0-rc6 is available
The MTU change undos the changes for bug 81, so I have reopened bug 81 (http://openib.org/bugzilla/show_bug.cgi?id=81). With rc6, PCI-X osu_bw and osu_bibw performance is bad, and PCI-E osu_bibw performance is bad. I've enclosed some performance data, look at rc4 vs rc5 vs rc6 for Cougar/Cheetah/LionMini. Are there other benchmarks driving the changes in rc6 (and rc4)? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems OSU MPI: Added mpi_alltoall fine tuning parameters Added default configuration/documentation file $MPIHOME/etc/mvapich.conf Added shell configuration files $MPIHOME/etc/mvapich.csh , $MPIHOME/etc/mvapich.csh Default MTU was changed back to 2K for InfiniHost III Ex and InfiniHost III Lx HCAs. For InfiniHost card recommended value is:VIADEV_DEFAULT_MTU=MTU1024 mpi_perf.xls Description: mpi_perf.xls ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
Yes, the modules were loaded. Each of the 32 hosts had 3 IB ports up. Does Intel MPI or uDAPL use multiple ports and/or multiple HCAs? I shut down all but one port on each host, and now Pallas is running better on the 32 nodes using Intel MPI 2.0.1. HP MPI 2.2 started working too with Pallas too over uDAPL, so maybe this is a uDAPL issue? I need to repeat the tests to make sure this isn't a fluke. Thanks for your help so far. Scott -Original Message- From: Davis, Arlin R [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 07, 2006 12:11 PM To: Scott Weitzenkamp (sweitzen); Arlin Davis Cc: Lentini, James; openib-general Subject: RE: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS Scott, Can you take a look and see if rdma_cm and rdma_ucm modules are being loaded? I noticed on my latest OFED RC5 install that I had to start them manually. -arlin -Original Message- From: Scott Weitzenkamp (sweitzen) [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 06, 2006 5:08 PM To: Arlin Davis; Scott Weitzenkamp (sweitzen) Cc: Davis, Arlin R; Lentini, James; openib-general Subject: RE: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS this looks like a configuration issue and not the timeout. The CR timeouts occured with the rdma device and not the rdssm. Is IPoIB running on the ib0 interfaces across the fabric? Yes, IPoIB is running. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
I have not touched /etc/dat.conf, so I am using whatever comes with OFED 1.0 rc5. For whatever reason, things have improved some. I am now running Intel MPI right after bringing up hosts (previously I was trying MVAPICH, then Open MPI, then HP MPI, then Intel MPI). I've run twice, and see these failures: Run #1 (after rebooting all hosts): rank 13 in job 1 192.168.1.1_34674 caused collective abort of all ranks^M exit status of rank 13: killed by signal 11 ^M [EMAIL PROTECTED]:/data/home/scott/builds/TopspinOS-2.7.0/build013 /protes\ t/Lk3/060706_123945/[EMAIL PROTECTED] intel.intel]$ ### TEST-W: Could not run /data/home/scott/builds/TopspinOS-2.7.0/build013/prot\ est/Lk3/060706_123945/intel.intel/1149709233/IMB_2.3/src/IMB-MPI1 Allreduce : 0\ Run #2 (after rebooting all hosts): rank 6 in job 1 192.168.1.1_33649 caused collective abort of all ranks^M exit status of rank 6: killed by signal 11 ^M [EMAIL PROTECTED]:/data/home/scott/builds/TopspinOS-2.7.0/build013 /protes\ t/Lk3/060706_145739/[EMAIL PROTECTED] intel.intel]$ ### TEST-W: Could not run /data/home/scott/builds/TopspinOS-2.7.0/build013/prot\ est/Lk3/060706_145739/intel.intel/1149717497/IMB_2.3/src/IMB-MPI1 Exchange : 0 rank 21 in job 1 192.168.1.1_34734 caused collective abort of all ranks^M exit status of rank 21: killed by signal 11 ^M [EMAIL PROTECTED]:/data/home/scott/builds/TopspinOS-2.7.0/build013 /protes\ t/Lk3/060706_145739/[EMAIL PROTECTED] intel.intel]$ ### TEST-W: Could not run /data/home/scott/builds/TopspinOS-2.7.0/build013/prot\ est/Lk3/060706_145739/intel.intel/1149717497/IMB_2.3/src/IMB-MPI1 Allgatherrv -\ multi 1: 0 Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Arlin Davis [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 07, 2006 3:25 PM To: Scott Weitzenkamp (sweitzen) Cc: Davis, Arlin R; Lentini, James; openib-general Subject: Re: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS Scott Weitzenkamp (sweitzen) wrote: Yes, the modules were loaded. Each of the 32 hosts had 3 IB ports up. Does Intel MPI or uDAPL use multiple ports and/or multiple HCAs? I shut down all but one port on each host, and now Pallas is running better on the 32 nodes using Intel MPI 2.0.1. HP MPI 2.2 started working too with Pallas too over uDAPL, so maybe this is a uDAPL issue? Can you tell me what adapters are installed (ibstat), how they are configured (ifconfig), and what your dat.conf looks like? It sounds like a device mapping issue during the dat_ia_open() processing. Multiple ports and HCAs should work fine but there is some care required in configuration of the dat.conf so you consitantly pick up the correct device across the cluster. Intel MPI will simply open a device based on the provider/device name (example: setenv I_MPI_DAPL_PROVIDER=OpenIB-cma) defined in the dat.conf and query dapl for the address to be used for connections. This line in the dat.conf will determine which library to load and which IB device to open and bind too. If you have the same exact configuration on each node and know that the ib0,ib1,ib2, etc will always come up in the same order then you can simply use the same netdev names across the cluster and use the same exact copy of dat.conf on each node. Here are the dat.conf options for OpenIB-cma configurations. # For cma version you specify ia_params as: # network address, network hostname, or netdev name and 0 for port # # Simple (OpenIB-cma) default with netdev name provided first on list # to enable use of same dat.conf version on all nodes # OpenIB-cma u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 OpenIB-cma-ip u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 192.168.0.22 0 OpenIB-cma-name u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 svr1-ib0 0 OpenIB-cma-netdev u1.2 nonthreadsafe default /usr/lib/libdaplcma.so mv_dapl.1.2 ib0 0 Which type are you using? address, hostname, or netdev names? Also, Intel MPI is sometimes too smart for its own good when opening rdma devices via uDAPL. If the open fails with the first rdma device specified in the dat.conf it will continue onto the next line until one is successfull. If all rdma devices fail it will then go onto the static device automatcally. This sometimes does more harm then good since one node could be failing over to the second device in your configuration and the other nodes are all on the first device. If they are all on the same subnet then it would work fine but if they are on different subnets then we would not be able to connect. If you send me your configuration, we can set it up here and hopefully duplicate your error case. -arlin ___ openib-general mailing list openib
RE: [openib-general] Re: [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
Tziporet is the gatekeeper (does that make me the keymaster? :-). Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of James Lentini Sent: Tuesday, June 06, 2006 2:51 PM To: Arlin Davis Cc: 'openib-general' Subject: [openib-general] Re: [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS On Mon, 5 Jun 2006, Arlin Davis wrote: Here is a patch to the openib-cma provider that uses the new set_option feature of the uCMA to adjust connect request timeout and retry values. The defaults are a little quick for some consumers. They are now bumped up from 3 retries to 15 and are tunable with uDAPL environment variables. Also, included a fix to disallow any event after a disconnect event. Committed in revision 7755. I would like to get this in OFED RC6 if possible. Who is the gatekeeper for OFED? One of us should bring this to their attention, but I'm not sure who to contact. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS
this looks like a configuration issue and not the timeout. The CR timeouts occured with the rdma device and not the rdssm. Is IPoIB running on the ib0 interfaces across the fabric? Yes, IPoIB is running. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Removing mpi subtree from ofed branch
I would like to remove the userspace mpi subtree from the ofed branch (https://openib.org/svn/gen2/branches/1.0/src/userspace). MPI is supplied in ofed as a separate package, which is not taken from the ofed branch. The presence of the mpi directory in the ofed branch is therefore misleading. So why don't we put the OFED MVAPICH MPI source in the branch then? It is also kinda confusing that the OFED MVAPICH is a tarball and not in subversion, given that it is based off the code that is in suvbersion. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o
I know Vlad made some changes for rc5 in this area, at least for libsdp, not sure if other libs got changed as well. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: Paul [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 24, 2006 11:00 AMTo: Scott Weitzenkamp (sweitzen)Cc: openib-general@openib.orgSubject: Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o Scott, Upon further inspection the build.sh and install.sh scripts built 32bit libraries and binaries. If I export CFLAGS (and the like) to include -m64 then the build dies while looking for a 64bit libsysfs. rhel4 u3 does not include a ppc64 sysfsutils, nor have I been able to find an actual 64bit version of it. Is there a workaround for getting things to build actual ppc64 binaries/libraries ? The actual error is:checking for dlsym in -ldl... yeschecking for pthread_mutex_init in -lpthread... yeschecking for sysfs_open_class in -lsysfs... noconfigure: error: sysfs_open_class() not found. libibverbs requires libsysfs. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o
OFED 1.0 rc4 does compile and run on RHEL4 U3 ppc64. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul LundinSent: Tuesday, May 23, 2006 12:34 PMTo: openib-general@openib.orgSubject: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o Hi All, I just started working with openIB in the past week. I am having an issue getting the kernel modules to compile with the stock rhel4 u3 kernel. I have applied the patches found at https://openib.org/svn/gen2/branches/backport/2.6.9_U3/ and followed the instructions from https://openib.org/tiki/tiki-index.php?page=Installation+Cheat+Sheet but I have been getting the following error:LD /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/built-in.oLD /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/built-in.oCC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/index.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/addr.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/cm.o/usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/cm.c: In function `ib_cm_cleanup':/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/cm.c:3367: warning: implicit declaration of function `idr_destroy'CC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/packer.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/ud_header.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/verbs.oCC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/sysfs.o/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.c:693: error: unknown field `uevent' specified in initializer/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.c:693: warning: initialization from incompatible pointer type make[2]: *** [/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.o] Error 1make[1]: *** [/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core] Error 2make: *** [_module_/usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband] Error 2make: Leaving directory `/usr/src/kernels/2.6.9-34.EL-ppc64'Any help would be appreciated. As noted this is on a ppc64 machine. The rhel4 u3 install does *NOT* configure openIB by default like it does on intel architectures. I was wondering if openIB has been tested at all on ppc64 and if this was even possible at this point. Regards.Paul ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o
No clue, I know if you grab OFED 1.0 rc4 tarball and run install.sh, it should work. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: Paul [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 23, 2006 12:42 PMTo: Scott Weitzenkamp (sweitzen)Cc: openib-general@openib.orgSubject: Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o Scott, Thanks for the confirmation and the quick reply. Any ideas as to what might be causing the error in question ?Regards. On 5/23/06, Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED] wrote: OFED 1.0 rc4 does compile and run on RHEL4 U3 ppc64. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Paul LundinSent: Tuesday, May 23, 2006 12:34 PMTo: openib-general@openib.orgSubject: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o Hi All, I just started working with openIB in the past week. I am having an issue getting the kernel modules to compile with the stock rhel4 u3 kernel. I have applied the patches found at https://openib.org/svn/gen2/branches/backport/2.6.9_U3/ and followed the instructions from https://openib.org/tiki/tiki-index.php?page=Installation+Cheat+Sheet but I have been getting the following error:LD /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/built-in.oLD /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/built-in.oCC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/index.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/addr.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/cm.o/usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/cm.c: In function `ib_cm_cleanup':/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/cm.c:3367: warning: implicit declaration of function `idr_destroy'CC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/packer.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/ud_header.oCC [M] /usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/verbs.oCC [M] /usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband/core/sysfs.o/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.c:693: error: unknown field `uevent' specified in initializer/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.c:693: warning: initialization from incompatible pointer type make[2]: *** [/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core/sysfs.o] Error 1make[1]: *** [/usr/src/kernels/2.6.9-34.EL-ppc64/drivers/infiniband/core] Error 2make: *** [_module_/usr/src/kernels/2.6.9- 34.EL-ppc64/drivers/infiniband] Error 2make: Leaving directory `/usr/src/kernels/2.6.9-34.EL-ppc64'Any help would be appreciated. As noted this is on a ppc64 machine. The rhel4 u3 install does *NOT* configure openIB by default like it does on intel architectures. I was wondering if openIB has been tested at all on ppc64 and if this was even possible at this point. Regards.Paul ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] OFED-1.0-rc4 need db-devel
db-devel package is required to build open_iscsi package RPM. This package is not relevant for RHEL 4.3. There are two options to install OFED-1.0-rc4 on RHEL 4.3 without open_iscsi: 1. Select Custom installation and don't choose to install open_iscsi. 2. Edit ofed.conf (created automatically under OFED-1.0-rc4 directory when you run install.sh or build.sh) and set *open_iscsi=n*. Then run: ./install.sh -c ofed.conf Why don't we ignore these packages on RHEL4 U3, just like we ignore uDAPL on ppc64? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] ib_mthca fails to load with old firmware
Ken I'm running into a problem when I try to use the OFED RC4 Ken release on some blade systems that have TopSpin HCA daughter Ken cards installed (actually Mellanox). I'm trying to figure out Ken how to update the firmware to the latest [ Ken http://mellanox.com/support/firmware_table.php ] but it seems Ken I must know the PSID so I can grab the right firmware Ken image. Can anyone point me in the right direction here? For blade HCAs you should contact the HCA vendor for firmware updates. You could try passing the module option fw_cmd_doorbell=0 to ib_mthca. That may work around things. - R. What kind of blade systems are these? For some blade systems, Cisco provides HCA firmware that has been configured to provide better signal integrity. If you run /usr/local/ofed/sbin/tvflash -i, I can then tell which firmware you need. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RE: OFED 1.0 rc4 won't compile on orig FC5 kernel
After running yum update, I was able to compile OFED 1.0 rc4 on 2.6.16-1.2111_FC5 kernel. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, May 11, 2006 9:37 AM To: Scott Weitzenkamp (sweitzen) Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: OFED 1.0 rc4 won't compile on orig FC5 kernel Quoting r. Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED]: Subject: OFED 1.0 rc4 won't compile on orig FC5 kernel Is this a useful kernel to try, or should get latest FC5 kernel or 2.6.16 from kernel.org? I think you should go to latest update. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RE: [openfabrics-ewg] RE: OFED 1.0 rc4 won't compile on orig FC5 kernel
Actually, I spoke too soon. Kernel components compiled, but MVAPICH did not: Compiling MVAPICH ... 2 mpirun_rsh.c: In function 'read_hostfile': mpirun_rsh.c:1197: warning: incompatible implicit declaration of built-in functi on 'strndup' mpirun_rsh.c:1205: warning: incompatible implicit declaration of built-in functi on 'strndup' mpirun_rsh.c:1220: warning: incompatible implicit declaration of built-in functi on 'strndup' mpirun_rsh.c:1220: error: too few arguments to function 'strndup' make[3]: *** [mpirun_rsh] Error 1 Exit status from make was 2 make[2]: *** [mpilib] Error 1 make[1]: *** [mpi-modules] Error 2 make: *** [mpi] Error 2 Error in compiling MVAPICH. Check the log file: make.mvapich.log Exiting Mvapich installation failed Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen) Sent: Tuesday, May 16, 2006 10:48 AM To: Michael S. Tsirkin Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: [openfabrics-ewg] RE: OFED 1.0 rc4 won't compile on orig FC5 kernel After running yum update, I was able to compile OFED 1.0 rc4 on 2.6.16-1.2111_FC5 kernel. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, May 11, 2006 9:37 AM To: Scott Weitzenkamp (sweitzen) Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: OFED 1.0 rc4 won't compile on orig FC5 kernel Quoting r. Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED]: Subject: OFED 1.0 rc4 won't compile on orig FC5 kernel Is this a useful kernel to try, or should get latest FC5 kernel or 2.6.16 from kernel.org? I think you should go to latest update. -- MST ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OFED 1.0 rc4 won't compile on orig FC5 kernel
Is this a useful kernel to try, or should get latest FC5 kernel or 2.6.16 from kernel.org? gcc -Wp,-MD,/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/.sysfs.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.1.0/include -D__KERNEL__ -I/var/tmp/OFED/tmp/openib/openib/include -I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include -Iinclude -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os-fomit-frame-pointer -g -march=k8 -mtune=nocona -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wdeclaration-after-statement -Wno-pointer-sign -I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include -I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/ulp/ipoib -I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/ulp/kdapl -I/var/tmp/OFED/tmp/openib/openib/drivers/infiniband/debug -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(sysfs)" -D"KBUILD_MODNAME=KBUILD_STR(ib_core)" -c -o /var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/.tmp_sysfs.o /var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.cIn file included from include/asm/pci.h:9, from include/linux/pci.h:648, from /var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/rdma/ib_mad.h:42, from /var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c:42:include/linux/mm.h: In function 'kernel_map_pages':include/linux/mm.h:1055: warning: implicit declaration of function 'mutex_debug_check_no_locks_freed'/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c: In function 'ib_device_uevent':/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c:443: warning: implicit declaration of function 'add_hotplug_env_var'/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c: At toplevel:/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c:674: error: unknown field 'hotplug' specified in initializer/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.c:674: warning: initialization from incompatible pointer typemake[3]: *** [/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/sysfs.o] Error 1make[2]: *** [/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core]Error 2make[1]: *** [_module_/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband] Error 2make[1]: Leaving directory `/usr/src/kernels/2.6.15-1.2054_FC5-x86_64'make: *** [kernel] Error 2ERROR: Failed to execute: make kernel Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] bugs I see as must-fix for OFED 1.0 rc5
SDP overhaul49 OFED 1.0: MVAPICH won't compile on ppc6451 OFED 1.0 rc4: SRP not available for RHEL4 U357 OFED 1.0 rc4: rdma_cm does not work for uDAPL59 OFED 1.0: Open MPI not configured correctly to find shlibs61 OFED 1.0 rc4: RDS does not load on RHEL4 U362 OFED 1.0 rc4: too many SRP patches, get this code checked in64 OFED 1.0 rc4: Open MPI fails when host has more than one ...74 OFED 1.0 rc4: Open MPI Pallas test hangs Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openfabrics-ewg] RE: [openib-general] OFED-1.0-rc4 is available
Title: OFED-1.0-rc4 is available 5) Open MPI is not working well for Pallas benchmarks, more details to follow[Scott Weitzenkamp (sweitzen)]First bug filed, Open MPI does not work when more than one IB port is active on a host (http://openib.org/bugzilla/show_bug.cgi?id=64). Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IPoIB not working on ppc64 RHEL4 U3 w/OFED 1.0 rc4?
I can't seem to get IPoIB to work (didn't try earlier rc) on this combo, does anyone have it working? [EMAIL PROTECTED] ~]# lsmod | grep ib_ipoibib_ipoib 70296 0ib_sa 29536 1 ib_ipoibib_core 81720 3 ib_ipoib,ib_sa,ib_mad[EMAIL PROTECTED] ~]# uname -aLinux svbu-qa-js20-1 2.6.9-34.EL #1 SMP Fri Feb 24 16:46:57 EST 2006 ppc64 ppc64ppc64 GNU/Linux[EMAIL PROTECTED] ~]# ifconfig ib0ib0: error fetching interface information: Device not found Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] IPoIB not working on ppc64 RHEL4 U3 w/OFED 1.0 rc4?
Never mind, user error on my part. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Monday, May 08, 2006 10:54 AMTo: [EMAIL PROTECTED]Cc: openib-general@openib.orgSubject: [openib-general] IPoIB not working on ppc64 RHEL4 U3 w/OFED 1.0 rc4? I can't seem to get IPoIB to work (didn't try earlier rc) on this combo, does anyone have it working? [EMAIL PROTECTED] ~]# lsmod | grep ib_ipoibib_ipoib 70296 0ib_sa 29536 1 ib_ipoibib_core 81720 3 ib_ipoib,ib_sa,ib_mad[EMAIL PROTECTED] ~]# uname -aLinux svbu-qa-js20-1 2.6.9-34.EL #1 SMP Fri Feb 24 16:46:57 EST 2006 ppc64 ppc64ppc64 GNU/Linux[EMAIL PROTECTED] ~]# ifconfig ib0ib0: error fetching interface information: Device not found Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] OFED-1.0-rc4 is available
Title: OFED-1.0-rc4 is available Open MPI compiles on PPC64, at least on RHEL4 it does. 3. MPI OSU and Open MPI compilation fails on PPC64 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] sdp test tools
We use netperf with libsdp.so, for example: $ LD_PRELOAD=libsdp.so netperf ... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of zhu shi song Sent: Monday, May 08, 2006 7:52 PM To: openib-general@openib.org Subject: [openib-general] sdp test tools I hope to test sdp connection capacity and its performance. Who has the test tools? Or which tool is more suitable? tks zhu shi song __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Re: Need OpenIB bugzilla component for RDS
I'm not seeing an RDS component yet... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Bryan O'Sullivan [mailto:[EMAIL PROTECTED] Sent: Monday, May 08, 2006 9:51 PM To: Ranjit Pandit Cc: Scott Weitzenkamp (sweitzen); openib-general Subject: Re: [openib-general] Re: Need OpenIB bugzilla component for RDS On Sun, 2006-05-07 at 10:07 -0700, Ranjit Pandit wrote: Please mark me as the default owner of RDS bugs. OK, you're in. b ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general