Re: [OMPI devel] 1.6.2rc3 released

2012-09-12 Thread Paul Hargrove
Solaris and NetBSD platforms which displayed the VT Makefile error w/ 1.6.2rc2 have completed successful builds w/ 1.6.2rc3. The PGI-8.0 platform which showed the other VT problem is down at the moment. So, I've tested that fix on the older (and thus slower) PGI-7.2-5 platform which also

Re: [OMPI devel] 1.6.2rc3 released

2012-09-13 Thread Paul Hargrove
<matthias.jur...@tu-dresden.de> wrote: > On Wednesday 12 September 2012 17:16:48 Paul Hargrove wrote: >> Solaris and NetBSD platforms which displayed the VT Makefile error w/ >> 1.6.2rc2 have completed successful builds w/ 1.6.2rc3. >> >> The PGI-8.0 platform which sh

[OMPI devel] F90 build failures with XLF-14.1 and OMPI trunk

2012-09-13 Thread Paul Hargrove
I've just tried building the Open MPI Trunk on a PPC64/Linux node using the May 2012 XL compilers from IBM (XLC-12.1 and XLF-14.1) While I CAN build the 1.6.2 rc's, a build of the trunk is failing in the F90 bindings as shown at the end of this message. While MOST errors are "1513-230", note that

Re: [OMPI devel] F90 build failures with XLF-14.1 and OMPI trunk

2012-09-13 Thread Paul Hargrove
An earlier release of the XL compilers (XLC-11.1 and XLF-13.1) on a different host *also* displays the same errors. -Paul On Thu, Sep 13, 2012 at 2:33 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > I've just tried building the Open MPI Trunk on a PPC64/Linux node using > the

Re: [OMPI devel] making Fortran MPI_Status components public

2012-09-27 Thread Paul Hargrove
Unless I am missing something here the desired incantation is either "PUBLIC" to make an entire module's contents accessible, or "PUBLIC :: [component]" for individual control. PUBLIC should be a standard part of F95 (no configure probe required). However, the presence of "OMPI_PRIVATE" suggests

Re: [OMPI devel] making Fortran MPI_Status components public

2012-09-27 Thread Paul Hargrove
On Wed, Sep 26, 2012 at 10:56 PM, Jeff Squyres wrote: [...] > > However, the presence of "OMPI_PRIVATE" suggests you already have a > configure probe for the "PRIVATE" keyword. > > Yes, we do, because not all compilers support it (yet?). > Then I'd guess you'll need to

[OMPI devel] MPI-RMA on uGNI?

2012-10-19 Thread Paul Hargrove
I am trying to resolve an odd issue I am seeing with my one uGNI-based code, and was hoping to use OMPI's uGNI support as an example of correct usage. My particular interest is in RDMA, but as far as I can tell the uGNI blt in ompi-trunk doesn't have the btl_put or blt_get entry points. So, if I

Re: [OMPI devel] MPI-RMA on uGNI?

2012-10-22 Thread Paul Hargrove
Thanks, Pasha. I see them now. I asked precisely because I doubted that I have enough understanding of the code structure to find this code on my own (and I did look). Obviously I was right to doubt myself. -Paul On Mon, Oct 22, 2012 at 7:10 AM, Shamis, Pavel wrote: >

[OMPI devel] 1.7.0rc5 - make check failure on OpenBSD-5.1/{i386, amd64}

2012-10-31 Thread Paul Hargrove
/w. -Paul On Tue, Oct 30, 2012 at 8:13 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > On my FreeBSD-6.3/amd64 platform I see "make check" failing 3 tests under > test/datatype (see below). Of course "make" stops after that, making it > possible that additional tests

[OMPI devel] 1.7.0rc5 - build failure w/ gcc-3.4.6/x86-64 (regression)

2012-10-31 Thread Paul Hargrove
I have access to a Linux/x86-64 machine running "Red Hat Enterprise Linux AS release 4" It has a pretty old gcc: $ gcc --version | head -1 gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3) As shown below, this gcc is rejecting some portion of the atomics. I am certain I've tested ompi-1.5 and 1.6 on

[OMPI devel] 1.7.0rc5 - FORTRAN build failure w /pathcc-4.0.12.1

2012-10-31 Thread Paul Hargrove
I have a Linux/x86-64 system with PathScale's "ekopath-4.0.12.1" compilers. Building Fortran 2008 support fails as shown below. My records show the ompi-1.5 branch and a Feb 2012 trunk were OK on this configuration. -Paul PPFC mpi-f08-interfaces-callbacks.lo module

[OMPI devel] 1.7.0rc5 - FOTRAN build failure with Open64 compilers

2012-10-31 Thread Paul Hargrove
Linux/x86-64 host with Open64 compilers version 4.5.1 from AMD. Fortran 2008 support is failing to build as shown below. My records show the ompi-1.5 branch was fine on this configuration. -Paul PPFC mpi-f08-types.lo ^ openf95-855 openf90: ERROR MPI_F08_TYPES, File =

[OMPI devel] 1.7.0rc5 - FORTRAN build failure w/ XLF

2012-10-31 Thread Paul Hargrove
The problems I previously reported building the trunk with IBM's xlc/xlf: http://www.open-mpi.org/community/lists/devel/2012/09/11518.php is still present in OMPI-1.7.0rc5 -Paul On Tue, Oct 30, 2012 at 7:01 PM, Ralph Castain wrote: > Hi folks > > We have posted the next

Re: [OMPI devel] 1.7.0rc5 - FORTRAN build failure w /pathcc-4.0.12.1

2012-10-31 Thread Paul Hargrove
pathf95 from PathScale's 3.2.99 compiler suite fails in the same manner: LOGICAL(KIND=4) not allowed with BIND(C) -Paul On Tue, Oct 30, 2012 at 9:03 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > I have a Linux/x86-64 system with PathScale's "ekopath-4.0.12.1" comp

Re: [OMPI devel] Compile-time MPI_Datatype checking

2012-10-31 Thread Paul Hargrove
Note that with Apple's latest versions of Xcode (4.2 and higher, IIRC) Clang is now the default C compiler. I am told that Clang is the ONLY bundled compiler for OSX 10.8 (Mountain Lion) unless you take extra steps to install gcc (which is actually llvm-gcc and cross-compiles for OSX 10.7). So,

Re: [OMPI devel] About Marshalling and Umarshalling

2012-11-04 Thread Paul Hargrove
Santhosh, I think Ralph wants you to give your definitions for "Marshlling" and "Unmashalling". Otherwise, it is not clear to him or others exactly what you are asking, because there are multiple possible meanings for those terms. -Paul On Sun, Nov 4, 2012 at 7:56 PM, Santhosh Kokala <

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27601 - trunk

2012-11-14 Thread Paul Hargrove
On Wed, Nov 14, 2012 at 6:26 PM, Larry Baker wrote: > m4 --version | sed -n -E -e > '1s/^.*[^A-Za-z0-9_-]?([0-9]+[.][0-9]+[.][0-9]+)[^A-Za-z0-9_-]?.*$/\1/p' > There are STILL problems with this approach as it is TWICE specific to GNU software: 1) M4 on OpenBSD (maybe others)

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27601 - trunk

2012-11-14 Thread Paul Hargrove
e broken parsing of Ralph's > "flex --version". Assuming the RE parser I wrote is satisfactory, it would > have to be adapted to fit in the framework, i.e., it has to be portable. > > Larry Baker > US Geological Survey > 650-329-5608 > ba...@usgs.gov > > >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27580 - in trunk: ompi/mca/btl/openib ompi/mca/btl/wv ompi/mca/coll/ml opal/util/keyval orte/mca/rmaps/rank_file

2012-12-06 Thread Paul Hargrove
Nathan, What does this mean with respect to "clean up" and old flex? Have you simply conceded that old flex will leak memory? -Paul On Thu, Dec 6, 2012 at 4:16 PM, Hjelm, Nathan T wrote: > Done. We now clean up correctly in new flex while having support for old > flex. > >

Re: [OMPI devel] opempi with gcc4.8 on Mac OSX10.7

2012-12-14 Thread Paul Hargrove
Surendra, There is no "release" of gcc-4.8 yet, and my experience (not Open MPI related) with recent snapshots has encountered many bugs still to be fixed before any release. Even if you could build Open MPI with gcc-4.8, I would not (at this time) trust it with any "production" jobs. Looking

Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
Since I see OpenBSD twice on the list of changes, I've fired off my automated testing on my OpenBSD platforms. Since the "MPI datatype issues on OpenBSD" I reported against 1.7.0rc5 also appeared on FreeBSD-6.3, I've tested that platform as well. The good news is that the problems I've reported

Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 2:26 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: [snip] > The BAD news is a new failure (SEGV in orted at exit) on > OpenBSD-5.2/amd64, which I will report in a separate email once I've > completed some triage. > [snip] You can disregard the "BAD

Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
; Have you tried llvm with clang? > > ** ** > > Ken > > ** ** > > *From:* devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] *On > Behalf Of *Paul Hargrove > *Sent:* Thursday, January 17, 2013 4:58 PM > *To:* Open MPI Developers > *Subject:* Re: [OMPI

Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 4:37 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: [snip] > I just now ran tests on OpenBSD-5.2/i386 and OpenBSD-5.2/amd64, using > Clang-3.1. > Unfortunately, there is a mass of linker error building libmpi_cxx.la (on > both systems) > I am trying aga

Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 5:36 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > On Thu, Jan 17, 2013 at 4:37 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > [snip] > >> I just now ran tests on OpenBSD-5.2/i386 and OpenBSD-5.2/amd64, using >> Clang-3.1. >> U

[OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-18 Thread Paul Hargrove
My employer has a nice new Cray XC30 (aka Cascade), and I thought I'd give Open MPI a quick test. Given that it is INTENDED to be API-compatible with the XE series, I began configuring with CC=cc CXX=CC FC=ftn --with-platform=lanl/cray_xe6/optimized-nopanasas However, since this is Intel h/w,

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-22 Thread Paul Hargrove
27864 should fix this. > > George. > > > On Jan 18, 2013, at 06:21 , Paul Hargrove <phhargr...@lbl.gov> wrote: > > My employer has a nice new Cray XC30 (aka Cascade), and I thought I'd give > Open MPI a quick test. > > Given that it is INTENDED to be API-comp

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
any more with it now. > > Hopefully, this will be the minimum required. > > > On Jan 22, 2013, at 4:20 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > I am using the openmpi-1.9a1r27886 tarball and I still see an error for > one of the two duplicate symbols: > >

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
is a --disable-shared/--enable-static build which may differ from other systems where LUSTRE support gets used/tested. -Paul On Fri, Jan 25, 2013 at 12:01 PM, Ralph Castain <r...@open-mpi.org> wrote: > Thanks Paul > > I'm currently tracking down a problem on the Cray XE6 - it appe

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
g to create a critical bug to track this issue. > > Works in 1.7 :-/ ... If you add -lnuma to libs_static in > mpicc-wrapper-data.txt. > > -Nathan > HPC-3, LANL > > On Fri, Jan 25, 2013 at 02:13:41PM -0800, Paul Hargrove wrote: > > Still having problems on the Cray XC30

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
Adding --without-lustre to my configure args allowed me to compile and link ring_c. I am in the queue now and will report later on run results. -Paul On Fri, Jan 25, 2013 at 2:13 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Still having problems on the Cray XC30, but now they

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
gt; ;). mpicc is completely borked in the trunk. > > If you want to use the Cray wrappers with Open MPI I can give you a module > file that sets up the environment correctly (link against -lmpi not > -lmpich, etc). > > -Nathan > > On Fri, Jan 25, 2013 at 03:10:37PM -0800, Pa

Re: [OMPI devel] New ARM patch

2013-01-25 Thread Paul Hargrove
FYI: I currently have QEMU-based ARM platform I use for testing other s/w: + a single-cpu ARMv5 system running Debian Squeeze + a dual-core ARMv7 system running Ubuntu Precise Since these are EMULATED platforms, they are a bit on the slow side, making periodic MTT runs untenable. However,

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
> So I'm not quite sure I understand the "mpicc is completely borked in the > trunk". Can you elaborate? > > On Jan 25, 2013, at 3:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Nathan, > > The 2nd and 3rd non-blank lines of my original post: > >

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
, at 4:29 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > LANL's platform file for the XE6 has enabled shared libs, which I must > disable on our XC30 which is not setup for them. > > > I'm not sure why it does as we build everything strictly static o

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
Ralph, Those are the result of the missing -lnuma that Nathan already identified earlier as missing in BOTH 1.7 and trunk. I see MORE missing symbols, which include ones from libxpmem and libugni. -Paul On Fri, Jan 25, 2013 at 4:59 PM, Ralph Castain wrote: > > On Jan 25,

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Paul Hargrove
the proper fix. -Paul On Fri, Jan 25, 2013 at 5:45 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Jan 25, 2013, at 5:12 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Those are the result of the missing -lnuma that Nathan already identified &g

[OMPI devel] One-line patch for warning in 1.7rc6

2013-01-25 Thread Paul Hargrove
While building 1.7rc6 on a i386 w/ InfiniBand I saw numerous instances of this warning: ../../../../../orte/mca/oob/ud/oob_ud.h:93: warning: cast from pointer to integer of different size The following 1-line change fixes this. Alternatively, a single cast to type uintptr_t is probably

[OMPI devel] Trunk broken on NERSC's Cray XE6

2013-01-25 Thread Paul Hargrove
.1.115.gem30) eswrap/1.0.10 14) configuration/1.0-1.0401.35391.1.2.gem 31) xtpe-mc12 15) ccm/2.2.0-1.0401.37254.2.14232) cray-shmem/5.6.0 16) audit/1.0.0-1.0401.37969.2.32.gem 33) PrgEnv-gnu/4.1.40 17) rca/1.0.0-2.0401.38656.2.2.gem -Paul On Fri, Jan 25, 2013 at 5:

Re: [OMPI devel] 1.6.4rc2 released

2013-01-28 Thread Paul Hargrove
I am pleased to say that 1.6.4rc2 builds and runs (single node, sm btl) on my BSD menagerie: freebsd6-amd64 freebsd7-amd64 freebsd8-amd64 freebsd8-i386 freebsd9-amd64 freebsd9-i386 netbsd6-amd64 netbsd6-i386 openbsd5-amd64 openbsd5-i386 The {Free,Net,Open}BSD

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-28 Thread Paul Hargrove
t; LANL's Cray XE6, and mpicc worked just fine for me once built that way. > >> > >> So I'm not quite sure I understand the "mpicc is completely borked in > the trunk". Can you elaborate? > >> > >> On Jan 25, 2013, at 3:59 PM, Paul Hargrove <phh

[OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Paul Hargrove
When configured using --with-ft=cr on linux/x86 I see the following build failure: Making all in mca/errmgr make[2]: Entering directory `/home/pcp1/phargrov/OMPI/openmpi-1.7rc6-linux-x86-blcr/BLD/orte/mca/errmgr' CC base/errmgr_base_close.lo CC base/errmgr_base_select.lo CC

[OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
The following 2 fragment from config/orte_check_alps.m4 appear to be contradictory. By that I mean the first appears to mean that "--with-alps" with no argument means /opt/cray/alps/default/... for CLE5 and /usr/... for CLE4, while the second fragment appears to be doing the opposite:

Re: [OMPI devel] 1.7rc6 build failure: bogus errmgr code

2013-01-28 Thread Paul Hargrove
stain <r...@open-mpi.org> wrote: > Yes, we need to make it absolutely clear that c/r is no longer supported - > I'll remove that configure option. > > Thanks > Ralph > > On Jan 28, 2013, at 5:38 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > When configured us

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
n Mon, Jan 28, 2013 at 6:14 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Jan 28, 2013, at 6:10 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > The following 2 fragment from config/orte_check_alps.m4 appear to be > contradictory. > By that I mean the

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-28 Thread Paul Hargrove
3 at 6:30 PM, Ralph Castain <r...@open-mpi.org> wrote: > Like I said, I didn't write this code - all I can say for certain is that > it gets the right answer on the LANL Crays. I'll talk to Nathan (the > author) about it tomorrow. > > On Jan 28, 2013, at 6:23 PM, Paul Hargrove &l

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-29 Thread Paul Hargrove
ob. Again, this probably slipped past Nathan because under CLE4 the alps headers are under /usr/include and therefore the missing CPPFLAGS were not actually required. -Paul On Mon, Jan 28, 2013 at 7:05 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Ralph and Nathan, > > As I sai

[OMPI devel] trunk install failure [brbarret]

2013-01-29 Thread Paul Hargrove
Using tonight's trunk tarball (r27954) configured using "--with-devel-headers" it looks like "make install" is trying to install rte_orte.h TWICE: /usr/bin/install -c -m 644 ../../../../../ompi/mca/rte/orte/rte_orte.h > ../../../../../ompi/mca/rte/orte/rte_orte.h >

[OMPI devel] Open MPI on Cray XC30 status

2013-01-29 Thread Paul Hargrove
OK, I am now on the openmpi-1.9a1r27954 tarball. In order to build OMPI and compile apps on this machine I must 1) edit the xe6 platform to --disable-shared/--enable-static (site-specific) 2) edit the xe6 platform file to provide a full path to the alps headers because the logic in

Re: [OMPI devel] openib fragment alignment

2013-02-20 Thread Paul Hargrove
Sounds like the problem comes down to just 32-bit systems that fault on unaligned 8-byte loads. That would be SPARC, IA64 and MIPS. For IB only SPARC is relevant. So perhaps alignment>2 should be conditional on 32-bit SPARC target. Additionally, an experiment to see if 4-byte alignment is "good

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Paul Hargrove
Pasha, I have at least one system where I can reproduce the problem, but don't have up-to-date autotools. So, I can only test from a tarball. If somebody can roll me a tarball of r28289 I can test ASAP. Otherwise I'll try to remember to test from tonight's trunk nightly once it appears. -Paul

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Paul Hargrove
New tarball is on the website now and I've started my build... -Paul On Thu, Apr 4, 2013 at 2:15 PM, Ralph Castain <r...@open-mpi.org> wrote: > I can kick it off now - will take a little while to hit the web site. > > > On Apr 4, 2013, at 2:01 PM, Paul Hargrove <phha

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Paul Hargrove
t;sham...@ornl.gov> wrote: > > > Paul, > > > > I will prepare a tarball for you. > > > > Thanks ! > > > > Pavel (Pasha) Shamis > > --- > > Computer Science Research Group > > Computer Science and Math Division > > Oak Ridge National

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Paul Hargrove
ng. -Paul On Thu, Apr 4, 2013 at 5:52 PM, Ralph Castain <r...@open-mpi.org> wrote: > Thanks Paul!! > > As always, much appreciated. > > On Apr 4, 2013, at 4:41 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Pasha, > > Your fix appears to work. > > My p

Re: [OMPI devel] Using external libevent

2013-04-17 Thread Paul Hargrove
On Wed, Apr 17, 2013 at 10:40 AM, Orion Poplawski wrote: > So, would you be willing to provide more of the rationale as to why > libevent is bundled? Orion, I am NOT an Open MPI developer myself. So please don't take my response as speaking for the community. I found

Re: [OMPI devel] Simplified: Misuse or bug with nested types?

2013-04-23 Thread Paul Hargrove
Eric, Are you testing against the Open MPI svn trunk? I ask because on April 9 George commited a fix for the bug reported by Thomas Jahns: http://www.open-mpi.org/community/lists/devel/2013/04/12268.php -Paul On Tue, Apr 23, 2013 at 5:35 PM, Eric Chamberland <

Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-02 Thread Paul Hargrove
Ralph, I am not an expert, by any means, but based on a presentation I heard 4 hours ago: The Xeon and Phi instruction sets have a large intersection, but neither is a subset of the other. In particular, Phi has its own SIMD instructions *instead* of Xeon's MMX, SSEn, etc. There is also on

Re: [OMPI devel] Any plans to support Intel MIC (Xeon Phi) in Open-MPI?

2013-05-02 Thread Paul Hargrove
> know exactly what that means (I haven't read their docs about this stuff), > but I suspect that it's more than just launching MPI processes on them... > > > On May 2, 2013, at 8:54 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > Ralph, > > > > I am not

Re: [OMPI devel] [EXTERNAL] Re: RFC: Python-generated Fortran wrappers

2013-05-22 Thread Paul Hargrove
Let me jump in here with a different perspective. First, for those who don't know me: + I am NOT an OMPI developer + I am NOT an MPI application author either + I am a developer of "competing" HPC communications software (GASNet) + I contribute to OMPI mainly by building release candidates

Re: [OMPI devel] Open MPI shirts and more

2013-10-18 Thread Paul Hargrove
I don't know it if just me, but the logo appears to have gotten cropped on the 2.25" button and the round magnet. -Paul On Fri, Oct 18, 2013 at 12:41 PM, Jeff Squyres (jsquyres) < jsquy...@cisco.com> wrote: > OMPI Developer Community -- > > Per the upcoming 10th anniversary of OMPI's SVN r1

Re: [OMPI devel] Open MPI shirts and more

2013-10-18 Thread Paul Hargrove
Still looks the same to me, even after changing browsers. Screen shot attached. -Paul On Fri, Oct 18, 2013 at 1:59 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > Fixed! > > On Oct 18, 2013, at 3:44 PM, Paul Hargrove <phhargr...@lbl.gov> > wrote: > >

Re: [OMPI devel] shmem vs. oshmem

2013-10-25 Thread Paul Hargrove
On Fri, Oct 25, 2013 at 2:32 PM, Jeff Squyres (jsquyres) wrote: > We're shipping oshmem, not shmem, so why not call them oshmem examples > [that also happen to be shmem examples] -- rather than shmem examples [that > also happen to be oshmem examples]? My USD 0.02: If the

Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

2013-11-05 Thread Paul Hargrove
Jeff, If this approach is to be adopted by other components (and perhaps other MPIs), then it would be important for the enumeration variable name to be derived in a UNIFORM way: __SOMETHING Without a fixed value for "SOMETHING" somebody will need to read sources (or documentation) to make

Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

2013-11-05 Thread Paul Hargrove
On Tue, Nov 5, 2013 at 6:00 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>wrote: > On Nov 5, 2013, at 2:54 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > If this approach is to be adopted by other components (and perhaps other > MPIs), then it would be im

Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

2013-11-05 Thread Paul Hargrove
ilities. > > How about using a * in the name, to represent where the match is? E.G., > btl_usnic_*_enum? > > It's a string, so it's not just limited to letters and underscores. > > Sent from my phone. No type good. > > On Nov 5, 2013, at 6:26 PM, "Paul Hargrov

Re: [OMPI devel] [PATCH 4/4] Trying to get the C/R code to compile again. (last)

2013-12-04 Thread Paul Hargrove
On Dec 4, 2013 12:07 PM, "Jeff Squyres (jsquyres)" wrote: [...] > But in some ways, having uncompilable code is a *good* thing, because it tells you exactly where you need to work on the architecture. Just updating it to *compile* removes that safeguard -- will you

[OMPI devel] 1.7.4rc1 build failure: FreeBSD-9

2013-12-19 Thread Paul Hargrove
I see the failure below when building 1.7.4rc1 on FreeBSD-9 (amd64). It looks to be just a missing header, probably sys/stat.h. $ gcc --version gcc (GCC) 4.2.1 20070831 patched [FreeBSD] Only configure option passed was --prefix-... -Paul Making all in mca/sharedfp/sm CC

[OMPI devel] 1.7.4rc1 build failure: OpenBSD-5 and NetBSD-6

2013-12-19 Thread Paul Hargrove
When building 1.7.4rc1 on OpenBSD-5 and NetBSD-6 (both amd64) I see what appears to be the same three errors ("make" output at end of this email) on both platforms. All three syntax errors appears to be collisions on the symbol if_mtu: -bash-4.2$ cat -n openmpi-1.7.4rc1/opal/util/if.h | grep -w

[OMPI devel] 1.74rc1 build failure: Solaris 11 / x86_64 / Sun Studio 12.3

2013-12-19 Thread Paul Hargrove
In 1.7.4rc1's README support is still claimed for Solaris 11 on x86_64 with Sun Studio (12.2 and 12.3): - Oracle Solaris 10 and 11, 32 and 64 bit (SPARC, i386, x86_64), with Oracle Solaris Studio 12.2 and 12.3 However, I get a build failure when configured with: CC=cc CFLAGS=-m64

Re: [OMPI devel] 1.7.4rc1 build failure: Solaris 11 / x86_64

2013-12-19 Thread Paul Hargrove
[2]: Leaving directory `/shared/OMPI/openmpi-1.7.4rc1-solaris11-x64-ib-gcc452/BLD/opal/mca/if/posix_ipv4 -Paul On Thu, Dec 19, 2013 at 3:51 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > In 1.7.4rc1's README support is still claimed for Solaris 11 on x86_64 > with Sun Studio (1

Re: [OMPI devel] 1.74rc1 build failure: Solaris 11 / x86_64 / Sun Studio 12.3

2013-12-19 Thread Paul Hargrove
{ > -opal_output(0, "btl_usnic_opal_ifinit: ioctl(SIOCGIFMTU) > failed with errno=%d", errno); > -break; > -} > -intf->if_mtu = ifr->ifr_mtu; > +/* get the MTU */ > +if (ioctl(sd, SIOCGIFMTU, ifr

Re: [OMPI devel] 1.7.4rc1 build failure: OpenBSD-5 and NetBSD-6

2013-12-19 Thread Paul Hargrove
atform does not list RLIMIT_AS. Running "grep -rl RLIMIT_AS /usr/include" confirms that this constant does not exist. So, I think "#ifdef RLIMIT_AS" is required. -Paul On Thu, Dec 19, 2013 at 4:39 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > On Dec 19,

Re: [OMPI devel] 1.74rc1 build failure: Solaris 11 / x86_64 / Sun Studio 12.3

2013-12-19 Thread Paul Hargrove
. Perhaps just because the OFED stack is present? -Paul On Thu, Dec 19, 2013 at 4:39 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > Try http://www.open-mpi.org/~jsquyres/unofficial/. > > Should have both "if" fixes in it. > > > On Dec 19, 2013,

[OMPI devel] 1.7.4rc1 run failure on Solaris 10 / SPARC (not SIGBUS)

2013-12-19 Thread Paul Hargrove
Testing with Solaris 10 on SPARC, I was expecting to encounter the bus error reported previously by Siegman Gross. Instead I see the following hwloc-related abort: $ env PATH=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin:$PATH

Re: [OMPI devel] 1.74rc1 build failure: Solaris 11 / x86_64 / Sun Studio 12.3

2013-12-19 Thread Paul Hargrove
-1.7.4rc2forpaul/ompi/mca/btl/usnic/btl_usnic_util.h:19:45: error: static declaration of �fls� follows non-static declaration /usr/include/string.h:87:12: note: previous declaration of �fls� was here make[2]: *** [btl_usnic_module.lo] Error 1 -Paul On Thu, Dec 19, 2013 at 6:35 PM, Paul Hargrove

[OMPI devel] 1.7.4rc1 install failure: NetBSD-6 amd64

2013-12-19 Thread Paul Hargrove
Attached is the output from "make install" of 1.7.4rc1 + Jeff's fix for the symbol conflict on "if_mtu". There appear to be at least 2 issues. 1) There are lots of (not fatal) messages about ldconfig not existing, but according to he NetBSD lists that utility went away with the conversion from

Re: [OMPI devel] 1.7.4rc1 run failure on Solaris 10 / SPARC (not SIGBUS)

2013-12-19 Thread Paul Hargrove
@open-mpi.org> wrote: > I believe this one has already been fixed and is in the nightly (1.7.4rc2) > - for now, you can just set "--bind-to none" on the cmd line to get past it > > > On Dec 19, 2013, at 6:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > &

[OMPI devel] 1.7.4rc1 autogen error: NetBSD-6

2013-12-19 Thread Paul Hargrove
Probably nobody cares, but I'll report this for completeness. In trying to understand the "make install" failure on NetBSD-6 I run "autogen.sh". The versions detected: Searching for autoconf Found autoconf version 2.69; checking version... Found version component 2 -- need 2

Re: [OMPI devel] 1.74rc1 build failure: Solaris 11 / x86_64 / Sun Studio 12.3

2013-12-19 Thread Paul Hargrove
13 at 6:47 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Jeff, > > I didn't actually get very far after fixing __always_inline. > In fact, the build still fails on the *same* line, but for a different > (valid) reason: > fls() is declared in /usr/include/string.h >

[OMPI devel] 1.7.4rc1 build failure: another OpenBSD-5

2013-12-20 Thread Paul Hargrove
Manually #ifdef'ing out the RLIMIT_AS code which lead to my previous failure on OpenBSD-5 allows me to reach the (sigh) *next* problem: Making all in mpi/cxx CXX mpicxx.lo /home/phargrov/OMPI/openmpi-1.7.4rc2forpaul-openbsd5-amd64/openmpi-1.7.4rc2forpaul/ompi/mpi/cxx/mpicxx.cc:120:21:

Re: [OMPI devel] 1.7.4rc1 run failure on Solaris 10 / SPARC (not SIGBUS)

2013-12-20 Thread Paul Hargrove
issues on Thu night. On Thu, Dec 19, 2013 at 7:19 PM, Ralph Castain <r...@open-mpi.org> wrote: > I believe this one has already been fixed and is in the nightly (1.7.4rc2) > - for now, you can just set "--bind-to none" on the cmd line to get past it > > > On Dec 19,

Re: [OMPI devel] 1.7.4rc1 run failure on Solaris 10 / SPARC (not SIGBUS)

2013-12-20 Thread Paul Hargrove
protection code just went in > today. Jeff has since regenerated the tarball for the web site, so the one > up there should have most (if not all) of these problems fixed > > Have a great holiday! > Ralph > > > On Dec 20, 2013, at 11:40 AM, Paul Hargrove <phhargr...@lbl

[OMPI devel] 1.7.4rc2r30031 testing summary

2013-12-20 Thread Paul Hargrove
This email is a summary of my results testing 1.7.4rc2r30031. I will send detailed follow-ups on the new issues. So, this is a heads-up to let you know this version still has significant problems for me. FreeBSD-9/amd64 + "mpirun -np 2 examples/ring_c" hangs! + "mpirun -np 1 examples/ring_c" runs

[OMPI devel] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
With plenty of help from Jeff and Ralph's bug fixes in the past 24 hours, I can now build OMPI for NetBSD. However, running even a simple example fails: Having set PATH and LD_LIBARY_PATH: $ mpirun -np 1 examples/ring_c just hangs Output from "top" shows idle procs: PID USERNAME PRI NICE

[OMPI devel] 1.7.4rc2r30031 - FreeBSD mpirun warning

2013-12-20 Thread Paul Hargrove
I have a build of OMPI 1.7.4rc2r30031 on FreeBSD-9 finally. I can (as I will detail in another email) run only singletons at the moment. However, when I do I get a warning that is, IMHO, unnecessary: $ mpirun -np 1 examples/ring_c

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
wba...@sandia.gov>wrote: > Paul - > > Any chance you could grab a stack trace from the mpi app? That's probably > the fastest next step > > Brian > > > > Sent with Good (www.good.com) > > > -Original Message- > *From: *Paul Hargrove [phhargr...@lbl.g

[OMPI devel] 1.7.4rc2r30031 - FreeBSD-9 mpirun hangs

2013-12-20 Thread Paul Hargrove
This case is not quite like my OpenBSD-5 report. On FreeBSD-9 I *can* run singletons, but "-np 2" hangs. The following hangs: $ mpirun -np 2 examples/ring_c The following complains about the "bogus" btl selection. So this is not the same as my problem with OpenBSD-5: $ mpirun -mca btl bogus -np

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
c0f857c000 On Fri, Dec 20, 2013 at 2:48 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Brian, > > Of course, I should have thought of that myself. > See below for backtrace from a singleton run. > > I'm starting an --enable-debug build to maybe get some line number info >

Re: [OMPI devel] 1.7.4rc2r30031 - FreeBSD mpirun warning

2013-12-20 Thread Paul Hargrove
On Fri, Dec 20, 2013 at 3:12 PM, Dave Goodell (dgoodell) <dgood...@cisco.com > wrote: > On Dec 20, 2013, at 4:43 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > > The warning is correct that no such interface exists. > > However 127.0.0.1/24 DOES exist: >

Re: [OMPI devel] 1.7.4rc2r30031 - FreeBSD-9 mpirun hangs

2013-12-20 Thread Paul Hargrove
, 2013 at 2:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > This case is not quite like my OpenBSD-5 report. > On FreeBSD-9 I *can* run singletons, but "-np 2" hangs. > > The following hangs: > $ mpirun -np 2 examples/ring_c > > The following complains

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
, Paul Hargrove <phhargr...@lbl.gov> wrote: > Below is the backtrace again, this time configured w/ --enable-debug and > for all threads. > -Paul > > Thread 2 (thread 1021110): > #0 0x1bc0ef6c5e3a in nanosleep () at :2 > #1 0x1bc0f317c2d4 in nanosleep (rq

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
ition) > > > On Dec 20, 2013, at 4:02 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > FWIW: > I've confirmed that this is REGRESSION relative to 1.7.2, which works fine > on OpenBSD-5 > > I could not build 1.7.3 due to some of issues fixed for 1.7.4rc in the

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
On Fri, Dec 20, 2013 at 4:02 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > FWIW: > I've confirmed that this is REGRESSION relative to 1.7.2, which works fine > on OpenBSD-5 > > I could not build 1.7.3 due to some of issues fixed for 1.7.4rc in the > past 24 hours.

Re: [OMPI devel] [EXTERNAL] 1.7.4rc2r30031 - OpenBSD-5 mpirun hangs

2013-12-20 Thread Paul Hargrove
ke it appear. I'm investigating and think I > know where the issue might lie (a timer that is firing to indicate a failed > connection attempt and causing a race condition) > > > On Dec 20, 2013, at 4:02 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > FWIW: > I've

Re: [OMPI devel] 1.7.4rc1 autogen error: NetBSD-6

2013-12-20 Thread Paul Hargrove
ime. However, I am sending this email to document my finding in case I don't get back to this. -Paul On Fri, Dec 20, 2013 at 6:49 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > I just submitted a CMR to Brian to fix this: > > https://svn.open-mpi.org/trac/ompi/ti

Re: [OMPI devel] 1.7.4rc1 autogen error: NetBSD-6

2013-12-20 Thread Paul Hargrove
; + open(OUT, ">configure.patched") || my_die "Can't open configure.patched"; print OUT $c; close(OUT); -Paul On Fri, Dec 20, 2013 at 6:04 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > As I indicated earlier today, the CMRed fix to push/pop &quo

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Paul Hargrove
Interestingly enough the 4MB latency actually improved significantly relative to the initial numbers. -Paul [Sent from my phone] On Jan 8, 2014 8:50 AM, "George Bosilca" wrote: > These results are way worst that the one you send on your previous email? > What is the reason?

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-08 Thread Paul Hargrove
Nevermind, since Nathan just clarified that the results are not comparable. -Paul [Sent from my phone] On Jan 8, 2014 8:58 AM, "Paul Hargrove" <phhargr...@lbl.gov> wrote: > Interestingly enough the 4MB latency actually improved significantly > relative to the initial nu

[OMPI devel] 1.7.4rc2r30148 - static link failure on NetBSD

2014-01-08 Thread Paul Hargrove
When I compile the current 1.7.4rc on NetBSD with no configure arguments, I still get the "make install" failure that I have detailed in previous emails. HOWEVER, if I configure with "--enable-static --disable-shared" then I get an earlier failure at build time (partial "make V=1" output shown

[OMPI devel] 1.7.4rc2r30148 run failure NetBSD6-x86

2014-01-08 Thread Paul Hargrove
While I have yet to get a working build on NetBSD for x86-64 h/w, I *have* successfully built Open MPI's current 1.7.4rc tarball on NetBSD-6 for x86. However, I can't *run* anything: Attempting the ring_c example on 2 cores: -bash-4.2$ mpirun -mca btl sm,self -np 2 examples/ring_c

<    1   2   3   4   5   6   7   8   9   >