Re: [OMPI devel] OpenMPI and R

2012-04-06 Thread TERRY DONTJE
Have you tried to compile and run a simple MPI program with your 
installed Open MPI?  If that works then you need to figure out what is 
being done by the Makefile when it is "testing if installed package can 
be loaded" and try and reproduce the issue manually.


BTW, I normally configure my OMPI with -enable-orterun-prefix-by-default 
to get OMPI to pull in the right library paths instead of using ldconfig.


In the below ldconfig -p you may want to also grep for mca to make sure 
the plugins being complained about in the R testing are found too, 
though I suspect they are but it would be good to double check.


--td

On 4/5/2012 7:59 PM, Benedict Holland wrote:
So I am now back on this full time as I need this to work. OpenMPI 
1.4.3 is deadlocking with Rmpi and I need the latest code. I still get 
the exact same problem. I configured it with a --prefix=/usr to get it 
to install everything in default directories and added 
/usr/lib/openmpi to my ldconfig. I don't have a LD_LIBRARY_PATH global 
variable on ubuntu 11.10.


ldconfig -p |grep mpi
libvt-mpi.so.0 (libc6,x86-64) => /usr/lib/libvt-mpi.so.0
libvt-mpi.so (libc6,x86-64) => /usr/lib/libvt-mpi.so
libvt-mpi-unify.so.0 (libc6,x86-64) => /usr/lib/libvt-mpi-unify.so.0
libvt-mpi-unify.so (libc6,x86-64) => /usr/lib/libvt-mpi-unify.so
libopenmpi_malloc.so.0 (libc6,x86-64) => /usr/lib/libopenmpi_malloc.so.0
libompitrace.so.0 (libc6,x86-64) => /usr/lib/libompitrace.so.0
libompitrace.so (libc6,x86-64) => /usr/lib/libompitrace.so
libompi_dbg_msgq.so (libc6,x86-64) => /usr/lib/openmpi/libompi_dbg_msgq.so
libmpi_f90.so.1 (libc6,x86-64) => /usr/lib/libmpi_f90.so.1
libmpi_f90.so.0 (libc6,x86-64) => /usr/lib/libmpi_f90.so.0
libmpi_f90.so (libc6,x86-64) => /usr/lib/libmpi_f90.so
libmpi_f77.so.1 (libc6,x86-64) => /usr/lib/libmpi_f77.so.1
libmpi_f77.so.0 (libc6,x86-64) => /usr/lib/libmpi_f77.so.0
libmpi_f77.so (libc6,x86-64) => /usr/lib/libmpi_f77.so
libmpi_cxx.so.1 (libc6,x86-64) => /usr/lib/libmpi_cxx.so.1
libmpi_cxx.so.0 (libc6,x86-64) => /usr/lib/libmpi_cxx.so.0
libmpi_cxx.so (libc6,x86-64) => /usr/lib/libmpi_cxx.so
libmpi.so.1 (libc6,x86-64) => /usr/lib/libmpi.so.1
libmpi.so.0 (libc6,x86-64) => /usr/lib/libmpi.so.0
libmpi.so (libc6,x86-64) => /usr/lib/libmpi.so
libexempi.so.3 (libc6,x86-64) => /usr/lib/libexempi.so.3
libcompizconfig.so.0 (libc6,x86-64) => /usr/lib/libcompizconfig.so.0

Compiling Rmpi from inside R gives me:

* installing *source* package 'Rmpi' ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
I am here /usr and it is OpenMPI
Trying to find mpi.h ...
Found in /usr/include
Trying to find libmpi.so or libmpich.a ...
Found libmpi in /usr/lib
checking for openpty in -lutil... yes
checking for main in -lpthread... yes
configure: creating ./config.status
config.status: creating src/Makevars
** Creating default NAMESPACE file
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c RegQuery.c -o RegQuery.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c Rmpi.c -o Rmpi.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c conversion.c -o conversion.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c internal.c -o internal.o
gcc -std=gnu99 -shared -o Rmpi.so RegQuery.o Rmpi.o conversion.o 
internal.o -L/usr/lib -lmpi -lutil -lpthread -L/usr/lib/R/lib -lR

installing to /usr/local/lib/R/site-library/Rmpi/libs
** R
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
[ben-Inspiron-1764:18216] mca: base: component_find: unable to open 
/usr/lib/openmpi/mca_paffinity_hwloc: 
/usr/lib/openmpi/mca_paffinity_hwloc.so: undefined symbol: 
opal_hwloc_topology (ignored)
[ben-Inspiron-1764:18216] mca: base: component_find: unable to open 
/usr/lib/openmpi/mca_shmem_posix: /usr/lib/openmpi/mca_shmem_posix.so: 
undefined symbol: opal_shmem_base_output (ignored)
[ben-Inspiron-1764:18216] 

Re: [OMPI devel] [patch] Bugs in mpi-f90-interfaces.h and its bridge implementation

2012-04-06 Thread Kawashima
Hi Jeff,

I've checked your code in bitbucket. Two types of error are found.
I've attached the patch.

First one (ignore-tkr) seems to be an error by manual patching.
Second one (tkr) seems that patch command could not apply my fixes
because neighboring lines were modified in your code.

Regards,

Takahiro Kawashima,
MPI development team,
Fujitsu

> Jeffrey Squyres wrote:
> > 
> > On Apr 3, 2012, at 10:56 PM, Kawashima wrote:
> > 
> > > I and my coworkers checked mpi-f90-interfaces.h against MPI 2.2 standard
> > > and found many bugs in it. Attached patches fix them for trunk.
> > > Though some of them are trivial, others are not so trivial.
> > > So I'll explain them below.
> > 
> > Excellent -- many thanks for these!
> > 
> > I have some notes on the specific patches, below, but first note the 
> > following: Craig Rasmussen and I have been working on revamping the Open 
> > MPI Fortran bindings for quite a while.  Here's a quick summary of the 
> > changes:
> > 
> > 1. two prototypes of the new MPI-3 mpi_f08 module
> >a. Full implementation of all functions, not using F08 descriptors
> >b. 6-function MPI using F08 descriptors (showing that it can be done) 
> > for ifort
> > 2. for the existing mpi module:
> >a. New implementation using "ignore TKR" directives
> >b. If your fortran compiler doesn't support "ignore TKR" directives 
> > (e.g., gfortran), fall back and use the old mpi module implementation
> > 3. wrapper compiler changes
> >a. New "mpifort" wrapper compiler; all Fortran interfaces are available
> >b. mpif77 and mpif90 are sym links to mpifort (and may disappear someday)
> > 
> > All of this work is available in a public bitbucket here:
> > 
> >https://bitbucket.org/jsquyres/mpi3-fortran
> 
> Great. I'll see it later.
> 
> > There's a linker error in there at the tip at the moment; Craig is working 
> > on fixing it.  When that's done, we'll likely put out a final public test 
> > tarball, and assuming that goes well, merge all this stuff into the OMPI 
> > SVN trunk.  The SVN merge will likely be a little disruptive because the 
> > directory structure of the Fortran bindings changed a bit (e.g., all 5 
> > implementations are now under ompi/mpi/fortran).
> > 
> > The point is that I plan to bring in all your fixes to this bitbucket 
> > branch so that all the new stuff and all your fixes come in to the trunk at 
> > the same time.
> > 
> > 1.4 is dead; I doubt we'll be applying your fixes there.
> > 
> > 1.5 has transitioned to 1.6 (yesterday); I can look into making a patch for 
> > the v1.6 series.  The tricky part is preserving the mpi module ABI between 
> > 1.5.5 and 1.6.  We've done this before, though, so I think it'll be do-able.
> 
> Thanks for your explanation.
> I know we are preparing v1.6 and v1.4 is not active.
> 
> > > 1. incorrect parameter types
> > > 
> > >  Two trivial parameter type mismatches.
> > >  Fixed in my mpi-f90-interface.type-mismatch.patch.
> > > 
> > >  MPI_Cart_map periods: integer -> logical
> > >  MPI_Reduce_scatter recvcounts: missing "dimension(*)"
> > 
> > Applied, and also made corresponding fixes to my ignore TKR mpi module (the 
> > f08 module didn't have these issues).
> > 
> > > 2. incorrect intent against MPI 2.2 standard
> > > 
> > >  This is a somewhat complex issue.
> > >  First, I'll cite MPI 2.2 standard below.
> > > 
> > >  2.3 in MPI 2.2 standard says:
> > >There is one special case - if an argument is a handle to an opaque
> > >object (these terms are defined in Section 2.5.1), and the object is
> > >updated by the procedure call, then the argument is marked INOUT or 
> > > OUT.
> > >It is marked this way even though the handle itself is not modified -
> > >we use the INOUT or OUT attribute to denote that what the handle
> > >references is updated. Thus, in C++, IN arguments are usually either
> > >references or pointers to const objects.
> > > 
> > >  2.3 in MPI 2.2 standard also says:
> > >MPI's use of IN, OUT and INOUT is intended to indicate to the user
> > >how an argument is to be used, but does not provide a rigorous
> > >classification that can be translated directly into all language
> > >bindings (e.g., INTENT in Fortran 90 bindings or const in C bindings).
> > >For instance, the "constant" MPI_BOTTOM can usually be passed to
> > >OUT buffer arguments. Similarly, MPI_STATUS_IGNORE can be passed as
> > >the OUT status argument.
> > > 
> > >  16.2.4 in MPI 2.2 standard says:
> > >Advice to implementors.
> > >The appropriate INTENT may be different from what is given in the
> > >MPI generic interface. Implementations must choose INTENT so that
> > >the function adheres to the MPI standard.
> > > 
> > >  Hmm. intent in mpi-f90-interfaces.h does not necessarily match
> > >  IN/OUT/INOUT in MPI 2.2, especially regarding opaque objects.
> > >  mpi-f90-interfaces.h seems to have consideration of opaque objects
> > >  partially, which is h

Re: [OMPI devel] [patch] Bugs in mpi-f90-interfaces.h and its bridge implementation

2012-04-06 Thread Jeffrey Squyres
On Apr 6, 2012, at 7:09 AM, Kawashima wrote:

> I've checked your code in bitbucket. Two types of error are found.
> I've attached the patch.
> 
> First one (ignore-tkr) seems to be an error by manual patching.
> Second one (tkr) seems that patch command could not apply my fixes
> because neighboring lines were modified in your code.

Thank you!  I've applied your patch.

I'm also in the middle of updating the code to handle Fortran mpiext's (per 
Josh's mail the other day).  I hope to finish this today.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] OpenMPI and R

2012-04-06 Thread Jeffrey Squyres
On Apr 5, 2012, at 9:07 PM, Benedict Holland wrote:

> Oh how interesting and I hope this helps someone. Following another link, I 
> had to use:
> 
> ./configure --prefix /usr --enable-shared --enable-static

This makes sense.  You were falling victim to the fact that R dlopens libmpi as 
a dynamic library in a private namespace.  Hence, when Open MPI then dlopens 
its own plugins, it can't find libmpi's symbols.  This is a generic problem 
with any system that opens plugins that, themselves, open plugns.  I wish there 
was a better solution to this -- the OS guys need to give us a better mechanism 
here.  :-(

OMPI's --enable-static option does two things:

- it builds libmpi.a (vs. libmpi.so)
- it slurps all of OMPI's plugins into libmpi.a (so it doesn't need to dlopen 
anything at run-time)

It's the latter point that is saving you.

Note that you could also just --disable-dlopen (vs. --disable-shared 
--enable-static), which just does the latter of the above things (meaning: OMPI 
still builds libmpi.so), and it should also work for you.

> when compiling this for Rmpi. Just curious, why isn't --enable-static a 
> default option? 

Among other reasons, shared libraries generally help save memory at run time.  
This is somewhat important as core counts go up.  If you "mpirun -np 32" on a 
single, 32-core machine, would you rather have 32 independent copies of 
libmpi.a loaded into RAM, or just one copy that all processes share?

Using libmpi.so enables the latter option.

Make sense?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] OpenMPI and R

2012-04-06 Thread Benedict Holland
Oh this actually does make a lot of sense. The kicker is that Rmpi doesn't
like to use OMPI and really loves to use LAM so I have to use R in batch
mode by running mpirun -np 12 and specify the host file. I have a bad
feeling that this is loading the library 12 times, once for each R
namespace. While this is annoying, not having Rmpi working was far more so
and ram is cheap. I do agree whole heatedly that dynamic libraries are far
superior to static ones but I wonder if setting the configuration by
default to build both would allow those applications which require the
static libraries to compile to compile but at the same time getting the
applications which use dynamic loading libraries to exist side by side. I
don't plan on writing applications which require -libmpi but I know that
there are developers not nearly in tune with this line of thinking or know
the difference between static and dynamic libraries.

Also this was tripping me up for weeks. I will let the Rmpi developer know
though and hopefully they can shift code to the dynamic libraries or
perhaps take part in development of the library to make it load dynamic
libraries.

Anyway Jeff, thank you for a wonderful explication. I wonder if this should
be posted anywhere on OpenMPI as a note for Rmpi developers which desire to
compile OMPI themselves. The packages which exist out there are multiple
years old at this point, at least for ubuntu and the released version was
actually one of your development releases and had several large bug fixes
since then.

BTW, I don't see myself not using OMPI for a while, are there any simple
projects which I might be able to work on to get to know the code base and
maybe move up the chain? Fixing some low hanging fruit bugs and learning
how to debug OMPI would be optimal.

Thanks again,
~Ben

On Fri, Apr 6, 2012 at 8:45 AM, Jeffrey Squyres  wrote:

> On Apr 5, 2012, at 9:07 PM, Benedict Holland wrote:
>
> > Oh how interesting and I hope this helps someone. Following another
> link, I had to use:
> >
> > ./configure --prefix /usr --enable-shared --enable-static
>
> This makes sense.  You were falling victim to the fact that R dlopens
> libmpi as a dynamic library in a private namespace.  Hence, when Open MPI
> then dlopens its own plugins, it can't find libmpi's symbols.  This is a
> generic problem with any system that opens plugins that, themselves, open
> plugns.  I wish there was a better solution to this -- the OS guys need to
> give us a better mechanism here.  :-(
>
> OMPI's --enable-static option does two things:
>
> - it builds libmpi.a (vs. libmpi.so)
> - it slurps all of OMPI's plugins into libmpi.a (so it doesn't need to
> dlopen anything at run-time)
>
> It's the latter point that is saving you.
>
> Note that you could also just --disable-dlopen (vs. --disable-shared
> --enable-static), which just does the latter of the above things (meaning:
> OMPI still builds libmpi.so), and it should also work for you.
>
> > when compiling this for Rmpi. Just curious, why isn't --enable-static a
> default option?
>
> Among other reasons, shared libraries generally help save memory at run
> time.  This is somewhat important as core counts go up.  If you "mpirun -np
> 32" on a single, 32-core machine, would you rather have 32 independent
> copies of libmpi.a loaded into RAM, or just one copy that all processes
> share?
>
> Using libmpi.so enables the latter option.
>
> Make sense?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting

2012-04-06 Thread Ralph Castain
+1 for SJ - much easier to be someplace with a major airport.


On Apr 5, 2012, at 7:54 AM, Gutierrez, Samuel K wrote:

> My vote is for San Jose.
> 
> Sam
> 
> 
> From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of 
> Josh Hursey [jjhur...@open-mpi.org]
> Sent: Wednesday, April 04, 2012 5:14 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting
> 
> I second Oak Ridge (or even UTK) sometime in June.
> 
> -- Josh
> 
> On Tue, Apr 3, 2012 at 3:07 PM, Barrett, Brian W  wrote:
>> On 4/3/12 11:08 AM, "Jeffrey Squyres"  wrote:
>> 
>>> On Apr 3, 2012, at 11:44 AM, Barrett, Brian W wrote:
>>> 
 There is discussion of attempting to have a developers meeting this
 summer.  We haven't had one in a while and people thought it would be
 good
 to work through some of the ideas on how to implement features for 1.7.
 We don't have a location yet, but possibilities include Los Alamos and
 San
 Jose.  To help us get an idea of who can attend, please add your
 information to the doodle poll below.
 
 http://www.doodle.com/cei3ve3qyeer9bv9
>>> 
>>> 
>>> Since the meeting is likely to take a whole week, might I suggest making
>>> each Doodle entry represent an entire week?  E.g., June 4-11, June 11-15,
>>> etc.
>> 
>> We talked about 3 days, so I was thinking that perhaps there were half
>> weeks that worked well for people.
>> 
>> Brian
>> 
>> --
>> Brian W. Barrett
>> Dept. 1423: Scalable System Software
>> Sandia National Laboratories
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> 
> --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting

2012-04-06 Thread Barrett, Brian W
Agreed.

Brian

On Apr 6, 2012, at 7:31 PM, Ralph Castain wrote:

> +1 for SJ - much easier to be someplace with a major airport.
> 
> 
> On Apr 5, 2012, at 7:54 AM, Gutierrez, Samuel K wrote:
> 
>> My vote is for San Jose.
>> 
>> Sam
>> 
>> 
>> From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of 
>> Josh Hursey [jjhur...@open-mpi.org]
>> Sent: Wednesday, April 04, 2012 5:14 AM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting
>> 
>> I second Oak Ridge (or even UTK) sometime in June.
>> 
>> -- Josh
>> 
>> On Tue, Apr 3, 2012 at 3:07 PM, Barrett, Brian W  wrote:
>>> On 4/3/12 11:08 AM, "Jeffrey Squyres"  wrote:
>>> 
 On Apr 3, 2012, at 11:44 AM, Barrett, Brian W wrote:
 
> There is discussion of attempting to have a developers meeting this
> summer.  We haven't had one in a while and people thought it would be
> good
> to work through some of the ideas on how to implement features for 1.7.
> We don't have a location yet, but possibilities include Los Alamos and
> San
> Jose.  To help us get an idea of who can attend, please add your
> information to the doodle poll below.
> 
> http://www.doodle.com/cei3ve3qyeer9bv9
 
 
 Since the meeting is likely to take a whole week, might I suggest making
 each Doodle entry represent an entire week?  E.g., June 4-11, June 11-15,
 etc.
>>> 
>>> We talked about 3 days, so I was thinking that perhaps there were half
>>> weeks that worked well for people.
>>> 
>>> Brian
>>> 
>>> --
>>> Brian W. Barrett
>>> Dept. 1423: Scalable System Software
>>> Sandia National Laboratories
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> 
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 



smime.p7s
Description: S/MIME cryptographic signature