Re: [OMPI devel] 1.10.0rc6 - slightly different mx problem

2015-08-25 Thread Brice Goglin
Le 25/08/2015 05:59, Christopher Samuel a écrit :
>
> INRIA does have Open-MX (Myrinet Express over Generic Ethernet
> Hardware), last release December 2014.  No idea if it's still developed
> or used..
>
> http://open-mx.gforge.inria.fr/
>
> Brice?
>
> Open-MPI is listed as working with it there. ;-)
>

It's not developed anymore. New releases just fix support for newer
kernels as long as the fix is easy.
There are still a couple users but I guess OMPI 1.8 is enough for them.

Brice



Re: [OMPI devel] v1.10.0rc7

2015-08-25 Thread Paul Hargrove
With only the slow qemu-emulated MIPS and ARM testers still running, I can
report that I have seen NO issues with rc7.

-Paul

On Mon, Aug 24, 2015 at 4:54 PM, Ralph Castain  wrote:

> Yet another step in the apparently never-ending quest to release v1.10.0…
>
> http://www.open-mpi.org/software/ompi/v1.10/
>
> Please check it out
> Ralph
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17832.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] esslingen MTT?

2015-08-25 Thread Adrian Reber
On Mon, Aug 24, 2015 at 09:47:22PM +, Jeff Squyres (jsquyres) wrote:
> Who runs the esslingen MTT?
> 
> You're getting some build failures on master that I don't understand:
> 
> -
> make[3]: Entering directory
> '/home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/ompi/mpi/fortran/mpif-h/profile'
>   GENERATE psizeof_f.f90
>   FC   psizeof_f.lo
> Usage: 
> /home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/libtool
>  [OPTION]...
> [MODE-ARG]...
> Try 'libtool --help' for more information.
> Makefile:2609: recipe for target 'psizeof_f.lo' failed
> -
> 
> Can you do a "make V=1" so that I can see what exactly is going wrong?

make[3]: Entering directory 
'/home/adrian/ompi/build/ompi/mpi/fortran/mpif-h/profile'
/bin/sh ../../../../../libtool  --tag=FC   --mode=compile-c -o psizeof_f.lo 
 psizeof_f.f90
libtool: compile: unrecognized option `-c'
libtool: compile: Try `libtool --help' for more information.
Makefile:2598: recipe for target 'psizeof_f.lo' failed
make[3]: *** [psizeof_f.lo] Error 1

The system has no fortran compiler installed and after a

 yum install gcc-gfortran.ppc64

it builds again. So it seems a fortran compiler is now required.

Adrian


Re: [OMPI devel] esslingen MTT?

2015-08-25 Thread Gilles Gouaillardet

Thanks Adrian,

i fixed this in PR #831 https://github.com/open-mpi/ompi/pull/831 and 
push it shortly to master


Best regards,

Gilles

On 8/25/2015 4:47 PM, Adrian Reber wrote:

On Mon, Aug 24, 2015 at 09:47:22PM +, Jeff Squyres (jsquyres) wrote:

Who runs the esslingen MTT?

You're getting some build failures on master that I don't understand:

-
make[3]: Entering directory
'/home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/ompi/mpi/fortran/mpif-h/profile'
   GENERATE psizeof_f.f90
   FC   psizeof_f.lo
Usage: 
/home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/libtool 
[OPTION]...
[MODE-ARG]...
Try 'libtool --help' for more information.
Makefile:2609: recipe for target 'psizeof_f.lo' failed
-

Can you do a "make V=1" so that I can see what exactly is going wrong?

make[3]: Entering directory 
'/home/adrian/ompi/build/ompi/mpi/fortran/mpif-h/profile'
/bin/sh ../../../../../libtool  --tag=FC   --mode=compile-c -o psizeof_f.lo 
 psizeof_f.f90
libtool: compile: unrecognized option `-c'
libtool: compile: Try `libtool --help' for more information.
Makefile:2598: recipe for target 'psizeof_f.lo' failed
make[3]: *** [psizeof_f.lo] Error 1

The system has no fortran compiler installed and after a

  yum install gcc-gfortran.ppc64

it builds again. So it seems a fortran compiler is now required.

Adrian
___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2015/08/17836.php





[OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet

Folks,

some time ago, some crashes were reported when using java bindings.
one of them was caused was caused by mca_mtl_psm.so.
the root cause is libinfinipath.so initializer sets its own signal 
handler, which

conflicts with the signal handler sets by the jvm.
the only workaround is to disable the psm mtl
(e.g. mpirun --mca mtl ^psm ...)
since mpirun --mca mtl_psm_priority 0 ... does not work
(libinfinipath.so is loaded, so the initializer is ran and the signal 
handlers are set)

so the psm mtl cannot be disabled by the Java MPI_Init()

one option is to document this
an other option is not to build the psm mtl if java bindings are built
and an other option is to revamp mca_mtl_psm.so so it does not link with 
libinfinipath.so

(use an intermediate component, or dlopen libinfinipath)

any thoughts ?

Cheers,

Gilles


Re: [OMPI devel] esslingen MTT?

2015-08-25 Thread Jeff Squyres (jsquyres)
+1 -- thanks Adrian.

> On Aug 25, 2015, at 4:04 AM, Gilles Gouaillardet  wrote:
> 
> Thanks Adrian,
> 
> i fixed this in PR #831 https://github.com/open-mpi/ompi/pull/831 and push it 
> shortly to master
> 
> Best regards,
> 
> Gilles
> 
> On 8/25/2015 4:47 PM, Adrian Reber wrote:
>> On Mon, Aug 24, 2015 at 09:47:22PM +, Jeff Squyres (jsquyres) wrote:
>>> Who runs the esslingen MTT?
>>> 
>>> You're getting some build failures on master that I don't understand:
>>> 
>>> -
>>> make[3]: Entering directory
>>> '/home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/ompi/mpi/fortran/mpif-h/profile'
>>>   GENERATE psizeof_f.f90
>>>   FC   psizeof_f.lo
>>> Usage: 
>>> /home/adrian/mtt-scratch/mpi-install/FDvh/src/openmpi-dev-2350-geb25c00/libtool
>>>  [OPTION]...
>>> [MODE-ARG]...
>>> Try 'libtool --help' for more information.
>>> Makefile:2609: recipe for target 'psizeof_f.lo' failed
>>> -
>>> 
>>> Can you do a "make V=1" so that I can see what exactly is going wrong?
>> make[3]: Entering directory 
>> '/home/adrian/ompi/build/ompi/mpi/fortran/mpif-h/profile'
>> /bin/sh ../../../../../libtool  --tag=FC   --mode=compile-c -o 
>> psizeof_f.lo  psizeof_f.f90
>> libtool: compile: unrecognized option `-c'
>> libtool: compile: Try `libtool --help' for more information.
>> Makefile:2598: recipe for target 'psizeof_f.lo' failed
>> make[3]: *** [psizeof_f.lo] Error 1
>> 
>> The system has no fortran compiler installed and after a
>> 
>>  yum install gcc-gfortran.ppc64
>> 
>> it builds again. So it seems a fortran compiler is now required.
>> 
>>  Adrian
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/08/17836.php
>> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17837.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Jeff Squyres (jsquyres)
Is it possible to run-time detect this situation?  E.g., probe the signal 
handler, or somesuch.

Rationale: I'd rather have something run-time disabled than not built.

Would dlopen'ing libinfinipath change actually change its signal handler 
behavior?


> On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  wrote:
> 
> Folks,
> 
> some time ago, some crashes were reported when using java bindings.
> one of them was caused was caused by mca_mtl_psm.so.
> the root cause is libinfinipath.so initializer sets its own signal handler, 
> which
> conflicts with the signal handler sets by the jvm.
> the only workaround is to disable the psm mtl
> (e.g. mpirun --mca mtl ^psm ...)
> since mpirun --mca mtl_psm_priority 0 ... does not work
> (libinfinipath.so is loaded, so the initializer is ran and the signal 
> handlers are set)
> so the psm mtl cannot be disabled by the Java MPI_Init()
> 
> one option is to document this
> an other option is not to build the psm mtl if java bindings are built
> and an other option is to revamp mca_mtl_psm.so so it does not link with 
> libinfinipath.so
> (use an intermediate component, or dlopen libinfinipath)
> 
> any thoughts ?
> 
> Cheers,
> 
> Gilles
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
i do not know if this can be runtime detected ...
note we should report this to intel folks and ask them to advise.
ideally, they would provide a way to make sure libinfinipath.so does not
conflict with the jvm signal handlers.

my idea is to dlopen libinfinipath only if java bindings are not used.

On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) 
wrote:

> Is it possible to run-time detect this situation?  E.g., probe the signal
> handler, or somesuch.
>
> Rationale: I'd rather have something run-time disabled than not built.
>
> Would dlopen'ing libinfinipath change actually change its signal handler
> behavior?
>
>
> > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  > wrote:
> >
> > Folks,
> >
> > some time ago, some crashes were reported when using java bindings.
> > one of them was caused was caused by mca_mtl_psm.so.
> > the root cause is libinfinipath.so initializer sets its own signal
> handler, which
> > conflicts with the signal handler sets by the jvm.
> > the only workaround is to disable the psm mtl
> > (e.g. mpirun --mca mtl ^psm ...)
> > since mpirun --mca mtl_psm_priority 0 ... does not work
> > (libinfinipath.so is loaded, so the initializer is ran and the signal
> handlers are set)
> > so the psm mtl cannot be disabled by the Java MPI_Init()
> >
> > one option is to document this
> > an other option is not to build the psm mtl if java bindings are built
> > and an other option is to revamp mca_mtl_psm.so so it does not link with
> libinfinipath.so
> > (use an intermediate component, or dlopen libinfinipath)
> >
> > any thoughts ?
> >
> > Cheers,
> >
> > Gilles
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
>


[OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-25 Thread Gilles Gouaillardet
Folks,

I ran some basic tests with IPM profiler-like
https://github.com/nerscadmin/IPM and found that when fortran calls an
mpi subroutine, this is accounted twice.
IPM defines both MPI_* subroutines and their fortran mpi_*_ counterpart.
since the ompi fortran calls the MPI_* symbol (and not the PMPI_* one), and
IPM does nothing to prevent double accounting, all subroutines are
accounted twice

what is the rationale for calling MPI_* from fortran instead of PMPI_* ?

basically, I can see three options
1. we do nothing, this is an IPM problem, not an Open MPI one
2. we change ompi to call the PMPI_* symbols
3. we add a configure option to call PMPI_* symbols instead of the MPI_*
ones

any thoughts ?

Cheers,

Gilles


Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-25 Thread Bert Wesarg

On 08/25/2015 02:44 PM, Gilles Gouaillardet wrote:

Folks,

I ran some basic tests with IPM profiler-like
https://github.com/nerscadmin/IPM and found that when fortran calls an
mpi subroutine, this is accounted twice.
IPM defines both MPI_* subroutines and their fortran mpi_*_ counterpart.
since the ompi fortran calls the MPI_* symbol (and not the PMPI_* one), and
IPM does nothing to prevent double accounting, all subroutines are
accounted twice

what is the rationale for calling MPI_* from fortran instead of PMPI_* ?

basically, I can see three options
1. we do nothing, this is an IPM problem, not an Open MPI one
2. we change ompi to call the PMPI_* symbols
3. we add a configure option to call PMPI_* symbols instead of the MPI_*
ones

any thoughts ?


One more datapoint, also from a monitor tool (Score-P, as some of you 
know): The Open SHMEM part of Open MPI also calls the MPI interface, not 
the PMPI. That may result in performance data from MPI calls in SHMEM 
applications, which seems weird too.


Bert



Cheers,

Gilles




--
Dipl.-Inf. Bert Wesarg
wiss. Mitarbeiter

Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
01062 Dresden
Tel.: +49 (351) 463-42451
Fax: +49 (351) 463-37773
E-Mail: bert.wes...@tu-dresden.de



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Jeff Squyres (jsquyres)
Intel folks: can you comment on this?  It appears that the libinfinipath signal 
handler is interfering with the java garbage collector.


> On Aug 25, 2015, at 8:01 AM, Gilles Gouaillardet 
>  wrote:
> 
> i do not know if this can be runtime detected ...
> note we should report this to intel folks and ask them to advise.
> ideally, they would provide a way to make sure libinfinipath.so does not 
> conflict with the jvm signal handlers.
> 
> my idea is to dlopen libinfinipath only if java bindings are not used.
> 
> On Tuesday, August 25, 2015, Jeff Squyres (jsquyres)  
> wrote:
> Is it possible to run-time detect this situation?  E.g., probe the signal 
> handler, or somesuch.
> 
> Rationale: I'd rather have something run-time disabled than not built.
> 
> Would dlopen'ing libinfinipath change actually change its signal handler 
> behavior?
> 
> 
> > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  wrote:
> >
> > Folks,
> >
> > some time ago, some crashes were reported when using java bindings.
> > one of them was caused was caused by mca_mtl_psm.so.
> > the root cause is libinfinipath.so initializer sets its own signal handler, 
> > which
> > conflicts with the signal handler sets by the jvm.
> > the only workaround is to disable the psm mtl
> > (e.g. mpirun --mca mtl ^psm ...)
> > since mpirun --mca mtl_psm_priority 0 ... does not work
> > (libinfinipath.so is loaded, so the initializer is ran and the signal 
> > handlers are set)
> > so the psm mtl cannot be disabled by the Java MPI_Init()
> >
> > one option is to document this
> > an other option is not to build the psm mtl if java bindings are built
> > and an other option is to revamp mca_mtl_psm.so so it does not link with 
> > libinfinipath.so
> > (use an intermediate component, or dlopen libinfinipath)
> >
> > any thoughts ?
> >
> > Cheers,
> >
> > Gilles
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Howard Pritchard
I think rather than trying workarounds of dubious robustness inside open
mpi we

- dicument the issue on either the somewhat aged open mpi website faq or
add it to a wiki page on github
- file a bug against  intel psm

--

sent from my smart phonr so no good type.

Howard
On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
gilles.gouaillar...@gmail.com> wrote:

> i do not know if this can be runtime detected ...
> note we should report this to intel folks and ask them to advise.
> ideally, they would provide a way to make sure libinfinipath.so does not
> conflict with the jvm signal handlers.
>
> my idea is to dlopen libinfinipath only if java bindings are not used.
>
> On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) 
> wrote:
>
>> Is it possible to run-time detect this situation?  E.g., probe the signal
>> handler, or somesuch.
>>
>> Rationale: I'd rather have something run-time disabled than not built.
>>
>> Would dlopen'ing libinfinipath change actually change its signal handler
>> behavior?
>>
>>
>> > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet 
>> wrote:
>> >
>> > Folks,
>> >
>> > some time ago, some crashes were reported when using java bindings.
>> > one of them was caused was caused by mca_mtl_psm.so.
>> > the root cause is libinfinipath.so initializer sets its own signal
>> handler, which
>> > conflicts with the signal handler sets by the jvm.
>> > the only workaround is to disable the psm mtl
>> > (e.g. mpirun --mca mtl ^psm ...)
>> > since mpirun --mca mtl_psm_priority 0 ... does not work
>> > (libinfinipath.so is loaded, so the initializer is ran and the signal
>> handlers are set)
>> > so the psm mtl cannot be disabled by the Java MPI_Init()
>> >
>> > one option is to document this
>> > an other option is not to build the psm mtl if java bindings are built
>> > and an other option is to revamp mca_mtl_psm.so so it does not link
>> with libinfinipath.so
>> > (use an intermediate component, or dlopen libinfinipath)
>> >
>> > any thoughts ?
>> >
>> > Cheers,
>> >
>> > Gilles
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
>


Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2362-ge2124c6

2015-08-25 Thread Howard Pritchard
is this going in to v2.x?

--

sent from my smart phonr so no good type.

Howard
On Aug 25, 2015 7:54 AM,  wrote:

> This is an automated email from the git hooks/post-receive script. It was
> generated because a ref change was pushed to the repository containing
> the project "open-mpi/ompi".
>
> The branch, master has been updated
>via  e2124c61fee7bd5a156c90d559ba15f6ded34d53 (commit)
>   from  6f2e8d20737907b474a401d041b5c0b1059e7d3f (commit)
>
> Those revisions listed above that are new to this repository have
> not appeared on any other notification email; so we list those
> revisions in full, below.
>
> - Log -
>
> https://github.com/open-mpi/ompi/commit/e2124c61fee7bd5a156c90d559ba15f6ded34d53
>
> commit e2124c61fee7bd5a156c90d559ba15f6ded34d53
> Author: Jeff Squyres 
> Date:   Tue Aug 25 09:53:25 2015 -0400
>
> README: minor re-flowing on extra-long lines
>
> No other content changes; just re-flowing of long lines.
>
> diff --git a/README b/README
> index 70f251d..6883d1f 100644
> --- a/README
> +++ b/README
> @@ -436,8 +436,8 @@ General Run-Time Support Notes
>  MPI Functionality and Features
>  --
>
> -- Rank reordering support is available using the TreeMatch library. It is
> activated
> -  for the graph and dist_graph topologies.
> +- Rank reordering support is available using the TreeMatch library. It
> +  is activated for the graph and dist_graph topologies.
>
>  - All MPI-3 functionality is supported.
>
> @@ -532,37 +532,39 @@ MPI Collectives
>MPI process onto Mellanox QDR InfiniBand switch CPUs and HCAs.
>
>  - The "ML" coll component is an implementation of MPI collective
> -  operations that takes advantage of communication hierarchies
> -  in modern systems. A ML collective operation is implemented by
> +  operations that takes advantage of communication hierarchies in
> +  modern systems. A ML collective operation is implemented by
>combining multiple independently progressing collective primitives
>implemented over different communication hierarchies, hence a ML
> -  collective operation is also referred to as a hierarchical collective
> -  operation. The number of collective primitives that are included in a
> -  ML collective operation is a function of subgroups(hierarchies).
> -  Typically, MPI processes in a single communication hierarchy such as
> -  CPU socket, node, or subnet are grouped together into a single subgroup
> -  (hierarchy). The number of subgroups are configurable at runtime,
> -  and each different collective operation could be configured to have
> -  a different of number of subgroups.
> +  collective operation is also referred to as a hierarchical
> +  collective operation. The number of collective primitives that are
> +  included in a ML collective operation is a function of
> +  subgroups(hierarchies).  Typically, MPI processes in a single
> +  communication hierarchy such as CPU socket, node, or subnet are
> +  grouped together into a single subgroup (hierarchy). The number of
> +  subgroups are configurable at runtime, and each different collective
> +  operation could be configured to have a different of number of
> +  subgroups.
>
>The component frameworks and components used by/required for a
>"ML" collective operation.
>
>Frameworks:
> -  * "sbgp" - Provides functionality for grouping processes into subgroups
> +  * "sbgp" - Provides functionality for grouping processes into
> + subgroups
>* "bcol" - Provides collective primitives optimized for a particular
>   communication hierarchy
>
>Components:
> -  * sbgp components - Provides grouping functionality over a CPU
> socket
> -  ("basesocket"), shared memory ("basesmuma"),
> -  Mellanox's ConnectX HCA ("ibnet"), and other
> -  interconnects supported by PML ("p2p")
> -
> -  * BCOL components - Provides optimized collective primitives for
> -  shared memory ("basesmuma"), Mellanox's ConnectX
> -  HCA ("iboffload"), and other interconnects
> supported
> -  by PML ("ptpcoll")
> +  * sbgp components - Provides grouping functionality over a CPU
> +  socket ("basesocket"), shared memory
> +  ("basesmuma"), Mellanox's ConnectX HCA
> +  ("ibnet"), and other interconnects supported by
> +  PML ("p2p")
> +  * BCOL components - Provides optimized collective primitives for
> +  shared memory ("basesmuma"), Mellanox's ConnectX
> +  HCA ("iboffload"), and other interconnects
> +  supported by PML ("ptpcoll")
>
>  - The "cuda" coll component provides CUDA-aware support for the
>reduction type collectives with GPU buffers. This component is only
> @@ -100

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Jeff Squyres (jsquyres)
On Aug 25, 2015, at 10:00 AM, Howard Pritchard  wrote:
> 
> I think rather than trying workarounds of dubious robustness inside open mpi 
> we
> 
> - dicument the issue on either the somewhat aged open mpi website faq or add 
> it to a wiki page on github

It should probably be documented in the README and the FAQ.

I'd be against adding user documentation to the wiki -- this would be a 3rd 
place for users to look for information.

> - file a bug against  intel psm 

I'd like to hear what they have to say first... :-)

> 
> --
> 
> sent from my smart phonr so no good type.
> 
> Howard
> 
> On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" 
>  wrote:
> i do not know if this can be runtime detected ...
> note we should report this to intel folks and ask them to advise.
> ideally, they would provide a way to make sure libinfinipath.so does not 
> conflict with the jvm signal handlers.
> 
> my idea is to dlopen libinfinipath only if java bindings are not used.
> 
> On Tuesday, August 25, 2015, Jeff Squyres (jsquyres)  
> wrote:
> Is it possible to run-time detect this situation?  E.g., probe the signal 
> handler, or somesuch.
> 
> Rationale: I'd rather have something run-time disabled than not built.
> 
> Would dlopen'ing libinfinipath change actually change its signal handler 
> behavior?
> 
> 
> > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  wrote:
> >
> > Folks,
> >
> > some time ago, some crashes were reported when using java bindings.
> > one of them was caused was caused by mca_mtl_psm.so.
> > the root cause is libinfinipath.so initializer sets its own signal handler, 
> > which
> > conflicts with the signal handler sets by the jvm.
> > the only workaround is to disable the psm mtl
> > (e.g. mpirun --mca mtl ^psm ...)
> > since mpirun --mca mtl_psm_priority 0 ... does not work
> > (libinfinipath.so is loaded, so the initializer is ran and the signal 
> > handlers are set)
> > so the psm mtl cannot be disabled by the Java MPI_Init()
> >
> > one option is to document this
> > an other option is not to build the psm mtl if java bindings are built
> > and an other option is to revamp mca_mtl_psm.so so it does not link with 
> > libinfinipath.so
> > (use an intermediate component, or dlopen libinfinipath)
> >
> > any thoughts ?
> >
> > Cheers,
> >
> > Gilles
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-25 Thread George Bosilca
This seems to be the case only with the TKR interface. All the others are
either calling the OMPI version directly (mpif-h), or are calling some
other internal (or weak symbol function).

  George.


On Tue, Aug 25, 2015 at 9:04 AM, Bert Wesarg 
wrote:

> On 08/25/2015 02:44 PM, Gilles Gouaillardet wrote:
>
>> Folks,
>>
>> I ran some basic tests with IPM profiler-like
>> https://github.com/nerscadmin/IPM and found that when fortran calls an
>> mpi subroutine, this is accounted twice.
>> IPM defines both MPI_* subroutines and their fortran mpi_*_ counterpart.
>> since the ompi fortran calls the MPI_* symbol (and not the PMPI_* one),
>> and
>> IPM does nothing to prevent double accounting, all subroutines are
>> accounted twice
>>
>> what is the rationale for calling MPI_* from fortran instead of PMPI_* ?
>>
>> basically, I can see three options
>> 1. we do nothing, this is an IPM problem, not an Open MPI one
>> 2. we change ompi to call the PMPI_* symbols
>> 3. we add a configure option to call PMPI_* symbols instead of the MPI_*
>> ones
>>
>> any thoughts ?
>>
>
> One more datapoint, also from a monitor tool (Score-P, as some of you
> know): The Open SHMEM part of Open MPI also calls the MPI interface, not
> the PMPI. That may result in performance data from MPI calls in SHMEM
> applications, which seems weird too.
>
> Bert
>
>
>> Cheers,
>>
>> Gilles
>>
>>
>>
> --
> Dipl.-Inf. Bert Wesarg
> wiss. Mitarbeiter
>
> Technische Universität Dresden
> Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
> 01062 Dresden
> Tel.: +49 (351) 463-42451
> Fax: +49 (351) 463-37773
> E-Mail: bert.wes...@tu-dresden.de
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17843.php
>


Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Howard Pritchard
I'll update the java FAQ.

2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :

> On Aug 25, 2015, at 10:00 AM, Howard Pritchard 
> wrote:
> >
> > I think rather than trying workarounds of dubious robustness inside open
> mpi we
> >
> > - dicument the issue on either the somewhat aged open mpi website faq or
> add it to a wiki page on github
>
> It should probably be documented in the README and the FAQ.
>
> I'd be against adding user documentation to the wiki -- this would be a
> 3rd place for users to look for information.
>
> > - file a bug against  intel psm
>
> I'd like to hear what they have to say first... :-)
>
> >
> > --
> >
> > sent from my smart phonr so no good type.
> >
> > Howard
> >
> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
> gilles.gouaillar...@gmail.com> wrote:
> > i do not know if this can be runtime detected ...
> > note we should report this to intel folks and ask them to advise.
> > ideally, they would provide a way to make sure libinfinipath.so does not
> conflict with the jvm signal handlers.
> >
> > my idea is to dlopen libinfinipath only if java bindings are not used.
> >
> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) 
> wrote:
> > Is it possible to run-time detect this situation?  E.g., probe the
> signal handler, or somesuch.
> >
> > Rationale: I'd rather have something run-time disabled than not built.
> >
> > Would dlopen'ing libinfinipath change actually change its signal handler
> behavior?
> >
> >
> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet 
> wrote:
> > >
> > > Folks,
> > >
> > > some time ago, some crashes were reported when using java bindings.
> > > one of them was caused was caused by mca_mtl_psm.so.
> > > the root cause is libinfinipath.so initializer sets its own signal
> handler, which
> > > conflicts with the signal handler sets by the jvm.
> > > the only workaround is to disable the psm mtl
> > > (e.g. mpirun --mca mtl ^psm ...)
> > > since mpirun --mca mtl_psm_priority 0 ... does not work
> > > (libinfinipath.so is loaded, so the initializer is ran and the signal
> handlers are set)
> > > so the psm mtl cannot be disabled by the Java MPI_Init()
> > >
> > > one option is to document this
> > > an other option is not to build the psm mtl if java bindings are built
> > > and an other option is to revamp mca_mtl_psm.so so it does not link
> with libinfinipath.so
> > > (use an intermediate component, or dlopen libinfinipath)
> > >
> > > any thoughts ?
> > >
> > > Cheers,
> > >
> > > Gilles
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17847.php
>


Re: [OMPI devel] v1.10.0rc7

2015-08-25 Thread Ralph Castain
Thanks Paul!!

> On Aug 24, 2015, at 11:46 PM, Paul Hargrove  wrote:
> 
> With only the slow qemu-emulated MIPS and ARM testers still running, I can 
> report that I have seen NO issues with rc7.
> 
> -Paul
> 
> On Mon, Aug 24, 2015 at 4:54 PM, Ralph Castain  > wrote:
> Yet another step in the apparently never-ending quest to release v1.10.0…
> 
> http://www.open-mpi.org/software/ompi/v1.10/ 
> 
> 
> Please check it out
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17832.php 
> 
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov 
> 
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17835.php



Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Nathaniel Graham
What if we modify the mpirun script to include the --mca mtl ^psm tag if
java is in the run string?

-Nathan

On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard 
wrote:

> I'll update the java FAQ.
>
> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :
>
>> On Aug 25, 2015, at 10:00 AM, Howard Pritchard 
>> wrote:
>> >
>> > I think rather than trying workarounds of dubious robustness inside
>> open mpi we
>> >
>> > - dicument the issue on either the somewhat aged open mpi website faq
>> or add it to a wiki page on github
>>
>> It should probably be documented in the README and the FAQ.
>>
>> I'd be against adding user documentation to the wiki -- this would be a
>> 3rd place for users to look for information.
>>
>> > - file a bug against  intel psm
>>
>> I'd like to hear what they have to say first... :-)
>>
>> >
>> > --
>> >
>> > sent from my smart phonr so no good type.
>> >
>> > Howard
>> >
>> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
>> gilles.gouaillar...@gmail.com> wrote:
>> > i do not know if this can be runtime detected ...
>> > note we should report this to intel folks and ask them to advise.
>> > ideally, they would provide a way to make sure libinfinipath.so does
>> not conflict with the jvm signal handlers.
>> >
>> > my idea is to dlopen libinfinipath only if java bindings are not used.
>> >
>> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>> > Is it possible to run-time detect this situation?  E.g., probe the
>> signal handler, or somesuch.
>> >
>> > Rationale: I'd rather have something run-time disabled than not built.
>> >
>> > Would dlopen'ing libinfinipath change actually change its signal
>> handler behavior?
>> >
>> >
>> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet 
>> wrote:
>> > >
>> > > Folks,
>> > >
>> > > some time ago, some crashes were reported when using java bindings.
>> > > one of them was caused was caused by mca_mtl_psm.so.
>> > > the root cause is libinfinipath.so initializer sets its own signal
>> handler, which
>> > > conflicts with the signal handler sets by the jvm.
>> > > the only workaround is to disable the psm mtl
>> > > (e.g. mpirun --mca mtl ^psm ...)
>> > > since mpirun --mca mtl_psm_priority 0 ... does not work
>> > > (libinfinipath.so is loaded, so the initializer is ran and the signal
>> handlers are set)
>> > > so the psm mtl cannot be disabled by the Java MPI_Init()
>> > >
>> > > one option is to document this
>> > > an other option is not to build the psm mtl if java bindings are built
>> > > and an other option is to revamp mca_mtl_psm.so so it does not link
>> with libinfinipath.so
>> > > (use an intermediate component, or dlopen libinfinipath)
>> > >
>> > > any thoughts ?
>> > >
>> > > Cheers,
>> > >
>> > > Gilles
>> > > ___
>> > > devel mailing list
>> > > de...@open-mpi.org
>> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
>> >
>> >
>> > --
>> > Jeff Squyres
>> > jsquy...@cisco.com
>> > For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/08/17847.php
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17849.php
>


Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Ralph Castain
We’re looking at this off-list. It would be preferable not to disable PSM if we 
can avoid it

> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham  wrote:
> 
> What if we modify the mpirun script to include the --mca mtl ^psm tag if java 
> is in the run string?
> 
> -Nathan
> 
> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard  > wrote:
> I'll update the java FAQ.
> 
> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres)  >:
> On Aug 25, 2015, at 10:00 AM, Howard Pritchard  > wrote:
> >
> > I think rather than trying workarounds of dubious robustness inside open 
> > mpi we
> >
> > - dicument the issue on either the somewhat aged open mpi website faq or 
> > add it to a wiki page on github
> 
> It should probably be documented in the README and the FAQ.
> 
> I'd be against adding user documentation to the wiki -- this would be a 3rd 
> place for users to look for information.
> 
> > - file a bug against  intel psm
> 
> I'd like to hear what they have to say first... :-)
> 
> >
> > --
> >
> > sent from my smart phonr so no good type.
> >
> > Howard
> >
> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" 
> > mailto:gilles.gouaillar...@gmail.com>> 
> > wrote:
> > i do not know if this can be runtime detected ...
> > note we should report this to intel folks and ask them to advise.
> > ideally, they would provide a way to make sure libinfinipath.so does not 
> > conflict with the jvm signal handlers.
> >
> > my idea is to dlopen libinfinipath only if java bindings are not used.
> >
> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres)  > > wrote:
> > Is it possible to run-time detect this situation?  E.g., probe the signal 
> > handler, or somesuch.
> >
> > Rationale: I'd rather have something run-time disabled than not built.
> >
> > Would dlopen'ing libinfinipath change actually change its signal handler 
> > behavior?
> >
> >
> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  > > > wrote:
> > >
> > > Folks,
> > >
> > > some time ago, some crashes were reported when using java bindings.
> > > one of them was caused was caused by mca_mtl_psm.so.
> > > the root cause is libinfinipath.so initializer sets its own signal 
> > > handler, which
> > > conflicts with the signal handler sets by the jvm.
> > > the only workaround is to disable the psm mtl
> > > (e.g. mpirun --mca mtl ^psm ...)
> > > since mpirun --mca mtl_psm_priority 0 ... does not work
> > > (libinfinipath.so is loaded, so the initializer is ran and the signal 
> > > handlers are set)
> > > so the psm mtl cannot be disabled by the Java MPI_Init()
> > >
> > > one option is to document this
> > > an other option is not to build the psm mtl if java bindings are built
> > > and an other option is to revamp mca_mtl_psm.so so it does not link with 
> > > libinfinipath.so
> > > (use an intermediate component, or dlopen libinfinipath)
> > >
> > > any thoughts ?
> > >
> > > Cheers,
> > >
> > > Gilles
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org 
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > > 
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php 
> > > 
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com 
> > For corporate legal information go to: 
> > http://www.cisco.com/web/about/doing_business/legal/cri/ 
> > 
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17840.php 
> > 
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17841.php 
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17845.php 
> > 
> 
> 
> --
>

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Ralph Castain
Gilles: what version of PSM were you using? and with which cards?


> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham  wrote:
> 
> What if we modify the mpirun script to include the --mca mtl ^psm tag if java 
> is in the run string?
> 
> -Nathan
> 
> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard  > wrote:
> I'll update the java FAQ.
> 
> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres)  >:
> On Aug 25, 2015, at 10:00 AM, Howard Pritchard  > wrote:
> >
> > I think rather than trying workarounds of dubious robustness inside open 
> > mpi we
> >
> > - dicument the issue on either the somewhat aged open mpi website faq or 
> > add it to a wiki page on github
> 
> It should probably be documented in the README and the FAQ.
> 
> I'd be against adding user documentation to the wiki -- this would be a 3rd 
> place for users to look for information.
> 
> > - file a bug against  intel psm
> 
> I'd like to hear what they have to say first... :-)
> 
> >
> > --
> >
> > sent from my smart phonr so no good type.
> >
> > Howard
> >
> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" 
> > mailto:gilles.gouaillar...@gmail.com>> 
> > wrote:
> > i do not know if this can be runtime detected ...
> > note we should report this to intel folks and ask them to advise.
> > ideally, they would provide a way to make sure libinfinipath.so does not 
> > conflict with the jvm signal handlers.
> >
> > my idea is to dlopen libinfinipath only if java bindings are not used.
> >
> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres)  > > wrote:
> > Is it possible to run-time detect this situation?  E.g., probe the signal 
> > handler, or somesuch.
> >
> > Rationale: I'd rather have something run-time disabled than not built.
> >
> > Would dlopen'ing libinfinipath change actually change its signal handler 
> > behavior?
> >
> >
> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet  > > > wrote:
> > >
> > > Folks,
> > >
> > > some time ago, some crashes were reported when using java bindings.
> > > one of them was caused was caused by mca_mtl_psm.so.
> > > the root cause is libinfinipath.so initializer sets its own signal 
> > > handler, which
> > > conflicts with the signal handler sets by the jvm.
> > > the only workaround is to disable the psm mtl
> > > (e.g. mpirun --mca mtl ^psm ...)
> > > since mpirun --mca mtl_psm_priority 0 ... does not work
> > > (libinfinipath.so is loaded, so the initializer is ran and the signal 
> > > handlers are set)
> > > so the psm mtl cannot be disabled by the Java MPI_Init()
> > >
> > > one option is to document this
> > > an other option is not to build the psm mtl if java bindings are built
> > > and an other option is to revamp mca_mtl_psm.so so it does not link with 
> > > libinfinipath.so
> > > (use an intermediate component, or dlopen libinfinipath)
> > >
> > > any thoughts ?
> > >
> > > Cheers,
> > >
> > > Gilles
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org 
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > > 
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php 
> > > 
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com 
> > For corporate legal information go to: 
> > http://www.cisco.com/web/about/doing_business/legal/cri/ 
> > 
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17840.php 
> > 
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17841.php 
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17845.php 
> > 
> 
> 
> --
> Jeff Squyres
> jsquy...@ci

[OMPI devel] cosmetic misleading mpirun error message

2015-08-25 Thread Cabral, Matias A
Hi,

Playing with the 1.10.0 (just released) build I found a cosmetic misleading 
error message in mpirun. If by mistake you type -hosts (with an extra  "s"), 
the error message complains about an unknown "-o" option that is actually not 
being used. Typing the parameters correctly fixes the issue :)

m> mpirun --allow-run-as-root -hosts m7,m8 -np 2  osu_latency
mpirun: Error: unknown option "-o"
Type 'mpirun --help' for usage.

Thanks,
Regards,


_MAC



Re: [OMPI devel] cosmetic misleading mpirun error message

2015-08-25 Thread Jeff Squyres (jsquyres)
Fair point.

I don't know if there's an easy way to fix that, though.


> On Aug 25, 2015, at 6:01 PM, Cabral, Matias A  
> wrote:
> 
> Hi,
> 
>  
> 
> Playing with the 1.10.0 (just released) build I found a cosmetic misleading 
> error message in mpirun. If by mistake you type -hosts (with an extra  “s”), 
> the error message complains about an actually not being used. Typing the 
> parameters correctly fixes the issue J
> 
>  
> 
> m> mpirun --allow-run-as-root -hosts m7,m8 -np 2  osu_latency
> 
> mpirun: Error: unknown option "-o"
> 
> Type 'mpirun --help' for usage.
> 
>  
> 
> Thanks,
> 
> Regards,
> 
>  
> 
>  
> 
> _MAC
> 
>  
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17854.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*

2015-08-25 Thread Jeff Squyres (jsquyres)
On Aug 25, 2015, at 11:03 AM, George Bosilca  wrote:
> 
> This seems to be the case only with the TKR interface. All the others are 
> either calling the OMPI version directly (mpif-h), or are calling some other 
> internal (or weak symbol function).

Yes, those might need to be updated.  Not it!  (let's let the TKR interface 
die...)

You're right about the mpif-h interface, though -- they call the PMPI versions 
of the functions (through weak symbols).

However, our use of weak symbols might be confusing to the tool -- is it 
somehow intercepting our call from ompi_send_f() to PMPI_Send(), for example?  
You might want to step through with a debugger to see what's happening, because 
the debugger should show the name of the symbol that is invoked in the call 
stack, even though the pointer in the source code may show you in "MPI_Send()" 
(remember: we compile the C code for our functions potential with #defines that 
turn MPI_Send into PMPI_Send, etc.).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
i run on a centos 7 vm, and with the OFED that comes with centos
(I will send full details tomorrow)
there is no psm hardware, just infinipath libs

a first trivial workaround in ompi would be to
putenv("OMPI_MCA_mtl_psm_priority=0")
in the java binding before invoking ompi_mpi_init,
but that cannot works because libinfinipath is dlopen'ed and it's signal
handler is set
also, I guess putenv("OMPI_MCA_mtl=^psm") would not work if ompi was
configure'd with--disable-dlopen

Cheers,

Gilles

On Wednesday, August 26, 2015, Ralph Castain  wrote:

> Gilles: what version of PSM were you using? and with which cards?
>
>
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham  > wrote:
>
> What if we modify the mpirun script to include the --mca mtl ^psm tag if
> java is in the run string?
>
> -Nathan
>
> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard  > wrote:
>
>> I'll update the java FAQ.
>>
>> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) > >:
>>
>>> On Aug 25, 2015, at 10:00 AM, Howard Pritchard >> > wrote:
>>> >
>>> > I think rather than trying workarounds of dubious robustness inside
>>> open mpi we
>>> >
>>> > - dicument the issue on either the somewhat aged open mpi website faq
>>> or add it to a wiki page on github
>>>
>>> It should probably be documented in the README and the FAQ.
>>>
>>> I'd be against adding user documentation to the wiki -- this would be a
>>> 3rd place for users to look for information.
>>>
>>> > - file a bug against  intel psm
>>>
>>> I'd like to hear what they have to say first... :-)
>>>
>>> >
>>> > --
>>> >
>>> > sent from my smart phonr so no good type.
>>> >
>>> > Howard
>>> >
>>> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
>>> gilles.gouaillar...@gmail.com
>>> > wrote:
>>> > i do not know if this can be runtime detected ...
>>> > note we should report this to intel folks and ask them to advise.
>>> > ideally, they would provide a way to make sure libinfinipath.so does
>>> not conflict with the jvm signal handlers.
>>> >
>>> > my idea is to dlopen libinfinipath only if java bindings are not used.
>>> >
>>> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <
>>> jsquy...@cisco.com >
>>> wrote:
>>> > Is it possible to run-time detect this situation?  E.g., probe the
>>> signal handler, or somesuch.
>>> >
>>> > Rationale: I'd rather have something run-time disabled than not built.
>>> >
>>> > Would dlopen'ing libinfinipath change actually change its signal
>>> handler behavior?
>>> >
>>> >
>>> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet >> > wrote:
>>> > >
>>> > > Folks,
>>> > >
>>> > > some time ago, some crashes were reported when using java bindings.
>>> > > one of them was caused was caused by mca_mtl_psm.so.
>>> > > the root cause is libinfinipath.so initializer sets its own signal
>>> handler, which
>>> > > conflicts with the signal handler sets by the jvm.
>>> > > the only workaround is to disable the psm mtl
>>> > > (e.g. mpirun --mca mtl ^psm ...)
>>> > > since mpirun --mca mtl_psm_priority 0 ... does not work
>>> > > (libinfinipath.so is loaded, so the initializer is ran and the
>>> signal handlers are set)
>>> > > so the psm mtl cannot be disabled by the Java MPI_Init()
>>> > >
>>> > > one option is to document this
>>> > > an other option is not to build the psm mtl if java bindings are
>>> built
>>> > > and an other option is to revamp mca_mtl_psm.so so it does not link
>>> with libinfinipath.so
>>> > > (use an intermediate component, or dlopen libinfinipath)
>>> > >
>>> > > any thoughts ?
>>> > >
>>> > > Cheers,
>>> > >
>>> > > Gilles
>>> > > ___
>>> > > devel mailing list
>>> > > de...@open-mpi.org
>>> 
>>> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
>>> >
>>> >
>>> > --
>>> > Jeff Squyres
>>> > jsquy...@cisco.com
>>> 
>>> > For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> >
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> 
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
>>> >
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> 
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> 
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com 
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_bus

Re: [OMPI devel] cosmetic misleading mpirun error message

2015-08-25 Thread Gilles Gouaillardet
would it be easier if the option was --host instead of -host ?
I guess changing the cli is not an option for the v1.x series, so what
about adding the -hosts option (alias to -host option) ?
I made the same mistake a few times before, adding a s to hosts looks more
intuitive for me.
my 0.02 US$

Gilles

On Wednesday, August 26, 2015, Jeff Squyres (jsquyres) 
wrote:

> Fair point.
>
> I don't know if there's an easy way to fix that, though.
>
>
> > On Aug 25, 2015, at 6:01 PM, Cabral, Matias A  > wrote:
> >
> > Hi,
> >
> >
> >
> > Playing with the 1.10.0 (just released) build I found a cosmetic
> misleading error message in mpirun. If by mistake you type -hosts (with an
> extra  “s”), the error message complains about an actually not being used.
> Typing the parameters correctly fixes the issue J
> >
> >
> >
> > m> mpirun --allow-run-as-root -hosts m7,m8 -np 2  osu_latency
> >
> > mpirun: Error: unknown option "-o"
> >
> > Type 'mpirun --help' for usage.
> >
> >
> >
> > Thanks,
> >
> > Regards,
> >
> >
> >
> >
> >
> > _MAC
> >
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17854.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17855.php


Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Paul Hargrove
Gilles,

Is the conflict over "SIG32"?
If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable
InfiniPath's use of that signal.

-Paul

On Tue, Aug 25, 2015 at 6:02 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> i run on a centos 7 vm, and with the OFED that comes with centos
> (I will send full details tomorrow)
> there is no psm hardware, just infinipath libs
>
> a first trivial workaround in ompi would be to
> putenv("OMPI_MCA_mtl_psm_priority=0")
> in the java binding before invoking ompi_mpi_init,
> but that cannot works because libinfinipath is dlopen'ed and it's signal
> handler is set
> also, I guess putenv("OMPI_MCA_mtl=^psm") would not work if ompi was
> configure'd with--disable-dlopen
>
> Cheers,
>
> Gilles
>
>
> On Wednesday, August 26, 2015, Ralph Castain  wrote:
>
>> Gilles: what version of PSM were you using? and with which cards?
>>
>>
>> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham 
>> wrote:
>>
>> What if we modify the mpirun script to include the --mca mtl ^psm tag if
>> java is in the run string?
>>
>> -Nathan
>>
>> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard 
>> wrote:
>>
>>> I'll update the java FAQ.
>>>
>>> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :
>>>
 On Aug 25, 2015, at 10:00 AM, Howard Pritchard 
 wrote:
 >
 > I think rather than trying workarounds of dubious robustness inside
 open mpi we
 >
 > - dicument the issue on either the somewhat aged open mpi website faq
 or add it to a wiki page on github

 It should probably be documented in the README and the FAQ.

 I'd be against adding user documentation to the wiki -- this would be a
 3rd place for users to look for information.

 > - file a bug against  intel psm

 I'd like to hear what they have to say first... :-)

 >
 > --
 >
 > sent from my smart phonr so no good type.
 >
 > Howard
 >
 > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
 gilles.gouaillar...@gmail.com> wrote:
 > i do not know if this can be runtime detected ...
 > note we should report this to intel folks and ask them to advise.
 > ideally, they would provide a way to make sure libinfinipath.so does
 not conflict with the jvm signal handlers.
 >
 > my idea is to dlopen libinfinipath only if java bindings are not used.
 >
 > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <
 jsquy...@cisco.com> wrote:
 > Is it possible to run-time detect this situation?  E.g., probe the
 signal handler, or somesuch.
 >
 > Rationale: I'd rather have something run-time disabled than not built.
 >
 > Would dlopen'ing libinfinipath change actually change its signal
 handler behavior?
 >
 >
 > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet 
 wrote:
 > >
 > > Folks,
 > >
 > > some time ago, some crashes were reported when using java bindings.
 > > one of them was caused was caused by mca_mtl_psm.so.
 > > the root cause is libinfinipath.so initializer sets its own signal
 handler, which
 > > conflicts with the signal handler sets by the jvm.
 > > the only workaround is to disable the psm mtl
 > > (e.g. mpirun --mca mtl ^psm ...)
 > > since mpirun --mca mtl_psm_priority 0 ... does not work
 > > (libinfinipath.so is loaded, so the initializer is ran and the
 signal handlers are set)
 > > so the psm mtl cannot be disabled by the Java MPI_Init()
 > >
 > > one option is to document this
 > > an other option is not to build the psm mtl if java bindings are
 built
 > > and an other option is to revamp mca_mtl_psm.so so it does not link
 with libinfinipath.so
 > > (use an intermediate component, or dlopen libinfinipath)
 > >
 > > any thoughts ?
 > >
 > > Cheers,
 > >
 > > Gilles
 > > ___
 > > devel mailing list
 > > de...@open-mpi.org
 > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
 > > Link to this post:
 http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
 >
 >
 > --
 > Jeff Squyres
 > jsquy...@cisco.com
 > For corporate legal information go to:
 http://www.cisco.com/web/about/doing_business/legal/cri/
 >
 > ___
 > devel mailing list
 > de...@open-mpi.org
 > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
 > Link to this post:
 http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
 >
 > ___
 > devel mailing list
 > de...@open-mpi.org
 > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
 > Link to this post:
 http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
 > 

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Gilles Gouaillardet
Thanks Paul,

I will give it a try

Cheers,

Gilles

On Wednesday, August 26, 2015, Paul Hargrove  wrote:

> Gilles,
>
> Is the conflict over "SIG32"?
> If so, I believe setenv PSM_RCVTHREAD=0 in the environment will disable
> InfiniPath's use of that signal.
>
> -Paul
>
> On Tue, Aug 25, 2015 at 6:02 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> > wrote:
>
>> i run on a centos 7 vm, and with the OFED that comes with centos
>> (I will send full details tomorrow)
>> there is no psm hardware, just infinipath libs
>>
>> a first trivial workaround in ompi would be to
>> putenv("OMPI_MCA_mtl_psm_priority=0")
>> in the java binding before invoking ompi_mpi_init,
>> but that cannot works because libinfinipath is dlopen'ed and it's signal
>> handler is set
>> also, I guess putenv("OMPI_MCA_mtl=^psm") would not work if ompi was
>> configure'd with--disable-dlopen
>>
>> Cheers,
>>
>> Gilles
>>
>>
>> On Wednesday, August 26, 2015, Ralph Castain > > wrote:
>>
>>> Gilles: what version of PSM were you using? and with which cards?
>>>
>>>
>>> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham 
>>> wrote:
>>>
>>> What if we modify the mpirun script to include the --mca mtl ^psm tag if
>>> java is in the run string?
>>>
>>> -Nathan
>>>
>>> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard 
>>> wrote:
>>>
 I'll update the java FAQ.

 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :

> On Aug 25, 2015, at 10:00 AM, Howard Pritchard 
> wrote:
> >
> > I think rather than trying workarounds of dubious robustness inside
> open mpi we
> >
> > - dicument the issue on either the somewhat aged open mpi website
> faq or add it to a wiki page on github
>
> It should probably be documented in the README and the FAQ.
>
> I'd be against adding user documentation to the wiki -- this would be
> a 3rd place for users to look for information.
>
> > - file a bug against  intel psm
>
> I'd like to hear what they have to say first... :-)
>
> >
> > --
> >
> > sent from my smart phonr so no good type.
> >
> > Howard
> >
> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
> gilles.gouaillar...@gmail.com> wrote:
> > i do not know if this can be runtime detected ...
> > note we should report this to intel folks and ask them to advise.
> > ideally, they would provide a way to make sure libinfinipath.so does
> not conflict with the jvm signal handlers.
> >
> > my idea is to dlopen libinfinipath only if java bindings are not
> used.
> >
> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > Is it possible to run-time detect this situation?  E.g., probe the
> signal handler, or somesuch.
> >
> > Rationale: I'd rather have something run-time disabled than not
> built.
> >
> > Would dlopen'ing libinfinipath change actually change its signal
> handler behavior?
> >
> >
> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet <
> gil...@rist.or.jp> wrote:
> > >
> > > Folks,
> > >
> > > some time ago, some crashes were reported when using java bindings.
> > > one of them was caused was caused by mca_mtl_psm.so.
> > > the root cause is libinfinipath.so initializer sets its own signal
> handler, which
> > > conflicts with the signal handler sets by the jvm.
> > > the only workaround is to disable the psm mtl
> > > (e.g. mpirun --mca mtl ^psm ...)
> > > since mpirun --mca mtl_psm_priority 0 ... does not work
> > > (libinfinipath.so is loaded, so the initializer is ran and the
> signal handlers are set)
> > > so the psm mtl cannot be disabled by the Java MPI_Init()
> > >
> > > one option is to document this
> > > an other option is not to build the psm mtl if java bindings are
> built
> > > and an other option is to revamp mca_mtl_psm.so so it does not
> link with libinfinipath.so
> > > (use an intermediate component, or dlopen libinfinipath)
> > >
> > > any thoughts ?
> > >
> > > Cheers,
> > >
> > > Gilles
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
> >
> > __

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Howard Pritchard
which off-list are we talking about?
very annoying.


2015-08-25 10:38 GMT-06:00 Ralph Castain :

> We’re looking at this off-list. It would be preferable not to disable PSM
> if we can avoid it
>
> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham 
> wrote:
>
> What if we modify the mpirun script to include the --mca mtl ^psm tag if
> java is in the run string?
>
> -Nathan
>
> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard 
> wrote:
>
>> I'll update the java FAQ.
>>
>> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) :
>>
>>> On Aug 25, 2015, at 10:00 AM, Howard Pritchard 
>>> wrote:
>>> >
>>> > I think rather than trying workarounds of dubious robustness inside
>>> open mpi we
>>> >
>>> > - dicument the issue on either the somewhat aged open mpi website faq
>>> or add it to a wiki page on github
>>>
>>> It should probably be documented in the README and the FAQ.
>>>
>>> I'd be against adding user documentation to the wiki -- this would be a
>>> 3rd place for users to look for information.
>>>
>>> > - file a bug against  intel psm
>>>
>>> I'd like to hear what they have to say first... :-)
>>>
>>> >
>>> > --
>>> >
>>> > sent from my smart phonr so no good type.
>>> >
>>> > Howard
>>> >
>>> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" <
>>> gilles.gouaillar...@gmail.com> wrote:
>>> > i do not know if this can be runtime detected ...
>>> > note we should report this to intel folks and ask them to advise.
>>> > ideally, they would provide a way to make sure libinfinipath.so does
>>> not conflict with the jvm signal handlers.
>>> >
>>> > my idea is to dlopen libinfinipath only if java bindings are not used.
>>> >
>>> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) <
>>> jsquy...@cisco.com> wrote:
>>> > Is it possible to run-time detect this situation?  E.g., probe the
>>> signal handler, or somesuch.
>>> >
>>> > Rationale: I'd rather have something run-time disabled than not built.
>>> >
>>> > Would dlopen'ing libinfinipath change actually change its signal
>>> handler behavior?
>>> >
>>> >
>>> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet 
>>> wrote:
>>> > >
>>> > > Folks,
>>> > >
>>> > > some time ago, some crashes were reported when using java bindings.
>>> > > one of them was caused was caused by mca_mtl_psm.so.
>>> > > the root cause is libinfinipath.so initializer sets its own signal
>>> handler, which
>>> > > conflicts with the signal handler sets by the jvm.
>>> > > the only workaround is to disable the psm mtl
>>> > > (e.g. mpirun --mca mtl ^psm ...)
>>> > > since mpirun --mca mtl_psm_priority 0 ... does not work
>>> > > (libinfinipath.so is loaded, so the initializer is ran and the
>>> signal handlers are set)
>>> > > so the psm mtl cannot be disabled by the Java MPI_Init()
>>> > >
>>> > > one option is to document this
>>> > > an other option is not to build the psm mtl if java bindings are
>>> built
>>> > > and an other option is to revamp mca_mtl_psm.so so it does not link
>>> with libinfinipath.so
>>> > > (use an intermediate component, or dlopen libinfinipath)
>>> > >
>>> > > any thoughts ?
>>> > >
>>> > > Cheers,
>>> > >
>>> > > Gilles
>>> > > ___
>>> > > devel mailing list
>>> > > de...@open-mpi.org
>>> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17838.php
>>> >
>>> >
>>> > --
>>> > Jeff Squyres
>>> > jsquy...@cisco.com
>>> > For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> >
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17840.php
>>> >
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17841.php
>>> > ___
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> > Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17845.php
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/08/17847.php
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link t

Re: [OMPI devel] mca_mtl_psm and java

2015-08-25 Thread Ralph Castain
Sorry - but there are some discussions that cannot and should not take place on 
a public mailing list. As a former corporate person yourself, you should 
understand :-)

> On Aug 25, 2015, at 6:56 PM, Howard Pritchard  wrote:
> 
> which off-list are we talking about?
> very annoying.
> 
> 
> 2015-08-25 10:38 GMT-06:00 Ralph Castain  >:
> We’re looking at this off-list. It would be preferable not to disable PSM if 
> we can avoid it
> 
>> On Aug 25, 2015, at 9:32 AM, Nathaniel Graham > > wrote:
>> 
>> What if we modify the mpirun script to include the --mca mtl ^psm tag if 
>> java is in the run string?
>> 
>> -Nathan
>> 
>> On Tue, Aug 25, 2015 at 9:47 AM, Howard Pritchard > > wrote:
>> I'll update the java FAQ.
>> 
>> 2015-08-25 8:36 GMT-06:00 Jeff Squyres (jsquyres) > >:
>> On Aug 25, 2015, at 10:00 AM, Howard Pritchard > > wrote:
>> >
>> > I think rather than trying workarounds of dubious robustness inside open 
>> > mpi we
>> >
>> > - dicument the issue on either the somewhat aged open mpi website faq or 
>> > add it to a wiki page on github
>> 
>> It should probably be documented in the README and the FAQ.
>> 
>> I'd be against adding user documentation to the wiki -- this would be a 3rd 
>> place for users to look for information.
>> 
>> > - file a bug against  intel psm
>> 
>> I'd like to hear what they have to say first... :-)
>> 
>> >
>> > --
>> >
>> > sent from my smart phonr so no good type.
>> >
>> > Howard
>> >
>> > On Aug 25, 2015 6:02 AM, "Gilles Gouaillardet" 
>> > mailto:gilles.gouaillar...@gmail.com>> 
>> > wrote:
>> > i do not know if this can be runtime detected ...
>> > note we should report this to intel folks and ask them to advise.
>> > ideally, they would provide a way to make sure libinfinipath.so does not 
>> > conflict with the jvm signal handlers.
>> >
>> > my idea is to dlopen libinfinipath only if java bindings are not used.
>> >
>> > On Tuesday, August 25, 2015, Jeff Squyres (jsquyres) > > > wrote:
>> > Is it possible to run-time detect this situation?  E.g., probe the signal 
>> > handler, or somesuch.
>> >
>> > Rationale: I'd rather have something run-time disabled than not built.
>> >
>> > Would dlopen'ing libinfinipath change actually change its signal handler 
>> > behavior?
>> >
>> >
>> > > On Aug 25, 2015, at 4:27 AM, Gilles Gouaillardet > > > > wrote:
>> > >
>> > > Folks,
>> > >
>> > > some time ago, some crashes were reported when using java bindings.
>> > > one of them was caused was caused by mca_mtl_psm.so.
>> > > the root cause is libinfinipath.so initializer sets its own signal 
>> > > handler, which
>> > > conflicts with the signal handler sets by the jvm.
>> > > the only workaround is to disable the psm mtl
>> > > (e.g. mpirun --mca mtl ^psm ...)
>> > > since mpirun --mca mtl_psm_priority 0 ... does not work
>> > > (libinfinipath.so is loaded, so the initializer is ran and the signal 
>> > > handlers are set)
>> > > so the psm mtl cannot be disabled by the Java MPI_Init()
>> > >
>> > > one option is to document this
>> > > an other option is not to build the psm mtl if java bindings are built
>> > > and an other option is to revamp mca_mtl_psm.so so it does not link with 
>> > > libinfinipath.so
>> > > (use an intermediate component, or dlopen libinfinipath)
>> > >
>> > > any thoughts ?
>> > >
>> > > Cheers,
>> > >
>> > > Gilles
>> > > ___
>> > > devel mailing list
>> > > de...@open-mpi.org 
>> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> > > 
>> > > Link to this post: 
>> > > http://www.open-mpi.org/community/lists/devel/2015/08/17838.php 
>> > > 
>> >
>> >
>> > --
>> > Jeff Squyres
>> > jsquy...@cisco.com 
>> > For corporate legal information go to: 
>> > http://www.cisco.com/web/about/doing_business/legal/cri/ 
>> > 
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org 
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> > 
>> > Link to this post: 
>> > http://www.open-mpi.org/community/lists/devel/2015/08/17840.php 
>> > 
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org 
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> > 
>> > Link to this post: 
>> > http://www.open-mpi.org/community/lists/