[OMPI devel] osu_mbw_mr error

2014-11-03 Thread Alina Sklarevich
Hi,

On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when
using the openib BTL.

When compiled in production mode (i.e. no --enable-debug) the test simply
hangs.

When using either the tcp BTL or the cm PML, the benchmark completes
without error.

The command line to reproduce this is:

$ mpirun --bind-to core -display-map -mca btl_openib_if_include mlx5_0:1
-np 2 -mca pml ob1 -mca btl openib,self,sm ./osu_mbw_mr

# OSU MPI Multiple Bandwidth / Message Rate Test v4.4
# [ pairs: 1 ] [ window size: 64 ]
# Size  MB/sMessages/s
osu_mbw_mr: ../../../../opal/class/opal_list.h:547: _opal_list_append:
Assertion `0 == item->opal_list_item_refcount' failed.
[vegas15:30395] *** Process received signal ***
[vegas15:30395] Signal: Aborted (6)
[vegas15:30395] Signal code:  (-6)
[vegas15:30395] [ 0] /lib64/libpthread.so.0[0x30bc40f500]
[vegas15:30395] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x30bc0328a5]
[vegas15:30395] [ 2] /lib64/libc.so.6(abort+0x175)[0x30bc034085]
[vegas15:30395] [ 3] /lib64/libc.so.6[0x30bc02ba1e]
[vegas15:30395] [ 4]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x30bc02bae0]
[vegas15:30395] [ 5]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(+0x9087)[0x73f70087]
[vegas15:30395] [ 6]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_alloc+0x403)[0x73f754b3]
[vegas15:30395] [ 7]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_sendi+0xf9e)[0x73f785b4]
[vegas15:30395] [ 8]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xed08)[0x73308d08]
[vegas15:30395] [ 9]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xf8ba)[0x733098ba]
[vegas15:30395] [10]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_isend+0x108)[0x73309a1f]
[vegas15:30395] [11]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/libmpi.so.1(MPI_Isend+0x2ec)[0x77cff5e8]
[vegas15:30395] [12]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400fa4]
[vegas15:30395] [13]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x40167d]
[vegas15:30395] [14] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30bc01ecdd]
[vegas15:30395] [15]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400db9]
[vegas15:30395] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 30395 on node vegas15 exited on
signal 6 (Aborted).
--


Thanks,
Alina.


Re: [OMPI devel] osu_mbw_mr error

2014-11-03 Thread Ralph Castain
Can you please let me know when you fix this? I intend to release 1.8.4 by the 
end of the week. Since Mellanox is the only member with IB, you folks have been 
maintaining this BTL.


> On Nov 3, 2014, at 6:26 AM, Alina Sklarevich  
> wrote:
> 
> Hi,
> 
> On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when using 
> the openib BTL.
> 
> When compiled in production mode (i.e. no --enable-debug) the test simply 
> hangs.
> 
> When using either the tcp BTL or the cm PML, the benchmark completes without 
> error.
> 
> The command line to reproduce this is:
> 
> $ mpirun --bind-to core -display-map -mca btl_openib_if_include mlx5_0:1 -np 
> 2 -mca pml ob1 -mca btl openib,self,sm ./osu_mbw_mr
> 
> # OSU MPI Multiple Bandwidth / Message Rate Test v4.4
> # [ pairs: 1 ] [ window size: 64 ]
> # Size  MB/sMessages/s
> osu_mbw_mr: ../../../../opal/class/opal_list.h:547: _opal_list_append: 
> Assertion `0 == item->opal_list_item_refcount' failed.
> [vegas15:30395] *** Process received signal ***
> [vegas15:30395] Signal: Aborted (6)
> [vegas15:30395] Signal code:  (-6)
> [vegas15:30395] [ 0] /lib64/libpthread.so.0[0x30bc40f500]
> [vegas15:30395] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x30bc0328a5]
> [vegas15:30395] [ 2] /lib64/libc.so.6(abort+0x175)[0x30bc034085]
> [vegas15:30395] [ 3] /lib64/libc.so.6[0x30bc02ba1e]
> [vegas15:30395] [ 4] /lib64/libc.so.6(__assert_perror_fail+0x0)[0x30bc02bae0]
> [vegas15:30395] [ 5] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(+0x9087)[0x73f70087]
> [vegas15:30395] [ 6] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_alloc+0x403)[0x73f754b3]
> [vegas15:30395] [ 7] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_sendi+0xf9e)[0x73f785b4]
> [vegas15:30395] [ 8] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xed08)[0x73308d08]
> [vegas15:30395] [ 9] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xf8ba)[0x733098ba]
> [vegas15:30395] [10] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_isend+0x108)[0x73309a1f]
> [vegas15:30395] [11] 
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/libmpi.so.1(MPI_Isend+0x2ec)[0x77cff5e8]
> [vegas15:30395] [12] 
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400fa4]
> [vegas15:30395] [13] 
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x40167d]
> [vegas15:30395] [14] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30bc01ecdd]
> [vegas15:30395] [15] 
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400db9]
> [vegas15:30395] *** End of error message ***
> --
> mpirun noticed that process rank 0 with PID 30395 on node vegas15 exited on 
> signal 6 (Aborted).
> --
> 
> 
> Thanks,
> Alina.
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16142.php



Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Friedley, Andrew
Thanks Jeff, we're working to verify.  I don't mind the slower behavior on the 
trunk; we are only concerned with the stable release series.  Enabling thread 
safety options on the trunk/master is no problem here.

Did the 'more expensive freelists' change make it to the release series as well?

Andrew

> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff
> Squyres (jsquyres)
> Sent: Saturday, November 1, 2014 4:49 AM
> To: Open MPI Developers List
> Subject: Re: [OMPI devel] enable-smp-locks affects PSM performance
> 
> Andrew --
> 
> A PR went in on the v1.8 branch that removed the locking behavior by
> default:
> 
> https://github.com/open-mpi/ompi-release/pull/67
> 
> Can you confirm that the behavior on the v1.8 nightly tarballs show better
> PSM message rate behavior?
> 
> (http://www.open-mpi.org/nightly/v1.8/ -- 135 and beyond have the fix)
> 
> For master, this was actually the exactly intended result: at a previous OMPI
> dev meeting, we said "let's turn on threading, and see who screams."  You
> screamed.  :-)
> 
> In a conversation with Nathan on Friday, it looks like many things in
> 6ef938de3fa9ca0fed2c5bcb0736f65b0d8803af get more expensive with
> threading enabled -- particularly freelists.  It's quite possible that we 
> should
> change many of these things to be more expensive only a) if
> THREAD_MULTIPLE is active, and/or b) if progress threads are active.
> 
> However, that may not be safe in itself.  Perhaps we should have multiple
> flavors of these types (like freelists) that are both thread safe and thread
> unsafe.  So even if we're in a THREAD_MULTIPLE and/or progress threads are
> active, if you *know* you've got a data structure that does not need the
> added thread-safety protections, you can use the unsafe-but-faster
> versions.
> 
> 
> 
> 
> On Oct 30, 2014, at 12:51 PM, Friedley, Andrew
>  wrote:
> 
> > Hi Howard,
> >
> > No, we do not, just the OMPI default.
> >
> > This post isn't so much about our packaging requirements, as about default
> behavior in upstream Open MPI.  We'd like performance to be good by
> default for anyone compiling their own Open MPI (and using PSM).
> >
> > Andrew
> >
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Howard
> Pritchard
> > Sent: Thursday, October 30, 2014 8:53 AM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] enable-smp-locks affects PSM performance
> >
> > Hi Andrew,
> >
> > In your distribution of ompi, do you include versions of ompi to support
> different MPI thread safetylevels?  In particular, do you include a library
> which supports MPI_THREAD_MULTIPLE?
> > Just trying to better understand the requirements of you ompi package in
> terms of MPI thread safety.
> >
> > Thanks,
> >
> > Howard
> >
> >
> > 2014-10-30 8:10 GMT-06:00 Friedley, Andrew
> :
> > Hi,
> >
> > I'm reporting a performance (message rate 16%, latency 3%) regression
> when using PSM that occurred between OMPI v1.6.5 and v1.8.1.  I would
> guess it affects other networks too, but I haven't tested.  The problem stems
> from the --enable-smp-locks and --enable-opal-multi-threads options.
> >
> > --enable-smp-locks defaults to enabled and, on x86, causes a 'lock' prefix 
> > to
> be prepended to ASM instructions used by atomic primitives.  Disabling
> removes the 'lock' prefix.
> >
> > In OMPI 1.6.5, --enable-opal-multi-threads defaulted to disabled.  When
> enabled, OPAL would be compiled with multithreading support, which
> included compiling in calls to atomic primitives.  Those atomic primitives, in
> turn, potentially use a lock prefix (controlled by --enable-smp-locks).
> >
> > SVN r29891 on the trunk changed the above.  --enable-opal-multi-threads
> was removed.  CPP macros (#if OPAL_ENABLE_MULTI_THREADS) controlling
> various calls to atomic primitives were removed, effectively changing the
> default behavior to multithreading ON for OPAL.  This change was then
> carried to the v1.7 branch in r29944, Fixes #3983.
> >
> > We can use --disable-smp-locks to make the performance regression go
> away for the builds we ship, but we'd very much prefer if performance was
> good 'out of the box' for people that grab an OMPI tarball and use it with
> PSM.
> >
> > My question is, what's the best way to do that?  It seems obvious to just
> make --disable-smp-locks the default, but I presume the change was done
> on purpose, so I'm looking for community feedback.
> >
> > Thanks,
> >
> > Andrew
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: http://www.open-
> mpi.org/community/lists/devel/2014/10/16130.php
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: http://www.open-
> mpi.org/community/lists/devel/2014/10/1613

Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Jeff Squyres (jsquyres)
On Nov 3, 2014, at 10:31 AM, Friedley, Andrew  wrote:

> Thanks Jeff, we're working to verify.  I don't mind the slower behavior on 
> the trunk; we are only concerned with the stable release series.  Enabling 
> thread safety options on the trunk/master is no problem here.
> 
> Did the 'more expensive freelists' change make it to the release series as 
> well?

Yes.  And all of that should now be reverted.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 1, 2014, at 3:44 AM, Gilles Gouaillardet  
wrote:

> Hi Dave,
> 
> I am sorry about that, the doc is not to be blamed here.
> I usually do pull/commit/push in a row to avoid this kind of things but i 
> screwed up this time ...
> I cannot remember if i did commit/pull/push or if i simply forgot to pull

Most of the time a "pull" won't succeed if you have uncommitted modifications 
your tree, so I'm not sure how pull/commit/push would actually work for you.  
Do you stash/unstash in the middle there?  Or are you saying you make all of 
your changes between "pull" and "commit"?  If so, there's always a race there 
that you might occasionally need to resolve with "git rebase" or "git pull 
--rebase" anyway.

> btw, is there a push option to abort if that would make github history non 
> linear ?

No, not really.  There are some options to "pull" to prevent you from creating 
a merge commit, but the fix when you encounter that situation would simply be 
to rebase in some fashion, so you might as well just do that every time.

The best thing to do is to just try to use "git pull --rebase" for any topic 
work (i.e., don't use a bare "git pull" unless you know that you need to 
perform a merge).  A few other alternatives if you don't like that for some 
reason:

1. Set your "pull" default to perform a rebase.  I don't recommend it because 
this can lead to confusion if you work on multiple systems and you are not 100% 
consistent about setting this behavior.  But here's how to do it: 
http://stevenharman.net/git-pull-with-automatic-rebase

2. "git pull --rebase" can always be substituted by "git fetch ; git rebase".  
You could change your workflow to avoid the "pull" command altogether until it 
all makes more sense to you.  Similarly, "git pull" (which means "git pull 
--no-rebase" by default) can always be substituted by "git fetch ; git merge".

3. View the commit graph before pushing to make sure you're pushing the history 
you think you should be.  A helpful command for this (which you can alias if 
desired) is:

git log --graph --oneline --decorate HEAD '@{u}'

That will show the commit graph that can be traced back from your current 
branch and its tracked upstream branch.  If you see a merge commit where you 
didn't expect one, fix the history before pushing.  If you don't know how to 
fix it, ask the list or google around a bit.

-Dave



Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Jed Brown
"Dave Goodell (dgoodell)"  writes:
> Most of the time a "pull" won't succeed if you have uncommitted
> modifications your tree, so I'm not sure how pull/commit/push would
> actually work for you.  Do you stash/unstash in the middle there?

Git will happily do the pull/merge despite your dirty tree as long as
none of the dirty files are affected.  Linus says that he usually has
uncommitted changes in his tree when merging.


signature.asc
Description: PGP signature


Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Dave Goodell (dgoodell)
Hi Alex,

Why did you push this "OSHMEM: spml ikrit..." commit twice?  I see it here 
(together with an undesirable merge-of-master commit) and also as 065dc9b4.

-Dave

On Nov 3, 2014, at 2:03 AM, git...@crest.iu.edu wrote:

> This is an automated email from the git hooks/post-receive script. It was
> generated because a ref change was pushed to the repository containing
> the project "open-mpi/ompi".
> 
> The branch, master has been updated
>   via  87dffacc56b4ebcecaa2e65e19c2f813d2a5d078 (commit)
>   via  e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4 (commit)
>  from  065dc9b4deec9cd9500f2fdc6bb53bbf58a9c2f6 (commit)
> 
> Those revisions listed above that are new to this repository have
> not appeared on any other notification email; so we list those
> revisions in full, below.
> 
> - Log -
> https://github.com/open-mpi/ompi/commit/87dffacc56b4ebcecaa2e65e19c2f813d2a5d078
> 
> commit 87dffacc56b4ebcecaa2e65e19c2f813d2a5d078
> Merge: e1cf6f3 065dc9b
> Author: Alex Mikheev 
> Date:   Mon Nov 3 10:02:29 2014 +0200
> 
>Merge branch 'master' of github.com:open-mpi/ompi
> 
>Conflicts:
>   oshmem/mca/spml/ikrit/spml_ikrit_component.c
> 
> 
> 
> https://github.com/open-mpi/ompi/commit/e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4
> 
> commit e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4
> Author: Alex Mikheev 
> Date:   Sun Nov 2 12:41:20 2014 +0200
> 
>OSHMEM: spml ikrit: disable rdmap op DCI pool
> 
>Instead use single pool for both rdma and send receive ops.
> 
> diff --git a/oshmem/mca/spml/ikrit/spml_ikrit_component.c 
> b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> index 2079640..e021666 100644
> --- a/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> +++ b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> @@ -92,6 +92,12 @@ static inline int set_mxm_tls()
> {
> char *tls;
> 
> +/* disable dci pull for rdma ops. Use single pool.
> + * Pool size is controlled by MXM_DC_QP_LIMIT 
> + * variable
> + */
> +setenv("MXM_OSHMEM_DC_RNDV_QP_LIMIT", "0", 0);
> +
> tls = getenv("MXM_OSHMEM_TLS");
> if (NULL != tls) {
> return check_mxm_tls("MXM_OSHMEM_TLS");
> 
> 
> ---
> 
> Summary of changes:
> oshmem/mca/spml/ikrit/spml_ikrit_component.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> hooks/post-receive
> -- 
> open-mpi/ompi
> ___
> ompi-commits mailing list
> ompi-comm...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits



Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 3, 2014, at 10:41 AM, Jed Brown  wrote:

> "Dave Goodell (dgoodell)"  writes:
>> Most of the time a "pull" won't succeed if you have uncommitted
>> modifications your tree, so I'm not sure how pull/commit/push would
>> actually work for you.  Do you stash/unstash in the middle there?
> 
> Git will happily do the pull/merge despite your dirty tree as long as
> none of the dirty files are affected.  Linus says that he usually has
> uncommitted changes in his tree when merging.

Hmm... you can see how often I create proper merge commits on a dirty tree.  I 
must have been hit in the past by conflicting dirty files.

-Dave



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Alexander Mikheev
It is --amend of my previous commit.  When I tried to push my amended commit, 
the merge was required. 

> -Original Message-
> From: Dave Goodell (dgoodell) [mailto:dgood...@cisco.com]
> Sent: Monday, November 03, 2014 6:47 PM
> To: Alexander Mikheev
> Cc: Open MPI Developers
> Subject: Re: [OMPI commits] Git: open-mpi/ompi branch master updated.
> dev-206-g87dffac
> 
> Hi Alex,
> 
> Why did you push this "OSHMEM: spml ikrit..." commit twice?  I see it here
> (together with an undesirable merge-of-master commit) and also as
> 065dc9b4.
> 
> -Dave
> 
> On Nov 3, 2014, at 2:03 AM, git...@crest.iu.edu wrote:
> 
> > This is an automated email from the git hooks/post-receive script. It
> > was generated because a ref change was pushed to the repository
> > containing the project "open-mpi/ompi".
> >
> > The branch, master has been updated
> >   via  87dffacc56b4ebcecaa2e65e19c2f813d2a5d078 (commit)
> >   via  e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4 (commit)
> >  from  065dc9b4deec9cd9500f2fdc6bb53bbf58a9c2f6 (commit)
> >
> > Those revisions listed above that are new to this repository have not
> > appeared on any other notification email; so we list those revisions
> > in full, below.
> >
> > - Log
> > -
> > https://github.com/open-
> mpi/ompi/commit/87dffacc56b4ebcecaa2e65e19c2f8
> > 13d2a5d078
> >
> > commit 87dffacc56b4ebcecaa2e65e19c2f813d2a5d078
> > Merge: e1cf6f3 065dc9b
> > Author: Alex Mikheev 
> > Date:   Mon Nov 3 10:02:29 2014 +0200
> >
> >Merge branch 'master' of github.com:open-mpi/ompi
> >
> >Conflicts:
> > oshmem/mca/spml/ikrit/spml_ikrit_component.c
> >
> >
> >
> > https://github.com/open-
> mpi/ompi/commit/e1cf6f37baf2b6240ab3aa3a219b88
> > 56cfa2caf4
> >
> > commit e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4
> > Author: Alex Mikheev 
> > Date:   Sun Nov 2 12:41:20 2014 +0200
> >
> >OSHMEM: spml ikrit: disable rdmap op DCI pool
> >
> >Instead use single pool for both rdma and send receive ops.
> >
> > diff --git a/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> > b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> > index 2079640..e021666 100644
> > --- a/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> > +++ b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
> > @@ -92,6 +92,12 @@ static inline int set_mxm_tls() {
> > char *tls;
> >
> > +/* disable dci pull for rdma ops. Use single pool.
> > + * Pool size is controlled by MXM_DC_QP_LIMIT
> > + * variable
> > + */
> > +setenv("MXM_OSHMEM_DC_RNDV_QP_LIMIT", "0", 0);
> > +
> > tls = getenv("MXM_OSHMEM_TLS");
> > if (NULL != tls) {
> > return check_mxm_tls("MXM_OSHMEM_TLS");
> >
> >
> > --
> > -
> >
> > Summary of changes:
> > oshmem/mca/spml/ikrit/spml_ikrit_component.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >
> > hooks/post-receive
> > --
> > open-mpi/ompi
> > ___
> > ompi-commits mailing list
> > ompi-comm...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits



[OMPI devel] OMPI 1.8.4rc1 issues

2014-11-03 Thread Ralph Castain
Hi folks

Looking at the over-the-weekend MTT reports plus at least one comment on the 
list, we have the following issues to address:

* many-to-one continues to fail. Shall I just assume this is an unfixable 
problem or a bad test and ignore it?

* neighbor_allgather_self segfaults in ompi_request_null or 
coll_base_comm_unselect or comm_activate or..., take your pick

* MPI_IN_PLACE returns zero (issue #259 
https://github.com/open-mpi/ompi/issues/259)

Ralph



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Jeff Squyres (jsquyres)
Please don't try to amend commits that you have pushed to a public tree.  That 
can really screw up people who have already pulled your previous commit.

Rule of thumb: never, never, never change the history of a public tree.

The *ONE* exception to that is that you can change the history of the branch 
for an unmerged github pull request, because, by definition, your commits are 
not in an authoritative upstream tree yet.  E.g., if you open a PR, get some 
feedback, then you make some changes to the commits (possibly including 
changing history, like squashing multiple commits down into one), and [force] 
push the changes back to your topic branch on that PR.

Make sense?



On Nov 3, 2014, at 11:50 AM, Alexander Mikheev  wrote:

> It is --amend of my previous commit.  When I tried to push my amended commit, 
> the merge was required. 
> 
>> -Original Message-
>> From: Dave Goodell (dgoodell) [mailto:dgood...@cisco.com]
>> Sent: Monday, November 03, 2014 6:47 PM
>> To: Alexander Mikheev
>> Cc: Open MPI Developers
>> Subject: Re: [OMPI commits] Git: open-mpi/ompi branch master updated.
>> dev-206-g87dffac
>> 
>> Hi Alex,
>> 
>> Why did you push this "OSHMEM: spml ikrit..." commit twice?  I see it here
>> (together with an undesirable merge-of-master commit) and also as
>> 065dc9b4.
>> 
>> -Dave
>> 
>> On Nov 3, 2014, at 2:03 AM, git...@crest.iu.edu wrote:
>> 
>>> This is an automated email from the git hooks/post-receive script. It
>>> was generated because a ref change was pushed to the repository
>>> containing the project "open-mpi/ompi".
>>> 
>>> The branch, master has been updated
>>>  via  87dffacc56b4ebcecaa2e65e19c2f813d2a5d078 (commit)
>>>  via  e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4 (commit)
>>> from  065dc9b4deec9cd9500f2fdc6bb53bbf58a9c2f6 (commit)
>>> 
>>> Those revisions listed above that are new to this repository have not
>>> appeared on any other notification email; so we list those revisions
>>> in full, below.
>>> 
>>> - Log
>>> -
>>> https://github.com/open-
>> mpi/ompi/commit/87dffacc56b4ebcecaa2e65e19c2f8
>>> 13d2a5d078
>>> 
>>> commit 87dffacc56b4ebcecaa2e65e19c2f813d2a5d078
>>> Merge: e1cf6f3 065dc9b
>>> Author: Alex Mikheev 
>>> Date:   Mon Nov 3 10:02:29 2014 +0200
>>> 
>>>   Merge branch 'master' of github.com:open-mpi/ompi
>>> 
>>>   Conflicts:
>>> oshmem/mca/spml/ikrit/spml_ikrit_component.c
>>> 
>>> 
>>> 
>>> https://github.com/open-
>> mpi/ompi/commit/e1cf6f37baf2b6240ab3aa3a219b88
>>> 56cfa2caf4
>>> 
>>> commit e1cf6f37baf2b6240ab3aa3a219b8856cfa2caf4
>>> Author: Alex Mikheev 
>>> Date:   Sun Nov 2 12:41:20 2014 +0200
>>> 
>>>   OSHMEM: spml ikrit: disable rdmap op DCI pool
>>> 
>>>   Instead use single pool for both rdma and send receive ops.
>>> 
>>> diff --git a/oshmem/mca/spml/ikrit/spml_ikrit_component.c
>>> b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
>>> index 2079640..e021666 100644
>>> --- a/oshmem/mca/spml/ikrit/spml_ikrit_component.c
>>> +++ b/oshmem/mca/spml/ikrit/spml_ikrit_component.c
>>> @@ -92,6 +92,12 @@ static inline int set_mxm_tls() {
>>>char *tls;
>>> 
>>> +/* disable dci pull for rdma ops. Use single pool.
>>> + * Pool size is controlled by MXM_DC_QP_LIMIT
>>> + * variable
>>> + */
>>> +setenv("MXM_OSHMEM_DC_RNDV_QP_LIMIT", "0", 0);
>>> +
>>>tls = getenv("MXM_OSHMEM_TLS");
>>>if (NULL != tls) {
>>>return check_mxm_tls("MXM_OSHMEM_TLS");
>>> 
>>> 
>>> --
>>> -
>>> 
>>> Summary of changes:
>>> oshmem/mca/spml/ikrit/spml_ikrit_component.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> 
>>> hooks/post-receive
>>> --
>>> open-mpi/ompi
>>> ___
>>> ompi-commits mailing list
>>> ompi-comm...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16150.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 3, 2014, at 10:50 AM, Alexander Mikheev  wrote:

> It is --amend of my previous commit.  When I tried to push my amended commit, 
> the merge was required. 

Ah, I just spotted the minor difference between the two commits.  The second 
argument to setenv() was changed from integer zero to a string "0".

In the future, it would be better to just create a new single commit fixing the 
mistake directly instead of amending, merging, and pushing, since that 
difference was not obvious from just looking at the email diffs flying by.

-Dave



Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Friedley, Andrew
Thanks, we verified here this regression is fixed and it also appears that a 
regression we saw between 1.8.1 and 1.8.3 (but had not yet pinned down) is 
fixed too.  Performance results from the nightly are matching our old v1.6 
numbers.

Andrew

> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff
> Squyres (jsquyres)
> Sent: Monday, November 3, 2014 7:49 AM
> To: Open MPI Developers List
> Subject: Re: [OMPI devel] enable-smp-locks affects PSM performance
> 
> On Nov 3, 2014, at 10:31 AM, Friedley, Andrew 
> wrote:
> 
> > Thanks Jeff, we're working to verify.  I don't mind the slower behavior on
> the trunk; we are only concerned with the stable release series.  Enabling
> thread safety options on the trunk/master is no problem here.
> >
> > Did the 'more expensive freelists' change make it to the release series as
> well?
> 
> Yes.  And all of that should now be reverted.
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-
> mpi.org/community/lists/devel/2014/11/16145.php


Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Jeff Squyres (jsquyres)
w00t.

Now we just need to focus on https://github.com/open-mpi/ompi/issues/258 to fix 
the issue properly...


On Nov 3, 2014, at 2:08 PM, Friedley, Andrew  wrote:

> Thanks, we verified here this regression is fixed and it also appears that a 
> regression we saw between 1.8.1 and 1.8.3 (but had not yet pinned down) is 
> fixed too.  Performance results from the nightly are matching our old v1.6 
> numbers.
> 
> Andrew
> 
>> -Original Message-
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff
>> Squyres (jsquyres)
>> Sent: Monday, November 3, 2014 7:49 AM
>> To: Open MPI Developers List
>> Subject: Re: [OMPI devel] enable-smp-locks affects PSM performance
>> 
>> On Nov 3, 2014, at 10:31 AM, Friedley, Andrew 
>> wrote:
>> 
>>> Thanks Jeff, we're working to verify.  I don't mind the slower behavior on
>> the trunk; we are only concerned with the stable release series.  Enabling
>> thread safety options on the trunk/master is no problem here.
>>> 
>>> Did the 'more expensive freelists' change make it to the release series as
>> well?
>> 
>> Yes.  And all of that should now be reverted.
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: http://www.open-
>> mpi.org/community/lists/devel/2014/11/16145.php
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16154.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
On Mon, Nov 3, 2014 at 8:29 AM, Dave Goodell (dgoodell) 
wrote:

> > btw, is there a push option to abort if that would make github history
> non linear ?
>
> No, not really.  There are some options to "pull" to prevent you from
> creating a merge commit, but the fix when you encounter that situation
> would simply be to rebase in some fashion, so you might as well just do
> that every time.
>

The "some options" Dave is referring to is probably
git pull --ff-only
I have this aliased to "git ff" and use it instead of "git pull".

If your pull would require a merge, this will tell you so and not make any
changes.
As Dave says, "git pull --rebase" is *probably* going to be your next step
if "git pull --ff-only" fails.  However, that is not *always* the case.
Sometimes you may wish to "stash" or create a new branch for the local
modifications.

I prefer "git pull --ff-only" because it allows (some may say "forces") me
to examine the situation before I create non-linear history.  Without it,
imagine what happens when I login to some machine I seldom use, and there
are local mods from some experiment I had totally forgotten about.
- If I do a plain "git pull" I get a merge I probably didn't want
- If I do "git pull --rebase" then my local mods are (silently unless you
look carefully) rebased on the new tip.
In either of the above cases I may find myself resolving conflicts for
changes I didn't even remember making.

So, I favor "git pull --ff-only" because in the case of no local mods it
just updates my local repo, and otherwise I get to examine the local
changes before I let git merge or rebase them.  If you are familiar enough
with using "stash", you can even choose to ignore the local changes for now
and get on with the task at hand.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
IIRC it was not possible to merge with a dirty tree with git 1.7.
So, Dave, you may have been bitten in those dark days.
-Paul

On Mon, Nov 3, 2014 at 8:49 AM, Dave Goodell (dgoodell) 
wrote:

> On Nov 3, 2014, at 10:41 AM, Jed Brown  wrote:
>
> > "Dave Goodell (dgoodell)"  writes:
> >> Most of the time a "pull" won't succeed if you have uncommitted
> >> modifications your tree, so I'm not sure how pull/commit/push would
> >> actually work for you.  Do you stash/unstash in the middle there?
> >
> > Git will happily do the pull/merge despite your dirty tree as long as
> > none of the dirty files are affected.  Linus says that he usually has
> > uncommitted changes in his tree when merging.
>
> Hmm... you can see how often I create proper merge commits on a dirty
> tree.  I must have been hit in the past by conflicting dirty files.
>
> -Dave
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16149.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Jed Brown
Paul Hargrove  writes:

> IIRC it was not possible to merge with a dirty tree with git 1.7.

Nope.  This warning was added to git-1.7.0:

  https://github.com/git/git/commit/e330d8ca1a9ec38ce40b0f67123b1dd893f0b31c

Discussion thread:

  http://thread.gmane.org/gmane.comp.version-control.git/136356/focus=136556


signature.asc
Description: PGP signature


Re: [OMPI devel] osu_mbw_mr error

2014-11-03 Thread Nathan Hjelm

I see the problem. The openib btl does not properly handle the following
call sequence (this is an openib btl bug IMHO):

btl_sendi (..., &descriptor);
btl_free (..., descriptor);

The bug is in the message coalescing code and it looks like extra logic
needs to be added to the openib btl's btl_free function for this to work
properly. I am working on a fix now.

-Nathan

On Mon, Nov 03, 2014 at 04:26:10PM +0200, Alina Sklarevich wrote:
>Hi,
>On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when
>using the openib BTL.
>When compiled in production mode (i.e. no --enable-debug) the test simply
>hangs.
>When using either the tcp BTL or the cm PML, the benchmark completes
>without error.
>The command line to reproduce this is:
>$ mpirun --bind-to core -display-map -mca btl_openib_if_include mlx5_0:1
>-np 2 -mca pml ob1 -mca btl openib,self,sm ./osu_mbw_mr
># OSU MPI Multiple Bandwidth / Message Rate Test v4.4
># [ pairs: 1 ] [ window size: 64 ]
># Size  MB/sMessages/s
>osu_mbw_mr: ../../../../opal/class/opal_list.h:547: _opal_list_append:
>Assertion `0 == item->opal_list_item_refcount' failed.
>[vegas15:30395] *** Process received signal ***
>[vegas15:30395] Signal: Aborted (6)
>[vegas15:30395] Signal code:  (-6)
>[vegas15:30395] [ 0] /lib64/libpthread.so.0[0x30bc40f500]
>[vegas15:30395] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x30bc0328a5]
>[vegas15:30395] [ 2] /lib64/libc.so.6(abort+0x175)[0x30bc034085]
>[vegas15:30395] [ 3] /lib64/libc.so.6[0x30bc02ba1e]
>[vegas15:30395] [ 4]
>/lib64/libc.so.6(__assert_perror_fail+0x0)[0x30bc02bae0]
>[vegas15:30395] [ 5]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(+0x9087)[0x73f70087]
>[vegas15:30395] [ 6]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_alloc+0x403)[0x73f754b3]
>[vegas15:30395] [ 7]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_sendi+0xf9e)[0x73f785b4]
>[vegas15:30395] [ 8]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xed08)[0x73308d08]
>[vegas15:30395] [ 9]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xf8ba)[0x733098ba]
>[vegas15:30395] [10]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_isend+0x108)[0x73309a1f]
>[vegas15:30395] [11]
>
> /labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/libmpi.so.1(MPI_Isend+0x2ec)[0x77cff5e8]
>[vegas15:30395] [12]
>
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400fa4]
>[vegas15:30395] [13]
>
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x40167d]
>[vegas15:30395] [14]
>/lib64/libc.so.6(__libc_start_main+0xfd)[0x30bc01ecdd]
>[vegas15:30395] [15]
>
> /hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400db9]
>[vegas15:30395] *** End of error message ***
>--
>mpirun noticed that process rank 0 with PID 30395 on node vegas15 exited
>on signal 6 (Aborted).
>--
>Thanks,
>Alina.

> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16142.php



pgp0N7VE22Bta.pgp
Description: PGP signature


[OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Not clear if the following failure is Solaris-specific, but it *IS* a
regression relative to 1.8.3.

The system has 2 IPV4 interfaces:
   Ethernet on 172.16.0.119/16
   IPoIB on 172.18.0.119/16

$ ifconfig bge0
bge0: flags=1004843 mtu 1500
index 2
inet 172.16.0.119 netmask  broadcast 172.16.255.255
$ ifconfig p.ibp0
p.ibp0: flags=1001000843
mtu 2044 index 3
inet 172.18.0.119 netmask  broadcast 172.18.255.255

However, I get a message from mca/oob/tcp about not being able to
communicate between these two interfaces ON THE SAME NODE:

$ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun
-mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
[pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).

A process or daemon was unable to complete a TCP connection
to another process:
  Local host:pcp-j-19
  Remote host:   172.18.0.119
This is usually caused by a firewall on the remote host. Please
check that any firewall (e.g., iptables) has been disabled and
try again.


Let me know what sort of verbose options I should use to gather any
additional info you may need.

-Paul

On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain 
wrote:

> Hi folks
>
> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a head
> start on the testing so we can release by Fri Nov 7th. So please take a
> little time and test the current tarball:
>
> http://www.open-mpi.org/software/ompi/v1.8/
>
> Thanks
> Ralph
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
Could you please set -mca oob_base_verbose 20? I’m not sure why the connection 
is failing.

Thanks
Ralph

> On Nov 3, 2014, at 5:56 PM, Paul Hargrove  wrote:
> 
> Not clear if the following failure is Solaris-specific, but it *IS* a 
> regression relative to 1.8.3.
> 
> The system has 2 IPV4 interfaces:
>Ethernet on 172.16.0.119/16 
>IPoIB on 172.18.0.119/16 
> 
> $ ifconfig bge0
> bge0: flags=1004843 mtu 1500 index 2
> inet 172.16.0.119 netmask  broadcast 172.16.255.255
> $ ifconfig p.ibp0
> p.ibp0: flags=1001000843 
> mtu 2044 index 3
> inet 172.18.0.119 netmask  broadcast 172.18.255.255
> 
> However, I get a message from mca/oob/tcp about not being able to communicate 
> between these two interfaces ON THE SAME NODE:
> 
> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun -mca 
> btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
> 
> A process or daemon was unable to complete a TCP connection
> to another process:
>   Local host:pcp-j-19
>   Remote host:   172.18.0.119
> This is usually caused by a firewall on the remote host. Please
> check that any firewall (e.g., iptables) has been disabled and
> try again.
> 
> 
> Let me know what sort of verbose options I should use to gather any 
> additional info you may need.
> 
> -Paul
> 
> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain  > wrote:
> Hi folks
> 
> I know 1.8.4 isn’t entirely complete just yet, but I’d like to get a head 
> start on the testing so we can release by Fri Nov 7th. So please take a 
> little time and test the current tarball:
> 
> http://www.open-mpi.org/software/ompi/v1.8/ 
> 
> 
> Thanks
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php 
> 
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov 
> 
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php



Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Ralph,

Requested output is attached.

I have a Linux/x86 system with the same network configuration and will soon
be able to determine if the problem is specific to Solaris.

-Paul


On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain  wrote:

> Could you please set -mca oob_base_verbose 20? I'm not sure why the
> connection is failing.
>
> Thanks
> Ralph
>
> On Nov 3, 2014, at 5:56 PM, Paul Hargrove  wrote:
>
> Not clear if the following failure is Solaris-specific, but it *IS* a
> regression relative to 1.8.3.
>
> The system has 2 IPV4 interfaces:
>Ethernet on 172.16.0.119/16
>IPoIB on 172.18.0.119/16
>
> $ ifconfig bge0
> bge0: flags=1004843 mtu 1500
> index 2
> inet 172.16.0.119 netmask  broadcast 172.16.255.255
> $ ifconfig p.ibp0
> p.ibp0: flags=1001000843
> mtu 2044 index 3
> inet 172.18.0.119 netmask  broadcast 172.18.255.255
>
> However, I get a message from mca/oob/tcp about not being able to
> communicate between these two interfaces ON THE SAME NODE:
>
> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun
> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
> 
> A process or daemon was unable to complete a TCP connection
> to another process:
>   Local host:pcp-j-19
>   Remote host:   172.18.0.119
> This is usually caused by a firewall on the remote host. Please
> check that any firewall (e.g., iptables) has been disabled and
> try again.
> 
>
> Let me know what sort of verbose options I should use to gather any
> additional info you may need.
>
> -Paul
>
> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain 
> wrote:
>
>> Hi folks
>>
>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a head
>> start on the testing so we can release by Fri Nov 7th. So please take a
>> little time and test the current tarball:
>>
>> http://www.open-mpi.org/software/ompi/v1.8/
>>
>> Thanks
>> Ralph
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
[pcp-j-19:01003] mca: base: components_register: registering oob components
[pcp-j-19:01003] mca: base: components_register: found loaded component tcp
[pcp-j-19:01003] mca: base: components_register: component tcp register 
function successful
[pcp-j-19:01003] mca: base: components_open: opening oob components
[pcp-j-19:01003] mca: base: components_open: found loaded component tcp
[pcp-j-19:01003] mca: base: components_open: component tcp open function 
successful
[pcp-j-19:01003] mca:oob:select: checking available component tcp
[pcp-j-19:01003] mca:oob:select: Querying component [tcp]
[pcp-j-19:01003] oob:tcp: component_available called
[pcp-j-19:01003] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
[pcp-j-19:01003] [[26539,0],0] oob:tcp:init rejecting loopback interface lo0
[pcp-j-19:01003] WORKING INTERFACE 2 KERNEL INDEX 2 FAMILY: V4
[pcp-j-19:01003] [[26539,0],0] oob:tcp:init adding 172.16.0.119 to our list of 
V4 connections
[pcp-j-19:01003] WORKING INTERFACE 3 KERNEL INDEX 3 FAMILY: V4
[pcp-j-19:01003] [[26539,0],0] oob:tcp:init adding 172.18.0.119 to our list of 
V4 connections
[pcp-j-19:01003] [[26539,0],0] TCP STARTUP
[pcp-j-19:01003] [[26539,0],0] attempting to bind to IPv4 port 0
[pcp-j-19:01003] [[26539,0],0] assigned IPv4 port 43391
[pcp-j-19:01003] mca:oob:select: Adding component to end
[pcp-j-19:01003] mca:oob:select: Found 1 active transports
[pcp-j-19:01003] [[26539,0],0]: set_addr to uri 
1739259904.0;tcp://172.16.0.119,172.18.0.119:43391
[pcp-j-19:01003] [[26539,0],0]:set_addr peer [[26539,0],0] is me
[pcp-j-19:01004] mca: base: components_register: registering oob components
[pcp-j-19:01004] mca: base: components_register: found loa

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
Hmmm…Paul, would you be able to try this with the latest trunk tarball? This 
looks familiar to me, and I wonder if we are just missing a changeset from the 
trunk that fixed the handshake issues we had with failing over from one 
transport to another.

Ralph

> On Nov 3, 2014, at 7:23 PM, Paul Hargrove  wrote:
> 
> Ralph,
> 
> Requested output is attached.
> 
> I have a Linux/x86 system with the same network configuration and will soon 
> be able to determine if the problem is specific to Solaris.
> 
> -Paul
> 
> 
> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain  > wrote:
> Could you please set -mca oob_base_verbose 20? I’m not sure why the 
> connection is failing.
> 
> Thanks
> Ralph
> 
>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove > > wrote:
>> 
>> Not clear if the following failure is Solaris-specific, but it *IS* a 
>> regression relative to 1.8.3.
>> 
>> The system has 2 IPV4 interfaces:
>>Ethernet on 172.16.0.119/16 
>>IPoIB on 172.18.0.119/16 
>> 
>> $ ifconfig bge0
>> bge0: flags=1004843 mtu 1500 index 
>> 2
>> inet 172.16.0.119 netmask  broadcast 172.16.255.255
>> $ ifconfig p.ibp0
>> p.ibp0: flags=1001000843 
>> mtu 2044 index 3
>> inet 172.18.0.119 netmask  broadcast 172.18.255.255
>> 
>> However, I get a message from mca/oob/tcp about not being able to 
>> communicate between these two interfaces ON THE SAME NODE:
>> 
>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun -mca 
>> btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>> 
>> A process or daemon was unable to complete a TCP connection
>> to another process:
>>   Local host:pcp-j-19
>>   Remote host:   172.18.0.119
>> This is usually caused by a firewall on the remote host. Please
>> check that any firewall (e.g., iptables) has been disabled and
>> try again.
>> 
>> 
>> Let me know what sort of verbose options I should use to gather any 
>> additional info you may need.
>> 
>> -Paul
>> 
>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain > > wrote:
>> Hi folks
>> 
>> I know 1.8.4 isn’t entirely complete just yet, but I’d like to get a head 
>> start on the testing so we can release by Fri Nov 7th. So please take a 
>> little time and test the current tarball:
>> 
>> http://www.open-mpi.org/software/ompi/v1.8/ 
>> 
>> 
>> Thanks
>> Ralph
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php 
>> 
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov 
>> 
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352 
>> 
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php 
>> 
> 
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php 
> 
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov 
> 
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16162.php



Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
I ran on Linux with the same network setup and saw no problems.

I noticed something in the output attached to my previous message:
   [pcp-j-19:01003] mca_oob_tcp_accept: accept() failed: Error 0 (0).
which was suspicious to me assuming one or both of those zeros represent
errno.

That made me think of Gilles's recent issues w/ errno on Solaris unless
_REENTRANT was defined.
So, I tried building again after configuring with CFLAGS=-D_REENTRANT
AND THAT DID THE TRICK.

-Paul

On Mon, Nov 3, 2014 at 7:23 PM, Paul Hargrove  wrote:

> Ralph,
>
> Requested output is attached.
>
> I have a Linux/x86 system with the same network configuration and will
> soon be able to determine if the problem is specific to Solaris.
>
> -Paul
>
>
> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain 
> wrote:
>
>> Could you please set -mca oob_base_verbose 20? I'm not sure why the
>> connection is failing.
>>
>> Thanks
>> Ralph
>>
>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove  wrote:
>>
>> Not clear if the following failure is Solaris-specific, but it *IS* a
>> regression relative to 1.8.3.
>>
>> The system has 2 IPV4 interfaces:
>>Ethernet on 172.16.0.119/16
>>IPoIB on 172.18.0.119/16
>>
>> $ ifconfig bge0
>> bge0: flags=1004843 mtu 1500
>> index 2
>> inet 172.16.0.119 netmask  broadcast 172.16.255.255
>> $ ifconfig p.ibp0
>> p.ibp0:
>> flags=1001000843 mtu 2044
>> index 3
>> inet 172.18.0.119 netmask  broadcast 172.18.255.255
>>
>> However, I get a message from mca/oob/tcp about not being able to
>> communicate between these two interfaces ON THE SAME NODE:
>>
>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun
>> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>> 
>> A process or daemon was unable to complete a TCP connection
>> to another process:
>>   Local host:pcp-j-19
>>   Remote host:   172.18.0.119
>> This is usually caused by a firewall on the remote host. Please
>> check that any firewall (e.g., iptables) has been disabled and
>> try again.
>> 
>>
>> Let me know what sort of verbose options I should use to gather any
>> additional info you may need.
>>
>> -Paul
>>
>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain 
>> wrote:
>>
>>> Hi folks
>>>
>>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a
>>> head start on the testing so we can release by Fri Nov 7th. So please take
>>> a little time and test the current tarball:
>>>
>>> http://www.open-mpi.org/software/ompi/v1.8/
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Ralph,

You will see from the message I sent a moment ago that -D_REENTRANT on
Solaris appears to be the problem.
However, I will also try the trunk tarball as you have requested.

-Paul


On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain  wrote:

> Hmmm...Paul, would you be able to try this with the latest trunk tarball?
> This looks familiar to me, and I wonder if we are just missing a changeset
> from the trunk that fixed the handshake issues we had with failing over
> from one transport to another.
>
> Ralph
>
> On Nov 3, 2014, at 7:23 PM, Paul Hargrove  wrote:
>
> Ralph,
>
> Requested output is attached.
>
> I have a Linux/x86 system with the same network configuration and will
> soon be able to determine if the problem is specific to Solaris.
>
> -Paul
>
>
> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain 
> wrote:
>
>> Could you please set -mca oob_base_verbose 20? I'm not sure why the
>> connection is failing.
>>
>> Thanks
>> Ralph
>>
>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove  wrote:
>>
>> Not clear if the following failure is Solaris-specific, but it *IS* a
>> regression relative to 1.8.3.
>>
>> The system has 2 IPV4 interfaces:
>>Ethernet on 172.16.0.119/16
>>IPoIB on 172.18.0.119/16
>>
>> $ ifconfig bge0
>> bge0: flags=1004843 mtu 1500
>> index 2
>> inet 172.16.0.119 netmask  broadcast 172.16.255.255
>> $ ifconfig p.ibp0
>> p.ibp0:
>> flags=1001000843 mtu 2044
>> index 3
>> inet 172.18.0.119 netmask  broadcast 172.18.255.255
>>
>> However, I get a message from mca/oob/tcp about not being able to
>> communicate between these two interfaces ON THE SAME NODE:
>>
>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun
>> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>> 
>> A process or daemon was unable to complete a TCP connection
>> to another process:
>>   Local host:pcp-j-19
>>   Remote host:   172.18.0.119
>> This is usually caused by a firewall on the remote host. Please
>> check that any firewall (e.g., iptables) has been disabled and
>> try again.
>> 
>>
>> Let me know what sort of verbose options I should use to gather any
>> additional info you may need.
>>
>> -Paul
>>
>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain 
>> wrote:
>>
>>> Hi folks
>>>
>>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a
>>> head start on the testing so we can release by Fri Nov 7th. So please take
>>> a little time and test the current tarball:
>>>
>>> http://www.open-mpi.org/software/ompi/v1.8/
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16162.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/11/16163.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
No need - that was one of the things I was looking for. Thanks! I will pursue 
the fix


> On Nov 3, 2014, at 8:56 PM, Paul Hargrove  wrote:
> 
> Ralph,
> 
> You will see from the message I sent a moment ago that -D_REENTRANT on 
> Solaris appears to be the problem.
> However, I will also try the trunk tarball as you have requested.
> 
> -Paul
> 
> 
> On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain  > wrote:
> Hmmm…Paul, would you be able to try this with the latest trunk tarball? This 
> looks familiar to me, and I wonder if we are just missing a changeset from 
> the trunk that fixed the handshake issues we had with failing over from one 
> transport to another.
> 
> Ralph
> 
>> On Nov 3, 2014, at 7:23 PM, Paul Hargrove > > wrote:
>> 
>> Ralph,
>> 
>> Requested output is attached.
>> 
>> I have a Linux/x86 system with the same network configuration and will soon 
>> be able to determine if the problem is specific to Solaris.
>> 
>> -Paul
>> 
>> 
>> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain > > wrote:
>> Could you please set -mca oob_base_verbose 20? I’m not sure why the 
>> connection is failing.
>> 
>> Thanks
>> Ralph
>> 
>>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove >> > wrote:
>>> 
>>> Not clear if the following failure is Solaris-specific, but it *IS* a 
>>> regression relative to 1.8.3.
>>> 
>>> The system has 2 IPV4 interfaces:
>>>Ethernet on 172.16.0.119/16 
>>>IPoIB on 172.18.0.119/16 
>>> 
>>> $ ifconfig bge0
>>> bge0: flags=1004843 mtu 1500 
>>> index 2
>>> inet 172.16.0.119 netmask  broadcast 172.16.255.255
>>> $ ifconfig p.ibp0
>>> p.ibp0: flags=1001000843 
>>> mtu 2044 index 3
>>> inet 172.18.0.119 netmask  broadcast 172.18.255.255
>>> 
>>> However, I get a message from mca/oob/tcp about not being able to 
>>> communicate between these two interfaces ON THE SAME NODE:
>>> 
>>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun 
>>> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>>> 
>>> A process or daemon was unable to complete a TCP connection
>>> to another process:
>>>   Local host:pcp-j-19
>>>   Remote host:   172.18.0.119
>>> This is usually caused by a firewall on the remote host. Please
>>> check that any firewall (e.g., iptables) has been disabled and
>>> try again.
>>> 
>>> 
>>> Let me know what sort of verbose options I should use to gather any 
>>> additional info you may need.
>>> 
>>> -Paul
>>> 
>>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain >> > wrote:
>>> Hi folks
>>> 
>>> I know 1.8.4 isn’t entirely complete just yet, but I’d like to get a head 
>>> start on the testing so we can release by Fri Nov 7th. So please take a 
>>> little time and test the current tarball:
>>> 
>>> http://www.open-mpi.org/software/ompi/v1.8/ 
>>> 
>>> 
>>> Thanks
>>> Ralph
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org 
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> 
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Paul H. Hargrove  phhargr...@lbl.gov 
>>> 
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352 
>>> 
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org 
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> 
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php 
>>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php 
>> 
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov 
>> 
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-4