Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
No need - that was one of the things I was looking for. Thanks! I will pursue the fix > On Nov 3, 2014, at 8:56 PM, Paul Hargrove wrote: > > Ralph, > > You will see from the message I sent a moment ago that -D_REENTRANT on > Solaris appears to be the problem. > However, I will also try the t

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Ralph, You will see from the message I sent a moment ago that -D_REENTRANT on Solaris appears to be the problem. However, I will also try the trunk tarball as you have requested. -Paul On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain wrote: > Hmmm...Paul, would you be able to try this with the l

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
I ran on Linux with the same network setup and saw no problems. I noticed something in the output attached to my previous message: [pcp-j-19:01003] mca_oob_tcp_accept: accept() failed: Error 0 (0). which was suspicious to me assuming one or both of those zeros represent errno. That made me thi

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
Hmmm…Paul, would you be able to try this with the latest trunk tarball? This looks familiar to me, and I wonder if we are just missing a changeset from the trunk that fixed the handshake issues we had with failing over from one transport to another. Ralph > On Nov 3, 2014, at 7:23 PM, Paul Har

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Ralph, Requested output is attached. I have a Linux/x86 system with the same network configuration and will soon be able to determine if the problem is specific to Solaris. -Paul On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain wrote: > Could you please set -mca oob_base_verbose 20? I'm not sur

Re: [OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Ralph Castain
Could you please set -mca oob_base_verbose 20? I’m not sure why the connection is failing. Thanks Ralph > On Nov 3, 2014, at 5:56 PM, Paul Hargrove wrote: > > Not clear if the following failure is Solaris-specific, but it *IS* a > regression relative to 1.8.3. > > The system has 2 IPV4 inter

[OMPI devel] [1.8.4rc1] REGRESSION on Solaris-11/x86 with two subnets

2014-11-03 Thread Paul Hargrove
Not clear if the following failure is Solaris-specific, but it *IS* a regression relative to 1.8.3. The system has 2 IPV4 interfaces: Ethernet on 172.16.0.119/16 IPoIB on 172.18.0.119/16 $ ifconfig bge0 bge0: flags=1004843 mtu 1500 index 2 inet 172.16.0.119 netmask broadcas

Re: [OMPI devel] osu_mbw_mr error

2014-11-03 Thread Nathan Hjelm
I see the problem. The openib btl does not properly handle the following call sequence (this is an openib btl bug IMHO): btl_sendi (..., &descriptor); btl_free (..., descriptor); The bug is in the message coalescing code and it looks like extra logic needs to be added to the openib btl's btl_fre

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Jed Brown
Paul Hargrove writes: > IIRC it was not possible to merge with a dirty tree with git 1.7. Nope. This warning was added to git-1.7.0: https://github.com/git/git/commit/e330d8ca1a9ec38ce40b0f67123b1dd893f0b31c Discussion thread: http://thread.gmane.org/gmane.comp.version-control.git/136356

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
IIRC it was not possible to merge with a dirty tree with git 1.7. So, Dave, you may have been bitten in those dark days. -Paul On Mon, Nov 3, 2014 at 8:49 AM, Dave Goodell (dgoodell) wrote: > On Nov 3, 2014, at 10:41 AM, Jed Brown wrote: > > > "Dave Goodell (dgoodell)" writes: > >> Most of the

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Paul Hargrove
On Mon, Nov 3, 2014 at 8:29 AM, Dave Goodell (dgoodell) wrote: > > btw, is there a push option to abort if that would make github history > non linear ? > > No, not really. There are some options to "pull" to prevent you from > creating a merge commit, but the fix when you encounter that situati

Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Jeff Squyres (jsquyres)
w00t. Now we just need to focus on https://github.com/open-mpi/ompi/issues/258 to fix the issue properly... On Nov 3, 2014, at 2:08 PM, Friedley, Andrew wrote: > Thanks, we verified here this regression is fixed and it also appears that a > regression we saw between 1.8.1 and 1.8.3 (but had

Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Friedley, Andrew
Thanks, we verified here this regression is fixed and it also appears that a regression we saw between 1.8.1 and 1.8.3 (but had not yet pinned down) is fixed too. Performance results from the nightly are matching our old v1.6 numbers. Andrew > -Original Message- > From: devel [mailto:

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 3, 2014, at 10:50 AM, Alexander Mikheev wrote: > It is --amend of my previous commit. When I tried to push my amended commit, > the merge was required. Ah, I just spotted the minor difference between the two commits. The second argument to setenv() was changed from integer zero to a

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Jeff Squyres (jsquyres)
Please don't try to amend commits that you have pushed to a public tree. That can really screw up people who have already pulled your previous commit. Rule of thumb: never, never, never change the history of a public tree. The *ONE* exception to that is that you can change the history of the br

[OMPI devel] OMPI 1.8.4rc1 issues

2014-11-03 Thread Ralph Castain
Hi folks Looking at the over-the-weekend MTT reports plus at least one comment on the list, we have the following issues to address: * many-to-one continues to fail. Shall I just assume this is an unfixable problem or a bad test and ignore it? * neighbor_allgather_self segfaults in ompi_reques

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Alexander Mikheev
It is --amend of my previous commit. When I tried to push my amended commit, the merge was required. > -Original Message- > From: Dave Goodell (dgoodell) [mailto:dgood...@cisco.com] > Sent: Monday, November 03, 2014 6:47 PM > To: Alexander Mikheev > Cc: Open MPI Developers > Subject: Re

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 3, 2014, at 10:41 AM, Jed Brown wrote: > "Dave Goodell (dgoodell)" writes: >> Most of the time a "pull" won't succeed if you have uncommitted >> modifications your tree, so I'm not sure how pull/commit/push would >> actually work for you. Do you stash/unstash in the middle there? > > Gi

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-206-g87dffac

2014-11-03 Thread Dave Goodell (dgoodell)
Hi Alex, Why did you push this "OSHMEM: spml ikrit..." commit twice? I see it here (together with an undesirable merge-of-master commit) and also as 065dc9b4. -Dave On Nov 3, 2014, at 2:03 AM, git...@crest.iu.edu wrote: > This is an automated email from the git hooks/post-receive script. It w

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Jed Brown
"Dave Goodell (dgoodell)" writes: > Most of the time a "pull" won't succeed if you have uncommitted > modifications your tree, so I'm not sure how pull/commit/push would > actually work for you. Do you stash/unstash in the middle there? Git will happily do the pull/merge despite your dirty tree

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-198-g68bec0a

2014-11-03 Thread Dave Goodell (dgoodell)
On Nov 1, 2014, at 3:44 AM, Gilles Gouaillardet wrote: > Hi Dave, > > I am sorry about that, the doc is not to be blamed here. > I usually do pull/commit/push in a row to avoid this kind of things but i > screwed up this time ... > I cannot remember if i did commit/pull/push or if i simply for

Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Jeff Squyres (jsquyres)
On Nov 3, 2014, at 10:31 AM, Friedley, Andrew wrote: > Thanks Jeff, we're working to verify. I don't mind the slower behavior on > the trunk; we are only concerned with the stable release series. Enabling > thread safety options on the trunk/master is no problem here. > > Did the 'more expen

Re: [OMPI devel] enable-smp-locks affects PSM performance

2014-11-03 Thread Friedley, Andrew
Thanks Jeff, we're working to verify. I don't mind the slower behavior on the trunk; we are only concerned with the stable release series. Enabling thread safety options on the trunk/master is no problem here. Did the 'more expensive freelists' change make it to the release series as well? An

Re: [OMPI devel] osu_mbw_mr error

2014-11-03 Thread Ralph Castain
Can you please let me know when you fix this? I intend to release 1.8.4 by the end of the week. Since Mellanox is the only member with IB, you folks have been maintaining this BTL. > On Nov 3, 2014, at 6:26 AM, Alina Sklarevich > wrote: > > Hi, > > On 1.8.4rc1 we observe the following asser

[OMPI devel] osu_mbw_mr error

2014-11-03 Thread Alina Sklarevich
Hi, On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when using the openib BTL. When compiled in production mode (i.e. no --enable-debug) the test simply hangs. When using either the tcp BTL or the cm PML, the benchmark completes without error. The command line to reproduce th