Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain
Okay, please try the attached patch. It will cause two messages to be output for each job: one indicating the job has been marked terminated, and the other reporting that the completion message was sent to the requestor. Let's see what that tells us. Thanks Ralph On Wed, Oct 14, 2015 at 3:44

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet
Folks, i made PR #1028 https://github.com/open-mpi/ompi/pull/1028 it is not 100% clean (so i will not commit it before a review) since opal/mca/pmix/pmix1xx/pmix/configure is now invoked with two CPPFLAGS=... on the command line: - first one comes from the ompi configure command line - second

[hwloc-devel] Create success (hwloc git 1.11.0-91-g010b4b6)

2015-10-14 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.11.0-91-g010b4b6 Start time: Wed Oct 14 21:06:24 EDT 2015 End time: Wed Oct 14 21:08:02 EDT 2015 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git 1.10.1-71-g48f9ddd)

2015-10-14 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.10.1-71-g48f9ddd Start time: Wed Oct 14 21:04:51 EDT 2015 End time: Wed Oct 14 21:06:23 EDT 2015 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git 1.9.1-66-ga20252d)

2015-10-14 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.9.1-66-ga20252d Start time: Wed Oct 14 21:03:05 EDT 2015 End time: Wed Oct 14 21:04:51 EDT 2015 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git dev-811-gdaaf59f)

2015-10-14 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc dev-811-gdaaf59f Start time: Wed Oct 14 21:01:02 EDT 2015 End time: Wed Oct 14 21:02:55 EDT 2015 Your friendly daemon, Cyrador

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Larry Baker
The INTEGER*n, LOGICAL*n, REAL*n, etc., syntax has never been legal Fortran. Fortran originally had only INTEGER, REAL, DOUBLE PRECISION, and COMPLEX numeric types. Fortran 90 added the notion of a KIND of numeric, but left unspecified the mapping of numeric KINDs to processor-specific

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos
Hi Ralph, > On 15 Oct 2015, at 0:26 , Ralph Castain wrote: > Okay, so each orte-submit is reporting job has launched, which means the hang > is coming while waiting to hear the job completed. Are you sure that orte-dvm > believes the job has completed? No, I'm not. > In

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Jeff Squyres (jsquyres)
On Oct 14, 2015, at 5:53 PM, Vladimír Fuka wrote: > >> As that ticket notes if REAL*16 <> long double Open MPI should be >> disabling redutions on MPI_REAL16. I can take a look and see if I can >> determine why that is not working as expected. > > Does it really need to

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain
Okay, so each orte-submit is reporting job has launched, which means the hang is coming while waiting to hear the job completed. Are you sure that orte-dvm believes the job has completed? In other words, when you say that you observe the job as completing, are you basing that on some output from

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Vladimír Fuka
> As that ticket notes if REAL*16 <> long double Open MPI should be > disabling redutions on MPI_REAL16. I can take a look and see if I can > determine why that is not working as expected. Does it really need to be just disabled when the `real(real128)` is actually equivalent to c_long_double?

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos
Hi Ralph, > On 14 Oct 2015, at 21:50 , Ralph Castain wrote: > I wonder if they might be getting duplicate process names if started quickly > enough. Do you get the "job has launched" message (orte-submit outputs a > message after orte-dvm responds that the job launched)?

Re: [OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Ralph Castain
I wonder if they might be getting duplicate process names if started quickly enough. Do you get the "job has launched" message (orte-submit outputs a message after orte-dvm responds that the job launched)? On Wed, Oct 14, 2015 at 12:04 PM, Mark Santcroos wrote: >

[OMPI devel] orte-dvm / orte-submit race condition

2015-10-14 Thread Mark Santcroos
Hi, By hammering on a DVM with orte-submit I can reproducibly make orte-submit not return, but hang instead. The task is executed correctly though. It can be reproduced using the small snippet below. Switching from sequential to "concurrent" execution of the orte-submit's triggers the effect.

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Jeff Squyres (jsquyres)
On Oct 14, 2015, at 12:48 PM, Nathan Hjelm wrote: > > I think this is from a known issue. Try applying this and run again: > > https://github.com/open-mpi/ompi/commit/952d01db70eab4cbe11ff4557434acaa928685a4.patch The good news is that if this fixes your problem, the fix is

Re: [OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Nathan Hjelm
I think this is from a known issue. Try applying this and run again: https://github.com/open-mpi/ompi/commit/952d01db70eab4cbe11ff4557434acaa928685a4.patch -Nathan On Wed, Oct 14, 2015 at 06:33:07PM +0200, Paul Kapinos wrote: > Dear Open MPI developer, > > We're puzzled by reproducible

[OMPI devel] Bad performance (20% bandwidth loss) when compiling with GCC 5.2 instead of 4.x

2015-10-14 Thread Paul Kapinos
Dear Open MPI developer, We're puzzled by reproducible performance (bandwidth) penalty observed when comparing measurements via InfibiBand between two nodes, OpenMPI/1.10.0 compiled with *GCC/5.2* instead of GCC 4.8 and Intel compiler. Take a look at the attached picture of two measurements

Re: [OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Nathan Hjelm
On Wed, Oct 14, 2015 at 02:40:00PM +0100, Vladimír Fuka wrote: > Hello, > > I have a problem with using the quadruple (128bit) or extended > (80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5 > and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed > this

[OMPI devel] 16 byte real in Fortran

2015-10-14 Thread Vladimír Fuka
Hello, I have a problem with using the quadruple (128bit) or extended (80bit) precision reals in Fortran. I did my tests with gfortran-4.8.5 and OpenMPI-1.7.2 (preinstalled OpenSuSE 13.2), but others confirmed this behaviour for more recent versions at

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet
Folks, i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my configure command line. here is what happens : opal/mca/pmix/pmix1xx/configure.m4 set the CPPFLAGS environment variable with -I/tmp and include paths for hwloc and libevent then opal/mca/pmix/pmix1xx/pmix/configure is