Re: [OMPI devel] 1.8.2rc5 released

2014-08-22 Thread Ralph Castain
Looks okay - good to go On Aug 22, 2014, at 12:09 PM, Jeff Squyres (jsquyres) wrote: > No -- most of these were not user-visible, or they were fixes from fixes > post-1.8.1. > > I think the relevant ones were put in NEWS already. I'm recording a podcast > right now -- can you double check?

Re: [OMPI devel] OMPI devel] MPI_Abort does not make mpirun return with the right exit code

2014-08-22 Thread Ralph Castain
I think these are fixed now - at least, your test cases all pass for me On Aug 22, 2014, at 9:12 AM, Ralph Castain wrote: > > On Aug 22, 2014, at 9:06 AM, Gilles Gouaillardet > wrote: > >> Ralph, >> >> Will do on Monday >> >> About the first test, in my case echo $? returns 0 > > My "sho

Re: [OMPI devel] 1.8.2rc5 released

2014-08-22 Thread Jeff Squyres (jsquyres)
No -- most of these were not user-visible, or they were fixes from fixes post-1.8.1. I think the relevant ones were put in NEWS already. I'm recording a podcast right now -- can you double check? On Aug 22, 2014, at 2:42 PM, Ralph Castain wrote: > Did you update the NEWS with these? > > O

Re: [OMPI devel] 1.8.2rc5 released

2014-08-22 Thread Ralph Castain
Did you update the NEWS with these? On Aug 22, 2014, at 11:33 AM, Jeff Squyres (jsquyres) wrote: > In the usual location: > >http://www.open-mpi.org/software/ompi/v1.8/ > > Changes since rc4: > > - Add a missing atomics stuff into tarball > - fortran: add missing bindings for WIN_SYNC, W

[OMPI devel] 1.8.2rc5 released

2014-08-22 Thread Jeff Squyres (jsquyres)
In the usual location: http://www.open-mpi.org/software/ompi/v1.8/ Changes since rc4: - Add a missing atomics stuff into tarball - fortran: add missing bindings for WIN_SYNC, WIN_LOCK_ALL, WIN_UNLOCK_ALL - README updates - usnic: ensure to have a safe destruction of an opal_list_item_t - rem

Re: [OMPI devel] OMPI devel] MPI_Abort does not make mpirun return with the right exit code

2014-08-22 Thread Ralph Castain
On Aug 22, 2014, at 9:06 AM, Gilles Gouaillardet wrote: > Ralph, > > Will do on Monday > > About the first test, in my case echo $? returns 0 My "showcode" is just an alias for the echo > I noticed this confusing message in your output : > mpirun noticed that process rank 0 with PID 24382 o

Re: [OMPI devel] OMPI devel] MPI_Abort does not make mpirun return with the right exit code

2014-08-22 Thread Gilles Gouaillardet
Ralph, Will do on Monday About the first test, in my case echo $? returns 0 I noticed this confusing message in your output : mpirun noticed that process rank 0 with PID 24382 on node bend002 exited on signal 0 (Unknown signal 0). About the second test, please note my test program return 3; whe

Re: [OMPI devel] MPI_Abort does not make mpirun return with the right exit code

2014-08-22 Thread Ralph Castain
You might want to try again with current head of trunk as something seems off in what you are seeing - more below On Aug 22, 2014, at 3:12 AM, Gilles Gouaillardet wrote: > Ralph, > > i tried again after the merge and found the same behaviour, though the > internals are very different. > > i

Re: [OMPI devel] 1.8.2rc4 problem: only 32 out of 48 cores are working

2014-08-22 Thread Andrej Prsa
Hi again, I generated a video that demonstrates the problem; for brevity I did not run a full process, but I'm providing the timing below. If you'd like me to record a full process, just let me know -- but as I said in my previous email, 32 procs drop to 1 after about a minute and the computation

Re: [OMPI devel] MPI_Abort does not make mpirun return with the right exit code

2014-08-22 Thread Gilles Gouaillardet
Ralph, i tried again after the merge and found the same behaviour, though the internals are very different. i run without any batch manager from node0: mpirun -np 1 --mca btl tcp,self -host node1 ./abort exit with exit code zero :-( short story : i applied pmix.2.patch and that fixed my proble

Re: [OMPI devel] 1.8.2rc4 problem: only 32 out of 48 cores are working

2014-08-22 Thread Andrej Prsa
Hi Ralph, Chris, You guys are both correct: (1) The output that I passed along /is/ exemplary of only 32 processors running (provided htop reports things correctly). The job I submitted is the exact same process called 48 times (well, np times), so all procs should take about the same