Re: [OMPI devel] [PATCH] mpirun hangs on ia64

2014-01-23 Thread Paul Hargrove
On Thu, Jan 23, 2014 at 4:14 PM, Ralph Castain wrote: > I put it for 1.7.5 just for completeness - I agree that not many people > will care, but we should reward your hard work! > "The reward of a thing well done is having done it." Ralph Waldo Emerson -- Paul H. Hargrove

[OMPI devel] out-of-date or missing manpages

2014-01-23 Thread Paul Hargrove
The man pages in trunk for the compiler wrappers still make reference to distinct FC and F77 (and FCFLAGS and FFLAGS), though configure no longer honors F77 or FFLAGS: $ grep -e F77 -e FFLAGS INST/share/man/man1/* INST/share/man/man1/mpiCC.1:the user in the CC, CXX, F77, and/or FC environment vari

Re: [OMPI devel] [PATCH] mpirun hangs on ia64

2014-01-23 Thread Ralph Castain
I put it for 1.7.5 just for completeness - I agree that not many people will care, but we should reward your hard work! Thanks Ralph On Jan 23, 2014, at 2:06 PM, Paul Hargrove wrote: > On Thu, Jan 23, 2014 at 1:16 PM, Paul Hargrove wrote: > [snip] > I will retest ASAP and report with, I hope,

[OMPI devel] yet another fortran (documentation) issue

2014-01-23 Thread Paul Hargrove
The following is an issue found when testing with xlf-14.1, which is already known to have problems with the F08 stuff. So, I've configured with "FC=xlf90 --enable-mpi-fortran=usempi". The problem is that mpifort is now a wrapper around xlf90 and thus is assuming F90 free-form input, independent

[OMPI devel] [PATCH] mpirun hangs on ia64

2014-01-23 Thread Paul Hargrove
On Thu, Jan 23, 2014 at 1:16 PM, Paul Hargrove wrote: [snip] > I will retest ASAP and report with, I hope, an attachment to fix both > IA64.asm and ia64/atomic.h > [snip] Eureka!! With the bogus cast removed in both places, I can now run ring_c on linux/ia64. The attached patch is against trunk

Re: [OMPI devel] vader on SGI UV?

2014-01-23 Thread Nathan Hjelm
Well, it *should* work since the Cray and SGI variants are more or less the same. I would have to take a look at their xpmem.h to see if anything is different. -Nathan On Thu, Jan 23, 2014 at 01:38:53PM -0800, Paul Hargrove wrote: >I've answered this one for myself: > NO: the vader blt d

Re: [OMPI devel] vader on SGI UV?

2014-01-23 Thread Paul Hargrove
I've answered this one for myself: NO: the vader blt does not build on an SGI UV However, xpmem support isn't detected at configure time either. So, there is no "problem" here. It might be nice to clarify in README that vader is for Cray's variant of XPMEM only. ++ Everything below this point i

Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-23 Thread Paul Hargrove
Some progress: I fixed IA64.asm but still saw failures. I realized I'd not checked the ia64/atomic.h file. Lo and behold the origin of the bogus "sxt4" is a pair of improper casts, removed by the following: --- opal/include/opal/sys/ia64/atomic.h~2014-01-23 13:04:03.0 -0800 +++ op

[OMPI devel] vader on SGI UV?

2014-01-23 Thread Paul Hargrove
Nathan, Is the vader BTL known to work or not work on an SGI UV (w/ XPMEM support, of course)? I can easily attempt the build, but any test runs would enter a queue that is about 1 week deep. So, I am wondering if the attempt is worth pursuing. Additionally, does one need an explicit "-mca btl se

Re: [OMPI devel] mca_bml_r2_del_btl incorrect memory size reallocation?

2014-01-23 Thread Jeff Squyres (jsquyres)
This function is generally called during MPI_Finalize (i.e., when everything is being torn down). It may also be called during the disconnection of MPI dynamic processes (i.e., we don't need a BTL connection to a given peer anymore because the last communicator containing it has been released).

Re: [OMPI devel] [PATCH] use ORTE_PROC_IS_APP

2014-01-23 Thread Josh Hursey
That should be ok. On Thu, Jan 23, 2014 at 10:17 AM, Ralph Castain wrote: > Sure - no issues with me > > > On Jan 23, 2014, at 7:10 AM, Adrian Reber wrote: > > > Selecting SNAPC requires the information if it is an app or not: > > > > int orte_snapc_base_select(bool seed, bool app); > > > > Th

Re: [OMPI devel] [PATCH] make orte-checkpoint communicate with orterun again

2014-01-23 Thread Josh Hursey
+1 On Thu, Jan 23, 2014 at 10:16 AM, Ralph Castain wrote: > Looks correct to me - you are right in that you cannot release the buffer > until after the send completes. We don't copy the data underneath to save > memory and time. > > > On Jan 23, 2014, at 6:51 AM, Adrian Reber wrote: > > > Foll

Re: [OMPI devel] mca_bml_r2_del_btl incorrect memory size reallocation?

2014-01-23 Thread Ralph Castain
I would think valgrind on the app would be your best bet. If/when you do commit it, please remember to cmr it for the 1.7.5 milestone. Thanks Ralph On Jan 23, 2014, at 12:57 AM, Christoph Niethammer wrote: > Hello > > I think I found a minor memory bug in the bml_r2 code in function > mca_bm

Re: [OMPI devel] [PATCH] use ORTE_PROC_IS_APP

2014-01-23 Thread Ralph Castain
Sure - no issues with me On Jan 23, 2014, at 7:10 AM, Adrian Reber wrote: > Selecting SNAPC requires the information if it is an app or not: > > int orte_snapc_base_select(bool seed, bool app); > > The following patch uses the correct define. Can I commit it like this: > > t a/orte/mca/ess/b

Re: [OMPI devel] [PATCH] make orte-checkpoint communicate with orterun again

2014-01-23 Thread Ralph Castain
Looks correct to me - you are right in that you cannot release the buffer until after the send completes. We don't copy the data underneath to save memory and time. On Jan 23, 2014, at 6:51 AM, Adrian Reber wrote: > Following patch makes orte-checkpoint communicate with orterun again: > > di

Re: [OMPI devel] trunk and v1.7: xlc and lost atomics patch

2014-01-23 Thread Ralph Castain
Sigh - no idea how that patch went into the 1.6 series without first entering the trunk. Thanks so much for tracking it down! Now in the trunk and cmr'd for 1.7.4 On Jan 22, 2014, at 9:23 PM, Paul Hargrove wrote: > Testing the trunk w/ xlc-11.1 on a linux/ppc64 system I see two failures from

Re: [OMPI devel] build failure in trunk

2014-01-23 Thread Nathan Hjelm
Shoot. Forgot to add the ignore for that component. Will do that now. -Nathan On Thu, Jan 23, 2014 at 08:17:47AM +0200, Mike Dubman wrote: > 06:29:26 make[3]: Leaving directory > `/scrap/jenkins/workspace/hpc-ompi-shmem/label/r-vmb-centos5-u7-x86-64/ompi/mca/bcol/ptpcoll' > 06:29:26 make[2]: L

Re: [OMPI devel] 1.7.4rc: yet another launch failure

2014-01-23 Thread Nathan Hjelm
I agree. A configure option to disable the use of getpwuid would be great as it is one of those functions that can never be static. getpwuid also fails for no particular reason on at least one XC30. -Nathan On Wed, Jan 22, 2014 at 08:57:20PM -0800, Ralph Castain wrote: >Interesting - still, I

[OMPI devel] [PATCH] use ORTE_PROC_IS_APP

2014-01-23 Thread Adrian Reber
Selecting SNAPC requires the information if it is an app or not: int orte_snapc_base_select(bool seed, bool app); The following patch uses the correct define. Can I commit it like this: t a/orte/mca/ess/base/ess_base_std_app.c b/orte/mca/ess/base/ess_base_std_app.c index dbbb2f4..f3a38f0 100644

[OMPI devel] [PATCH] make orte-checkpoint communicate with orterun again

2014-01-23 Thread Adrian Reber
Following patch makes orte-checkpoint communicate with orterun again: diff --git a/orte/tools/orte-checkpoint/orte-checkpoint.c b/orte/tools/orte-checkpoint/orte-checkpoint.c index 7106342..8539f34 100644 --- a/orte/tools/orte-checkpoint/orte-checkpoint.c +++ b/orte/tools/orte-checkpoint/orte-che

Re: [OMPI devel] trunk: typo in error message

2014-01-23 Thread Jeff Squyres (jsquyres)
Fixed and slated for 1.7.5; thanks. On Jan 23, 2014, at 2:33 AM, Paul Hargrove wrote: > As originally noted in Dec 2011 > (http://www.open-mpi.org/community/lists/devel/2011/12/10169.php) there is a > 1-character typo in generate-asm.pl: > > $ cat -n generate-asm.pl | head -20 > 1 #!/us

Re: [OMPI devel] 1.7.4 status update

2014-01-23 Thread Ralph Castain
woot!!! Thanks Paul and Jeff! On Jan 22, 2014, at 10:22 PM, Paul Hargrove wrote: > > On Wed, Jan 22, 2014 at 7:27 PM, Paul Hargrove wrote: > After the 1.7 tests on the XLF, Open64 and PathScale platforms complete I'll > be testing the trunk on those systems with the compiler-appropriate > --

[OMPI devel] mca_bml_r2_del_btl incorrect memory size reallocation?

2014-01-23 Thread Christoph Niethammer
Hello I think I found a minor memory bug in the bml_r2 code in function mca_bml_r2_del_btl but I could not figure out when this function ever gets called. How can I test this function in a proper way? Here the diff showing the issue: @@ -699,11 +699,11 @@ static int mca_bml_r2_del_btl(mca_btl_

[OMPI devel] trunk: typo in error message

2014-01-23 Thread Paul Hargrove
As originally noted in Dec 2011 ( http://www.open-mpi.org/community/lists/devel/2011/12/10169.php) there is a 1-character typo in generate-asm.pl: $ cat -n generate-asm.pl | head -20 1 #!/usr/bin/perl -w 2 3 4 my $asmarch = shift; 5 my $asmformat = shift; 6 my $ba

Re: [OMPI devel] 1.7.4 status update

2014-01-23 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 7:27 PM, Paul Hargrove wrote: > After the 1.7 tests on the XLF, Open64 and PathScale platforms complete > I'll be testing the trunk on those systems with the compiler-appropriate > --enable-mpi-fortran= settings. The following are results (for trunk) for four compilers

[OMPI devel] build failure in trunk

2014-01-23 Thread Mike Dubman
*06:29:26* make[3]: Leaving directory `/scrap/jenkins/workspace/hpc-ompi-shmem/label/r-vmb-centos5-u7-x86-64/ompi/mca/bcol/ptpcoll'*06:29:26* make[2]: Leaving directory `/scrap/jenkins/workspace/hpc-ompi-shmem/label/r-vmb-centos5-u7-x86-64/ompi/mca/bcol/ptpcoll'*06:29:26* Making install in mca/bcol

Re: [OMPI devel] Unknown object files in libmpi.a

2014-01-23 Thread Paul Hargrove
Irvanda, Others on this list might have specific knowledge of the objects you listed, but I am going to present a general solution that hopefully will let you find the answers you seek. If you have libmpi.a build from sources configured with --enable-debug, then the source file information is sto

[OMPI devel] trunk and v1.7: xlc and lost atomics patch

2014-01-23 Thread Paul Hargrove
Testing the trunk w/ xlc-11.1 on a linux/ppc64 system I see two failures from "make check". Specifically the atomic_cmpset and atomic_spinlock tests both get segfaults. This is an issue I first reported against 1.5.5rc2 and v1.6. It appears that ticket 3040 was opened at the time of my original