Re: [OMPI devel] v1.7 and trunk: hello_oshmemfh link failure with xlc/ppc32/linux

2014-02-09 Thread Paul Hargrove
The same problem exists on a ppc64 target with a newer xlc (12.1 vs 11.1) but the same xlf (14.1). -Paul On Sat, Feb 8, 2014 at 5:22 PM, Paul Hargrove wrote: > Testing the current v1.7 tarball (1.7.5a1r30634), I get a failure when > building the oshmem examples. > I've confirmed that the same

[OMPI devel] v1.7.5a1: mpirun failure on ppc/linux (regression vs 1.7.4)

2014-02-09 Thread Paul Hargrove
I have tried building the current v1.7 tarball (1.7.5a1r30639) with gcc on two ppc64/linux machines and one ppc32/linux. All three die in MPI_Init when I try to run ring_c. I've retested 1.7.4 on both ppc64 machines, and thankfully the problem is not present. Each of them at least dies with wha

Re: [OMPI devel] v1.7.5a1: mpirun failure on ppc/linux (regression vs 1.7.4)

2014-02-09 Thread Mike Dubman
Hi, we get same crash with gcc and x86_64. On Sun, Feb 9, 2014 at 10:32 AM, Paul Hargrove wrote: > I have tried building the current v1.7 tarball (1.7.5a1r30639) with gcc on > two ppc64/linux machines and one ppc32/linux. All three die in MPI_Init > when I try to run ring_c. > > I've retested

Re: [OMPI devel] v1.7.5a1: mpirun failure on ppc/linux (regression vs 1.7.4)

2014-02-09 Thread Paul Hargrove
Oddly, all of my x86-64 platforms are OK. -Paul [Sent from my phone] On Feb 9, 2014 6:22 AM, "Mike Dubman" wrote: > Hi, > we get same crash with gcc and x86_64. > > > On Sun, Feb 9, 2014 at 10:32 AM, Paul Hargrove wrote: > >> I have tried building the current v1.7 tarball (1.7.5a1r30639) with g

Re: [OMPI devel] Update on 1.7.5

2014-02-09 Thread Paul Hargrove
Testing v1.7 w/ oshmem I did have a few problems: http://www.open-mpi.org/community/lists/devel/2014/02/14056.php http://www.open-mpi.org/community/lists/devel/2014/02/14057.php http://www.open-mpi.org/community/lists/devel/2014/02/14059.php Solaris MPI_Init failures that I have yet to tria

Re: [OMPI devel] v1.7.5a1: mpirun failure on ppc/linux (regression vs 1.7.4)

2014-02-09 Thread Paul Hargrove
Below is some info collected from a core generated from running ring_c without mpirun. It looks like a bogus btl_module pointer or corrupted object is the culprit in this crash. -Paul Core was generated by `./ring_c '. Program terminated with signal 11, Segmentation fault. #0 0x0080c9b990ac

[OMPI devel] v1.7.5a1(regresion): MPI_Init crash on ppc/linux and sparc/solaris

2014-02-09 Thread Paul Hargrove
I alluded in an earlier email to problems on SPARC/Solaris that I had yet to triage. It turns out the problem there is the same as on ppc/linux. >From a core generated by ring_c run without mpirun: [1] strcmp(0x100274390, 0x48, 0x100274347, 0x1, 0x8080808080808080, 0x101010101010101), at 0x

[OMPI devel] [PATCH] Re: Still having issues w/ opal_path_nfs and EPERM

2014-02-09 Thread Paul Hargrove
I found the source of the problem, and a solution. The following is r30612, in which Jeff thought he had fixed the problem: --- opal/util/path.c(revision 30611) +++ opal/util/path.c(revision 30612) @@ -515,12 +515,17 @@ } while (-1 == vfsrc && ESTALE == errno && (0 < --trials)); #en

[OMPI devel] Compilation error: 'OMPI_MPIHANDLES_DLL_PREFIX' undeclared

2014-02-09 Thread Irvanda Kurniadi
Hi, I'm porting OpenMPI to L4/fiasco. I found this error message while compiling OpenMPI: error: 'OMPI_MPIHANDLES_DLL_PREFIX' undeclared (first use in this function) error: 'OMPI_MSGQ_DLL_PREFIX' undeclared (first use in this function) I found the OMPI_MPIHANDLES_DLL_PREFIX in CMakelist.txt like