[OMPI devel] 1.7.4rc: MIPS64 atomics tests fail

2014-01-21 Thread Paul Hargrove
Building a recent (1.7.4rc2r30303) v1.7 tarball on a (QEMU-emulated) MIPS64 system I find that the opal atomics test fail. Applying the "for trunk" patch I attached to ticket #3039 roughly 1 year ago resolves the problems for me. I suppose it would be great if at least one person with real MIPS h

Re: [OMPI devel] callback debugging

2014-01-21 Thread Adrian Reber
orte-checkpoint before communicating with orterun which runs the processes I am trying to checkpoint. The full backtrace: #0 0x769befa0 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81 #1 0x77b45712 in app_coord_init () at ../../../../../orte/mca/snapc/full/s

Re: [OMPI devel] callback debugging

2014-01-21 Thread Adrian Reber
I think I still do not really understand how it works. The barrier on which orte-checkpoint is currently hanging is in app_coord_init(). You are also saying that orte-checkpoint should not be calling a barrier. The backtrace of the point where it is hanging now looks like: #0 0x769befa0

Re: [OMPI devel] callback debugging

2014-01-21 Thread Ralph Castain
That doesn't make any sense - I can't imagine a reason for orte-checkpoint itself to be running a barrier. I wonder if it is selecting the wrong component in snapc? As for the patch, that isn't going to work. The collective id has to be *globally* unique, which means that only orterun can issue

Re: [OMPI devel] 1.7.4rc: MIPS64 atomics tests fail

2014-01-21 Thread Ralph Castain
I dug back and found that your trunk patch still applies, so I committed it and moved it to 1.7.4. So if you wouldn't mind verifying it once the nightly tarball is available, I'd appreciate it. Thanks! Ralph On Jan 20, 2014, at 9:38 PM, Paul Hargrove wrote: > Building a recent (1.7.4rc2r30303

Re: [OMPI devel] callback debugging

2014-01-21 Thread Adrian Reber
Good to know that it does not make any sense. So it not just me. Looking at the call chain I can see orte_snapc_base_select(ORTE_PROC_IS_HNP, !ORTE_PROC_IS_DAEMON); and the second parameter is used to decide if it is an app or not: int orte_snapc_base_select(bool seed, bool app) in orte/mca/sn

Re: [OMPI devel] callback debugging

2014-01-21 Thread Ralph Castain
That second argument is incorrect - it should be ORTE_PROC_IS_APP (note no !). The problem is that orte-checkpoint is a tool, and so it isn't a daemon - but it is also not an app. On Jan 21, 2014, at 11:56 AM, Adrian Reber wrote: > Good to know that it does not make any sense. So it not just

Re: [OMPI devel] callback debugging

2014-01-21 Thread Adrian Reber
Thanks, that helps. Now it actually starts to communicate with the orterun process. This still fails but I will try to fix it. On Tue, Jan 21, 2014 at 12:27:55PM -0800, Ralph Castain wrote: > That second argument is incorrect - it should be ORTE_PROC_IS_APP (note no > !). The problem is that orte

Re: [OMPI devel] 1.7.4rc: MPI_F08_TYPE build failure with AMD's Open64

2014-01-21 Thread Jeff Squyres (jsquyres)
Paul -- I'm sorry, due to craziness and the holiday yesterday, the amended Fortran BIND(C) checks didn't get committed to the v1.7 branch until earlier today. So they'll be in tonight's tarball. It looks to me like the Open64 compilers won't pass the BIND(C) checks, and we should be ok. Can

Re: [OMPI devel] 1.7.4rc: MPI_F08_TYPE build failure with AMD's Open64

2014-01-21 Thread Paul Hargrove
On Tue, Jan 21, 2014 at 1:55 PM, Jeff Squyres (jsquyres) wrote: > Paul -- > > I'm sorry, due to craziness and the holiday yesterday, the amended Fortran > BIND(C) checks didn't get committed to the v1.7 branch until earlier today. > So they'll be in tonight's tarball. > > It looks to me like the

[OMPI devel] 1.7.4 status update

2014-01-21 Thread Ralph Castain
Hi folks I think it is safe to say that we are not going to get a release candidate out tonight - more Fortran problems have surfaced, along with the need to complete the ROMIO review. I have therefore concluded we cannot release 1.7.4 this week. This leaves us with a couple of options: 1. con

Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-21 Thread Jeff Squyres (jsquyres)
I'm glad you did this test; it pulled on a thread which eventually led me to realize that I fix I made on the trunk (and took to v1.7) for gfortran 4.9 was just the Wrong Thing to do. I've now reverted that fix on trunk/v1.7, which should put us in a good position for pathscale. It leaves us w

Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-21 Thread Paul Hargrove
Jeff, Looks like we may be getting closer, but are not quite there: PPFC mpi-f08.lo BIND(C, name="ompi_type_create_hindexed_block_f") ^ pathf95-1690 pathf95: ERROR OMPI_TYPE_CREATE_HINDEXED_BLOCK_F, File = /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x

Re: [OMPI devel] 1.7.4rc: MPI_F08_TYPE build failure with AMD's Open64

2014-01-21 Thread Paul Hargrove
Jeff, Not surprisingly (given their common ancestry), the Open64 fortran compiler is now failing with essentially the same errors as I reported for PathScale-4.0 a few minutes ago. PPFC mpi-f08.lo use :: mpi_f08_types, only : MPI_ADDRESS_KIND ^ openf95-1690 openf90: ERRO