Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Ralph H Castain
Appears fixed with r17992 - at least, it works on TM, slurm (odin), and Mac. On 3/27/08 11:06 AM, "Ralph H Castain" wrote: > Found the problem - should have a fix committed soon. Issue is with > differences in the number of daemons launched by the various plms (whether > or not procs are launch

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Ralph H Castain
Found the problem - should have a fix committed soon. Issue is with differences in the number of daemons launched by the various plms (whether or not procs are launched local to mpirun). On 3/27/08 10:39 AM, "Ralph H Castain" wrote: > Hmmm...puzzling. It is working fine for me on TM machines a

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Ralph H Castain
Hmmm...puzzling. It is working fine for me on TM machines and on my Mac. However, Galen reports it borked on alps as well. I'll have to dig a little to check this out and see if there is something missing on those PLMs. Will get back shortly. Sorry for problem On 3/27/08 10:28 AM, "Tim Prins"

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Tim Prins
Unfortunately now with r17988 I cannot run any mpi programs, they seem to hang in the modex. Tim Ralph H Castain wrote: Thanks Tim - I found the problem and will commit a fix shortly. Appreciate your testing and reporting! On 3/27/08 8:24 AM, "Tim Prins" wrote: This commit breaks things

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Ralph H Castain
Thanks Tim - I found the problem and will commit a fix shortly. Appreciate your testing and reporting! On 3/27/08 8:24 AM, "Tim Prins" wrote: > This commit breaks things for me. Running on 3 nodes of odin: > > mpirun -mca btl tcp,sm,self examples/ring_c > > causes a hang. All of the process

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17941

2008-03-27 Thread Tim Prins
This commit breaks things for me. Running on 3 nodes of odin: mpirun -mca btl tcp,sm,self examples/ring_c causes a hang. All of the processes are stuck in orte_grpcomm_base_barrier during MPI_Finalize. Not all programs hang, and the ring program does not hang all the time, but fairly often.