slurm 2.3.2 -Andrew
On Tue, Jan 17, 2012 at 6:05 PM, Ralph Castain <rhc.open...@gmail.com> wrote: > What version of slurm? > > > Sent from my iPad > > On Jan 17, 2012, at 4:36 AM, Andrew Senin <andrew.se...@itseez.com> wrote: > >> Hi Ralph, >> >> If you want Mike can provide access to the lab with RHEL 6.0 where we >> see the problem. >> >> Thanks, >> Andrew Senin >> >> On Tue, Jan 17, 2012 at 9:59 AM, Mike Dubman <mike.o...@gmail.com> wrote: >>> It happens for us on RHEL 6.0 >>> >>> >>> On Tue, Jan 17, 2012 at 3:46 AM, Ralph Castain <rhc.open...@gmail.com> >>> wrote: >>>> >>>> Well, I'm afraid I can't replicate your report. It runs fine for me. >>>> >>>> Sent from my iPad >>>> >>>> On Jan 16, 2012, at 4:25 PM, Ralph Castain <rhc.open...@gmail.com> wrote: >>>> >>>>> Hmmmm....probably a bug. I haven't tested that branch yet. Will take a >>>>> look. >>>>> >>>>> Sent from my iPad >>>>> >>>>> On Jan 16, 2012, at 11:56 AM, Andrew Senin <andrew.se...@itseez.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I think I've found a bug in the hear revision of the OpenMPI 1.5 >>>>>> branch. If it is configured with --disable-debug it crashes in >>>>>> finalize on the hello_c.c example. Did I miss something out? >>>>>> >>>>>> Configure options: >>>>>> ./configure --with-pmi=/usr/ --with-slurm=/usr/ --without-psm >>>>>> --disable-debug --enable-mpirun-prefix-by-default >>>>>> >>>>>> --prefix=/hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install >>>>>> >>>>>> Runtime command and output: >>>>>> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:../lib ./mpirun --mca btl openib,self >>>>>> --npernode 1 --host mir1,mir2 ./hello >>>>>> >>>>>> Hello, world, I am 0 of 2 >>>>>> Hello, world, I am 1 of 2 >>>>>> [mir1:05542] *** Process received signal *** >>>>>> [mir1:05542] Signal: Segmentation fault (11) >>>>>> [mir1:05542] Signal code: Address not mapped (1) >>>>>> [mir1:05542] Failing at address: 0xe8 >>>>>> [mir2:10218] *** Process received signal *** >>>>>> [mir2:10218] Signal: Segmentation fault (11) >>>>>> [mir2:10218] Signal code: Address not mapped (1) >>>>>> [mir2:10218] Failing at address: 0xe8 >>>>>> [mir1:05542] [ 0] /lib64/libpthread.so.0() [0x390d20f4c0] >>>>>> [mir1:05542] [ 1] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8) >>>>>> [0x7f4588cee6a8] >>>>>> [mir1:05542] [ 2] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32) >>>>>> [0x7f4588cee700] >>>>>> [mir1:05542] [ 3] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73) >>>>>> [0x7f4588d1beb2] >>>>>> [mir1:05542] [ 4] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe) >>>>>> [0x7f4588c81eb5] >>>>>> [mir1:05542] [ 5] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a) >>>>>> [0x7f4588c217c3] >>>>>> [mir1:05542] [ 6] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59) >>>>>> [0x7f4588c39959] >>>>>> [mir1:05542] [ 7] ./hello(main+0x69) [0x4008fd] >>>>>> [mir1:05542] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) >>>>>> [0x390ca1ec5d] >>>>>> [mir1:05542] [ 9] ./hello() [0x4007d9] >>>>>> [mir1:05542] *** End of error message *** >>>>>> [mir2:10218] [ 0] /lib64/libpthread.so.0() [0x3a6dc0f4c0] >>>>>> [mir2:10218] [ 1] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8) >>>>>> [0x7f409f31d6a8] >>>>>> [mir2:10218] [ 2] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32) >>>>>> [0x7f409f31d700] >>>>>> [mir2:10218] [ 3] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73) >>>>>> [0x7f409f34aeb2] >>>>>> [mir2:10218] [ 4] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe) >>>>>> [0x7f409f2b0eb5] >>>>>> [mir2:10218] [ 5] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a) >>>>>> [0x7f409f2507c3] >>>>>> [mir2:10218] [ 6] >>>>>> >>>>>> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59) >>>>>> [0x7f409f268959] >>>>>> [mir2:10218] [ 7] ./hello(main+0x69) [0x4008fd] >>>>>> [mir2:10218] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) >>>>>> [0x3a6d41ec5d] >>>>>> [mir2:10218] [ 9] ./hello() [0x4007d9] >>>>>> [mir2:10218] *** End of error message *** >>>>>> >>>>>> -------------------------------------------------------------------------- >>>>>> mpirun noticed that process rank 0 with PID 5542 on node mir1 exited >>>>>> on signal 11 (Segmentation fault). >>>>>> --------------------------------------------------------------------- >>>>>> >>>>>> Thanks, >>>>>> Andrew Senin >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users