Trying to replicate this, but I can't. I'm using the latest 1.6 tarball, not 1.6.5, so it is possible something was fixed - though I believe we have committed very few changes as that series is about to drop to "deprecated".
First thing I encountered: configure: WARNING: unrecognized options: --disable-maintainer-mode, --enable-ltdl-convenience So I removed those - no idea what they even do - but retained the rest of your configure options. I then used your cmd line, substituting "hostname" for "foo", and everything ran just fine on an ssh-based system. Here's my system info: Linux bend001 2.6.32-358.18.1.el6.x86_64 #1 SMP Wed Aug 28 17:19:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3) On Nov 15, 2013, at 7:24 AM, Sylvestre Ledru <sylves...@debian.org> wrote: > Hello, > > On 02/10/2013 19:34, Jeff Squyres (jsquyres) wrote: >> On Sep 30, 2013, at 11:05 AM, Sylvestre Ledru <sylves...@debian.org> wrote: >> >>> Here are the options list: >>> configure: running /bin/bash './configure' CFLAGS="-DNDEBUG -g -O2 >>> -Wformat -Werror=format-security -finline-functions -fno-strict-aliasing >>> -pthread" CPPFLAGS=" -I/usr//include -I/usr/include/infiniband >>> -I/usr/include/infiniband" FFLAGS="-g -O2" LDFLAGS=" -L/usr//lib" >>> --enable-shared --disable-static --prefix=/usr --with-mpi=open_mpi >>> --disable-aio --cache-file=/dev/null --srcdir=. --disable-option-checking >> Hmm -- I'm confused here; it's not possible that you're getting an assertion >> failure with this configure line, for two reasons: >> >> 1. The assert() in question will only be compiled in if you --enable-debug >> on the configure command line. >> 2. You supplied -DNDEBUG in CFLAGS, which means you've disabled all assert()s >> >> Can you verify that this is the correct configure line that you used to >> generate that error? Or is something else going on? >> > So, I tried with the arguments you sent me in private. > $ ./configure --prefix=/home/sylvestre/bogus2 --disable-maintainer-mode > --disable-dependency-tracking --with-threads=posix > --enable-opal-multi-threads --disable-silent-rules --enable-debug > --with-devel-headers --with-slurm --with-sge --enable-heterogeneous > --disable-vt --enable-mpirun-prefix-by-default --enable-mpi-f77 > --enable-mpi-f90 --enable-ltdl-convenience > > I am getting something more interesting than a freeze (even if it does > not mean much to me): > ./mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca > rmaps_base_verbose 5 -mca ess_base_verbose 5 -c 4 foo > [merulo:32531] mca:base:select:( ess) Querying component [env] > [merulo:32531] mca:base:select:( ess) Skipping component [env]. Query > failed to return a module > [merulo:32531] mca:base:select:( ess) Querying component [hnp] > [merulo:32531] mca:base:select:( ess) Query of component [hnp] set > priority to 100 > [merulo:32531] mca:base:select:( ess) Querying component [singleton] > [merulo:32531] mca:base:select:( ess) Skipping component [singleton]. > Query failed to return a module > [merulo:32531] mca:base:select:( ess) Querying component [slave] > [merulo:32531] mca:base:select:( ess) Query of component [slave] set > priority to 0 > [merulo:32531] mca:base:select:( ess) Querying component [slurm] > [merulo:32531] mca:base:select:( ess) Skipping component [slurm]. Query > failed to return a module > [merulo:32531] mca:base:select:( ess) Querying component [slurmd] > [merulo:32531] mca:base:select:( ess) Skipping component [slurmd]. > Query failed to return a module > [merulo:32531] mca:base:select:( ess) Querying component [tm] > [merulo:32531] mca:base:select:( ess) Skipping component [tm]. Query > failed to return a module > [merulo:32531] mca:base:select:( ess) Querying component [tool] > [merulo:32531] mca:base:select:( ess) Skipping component [tool]. Query > failed to return a module > [merulo:32531] mca:base:select:( ess) Selected component [hnp] > [merulo:32531] mca:base:select:( plm) Querying component [rsh] > [merulo:32531] [[INVALID],INVALID] plm:base:rsh_lookup on agent ssh : > rsh path NULL > [merulo:32531] *** Process received signal *** > [merulo:32531] Signal: Segmentation fault (11) > [merulo:32531] Signal code: Invalid permissions (2) > [merulo:32531] Failing at address: (nil) > [merulo:32531] [ 0] > linux-gate.so.1(__kernel_sigtramp+0x7fffffffff886860) [0xa000000000040800] > [merulo:32531] [ 1] > /home/sylvestre/bogus2/lib/openmpi/mca_plm_rsh.so(orte_plm_rsh_component_query+0xae3c0) > [0x2000000000867f40] > [merulo:32531] [ 2] > /home/sylvestre/bogus2/lib/libopen-rte.so.4(mca_base_select-0x5dc110) > [0x20000000001ddea0] > [merulo:32531] [ 3] > /home/sylvestre/bogus2/lib/libopen-rte.so.4(orte_plm_base_select-0x680cd0) > [0x20000000001392f0] > [merulo:32531] [ 4] > /home/sylvestre/bogus2/lib/openmpi/mca_ess_hnp.so(+0x56f0) > [0x20000000008316f0] > [merulo:32531] [ 5] > /home/sylvestre/bogus2/lib/libopen-rte.so.4(orte_init-0x72bf10) > [0x200000000008e0c0] > [merulo:32531] [ 6] ./mpirun(orterun+0x1fffffffff84cc80) > [0x4000000000006c60] > [merulo:32531] [ 7] ./mpirun(main+0x1fffffffff84b880) [0x40000000000045e0] > [merulo:32531] [ 8] > /lib/ia64-linux-gnu/libc.so.6.1(__libc_start_main-0x2fcd50) > [0x20000000004bd2a0] > [merulo:32531] [ 9] ./mpirun(_start+0x1fffffffff84a3c0) [0x40000000000043c0] > [merulo:32531] *** End of error message *** > Segmentation fault > > And the backtrace: > Program received signal SIGSEGV, Segmentation fault. > 0x2000000000867f40 in orte_plm_rsh_component_query > (module=0x60000fffffffb0d8, > priority=0x60000fffffffb0d0) at plm_rsh_component.c:205 > 205 OPAL_OUTPUT_VERBOSE((1, orte_plm_globals.output, > (gdb) bt > #0 0x2000000000867f40 in orte_plm_rsh_component_query > (module=0x60000fffffffb0d8, > priority=0x60000fffffffb0d0) at plm_rsh_component.c:205 > #1 0x20000000001ddea0 in mca_base_select (type_name=0x200000000026e708 > "plm", output_id=8, > components_available=0x20000000002c5f08 <orte_plm_base>, > best_module=0x60000fffffffb0e0, > best_component=0x60000fffffffb0e8) at mca_base_components_select.c:76 > #2 0x20000000001392f0 in orte_plm_base_select () at > base/plm_base_select.c:46 > #3 0x20000000008316f0 in rte_init () at ess_hnp_module.c:169 > #4 0x200000000008e0c0 in orte_init (pargc=0x60000fffffffb360, > pargv=0x60000fffffffb368, flags=4) > at runtime/orte_init.c:127 > #5 0x4000000000006c60 in orterun (argc=16, argv=0x60000fffffffb618) at > orterun.c:693 > #6 0x40000000000045e0 in main (argc=16, argv=0x60000fffffffb618) at > main.c:13 > > > Sylvestre > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel