Ah, I see the "sh: tcp://10.1.25.142,172.31.1.254,10.12.25.142:41686: No such 
file or directory" message now -- I was looking for something like that when I 
replied before and missed it.

I really wish I understood why the heck that is happening; it doesn't seem to 
make sense.  

Matt: Random thought -- is your "srun" a shell script, perchance?  (it 
shouldn't be, but perhaps there's some kind of local override...?)

Ralph's point on the call today is that it doesn't matter *how* this problem is 
happening.  It *is* happening to real users, and so we need to account for it.

But it really bothers me that we don't understand *how/why* this is happening 
(e.g., is this OMPI's fault somehow?  I don't think so, but then again, we 
don't understand how it's happening).  *Somewhere* in there, a shell is getting 
invoked.  But "srun" shouldn't be invoking a shell on the remote side -- it 
should be directly fork/exec'ing the tokens with no shell interpretation at all.




On Sep 2, 2014, at 7:04 PM, Ralph Castain <r...@open-mpi.org> wrote:

> I can answer that for you right now. The launch of the orted's is what is 
> failing, and they are "silently" failing at this time. The reason is simple:
> 
> 1. we are failing due to truncation of the HNP uri at the first semicolon. 
> This causes the orted to emit an ORTE_ERROR_LOG message and then abort with a 
> non-zero exit status
> 
> 2. we throw away the error message unless someone adds --debug-daemons 
> because we redirect the srun output to /dev/null. This is done because slurm 
> spits out other things during our normal operation that confuse users
> 
> 3. srun detects the non-zero exit status of the orted and aborts the rest of 
> the job.
> 
> So when Matt adds --debug-daemons, he then sees the error messages. When he 
> further adds the oob and plm verbosity, the true error is fully exposed.
> 
> 
> On Sep 2, 2014, at 2:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
>> Matt --
>> 
>> We were discussing this issue on our weekly OMPI engineering call today.
>> 
>> Can you check one thing for me?  With the un-edited 1.8.2 tarball 
>> installation, I see that you're getting no output for commands that you run 
>> -- but also no errors.
>> 
>> Can you verify and see if your commands are actually *running*?  E.g, try:
>> 
>> $ cat > script.sh <<EOF
>> #!/bin/sh
>> echo hello world
>> sleep 600
>> echo goodbye world
>> EOF
>> $ chmod +x script.sh
>> $ setenv OMPI_MCA_shmem_mmap_enable_nfs_warning 0
>> $ /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-clean/bin/mpirun 
>> -np 8 script.sh
>> 
>> and then go "ps" on the back-end nodes and see if there is an "orted" 
>> process and N "sleep 600" processes running on them.
>> 
>> I'm *assuming* you won't see the "hello world" output.
>> 
>> The purpose of this test is that I want to see if OMPI is just totally 
>> erring out and not even running your job (which is quite unlikely; OMPI 
>> should be much more noisy when this happens), or whether we're simply not 
>> seeing the stdout from the job.
>> 
>> Thanks.
>> 
>> 
>> 
>> On Sep 2, 2014, at 9:36 AM, Matt Thompson <fort...@gmail.com> wrote:
>> 
>>> On that machine, it would be SLES 11 SP1. I think it's soon transitioning 
>>> to SLES 11 SP3.
>>> 
>>> I also use Open MPI on an RHEL 6.5 box (possibly soon to be RHEL 7).
>>> 
>>> 
>>> On Mon, Sep 1, 2014 at 8:41 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> Thanks - I expect we'll have to release 1.8.3 soon to fix this in case 
>>> others have similar issues. Out of curiosity, what OS are you using?
>>> 
>>> 
>>> On Sep 1, 2014, at 9:00 AM, Matt Thompson <fort...@gmail.com> wrote:
>>> 
>>>> Ralph,
>>>> 
>>>> Okay that seems to have done it here (well, minus the usual 
>>>> shmem_mmap_enable_nfs_warning that our system always generates):
>>>> 
>>>> (1033) $ setenv OMPI_MCA_shmem_mmap_enable_nfs_warning 0
>>>> (1034) $ 
>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug-patch/bin/mpirun
>>>>  -np 8 ./helloWorld.182-debug-patch.x
>>>> Process    7 of    8 is on borg01w218
>>>> Process    5 of    8 is on borg01w218
>>>> Process    1 of    8 is on borg01w218
>>>> Process    3 of    8 is on borg01w218
>>>> Process    0 of    8 is on borg01w218
>>>> Process    2 of    8 is on borg01w218
>>>> Process    4 of    8 is on borg01w218
>>>> Process    6 of    8 is on borg01w218
>>>> 
>>>> I'll ask the admin to apply the patch locally...and wait for 1.8.3, I 
>>>> suppose.
>>>> 
>>>> Thanks,
>>>> Matt
>>>> 
>>>> On Sun, Aug 31, 2014 at 10:08 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Hmmm....I may see the problem. Would you be so kind as to apply the 
>>>> attached patch to your 1.8.2 code, rebuild, and try again?
>>>> 
>>>> Much appreciate the help. Everyone's system is slightly different, and I 
>>>> think you've uncovered one of those differences.
>>>> Ralph
>>>> 
>>>> 
>>>> 
>>>> On Aug 31, 2014, at 6:25 AM, Matt Thompson <fort...@gmail.com> wrote:
>>>> 
>>>>> Ralph,
>>>>> 
>>>>> Sorry it took me a bit of time. Here you go:
>>>>> 
>>>>> (1002) $ 
>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug/bin/mpirun 
>>>>> --leave-session-attached --debug-daemons --mca oob_base_verbose 10 -mca 
>>>>> plm_base_verbose 5 -np 8 ./helloWorld.182-debug.x
>>>>> [borg01w063:03815] mca:base:select:(  plm) Querying component [isolated]
>>>>> [borg01w063:03815] mca:base:select:(  plm) Query of component [isolated] 
>>>>> set priority to 0
>>>>> [borg01w063:03815] mca:base:select:(  plm) Querying component [rsh]
>>>>> [borg01w063:03815] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh 
>>>>> path NULL
>>>>> [borg01w063:03815] mca:base:select:(  plm) Query of component [rsh] set 
>>>>> priority to 10
>>>>> [borg01w063:03815] mca:base:select:(  plm) Querying component [slurm]
>>>>> [borg01w063:03815] [[INVALID],INVALID] plm:slurm: available for selection
>>>>> [borg01w063:03815] mca:base:select:(  plm) Query of component [slurm] set 
>>>>> priority to 75
>>>>> [borg01w063:03815] mca:base:select:(  plm) Selected component [slurm]
>>>>> [borg01w063:03815] plm:base:set_hnp_name: initial bias 3815 nodename hash 
>>>>> 1757783593
>>>>> [borg01w063:03815] plm:base:set_hnp_name: final jobfam 49163
>>>>> [borg01w063:03815] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w063:03815] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w063:03815] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w063:03815] mca: base: components_open: opening oob components
>>>>> [borg01w063:03815] mca: base: components_open: found loaded component tcp
>>>>> [borg01w063:03815] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w063:03815] mca:oob:select: checking available component tcp
>>>>> [borg01w063:03815] mca:oob:select: Querying component [tcp]
>>>>> [borg01w063:03815] oob:tcp: component_available called
>>>>> [borg01w063:03815] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w063:03815] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w063:03815] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w063:03815] [[49163,0],0] oob:tcp:init adding 10.1.24.63 to our 
>>>>> list of V4 connections
>>>>> [borg01w063:03815] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w063:03815] [[49163,0],0] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w063:03815] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w063:03815] [[49163,0],0] oob:tcp:init adding 10.12.24.63 to our 
>>>>> list of V4 connections
>>>>> [borg01w063:03815] [[49163,0],0] TCP STARTUP
>>>>> [borg01w063:03815] [[49163,0],0] attempting to bind to IPv4 port 0
>>>>> [borg01w063:03815] [[49163,0],0] assigned IPv4 port 41373
>>>>> [borg01w063:03815] mca:oob:select: Adding component to end
>>>>> [borg01w063:03815] mca:oob:select: Found 1 active transports
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:receive start comm
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_job
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: LAUNCH DAEMONS CALLED
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm creating map
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm add new daemon 
>>>>> [[49163,0],1]
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm assigning new daemon 
>>>>> [[49163,0],1] to node borg01w064
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm add new daemon 
>>>>> [[49163,0],2]
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm assigning new daemon 
>>>>> [[49163,0],2] to node borg01w065
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm add new daemon 
>>>>> [[49163,0],3]
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm assigning new daemon 
>>>>> [[49163,0],3] to node borg01w069
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm add new daemon 
>>>>> [[49163,0],4]
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm assigning new daemon 
>>>>> [[49163,0],4] to node borg01w070
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm add new daemon 
>>>>> [[49163,0],5]
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:setup_vm assigning new daemon 
>>>>> [[49163,0],5] to node borg01w071
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: launching on nodes 
>>>>> borg01w064,borg01w065,borg01w069,borg01w070,borg01w071
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: Set 
>>>>> prefix:/discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: final top-level argv:
>>>>>   srun --ntasks-per-node=1 --kill-on-bad-exit --cpu_bind=none --nodes=5 
>>>>> --nodelist=borg01w064,borg01w065,borg01w069,borg01w070,borg01w071 
>>>>> --ntasks=5 orted -mca orte_debug_daemons 1 -mca 
>>>>> orte_leave_session_attached 1 -mca orte_ess_jobid 3221946368 -mca 
>>>>> orte_ess_vpid 1 -mca orte_ess_num_procs 6 -mca orte_hnp_uri 
>>>>> 3221946368.0;tcp://10.1.24.63,172.31.1.254,10.12.24.63:41373 --mca 
>>>>> oob_base_verbose 10 -mca plm_base_verbose 5
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: reset PATH: 
>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug/bin:/usr/local/other/SLES11/gcc/4.9.1/bin:/usr/local/other/SLES11.1/git/1.8.5.2/libexec/git-core:/usr/local/other/SLES11.1/git/1.8.5.2/bin:/usr/local/other/SLES11/svn/1.6.17/bin:/usr/local/other/SLES11/tkcvs/8.2.3/gcc-4.3.2/bin:.:/home/mathomp4/bin:/home/mathomp4/cvstools:/discover/nobackup/projects/gmao/share/dasilva/opengrads/Contents:/usr/local/other/Htop/1.0/bin:/usr/local/other/SLES11/gnuplot/4.6.0/gcc-4.3.2/bin:/usr/local/other/SLES11/xpdf/3.03-gcc-4.3.2/bin:/home/mathomp4/src/pdtoolkit-3.16/x86_64/bin:/discover/nobackup/mathomp4/WavewatchIII-GMAO/bin:/discover/nobackup/mathomp4/WavewatchIII-GMAO/exe:/usr/local/other/pods:/usr/local/other/SLES11.1/R/3.1.0/gcc-4.3.4/lib64/R/bin:.:/home/mathomp4/bin:/home/mathomp4/cvstools:/discover/nobackup/projects/gmao/share/dasilva/opengrads/Contents:/usr/local/other/Htop/1.0/bin:/usr/local/other/SLES11/gnuplot/4.
> 6
>> .0/gcc-4.3.2/bin:/usr/local/other/SLES11/xpdf/3.03-gcc-4.3.2/bin:/home/mathomp4/src/pdtoolkit-3.16/x86_64/bin:/discover/nobackup/mathomp4/WavewatchIII-GMAO/bin:/discover/nobackup/mathomp4/WavewatchIII-GMAO/exe:/usr/local/other/pods:/usr/local/other/SLES11.1/R/3.1.0/gcc-4.3.4/lib64/R/bin:/home/mathomp4/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/kde3/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/usr/slurm/bin
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: reset LD_LIBRARY_PATH: 
>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug/lib:/usr/local/other/SLES11/gcc/4.9.1/lib64:/usr/local/other/SLES11.1/git/1.8.5.2/lib:/usr/local/other/SLES11/svn/1.6.17/lib:/usr/local/other/SLES11/tkcvs/8.2.3/gcc-4.3.2/lib
>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>> [borg01w065:15893] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w065:15893] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w065:15893] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w065:15893] mca: base: components_open: opening oob components
>>>>> [borg01w065:15893] mca: base: components_open: found loaded component tcp
>>>>> [borg01w065:15893] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w065:15893] mca:oob:select: checking available component tcp
>>>>> [borg01w065:15893] mca:oob:select: Querying component [tcp]
>>>>> [borg01w065:15893] oob:tcp: component_available called
>>>>> [borg01w065:15893] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w065:15893] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w065:15893] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w065:15893] [[49163,0],2] oob:tcp:init adding 10.1.24.65 to our 
>>>>> list of V4 connections
>>>>> [borg01w065:15893] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w065:15893] [[49163,0],2] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w065:15893] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w065:15893] [[49163,0],2] oob:tcp:init adding 10.12.24.65 to our 
>>>>> list of V4 connections
>>>>> [borg01w065:15893] [[49163,0],2] TCP STARTUP
>>>>> [borg01w065:15893] [[49163,0],2] attempting to bind to IPv4 port 0
>>>>> [borg01w065:15893] [[49163,0],2] assigned IPv4 port 43456
>>>>> [borg01w065:15893] mca:oob:select: Adding component to end
>>>>> [borg01w065:15893] mca:oob:select: Found 1 active transports
>>>>> [borg01w070:12645] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w070:12645] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w070:12645] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w070:12645] mca: base: components_open: opening oob components
>>>>> [borg01w070:12645] mca: base: components_open: found loaded component tcp
>>>>> [borg01w070:12645] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w070:12645] mca:oob:select: checking available component tcp
>>>>> [borg01w070:12645] mca:oob:select: Querying component [tcp]
>>>>> [borg01w070:12645] oob:tcp: component_available called
>>>>> [borg01w070:12645] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w070:12645] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w070:12645] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w070:12645] [[49163,0],4] oob:tcp:init adding 10.1.24.70 to our 
>>>>> list of V4 connections
>>>>> [borg01w070:12645] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w070:12645] [[49163,0],4] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w070:12645] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w070:12645] [[49163,0],4] oob:tcp:init adding 10.12.24.70 to our 
>>>>> list of V4 connections
>>>>> [borg01w070:12645] [[49163,0],4] TCP STARTUP
>>>>> [borg01w070:12645] [[49163,0],4] attempting to bind to IPv4 port 0
>>>>> [borg01w070:12645] [[49163,0],4] assigned IPv4 port 53062
>>>>> [borg01w070:12645] mca:oob:select: Adding component to end
>>>>> [borg01w070:12645] mca:oob:select: Found 1 active transports
>>>>> [borg01w064:16565] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w064:16565] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w064:16565] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w071:14879] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w071:14879] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w064:16565] mca: base: components_open: opening oob components
>>>>> [borg01w064:16565] mca: base: components_open: found loaded component tcp
>>>>> [borg01w064:16565] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w064:16565] mca:oob:select: checking available component tcp
>>>>> [borg01w064:16565] mca:oob:select: Querying component [tcp]
>>>>> [borg01w064:16565] oob:tcp: component_available called
>>>>> [borg01w064:16565] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w064:16565] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w064:16565] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w064:16565] [[49163,0],1] oob:tcp:init adding 10.1.24.64 to our 
>>>>> list of V4 connections
>>>>> [borg01w064:16565] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w064:16565] [[49163,0],1] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w064:16565] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w064:16565] [[49163,0],1] oob:tcp:init adding 10.12.24.64 to our 
>>>>> list of V4 connections
>>>>> [borg01w064:16565] [[49163,0],1] TCP STARTUP
>>>>> [borg01w064:16565] [[49163,0],1] attempting to bind to IPv4 port 0
>>>>> [borg01w064:16565] [[49163,0],1] assigned IPv4 port 43828
>>>>> [borg01w064:16565] mca:oob:select: Adding component to end
>>>>> [borg01w069:30276] mca: base: components_register: registering oob 
>>>>> components
>>>>> [borg01w069:30276] mca: base: components_register: found loaded component 
>>>>> tcp
>>>>> [borg01w071:14879] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w069:30276] mca: base: components_register: component tcp register 
>>>>> function successful
>>>>> [borg01w071:14879] mca: base: components_open: opening oob components
>>>>> [borg01w071:14879] mca: base: components_open: found loaded component tcp
>>>>> [borg01w071:14879] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w071:14879] mca:oob:select: checking available component tcp
>>>>> [borg01w071:14879] mca:oob:select: Querying component [tcp]
>>>>> [borg01w071:14879] oob:tcp: component_available called
>>>>> [borg01w069:30276] mca: base: components_open: opening oob components
>>>>> [borg01w069:30276] mca: base: components_open: found loaded component tcp
>>>>> [borg01w069:30276] mca: base: components_open: component tcp open 
>>>>> function successful
>>>>> [borg01w071:14879] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w071:14879] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w071:14879] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w071:14879] [[49163,0],5] oob:tcp:init adding 10.1.24.71 to our 
>>>>> list of V4 connections
>>>>> [borg01w071:14879] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w069:30276] mca:oob:select: checking available component tcp
>>>>> [borg01w069:30276] mca:oob:select: Querying component [tcp]
>>>>> [borg01w069:30276] oob:tcp: component_available called
>>>>> [borg01w071:14879] [[49163,0],5] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w071:14879] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w071:14879] [[49163,0],5] oob:tcp:init adding 10.12.24.71 to our 
>>>>> list of V4 connections
>>>>> [borg01w071:14879] [[49163,0],5] TCP STARTUP
>>>>> [borg01w069:30276] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w069:30276] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>> [borg01w069:30276] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>> [borg01w069:30276] [[49163,0],3] oob:tcp:init adding 10.1.24.69 to our 
>>>>> list of V4 connections
>>>>> [borg01w069:30276] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>> [borg01w069:30276] [[49163,0],3] oob:tcp:init adding 172.31.1.254 to our 
>>>>> list of V4 connections
>>>>> [borg01w069:30276] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>> [borg01w069:30276] [[49163,0],3] oob:tcp:init adding 10.12.24.69 to our 
>>>>> list of V4 connections
>>>>> [borg01w069:30276] [[49163,0],3] TCP STARTUP
>>>>> [borg01w071:14879] [[49163,0],5] attempting to bind to IPv4 port 0
>>>>> [borg01w069:30276] [[49163,0],3] attempting to bind to IPv4 port 0
>>>>> [borg01w069:30276] [[49163,0],3] assigned IPv4 port 39299
>>>>> [borg01w064:16565] mca:oob:select: Found 1 active transports
>>>>> [borg01w069:30276] mca:oob:select: Adding component to end
>>>>> [borg01w069:30276] mca:oob:select: Found 1 active transports
>>>>> [borg01w071:14879] [[49163,0],5] assigned IPv4 port 56113
>>>>> [borg01w071:14879] mca:oob:select: Adding component to end
>>>>> [borg01w071:14879] mca:oob:select: Found 1 active transports
>>>>> srun.slurm: error: borg01w064: task 0: Exited with exit code 213
>>>>> srun.slurm: Terminating job step 2347743.3
>>>>> srun.slurm: Job step aborted: Waiting up to 2 seconds for job step to 
>>>>> finish.
>>>>> [borg01w070:12645] [[49163,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/rml_base_contact.c at line 161
>>>>> [borg01w070:12645] [[49163,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> routed_binomial.c at line 498
>>>>> [borg01w070:12645] [[49163,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/ess_base_std_orted.c at line 539
>>>>> [borg01w065:15893] [[49163,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/rml_base_contact.c at line 161
>>>>> [borg01w065:15893] [[49163,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> routed_binomial.c at line 498
>>>>> [borg01w065:15893] [[49163,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/ess_base_std_orted.c at line 539
>>>>> slurmd[borg01w065]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> slurmd[borg01w070]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> [borg01w064:16565] [[49163,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/rml_base_contact.c at line 161
>>>>> [borg01w064:16565] [[49163,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> routed_binomial.c at line 498
>>>>> [borg01w064:16565] [[49163,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/ess_base_std_orted.c at line 539
>>>>> [borg01w069:30276] [[49163,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/rml_base_contact.c at line 161
>>>>> [borg01w069:30276] [[49163,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> routed_binomial.c at line 498
>>>>> [borg01w069:30276] [[49163,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/ess_base_std_orted.c at line 539
>>>>> slurmd[borg01w069]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> [borg01w071:14879] [[49163,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/rml_base_contact.c at line 161
>>>>> [borg01w071:14879] [[49163,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> routed_binomial.c at line 498
>>>>> [borg01w071:14879] [[49163,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>> base/ess_base_std_orted.c at line 539
>>>>> slurmd[borg01w071]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> slurmd[borg01w065]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> slurmd[borg01w069]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> slurmd[borg01w070]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> slurmd[borg01w071]: *** STEP 2347743.3 KILLED AT 2014-08-31T09:24:17 WITH 
>>>>> SIGNAL 9 ***
>>>>> srun.slurm: error: borg01w069: task 2: Exited with exit code 213
>>>>> srun.slurm: error: borg01w065: task 1: Exited with exit code 213
>>>>> srun.slurm: error: borg01w071: task 4: Exited with exit code 213
>>>>> srun.slurm: error: borg01w070: task 3: Exited with exit code 213
>>>>> sh: tcp://10.1.24.63,172.31.1.254,10.12.24.63:41373: No such file or 
>>>>> directory
>>>>> [borg01w063:03815] [[49163,0],0] plm:slurm: primary daemons complete!
>>>>> [borg01w063:03815] [[49163,0],0] plm:base:receive stop comm
>>>>> [borg01w063:03815] [[49163,0],0] TCP SHUTDOWN
>>>>> [borg01w063:03815] mca: base: close: component tcp closed
>>>>> [borg01w063:03815] mca: base: close: unloading component tcp
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Aug 29, 2014 at 3:18 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> Rats - I also need "-mca plm_base_verbose 5" on there so I can see the 
>>>>> cmd line being executed. Can you add it?
>>>>> 
>>>>> 
>>>>> On Aug 29, 2014, at 11:16 AM, Matt Thompson <fort...@gmail.com> wrote:
>>>>> 
>>>>>> Ralph,
>>>>>> 
>>>>>> Here you go:
>>>>>> 
>>>>>> (1080) $ 
>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2-debug/bin/mpirun 
>>>>>> --leave-session-attached --debug-daemons --mca oob_base_verbose 10 -np 8 
>>>>>> ./helloWorld.182-debug.x
>>>>>> [borg01x142:29232] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x142:29232] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x142:29232] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x142:29232] mca: base: components_open: opening oob components
>>>>>> [borg01x142:29232] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x142:29232] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x142:29232] mca:oob:select: checking available component tcp
>>>>>> [borg01x142:29232] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x142:29232] oob:tcp: component_available called
>>>>>> [borg01x142:29232] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x142:29232] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x142:29232] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x142:29232] [[52298,0],0] oob:tcp:init adding 10.1.25.142 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x142:29232] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x142:29232] [[52298,0],0] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x142:29232] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x142:29232] [[52298,0],0] oob:tcp:init adding 10.12.25.142 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x142:29232] [[52298,0],0] TCP STARTUP
>>>>>> [borg01x142:29232] [[52298,0],0] attempting to bind to IPv4 port 0
>>>>>> [borg01x142:29232] [[52298,0],0] assigned IPv4 port 41686
>>>>>> [borg01x142:29232] mca:oob:select: Adding component to end
>>>>>> [borg01x142:29232] mca:oob:select: Found 1 active transports
>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>> [borg01x153:01290] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x153:01290] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x143:13793] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x143:13793] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x153:01290] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x153:01290] mca: base: components_open: opening oob components
>>>>>> [borg01x153:01290] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x153:01290] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x153:01290] mca:oob:select: checking available component tcp
>>>>>> [borg01x153:01290] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x153:01290] oob:tcp: component_available called
>>>>>> [borg01x153:01290] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x153:01290] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x153:01290] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x153:01290] [[52298,0],4] oob:tcp:init adding 10.1.25.153 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x153:01290] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x153:01290] [[52298,0],4] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x153:01290] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x153:01290] [[52298,0],4] oob:tcp:init adding 10.12.25.153 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x153:01290] [[52298,0],4] TCP STARTUP
>>>>>> [borg01x153:01290] [[52298,0],4] attempting to bind to IPv4 port 0
>>>>>> [borg01x143:13793] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x153:01290] [[52298,0],4] assigned IPv4 port 38028
>>>>>> [borg01x143:13793] mca: base: components_open: opening oob components
>>>>>> [borg01x143:13793] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x143:13793] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x143:13793] mca:oob:select: checking available component tcp
>>>>>> [borg01x143:13793] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x143:13793] oob:tcp: component_available called
>>>>>> [borg01x143:13793] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x143:13793] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x143:13793] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x143:13793] [[52298,0],1] oob:tcp:init adding 10.1.25.143 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x143:13793] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x143:13793] [[52298,0],1] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x143:13793] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x143:13793] [[52298,0],1] oob:tcp:init adding 10.12.25.143 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x143:13793] [[52298,0],1] TCP STARTUP
>>>>>> [borg01x143:13793] [[52298,0],1] attempting to bind to IPv4 port 0
>>>>>> [borg01x153:01290] mca:oob:select: Adding component to end
>>>>>> [borg01x153:01290] mca:oob:select: Found 1 active transports
>>>>>> [borg01x143:13793] [[52298,0],1] assigned IPv4 port 44719
>>>>>> [borg01x143:13793] mca:oob:select: Adding component to end
>>>>>> [borg01x143:13793] mca:oob:select: Found 1 active transports
>>>>>> [borg01x144:30878] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x144:30878] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x144:30878] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x144:30878] mca: base: components_open: opening oob components
>>>>>> [borg01x144:30878] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x144:30878] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x144:30878] mca:oob:select: checking available component tcp
>>>>>> [borg01x144:30878] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x144:30878] oob:tcp: component_available called
>>>>>> [borg01x144:30878] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x144:30878] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x144:30878] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x144:30878] [[52298,0],2] oob:tcp:init adding 10.1.25.144 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x144:30878] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x144:30878] [[52298,0],2] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x144:30878] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x144:30878] [[52298,0],2] oob:tcp:init adding 10.12.25.144 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x144:30878] [[52298,0],2] TCP STARTUP
>>>>>> [borg01x144:30878] [[52298,0],2] attempting to bind to IPv4 port 0
>>>>>> [borg01x144:30878] [[52298,0],2] assigned IPv4 port 40700
>>>>>> [borg01x144:30878] mca:oob:select: Adding component to end
>>>>>> [borg01x144:30878] mca:oob:select: Found 1 active transports
>>>>>> [borg01x154:01154] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x154:01154] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x154:01154] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x154:01154] mca: base: components_open: opening oob components
>>>>>> [borg01x154:01154] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x154:01154] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x154:01154] mca:oob:select: checking available component tcp
>>>>>> [borg01x154:01154] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x154:01154] oob:tcp: component_available called
>>>>>> [borg01x154:01154] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x154:01154] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x154:01154] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x154:01154] [[52298,0],5] oob:tcp:init adding 10.1.25.154 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x154:01154] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x154:01154] [[52298,0],5] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x154:01154] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x154:01154] [[52298,0],5] oob:tcp:init adding 10.12.25.154 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x154:01154] [[52298,0],5] TCP STARTUP
>>>>>> [borg01x154:01154] [[52298,0],5] attempting to bind to IPv4 port 0
>>>>>> [borg01x154:01154] [[52298,0],5] assigned IPv4 port 41191
>>>>>> [borg01x154:01154] mca:oob:select: Adding component to end
>>>>>> [borg01x154:01154] mca:oob:select: Found 1 active transports
>>>>>> [borg01x145:02419] mca: base: components_register: registering oob 
>>>>>> components
>>>>>> [borg01x145:02419] mca: base: components_register: found loaded 
>>>>>> component tcp
>>>>>> [borg01x145:02419] mca: base: components_register: component tcp 
>>>>>> register function successful
>>>>>> [borg01x145:02419] mca: base: components_open: opening oob components
>>>>>> [borg01x145:02419] mca: base: components_open: found loaded component tcp
>>>>>> [borg01x145:02419] mca: base: components_open: component tcp open 
>>>>>> function successful
>>>>>> [borg01x145:02419] mca:oob:select: checking available component tcp
>>>>>> [borg01x145:02419] mca:oob:select: Querying component [tcp]
>>>>>> [borg01x145:02419] oob:tcp: component_available called
>>>>>> [borg01x145:02419] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x145:02419] WORKING INTERFACE 2 KERNEL INDEX 1 FAMILY: V4
>>>>>> [borg01x145:02419] WORKING INTERFACE 3 KERNEL INDEX 2 FAMILY: V4
>>>>>> [borg01x145:02419] [[52298,0],3] oob:tcp:init adding 10.1.25.145 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x145:02419] WORKING INTERFACE 4 KERNEL INDEX 4 FAMILY: V4
>>>>>> [borg01x145:02419] [[52298,0],3] oob:tcp:init adding 172.31.1.254 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x145:02419] WORKING INTERFACE 5 KERNEL INDEX 5 FAMILY: V4
>>>>>> [borg01x145:02419] [[52298,0],3] oob:tcp:init adding 10.12.25.145 to our 
>>>>>> list of V4 connections
>>>>>> [borg01x145:02419] [[52298,0],3] TCP STARTUP
>>>>>> [borg01x145:02419] [[52298,0],3] attempting to bind to IPv4 port 0
>>>>>> [borg01x145:02419] [[52298,0],3] assigned IPv4 port 51079
>>>>>> [borg01x145:02419] mca:oob:select: Adding component to end
>>>>>> [borg01x145:02419] mca:oob:select: Found 1 active transports
>>>>>> [borg01x144:30878] [[52298,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/rml_base_contact.c at line 161
>>>>>> [borg01x144:30878] [[52298,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> routed_binomial.c at line 498
>>>>>> [borg01x144:30878] [[52298,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/ess_base_std_orted.c at line 539
>>>>>> srun.slurm: error: borg01x143: task 0: Exited with exit code 213
>>>>>> srun.slurm: Terminating job step 2332583.24
>>>>>> slurmd[borg01x144]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> srun.slurm: Job step aborted: Waiting up to 2 seconds for job step to 
>>>>>> finish.
>>>>>> srun.slurm: error: borg01x153: task 3: Exited with exit code 213
>>>>>> [borg01x153:01290] [[52298,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/rml_base_contact.c at line 161
>>>>>> [borg01x153:01290] [[52298,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> routed_binomial.c at line 498
>>>>>> [borg01x153:01290] [[52298,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/ess_base_std_orted.c at line 539
>>>>>> [borg01x143:13793] [[52298,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/rml_base_contact.c at line 161
>>>>>> [borg01x143:13793] [[52298,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> routed_binomial.c at line 498
>>>>>> [borg01x143:13793] [[52298,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/ess_base_std_orted.c at line 539
>>>>>> slurmd[borg01x144]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> srun.slurm: error: borg01x144: task 1: Exited with exit code 213
>>>>>> [borg01x154:01154] [[52298,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/rml_base_contact.c at line 161
>>>>>> [borg01x154:01154] [[52298,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> routed_binomial.c at line 498
>>>>>> [borg01x154:01154] [[52298,0],5] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/ess_base_std_orted.c at line 539
>>>>>> slurmd[borg01x154]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> slurmd[borg01x154]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> srun.slurm: error: borg01x154: task 4: Exited with exit code 213
>>>>>> srun.slurm: error: borg01x145: task 2: Exited with exit code 213
>>>>>> [borg01x145:02419] [[52298,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/rml_base_contact.c at line 161
>>>>>> [borg01x145:02419] [[52298,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> routed_binomial.c at line 498
>>>>>> [borg01x145:02419] [[52298,0],3] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>> base/ess_base_std_orted.c at line 539
>>>>>> slurmd[borg01x145]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> slurmd[borg01x145]: *** STEP 2332583.24 KILLED AT 2014-08-29T13:59:30 
>>>>>> WITH SIGNAL 9 ***
>>>>>> sh: tcp://10.1.25.142,172.31.1.254,10.12.25.142:41686: No such file or 
>>>>>> directory
>>>>>> [borg01x142:29232] [[52298,0],0] TCP SHUTDOWN
>>>>>> [borg01x142:29232] mca: base: close: component tcp closed
>>>>>> [borg01x142:29232] mca: base: close: unloading component tcp
>>>>>> 
>>>>>> Note, if I can get the allocation today, I want to try doing all this on 
>>>>>> a single SandyBridge node, rather than on 6. It might make comparing 
>>>>>> various runs a bit easier!
>>>>>> 
>>>>>> Matt
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Aug 29, 2014 at 12:42 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>> wrote:
>>>>>> Okay, something quite weird is happening here. I can't replicate using 
>>>>>> the 1.8.2 release tarball on a slurm machine, so my guess is that 
>>>>>> something else is going on here.
>>>>>> 
>>>>>> Could you please rebuild the 1.8.2 code with --enable-debug on the 
>>>>>> configure line (assuming you haven't already done so), and then rerun 
>>>>>> that version as before but adding "--mca oob_base_verbose 10" to the cmd 
>>>>>> line?
>>>>>> 
>>>>>> 
>>>>>> On Aug 29, 2014, at 4:22 AM, Matt Thompson <fort...@gmail.com> wrote:
>>>>>> 
>>>>>>> Ralph,
>>>>>>> 
>>>>>>> For 1.8.2rc4 I get:
>>>>>>> 
>>>>>>> (1003) $ 
>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2rc4/bin/mpirun 
>>>>>>> --leave-session-attached --debug-daemons -np 8 ./helloWorld.182.x
>>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>>> Daemon [[47143,0],5] checking in as pid 10990 on host borg01x154
>>>>>>> [borg01x154:10990] [[47143,0],5] orted: up and running - waiting for 
>>>>>>> commands!
>>>>>>> Daemon [[47143,0],1] checking in as pid 23473 on host borg01x143
>>>>>>> Daemon [[47143,0],2] checking in as pid 8250 on host borg01x144
>>>>>>> [borg01x144:08250] [[47143,0],2] orted: up and running - waiting for 
>>>>>>> commands!
>>>>>>> [borg01x143:23473] [[47143,0],1] orted: up and running - waiting for 
>>>>>>> commands!
>>>>>>> Daemon [[47143,0],3] checking in as pid 12320 on host borg01x145
>>>>>>> Daemon [[47143,0],4] checking in as pid 10902 on host borg01x153
>>>>>>> [borg01x153:10902] [[47143,0],4] orted: up and running - waiting for 
>>>>>>> commands!
>>>>>>> [borg01x145:12320] [[47143,0],3] orted: up and running - waiting for 
>>>>>>> commands!
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_cmd: received add_local_procs
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: received add_local_procs
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: received add_local_procs
>>>>>>> [borg01x143:23473] [[47143,0],1] orted_cmd: received add_local_procs
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: received add_local_procs
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: received add_local_procs
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],0]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],2]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],3]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],1]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],5]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],4]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],6]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync+nidmap from 
>>>>>>> local proc [[47143,1],7]
>>>>>>> MPIR_being_debugged = 0
>>>>>>> MPIR_debug_state = 1
>>>>>>> MPIR_partial_attach_ok = 1
>>>>>>> MPIR_i_am_starter = 0
>>>>>>> MPIR_forward_output = 0
>>>>>>> MPIR_proctable_size = 8
>>>>>>> MPIR_proctable:
>>>>>>>   (i, host, exe, pid) = (0, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1647)
>>>>>>>   (i, host, exe, pid) = (1, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1648)
>>>>>>>   (i, host, exe, pid) = (2, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1650)
>>>>>>>   (i, host, exe, pid) = (3, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1652)
>>>>>>>   (i, host, exe, pid) = (4, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1654)
>>>>>>>   (i, host, exe, pid) = (5, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1656)
>>>>>>>   (i, host, exe, pid) = (6, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1658)
>>>>>>>   (i, host, exe, pid) = (7, borg01x142, 
>>>>>>> /home/mathomp4/HelloWorldTest/./helloWorld.182.x, 1660)
>>>>>>> MPIR_executable_path: NULL
>>>>>>> MPIR_server_arguments: NULL
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_cmd: received message_local_procs
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: received message_local_procs
>>>>>>> [borg01x143:23473] [[47143,0],1] orted_cmd: received message_local_procs
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: received message_local_procs
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: received message_local_procs
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: received message_local_procs
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_cmd: received message_local_procs
>>>>>>> [borg01x143:23473] [[47143,0],1] orted_cmd: received message_local_procs
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: received message_local_procs
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: received message_local_procs
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: received message_local_procs
>>>>>>> Process    2 of    8 is on borg01x142
>>>>>>> Process    5 of    8 is on borg01x142
>>>>>>> Process    4 of    8 is on borg01x142
>>>>>>> Process    1 of    8 is on borg01x142
>>>>>>> Process    0 of    8 is on borg01x142
>>>>>>> Process    3 of    8 is on borg01x142
>>>>>>> Process    6 of    8 is on borg01x142
>>>>>>> Process    7 of    8 is on borg01x142
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: received message_local_procs
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_cmd: received message_local_procs
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: received message_local_procs
>>>>>>> [borg01x143:23473] [[47143,0],1] orted_cmd: received message_local_procs
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: received message_local_procs
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: received message_local_procs
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: received message_local_procs
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],2]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],1]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],3]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],0]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],4]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],6]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],5]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_recv: received sync from local 
>>>>>>> proc [[47143,1],7]
>>>>>>> [borg01x142:01629] [[47143,0],0] orted_cmd: received exit cmd
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: received exit cmd
>>>>>>> [borg01x144:08250] [[47143,0],2] orted_cmd: all routes and children 
>>>>>>> gone - exiting
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: received exit cmd
>>>>>>> [borg01x153:10902] [[47143,0],4] orted_cmd: all routes and children 
>>>>>>> gone - exiting
>>>>>>> [borg01x143:23473] [[47143,0],1] orted_cmd: received exit cmd
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: received exit cmd
>>>>>>> [borg01x154:10990] [[47143,0],5] orted_cmd: all routes and children 
>>>>>>> gone - exiting
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: received exit cmd
>>>>>>> [borg01x145:12320] [[47143,0],3] orted_cmd: all routes and children 
>>>>>>> gone - exiting
>>>>>>> 
>>>>>>> Using the 1.8.2 mpirun:
>>>>>>> 
>>>>>>> (1004) $ 
>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2/bin/mpirun 
>>>>>>> --leave-session-attached --debug-daemons -np 8 ./helloWorld.182.x
>>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>>> srun.slurm: cluster configuration lacks support for cpu binding
>>>>>>> [borg01x143:23494] [[47330,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/rml_base_contact.c at line 161
>>>>>>> [borg01x143:23494] [[47330,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> routed_binomial.c at line 498
>>>>>>> [borg01x143:23494] [[47330,0],1] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/ess_base_std_orted.c at line 539
>>>>>>> srun.slurm: error: borg01x143: task 0: Exited with exit code 213
>>>>>>> srun.slurm: Terminating job step 2332583.4
>>>>>>> [borg01x153:10915] [[47330,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/rml_base_contact.c at line 161
>>>>>>> [borg01x153:10915] [[47330,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> routed_binomial.c at line 498
>>>>>>> [borg01x153:10915] [[47330,0],4] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/ess_base_std_orted.c at line 539
>>>>>>> [borg01x144:08263] [[47330,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/rml_base_contact.c at line 161
>>>>>>> [borg01x144:08263] [[47330,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> routed_binomial.c at line 498
>>>>>>> [borg01x144:08263] [[47330,0],2] ORTE_ERROR_LOG: Bad parameter in file 
>>>>>>> base/ess_base_std_orted.c at line 539
>>>>>>> srun.slurm: Job step aborted: Waiting up to 2 seconds for job step to 
>>>>>>> finish.
>>>>>>> slurmd[borg01x145]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> slurmd[borg01x154]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> slurmd[borg01x153]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> slurmd[borg01x153]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> srun.slurm: error: borg01x144: task 1: Exited with exit code 213
>>>>>>> slurmd[borg01x144]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> slurmd[borg01x144]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> srun.slurm: error: borg01x153: task 3: Exited with exit code 213
>>>>>>> slurmd[borg01x154]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> slurmd[borg01x145]: *** STEP 2332583.4 KILLED AT 2014-08-29T07:16:20 
>>>>>>> WITH SIGNAL 9 ***
>>>>>>> srun.slurm: error: borg01x154: task 4: Killed
>>>>>>> srun.slurm: error: borg01x145: task 2: Killed
>>>>>>> sh: tcp://10.1.25.142,172.31.1.254,10.12.25.142:34169: No such file or 
>>>>>>> directory
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Aug 28, 2014 at 7:17 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>>> wrote:
>>>>>>> I'm unaware of any changes to the Slurm integration between rc4 and 
>>>>>>> final release. It sounds like this might be something else going on - 
>>>>>>> try adding "--leave-session-attached --debug-daemons" to your 1.8.2 
>>>>>>> command line and let's see if any errors get reported.
>>>>>>> 
>>>>>>> 
>>>>>>> On Aug 28, 2014, at 12:20 PM, Matt Thompson <fort...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Open MPI List,
>>>>>>>> 
>>>>>>>> I recently encountered an odd bug with Open MPI 1.8.1 and GCC 4.9.1 on 
>>>>>>>> our cluster (reported on this list), and decided to try it with 1.8.2. 
>>>>>>>> However, we seem to be having an issue with Open MPI 1.8.2 and SLURM. 
>>>>>>>> Even weirder, Open MPI 1.8.2rc4 doesn't show the bug. And the bug is: 
>>>>>>>> I get no stdout with Open MPI 1.8.2. That is, HelloWorld doesn't work.
>>>>>>>> 
>>>>>>>> To wit, our sysadmin has two tarballs:
>>>>>>>> 
>>>>>>>> (1441) $ sha1sum openmpi-1.8.2rc4.tar.bz2
>>>>>>>> 7e7496913c949451f546f22a1a159df25f8bb683  openmpi-1.8.2rc4.tar.bz2
>>>>>>>> (1442) $ sha1sum openmpi-1.8.2.tar.gz
>>>>>>>> cf2b1e45575896f63367406c6c50574699d8b2e1  openmpi-1.8.2.tar.gz
>>>>>>>> 
>>>>>>>> I then build each with a script in the method our sysadmin usually 
>>>>>>>> does:
>>>>>>>> 
>>>>>>>> #!/bin/sh 
>>>>>>>> set -x
>>>>>>>> export PREFIX=/discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2
>>>>>>>> export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/nlocal/slurm/2.6.3/lib64
>>>>>>>> build() {
>>>>>>>> echo `pwd`
>>>>>>>> ./configure --with-slurm --disable-wrapper-rpath --enable-shared 
>>>>>>>> --enable-mca-no-build=btl-usnic \
>>>>>>>>     CC=gcc CXX=g++ F77=gfortran FC=gfortran \
>>>>>>>>     CFLAGS="-mtune=generic -fPIC -m64" CXXFLAGS="-mtune=generic -fPIC 
>>>>>>>> -m64" FFLAGS="-mtune=generic -fPIC -m64" \
>>>>>>>>     F77FLAGS="-mtune=generic -fPIC -m64" FCFLAGS="-mtune=generic -fPIC 
>>>>>>>> -m64" F90FLAGS="-mtune=generic -fPIC -m64" \
>>>>>>>>     LDFLAGS="-L/usr/nlocal/slurm/2.6.3/lib64" 
>>>>>>>> CPPFLAGS="-I/usr/nlocal/slurm/2.6.3/include" LIBS="-lpciaccess" \
>>>>>>>>    --prefix=${PREFIX} 2>&1 | tee configure.1.8.2.log
>>>>>>>> make 2>&1 | tee make.1.8.2.log
>>>>>>>> make check 2>&1 | tee makecheck.1.8.2.log
>>>>>>>> make install 2>&1 | tee makeinstall.1.8.2.log
>>>>>>>> }
>>>>>>>> echo "calling build"
>>>>>>>> build
>>>>>>>> echo "exiting"
>>>>>>>> 
>>>>>>>> The only difference between the two is '1.8.2' or '1.8.2rc4' in the 
>>>>>>>> PREFIX and log file tees.  Now, let us test. First, I grab some nodes 
>>>>>>>> with slurm:
>>>>>>>> 
>>>>>>>> $ salloc --nodes=6 --ntasks-per-node=16 --constraint=sand 
>>>>>>>> --time=09:00:00 --account=g0620 --mail-type=BEGIN
>>>>>>>> 
>>>>>>>> Once I get my nodes, I run with 1.8.2rc4:
>>>>>>>> 
>>>>>>>> (1142) $ 
>>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2rc4/bin/mpifort 
>>>>>>>> -o helloWorld.182rc4.x helloWorld.F90
>>>>>>>> (1143) $ 
>>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2rc4/bin/mpirun 
>>>>>>>> -np 8 ./helloWorld.182rc4.x
>>>>>>>> Process    0 of    8 is on borg01w044
>>>>>>>> Process    5 of    8 is on borg01w044
>>>>>>>> Process    3 of    8 is on borg01w044
>>>>>>>> Process    7 of    8 is on borg01w044
>>>>>>>> Process    1 of    8 is on borg01w044
>>>>>>>> Process    2 of    8 is on borg01w044
>>>>>>>> Process    4 of    8 is on borg01w044
>>>>>>>> Process    6 of    8 is on borg01w044
>>>>>>>> 
>>>>>>>> Now 1.8.2:
>>>>>>>> 
>>>>>>>> (1144) $ 
>>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2/bin/mpifort -o 
>>>>>>>> helloWorld.182.x helloWorld.F90
>>>>>>>> (1145) $ 
>>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2/bin/mpirun -np 
>>>>>>>> 8 ./helloWorld.182.x
>>>>>>>> (1146) $
>>>>>>>> 
>>>>>>>> No output at all. But, if I take the helloWorld.x from 1.8.2 and run 
>>>>>>>> it with 1.8.2rc4's mpirun:
>>>>>>>> 
>>>>>>>> (1146) $ 
>>>>>>>> /discover/nobackup/mathomp4/MPI/gcc_4.9.1-openmpi_1.8.2rc4/bin/mpirun 
>>>>>>>> -np 8 ./helloWorld.182.x
>>>>>>>> Process    5 of    8 is on borg01w044
>>>>>>>> Process    7 of    8 is on borg01w044
>>>>>>>> Process    2 of    8 is on borg01w044
>>>>>>>> Process    4 of    8 is on borg01w044
>>>>>>>> Process    1 of    8 is on borg01w044
>>>>>>>> Process    3 of    8 is on borg01w044
>>>>>>>> Process    6 of    8 is on borg01w044
>>>>>>>> Process    0 of    8 is on borg01w044
>>>>>>>> 
>>>>>>>> So...any idea what is happening here? There did seem to be a few SLURM 
>>>>>>>> related changes between the two tarballs involving /dev/null but it's 
>>>>>>>> a bit above me to decipher.
>>>>>>>> 
>>>>>>>> You can find the ompi_info, build, make, config, etc logs at these 
>>>>>>>> links (they are ~300kB which is over the mailing list limit according 
>>>>>>>> to the Open MPI web page):
>>>>>>>> 
>>>>>>>> https://dl.dropboxusercontent.com/u/61696/OMPI-1.8.2rc4-Output.tar.bz2
>>>>>>>> https://dl.dropboxusercontent.com/u/61696/OMPI-1.8.2-Output.tar.bz2
>>>>>>>> 
>>>>>>>> Thank you for any help and please let me know if you need more 
>>>>>>>> information,
>>>>>>>> Matt
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>>>>>>> get is one trick: rational thinking. But when you're good and crazy, 
>>>>>>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25182.php
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post: 
>>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25184.php
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>>>>>> get is one trick: rational thinking. But when you're good and crazy, 
>>>>>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post: 
>>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25187.php
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25193.php
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>>>>> get is one trick: rational thinking. But when you're good and crazy, 
>>>>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25196.php
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25197.php
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>>>> get is one trick: rational thinking. But when you're good and crazy, 
>>>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/08/25204.php
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/08/25205.php
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>>> get is one trick: rational thinking. But when you're good and crazy, 
>>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/09/25210.php
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/09/25211.php
>>> 
>>> 
>>> 
>>> -- 
>>> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>>> get is one trick: rational thinking. But when you're good and crazy, 
>>> oooh, oooh, oooh, the sky is the limit!" -- The Tick
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/09/25219.php
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/09/25232.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25234.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to