When you fork that process off, do you set its process group? Or is it in the 
same process group as the shell script?

> On Jun 19, 2017, at 10:19 AM, Ted Sussman <ted.suss...@adina.com> wrote:
> 
> If I replace the sleep with an infinite loop, I get the same behavior.  One 
> "aborttest" process 
> remains after all the signals are sent.
> 
> On 19 Jun 2017 at 10:10, r...@open-mpi.org wrote:
> 
>> 
>> That is typical behavior when you throw something into "sleep" - not much we 
>> can do about it, I 
>> think.
>> 
>>    On Jun 19, 2017, at 9:58 AM, Ted Sussman <ted.suss...@adina.com> wrote:
>> 
>>    Hello,
>> 
>>    I have rebuilt Open MPI 2.1.1 on the same computer, including 
>> --enable-debug.
>> 
>>    I have attached the abort test program aborttest10.tgz.  This version 
>> sleeps for 5 sec before
>>    calling MPI_ABORT, so that I can check the pids using ps.
>> 
>>    This is what happens (see run2.sh.out).
>> 
>>    Open MPI invokes two instances of dum.sh.  Each instance of dum.sh 
>> invokes aborttest.exe.
>> 
>>    Pid    Process
>>    -------------------
>>    19565  dum.sh
>>    19566  dum.sh
>>    19567 aborttest10.exe
>>    19568 aborttest10.exe
>> 
>>    When MPI_ABORT is called, Open MPI sends SIGCONT, SIGTERM and SIGKILL to 
>> both
>>    instances of dum.sh (pids 19565 and 19566).
>> 
>>    ps shows that both the shell processes vanish, and that one of the 
>> aborttest10.exe processes
>>    vanishes.  But the other aborttest10.exe remains and continues until it 
>> is finished sleeping.
>> 
>>    Hope that this information is useful.
>> 
>>    Sincerely,
>> 
>>    Ted Sussman
>> 
>> 
>> 
>>    On 19 Jun 2017 at 23:06,  gil...@rist.or.jp  wrote:
>> 
>> 
>>     Ted,
>>     
>>    some traces are missing  because you did not configure with --enable-debug
>>    i am afraid you have to do it (and you probably want to install that 
>> debug version in an 
>>    other
>>    location since its performances are not good for production) in order to 
>> get all the logs.
>>     
>>    Cheers,
>>     
>>    Gilles
>>     
>>    ----- Original Message -----
>>       Hello Gilles,
>> 
>>       I retried my example, with the same results as I observed before.  The 
>> process with rank 
>>    1
>>       does not get killed by MPI_ABORT.
>> 
>>       I have attached to this E-mail:
>> 
>>         config.log.bz2
>>         ompi_info.bz2  (uses ompi_info -a)
>>         aborttest09.tgz
>> 
>>       This testing is done on a computer running Linux 3.10.0.  This is a 
>> different computer 
>>    than
>>       the computer that I previously used for testing.  You can confirm that 
>> I am using Open 
>>    MPI
>>       2.1.1.
>> 
>>       tar xvzf aborttest09.tgz
>>       cd aborttest09
>>       ./sh run2.sh
>> 
>>       run2.sh contains the command
>> 
>>       /opt/openmpi-2.1.1-GNU/bin/mpirun -np 2 -mca btl tcp,self --mca 
>> odls_base_verbose 
>>    10
>>       ./dum.sh
>> 
>>       The output from this run is in aborttest09/run2.sh.out.
>> 
>>       The output shows that the the "default" component is selected by odls.
>> 
>>       The only messages from odls are: odls: launch spawning child ...  (two 
>> messages). 
>>    There
>>       are no messages from odls with "kill" and I see no SENDING SIGCONT / 
>> SIGKILL
>>       messages.
>> 
>>       I am not running from within any batch manager.
>> 
>>       Sincerely,
>> 
>>       Ted Sussman
>> 
>>       On 17 Jun 2017 at 16:02, gil...@rist.or.jp wrote:
>> 
>>    Ted,
>> 
>>    i do not observe the same behavior you describe with Open MPI 2.1.1
>> 
>>    # mpirun -np 2 -mca btl tcp,self --mca odls_base_verbose 5 ./abort.sh
>> 
>>    abort.sh 31361 launching abort
>>    abort.sh 31362 launching abort
>>    I am rank 0 with pid 31363
>>    I am rank 1 with pid 31364
>>    ------------------------------------------------------------------------
>>    --
>>    MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>    with errorcode 1.
>> 
>>    NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>    You may or may not see output from other processes, depending on
>>    exactly when Open MPI kills them.
>>    ------------------------------------------------------------------------
>>    --
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc working on WILDCARD
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc checking child process
>>    [[18199,1],0]
>>    [linux:31356] [[18199,0],0] SENDING SIGCONT TO [[18199,1],0]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 18 TO PID 31361
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc checking child process
>>    [[18199,1],1]
>>    [linux:31356] [[18199,0],0] SENDING SIGCONT TO [[18199,1],1]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 18 TO PID 31362
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] SENDING SIGTERM TO [[18199,1],0]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 15 TO PID 31361
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] SENDING SIGTERM TO [[18199,1],1]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 15 TO PID 31362
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] SENDING SIGKILL TO [[18199,1],0]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 9 TO PID 31361
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] SENDING SIGKILL TO [[18199,1],1]
>>    [linux:31356] [[18199,0],0] odls:default:SENT KILL 9 TO PID 31362
>>    SUCCESS
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc working on WILDCARD
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc checking child process
>>    [[18199,1],0]
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc child [[18199,1],0] is
>>    not alive
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc checking child process
>>    [[18199,1],1]
>>    [linux:31356] [[18199,0],0] odls:kill_local_proc child [[18199,1],1] is
>>    not alive
>> 
>> 
>>    Open MPI did kill both shells, and they were indeed killed as evidenced
>>    by ps
>> 
>>    #ps -fu gilles --forest
>>    UID        PID  PPID  C STIME TTY          TIME CMD
>>    gilles    1564  1561  0 15:39 ?        00:00:01 sshd: gilles@pts/1
>>    gilles    1565  1564  0 15:39 pts/1    00:00:00  \_ -bash
>>    gilles   31356  1565  3 15:57 pts/1    00:00:00      \_ /home/gilles/
>>    local/ompi-v2.x/bin/mpirun -np 2 -mca btl tcp,self --mca odls_base
>>    gilles   31364     1  1 15:57 pts/1    00:00:00 ./abort
>> 
>> 
>>    so trapping SIGTERM in your shell and manually killing the MPI task
>>    should work
>>    (as Jeff explained, as long as the shell script is fast enough to do
>>    that between SIGTERM and SIGKILL)
>> 
>> 
>>    if you observe a different behavior, please double check your Open MPI
>>    version and post the outputs of the same commands.
>> 
>>    btw, are you running from a batch manager ? if yes, which one ?
>> 
>>    Cheers,
>> 
>>    Gilles
>> 
>>    ----- Original Message -----
>>    Ted,
>> 
>>    if you
>> 
>>    mpirun --mca odls_base_verbose 10 ...
>> 
>>    you will see which processes get killed and how
>> 
>>    Best regards,
>> 
>> 
>>    Gilles
>> 
>>    ----- Original Message -----
>>    Hello Jeff,
>> 
>>    Thanks for your comments.
>> 
>>    I am not seeing behavior #4, on the two computers that I have 
>>    tested
>>    on, using Open MPI
>>    2.1.1.
>> 
>>    I wonder if you can duplicate my results with the files that I have
>>    uploaded.
>> 
>>    Regarding what is the "correct" behavior, I am willing to modify my
>>    application to correspond
>>    to Open MPI's behavior (whatever behavior the Open MPI 
>>    developers
>>    decide is best) --
>>    provided that Open MPI does in fact kill off both shells.
>> 
>>    So my highest priority now is to find out why Open MPI 2.1.1 does
>>    not
>>    kill off both shells on
>>    my computer.
>> 
>>    Sincerely,
>> 
>>    Ted Sussman
>> 
>>      On 16 Jun 2017 at 16:35, Jeff Squyres (jsquyres) wrote:
>> 
>>    Ted --
>> 
>>    Sorry for jumping in late.  Here's my $0.02...
>> 
>>    In the runtime, we can do 4 things:
>> 
>>    1. Kill just the process that we forked.
>>    2. Kill just the process(es) that call back and identify
>>    themselves
>>    as MPI processes (we don't track this right now, but we could add that
>>    functionality).
>>    3. Union of #1 and #2.
>>    4. Kill all processes (to include any intermediate processes 
>>    that
>>    are not included in #1 and #2).
>> 
>>    In Open MPI 2.x, #4 is the intended behavior.  There may be a 
>>    bug
>>    or
>>    two that needs to get fixed (e.g., in your last mail, I don't see
>>    offhand why it waits until the MPI process finishes sleeping), but we
>>    should be killing the process group, which -- unless any of the
>>    descendant processes have explicitly left the process group -- should
>>    hit the entire process tree. 
>> 
>>    Sidenote: there's actually a way to be a bit more aggressive 
>>    and
>>    do
>>    a better job of ensuring that we kill *all* processes (via creative
>>    use
>>    of PR_SET_CHILD_SUBREAPER), but that's basically a future 
>>    enhancement
>>    /
>>    optimization.
>> 
>>    I think Gilles and Ralph proposed a good point to you: if you 
>>    want
>>    to be sure to be able to do cleanup after an MPI process terminates (
>>    normally or abnormally), you should trap signals in your intermediate
>>    processes to catch what Open MPI's runtime throws and therefore know
>>    that it is time to cleanup. 
>> 
>>    Hypothetically, this should work in all versions of Open MPI...?
>> 
>>    I think Ralph made a pull request that adds an MCA param to 
>>    change
>>    the default behavior from #4 to #1.
>> 
>>    Note, however, that there's a little time between when Open 
>>    MPI
>>    sends the SIGTERM and the SIGKILL, so this solution could be racy.  If
>>    you find that you're running out of time to cleanup, we might be able
>>    to
>>    make the delay between the SIGTERM and SIGKILL be configurable 
>>    (e.g.,
>>    via MCA param).
>> 
>> 
>> 
>> 
>>    On Jun 16, 2017, at 10:08 AM, Ted Sussman 
>>    <ted.suss...@adina.com
>> 
>>    wrote:
>> 
>>    Hello Gilles and Ralph,
>> 
>>    Thank you for your advice so far.  I appreciate the time 
>>    that
>>    you
>>    have spent to educate me about the details of Open MPI.
>> 
>>    But I think that there is something fundamental that I 
>>    don't
>>    understand.  Consider Example 2 run with Open MPI 2.1.1.
>> 
>>    mpirun --> shell for process 0 -->  executable for process 
>>    0 -->
>>    MPI calls, MPI_Abort
>>            --> shell for process 1 -->  executable for process 1 -->
>>    MPI calls
>> 
>>    After the MPI_Abort is called, ps shows that both shells 
>>    are
>>    running, and that the executable for process 1 is running (in this
>>    case,
>>    process 1 is sleeping).  And mpirun does not exit until process 1 is
>>    finished sleeping.
>> 
>>    I cannot reconcile this observed behavior with the 
>>    statement
>> 
>>          >     2.x: each process is put into its own process group
>>    upon launch. When we issue a
>>         >     "kill", we issue it to the process group. Thus,
>>    every
>>    child proc of that child proc will
>>         >     receive it. IIRC, this was the intended behavior.
>> 
>>    I assume that, for my example, there are two process 
>>    groups. 
>>    The
>>    process group for process 0 contains the shell for process 0 and the
>>    executable for process 0; and the process group for process 1 contains
>>    the shell for process 1 and the executable for process 1.  So what
>>    does
>>    MPI_ABORT do?  MPI_ABORT does not kill the process group for process 
>>    0,
>>     
>>    since the shell for process 0 continues.  And MPI_ABORT does not kill
>>    the process group for process 1, since both the shell and executable
>>    for
>>    process 1 continue.
>> 
>>    If I hit Ctrl-C after MPI_Abort is called, I get the message
>> 
>>    mpirun: abort is already in progress.. hit ctrl-c again to
>>    forcibly terminate
>> 
>>    but I don't need to hit Ctrl-C again because mpirun 
>>    immediately
>>    exits.
>> 
>>    Can you shed some light on all of this?
>> 
>>    Sincerely,
>> 
>>    Ted Sussman
>> 
>> 
>>    On 15 Jun 2017 at 14:44, r...@open-mpi.org wrote:
>> 
>> 
>>    You have to understand that we have no way of 
>>    knowing who is
>>    making MPI calls - all we see is
>>    the proc that we started, and we know someone of 
>>    that rank is
>>    running (but we have no way of
>>    knowing which of the procs you sub-spawned it is).
>> 
>>    So the behavior you are seeking only occurred in 
>>    some earlier
>>    release by sheer accident. Nor will
>>    you find it portable as there is no specification 
>>    directing
>>    that
>>    behavior.
>> 
>>    The behavior I´ve provided is to either deliver the 
>>    signal to
>>    _
>>    all_ child processes (including
>>    grandchildren etc.), or _only_ the immediate child 
>>    of the
>>    daemon.
>>      It won´t do what you describe -
>>    kill the mPI proc underneath the shell, but not the 
>>    shell
>>    itself.
>> 
>>    What you can eventually do is use PMIx to ask the 
>>    runtime to
>>    selectively deliver signals to
>>    pid/procs for you. We don´t have that capability 
>>    implemented
>>    just yet, I´m afraid.
>> 
>>    Meantime, when I get a chance, I can code an 
>>    option that will
>>    record the pid of the subproc that
>>    calls MPI_Init, and then let´s you deliver signals to 
>>    just
>>    that
>>    proc. No promises as to when that will
>>    be done.
>> 
>> 
>>          On Jun 15, 2017, at 1:37 PM, Ted Sussman 
>>    <ted.sussman@
>>    adina.
>>    com> wrote:
>> 
>>         Hello Ralph,
>> 
>>          I am just an Open MPI end user, so I will need to 
>>    wait for
>>    the next official release.
>> 
>>         mpirun --> shell for process 0 -->  executable for 
>>    process
>>    0
>>    --> MPI calls
>>                 --> shell for process 1 -->  executable for process
>>    1
>>    --> MPI calls
>>                                          ...
>> 
>>         I guess the question is, should MPI_ABORT kill the
>>    executables or the shells?  I naively
>>         thought, that, since it is the executables that make 
>>    the
>>    MPI
>>    calls, it is the executables that
>>         should be aborted by the call to MPI_ABORT.  Since 
>>    the
>>    shells don't make MPI calls, the
>>          shells should not be aborted.
>> 
>>         And users might have several layers of shells in 
>>    between
>>    mpirun and the executable.
>> 
>>         So now I will look for the latest version of Open MPI 
>>    that
>>    has the 1.4.3 behavior.
>> 
>>         Sincerely,
>> 
>>         Ted Sussman
>> 
>>          On 15 Jun 2017 at 12:31, r...@open-mpi.org wrote:
>> 
>>         >
>>          > Yeah, things jittered a little there as we debated 
>>    the "
>>    right" behavior. Generally, when we
>>         see that
>>         > happening it means that a param is required, but 
>>    somehow
>>    we never reached that point.
>>         >
>>         > See if https://github.com/open-mpi/ompi/pull/3704  
>>    helps
>>    -
>>    if so, I can schedule it for the next
>>         2.x
>>          > release if the RMs agree to take it
>>         >
>>         > Ralph
>>          >
>>         >     On Jun 15, 2017, at 12:20 PM, Ted Sussman <ted.
>>    sussman
>>    @adina.com > wrote:
>>          >
>>         >     Thank you for your comments.
>>          >   
>>         >     Our application relies upon "dum.sh" to clean up
>>    after
>>    the process exits, either if the
>>          process
>>         >     exits normally, or if the process exits abnormally
>>    because of MPI_ABORT.  If the process
>>          >     group is killed by MPI_ABORT, this clean up will not
>>    be performed.  If exec is used to launch
>>         >     the executable from dum.sh, then dum.sh is
>>    terminated
>>    by the exec, so dum.sh cannot
>>         >     perform any clean up.
>>         >   
>>          >     I suppose that other user applications might work
>>    similarly, so it would be good to have an
>>         >     MCA parameter to control the behavior of 
>>    MPI_ABORT.
>>         >   
>>         >     We could rewrite our shell script that invokes
>>    mpirun,
>>    so that the cleanup that is now done
>>         >     by
>>          >     dum.sh is done by the invoking shell script after
>>    mpirun exits.  Perhaps this technique is the
>>         >     preferred way to clean up after mpirun is invoked.
>>          >   
>>         >     By the way, I have also tested with Open MPI 
>>    1.10.7,
>>    and Open MPI 1.10.7 has different
>>          >     behavior than either Open MPI 1.4.3 or Open MPI 
>>    2.1.
>>    1.
>>       In this explanation, it is important to
>>          >     know that the aborttest executable sleeps for 20 
>>    sec.
>>         >   
>>          >     When running example 2:
>>         >   
>>         >     1.4.3: process 1 immediately aborts
>>         >     1.10.7: process 1 doesn't abort and never stops.
>>          >     2.1.1 process 1 doesn't abort, but stops after it is
>>    finished sleeping
>>         >   
>>         >     Sincerely,
>>         >   
>>         >     Ted Sussman
>>          >   
>>         >     On 15 Jun 2017 at 9:18, r...@open-mpi.org wrote:
>>         >
>>         >     Here is how the system is working:
>>          >   
>>         >     Master: each process is put into its own process
>>    group
>>    upon launch. When we issue a
>>         >     "kill", however, we only issue it to the individual
>>    process (instead of the process group
>>         >     that is headed by that child process). This is
>>    probably a bug as I don´t believe that is
>>         >     what we intended, but set that aside for now.
>>          >   
>>         >     2.x: each process is put into its own process group
>>    upon launch. When we issue a
>>         >     "kill", we issue it to the process group. Thus,
>>    every
>>    child proc of that child proc will
>>         >     receive it. IIRC, this was the intended behavior.
>>          >   
>>         >     It is rather trivial to make the change (it only
>>    involves 3 lines of code), but I´m not sure
>>         >     of what our intended behavior is supposed to be.
>>    Once
>>    we clarify that, it is also trivial
>>         >     to add another MCA param (you can never have too
>>    many!)
>>      to allow you to select the
>>         >     other behavior.
>>         >   
>>         >
>>          >     On Jun 15, 2017, at 5:23 AM, Ted Sussman <ted.
>>    sussman@
>>    adina.com > wrote:
>>         >   
>>         >     Hello Gilles,
>>         >   
>>          >     Thank you for your quick answer.  I confirm that if
>>    exec is used, both processes
>>         >     immediately
>>          >     abort.
>>         >   
>>          >     Now suppose that the line
>>         >   
>>         >     echo "After aborttest:
>>         >     
>>    OMPI_COMM_WORLD_RANK="$OMPI_COMM_
>>    WORLD_RANK
>>          >   
>>         >     is added to the end of dum.sh.
>>         >   
>>         >     If Example 2 is run with Open MPI 1.4.3, the output
>>    is
>>         >   
>>         >     After aborttest: OMPI_COMM_WORLD_RANK=0
>>         >   
>>         >     which shows that the shell script for the process
>>    with
>>    rank 0 continues after the
>>          >     abort,
>>         >     but that the shell script for the process with rank
>>    1
>>    does not continue after the
>>          >     abort.
>>         >   
>>          >     If Example 2 is run with Open MPI 2.1.1, with exec
>>    used to invoke
>>         >     aborttest02.exe, then
>>         >     there is no such output, which shows that both shell
>>    scripts do not continue after
>>         >     the abort.
>>         >   
>>          >     I prefer the Open MPI 1.4.3 behavior because our
>>    original application depends
>>         >     upon the
>>          >     Open MPI 1.4.3 behavior.  (Our original application
>>    will also work if both
>>         >     executables are
>>          >     aborted, and if both shell scripts continue after
>>    the
>>    abort.)
>>         >   
>>          >     It might be too much to expect, but is there a way
>>    to
>>    recover the Open MPI 1.4.3
>>         >     behavior
>>          >     using Open MPI 2.1.1? 
>>         >   
>>          >     Sincerely,
>>         >   
>>         >     Ted Sussman
>>         >   
>>         >   
>>          >     On 15 Jun 2017 at 9:50, Gilles Gouaillardet wrote:
>>         >
>>         >     Ted,
>>         >   
>>          >   
>>         >     fwiw, the 'master' branch has the behavior you
>>    expect.
>>         >   
>>         >   
>>         >     meanwhile, you can simple edit your 'dum.sh' script
>>    and replace
>>          >   
>>         >     /home/buildadina/src/aborttest02/aborttest02.exe
>>          >   
>>         >     with
>>          >   
>>         >     exec /home/buildadina/src/aborttest02/aborttest02.
>>    exe
>>          >   
>>         >   
>>         >     Cheers,
>>         >   
>>         >   
>>         >     Gilles
>>         >   
>>          >   
>>         >     On 6/15/2017 3:01 AM, Ted Sussman wrote:
>>          >     Hello,
>>         >   
>>         >     My question concerns MPI_ABORT, indirect 
>>    execution
>>    of
>>         >     executables by mpirun and Open
>>         >     MPI 2.1.1.  When mpirun runs executables directly,
>>    MPI
>>    _ABORT
>>         >     works as expected, but
>>          >     when mpirun runs executables indirectly, 
>>    MPI_ABORT
>>    does not
>>         >     work as expected.
>>         >   
>>         >     If Open MPI 1.4.3 is used instead of Open MPI 
>>    2.1.1,
>>    MPI_ABORT
>>         >     works as expected in all
>>          >     cases.
>>         >   
>>          >     The examples given below have been simplified as 
>>    far
>>    as possible
>>         >     to show the issues.
>>         >   
>>         >     ---
>>         >   
>>          >     Example 1
>>         >   
>>          >     Consider an MPI job run in the following way:
>>         >   
>>          >     mpirun ... -app addmpw1
>>         >   
>>         >     where the appfile addmpw1 lists two executables:
>>         >   
>>         >     -n 1 -host gulftown ... aborttest02.exe
>>         >     -n 1 -host gulftown ... aborttest02.exe
>>          >   
>>         >     The two executables are executed on the local node
>>    gulftown.
>>         >      aborttest02 calls MPI_ABORT
>>         >     for rank 0, then sleeps.
>>         >   
>>         >     The above MPI job runs as expected.  Both 
>>    processes
>>    immediately
>>         >     abort when rank 0 calls
>>         >     MPI_ABORT.
>>         >   
>>          >     ---
>>         >   
>>          >     Example 2
>>         >   
>>         >     Now change the above example as follows:
>>         >   
>>         >     mpirun ... -app addmpw2
>>         >   
>>         >     where the appfile addmpw2 lists shell scripts:
>>         >   
>>         >     -n 1 -host gulftown ... dum.sh
>>         >     -n 1 -host gulftown ... dum.sh
>>         >   
>>         >     dum.sh invokes aborttest02.exe.  So aborttest02.exe
>>    is
>>    executed
>>         >     indirectly by mpirun.
>>         >   
>>         >     In this case, the MPI job only aborts process 0 when
>>    rank 0 calls
>>          >     MPI_ABORT.  Process 1
>>         >     continues to run.  This behavior is unexpected.
>>         >   
>>         >     ----
>>          >   
>>         >     I have attached all files to this E-mail.  Since
>>    there
>>    are absolute
>>          >     pathnames in the files, to
>>         >     reproduce my findings, you will need to update the
>>    pathnames in the
>>          >     appfiles and shell
>>         >     scripts.  To run example 1,
>>          >   
>>         >     sh run1.sh
>>          >   
>>         >     and to run example 2,
>>         >   
>>         >     sh run2.sh
>>         >   
>>          >     ---
>>         >   
>>          >     I have tested these examples with Open MPI 1.4.3 
>>    and
>>    2.
>>    0.3.  In
>>         >     Open MPI 1.4.3, both
>>          >     examples work as expected.  Open MPI 2.0.3 has 
>>    the
>>    same behavior
>>         >     as Open MPI 2.1.1.
>>         >   
>>         >     ---
>>          >   
>>         >     I would prefer that Open MPI 2.1.1 aborts both
>>    processes, even
>>         >     when the executables are
>>         >     invoked indirectly by mpirun.  If there is an MCA
>>    setting that is
>>         >     needed to make Open MPI
>>         >     2.1.1 abort both processes, please let me know.
>>          >   
>>         >   
>>         >     Sincerely,
>>         >   
>>         >     Theodore Sussman
>>          >   
>>         >   
>>          >     The following section of this message contains a
>>    file
>>    attachment
>>         >     prepared for transmission using the Internet MIME
>>    message format.
>>          >     If you are using Pegasus Mail, or any other MIME-
>>    compliant system,
>>         >     you should be able to save it or view it from within
>>    your mailer.
>>         >     If you cannot, please ask your system administrator
>>    for assistance.
>>         >   
>>         >       ---- File information -----------
>>         >         File:  config.log.bz2
>>         >         Date:  14 Jun 2017, 13:35
>>         >         Size:  146548 bytes.
>>          >         Type:  Binary
>>         >   
>>          >   
>>         >     The following section of this message contains a
>>    file
>>    attachment
>>          >     prepared for transmission using the Internet MIME
>>    message format.
>>         >     If you are using Pegasus Mail, or any other MIME-
>>    compliant system,
>>         >     you should be able to save it or view it from within
>>    your mailer.
>>         >     If you cannot, please ask your system administrator
>>    for assistance.
>>         >   
>>         >       ---- File information -----------
>>         >         File:  ompi_info.bz2
>>         >         Date:  14 Jun 2017, 13:35
>>          >         Size:  24088 bytes.
>>         >         Type:  Binary
>>          >   
>>         >   
>>          >     The following section of this message contains a
>>    file
>>    attachment
>>         >     prepared for transmission using the Internet MIME
>>    message format.
>>          >     If you are using Pegasus Mail, or any other MIME-
>>    compliant system,
>>         >     you should be able to save it or view it from within
>>    your mailer.
>>         >     If you cannot, please ask your system administrator
>>    for assistance.
>>         >   
>>         >       ---- File information -----------
>>         >         File:  aborttest02.tgz
>>         >         Date:  14 Jun 2017, 13:52
>>         >         Size:  4285 bytes.
>>          >         Type:  Binary
>>         >   
>>          >   
>>         >     
>>    ________________________________________
>>    _______
>>          >     users mailing list
>>         >     users@lists.open-mpi.org
>>          >     
>>    https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>         >   
>>         >     
>>    ________________________________________
>>    _______
>>          >     users mailing list
>>         >     users@lists.open-mpi.org
>>         >     
>>    https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>         >   
>>         >   
>>          >   
>>         >     
>>    ________________________________________
>>    _______
>>          >     users mailing list
>>         >     users@lists.open-mpi.org
>>          >     
>>    https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>         >   
>>         >     
>>    ________________________________________
>>    _______
>>          >     users mailing list
>>         >     users@lists.open-mpi.org
>>         >     
>>    https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>         >   
>>         >   
>>          >   
>>         >     
>>    ________________________________________
>>    _______
>>          >     users mailing list
>>         >     users@lists.open-mpi.org
>>          >     
>>    https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>         >
>> 
>>          
>>         __________________________________________
>>    _____
>>          users mailing list
>>         users@lists.open-mpi.org
>>         
>>     https://rfd.newmexicoconsortium.org/mailman/listin
>>    fo/users
>> 
>> 
>>      
>>    _____________________________________________
>>    __
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/us
>>    ers
>> 
>> 
>>    --
>>    Jeff Squyres
>>    jsquy...@cisco.com
>> 
>>    _______________________________________________
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>> 
>> 
>>    _______________________________________________
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>>    _______________________________________________
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>>    _______________________________________________
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
>>         
>> 
>> 
>>    The following section of this message contains a file attachment
>>    prepared for transmission using the Internet MIME message format.
>>    If you are using Pegasus Mail, or any other MIME-compliant system,
>>    you should be able to save it or view it from within your mailer.
>>    If you cannot, please ask your system administrator for assistance.
>> 
>>      ---- File information -----------
>>        File:  aborttest10.tgz
>>        Date:  19 Jun 2017, 12:42
>>        Size:  4740 bytes.
>>        Type:  Binary
>>    <aborttest10.tgz>_______________________________________________
>>    users mailing list
>>    users@lists.open-mpi.org
>>    https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> 
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to