Hello Gilles and Ralph, Thank you for your advice so far. I appreciate the time that you have spent to educate me about the details of Open MPI.
But I think that there is something fundamental that I don't understand. Consider Example 2 run with Open MPI 2.1.1. mpirun --> shell for process 0 --> executable for process 0 --> MPI calls, MPI_Abort --> shell for process 1 --> executable for process 1 --> MPI calls After the MPI_Abort is called, ps shows that both shells are running, and that the executable for process 1 is running (in this case, process 1 is sleeping). And mpirun does not exit until process 1 is finished sleeping. I cannot reconcile this observed behavior with the statement > > 2.x: each process is put into its own process group upon launch. > When we issue a > > "kill", we issue it to the process group. Thus, every child proc of > that child proc will > > receive it. IIRC, this was the intended behavior. I assume that, for my example, there are two process groups. The process group for process 0 contains the shell for process 0 and the executable for process 0; and the process group for process 1 contains the shell for process 1 and the executable for process 1. So what does MPI_ABORT do? MPI_ABORT does not kill the process group for process 0, since the shell for process 0 continues. And MPI_ABORT does not kill the process group for process 1, since both the shell and executable for process 1 continue. If I hit Ctrl-C after MPI_Abort is called, I get the message mpirun: abort is already in progress.. hit ctrl-c again to forcibly terminate but I don't need to hit Ctrl-C again because mpirun immediately exits. Can you shed some light on all of this? Sincerely, Ted Sussman On 15 Jun 2017 at 14:44, r...@open-mpi.org wrote: > > You have to understand that we have no way of knowing who is making MPI calls > - all we see is > the proc that we started, and we know someone of that rank is running (but we > have no way of > knowing which of the procs you sub-spawned it is). > > So the behavior you are seeking only occurred in some earlier release by > sheer accident. Nor will > you find it portable as there is no specification directing that behavior. > > The behavior I´ve provided is to either deliver the signal to _all_ child > processes (including > grandchildren etc.), or _only_ the immediate child of the daemon. It won´t do > what you describe - > kill the mPI proc underneath the shell, but not the shell itself. > > What you can eventually do is use PMIx to ask the runtime to selectively > deliver signals to > pid/procs for you. We don´t have that capability implemented just yet, I´m > afraid. > > Meantime, when I get a chance, I can code an option that will record the pid > of the subproc that > calls MPI_Init, and then let´s you deliver signals to just that proc. No > promises as to when that will > be done. > > > On Jun 15, 2017, at 1:37 PM, Ted Sussman <ted.suss...@adina.com> wrote: > > Hello Ralph, > > I am just an Open MPI end user, so I will need to wait for the next > official release. > > mpirun --> shell for process 0 --> executable for process 0 --> MPI calls > --> shell for process 1 --> executable for process 1 --> MPI calls > ... > > I guess the question is, should MPI_ABORT kill the executables or the > shells? I naively > thought, that, since it is the executables that make the MPI calls, it is > the executables that > should be aborted by the call to MPI_ABORT. Since the shells don't make > MPI calls, the > shells should not be aborted. > > And users might have several layers of shells in between mpirun and the > executable. > > So now I will look for the latest version of Open MPI that has the 1.4.3 > behavior. > > Sincerely, > > Ted Sussman > > On 15 Jun 2017 at 12:31, r...@open-mpi.org wrote: > > > > > Yeah, things jittered a little there as we debated the "right" > behavior. Generally, when we > see that > > happening it means that a param is required, but somehow we never > reached that point. > > > > See if https://github.com/open-mpi/ompi/pull/3704 helps - if so, I can > schedule it for the next > 2.x > > release if the RMs agree to take it > > > > Ralph > > > > On Jun 15, 2017, at 12:20 PM, Ted Sussman <ted.suss...@adina.com > > wrote: > > > > Thank you for your comments. > > > > Our application relies upon "dum.sh" to clean up after the process > exits, either if the > process > > exits normally, or if the process exits abnormally because of > MPI_ABORT. If the process > > group is killed by MPI_ABORT, this clean up will not be performed. > If exec is used to launch > > the executable from dum.sh, then dum.sh is terminated by the exec, > so dum.sh cannot > > perform any clean up. > > > > I suppose that other user applications might work similarly, so it > would be good to have an > > MCA parameter to control the behavior of MPI_ABORT. > > > > We could rewrite our shell script that invokes mpirun, so that the > cleanup that is now done > > by > > dum.sh is done by the invoking shell script after mpirun exits. > Perhaps this technique is the > > preferred way to clean up after mpirun is invoked. > > > > By the way, I have also tested with Open MPI 1.10.7, and Open MPI > 1.10.7 has different > > behavior than either Open MPI 1.4.3 or Open MPI 2.1.1. In this > explanation, it is important to > > know that the aborttest executable sleeps for 20 sec. > > > > When running example 2: > > > > 1.4.3: process 1 immediately aborts > > 1.10.7: process 1 doesn't abort and never stops. > > 2.1.1 process 1 doesn't abort, but stops after it is finished > sleeping > > > > Sincerely, > > > > Ted Sussman > > > > On 15 Jun 2017 at 9:18, r...@open-mpi.org wrote: > > > > Here is how the system is working: > > > > Master: each process is put into its own process group upon launch. > When we issue a > > "kill", however, we only issue it to the individual process > (instead of the process group > > that is headed by that child process). This is probably a bug as I > don´t believe that is > > what we intended, but set that aside for now. > > > > 2.x: each process is put into its own process group upon launch. > When we issue a > > "kill", we issue it to the process group. Thus, every child proc of > that child proc will > > receive it. IIRC, this was the intended behavior. > > > > It is rather trivial to make the change (it only involves 3 lines > of code), but I´m not sure > > of what our intended behavior is supposed to be. Once we clarify > that, it is also trivial > > to add another MCA param (you can never have too many!) to allow > you to select the > > other behavior. > > > > > > On Jun 15, 2017, at 5:23 AM, Ted Sussman <ted.suss...@adina.com > > wrote: > > > > Hello Gilles, > > > > Thank you for your quick answer. I confirm that if exec is used, > both processes > > immediately > > abort. > > > > Now suppose that the line > > > > echo "After aborttest: > > OMPI_COMM_WORLD_RANK="$OMPI_COMM_WORLD_RANK > > > > is added to the end of dum.sh. > > > > If Example 2 is run with Open MPI 1.4.3, the output is > > > > After aborttest: OMPI_COMM_WORLD_RANK=0 > > > > which shows that the shell script for the process with rank 0 > continues after the > > abort, > > but that the shell script for the process with rank 1 does not > continue after the > > abort. > > > > If Example 2 is run with Open MPI 2.1.1, with exec used to invoke > > aborttest02.exe, then > > there is no such output, which shows that both shell scripts do not > continue after > > the abort. > > > > I prefer the Open MPI 1.4.3 behavior because our original > application depends > > upon the > > Open MPI 1.4.3 behavior. (Our original application will also work > if both > > executables are > > aborted, and if both shell scripts continue after the abort.) > > > > It might be too much to expect, but is there a way to recover the > Open MPI 1.4.3 > > behavior > > using Open MPI 2.1.1? > > > > Sincerely, > > > > Ted Sussman > > > > > > On 15 Jun 2017 at 9:50, Gilles Gouaillardet wrote: > > > > Ted, > > > > > > fwiw, the 'master' branch has the behavior you expect. > > > > > > meanwhile, you can simple edit your 'dum.sh' script and replace > > > > /home/buildadina/src/aborttest02/aborttest02.exe > > > > with > > > > exec /home/buildadina/src/aborttest02/aborttest02.exe > > > > > > Cheers, > > > > > > Gilles > > > > > > On 6/15/2017 3:01 AM, Ted Sussman wrote: > > Hello, > > > > My question concerns MPI_ABORT, indirect execution of > > executables by mpirun and Open > > MPI 2.1.1. When mpirun runs executables directly, MPI_ABORT > > works as expected, but > > when mpirun runs executables indirectly, MPI_ABORT does not > > work as expected. > > > > If Open MPI 1.4.3 is used instead of Open MPI 2.1.1, MPI_ABORT > > works as expected in all > > cases. > > > > The examples given below have been simplified as far as possible > > to show the issues. > > > > --- > > > > Example 1 > > > > Consider an MPI job run in the following way: > > > > mpirun ... -app addmpw1 > > > > where the appfile addmpw1 lists two executables: > > > > -n 1 -host gulftown ... aborttest02.exe > > -n 1 -host gulftown ... aborttest02.exe > > > > The two executables are executed on the local node gulftown. > > aborttest02 calls MPI_ABORT > > for rank 0, then sleeps. > > > > The above MPI job runs as expected. Both processes immediately > > abort when rank 0 calls > > MPI_ABORT. > > > > --- > > > > Example 2 > > > > Now change the above example as follows: > > > > mpirun ... -app addmpw2 > > > > where the appfile addmpw2 lists shell scripts: > > > > -n 1 -host gulftown ... dum.sh > > -n 1 -host gulftown ... dum.sh > > > > dum.sh invokes aborttest02.exe. So aborttest02.exe is executed > > indirectly by mpirun. > > > > In this case, the MPI job only aborts process 0 when rank 0 calls > > MPI_ABORT. Process 1 > > continues to run. This behavior is unexpected. > > > > ---- > > > > I have attached all files to this E-mail. Since there are absolute > > pathnames in the files, to > > reproduce my findings, you will need to update the pathnames in the > > appfiles and shell > > scripts. To run example 1, > > > > sh run1.sh > > > > and to run example 2, > > > > sh run2.sh > > > > --- > > > > I have tested these examples with Open MPI 1.4.3 and 2.0.3. In > > Open MPI 1.4.3, both > > examples work as expected. Open MPI 2.0.3 has the same behavior > > as Open MPI 2.1.1. > > > > --- > > > > I would prefer that Open MPI 2.1.1 aborts both processes, even > > when the executables are > > invoked indirectly by mpirun. If there is an MCA setting that is > > needed to make Open MPI > > 2.1.1 abort both processes, please let me know. > > > > > > Sincerely, > > > > Theodore Sussman > > > > > > The following section of this message contains a file attachment > > prepared for transmission using the Internet MIME message format. > > If you are using Pegasus Mail, or any other MIME-compliant system, > > you should be able to save it or view it from within your mailer. > > If you cannot, please ask your system administrator for assistance. > > > > ---- File information ----------- > > File: config.log.bz2 > > Date: 14 Jun 2017, 13:35 > > Size: 146548 bytes. > > Type: Binary > > > > > > The following section of this message contains a file attachment > > prepared for transmission using the Internet MIME message format. > > If you are using Pegasus Mail, or any other MIME-compliant system, > > you should be able to save it or view it from within your mailer. > > If you cannot, please ask your system administrator for assistance. > > > > ---- File information ----------- > > File: ompi_info.bz2 > > Date: 14 Jun 2017, 13:35 > > Size: 24088 bytes. > > Type: Binary > > > > > > The following section of this message contains a file attachment > > prepared for transmission using the Internet MIME message format. > > If you are using Pegasus Mail, or any other MIME-compliant system, > > you should be able to save it or view it from within your mailer. > > If you cannot, please ask your system administrator for assistance. > > > > ---- File information ----------- > > File: aborttest02.tgz > > Date: 14 Jun 2017, 13:52 > > Size: 4285 bytes. > > Type: Binary > > > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users