Re: [OMPI users] Propagating SIGINT instead of SIGTERM to children processes

2020-03-16 Thread Ralph Castain via users
Hi Nathan

Sorry for the long, long delay in responding - no reasonable excuse (just busy, 
switching over support areas, etc.). Hopefully, you already found the solution.

You can specify the signals to forward to children using an MCA parameter:

OMPI_MCA_ess_base_forward_signals=SIGINT

should do what you are seeking. You can get a list of these using the 
"ompi_info" program that comes with OpenMPI. In this case, you would find the 
above param with the following help output:

Comma-delimited list of additional signals (names or integers) to forward to 
application processes [\"none\" => forward nothing]. Signals provided by 
default include SIGTSTP, SIGUSR1, SIGUSR2, SIGABRT, SIGALRM, and SIGCONT


HTH
Ralph

> On Sep 28, 2019, at 4:42 AM, Nathan GREINER via users 
>  wrote:
> 
> Dear open-mpi users,
> 
> I am using open-mpi in conjunction with the mpi4py package to run parallel 
> simulations using python on my local machine.
> 
> I use the following idiom:
> 
>mpiexec -np 4 python myscript.py
> 
> When I hit ^C during the execution of the above command, the mpi program is 
> interrupted, and the python programs are also interrupted.
> 
> However, I get no traceback from the python programs, and more 
> problematically, the cleanup functions of these programs are not executed as 
> they should when these programs get interrupted.
> 
> The open-mpi documentation states that: "When orterun (<=> mpiexec <=> 
> mpirun) receives a SIGTERM and SIGINT, it will attempt to kill the entire job 
> by sending all processes in the job a SIGTERM, waiting a small number of 
> seconds, then sending all processes in the job a SIGKILL."
> 
> Thus, the python programs receive a SIGTERM signal instead of the SIGINT 
> signal that they would receive upon hitting ^C during an execution launched 
> with the idiom:
> 
>python myscript.py
> 
> I know that there is a way to make the python programs handle the SIGTERM 
> signal as if it was a SIGINT signal (namely, raising a KeyboardInterrupt), 
> but I would prefer to be able to configure mpiexec to propagate the SIGINT 
> signal it receives instead of sending a SIGTERM signal to its children 
> processes.
> 
> Would you know how this could be achieved?
> 
> Thank you very much for your time and help,
> 
> Nathan GREINER
> 
> PS: I am new to the open-mpi users mailing list: is this the right place and 
> way to ask such a question?
> 
> 




Re: [OMPI users] Interpreting the output of --display-map and --display-allocation

2020-03-16 Thread Ralph Castain via users
FWIW: I have replaced those flags in the display option output with their 
string equivalent to make interpretation easier. This is available in OMPI 
master and will be included in the v5 release.



> On Nov 21, 2019, at 2:08 AM, Peter Kjellström via users 
>  wrote:
> 
> On Mon, 18 Nov 2019 17:48:30 +
> "Mccall, Kurt E. \(MSFC-EV41\) via users" 
> wrote:
> 
>> I'm trying to debug a problem with my job, launched with the mpiexec
>> options -display-map and -display-allocation, but I don't know how to
>> interpret the output.   For example,  mpiexec displays the following
>> when a job is spawned by MPI_Comm_spawn():
>> 
>> ==   ALLOCATED NODES   ==
>>n002: flags=0x11 slots=3 max_slots=0 slots_inuse=2 state=UP
>>n001: flags=0x13 slots=3 max_slots=0 slots_inuse=1 state=UP
>> 
>> Maybe the differing "flags" values have bearing on the problem, but I
>> don't know what they mean.   Are the outputs of these two options
>> documented anywhere?
> 
> I don't know of any such specific documentation but the flag values are
> defined in:
> 
> orte/util/attr.h:54 (openmpi-3.1.4)
> 
> The difference between your nodes (bit value 0x2) means:
> 
> #define ORTE_NODE_FLAG_LOC_VERIFIED   0x02   
> 
> // whether or not the location has been verified - used for
> // environments where the daemon's final destination is uncertain
> 
> I do not know what that means exactly but it is not related to pinning
> on or off.
> 
> Seems to indicate a broken launch and/or install and/or environment.
> 
> /Peter K




[OMPI users] Limits of communicator size and number of parallel broadcast transmissions

2020-03-16 Thread Konstantinos Konstantinidis via users
Hi, I have some questions regarding technical details of MPI collective
communication methods and broadcast:

   - I want to understand when the number of receivers in a MPI_Bcast can
   be a problem slowing down the broadcast. There are a few implementations of
   MPI_Bcast. Consider that of a binary tree. In this case, the sender (root)
   transmits the common message to its two children and each them to two more
   and so on. Is it accurate to say that in each level of the tree all
   transmissions happen in parallel or only one transmission can be done from
   each node? To that end, is there a limit on the number of processes a
   process can broadcast to in parallel?
   - Since each MPI_Bcast is associated with a communicator is there a
   limit on the number of processes a communicator can have and if so what is
   it in Open MPI?


Regards,
Kostas


Re: [OMPI users] MPI_Comm_spawn: no allocated resources for the application ...

2020-03-16 Thread Ralph Castain via users
Sorry for the incredibly late reply. Hopefully, you have already managed to 
find the answer.

I'm not sure what your comm_spawn command looks like, but it appears you 
specified the host in it using the "dash_host" info-key, yes? The problem is 
that this is interpreted the same way as the "-host n001.cluster.com 
 " option on an mpiexec cmd line - which means that it 
only allocates _one_ slot to the request. If you are asking to spawn two procs, 
then you don't have adequate resources. One way to check is to only spawn one 
proc with your comm_spawn request and see if that works.

If you want to specify the host, then you need to append the number of slots to 
allocate on that host - e.g., "n001.cluster.com  :2". 
Of course, you cannot allocate more than the system provided minus the number 
currently in use. There are additional modifiers you can pass to handle 
variable numbers of slots.

HTH
Ralph


On Oct 25, 2019, at 5:30 AM, Mccall, Kurt E. (MSFC-EV41) via users 
mailto:users@lists.open-mpi.org> > wrote:

I am trying to launch a number of manager processes, one per node, and then have
each of those managers spawn, on its own same node, a number of workers.   For 
this example,
I have 2 managers and 2 workers per manager.  I'm following the instructions at 
this link
 
https://stackoverflow.com/questions/47743425/controlling-node-mapping-of-mpi-comm-spawn
 to force one manager process per node.
  Here is my PBS/Torque qsub command:
 $ qsub -V -j oe -e ./stdio -o ./stdio -f -X -N MyManagerJob -l nodes=2:ppn=3  
MyManager.bash
 I expect "-l nodes=2:ppn=3" to reserve 2 nodes with 3 slots on each (one slot 
for the manager and the other two for the separately spawned workers).  The 
first  argument
is a lower-case L, not a one.
   Here is my mpiexec command within the MyManager.bash script.
 mpiexec --enable-recovery --display-map --display-allocation --mca 
mpi_param_check 1 --v --x DISPLAY --np 2  --map-by ppr:1:node  MyManager.exe
 I expect "--map-by ppr:1:node" to cause OpenMpi to launch exactly one manager 
on each node. 
   When the first worker is spawned vi MPI_Comm_spawn(), OpenMpi reports:
 ==   ALLOCATED NODES   ==
    n002: flags=0x11 slots=3 max_slots=0 slots_inuse=3 state=UP
n001: flags=0x13 slots=3 max_slots=0 slots_inuse=1 state=UP
=
--
There are no allocated resources for the application:
  ./MyWorker
that match the requested mapping:
  -host: n001.cluster.com  
 Verify that you have mapped the allocated resources properly for the
indicated specification.
--
[n001:14883] *** An error occurred in MPI_Comm_spawn
[n001:14883] *** reported by process [1897594881,1]
[n001:14883] *** on communicator MPI_COMM_SELF
[n001:14883] *** MPI_ERR_SPAWN: could not spawn processes
   It the banner above, it clearly states that node n001 has 3 slots reserved
and only one slot in used at time of the spawn.   Not sure why it reports
that there are no resources for it.
 I've tried compiling OpenMpi 4.0 both with and without Torque support, and
I've tried using a an explicit host file (or not), but the error is unchanged. 
Any ideas?
 My cluster is running CentOS 7.4 and I am using the Portland Group C++ 
compiler.