Re: [OMPI users] [EXTERNAL] Invalid -L flag added to aprun

Borchert, Christopher B ERDC-RDE-ITL-MS CIV via users Thu, 11 Jul 2024 07:14:36 -0700

It’s the same output and the same result:

batch13:~> aprun -n 2 -N 1 hostname
nid00418
nid00419


batch13:~> aprun -n 2 -N 1 -L nid00418,nid00419 hostname
aprun: -L node_list contains an invalid entry
Usage: aprun [global_options] [command_options] cmd1
...

Thanks,
Chris

-----Original Message-----
From: Pritchard Jr., Howard <howa...@lanl.gov> 
Sent: Thursday, July 11, 2024 9:03 AM
To: Borchert, Christopher B ERDC-RDE-ITL-MS CIV 
<christopher.b.borch...@erdc.dren.mil>; Open MPI Users 
<users@lists.open-mpi.org>
Subject: Re: [EXTERNAL] [OMPI users] Invalid -L flag added to aprun

Hi Chris

I wonder if somethings messed up with the way alps is interpreting node names 
on the system.

Could you try doing the following:

1. get a two node allocation on your cluster
2. run aprun -n 2 -N 1 hostname
3. take the hostnames returned then run aprun -n 2 -N 1 -L X,Y hostname
Where X= first string returned from the command in step 2 and Y is the second 
string returned from the command in step 2

On 7/11/24, 7:55 AM, "Borchert, Christopher B ERDC-RDE-ITL-MS CIV" 
<christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil>> wrote:


Thanks Howard. Here is what I got.


batch35:/p/work/borchert> mpirun -n 1 -d ./a.out
[batch35:62735] procdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0/0
[batch35:62735] jobdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0
[batch35:62735] top: /p/work/borchert/ompi.batch35.34110/pid.62735
[batch35:62735] top: /p/work/borchert/ompi.batch35.34110
[batch35:62735] tmp: /p/work/borchert
[batch35:62735] sess_dir_cleanup: job session dir does not exist
[batch35:62735] sess_dir_cleanup: top session dir does not exist
[batch35:62735] procdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0/0
[batch35:62735] jobdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0
[batch35:62735] top: /p/work/borchert/ompi.batch35.34110/pid.62735
[batch35:62735] top: /p/work/borchert/ompi.batch35.34110
[batch35:62735] tmp: /p/work/borchert
[batch35:62735] mca: base: components_register: registering framework ras 
components
[batch35:62735] mca: base: components_register: found loaded component simulator
[batch35:62735] mca: base: components_register: component simulator register 
function successful
[batch35:62735] mca: base: components_register: found loaded component slurm
[batch35:62735] mca: base: components_register: component slurm register 
function successful
[batch35:62735] mca: base: components_register: found loaded component tm
[batch35:62735] mca: base: components_register: component tm register function 
successful
[batch35:62735] mca: base: components_register: found loaded component alps
[batch35:62735] mca: base: components_register: component alps register 
function successful
[batch35:62735] mca: base: components_open: opening ras components
[batch35:62735] mca: base: components_open: found loaded component simulator
[batch35:62735] mca: base: components_open: found loaded component slurm
[batch35:62735] mca: base: components_open: component slurm open function 
successful
[batch35:62735] mca: base: components_open: found loaded component tm
[batch35:62735] mca: base: components_open: component tm open function 
successful
[batch35:62735] mca: base: components_open: found loaded component alps
[batch35:62735] mca: base: components_open: component alps open function 
successful
[batch35:62735] mca:base:select: Auto-selecting ras components
[batch35:62735] mca:base:select:( ras) Querying component [simulator]
[batch35:62735] mca:base:select:( ras) Querying component [slurm]
[batch35:62735] mca:base:select:( ras) Querying component [tm]
[batch35:62735] mca:base:select:( ras) Query of component [tm] set priority to 
100
[batch35:62735] mca:base:select:( ras) Querying component [alps]
[batch35:62735] ras:alps: available for selection
[batch35:62735] mca:base:select:( ras) Query of component [alps] set priority 
to 75
[batch35:62735] mca:base:select:( ras) Selected component [tm]
[batch35:62735] mca: base: close: unloading component simulator
[batch35:62735] mca: base: close: component slurm closed
[batch35:62735] mca: base: close: unloading component slurm
[batch35:62735] mca: base: close: unloading component alps
[batch35:62735] [[34694,0],0] ras:base:allocate
[batch35:62735] [[34694,0],0] ras:tm:allocate:discover: got hostname nid01243
[batch35:62735] [[34694,0],0] ras:tm:allocate:discover: not found -- added to 
list
[batch35:62735] [[34694,0],0] ras:tm:allocate:discover: got hostname nid01244
[batch35:62735] [[34694,0],0] ras:tm:allocate:discover: not found -- added to 
list
[batch35:62735] [[34694,0],0] ras:base:node_insert inserting 2 nodes
[batch35:62735] [[34694,0],0] ras:base:node_insert node nid01243 slots 1
[batch35:62735] [[34694,0],0] ras:base:node_insert node nid01244 slots 1


====================== ALLOCATED NODES ======================
nid01243: flags=0x10 slots=1 max_slots=0 slots_inuse=0 state=UP
nid01244: flags=0x10 slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
[batch35:62735] plm:alps: final top-level argv:
[batch35:62735] plm:alps: aprun -n 2 -N 1 -cc none -e PMI_NO_PREINITIALIZE=1 -e 
PMI_NO_FORK=1 -e OMPI_NO_USE_CRAY_PMI=1 -L nid01243,nid01244 orted -mca 
orte_debug 1 -mca ess_base_jobid 2273705984 -mca ess_base_vpid 1 -mca 
ess_base_num_procs 3 -mca orte_node_regex batch[2:35],nid[5:1243-1244]@0(3) 
-mca orte_hnp_uri 2273705984.0;tcp://10.128.8.181:56687
aprun: -L node_list contains an invalid entry


Usage: aprun [global_options] [command_options] cmd1
[: [command_options] cmd2 [: ...] ]
[--help] [--version]


--help Print this help information and exit
--version Print version information
: Separate binaries for MPMD mode
(Multiple Program, Multiple Data)


Global Options:
-b, --bypass-app-transfer
Bypass application transfer to compute node
-B, --batch-args
Get values from Batch reservation for -n, -N, -d, and -m
-C, --reconnect
Reconnect fanout control tree around failed nodes
-D, --debug level
Debug level bitmask (0-7)
-e, --environment-override env
Set an environment variable on the compute nodes
Must use format VARNAME=value
Set multiple env variables using multiple -e args
-P, --pipes pipes
Write[,read] pipes (not applicable for general use)
-p, --protection-domain pdi
Protection domain identifier
-q, --quiet
Quiet mode; suppress aprun non-fatal messages
-R, --relaunch max_shrink
Relaunch application; max_shrink is zero or more maximum
PEs to shrink for a relaunch
-T, --sync-output
Use synchronous TTY
-t, --cpu-time-limit sec
Per PE CPU time limit in seconds (default unlimited)
--wdir wdir
Application working directory (default current directory)
-Z, --zone-sort-secs secs
Perform periodic memory zone sort every secs seconds
-z, --zone-sort
Perform memory zone sort before application launch


Command Options:
-a, --architecture arch
Architecture type (only XT currently supported)
--cc, --cpu-binding cpu_list
CPU binding list or keyword
([cpu#[,cpu# | cpu1-cpu2] | x]...] | keyword)
--cp, --cpu-binding-file file
CPU binding placement filename
-d, --cpus-per-pe depth
Number of CPUs allocated per PE (number of threads)
-E, --exclude-node-list node_list
List of nodes to exclude from placement
--exclude-node-list-file node_list_file
File with a list of nodes to exclude from placement
-F, --access-mode flag
Exclusive or share node resources flag
-j, --cpus-per-cu CPUs
CPUs to use per Compute Unit (CU)
-L, --node-list node_list
Manual placement list (node[,node | node1-node2]...)
-l, --node-list-file node_list_file
File with manual placement list
-m, --memory-per-pe size
Per PE memory limit in megabytes
(default node memory/number of processors)
K|M|G suffix supported (16 == 16M == 16 megabytes)
Add an 'h' suffix to request per PE huge page memory
Add an 's' to the 'h' suffix to make the per PE huge page
memory size strict (required)
--mpmd-env env
Set an environment variable on the compute nodes
for a specific MPMD command
Must use format VARNAME=value
Set multiple env variables using multiple --mpmd-env args
-N, --pes-per-node pes
PEs per node
-n, --pes width
Number of PEs requested
--p-governor governor_name
Specify application performance governor
--p-state pstate
Specify application p-state in kHz
-r, --specialized-cpus CPUs
Restrict this many CPUs per node to specialization
-S, --pes-per-numa-node pes
PEs per NUMA node
--ss, --strict-memory-containment
Strict memory containment per NUMA node
[batch35:62735] [[34694,0],0]:errmgr_default_hnp.c(212) updating exit status to 
1
--------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
--------------------------------------------------------------------------
[batch35:62735] [[34694,0],0] orted:comm:process_commands() Processing Command: 
ORTE_DAEMON_HALT_VM_CMD
[batch35:62735] Job UNKNOWN has launched
[batch35:62735] [[34694,0],0] Releasing job data for [34694,1]
[batch35:62735] [[34694,0],0] ras:tm:finalize: success (nothing to do)
[batch35:62735] mca: base: close: unloading component tm
[batch35:62735] sess_dir_finalize: proc session dir does not exist
[batch35:62735] sess_dir_finalize: job session dir does not exist
[batch35:62735] sess_dir_finalize: jobfam session dir does not exist
[batch35:62735] sess_dir_finalize: jobfam session dir does not exist
[batch35:62735] sess_dir_finalize: top session dir does not exist
[batch35:62735] sess_dir_cleanup: job session dir does not exist
[batch35:62735] sess_dir_cleanup: top session dir does not exist
[batch35:62735] [[34694,0],0] Releasing job data for [34694,0]
[batch35:62735] sess_dir_cleanup: job session dir does not exist
[batch35:62735] sess_dir_cleanup: top session dir does not exist
exiting with status 1


Chris


-----Original Message-----
From: Pritchard Jr., Howard <howa...@lanl.gov <mailto:howa...@lanl.gov>> 
Sent: Wednesday, July 10, 2024 12:40 PM
To: Borchert, Christopher B ERDC-RDE-ITL-MS CIV 
<christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil>>; Open MPI Users 
<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>
Subject: Re: [EXTERNAL] [OMPI users] Invalid -L flag added to aprun


Hi Chris,


Sorry for the delay. I wanted to first double check with my own build of a 
4.1.x branch.


Could you try again with two things.


One if you didn't configure Open MPI with --enable-debug option could you do 
that and rebuild?


Then, try setting these environment variables and rerunning your test to see if 
we learn more:


export OMPI_MCA_ras_base_verbose=100
export OMPI_MCA_ras_base_launch_orted_on_hn=1




On 7/2/24, 8:09 AM, "Borchert, Christopher B ERDC-RDE-ITL-MS CIV" 
<christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil> 
<mailto:christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil>>> wrote:




Thanks Howard. I don't find the env var changes the behavior. I'm using PBS Pro.




Chris




-----Original Message-----
From: Pritchard Jr., Howard <howa...@lanl.gov <mailto:howa...@lanl.gov> 
<mailto:howa...@lanl.gov <mailto:howa...@lanl.gov>>> 
Sent: Monday, July 1, 2024 3:43 PM
To: Open MPI Users <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> 
<mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>>
Cc: Borchert, Christopher B ERDC-RDE-ITL-MS CIV 
<christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil> 
<mailto:christopher.b.borch...@erdc.dren.mil 
<mailto:christopher.b.borch...@erdc.dren.mil>>>
Subject: Re: [EXTERNAL] [OMPI users] Invalid -L flag added to aprun




Hi Christoph,




First a big caveat and disclaimer. I'm not sure if any Open MPI developers have 
access any longer to Cray XC systems, so all I can do is make suggestions.




What's probably happening is orte is thinking it is going to fork off the 
application processes on the head node itself. That isn't going to work for XC 
aries network.
I'm not sure what would have changed between the orte in 4.0.x and 4.1.x to 
cause this difference but could you set the following ORTE MCA parameter and 
see if this problem goes away?




export ORTE_MCA_ras_base_launch_orted_on_hn=1




What batch scheduler is your system using?




Howard




On 7/1/24, 2:11 PM, "users on behalf of Borchert, Christopher B 
ERDC-RDE-ITL-MS CIV via users" <users-boun...@lists.open-mpi.org 
<mailto:users-boun...@lists.open-mpi.org> 
<mailto:users-boun...@lists.open-mpi.org 
<mailto:users-boun...@lists.open-mpi.org>> 
<mailto:users-boun...@lists.open-mpi.org 
<mailto:users-boun...@lists.open-mpi.org> 
<mailto:users-boun...@lists.open-mpi.org 
<mailto:users-boun...@lists.open-mpi.org>>> on behalf of 
users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> 
<mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> 
<mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> 
<mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>>> wrote:








On a Cray XC (requiring aprun launcher to get from batch node to compute node), 
4.0.5 works but 4.1.1 and 4.1.6 do not (even on a single node). The newer ones 
throw this:
--------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before communicating 
back to mpirun. This could be caused by a number of factors, including an 
inability to create a connection back to mpirun due to a lack of common network 
interfaces and/or no route found between them. Please check network 
connectivity (including firewalls and network routing requirements).
--------------------------------------------------------------------------








On all 3 when I add -d to mpirun, they show aprun is being called. However, the 
2 newer versions add an invalid flag: -L. Doesn't matter if the -L is followed 
by a batch node name or a compute node name.








4.0.5:
[batch7:78642] plm:alps: aprun -n 1 -N 1 -cc none -e
PMI_NO_PREINITIALIZE=1 -e PMI_NO_FORK=1 -e OMPI_NO_USE_CRAY_PMI=1 orted -mca 
orte_debug 1 -mca ess_base_jobid 3787849728 -mca ess_base_vpid 1 -mca 
ess_base_num_procs 2 -mca orte_node_regex batch[1:7],[3:132]@0(2) -mca 
orte_hnp_uri 3787849728.0;tcp://10.128.13.251:34149








4.1.1:
[batch7:75094] plm:alps: aprun -n 1 -N 1 -cc none -e
PMI_NO_PREINITIALIZE=1 -e PMI_NO_FORK=1 -e OMPI_NO_USE_CRAY_PMI=1 -L batch7 
orted -mca orte_debug 1 -mca ess_base_jobid 4154589184 -mca ess_base_vpid 1 
-mca ess_base_num_procs 2 -mca orte_node_regex mpirun,batch[1:7]@0(2) -mca 
orte_hnp_uri 4154589184.0;tcp://10.128.13.251:56589
aprun: -L node_list contains an invalid entry








4.1.6:
[batch20:43065] plm:alps: aprun -n 1 -N 1 -cc none -e
PMI_NO_PREINITIALIZE=1 -e PMI_NO_FORK=1 -e OMPI_NO_USE_CRAY_PMI=1 -L
nid00140 orted -mca orte_debug 1 -mca ess_base_jobid 115474432 -mca 
ess_base_vpid 1 -mca ess_base_num_procs 2 -mca orte_node_regex
batch[2:20],nid[5:140]@0(2) -mca orte_hnp_uri
115474432.0;tcp://10.128.1.39:51455
aprun: -L node_list contains an invalid entry








How can I get this -L argument removed?








Thanks, Chris

smime.p7s
Description: S/MIME cryptographic signature

Re: [OMPI users] [EXTERNAL] Invalid -L flag added to aprun

Reply via email to