Okay, so what’s happening is that we are auto-detecting only 4 cores on that 
box, and since you didn’t provide any further info, we set the #slots = #cores. 
If you want to run more than that, you can either tell us a number of slots to 
use (e.g., -host mybox:32) or add --oversubscribe to the cmd line


> On Apr 25, 2017, at 1:31 PM, Eric Chamberland 
> <eric.chamberl...@giref.ulaval.ca> wrote:
> 
> Ok, here it is:
> 
> ===================
> first, with -n 8:
> ===================
> 
> mpirun -mca ras_base_verbose 10 --display-allocation -n 8 echo "Hello"
> 
> [zorg:22429] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL
> [zorg:22429] plm:base:set_hnp_name: initial bias 22429 nodename hash 810220270
> [zorg:22429] plm:base:set_hnp_name: final jobfam 40249
> [zorg:22429] [[40249,0],0] plm:rsh_setup on agent ssh : rsh path NULL
> [zorg:22429] [[40249,0],0] plm:base:receive start comm
> [zorg:22429] mca: base: components_register: registering framework ras 
> components
> [zorg:22429] mca: base: components_register: found loaded component 
> loadleveler
> [zorg:22429] mca: base: components_register: component loadleveler register 
> function successful
> [zorg:22429] mca: base: components_register: found loaded component slurm
> [zorg:22429] mca: base: components_register: component slurm register 
> function successful
> [zorg:22429] mca: base: components_register: found loaded component simulator
> [zorg:22429] mca: base: components_register: component simulator register 
> function successful
> [zorg:22429] mca: base: components_open: opening ras components
> [zorg:22429] mca: base: components_open: found loaded component loadleveler
> [zorg:22429] mca: base: components_open: component loadleveler open function 
> successful
> [zorg:22429] mca: base: components_open: found loaded component slurm
> [zorg:22429] mca: base: components_open: component slurm open function 
> successful
> [zorg:22429] mca: base: components_open: found loaded component simulator
> [zorg:22429] mca:base:select: Auto-selecting ras components
> [zorg:22429] mca:base:select:(  ras) Querying component [loadleveler]
> [zorg:22429] [[40249,0],0] ras:loadleveler: NOT available for selection
> [zorg:22429] mca:base:select:(  ras) Querying component [slurm]
> [zorg:22429] mca:base:select:(  ras) Querying component [simulator]
> [zorg:22429] mca:base:select:(  ras) No component selected!
> [zorg:22429] [[40249,0],0] plm:base:setup_job
> [zorg:22429] [[40249,0],0] ras:base:allocate
> [zorg:22429] [[40249,0],0] ras:base:allocate nothing found in module - 
> proceeding to hostfile
> [zorg:22429] [[40249,0],0] ras:base:allocate parsing default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22429] [[40249,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22429] [[40249,0],0] ras:base:allocate nothing found in hostfiles - 
> checking for rankfile
> [zorg:22429] [[40249,0],0] ras:base:allocate nothing found in rankfile - 
> inserting current node
> [zorg:22429] [[40249,0],0] ras:base:node_insert inserting 1 nodes
> [zorg:22429] [[40249,0],0] ras:base:node_insert updating HNP [zorg] info to 1 
> slots
> 
> ======================   ALLOCATED NODES   ======================
>        zorg: flags=0x01 slots=1 max_slots=0 slots_inuse=0 state=UP
> =================================================================
> [zorg:22429] [[40249,0],0] plm:base:setup_vm
> [zorg:22429] [[40249,0],0] plm:base:setup_vm creating map
> [zorg:22429] [[40249,0],0] setup:vm: working unmanaged allocation
> [zorg:22429] [[40249,0],0] using default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22429] [[40249,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22429] [[40249,0],0] plm:base:setup_vm only HNP in allocation
> [zorg:22429] [[40249,0],0] plm:base:setting slots for node zorg by cores
> 
> ======================   ALLOCATED NODES   ======================
>        zorg: flags=0x11 slots=4 max_slots=0 slots_inuse=0 state=UP
> =================================================================
> [zorg:22429] [[40249,0],0] complete_setup on job [40249,1]
> [zorg:22429] [[40249,0],0] plm:base:launch_apps for job [40249,1]
> [zorg:22429] [[40249,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> --------------------------------------------------------------------------
> There are not enough slots available in the system to satisfy the 8 slots
> that were requested by the application:
>  echo
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --------------------------------------------------------------------------
> [zorg:22429] [[40249,0],0] plm:base:orted_cmd sending orted_exit commands
> [zorg:22429] [[40249,0],0] plm:base:receive stop comm
> 
> ===================
> second with -n 4:
> ===================
> (16:31:23) [zorg]:~> mpirun -mca ras_base_verbose 10 --display-allocation -n 
> 4 echo "Hello"
> 
> [zorg:22463] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL
> [zorg:22463] plm:base:set_hnp_name: initial bias 22463 nodename hash 810220270
> [zorg:22463] plm:base:set_hnp_name: final jobfam 40219
> [zorg:22463] [[40219,0],0] plm:rsh_setup on agent ssh : rsh path NULL
> [zorg:22463] [[40219,0],0] plm:base:receive start comm
> [zorg:22463] mca: base: components_register: registering framework ras 
> components
> [zorg:22463] mca: base: components_register: found loaded component 
> loadleveler
> [zorg:22463] mca: base: components_register: component loadleveler register 
> function successful
> [zorg:22463] mca: base: components_register: found loaded component slurm
> [zorg:22463] mca: base: components_register: component slurm register 
> function successful
> [zorg:22463] mca: base: components_register: found loaded component simulator
> [zorg:22463] mca: base: components_register: component simulator register 
> function successful
> [zorg:22463] mca: base: components_open: opening ras components
> [zorg:22463] mca: base: components_open: found loaded component loadleveler
> [zorg:22463] mca: base: components_open: component loadleveler open function 
> successful
> [zorg:22463] mca: base: components_open: found loaded component slurm
> [zorg:22463] mca: base: components_open: component slurm open function 
> successful
> [zorg:22463] mca: base: components_open: found loaded component simulator
> [zorg:22463] mca:base:select: Auto-selecting ras components
> [zorg:22463] mca:base:select:(  ras) Querying component [loadleveler]
> [zorg:22463] [[40219,0],0] ras:loadleveler: NOT available for selection
> [zorg:22463] mca:base:select:(  ras) Querying component [slurm]
> [zorg:22463] mca:base:select:(  ras) Querying component [simulator]
> [zorg:22463] mca:base:select:(  ras) No component selected!
> [zorg:22463] [[40219,0],0] plm:base:setup_job
> [zorg:22463] [[40219,0],0] ras:base:allocate
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in module - 
> proceeding to hostfile
> [zorg:22463] [[40219,0],0] ras:base:allocate parsing default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in hostfiles - 
> checking for rankfile
> [zorg:22463] [[40219,0],0] ras:base:allocate nothing found in rankfile - 
> inserting current node
> [zorg:22463] [[40219,0],0] ras:base:node_insert inserting 1 nodes
> [zorg:22463] [[40219,0],0] ras:base:node_insert updating HNP [zorg] info to 1 
> slots
> 
> ======================   ALLOCATED NODES   ======================
>        zorg: flags=0x01 slots=1 max_slots=0 slots_inuse=0 state=UP
> =================================================================
> [zorg:22463] [[40219,0],0] plm:base:setup_vm
> [zorg:22463] [[40219,0],0] plm:base:setup_vm creating map
> [zorg:22463] [[40219,0],0] setup:vm: working unmanaged allocation
> [zorg:22463] [[40219,0],0] using default hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] plm:base:setup_vm only HNP in allocation
> [zorg:22463] [[40219,0],0] plm:base:setting slots for node zorg by cores
> 
> ======================   ALLOCATED NODES   ======================
>        zorg: flags=0x11 slots=4 max_slots=0 slots_inuse=0 state=UP
> =================================================================
> [zorg:22463] [[40219,0],0] complete_setup on job [40219,1]
> [zorg:22463] [[40219,0],0] plm:base:launch_apps for job [40219,1]
> [zorg:22463] [[40219,0],0] hostfile: checking hostfile 
> /opt/openmpi-3.x_debug/etc/openmpi-default-hostfile for nodes
> [zorg:22463] [[40219,0],0] plm:base:launch wiring up iof for job [40219,1]
> [zorg:22463] [[40219,0],0] plm:base:launch job [40219,1] is not a dynamic 
> spawn
> Hello
> Hello
> Hello
> Hello
> [zorg:22463] [[40219,0],0] plm:base:orted_cmd sending orted_exit commands
> [zorg:22463] [[40219,0],0] plm:base:receive stop comm
> 
> 
> Thanks!
> 
> Eric
> 
> On 25/04/17 04:00 PM, r...@open-mpi.org wrote:
>> -mca ras_base_verbose 10 --display-allocation
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to