...and your version of Slurm? On Feb 12, 2014, at 7:19 AM, Ralph Castain <r...@open-mpi.org> wrote:
> What is your SLURM_TASKS_PER_NODE? > > On Feb 12, 2014, at 6:58 AM, Adrian Reber <adr...@lisas.de> wrote: > >> No, the system has only a few MOAB_* variables and many SLURM_* >> variables: >> >> $BASH $IFS $SECONDS >> $SLURM_PTY_PORT >> $BASHOPTS $LINENO $SHELL >> $SLURM_PTY_WIN_COL >> $BASHPID $LINES $SHELLOPTS >> $SLURM_PTY_WIN_ROW >> $BASH_ALIASES $MACHTYPE $SHLVL >> $SLURM_SRUN_COMM_HOST >> $BASH_ARGC $MAILCHECK $SLURMD_NODENAME >> $SLURM_SRUN_COMM_PORT >> $BASH_ARGV $MOAB_CLASS >> $SLURM_CHECKPOINT_IMAGE_DIR $SLURM_STEPID >> $BASH_CMDS $MOAB_GROUP $SLURM_CONF >> $SLURM_STEP_ID >> $BASH_COMMAND $MOAB_JOBID >> $SLURM_CPUS_ON_NODE $SLURM_STEP_LAUNCHER_PORT >> $BASH_LINENO $MOAB_NODECOUNT >> $SLURM_DISTRIBUTION $SLURM_STEP_NODELIST >> $BASH_SOURCE $MOAB_PARTITION $SLURM_GTIDS >> $SLURM_STEP_NUM_NODES >> $BASH_SUBSHELL $MOAB_PROCCOUNT $SLURM_JOBID >> $SLURM_STEP_NUM_TASKS >> $BASH_VERSINFO $MOAB_SUBMITDIR >> $SLURM_JOB_CPUS_PER_NODE $SLURM_STEP_TASKS_PER_NODE >> $BASH_VERSION $MOAB_USER $SLURM_JOB_ID >> $SLURM_SUBMIT_DIR >> $COLUMNS $OPTERR >> $SLURM_JOB_NODELIST $SLURM_SUBMIT_HOST >> $COMP_WORDBREAKS $OPTIND >> $SLURM_JOB_NUM_NODES $SLURM_TASKS_PER_NODE >> $DIRSTACK $OSTYPE >> $SLURM_LAUNCH_NODE_IPADDR $SLURM_TASK_PID >> $EUID $PATH $SLURM_LOCALID >> $SLURM_TOPOLOGY_ADDR >> $GROUPS $POSIXLY_CORRECT $SLURM_NNODES >> $SLURM_TOPOLOGY_ADDR_PATTERN >> $HISTCMD $PPID $SLURM_NODEID >> $SRUN_DEBUG >> $HISTFILE $PS1 $SLURM_NODELIST >> $TERM >> $HISTFILESIZE $PS2 $SLURM_NPROCS >> $TMPDIR >> $HISTSIZE $PS4 $SLURM_NTASKS >> $UID >> $HOSTNAME $PWD >> $SLURM_PRIO_PROCESS $_ >> $HOSTTYPE $RANDOM $SLURM_PROCID >> >> >> >> >> On Wed, Feb 12, 2014 at 06:12:45AM -0800, Ralph Castain wrote: >>> Seems rather odd - since this is managed by Moab, you shouldn't be seeing >>> SLURM envars at all. What you should see are PBS_* envars, including a >>> PBS_NODEFILE that actually contains the allocation. >>> >>> >>> On Feb 12, 2014, at 4:42 AM, Adrian Reber <adr...@lisas.de> wrote: >>> >>>> I tried the nightly snapshot (openmpi-1.7.5a1r30692.tar.gz) on a system >>>> with slurm and moab. I requested an interactive session using: >>>> >>>> msub -I -l nodes=3:ppn=8 >>>> >>>> and started a simple test case which fails: >>>> >>>> $ mpirun -np 2 ./mpi-test 1 >>>> -------------------------------------------------------------------------- >>>> There are not enough slots available in the system to satisfy the 2 slots >>>> that were requested by the application: >>>> ./mpi-test >>>> >>>> Either request fewer slots for your application, or make more slots >>>> available >>>> for use. >>>> -------------------------------------------------------------------------- >>>> srun: error: xxxx108: task 1: Exited with exit code 1 >>>> srun: Terminating job step 131823.4 >>>> srun: error: xxxx107: task 0: Exited with exit code 1 >>>> srun: Job step aborted >>>> slurmd[xxxx108]: *** STEP 131823.4 KILLED AT 2014-02-12T13:30:32 WITH >>>> SIGNAL 9 *** >>>> >>>> >>>> requesting only one core works: >>>> >>>> $ mpirun ./mpi-test 1 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 1: 0.000000 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 1: 0.000000 >>>> >>>> >>>> using openmpi-1.6.5 works with multiple cores: >>>> >>>> $ mpirun -np 24 ./mpi-test 2 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 24: 0.000000 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 12 on xxxx106 out of 24: 12.000000 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 11 on xxxx108 out of 24: 11.000000 >>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 18 on xxxx106 out of 24: 18.000000 >>>> >>>> $ echo $SLURM_JOB_CPUS_PER_NODE >>>> 8(x3) >>>> >>>> I never used slurm before so this could also be a user error on my side. >>>> But as 1.6.5 works it seems something has changed and wanted to let >>>> you know in case it was not intentionally. >>>> >>>> Adrian >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> Adrian >> >> -- >> Adrian Reber <adr...@lisas.de> http://lisas.de/~adrian/ >> "Let us all bask in television's warm glowing warming glow." -- Homer Simpson >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >