Actually, you can easily use arrays:

arr=( /pipeline/Runs/ProjectFolders/Project_Michael-He/Sample_* )

samp=`basename ${arr[$SLURM_ARRAY_TASK_ID]}`
fq=$sdir/*gz
cmd="python run_sgRNA.py $fq $samp"
srun -n 1 $cmd &


Cheers,
Gene

--
New Zealand eScience Infrastructure
Centre for eResearch
The University of Auckland
e: g.soudlen...@auckland.ac.nz
p: +64 9 3737599 ext 89834 c: +64 21 840 825 f: +64 9 373 7453
w: www.nesi.org.nz

On 3/08/16 3:33 pm, Lachlan Musicman wrote:
Sometimes we would like to run jobs in parallel without using arrays
because the files aren't well named. But the files are all in the same
folder.

We have written a small script that loops over each file, constructs the
command in question and runs it.

We only want each command to run once, but we would like to launch them
all at the same time - from the same sbatch file if possible. We are
using srun -n 1 to get this result.

The script looks like this

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=12            # number of samples
#SBATCH --mem-per-cpu=1G
#SBATCH --time=0-00:30:00     # 30 minutes
#SBATCH --output=my.stdout
#SBATCH --job-name="just_a_test"

for sdir in /pipeline/Runs/ProjectFolders/Project_Michael-He/Sample_*; do
        samp=`basename $sdir`
        fq=$sdir/*gz
        cmd="python run_sgRNA.py $fq $samp"
        srun -n 1 $cmd &      # -n 1 to run the command once only
(...how to specify more than 1 threads per task???)
done

wait



So my questions are:

- is the above the best way to deal with this scenario?

- is getting each srun to use multiple threads (but only run once) just
a matter of using -c in srun? eg to execute each of twelve jobs once,
but with four threads: srun -n 1 -c 4 $cmd &


Cheers
L.


------
The most dangerous phrase in the language is, "We've always done it this
way."

- Grace Hopper

Reply via email to