Actually, you can easily use arrays:
arr=( /pipeline/Runs/ProjectFolders/Project_Michael-He/Sample_* ) samp=`basename ${arr[$SLURM_ARRAY_TASK_ID]}` fq=$sdir/*gz cmd="python run_sgRNA.py $fq $samp" srun -n 1 $cmd & Cheers, Gene -- New Zealand eScience Infrastructure Centre for eResearch The University of Auckland e: g.soudlen...@auckland.ac.nz p: +64 9 3737599 ext 89834 c: +64 21 840 825 f: +64 9 373 7453 w: www.nesi.org.nz On 3/08/16 3:33 pm, Lachlan Musicman wrote:
Sometimes we would like to run jobs in parallel without using arrays because the files aren't well named. But the files are all in the same folder. We have written a small script that loops over each file, constructs the command in question and runs it. We only want each command to run once, but we would like to launch them all at the same time - from the same sbatch file if possible. We are using srun -n 1 to get this result. The script looks like this #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks=12 # number of samples #SBATCH --mem-per-cpu=1G #SBATCH --time=0-00:30:00 # 30 minutes #SBATCH --output=my.stdout #SBATCH --job-name="just_a_test" for sdir in /pipeline/Runs/ProjectFolders/Project_Michael-He/Sample_*; do samp=`basename $sdir` fq=$sdir/*gz cmd="python run_sgRNA.py $fq $samp" srun -n 1 $cmd & # -n 1 to run the command once only (...how to specify more than 1 threads per task???) done wait So my questions are: - is the above the best way to deal with this scenario? - is getting each srun to use multiple threads (but only run once) just a matter of using -c in srun? eg to execute each of twelve jobs once, but with four threads: srun -n 1 -c 4 $cmd & Cheers L. ------ The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper