Hi
I made the change to fsl_sub to make it work for Torque
and also for slurm
(I am using slurm now, but it should still work for torque)
I hope it helps
Chears
Romain
Le 17/03/15 09:00, Matthew George Liptrot a écrit :
Hi,
Does anyone have an up-to-date version of fsl_sub modified for the
MOAB/TORQUE/PBS job management interface? (I found a very old one from
Matt Glasser online but was hoping someone might have updated it since).
Thanks in advance,
M@
--
*Matthew George Liptrot*
*
*
*Department of Computer Science*
*University of Copenhagen*
&
*Section for Cognitive Systems*
*Department of Applied Mathematics and Computer Science*
*Technical University of Denmark*
http://about.me/matthewliptrot
_______________________________________________
HCP-Users mailing list
[email protected]
http://lists.humanconnectome.org/mailman/listinfo/hcp-users
_______________________________________________
HCP-Users mailing list
[email protected]
http://lists.humanconnectome.org/mailman/listinfo/hcp-users
#!/bin/sh
# Copyright (C) 2007 University of Oxford
# Authors: Dave Flitney & Stephen Smith
# Part of FSL - FMRIB's Software Library
# http://www.fmrib.ox.ac.uk/fsl
# [email protected]
#
# Developed at FMRIB (Oxford Centre for Functional Magnetic Resonance
# Imaging of the Brain), Department of Clinical Neurology, Oxford
# University, Oxford, UK
#
#
# LICENCE
#
# FMRIB Software Library, Release 4.0 (c) 2007, The University of
# Oxford (the "Software")
#
# The Software remains the property of the University of Oxford ("the
# University").
#
# The Software is distributed "AS IS" under this Licence solely for
# non-commercial use in the hope that it will be useful, but in order
# that the University as a charitable foundation protects its assets for
# the benefit of its educational and research purposes, the University
# makes clear that no condition is made or to be implied, nor is any
# warranty given or to be implied, as to the accuracy of the Software,
# or that it will be suitable for any particular purpose or for use
# under any specific conditions. Furthermore, the University disclaims
# all responsibility for the use which is made of the Software. It
# further disclaims any liability for the outcomes arising from using
# the Software.
#
# The Licensee agrees to indemnify the University and hold the
# University harmless from and against any and all claims, damages and
# liabilities asserted by third parties (including claims for
# negligence) which arise directly or indirectly from the use of the
# Software or the sale of any products based on the Software.
#
# No part of the Software may be reproduced, modified, transmitted or
# transferred in any form or by any means, electronic or mechanical,
# without the express permission of the University. The permission of
# the University is not required if the said reproduction, modification,
# transmission or transference is done without financial return, the
# conditions of this Licence are imposed upon the receiver of the
# product, and all original and amended source code is included in any
# transmitted product. You may be held legally responsible for any
# copyright infringement that is caused or encouraged by your failure to
# abide by these terms and conditions.
#
# You are not permitted under this Licence to use this Software
# commercially. Use for which any financial return is received shall be
# defined as commercial use, and includes (1) integration of all or part
# of the source code or the Software into a product for sale or license
# by or on behalf of Licensee to third parties or (2) use of the
# Software or any derivative of it for research with the final aim of
# developing software products for sale or license to a third party or
# (3) use of the Software or any derivative of it for research with the
# final aim of developing non-software products for sale or license to a
# third party, or (4) use of the Software to provide any service to an
# external organisation for which payment is received. If you are
# interested in using the Software commercially, please contact Isis
# Innovation Limited ("Isis"), the technology transfer company of the
# University, to negotiate a licence. Contact details are:
# [email protected] quoting reference DE/1112.
###########################################################################
# Edit this file in order to setup FSL to use your local compute
# cluster.
###########################################################################
###########################################################################
# The following section determines what to do when fsl_sub is called
# by an FSL program. If it finds a local cluster if will pass the
# commands onto the cluster. Otherwise it will run the commands
# itself. There are three values for the METHOD variable, "SGE", "TORQUE" and
# "NONE". You should setup the tests to look for whether the calling
# computer can see your cluster setup scripts, and run them (if that's
# what you want, i.e. if you haven't already run them in the user's
# login scripts). Note that these tests look for the environment
# variable SGE_ROOT, which a user can unset if they don't want the
# cluster to be used.
###########################################################################
#rrr je change le nom de la queue generic par un variable
QUEUE_NAME=$FSLQUEUE
if [ "x$FSLQUEUE" = "x" ];
then
QUEUE_NAME=long
#echo default queue long overite with FSLQUEUE variable
fi
if [ "x$METHOD" = "x" ];
then
#METHOD="SLURM"
METHOD="TORQUE"
#METHOD="NONE"
fi
#for slurm
WALLTIME="86400"
if [ $METHOD = SGE ] ; then
if [ "x$SGE_ROOT" = "x" ] ; then
if [ -f /usr/local/share/sge/default/common/settings.sh ] ; then
. /usr/local/share/sge/default/common/settings.sh
elif [ -f /usr/local/sge/default/common/settings.sh ] ; then
. /usr/local/sge/default/common/settings.sh
else
METHOD=NONE
fi
fi
elif [ $METHOD = "TORQUE" ] ; then
if [ "x$FSL_SUB" = "x" ] ; then
WALLTIME="86400" #This is the default amount of time jobs will be
allowed to run and is equivalent to long.q in the default fMRIB SGE setup
MailOpts="ab"
else
echo "performing locallyrrr" >&2
METHOD=NONE
fi
fi
###########################################################################
# The following auto-decides what cluster queue to use. The calling
# FSL program will probably use the -T option when calling fsl_sub,
# which tells fsl_sub how long (in minutes) the process is expected to
# take (in the case of the -t option, how long each line in the
# supplied file is expected to take). You need to setup the following
# list to map ranges of timings into your cluster queues - it doesn't
# matter how many you setup, that's up to you.
###########################################################################
map_qname ()
{
if [ $1 -le 20 ] ; then
#queue=veryshort.q
queue=$QUEUE_NAME #There are no separate queues currently
elif [ $1 -le 120 ] ; then
#queue=short.q
queue=$QUEUE_NAME
elif [ $1 -le 1440 ] ; then
#queue=long.q
queue=$QUEUE_NAME
else
#queue=verylong.q
queue=$QUEUE_NAME
fi
#echo "Estimated time was $1 mins: queue name is $queue"
}
###########################################################################
# Don't change the following (but keep scrolling down!)
###########################################################################
POSIXLY_CORRECT=1
export POSIXLY_CORRECT
command=`basename $0`
usage ()
{
cat <<EOF
$command V1.0beta - wrapper for job control system such as SGE
Usage: $command [options] <command>
$command gzip *.img *.hdr
$command -q short.q gzip *.img *.hdr
$command -a darwin regscript rawdata outputdir ...
-T <minutes> Estimated job length in minutes, used to auto-set queue
name
-q <queuename> Possible values for <queuename> are "verylong.q",
"long.q"
and "short.q". See below for details
Default is "long.q".
-a <arch-name> Architecture [e.g., darwin or lx24-amd64]
-p <job-priority> Lower priority [0:-1024] default = 0
-M <email-address> Who to email, default = `whoami`@fmrib.ox.ac.uk
-j <jid> Place a hold on this task until job jid has completed
-t <filename> Specify a task file of commands to execute in parallel
-N <jobname> Specify jobname as it will appear on queue
-n <nCPUs> Number of CPUs that job will use
-l <logdirname> Where to output logfiles
-m <mailoptions> Change the SGE mail options, see qsub for details
-F Use flags embedded in scripts to set SGE queuing options
-v Verbose mode.
Queues:
There are three batch queues configured on the cluster, each with defined CPU
time limits.
veryshort.q:This queue is for jobs which last under 30mins.
short.q: This queue is for jobs which last up to 2h.
long.q: This queue is for jobs which last less than 24h.
verylong.q: This queue is for jobs which will take longer than 24h CPU time.
There is one slot per node, and jobs on this queue have a nice value
of 5. If jobs enter the short.q queue then items running on this
queue are suspended and resumed on completion of the short.q task.
EOF
exit 1
}
nargs=$#
if [ $nargs -eq 0 ] ; then
usage
fi
set -- `getopt T:q:a:p:M:j:t:N:n:Fvm:l:r $*`
result=$?
if [ $result != 0 ] ; then
echo "What? Your arguments make no sense!"
fi
if [ $nargs -eq 0 ] || [ $result != 0 ] ; then
usage
fi
###########################################################################
# The following sets up the default queue name, which you may want to
# change. It also sets up the basic emailing control.
###########################################################################
#queue=long.q
queue=$QUEUE_NAME
#mailto=`whoami`@fmrib.ox.ac.uk
mailto=`whoami`@uoregon.edu
MailOpts="n"
TORQUEDEPENDANCYMODE="w" #is one of "w" "l"
###########################################################################
# In the following, you might want to change the behaviour of some
# flags so that they prepare the right arguments for the actual
# cluster queue submission program, in our case "qsub".
#
# -a sets is the cluster submission flag for controlling the required
# hardware architecture (normally not set by the calling program)
#
# -p set the priority of the job - ignore this if your cluster
# environment doesn't have priority control in this way.
#
# -j tells the cluster not to start this job until cluster job ID $jid
# has completed. You will need this feature.
#
# -t will pass on to the cluster software the name of a text file
# containing a set of commands to run in parallel; one command per
# line.
#
# -N option determines what the command will be called when you list
# running processes.
#
# -l tells the cluster what to call the standard output and standard
# -error logfiles for the submitted program.
###########################################################################
if [ -z $FSLSUBVERBOSE ] ; then
verbose=0
else
verbose=$FSLSUBVERBOSE;
echo "METHOD=$METHOD : args=$@" >&2
fi
# Can remove after full test
#verbose=1
scriptmode=0
while [ $1 != -- ] ; do
case $1 in
-T)
map_qname $2
WALLTIME=`echo "$2 * 60" | bc`
shift;;
-q)
queue=$2
if [ $queue = "veryshort.q" ] ; then
WALLTIME=1800
queue=$QUEUE_NAME
elif [ $queue = "short.q" ] ; then
WALLTIME=7200
queue=$QUEUE_NAME
elif [ $queue = "long.q" ] ; then
WALLTIME=86400
queue=$QUEUE_NAME
elif [ $queue = "verylong.q" ] ; then
WALLTIME=604800
queue=$QUEUE_NAME
fi
shift;;
-a)
acceptable_arch=no
available_archs=`qhost | tail -n +4 | awk '{print $2}' | sort | uniq`
for a in $available_archs; do
if [ $2 = $a ] ; then
acceptable_arch=yes
fi
done
if [ $acceptable_arch = yes ]; then
sge_arch="-l arch=$2"
else
echo "Sorry arch of $2 is not supported on this SGE configuration!"
echo "Should be one of:" $available_archs
exit 127
fi
shift;;
-p)
sge_priority="-p $2"
shift;;
-M)
mailto=$2
if [ $METHOD = "TORQUE" ] ; then
mailto=`whoami`@pastropdemail.icm-institute.org
fi
shift;;
-j)
jid="$2"
if [[ $jid =~ ^.*[0-9]+.*$ ]] ; then
## 20120706cdt delimiter for my torque is a colon
jid2=$(echo $jid | sed -e 's/,/:/g')
if [ $TORQUEDEPENDANCYMODE = "l" ] ; then
## 20120706cdt apparently depend= is not a -l option
## for some torques
#torque_hold=",depend=afterok:${jid2}.calcul-icm"
torque_hold=",depend=afterok:${jid2}"
slurm_hold=" --depend=afterok:${jid2}"
elif [ $TORQUEDEPENDANCYMODE = "w" ] ; then
#RRR modify dependance after an array submission is different
if `echo ${jid2} | grep "\[" 1>/dev/null 2>&1`
then
torque_hold=" -W depend=afterokarray:${jid2}"
slurm_hold=" --depend=afterok:${jid2}"
else
torque_hold=" -W depend=afterok:${jid2}"
slurm_hold=" --depend=afterok:${jid2}"
fi
fi
fi
sge_hold="-hold_jid $jid"
shift;;
-t)
taskfile=$2
tasks=`wc -l $taskfile | awk '{print $1}'`
sge_tasks="-t 1-$tasks"
slurm_tasks="--array=1-$tasks"
shift;;
-N)
JobName=$2;
shift;;
-n)
if [ $METHOD = "SGE" ] ; then
NumCPUs="-pe make $2"
elif [ $METHOD = "TORQUE" ] ; then
NumCPUs=",nodes=1:ppn=$2"
elif [ $METHOD = "SLURM" ] ; then
NumCPUs=" -n $2"
fi
shift;;
-m)
MailOpts=$2;
if [[ $METHOD = "TORQUE" && $MailOpts = "as" ]] ; then
MailOpts=ab
fi
shift;;
-l)
LogOpts="-o $2/log.%A_%a -e $2/err.%A_%a";
LogDir="${2}/";
mkdir -p $2;
shift;;
-F)
scriptmode=1;
;;
-v)
verbose=1
;;
esac
shift # next flag
done
shift
###########################################################################
# Don't change the following (but keep scrolling down!)
###########################################################################
if [ "x$JobName" = x ] ; then
if [ "x$taskfile" != x ] ; then
JobName=`basename $taskfile`
else
JobName=`basename $1`
fi
fi
if [ "x$tasks" != x ] && [ ! -f "$taskfile" ] ; then
echo $taskfile: invalid input!
echo Should be a text file listing all the commands to run!
exit -1
fi
if [ "x$tasks" != "x" ] && [ "x$@" != "x" ] ; then
echo $@
echo Spurious input after parsing command line!
exit -1
fi
case $METHOD in
###########################################################################
# The following is the main call to the cluster, using the "qsub" SGE
# program. If $tasks has not been set then qsub is running a single
# command, otherwise qsub is processing a text file of parallel
# commands.
###########################################################################
SGE)
if [ "x$tasks" = "x" ] ; then
if [ $scriptmode -ne 1 ] ; then
sge_command="qsub -V -cwd -shell n -b y -r y -q $queue -M
$mailto -N $JobName -m $MailOpts $LogOpts $sge_arch $sge_hold $NumCPUs"
else
sge_command="qsub $LogOpts $sge_arch $sge_hold $NumCPUs"
fi
if [ $verbose -eq 1 ] ; then
echo sge_command: $sge_command >&2
echo executing: $@ >&2
fi
exec $sge_command $@ | awk '{print $3}'
else
sge_command="qsub -V -cwd -q $queue -M $mailto -N $JobName -m
$MailOpts $LogOpts $sge_arch $sge_hold $sge_tasks $NumCPUs"
if [ $verbose -eq 1 ] ; then
echo sge_command: $sge_command >&2
echo control file: $taskfile >&2
fi
exec $sge_command <<EOF | awk '{print $3}' | awk -F. '{print $1}'
#!/bin/sh
#$ -S /bin/sh
command=\`sed -n -e "\${SGE_TASK_ID}p" $taskfile\`
exec /bin/sh -c "\$command"
EOF
fi
;;
###########################################################################
# The following is the main call to the cluster, using the "qsub" TORQUE
# program. If $tasks has not been set then qsub is running a single
# command, otherwise qsub is processing a text file of parallel
# commands. This script is compatible with MOAB 5.3.7.s15113 and higher.
###########################################################################
TORQUE)
# SGE takes args after script, TORQUE does not. Tempscript stores the
command and arguements to be used
tempscript="$HOME/tempcmd""$RANDOM"
if [ "x$tasks" = "x" ] ; then
echo $@ > $tempscript
if [ $scriptmode -ne 1 ] ; then
#RRR torque_command="qsub -V -d . -b y -r y -q $queue -M
$mailto -N $JobName -m $MailOpts $LogOpts $sge_arch -l
walltime=$WALLTIME$NumCPUs$torque_hold"
#torque_command="qsub -V -b 10 -r y -q $queue -M $mailto -N
$JobName -m $MailOpts $LogOpts $sge_arch -l
walltime=$WALLTIME$NumCPUs$torque_hold"
torque_command="qsub -V -q $queue -N $JobName $LogOpts
$sge_arch -l walltime=$WALLTIME$NumCPUs$torque_hold"
else
torque_command="qsub $LogOpts $sge_arch -l
walltime=$WALLTIME$NumCPUs$torque_hold"
fi
if [ $verbose -eq 1 ] ; then
echo torque_command: $torque_command >&2
Tempscriptt=`cat $tempscript`
echo tempscriptttt: $Tempscriptt >&2
fi
exec $torque_command $tempscript | awk '{print $1}' | awk -F.
'{print $1}'
rm $tempscript
sleep 2
else
echo "command=\`cat "$taskfile" | head -\$PBS_ARRAYID | tail -1\` ;
exec \$command" > $tempscript
#RRR torque_command="qsub -V -d . -q $queue -M $mailto -N $JobName
-m $MailOpts $LogOpts $sge_arch -l walltime=$WALLTIME$NumCPUs$torque_hold
$sge_tasks"
#torque_command="qsub -V -q $queue -M $mailto -N $JobName -m
$MailOpts $LogOpts $sge_arch -l walltime=$WALLTIME$NumCPUs$torque_hold
$sge_tasks"
torque_command="qsub -V -q $queue -N $JobName $LogOpts
$sge_arch -l walltime=$WALLTIME$NumCPUs$torque_hold $sge_tasks"
if [ $verbose -eq 1 ] ; then
echo torque_command: $torque_command >&2
echo control file: $taskfile >&2
echo sss $sge_tasks >&2
Tempscriptt=`cat $tempscript`
echo tempscriptrrr: $Tempscriptt >&2
#echo RRRtempscript: $tempscript >&2
fi
exec $torque_command $tempscript | awk '{print $1}' | awk -F.
'{print $1}'
#echo $torque_command $tempscript #| awk '{print $1}' | awk -F.
'{print $1}'
rm $tempscript
sleep 2
fi
;;
###########################################################################
# The following is the main call to the cluster, using the "sbatch" SLURM
# program. If $tasks has not been set then qsub is running a single
# command, otherwise qsub is processing a text file of parallel
# commands. This script is compatible with MOAB 5.3.7.s15113 and higher.
###########################################################################
SLURM)
#walltime in min
WALLTIME=`echo "$WALLTIME/60"|bc`
# SGE takes args after script, TORQUE does not. Tempscript stores the
command and arguements to be used
tempscript="$LogDir/tempcmd""$RANDOM"
if [ "x$tasks" = "x" ] ; then
echo '#!/bin/bash' > $tempscript
echo $@ >> $tempscript
if [ $scriptmode -ne 1 ] ; then
slurm_command="sbatch --export=all --mem=30000 -p $queue
--job-name=$JobName $LogOpts $sge_arch -t $WALLTIME $NumCPUs $slurm_hold"
else
slurm_command="sbatch $LogOpts $sge_arch -t $WALLTIME $NumCPUs
$slurm_hold"
fi
if [ $verbose -eq 1 ] ; then
echo "slurm_command 1 : $slurm_command" >&2
Tempscriptt=`cat $tempscript`
#echo tempscriptttt : $Tempscriptt >&2
fi
exec $slurm_command $tempscript | awk '{print $4}'
#echo "dollar 1 $1">&2
#rm $tempscript
sleep 2
else
echo '#!/bin/bash' > $tempscript
echo "command=\`cat "$taskfile" | head -\$SLURM_ARRAY_TASK_ID |
tail -1\` ; exec \$command" >> $tempscript
slurm_command="sbatch --export=all --mem=7000 -p $queue
--job-name=$JobName $LogOpts $sge_arch -t $WALLTIME $NumCPUs $slurm_hold
$slurm_tasks"
if [ $verbose -eq 1 ] ; then
echo "slurm_command 2 : $slurm_command" >&2
echo "control file : $taskfile" >&2
echo "sss $slurm_tasks" >&2
Tempscriptt=`cat $tempscript`
echo "tempscriptrrr: $Tempscriptt" >&2
fi
exec $slurm_command $tempscript | awk '{print $4}'
#rm $tempscript
sleep 2
fi
;;
###########################################################################
# Don't change the following - this runs the commands directly if a
# cluster is not being used.
###########################################################################
NONE)
if [ "x$tasks" = "x" ] ; then
if [ $verbose -eq 1 ] ; then
echo executing: $@ >&2
fi
/bin/sh <<EOF1 > ${LogDir}${JobName}.o$$ 2> ${LogDir}${JobName}.e$$
$@
EOF1
else
if [ $verbose -eq 1 ] ; then
echo "Running commands in: $taskfile" >&2
fi
n=1
while [ $n -le $tasks ] ; do
line=`sed -n -e ''${n}'p' $taskfile`
if [ $verbose -eq 1 ] ; then
echo executing: $line >&2
fi
/bin/sh <<EOF2 > ${LogDir}${JobName}.o$$.$n 2>
${LogDir}${JobName}.e$$.$n
$line
EOF2
n=`expr $n + 1`
done
fi
echo $$
;;
esac
###########################################################################
# Done.
###########################################################################