Dear Carles,
Please find attachement of slurm.conf. PriorityMaxAge=10 is already set. Pls let me know if I am missing any other thing. Regards, Yogendra ________________________________ From: Carlos Fenoy [mini...@gmail.com] Sent: Monday, November 11, 2013 3:52 PM To: slurm-dev Cc: Pankaj Sharma (WI01 - GIS) Subject: [slurm-dev] Re: Slurm configuration problem --Age factor not working @ all Dear Yogendra, It seems you are missing the PriorityMaxAge parameter. Set it and the Age parameter should start working. Regards, Carles Fenoy On Mon, Nov 11, 2013 at 11:04 AM, <yogendra.shar...@wipro.com<mailto:yogendra.shar...@wipro.com>> wrote: Hi Team We have enabled mutlifactor priority (PrirityWeightAge & PriorityJobSize) but PrirityWeightAge is not working, our jobs are always scheduled based on Jobsize, when we see sprio -l it always show job age = 0. Below are the priorities which we have configured and we are using slurm 2.6.2, For testing we are submitting sleep jobs i minutes. Please help me enabling Age factor also in our configuration. [root@hpca ~]# sprio -l JOBID USER PRIORITY AGE FAIRSHARE JOBSIZE PARTITION QOS NICE 102 root 968 0 0 469 500 0 0 103 root 968 0 0 469 500 0 0 104 root 968 0 0 469 500 0 0 Below is the part of my slurm.conf ---------------------------------------------------- # TIMERS #BatchStartTimeout=10 #CompleteWait=0 #EpilogMsgTime=2000 #GetEnvTimeout=2 #HealthCheckInterval=0 #HealthCheckProgram= InactiveLimit=0 KillWait=10 #MessageTimeout=10 #ResvOverRun=0 MinJobAge=2 #OverTimeLimit=5 SlurmctldTimeout=120 SlurmdTimeout=300 #UnkillableStepTimeout=60 #VSizeFactor=0 Waittime=0 # # # SCHEDULING #DefMemPerCPU=0 FastSchedule=1 #MaxMemPerCPU=0 #SchedulerRootFilter=1 SchedulerTimeSlice=30 PreemptMode=cancel PreemptType=preempt/partition_prio #PreemptType=preempt/qos SchedulerType=sched/backfill SchedulerPort=7321 #SelectType=select/linear SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory #PreemptMode=GANG # # # JOB PRIORITY PriorityType=priority/multifactor #PriorityDecayHalfLife=PriorityMaxAge=10 #PriorityUsageResetPeriod= PriorityWeightAge=100000 #PriorityWeightFairshare=1000 PriorityWeightJobSize=1000 PriorityWeightPartition=1000 # -- Regards, Yogendra #PriorityCalcPeriod= PriorityFavorSmall=Yes Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com<http://www.wipro.com> -- -- Carles Fenoy Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
# slurm.conf file generated by configurator.html. # Put this file on all nodes of your cluster. # See the slurm.conf man page for more information. # ControlMachine=hpca #ControlAddr= #BackupController= #BackupAddr= # AuthType=auth/munge CacheGroups=0 #CheckpointType=checkpoint/none #CryptoType=crypto/openssl #DisableRootJobs=NO #EnforcePartLimits=NO #Epilog=/opt/epilog #EpilogSlurmctld=logout #FirstJobId=1 #MaxJobId=999999 #GresTypes= #GroupUpdateForce=0 #GroupUpdateTime=600 #JobCheckpointDir=/var/slurm/checkpoint #JobCredentialPrivateKey= #JobCredentialPublicCertificate= #JobFileAppend=0 JobRequeue=1 #JobSubmitPlugins=1 #KillOnBadExit=0 #Licenses=foo*4,bar #MailProg=/bin/mail #MaxJobCount=5000 #MaxStepCount=40000 #MaxTasksPerNode=128 MpiDefault=none #MpiParams=ports=#-# #PluginDir= #PlugStackConfig= #PrivateData=jobs ProctrackType=proctrack/pgid #Prolog= #PrologSlurmctld= #PropagatePrioProcess=0 #PropagateResourceLimits= #PropagateResourceLimitsExcept= ReturnToService=1 #SallocDefaultCommand= SlurmctldPidFile=/var/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/var/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/tmp/slurmd SlurmUser=root #SlurmdUser= SrunEpilog=/opt/epilog #SrunProlog= StateSaveLocation=/var/spool SwitchType=switch/none #TaskEpilog= TaskPlugin=task/none #TaskPluginParam= #TaskProlog= #TopologyPlugin=topology/tree #TmpFs=/tmp #TrackWCKey=no #TreeWidth= #UnkillableStepProgram= UsePAM=0 # # # TIMERS #BatchStartTimeout=10 #CompleteWait=0 #EpilogMsgTime=2000 #GetEnvTimeout=2 #HealthCheckInterval=0 #HealthCheckProgram= InactiveLimit=0 KillWait=10 #MessageTimeout=10 #ResvOverRun=0 MinJobAge=2 #OverTimeLimit=5 SlurmctldTimeout=120 SlurmdTimeout=300 #UnkillableStepTimeout=60 #VSizeFactor=0 Waittime=0 # # # SCHEDULING #DefMemPerCPU=0 FastSchedule=1 #MaxMemPerCPU=0 #SchedulerRootFilter=1 SchedulerTimeSlice=30 PreemptMode=cancel PreemptType=preempt/partition_prio #PreemptType=preempt/qos SchedulerType=sched/backfill SchedulerPort=7321 #SelectType=select/linear SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory #PreemptMode=GANG # # # JOB PRIORITY PriorityType=priority/multifactor #PriorityDecayHalfLife= #PriorityCalcPeriod= PriorityFavorSmall=Yes PriorityMaxAge=10 #PriorityUsageResetPeriod= PriorityWeightAge=100000 #PriorityWeightFairshare=1000 PriorityWeightJobSize=1000 PriorityWeightPartition=1000 # # # LOGGING AND ACCOUNTING #AccountingStorageEnforce=0 #AccountingStorageHost= #AccountingStorageLoc= #AccountingStoragePass= #AccountingStoragePort= AccountingStorageType=accounting_storage/mysql #AccountingStorageType=none #AccountingStorageUser= AccountingStoreJobComment=YES ClusterName=cluster #DebugFlags= DebugFlags=NO_CONF_HASH #JobCompHost= #JobCompLoc= #JobCompPass= #JobCompPort= JobCompType=jobcomp/mysql #JobCompUser= JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/none SlurmctldDebug=3 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=3 SlurmdLogFile=/var/log/slurm/slurmd.log SlurmSchedLogFile=/var/log/slurm/slurmsched.log SlurmSchedLogLevel=1 # # # POWER SAVE SUPPORT FOR IDLE NODES (optional) #SuspendProgram= #ResumeProgram= #SuspendTimeout= #ResumeTimeout= #ResumeRate= #SuspendExcNodes= #SuspendExcParts= #SuspendRate= #SuspendTime= GresTypes=gpu # # # COMPUTE NODES NodeName=hpca[0002-0016] CPUs=16 Gres=gpu:2 State=UNKNOWN PartitionName=GPUNODES Nodes=hpca[0002-0016] shared=exclusive MaxTime=INFINITE Default=YES Priority=10 State=UP PartitionName=high Nodes=hpca[0002-0016] shared=exclusive MaxTime=INFINITE Default=NO Priority=20 State=UP RootOnly=YES