Dear Slurm users,

It is very useful to view Slurm a user's resource limits and current usage. For example, jobs may be blocked because some resource limit gets exceeded, and it is important to analyze why this occurs.

Several Slurm commands such as sshare and sacctmgr can print a number of user limits, and to a lesser extent the user's current usage, however, their capabilities are very limited.

The showuserlimits tool fills this need by inquiring the Slurm database about all available user and association limits and current usages. The amount of information in the database is quite extensive, so the showuserlimits tool allows filtering the data and print only the desired information. An output example is:

$ showuserlimits -u xxx -l GrpTRESRunMins -s cpu
Association (User):
           ClusterName =        niflheim
               Account =        camdvip
              UserName =        xxx, current value or id = 1777
             Partition =        None, current value or id = Any partition
        GrpTRESRunMins =
                     cpu:       Limit = 7000000, current value = 2800752

The showuserlimits tool can be downloaded from:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/showuserlimits

The showuserlimits tool is used by the showjob command available from the scripts for managing jobs:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/jobs

If you have comments or suggestions regarding these tools, please send me a mail.

Best regards,
Ole

--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark

Reply via email to