[slurm-dev] Job submit plugin to improve backfill
At CCNI, we use backfill scheduling on all our systems. However, we have found that users typically do not specify a time limit for their job so the scheduler assumes the maximum from QoS/user limits/partition limits/etc. This really hurts backfilling since the scheduler remains ignorant of short jobs. Attached is a small patch I wrote containing a job submit plugin and a new error message. The plugin rejects a job submission when it is missing a time limit and will provide the user with a clear and distinct error. I've just re-tested and the patch applies and builds cleanly on the slurm-2.5, slurm-2.6, and master branches. Please let me know if you find this useful, run across problems, or have suggestions/improvements. Thanks. -- Daniel M. Weeks Systems Programmer Computational Center for Nanotechnology Innovations Rensselaer Polytechnic Institute Troy, NY 12180 518-276-4458 diff --git a/configure.ac b/configure.ac index 609534b..beb14cb 100644 --- a/configure.ac +++ b/configure.ac @@ -501,6 +503,7 @@ AC_CONFIG_FILES([Makefile src/plugins/job_submit/logging/Makefile src/plugins/job_submit/lua/Makefile src/plugins/job_submit/partition/Makefile + src/plugins/job_submit/require_timelimit/Makefile src/plugins/launch/Makefile src/plugins/launch/aprun/Makefile src/plugins/launch/poe/Makefile diff --git a/slurm/slurm_errno.h b/slurm/slurm_errno.h index 7f8bb72..01267c3 100644 --- a/slurm/slurm_errno.h +++ b/slurm/slurm_errno.h @@ -257,7 +257,10 @@ enum { ESLURM_JOBS_RUNNING_ON_ASSOC, ESLURM_CLUSTER_DELETED, ESLURM_ONE_CHANGE, - ESLURM_BAD_NAME + ESLURM_BAD_NAME, + + /* require_timelimit custom errors */ + ESLURM_MISSING_TIME_LIMIT = 8000 }; /* look up an errno value */ diff --git a/src/common/slurm_errno.c b/src/common/slurm_errno.c index 24f5018..28834fd 100644 --- a/src/common/slurm_errno.c +++ b/src/common/slurm_errno.c @@ -391,7 +391,11 @@ static slurm_errtab_t slurm_errtab[] = { { ESLURM_ONE_CHANGE, Can only change one at a time }, { ESLURM_BAD_NAME, - Unacceptable name given. (No '.' in name allowed) } + Unacceptable name given. (No '.' in name allowed) }, + + /* require_timelimit custom errors */ + { ESLURM_MISSING_TIME_LIMIT, + Missing time limit } }; /* diff --git a/src/plugins/job_submit/Makefile.am b/src/plugins/job_submit/Makefile.am index e35d4fe..c0cc646 100644 --- a/src/plugins/job_submit/Makefile.am +++ b/src/plugins/job_submit/Makefile.am @@ -1,3 +1,3 @@ # Makefile for job_submit plugins -SUBDIRS = all_partitions cnode defaults logging lua partition +SUBDIRS = all_partitions cnode defaults logging lua partition require_timelimit diff --git a/src/plugins/job_submit/require_timelimit/Makefile.am b/src/plugins/job_submit/require_timelimit/Makefile.am new file mode 100644 index 000..117103a --- /dev/null +++ b/src/plugins/job_submit/require_timelimit/Makefile.am @@ -0,0 +1,13 @@ +# Makefile for job_submit/require_timelimit plugin + +AUTOMAKE_OPTIONS = foreign + +PLUGIN_FLAGS = -module -avoid-version --export-dynamic + +INCLUDES = -I$(top_srcdir) -I$(top_srcdir)/src/common + +pkglib_LTLIBRARIES = job_submit_require_timelimit.la + +# Job submit require_timelimit plugin. +job_submit_require_timelimit_la_SOURCES = job_submit_require_timelimit.c +job_submit_require_timelimit_la_LDFLAGS = $(SO_LDFLAGS) $(PLUGIN_FLAGS) diff --git a/src/plugins/job_submit/require_timelimit/job_submit_require_timelimit.c b/src/plugins/job_submit/require_timelimit/job_submit_require_timelimit.c new file mode 100644 index 000..32367d7 --- /dev/null +++ b/src/plugins/job_submit/require_timelimit/job_submit_require_timelimit.c @@ -0,0 +1,34 @@ +#include slurm/slurm.h +#include slurm/slurm_errno.h + +#include src/slurmctld/slurmctld.h + +const char plugin_name[]=Require time limit jobsubmit plugin; +const char plugin_type[]=job_submit/require_timelimit; +const uint32_t plugin_version = 100; +const uint32_t min_plug_version = 100; + +int job_submit(struct job_descriptor *job_desc, uint32_t submit_uid) +{ + // NOTE: no job id actually exists yet (=NO_VAL) + + if (job_desc-time_limit == NO_VAL) { + info(Missing time limit for job by uid:%u, submit_uid); + return ESLURM_MISSING_TIME_LIMIT; + } else if (job_desc-time_limit == INFINITE) { + info(Bad time limit for job by uid:%u, submit_uid); + return ESLURM_INVALID_TIME_LIMIT; + } + + return SLURM_SUCCESS; +} + +int job_modify(struct job_descriptor *job_desc, struct job_record *job_ptr, uint32_t submit_uid) +{ + if (job_desc-time_limit == INFINITE) { + info(Bad replacement time limit for %u, job_desc-job_id); + return ESLURM_INVALID_TIME_LIMIT; + } + + return SLURM_SUCCESS; +}
[slurm-dev] Re: Job submit plugin to improve backfill
An alternative that we do is choose very low defaults for people: PartitionName=Default DefaultTime=30:00 #plus other options DefMemPerCPU=512 The disadvantage to this approach is that it doesn't give an obvious error message at submit time. However, it's not hard to figure out what happened when they hit the time limit or the error output says they went over their memory limit. Ryan On 06/28/2013 08:29 AM, Daniel M. Weeks wrote: At CCNI, we use backfill scheduling on all our systems. However, we have found that users typically do not specify a time limit for their job so the scheduler assumes the maximum from QoS/user limits/partition limits/etc. This really hurts backfilling since the scheduler remains ignorant of short jobs. Attached is a small patch I wrote containing a job submit plugin and a new error message. The plugin rejects a job submission when it is missing a time limit and will provide the user with a clear and distinct error. I've just re-tested and the patch applies and builds cleanly on the slurm-2.5, slurm-2.6, and master branches. Please let me know if you find this useful, run across problems, or have suggestions/improvements. Thanks. -- Ryan Cox Operations Director Fulton Supercomputing Lab Brigham Young University
[slurm-dev] Re: Job submit plugin to improve backfill
Hi Ryan, Thanks. We had considered this approach but went in a different direction for a couple reasons: We have a good number of users that script job submissions and may blast out up to several hundred jobs. A user might not realize their jobs are getting cutoff until many of them run and it's a waste of resources. Also, we have many users that are relatively new to HPC/Slurm and work from guides or tutorials that don't explain things very well. The distinct error message at job submission rather than a related error after a failure (from the user's perspective) keeps a lot of support emails out of my inbox. Of course I'd like them to learn to use Slurm better but they usually want to focus on their own research first. - Dan On 06/28/2013 11:00 AM, Ryan Cox wrote: An alternative that we do is choose very low defaults for people: PartitionName=Default DefaultTime=30:00 #plus other options DefMemPerCPU=512 The disadvantage to this approach is that it doesn't give an obvious error message at submit time. However, it's not hard to figure out what happened when they hit the time limit or the error output says they went over their memory limit. Ryan On 06/28/2013 08:29 AM, Daniel M. Weeks wrote: At CCNI, we use backfill scheduling on all our systems. However, we have found that users typically do not specify a time limit for their job so the scheduler assumes the maximum from QoS/user limits/partition limits/etc. This really hurts backfilling since the scheduler remains ignorant of short jobs. Attached is a small patch I wrote containing a job submit plugin and a new error message. The plugin rejects a job submission when it is missing a time limit and will provide the user with a clear and distinct error. I've just re-tested and the patch applies and builds cleanly on the slurm-2.5, slurm-2.6, and master branches. Please let me know if you find this useful, run across problems, or have suggestions/improvements. Thanks. -- Ryan Cox Operations Director Fulton Supercomputing Lab Brigham Young University -- Daniel M. Weeks Systems Programmer Computational Center for Nanotechnology Innovations Rensselaer Polytechnic Institute Troy, NY 12180 518-276-4458
[slurm-dev] Re: Job submit plugin to improve backfill
Hello, Why not enable this functionality by setting DefaultTime=0 in slurm.conf which would let us set this on per-partition basis, rather than through job submit plugin. (Unless i'm missing something obvious here) Also currently setting DefaultTime=0 (on 2.5.6 at least) gives following message: # srun -N2 hostname srun: error: Unable to create job step: Job/step already completing or completed I suppose it is the way it should be, but seems rather illogical to be able to set this at all. -- Nikita Burtsev Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, June 28, 2013 at 7:25 PM, Daniel M. Weeks wrote: Hi Ryan, Thanks. We had considered this approach but went in a different direction for a couple reasons: We have a good number of users that script job submissions and may blast out up to several hundred jobs. A user might not realize their jobs are getting cutoff until many of them run and it's a waste of resources. Also, we have many users that are relatively new to HPC/Slurm and work from guides or tutorials that don't explain things very well. The distinct error message at job submission rather than a related error after a failure (from the user's perspective) keeps a lot of support emails out of my inbox. Of course I'd like them to learn to use Slurm better but they usually want to focus on their own research first. - Dan On 06/28/2013 11:00 AM, Ryan Cox wrote: An alternative that we do is choose very low defaults for people: PartitionName=Default DefaultTime=30:00 #plus other options DefMemPerCPU=512 The disadvantage to this approach is that it doesn't give an obvious error message at submit time. However, it's not hard to figure out what happened when they hit the time limit or the error output says they went over their memory limit. Ryan On 06/28/2013 08:29 AM, Daniel M. Weeks wrote: At CCNI, we use backfill scheduling on all our systems. However, we have found that users typically do not specify a time limit for their job so the scheduler assumes the maximum from QoS/user limits/partition limits/etc. This really hurts backfilling since the scheduler remains ignorant of short jobs. Attached is a small patch I wrote containing a job submit plugin and a new error message. The plugin rejects a job submission when it is missing a time limit and will provide the user with a clear and distinct error. I've just re-tested and the patch applies and builds cleanly on the slurm-2.5, slurm-2.6, and master branches. Please let me know if you find this useful, run across problems, or have suggestions/improvements. Thanks. -- Ryan Cox Operations Director Fulton Supercomputing Lab Brigham Young University -- Daniel M. Weeks Systems Programmer Computational Center for Nanotechnology Innovations Rensselaer Polytechnic Institute Troy, NY 12180 518-276-4458
[slurm-dev] Re: Job submit plugin to improve backfill
Another route that could be taken is to set the DefaultTime for a partition to 0, and the small patch attached to this email will reject a job when is has no time limit specified and the default_time limit is 0. I also modified the ESLURM_INVALID_TIME_LIMIT to include information that the error might be because of a missing time limit. Phil Eckert LLNL On 6/28/13 7:29 AM, Daniel M. Weeks week...@rpi.edu wrote: At CCNI, we use backfill scheduling on all our systems. However, we have found that users typically do not specify a time limit for their job so the scheduler assumes the maximum from QoS/user limits/partition limits/etc. This really hurts backfilling since the scheduler remains ignorant of short jobs. Attached is a small patch I wrote containing a job submit plugin and a new error message. The plugin rejects a job submission when it is missing a time limit and will provide the user with a clear and distinct error. I've just re-tested and the patch applies and builds cleanly on the slurm-2.5, slurm-2.6, and master branches. Please let me know if you find this useful, run across problems, or have suggestions/improvements. Thanks. -- Daniel M. Weeks Systems Programmer Computational Center for Nanotechnology Innovations Rensselaer Polytechnic Institute Troy, NY 12180 518-276-4458 spatch Description: spatch