Hi guys, I've been testing job preemption and found a bug in implementation of PreemptMode=off option for partitions. The examples from: http://slurm.schedmd.com/preempt.html, presents non-preemptable partitions with PreemtMode=off only when the option is set for the highest priority partition. In our environment we rather wanted to implement partitions like: PartitionName=priousers [...] Priority=30 PreemptMode=suspend PartitionName=users [...] Priority=20 PreemptMode=suspend Shared=FORCE:1 PartitionName=external-users [...] Priority=10 PreemptMode=off Shared=FORCE:1
so.. the priority for external-users is lowest, but their jobs won't be preempted. Whithout patching jobs from external-users partition where considered to be preempted, and finally killed with: slurmctld/gang.c :671 if (rc != SLURM_SUCCESS) { rc = job_signal(job_ptr->job_id, SIGKILL, 0, 0, true); if (rc == SLURM_SUCCESS) info("preempted job %u had to be killed", job_ptr->job_id); else { info("preempted job %u kill failure %s", job_ptr->job_id, slurm_strerror(rc)); } } because they cannot be suspended, requeued, etc. because they are not in partition with appropriate PreemptMode. The needed change is in select plugin beeing used, in our case it's cons_res, but checking the code the problem in different plugins will be similar. I've changed a condition in: plugins/select/cons_res/job_test.c:2326 to: if (p_ptr->part_ptr->priority <= jp_ptr->part_ptr->priority && p_ptr->part_ptr->preempt_mode != PREEMPT_MODE_OFF) Probably this is the only needed change to implement correct behaviour (I was testing on workstation with 3 partitons), but I'd also recommend additional change in preemptable_candidates list creation. However it is implicitly checked (line 1558 select_cons_res.c) checked, if preemptable_candidate have PREEMPT_MODE_OFF job: mode = slurm_job_preempt_mode(tmp_job_ptr); if ((mode != PREEMPT_MODE_REQUEUE) && (mode != PREEMPT_MODE_CHECKPOINT) && (mode != PREEMPT_MODE_CANCEL)) continue; /* can't remove job */ I think it's more efficient do not include jobs from partition with PREEMPT_MODE_OFF in preemptable_candidates list, which can be done in (plugins/preempt/partition_prio/preempt_partition_prio.c:113 if ((job_p->part_ptr == NULL) || (job_p->part_ptr->priority >= job_ptr->part_ptr->priority) || (job_p->part_ptr->preempt_mode=PREEMPT_MODE_OFF)) // add check jobs partmode continue; I've attached a patch with my changes, it also add some additional debug with appropriate debug flags. I'was working on 2.6.7, but quick review of the code shows that there were no chages in 14.03-rc1 version. cheers, marcin =================== Marcin Stolarek Interdisciplinary Centre for Mathematical and Computational Modelling (ICM), University of Warsaw, Poland
patch
Description: Binary data