Hello Dani and Thomas,

thank you both for your responses. They seem fit for a regular-size hpc-cluster.

However i think i did not specify clear enough what my cluster looks like and what i'm trying to achieve. Compared to a regular HPC-Cluster my testing cluster consists of as little as 5 nodes (each having the same "grandscale"-features, so no IB-nodes etc., and only differ in hardware details like cpu, #RAM etc., including some MCUs akin to a raspberry pi). The purpose of this cluster is to investigate how smart distribution of workloads based on predetermined performance and energy data can benefit hpc-clusters that consist of heterogenous systems that differ greatly regarding energy consumption and performance.
Its just a small research project.

I already have all the data i need and now just need to find a way to integrate node selection based on these priority lists into Slurm.

My idea is to write a plugin that, on job submission to slurm, reads those lists, makes a smart selection based on different criteria which currently available node would be suited best, and forwards the job to that note. Partition-Selection is not needed since i run all Nodes in one partition for easier usage. The only information my plugin needs to forward besides nodename is some small config params in the form of Environment Variables on the target machine.

So far i did those job requests manually via:
srun -w<Nodename> --export="<ev1>",... <cmd>

I would like to include functionality into slurm so upon a simple "srun <cmd>" it supplies node selection and matching envVars automatically. Based on my current knowledge of slurms architecture, a plugin (either select, or schedule) seems to be the apparent fit for what i'm trying to achieve.

However, as stated in my first mail, i have not dabbled with plugin development/editing yet and kindly ask for advice from someone more experienced with that if indeed i pursue the correct approach. Or if a frontend solution, albeit less elegant, would be both easier and better fitting for the purpose of this project.

Thanks again,

M. Wagner




2017-04-04 19:54 GMT+02:00 Thomas M. Payerle <paye...@umd.edu>:

You can define nodes with "features", and then at job submission time
require specific features.

E.g., if you had some nodes with ethernet only, some with QDR infiniband,
and some with FDR ib, you could define the QDR and FDR nodes to have
the feature qdr or fdr, respectively.  Then, e.g.  a job needing QDR
infiniband
would request the qdr feature when running sbatch. You would probably want
to set the WEIGHTS (as dani wrote) so that non-IB nodes are selected
preferentially
(i.e. when no qdr or fdr feature requested) so that non-IB jobs are less
likely
to consume all the IB nodes.

This is all or nothing --- i.e. a job requesting feature qdr will NOT run
on non-qdr nodes.  It won't even run on fdr nodes (IIRC, you cannot ask
for either qdr or fdr) --- if sufficient qdr nodes are not available it just waits until they are. As opposed to the WEIGHTS suggestion, which basically says these nodes are preferred, but other nodes can be used if needed. (So
using WEIGHTS is useful for preferring non-IB nodes; if no ib feature
requested,
the non-IB nodes are preferred, but ib nodes can be used for such jobs if no non-IB nodes available. But jobs requesting ib features will skip over
the non-IB nodes despite the higher weight because lack the requested
feature).


You can have sets of disjoint features (e.g. an sse4 feature if only some of your nodes support SSE4), and jobs can e.g. request fdr AND sse4 (assuming
you have nodes meeting those criteria).

I do not believe that there is anything builtin to have features selected
by QOS, although one could probably add a job_submit lua script which
adds features to the job request based on QOS.




On Tue, 4 Apr 2017, dani wrote:

AFAIK, sbatch and friends don't allow "preferences" when submitting jobs
to multiple "clusters/partitions". Theoretically you could define a
different QOS for each workload, but either that QOS would be valid for all
nodes/partitions, or it would be the same as submitting to a single
partition (group of nodes).

The other mechanism allowing some preferential treatment of nodes is
"WEIGHT", but this one is generic to all jobs - all jobs would prefer the higher wight nodes, and I don't know of a qos that can add weight to nodes,
to be assigned to both a job and a partition to bridge that gap.

I think building a qos extension would be the "best" generic solution, but
not necessarily  the easiest to implement.

On 04/04//2017 18:42, maviko.wag...@fau.de wrote:


Hello everyone,

I'm looking for advice regarding what would be the easiest way to realize
the following:

I'm running Slurm to manage a heterogenous Cluster featuring a multitude
of different Nodes with specific Features and Hardware Details.
I'm currently using Slurm 15.08.7 since some of my Nodes run pretty Old
Ubuntus (14.04 oldest).
I also know power and runtime levels for all possible Workloads that
could be submitted to my Cluster. This Data was created via predetermined
tests.

Now i would like to utilize this data to chose the best (based on
different criterias) available target node for each workload upon
submission.
What would be the easiest way to implement said behaviour?
Would writing a custom Scheduler-Plugin be suitable? Or would it be
easier to implement a front-end running programm that receives input similar to slurm and forwards commands based on some internal selection algorithms?

I have not looked closer into tweaking Slurm further than basic
slurm.conf edits and would like to receive advice on which approach seems
more suitable... And maybe starting points.

Thanks in advance,

M. Wagner





Tom Payerle
IT-ETI-EUS                              paye...@umd.edu
4254 Stadium Dr                         (301) 405-6135
University of Maryland
College Park, MD 20742-4111

Reply via email to