Hello everyone,

I'm trying to improve topology awareness in a local Slurm-managed HPC system. 
It's using the default hierarchical 3-level topology with the tree-plugin. It 
however does not always confine jobs to the most tightly packed group of nodes, 
seems to over-provision switches for smaller jobs, and gets slow or overwhelmed 
with jobs that have a high node count. 
I'd like to implement something more literally aligned with best-fit, but I'm 
having trouble understanding the relevant interfaces to hook into the topology 
model of Slurm. I would like a high-level explanation of how the tree- and 
common topology components work, how they integrate into the higher scheduling 
logic and what the internal topology model looks like. Or some pointers to 
relevant docs discussing this.

I have read the topology guide and its dev-doc, which does note some of the 
caveats I mentioned. It however only talks about providing a set of weights to 
the upper logic levels in the form of a node ranking. I can't see how this 
ranking resembles the topology and how it's being used. From looking at the 
signatures and C-code I can tell this much:

topology-tree consumes the topology.conf and generates a ranking of some kind 
that is passed to topology-common.

topology-common consumes a ranking and uses its own gres-sched to figure out 
what nodes can fit a job (possibly pulling info from the gres-select-plugin to 
determine node capabilities).

It's then supposed to apply a best-fit algorithm to efficiently fill up vacant 
cluster-capacity, but I can't manage to follow this part in the code as 
everything crumbles into separate files that I can't link correctly in my head.

Thanks in advance.

referenced docs: 
<https://slurm.schedmd.com/topology.html> 
<https://hpc.rz.rptu.de/documentation/topology_plugin.html> 
<https://github.com/SchedMD/slurm/tree/master/src/plugins/topology/common>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to