Regarding #2, we use https://slurm.schedmd.com/fair_tree.html and it has really exceeded my expectations.
Pete Ruprecht CU-Boulder Research Computing On 8/24/17, 11:03 AM, "Patrick Goetz" <pgo...@math.utexas.edu> wrote: I'm managing what has turned into a very busy cluster and the users are complaining about resource allocation. There appear to be 2 main complaints: 1. When users submit (say) 8 long running single core jobs, it doesn't appear that Slurm attempts to consolidate them on a single node (each of our nodes can accommodate 16 tasks). This becomes problematic for users submitting threaded applications which must run on a single node, as even short jobs can be locked out of every node on the system for days at a time. 2. Users are interested in temporally influenced resource allocation. Quoting one of the users "heavy users in recent past will get fewer resources assigned to them, while users that rarely use the cluster will likely get more jobs run. We believe a good system will track each user base on their usage" Basically they want to keep track of User_usage = Sum (Number of cores requested x Running time) accumulated over, say, a month's time before the counter is reset. I'm not really sure how to implement such a thing, or if it is even possible. It's quite possible that a solution to #1 would effectively solve #2 for them, as #1 is the source of the request for #2. Thanks.