Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Oytun Peksel
Thank you all for your input.
Being a newbie in this, my impression from what you guys write is  for most 
commercial software suspend/release_license/reacquire mechanism is not feasible.

(Answer to Mark)
What we are using here is an engineering software called abaqus. In abaqus you 
can use token based licenses which depend on number of cores used (and some 
other things). It checks out the license on submission from a flex license 
server, and if it gets suspended it releases the licenses. Then another 
instance can use the released tokens. If the initially suspended instance 
somehow resumed then it cannot start unless there are enough tokens.

I have had no problems with this mechanism really.  It works pretty well if  I 
do not attempt to track licenses with slurm.

I claim:  since Slurm doesn't really integrate with license servers and it is 
pretty much up to admin,  it should not assume that all licenses are not 
releasable.

Another thing puzzles me is :
AccountingStorageTRES=license/someSoftware

I would expect this to track the licenses defined either in slurm.conf or in 
sacctmgr. But it does not.

When I do :

scontrol show job

it does not show any licenses in the output:

TRES=cpu=23,mem=23G,node=1,billing=23

Or sacct --format=tres
Shows just the default trackable resources.


Oytun Peksel
oytun.pek...@semcon.com
Mobile   +46739205917


-Original Message-
From: slurm-users  On Behalf Of Chris 
Samuel
Sent: den 7 november 2019 08:03
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Help with preemtion based on licenses

On Wednesday, 6 November 2019 7:36:57 AM PST Oytun Peksel wrote:

> GPU part of the discussion is beyond my knowledge so I assumed it
> would be possible to release it.

If you simply suspend a job then the application does not exit, it will just 
get stopped and so will be holding various resources and file handles open - 
and that will include the GPU and the resources on it.

[...]
> After all software licenses might be the most expensive resource to
> utilize where preemption might sometimes be inevitable.

I think the thing to remember with software licensing systems is that we are 
not the users or customers for that vendor, it's the ISV whose software you are 
using who is their customer.  So their aim is to try and ensure that the ISV 
sells as many licenses for their software as possible.

If you just suspend an application that has checked licenses out and then use 
some other program to make the license server think it's died and release them 
then I suspect when you unsuspend it then it will be very confused as it'll 
think it still has these licenses checked out but the license server won't.  I 
suspect that would not lead to a happy program, user or license server.

So for both GPUs and licenses I suspect you really do want either cancel or 
requeue for this.

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






When you communicate with us or otherwise interact with Semcon, we will process 
personal data that you provide to us or we collect about you, please read more 
in our Privacy Policy.



Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Chris Samuel
On Wednesday, 6 November 2019 7:36:57 AM PST Oytun Peksel wrote:

> GPU part of the discussion is beyond my knowledge so I assumed it would be
> possible to release it.

If you simply suspend a job then the application does not exit, it will just 
get stopped and so will be holding various resources and file handles open - 
and that will include the GPU and the resources on it.

[...]
> After all software licenses might be the most expensive resource to utilize 
> where preemption might sometimes be inevitable.

I think the thing to remember with software licensing systems is that we are 
not the users or customers for that vendor, it's the ISV whose software you 
are using who is their customer.  So their aim is to try and ensure that the 
ISV sells as many licenses for their software as possible.

If you just suspend an application that has checked licenses out and then use 
some other program to make the license server think it's died and release them 
then I suspect when you unsuspend it then it will be very confused as it'll 
think it still has these licenses checked out but the license server won't.  I 
suspect that would not lead to a happy program, user or license server.

So for both GPUs and licenses I suspect you really do want either cancel or 
requeue for this.

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Reuti


> Am 06.11.2019 um 16:36 schrieb Oytun Peksel :
> 
> Thanks for the information Mark.
> 
> I understand. GPU part of the discussion is beyond my knowledge so I assumed 
> it would be possible to release it.
> 
> But as for the licenses it is always possible to leave it to the system 
> admin. It is possible to take care of license release and reacquire using 
> scripts instead of assuming it is not possible. At least there should be an 
> easy configuration option to configure generic or trackable resources to be 
> releasable.

To name some additional obstacles to Mark's notes:

In the inaction of any queuing system and the license tracking mechanism inside 
each application there can for sure many things be improved. But it starts 
already with the constraint that there is to my knowledge no mechanism in any 
license daemon to "check and reserve/acquire a license if available" in an 
atomic operation, so that the queuing system is aware of the availability of a 
license and schedule a job to use it. What might come close is to borrow a 
license in a scheduling run and use this information for an upcoming job. But 
here already the limitations of each allocation might be different: some 
vendors allow to release a borrowed license premature, while others don't allow 
this and one has to wait for the specified timeframe to elapse.

Then there is the application itself: when does it check for an available 
license? Just as the application starts, periodic every certain amount of 
elapsed time, or for each iteration while it's running – or will it hold the 
license while it's running and only release it when it finishes? What will 
happen if the application was suspended for some time and when it continues it 
might discover that there were X minutes without a license daemon response and 
so it might quit. If one is lucky: results achieved up to this point can still 
be saved.

To make the things worse: what type of license is used by a particular 
application? One license per core/thread, per CPU, per job, per machine; or per 
machine per user or for each group on this machine?

One positive aspect could be, if one job consists of several instances of a 
program like a compiler when compiling a large application and the job could be 
stopped exactly when no compiler instance is active but just the job script.

Sure, for some applications it might be possible to script this in some way. So 
in my opinion the first goal for such a proposal would be to get this working 
outside of any queuing system. Stop the application on a local machine with a 
sigstop and attempt to use the license by another instance of this application, 
being it the same or another machine. Often the state of the license daemon can 
be checked and the stopped application should allow the counter of the 
available licenses to increment again in the license daemon's state output.

-- Reuti


> After all software licenses might be the most expensive resource to utilize  
> where preemption might sometimes be inevitable.
> 
> For now I have no better plan than to dig in the source code to find an easy 
> way to change this behavior.
> 
> Oytun Peksel
> oytun.pek...@semcon.com
> Mobile   +46739205917
> 
> 
> -Original Message-
> From: slurm-users  On Behalf Of Mark 
> Hahn
> Sent: den 6 november 2019 16:23
> To: Slurm User Community List 
> Subject: Re: [slurm-users] Help with preemtion based on licenses
> 
>> This does not make sense to me. If gpu is my generic resource why would it 
>> not release the gpu resources if a job is suspended?
> 
> how would that be implemented?  how would the scheduler reach into the 
> application and cause the license to be released and reacquired?
> after all, the license server is otherwise oblivious to whether the job it 
> has granted a license to has been suspended or resumed.
> this applies to other gres as well - for instance GPUs, since there's no 
> mechanism to free up GPU resources allocated to a suspended process.
> 
> *that* is the problem - merely adding and substracting is not.
> 
> regards, mark hahn.
> 
> 
> 
> When you communicate with us or otherwise interact with Semcon, we will 
> process personal data that you provide to us or we collect about you, please 
> read more in our Privacy Policy.
> 




Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Oytun Peksel
Thanks for the information Mark.

I understand. GPU part of the discussion is beyond my knowledge so I assumed it 
would be possible to release it.

But as for the licenses it is always possible to leave it to the system admin. 
It is possible to take care of license release and reacquire using scripts 
instead of assuming it is not possible. At least there should be an easy 
configuration option to configure generic or trackable resources to be 
releasable.

After all software licenses might be the most expensive resource to utilize  
where preemption might sometimes be inevitable.

For now I have no better plan than to dig in the source code to find an easy 
way to change this behavior.

Oytun Peksel
oytun.pek...@semcon.com
Mobile   +46739205917


-Original Message-
From: slurm-users  On Behalf Of Mark Hahn
Sent: den 6 november 2019 16:23
To: Slurm User Community List 
Subject: Re: [slurm-users] Help with preemtion based on licenses

> This does not make sense to me. If gpu is my generic resource why would it 
> not release the gpu resources if a job is suspended?

how would that be implemented?  how would the scheduler reach into the 
application and cause the license to be released and reacquired?
after all, the license server is otherwise oblivious to whether the job it has 
granted a license to has been suspended or resumed.
this applies to other gres as well - for instance GPUs, since there's no 
mechanism to free up GPU resources allocated to a suspended process.

*that* is the problem - merely adding and substracting is not.

regards, mark hahn.



When you communicate with us or otherwise interact with Semcon, we will process 
personal data that you provide to us or we collect about you, please read more 
in our Privacy Policy.



Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Oytun Peksel
Ok, I found out it is possible to preempt on licenses if you define the license 
as a generic resource. Such as:
GresTypes=license
NodeName=SomeNode Gres=license:someSoftware:100

And submit the jobs with --gres=license:someSoftware:20

But this does not work with PreemptMode=Suspend. It would requeue or cancel the 
preempted job but it won't suspend it. There is an interesting paragraph in 
Gres Scheduling page:

"Jobs will be allocated specific generic resources as needed to satisfy the 
request. If the job is suspended, those resources do not become available for 
use by other jobs."

This does not make sense to me. If gpu is my generic resource why would it not 
release the gpu resources if a job is suspended?



Oytun Peksel
oytun.pek...@semcon.com
Mobile   +46739205917


-Original Message-
From: slurm-users  On Behalf Of Oytun 
Peksel
Sent: den 6 november 2019 09:09
To: Slurm User Community List 
Subject: Re: [slurm-users] Help with preemtion based on licenses

Yes of course no one would expect the resource manager to control the job 
applications to release licenses.
 Sometimes licenses are released either automatically or can be done by scripts.

The desired behavior here while using  '--license someSoftware@someserver:x ' :
 if there are not enough licenses a running job should be 
suspended/cancelled/requeued/checkpointed and assume that licenses are released.

Namely just treat license resource as any other resource like CPU and Memory. 
Nothing else. Today licenses are automatically pending the job disabling 
preemption mechanism.

The above behavior is observed with select/cons_tres plugin and license defined 
as a TRES "AccountingStorageTres=license/someSoftware



Oytun Peksel
oytun.pek...@semcon.com
Mobile   +46739205917


-Original Message-
From: slurm-users  On Behalf Of Mark Hahn
Sent: den 5 november 2019 16:38
To: Slurm User Community List 
Subject: Re: [slurm-users] Help with preemtion based on licenses

> The limiting factor in our cluster is licenses and I want to have high and 
> low priority jobs where submitting a high priority job will preempt (suspend) 
> a low priority job if all the licenses are already in use.

But what are you expecting to happen?  that Slurm will somehow release the 
license used by the suspended job, and then somehow reacquire the license when 
it is resumed?  I've never heard of that kind of thing even being offered by 
license managers, let alone that level of intimate integration between 
schedulers and license managers.

At most, a scheduler may provide a callout to query the number of free 
licenses, and consider a job eligible to start if its declared usage fits (gres 
in Slurm terms, I think).

regards, mark hahn
--
operator may differ from spokesperson.h...@mcmaster.ca



When you communicate with us or otherwise interact with Semcon, we will process 
personal data that you provide to us or we collect about you, please read more 
in our Privacy Policy.




Re: [slurm-users] Running job using our serial queue

2019-11-06 Thread Marcus Wagner

Hi David,

if I remember right (we have disabled swap for years now), swapping out 
processes seem to slow down the system overall.
But I know, that if the oom_killer does its job (killing over memory 
processes), the whole system is stalled until it has done its work. This 
might be the issue, your users see.


Hwloc at least should help the scheduler to decide, where to place 
processes, but if I remember right, slurm has to be built with hwloc 
support (meaning at least hwloc-devel has to be installed).

But this part is more guessing, than knowing.

Best
Marcus

On 11/5/19 11:58 AM, David Baker wrote:

Hello,

Thank you for your replies. I double checked that the "task" in, for 
example, taskplugin=task/affinity is optional. In this respect it is 
good to know that we have  the correct cgroups setup. So in theory 
users should only disturb themselves, however in reality we find that 
there is often a knock on effect on other users' jobs. So, for 
example, users have complained that their jobs sometimes stall. I can 
only vaguely think that something odd is going on at the kernel level 
perhaps.


One additional thing that I need to ask is... Should we have hwloc 
installed our compute nodes? Does that help? Whenever I check which 
processes are not being constrained by cgroups I only ever find a 
small group of system processes.


Best regards,
David





*From:* slurm-users  on behalf 
of Marcus Wagner 

*Sent:* 05 November 2019 07:47
*To:* slurm-users@lists.schedmd.com 
*Subject:* Re: [slurm-users] Running job using our serial queue
Hi David,

doing it the way you do it, is the same way, we do it.

When the Matlab job asks for one CPU, it only gets on CPU this way. 
That means, that all the processes are bound to this one CPU. So 
(theoretically) the user is just disturbing himself, if he uses more.


But especially Matlab, there are more things to do. I t does not 
suffice to add '-singleCompThread' to the commandline. Matlab is not 
the only tool, that tries to use all cores, it finds on the node.
The same is valid for CPLEX and Gurobi, both often used from Matlab. 
So even, if the user sets '-singleCompThread' for Matlab, that does 
not mean at all, the job is only using one CPU.



Best
Marcus

On 11/4/19 4:14 PM, David Baker wrote:

Hello,

We decided to route all jobs requesting from 1 to 20 cores to our 
serial queue. Furthermore, the nodes controlled by the serial queue 
are shared by multiple users. We did this to try to reduce the level 
of fragmentation across the cluster -- our default "batch" queue 
provides exclusive access to compute nodes.


It looks like the downside of the serial queue is that jobs from 
different users can interact quite badly. To some extent this is an 
education issue -- for example matlab users need to be told to add 
the "-singleCompThread" option to their command line. On the other 
hand I wonder if our cgroups setup is optimal for the serial queue. 
Our *cgroup.conf* contains...


*CgroupAutomount=yes
*
*CgroupReleaseAgentDir="/etc/slurm/cgroup"
*
*
*
*ConstrainCores=yes
*
*ConstrainRAMSpace=yes
*
*ConstrainDevices=yes
*
*TaskAffinity=no
*
*
*
*CgroupMountpoint=/sys/fs/cgroup*

The relevant cgroup configuration in the *slurm.conf *is...
*ProctrackType=proctrack/cgroup
TaskPlugin=affinity,cgroup*

Could someone please advise us on the required/recommended cgroup 
setup for the above scenario? For example, should we really set 
"TaskAffinity=yes"? I assume the interaction between jobs (sometimes 
jobs can get stalled) is due to context switching at the kernel 
level, however (apart from educating users) how can we minimise that 
switching on the serial nodes?


Best regards,
David



--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de  
www.itc.rwth-aachen.de  



--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de



Re: [slurm-users] Replacement for FastSchedule since 19.05.3

2019-11-06 Thread Taras Shapovalov
Hi Chris,

Thanks for the answer.
Does this mean that there is no way to get rid of annoying error messages
in the logs if we need the hardware autodetection (FastSchedule=0)?

error: FastSchedule will be removed in 20.02, as will the
FastSchedule=0 functionality. Please consider removing this from your
configuration now.


The error message suggests to "consider" this somehow. But I don't get how
we should consider this.

Best regards,

Taras

On Wed, Nov 6, 2019 at 5:30 AM Chris Samuel  wrote:

> On 5/11/19 6:36 am, Taras Shapovalov wrote:
>
> > Since Slurm 19.05.3 we get an error message that FastSchedule is
> > deprecated. But I cannot find in the documentation what is an
> > alternative option for FastSchedule=0. Do you know how we can do that
> > without using the option since 19.05.3?
>
> There isn't an alternative for FastSchedule=0 from what I can see, it
> seems that it doesn't work properly with cons_tres (which will be
> replacing cons_res) and so is destined for the scrap heap.
>
> See slide 10 of Tim's presentation from this years Slurm Users Group
> meeting:
>
> https://slurm.schedmd.com/SLUG19/Slurm_20.02_and_Beyond.pdf
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>


Re: [slurm-users] Help with preemtion based on licenses

2019-11-06 Thread Oytun Peksel
Yes of course no one would expect the resource manager to control the job 
applications to release licenses.
 Sometimes licenses are released either automatically or can be done by scripts.

The desired behavior here while using  '--license someSoftware@someserver:x ' :
 if there are not enough licenses a running job should be 
suspended/cancelled/requeued/checkpointed and assume that licenses are released.

Namely just treat license resource as any other resource like CPU and Memory. 
Nothing else. Today licenses are automatically pending the job disabling 
preemption mechanism.

The above behavior is observed with select/cons_tres plugin and license defined 
as a TRES "AccountingStorageTres=license/someSoftware



Oytun Peksel
oytun.pek...@semcon.com
Mobile   +46739205917


-Original Message-
From: slurm-users  On Behalf Of Mark Hahn
Sent: den 5 november 2019 16:38
To: Slurm User Community List 
Subject: Re: [slurm-users] Help with preemtion based on licenses

> The limiting factor in our cluster is licenses and I want to have high and 
> low priority jobs where submitting a high priority job will preempt (suspend) 
> a low priority job if all the licenses are already in use.

But what are you expecting to happen?  that Slurm will somehow release the 
license used by the suspended job, and then somehow reacquire the license when 
it is resumed?  I've never heard of that kind of thing even being offered by 
license managers, let alone that level of intimate integration between 
schedulers and license managers.

At most, a scheduler may provide a callout to query the number of free 
licenses, and consider a job eligible to start if its declared usage fits (gres 
in Slurm terms, I think).

regards, mark hahn
--
operator may differ from spokesperson.h...@mcmaster.ca



When you communicate with us or otherwise interact with Semcon, we will process 
personal data that you provide to us or we collect about you, please read more 
in our Privacy Policy.