Re: [slurm-users] Execute scripts on suspend and cancel

2019-10-15 Thread Oytun Peksel
Brian,

Thanks for your response. I am looking into that option. I am a bit confused 
about which signal is sent though. I thought it was SIGSTOP not SIGSTP. And I 
read you can't really catch and stop SIGSTOP or SIGCONT signals but I am not 
very good at sys admin stuff anyway.

So in the end, these feel like dirty tricks to me. The select/* plugins should 
have  mechanisms to run scripts and such before sending signals. But apparently 
there is no such mechanism.

So probably I will dig deeper into what you suggested.

Thanks



Oytun Peksel

oytun.pek...@semcon.com 

+46739205917


From: slurm-users  On Behalf Of Brian 
Andrus
Sent: den 15 oktober 2019 20:58
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Execute scripts on suspend and cancel


It seems that there are some details that would need addressed.

A suspend signal is nothing more than sending a SIGSTP (like hitting ctrl-s), 
so the application is still in memory awaiting SIGCONT

So what should happen when it continues and there are no more licenses? So the 
proper place for what you are looking for is in the application itself. If it 
is given a SIGSTP, it could release the licenses and then check them out again 
when SIGCONT is received.

If you are able to tell your app to release/request a license externally, you 
may want to have a wrapper to do the signal handling until they have it as part 
of their app.

Brian Andrus


On 10/14/2019 4:40 AM, Oytun Peksel wrote:
It is quite weird if slurm has no mechanism as described. I have been digging 
more into it and someone suggested a workaround using mail notifications. You 
use a script instead of the mail application and catch the event then use use 
sacct to see what is happening.

Two problems with this:

*There is no mail sent with suspended preemption

*If you use requeue instead there will be a mail event and you can 
catch it. Sacct will flag it as "preempted" so you know it is requeued. But 
then it would change it pending. So you really need to be quick to catch it. 
Also there is no distinctive flag for resuming.


Anyone has any other method to execute scripts during preemption?




Oytun Peksel

oytun.pek...@semcon.com 

+46739205917


From: slurm-users 

 On Behalf Of Oytun Peksel
Sent: den 11 oktober 2019 09:10
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Execute scripts on suspend and cancel

Hi,

I was wondering is there an option in Slurm to execute custom scripts before 
Suspend signal.  What I need to do is to tell an application to release it's 
licenses before sending the suspend signal during preemption. I think went 
through all the documentation but could not find a mechanism like this.

BR
/Oytun


When you communicate with us or otherwise interact with Semcon, we will process 
personal data that you provide to us or we collect about you, please read more 
in our Privacy Policy.


[slurm-users] Slurm User Group 2019 (SLUG19) presentations online, SC19

2019-10-15 Thread Tim Wickberg
Many thanks to all the attendees, and especially to all those who 
presented at the Slurm User Group 2019 meeting in Salt Lake City. Thank 
you to the University of Utah as well for hosting.


I hope to see many of you again at SLUG'20, which at Harvard University 
on September 15-16, 2020.


PDFs of the presentations are online at
http://slurm.schedmd.com/publications.html

For those of you who will be at SC19 in Denver - we hope to see you at 
the Slurm booth (#1571), and at the Slurm "Birds of a Feather" session 
on Thursday, November 21st, from 12:15 - 1:15pm, in rooms 
401/402/403/404. As always, there will be a number of presentations in 
the Slurm booth - please check the display in the booth for the full 
schedule.


- Tim

--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support



Re: [slurm-users] Execute scripts on suspend and cancel

2019-10-15 Thread Brian Andrus

It seems that there are some details that would need addressed.

A suspend signal is nothing more than sending a SIGSTP (like hitting 
ctrl-s), so the application is still in memory awaiting SIGCONT


So what should happen when it continues and there are no more licenses? 
So the proper place for what you are looking for is in the application 
itself. If it is given a SIGSTP, it could release the licenses and then 
check them out again when SIGCONT is received.


If you are able to tell your app to release/request a license 
externally, you may want to have a wrapper to do the signal handling 
until they have it as part of their app.


Brian Andrus


On 10/14/2019 4:40 AM, Oytun Peksel wrote:


It is quite weird if slurm has no mechanism as described. I have been 
digging more into it and someone suggested a workaround using mail 
notifications. You use a script instead of the mail application and 
catch the event then use use sacct to see what is happening.


Two problems with this:

·There is no mail sent with suspended preemption

·If you use requeue instead there will be a mail event and you can 
catch it. Sacct will flag it as “preempted” so you know it is 
requeued. But then it would change it pending. So you really need to 
be quick to catch it. Also there is no distinctive flag for resuming.


Anyone has any other method to execute scripts during preemption?



*Oytun Peksel*

oytun.pek...@semcon.com 




+46739205917




*From:*slurm-users  *On Behalf 
Of *Oytun Peksel

*Sent:* den 11 oktober 2019 09:10
*To:* slurm-users@lists.schedmd.com
*Subject:* [slurm-users] Execute scripts on suspend and cancel

Hi,

I was wondering is there an option in Slurm to execute custom scripts 
before Suspend signal.  What I need to do is to tell an application to 
release it’s licenses before sending the suspend signal during 
preemption. I think went through all the documentation but could not 
find a mechanism like this.


BR

/Oytun



/When you communicate with us or otherwise interact with Semcon, we 
will process personal data that you provide to us or we collect about 
you, please read more in our Privacy Policy 
./