AW: make slaves not getting tasks anymore

Mike Michel Wed, 30 Dec 2015 09:49:49 -0800

I am using marathon and from shuai lins answer it still seems that maintenance 
mode is not the right option for me. I don’t want to marathon move the tasks to 
another node (phase 1) without user action (restart the task) and it should 
also not just kill the tasks (phase 2).


 

To be concrete: I need to update docker and want to tell users that they need 
to restart their tasks to be moved to a node with the latest docker version. 

 

With MESOS-1739 my „first idea“ would work.

 

Von: Klaus Ma [mailto:klaus1982...@gmail.com] 
Gesendet: Mittwoch, 30. Dezember 2015 13:24
An: user@mesos.apache.org
Betreff: Re: make slaves not getting tasks anymore

 

Hi Mike,

 

Which framework are you using? How about Maintenance's scheduling feature? My 
understanding is that framework show not dispatch task to the maintenance 
agent; so Operator can wait for all tasks finished before taking any action.

 

For "When maintenance is triggered by the operator", it's used when there're 
some tasks took too long time to finish; so Operator can task action to shut 
them down.

 

For the agent restart with new attributes, there's a JIRA (MESOS-1739) about it.




----

Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer 
Platform Symphony/DCOS Development & Support, STG, IBM GCG 
+86-10-8245 4084 |  <mailto:klaus1982...@gmail.com> klaus1982...@gmail.com |  
<http://k82.me> http://k82.me

 

On Wed, Dec 30, 2015 at 7:43 PM, Mike Michel <mike.mic...@mmbash.de 
<mailto:mike.mic...@mmbash.de> > wrote:

Hi,

 

i need to update slaves from time to time and looking for a way to take them 
out of the cluster but without killing the running tasks. I need to wait until 
all tasks are done and during this time no new tasks should be started on this 
slave. My first idea was to set a constraint „status:online“ for every task i 
start and then change the attribute of the slave to „offline“, restart slave 
process while executer still runs the tasks but it seems if you change the 
attributes of a slave it can not connect to the cluster without rm -rf /tmp 
before which will kill all tasks.

 

Also the maintenance mode seems not to be an option:

 

„When maintenance is triggered by the operator, all agents on the machine are 
told to shutdown. These agents are subsequently removed from the master which 
causes tasks to be updated as TASK_LOST. Any agents from machines in 
maintenance are also prevented from registering with the master.“

 

Is there another way?

 

 

Cheers

 

Mike

AW: make slaves not getting tasks anymore

Reply via email to