[jira] [Created] (MESOS-8933) Stop sending offers from agents in draining mode

2018-05-17 Thread Sagar Sadashiv Patwardhan (JIRA)
Sagar Sadashiv Patwardhan created MESOS-8933:


 Summary: Stop sending offers from agents in draining mode
 Key: MESOS-8933
 URL: https://issues.apache.org/jira/browse/MESOS-8933
 Project: Mesos
  Issue Type: Improvement
Reporter: Sagar Sadashiv Patwardhan


*Background:*

At Yelp, we use mesos to run microservices(marathon), batch jobs(chronos and 
custom frameworks), spark(spark mesos framework) etc.  We also autoscale the 
number of agents in our cluster based on the current demand and some other 
metrics. We use mesos maintenance primitives to gracefully shut down mesos 
agents. 

*Problem:*

When we want to shut down an agent for some reason, we first move the agent 
into draining mode. This allows us to gracefully terminate the micro-services 
and other tasks. But, mesos continues to send offers from that agent with 
unavailability set. Frameworks such as marathon, chronos and spark ignore the 
unavailability and schedule the tasks on the agent. To prevent this from 
happening, we allocate all the available resources on that agent to maintenance 
role. But, this approach is not fool-proof. There is still a race condition 
between when we move the agent into draining mode and when we allocate all the 
available resources on the agent to maintenance role.

*Proposal:*

 It would be nice if mesos stops sending offers from the agents in draining 
mode. Something like this: 
[https://gist.github.com/sagar8192/0b9dbccc908818f8f9f5a18d1f634513] I don't 
know if this affects the allocator or not. We can put this behind a 
flag(something like --do-not-send-ffers-from-agents-in-draining-mode) and make 
it optional.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-22 Thread Sagar Sadashiv Patwardhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373786#comment-16373786
 ] 

Sagar Sadashiv Patwardhan commented on MESOS-8534:
--

Hi Qian Zhang, the reason behind using pods is they provide atomicity(all or 
nothing semantics). We want either all the containers to start or none of them 
to start. If we put the containers in different pods, we will not be able to 
use all-or-nothing sematic of pods. We had some discussion about this in 
today's containerizer WG. Please check it out when you get a chance. I think it 
will help clarify your questions.

> Allow nested containers in TaskGroups to have separate network namespaces
> -
>
> Key: MESOS-8534
> URL: https://issues.apache.org/jira/browse/MESOS-8534
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Sagar Sadashiv Patwardhan
>Priority: Minor
>  Labels: cni
>
> As per the discussion with [~jieyu] and [~avinash.mesos] , I am going to 
> allow nested containers in TaskGroups to have separate namespaces. I am also 
> going to retain the existing functionality, where nested containers can share 
> namespaces with the parent/root container.
> *Use case:* At Yelp, we have this application called seagull that runs 
> multiple tasks in parallel. It is mainly used for running tests that depend 
> on other containerized internal microservices. It was developed before mesos 
> had support for docker-executor. So, it uses a custom executor, which 
> directly talks to docker daemon on the host and run a bunch of service 
> containers along with the process where tests are executed. Resources for all 
> these containers are not accounted for in mesos. Clean-up of these containers 
> is also a headache. We have a tool called docker-reaper that automatically 
> reaps the orphaned containers once the executor goes away. In addition to 
> that, we also run a few cron jobs that clean-up any leftover containers.
> We are in the process of containerizing the process that runs the tests. We 
> also want to delegate the responsibility of lifecycle management of docker 
> containers to mesos and get rid of the custom executor. We looked at a few 
> alternatives to do this and decided to go with pods because they provide 
> all-or-nothing(atomicity) semantics that we need for our application. But, we 
> cannot use pods directly because all the containers in a pod have the same 
> network namespace. The service discovery mechanism requires all the 
> containers to have separate IPs. All of our microservices bind to  
> container port, so we will have port collision unless we are giving separate 
> namespaces to all the containers in a pod.
> *Proposal:* I am planning to allow nested containers to have separate 
> namespaces. If NetworkInfo protobuf for nested containers is not empty, then 
> we will assign separate mnt and network namespaces to the nested containers. 
> Otherwise,  they will share the network and mount namepsaces with the 
> parent/root container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-20 Thread Sagar Sadashiv Patwardhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370501#comment-16370501
 ] 

Sagar Sadashiv Patwardhan commented on MESOS-8534:
--

[~alexr] We can document this behavior and add a check to validate that 
HTTP/TCP is not set for nested containers that are requesting separate 
namespaces(have NetworkInfos set in their ContainerInfo). If people are 
interested in using HTTP health check, then we ask them to use command check 
with `curl` instead, since both of them are equivalent. I will open a follow-up 
ticket to fix HTTP and TCP health checks. We just need to find the PID of the 
container and clone the health check process with namespaces of the target 
nested container.

> Allow nested containers in TaskGroups to have separate network namespaces
> -
>
> Key: MESOS-8534
> URL: https://issues.apache.org/jira/browse/MESOS-8534
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Sagar Sadashiv Patwardhan
>Priority: Minor
>  Labels: cni
>
> As per the discussion with [~jieyu] and [~avinash.mesos] , I am going to 
> allow nested containers in TaskGroups to have separate namespaces. I am also 
> going to retain the existing functionality, where nested containers can share 
> namespaces with parent/root container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-15 Thread Sagar Sadashiv Patwardhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362737#comment-16362737
 ] 

Sagar Sadashiv Patwardhan edited comment on MESOS-8534 at 2/16/18 1:16 AM:
---

[~alexr] Yes, this will affect both HTTP and TCP healthchecks. Let me figure 
what can be done to retain the existing functionality.


was (Author: sagar8192):
[~alexr] Yes, I think this will affect both HTTP and TCP healthchecks. Let me 
figure what can be done to retain the existing functionality.

> Allow nested containers in TaskGroups to have separate network namespaces
> -
>
> Key: MESOS-8534
> URL: https://issues.apache.org/jira/browse/MESOS-8534
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Sagar Sadashiv Patwardhan
>Priority: Minor
>  Labels: cni
>
> As per the discussion with [~jieyu] and [~avinash.mesos] , I am going to 
> allow nested containers in TaskGroups to have separate namespaces. I am also 
> going to retain the existing functionality, where nested containers can share 
> namespaces with parent/root container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-15 Thread Sagar Sadashiv Patwardhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366490#comment-16366490
 ] 

Sagar Sadashiv Patwardhan commented on MESOS-8534:
--

I discussed this with [~jieyu] today. Making TCP and HTTP healthchecks work is 
not straightforward and will require a lot of work. He suggested that we can 
use command check instead. Command check for nested containers already executes 
commands under the target nested container namespaces. So, we can use this 
`curl 127.0.0.1:` instead of HTTP healthcheck. This solution works for 
our use case.

> Allow nested containers in TaskGroups to have separate network namespaces
> -
>
> Key: MESOS-8534
> URL: https://issues.apache.org/jira/browse/MESOS-8534
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Sagar Sadashiv Patwardhan
>Priority: Minor
>  Labels: cni
>
> As per the discussion with [~jieyu] and [~avinash.mesos] , I am going to 
> allow nested containers in TaskGroups to have separate namespaces. I am also 
> going to retain the existing functionality, where nested containers can 
> connect to parent/root containers namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-13 Thread Sagar Sadashiv Patwardhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362737#comment-16362737
 ] 

Sagar Sadashiv Patwardhan commented on MESOS-8534:
--

[~alexr] Yes, I think this will affect both HTTP and TCP healthchecks. Let me 
figure what can be done to retain the existing functionality.

> Allow nested containers in TaskGroups to have separate network namespaces
> -
>
> Key: MESOS-8534
> URL: https://issues.apache.org/jira/browse/MESOS-8534
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Sagar Sadashiv Patwardhan
>Priority: Minor
>  Labels: cni
>
> As per the discussion with [~jieyu] and [~avinash.mesos] , I am going to 
> allow nested containers in TaskGroups to have separate namespaces. I am also 
> going to retain the existing functionality, where nested containers can 
> connect to parent/root containers namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8534) Allow nested containers in TaskGroups to have separate network namespaces

2018-02-01 Thread Sagar Sadashiv Patwardhan (JIRA)
Sagar Sadashiv Patwardhan created MESOS-8534:


 Summary: Allow nested containers in TaskGroups to have separate 
network namespaces
 Key: MESOS-8534
 URL: https://issues.apache.org/jira/browse/MESOS-8534
 Project: Mesos
  Issue Type: Task
Reporter: Sagar Sadashiv Patwardhan


This is a placeholder. I will fill in more details after I have them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (MESOS-7882) Mesos master rescinds all the in-flight offers from all the registered agents when a new maintenance schedule is posted for a subset of slaves

2017-08-11 Thread Sagar Sadashiv Patwardhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sadashiv Patwardhan updated MESOS-7882:
-
Description: 
We are running mesos 1.1.0 in production. We use a custom autoscaler for 
scaling our mesos  cluster up and down. While scaling down the cluster, 
autoscaler makes a POST request to mesos master /maintenance/schedule endpoint 
with a set of slaves to move to maintenance mode. This forces mesos master to 
rescind all the in-flight offers from *all the slaves* in the cluster. If our 
scheduler accepts one of these offers, then we get a TASK_LOST status update 
back for that task. We also see such 
(https://gist.github.com/sagar8192/8858e7cb59a23e8e1762a27571824118) log lines 
in mesos master logs.

After reading the code(refs: 
https://github.com/apache/mesos/blob/master/src/master/master.cpp#L6772), it 
appears that offers are getting rescinded for all the slaves. I am not sure 
what is the expected behavior here, but it makes more sense if only resources 
from slaves marked for maintenance are reclaimed.

*Experiment:*
To verify that it is actually happening, I checked out the master branch(sha: 
a31dd52ab71d2a529b55cd9111ec54acf7550ded ) and added some log 
lines(https://gist.github.com/sagar8192/42ca055720549c5ff3067b1e6c7c68b3). 
Built the binary and started a mesos master and 2 agent processes. Used a basic 
python framework that launches docker containers on these slaves. Verified that 
there is no existing schedule for any slaves using `curl 
10.40.19.239:5050/maintenance/status`. Posted maintenance schedule for one of 
the slaves(https://gist.github.com/sagar8192/fb65170240dd32a53f27e6985c549df0) 
after starting the mesos framework.

*Logs:*
mesos-master: https://gist.github.com/sagar8192/91888419fdf8284e33ebd58351131203
mesos-slave1: https://gist.github.com/sagar8192/3a83364b1f5ffc63902a80c728647f31
mesos-slave2: https://gist.github.com/sagar8192/1b341ef2271dde11d276974a27109426
Mesos framework: 
https://gist.github.com/sagar8192/bcd4b37dba03bde0a942b5b972004e8a

I think mesos should rescind offers and inverse offers only for those slaves 
that are marked for maintenance(draining mode).

  was:
We are running mesos 1.1.0 in production. We use a custom autoscaler for 
scaling our mesos  cluster up and down. While scaling down the cluster, 
autoscaler makes a POST request to mesos master /maintenance/schedule endpoint 
with a set of slaves to move to maintenance mode. This forces mesos master to 
rescind all the in-flight offers from *all the slaves* in the cluster. If our 
scheduler accepts one of these offers, then we get a TASK_LOST status update 
back for that task. We also see such 
(https://gist.github.com/sagar8192/8858e7cb59a23e8e1762a27571824118) log lines 
in mesos master logs.

After reading the code(refs: 
https://github.com/apache/mesos/blob/master/src/master/master.cpp#L6772), it 
appears that offers are getting rescinded for all the slaves. I am not sure 
what is the expected behavior here, but it makes more sense if only resources 
from slaves marked for maintenance are reclaimed.

Experiment:
To verify that it is actually happening, I checked out the master branch(sha: 
a31dd52ab71d2a529b55cd9111ec54acf7550ded ) and added some log 
lines(https://gist.github.com/sagar8192/42ca055720549c5ff3067b1e6c7c68b3). 
Built the binary and started a mesos master and 2 agent processes. Used a basic 
python framework that launches docker containers on these slaves. Verified that 
there is no existing schedule for any slaves using `curl 
10.40.19.239:5050/maintenance/status`. Posted maintenance schedule for one of 
the slaves(https://gist.github.com/sagar8192/fb65170240dd32a53f27e6985c549df0) 
after starting the mesos framework.

Logs:
mesos-master: https://gist.github.com/sagar8192/91888419fdf8284e33ebd58351131203
mesos-slave1: https://gist.github.com/sagar8192/3a83364b1f5ffc63902a80c728647f31
mesos-slave2: https://gist.github.com/sagar8192/1b341ef2271dde11d276974a27109426
Mesos framework: 
https://gist.github.com/sagar8192/bcd4b37dba03bde0a942b5b972004e8a

I think mesos should rescind offers and inverse offers only for those slaves 
that are marked for maintenance(draining mode).


> Mesos master rescinds all the in-flight offers from all the registered agents 
> when a new maintenance schedule is posted for a subset of slaves
> --
>
> Key: MESOS-7882
> URL: https://issues.apache.org/jira/browse/MESOS-7882
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.3.0
> Environment: Ubuntu 14:04(trusty)
> Mesos master branch.
> SHA: a31dd52ab71d2a529b55cd9111ec54acf7550ded
>Reporter: Sagar Sadashiv Patwardhan
>

[jira] [Created] (MESOS-7882) Mesos master rescinds all the in-flight offers from all the registered agents when a new maintenance schedule is posted for a subset of slaves

2017-08-11 Thread Sagar Sadashiv Patwardhan (JIRA)
Sagar Sadashiv Patwardhan created MESOS-7882:


 Summary: Mesos master rescinds all the in-flight offers from all 
the registered agents when a new maintenance schedule is posted for a subset of 
slaves
 Key: MESOS-7882
 URL: https://issues.apache.org/jira/browse/MESOS-7882
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 1.3.0
 Environment: Ubuntu 14:04(trusty)
Mesos master branch.
SHA: a31dd52ab71d2a529b55cd9111ec54acf7550ded
Reporter: Sagar Sadashiv Patwardhan
Priority: Minor


We are running mesos 1.1.0 in production. We use a custom autoscaler for 
scaling our mesos  cluster up and down. While scaling down the cluster, 
autoscaler makes a POST request to mesos master /maintenance/schedule endpoint 
with a set of slaves to move to maintenance mode. This forces mesos master to 
rescind all the in-flight offers from *all the slaves* in the cluster. If our 
scheduler accepts one of these offers, then we get a TASK_LOST status update 
back for that task. We also see such 
(https://gist.github.com/sagar8192/8858e7cb59a23e8e1762a27571824118) log lines 
in mesos master logs.

After reading the code(refs: 
https://github.com/apache/mesos/blob/master/src/master/master.cpp#L6772), it 
appears that offers are getting rescinded for all the slaves. I am not sure 
what is the expected behavior here, but it makes more sense if only resources 
from slaves marked for maintenance are reclaimed.

Experiment:
To verify that it is actually happening, I checked out the master branch(sha: 
a31dd52ab71d2a529b55cd9111ec54acf7550ded ) and added some log 
lines(https://gist.github.com/sagar8192/42ca055720549c5ff3067b1e6c7c68b3). 
Built the binary and started a mesos master and 2 agent processes. Used a basic 
python framework that launches docker containers on these slaves. Verified that 
there is no existing schedule for any slaves using `curl 
10.40.19.239:5050/maintenance/status`. Posted maintenance schedule for one of 
the slaves(https://gist.github.com/sagar8192/fb65170240dd32a53f27e6985c549df0) 
after starting the mesos framework.

Logs:
mesos-master: https://gist.github.com/sagar8192/91888419fdf8284e33ebd58351131203
mesos-slave1: https://gist.github.com/sagar8192/3a83364b1f5ffc63902a80c728647f31
mesos-slave2: https://gist.github.com/sagar8192/1b341ef2271dde11d276974a27109426
Mesos framework: 
https://gist.github.com/sagar8192/bcd4b37dba03bde0a942b5b972004e8a

I think mesos should rescind offers and inverse offers only for those slaves 
that are marked for maintenance(draining mode).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)