Re: [Proposal] Multiple Containers in Single Mesos Task

meghdoot bhattacharya Mon, 03 Jul 2017 11:55:12 -0700

We shared this a month back through a tweet on supporting docker container pods 
with a single task (namespace collapse and resourcing sharing with parent mesos 
task) to satisfy certain needs where in we had to treat mesos, docker and 
docker-compose first class in our ecosystem.
Slides: https://lnkd.in/gK8rNJ8


Video:https://lnkd.in/g5MAsk9

Source:https://github.com/paypal/dce-go

Mesos still provides the most flexible primitives among other competing 
solutions to build solutions that you need.
Agree with Jie, the native mesos generic pod integration via task groups and 
nested containers should probably have one recommended model and if that does 
not satisfy there are ways to achieve it.
Thx
      From: Yan Xu <xuj...@apple.com>
 To: dev <dev@mesos.apache.org> 
 Sent: Wednesday, June 21, 2017 11:29 AM
 Subject: Re: [Proposal] Multiple Containers in Single Mesos Task
   
---
@xujyan <https://twitter.com/xujyan>

On Fri, Jun 16, 2017 at 8:57 AM, Zhitao Li <zhitaoli...@gmail.com> wrote:

> Hi Ben,
>
> Thanks for reading the proposal. There are several motivations, although
> scalability is the primary one:
>
> 1) w.r.t. scalability, it's not only Mesos's own scalability, but also
> many*
> additional infra tools* which need to integrate with Mesos and process
> *every* task in the cluster: a 2-3x increase on task numbers would easily
> make these systems harder to catch up with cluster size;


Have you looked into what the bottleneck is for these tools?
In our experiences what hurts scalability most is not the number of tasks
but the size of metadata that needs to be processed (per task).
I am interested in seeing if this is still an issue if we improve the APIs
by stripping out unnecessary fields, introducing API for querying
individual tasks, etc.


>

2) Another thing we are looking at is to provide more robust and powerful
> upgrade story for a pod of containers. Although such work does not demand
> modeling multiple containers to one task, our internal discussions feel
> that this modeling makes it easier to handle. A couple of things we are
> specifically looking at:
>

How do the following benefit from `additional_containers` instead of tasks?


>
>    - reliable in-place upgrade: while dynamic reservation usually works,
>    it's still non-trivial to provide exact guarantee that allocator/master
>    will send back offers after a `KILL` in time. This is technically more
>    related to MESOS-1280 <https://issues.apache.org/jira/browse/MESOS-1280
> >.
>

This is interesting to us too, let's sync on this.


>    - automatic rollback upon failed upgrade: similar to above point, it'll
>    be great if the entire scheduler/mesos stack can guarantee an atomic
>    rollback. Right now this depends on availability of entire control plane
>    (scheduler and master) since multiple messages need to be passed.

  - zero-traffic-loss upgrade: if workload utilizes primitives like
>    SO_REUSE_PORT <https://lwn.net/Articles/542629/>, it should be possible
>    to upgrade a container w/o losing any customer traffic.
>

If you update the task in-place, you wouldn't even necessarily need to
restart the process. I assume you are talking about cases where you have
to, but it has to be supported by Mesos not moving your task to a random
host?


>
> 3) another awkwardness of TaskGroup is that we do not really know how to
> proper size a task within a group because they are isolated by the same
> root container's scope, neither do we really care from a scheduler's
> perspective. Sizing the sum of the containers are far more important than
> sizing each task to us.
>

This is interesting. Right now resources in the tasks within the same group
aren't isolated but they are bundled together anyway.
In the long run when we start isolating them, perhaps we can make task
resources optional if they are launched by a `LaunchGroup` operation?


> 4) Also, it seems like we cannot add a new "zero resource usage" task to a
> group right now, therefore adding/removing a container has to involved both
> the "scheduling" logic, and the "container upgrade" part.
>

I guess you mean the scheduler shouldn't need to wait for new offers to
simply update the current task so I think this is the same point as 1)?


>
> The last two points came from internal discussion with our scheduler team.
> I guess they may not be as significant as first two, but I'm just putting
> them on the table.
>
>
> On Thu, Jun 15, 2017 at 2:43 PM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
> > From reading this, the motivation is that TaskGroup having 1 task per
> > container "could create a scalability issue for a large scale Mesos
> cluster
> > since many endpoints/operations scale with the total number of Tasks in
> the
> > cluster."
> >
> > Is that the only motivation here?
> >
> > On Thu, Jun 15, 2017 at 11:45 AM, Charles Raimbert <
> craimber...@gmail.com>
> > wrote:
> >
> >> Hello All,
> >>
> >> As we are interested in PODs to run colocated containers under the same
> >> grouping, we have been looking at TaskGroup but we have also been
> working
> >> on a design to allow multiple containers in the same Task.
> >>
> >> Please feel free to write your comments and suggestions on the proposal
> >> draft:
> >> https://docs.google.com/document/d/1Os5tXUJfJ8Op_YBZR7L8hSHq
> >> IeO1f9LY2yzKxsOdrwg
> >>
> >> Thanks,
> >> Charles Raimbert & Zhitao Li
> >>
> >
> >
>
>
> --
> Cheers,
>
> Zhitao Li
>

Re: [Proposal] Multiple Containers in Single Mesos Task

Reply via email to