Thanks for the proposal! Sorry about the late reply on this. My take on this is: we should avoid introducing APIs that allow folks to do the same thing using different ways. Looks like the proposal here would allow frameworks to launch pod like workload in different ways (via TaskGroup or via this approach). I'd try to avoid that if possible.
If you guys are using custom executor, the "containers" can be encoded in the 'data' field in a single TaskInfo, and sent to the executor. Another benefit of the current TaskGroup approach is health check. Currently in Mesos, health check is at the granularity of Task (not Container). - Jie On Wed, Jun 21, 2017 at 12:28 AM, tommy xiao <xia...@gmail.com> wrote: > hi zhitao, > > in task definition, on task only one instance, you introduce multiple > containers it broke the defined convention, need more committer comments. > > 2017-06-19 23:46 GMT+08:00 Zhitao Li <zhitaoli...@gmail.com>: > > > Hi Tommy, > > > > Are you suggesting to either replace Task with Pod, or adding a new > concept > > of Pod? Former would be too destructive to users and I don't see enough > > value. For adding new concept of pod: task has reliable status update and > > are modeled all around in Mesos and I feel doing that for another level > of > > concept is not worth it. > > > > On Sun, Jun 18, 2017 at 10:18 PM, tommy xiao <xia...@gmail.com> wrote: > > > > > it looks like Pod. how about upgrade TASK to POD concept? > > > > > > 2017-06-16 23:57 GMT+08:00 Zhitao Li <zhitaoli...@gmail.com>: > > > > > > > Hi Ben, > > > > > > > > Thanks for reading the proposal. There are several motivations, > > although > > > > scalability is the primary one: > > > > > > > > 1) w.r.t. scalability, it's not only Mesos's own scalability, but > also > > > > many* > > > > additional infra tools* which need to integrate with Mesos and > process > > > > *every* task in the cluster: a 2-3x increase on task numbers would > > easily > > > > make these systems harder to catch up with cluster size; > > > > 2) Another thing we are looking at is to provide more robust and > > powerful > > > > upgrade story for a pod of containers. Although such work does not > > demand > > > > modeling multiple containers to one task, our internal discussions > feel > > > > that this modeling makes it easier to handle. A couple of things we > are > > > > specifically looking at: > > > > > > > > - reliable in-place upgrade: while dynamic reservation usually > > works, > > > > it's still non-trivial to provide exact guarantee that > > > allocator/master > > > > will send back offers after a `KILL` in time. This is technically > > more > > > > related to MESOS-1280 <https://issues.apache.org/ > > > jira/browse/MESOS-1280 > > > > >. > > > > - automatic rollback upon failed upgrade: similar to above point, > > > it'll > > > > be great if the entire scheduler/mesos stack can guarantee an > atomic > > > > rollback. Right now this depends on availability of entire control > > > plane > > > > (scheduler and master) since multiple messages need to be passed. > > > > - zero-traffic-loss upgrade: if workload utilizes primitives like > > > > SO_REUSE_PORT <https://lwn.net/Articles/542629/>, it should be > > > possible > > > > to upgrade a container w/o losing any customer traffic. > > > > > > > > 3) another awkwardness of TaskGroup is that we do not really know how > > to > > > > proper size a task within a group because they are isolated by the > same > > > > root container's scope, neither do we really care from a scheduler's > > > > perspective. Sizing the sum of the containers are far more important > > than > > > > sizing each task to us. > > > > 4) Also, it seems like we cannot add a new "zero resource usage" task > > to > > > a > > > > group right now, therefore adding/removing a container has to > involved > > > both > > > > the "scheduling" logic, and the "container upgrade" part. > > > > > > > > The last two points came from internal discussion with our scheduler > > > team. > > > > I guess they may not be as significant as first two, but I'm just > > putting > > > > them on the table. > > > > > > > > > > > > On Thu, Jun 15, 2017 at 2:43 PM, Benjamin Mahler <bmah...@apache.org > > > > > > wrote: > > > > > > > > > From reading this, the motivation is that TaskGroup having 1 task > per > > > > > container "could create a scalability issue for a large scale Mesos > > > > cluster > > > > > since many endpoints/operations scale with the total number of > Tasks > > in > > > > the > > > > > cluster." > > > > > > > > > > Is that the only motivation here? > > > > > > > > > > On Thu, Jun 15, 2017 at 11:45 AM, Charles Raimbert < > > > > craimber...@gmail.com> > > > > > wrote: > > > > > > > > > >> Hello All, > > > > >> > > > > >> As we are interested in PODs to run colocated containers under the > > > same > > > > >> grouping, we have been looking at TaskGroup but we have also been > > > > working > > > > >> on a design to allow multiple containers in the same Task. > > > > >> > > > > >> Please feel free to write your comments and suggestions on the > > > proposal > > > > >> draft: > > > > >> https://docs.google.com/document/d/1Os5tXUJfJ8Op_YBZR7L8hSHq > > > > >> IeO1f9LY2yzKxsOdrwg > > > > >> > > > > >> Thanks, > > > > >> Charles Raimbert & Zhitao Li > > > > >> > > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > > > > > Zhitao Li > > > > > > > > > > > > > > > > -- > > > Deshi Xiao > > > Twitter: xds2000 > > > E-mail: xiaods(AT)gmail.com > > > > > > > > > > > -- > > Cheers, > > > > Zhitao Li > > > > > > -- > Deshi Xiao > Twitter: xds2000 > E-mail: xiaods(AT)gmail.com >