Mesos is not only about running stateless microservices to handle http
requests. There are long duration workloads that would benefit from being
rescheduled to a different host and not being interrupted; i.e. to
implement dynamic bin packing in the cluster.

The networking issues has been proved through CRIU that is possible even at
the socket level. Regarding IP moving around, Project Calico
<https://www.projectcalico.org/> offers a way to do that; We tried with a
homemade modifications to do it using docker and OSPF and it works very
well.

On Fri, Feb 19, 2016 at 11:49 AM, Sharma Podila <spod...@netflix.com> wrote:

> Moving stateless services can be trivial or a non problem, as others have
> suggested.
> Migrating state full services becomes a function of migrating the state,
> including any network conx, etc. To think aloud, from a bit of past
> considerations in hpc like systems, some systems relied upon the underlying
> systems to support migration (vMotion, etc.), to 3rd party libraries (was
> that Meiosys) that could work on existing application binaries, to
> libraries (BLCR
> <http://crd.lbl.gov/departments/computer-science/CLaSS/research/BLCR/>)
> that need support from application developer. I was involved with providing
> support for BLCR based applications. One of the challenges was the time to
> checkpoint an application with large memory footprint, say, 100 GB or more,
> which isn't uncommon in hpc. Incremental checkpointing wasn't an option, at
> least at that point.
> Regardless, Mesos' support for checkpoint-restore would have to consider
> the type of checkpoint-restore being used. I would imagine that the core
> part of the solution would be simple'ish, in providing a "workflow" for the
> checkpoint-restore system (sort of send signal to start checkpoint, wait
> certain time to complete or timeout). Relatively less simple would be the
> actual integration of the checkpoint-restore system and dealing with its
> constraints and idiosyncrasies.
>
>
> On Fri, Feb 19, 2016 at 4:50 AM, Dick Davies <d...@hellooperator.net>
> wrote:
>
>> Agreed, vMotion always struck me as something for those monolithic
>> apps with a lot of local state.
>>
>> The industry seems to be moving away from that as fast as its little
>> legs will carry it.
>>
>> On 19 February 2016 at 11:35, Jason Giedymin <jason.giedy...@gmail.com>
>> wrote:
>> > Food for thought:
>> >
>> > One should refrain from monolithic apps. If they're small and stateless
>> you
>> > should be doing rolling upgrades.
>> >
>> > If you find yourself with one container and you can't easily distribute
>> that
>> > work load by just scaling and load balancing then you have a monolith.
>> Time
>> > to enhance it.
>> >
>> > Containers should not be treated like VMs.
>> >
>> > -Jason
>> >
>> > On Feb 19, 2016, at 6:05 AM, Mike Michel <mike.mic...@mmbash.de> wrote:
>> >
>> > Question is if you really need this when you are moving in the world of
>> > containers/microservices where it is about building stateless 12factor
>> apps
>> > except databases. Why moving a service when you can just kill it and
>> let the
>> > work be done by 10 other containers doing the same? I remember a talk on
>> > dockercon about containers and live migration. It was like: „And now
>> where
>> > you know how to do it, dont’t do it!“
>> >
>> >
>> >
>> > Von: Avinash Sridharan [mailto:avin...@mesosphere.io]
>> > Gesendet: Freitag, 19. Februar 2016 05:48
>> > An: user@mesos.apache.org
>> > Betreff: Re: Feature request: move in-flight containers w/o stopping
>> them
>> >
>> >
>> >
>> > One problem with implementing something like vMotion for Mesos is to
>> address
>> > seamless movement of network connectivity as well. This effectively
>> requires
>> > moving the IP address of the container across hosts. If the container
>> shares
>> > host network stack, this won't be possible since this would imply
>> moving the
>> > host IP address from one host to another. When a container has its
>> network
>> > namespace, attached to the host, using a bridge, moving across L2
>> segments
>> > might be a possibility. To move across L3 segments you will need some
>> form
>> > of overlay (VxLAN maybe ?) .
>> >
>> >
>> >
>> > On Thu, Feb 18, 2016 at 7:34 PM, Jay Taylor <outtat...@gmail.com>
>> wrote:
>> >
>> > Is this theoretically feasible with Linux checkpoint and restore,
>> perhaps
>> > via CRIU?http://criu.org/Main_Page
>> >
>> >
>> > On Feb 18, 2016, at 4:35 AM, Paul Bell <arach...@gmail.com> wrote:
>> >
>> > Hello All,
>> >
>> >
>> >
>> > Has there ever been any consideration of the ability to move in-flight
>> > containers from one Mesos host node to another?
>> >
>> >
>> >
>> > I see this as analogous to VMware's "vMotion" facility wherein VMs can
>> be
>> > moved from one ESXi host to another.
>> >
>> >
>> >
>> > I suppose something like this could be useful from a load-balancing
>> > perspective.
>> >
>> >
>> >
>> > Just curious if it's ever been considered and if so - and rejected - why
>> > rejected?
>> >
>> >
>> >
>> > Thanks.
>> >
>> >
>> >
>> > -Paul
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Avinash Sridharan, Mesosphere
>> >
>> > +1 (323) 702 5245
>>
>
>

Reply via email to