[ 
https://issues.apache.org/jira/browse/MESOS-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937104#comment-13937104
 ] 

Till Toenshoff commented on MESOS-816:
--------------------------------------

Triggered by a comment on the RR, I would like to open a discussion on a 
question;
Does it make sense to use the posix-isolation as a fallback if the external 
containerizer is not supporting "launch" on a container. The current RR assumes 
that such missing implementation would be a  failure, resulting in a 
termination of that container process.

> Allow delegation to shell scripts for isolation
> -----------------------------------------------
>
>                 Key: MESOS-816
>                 URL: https://issues.apache.org/jira/browse/MESOS-816
>             Project: Mesos
>          Issue Type: Improvement
>          Components: isolation, slave
>            Reporter: Jason Dusek
>            Priority: Minor
>         Attachments: mesos-shell-isolator.jpg
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Being able to delegate isolation to shell scripts could make it easier to 
> leverage the machinery provided by the LXC tools, LibVirt, VirtualBox, Docker 
> and similar containerization systems.
> Why go through command line tools for isolation? We have seen many requests 
> for isolation, covering a wide variety of scenarios:
> - Setups requiring multiple versions of the same language (Ruby 1.8, Ruby 
> 1.9).
> - Setups requiring installation and configuration of RPM-packaged 
> applications.
> - Build-and-test setups, where sharing the environment of the host would 
> impact reproducibility.
> - Integration of 3rd party, service-oriented applications.
> - Launching applications with Docker.
> - Launching multiple instances of a Mesos framework that, like Hadoop, has 
> significant system setup and dependencies.
> To cover these and other use cases, it seems reasonable to allow Mesos to 
> delegate to external programs for isolation:
> - It makes it easier to experiment with new containerization tools.
> - It allows for site administrators to customize containerization, or even 
> implement new containerization mechanisms, without impacting their ability to 
> keep pace with Mesos development.
> - Many external programs exist for containerization -- Docker, LXC tools, 
> LibVirt -- which handle a great deal of the book-keeping around finding and 
> efficiently cloning disk images and setting up the guest system (its 
> hostname, TTYs, /dev/*, /proc).
> The scenarios listed above can be understood in terms of three use cases:
> - The containerized system service scenario, wherein an application, 
> installed with RPM or a similar tool, is started and managed by the init 
> system within a container. Percona MySQL is an example of such an application.
> - The containerized application scenario, wherein an application is installed 
> or unpacked and then configured and launched in a single command. For 
> example, running a custom Rails app with bundle install && bundle exec rails.
> - The containerized framework/executor scenario, wherein the application is 
> Spark, Hadoop or another Mesos framework/executor pair.
> One way to achieve this could be to introduce an External Isolator, which 
> works in parallel with the existing process/posix and cgroups isolators. The 
> responsibility of this isolator would be to act as a thin layer to external 
> isolators. Calls for task launching, stopping or any other resource change 
> would be serialized and passed to the external isolators by the Mesos 
> External Isolator. 
> Allowing for pluggable isolators invites the possibility of having different 
> isolators per task. For applications using containers, it's reasonable that 
> each application or framework can specify a different base image; and this 
> would be an option passed to the corresponding isolator. One can also imagine 
> specialized frameworks that need to disable isolation entirely. For example, 
> a "system backup" framework that would specify a null isolator to allow it to 
> snapshot interesting data on each slave and transfer it to a sanctioned 
> storage location.
> However, for users and framework authors to specify isolators would both be 
> harmful to portability and would make isolation their problem, no longer 
> something handled transparently by Mesos. Furthermore, it would have the 
> unintended effect of putting them at odds with site administrators, who would 
> also specify isolators -- as a command line option for each slave.
> Allowing tasks to carry a more abstract notion of "container" with them would 
> allow for most application level scenarios we've outlined above.  
> Theoretically, more than one isolator might be able to handle a given 
> container. For example if, the container is specified as an "ISO" and a 
> distro LiveCD is provided, one could imagine a Docker isolator, LXC isolator 
> or Virtualbox isolator handling it. Encouraging users and framework authors 
> to specify a container would be simpler for them than specifying isolator 
> flags, allows them to more clearly document their intent, and reduces the 
> scope for conflict with other parties who have an interest in upgrading and 
> tuning isolation. It also makes applications and command examples more 
> portable, by decoupling the isolation mechanism from the desired container 
> layout (which is, more or less, a chroot with some files in it).
> To this end, we propose adding an optional ContainerInfo to each CommandInfo:
>     message CommandInfo {
>       message ContainerInfo {
>         required bytes image = 1;
>         repeated bytes options = 2;
>       }
>       ...
>       optional ContainerInfo container = 4;
>     }
> The first field of the ContainerInfo should indicate the image, perhaps as a 
> URL. For example:
>     docker:///johncosta/redis
>     iso+http://mirrors.kernel.org/knoppix/KNOPPIX_V7.2.0CD-2013-06-16-EN.iso
>     lxc:///ubuntu
> The scheme of the URL -- recognizable as a string of letters and digits and 
> perhaps plusses, dots and dashes preceding the first `://`, per RFC 3986 -- 
> serves to indicate the type of the container, which isolators can use to 
> determine both what to do with a container and how to obtain it. For the 
> Docker URL type, for example, the absence of a host between the second and 
> third slashes could be interpreted to mean that the image should be fetched 
> from the Docker index or from a locally configured default Docker image 
> server; whereas if a hostname is given, it is treated as the image server to 
> use.
> The addition of "options" to the ContainerInfo poses a risk to portability 
> and warrants both explanation and justification. In the case of Docker URLs, 
> for example, it is possible to mount additional filesystems on the Docker 
> command line; and these filesystems can even be indicated by reference to 
> another Docker container by name. Support for this feature is clearly tied to 
> the Docker URL and its meaning.
> When the default isolator for a slave is specified, there may also be a 
> default container specified. It is good for us, then, that the ContainerInfo 
> structure maps cleanly to an array of byte strings, since this is an easy 
> thing to handle from the command line.
> Now in practice, how will we use the ContainerInfo? In the three use cases 
> outlined above -- service container, command container and containerized 
> executor -- tasks needing a special container will specify an ExecutorInfo in 
> the TaskInfo and not a bare CommandInfo. The ContainerInfo would then be part 
> of the CommandInfo embedded in the ExecutorInfo.
> To consider a specific case, were the Storm framework packaged in a 
> container, then the same container could be used both for Nimbus and the 
> worker nodes:
> * Nimbus would be launched with a TaskInfo requesting the container and 
> launching Nimbus.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "python /opt/storm/bin/storm go"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>                 options = [ "-p", "1337:8000" ]
>               }
>             }
>             ...
>           }
>           ...
>         }
> * Nimbus would launch executors with a TaskInfo requesting the very same 
> container, but specifying a different command.
>         TaskInfo {
>           executor = ExecutorInfo {
>             command = CommandInfo {
>               value = "curl -sSfL http://storm.server:1337/conf/storm.yaml -o 
> /opt/storm/conf/storm.yaml && python /opt/storm/bin/storm supervisor 
> storm.mesos.MesosSupervisor"
>               containerInfo = ContainerInfo {
>                 image = "docker:///storm-mesos/latest"
>               }
>             }
>             ...
>           }
>           ...
>         }
> While in the near term we expect container URLs to be pretty specific to the 
> containerization mechanism, let us hope for a glorious future with URLs like 
> `img:///ubuntu-13.04` that point to well-known, portable images.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to