----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28920/#review65000 -----------------------------------------------------------
I haven't had time to complete this review, but I wanted to give you what I have so far. This is all fantastic and I really appreciate you doing this! I'm excited to start using this implementation. You should update configuration-reference.md and deploying-aurora-scheduler.md in the docs dir to explain these changes. You should state minimum mesos slave version as well explain minimum docker version required on the slaves. You've made a bunch of changes to the executor and the runner to make them docker aware. I'd like those two components to not know about docker and I think we could really simplify this patch if we remove all that. I think these changes are wholly unnecessary. We should chat on IRC or Facetime about this. I don't understand how users are managed in this world. Docker images have their own /etc/passwd, so it seems to me that the unix user the mesos task runs as (same as aurora role) needs to exist inside that /etc/passwd. If that's not the case I'm confused and want to understand what user (username and id) the task does run as inside and outside the container. Allowing the aurora user to set volumes is a security nightmare. before docker we're using unix permissions to control security. We basically said: your aurora role is your unix user and that's all the permissions you get on a host. The docker daemon runs as a priviledged user, so now you can tell docker to bind mount in files from all over the system that your unix user didn't previously have permissions to read. So at a minimum, we should have a flag to disable the volumes feature with a big red warning flag in docs telling people what security issues they're signing up for whenever they enable it. api/src/main/thrift/org/apache/aurora/gen/api.thrift <https://reviews.apache.org/r/28920/#comment107880> extraneous api/src/main/thrift/org/apache/aurora/gen/api.thrift <https://reviews.apache.org/r/28920/#comment107879> extraneous examples/jobs/docker/hello_docker.aurora <https://reviews.apache.org/r/28920/#comment107914> I would simplify this example by running an inline bash script rather than a python script that is added to the container. hello_world_proc = Process( name="hello_process", cmdline=""" while true; do echo "hello world!" sleep 2 done """ I'd set the image to "busybox" which is a small container that is used a lot of other docker examples. This means that you can also remove that pkg_checksum trick with is distracting to someone trying to understand how to use docker with aurora and looks at this example. I'd remove examples/jobs/docker/hello_docker.aurora and examples/jobs/docker/Dockerfile examples/jobs/docker/hello_docker.aurora <https://reviews.apache.org/r/28920/#comment107919> remove announcer to simplify example. announcer has nothing to do with docker. examples/jobs/docker/thermos_executor.sh <https://reviews.apache.org/r/28920/#comment107923> I don't think setting LIBPROCESS_IP is necessary here and I don't think you set the second arg to this anywhere in the script. Perhaps this unintentionally snuck into this patch? examples/jobs/docker/thermos_executor.sh <https://reviews.apache.org/r/28920/#comment107929> I'm pretty sure you can do away with this shell script all together. First, I think the executor should have no docker specific knowledge. I have a comment on how to do that around the sandbox stuff (see below). Once the executor doesn't need docker specific flags, just set executorinfo.commandinfo.value to ${MESOS_SANDBOX}./thermos_executor <args> This will work because $MESOS_SANDBOX is only set when we're inside docker (which is, IMHO, a bug in mesos, they should set it in both cases. We should file a jira against mesos for that). src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java <https://reviews.apache.org/r/28920/#comment107896> use $MESOS_SANDBOX rather than /mnt/mesos/sandbox. See https://github.com/apache/mesos/blob/master/docs/docker-containerizer.md which says: "map the sandbox directory into the Docker container and set the directory mapping to the MESOS_SANDBOX environment variable" src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java <https://reviews.apache.org/r/28920/#comment107907> I would do away with this TaskConfig hasProcesses field. You should just use config.isSetExecutorConfig() (and of course not set executor config in the python client when there are no processes.) src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java <https://reviews.apache.org/r/28920/#comment107925> info is too verbose here. src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java <https://reviews.apache.org/r/28920/#comment107903> This is confusing to the user and I think unnecessary. It seems like the user should be able to use whatever wrapper script they want (why does it have to end in .sh?) The requirements of what the wrapper script must do should be explained in the deploying-aurora doc src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java <https://reviews.apache.org/r/28920/#comment107905> To make this log message useful to an operations person reading it, it should explain which job this message pertains to. src/main/python/apache/aurora/config/thrift.py <https://reviews.apache.org/r/28920/#comment107927> s/Task/Job/ task is a mesos construct, job is an aurora construct. src/main/python/apache/aurora/executor/aurora_executor.py <https://reviews.apache.org/r/28920/#comment107928> I think what you're trying to accomplish here with all these changes to the executor can be done with a much more trival and non-docker specific change: just make the DefaultSandbox directory configurable via an env var. Just change SANDBOX_NAME = 'sandbox' to something like this: SANDBOX_NAME = os.getenv('AURORA_SANDBOX') or 'sandbox' Then, of course, you'll have to set that in the protobuf message you send to mesos in executorinfo.commandinfo.value. src/main/python/apache/aurora/executor/thermos_task_runner.py <https://reviews.apache.org/r/28920/#comment107931> I'm lost with these role changes to the thermos_runner. Could you give me an overview of what this does and why it is necessary? src/main/python/apache/thermos/config/schema_base.py <https://reviews.apache.org/r/28920/#comment107932> I don't think name is used anywhere, so we should just remove it. - Jay Buffington On Dec. 11, 2014, 6:16 a.m., Steve Niemitz wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28920/ > ----------------------------------------------------------- > > (Updated Dec. 11, 2014, 6:16 a.m.) > > > Review request for Aurora, Jay Buffington, Kevin Sweeney, and Bill Farner. > > > Bugs: AURORA-633 > https://issues.apache.org/jira/browse/AURORA-633 > > > Repository: aurora > > > Description > ------- > > This change adds support for launching docker containers through aurora. > These changes are based off of the discussion in > https://issues.apache.org/jira/browse/AURORA-633 > > As of now, a special thermos_executor.sh script is needed to launch the > executor inside docker containers. A sample script is in > examples/jobs/docker, as well as an example aurora file. > > In addition, mesos-slave must be run with `--containerizers=docker,mesos`, > the example upstart config in examples/vagrant/upstart has been updated to > reflect this. > > The thermos root path defaults to /var/run/thermos, however if a different > path is used, it must be passed to the scheduler via > `--thermos_observer_root=<some path>` > > > Diffs > ----- > > Vagrantfile f8b7db8eebdc6a10989de3bc9a2c3e89ce17f5fc > api/src/main/thrift/org/apache/aurora/gen/api.thrift > 5665c69cd7b49c3fd7345074c9f16a3b224496ab > api/src/main/thrift/org/apache/thermos/thermos_internal.thrift > 2c449a491bc5a8ac858ea6487e4cef0591f36f66 > examples/jobs/docker/Dockerfile PRE-CREATION > examples/jobs/docker/hello_docker.aurora PRE-CREATION > examples/jobs/docker/hello_docker.py PRE-CREATION > examples/jobs/docker/thermos_executor.sh PRE-CREATION > examples/vagrant/aurorabuild.sh 69983d0140b76c6869cd04e55d760f3e3a1e4262 > examples/vagrant/upstart/mesos-slave.conf > 512ce7ecf34042ed68dda55efb2dd0415f8469db > src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java > 72c7545e7f16549f6a9ccb5fb74a06f154a7ea94 > src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java > ead9d28100673440168a32d114ecaa15874978a6 > src/main/java/org/apache/aurora/scheduler/base/CommandUtil.java > d885b224ec5a1d529347d84e03ba98ab6734a126 > src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java > 5bf283062c9d119ff91ed45da8b236e36d0fc9aa > src/main/python/apache/aurora/config/thrift.py > ba94ac3c0cbaf3c91eb1a1d86a244ed6fa3b649c > src/main/python/apache/aurora/executor/aurora_executor.py > 636b23d30a897b557eb8c3f8733c90b23cb807ef > src/main/python/apache/aurora/executor/bin/thermos_executor_main.py > 9df9b4b79c0c7d29c5088409bf15c0d32a621df0 > src/main/python/apache/aurora/executor/common/sandbox.py > f47a32b3fefb4a89940b1ddc473b8316ac00df12 > src/main/python/apache/aurora/executor/thermos_task_runner.py > 5e4bd65537d186459003c0b9434f1b769e04f448 > src/main/python/apache/thermos/bin/thermos_runner.py > 647de2771f301b17de33d8b45198c211d2e84367 > src/main/python/apache/thermos/config/schema_base.py > f9143cc1b83143d6147f59d90c79435d055d0518 > src/main/python/apache/thermos/core/runner.py > 8aac6b50c66080abbb5308b367e9f74c487f42e3 > src/main/python/apache/thermos/observer/task_observer.py > cd528dcca3f5a330359cf38005f3a1a0329a4886 > src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java > 5e54364a49a208bd5f19b9649633dc8feca591e9 > > src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java > ddcb511d108220ab5e4efcf3496458f7ab4a20c2 > src/test/python/apache/aurora/executor/test_thermos_executor.py > 503e62f4cac872b14f6985b5bccc3e4dfcf81789 > > Diff: https://reviews.apache.org/r/28920/diff/ > > > Testing > ------- > > > Thanks, > > Steve Niemitz > >