Joshua Cohen created AURORA-1632: ------------------------------------ Summary: Investigate executor fixes when Mesos 0.30.0 stops passing along environment variables Key: AURORA-1632 URL: https://issues.apache.org/jira/browse/AURORA-1632 Project: Aurora Issue Type: Task Components: Executor Reporter: Joshua Cohen Priority: Blocker
In the 0.30.0 release, the Mesos Agent will no longer implicitly pass along its environment variables (see: http://mail-archives.apache.org/mod_mbox/mesos-dev/201603.mbox/%3CCAK7AWaGB24ALh8eb%2BvKMFgc4%2BjmhxZ6ry79HBcKN%2BBt04Sx43A%40mail.gmail.com%3E). I tested in vagrant by explicitly setting the {{--executor_environment_variables}} flag on the agent to {{'{}'}} and verified that this does impact us. Initially we get a permission denied error when trying to fork the runner: {noformat} I0310 16:36:21.048671 18103 thermos_task_runner.py:275] Forking off runner with cmdline: /var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/thermos_runner.pex --setuid=vagrant --task_id=vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf --log_to_disk=DEBUG --hostname=192.168.33.7 --thermos_json=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/task.json --sandbox=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/sandbox --log_dir=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c --checkpoint_root=/var/lib/mesos/slaves/aa9f2963-947d-4582-8cec-e694d9d06e79-S0/frameworks/0f9b27e9-6b03-4b5e-9e2f-91eae9ba5c99-0003/executors/thermos-vagrant-test-http_example-0-a905b6d0-79d7-4fff-9cb2-5f5b4a6709cf/runs/56f62331-3ad4-463a-b392-3b80cc664b3c/checkpoints --process_logger_destination=file --port=aurora:31248 --port=http:31248 F0310 16:36:21.057298 18103 aurora_executor.py:80] Task initialization failed: [Errno 13] Permission denied {noformat} This error can be addressed with the patch from this pull request: https://github.com/apache/aurora/pull/21. However, even after applying this patch processes fail to fork (see attached screenshot). -- This message was sent by Atlassian JIRA (v6.3.4#6332)