[ https://issues.apache.org/jira/browse/MESOS-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph Wu updated MESOS-7027: ----------------------------- Sprint: Mesosphere Sprint 50 Story Points: 5 Target Version/s: 1.3.0 (was: 1.2.0) Priority: Critical (was: Blocker) This is technically a regression, as the bug was introduced in this commit (1.2.x): https://github.com/apache/mesos/commit/e3562845d7a491916df99b1b2c87b8acb4ebab3f But I'm marking this as a non-blocker because: * This bug is only encountered in specific installations/configurations of Mesos; particularly when the {{rpath}} of the mesos binaries does not match the installation path; or when {{LD_LIBRARY_PATH}} is explicitly specified on the agent. ** Workaround for command executor is to launch a single-task {{TaskGroup}} * The fix may require some API fixes/changes, which may take a while to review. > CommandExecutor ENV overwritten by Docker Image ENV in Unified Containerizer > ---------------------------------------------------------------------------- > > Key: MESOS-7027 > URL: https://issues.apache.org/jira/browse/MESOS-7027 > Project: Mesos > Issue Type: Bug > Reporter: Kevin Klues > Assignee: Joseph Wu > Priority: Critical > Labels: environment, mesosphere > > Using the unified containerizer, if a docker image is provisioned and has > environment variables set via the ENV directive, those environment variables > will be inherited by the {{mesos-executor}} process and overwrite similarly > named environment variables that otherwise would have been inherited from the > agent. > This causes problems (for example) in DC/OS when trying to launch tasks based > off the {{nvidia/cuda}} image. The {{nvidia/cuda}} image explicitly sets > {{LD_LIBRARY_PATH}} to its own value so that the proper nvidia libraries will > be available to whatever command is launched inside the container. > However, DC/OS relies on {{LD_LIBRARY_PATH}} to contain a path to > {{/opt/mesosphere/lib}} so that all of the mesosphere libraries are available > to the mesos binaries launched by the agent ({{mesos-containerizer}}, > {{mesos-execute}}, etc.). This is necessary to make sure that any external > dependencies they might have (e.g. libssl.so) can be resolved at runtime. > By overwriting the executor's environment with the Docker Image environment, > {{LD_LIBRARY_PATH}} will not be set properly and {{mesos-execute}} will fail. > It seems to me, the Docker Image environment should *only* actually overwrite > the environment of the user process (not its executor). However, this can get > complicated, because the executor actually is the user process in the case of > launching a custom executor. > We need to rethink how the environment is inherited/overwritten through all > the various processes that get spawned while launching a container as well as > how to make it work for tasks launched by arbitrary executors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)