[ https://issues.apache.org/jira/browse/MESOS-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133497#comment-15133497 ]
Joseph Wu commented on MESOS-4598: ---------------------------------- I agree that we need a centralized fix. I see the following scenarios: | || Subprocess uses libprocess || Subprocess is something else || || Subprocess sets/inherits the same {{PORT}} by accident | Bind failure -> exit Option #1 above prevents accidental inheritance | Nothing happens (?) | || Subprocess sets a different {{PORT}} on purpose | Bind success (?) | Nothing happens (?) | My thought for a complete fix is the following changes: * If the {{subprocess}} call gets {{environment = None()}}, we should automatically remove {{LIBPROCESS_PORT}} from the inherited environment. ** I'd prefer not to unset {{LIBPROCESS_PORT}} on initialization because this makes it harder to catch the upper-left error above. Also, the V1 HTTP scheduler library tests will eventually need to re-initialize libprocess between tests. * The parts of [{{executorEnvironment}}|https://github.com/apache/mesos/blame/master/src/slave/containerizer/containerizer.cpp#L265] dealing with libprocess & libmesos should be refactored into libprocess as a helper. We would use this helper for the Containerizer, Fetcher, and ContainerLogger module. * If the {{subprocess}} call is given {{LIBPROCESS_PORT == os::getenv("LIBPROCESS_PORT")}}, we can probably return an {{Error}} immediately. Or log a warning and unset the env var locally. > Logrotate ContainerLogger should not remove IP from environment. > ---------------------------------------------------------------- > > Key: MESOS-4598 > URL: https://issues.apache.org/jira/browse/MESOS-4598 > Project: Mesos > Issue Type: Bug > Affects Versions: 0.27.0 > Reporter: Joseph Wu > Assignee: Joseph Wu > Labels: mesosphere > > The {{LogrotateContainerLogger}} starts libprocess-using subprocesses. > Libprocess initialization will attempt to resolve the IP from the hostname. > If a DNS service is not available, this step will fail, which terminates the > logger subprocess prematurely. > Since the logger subprocesses live on the agent, they should use the same > {{LIBPROCESS_IP}} supplied to the agent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)