[ https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ilya Pronin updated MESOS-7795: ------------------------------- Shepherd: Yan Xu > Remove "latest" symlink after agent reboot > ------------------------------------------ > > Key: MESOS-7795 > URL: https://issues.apache.org/jira/browse/MESOS-7795 > Project: Mesos > Issue Type: Improvement > Components: agent > Reporter: Ilya Pronin > Assignee: Ilya Pronin > Priority: Minor > > Currently when the agent detects that the host was rebooted it doesn't > recover agent info. New agent info is not checkpointed until the agent > successfully registers with a master. If the agent crashes before > registering, on restart it will recover the old agent info that was > checkpointed before host reboot. > This can lead to problems. E.g. the agent may flap due to incompatible agent > info, if its resources somehow change after reboot. Or the usage of the old > agent ID in reregistration process may cause crashes like MESOS-7432. > We can remove the "latest" symlink when we detect that current boot ID is > different from the checkpointed one in order to prevent the agent from > recovering stale info after we checkpoint new boot ID. Or we can postpone > boot ID checkpointing until we checkpointed new agent info. -- This message was sent by Atlassian JIRA (v6.4.14#64029)