[ 
https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Pronin updated MESOS-7795:
-------------------------------
    Shepherd: Yan Xu

> Remove "latest" symlink after agent reboot
> ------------------------------------------
>
>                 Key: MESOS-7795
>                 URL: https://issues.apache.org/jira/browse/MESOS-7795
>             Project: Mesos
>          Issue Type: Improvement
>          Components: agent
>            Reporter: Ilya Pronin
>            Assignee: Ilya Pronin
>            Priority: Minor
>
> Currently when the agent detects that the host was rebooted it doesn't 
> recover agent info. New agent info is not checkpointed until the agent 
> successfully registers with a master. If the agent crashes before 
> registering, on restart it will recover the old agent info that was 
> checkpointed before host reboot.
> This can lead to problems. E.g. the agent may flap due to incompatible agent 
> info, if its resources somehow change after reboot. Or the usage of the old 
> agent ID in reregistration process may cause crashes like MESOS-7432.
> We can remove the "latest" symlink when we detect that current boot ID is 
> different from the checkpointed one in order to prevent the agent from 
> recovering stale info after we checkpoint new boot ID. Or we can postpone 
> boot ID checkpointing until we checkpointed new agent info.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to