[ 
https://issues.apache.org/jira/browse/MESOS-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-7711:
--------------------------
    Shepherd: James Peach

> Master updates registry for reregistering agents even when they haven't been 
> unreachable
> ----------------------------------------------------------------------------------------
>
>                 Key: MESOS-7711
>                 URL: https://issues.apache.org/jira/browse/MESOS-7711
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Yan Xu
>            Assignee: Yan Xu
>
> During a master failover we observed many registry updates, on average _one 
> per two agents_, as indicated by the log line 
> {noformat:title=}
> I0609 04:46:25.220196 48864 registrar.cpp:550] Successfully updated the 
> registry in 42.904064ms
> {noformat}
> [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L550]
> In this case few agents were ever unreachable so most of them are redundant. 
> Associated with each registry update is also the time spent on applying the 
> operations
> {noformat:title=}
> I0609 04:46:26.475761 48897 registrar.cpp:493] Applied 1 operations in 
> 11.673082ms; attempting to update the registry
> {noformat}
> [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L493]
> Even though not consuming the time of the Master actor, all agent 
> reregistrations are guarded and delayed by these operations, and this could 
> be easily avoided by checking with the {{slaves.recovered}} field in 
> {{Master}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to