[ https://issues.apache.org/jira/browse/MESOS-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yan Xu updated MESOS-7711: -------------------------- Shepherd: James Peach > Master updates registry for reregistering agents even when they haven't been > unreachable > ---------------------------------------------------------------------------------------- > > Key: MESOS-7711 > URL: https://issues.apache.org/jira/browse/MESOS-7711 > Project: Mesos > Issue Type: Bug > Components: master > Reporter: Yan Xu > Assignee: Yan Xu > > During a master failover we observed many registry updates, on average _one > per two agents_, as indicated by the log line > {noformat:title=} > I0609 04:46:25.220196 48864 registrar.cpp:550] Successfully updated the > registry in 42.904064ms > {noformat} > [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L550] > In this case few agents were ever unreachable so most of them are redundant. > Associated with each registry update is also the time spent on applying the > operations > {noformat:title=} > I0609 04:46:26.475761 48897 registrar.cpp:493] Applied 1 operations in > 11.673082ms; attempting to update the registry > {noformat} > [code|https://github.com/apache/mesos/blob/19a6134d03141dc2cb073a904378c2c129b5138d/src/master/registrar.cpp#L493] > Even though not consuming the time of the Master actor, all agent > reregistrations are guarded and delayed by these operations, and this could > be easily avoided by checking with the {{slaves.recovered}} field in > {{Master}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)