[ https://issues.apache.org/jira/browse/SLING-10489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504480#comment-17504480 ]
Stefan Egli commented on SLING-10489: ------------------------------------- * merged [discovery.oak PR#4|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/4] > Ignore partially started, newly joining instances to avoid disturbing > discovery (for a while) > --------------------------------------------------------------------------------------------- > > Key: SLING-10489 > URL: https://issues.apache.org/jira/browse/SLING-10489 > Project: Sling > Issue Type: Improvement > Components: Discovery > Affects Versions: Discovery Commons 1.0.24, Discovery Base 2.0.10, > Discovery Oak 1.2.34 > Reporter: Stefan Egli > Assignee: Stefan Egli > Priority: Major > Fix For: Discovery Base 2.0.12, Discovery Commons 1.0.26, > Discovery Oak 1.2.36 > > Time Spent: 8h 10m > Remaining Estimate: 0h > > Discovery.oak requires that both Oak and Sling are operating normally in > order to declare victory and announce a new topology. > The startup phase is especially tricky in this regard, since there are > multiple elements that need to get updated (some are in the Oak layer, some > in Sling) : > * lease & clusterNodeId : this is maintained by Oak > * idMap : this is maintained by IdMapService (Sling) > * leaderElectionId : this is maintained by OakViewChecker (Sling) > * syncToken : this is maintained by SyncTokenService (Sling) > Situations have been seen where Oak starts up fine, but higher level (eg > Sling) bundles were not activated within a reasonable amount of time. This > lead to discovery staying in TOPOLOGY_CHANGING state for longer than expected. > There should be a mechanism that ignores (suppresses) newly joining instances > if they start up only partially. However, after a certain timeout this > mechanism should give up. -- This message was sent by Atlassian Jira (v8.20.1#820001)