[ https://issues.apache.org/jira/browse/SOLR-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153294#comment-15153294 ]
Scott Blum commented on SOLR-8697: ---------------------------------- [~markrmil...@gmail.com] [~erickerickson] I think there is a potential problem with how OverseerTest is constructed, that perhaps caused us to write some code into LeaderElector in the past that doesn't make any sense for live code. I'm looking at the implementation of MockZkController.publishState() (it's kind of a beast) and I notice that when it creates an ElectionContext, it never actually adds it to the map, checks whether one already exists, etc. As a result, MockZkController does something the real ZkController never does -- it tries to register two different election contexts for the same core on the same ZK session. My question is, what's the right fix? I can either make MockZkController not setup a new electionContext on subsequent invocations, or I could make it simply cancel the previous election context before creating a new one. > Fix LeaderElector issues > ------------------------ > > Key: SOLR-8697 > URL: https://issues.apache.org/jira/browse/SOLR-8697 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 5.4.1 > Reporter: Scott Blum > Labels: patch, reliability, solrcloud > > This patch is still somewhat WIP for a couple of reasons: > 1) Still debugging test failures. > 2) This will more scrutiny from knowledgable folks! > There are some subtle bugs with the current implementation of LeaderElector, > best demonstrated by the following test: > 1) Start up a small single-node solrcloud. it should be become Overseer. > 2) kill -9 the solrcloud process and immediately start a new one. > 3) The new process won't become overseer. The old process's ZK leader elect > node has not yet disappeared, and the new process fails to set appropriate > watches. > NOTE: this is only reproducible if the new node is able to start up and join > the election quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org