shalinmangar opened a new pull request #2004: URL: https://github.com/apache/lucene-solr/pull/2004
This PR is based on the work of @CaoManhDat # Description The shutdown process waits for all replicas/cores to be closed before removing the election node of the leader. This can take some time due to index flush or merge activities on the leader cores and delays new leaders from being elected. This leads to long gap of time during which indexing requests fail. # Solution This PR introduces a Phaser that registers and deregisters all update requests. The shutdown process pauses indexing requests, waits for all in-flight indexing requests to complete, removes election nodes (thus triggering leader election) and then closes all replicas. # Tests Since this PR makes changes to Solr's shutdown process, it is difficult to write a unit/integration tests inside Solr itself. This was tested by creating a test harness outside of Solr. The steps were: - Create a Collection with 1 shard, 3 replicas - Do heavy indexing when all replicas are ACTIVE - Shut down the leader node - Measure the leaderless time (time during which there is no live/active leader) Before the patch | After the patch -- | -- Shutting down time: 100ms Leader election: 46s | Shutting down time: 12s Leader election: 3s Shutting down time: 37s Leader election: 3s | Shutting down time: 31s Leader election: 3s Shutting down time: 200ms Leader election: 17s | Shutting down time: 100ms Leader election: 3s Shutting down time: 200ms Leader election: 27s | Shutting down time: 4s Leader election: 3s Since the patch introduces waiting for all updates to finish before giving up the leadership so the shut down time gets increased on average, but also it makes leader election more stable and improves consistency. # Checklist Please review the following and check all that apply: - [X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [X] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [X] I have developed this patch against the `master` branch. - [X] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org