shalinmangar opened a new pull request #2004:
URL: https://github.com/apache/lucene-solr/pull/2004


   This PR is based on the work of @CaoManhDat 
   
   # Description
   
   The shutdown process waits for all replicas/cores to be closed before 
removing the election node of the leader. This can take some time due to index 
flush or merge activities on the leader cores and delays new leaders from being 
elected. This leads to long gap of time during which indexing requests fail.
   
   # Solution
   
   This PR introduces a Phaser that registers and deregisters all update 
requests. The shutdown process pauses indexing requests, waits for all 
in-flight indexing requests to complete, removes election nodes (thus 
triggering leader election) and then closes all replicas.
   
   # Tests
   
   Since this PR makes changes to Solr's shutdown process, it is difficult to 
write a unit/integration tests inside Solr itself. This was tested by creating 
a test harness outside of Solr. The steps were:
   - Create a Collection with 1 shard, 3 replicas
   - Do heavy indexing when all replicas are ACTIVE
   - Shut down the leader node
   - Measure the leaderless time (time during which there is no live/active 
leader)
   
   
   Before the patch | After the patch
   -- | --
   Shutting down time: 100ms Leader election: 46s | Shutting down time: 12s 
Leader election: 3s
   Shutting down time: 37s Leader election: 3s | Shutting down time: 31s Leader 
election: 3s
   Shutting down time: 200ms Leader election: 17s | Shutting down time: 100ms 
Leader election: 3s
   Shutting down time: 200ms Leader election: 27s | Shutting down time: 4s 
Leader election: 3s
   
   Since the patch introduces waiting for all updates to finish before giving 
up the leadership so the shut down time gets increased on average, but also it 
makes leader election more stable and improves consistency.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `master` branch.
   - [X] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to