[ 
https://issues.apache.org/jira/browse/SOLR-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-8554:
--------------------------------
    Attachment: SOLR-8554.patch

Patch which moves force leader and rebalance leader to the overseer. The size 
is bloated because of the code moving around

Some additional changes to force leader:
- Changed some INFO/WARN logging to DEBUG in Forceleader
- ForceLeader never waited for the shard responses to be collected. Patch adds 
that as well
- ForceLeader makes use of sliceCmd and implements async as well

Some additional changes for rebalance leader:
- The response object has a success message along with covers more responses 
where errors could happen
- I couldn't get the max_at_once param to work correctly . The problem is once 
we send a shard request of REJOINLEADERELECTION we wait for changes to complete 
in waitForNodeChange . We also use the sequence number so we can't delay 
waitForNodeChange either.

Maybe we should just remove max_at_once? The way rebalance leaders has worked 
is one at a time. That means unless one shard is fully complete the second one 
doesn't start . So this parameter is not too useful either.

> RebalanceLeader and ForceLeader APIs should be part of 
> OverseerCollectionMessageHandler
> ---------------------------------------------------------------------------------------
>
>                 Key: SOLR-8554
>                 URL: https://issues.apache.org/jira/browse/SOLR-8554
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Varun Thacker
>         Attachments: SOLR-8554.patch
>
>
> I think RebalanceLeader and ForceLeader API calls should be part of the 
> OverseerCollectionMessageHandler.
> There are two reasons for this -
> 1. If the API is implemented within the OverseerCollectionMessageHandler the 
> Overseer provides us with concurrency guarantees i.e two calls against a 
> collection can't be executed simultaneously .
> An example, a delete shard was fired and simultaneously a force leader was 
> fired. Now say the replica which was meant to be deleted became the new 
> leader.
> 2. Less important that 1 , but if the API is implemented within the 
> OverseerCollectionMessageHandler , we can provide an async option in these 
> APIs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to