[ 
https://issues.apache.org/jira/browse/SOLR-11083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087041#comment-16087041
 ] 

Shalin Shekhar Mangar commented on SOLR-11083:
----------------------------------------------

[~noble.paul] suggested that if we build a core admin API to unload a replica 
temporarily i.e. for the next N minutes, then MoveReplica can use that API 
first and then add a new replica. Once the N minutes elapse, the old replica 
will be loaded again and will discover that it has been replaced and promptly 
unload itself. If the overseer fails then a new replica won't exist and the old 
replica will come back online.

I won't have time to work on this but wanted to write a potential solution here 
in case someone else is interested.

> MoveReplica API can lose replicas for shared file systems on overseer restart 
> if source node is live
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11083
>                 URL: https://issues.apache.org/jira/browse/SOLR-11083
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: hdfs, SolrCloud
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 7.1
>
>
> MoveReplica unloads the old replica and creates a new one for shared file 
> systems. But if the overseer restarts between the two operations then the old 
> replica is lost. It is upto the user to detect the failure (using request 
> status API) and retry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to