[ https://issues.apache.org/jira/browse/SOLR-11083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087041#comment-16087041 ]
Shalin Shekhar Mangar commented on SOLR-11083: ---------------------------------------------- [~noble.paul] suggested that if we build a core admin API to unload a replica temporarily i.e. for the next N minutes, then MoveReplica can use that API first and then add a new replica. Once the N minutes elapse, the old replica will be loaded again and will discover that it has been replaced and promptly unload itself. If the overseer fails then a new replica won't exist and the old replica will come back online. I won't have time to work on this but wanted to write a potential solution here in case someone else is interested. > MoveReplica API can lose replicas for shared file systems on overseer restart > if source node is live > ---------------------------------------------------------------------------------------------------- > > Key: SOLR-11083 > URL: https://issues.apache.org/jira/browse/SOLR-11083 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: hdfs, SolrCloud > Reporter: Shalin Shekhar Mangar > Fix For: 7.1 > > > MoveReplica unloads the old replica and creates a new one for shared file > systems. But if the overseer restarts between the two operations then the old > replica is lost. It is upto the user to detect the failure (using request > status API) and retry. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org