[ https://issues.apache.org/jira/browse/SOLR-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280256#comment-16280256 ]
ASF subversion and git services commented on SOLR-11458: -------------------------------------------------------- Commit 5df352f6c13a516f014d8a5fd5205964ea35f310 in lucene-solr's branch refs/heads/branch_7_2 from [~ab] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5df352f ] SOLR-11458: Improve error handling in MoveReplicaCmd to avoid potential loss of data. > Bugs in MoveReplicaCmd handling of failures > ------------------------------------------- > > Key: SOLR-11458 > URL: https://issues.apache.org/jira/browse/SOLR-11458 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 7.0, 7.0.1, 7.1, master (8.0) > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Fix For: 7.2, master (8.0) > > Attachments: SOLR-11458.diff, SOLR-11458.diff > > > Spin-off from SOLR-11449: > {quote} > There's a section of code in moveNormalReplica that ensures that we don't > lose a shard leader during move. There's no corresponding protection in > moveHdfsReplica, which means that moving a replica that is also a shard > leader may potentially lead to data loss (eg. when replicationFactor=1). > Also, there's no rollback strategy when moveHdfsReplica partially fails, > unlike in moveNormalReplica where the code simply skips deleting the original > replica - it seems that the code should attempt to restore the original > replica in this case? When RF=1 and such failure occurs then not restoring > the original replica means lost shard. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org