[
https://issues.apache.org/jira/browse/SOLR-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SOLR-17821:
----------------------------------
Labels: pull-request-available (was: )
> InstallShardData and Recover do not handle failures gracefully
> --------------------------------------------------------------
>
> Key: SOLR-17821
> URL: https://issues.apache.org/jira/browse/SOLR-17821
> Project: Solr
> Issue Type: Bug
> Components: Backup/Restore
> Reporter: Houston Putman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Whenever a ShardInstall or Recover command succeeds, the shard zk terms will
> only be updated to reflect that they are not zero anymore. This is actually
> handled down in the InstallCoreData cmd, so if 1 core recover/install
> succeeds, then the zk terms will all be either untouched (if the terms are
> non-zero to start) or will all be set to 1. This does not handle errors
> gracefully.
> What we actually want to do is increase the terms of the successful replicas,
> and then the non-successful replicas can start to recover from the successful
> ones. If the leader was unsuccessful, it should give up leadership because
> its shard term is no longer the highest.
> Since shardInstall requires collections be read-only, we also need to fix the
> issues with read-only and recovery.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]