Houston Putman created SOLR-17821:
-------------------------------------
Summary: InstallShardData and Recover do not handle failures
gracefully
Key: SOLR-17821
URL: https://issues.apache.org/jira/browse/SOLR-17821
Project: Solr
Issue Type: Bug
Components: Backup/Restore
Reporter: Houston Putman
Whenever a ShardInstall or Recover command succeeds, the shard zk terms will
only be updated to reflect that they are not zero anymore. This is actually
handled down in the InstallCoreData cmd, so if 1 core recover/install succeeds,
then the zk terms will all be either untouched (if the terms are non-zero to
start) or will all be set to 1. This does not handle errors gracefully.
What we actually want to do is increase the terms of the successful replicas,
and then the non-successful replicas can start to recover from the successful
ones. If the leader was unsuccessful, it should give up leadership because its
shard term is no longer the highest.
Since shardInstall requires collections be read-only, we also need to fix the
issues with read-only and recovery.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]