[
https://issues.apache.org/jira/browse/SOLR-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496371#comment-14496371
]
Hari Sekhon commented on SOLR-7394:
-----------------------------------
Checking both of those jiras this appears to be a different issue where both
replicas have already failed recovery and then neither wants to attempt
recovery or take leadership again so both stay down, leaving the shard offline
even though both server's solr instances are restarted.
Those suggested jiras don't seem to be the same thing, as the exception I've
seen around this was recovery failed rather than zookeeper session expiration
or tlog replay.
> Shard replicas don't recover after cluster wide restart
> -------------------------------------------------------
>
> Key: SOLR-7394
> URL: https://issues.apache.org/jira/browse/SOLR-7394
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.7.2, 4.10.3
> Environment: HDP 2.2 / HDP Search
> Reporter: Hari Sekhon
> Priority: Critical
> Attachments: 145.solr.log, 146.solr.log, 147.solr.log, 148.solr.log,
> 149.solr.log, 150.solr.log, Solr_cores_not_recovering.png
>
>
> After cluster wide restart, some shards never come back online, with both
> replicas staying red and not attempting to become leaders after one failed
> recovery attempt. I eventually used the API to request recovery to trigger
> them to recover and come back online, otherwise the shards stayed down
> indefinitely.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]