[ https://issues.apache.org/jira/browse/SOLR-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962310#comment-16962310 ]
Kevin Risden commented on SOLR-11431: ------------------------------------- I see some of these same failures locally [~psomogyi] so not sure the PR is ready to go. > Leader candidate cannot become leader if replica responds 500 to PeerSync > ------------------------------------------------------------------------- > > Key: SOLR-11431 > URL: https://issues.apache.org/jira/browse/SOLR-11431 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Affects Versions: 7.0 > Reporter: Mano Kovacs > Priority: Major > Attachments: SOLR-11431.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > When leader candidate does PeerSync to all replicas, to download any missing > updates, it is tolerant to failures. It uses {{cantReachIsSuccess=true}} > switch which handles connection issue, 404 and 503 as success, since replicas > being DOWN should not affect the process. > However, if a replica has disk issues, the core initialization might fail and > that results in {{500}} instead of {{503}}. I failing replica like that can > prevent any other replicas becoming the leader. > Proposing either: > * Accepting {{500}} as "cant reach" so leader candidate can go on > or > * Changing {{SolrCoreInitializationException}} to return {{503}} instead of > {{500}} > * * this might be API change, however -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org