[jira] [Commented] (SOLR-8629) When a prospective leader attempts to sync with it's shard, we should only fail the sync due to peer sync, not necessarily other exceptions.
[ https://issues.apache.org/jira/browse/SOLR-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470109#comment-15470109 ] Stephan Lagraulet commented on SOLR-8629: - Hi [~markrmil...@gmail.com] did you have the chance to work on this issue? > When a prospective leader attempts to sync with it's shard, we should only > fail the sync due to peer sync, not necessarily other exceptions. > > > Key: SOLR-8629 > URL: https://issues.apache.org/jira/browse/SOLR-8629 > Project: Solr > Issue Type: Bug >Reporter: Mark Miller >Assignee: Mark Miller > > Otherwise, one screwed up replica can prevent a leader even if there are 100 > other good replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8629) When a prospective leader attempts to sync with it's shard, we should only fail the sync due to peer sync, not necessarily other exceptions.
[ https://issues.apache.org/jira/browse/SOLR-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170697#comment-15170697 ] Mark Miller commented on SOLR-8629: --- This may be as simple as adding code 500 as a success on peersync like we currently do on connect exceptions. My worry is the same as those exceptions though - it may be a very temporary situation, and the affected node may be the best leader candidate. That is why I've been thinking about SOLR-8753. It would be nice to allow a couple of retries of the possible leaders over time in these situations. I think that may be tricky to do nicely with the current code though. > When a prospective leader attempts to sync with it's shard, we should only > fail the sync due to peer sync, not necessarily other exceptions. > > > Key: SOLR-8629 > URL: https://issues.apache.org/jira/browse/SOLR-8629 > Project: Solr > Issue Type: Bug >Reporter: Mark Miller >Assignee: Mark Miller > > Otherwise, one screwed up replica can prevent a leader even if there are 100 > other good replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8629) When a prospective leader attempts to sync with it's shard, we should only fail the sync due to peer sync, not necessarily other exceptions.
[ https://issues.apache.org/jira/browse/SOLR-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170664#comment-15170664 ] Mark Miller commented on SOLR-8629: --- I think SOLR-8753 will allow us to be a little bit more conservative in some of these issues. Right now, even though elections were originally intended to retry forever, currently each replica generally only gets one shot at trying to be the leader. > When a prospective leader attempts to sync with it's shard, we should only > fail the sync due to peer sync, not necessarily other exceptions. > > > Key: SOLR-8629 > URL: https://issues.apache.org/jira/browse/SOLR-8629 > Project: Solr > Issue Type: Bug >Reporter: Mark Miller >Assignee: Mark Miller > > Otherwise, one screwed up replica can prevent a leader even if there are 100 > other good replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8629) When a prospective leader attempts to sync with it's shard, we should only fail the sync due to peer sync, not necessarily other exceptions.
[ https://issues.apache.org/jira/browse/SOLR-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127044#comment-15127044 ] Mark Miller commented on SOLR-8629: --- This may be tricky, but I'm not entirely happy with the current situation. > When a prospective leader attempts to sync with it's shard, we should only > fail the sync due to peer sync, not necessarily other exceptions. > > > Key: SOLR-8629 > URL: https://issues.apache.org/jira/browse/SOLR-8629 > Project: Solr > Issue Type: Bug >Reporter: Mark Miller >Assignee: Mark Miller > > Otherwise, one screwed up replica can prevent a leader even if there are 100 > other good replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org