[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935318#comment-17935318 ] Chris M. Hostetter commented on SOLR-17656: --- {quote} So that made me eventually think that it was a client problem and not actually a problem with the node. ... I added the state wait at the beginning because I wasn't entirely sure this was the issue at the time, but now I know that part is useless, it can be removed. {quote} Ha ha ... ok ... my confusion was thinking the "state wait" was the *_important_* part of your fix, and that the client changes were just you doing a little refactoring/cleanup of duplicated code in the same commit. Now that i understand that your changes to what {{SolrClient}} gets used in the test are what *REALLY* made the failures stop, that makes things (a little) less confusing ... maybe the underling issue relates to connection caching in the low level {{org.apache.http.client.HttpClient}} , and the jetty restarts breaking those connections in a way that the client doesn't notice until it tries to send a request? (but i don't really understand the retry weirdness you mentioned) > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935274#comment-17935274 ] Houston Putman commented on SOLR-17656: --- [~hossman] I could never reproduce locally either, but the smoking gun for me was that the node was available (the logs said it was fine, and the collection creation command worked without issue) but the actual client request got the IOException. So that made me eventually think that it was a client problem and not actually a problem with the node. If you look at the logs of the failing tests, they include many of these: {quote} o.a.h.i.e.RetryExec I/O exception (org.apache.http.NoHttpResponseException) caught when processing request to \{s}->https://127.0.0.1:38235: The target server failed to respond o.a.h.i.e.RetryExec Retrying request to \{s}->[https://127.0.0.1:38235|https://127.0.0.1:38235/] {quote} The weird thing here, is that the bad requests are only supposed to be retried if the request is non-admin, but all of these requests being retried seem to be Collections API calls. The weirder thing, is that this same retry logic is not used when the same exception happens during the "/get" call in {{testRealTimeGet()}} , so it fails in the way described above. These NoHttpResponseExceptions happen all over the test, probably because of the amount of restarts being done, but because they get retried almost every time it's not a big deal. I don't know why it's not being retried for the "/get" request, but all I do know is that creating a new client is a safe way to see the test pass, and ultimately the client is not what is being tested here. I added the state wait at the beginning because I wasn't entirely sure this was the issue at the time, but now I know that part is useless, it can be removed. > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933695#comment-17933695 ] ASF subversion and git services commented on SOLR-17656: Commit b81dc4b5f29441806235a51ee87970a93bddeac5 in solr's branch refs/heads/deprecations from Houston Putman [ https://gitbox.apache.org/repos/asf?p=solr.git;h=b81dc4b5f29 ] SOLR-17656: Fix TestPullReplica.testRealTimeGet() > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933573#comment-17933573 ] Chris M. Hostetter commented on SOLR-17656: --- [~houston] - i appreciate your help w/the test – but after reading your change a couple of times now, and looking at the old version vs the new version, and re-running w/your exact seed (and confirming that {{testSkipLeaderRecoveryProperty}} ran before {{testRealTimeGet}}) I still can't for the life of me understand when/why that failures would have happened (in a way that your commit would now reliably prevent it from happening) If the issue is that one of the jetties never restarted (or restarted too slowly), then... * I'd expect {{testSkipLeaderRecoveryProperty}} should have failed as well? .. and if it didn't we shoudl also harden it's assertions ** but even then, i'd expect {{testRealTimeGet}} to timeout or fail when creating it's collection -- not when adding a doc to that collection * But also: {{testSkipLeaderRecoveryProperty}} isn't the first test in this class to stop/restart jetty instances -- so if that's the problem it seems like it could happen in other test methods as well, and the better fix would be to put your "wait for live nodes" logic in a before/after method ...can you help walk me through what exactly the problem was and how your patch fixed it? > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933474#comment-17933474 ] ASF subversion and git services commented on SOLR-17656: Commit b81dc4b5f29441806235a51ee87970a93bddeac5 in solr's branch refs/heads/main from Houston Putman [ https://gitbox.apache.org/repos/asf?p=solr.git;h=b81dc4b5f29 ] SOLR-17656: Fix TestPullReplica.testRealTimeGet() > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933481#comment-17933481 ] ASF subversion and git services commented on SOLR-17656: Commit 00eaa995bb633d8604c6ace363a6e7841563f897 in solr's branch refs/heads/branch_9x from Houston Putman [ https://gitbox.apache.org/repos/asf?p=solr.git;h=00eaa995bb6 ] SOLR-17656: Fix TestPullReplica.testRealTimeGet() (cherry picked from commit b81dc4b5f29441806235a51ee87970a93bddeac5) > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933472#comment-17933472 ] Houston Putman commented on SOLR-17656: --- I think this inadvertently created a bug that when {{testRealTimeGet}} runs after the new {{testSkipLeaderRecoveryProperty}} it can hit: {quote}{{> org.apache.solr.client.solrj.SolrServerException: IOException occurred when talking to server at: [https://127.0.0.1:42033/solr]}} {{> at __randomizedtesting.SeedInfo.seed([BBB06E01532385D3:E3DD9B02F1992B1A]:0)}} {{> at app//org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:736)}} {{> at app//org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:260)}} {{> at app//org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:239)}} {{> at app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:279)}} {{> at app//org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:166)}} {{> at app//org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:131)}} {{> at app//org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:147)}} {{> at app//org.apache.solr.cloud.TestPullReplica.testRealTimeGet(TestPullReplica.java:497)}} {quote} {{I'll see if I can fix it quickly.}} > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: main (10.0), 9.9 > > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927300#comment-17927300 ] ASF subversion and git services commented on SOLR-17656: Commit 9b3baab5b460c9a14270642998a610c3bec286c2 in solr's branch refs/heads/branch_9x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=solr.git;h=9b3baab5b46 ] SOLR-17656: New 'skipLeaderRecovery' replica property allows PULL replicas with existing indexes to immediately become ACTIVE (cherry picked from commit e775fd26dbf0d5967c06f226d650bd8a6b896c4e) > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927239#comment-17927239 ] ASF subversion and git services commented on SOLR-17656: Commit e775fd26dbf0d5967c06f226d650bd8a6b896c4e in solr's branch refs/heads/main from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=solr.git;h=e775fd26dbf ] SOLR-17656: New 'skipLeaderRecovery' replica property allows PULL replicas with existing indexes to immediately become ACTIVE > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17656) Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING
[ https://issues.apache.org/jira/browse/SOLR-17656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925817#comment-17925817 ] David Smiley commented on SOLR-17656: - Sounds useful. > Add expert level option to allowe PULL replicas to go ACTIVE w/o RECOVERING > --- > > Key: SOLR-17656 > URL: https://issues.apache.org/jira/browse/SOLR-17656 > Project: Solr > Issue Type: New Feature >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17656-1.patch, SOLR-17656.patch > > > In situations where a Solr cluster undergoes a rolling restart (or some other > "catastrophic" failure situations requiring/causing solr node restarts) there > can be a snowball effect of poor performance (or even solr node crashing) due > to fewer then normal replicas serving query requests while replicas on > restarting nodes are DOWN or RECOVERING – especially if shard leaders are > also affected, and (restarting) replicas first must wait for a leader > election before they can recover (or wait to finish recovery from an > over-worked leader). > For NRT type usecases, RECOVERING is really a necessary evil to ensure every > replicas is up to date before handling NRT requests – but in the case of PULL > replicas, which are expected to routinely "lag" behind their leader, I've > talked to a lot of Solr users w/usecases where they would be happy to have > PULL replicas back online serving "stale" data ASAP, and let normal > IndexFetching "catchup" with the leader later. > I propose we support a new "advanced" replica property that can be set on > PULL replicas by expert level users, to indicate: on (re)init, these replicas > may skip RECOVERING and go directly to ACTIVE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org