Re: Down Replica is elected as Leader (solr v8.7.0)
> Are yours growing always, on all nodes, forever? Or is it one or two who ends up in a bad state? Randomly on some of the shards and some of the followers in the collection. Then whichever tlog was open on follower when it was the leader, that one doesn't stops growing. And that shard had active ingestion at a high rate. If we now add more shards, or do a cluster rebalance, the collection is unsusable and causes a production query and ingest outage. Very painful manual restoration twice a day. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Down Replica is elected as Leader (solr v8.7.0)
I've run into this (or similar) issues in the past (solr6? I don't remember exactly) where tlogs get stuck either growing indefinitely and/or refusing to commit on restart. What I ended up doing was writing a monitor to check for the number of tlogs and alert if they got over some limit (100 or whatever) and then I could stay ahead of the issue by rebuilding individual nodes as-needed. Are yours growing always, on all nodes, forever? Or is it one or two who ends up in a bad state? On Tue, Feb 16, 2021 at 3:57 PM mmb1234 wrote: > > Looks like the problem is related to tlog rotation on the follower shard. > > We did the following for a specific shard. > > 0. start solr cloud > 1. solr-0 (leader), solr-1, solr-2 > 2. rebalance to make solr-1 as preferred leader > 3. solr-0, solr-1 (leader), solr-2 > > The tlog file on solr-0 kept on growing infinitely (100s of GBs) until we > shut the cluster and dropped all shards (manually). > > Only way to "restart" tlog rotation on solr-0 (follower) was to issue > /admin/cores/action=RELOAD=x atleast twice when the tlog size was > small (in MBs). > > Also if rebalance is is issued to select solr-0 as a leader, leader election > never completes. > > solr-0 output after step (3) above. > > solr-0 > 2140856 ./data2/mydata_0_e000-/tlog > 2140712 ./data2/mydata_0_e000-/tlog/tlog.021 > > solr-1 (leader) > 35268 ./data2/mydata_0_e000-/tlog > 35264 ./data2/mydata_0_e000-/tlog/tlog.055 > > solr-2 > 35256 ./data2/mydata_0_e000-/tlog > 35252 ./data2/mydata_0_e000-/tlog/tlog.054 > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Down Replica is elected as Leader (solr v8.7.0)
Looks like the problem is related to tlog rotation on the follower shard. We did the following for a specific shard. 0. start solr cloud 1. solr-0 (leader), solr-1, solr-2 2. rebalance to make solr-1 as preferred leader 3. solr-0, solr-1 (leader), solr-2 The tlog file on solr-0 kept on growing infinitely (100s of GBs) until we shut the cluster and dropped all shards (manually). Only way to "restart" tlog rotation on solr-0 (follower) was to issue /admin/cores/action=RELOAD=x atleast twice when the tlog size was small (in MBs). Also if rebalance is is issued to select solr-0 as a leader, leader election never completes. solr-0 output after step (3) above. solr-0 2140856 ./data2/mydata_0_e000-/tlog 2140712 ./data2/mydata_0_e000-/tlog/tlog.021 solr-1 (leader) 35268 ./data2/mydata_0_e000-/tlog 35264 ./data2/mydata_0_e000-/tlog/tlog.055 solr-2 35256 ./data2/mydata_0_e000-/tlog 35252 ./data2/mydata_0_e000-/tlog/tlog.054 -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Down Replica is elected as Leader (solr v8.7.0)
Looks like the problem is related to tlog rotation on the follower shard. We did the following for a specific shard. 0. start solr cloud 1. solr-0 (leader), solr-1, solr-2 2. rebalance to make solr-1 as preferred leader 3. solr-0, solr-1 (leader), solr-2 The tlog file on solr-0 kept on growing infinitely (100s of GBs) until we shut the cluster and dropped all shards (manually). Only way to "restart" tlog rotation on solr-0 (follower) was to issue /admin/cores/action=RELOAD=x atleast twice when the tlog size was small (in MBs). Also if rebalance is is issued to select solr-0 as a leader, leader election never completes. solr-0 output after step (3) above. solr-0 2140856 ./data2/mydata_0_e000-/tlog 2140712 ./data2/mydata_0_e000-/tlog/tlog.021 solr-1 (leader) 35268 ./data2/mydata_0_e000-/tlog 35264 ./data2/mydata_0_e000-/tlog/tlog.055 solr-2 35256 ./data2/mydata_0_e000-/tlog 35252 ./data2/mydata_0_e000-/tlog/tlog.054 -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Down Replica is elected as Leader (solr v8.7.0)
We found that for the shard that does not get a leader, the tlog replay did not complete (we don't see "log replay finished", "creating leader registration node", "I am the new leader" etc log messages) for hours. Also not sure why the TLOG are 10's of GBs (anywhere from 30 to 40GB). Collection's shards have 3x replicas, TLOG replication and 10sec hard commit. The situation is resulting in 2x per day outage. Current work around is to stop ingestion, issue a collection rebalance and/or node restarts and repeat until shards are online (a couple of hrs each day of manual recovery). Any suggestions or workarounds? Not sure if we're running into these issues: https://issues.apache.org/jira/browse/SOLR-13486 https://issues.apache.org/jira/browse/SOLR-14679 Partial log output from both nodes (sorted by time asc): myapp-data-solr-0 2021-02-12 19:36:05.965 INFO (zkCallback-14-thread-51) [c:mydata s:0_8000-9fff r:core_node3 x:mydata_0_8000-9fff_replica_t1] o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader myapp-data-solr-0 2021-02-12 19:36:05.966 WARN (recoveryExecutor-96-thread-1-processing-n:myapp-data-solr-0.myapp-data-solr-headless:8983_solr x:mydata_0_8000-9fff_replica_t1 c:mydata s:0_8000-9fff r:core_node3) [c:mydata s:0_8000-9fff r:core_node3 x:mydata_0_8000-9fff_replica_t1] o.a.s.u.UpdateLog Starting log replay tlog{file=/opt/solr/volumes/data1/mydata_0_8000-9fff/tlog/tlog.0003525 refcount=2} active=false starting pos=0 inSortedOrder=true myapp-data-solr-0 2021-02-12 22:13:03.084 INFO (recoveryExecutor-96-thread-1-processing-n:myapp-data-solr-0.myapp-data-solr-headless:8983_solr x:mydata_0_8000-9fff_replica_t1 c:mydata s:0_8000-9fff r:core_node3) [c:mydata s:0_8000-9fff r:core_node3 x:mydata_0_8000-9fff_replica_t1] o.a.s.u.UpdateLog log replay status tlog{file=/opt/solr/volumes/data1/mydata_0_8000-9fff/tlog/tlog.0003525 refcount=3} active=false starting pos=0 current pos=27101174167 current size=33357447222 % read=81.0 myapp-data-solr-0 2021-02-12 22:13:06.602 ERROR (indexFetcher-4092-thread-1) [ ] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw state=DocCollection(mydata//collections/mydata/state.json/750)={ "pullReplicas":"0", "replicationFactor":"0", "shards":{ "0_8000-9fff":{ "range":null, "state":"active", "replicas":{ "core_node3":{ "core":"mydata_0_8000-9fff_replica_t1", "base_url":"http://myapp-data-solr-0.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-0.myapp-data-solr-headless:8983_solr", "state":"active", "type":"TLOG", "force_set_state":"false"}, "core_node5":{ "core":"mydata_0_8000-9fff_replica_t2", "base_url":"http://myapp-data-solr-1.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-1.myapp-data-solr-headless:8983_solr", "state":"active", "type":"TLOG", "force_set_state":"false", "property.preferredleader":"true"}, "core_node6":{ "core":"mydata_0_8000-9fff_replica_t4", "base_url":"http://myapp-data-solr-2.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-2.myapp-data-solr-headless:8983_solr", "state":"down", "type":"TLOG", "force_set_state":"false"}}}, myapp-data-solr-0 2021-02-12 22:45:51.600 ERROR (indexFetcher-4092-thread-1) [ ] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw state=DocCollection(mydata//collections/mydata/state.json/754)={ "pullReplicas":"0", "replicationFactor":"0", "shards":{ "0_8000-9fff":{ "range":null, "state":"active", "replicas":{ "core_node3":{ "core":"mydata_0_8000-9fff_replica_t1", "base_url":"http://myapp-data-solr-0.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-0.myapp-data-solr-headless:8983_solr", "state":"active", "type":"TLOG", "force_set_state":"false"}, "core_node5":{ "core":"mydata_0_8000-9fff_replica_t2", "base_url":"http://myapp-data-solr-1.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-1.myapp-data-solr-headless:8983_solr", "state":"down", "type":"TLOG", "force_set_state":"false", "property.preferredleader":"true"}, "core_node6":{ "core":"mydata_0_8000-9fff_replica_t4", "base_url":"http://myapp-data-solr-2.myapp-data-solr-headless:8983/solr;, "node_name":"myapp-data-solr-2.myapp-data-solr-headless:8983_solr", "state":"down", "type":"TLOG", "force_set_state":"false"}}},... myapp-data-solr-1 2021-02-12 22:45:56.600 ERROR (indexFetcher-4092-thread-1) [ ] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: mydata slice: 0_8000-9fff saw
Re: Down Replica is elected as Leader (solr v8.7.0)
By tracing the output in the log files we see the following sequence. Overseer role list has POD-1, POD-2, POD-3 in that order POD-3 has 2 shard leaders. POD-3 restarts. A) Logs for the shard whose leader moves successfully from POD-3 to POD-1 On POD-1: o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader On POD-1: o.a.s.u.UpdateLog Starting log replay On POD-1: o.a.s.u.UpdateLog Log replay finished. On POD-1: o.a.s.c.SolrCore Registered new searcher autowarm time: 0 ms On POD-1: o.a.s.c.ShardLeaderElectionContextBase Creating leader registration node ... after winning as ... On POD-1: o.a.s.c.ShardLeaderElectionContext I am the new leader: B) Logs for the shard whose leader does not move from POD-3 to POD-1 On POD-1: o.a.s.c.ShardLeaderElectionContext Replaying tlog before become new leader On POD-1: o.a.s.u.UpdateLog Starting log replay... On POD-1: o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms < POD-3 is back up at this time > On POD-3: o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: No registered leader On was found after waiting for 4000ms It was odd to see no INFO, WARN or ERROR log message after "Starting log replay" on POD-1 for the shard which did not get its leader elected. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Down Replica is elected as Leader (solr v8.7.0)
I haven’t delved into the exact reason for this, but what generally helps to avoid this situation in a cluster is i) During shutdown (in case you need to restart the cluster), let the overseer node be the last one to shut down. ii) While restarting, let the Overseer node be the first one to start iii) Wait for 5-10 seconds between each subsequent node start Hope this helps. Best, Rahul On Thu, Feb 11, 2021 at 12:03 PM mmb1234 wrote: > Hello, > > On reboot of one of the solr nodes in the cluster, we often see a > collection's shards with > 1. LEADER replica in DOWN state, and/or > 2. shard with no LEADER > > Output from /solr/admin/collections?action=CLUSTERSTATUS is below. > > Even after 5 to 10 minutes, the collection often does not recover. Unclear > why this is happening and what we can try to prevent or remedy it. > > ps: perReplicaState= true in solr v8.8.0 didn't work well because after a > rebalance all replicas somehow get a "leader:true" status even though > states.json looked ok. > > { > "responseHeader": { > "status": 0, > "QTime": 2 > }, > "cluster": { > "collections": { > "datacore": { > "pullReplicas": "0", > "replicationFactor": "0", > "shards": { > "__": { > "range": null, > "state": "active", > "replicas": { > "core_node1": { > "core": "datacore____replica_t187", > "base_url": "http://solr-0.solr-headless:8983/solr;, > "node_name": "solr-0.solr-headless:8983_solr", > "state": "down", > "type": "TLOG", > "force_set_state": "false", > "property.preferredleader": "true", > "leader": "true" > }, > "core_node2": { > "core": "datacore____replica_t188", > "base_url": "http://solr-1.solr-headless:8983/solr;, > "node_name": "solr-1.solr-headless:8983_solr", > "state": "active", > "type": "TLOG", > "force_set_state": "false" > }, > "core_node3": { > "core": "datacore____replica_t189", > "base_url": "http://solr-2.solr-headless:8983/solr;, > "node_name": "solr-2.solr-headless:8983_solr", > "state": "active", > "type": "TLOG", > "force_set_state": "false" > } > } > }, > "__j": { > "range": null, > "state": "active", > "replicas": { > "core_node19": { > "core": "datacore___j_replica_t187", > "base_url": "http://solr-0.solr-headless:8983/solr;, > "node_name": "solr-0.solr-headless:8983_solr", > "state": "down", > "type": "TLOG", > "force_set_state": "false", > "property.preferredleader": "true" > }, > "core_node20": { > "core": "datacore___j_replica_t188", > "base_url": "http://solr-1.solr-headless:8983/solr;, > "node_name": "solr-1.solr-headless:8983_solr", > "state": "active", > "type": "TLOG", > "force_set_state": "false" > }, > "core_node21": { > "core": "datacore___j_replica_t189", > "base_url": "http://solr-2.solr-headless:8983/solr;, > "node_name": "solr-2.solr-headless:8983_solr", > "state": "active", > "type": "TLOG", > "force_set_state": "false" > } > } > }, > "__": { > "range": null, > "state": "active", > "replicas": { > "core_node4": { > "core": "datacore____replica_t91", > "base_url": "http://solr-0... > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Down Replica is elected as Leader (solr v8.7.0)
Hello, On reboot of one of the solr nodes in the cluster, we often see a collection's shards with 1. LEADER replica in DOWN state, and/or 2. shard with no LEADER Output from /solr/admin/collections?action=CLUSTERSTATUS is below. Even after 5 to 10 minutes, the collection often does not recover. Unclear why this is happening and what we can try to prevent or remedy it. ps: perReplicaState= true in solr v8.8.0 didn't work well because after a rebalance all replicas somehow get a "leader:true" status even though states.json looked ok. { "responseHeader": { "status": 0, "QTime": 2 }, "cluster": { "collections": { "datacore": { "pullReplicas": "0", "replicationFactor": "0", "shards": { "__": { "range": null, "state": "active", "replicas": { "core_node1": { "core": "datacore____replica_t187", "base_url": "http://solr-0.solr-headless:8983/solr;, "node_name": "solr-0.solr-headless:8983_solr", "state": "down", "type": "TLOG", "force_set_state": "false", "property.preferredleader": "true", "leader": "true" }, "core_node2": { "core": "datacore____replica_t188", "base_url": "http://solr-1.solr-headless:8983/solr;, "node_name": "solr-1.solr-headless:8983_solr", "state": "active", "type": "TLOG", "force_set_state": "false" }, "core_node3": { "core": "datacore____replica_t189", "base_url": "http://solr-2.solr-headless:8983/solr;, "node_name": "solr-2.solr-headless:8983_solr", "state": "active", "type": "TLOG", "force_set_state": "false" } } }, "__j": { "range": null, "state": "active", "replicas": { "core_node19": { "core": "datacore___j_replica_t187", "base_url": "http://solr-0.solr-headless:8983/solr;, "node_name": "solr-0.solr-headless:8983_solr", "state": "down", "type": "TLOG", "force_set_state": "false", "property.preferredleader": "true" }, "core_node20": { "core": "datacore___j_replica_t188", "base_url": "http://solr-1.solr-headless:8983/solr;, "node_name": "solr-1.solr-headless:8983_solr", "state": "active", "type": "TLOG", "force_set_state": "false" }, "core_node21": { "core": "datacore___j_replica_t189", "base_url": "http://solr-2.solr-headless:8983/solr;, "node_name": "solr-2.solr-headless:8983_solr", "state": "active", "type": "TLOG", "force_set_state": "false" } } }, "__": { "range": null, "state": "active", "replicas": { "core_node4": { "core": "datacore____replica_t91", "base_url": "http://solr-0... -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html