dear all,

 I have a solr cloud 8.2.0 with 6 instance per 6 server (64G RAM), each 
instance has xmx = xms = 30G.

Today almost nodes in the solrcloud were dead 2 times from 8:00AM (5/6 nodes 
were down) and 1:00PM (2/6 nodes  were down). yesterday,  One node were down. 
almost metrics didn't increase too much except threads. 

Performance in one week ago:



 



 



 





 









 

performace 12h ago:



 



 



 



 



 



 





I go to the admin UI, some node dead some node too long to response. When 
checking logfile, they generate too much (log level warning), here are logs 
which appears in the solr cloud:

Log before server 4 and 6 down

- Server 4 before it dead:

   + o.a.s.h.RequestHandlerBase java.io.IOException: 
java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms

  +org.apache.solr.client.solrj.SolrServerException: Timeout occured while 
waiting response from server at:  
http://server6:8983/solr/mycollection_shard3_replica_n5/select

  

at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:406)

                at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:746)

                at 
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1274)

                at 
org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)

                at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)

                at java.util.concurrent.FutureTask.run(FutureTask.java:266)

                at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

                at java.util.concurrent.FutureTask.run(FutureTask.java:266)

                at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)

                at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)

                at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

                ... 1 more

Caused by: java.util.concurrent.TimeoutException

                at 
org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)

                at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:397)

                ... 12 more

 

+ o.a.s.s.HttpSolrCall invalid return code: -1

+ o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp: 1592803662746 
, received timestamp: 1592803796152 , TTL: 120000  

+ o.a.s.s.PKIAuthenticationPlugin Decryption failed , key must be wrong => 
java.security.InvalidKeyException: No installed provider supports this key: 
(null)

+  o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling 
SolrCmdDistributor$Req: cmd=delete{,commitWithin=-1}; node=ForwardNode: 
http://server6:8983/solr/mycollection_shard3_replica_n5/ to 
http://server6:8983/solr/mycollection_shard3_replica_n5/ => 
java.util.concurrent.TimeoutException

+ o.a.s.s.HttpSolrCall 
null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
 Async exception during distributed update: null

 

Server 2: 

 + Max requests queued per destination 3000 exceeded for 
HttpDestination[http://server4:8983]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3[c=4/4,b=4,m=0,i=0]

 +  Max requests queued per destination 3000 exceeded for 
HttpDestination[http://server5:8983]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3[c=4/4,b=4,m=0,i=0]

 

+ Timeout occured while waiting response from server at: 
http://server4:8983/solr/mycollection_shard6_replica_n23/select

+ Timeout occured while waiting response from server at: 
http://server6:8983/solr/mycollection_shard2_replica_n15/select

+   o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: null

Caused by: org.apache.solr.client.solrj.SolrServerException: IOException 
occured when talking to server at: null

Caused by: java.nio.channels.ClosedChannelException

 

Server 6:

 + org.apache.solr.client.solrj.SolrServerException: Timeout occured while 
waiting response from server at: 
http://server6:8983/solr/mycollection_shard2_replica_n15/select

 + + org.apache.solr.client.solrj.SolrServerException: Timeout occured while 
waiting response from server at: Timeout occured while waiting response from 
server at: http://server4:8983/mycollection_shard6_replica_n23/select

 

I tried search google but didn't find any clue  :(! Do you help me how to find 
the cause. thank you!


 

Reply via email to