Hi, I had this problem. In my case was the wait/io in vm. I migrate my environment to another place and solved.
Actually it's problem wirh wait/io at host physical (until backup it's a problem over veeam). Regards Em sáb, 4 de jul de 2020 12:30, Tran Van Hoan <tranvanhoan...@yahoo.com.invalid> escreveu: > The problem reoccurs repeatly in recent days. > To day i tried dump heap and thread. Only dumping thread, heap can not > because solr instance was hang. > Almost thread was blocked. > > On Tuesday, June 23, 2020, 10:42:36 PM GMT+7, Tran Van Hoan > <tranvanhoan...@yahoo.com.invalid> wrote: > > > I checked node exporter metrics and saw network no problem > > On Tuesday, June 23, 2020, 8:37:41 PM GMT+7, Tran Van Hoan < > tranvanhoan...@yahoo.com> wrote: > > > I check node exporter, no problem with OS, hardware and network. > I attached images about solr metrics 7 days and 12h. > > On Tuesday, June 23, 2020, 2:23:05 PM GMT+7, Dario Rigolin < > dario.rigo...@comperio.it> wrote: > > > What about a network issue? > > Il giorno mar 23 giu 2020 alle ore 01:37 Tran Van Hoan > <tranvanhoan...@yahoo.com.invalid> ha scritto: > > > > > > dear all, > > > > I have a solr cloud 8.2.0 with 6 instance per 6 server (64G RAM), each > > instance has xmx = xms = 30G. > > > > Today almost nodes in the solrcloud were dead 2 times from 8:00AM (5/6 > > nodes were down) and 1:00PM (2/6 nodes were down). yesterday, One node > > were down. almost metrics didn't increase too much except threads. > > > > Performance in one week ago: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > performace 12h ago: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I go to the admin UI, some node dead some node too long to response. When > > checking logfile, they generate too much (log level warning), here are > logs > > which appears in the solr cloud: > > > > Log before server 4 and 6 down > > > > - Server 4 before it dead: > > > > + o.a.s.h.RequestHandlerBase java.io.IOException: > > java.util.concurrent.TimeoutException: Idle timeout expired: > 120000/120000 > > ms > > > > +org.apache.solr.client.solrj.SolrServerException: Timeout occured while > > waiting response from server at: > > http://server6:8983/solr/mycollection_shard3_replica_n5/select > > > > > > > > at > > > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:406) > > > > at > > > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:746) > > > > at > > org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1274) > > > > at > > > org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238) > > > > at > > > org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199) > > > > at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > > > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > > at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > > > > at > > > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181) > > > > at > > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > > > > ... 1 more > > > > Caused by: java.util.concurrent.TimeoutException > > > > at > > > org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216) > > > > at > > > org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:397) > > > > ... 12 more > > > > > > > > + o.a.s.s.HttpSolrCall invalid return code: -1 > > > > + o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp: > > 1592803662746 , received timestamp: 1592803796152 , TTL: 120000 > > > > + o.a.s.s.PKIAuthenticationPlugin Decryption failed , key must be wrong > => > > java.security.InvalidKeyException: No installed provider supports this > key: > > (null) > > > > + o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling > > SolrCmdDistributor$Req: cmd=delete{,commitWithin=-1}; node=ForwardNode: > > http://server6:8983/solr/mycollection_shard3_replica_n5/ to > > http://server6:8983/solr/mycollection_shard3_replica_n5/ => > > java.util.concurrent.TimeoutException > > > > + o.a.s.s.HttpSolrCall > > > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: > > Async exception during distributed update: null > > > > > > > > Server 2: > > > > + Max requests queued per destination 3000 exceeded for > > HttpDestination[http://server4:8983 > > ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3 > > [c=4/4,b=4,m=0,i=0] > > > > + Max requests queued per destination 3000 exceeded for > > HttpDestination[http://server5:8983 > > ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3 > > [c=4/4,b=4,m=0,i=0] > > > > > > > > + Timeout occured while waiting response from server at: > > http://server4:8983/solr/mycollection_shard6_replica_n23/select > > > > + Timeout occured while waiting response from server at: > > http://server6:8983/solr/mycollection_shard2_replica_n15/select > > > > + o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: > > org.apache.solr.client.solrj.SolrServerException: IOException occured > when > > talking to server at: null > > > > Caused by: org.apache.solr.client.solrj.SolrServerException: IOException > > occured when talking to server at: null > > > > Caused by: java.nio.channels.ClosedChannelException > > > > > > > > Server 6: > > > > + org.apache.solr.client.solrj.SolrServerException: Timeout occured > while > > waiting response from server at: > > http://server6:8983/solr/mycollection_shard2_replica_n15/select > > > > + + org.apache.solr.client.solrj.SolrServerException: Timeout occured > > while waiting response from server at: Timeout occured while waiting > > response from server at: > > http://server4:8983/mycollection_shard6_replica_n23/select > > > > > > > > I tried search google but didn't find any clue :(! Do you help me how to > > find the cause. thank you! > > > > > > > > > > > > > -- > > Dario Rigolin > Comperio srl - CTO > Mobile: +39 347 7232652 - Office: +39 0425 471482 > Skype: dario.rigolin > >