Doss, See below.
Dominique Le lun. 10 août 2020 à 17:41, Doss <itsmed...@gmail.com> a écrit : > Hi Dominique, > > Thanks for your response. Find below the details, please do let me know if > anything I missed. > > > *- hardware architecture and sizing* > >> Centos 7, VMs,4CPUs, 66GB RAM, 16GB Heap, 250GB SSD > > > *- JVM version / settings * > >> Red Hat, Inc. OpenJDK 64-Bit Server VM, version:"14.0.1 14.0.1+7" - > Default Settings including GC > I don't think I would use a JVM version 14. OpenJDK 11 in my opinion is the best choice for LTS version. > > *- Solr settings * > >> softCommit: 15000 (15 sec), autoCommit: 300000 (5 mins) > <mergePolicyFactory > class="org.apache.solr.index.TieredMergePolicyFactory"><int > name="maxMergeAtOnce">30</int> <int name="maxMergeAtOnceExplicit">100</int> > <double name="segmentsPerTier">30.0</double> </mergePolicyFactory> > > <mergeScheduler > class="org.apache.lucene.index.ConcurrentMergeScheduler"><int > name="maxMergeCount">18</int><int > name="maxThreadCount">6</int></mergeScheduler> > You change a lot of default values. Any specific raisons ? Il seems very aggressive ! > > > *- collections and queries information * > >> One Collection, with 4 shards , 3 replicas , 3.5 Million Records, 150 > columns, mostly integer fields, Average doc size is 350kb. Insert / Updates > 0.5 Million Span across the whole day (peak time being 6PM to 10PM) , > selects not yet started. Daily once we do delta import of cetrain fields of > type multivalued with some good amount of data. > > *- gc logs or gceasy results* > > Easy GC Report says GC health is good, one server's gc report: > https://drive.google.com/file/d/1C2SqEn0iMbUOXnTNlYi46Gq9kF_CmWss/view?usp=sharing > CPU Load Pattern: > https://drive.google.com/file/d/1rjRMWv5ritf5QxgbFxDa0kPzVlXdbySe/view?usp=sharing > > You have to analyze GC on all nodes ! Your heap is very big. According to full GC frequency, I don't think you really need such a big heap for only indexing. May be when you will perform queries. Did you check your network performances ? Did you check Zookeeper logs ? > > Thanks, > Doss. > > > > On Mon, Aug 10, 2020 at 7:39 PM Dominique Bejean < > dominique.bej...@eolya.fr> wrote: > >> Hi Doss, >> >> See a lot of TIMED_WATING connection occurs with high tcp traffic >> infrastructure as in a LAMP solution when the Apache server can't >> anymore connect to the MySQL/MariaDB database. >> In this case, tweak net.ipv4.tcp_tw_reuse is a possible solution (but >> never net.ipv4.tcp_tw_recycle as you suggested in your previous post). >> This >> is well explained in this great article >> https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux >> >> However, in general and more specifically in your case, I would >> investigate >> the root cause of your issue and do not try to find a workaround. >> >> Can you provide more information about your use case (we know : 3 node >> SOLR >> (8.3.1 NRT) + 3 Node Zookeeper Ensemble) ? >> >> - hardware architecture and sizing >> - JVM version / settings >> - Solr settings >> - collections and queries information >> - gc logs or gceasy results >> >> Regards >> >> Dominique >> >> >> >> Le lun. 10 août 2020 à 15:43, Doss <itsmed...@gmail.com> a écrit : >> >> > Hi, >> > >> > In solr 8.3.1 source, I see the following , which I assume could be the >> > reason for the issue "Max requests queued per destination 3000 exceeded >> for >> > HttpDestination", >> > >> > >> solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java: >> > private static final int MAX_OUTSTANDING_REQUESTS = 1000; >> > >> solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java: >> > available = new Semaphore(MAX_OUTSTANDING_REQUESTS, false); >> > >> solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java: >> > return MAX_OUTSTANDING_REQUESTS * 3; >> > >> > how can I increase this? >> > >> > On Mon, Aug 10, 2020 at 12:01 AM Doss <itsmed...@gmail.com> wrote: >> > >> > > Hi, >> > > >> > > We are having 3 node SOLR (8.3.1 NRT) + 3 Node Zookeeper Ensemble now >> and >> > > then we are facing "Max requests queued per destination 3000 exceeded >> for >> > > HttpDestination" >> > > >> > > After restart evering thing starts working fine until another problem. >> > > Once a problem occurred we are seeing soo many TIMED_WAITING threads >> > > >> > > Server 1: >> > > *7722* Threads are in TIMED_WATING >> > > >> > >> ("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@151d5f2f >> > > ") >> > > Server 2: >> > > *4046* Threads are in TIMED_WATING >> > > >> > >> ("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e0205c3 >> > > ") >> > > Server 3: >> > > *4210* Threads are in TIMED_WATING >> > > >> > >> ("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5ee792c0 >> > > ") >> > > >> > > Please suggest whether net.ipv4.tcp_tw_reuse=1 will help ? or how can >> we >> > > increase the 3000 limit? >> > > >> > > Sorry, since I haven't got any response to my previous query, I am >> > > creating this as new, >> > > >> > > Thanks, >> > > Mohandoss. >> > > >> > >> >