Dear Erick, Forgive my ignorance.
Please find some of the details you required. *have you looked at the solr logs?* > Sorry I haven't defined the log4j.properties file, so I don't have solr logs. Since it requires tomcat restart I am planning to do it in next restart. But found the following in tomcat log 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2] org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The web application [/mima] appears to have started a thread named [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349) org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) *How big are the cores?* > We have 16 cores, out of it only 5 are big ones. Total size of all 16 cores is 10+ GB *How many docs in the cores when the problem happens?* 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB) 4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5 GB) remaining cores are 1,00,000 to 40,00,000 documents *How much memory are you allocating the JVM? * 5GB for JVM, Total RAM available in the systems is 30 GB *can you restart Tomcat without a problem?* This problem is occurring in production, I never tried. Thanks, Doss. On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson <erickerick...@gmail.com> wrote: > You've really got to provide details for us to say much > of anything. There are about a zillion things that it could be. > > In particular, have you looked at the solr logs? Are there > any interesting things in them? How big are the cores? > How much memory are you allocating the JVM? How > many docs in the cores when the problem happens? > Before the nodes stop responding, can you restart > Tomcat without a problem? > > You might review: > http://wiki.apache.org/solr/UsingMailingLists > > Best, > Erick > > > On Wed, Nov 19, 2014 at 1:04 AM, Doss <itsmed...@gmail.com> wrote: > > I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times > > SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat > in > > Node 1, but SOLR not starting up, but if I remove the solr cores in both > > nodes and try restarting it starts working, and then I have to reindex > the > > whole data again. We are using this setup in production because of this > > issue we are having 1 to 1.30 hours of service down time. Any suggestions > > would be greatly appreciated. > > > > Thanks, > > Doss. >