Doss:

Tomcat often puts things in "catalina.out", you might check there,
I've often seen logging information from Solr go there by
default.

Without having some idea what kinds of problems Solr is
reporting when you see this situation, it's really hard to say.

Some things I'd check first though, in order of what
I _guess_ is most likely.

> There have been anecdotal reports (in fact, I'm trying
to understand the why of it right now) of the suggester
taking a long time to initialize, even if you don't use it!
So if you're not using the suggest component, try
commenting out those sections in solrconfig.xml for
the cores in question. I like this explanation since it
fits with your symptoms, but I don't like it since the
index you are using isn't all that big. So it's something
of a shot in the dark. I expect that the core will
_eventually_ come up, but I've seen reports of 10-15
minutes being required, far beyond my patience! That
said, this would also explain why deleting the index
works.

> OutOfMemory errors. You might be able to attach
jConsole (part of the standard Java stuff) to the process
and monitor the memory usage. If it's being pushed near
the 5G limit that's the first thing I'd suspect.

> If you're using the default setups, then the Zookeeper
timeout may be too low, I think the default (not sure about
whether it's been changed in 4.9) is 15 seconds, 30-60
is usually much better.

Best,
Erick


On Thu, Nov 20, 2014 at 3:47 AM, Doss <itsmed...@gmail.com> wrote:
> Dear Erick,
>
> Forgive my ignorance.
>
> Please find some of the details you required.
>
> *have you looked at the solr logs?*
>
>  > Sorry I haven't defined the log4j.properties file, so I don't have solr
> logs. Since it requires tomcat restart I am planning to do it in next
> restart.
>
> But found the following in tomcat log
>
> 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
> org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The web
> application [/mima] appears to have started a thread named
> [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
> stop it. This is very likely to create a memory leak. Stack trace of thread:
>  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>  sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>  sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>  
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
>  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
>
>
> *How big are the cores?*
>
>> We have 16 cores, out of it only 5 are big ones. Total size of all 16
> cores is 10+ GB
>
> *How many docs in the cores when the problem happens?*
>
> 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
>  4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5 GB)
> remaining cores are 1,00,000 to 40,00,000 documents
>
> *How much memory are you allocating the JVM? *
>
> 5GB for JVM, Total RAM available in the systems is 30 GB
>
> *can you restart Tomcat without a problem?*
>
> This problem is occurring in production, I never tried.
>
>
> Thanks,
> Doss.
>
>
> On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> You've really got to provide details for us to say much
>> of anything. There are about a zillion things that it could be.
>>
>> In particular, have you looked at the solr logs? Are there
>> any interesting things in them? How big are the cores?
>> How much memory are you allocating the JVM? How
>> many docs in the cores when the problem happens?
>> Before the nodes stop responding, can you restart
>> Tomcat without a problem?
>>
>> You might review:
>> http://wiki.apache.org/solr/UsingMailingLists
>>
>> Best,
>> Erick
>>
>>
>> On Wed, Nov 19, 2014 at 1:04 AM, Doss <itsmed...@gmail.com> wrote:
>> > I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times
>> > SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat
>> in
>> > Node 1, but SOLR not starting up, but if I remove the solr cores in both
>> > nodes and try restarting it starts working, and then I have to reindex
>> the
>> > whole data again. We are using this setup in production because of this
>> > issue we are having 1 to 1.30 hours of service down time. Any suggestions
>> > would be greatly appreciated.
>> >
>> > Thanks,
>> > Doss.
>>

Reply via email to