Re: SOLR not starting after restart 2 node cloud setup

2014-12-02 Thread Doss
Dear Erick,

Thanks for your thoughts, it helped me a lot. In my instances no solr logs
are appended in to catalina.out.

Now I placed the log4j.properties file. Solr logs are captured in solr.log
file with the help of it I found the reason for the issue.

I am starting tomcat with the option -Dbootstrap_conf=true which made solr
to look for core configuration files in a wrong directory, after removing
this it started without any issues.

I also commented suggester component which made solr to load fast.

Thanks,
Doss.




On Thu, Nov 20, 2014 at 9:47 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Doss:

 Tomcat often puts things in catalina.out, you might check there,
 I've often seen logging information from Solr go there by
 default.

 Without having some idea what kinds of problems Solr is
 reporting when you see this situation, it's really hard to say.

 Some things I'd check first though, in order of what
 I _guess_ is most likely.

  There have been anecdotal reports (in fact, I'm trying
 to understand the why of it right now) of the suggester
 taking a long time to initialize, even if you don't use it!
 So if you're not using the suggest component, try
 commenting out those sections in solrconfig.xml for
 the cores in question. I like this explanation since it
 fits with your symptoms, but I don't like it since the
 index you are using isn't all that big. So it's something
 of a shot in the dark. I expect that the core will
 _eventually_ come up, but I've seen reports of 10-15
 minutes being required, far beyond my patience! That
 said, this would also explain why deleting the index
 works.

  OutOfMemory errors. You might be able to attach
 jConsole (part of the standard Java stuff) to the process
 and monitor the memory usage. If it's being pushed near
 the 5G limit that's the first thing I'd suspect.

  If you're using the default setups, then the Zookeeper
 timeout may be too low, I think the default (not sure about
 whether it's been changed in 4.9) is 15 seconds, 30-60
 is usually much better.

 Best,
 Erick


 On Thu, Nov 20, 2014 at 3:47 AM, Doss itsmed...@gmail.com wrote:
  Dear Erick,
 
  Forgive my ignorance.
 
  Please find some of the details you required.
 
  *have you looked at the solr logs?*
 
Sorry I haven't defined the log4j.properties file, so I don't have
 solr
  logs. Since it requires tomcat restart I am planning to do it in next
  restart.
 
  But found the following in tomcat log
 
  18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
  org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The
 web
  application [/mima] appears to have started a thread named
  [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
  stop it. This is very likely to create a memory leak. Stack trace of
 thread:
   sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
   sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
   sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
   org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 
 
  *How big are the cores?*
 
  We have 16 cores, out of it only 5 are big ones. Total size of all 16
  cores is 10+ GB
 
  *How many docs in the cores when the problem happens?*
 
  1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
   4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5
 GB)
  remaining cores are 1,00,000 to 40,00,000 documents
 
  *How much memory are you allocating the JVM? *
 
  5GB for JVM, Total RAM available in the systems is 30 GB
 
  *can you restart Tomcat without a problem?*
 
  This problem is occurring in production, I never tried.
 
 
  Thanks,
  Doss.
 
 
  On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  You've really got to provide details for us to say much
  of anything. There are about a zillion things that it could be.
 
  In particular, have you looked at the solr logs? Are there
  any interesting things in them? How big are the cores?
  How much memory are you allocating the JVM? How
  many docs in the cores when the problem happens?
  Before the nodes stop responding, can you restart
  Tomcat without a problem?
 
  You might review:
  http://wiki.apache.org/solr/UsingMailingLists
 
  Best,
  Erick
 
 
  On Wed, Nov 19, 2014 at 1:04 AM, Doss itsmed...@gmail.com wrote:
   I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At
 times
   SOLR in Node 1 stops responding, to fix the issue I am restarting
 tomcat
  in
   Node 1, but SOLR not starting up, but if I remove the solr cores in
 both
   nodes and try restarting it starts working, and then I have to reindex
  the
   whole data again. We are using this setup in production because of
 this
   issue we 

Re: SOLR not starting after restart 2 node cloud setup

2014-12-02 Thread Erick Erickson
Glad you found a solution!

Best,
Erick

On Tue, Dec 2, 2014 at 4:30 AM, Doss itsmed...@gmail.com wrote:
 Dear Erick,

 Thanks for your thoughts, it helped me a lot. In my instances no solr logs
 are appended in to catalina.out.

 Now I placed the log4j.properties file. Solr logs are captured in solr.log
 file with the help of it I found the reason for the issue.

 I am starting tomcat with the option -Dbootstrap_conf=true which made solr
 to look for core configuration files in a wrong directory, after removing
 this it started without any issues.

 I also commented suggester component which made solr to load fast.

 Thanks,
 Doss.




 On Thu, Nov 20, 2014 at 9:47 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 Doss:

 Tomcat often puts things in catalina.out, you might check there,
 I've often seen logging information from Solr go there by
 default.

 Without having some idea what kinds of problems Solr is
 reporting when you see this situation, it's really hard to say.

 Some things I'd check first though, in order of what
 I _guess_ is most likely.

  There have been anecdotal reports (in fact, I'm trying
 to understand the why of it right now) of the suggester
 taking a long time to initialize, even if you don't use it!
 So if you're not using the suggest component, try
 commenting out those sections in solrconfig.xml for
 the cores in question. I like this explanation since it
 fits with your symptoms, but I don't like it since the
 index you are using isn't all that big. So it's something
 of a shot in the dark. I expect that the core will
 _eventually_ come up, but I've seen reports of 10-15
 minutes being required, far beyond my patience! That
 said, this would also explain why deleting the index
 works.

  OutOfMemory errors. You might be able to attach
 jConsole (part of the standard Java stuff) to the process
 and monitor the memory usage. If it's being pushed near
 the 5G limit that's the first thing I'd suspect.

  If you're using the default setups, then the Zookeeper
 timeout may be too low, I think the default (not sure about
 whether it's been changed in 4.9) is 15 seconds, 30-60
 is usually much better.

 Best,
 Erick


 On Thu, Nov 20, 2014 at 3:47 AM, Doss itsmed...@gmail.com wrote:
  Dear Erick,
 
  Forgive my ignorance.
 
  Please find some of the details you required.
 
  *have you looked at the solr logs?*
 
Sorry I haven't defined the log4j.properties file, so I don't have
 solr
  logs. Since it requires tomcat restart I am planning to do it in next
  restart.
 
  But found the following in tomcat log
 
  18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
  org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The
 web
  application [/mima] appears to have started a thread named
  [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
  stop it. This is very likely to create a memory leak. Stack trace of
 thread:
   sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
   sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
   sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
   org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
 
 
  *How big are the cores?*
 
  We have 16 cores, out of it only 5 are big ones. Total size of all 16
  cores is 10+ GB
 
  *How many docs in the cores when the problem happens?*
 
  1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
   4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5
 GB)
  remaining cores are 1,00,000 to 40,00,000 documents
 
  *How much memory are you allocating the JVM? *
 
  5GB for JVM, Total RAM available in the systems is 30 GB
 
  *can you restart Tomcat without a problem?*
 
  This problem is occurring in production, I never tried.
 
 
  Thanks,
  Doss.
 
 
  On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  You've really got to provide details for us to say much
  of anything. There are about a zillion things that it could be.
 
  In particular, have you looked at the solr logs? Are there
  any interesting things in them? How big are the cores?
  How much memory are you allocating the JVM? How
  many docs in the cores when the problem happens?
  Before the nodes stop responding, can you restart
  Tomcat without a problem?
 
  You might review:
  http://wiki.apache.org/solr/UsingMailingLists
 
  Best,
  Erick
 
 
  On Wed, Nov 19, 2014 at 1:04 AM, Doss itsmed...@gmail.com wrote:
   I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At
 times
   SOLR in Node 1 stops responding, to fix the issue I am restarting
 tomcat
  in
   Node 1, but SOLR not starting up, but if I remove the solr cores in
 both
   nodes and try restarting it starts working, and 

Re: SOLR not starting after restart 2 node cloud setup

2014-11-20 Thread Doss
Dear Erick,

Forgive my ignorance.

Please find some of the details you required.

*have you looked at the solr logs?*

  Sorry I haven't defined the log4j.properties file, so I don't have solr
logs. Since it requires tomcat restart I am planning to do it in next
restart.

But found the following in tomcat log

18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The web
application [/mima] appears to have started a thread named
[localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
stop it. This is very likely to create a memory leak. Stack trace of thread:
 sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
 sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
 sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
 sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
 sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)


*How big are the cores?*

 We have 16 cores, out of it only 5 are big ones. Total size of all 16
cores is 10+ GB

*How many docs in the cores when the problem happens?*

1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
 4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5 GB)
remaining cores are 1,00,000 to 40,00,000 documents

*How much memory are you allocating the JVM? *

5GB for JVM, Total RAM available in the systems is 30 GB

*can you restart Tomcat without a problem?*

This problem is occurring in production, I never tried.


Thanks,
Doss.


On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson erickerick...@gmail.com
wrote:

 You've really got to provide details for us to say much
 of anything. There are about a zillion things that it could be.

 In particular, have you looked at the solr logs? Are there
 any interesting things in them? How big are the cores?
 How much memory are you allocating the JVM? How
 many docs in the cores when the problem happens?
 Before the nodes stop responding, can you restart
 Tomcat without a problem?

 You might review:
 http://wiki.apache.org/solr/UsingMailingLists

 Best,
 Erick


 On Wed, Nov 19, 2014 at 1:04 AM, Doss itsmed...@gmail.com wrote:
  I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times
  SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat
 in
  Node 1, but SOLR not starting up, but if I remove the solr cores in both
  nodes and try restarting it starts working, and then I have to reindex
 the
  whole data again. We are using this setup in production because of this
  issue we are having 1 to 1.30 hours of service down time. Any suggestions
  would be greatly appreciated.
 
  Thanks,
  Doss.



Re: SOLR not starting after restart 2 node cloud setup

2014-11-20 Thread Erick Erickson
Doss:

Tomcat often puts things in catalina.out, you might check there,
I've often seen logging information from Solr go there by
default.

Without having some idea what kinds of problems Solr is
reporting when you see this situation, it's really hard to say.

Some things I'd check first though, in order of what
I _guess_ is most likely.

 There have been anecdotal reports (in fact, I'm trying
to understand the why of it right now) of the suggester
taking a long time to initialize, even if you don't use it!
So if you're not using the suggest component, try
commenting out those sections in solrconfig.xml for
the cores in question. I like this explanation since it
fits with your symptoms, but I don't like it since the
index you are using isn't all that big. So it's something
of a shot in the dark. I expect that the core will
_eventually_ come up, but I've seen reports of 10-15
minutes being required, far beyond my patience! That
said, this would also explain why deleting the index
works.

 OutOfMemory errors. You might be able to attach
jConsole (part of the standard Java stuff) to the process
and monitor the memory usage. If it's being pushed near
the 5G limit that's the first thing I'd suspect.

 If you're using the default setups, then the Zookeeper
timeout may be too low, I think the default (not sure about
whether it's been changed in 4.9) is 15 seconds, 30-60
is usually much better.

Best,
Erick


On Thu, Nov 20, 2014 at 3:47 AM, Doss itsmed...@gmail.com wrote:
 Dear Erick,

 Forgive my ignorance.

 Please find some of the details you required.

 *have you looked at the solr logs?*

   Sorry I haven't defined the log4j.properties file, so I don't have solr
 logs. Since it requires tomcat restart I am planning to do it in next
 restart.

 But found the following in tomcat log

 18-Nov-2014 11:27:29.028 WARNING [localhost-startStop-2]
 org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The web
 application [/mima] appears to have started a thread named
 [localhost-startStop-1-SendThread(10.236.149.28:2181)] but has failed to
 stop it. This is very likely to create a memory leak. Stack trace of thread:
  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
  sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
  sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
  
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)


 *How big are the cores?*

 We have 16 cores, out of it only 5 are big ones. Total size of all 16
 cores is 10+ GB

 *How many docs in the cores when the problem happens?*

 1 core with 163 fields and 33,00,000 documents (Index size 2+ GB)
  4 cores with 3 fields and has 150,00,000 (approx) documents (1.2 to 1.5 GB)
 remaining cores are 1,00,000 to 40,00,000 documents

 *How much memory are you allocating the JVM? *

 5GB for JVM, Total RAM available in the systems is 30 GB

 *can you restart Tomcat without a problem?*

 This problem is occurring in production, I never tried.


 Thanks,
 Doss.


 On Wed, Nov 19, 2014 at 7:55 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 You've really got to provide details for us to say much
 of anything. There are about a zillion things that it could be.

 In particular, have you looked at the solr logs? Are there
 any interesting things in them? How big are the cores?
 How much memory are you allocating the JVM? How
 many docs in the cores when the problem happens?
 Before the nodes stop responding, can you restart
 Tomcat without a problem?

 You might review:
 http://wiki.apache.org/solr/UsingMailingLists

 Best,
 Erick


 On Wed, Nov 19, 2014 at 1:04 AM, Doss itsmed...@gmail.com wrote:
  I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times
  SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat
 in
  Node 1, but SOLR not starting up, but if I remove the solr cores in both
  nodes and try restarting it starts working, and then I have to reindex
 the
  whole data again. We are using this setup in production because of this
  issue we are having 1 to 1.30 hours of service down time. Any suggestions
  would be greatly appreciated.
 
  Thanks,
  Doss.



SOLR not starting after restart 2 node cloud setup

2014-11-19 Thread Doss
I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times
SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat in
Node 1, but SOLR not starting up, but if I remove the solr cores in both
nodes and try restarting it starts working, and then I have to reindex the
whole data again. We are using this setup in production because of this
issue we are having 1 to 1.30 hours of service down time. Any suggestions
would be greatly appreciated.

Thanks,
Doss.


Re: SOLR not starting after restart 2 node cloud setup

2014-11-19 Thread Erick Erickson
You've really got to provide details for us to say much
of anything. There are about a zillion things that it could be.

In particular, have you looked at the solr logs? Are there
any interesting things in them? How big are the cores?
How much memory are you allocating the JVM? How
many docs in the cores when the problem happens?
Before the nodes stop responding, can you restart
Tomcat without a problem?

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick


On Wed, Nov 19, 2014 at 1:04 AM, Doss itsmed...@gmail.com wrote:
 I have two node SOLR (4.9.0) cloud with Tomcat (8), Zookeeper. At times
 SOLR in Node 1 stops responding, to fix the issue I am restarting tomcat in
 Node 1, but SOLR not starting up, but if I remove the solr cores in both
 nodes and try restarting it starts working, and then I have to reindex the
 whole data again. We are using this setup in production because of this
 issue we are having 1 to 1.30 hours of service down time. Any suggestions
 would be greatly appreciated.

 Thanks,
 Doss.