Re: Frequent garbage collections after a day of operation
A wonderful writeup on various memory collection concerns http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/ On Fri, Feb 17, 2012 at 12:27 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: One thing that could fit the pattern you describe would be Solr caches filling up and getting you too close to your JVM or memory limit This [uncommitted] issue would solve that problem by allowing the GC to collect caches that become too large, though in practice, the cache setting would need to be fairly large for an OOM to occur from them: https://issues.apache.org/jira/browse/SOLR-1513 On Thu, Feb 16, 2012 at 7:14 PM, Bryan Loofbourrow bloofbour...@knowledgemosaic.com wrote: A couple of thoughts: We wound up doing a bunch of tuning on the Java garbage collection. However, the pattern we were seeing was periodic very extreme slowdowns, because we were then using the default garbage collector, which blocks when it has to do a major collection. This doesn't sound like your problem, but it's something to be aware of. One thing that could fit the pattern you describe would be Solr caches filling up and getting you too close to your JVM or memory limit. For example, if you have large documents, and have defined a large document cache, that might do it. I found it useful to point jconsole (free with the JDK) at my JVM, and watch the pattern of memory usage. If the troughs at the bottom of the GC cycles keep rising, you know you've got something that is continuing to grab more memory and not let go of it. Now that our JVM is running smoothly, we just see a sawtooth pattern, with the troughs approximately level. When the system is under load, the frequency of the wave rises. Try it and see what sort of pattern you're getting. -- Bryan -Original Message- From: Matthias Käppler [mailto:matth...@qype.com] Sent: Thursday, February 16, 2012 7:23 AM To: solr-user@lucene.apache.org Subject: Frequent garbage collections after a day of operation Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day or so of operation we see increased response times from SOLR, up to 3 times increases on average. During this time we see increased CPU load due to heavy garbage collection in the JVM, which bogs down the the whole system, so throughput decreases, naturally. When restarting the slaves, everything goes back to normal, but that's more like a brute force solution. The thing is, we don't know what's causing this and we don't have that much experience with Java stacks since we're for most parts a Rails company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else seeing this, or can you think of a reason for this? Most of our queries to SOLR involve the DismaxHandler and the spatial search query components. We don't use any custom request handlers so far. Thanks in advance, -Matthias -- Matthias Käppler Lead Developer API Mobile Qype GmbH Großer Burstah 50-52 20457 Hamburg Telephone: +49 (0)40 - 219 019 2 - 160 Skype: m_kaeppler Email: matth...@qype.com Managing Director: Ian Brotherston Amtsgericht Hamburg HRB 95913 This e-mail and its attachments may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail and its attachments. Any unauthorized copying, disclosure or distribution of this e-mail and its attachments is strictly forbidden. This notice also applies to future messages.
Frequent garbage collections after a day of operation
Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day or so of operation we see increased response times from SOLR, up to 3 times increases on average. During this time we see increased CPU load due to heavy garbage collection in the JVM, which bogs down the the whole system, so throughput decreases, naturally. When restarting the slaves, everything goes back to normal, but that's more like a brute force solution. The thing is, we don't know what's causing this and we don't have that much experience with Java stacks since we're for most parts a Rails company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else seeing this, or can you think of a reason for this? Most of our queries to SOLR involve the DismaxHandler and the spatial search query components. We don't use any custom request handlers so far. Thanks in advance, -Matthias -- Matthias Käppler Lead Developer API Mobile Qype GmbH Großer Burstah 50-52 20457 Hamburg Telephone: +49 (0)40 - 219 019 2 - 160 Skype: m_kaeppler Email: matth...@qype.com Managing Director: Ian Brotherston Amtsgericht Hamburg HRB 95913 This e-mail and its attachments may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail and its attachments. Any unauthorized copying, disclosure or distribution of this e-mail and its attachments is strictly forbidden. This notice also applies to future messages.
Re: Frequent garbage collections after a day of operation
Make sure your Tomcat instances are started each with a max heap size that adds up to something a lot lower than the complete RAM of your system. Frequent Garbage collection means that your applications request more RAM but your Java VM has no more resources, so it requires the Garbage Collector to free memory so that the requested new objects can be created. It's not indicating a memory leak unless you are running a custom EntityProcessor in DIH that runs into an infinite loop and creates huge amounts of schema fields. ;-) Also - if you are doing hot deploys on Tomcat, you will have to restart the Tomcat instance on a regular bases as hot deploys DO leak memory after a while. (You might be seeing class undeploy messages in catalina.out and later on OutOfMemory error messages.) If this is not of any help you will probably have to provide a bit more information on your Tomcat and SOLR configuration setup. Chantal On Thu, 2012-02-16 at 16:22 +0100, Matthias Käppler wrote: Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day or so of operation we see increased response times from SOLR, up to 3 times increases on average. During this time we see increased CPU load due to heavy garbage collection in the JVM, which bogs down the the whole system, so throughput decreases, naturally. When restarting the slaves, everything goes back to normal, but that's more like a brute force solution. The thing is, we don't know what's causing this and we don't have that much experience with Java stacks since we're for most parts a Rails company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else seeing this, or can you think of a reason for this? Most of our queries to SOLR involve the DismaxHandler and the spatial search query components. We don't use any custom request handlers so far. Thanks in advance, -Matthias
RE: Frequent garbage collections after a day of operation
A couple of thoughts: We wound up doing a bunch of tuning on the Java garbage collection. However, the pattern we were seeing was periodic very extreme slowdowns, because we were then using the default garbage collector, which blocks when it has to do a major collection. This doesn't sound like your problem, but it's something to be aware of. One thing that could fit the pattern you describe would be Solr caches filling up and getting you too close to your JVM or memory limit. For example, if you have large documents, and have defined a large document cache, that might do it. I found it useful to point jconsole (free with the JDK) at my JVM, and watch the pattern of memory usage. If the troughs at the bottom of the GC cycles keep rising, you know you've got something that is continuing to grab more memory and not let go of it. Now that our JVM is running smoothly, we just see a sawtooth pattern, with the troughs approximately level. When the system is under load, the frequency of the wave rises. Try it and see what sort of pattern you're getting. -- Bryan -Original Message- From: Matthias Käppler [mailto:matth...@qype.com] Sent: Thursday, February 16, 2012 7:23 AM To: solr-user@lucene.apache.org Subject: Frequent garbage collections after a day of operation Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day or so of operation we see increased response times from SOLR, up to 3 times increases on average. During this time we see increased CPU load due to heavy garbage collection in the JVM, which bogs down the the whole system, so throughput decreases, naturally. When restarting the slaves, everything goes back to normal, but that's more like a brute force solution. The thing is, we don't know what's causing this and we don't have that much experience with Java stacks since we're for most parts a Rails company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else seeing this, or can you think of a reason for this? Most of our queries to SOLR involve the DismaxHandler and the spatial search query components. We don't use any custom request handlers so far. Thanks in advance, -Matthias -- Matthias Käppler Lead Developer API Mobile Qype GmbH Großer Burstah 50-52 20457 Hamburg Telephone: +49 (0)40 - 219 019 2 - 160 Skype: m_kaeppler Email: matth...@qype.com Managing Director: Ian Brotherston Amtsgericht Hamburg HRB 95913 This e-mail and its attachments may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail and its attachments. Any unauthorized copying, disclosure or distribution of this e-mail and its attachments is strictly forbidden. This notice also applies to future messages.
Re: Frequent garbage collections after a day of operation
One thing that could fit the pattern you describe would be Solr caches filling up and getting you too close to your JVM or memory limit This [uncommitted] issue would solve that problem by allowing the GC to collect caches that become too large, though in practice, the cache setting would need to be fairly large for an OOM to occur from them: https://issues.apache.org/jira/browse/SOLR-1513 On Thu, Feb 16, 2012 at 7:14 PM, Bryan Loofbourrow bloofbour...@knowledgemosaic.com wrote: A couple of thoughts: We wound up doing a bunch of tuning on the Java garbage collection. However, the pattern we were seeing was periodic very extreme slowdowns, because we were then using the default garbage collector, which blocks when it has to do a major collection. This doesn't sound like your problem, but it's something to be aware of. One thing that could fit the pattern you describe would be Solr caches filling up and getting you too close to your JVM or memory limit. For example, if you have large documents, and have defined a large document cache, that might do it. I found it useful to point jconsole (free with the JDK) at my JVM, and watch the pattern of memory usage. If the troughs at the bottom of the GC cycles keep rising, you know you've got something that is continuing to grab more memory and not let go of it. Now that our JVM is running smoothly, we just see a sawtooth pattern, with the troughs approximately level. When the system is under load, the frequency of the wave rises. Try it and see what sort of pattern you're getting. -- Bryan -Original Message- From: Matthias Käppler [mailto:matth...@qype.com] Sent: Thursday, February 16, 2012 7:23 AM To: solr-user@lucene.apache.org Subject: Frequent garbage collections after a day of operation Hey everyone, we're running into some operational problems with our SOLR production setup here and were wondering if anyone else is affected or has even solved these problems before. We're running a vanilla SOLR 3.4.0 in several Tomcat 6 instances, so nothing out of the ordinary, but after a day or so of operation we see increased response times from SOLR, up to 3 times increases on average. During this time we see increased CPU load due to heavy garbage collection in the JVM, which bogs down the the whole system, so throughput decreases, naturally. When restarting the slaves, everything goes back to normal, but that's more like a brute force solution. The thing is, we don't know what's causing this and we don't have that much experience with Java stacks since we're for most parts a Rails company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else seeing this, or can you think of a reason for this? Most of our queries to SOLR involve the DismaxHandler and the spatial search query components. We don't use any custom request handlers so far. Thanks in advance, -Matthias -- Matthias Käppler Lead Developer API Mobile Qype GmbH Großer Burstah 50-52 20457 Hamburg Telephone: +49 (0)40 - 219 019 2 - 160 Skype: m_kaeppler Email: matth...@qype.com Managing Director: Ian Brotherston Amtsgericht Hamburg HRB 95913 This e-mail and its attachments may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail and its attachments. Any unauthorized copying, disclosure or distribution of this e-mail and its attachments is strictly forbidden. This notice also applies to future messages.