Re: SOLR hangs - update timeout - please help
Working for a week now, no signs of fatigue. Many thanks for all the hints R -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-hangs-update-timeout-please-help-tp3863851p3899004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR hangs - update timeout - please help
Hi, To anyone still interested in this subject: after disabling windows nio handler in Jetty SOLR became more stable - currently it's been working for 3 days without any hanging or slowdown. I'll post next update in few days. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-hangs-update-timeout-please-help-tp3863851p3881674.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR hangs - update timeout - please help
Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog goks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog goks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
That's bad news. If 5-7 seconds is not safe then what is the safe interval for updates? Near real-time is not for me as it works only when querying by document Id - this doesn't solve anything in my case. I just want the index to be updated in real-time, 30-40 seconds delay is acceptable but not much more than that. Is there anything that can be done, or should I start looking for some other indexing tool? I'm wondering why there's such terrible performance degradation over time - SOLR runs fine for first 10-20 hours, updates are extremely fast and then they become slower and slower until eventually they stop executing at all. Is there any issue with garbage collection or index fragmentation or some internal data structures that can't manage their data effectively when updates are frequent? Best regards RG Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog goks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog goks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
Could be garbage collection. Could be larger and larger merges. At some point your commit will cause all segments to be merged. It's likely that what's happening is you need to hit the magic combination of events, particularly the problem of too many warming searchers. So, look at your log files or the admin page and see what your searcher warmup times are. This provides a lower bound for your commit interval. I'm guessing you have a single machine that's indexing and searching. Consider a master/slave setup which will avoid the problem of indexing and search contention. As you say you're going to handle many more queries in the future this may be required anyway... NRT does not just search doc IDs, it's intended for this kind of problem so I believe that is a possibility. But we're talking trunk here I think. I _strongly_ encourage you to think about whether such rapid search availability is really required. Often 3-5 minutes is acceptable if you ask, which gives you ample time to avoid this problem. That said, you have a relatively small index here, so you may be able to get away with, say, 30 second commits. Best Erick On Thu, Mar 29, 2012 at 4:49 AM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: That's bad news. If 5-7 seconds is not safe then what is the safe interval for updates? Near real-time is not for me as it works only when querying by document Id - this doesn't solve anything in my case. I just want the index to be updated in real-time, 30-40 seconds delay is acceptable but not much more than that. Is there anything that can be done, or should I start looking for some other indexing tool? I'm wondering why there's such terrible performance degradation over time - SOLR runs fine for first 10-20 hours, updates are extremely fast and then they become slower and slower until eventually they stop executing at all. Is there any issue with garbage collection or index fragmentation or some internal data structures that can't manage their data effectively when updates are frequent? Best regards RG Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog goks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog goks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
On 3/29/2012 2:49 AM, Rafal Gwizdala wrote: That's bad news. If 5-7 seconds is not safe then what is the safe interval for updates? Near real-time is not for me as it works only when querying by document Id - this doesn't solve anything in my case. I just want the index to be updated in real-time, 30-40 seconds delay is acceptable but not much more than that. Is there anything that can be done, or should I start looking for some other indexing tool? I'm wondering why there's such terrible performance degradation over time - SOLR runs fine for first 10-20 hours, updates are extremely fast and then they become slower and slower until eventually they stop executing at all. Is there any issue with garbage collection or index fragmentation or some internal data structures that can't manage their data effectively when updates are frequent? You've gotten some replies from experts already. I'm nowhere near their caliber, but I do have some things to say about my experiences. When I do a commit, it can take 30 seconds or longer. The bulk of that time is spent warming the caches. Most of the time it's between 5 and 15 seconds. I have a program that starts updates at the top of every minute, but won't begin checking time again until the previous update is done. I've checked things carefully, and it's warming the filter cache that takes so much time. The crazy thing is that my autoWarmCount for filterCache is only 4. We have some very very nasty filter queries. Are you kicking off these every 5-7 second updates even if the previous update has not finished running? You might be able to make things better by only doing the current update if the previous update has finished, which means using the default waitSearcher=true on your commits. You can try other things - reducing the size of Solr's caches and reducing the autoWarmCount, possibly to zero. Garbage collection can definitely be a problem, and that can be compounded if the machine does not have enough RAM for the OS to keep a large chunk of your index cached, and/or you have not given enough RAM to the JVM. As far as garbage collection, I have had good luck with the following options added to the java commandline. As you can see, I have an 8GB heap size, which is quite a bit more than my Solr actually needs. Garbage collection is less of a problem if the JVM has plenty of memory to work with - though I understand that if it has too much memory, you start having different problems with GC. -Xms8192M -Xmx8192M -XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled The servers are maxed out at 64GB, and each server is handling three index cores totaling about 60GB, so I can't quite fit all of my index into RAM. I wish I had 256GB per server - Solr would perform much better. You say your server has 4GB of memory, and that Solr is only using 300MB? I would guess that you need to upgrade to 8GB, 16GB, or more if you can. Then you should give Solr at least 2-3GB of that, leaving the rest to cache your index. With 5 million records, your index is probably several gigabytes in size. Thanks, Shawn
Re: SOLR hangs - update timeout - please help
Have you tried using Solr 3.5 with RankingAlgorithm 1.4.1 ? Has NRT support and is very fast, updates about 5000 documents in about 490 ms (while updating 1m docs in batches of 5k). You can get more info from here: http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search_ver_3.x Regards, Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 3/29/2012 1:49 AM, Rafal Gwizdala wrote: That's bad news. If 5-7 seconds is not safe then what is the safe interval for updates? Near real-time is not for me as it works only when querying by document Id - this doesn't solve anything in my case. I just want the index to be updated in real-time, 30-40 seconds delay is acceptable but not much more than that. Is there anything that can be done, or should I start looking for some other indexing tool? I'm wondering why there's such terrible performance degradation over time - SOLR runs fine for first 10-20 hours, updates are extremely fast and then they become slower and slower until eventually they stop executing at all. Is there any issue with garbage collection or index fragmentation or some internal data structures that can't manage their data effectively when updates are frequent? Best regards RG Thu, Mar 29, 2012 at 10:24 AM, Lance Norskoggoks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskoggoks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
On Thu, Mar 29, 2012 at 4:24 AM, Lance Norskog goks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. Are you speaking of a JVM bug, or something else? A Lucene bug? A Solr bug? Rafal, do you have a thread dump of when the update hangs (as opposed to at shutdown?) -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: SOLR hangs - update timeout - please help
If you must have real-time search, you might look at systems that are designed to do that. MarkLogic isn't free, but it is fast and real-time. You can use their no-charge Express license for development and prototyping: http://developer.marklogic.com/express OK, back to Solr. wunder Search Guy, Chegg former MarkLogic engineer On Mar 29, 2012, at 1:49 AM, Rafal Gwizdala wrote: That's bad news. If 5-7 seconds is not safe then what is the safe interval for updates? Near real-time is not for me as it works only when querying by document Id - this doesn't solve anything in my case. I just want the index to be updated in real-time, 30-40 seconds delay is acceptable but not much more than that. Is there anything that can be done, or should I start looking for some other indexing tool? I'm wondering why there's such terrible performance degradation over time - SOLR runs fine for first 10-20 hours, updates are extremely fast and then they become slower and slower until eventually they stop executing at all. Is there any issue with garbage collection or index fragmentation or some internal data structures that can't manage their data effectively when updates are frequent? Best regards RG Thu, Mar 29, 2012 at 10:24 AM, Lance Norskog goks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync() system call). It does not run smoothly at that rate. It is no surprise that eventually you hit a thread-locking bug. http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/RealTimeGet http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/CommitWithin On Wed, Mar 28, 2012 at 11:08 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Lance, I know there are many variables that's why I'm asking where to start and what to check. Updates are sent every 5-7 seconds, each update contains between 1 and 50 docs. Commit is done every time (on each update). Currently queries aren't very frequent - about 1 query every 3-5 seconds, but the system is going to handle much more (of course if the problem is fixed). The system has 2 core CPU (virtualized) and 4 GB memory (SOLR uses about 300 MB) R On Thu, Mar 29, 2012 at 1:53 AM, Lance Norskog goks...@gmail.com wrote: How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) -- Lance Norskog goks...@gmail.com -- Lance Norskog goks...@gmail.com
Re: SOLR hangs - update timeout - please help
Guys, thanks for all the suggestions I will be trying them, one at a time. Imho it's too early to give up and look for another tool, I'll try to work on configuration and see what happens. The NRT looks quite promising, there are also tons of config options to change. As for now, I have made the updates less frequent - about once every 30 seconds (but now the batches are bigger, about 150-200 documents per update). I'll see if this makes SOLR more stable or users more aggressive. Unfortunately I have no resources for experimenting so I'll keep making small changes to production system and observing the effects. Shawn, I have given the JVM about 2 GB of memory but it's only using 300 MB so I don't think there's memory shortage now. The whole index is about 2 GB in size but I think there aren't enough queries to fill up the cache and make SOLR load everything in memory. Below i'm pasting the thread dump taken when the update was hung (it's also attached to the first message of this topic) Best regards, RG solr coreexample/core system jvm version20.5-b03/version nameJava HotSpot(TM) 64-Bit Server VM/name /jvm threadCount current31/current peak32/peak daemon8/daemon /threadCount threadDump thread id39/id namepool-4-thread-1/name stateWAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@765bc9b8 /lock cpuTime312,5000ms/cpuTime userTime265,6250ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method)/line lineat java.util.concurrent.locks.LockSupport.park(Unknown Source)/line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)/line lineat java.util.concurrent.DelayQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)/line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)/line lineat java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)/line lineat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)/line lineat java.lang.Thread.run(Unknown Source)/line /stackTrace /thread thread id38/id namepool-2-thread-1/name stateWAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4188bbd /lock cpuTime6484,3750ms/cpuTime userTime5546,8750ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method)/line lineat java.util.concurrent.locks.LockSupport.park(Unknown Source)/line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)/line lineat java.util.concurrent.DelayQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)/line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)/line lineat java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)/line lineat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)/line lineat java.lang.Thread.run(Unknown Source)/line /stackTrace /thread thread id37/id nameDestroyJavaVM/name stateRUNNABLE/state cpuTime4906,2500ms/cpuTime userTime4484,3750ms/userTime stackTrace /stackTrace /thread thread id36/id nameqtp1033068770-36/name stateTIMED_WAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@677e2764 /lock cpuTime134968,7500ms/cpuTime userTime114984,3750ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method)/line lineat java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)/line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)/line lineat org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:320) /line lineat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:480) /line lineat java.lang.Thread.run(Unknown Source)/line /stackTrace /thread thread id35/id nameqtp1033068770-35/name stateRUNNABLE/state cpuTime147390,6250ms/cpuTime userTime126593,7500ms/userTime stackTrace lineat sun.management.ThreadImpl.getThreadInfo1(Native Method) /line lineat sun.management.ThreadImpl.getThreadInfo(Unknown Source) /line lineat org.apache.jsp.admin.threaddump_jsp._jspService(org.apache.jsp.admin.threaddump_jsp:264) /line lineat
Re: SOLR hangs - update timeout - please help
More memory is not necessarily better, it can lead to longer, more intense garbage collections that cause things to stop. You might also consider lowering your memory allocation, but 2G is really not all that much so I somewhat doubt it's a problem but thought I'd mention it. Best Erick On Thu, Mar 29, 2012 at 1:50 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Guys, thanks for all the suggestions I will be trying them, one at a time. Imho it's too early to give up and look for another tool, I'll try to work on configuration and see what happens. The NRT looks quite promising, there are also tons of config options to change. As for now, I have made the updates less frequent - about once every 30 seconds (but now the batches are bigger, about 150-200 documents per update). I'll see if this makes SOLR more stable or users more aggressive. Unfortunately I have no resources for experimenting so I'll keep making small changes to production system and observing the effects. Shawn, I have given the JVM about 2 GB of memory but it's only using 300 MB so I don't think there's memory shortage now. The whole index is about 2 GB in size but I think there aren't enough queries to fill up the cache and make SOLR load everything in memory. Below i'm pasting the thread dump taken when the update was hung (it's also attached to the first message of this topic) Best regards, RG solr coreexample/core system jvm version20.5-b03/version nameJava HotSpot(TM) 64-Bit Server VM/name /jvm threadCount current31/current peak32/peak daemon8/daemon /threadCount threadDump thread id39/id namepool-4-thread-1/name stateWAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@765bc9b8 /lock cpuTime312,5000ms/cpuTime userTime265,6250ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method) /line lineat java.util.concurrent.locks.LockSupport.park(Unknown Source) /line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) /line lineat java.util.concurrent.DelayQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) /line lineat java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) /line lineat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) /line lineat java.lang.Thread.run(Unknown Source) /line /stackTrace /thread thread id38/id namepool-2-thread-1/name stateWAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@4188bbd /lock cpuTime6484,3750ms/cpuTime userTime5546,8750ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method) /line lineat java.util.concurrent.locks.LockSupport.park(Unknown Source) /line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) /line lineat java.util.concurrent.DelayQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) /line lineat java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source) /line lineat java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source) /line lineat java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) /line lineat java.lang.Thread.run(Unknown Source) /line /stackTrace /thread thread id37/id nameDestroyJavaVM/name stateRUNNABLE/state cpuTime4906,2500ms/cpuTime userTime4484,3750ms/userTime stackTrace /stackTrace /thread thread id36/id nameqtp1033068770-36/name stateTIMED_WAITING/state lockjava.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@677e2764 /lock cpuTime134968,7500ms/cpuTime userTime114984,3750ms/userTime stackTrace lineat sun.misc.Unsafe.park(Native Method) /line lineat java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source) /line lineat java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source) /line lineat org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:320) /line lineat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:480) /line lineat java.lang.Thread.run(Unknown Source) /line /stackTrace /thread thread id35/id nameqtp1033068770-35/name
Re: SOLR hangs - update timeout - please help
On Thu, Mar 29, 2012 at 1:50 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Below i'm pasting the thread dump taken when the update was hung (it's also attached to the first message of this topic) Interesting... It looks like there's only one thread in solr code (the one generating the thread dump). The stack trace looks like you switched Jetty to use the NIO connector perhaps? Could you try with the Jetty shipped with Solr (exactly as configured)? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: SOLR hangs - update timeout - please help
Yonik, I didn't say there was an update request active at the moment the thread dump was made, only that previous update requests failed with a timeout. So maybe this is the missing piece. I didn't enable nio with Jetty, probably it's there by default. Disabling it is the next thing to check. If solr hangs next time I'll try to make a thread dump when the update request is waiting for completion. Best regards RG On Thu, Mar 29, 2012 at 8:19 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Thu, Mar 29, 2012 at 1:50 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Below i'm pasting the thread dump taken when the update was hung (it's also attached to the first message of this topic) Interesting... It looks like there's only one thread in solr code (the one generating the thread dump). The stack trace looks like you switched Jetty to use the NIO connector perhaps? Could you try with the Jetty shipped with Solr (exactly as configured)? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: SOLR hangs - update timeout - please help
Oops... my previous replies accidentally went off-list. I'll cut-n-paste below. OK, so it looks like there is probably no bug here - it's simply that commits can sometimes take a long time and updates were blocked during that time (and would have succeeded eventually except the jetty timeout was not set long enough). Things are better in trunk (4.0) with soft commits and updates that can proceed concurrently with commits. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 On Thu, Mar 29, 2012 at 3:11 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: You're right, this is not default Jetty from Solr - I configured it from scratch and then added Solr. Previously I had autocommit enabled and also did commit on every update so this might also contribute to the problem. Now I disabled it and made the updates less frequent. If the autocommit is allowed to happen together with 'manual' commit on update then there could be simultaneous commits, which now shouldn't happen - there will be at most one update/commit active at a time. Request timeout is default for jetty, but don't know what's that value. Best regards RG I wrote: On Thu, Mar 29, 2012 at 2:25 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Yonik, I didn't say there was an update request active at the moment the thread dump was made, only that previous update requests failed with a timeout. So maybe this is the missing piece. I didn't enable nio with Jetty, probably it's there by default. Not with the jetty that comes with Solr. bq. If solr hangs next time I'll try to make a thread dump when the update request is waiting for completion. Great! We need to see where it's hanging! Also, how long did the request take to time out? Do you have auto-commit enabled? In the 3x series, updates will block while commits are in progress, so timeouts can happen if they are set too short (and it seems like maybe you aren't using the Jetty from Solr, so the configuration may not be ideal).
Re: SOLR hangs - update timeout - please help
How often are updates? And when are commits? How many CPUs? How much query load? There are so many variables. Check the mailing list archives and Solr issues, there might be a similar problem already discussed. Also, attachments do not work with Apache mailing lists. (Well, ok, they work for direct subscribers, but not for indirect subscribers and archive site users.) On Wed, Mar 28, 2012 at 12:54 AM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Hello, I have SOLR 3.5 running on windows 2008 64bit, java: java version 1.6.0_30 Java(TM) SE Runtime Environment (build 1.6.0_30-b12) Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode) hosted in Jetty server SOLR is being used as an external index for application data and so it's updated quite frequently (once every few seconds). Currently it holds about 5 million documents with few million document updates/inserts every month. The problem is that after running for some time (about a day) SOLR hangs on update and stops processing further update requests. At the same time searching works normally. I'm updating SOLR by http requests and when the server hangs every update request ends with a timeout. What is more, when I try to stop the SOLR service (when the problem above occurs) it hangs too, keeping one cpu core 100% busy and it never exits - has to be killed. I'm attaching two thread dumps taken with about 1 minute gap. Can you please take a look at it and suggest what can be done next to find the cause of the problem and fix it. Best regards RG -- Lance Norskog goks...@gmail.com