Re: stuck thread problem?
Hieu: The stack trace you shared, is from a search request with a mvel script in a function_score query. I don't think that this is causing threads to get stuck and it just pops up here because in general a query with scripts take more cpu time. You can easily verify this if you stop search requests temporarily then the shared stacktrace shouldn't be included in the hot threads output. If a node gets slower over time there may be a different issue. For example the jvm is slowly running out of memory and garbage collections taking are longer. On 10 September 2014 05:26, Hieu Nguyen wrote: > Thanks for the response, Martijn! We'll consider upgrading, but it'd be > great to root cause the issue. > > On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote: >> >> Patrick: I have never seen this, but this means the openjdk on FreeBSD >> doesn't support cpu sampling of threads. Not sure, but maybe you can try >> with an Oracle jdk/jre? Otherwise jstack should be used on order to figure >> out on what threads of ES get stuck on. >> >> Hieu: This doesn't look like the issue that is fixed, since the threads >> are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old >> version and I would recommend to upgrade. Also mvel is deprecated and will >> only be available via a plugin: https://github.com/ >> elasticsearch/elasticsearch/pull/6571 and https://github.com/ >> elasticsearch/elasticsearch/pull/6610 >> >> >> On 6 September 2014 16:12, Hieu Nguyen wrote: >> >>> We have seen a similar issue in our cluster (CPU usage and search time >>> suddenly went up slowly for the master node over a period of one day, until >>> we restarted). Is there a easy way to confirm that it's indeed the same >>> issue mentioned here? >>> >>> Below is the output of our hot threads on this node (version 0.90.12): >>> >>>85.8% (857.7ms out of 1s) cpu usage by thread >>> 'elasticsearch[cluster1][search][T#3]' >>> 8/10 snapshots sharing following 30 elements >>>java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429) >>>java.lang.ThreadLocal$ThreadLocalMap.access$100( >>> ThreadLocal.java:261) >>>java.lang.ThreadLocal.set(ThreadLocal.java:183) >>>org.elasticsearch.common.mvel2.optimizers.OptimizerFactory. >>> clearThreadAccessorOptimizer(OptimizerFactory.java:114) >>>org.elasticsearch.common.mvel2.MVELRuntime.execute( >>> MVELRuntime.java:169) >>>org.elasticsearch.common.mvel2.compiler.CompiledExpression. >>> getDirectValue(CompiledExpression.java:123) >>>org.elasticsearch.common.mvel2.compiler. >>> CompiledExpression.getValue(CompiledExpression.java:119) >>>org.elasticsearch.script.mvel.MvelScriptEngineService$ >>> MvelSearchScript.run(MvelScriptEngineService.java:191) >>>org.elasticsearch.script.mvel.MvelScriptEngineService$ >>> MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206) >>>org.elasticsearch.common.lucene.search.function. >>> ScriptScoreFunction.score(ScriptScoreFunction.java:54) >>>org.elasticsearch.common.lucene.search.function. >>> FunctionScoreQuery$CustomBoostFactorScorer.score( >>> FunctionScoreQuery.java:175) >>>org.apache.lucene.search.TopScoreDocCollector$ >>> OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140) >>>org.apache.lucene.search.TimeLimitingCollector.collect( >>> TimeLimitingCollector.java:153) >>>org.apache.lucene.search.Scorer.score(Scorer.java:65) >>>org.apache.lucene.search.IndexSearcher.search( >>> IndexSearcher.java:621) >>>org.elasticsearch.search.internal.ContextIndexSearcher. >>> search(ContextIndexSearcher.java:162) >>>org.apache.lucene.search.IndexSearcher.search( >>> IndexSearcher.java:491) >>>org.apache.lucene.search.IndexSearcher.search( >>> IndexSearcher.java:448) >>>org.apache.lucene.search.IndexSearcher.search( >>> IndexSearcher.java:281) >>>org.apache.lucene.search.IndexSearcher.search( >>> IndexSearcher.java:269) >>>org.elasticsearch.search.query.QueryPhase.execute( >>> QueryPhase.java:117) >>>org.elasticsearch.search.SearchService.executeQueryPhase( >>> SearchService.java:244) >>>org.elasticsearch.search.action.SearchServiceTransportAction. >>> sendExecuteQuery(SearchServiceTransportAction.java:202) >>>org.elasticsearch.action.search.type. >>> TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase( >>> TransportSearchQueryThenFetchAction.java:80) >>>org.elasticsearch.action.search.type.TransportSearchTypeAction$ >>> BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216) >>>org.elasticsearch.action.search.type.TransportSearchTypeAction$ >>> BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203) >>>org.elasticsearch.action.search.type.TransportSearchTypeAction$ >>> BaseAsyncAction$2.run(TransportSearchTypeAction.java:186) >>>java.util.concurrent.Thr
Re: stuck thread problem?
Thanks for the response, Martijn! We'll consider upgrading, but it'd be great to root cause the issue. On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote: > > Patrick: I have never seen this, but this means the openjdk on FreeBSD > doesn't support cpu sampling of threads. Not sure, but maybe you can try > with an Oracle jdk/jre? Otherwise jstack should be used on order to figure > out on what threads of ES get stuck on. > > Hieu: This doesn't look like the issue that is fixed, since the threads > are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old > version and I would recommend to upgrade. Also mvel is deprecated and will > only be available via a plugin: > https://github.com/elasticsearch/elasticsearch/pull/6571 and > https://github.com/elasticsearch/elasticsearch/pull/6610 > > > On 6 September 2014 16:12, Hieu Nguyen > > wrote: > >> We have seen a similar issue in our cluster (CPU usage and search time >> suddenly went up slowly for the master node over a period of one day, until >> we restarted). Is there a easy way to confirm that it's indeed the same >> issue mentioned here? >> >> Below is the output of our hot threads on this node (version 0.90.12): >> >>85.8% (857.7ms out of 1s) cpu usage by thread >> 'elasticsearch[cluster1][search][T#3]' >> 8/10 snapshots sharing following 30 elements >>java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429) >> >> java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261) >>java.lang.ThreadLocal.set(ThreadLocal.java:183) >> >> >> org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114) >> >> org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169) >> >> >> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123) >> >> >> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) >> >> >> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191) >> >> >> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206) >> >> >> org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54) >> >> >> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175) >> >> >> org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140) >> >> >> org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153) >>org.apache.lucene.search.Scorer.score(Scorer.java:65) >> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) >> >> >> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162) >> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) >> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) >> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) >> >> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269) >> >> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117) >> >> >> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244) >> >> >> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202) >> >> >> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80) >> >> >> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216) >> >> >> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203) >> >> >> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186) >> >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>java.lang.Thread.run(Thread.java:744) >> 2/10 snapshots sharing following 20 elements >> >> org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65) >> >> >> org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178) >> >> >> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169) >>o
Re: stuck thread problem?
On 9 sept. 2014, at 09:03, Martijn v Groningen wrote: > Patrick: I have never seen this, but this means the openjdk on FreeBSD > doesn't support cpu sampling of threads. Not sure, but maybe you can try > with an Oracle jdk/jre? Otherwise jstack should be used on order to figure > out on what threads of ES get stuck on. Martijn, thanks for coming back on this. Last week end I've performed a full upgrade of this server (freebsd upgrade from 9.2 to 9.3, and recompiliation of every ports installed). In the process, ES was upgraded, as was OpenJDK. I'll wait and see if the problem comes back or not, and keep you posted. regards, Patrick -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E1C92FAF-0863-46F3-8EB9-6FE26278E111%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
Patrick: I have never seen this, but this means the openjdk on FreeBSD doesn't support cpu sampling of threads. Not sure, but maybe you can try with an Oracle jdk/jre? Otherwise jstack should be used on order to figure out on what threads of ES get stuck on. Hieu: This doesn't look like the issue that is fixed, since the threads are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old version and I would recommend to upgrade. Also mvel is deprecated and will only be available via a plugin: https://github.com/elasticsearch/elasticsearch/pull/6571 and https://github.com/elasticsearch/elasticsearch/pull/6610 On 6 September 2014 16:12, Hieu Nguyen wrote: > We have seen a similar issue in our cluster (CPU usage and search time > suddenly went up slowly for the master node over a period of one day, until > we restarted). Is there a easy way to confirm that it's indeed the same > issue mentioned here? > > Below is the output of our hot threads on this node (version 0.90.12): > >85.8% (857.7ms out of 1s) cpu usage by thread > 'elasticsearch[cluster1][search][T#3]' > 8/10 snapshots sharing following 30 elements >java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429) > > java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261) >java.lang.ThreadLocal.set(ThreadLocal.java:183) > > > org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114) > > org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169) > > > org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123) > > > org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) > > > org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191) > > > org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206) > > > org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54) > > > org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175) > > > org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140) > > > org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153) >org.apache.lucene.search.Scorer.score(Scorer.java:65) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) > > > org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269) > > org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117) > > > org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244) > > > org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202) > > > org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80) > > > org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216) > > > org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203) > > > org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186) > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >java.lang.Thread.run(Thread.java:744) > 2/10 snapshots sharing following 20 elements > > org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65) > > > org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178) > > > org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169) >org.apache.lucene.search.Scorer.score(Scorer.java:64) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) > > > org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) > > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
Re: stuck thread problem?
We have seen a similar issue in our cluster (CPU usage and search time suddenly went up slowly for the master node over a period of one day, until we restarted). Is there a easy way to confirm that it's indeed the same issue mentioned here? Below is the output of our hot threads on this node (version 0.90.12): 85.8% (857.7ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#3]' 8/10 snapshots sharing following 30 elements java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429) java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261) java.lang.ThreadLocal.set(ThreadLocal.java:183) org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114) org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169) org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123) org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191) org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206) org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54) org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175) org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140) org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153) org.apache.lucene.search.Scorer.score(Scorer.java:65) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269) org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117) org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244) org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202) org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80) org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216) org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203) org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:744) 2/10 snapshots sharing following 20 elements org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65) org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178) org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169) org.apache.lucene.search.Scorer.score(Scorer.java:64) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281) org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269) org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117) org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244) org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202) org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80) org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216) org.elasticsearch.action.search.type.Transport
Re: stuck thread problem?
Hi, Epic fail: [2014-09-05 22:14:00,043][DEBUG][action.admin.cluster.node.hotthreads] [Bloodsport] failed to execute on node [fLwUGA_eSfmLApJ7SyIncA] org.elasticsearch.ElasticsearchException: failed to detect hot threads at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:103) at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:43) at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.IllegalStateException: MBean doesn't support thread CPU Time at org.elasticsearch.monitor.jvm.HotThreads.innerDetect(HotThreads.java:90) at org.elasticsearch.monitor.jvm.HotThreads.detect(HotThreads.java:75) at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:101) ... 5 more On 29 août 2014, at 13:54, Martijn v Groningen wrote: > Hi Patrick, > > I this problem happens again then you should execute the hot threads api: > curl localhost:9200/_nodes/hot_threads > > Documentation: > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads > > Just pick a node in your cluster and run that command. This is the equivalent > of running jstack on all the nodes in your cluster. > > Martijn > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62B26081-90D7-437E-A797-A5437A6EF466%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
Hi Patrick, I this problem happens again then you should execute the hot threads api: curl localhost:9200/_nodes/hot_threads Documentation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads Just pick a node in your cluster and run that command. This is the equivalent of running jstack on all the nodes in your cluster. Martijn On 29 August 2014 13:34, Patrick Proniewski wrote: > Hi, > > I don't know how to debug a JAVA process. Haven't heard about jstack until > it was mentioned in this thread. > All I know is what I've posted in my first message. > > I've restarted ES, and currently I've no stuck thread to investigate. In > the mean time, you can teach me how I should use jstack, so next time it > happens I'll be ready. > > On 29 août 2014, at 13:19, Martijn v Groningen < > martijn.v.gronin...@gmail.com> wrote: > > > Hi Patrick, > > > > Did you see the same stuck thread via jstack or the hot thread api that > > Martin reported? This can only happen if scan search was enabled (by > > setting search_type=scan in a search request) > > If that isn't the case then something else is maybe stuck. > > > > Martijn > > > > > > On 29 August 2014 09:58, Patrick Proniewski > > wrote: > > > >> Thank you! > >> > >> On 29 août 2014, at 08:49, Martin Forssen > wrote: > >> > >>> FYI, this turned out to be a real bug. A fix has been committed and > will > >> be > >>> included in the next release. > >>> > >>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote: > > I did report it > >> https://github.com/elasticsearch/elasticsearch/issues/7478 > > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups > >> "elasticsearch" group. > >> To unsubscribe from this group and stop receiving emails from it, send > an > >> email to elasticsearch+unsubscr...@googlegroups.com. > >> To view this discussion on the web visit > >> > https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net > >> . > >> For more options, visit https://groups.google.com/d/optout. > >> > > > > > > > > -- > > Met vriendelijke groet, > > > > Martijn van Groningen > > > > -- > > You received this message because you are subscribed to the Google > Groups "elasticsearch" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to elasticsearch+unsubscr...@googlegroups.com. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com > . > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net > . > For more options, visit https://groups.google.com/d/optout. > -- Met vriendelijke groet, Martijn van Groningen -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TxbSpXgVwRmfF5X3%2BDzsWgz8iNgaTjWXOP7iT1NdfHLow%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
Hi, I don't know how to debug a JAVA process. Haven't heard about jstack until it was mentioned in this thread. All I know is what I've posted in my first message. I've restarted ES, and currently I've no stuck thread to investigate. In the mean time, you can teach me how I should use jstack, so next time it happens I'll be ready. On 29 août 2014, at 13:19, Martijn v Groningen wrote: > Hi Patrick, > > Did you see the same stuck thread via jstack or the hot thread api that > Martin reported? This can only happen if scan search was enabled (by > setting search_type=scan in a search request) > If that isn't the case then something else is maybe stuck. > > Martijn > > > On 29 August 2014 09:58, Patrick Proniewski > wrote: > >> Thank you! >> >> On 29 août 2014, at 08:49, Martin Forssen wrote: >> >>> FYI, this turned out to be a real bug. A fix has been committed and will >> be >>> included in the next release. >>> >>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote: I did report it >> https://github.com/elasticsearch/elasticsearch/issues/7478 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Met vriendelijke groet, > > Martijn van Groningen > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
Hi Patrick, Did you see the same stuck thread via jstack or the hot thread api that Martin reported? This can only happen if scan search was enabled (by setting search_type=scan in a search request) If that isn't the case then something else is maybe stuck. Martijn On 29 August 2014 09:58, Patrick Proniewski wrote: > Thank you! > > On 29 août 2014, at 08:49, Martin Forssen wrote: > > > FYI, this turned out to be a real bug. A fix has been committed and will > be > > included in the next release. > > > > On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote: > >> > >> I did report it > https://github.com/elasticsearch/elasticsearch/issues/7478 > >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net > . > For more options, visit https://groups.google.com/d/optout. > -- Met vriendelijke groet, Martijn van Groningen -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
Thank you! On 29 août 2014, at 08:49, Martin Forssen wrote: > FYI, this turned out to be a real bug. A fix has been committed and will be > included in the next release. > > On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote: >> >> I did report it https://github.com/elasticsearch/elasticsearch/issues/7478 >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release. On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote: > > I did report it https://github.com/elasticsearch/elasticsearch/issues/7478 > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04d9c094-112d-4d7d-bd48-e4fa2ff3a774%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
I did report it https://github.com/elasticsearch/elasticsearch/issues/7478 -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aea84561-33fc-4451-a5e5-bbc576470135%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
It'd be worth raising this as an issue on Github if you are concerned, at least then the ES devs will see it :) Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 27 August 2014 18:34, Martin Forssen wrote: > I see the same problem. We are running 1.1.1 on a 13-node cluster (3 > master and 5+5 data). I see stuck threads on most of the data nodes, I had > a look around on one of them. Top in thread mode shows: > top - 08:08:20 up 62 days, 18:49, 1 user, load average: 9.18, 13.21, > 12.67 > Threads: 528 total, 14 running, 514 sleeping, 0 stopped, 0 zombie > %Cpu(s): 39.0 us, 1.5 sy, 0.0 ni, 59.0 id, 0.2 wa, 0.2 hi, 0.0 si, > 0.1 st > KiB Mem: 62227892 total, 61933428 used, 294464 free,65808 buffers > KiB Swap: 61865980 total,19384 used, 61846596 free. 24645668 cached Mem > > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ > COMMAND > > 3743 elastic+ 20 0 1.151t 0.045t 0.013t S 93.4 78.1 17462:00 > java > > 3748 elastic+ 20 0 1.151t 0.045t 0.013t S 93.4 78.1 17457:55 > java > > 3761 elastic+ 20 0 1.151t 0.045t 0.013t S 93.1 78.1 17455:21 > java > > 3744 elastic+ 20 0 1.151t 0.045t 0.013t S 92.7 78.1 17456:55 > java > > 1758 elastic+ 20 0 1.151t 0.045t 0.013t R 5.9 78.1 3450:01 > java > > 1755 elastic+ 20 0 1.151t 0.045t 0.013t R 5.6 78.1 3450:05 > java > > So I have four threads consuming way more CPU than the others. The node is > only doing a moderate amount of garbage collection. Running jstack I find > that all the stuck threads have stack dump which looks like this: > Thread 3744: (state = IN_JAVA) > - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 > (Compiled frame; information may be imprecise) > - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled > frame) > - > org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, > org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame) > - > org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, > org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame) > - > org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext, > boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled > frame) > - org.apache.lucene.search.IndexSearcher.search(java.util.List, > org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) > @bci=68, line=618 (Compiled frame) > - > org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, > org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) > @bci=225, line=173 (Compiled frame) > - > org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, > org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame) > - > org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext) > @bci=54, line=52 (Interpreted frame) > - > org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext) > @bci=174, line=119 (Compiled frame) > - > org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest) > @bci=49, line=233 (Interpreted frame) > - > org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest, > org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted > frame) > - > org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest, > org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted > frame) > - > org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() > @bci=12, line=270 (Compiled frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1145 (Compiled frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame) > > The state varies between IN_JAVA an BLOCKED. I took two stack traces 10 > minutes apart and they were identical for the suspect threads. > > I assume this could be a very long running query, but I wonder if it isn't > just stuck. Perhaps we are seeing this issue: > http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https:/
Re: stuck thread problem?
I see the same problem. We are running 1.1.1 on a 13-node cluster (3 master and 5+5 data). I see stuck threads on most of the data nodes, I had a look around on one of them. Top in thread mode shows: top - 08:08:20 up 62 days, 18:49, 1 user, load average: 9.18, 13.21, 12.67 Threads: 528 total, 14 running, 514 sleeping, 0 stopped, 0 zombie %Cpu(s): 39.0 us, 1.5 sy, 0.0 ni, 59.0 id, 0.2 wa, 0.2 hi, 0.0 si, 0.1 st KiB Mem: 62227892 total, 61933428 used, 294464 free,65808 buffers KiB Swap: 61865980 total,19384 used, 61846596 free. 24645668 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 3743 elastic+ 20 0 1.151t 0.045t 0.013t S 93.4 78.1 17462:00 java 3748 elastic+ 20 0 1.151t 0.045t 0.013t S 93.4 78.1 17457:55 java 3761 elastic+ 20 0 1.151t 0.045t 0.013t S 93.1 78.1 17455:21 java 3744 elastic+ 20 0 1.151t 0.045t 0.013t S 92.7 78.1 17456:55 java 1758 elastic+ 20 0 1.151t 0.045t 0.013t R 5.9 78.1 3450:01 java 1755 elastic+ 20 0 1.151t 0.045t 0.013t R 5.6 78.1 3450:05 java So I have four threads consuming way more CPU than the others. The node is only doing a moderate amount of garbage collection. Running jstack I find that all the stuck threads have stack dump which looks like this: Thread 3744: (state = IN_JAVA) - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 (Compiled frame; information may be imprecise) - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled frame) - org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame) - org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame) - org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext, boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled frame) - org.apache.lucene.search.IndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=68, line=618 (Compiled frame) - org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=225, line=173 (Compiled frame) - org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame) - org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext) @bci=54, line=52 (Interpreted frame) - org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext) @bci=174, line=119 (Compiled frame) - org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest) @bci=49, line=233 (Interpreted frame) - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest, org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted frame) - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest, org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted frame) - org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() @bci=12, line=270 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame) The state varies between IN_JAVA an BLOCKED. I took two stack traces 10 minutes apart and they were identical for the suspect threads. I assume this could be a very long running query, but I wonder if it isn't just stuck. Perhaps we are seeing this issue: http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get -- You received this message because you are subscribed t
Re: stuck thread problem?
On 25 août 2014, at 13:51, Mark Walkom wrote: > (You should really set Xms and Xmx to be the same.) Ok, I'll do this next time I restart. > But it's not faulty, it's probably just GC which should be visible in the > logs. How much data do you have in your "cluster"? NAME USED AVAIL REFER MOUNTPOINT zdata/elasticsearch 4.07G 1.01T 4.07G /zdata/elasticsearch It's only daily indices (logstash) Is GC supposed to last more than 3 days? And I can't find any reference to a garbage collecting in /var/log/elasticsearch/*. > On 25 August 2014 21:19, Patrick Proniewski > wrote: > >> Hello, >> >> I'm running an ELK install for few months now, and few weeks ago I've >> noticed a strange behavior: ES had some kind of stuck thread consuming >> 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted >> ES, and it all came back to normal, until it started again 2 weeks later to >> consume CPU doing nothing. New restart, and 2 weeks later same problem. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6D4EDFB2-21CB-46DB-B765-CD6FC8C74D45%40patpro.net. For more options, visit https://groups.google.com/d/optout.
Re: stuck thread problem?
(You should really set Xms and Xmx to be the same.) But it's not faulty, it's probably just GC which should be visible in the logs. How much data do you have in your "cluster"? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 25 August 2014 21:19, Patrick Proniewski wrote: > Hello, > > I'm running an ELK install for few months now, and few weeks ago I've > noticed a strange behavior: ES had some kind of stuck thread consuming > 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted > ES, and it all came back to normal, until it started again 2 weeks later to > consume CPU doing nothing. New restart, and 2 weeks later same problem. > > I'm running: > > ES 1.1.0 on FreeBSD 9.2-RELEASE amd64. > > /usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid > -Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k > -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly > -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch > -Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp > /usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/* > org.elasticsearch.bootstrap.Elasticsearch > > # java -version > openjdk version "1.7.0_51" > OpenJDK Runtime Environment (build 1.7.0_51-b13) > OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode) > > Here is a sample top output showing the faulty thread: > > last pid: 37009; load averages: 0.79, 0.73, 0.71 up > 53+18:11:28 11:48:59 > 932 processes: 9 running, 901 sleeping, 22 waiting > CPU: 10.8% user, 0.0% nice, 0.2% system, 0.0% interrupt, 89.0% idle > Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free > ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other > Swap: 4096M Total, 4096M Free > > PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND >../.. > 74417 elasticsearch 330 2791M 2058M uwait 5 153:33 23.97% > java{java}<-- > 74417 elasticsearch 270 2791M 2058M uwait 7 43:02 7.96% > java{java} > 74417 elasticsearch 270 2791M 2058M uwait 6 43:02 7.96% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 1 7:32 2.20% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 5 8:26 1.76% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 7 8:25 1.76% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 5 8:25 1.76% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 5 8:26 1.66% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 7 8:26 1.66% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 4 8:25 1.66% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 6 8:25 1.66% > java{java} > 74417 elasticsearch 220 2791M 2058M uwait 1 8:25 1.66% > java{java} > > Nothing to be found in log files... > > Any idea? > > Patrick > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z2AiKZPs2EoQDSqAvYLVzAwJvwC%3DiFV4G4f6OGPcPOCA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
stuck thread problem?
Hello, I'm running an ELK install for few months now, and few weeks ago I've noticed a strange behavior: ES had some kind of stuck thread consuming 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted ES, and it all came back to normal, until it started again 2 weeks later to consume CPU doing nothing. New restart, and 2 weeks later same problem. I'm running: ES 1.1.0 on FreeBSD 9.2-RELEASE amd64. /usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid -Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp /usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/* org.elasticsearch.bootstrap.Elasticsearch # java -version openjdk version "1.7.0_51" OpenJDK Runtime Environment (build 1.7.0_51-b13) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode) Here is a sample top output showing the faulty thread: last pid: 37009; load averages: 0.79, 0.73, 0.71 up 53+18:11:28 11:48:59 932 processes: 9 running, 901 sleeping, 22 waiting CPU: 10.8% user, 0.0% nice, 0.2% system, 0.0% interrupt, 89.0% idle Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other Swap: 4096M Total, 4096M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND ../.. 74417 elasticsearch 330 2791M 2058M uwait 5 153:33 23.97% java{java} <-- 74417 elasticsearch 270 2791M 2058M uwait 7 43:02 7.96% java{java} 74417 elasticsearch 270 2791M 2058M uwait 6 43:02 7.96% java{java} 74417 elasticsearch 220 2791M 2058M uwait 1 7:32 2.20% java{java} 74417 elasticsearch 220 2791M 2058M uwait 5 8:26 1.76% java{java} 74417 elasticsearch 220 2791M 2058M uwait 7 8:25 1.76% java{java} 74417 elasticsearch 220 2791M 2058M uwait 5 8:25 1.76% java{java} 74417 elasticsearch 220 2791M 2058M uwait 5 8:26 1.66% java{java} 74417 elasticsearch 220 2791M 2058M uwait 7 8:26 1.66% java{java} 74417 elasticsearch 220 2791M 2058M uwait 4 8:25 1.66% java{java} 74417 elasticsearch 220 2791M 2058M uwait 6 8:25 1.66% java{java} 74417 elasticsearch 220 2791M 2058M uwait 1 8:25 1.66% java{java} Nothing to be found in log files... Any idea? Patrick -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net. For more options, visit https://groups.google.com/d/optout.