subject:"stuck thread problem\?"

Re: stuck thread problem?

2014-09-10 Thread Martijn v Groningen

Hieu: The stack trace you shared, is from a search request with a mvel
script in a function_score query. I don't think that this is causing
threads to get stuck and it just pops up here because in general a query
with scripts take more cpu time. You can easily verify this if you stop
search requests temporarily then the shared stacktrace shouldn't be
included in the hot threads output. If a node gets slower over time there
may be a different issue. For example the jvm is slowly running out of
memory and garbage collections taking are longer.


On 10 September 2014 05:26, Hieu Nguyen  wrote:

> Thanks for the response, Martijn! We'll consider upgrading, but it'd be
> great to root cause the issue.
>
> On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote:
>>
>> Patrick: I have never seen this, but this means the openjdk on FreeBSD
>> doesn't support cpu sampling of threads. Not sure, but maybe you can try
>> with an Oracle jdk/jre? Otherwise jstack should be used on order to figure
>> out on what threads of ES get stuck on.
>>
>> Hieu: This doesn't look like the issue that is fixed, since the threads
>> are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old
>> version and I would recommend to upgrade. Also mvel is deprecated and will
>> only be available via a plugin: https://github.com/
>> elasticsearch/elasticsearch/pull/6571 and https://github.com/
>> elasticsearch/elasticsearch/pull/6610
>>
>>
>> On 6 September 2014 16:12, Hieu Nguyen  wrote:
>>
>>> We have seen a similar issue in our cluster (CPU usage and search time
>>> suddenly went up slowly for the master node over a period of one day, until
>>> we restarted). Is there a easy way to confirm that it's indeed the same
>>> issue mentioned here?
>>>
>>> Below is the output of our hot threads on this node (version 0.90.12):
>>>
>>>85.8% (857.7ms out of 1s) cpu usage by thread
>>> 'elasticsearch[cluster1][search][T#3]'
>>>  8/10 snapshots sharing following 30 elements
>>>java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
>>>java.lang.ThreadLocal$ThreadLocalMap.access$100(
>>> ThreadLocal.java:261)
>>>java.lang.ThreadLocal.set(ThreadLocal.java:183)
>>>org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.
>>> clearThreadAccessorOptimizer(OptimizerFactory.java:114)
>>>org.elasticsearch.common.mvel2.MVELRuntime.execute(
>>> MVELRuntime.java:169)
>>>org.elasticsearch.common.mvel2.compiler.CompiledExpression.
>>> getDirectValue(CompiledExpression.java:123)
>>>org.elasticsearch.common.mvel2.compiler.
>>> CompiledExpression.getValue(CompiledExpression.java:119)
>>>org.elasticsearch.script.mvel.MvelScriptEngineService$
>>> MvelSearchScript.run(MvelScriptEngineService.java:191)
>>>org.elasticsearch.script.mvel.MvelScriptEngineService$
>>> MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
>>>org.elasticsearch.common.lucene.search.function.
>>> ScriptScoreFunction.score(ScriptScoreFunction.java:54)
>>>org.elasticsearch.common.lucene.search.function.
>>> FunctionScoreQuery$CustomBoostFactorScorer.score(
>>> FunctionScoreQuery.java:175)
>>>org.apache.lucene.search.TopScoreDocCollector$
>>> OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
>>>org.apache.lucene.search.TimeLimitingCollector.collect(
>>> TimeLimitingCollector.java:153)
>>>org.apache.lucene.search.Scorer.score(Scorer.java:65)
>>>org.apache.lucene.search.IndexSearcher.search(
>>> IndexSearcher.java:621)
>>>org.elasticsearch.search.internal.ContextIndexSearcher.
>>> search(ContextIndexSearcher.java:162)
>>>org.apache.lucene.search.IndexSearcher.search(
>>> IndexSearcher.java:491)
>>>org.apache.lucene.search.IndexSearcher.search(
>>> IndexSearcher.java:448)
>>>org.apache.lucene.search.IndexSearcher.search(
>>> IndexSearcher.java:281)
>>>org.apache.lucene.search.IndexSearcher.search(
>>> IndexSearcher.java:269)
>>>org.elasticsearch.search.query.QueryPhase.execute(
>>> QueryPhase.java:117)
>>>org.elasticsearch.search.SearchService.executeQueryPhase(
>>> SearchService.java:244)
>>>org.elasticsearch.search.action.SearchServiceTransportAction.
>>> sendExecuteQuery(SearchServiceTransportAction.java:202)
>>>org.elasticsearch.action.search.type.
>>> TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(
>>> TransportSearchQueryThenFetchAction.java:80)
>>>org.elasticsearch.action.search.type.TransportSearchTypeAction$
>>> BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
>>>org.elasticsearch.action.search.type.TransportSearchTypeAction$
>>> BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
>>>org.elasticsearch.action.search.type.TransportSearchTypeAction$
>>> BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
>>>java.util.concurrent.Thr

Re: stuck thread problem?

2014-09-09 Thread Hieu Nguyen

Thanks for the response, Martijn! We'll consider upgrading, but it'd be 
great to root cause the issue. 

On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote:
>
> Patrick: I have never seen this, but this means the openjdk on FreeBSD 
> doesn't support cpu sampling of threads. Not sure, but maybe you can try 
> with an Oracle jdk/jre? Otherwise jstack should be used on order to figure 
> out on what threads of ES get stuck on.
>
> Hieu: This doesn't look like the issue that is fixed, since the threads 
> are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old 
> version and I would recommend to upgrade. Also mvel is deprecated and will 
> only be available via a plugin: 
> https://github.com/elasticsearch/elasticsearch/pull/6571 and 
> https://github.com/elasticsearch/elasticsearch/pull/6610
>
>
> On 6 September 2014 16:12, Hieu Nguyen > 
> wrote:
>
>> We have seen a similar issue in our cluster (CPU usage and search time 
>> suddenly went up slowly for the master node over a period of one day, until 
>> we restarted). Is there a easy way to confirm that it's indeed the same 
>> issue mentioned here?
>>
>> Below is the output of our hot threads on this node (version 0.90.12):
>>
>>85.8% (857.7ms out of 1s) cpu usage by thread 
>> 'elasticsearch[cluster1][search][T#3]'
>>  8/10 snapshots sharing following 30 elements
>>java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
>>   
>>  java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
>>java.lang.ThreadLocal.set(ThreadLocal.java:183)
>>   
>>  
>> org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
>>   
>>  org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
>>   
>>  
>> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
>>   
>>  
>> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
>>   
>>  
>> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
>>   
>>  
>> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
>>   
>>  
>> org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
>>   
>>  
>> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
>>   
>>  
>> org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
>>   
>>  
>> org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
>>org.apache.lucene.search.Scorer.score(Scorer.java:65)
>>   
>>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
>>   
>>  
>> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
>>   
>>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
>>   
>>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
>>   
>>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
>>   
>>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
>>   
>>  org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
>>   
>>  
>> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
>>   
>>  
>> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
>>   
>>  
>> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
>>   
>>  
>> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
>>   
>>  
>> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
>>   
>>  
>> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
>>   
>>  
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>   
>>  
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>java.lang.Thread.run(Thread.java:744)
>>  2/10 snapshots sharing following 20 elements
>>   
>>  org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
>>   
>>  
>> org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
>>   
>>  
>> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
>>o

Re: stuck thread problem?

2014-09-09 Thread Patrick Proniewski

On 9 sept. 2014, at 09:03, Martijn v Groningen  
wrote:

> Patrick: I have never seen this, but this means the openjdk on FreeBSD
> doesn't support cpu sampling of threads. Not sure, but maybe you can try
> with an Oracle jdk/jre? Otherwise jstack should be used on order to figure
> out on what threads of ES get stuck on.

Martijn, thanks for coming back on this. Last week end I've performed a full 
upgrade of this server (freebsd upgrade from 9.2 to 9.3, and recompiliation of 
every ports installed). In the process, ES was upgraded, as was OpenJDK.

I'll wait and see if the problem comes back or not, and keep you posted.

regards,
Patrick

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E1C92FAF-0863-46F3-8EB9-6FE26278E111%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-09-09 Thread Martijn v Groningen

Patrick: I have never seen this, but this means the openjdk on FreeBSD
doesn't support cpu sampling of threads. Not sure, but maybe you can try
with an Oracle jdk/jre? Otherwise jstack should be used on order to figure
out on what threads of ES get stuck on.

Hieu: This doesn't look like the issue that is fixed, since the threads are
in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old
version and I would recommend to upgrade. Also mvel is deprecated and will
only be available via a plugin:
https://github.com/elasticsearch/elasticsearch/pull/6571 and
https://github.com/elasticsearch/elasticsearch/pull/6610


On 6 September 2014 16:12, Hieu Nguyen  wrote:

> We have seen a similar issue in our cluster (CPU usage and search time
> suddenly went up slowly for the master node over a period of one day, until
> we restarted). Is there a easy way to confirm that it's indeed the same
> issue mentioned here?
>
> Below is the output of our hot threads on this node (version 0.90.12):
>
>85.8% (857.7ms out of 1s) cpu usage by thread
> 'elasticsearch[cluster1][search][T#3]'
>  8/10 snapshots sharing following 30 elements
>java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
>
>  java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
>java.lang.ThreadLocal.set(ThreadLocal.java:183)
>
>  
> org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
>
>  org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
>
>  
> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
>
>  
> org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
>
>  
> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
>
>  
> org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
>
>  
> org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
>
>  
> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
>
>  
> org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
>
>  
> org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
>org.apache.lucene.search.Scorer.score(Scorer.java:65)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
>
>  
> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
>
>  org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
>
>  
> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
>
>  
> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
>
>  
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
>
>  
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
>
>  
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
>
>  
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
>
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>java.lang.Thread.run(Thread.java:744)
>  2/10 snapshots sharing following 20 elements
>
>  org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
>
>  
> org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
>
>  
> org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
>org.apache.lucene.search.Scorer.score(Scorer.java:64)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
>
>  
> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
>
>  org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java

Re: stuck thread problem?

2014-09-06 Thread Hieu Nguyen

We have seen a similar issue in our cluster (CPU usage and search time 
suddenly went up slowly for the master node over a period of one day, until 
we restarted). Is there a easy way to confirm that it's indeed the same 
issue mentioned here?

Below is the output of our hot threads on this node (version 0.90.12):

   85.8% (857.7ms out of 1s) cpu usage by thread 
'elasticsearch[cluster1][search][T#3]'
 8/10 snapshots sharing following 30 elements
   java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
   java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
   java.lang.ThreadLocal.set(ThreadLocal.java:183)
  
 
org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
  
 org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
  
 
org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
  
 
org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
  
 
org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
  
 
org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
  
 
org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
  
 
org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
  
 
org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
  
 
org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
   org.apache.lucene.search.Scorer.score(Scorer.java:65)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
  
 
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
  
 org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
  
 
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
  
 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
  
 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
  
 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  
 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
  
 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
  
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:744)
 2/10 snapshots sharing following 20 elements
  
 org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
  
 
org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
  
 
org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
   org.apache.lucene.search.Scorer.score(Scorer.java:64)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
  
 
org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
   org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
  
 org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
  
 
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
  
 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
  
 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
  
 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
  
 
org.elasticsearch.action.search.type.Transport

Re: stuck thread problem?

2014-09-05 Thread Patrick Proniewski

Hi,

Epic fail:

[2014-09-05 22:14:00,043][DEBUG][action.admin.cluster.node.hotthreads] 
[Bloodsport] failed to execute on node [fLwUGA_eSfmLApJ7SyIncA]
org.elasticsearch.ElasticsearchException: failed to detect hot threads
at 
org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:103)
at 
org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:43)
at 
org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: MBean doesn't support thread CPU 
Time
at 
org.elasticsearch.monitor.jvm.HotThreads.innerDetect(HotThreads.java:90)
at org.elasticsearch.monitor.jvm.HotThreads.detect(HotThreads.java:75)
at 
org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:101)
... 5 more

On 29 août 2014, at 13:54, Martijn v Groningen wrote:

> Hi Patrick,
> 
> I this problem happens again then you should execute the hot threads api:
> curl localhost:9200/_nodes/hot_threads
> 
> Documentation: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads
> 
> Just pick a node in your cluster and run that command. This is the equivalent 
> of running jstack on all the nodes in your cluster.
> 
> Martijn
> 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/62B26081-90D7-437E-A797-A5437A6EF466%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Martijn v Groningen

Hi Patrick,

I this problem happens again then you should execute the hot threads api:
curl localhost:9200/_nodes/hot_threads

Documentation:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads

Just pick a node in your cluster and run that command. This is the
equivalent of running jstack on all the nodes in your cluster.

Martijn


On 29 August 2014 13:34, Patrick Proniewski 
wrote:

> Hi,
>
> I don't know how to debug a JAVA process. Haven't heard about jstack until
> it was mentioned in this thread.
> All I know is what I've posted in my first message.
>
> I've restarted ES, and currently I've no stuck thread to investigate. In
> the mean time, you can teach me how I should use jstack, so next time it
> happens I'll be ready.
>
> On 29 août 2014, at 13:19, Martijn v Groningen <
> martijn.v.gronin...@gmail.com> wrote:
>
> > Hi Patrick,
> >
> > Did you see the same stuck thread via jstack or the hot thread api that
> > Martin reported? This can only happen if scan search was enabled (by
> > setting search_type=scan in a search request)
> > If that isn't the case then something else is maybe stuck.
> >
> > Martijn
> >
> >
> > On 29 August 2014 09:58, Patrick Proniewski 
> > wrote:
> >
> >> Thank you!
> >>
> >> On 29 août 2014, at 08:49, Martin Forssen 
> wrote:
> >>
> >>> FYI, this turned out to be a real bug. A fix has been committed and
> will
> >> be
> >>> included in the next release.
> >>>
> >>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
> 
>  I did report it
> >> https://github.com/elasticsearch/elasticsearch/issues/7478
> 
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "elasticsearch" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to elasticsearch+unsubscr...@googlegroups.com.
> >> To view this discussion on the web visit
> >>
> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
> >
> > --
> > Met vriendelijke groet,
> >
> > Martijn van Groningen
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearch+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com
> .
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Met vriendelijke groet,

Martijn van Groningen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TxbSpXgVwRmfF5X3%2BDzsWgz8iNgaTjWXOP7iT1NdfHLow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Patrick Proniewski

Hi,

I don't know how to debug a JAVA process. Haven't heard about jstack until it 
was mentioned in this thread.
All I know is what I've posted in my first message.

I've restarted ES, and currently I've no stuck thread to investigate. In the 
mean time, you can teach me how I should use jstack, so next time it happens 
I'll be ready.

On 29 août 2014, at 13:19, Martijn v Groningen  
wrote:

> Hi Patrick,
> 
> Did you see the same stuck thread via jstack or the hot thread api that
> Martin reported? This can only happen if scan search was enabled (by
> setting search_type=scan in a search request)
> If that isn't the case then something else is maybe stuck.
> 
> Martijn
> 
> 
> On 29 August 2014 09:58, Patrick Proniewski 
> wrote:
> 
>> Thank you!
>> 
>> On 29 août 2014, at 08:49, Martin Forssen  wrote:
>> 
>>> FYI, this turned out to be a real bug. A fix has been committed and will
>> be
>>> included in the next release.
>>> 
>>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
 
 I did report it
>> https://github.com/elasticsearch/elasticsearch/issues/7478
 
>> 
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
>> .
>> For more options, visit https://groups.google.com/d/optout.
>> 
> 
> 
> 
> -- 
> Met vriendelijke groet,
> 
> Martijn van Groningen
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Martijn v Groningen

Hi Patrick,

Did you see the same stuck thread via jstack or the hot thread api that
Martin reported? This can only happen if scan search was enabled (by
setting search_type=scan in a search request)
If that isn't the case then something else is maybe stuck.

Martijn


On 29 August 2014 09:58, Patrick Proniewski 
wrote:

> Thank you!
>
> On 29 août 2014, at 08:49, Martin Forssen  wrote:
>
> > FYI, this turned out to be a real bug. A fix has been committed and will
> be
> > included in the next release.
> >
> > On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
> >>
> >> I did report it
> https://github.com/elasticsearch/elasticsearch/issues/7478
> >>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Met vriendelijke groet,

Martijn van Groningen

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Patrick Proniewski

Thank you!

On 29 août 2014, at 08:49, Martin Forssen  wrote:

> FYI, this turned out to be a real bug. A fix has been committed and will be 
> included in the next release.
> 
> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>> 
>> I did report it https://github.com/elasticsearch/elasticsearch/issues/7478
>> 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-28 Thread Martin Forssen

FYI, this turned out to be a real bug. A fix has been committed and will be 
included in the next release.

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>
> I did report it https://github.com/elasticsearch/elasticsearch/issues/7478
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/04d9c094-112d-4d7d-bd48-e4fa2ff3a774%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-27 Thread Martin Forssen

I did report it https://github.com/elasticsearch/elasticsearch/issues/7478

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aea84561-33fc-4451-a5e5-bbc576470135%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-27 Thread Mark Walkom

It'd be worth raising this as an issue on Github if you are concerned, at
least then the ES devs will see it :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 27 August 2014 18:34, Martin Forssen  wrote:

> I see the same problem. We are running 1.1.1 on a 13-node cluster (3
> master and 5+5 data). I see stuck threads on most of the data nodes, I had
> a look around on one of them. Top in thread mode shows:
> top - 08:08:20 up 62 days, 18:49,  1 user,  load average: 9.18, 13.21,
> 12.67
> Threads: 528 total,  14 running, 514 sleeping,   0 stopped,   0 zombie
> %Cpu(s): 39.0 us,  1.5 sy,  0.0 ni, 59.0 id,  0.2 wa,  0.2 hi,  0.0 si,
> 0.1 st
> KiB Mem:  62227892 total, 61933428 used,   294464 free,65808 buffers
> KiB Swap: 61865980 total,19384 used, 61846596 free. 24645668 cached Mem
>
>   PID USER  PR  NIVIRTRESSHR S %CPU %MEM TIME+
> COMMAND
>
>  3743 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17462:00
> java
>
>  3748 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17457:55
> java
>
>  3761 elastic+  20   0  1.151t 0.045t 0.013t S 93.1 78.1  17455:21
> java
>
>  3744 elastic+  20   0  1.151t 0.045t 0.013t S 92.7 78.1  17456:55
> java
>
>  1758 elastic+  20   0  1.151t 0.045t 0.013t R  5.9 78.1   3450:01
> java
>
>  1755 elastic+  20   0  1.151t 0.045t 0.013t R  5.6 78.1   3450:05
> java
>
> So I have four threads consuming way more CPU than the others. The node is
> only doing a moderate amount of garbage collection. Running jstack I find
> that all the stuck threads have  stack dump which looks like this:
> Thread 3744: (state = IN_JAVA)
>  - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446
> (Compiled frame; information may be imprecise)
>  - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled
> frame)
>  -
> org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
> org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame)
>  -
> org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
> org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame)
>  -
> org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext,
> boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled
> frame)
>  - org.apache.lucene.search.IndexSearcher.search(java.util.List,
> org.apache.lucene.search.Weight, org.apache.lucene.search.Collector)
> @bci=68, line=618 (Compiled frame)
>  -
> org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List,
> org.apache.lucene.search.Weight, org.apache.lucene.search.Collector)
> @bci=225, line=173 (Compiled frame)
>  -
> org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query,
> org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame)
>  -
> org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext)
> @bci=54, line=52 (Interpreted frame)
>  -
> org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext)
> @bci=174, line=119 (Compiled frame)
>  -
> org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest)
> @bci=49, line=233 (Interpreted frame)
>  -
> org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest,
> org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted
> frame)
>  -
> org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest,
> org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted
> frame)
>  -
> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run()
> @bci=12, line=270 (Compiled frame)
>  -
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
> @bci=95, line=1145 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)
>
> The state varies between IN_JAVA an BLOCKED. I took two stack traces 10
> minutes apart and they were identical for the suspect threads.
>
> I assume this could be a very long running query, but I wonder if it isn't
> just stuck. Perhaps we are seeing this issue:
> http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https:/

Re: stuck thread problem?

2014-08-27 Thread Martin Forssen

I see the same problem. We are running 1.1.1 on a 13-node cluster (3 master 
and 5+5 data). I see stuck threads on most of the data nodes, I had a look 
around on one of them. Top in thread mode shows:
top - 08:08:20 up 62 days, 18:49,  1 user,  load average: 9.18, 13.21, 12.67
Threads: 528 total,  14 running, 514 sleeping,   0 stopped,   0 zombie
%Cpu(s): 39.0 us,  1.5 sy,  0.0 ni, 59.0 id,  0.2 wa,  0.2 hi,  0.0 si,  
0.1 st
KiB Mem:  62227892 total, 61933428 used,   294464 free,65808 buffers
KiB Swap: 61865980 total,19384 used, 61846596 free. 24645668 cached Mem

  PID USER  PR  NIVIRTRESSHR S %CPU %MEM TIME+ 
COMMAND 
   

 3743 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17462:00 
java
   

 3748 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17457:55 
java
   

 3761 elastic+  20   0  1.151t 0.045t 0.013t S 93.1 78.1  17455:21 
java
   

 3744 elastic+  20   0  1.151t 0.045t 0.013t S 92.7 78.1  17456:55 
java
   

 1758 elastic+  20   0  1.151t 0.045t 0.013t R  5.9 78.1   3450:01 
java
   

 1755 elastic+  20   0  1.151t 0.045t 0.013t R  5.6 78.1   3450:05 
java  

So I have four threads consuming way more CPU than the others. The node is 
only doing a moderate amount of garbage collection. Running jstack I find 
that all the stuck threads have  stack dump which looks like this:
Thread 3744: (state = IN_JAVA)
 - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 (Compiled 
frame; information may be imprecise)
 - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled 
frame)
 - 
org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
 
org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame)
 - 
org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext,
 
org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame)
 - 
org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext,
 
boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled 
frame)
 - org.apache.lucene.search.IndexSearcher.search(java.util.List, 
org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) 
@bci=68, line=618 (Compiled frame)
 - 
org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, 
org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) 
@bci=225, line=173 (Compiled frame)
 - 
org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, 
org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame)
 - 
org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext)
 
@bci=54, line=52 (Interpreted frame)
 - 
org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext)
 
@bci=174, line=119 (Compiled frame)
 - 
org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest)
 
@bci=49, line=233 (Interpreted frame)
 - 
org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest,
 
org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted 
frame)
 - 
org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest,
 
org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted 
frame)
 - 
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() 
@bci=12, line=270 (Compiled frame)
 - 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
 
@bci=95, line=1145 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 
(Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)

The state varies between IN_JAVA an BLOCKED. I took two stack traces 10 
minutes apart and they were identical for the suspect threads.

I assume this could be a very long running query, but I wonder if it isn't 
just stuck. Perhaps we are seeing this issue: 
http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get

-- 
You received this message because you are subscribed t

Re: stuck thread problem?

2014-08-25 Thread Patrick Proniewski

On 25 août 2014, at 13:51, Mark Walkom  wrote:

> (You should really set Xms and Xmx to be the same.)

Ok, I'll do this next time I restart.


> But it's not faulty, it's probably just GC which should be visible in the
> logs. How much data do you have in your "cluster"?


NAME   USED  AVAIL  REFER  MOUNTPOINT
zdata/elasticsearch   4.07G  1.01T  4.07G  /zdata/elasticsearch

It's only daily indices (logstash)
Is GC supposed to last more than 3 days? And I can't find any reference to a 
garbage collecting in /var/log/elasticsearch/*.



> On 25 August 2014 21:19, Patrick Proniewski 
> wrote:
> 
>> Hello,
>> 
>> I'm running an ELK install for few months now, and few weeks ago I've
>> noticed a strange behavior: ES had some kind of stuck thread consuming
>> 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted
>> ES, and it all came back to normal, until it started again 2 weeks later to
>> consume CPU doing nothing. New restart, and 2 weeks later same problem.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6D4EDFB2-21CB-46DB-B765-CD6FC8C74D45%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-25 Thread Mark Walkom

(You should really set Xms and Xmx to be the same.)

But it's not faulty, it's probably just GC which should be visible in the
logs. How much data do you have in your "cluster"?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 25 August 2014 21:19, Patrick Proniewski 
wrote:

> Hello,
>
> I'm running an ELK install for few months now, and few weeks ago I've
> noticed a strange behavior: ES had some kind of stuck thread consuming
> 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted
> ES, and it all came back to normal, until it started again 2 weeks later to
> consume CPU doing nothing. New restart, and 2 weeks later same problem.
>
> I'm running:
>
> ES 1.1.0 on FreeBSD 9.2-RELEASE amd64.
>
> /usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid
> -Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k
> -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch
> -Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp
> /usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/*
> org.elasticsearch.bootstrap.Elasticsearch
>
> # java -version
> openjdk version "1.7.0_51"
> OpenJDK Runtime Environment (build 1.7.0_51-b13)
> OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
>
> Here is a sample top output showing the faulty thread:
>
> last pid: 37009;  load averages:  0.79,  0.73,  0.71   up
> 53+18:11:28  11:48:59
> 932 processes: 9 running, 901 sleeping, 22 waiting
> CPU: 10.8% user,  0.0% nice,  0.2% system,  0.0% interrupt, 89.0% idle
> Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free
> ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other
> Swap: 4096M Total, 4096M Free
>
>   PID USERNAME  PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
>../..
> 74417 elasticsearch  330  2791M  2058M uwait   5 153:33 23.97%
> java{java}<--
> 74417 elasticsearch  270  2791M  2058M uwait   7  43:02  7.96%
> java{java}
> 74417 elasticsearch  270  2791M  2058M uwait   6  43:02  7.96%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   1   7:32  2.20%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   5   8:26  1.76%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   7   8:25  1.76%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   5   8:25  1.76%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   5   8:26  1.66%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   7   8:26  1.66%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   4   8:25  1.66%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   6   8:25  1.66%
> java{java}
> 74417 elasticsearch  220  2791M  2058M uwait   1   8:25  1.66%
> java{java}
>
> Nothing to be found in  log files...
>
> Any idea?
>
> Patrick
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z2AiKZPs2EoQDSqAvYLVzAwJvwC%3DiFV4G4f6OGPcPOCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

stuck thread problem?

2014-08-25 Thread Patrick Proniewski

Hello,

I'm running an ELK install for few months now, and few weeks ago I've noticed a 
strange behavior: ES had some kind of stuck thread consuming 20-70% of a CPU 
core. It remained unnoticed for days. Then I've restarted ES, and it all came 
back to normal, until it started again 2 weeks later to consume CPU doing 
nothing. New restart, and 2 weeks later same problem.

I'm running: 

ES 1.1.0 on FreeBSD 9.2-RELEASE amd64.

/usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid 
-Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k 
-Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+HeapDumpOnOutOfMemoryError -Delasticsearch 
-Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp 
/usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/*
 org.elasticsearch.bootstrap.Elasticsearch

# java -version
openjdk version "1.7.0_51"
OpenJDK Runtime Environment (build 1.7.0_51-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

Here is a sample top output showing the faulty thread:

last pid: 37009;  load averages:  0.79,  0.73,  0.71   up 
53+18:11:28  11:48:59
932 processes: 9 running, 901 sleeping, 22 waiting
CPU: 10.8% user,  0.0% nice,  0.2% system,  0.0% interrupt, 89.0% idle
Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free
ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other
Swap: 4096M Total, 4096M Free

  PID USERNAME  PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
   ../..
74417 elasticsearch  330  2791M  2058M uwait   5 153:33 23.97% java{java}   
 <--
74417 elasticsearch  270  2791M  2058M uwait   7  43:02  7.96% java{java}
74417 elasticsearch  270  2791M  2058M uwait   6  43:02  7.96% java{java}
74417 elasticsearch  220  2791M  2058M uwait   1   7:32  2.20% java{java}
74417 elasticsearch  220  2791M  2058M uwait   5   8:26  1.76% java{java}
74417 elasticsearch  220  2791M  2058M uwait   7   8:25  1.76% java{java}
74417 elasticsearch  220  2791M  2058M uwait   5   8:25  1.76% java{java}
74417 elasticsearch  220  2791M  2058M uwait   5   8:26  1.66% java{java}
74417 elasticsearch  220  2791M  2058M uwait   7   8:26  1.66% java{java}
74417 elasticsearch  220  2791M  2058M uwait   4   8:25  1.66% java{java}
74417 elasticsearch  220  2791M  2058M uwait   6   8:25  1.66% java{java}
74417 elasticsearch  220  2791M  2058M uwait   1   8:25  1.66% java{java}

Nothing to be found in  log files...

Any idea?

Patrick

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

Re: stuck thread problem?

stuck thread problem?

17 matches

Site Navigation

Mail list logo

Footer information