Re: High CPU and load since 1.4.1 -> 1.4.2 update

Arie Mon, 05 Jan 2015 07:11:36 -0800

Hi,,

The only things I see is ES_HEAP_SIZE with 50% this can bet set to 8gb.
I'm seeing some swap usage, you could disable swap completely.



hth,,
A.


On Monday, January 5, 2015 12:48:53 PM UTC+1, Darren Mansell wrote:
>
> Hi all,
>
> We have a 4 node VM dev cluster with 3 data nodes and 1 master node. The 3 
> data nodes are taking lots of CPU and the load average is high when the 
> servers are sitting idle with no accesses or indexing.
>
> This seemed to be fine before we updated to 1.4.2 before the new year, but 
> I can't confirm this is definitely the reason.
>
> The hot threads from each node seem to indicate something to do with the 
> filesystem most of the time e.g.
>
> 87.7% (438.6ms out of 500ms) cpu usage by thread 
> 'elasticsearch[potassium][management][T#4]'
>      2/10 snapshots sharing following 21 elements
>        org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:223)
>        org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:242)
>        
> org.apache.lucene.store.FileSwitchDirectory.listAll(FileSwitchDirectory.java:87)
>        
> org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:48)
>        
> org.elasticsearch.index.store.DistributorDirectory.listAll(DistributorDirectory.java:88)
>        
> org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:48)
>        
> org.elasticsearch.common.lucene.Directories.estimateSize(Directories.java:40)
>        org.elasticsearch.index.store.Store.stats(Store.java:216)
>        
> org.elasticsearch.index.shard.service.InternalIndexShard.storeStats(InternalIndexShard.java:540)
>        
> org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:134)
>        
> org.elasticsearch.action.admin.indices.stats.ShardStats.<init>(ShardStats.java:49)
>        
> org.elasticsearch.indices.InternalIndicesService.stats(InternalIndicesService.java:212)
>        org.elasticsearch.node.service.NodeService.stats(NodeService.java:156)
>        
> org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:96)
>        
> org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:44)
>        
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:278)
>        
> org.elasticsearch.action.support.nodes.TransportNodesOperationAction$NodeTransportHandler.messageReceived(TransportNodesOperationAction.java:269)
>        
> org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
>        
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>        
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>        java.lang.Thread.run(Thread.java:745)
>
>
> Please also see the image attachment for the high CPU and load.
>
>
> The VMs are on ESXi 5.5 on 2 * 4 core HT Xeon E5530 so server hardware is 
> pretty heavy. The nodes are set up with 16GB RAM, 2 vCPUs, all ES defaults 
> apart from:
>
>
> [root@potassium ~]# grep -v ^# /etc/elasticsearch/elasticsearch.yml | grep -v 
> ^$cluster.name: es-devnode.name: "potassium"
> node.master: false
> node.data: true
> path.data: /data/elasticsearch
> marvel.agent.exporter.es.hosts: ["hydrogen:9200"]
>
>
> and
>
>
> [root@potassium ~]# grep -v ^# /etc/sysconfig/elasticsearch | grep -v ^$
> ES_HOME=/usr/share/elasticsearch
> ES_HEAP_SIZE=6g
> MAX_OPEN_FILES=65535
> MAX_MAP_COUNT=262144
> LOG_DIR=/var/log/elasticsearch
> DATA_DIR=/var/lib/elasticsearch
> WORK_DIR=/tmp/elasticsearch
> CONF_DIR=/etc/elasticsearch
> CONF_FILE=/etc/elasticsearch/elasticsearch.yml
> ES_USER=elasticsearch
>
>
> So far I've tried:
>
>
>
>    - Dropping all data and loading again using logstash
>    - Deleting XFS filesystem and changing to ext4
>    - Removing all plugins
>    - Leaving for about 2 weeks in case it was doing background optimisation
>    - and various other things
>
> Does anyone have any suggestions about where I should look next, or any 
> thoughts about what could be happening? Please let me know if I can pull any 
> other info off the nodes or cluster to help diagnose.
>
> Many thanks,
> Darren
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9447a9cc-f65b-4b46-8a0c-35002368c889%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: High CPU and load since 1.4.1 -> 1.4.2 update

Reply via email to