LocalManualCache is a component of Guava's LRU cache 
<https://code.google.com/p/guava-libraries/source/browse/guava-gwt/src-super/com/google/common/cache/super/com/google/common/cache/CacheBuilder.java>,
 
which is used by Elasticsearch for both the filter and field data cache. 
 Based on your node stats, I'd agree it is the field data usage which is 
causing your OOMs.  CircuitBreaker helps prevent OOM, but it works on a 
per-request basis.  It's possible for individual requests to pass the CB 
because they use small subsets of fields, but over-time the set of fields 
loaded into Field Data continues to grow and you'll OOM anyway.

I would prefer to set a field data limit, rather than an expiration.  A 
hard limit prevents OOM because you don't allow the cache to grow anymore. 
 An expiration does not guarantee that, since you could get a burst of 
activity that still fills up the heap and OOMs before the expiration can 
work.

-Z

On Wednesday, February 11, 2015 at 12:50:45 PM UTC-5, Wilfred Hughes wrote:
>
> After examining some other nodes that were using a lot of their heap, I 
> think this is actually field data cache:
>
>
> $ curl "http://localhost:9200/_cluster/stats?human&pretty";
> ...
>     "fielddata": {
>       "memory_size": "21.3gb",
>       "memory_size_in_bytes": 22888612852,
>       "evictions": 0
>     },
>     "filter_cache": {
>       "memory_size": "6.1gb",
>       "memory_size_in_bytes": 6650700423,
>       "evictions": 12214551
>     },
>
> Since this is storing logstash data, I'm going to add the following lines 
> to my elasticsearch.yml and see if I observe a difference once deployed to 
> production.
>
> # Don't hold field data caches for more than a day, since data is
> # grouped by day and we quickly lose interest in historical data.
> indices.fielddata.cache.expire: "1d"
>
>
> On Wednesday, 11 February 2015 16:29:22 UTC, Wilfred Hughes wrote:
>>
>> Hi all
>>
>> I have an ES 1.2.4 cluster which is occasionally running out of heap. I 
>> have ES_HEAP_SIZE=31G and according to the heap dump generated, my biggest 
>> memory users were:
>>
>> org.elasticsearch.common.cache.LocalCache$LocalManualCache 55%
>> org.elasticsearch.indices.cache.filter.IndicesFilterCache 11%
>>
>> and nothing else used more than 1%.
>>
>> It's not clear to me what this cache is. I can't find any references to 
>> ManualCache in the elasticsearch source code, and the docs: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/index-modules-fielddata.html
>>  
>> suggest to me that the circuit breakers should stop requests or reduce 
>> cache usage rather that OOMing.
>>
>> At the moment my cache was filled up, the node was actually trying to 
>> index some data:
>>
>> [2015-02-11 08:14:29,775][WARN ][index.translog           ] [data-node-2] 
>> [logstash-2015.02.11][0] failed to flush shard on translog threshold
>> org.elasticsearch.index.engine.FlushFailedEngineException: 
>> [logstash-2015.02.11][0] Flush failed
>>         at 
>> org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
>>         at 
>> org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
>>         at 
>> org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.IllegalStateException: this writer hit an 
>> OutOfMemoryError; cannot commit
>>         at 
>> org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
>>         at 
>> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
>>         at 
>> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
>>         at 
>> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
>>         at 
>> org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
>>         ... 5 more
>> [2015-02-11 08:14:29,812][DEBUG][action.bulk              ] [data-node-2] 
>> [logstash-2015.02.11][0] failed to execute bulk item (index) index 
>> {[logstash-2015.02.11][syslog_slurm][1
>> org.elasticsearch.index.engine.CreateFailedEngineException: 
>> [logstash-2015.02.11][0] Create failed for 
>> [syslog_slurm#12UUWk5mR_2A1FGP5W3_1g]
>>         at 
>> org.elasticsearch.index.engine.internal.InternalEngine.create(InternalEngine.java:393)
>>         at 
>> org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:384)
>>         at 
>> org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:430)
>>         at 
>> org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
>>         at 
>> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction
>>         at 
>> org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:433)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>         at 
>> org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
>>         at org.apache.lucene.util.fst.FST.<init>(FST.java:286)
>>         at org.apache.lucene.util.fst.Builder.<init>(Builder.java:163)
>>         at 
>> org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:422)
>>         at 
>> org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:572)
>>         at 
>> org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:547)
>>         at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
>>         at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
>>         at 
>> org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1039)
>>         at 
>> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:548)
>>         at 
>> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
>>         at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
>>         at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
>>         at 
>> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
>>         at 
>> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:465)
>>         at 
>> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518)
>>         at 
>> org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:368)
>>         at 
>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450)
>>         at 
>> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1537)
>>         at 
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1207)
>>         at 
>> org.elasticsearch.index.engine.internal.InternalEngine.innerCreate(InternalEngine.java:459)
>>         at 
>> org.elasticsearch.index.engine.internal.InternalEngine.create(InternalEngine.java:386)
>>         ... 8 more
>>
>> Could anyone clarify as to what this cache is, or point me towards some 
>> docs?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/458a1101-5f31-40a9-8f42-7cd95d9dfc2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to