Hi, I am doing all my tests on a 38GB production index copy, with ES 1.4.2. I tried several memory settings and virtual machine sizes, but ES fails to start on a linux system with 48GB memory and 32GB for ES heap.
Searching for similar issues, I encountered https://github.com/elasticsearch/elasticsearch/issues/8394 which is still open and looks fairly similar to my problem. The debug output at the start of looks like this : [2015-01-14 12:00:48,710][DEBUG][indices.cluster ] [Saint Elmo] [mailspool][1] creating shard [2015-01-14 12:00:48,710][DEBUG][index.service ] [Saint Elmo] [mailspool] creating shard_id [1] [2015-01-14 12:00:48,791][DEBUG][index.deletionpolicy ] [Saint Elmo] [mailspool][1] Using [keep_only_last] deletion policy [2015-01-14 12:00:48,793][DEBUG][index.merge.policy ] [Saint Elmo] [mailspool][1] using [tiered] merge mergePolicy with expunge_deletes_allowed[10.0], floor_segment[2mb], max_merge_at_once[10], max_merge_at_once_explicit[30], max_merged_segment[5gb], segments_per_tier[10.0], reclaim_deletes_weight[2.0] [2015-01-14 12:00:48,794][DEBUG][index.merge.scheduler ] [Saint Elmo] [mailspool][1] using [concurrent] merge scheduler with max_thread_count[2], max_merge_count[4] [2015-01-14 12:00:48,797][DEBUG][index.shard.service ] [Saint Elmo] [mailspool][1] state: [CREATED] [2015-01-14 12:00:48,797][DEBUG][index.translog ] [Saint Elmo] [mailspool][1] interval [5s], flush_threshold_ops [2147483647], flush_threshold_size [200mb], flush_threshold_period [30m] [2015-01-14 12:00:48,801][DEBUG][index.shard.service ] [Saint Elmo] [mailspool][1] state: [CREATED]->[RECOVERING], reason [from gateway] [2015-01-14 12:00:48,801][DEBUG][index.gateway ] [Saint Elmo] [mailspool][1] starting recovery from local ... [2015-01-14 12:00:48,805][DEBUG][river.cluster ] [Saint Elmo] processing [reroute_rivers_node_changed]: execute [2015-01-14 12:00:48,805][DEBUG][river.cluster ] [Saint Elmo] processing [reroute_rivers_node_changed]: no change in cluster_state [2015-01-14 12:00:48,814][INFO ][gateway ] [Saint Elmo] recovered [1] indices into cluster_state [2015-01-14 12:00:48,814][DEBUG][cluster.service ] [Saint Elmo] processing [local-gateway-elected-state]: done applying updated cluster_state (version: 2) [2015-01-14 12:00:48,840][DEBUG][index.engine.internal ] [Saint Elmo] [mailspool][1] starting engine [2015-01-14 12:00:58,406][DEBUG][cluster.service ] [Saint Elmo] processing [routing-table-updater]: execute [2015-01-14 12:00:58,407][DEBUG][gateway.local ] [Saint Elmo] [mailspool][4]: throttling allocation [[mailspool][4], node[null], [P], s[UNASSIGNED]] to [[[Saint Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300]]]] on primary allocation [2015-01-14 12:00:58,407][DEBUG][gateway.local ] [Saint Elmo] [mailspool][2]: throttling allocation [[mailspool][2], node[null], [P], s[UNASSIGNED]] to [[[Saint Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300]]]] on primary allocation [2015-01-14 12:00:58,407][DEBUG][gateway.local ] [Saint Elmo] [mailspool][3]: throttling allocation [[mailspool][3], node[null], [P], s[UNASSIGNED]] to [[[Saint Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300]]]] on primary allocation [2015-01-14 12:00:58,408][DEBUG][gateway.local ] [Saint Elmo] [mailspool][0]: throttling allocation [[mailspool][0], node[null], [P], s[UNASSIGNED]] to [[[Saint Elmo][gOgAuHo4SXyfyuPpws0Usw][es][inet[/172.16.45.250:9300]]]] on primary allocation [2015-01-14 12:00:58,408][DEBUG][cluster.service ] [Saint Elmo] processing [routing-table-updater]: no change in cluster_state [2015-01-14 12:01:31,619][WARN ][index.engine.internal ] [Saint Elmo] [mailspool][1] failed engine [refresh failed] java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.FixedBitSet.<init>(FixedBitSet.java:187) at org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:104) at org.elasticsearch.index.cache.filter.weighted.WeightedFilterCache$FilterCacheFilterWrapper.getDocIdSet(WeightedFilterCache.java:177) at org.elasticsearch.common.lucene.search.OrFilter.getDocIdSet(OrFilter.java:55) at org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(ApplyAcceptedDocsFilter.java:46) at org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:130) at org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542) at org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:136) at org.apache.lucene.search.QueryWrapperFilter$1.iterator(QueryWrapperFilter.java:59) at org.apache.lucene.index.BufferedUpdatesStream.applyQueryDeletes(BufferedUpdatesStream.java:554) at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:287) at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421) at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:292) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:267) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:257) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:171) at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:118) at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58) at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176) at org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:225) at org.elasticsearch.index.engine.internal.InternalEngine.refresh(InternalEngine.java:796) at org.elasticsearch.index.engine.internal.InternalEngine.delete(InternalEngine.java:692) at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryOperation(InternalIndexShard.java:798) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:268) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) [2015-01-14 12:01:32,238][DEBUG][index.service ] [Saint Elmo] [mailspool] [1] closing... (reason: [engine failure, message [refresh failed][OutOfMemoryError[Java heap space]]]) [2015-01-14 12:01:32,238][DEBUG][index.shard.service ] [Saint Elmo] [mailspool][1] state: [RECOVERING]->[CLOSED], reason [engine failure, message [refresh failed][OutOfMemoryError[Java heap space]]] [2015-01-14 12:01:32,315][DEBUG][index.service ] [Saint Elmo] [mailspool] [1] closed (reason: [engine failure, message [refresh failed][OutOfMemoryError[Java heap space]]]) I tried adding a few settings to my elasticsearch.yml as suggested in the referenced issue : index.load_fixed_bitset_filters_eagerly: false index.warmer.enabled: false indices.breaker.total.limit: 30% But none of this settings seems to work for me. Our mapping is visible here : http://git.blue-mind.net/gitlist/bluemind/blob/master/esearch/config/templates/mailspool.json It is used to store a full text index of emails. It uses parent / child structure : The msgBody type contains the full text of the messages and attachments. The msg type contains user flags (unread, important, folder it is store into, etc). We use this structure as "msg" is often updated : mails are often marked as read or moved. The msgBody can be pretty big so we don't want to update the whole document when a simple email flag is changed. Does this kind of index structure reminds a particular bug or required setting ? Any rule of thumb to size memory regarding to index size on disk ? Regards, Thomas. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/580ef748-abe9-44f6-ab4e-e388fe5b7803%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.