hi, in my setup, marvel node is different from production cluster.. the production nodes send data to marvel node.. marvel node had OOM exception. this brings me to the quesiton, how much heap does it need? i ran with default config.
in my prod cluster, i have a load balancer which is no data node. it runs with just 2GB heap. due to marvel failure, this node was getting timeouts and for some strange reason went down. what are the best practices here? how can i avoid this in the future? marvel node - [2014-04-17 09:13:33,715][WARN ][index.engine.internal ] [Gorilla-Man] [.marvel-2014.04.17][0] failed engine java.lang.OutOfMemoryError: Java heap space [2014-04-17 09:13:46,890][ERROR][index.engine.internal ] [Gorilla-Man] [.marvel-2014.04.17][0] failed to acquire searcher, source search_factory org.apache.lucene.store.AlreadyClosedException: this ReferenceManager is closed at org.apache.lucene.search.ReferenceManager.acquire(ReferenceManager.java:98) ... ES LB node - [2014-04-17 00:01:00,567][ERROR][marvel.agent.exporter ] [Darkoth] create fai lure (index:[.marvel-2014.04.16] type: [node_stats]): UnavailableShardsException [[.marvel-2014.04.16][0] [2] shardIt, [0] active : Timeout waiting for [1m], req uest: org.elasticsearch.action.bulk.BulkShardRequest@5d9be928] [2014-04-17 06:41:46,975][ERROR][marvel.agent.exporter ] [Darkoth] error conn ecting to [ip-10-68-145-124.ec2.internal:9200] java.net.SocketTimeoutException: connect timed out [2014-04-17 18:53:09,969][DEBUG][action.admin.cluster.node.info] [Darkoth] faile d to execute on node [L1f57myxQLK1SSRHRFcvFQ] java.lang.OutOfMemoryError: Java heap space [2014-04-17 19:35:05,805][DEBUG][action.search.type ] [Witchfire] [twitter _072013][0], node[5GNeFfbPTGi-1EccVvR7Nw], [P], s[STARTED]: Failed to execute [o rg.elasticsearch.action.search.SearchRequest@2f94d571] lastShard [true] org.elasticsearch.transport.RemoteTransportException: [Mauvais][inet[/ 10.183.42. 216:9300]][search/phase/query] Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException : rejected execution (queue capacity 1000) on org.elasticsearch.transport.netty. MessageChannelHandler$RequestHandler@4c75d754 at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecut ion(EsAbortPolicy.java:62) -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHau4yvYsVO%2BbSk_U0cU7%3Di7G4FFgqwHQo_1as%3DezM9t20TRuA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.