If you see cluster:monitor/nodes/stats[n]] request_id [82300775] timed out after [15000ms]
in the logs, you have a monitor tool running that can not complete requests because it takes longer than 15 seconds to traverse all the data folders on all the nodes. There are a number of methods to reduce disk traversal time in the data folders: - switch off monitoring (not really helpful) or reduce monitor interval (maybe helpful, maybe not) - increase stats request timeout (if monitor tools allow this but this does not solve the cause of the problem) - monitor only an index subset of your cluster (monitor tools usually do not have this option) - reduce number of segments per node -> either by optimizing indices or adding nodes - wait for a fix in a future ES release Have you counted the total number of segments? If the number is high, did you run _optimize with max_num_segments on your indices to reduce the number of segments? Jörg On Fri, Jan 9, 2015 at 6:55 AM, Revan007 <drag...@pionix.ro> wrote: > Hey, > > I am having trouble for some while. I am getting random node disconnects > and I cannot explain why. > There is no increase in traffic ( search or index ) when this is happening > , it feels so random to me . > I first thought it could be the aws cloud plugin so I removed it and used > unicast and pointed directly to my nodes IPs but that didn't seem to be the > problem . > I changed the type of instances, now m3.2xlarge, added more instances, > made so much modifications in ES yml config and still nothing . > Changed java oracle from 1.7 to 1.8 , changed CMS collector to G1GC and > still nothing . > > I am out of ideas ... how can I get more info on what is going on ? > > Here are the logs I can see from master node and the data node > http://pastebin.com/GhKfRkaa > > > Current config: > > > 6 m3.x2large, 1 master, 5 data nodes. > 414 indices, index/day > 7372 shards. 9 shards, 1 replica per index > 208 million documents, 430 GB > 15 gb heap size allocated per node > ES 1.4.2 > > Current yml config here : > http://pastebin.com/Nmdr7F6J > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHv0wxNXq_nJrj5ByxrpZmwbdiKmMUbu4YYfjuGM5XkAA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.