The thing is I don't think is the monitor plugin. When this happens, my node gets disconnected and the cluster goes into yellow state till it recovers . I am using curator optimize , it is set to 2 segments for indices older than 2 days .
On Saturday, January 10, 2015 at 2:56:52 PM UTC+2, Revan007 wrote: > > Hey, thank you for answering, I am using Marvel latest version. > > Here is more info about the problem : > > https://github.com/elasticsearch/elasticsearch/issues/9212#issuecomment-69292232 > > On Saturday, January 10, 2015 at 2:50:02 PM UTC+2, Jörg Prante wrote: >> >> If you see >> >> cluster:monitor/nodes/stats[n]] request_id [82300775] timed out after >> [15000ms] >> >> in the logs, you have a monitor tool running that can not complete >> requests because it takes longer than 15 seconds to traverse all the data >> folders on all the nodes. >> >> There are a number of methods to reduce disk traversal time in the data >> folders: >> >> - switch off monitoring (not really helpful) or reduce monitor interval >> (maybe helpful, maybe not) >> >> - increase stats request timeout (if monitor tools allow this but this >> does not solve the cause of the problem) >> >> - monitor only an index subset of your cluster (monitor tools usually do >> not have this option) >> >> - reduce number of segments per node -> either by optimizing indices or >> adding nodes >> >> - wait for a fix in a future ES release >> >> Have you counted the total number of segments? If the number is high, did >> you run _optimize with max_num_segments on your indices to reduce the >> number of segments? >> >> Jörg >> >> On Fri, Jan 9, 2015 at 6:55 AM, Revan007 <dra...@pionix.ro> wrote: >> >>> Hey, >>> >>> I am having trouble for some while. I am getting random node disconnects >>> and I cannot explain why. >>> There is no increase in traffic ( search or index ) when this is >>> happening , it feels so random to me . >>> I first thought it could be the aws cloud plugin so I removed it and >>> used unicast and pointed directly to my nodes IPs but that didn't seem to >>> be the problem . >>> I changed the type of instances, now m3.2xlarge, added more instances, >>> made so much modifications in ES yml config and still nothing . >>> Changed java oracle from 1.7 to 1.8 , changed CMS collector to G1GC and >>> still nothing . >>> >>> I am out of ideas ... how can I get more info on what is going on ? >>> >>> Here are the logs I can see from master node and the data node >>> http://pastebin.com/GhKfRkaa >>> >>> >>> Current config: >>> >>> >>> 6 m3.x2large, 1 master, 5 data nodes. >>> 414 indices, index/day >>> 7372 shards. 9 shards, 1 replica per index >>> 208 million documents, 430 GB >>> 15 gb heap size allocated per node >>> ES 1.4.2 >>> >>> Current yml config here : >>> http://pastebin.com/Nmdr7F6J >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4cc37e2e-4bbc-483d-bbbe-6cd0138d6689%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.