Re: ES 1.4.2 random node disconnect

joergpra...@gmail.com Sat, 10 Jan 2015 04:50:08 -0800

If you see

cluster:monitor/nodes/stats[n]] request_id [82300775] timed out after
[15000ms]


in the logs, you have a monitor tool running that can not complete requests
because it takes longer than 15 seconds to traverse all the data folders on
all the nodes.

There are a number of methods to reduce disk traversal time in the data
folders:

- switch off monitoring (not really helpful) or reduce monitor interval
(maybe helpful, maybe not)

- increase stats request timeout (if monitor tools allow this but this does
not solve the cause of the problem)

- monitor only an index subset of your cluster (monitor tools usually do
not have this option)

- reduce number of segments per node -> either by optimizing indices or
adding nodes

- wait for a fix in a future ES release

Have you counted the total number of segments? If the number is high, did
you run _optimize with max_num_segments on your indices to reduce the
number of segments?

Jörg

On Fri, Jan 9, 2015 at 6:55 AM, Revan007 <drag...@pionix.ro> wrote:

> Hey,
>
> I am having trouble for some while. I am getting random node disconnects
> and I cannot explain why.
> There is no increase in traffic ( search or index ) when this is happening
> , it feels so random to me .
> I first thought it could be the aws cloud plugin so I removed it and used
> unicast and pointed directly to my nodes IPs but that didn't seem to be the
> problem .
> I changed the type of instances, now m3.2xlarge, added more instances,
> made so much modifications in ES yml config and still nothing .
> Changed java oracle from 1.7 to 1.8 , changed CMS collector to G1GC and
> still nothing .
>
> I am out of ideas ... how can I get more info on what is going on ?
>
> Here are the logs I can see from master node and the data node
> http://pastebin.com/GhKfRkaa
>
>
> Current config:
>
>
> 6 m3.x2large, 1 master, 5 data nodes.
> 414 indices, index/day
> 7372 shards. 9 shards, 1 replica per index
> 208 million documents, 430 GB
> 15 gb heap size allocated per node
> ES 1.4.2
>
> Current yml config here :
> http://pastebin.com/Nmdr7F6J
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/85cc2abe-da8e-4170-8e7d-a4e01f4a22c3%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHv0wxNXq_nJrj5ByxrpZmwbdiKmMUbu4YYfjuGM5XkAA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES 1.4.2 random node disconnect

Reply via email to