I saw a few of those which I ultimately attributed to a flaky network connection. That's not going to explain every such episode, but it did solve one of mine.
On Thu, Oct 2, 2014 at 11:27 PM, Shobana Neelakantan <s.shoban...@gmail.com> wrote: > We are seeing the same issue as well. > Were you able to resolve this? > > On Thursday, April 17, 2014 2:38:06 AM UTC+5:30, Vishal Bhasin wrote: >> >> Hello - we are seeing the same issues, were you able to resolve this? >> thanks! >> >> On Sunday, 5 February 2012 16:48:25 UTC-6, Grant wrote: >>> >>> So we seem to be having recurring incidences of ES nodes getting into >>> a very odd state. In this particular case, one node because >>> unresponsive to test polls. I'm not really sure what to make of this, >>> because while this is ongoing, the cluster remains green, but the >>> borked node continues to try and service traffic, which means our app >>> is sporadically failing in the meantime. >>> >>> [UTC Feb 5 22:29:23] error : 'prod_elasticsearch_cluster_health' >>> failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/ >>> health] via TCP -- HTTP: Error receiving data -- Resource temporarily >>> unavailable >>> [UTC Feb 5 22:30:28] error : 'prod_elasticsearch_cluster_health' >>> failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/ >>> health] via TCP -- HTTP: Error receiving data -- Resource temporarily >>> unavailable >>> [UTC Feb 5 22:31:33] error : 'prod_elasticsearch_cluster_health' >>> failed protocol test [HTTP] at INET[10.180.48.216:9200/_cluster/ >>> health] via TCP -- HTTP: Error receiving data -- Resource temporarily >>> unavailable >>> >>> >>> Here's the corresponding logs from the node in question: >>> >>> Some of this: >>> >>> [2012-02-05 22:32:11,552][INFO ][discovery.zen ] [prod-es- >>> r07] master_left [[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es- >>> r08.ihost.brewster.com/10.180.48.255:9300]] >>> <http://r08.ihost.brewster.com/10.180.48.255:9300%5D%5D>], reason [no >>> longer >>> master] >>> [2012-02-05 22:32:11,552][INFO ][cluster.service ] [prod-es- >>> r07] master {new [prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es- >>> r04.ihost.brewster.com/10.180.35.110:9300] >>> <http://r04.ihost.brewster.com/10.180.35.110:9300%5D>], previous >>> [prod-es-r08] >>> [mlrGPzm3QeCm7d_E_Lvozg][inet[prod-es-r08.ihost.brewster.com/ >>> 10.180.48.255:9300]]}, removed {[prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg] >>> [inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300] >>> <http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300%5D>],}, >>> reason: >>> zen-disco-master_failed ([prod-es-r08][mlrGPzm3QeCm7d_E_Lvozg] >>> [inet[prod-es-r08.ihost.brewster.com/10.180.48.255:9300] >>> <http://prod-es-r08.ihost.brewster.com/10.180.48.255:9300%5D>]) >>> [2012-02-05 22:32:12,557][INFO ][discovery.zen ] [prod-es- >>> r07] master_left [[prod-es-r04][uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es- >>> r04.ihost.brewster.com/10.180.35.110:9300]] >>> <http://r04.ihost.brewster.com/10.180.35.110:9300%5D%5D>], reason [no >>> longer >>> master] >>> [2012-02-05 22:32:12,558][WARN ][discovery.zen ] [prod-es- >>> r07] not enough master nodes after master left (reason = no longer >>> master), current nodes: {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg] >>> [inet[prod-es-r02.ihost.brewster.com/10.182.14.95:9300] >>> <http://prod-es-r02.ihost.brewster.com/10.182.14.95:9300%5D>],[prod-es-r07] >>> >>> [zqJRs5e6S5eWfL0kVuolJg][inet[prod-es-r07.ihost.brewster.com/ >>> 10.180.48.216:9300]],} >>> [2012-02-05 22:32:12,559][INFO ][cluster.service ] [prod-es- >>> r07] removed {[prod-es-r02][uuh4KmeHR-eUeIr7J97zCg][inet[prod-es- >>> r02.ihost.brewster.com/10.182.14.95:9300] >>> <http://r02.ihost.brewster.com/10.182.14.95:9300%5D>],[prod-es-r04] >>> [uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/ >>> 10.180.35.110:9300]],}, reason: zen-disco-master_failed ([prod-es-r04] >>> [uOUyy7p_TBuNEbwmqWF9-w][inet[prod-es-r04.ihost.brewster.com/ >>> 10.180.35.110:9300]]) >>> [2012-02-05 22:32:12,565][WARN ][http.netty ] [prod-es- >>> r07] Caught exception while handling client http traffic, closing >>> connection [id: 0x09be0e53, /10.180.48.216:54645 => / >>> 10.180.48.216:9200] >>> java.lang.IllegalArgumentException: empty text >>> at >>> org.elasticsearch.common.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java: >>> >>> 103) >>> at >>> org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java: >>> >>> 68) >>> at >>> org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder. >>> createMessage(HttpRequestDecoder.java: >>> 81) >>> at >>> org.elasticsearch.common.netty.handler.codec.http. >>> HttpMessageDecoder.decode(HttpMessageDecoder.java: >>> 198) >>> at >>> org.elasticsearch.common.netty.handler.codec.http. >>> HttpMessageDecoder.decode(HttpMessageDecoder.java: >>> 107) >>> at >>> org.elasticsearch.common.netty.handler.codec.replay. >>> ReplayingDecoder.callDecode(ReplayingDecoder.java: >>> 470) >>> at >>> org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder. >>> messageReceived(ReplayingDecoder.java: >>> 443) >>> at >>> org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler. >>> handleUpstream(SimpleChannelUpstreamHandler.java: >>> 80) >>> at >>> org.elasticsearch.common.netty.channel.DefaultChannelPipeline. >>> sendUpstream(DefaultChannelPipeline.java: >>> 564) >>> at org.elasticsearch.common.netty.channel.DefaultChannelPipeline >>> >>> $DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java: >>> 783) >>> at >>> org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java: >>> >>> 81) >>> at >>> org.elasticsearch.common.netty.channel.DefaultChannelPipeline. >>> sendUpstream(DefaultChannelPipeline.java: >>> 564) >>> at >>> org.elasticsearch.common.netty.channel.DefaultChannelPipeline. >>> sendUpstream(DefaultChannelPipeline.java: >>> 559) >>> at >>> org.elasticsearch.common.netty.channel.Channels. >>> fireMessageReceived(Channels.java: >>> 274) >>> at >>> org.elasticsearch.common.netty.channel.Channels. >>> fireMessageReceived(Channels.java: >>> 261) >>> at >>> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java: >>> >>> 351) >>> at >>> org.elasticsearch.common.netty.channel.socket.nio. >>> NioWorker.processSelectedKeys(NioWorker.java: >>> 282) >>> at >>> org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java: >>> >>> 202) >>> at >>> org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java: >>> >>> 108) >>> at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker >>> >>> $1.run(DeadLockProofWorker.java:44) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >>> >>> 1110) >>> at java.util.concurrent.ThreadPoolExecutor >>> $Worker.run(ThreadPoolExecutor.java:603) >>> at java.lang.Thread.run(Thread.java:636) >>> [2012-02-05 22:32:12,814][DEBUG][action.search.type ] [prod-es- >>> r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request >>> [scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;] >>> [2012-02-05 22:32:12,815][DEBUG][action.search.type ] [prod-es- >>> r07] Node [Bbaoza_KTP2DJQgxM4JN-A] not available for scroll request >>> [scan;1;5092799:Bbaoza_KTP2DJQgxM4JN-A;1;total_hits:7200;] >>> [2012-02-05 22:32:14,066][WARN ][http.netty ] [prod-es- >>> r07] Caught exception while handling client http traffic, closing >>> connection [id: 0x2cb594cf, /10.180.48.216:54651 => / >>> 10.180.48.216:9200] >>> >>> >>> >>> >>> >>> Followed by tons of this: >>> >>> >>> [2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es- >>> r07] [contact_documents-33-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [R], >>> s[STARTED]: Failed to execute >>> [org.elasticsearch.action.search.SearchRequest@5c67fa3d] >>> org.elasticsearch.transport.SendRequestTransportException: [prod-es- >>> r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/ >>> phase/query >>> <http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300%5D%5D%5Bsearch/phase/query>] >>> >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 196) >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 168) >>> at >>> org.elasticsearch.search.action.SearchServiceTransportAction. >>> sendExecuteQuery(SearchServiceTransportAction.java: >>> 140) >>> at org.elasticsearch.action.search.type.TransportSearchCountAction >>> >>> $AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java: >>> 279) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211) >>> at org.elasticsearch.search.action.SearchServiceTransportAction >>> $2.handleException(SearchServiceTransportAction.java:151) >>> at org.elasticsearch.transport.TransportService >>> $2.run(TransportService.java:199) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >>> >>> 1110) >>> at java.util.concurrent.ThreadPoolExecutor >>> $Worker.run(ThreadPoolExecutor.java:603) >>> at java.lang.Thread.run(Thread.java:636) >>> Caused by: org.elasticsearch.transport.NodeNotConnectedException: >>> [prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]] >>> Node not connected >>> at >>> org.elasticsearch.transport.netty.NettyTransport. >>> nodeChannel(NettyTransport.java: >>> 636) >>> at >>> org.elasticsearch.transport.netty.NettyTransport. >>> sendRequest(NettyTransport.java: >>> 448) >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 181) >>> ... 11 more >>> [2012-02-05 22:32:24,572][DEBUG][action.search.type ] [prod-es- >>> r07] [contact_documents-4-0][0], node[ar6qMqYnRSm5f0zvpKDirA], [P], >>> s[STARTED]: Failed to execute >>> [org.elasticsearch.action.search.SearchRequest@718585ec] >>> org.elasticsearch.transport.SendRequestTransportException: [prod-es- >>> r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]][search/ >>> phase/query >>> <http://prod-es-r06.ihost.brewster.com/10.180.46.203:9300%5D%5D%5Bsearch/phase/query>] >>> >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 196) >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 168) >>> at >>> org.elasticsearch.search.action.SearchServiceTransportAction. >>> sendExecuteQuery(SearchServiceTransportAction.java: >>> 140) >>> at org.elasticsearch.action.search.type.TransportSearchCountAction >>> >>> $AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:74) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:205) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java: >>> 279) >>> at org.elasticsearch.action.search.type.TransportSearchTypeAction >>> >>> $BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:211) >>> at org.elasticsearch.search.action.SearchServiceTransportAction >>> $2.handleException(SearchServiceTransportAction.java:151) >>> at org.elasticsearch.transport.TransportService >>> $2.run(TransportService.java:199) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java: >>> >>> 1110) >>> at java.util.concurrent.ThreadPoolExecutor >>> $Worker.run(ThreadPoolExecutor.java:603) >>> at java.lang.Thread.run(Thread.java:636) >>> Caused by: org.elasticsearch.transport.NodeNotConnectedException: >>> [prod-es-r06][inet[prod-es-r06.ihost.brewster.com/10.180.46.203:9300]] >>> Node not connected >>> at >>> org.elasticsearch.transport.netty.NettyTransport. >>> nodeChannel(NettyTransport.java: >>> 636) >>> at >>> org.elasticsearch.transport.netty.NettyTransport. >>> sendRequest(NettyTransport.java: >>> 448) >>> at >>> org.elasticsearch.transport.TransportService.sendRequest(TransportService.java: >>> >>> 181) >>> ... 11 more >>> >>> etc >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/87767368-7870-4e24-a33f-a618e2c045ba%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/87767368-7870-4e24-a33f-a618e2c045ba%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fybrLioU5OxgMnj4s5GCtzLzj8ZHuU7%2BMgm6%3Dp3KNwuZw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.