Hi All, I am a new user to elasticsearch so, please bare with me for the questions with obvious answers. I am running elasticsearch 1.4.2 on a cloud VM with Linux Server release 5.9 (Tikanga). Everything was fine until full disk space issue was hit. After that I am getting errors related to shards
This is a master node. ES error log snippet [2015-04-07 00:00:22,417][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 00:00:22,417][ERROR][marvel.agent.exporter ] [Node_f0cd] error sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: SocketTimeoutException[Read timed out] [2015-04-07 00:01:32,489][ERROR][marvel.agent.exporter ] [Node_f0cd] error sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: SocketTimeoutException[Read timed out] [2015-04-07 00:01:32,491][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 00:02:42,561][ERROR][marvel.agent.exporter ] [Node_f0cd] error sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: SocketTimeoutException[Read timed out] [2015-04-07 00:02:42,561][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 00:03:52,632][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 00:54:05,769][ERROR][marvel.agent.exporter ] [Node_f0cd] create failure (index:[.marvel-2015.04.07] type: [node_stats]): UnavailableShardsException[[.marvel-2015.04.07][0] Primary shard is not active or isn't assigned is a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@20ec107a] [2015-04-07 01:15:07,070][ERROR][marvel.agent.exporter ] [Node_f0cd] error sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: SocketTimeoutException[Read timed out] [2015-04-07 01:15:07,071][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 01:16:17,145][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 02:40:15,177][DEBUG][action.search.type ] [Node_f0cd] All shards failed for phase: [query_fetch] [2015-04-07 02:40:15,439][DEBUG][action.search.type ] [Node_f0cd] All shards failed for phase: [query_fetch] [2015-04-07 02:40:15,485][DEBUG][action.search.type ] [Node_f0cd] All shards failed for phase: [query_fetch] [2015-04-07 02:40:15,527][DEBUG][action.search.type ] [Node_f0cd] All shards failed for phase: [query_fetch] [2015-04-07 02:40:15,574][DEBUG][action.search.type ] [Node_f0cd] All shards failed for phase: [query_fetch] [2015-04-07 02:43:52,567][ERROR][marvel.agent.exporter ] [Node_f0cd] error sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: SocketTimeoutException[Read timed out] [2015-04-07 02:43:52,569][DEBUG][action.bulk ] [Node_f0cd] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m] [2015-04-07 03:33:20,288][WARN ][netty.channel.DefaultChannelPipeline] An exception was thrown by an exception handler. java.util.concurrent.RejectedExecutionException: Worker has already been shutdown at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636) at org.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496) at org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781) Got a solution from Bigfoot in the below article http://stackoverflow.com/questions/28540659/elasticsearch-is-not-working-after-full-disk The solution says "I got it fixed by moving out of the way the /var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash-2015.02.16/1/translog/translog-1424037601837.recovering but I recon I now lost some events as this file was 40M?" However, my directory structure is different ( /var/lib/elasticsearch/elasticsearch/nodes/0/_state). Can anybody please tell what I can do about it? Also, there is another solution given in the link (probably if nothing works) http://stackoverflow.com/questions/21157466/all-shards-failed Thanks for looking into it. Regards, Abhishek -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ce95b47-0991-4a2c-a0fb-f7cfac2a4b5a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.