Elasticsearch is not working after full disk

Abhishek Behera Tue, 07 Apr 2015 09:50:10 -0700

Hi All,

I am a new user to elasticsearch so, please bare with me for the questions 
with obvious answers. 
I am running elasticsearch 1.4.2 on a cloud VM with Linux Server release 
5.9 (Tikanga). Everything was fine until full disk space issue was hit. 
After that I am getting errors related to shards


This is a master node.

ES error log snippet 

[2015-04-07 00:00:22,417][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m]
[2015-04-07 00:00:22,417][ERROR][marvel.agent.exporter ] [Node_f0cd] error 
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: 
SocketTimeoutException[Read timed out] 
[2015-04-07 00:01:32,489][ERROR][marvel.agent.exporter ] [Node_f0cd] error 
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: 
SocketTimeoutException[Read timed out] 
[2015-04-07 00:01:32,491][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 
[2015-04-07 00:02:42,561][ERROR][marvel.agent.exporter ] [Node_f0cd] error 
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: 
SocketTimeoutException[Read timed out] 
[2015-04-07 00:02:42,561][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 
[2015-04-07 00:03:52,632][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 

[2015-04-07 00:54:05,769][ERROR][marvel.agent.exporter ] [Node_f0cd] create 
failure (index:[.marvel-2015.04.07] type: [node_stats]): 
UnavailableShardsException[[.marvel-2015.04.07][0] Primary shard is not 
active or isn't assigned is a known node. Timeout: [1m], request: 
org.elasticsearch.action.bulk.BulkShardRequest@20ec107a] 

[2015-04-07 01:15:07,070][ERROR][marvel.agent.exporter ] [Node_f0cd] error 
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: 
SocketTimeoutException[Read timed out] 
[2015-04-07 01:15:07,071][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 
[2015-04-07 01:16:17,145][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 

[2015-04-07 02:40:15,177][DEBUG][action.search.type ] [Node_f0cd] All 
shards failed for phase: [query_fetch] 
[2015-04-07 02:40:15,439][DEBUG][action.search.type ] [Node_f0cd] All 
shards failed for phase: [query_fetch] 
[2015-04-07 02:40:15,485][DEBUG][action.search.type ] [Node_f0cd] All 
shards failed for phase: [query_fetch] 
[2015-04-07 02:40:15,527][DEBUG][action.search.type ] [Node_f0cd] All 
shards failed for phase: [query_fetch] 
[2015-04-07 02:40:15,574][DEBUG][action.search.type ] [Node_f0cd] All 
shards failed for phase: [query_fetch] 
[2015-04-07 02:43:52,567][ERROR][marvel.agent.exporter ] [Node_f0cd] error 
sending data to [http://10.49.216.121:9200/.marvel-2015.04.07/_bulk]: 
SocketTimeoutException[Read timed out] 
[2015-04-07 02:43:52,569][DEBUG][action.bulk ] [Node_f0cd] observer: 
timeout notification from cluster service. timeout setting [1m], time since 
start [1m] 

[2015-04-07 03:33:20,288][WARN ][netty.channel.DefaultChannelPipeline] An 
exception was thrown by an exception handler. 
java.util.concurrent.RejectedExecutionException: Worker has already been 
shutdown 
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
 

at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
 

at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
 

at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
 

at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
 

at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
 

at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
 

at 
org.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
 

at 
org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
 

at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
 

at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)

Got a solution from Bigfoot in the below article
http://stackoverflow.com/questions/28540659/elasticsearch-is-not-working-after-full-disk

The solution says "I got it fixed by moving out of the way the 
/var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash-2015.02.16/1/trans‌log/translog-1424037601837.recovering
 
but I recon I now lost some events as this file was 40M?"
However, my directory structure is different 
( /var/lib/elasticsearch/elasticsearch/nodes/0/_state). Can anybody please 
tell what I can do about it? Also, there is another solution given in the 
link (probably if nothing 
works) http://stackoverflow.com/questions/21157466/all-shards-failed

Thanks for looking into it.
Regards,
Abhishek

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6ce95b47-0991-4a2c-a0fb-f7cfac2a4b5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch is not working after full disk

Reply via email to