Hello, I have updated ElasticSearch from ver 0.90.3 to ver 1.3.4 ( OS - Debian Wheezy, deb package version ). This is a cluster configuration, with 3 nodes connected to unicast. Update was done with ElasticSearch switched off. Afters start new verion ElasticSearch cluster health is in 'yellow' state (showed by head plugin) ( and red state - showed by curl / _cluster / health ).
3 indexes in cluster has 3 unnassigned shards. Logs from all nodes are lot of informations of "corrupted indexes" or "sending failed shard for" Does update to ver 1.4.2 should fix the problem? (Due to lucene libraries LUCENE-5975 ) Removing index and rereading it is a last thing to do. ES state from first node: curl -XGET 'http://127.0.0.1:9200/_cluster/health?pretty=true' { "cluster_name" : "searchcass", "status" : "red", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 283, "active_shards" : 576, "relocating_shards" : 0, "initializing_shards" : 3, "unassigned_shards" : 3 } How can I fix it? Please reply. Regards Grzesiek ES log from node 1 (search01): ... [2014-12-17 11:04:20,176][WARN ][cluster.action.shard ] [search01] [201205][0] received shard failed for [201205][0], node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [_na_], reason [master [search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure] [2014-12-17 11:04:20,253][WARN ][indices.cluster ] [search01] [201301][0] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [201301][0] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.lucene.index.CorruptIndexException: [201301][0] Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by: CorruptIndexException[did not read all bytes from file: read 9650 vs size 9651 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more [2014-12-17 11:04:20,279][WARN ][cluster.action.shard ] [search01] [201304][4] received shard failed for [201304][4], node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201304][4] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201304][4] Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by: CorruptIndexException[did not read all bytes from file: read 295641 vs size 295642 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]]; ]] [2014-12-17 11:04:20,305][WARN ][cluster.action.shard ] [search01] [201304][4] received shard failed for [201304][4], node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [_na_], reason [master [search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure] [2014-12-17 11:04:20,329][WARN ][cluster.action.shard ] [search01] [201301][0] sending failed shard for [201301][0], node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201301][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201301][0] Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by: CorruptIndexException[did not read all bytes from file: read 9650 vs size 9651 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcassandra/nodes/0/indices/201301/0/index/_5f9v_k.del")))]]; ]] [2014-12-17 11:04:20,329][WARN ][cluster.action.shard ] [search01] [201301][0] received shard failed for [201301][0], node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201301][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201301][0] Corrupted index [corrupted_cFQBoZ-WTK2sW8mgUUv1vw] caused by: CorruptIndexException[did not read all bytes from file: read 9650 vs size 9651 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))]]; ]] [2014-12-17 11:04:20,331][WARN ][cluster.action.shard ] [search01] [201301][0] received shard failed for [201301][0], node[HYtX23nPS7uU-DeY-zF6AA], [P], s[INITIALIZING], indexUUID [_na_], reason [master [search01][HYtX23nPS7uU-DeY-zF6AA][search01][inet[/192.168.199.211:9300]] marked shard as initializing, but shard is marked as failed, resend shard failure] ... ES log from node 2 (search02): [2014-12-17 11:10:11,971][WARN ][cluster.action.shard ] [search02] [201301][0] sending failed shard for [201301][0], node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201301][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201301][0] Corrupted index [corrupted_U1eBtw3YRYKcfuV9ZHPadw] caused by: CorruptIndexException[did not read all bytes from file: read 9650 vs size 9651 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201301/0/index/_5f9v_k.del")))]]; ]] [2014-12-17 11:10:12,258][WARN ][indices.cluster ] [search02] [201205][0] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [201205][0] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.lucene.index.CorruptIndexException: [201205][0] Corrupted index [corrupted_xCs6wOMpR-G3pbQfUpn-Ww] caused by: CorruptIndexException[did not read all bytes from file: read 205 vs size 206 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more [2014-12-17 11:10:12,278][WARN ][indices.cluster ] [search02] [201304][4] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [201304][4] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.lucene.index.CorruptIndexException: [201304][4] Corrupted index [corrupted_mfMa6wjdT1m6QZ6WUBHKrA] caused by: CorruptIndexException[did not read all bytes from file: read 295641 vs size 295642 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more [2014-12-17 11:10:12,282][WARN ][cluster.action.shard ] [search02] [201205][0] sending failed shard for [201205][0], node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201205][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201205][0] Corrupted index [corrupted_xCs6wOMpR-G3pbQfUpn-Ww] caused by: CorruptIndexException[did not read all bytes from file: read 205 vs size 206 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]]; ]] [2014-12-17 11:10:12,297][WARN ][cluster.action.shard ] [search02] [201304][4] sending failed shard for [201304][4], node[OWUJ3lZbT5i00JKgrDFUcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201304][4] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201304][4] Corrupted index [corrupted_mfMa6wjdT1m6QZ6WUBHKrA] caused by: CorruptIndexException[did not read all bytes from file: read 295641 vs size 295642 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]]; ]] ES log from node 3 (search03): 2014-12-17 11:13:49,541][WARN ][cluster.action.shard ] [search03] [201205][0] sending failed shard for [201205][0], node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201205][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201205][0] Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by: CorruptIndexException[did not read all bytes from file: read 205 vs size 206 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]]; ]] [2014-12-17 11:13:49,581][WARN ][indices.cluster ] [search03] [201304][4] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [201304][4] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.lucene.index.CorruptIndexException: [201304][4] Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by: CorruptIndexException[did not read all bytes from file: read 295641 vs size 295642 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more [2014-12-17 11:13:49,651][WARN ][cluster.action.shard ] [search03] [201304][4] sending failed shard for [201304][4], node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201304][4] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201304][4] Corrupted index [corrupted_7hrGiX_jTx2KLbQUIAiLpg] caused by: CorruptIndexException[did not read all bytes from file: read 295641 vs size 295642 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201304/4/index/_294h_17.del")))]]; ]] [2014-12-17 11:13:49,747][WARN ][indices.cluster ] [search03] [201205][0] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [201205][0] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.lucene.index.CorruptIndexException: [201205][0] Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by: CorruptIndexException[did not read all bytes from file: read 205 vs size 206 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more [2014-12-17 11:13:49,823][WARN ][cluster.action.shard ] [search03] [201205][0] sending failed shard for [201205][0], node[zygoKW7SR6CwvanVoNrPcw], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[201205][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[201205][0] Corrupted index [corrupted_weSqXhW_T9Wle8wEHhEnXw] caused by: CorruptIndexException[did not read all bytes from file: read 205 vs size 206 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/searchcass/nodes/0/indices/201205/0/index/_1ys_3.del")))]]; ]] -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/746145b6-dd27-468c-af1e-50b4685b1a38%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.