Re: ES upgrade 0.20.6 to 1.4.2 - CorruptIndexException and FileNotFoundException
Any ideas? On Wednesday, December 31, 2014 3:35:39 PM UTC+1, Georgeta wrote: Hi All, I have a 5 nodes cluster. I updated the cluster from 0.20.6 to 1.4.2. When I start the cluster with shard allocation disabled, it starts and goes into a yellow state, all good. When I enable shard allocation WARN messages are generated: INFO || elasticsearch[node1][clusterService#updateTask][T#1] org.elasticsearch.cluster.routing.allocation.decider [node1] updating [cluster.routing.allocation.disable_allocation] from [true] to [false] [2014-12-31 13:46:26.310 GMT] WARN || elasticsearch[node1][[transport_server_worker.default]][T#4]{New I/O worker #21} org.elasticsearch.cluster.action.shard [node1] [index1][2] received shard failed for [index1][2], node[x6PqV8RMS8eA9GmBMZwjNQ], [P], s[STARTED], indexUUID [_na_], reason [engine failure, message [corrupt file detected source: [recovery phase 1]][RecoverFilesRecoveryException[[index1][2] Failed to transfer [69] files with total size of [6.5mb]]; nested: CorruptIndexException[checksum failed (hardware problem?) : expected=17tw8li actual=1ig9y12 resource=(org.apache.lucene.store.FSDirectory$FSIndexOutput@61297ce5)]; ]] [2014-12-31 13:46:35.504 GMT] WARN || elasticsearch[node1][[transport_server_worker.default]][T#14]{New I/O worker #31} org.elasticsearch.cluster.action.shard [node1] [index2][0] received shard failed for [index2][0], node[GORnFBrmQLOAvK294MUHgA], [P], s[STARTED], indexUUID [_na_], reason [engine failure, message [corrupt file detected source: [recovery phase 1]][RecoverFilesRecoveryException[[index2][0] Failed to transfer [163] files with total size of [238.1mb]]; nested: CorruptIndexException[checksum failed (hardware problem?) : expected=ptu7cd actual=1jw7kx9 resource=(org.apache.lucene.store.FSDirectory$FSIndexOutput@38c14092)]; ]] [2014-12-31 13:46:36.777 GMT] WARN || elasticsearch[node1][[transport_server_worker.default]][T#15]{New I/O worker #32} org.elasticsearch.cluster.action.shard [node1] [index2][0] received shard failed for [index2][0], node[GORnFBrmQLOAvK294MUHgA], [P], s[STARTED], indexUUID [_na_], reason [master [node1][8zFPkXuvQQWJvErc458tFA][dw1949demum.int.demandware.com][inet[/127.0.0.1:48003]]{local=false, power_zone=default} marked shard as started, but shard has not been created, mark shard as failed] [2014-12-31 13:46:36.792 GMT] WARN || elasticsearch[node1][[transport_server_worker.default]][T#14]{New I/O worker #31} org.elasticsearch.cluster.action.shard [node1] [index1][2] received shard failed for [index1][2], node[2mIDLcOcQJO4i73QHb7d6Q], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index1][2] failed recovery]; nested: EngineCreationFailureException[[index1][2] failed to open reader on writer]; nested: FileNotFoundException[No such file [_5aa.tis]]; ]] [2014-12-31 13:46:47.261 GMT] WARN || elasticsearch[node1][[transport_server_worker.default]][T#6]{New I/O worker #23} org.elasticsearch.cluster.action.shard [node1] [index1][2] received shard failed for [index1][2], node[x6PqV8RMS8eA9GmBMZwjNQ], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index1][2] failed to fetch index version after copying it over]; nested: CorruptIndexException[[index1][2] Preexisting corrupted index [corrupted_gExs5fftSwmCWWgUKN6Wbg] caused by: CorruptIndexException[checksum failed (hardware problem?) : expected=17tw8li actual=1ig9y12 resource=(org.apache.lucene.store.FSDirectory$FSIndexOutput@61297ce5)] org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=17tw8li actual=1ig9y12 resource=(org.apache.lucene.store.FSDirectory$FSIndexOutput@61297ce5) at org.elasticsearch.index.store.LegacyVerification$Adler32VerifyingIndexOutput.verify(LegacyVerification.java:73) at org.elasticsearch.index.store.Store.verify(Store.java:365) at org.elasticsearch.indices.recovery.RecoveryTarget$FileChunkTransportRequestHandler.messageReceived(RecoveryTarget.java:599) at org.elasticsearch.indices.recovery.RecoveryTarget$FileChunkTransportRequestHandler.messageReceived(RecoveryTarget.java:536) at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Suppressed: org.elasticsearch.transport.RemoteTransportException: [node5][inet[/127.0.0.1:48043]][internal:index/shard/recovery/file_chunk] Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=55hiu actual=16i3yt2 resource=(org.apache.lucene.store.FSDirectory$FSIndexOutput@108f1be6
Re: ES upgrade 0.20.6 to 1.3.4 - CorruptIndexException
Thank you :) On Tuesday, December 30, 2014 3:08:51 PM UTC+1, rcmuir wrote: Yes. again, use the latest version (1.4.x). its very simple. On Tue, Dec 30, 2014 at 8:53 AM, Georgeta Boanea gio...@gmail.com javascript: wrote: The Lucene bug is referring to 3.0-3.3 versions, Elasticsearch 0.20.6 is using Lucene 3.6, is it the same bug? On Tuesday, December 30, 2014 2:08:48 PM UTC+1, Robert Muir wrote: This bug occurs because you are upgrading to an old version of elasticsearch (1.3.4). Try the latest version where the bug is fixed: https://issues.apache.org/jira/browse/LUCENE-5975 On Fri, Dec 19, 2014 at 5:40 AM, Georgeta Boanea gio...@gmail.com wrote: Hi All, After upgrading from ES 0.20.6 to 1.3.4 the following messages occurred: [2014-12-19 10:02:06.714 GMT] WARN || elasticsearch[es-node-name][generic][T#14] org.elasticsearch.cluster.action.shard [es-node-name] [index-name][3] sending failed shard for [index-name][3], node[qOTLmb3IQC2COXZh1n9O2w], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index-name][3] failed to fetch index version after copying it over]; nested: CorruptIndexException[[index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))]]; ]] [2014-12-19 10:02:08.390 GMT] WARN || elasticsearch[es-node-name][generic][T#20] org.elasticsearch.indices.cluster [es-node-name] [index-name][3] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [index-name][3] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.lucene.index.CorruptIndexException: [index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more Shard [3] of the index remains unallocated and the cluster remains in a RED state. curl -XGET 'http://localhost:48012/_cluster/health?pretty=true' { cluster_name : cluster-name, status : red, timed_out : false, number_of_nodes : 5, number_of_data_nodes : 5, active_primary_shards : 10, active_shards : 20, relocating_shards : 0, initializing_shards : 1, unassigned_shards : 1 } If I do an optimize (curl -XPOST http://localhost:48012/index-name/_optimize?max_num_segments=1) for the index before the update, everything is fine. Optimize works just before the update, if is done after the update the problem remains the same. Any idea why this problem occurs? Is there another way to avoid this problem? I want to avoid optimize in case of large volume of data. Thank you, Georgeta -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74d0af86-c661-4e58-ba2c-d38adde1291c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/39216d8f-da8e-4793-abcc-dd004586d45f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch
ES upgrade 0.20.6 to 1.4.2 - CorruptIndexException and FileNotFoundException
) at org.apache.lucene.codecs.lucene3x.Lucene3xPostingsFormat.fieldsProducer(Lucene3xPostingsFormat.java:62) at org.apache.lucene.index.SegmentCoreReaders.init(SegmentCoreReaders.java:120) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:108) at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144) at org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:238) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:104) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:422) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:112) at org.apache.lucene.search.SearcherManager.init(SearcherManager.java:89) at org.elasticsearch.index.engine.internal.InternalEngine.buildSearchManager(InternalEngine.java:1527) at org.elasticsearch.index.engine.internal.InternalEngine.start(InternalEngine.java:309) ... 6 more Is there a way to resolve the checksum problem? Any idea why the files are deleted? Any help would be greatly appreciated :) Thank you Georgeta -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/193543ed-5703-4686-b460-4bd8680194c2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ES upgrade 0.20.6 to 1.3.4 - CorruptIndexException
Any ideas? On Friday, December 19, 2014 11:40:37 AM UTC+1, Georgeta Boanea wrote: Hi All, After upgrading from ES 0.20.6 to 1.3.4 the following messages occurred: [2014-12-19 10:02:06.714 GMT] WARN || elasticsearch[es-node-name][generic][T#14] org.elasticsearch.cluster.action.shard [es-node-name] [index-name][3] sending failed shard for [index-name][3], node[qOTLmb3IQC2COXZh1n9O2w], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index-name][3] failed to fetch index version after copying it over]; nested: CorruptIndexException[[index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))]]; ]] [2014-12-19 10:02:08.390 GMT] WARN || elasticsearch[es-node-name][generic][T#20] org.elasticsearch.indices.cluster [es-node-name] [index-name][3] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [index-name][3] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.lucene.index.CorruptIndexException: [index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more Shard [3] of the index remains unallocated and the cluster remains in a RED state. curl -XGET 'http://localhost:48012/_cluster/health?pretty=true' { cluster_name : cluster-name, status : red, timed_out : false, number_of_nodes : 5, number_of_data_nodes : 5, active_primary_shards : 10, active_shards : 20, relocating_shards : 0, initializing_shards : 1, unassigned_shards : 1 } If I do an optimize (curl -XPOST http://localhost:48012/index-name/_optimize?max_num_segments=1) for the index before the update, everything is fine. Optimize works just before the update, if is done after the update the problem remains the same. Any idea why this problem occurs? Is there another way to avoid this problem? I want to avoid optimize in case of large volume of data. Thank you, Georgeta -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9b5c6d6c-e8b5-4818-98d1-0ca64f289c5f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ES upgrade 0.20.6 to 1.3.4 - CorruptIndexException
The Lucene bug is referring to 3.0-3.3 versions, Elasticsearch 0.20.6 is using Lucene 3.6, is it the same bug? On Tuesday, December 30, 2014 2:08:48 PM UTC+1, Robert Muir wrote: This bug occurs because you are upgrading to an old version of elasticsearch (1.3.4). Try the latest version where the bug is fixed: https://issues.apache.org/jira/browse/LUCENE-5975 On Fri, Dec 19, 2014 at 5:40 AM, Georgeta Boanea gio...@gmail.com javascript: wrote: Hi All, After upgrading from ES 0.20.6 to 1.3.4 the following messages occurred: [2014-12-19 10:02:06.714 GMT] WARN || elasticsearch[es-node-name][generic][T#14] org.elasticsearch.cluster.action.shard [es-node-name] [index-name][3] sending failed shard for [index-name][3], node[qOTLmb3IQC2COXZh1n9O2w], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index-name][3] failed to fetch index version after copying it over]; nested: CorruptIndexException[[index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))]]; ]] [2014-12-19 10:02:08.390 GMT] WARN || elasticsearch[es-node-name][generic][T#20] org.elasticsearch.indices.cluster [es-node-name] [index-name][3] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [index-name][3] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.lucene.index.CorruptIndexException: [index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more Shard [3] of the index remains unallocated and the cluster remains in a RED state. curl -XGET 'http://localhost:48012/_cluster/health?pretty=true' { cluster_name : cluster-name, status : red, timed_out : false, number_of_nodes : 5, number_of_data_nodes : 5, active_primary_shards : 10, active_shards : 20, relocating_shards : 0, initializing_shards : 1, unassigned_shards : 1 } If I do an optimize (curl -XPOST http://localhost:48012/index-name/_optimize?max_num_segments=1) for the index before the update, everything is fine. Optimize works just before the update, if is done after the update the problem remains the same. Any idea why this problem occurs? Is there another way to avoid this problem? I want to avoid optimize in case of large volume of data. Thank you, Georgeta -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74d0af86-c661-4e58-ba2c-d38adde1291c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/39216d8f-da8e-4793-abcc-dd004586d45f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
ES upgrade 0.20.6 to 1.3.4 - CorruptIndexException
Hi All, After upgrading from ES 0.20.6 to 1.3.4 the following messages occurred: [2014-12-19 10:02:06.714 GMT] WARN || elasticsearch[es-node-name][generic][T#14] org.elasticsearch.cluster.action.shard [es-node-name] [index-name][3] sending failed shard for [index-name][3], node[qOTLmb3IQC2COXZh1n9O2w], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[index-name][3] failed to fetch index version after copying it over]; nested: CorruptIndexException[[index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))]]; ]] [2014-12-19 10:02:08.390 GMT] WARN || elasticsearch[es-node-name][generic][T#20] org.elasticsearch.indices.cluster [es-node-name] [index-name][3] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [index-name][3] failed to fetch index version after copying it over at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:152) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.lucene.index.CorruptIndexException: [index-name][3] Corrupted index [corrupted_Ackui00SSBi8YXACZGNDkg] caused by: CorruptIndexException[did not read all bytes from file: read 112 vs size 113 (resource: BufferedChecksumIndexInput(NIOFSIndexInput(path=path/3/index/_uzm_2.del)))] at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:353) at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:338) at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:119) ... 4 more Shard [3] of the index remains unallocated and the cluster remains in a RED state. curl -XGET 'http://localhost:48012/_cluster/health?pretty=true' { cluster_name : cluster-name, status : red, timed_out : false, number_of_nodes : 5, number_of_data_nodes : 5, active_primary_shards : 10, active_shards : 20, relocating_shards : 0, initializing_shards : 1, unassigned_shards : 1 } If I do an optimize (curl -XPOST http://localhost:48012/index-name/_optimize?max_num_segments=1) for the index before the update, everything is fine. Optimize works just before the update, if is done after the update the problem remains the same. Any idea why this problem occurs? Is there another way to avoid this problem? I want to avoid optimize in case of large volume of data. Thank you, Georgeta -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74d0af86-c661-4e58-ba2c-d38adde1291c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.