Re: Failing Replica Shards
Small mistake. 1. should be: 1. If shard had more than one segment then optimizing it to one segment usually worked. On Sunday, 30 November 2014 12:00:37 UTC+1, Jakub Podeszwik wrote: > > I've had similar problems. Two things that helped: > 1. If index had more than one shard then optimizing it to one shard > usually worked. > 2. In other case manually copying shard files from node with master shard > to one of nodes that kept failing. > > On Sunday, 30 November 2014 00:57:02 UTC+1, David Kleiner wrote: >> >> Hello Mehmet, >> >> For two indices with problematic shards (symptoms: shard is recovering, >> recovery stops and recovery is attempted on a different node), I manually >> changed replica count to 1 then 2. With a big index (over 90G, I think), I >> was never able to recover dual replica set, thankfully it was OK to drop >> it. Upgrading to more recent ES version helped too. >> >> HTH, >> >> David >> >> On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün >> wrote: >>> >>> Hey David, I have same problem now. Have you found a solution for that >>> problem? >>> >>> 26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı: Hello, In the past couple of days I've been getting a lot of error messages about corrupted replica shards. The primary shards come up fast after ES process restart but replicas take a long time to come back. Sometimes it takes a few node restarts to 'kick' the nodes to start replica shards. ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a 3-way cluster with 4 logstash feeders hanging off it. Here are the errors; [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [downloader-2014.08][4] received shard failed for [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[downloader-2014.08][4] Corrupted index [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc")) [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [eventlog-2014.06][0] received shard failed for [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd")) [2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [eventlog-2014.07][0] received shard failed for [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd")) Thanks, David >>> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bef48895-f1ec-41d3-9f3c-6009723f103b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Failing Replica Shards
I've had similar problems. Two things that helped: 1. If index had more than one shard then optimizing it to one shard usually worked. 2. In other case manually copying shard files from node with master shard to one of nodes that kept failing. On Sunday, 30 November 2014 00:57:02 UTC+1, David Kleiner wrote: > > Hello Mehmet, > > For two indices with problematic shards (symptoms: shard is recovering, > recovery stops and recovery is attempted on a different node), I manually > changed replica count to 1 then 2. With a big index (over 90G, I think), I > was never able to recover dual replica set, thankfully it was OK to drop > it. Upgrading to more recent ES version helped too. > > HTH, > > David > > On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün > wrote: >> >> Hey David, I have same problem now. Have you found a solution for that >> problem? >> >> 26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı: >>> >>> Hello, >>> >>> In the past couple of days I've been getting a lot of error messages >>> about corrupted replica shards. The primary shards come up fast after ES >>> process restart but replicas take a long time to come back. Sometimes it >>> takes a few node restarts to 'kick' the nodes to start replica shards. >>> >>> ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a >>> 3-way cluster with 4 logstash feeders hanging off it. >>> >>> Here are the errors; >>> >>> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / >>> Salvador Dali] [downloader-2014.08][4] received shard failed for >>> [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], >>> s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine >>> failure, message [corrupted preexisting >>> index][CorruptIndexException[[downloader-2014.08][4] Corrupted index >>> [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec >>> footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 >>> (resource: >>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc")) >>> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / >>> Salvador Dali] [eventlog-2014.06][0] received shard failed for >>> [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], >>> indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message >>> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] >>> Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: >>> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected >>> footer=-1071082520 (resource: >>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd")) >>> [2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 / >>> Salvador Dali] [eventlog-2014.07][0] received shard failed for >>> [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], >>> indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message >>> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] >>> Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: >>> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected >>> footer=-1071082520 (resource: >>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd")) >>> >>> >>> >>> Thanks, >>> >>> David >>> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53898508-c45d-4908-a93f-a383941ff61e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Failing Replica Shards
Hello Mehmet, For two indices with problematic shards (symptoms: shard is recovering, recovery stops and recovery is attempted on a different node), I manually changed replica count to 1 then 2. With a big index (over 90G, I think), I was never able to recover dual replica set, thankfully it was OK to drop it. Upgrading to more recent ES version helped too. HTH, David On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün wrote: > > Hey David, I have same problem now. Have you found a solution for that > problem? > > 26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı: >> >> Hello, >> >> In the past couple of days I've been getting a lot of error messages >> about corrupted replica shards. The primary shards come up fast after ES >> process restart but replicas take a long time to come back. Sometimes it >> takes a few node restarts to 'kick' the nodes to start replica shards. >> >> ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a >> 3-way cluster with 4 logstash feeders hanging off it. >> >> Here are the errors; >> >> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / >> Salvador Dali] [downloader-2014.08][4] received shard failed for >> [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], >> s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine >> failure, message [corrupted preexisting >> index][CorruptIndexException[[downloader-2014.08][4] Corrupted index >> [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec >> footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 >> (resource: >> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc")) >> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / >> Salvador Dali] [eventlog-2014.06][0] received shard failed for >> [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], >> indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message >> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] >> Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: >> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected >> footer=-1071082520 (resource: >> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd")) >> [2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 / >> Salvador Dali] [eventlog-2014.07][0] received shard failed for >> [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], >> indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message >> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] >> Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: >> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected >> footer=-1071082520 (resource: >> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd")) >> >> >> >> Thanks, >> >> David >> > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52c4fa13-32aa-4f60-bda9-c8e999ee0d2d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Failing Replica Shards
Hey David, I have same problem now. Have you found a solution for that problem? 26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı: > > Hello, > > In the past couple of days I've been getting a lot of error messages about > corrupted replica shards. The primary shards come up fast after ES process > restart but replicas take a long time to come back. Sometimes it takes a > few node restarts to 'kick' the nodes to start replica shards. > > ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a > 3-way cluster with 4 logstash feeders hanging off it. > > Here are the errors; > > [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / > Salvador Dali] [downloader-2014.08][4] received shard failed for > [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], > s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine > failure, message [corrupted preexisting > index][CorruptIndexException[[downloader-2014.08][4] Corrupted index > [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec > footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 > (resource: > NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc")) > [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / > Salvador Dali] [eventlog-2014.06][0] received shard failed for > [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], > indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message > [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] > Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: > CorruptIndexException[codec footer mismatch: actual footer=0 vs expected > footer=-1071082520 (resource: > NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd")) > [2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 / > Salvador Dali] [eventlog-2014.07][0] received shard failed for > [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], > indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message > [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] > Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: > CorruptIndexException[codec footer mismatch: actual footer=0 vs expected > footer=-1071082520 (resource: > NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd")) > > > > Thanks, > > David > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04a6e42a-0518-47ef-81a2-b59856a8a309%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Failing Replica Shards
Hello, In the past couple of days I've been getting a lot of error messages about corrupted replica shards. The primary shards come up fast after ES process restart but replicas take a long time to come back. Sometimes it takes a few node restarts to 'kick' the nodes to start replica shards. ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer. It's a 3-way cluster with 4 logstash feeders hanging off it. Here are the errors; [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [downloader-2014.08][4] received shard failed for [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[downloader-2014.08][4] Corrupted index [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc")) [2014-08-26 15:01:18,682][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [eventlog-2014.06][0] received shard failed for [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd")) [2014-08-26 15:01:18,684][WARN ][cluster.action.shard ] [log03 / Salvador Dali] [eventlog-2014.07][0] received shard failed for [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: CorruptIndexException[codec footer mismatch: actual footer=0 vs expected footer=-1071082520 (resource: NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd")) Thanks, David -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c0af53fb-6fdd-4624-bf6c-9b9d50081689%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.