I've had similar problems. Two things that helped:
1. If index had more than one shard then optimizing it to one shard usually 
worked.
2. In other case manually copying shard files from node with master shard 
to one of nodes that kept failing.

On Sunday, 30 November 2014 00:57:02 UTC+1, David Kleiner wrote:
>
> Hello Mehmet,
>
> For two indices with problematic shards (symptoms: shard is recovering, 
> recovery stops and recovery is attempted on a different node), I manually 
> changed replica count to 1 then 2.  With a big index (over 90G, I think), I 
> was never able to recover dual replica set, thankfully it was OK to drop 
> it.  Upgrading to more recent ES version helped too. 
>
> HTH,
>
> David
>
> On Saturday, November 29, 2014 2:48:45 AM UTC-8, Mehmet Cem Güntürkün 
> wrote:
>>
>> Hey David, I have same problem now. Have you found a solution for that 
>> problem?
>>
>> 26 Ağustos 2014 Salı 23:08:55 UTC+3 tarihinde David Kleiner yazdı:
>>>
>>> Hello,
>>>
>>> In the past couple of days I've been getting a lot of error messages 
>>> about corrupted replica shards.  The primary shards come up fast after ES 
>>> process restart but replicas take a long time to come back. Sometimes it 
>>> takes a few node restarts to 'kick' the nodes to start replica shards.
>>>
>>> ES version is 1.3.1 running on CentOS 6.5 hosted at Softlayer.  It's a 
>>> 3-way cluster with 4 logstash feeders hanging off it. 
>>>
>>> Here are the errors;
>>>
>>> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard     ] [log03 / 
>>> Salvador Dali] [downloader-2014.08][4] received shard failed for 
>>> [downloader-2014.08][4], node[l9-BQTHSSF-ElhgpPBZ24w], [R], 
>>> s[INITIALIZING], indexUUID [2vRrb5YlQP6MTVr1chOezg], reason [engine 
>>> failure, message [corrupted preexisting 
>>> index][CorruptIndexException[[downloader-2014.08][4] Corrupted index 
>>> [corrupted_SkU0-ZHZRxivSnGczABb_g] caused by: CorruptIndexException[codec 
>>> footer mismatch: actual footer=-1676705023 vs expected footer=-1071082520 
>>> (resource: 
>>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/downloader-2014.08/4/index/_k9a_es090_0.doc"))]]]]
>>> [2014-08-26 15:01:18,682][WARN ][cluster.action.shard     ] [log03 / 
>>> Salvador Dali] [eventlog-2014.06][0] received shard failed for 
>>> [eventlog-2014.06][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], 
>>> indexUUID [jbvChdRrRB6HTutxPvxMmQ], reason [engine failure, message 
>>> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.06][0] 
>>> Corrupted index [corrupted__712QIBQQqafzpBoQwZtcg] caused by: 
>>> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected 
>>> footer=-1071082520 (resource: 
>>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.06/0/index/_1k4x.nvd"))]]]]
>>> [2014-08-26 15:01:18,684][WARN ][cluster.action.shard     ] [log03 / 
>>> Salvador Dali] [eventlog-2014.07][0] received shard failed for 
>>> [eventlog-2014.07][0], node[l9-BQTHSSF-ElhgpPBZ24w], [R], s[INITIALIZING], 
>>> indexUUID [T4tTXkPjTaCdSVNTjHfOcg], reason [engine failure, message 
>>> [corrupted preexisting index][CorruptIndexException[[eventlog-2014.07][0] 
>>> Corrupted index [corrupted_OzfNRRGyTIq8a1PRhLYG2w] caused by: 
>>> CorruptIndexException[codec footer mismatch: actual footer=0 vs expected 
>>> footer=-1071082520 (resource: 
>>> NIOFSIndexInput(path="/acc/ES/NBS/nodes/0/indices/eventlog-2014.07/0/index/_rqf.nvd"))]]]]
>>>
>>> ----
>>>
>>> Thanks,
>>>
>>> David
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53898508-c45d-4908-a93f-a383941ff61e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to