Thanks for closing the loop.

On Wed, Oct 22, 2014 at 6:01 PM, Nate Folkert <nfolk...@foursquare.com> wrote:
> After disabling compression, I was able to successfully replicate that
> shard, so looks like we're hitting that bug.  I guess we'll have to upgrade!
>
> Thanks!
> - Nate
>
> On Wednesday, October 22, 2014 5:26:42 PM UTC-4, Robert Muir wrote:
>>
>> Can you try the workaround mentioned here:
>> http://www.elasticsearch.org/blog/elasticsearch-1-3-2-released/
>>
>> and see if it works? If the compression issue is the problem, you can
>> re-enable compression, just upgrade to at least 1.3.2 which has the
>> fix.
>>
>>
>> On Wed, Oct 22, 2014 at 4:57 PM, Nate Folkert <nfol...@foursquare.com>
>> wrote:
>> > Created and populated a new index on a 1.3.1 cluster.  Primary shards
>> > work
>> > fine.  Updated the index to create several replicas, and three of the
>> > four
>> > shards replicated, but one shard fails to replicate on any node with the
>> > following error (abbreviated some of the hashes for readability):
>> >
>> >>> [2014-10-22 20:31:54,549][WARN ][index.engine.internal    ] [NODENAME]
>> >>> [INDEXNAME][2] failed engine [corrupted preexisting index]
>> >>>
>> >>> [2014-10-22 20:31:54,549][WARN ][indices.cluster          ] [NODENAME]
>> >>> [INDEXNAME][2] failed to start shard
>> >>>
>> >>> org.apache.lucene.index.CorruptIndexException: [INDEXNAME][2]
>> >>> Corrupted
>> >>> index [CORRUPTED] caused by: CorruptIndexException[codec footer
>> >>> mismatch:
>> >>> actual footer=1161826848 vs expected footer=-1071082520 (resource:
>> >>> MMapIndexInput(path="DATAPATH/INDEXNAME/2/index/_7cp.fdt"))]
>> >>>
>> >>> at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:343)
>> >>>
>> >>> at org.elasticsearch.index.store.Store.failIfCorrupted(Store.java:328)
>> >>>
>> >>> at
>> >>>
>> >>> org.elasticsearch.indices.cluster.IndicesClusterStateService.applyInitializingShard(IndicesClusterStateService.java:723)
>> >>>
>> >>> at
>> >>>
>> >>> org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewOrUpdatedShards(IndicesClusterStateService.java:576)
>> >>>
>> >>> at
>> >>>
>> >>> org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:183)
>> >>>
>> >>> at
>> >>>
>> >>> org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:444)
>> >>>
>> >>> at
>> >>>
>> >>> org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
>> >>>
>> >>> at
>> >>>
>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >>>
>> >>> at
>> >>>
>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >>>
>> >>> at java.lang.Thread.run(Thread.java:745)
>> >>>
>> >>> [2014-10-22 20:31:54,549][WARN ][cluster.action.shard     ] [NODENAME]
>> >>> [INDEXNAME][2] sending failed shard for [INDEXNAME][2], node[NODEID],
>> >>> [R],
>> >>> s[INITIALIZING], indexUUID [INDEXID], reason [Failed to start shard,
>> >>> message
>> >>> [CorruptIndexException[[INDEXNAME][2] Corrupted index [CORRUPTED]
>> >>> caused by:
>> >>> CorruptIndexException[codec footer mismatch: actual footer=1161826848
>> >>> vs
>> >>> expected footer=-1071082520 (resource:
>> >>> MMapIndexInput(path="DATAPATH/INDEXNAME/2/index/_7cp.fdt"))]]]]
>> >>>
>> >>> [2014-10-22 20:31:54,550][WARN ][cluster.action.shard     ] [NODENAME]
>> >>> [INDEXNAME][2] sending failed shard for [INDEXNAME][2], node[NODEID],
>> >>> [R],
>> >>> s[INITIALIZING], indexUUID [INDEXID], reason [engine failure, message
>> >>> [corrupted preexisting index][CorruptIndexException[[INDEXNAME][2]
>> >>> Corrupted
>> >>> index [CORRUPTED] caused by: CorruptIndexException[codec footer
>> >>> mismatch:
>> >>> actual footer=1161826848 vs expected footer=-1071082520 (resource:
>> >>> MMapIndexInput(path="DATAPATH/INDEXNAME/2/index/_7cp.fdt"))]]]]
>> >
>> >
>> > The index is stuck now in a state where the shards try to replicate on
>> > one
>> > set of nodes, hit this failure, and then switch to try to replicate on a
>> > different set of nodes.  Have been looking around to see if anyone's
>> > encountered a similar issue but haven't found anything useful yet.
>> > Anybody
>> > know if this is recoverable or if I should just scrap it and try
>> > building a
>> > new one?
>> >
>> > - Nate
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "elasticsearch" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to elasticsearc...@googlegroups.com.
>> > To view this discussion on the web visit
>> >
>> > https://groups.google.com/d/msgid/elasticsearch/51f1b345-a19d-4c70-873f-a88880d47e5a%40googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/210c5bf5-c71a-4d5a-891d-3485a86dc0b4%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZUC9me5_t7mU%3D9ke%3DzfgcT%2Bv1Ds3dq81vFoP13CH2iV-w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to