We've restarted that node and it *seemed* to be working its way back to 
normality...

But the LockReleaseFailedException is here to stay:

[2013-12-17 04:43:53,962][WARN ][cluster.action.shard     ] [Porcupine] 
[zapier_legacy][0] sending failed shard for [zapier_legacy][0], 
node[QToCnTWtQLCWySMnbjm2IQ], [P], s[INITIALIZING], indexUUID 
[pzWL-WO_SsaGbuWfn2IQaw], reason [Failed to start shard, message 
[IndexShardGatewayRecoveryException[[zapier_legacy][0] failed recovery]; 
nested: EngineCreationFailureException[[zapier_legacy][0] failed to create 
engine]; nested: LockReleaseFailedException[Cannot forcefully unlock a 
NativeFSLock which is held by another indexer component: 
/var/data/elasticsearch/Rage Against the 
Machine/nodes/0/indices/zapier_legacy/0/index/write.lock]; ]]

Any thoughts.

On Monday, December 16, 2013 8:25:52 PM UTC-8, Bryan Helmig wrote:
>
> Here is are some logs of the start of the incident
>
> https://gist.github.com/bryanhelmig/3c17edfe5c4e9065e5a3
>
> And basically these logs over and over:
>
> https://gist.github.com/bryanhelmig/cfb9303bc033a1183701
>
> A little background:
>
> The cluster is 3 nodes on AWS & EBS, 100 shards (50 primaries & 50 
> replicas) and just this single shard (so far) got corrupted (?). We're at 
> about 800gb of data and we're using routing keys to keep it all (mostly) 
> sane among shards. Here is the topograph of the cluster from ES Head:
>
> http://i.imgur.com/zJa9Beh.png
>
> I think it happened as it tried to relocate a shard. Now it refuses to 
> start the engine?
>
> Thanks!
> -bryan
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a153920-30da-40a2-8ca5-186442169c12%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to