Re: How do people typically handle shard failures in their results?

Nikolas Everett Fri, 20 Jun 2014 04:23:16 -0700

On Fri, Jun 20, 2014 at 7:08 AM, Shay Banon <kim...@gmail.com> wrote:


> If it fails on the primary shard, then a failure is returned. If it
> worked, and a replica failed, then that replica is deemed a failed replica,
> and will get allocated somewhere else in the cluster. Maybe an example of
> where a failure on “all” shards would help here?
>

I think its more about searches and they can fail on one shard but not
other for all sorts of reasons.  Queue full, unfortunate script, bug, only
one shard had results and the query asked for something weird like to use
the postings highlighter when postings aren't stored.  Lots of reasons.

I log the event and move on.  I toyed with outputting a warning to the user
but didn't have time to implement it.  We're pretty diligent with our logs
so we'd notice the log and run it down.

If the failure is caused by the queue being full only on one node, we'd
likely notice that real quick as ganglia would lose it.  This happened to
me recently when we put a node without an ssd into a cluster with ssds.  It
couldn't keep up and dropped a ton of searches.  In our defense, we didn't
know the rest of the cluster had ssds so we were double surprised.

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2dvNM-wu%3Due4trJzAtLV%3Dz1xK0MVNxhYkUKv2g68z3VQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How do people typically handle shard failures in their results?

Reply via email to