Re: Primary vs. replica shard inconsistencies?

Paul Smith Thu, 30 Jan 2014 16:01:26 -0800

If you can narrow down a specific few IDs of results that appear/disappear
based on the primary/replica shard, and confirm through an explicit GET of
that ID with the preference=_local on the primary shard & replica for that
result.  To work out which shard # a specific ID belongs to, you can run
this query:

curl -XGET '*http://127.0.0.1:9200/_all/_search?pretty=1
<http://127.0.0.1:9200/_all/_search?pretty=1>*' -d '
{
"fields" : [],
"query" : {
"ids" : {
"values" : [
"123456789"
]
}
},
"explain" : 1
}
'

where the "values" attribute you place the ID of the item you're after.
 Within the result response you'l see the shard Id, use that to identify
which host is the primary and which is the replica.  You can then run the
GET query with the preference=_local on each of those hosts and see if the
primary or replica shows the result.  You will need to understand whether
the item that is 'flappy' (appearing/disappearing depending on the shard
being searched) is supposed to be in there or not, perhaps checking the
data store that is the source of the index (is it a dB?).

We have very infrequent case where the replica shard is not properly
receiving a delete at least with 0.19.10.  The delete successfully applies
to the Primary, but the Replica still holds the value and returns it within
search results.  We have loads of insert/update/delete activity and the
number of flappy items is very small, but it is definitely a thing.

see this previous thread:
http://elasticsearch-users.115913.n3.nabble.com/Deleted-items-appears-when-searching-a-replica-shard-td4029075.html

If it is the replica shard that's incorrect (my bet), the way to fix it is
to relocate the replica shard to another host.  The relocation will take
the copy of the primary (correct copy) and recreate a new replica shard,
effectively neutralizing the inconsistency.

We have written a tool, Scrutineer (https://github.com/aconex/scrutineer)
which can help detect inconsistencies in your cluster.  I also have a tool
not yet published to github that can help check these Primary/Replica
inconsistencies if that would help (you pass a list of IDs to it and it'll
check whether they're flappy between the primary & replica or not).  It can
also help automate the rebuilding of just the replica shards by shunting
them around (rather than a full rolling restart of ALL the shards, just the
shard replicas you want)

cheers,

Paul Smith

On 31 January 2014 09:44, Binh Ly <b...@hibalo.com> wrote:

> Xavier, can you post an example of 1 full query and then also show how the
> results of this one query is inconsistent? Just trying to understand what
> is inconsistent. Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/038735ba-cef6-4634-9d46-7ff39dffc4d2%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHfYWB51LsR5XH6G5VX3KcGdPU8mVUc-eEiROPS1wjwQGkaobg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Primary vs. replica shard inconsistencies?

Reply via email to