if it helps at all, i've pushed the flappy item detector tool (*cough*) here:
https://github.com/Aconex/es-flappyitem-detector We have a simple 3-node cluster, 5 shards, 1 replica, so I'm sure there's code in there that is built around those assumptions, but should be easily modified to suit your purpose perhaps. cheers, Paul On 31 January 2014 11:59, Paul Smith <tallpsm...@gmail.com> wrote: > the flappy detection tool I have connects to the cluster using the > standard java autodiscovery mechanism, and, works out which shards are > involved, and then creates explicit TransportClient connection to each > host, so would need access to 9300 (the SMILE based protocol port). Would > that help? (is 9300 accessible from a host that can run java ? > > > On 31 January 2014 11:45, <xav...@gaikai.com> wrote: > >> We have 4 query heads total (esq1.r6, esq2.r6, esq3.r7, esq4.r7). >> Interestingly query heads in the same rack give the same results. We don't >> do deletes at all on these indices so that shouldn't be an issue. >> Unfortunately at the moment I can't do preference=_local while getting the >> _id(s) directly because we don't allow access on 9200 on our worker nodes. >> I might be able to right some code to figure this out though. Either way >> here's my id results from the different heads. >> >> esq2.r6 gets 28 total results >> esq3.r7 gets 9 total results >> >> $curl -XGET " >> http://esq2.r6:9200/events/_search?q=sessionId:1390953880&size=100" | jq >> '.hits.hits[]._id' | sort >> % Total % Received % Xferd Average Speed Time Time Time >> Current >> Dload Upload Total Spent Left >> Speed >> 100 19337 100 19337 0 0 1039k 0 --:--:-- --:--:-- --:--:-- >> 1049k >> "0LcI_px4SZy5ZQkI_V7Qyw" >> "1sAGREtMSfK8OIxZErm8RQ" >> "6IV2v4TFTr-Gl1eC6hrj0Q" >> "6nwMexTHQBmFxfykOgKqWA" >> "7hFYs6y-QG6wGYEkoBKmdg" >> "9MTM10SeQ2yqWIb08oPnFA" >> "aELtGN6DQpmdRlQbr8i0uA" >> "AUHUg6k0QZOf_oGjsjSsGA" >> "Bo_u1eYGSF2LeU78kbcFZg" >> "EWs1K8YsR9-IBSAWK6ld7A" >> "Fx4l6_axSGCxpyFm7C7BSQ" >> "gpCrAZrNTNezWPfensER3g" >> "HAFmGcWuQAylxGjmnZZkSQ" >> "HB4Kwz3RSWWH5NHvyH4JMg" >> "H-eP-33FREOtq7v0uBPWbQ" >> "_IH6W4DoTRmdms0FJNlg4g" >> "iK_3TbzcSj2-MbMXip_XFg" >> "J4bjPFIcQ1ewrQqjN2qz6Q" >> "kfonMDBuR--UIhkyM2cWrg" >> "Kr6-9-3uR9Wp2923n-O2NA" >> "Nw_9rjwvQ62u-HsuWIm53A" >> "QRmY8R2MQemuePb0EkYxWA" >> "usloSJzQRzCpOQ8bxKi2vA" >> "w9NGEWg-QiivMpjyurYKrA" >> "wKy-YzB-TK2lnK86Sx2RBA" >> "y2ZmJ-_GRAmi3eHy1y8jzw" >> "ZmFj7w4hR5Cvy-owCLmZ1Q" >> "ZmlndPBLT-ivuOxm_A7yDA" >> >> $curl -XGET " >> http://esq3.r7:9200/events/_search?q=sessionId:1390953880&size=100" | jq >> '.hits.hits[]._id' | sort >> % Total % Received % Xferd Average Speed Time Time Time >> Current >> Dload Upload Total Spent Left >> Speed >> 100 6808 100 6808 0 0 70082 0 --:--:-- --:--:-- --:--:-- >> 70185 >> "1sAGREtMSfK8OIxZErm8RQ" >> "7hFYs6y-QG6wGYEkoBKmdg" >> "aELtGN6DQpmdRlQbr8i0uA" >> "Fx4l6_axSGCxpyFm7C7BSQ" >> "HAFmGcWuQAylxGjmnZZkSQ" >> "H-eP-33FREOtq7v0uBPWbQ" >> "QRmY8R2MQemuePb0EkYxWA" >> "wKy-YzB-TK2lnK86Sx2RBA" >> "y2ZmJ-_GRAmi3eHy1y8jzw" >> >> And here is es3.r7 with preference=_primary_first: >> >> $curl -XGET " >> http://esq3.r7/events/_search?q=sessionId:1390953880&size=100&preference=_primary_first" >> | jq '.hits.hits[]._id' | sort >> % Total % Received % Xferd Average Speed Time Time Time >> Current >> Dload Upload Total Spent Left >> Speed >> 100 19335 100 19335 0 0 871k 0 --:--:-- --:--:-- --:--:-- >> 899k >> "0LcI_px4SZy5ZQkI_V7Qyw" >> "1sAGREtMSfK8OIxZErm8RQ" >> "6IV2v4TFTr-Gl1eC6hrj0Q" >> "6nwMexTHQBmFxfykOgKqWA" >> "7hFYs6y-QG6wGYEkoBKmdg" >> "9MTM10SeQ2yqWIb08oPnFA" >> "aELtGN6DQpmdRlQbr8i0uA" >> "AUHUg6k0QZOf_oGjsjSsGA" >> "Bo_u1eYGSF2LeU78kbcFZg" >> "EWs1K8YsR9-IBSAWK6ld7A" >> "Fx4l6_axSGCxpyFm7C7BSQ" >> "gpCrAZrNTNezWPfensER3g" >> "HAFmGcWuQAylxGjmnZZkSQ" >> "HB4Kwz3RSWWH5NHvyH4JMg" >> "H-eP-33FREOtq7v0uBPWbQ" >> "_IH6W4DoTRmdms0FJNlg4g" >> "iK_3TbzcSj2-MbMXip_XFg" >> "J4bjPFIcQ1ewrQqjN2qz6Q" >> "kfonMDBuR--UIhkyM2cWrg" >> "Kr6-9-3uR9Wp2923n-O2NA" >> "Nw_9rjwvQ62u-HsuWIm53A" >> "QRmY8R2MQemuePb0EkYxWA" >> "usloSJzQRzCpOQ8bxKi2vA" >> "w9NGEWg-QiivMpjyurYKrA" >> "wKy-YzB-TK2lnK86Sx2RBA" >> "y2ZmJ-_GRAmi3eHy1y8jzw" >> "ZmFj7w4hR5Cvy-owCLmZ1Q" >> "ZmlndPBLT-ivuOxm_A7yDA" >> >> >> On Thursday, January 30, 2014 4:00:49 PM UTC-8, tallpsmith wrote: >> >>> If you can narrow down a specific few IDs of results that >>> appear/disappear based on the primary/replica shard, and confirm through an >>> explicit GET of that ID with the preference=_local on the primary shard & >>> replica for that result. To work out which shard # a specific ID belongs >>> to, you can run this query: >>> >>> curl -XGET '*http://127.0.0.1:9200/_all/_search?pretty=1 >>> <http://127.0.0.1:9200/_all/_search?pretty=1>*' -d ' >>> { >>> "fields" : [], >>> "query" : { >>> "ids" : { >>> "values" : [ >>> "123456789" >>> ] >>> } >>> }, >>> "explain" : 1 >>> } >>> ' >>> >>> where the "values" attribute you place the ID of the item you're after. >>> Within the result response you'l see the shard Id, use that to identify >>> which host is the primary and which is the replica. You can then run the >>> GET query with the preference=_local on each of those hosts and see if the >>> primary or replica shows the result. You will need to understand whether >>> the item that is 'flappy' (appearing/disappearing depending on the shard >>> being searched) is supposed to be in there or not, perhaps checking the >>> data store that is the source of the index (is it a dB?). >>> >>> We have very infrequent case where the replica shard is not properly >>> receiving a delete at least with 0.19.10. The delete successfully applies >>> to the Primary, but the Replica still holds the value and returns it within >>> search results. We have loads of insert/update/delete activity and the >>> number of flappy items is very small, but it is definitely a thing. >>> >>> see this previous thread: http://elasticsearch-users. >>> 115913.n3.nabble.com/Deleted-items-appears-when-searching- >>> a-replica-shard-td4029075.html >>> >>> If it is the replica shard that's incorrect (my bet), the way to fix it >>> is to relocate the replica shard to another host. The relocation will take >>> the copy of the primary (correct copy) and recreate a new replica shard, >>> effectively neutralizing the inconsistency. >>> >>> We have written a tool, Scrutineer (https://github.com/aconex/scrutineer) >>> which can help detect inconsistencies in your cluster. I also have a tool >>> not yet published to github that can help check these Primary/Replica >>> inconsistencies if that would help (you pass a list of IDs to it and it'll >>> check whether they're flappy between the primary & replica or not). It can >>> also help automate the rebuilding of just the replica shards by shunting >>> them around (rather than a full rolling restart of ALL the shards, just the >>> shard replicas you want) >>> >>> cheers, >>> >>> Paul Smith >>> >>> >>> >>> >>> >>> On 31 January 2014 09:44, Binh Ly <bi...@hibalo.com> wrote: >>> >>>> Xavier, can you post an example of 1 full query and then also show how >>>> the results of this one query is inconsistent? Just trying to understand >>>> what is inconsistent. Thanks. >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elasticsearc...@googlegroups.com. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/038735ba-cef6-4634-9d46-7ff39dffc4d2% >>>> 40googlegroups.com. >>>> >>>> For more options, visit https://groups.google.com/groups/opt_out. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/a99ab249-ddf4-4c38-97d7-3bfe8ec41b5f%40googlegroups.com >> . >> >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHfYWB5%3DUVzDmvmNHHmhkGnFaEAV37-c2cnMPKSvuqsJ3MzC2Q%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.