Eirck, 0> Load balancer is out of the picture . 1>When I query with *distrib=false* , I get consistent results as expected for those shards that dont have the key i.e I dont get the results back for those shards, however I just realized that while *distrib=false* is present in the query for the shard that is supposed to contain the key,only the replica of the shard that has this key returns the result , and the leader does not , looks like replica and the leader do not have the same data and replica seems to contain the key in the query for that shard.
2> By indexing I mean this collection is being populated by a web crawler. So looks like 1> above is pointing to leader and replica being out of synch for atleast one shard. On Thu, Oct 2, 2014 at 11:57 PM, Erick Erickson <erickerick...@gmail.com> wrote: > bq: Also ,the collection is being actively indexed as I query this, could > that > be an issue too ? > > Not if the documents you're searching aren't being added as you search > (and all your autocommit intervals have expired). > > I would turn off indexing for testing, it's just one more variable > that can get in the way of understanding this. > > Do note that if the problem were endemic to Solr, there would probably > be a _lot_ more noise out there. > > So to recap: > 0> we can take the load balancer out of the picture all together. > > 1> when you query each shard individually with &distrib=true, every > replica in a particular shard returns the same count. > > 2> when you query without &distrib=true you get varying counts. > > This is very strange and not at all expected. Let's try it again > without indexing going on.... > > And what do you mean by "indexing" anyway? How are documents being fed > to your system? > > Best, > Erick@PuzzledAsWell > > On Thu, Oct 2, 2014 at 7:32 PM, S.L <simpleliving...@gmail.com> wrote: > > Erick, > > > > I would like to add that the interesting behavior i.e point #2 that I > > mentioned in my earlier reply happens in all the shards , if this were > to > > be a distributed search issue this should have not manifested itself in > the > > shard that contains the key that I am searching for , looks like the > search > > is just failing as whole intermittently . > > > > Also ,the collection is being actively indexed as I query this, could > that > > be an issue too ? > > > > Thanks. > > > > On Thu, Oct 2, 2014 at 10:24 PM, S.L <simpleliving...@gmail.com> wrote: > > > >> Erick, > >> > >> Thanks for your reply, I tried your suggestions. > >> > >> 1 . When not using loadbalancer if *I have distrib=false* I get > >> consistent results across the replicas. > >> > >> 2. However here's the insteresting part , while not using load balancer > if > >> I *dont have distrib=false* , then when I query a particular node ,I get > >> the same behaviour as if I were using a loadbalancer , meaning the > >> distributed search from a node works intermittently .Does this give any > >> clue ? > >> > >> > >> > >> On Thu, Oct 2, 2014 at 7:47 PM, Erick Erickson <erickerick...@gmail.com > > > >> wrote: > >> > >>> Hmmm, nothing quite makes sense here.... > >>> > >>> Here are some experiments: > >>> 1> avoid the load balancer and issue queries like > >>> http://solr_server:8983/solr/collection/q=whatever&distrib=false > >>> > >>> the &distrib=false bit will cause keep SolrCloud from trying to send > >>> the queries anywhere, they'll be served only from the node you address > >>> them to. > >>> that'll help check whether the nodes are consistent. You should be > >>> getting back the same results from each replica in a shard (i.e. 2 of > >>> your 6 machines). > >>> > >>> Next, try your failing query the same way. > >>> > >>> Next, try your failing query from a browser, pointing it at successive > >>> nodes. > >>> > >>> Where is the first place problems show up? > >>> > >>> My _guess_ is that your load balancer isn't quite doing what you > think, or > >>> your cluster isn't set up the way you think it is, but those are > guesses. > >>> > >>> Best, > >>> Erick > >>> > >>> On Thu, Oct 2, 2014 at 2:51 PM, S.L <simpleliving...@gmail.com> wrote: > >>> > Hi All, > >>> > > >>> > I am trying to query a 6 node Solr4.7 cluster with 3 shards and a > >>> > replication factor of 2 . > >>> > > >>> > I have fronted these 6 Solr nodes using a load balancer , what I > notice > >>> is > >>> > that every time I do a search of the form > >>> > q=*:*&fq=(id:9e78c064-919f-4ef3-b236-dc66351b4acf) it gives me a > result > >>> > only once in every 3 tries , telling me that the load balancer is > >>> > distributing the requests between the 3 shards and SolrCloud only > >>> returns a > >>> > result if the request goes to the core that as that id . > >>> > > >>> > However if I do a simple search like q=*:* , I consistently get the > >>> right > >>> > aggregated results back of all the documents across all the shards > for > >>> > every request from the load balancer. Can someone please let me know > >>> what > >>> > this is symptomatic of ? > >>> > > >>> > Somehow Solr Cloud seems to be doing search query distribution and > >>> > aggregation for queries of type *:* only. > >>> > > >>> > Thanks. > >>> > >> > >> >