How SolrCloud Balance Number of Documents at each Shard?
Is it possible that different shards have different number of documents or does SolrCloud balance them? I ask this question because I want to learn the mechanism behind how Solr calculete hash value of the identifier of the document. Is it possible that hash function produces more documents into one of the shards other than any of shards. (because this may cause a bottleneck at some leaders of SolrCloud)
Re: How SolrCloud Balance Number of Documents at each Shard?
They won't be exact, but should be close. Are you seeing some *big* differences? Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 16, 2013 at 6:11 PM, Furkan KAMACI furkankam...@gmail.com wrote: Is it possible that different shards have different number of documents or does SolrCloud balance them? I ask this question because I want to learn the mechanism behind how Solr calculete hash value of the identifier of the document. Is it possible that hash function produces more documents into one of the shards other than any of shards. (because this may cause a bottleneck at some leaders of SolrCloud)
Re: How SolrCloud Balance Number of Documents at each Shard?
Hi Otis; Firstly thanks for your answers. So do you mean that hashing mechanism will randomly route a document into a randomly shard? I want to ask it because I consider about putting a load balancer in front of my SolrCloud and manually route some documents into some other shards to avoid bottleneck. 2013/4/17 Otis Gospodnetic otis.gospodne...@gmail.com They won't be exact, but should be close. Are you seeing some *big* differences? Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 16, 2013 at 6:11 PM, Furkan KAMACI furkankam...@gmail.com wrote: Is it possible that different shards have different number of documents or does SolrCloud balance them? I ask this question because I want to learn the mechanism behind how Solr calculete hash value of the identifier of the document. Is it possible that hash function produces more documents into one of the shards other than any of shards. (because this may cause a bottleneck at some leaders of SolrCloud)
Re: How SolrCloud Balance Number of Documents at each Shard?
Hi, Routing is not random... have a look at https://issues.apache.org/jira/browse/SOLR-2341 . In short, you shouldn't have to route manually from your app. Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 16, 2013 at 6:26 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi Otis; Firstly thanks for your answers. So do you mean that hashing mechanism will randomly route a document into a randomly shard? I want to ask it because I consider about putting a load balancer in front of my SolrCloud and manually route some documents into some other shards to avoid bottleneck. 2013/4/17 Otis Gospodnetic otis.gospodne...@gmail.com They won't be exact, but should be close. Are you seeing some *big* differences? Otis -- Solr ElasticSearch Support http://sematext.com/ On Tue, Apr 16, 2013 at 6:11 PM, Furkan KAMACI furkankam...@gmail.com wrote: Is it possible that different shards have different number of documents or does SolrCloud balance them? I ask this question because I want to learn the mechanism behind how Solr calculete hash value of the identifier of the document. Is it possible that hash function produces more documents into one of the shards other than any of shards. (because this may cause a bottleneck at some leaders of SolrCloud)