On 9/2/2015 9:19 AM, scott chu wrote: > Mail > Do you mean I only have to put 10M documents in one index and copy > it to many slaves in a classic Solr master-slave architecture to > provide querying serivce on internet, and it won't have obvious > downgrade of query performance? But I did have add 1M document into > one index on master and provide 2 slaves to serve querying service on > internet, the query performance is kinda sad. Why do you say: "at 10M > documents there's rarely a need to shard at all?" Do I provide too few > slaves? What amount of documents is suitable for a need for shard in > SolrCloud?
Lucene has exactly one hard and unbreakable limit, and it is the number of documents you can have in a single index (core/shard for Solr). That limit is just over 2.1 billion documents. The actual limiting factor is the maximum value of an integer in Java. Because deleted documents are counted when this limit is considered, you shouldn't go over 1 billion active documents per shard, but the *practical* recommendation for shard size is much lower than that. For various reasons, some of which are very technical and boring, the general advice is to not exceed about 100 million documents per shard. Some setups can handle more docs per shard, some require a lot less. There are no quick answers or hard rules. You may have been given this URL before: https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ There are sometimes reasons to shard a very small index. This is the correct path when the index is not very busy and you want to take advantage of a surplus of CPU power to make each query faster. In this situation, you will probably end up with multiple shards per server, so when a query comes in, multiple CPUs on the machine can handle the shard queries in parallel. If the index is handling a lot of requests per second, then you want that CPU power used for handling the load, not speeding up a single query. For high-load situations, one shard per physical server is desirable. Thanks, Shawn