On 9/2/2015 9:19 AM, scott chu wrote:
> Mail
> Do you mean I only have to put 10M documents in one index and copy
> it to many slaves in a classic Solr master-slave architecture to
> provide querying serivce on internet, and it won't have obvious
> downgrade of query performance? But I did have add 1M document into
> one index on master and provide 2 slaves to serve querying service on
> internet, the query performance is kinda sad. Why do you say: "at 10M
> documents there's rarely a need to shard at all?" Do I provide too few
> slaves? What amount of documents is suitable for a need for shard in
> SolrCloud?

Lucene has exactly one hard and unbreakable limit, and it is the number
of documents you can have in a single index (core/shard for Solr).  That
limit is just over 2.1 billion documents.  The actual limiting factor is
the maximum value of an integer in Java.  Because deleted documents are
counted when this limit is considered, you shouldn't go over 1 billion
active documents per shard, but the *practical* recommendation for shard
size is much lower than that.

For various reasons, some of which are very technical and boring, the
general advice is to not exceed about 100 million documents per shard. 
Some setups can handle more docs per shard, some require a lot less. 
There are no quick answers or hard rules.  You may have been given this
URL before:

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

There are sometimes reasons to shard a very small index.  This is the
correct path when the index is not very busy and you want to take
advantage of a surplus of CPU power to make each query faster.  In this
situation, you will probably end up with multiple shards per server, so
when a query comes in, multiple CPUs on the machine can handle the shard
queries in parallel.  If the index is handling a lot of requests per
second, then you want that CPU power used for handling the load, not
speeding up a single query.  For high-load situations, one shard per
physical server is desirable.

Thanks,
Shawn

Reply via email to