On 4/11/2016 6:31 AM, Bhaumik Joshi wrote:
> We are using solr 5.2.0 and we have Index-heavy (100 index updates per
> sec) and Query-heavy (100 queries per sec) scenario.
>
> *Index stats: *10 million documents and 16 GB index size
>
>  
>
> Which sharding strategy is best suited in above scenario?
>
> Please share reference resources which states detailed comparison of
> single shard over multi shard if any.
>
>  
>
> Meanwhile we did some tests with SolrMeter (Standalone java tool for
> stress tests with Solr) for single shard and two shards.
>
> *Index stats of test solr cloud: *0.7 million documents and 1 GB index
> size.
>
> As observed in test average query time with 2 shards is much higher
> than single shard.
>

On the same hardware, multiple shards will usually be slower than one
shard, especially under a high load.  Sharding can give good results
with *more* hardware, providing more CPU and memory resources.  When the
query load is high, there should only be only one core (shard replica)
per server, and Solr works best when it is running on bare metal, not
virtualized.

Handling 100 queries per second will require multiple copies of your
index on separate hardware.  This is a fairly high query load.  There
are installations handling much higher loads, of course.  Those
installations have a LOT of replicas and some way to balance load across
them.

For 10 million documents and 16GB of index, I'm not sure that I would
shard at all, just make sure that each machine has plenty of memory --
probably somewhere in the neighborhood of 24GB to 32GB.  That assumes
that Solr is the only thing running on that server, and that if it's
virtualized, making sure that the physical server's memory is not
oversubscribed.

Regarding your specific numbers:

The low queries per second may be caused by one or more of these
problems, or perhaps something I haven't thought of:  1) your queries
are particularly heavy.  2) updates are interfering by tying up scarce
resources.  3) you don't have enough memory in the machine.

How many documents are in each update request that you are sending?  In
another thread on the list, you have stated that you have a 1 second
maxTime on autoSoftCommit.  This is *way* too low, and a *major* source
of performance issues.  Very few people actually need that level of
latency -- a maxTime measured in minutes may be fast enough, and is much
friendlier for performance.

Thanks,
Shawn

Reply via email to