On 4/15/2014 10:44 PM, Mukesh Jha wrote:
> In my solr cluster I've multiple shards and each shard containing
> ~500,000,000 documents total index size being ~1 TB.
> 
> I was just wondering how much more can I keep on adding to the shard before
> we reach a tipping point and the performance starts to degrade?
> 
> Also as best practice what is the recomended no of docs / size of shards .

Vinay has given you my performance problems wiki page, which I created
because I was always telling people the same things about why their Solr
performance sucked.  Erick has given you additional good information.

I think Erick intended to give you this link instead of the wiki page
about XML updates:

http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

The performance tipping point is highly variable, mostly depending on
RAM.  The index details and the nature of the queries will affect this
... but honestly, that mostly comes down to RAM as well.  These details
affect how much heap is required as well as how much of the full index
must be loaded into the OS disk cache for good performance.  The "best
practices" recommendation is to have enough RAM to cache the entire
index ... but getting 1TB of total RAM for your case is *really*
expensive.  It also is probably not a strict requirement.

RAM is the most precious resource for large indexes.  CPU speed and
capability is important, but generally only for scaling query load once
you've gotten performance into an acceptable place.  (must remember to
put that in the SolrPerformanceProblems wiki page!)

I've got a dev Solr server that reached the tipping point yesterday.
It's only got 16GB of RAM, and the RAM slots are maxed out.  The Solr
instance has 7GB of heap, and I have a few other java programs on it
that each take a few hundred MB.  It's handling nearly 150GB of index,
with only about 7GB of OS disk cache.  I have never expected its
performance to be stellar.

The dev server normally runs fine, if a bit slow... but I had done an
index rebuild earlier in the day, so the amount of index data on the
machine was up near 200GB instead of below 150GB.  Suddenly it was
taking minutes to do basic update operations instead of a few hundred
milliseconds.  Once I deleted the data in the build cores (swapped with
live cores) and restarted Solr, everything started working OK again.

The tipping point would have been reached much sooner if this were the
hardware I had for production.  The dev server barely sees any query
load.  I doubt this dev server would survive production queries, even if
the indexes were spread across two of them instead of just the one.

In production, the full distributed index is served by two machines that
each have 64GB of RAM.  Those machines right now only need to deal with
about 100GB of total index data (with a 6GB Solr heap on each one), so
there's plenty of RAM between them to cache the entire index.  When I
add the new indexes that are under development, there won't be quite
enough RAM to cache everything, but it will be close enough that it
won't matter.

My current production index is 94 million docs, and the new index that I
am adding is currently at 13 million, expected to grow to 20 million in
the near future.

Thanks,
Shawn

Reply via email to