Thank you Shawn and Toke for the information and links!

No, I was not the one on #solr IRC channel. :/

Here are the details I have right now:

I'm building/running the operations side of this new SolrCloud cluster. 
It will be in Amazon, the initial cluster I'm planning to start with is 
5 r3.xlarge instances each using a general purpose SSD EBS volume for 
the SolrCloud related data (this will be separate from the EBS volume 
used by the OS). Each instance has 30.5 GiB RAM--152.5 GiB cluster 
wide--and each instance has 4 vCPU's. I'm using Oracle Java 1.8.0_31 and 
the G1 GC.

The data will be indexed on a separate machine and added to the 
SolrCloud cluster while searching is taking place. Unfortunately I don't 
have numbers at this time on how much data will be indexed. I do know 
that we will have over 2000 collections--some will be small (a few 
hundred documents and only a few megabytes at most), and a few will be 
very large (somewhere in the gigabytes). Our old Solr Master/Slave 
systems isn't broken up this way, so we aren't certain about how exactly 
things will map out in SolrCloud.

I'll continue researching, but I expect I'll just have to monitor the 
cluster as data gets imported into it and make adjustments as needed.

Ryan

On 4/2/15 12:06 AM, Toke Eskildsen wrote:
> Ryan Steele<ryan.ste...@pgi.com>  wrote:
>> Does a SolrCloud 5.0 cluster need enough RAM across the cluster to load
>> all the collections into RAM at all times?
> Although Shawn is right about us not being able to answer properly, sometimes 
> we can give qualified suggestions and guesses. At least to the direction you 
> should be looking. The quality of the guesses goes up with the amount of 
> information provided and "1TB" is really not much information.
>
> - Are you indexing while searching? How much?
> - How many documents in the index?
> - What is a typical query? What about faceting?
> - How many concurrent queries?
> - Expected median response time?
>
>> I'm building a SolrCloud cluster that may have approximately 1 TB of
>> data spread across the collections.
> We're running a 22TB SolrCloud of a single 16-core server with 256GB RAM. 
> We've also had performance problems serving a 100GB index from a same-size 
> machine.
>
> The one hardware advice I will give is to start with SSDs and scale from 
> there. With present day price/performance, using spinning drives for anything 
> IO-intensive makes little sense.
>
> - Toke Eskildsen
>
>
---------------------------------------------------------------------------------------
 This email has been scanned for email related threats and delivered safely by 
Mimecast.
 For more information please visit http://www.mimecast.com
---------------------------------------------------------------------------------------

Reply via email to