Thank you Shawn and Toke for the information and links! No, I was not the one on #solr IRC channel. :/
Here are the details I have right now: I'm building/running the operations side of this new SolrCloud cluster. It will be in Amazon, the initial cluster I'm planning to start with is 5 r3.xlarge instances each using a general purpose SSD EBS volume for the SolrCloud related data (this will be separate from the EBS volume used by the OS). Each instance has 30.5 GiB RAM--152.5 GiB cluster wide--and each instance has 4 vCPU's. I'm using Oracle Java 1.8.0_31 and the G1 GC. The data will be indexed on a separate machine and added to the SolrCloud cluster while searching is taking place. Unfortunately I don't have numbers at this time on how much data will be indexed. I do know that we will have over 2000 collections--some will be small (a few hundred documents and only a few megabytes at most), and a few will be very large (somewhere in the gigabytes). Our old Solr Master/Slave systems isn't broken up this way, so we aren't certain about how exactly things will map out in SolrCloud. I'll continue researching, but I expect I'll just have to monitor the cluster as data gets imported into it and make adjustments as needed. Ryan On 4/2/15 12:06 AM, Toke Eskildsen wrote: > Ryan Steele<ryan.ste...@pgi.com> wrote: >> Does a SolrCloud 5.0 cluster need enough RAM across the cluster to load >> all the collections into RAM at all times? > Although Shawn is right about us not being able to answer properly, sometimes > we can give qualified suggestions and guesses. At least to the direction you > should be looking. The quality of the guesses goes up with the amount of > information provided and "1TB" is really not much information. > > - Are you indexing while searching? How much? > - How many documents in the index? > - What is a typical query? What about faceting? > - How many concurrent queries? > - Expected median response time? > >> I'm building a SolrCloud cluster that may have approximately 1 TB of >> data spread across the collections. > We're running a 22TB SolrCloud of a single 16-core server with 256GB RAM. > We've also had performance problems serving a 100GB index from a same-size > machine. > > The one hardware advice I will give is to start with SSDs and scale from > there. With present day price/performance, using spinning drives for anything > IO-intensive makes little sense. > > - Toke Eskildsen > > --------------------------------------------------------------------------------------- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com ---------------------------------------------------------------------------------------