I'll answer the other easy ones ;) #1 yes, no need for a ton of RAM and tons of cores.
#2 it's not the overhead, it's that zookeeper is sensitive to not hearing from nodes and marking them dead, at least in the Hadoop and HBase world. #3 yes, the external LB would simply spread the query load over all your Solr 4.0 nodes Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Thu, Sep 20, 2012 at 3:37 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote: > I'll answer the easy one: > > #4 - yes! In fact, it would seem wise in many of these straightforward > cases like yours to leave standard master/slave as-is for the time being even > when upgrading to Solr 4. No need to make life more complicated. Now, if > you did want to have NRT where updates are pushed to the replicas as they > come in, then that's when the SolrCloud capabilities will come into play. > > But, if it ain't broke, don't fix it. > > Erik > > On Sep 20, 2012, at 14:51 , Petersen, Robert wrote: > >> Hello solr user group, >> >> I am evaluating the new Solr 4.0 beta with an eye to how to fit it into our >> current solr setup. Our current setup is running on solr 3.6.1 and uses 12 >> slaves behind a load balancer and a master which we index into, and they all >> have three cores (now referred to as collections in 4.0 eh?) for three >> disparate types of indexes. All machines are configured with dual quad xeon >> cpus and 64GB main memory. We've worked hard to keep our index sizes small >> despite holding millions of documents, so we have no need to shard any of >> the indexes. Everything is working very well at this time. >> >> So to move to solr 4.0, I imagine we'd set -DnumShards=1 and spin up 11 >> replicas, but I'm worried about the statement "For production, it's >> recommended that you run an external zookeeper ensemble rather than having >> Solr run embedded zookeeper servers." That means we'd need at least three >> more machines dedicated to just running zookeeper. So here are my >> questions: >> >> >> 1. Could the zookeeper servers be smaller commodity servers? Ie They >> wouldn't need 64GB of memory and huge CPUs right? >> >> 2. Is the overhead of running embedded zookeeper really great enough to >> require the external ensemble? Our configuration will be pretty static, I >> don't anticipate having to change the zookeeper cluster once it is set up >> unless a machine completely dies or something. >> >> 3. Can we still use our external load balancer hardware to distribute >> queries to the solr 4.0 replicas as we do now with our slave farm? >> >> 4. Can solr 4.0 still run in a master- slave configuration if we don't >> want to use zookeeper or some of the other cloud features? >> >> >> Thanks, >> >> Robert (Robi) Petersen >> Senior Software Engineer >> Site Search Specialist >> >> >