On 6/6/2013 6:32 PM, Jack Krupansky wrote: <big snip>
> This would be a lot more of a true "Solr Cloud" than the "cluster" > support that we have today. > > And the "CloudKeeper" itself might be a "traditional" SolrCloud cluster, > except that it needs to be multi-data center. I like a lot of what you said in the huge section that I didn't quote. It inspired a few ideas. Recently I was thinking about how we might change the names of certain things in Solr to get rid of historical throwbacks, given that we are redefining solr.xml and other config files in the dev branches. Your ideas are similar to something I thought about where you'd have an abstraction higher than a collection, something for which I couldn't think of a name. Another idea: One characteristic of SolrCloud is that the master/slave model goes away. In some ways, this is a very good thing, but it does get rid of the ability to index on one set of machines and query on another. What if we combined master/slave replication with SolrCloud? What I'm envisioning here is a master cloud with a low replication factor like 2 or 3, and a slave cloud with a potentially high replication actor. They would actually be part of the same cloud, sharing a zookeeper ensemble. It would need to support the ability to split configurations, either with two config sets for one cloud or the ability to include master and slave configs, similar to how we split index and query analyzers in the schema. Related side issue: The fact that SolrCloud uses the standard replication handler has led to lots of confusion. People look at the replication section for their cores and are very confused by what they see there, and when we tell them that SolrCloud's replicas don't normally use replication, they get REALLY confused. How about we set aside a dedicated handler name (/cloudreplication, perhaps) for an internally defined replication handler specific for SolrCloud recovery? Thanks, Shawn