On 7/16/2013 3:36 PM, Robert Stewart wrote:
I want to script the creation of N solr cloud instances (on ec2).

But its not clear to me where I would specify numShards setting.
 From documentation, I see you can specify on the "first node" you start up, OR 
alternatively, use the "collections" API to create a new collection - but in that case 
you need first at least one running SOLR instance.  I want to push all solr instances with similar 
configuration onto N instances and just run them with some number of shards pre-set somehow.  Where 
can I put numShards configuration setting?

What I want to do:

1) push solr configuration to zookeeper ensemble using zkCli command-line tool.
2) create N instances of SOLR running on Ec2, pointing to the same zookeeper
3) start all SOLR instances which will become a cloud setup with M shards (where 
M<N), and N-M replicas.

A minimal redundant SolrCloud cluster consists of two larger machines that run Solr and zookeeper, plus a third smaller machine that runs just zookeeper. This is just the minimum requirement, you can use additional and more powerful servers.

The general way that you should set up a brand new SolrCloud. If anyone spots a problem with this, please don't hesitate to mention it:

1) Set up three hosts running standalone zookeeper, configured as a fully redundant ensemble. This is outside the scope of Solr documentation, please consult the zookeeper site:

http://zookeeper.apache.org

2) Construct a zkHost parameter for your ZK ensemble. An example is below using the default zookeeper port of 2181. You'd need to use the proper port numbers, names, etc. The /chroot part is optional, but highly recommended. Use a name that has meaning for your SolrCloud cluster rather than chroot:

-DzkHost=server1:2181,server2:2181,server3:2181/chroot

By using the /chroot syntax, you can run more than one SolrCloud cluster on your zookeeper ensemble. Just use a different value for each cluster.

3) Start Solr with the same zkHost parameter on every Solr host, referring to the three zookeeper hosts already set up. You can use the same hosts for Solr as you did for zookeeper.

4) Use the zkcli script in example/cloud-scripts to upload a configuration set to zookeeper using the "upconfig" command. If you aren't using the Solr example or a custom install based on the example, then you'll need to examine the script to figure out how to run the java command manually and have it find the solr and zookeeper jars.

5) Use the Collections API to create a collection, referencing the uploaded config set and including additional parameters like numShards. If you have four Solr hosts, the following API call would work perfectly:

http://server:port/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=2&collection.configName=mycfg

Thanks,
Shawn

Reply via email to