Re: CREATE collection bug or feature?
On 6/19/2015 11:15 AM, Jim.Musil wrote: I noticed that when I issue the CREATE collection command to the api, it does not automatically put a replica on every live node connected to zookeeper. So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and create a collection like this: /admin/collections?action=CREATEname=my_collectionnumShards=1replicationFactor=1maxShardsPerNode=1collection.configName=my_config It will only create a core on one of the three nodes. I can make it work if I change replicationFactor to 3. When standing up an entire stack using chef, this all gets a bit clunky. I don't see any option such as ALL that would just create a replica on all nodes regardless of size. I'm guessing this is intentional, but curious about the reasoning. If you tell it replicationFactor=1, then you get exactly that -- one copy of your index. I personally think that it would be a violation of something known as the principle of least surprise for Solr to automatically create replicas without being asked to. I would assume that if you are writing automated tools to build indexes and the servers hosting those indexes that your automation will be able to calculate a reasonable replicationFactor, or calculate the number of hosts to create based on a provided replicationFactor. A feature to have Solr itself automatically calculate a replicationFactor based on the number of available hosts and the numShards value provided is not a bad idea. Please create a feature request issue in Jira. One way that this might be done is by setting replicationFactor to auto or maybe a special number, perhaps 0 or -1. https://issues.apache.org/jira/browse/SOLR Thanks, Shawn
Re: CREATE collection bug or feature?
Jim: This is by design. There's no way to tell Solr to find all the cores available and put one replica on each. In fact, you're explicitly telling it to create one and only one replica, one and only one shard. That is, your collection will have exactly one low-level core. But you realized that... As to the reasoning. Consider hetergeneous collections all hosted on the same Solr cluster. I have big collections, little collections, some with high QPS rates, some not. etc. Having Solr do things like this automatically would make managing this difficult. Probably the real reason is nobody thought it would be useful in the general case. And I probably concur. Adding a new node to an existing cluster would result in unbalanced clusters etc. I suppose a stop-gap would be to query the live_nodes in the cluster and add that to the URL, don't know how much of a pain that would be though. Best, Erick On Fri, Jun 19, 2015 at 10:15 AM, Jim.Musil jim.mu...@target.com wrote: I noticed that when I issue the CREATE collection command to the api, it does not automatically put a replica on every live node connected to zookeeper. So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and create a collection like this: /admin/collections?action=CREATEname=my_collectionnumShards=1replicationFactor=1maxShardsPerNode=1collection.configName=my_config It will only create a core on one of the three nodes. I can make it work if I change replicationFactor to 3. When standing up an entire stack using chef, this all gets a bit clunky. I don't see any option such as ALL that would just create a replica on all nodes regardless of size. I'm guessing this is intentional, but curious about the reasoning. Thanks! Jim
CREATE collection bug or feature?
I noticed that when I issue the CREATE collection command to the api, it does not automatically put a replica on every live node connected to zookeeper. So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and create a collection like this: /admin/collections?action=CREATEname=my_collectionnumShards=1replicationFactor=1maxShardsPerNode=1collection.configName=my_config It will only create a core on one of the three nodes. I can make it work if I change replicationFactor to 3. When standing up an entire stack using chef, this all gets a bit clunky. I don't see any option such as ALL that would just create a replica on all nodes regardless of size. I'm guessing this is intentional, but curious about the reasoning. Thanks! Jim
Re: CREATE collection bug or feature?
Thanks as always for the great answers! Jim On 6/19/15, 11:57 AM, Erick Erickson erickerick...@gmail.com wrote: Jim: This is by design. There's no way to tell Solr to find all the cores available and put one replica on each. In fact, you're explicitly telling it to create one and only one replica, one and only one shard. That is, your collection will have exactly one low-level core. But you realized that... As to the reasoning. Consider hetergeneous collections all hosted on the same Solr cluster. I have big collections, little collections, some with high QPS rates, some not. etc. Having Solr do things like this automatically would make managing this difficult. Probably the real reason is nobody thought it would be useful in the general case. And I probably concur. Adding a new node to an existing cluster would result in unbalanced clusters etc. I suppose a stop-gap would be to query the live_nodes in the cluster and add that to the URL, don't know how much of a pain that would be though. Best, Erick On Fri, Jun 19, 2015 at 10:15 AM, Jim.Musil jim.mu...@target.com wrote: I noticed that when I issue the CREATE collection command to the api, it does not automatically put a replica on every live node connected to zookeeper. So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and create a collection like this: /admin/collections?action=CREATEname=my_collectionnumShards=1replicati onFactor=1maxShardsPerNode=1collection.configName=my_config It will only create a core on one of the three nodes. I can make it work if I change replicationFactor to 3. When standing up an entire stack using chef, this all gets a bit clunky. I don't see any option such as ALL that would just create a replica on all nodes regardless of size. I'm guessing this is intentional, but curious about the reasoning. Thanks! Jim