Re: Creating new cluster with existing config in zookeeper
On 3/23/2016 9:43 AM, Robert Brown wrote: > When going to the admin UI on this new server I can see the > shards/replica's of the existing collection, and can even query it, > even tho this new server has no cores on it itself. > > Is this all expected behaviour? > Yes. There were some bugs early on in the 4.x versions where this *didn't* work, but those were fixed. This functionality was fully intentional. > Is there any performance gain with what I have at this precise stage? > The extra server certainly makes it appear i could balance more > load/requests, but I guess the queries are just being forwarded on to > the servers with the actual data? As Erick said, the requests are being handled by the servers that actually host the data. The new node is just acting as a data ferry. There is no performance gain unless you use the ADDREPLICA feature of the collections API to add replicas of your existing shards to the new node, so some of the query load is transferred. > Am I correct in thinking I can now create a new collection on this > host, and begin to build up a new cluster? and they won't interfere > with each other at all? > > Also, that I'll be able to see both collections when using the admin > UI Cloud page on any of the servers in either collection? You're still mixing up "collection" and "cluster" in your terminology. This is somewhat understandable ... the concepts do have some similarity. Within a single cluster (servers sharing a particular database/chroot in zookeeper), all collections (logical indexes) in the entire cluster will be usable on any machine in the cluster. Thanks, Shawn
Re: Creating new cluster with existing config in zookeeper
Thanks all, I am no doubt confusing things myself - I (rather stupidly) have 5 completely separate clouds, with separate ZK trees - a bad design decision on day one when I thought each config needed a separate ZK tree. So it could all be simplified a bit, but that's my current view, which is probably sounding confused. Cheers, Rob On 03/23/2016 04:03 PM, Tom Evans wrote: On Wed, Mar 23, 2016 at 3:43 PM, Robert Brown wrote: So I setup a new solr server to point to my existing ZK configs. When going to the admin UI on this new server I can see the shards/replica's of the existing collection, and can even query it, even tho this new server has no cores on it itself. Is this all expected behaviour? Is there any performance gain with what I have at this precise stage? The extra server certainly makes it appear i could balance more load/requests, but I guess the queries are just being forwarded on to the servers with the actual data? Am I correct in thinking I can now create a new collection on this host, and begin to build up a new cluster? and they won't interfere with each other at all? Also, that I'll be able to see both collections when using the admin UI Cloud page on any of the servers in either collection? I'm confused slightly: SolrCloud is a (singular) cluster of servers, storing all of its state and configuration underneath a single zookeeper path. The cluster contains collections. Collections are tied to a particular config set within the cluster. Collections are made up of 1 or more shards. Each shard is a core, and there are 1 or more replicas of each core. You can add more servers to the cluster, and then create a new collection with the same config as an existing collection, but it is still part of the same cluster. Of course, you could think of a set of servers within a cluster as a "logical" cluster if it just serves particular collection, but "cluster" to me would be all of the servers within the same zookeeper tree, because that is where cluster state is maintained. Cheers Tom
Re: Creating new cluster with existing config in zookeeper
> Is this all expected behaviour? Yes. As Since each Solr node has access to the entire state, an arbitrary Solr node can figure out where to forward a request for some collection it doesn't host. >Is there any performance gain with what I have at this precise stage? The >extra server certainly makes it appear i could balance more load/requests, but >I guess the queries are just being forwarded on to the servers with the actual >data? Not really. The work of searching is done by the Solr nodes hosting the collection. Am I correct in thinking I can now create a new collection on this host, and begin to build up a new cluster? and they won't interfere with each other at all? Yes Also, that I'll be able to see both collections when using the admin UI Cloud page on any of the servers in either collection? Yes On Wed, Mar 23, 2016 at 8:43 AM, Robert Brown wrote: > So I setup a new solr server to point to my existing ZK configs. > > When going to the admin UI on this new server I can see the shards/replica's > of the existing collection, and can even query it, even tho this new server > has no cores on it itself. > > Is this all expected behaviour? > > Is there any performance gain with what I have at this precise stage? The > extra server certainly makes it appear i could balance more load/requests, > but I guess the queries are just being forwarded on to the servers with the > actual data? > > Am I correct in thinking I can now create a new collection on this host, and > begin to build up a new cluster? and they won't interfere with each other > at all? > > Also, that I'll be able to see both collections when using the admin UI > Cloud page on any of the servers in either collection? > > Thanks, > Rob > > > > On 03/22/2016 04:47 PM, Erick Erickson wrote: >> >> The whole _point_ of configsets is to re-use them in multiple >> collections, so please do! >> >> Best, >> Erick >> >> On Tue, Mar 22, 2016 at 5:38 AM, Robert Brown >> wrote: >>> >>> Hi, >>> >>> Is it safe to create a new cluster but use an existing config set that's >>> in >>> zookeeper? Or does that config set contain the cluster status too? >>> >>> I want to (re)-build a cluster from scratch, with a different amount of >>> shards, but not using shard-splitting. >>> >>> Thanks, >>> Rob >>> >
Re: Creating new cluster with existing config in zookeeper
On Wed, Mar 23, 2016 at 3:43 PM, Robert Brown wrote: > So I setup a new solr server to point to my existing ZK configs. > > When going to the admin UI on this new server I can see the shards/replica's > of the existing collection, and can even query it, even tho this new server > has no cores on it itself. > > Is this all expected behaviour? > > Is there any performance gain with what I have at this precise stage? The > extra server certainly makes it appear i could balance more load/requests, > but I guess the queries are just being forwarded on to the servers with the > actual data? > > Am I correct in thinking I can now create a new collection on this host, and > begin to build up a new cluster? and they won't interfere with each other > at all? > > Also, that I'll be able to see both collections when using the admin UI > Cloud page on any of the servers in either collection? > I'm confused slightly: SolrCloud is a (singular) cluster of servers, storing all of its state and configuration underneath a single zookeeper path. The cluster contains collections. Collections are tied to a particular config set within the cluster. Collections are made up of 1 or more shards. Each shard is a core, and there are 1 or more replicas of each core. You can add more servers to the cluster, and then create a new collection with the same config as an existing collection, but it is still part of the same cluster. Of course, you could think of a set of servers within a cluster as a "logical" cluster if it just serves particular collection, but "cluster" to me would be all of the servers within the same zookeeper tree, because that is where cluster state is maintained. Cheers Tom
Re: Creating new cluster with existing config in zookeeper
On 3/22/2016 11:16 AM, Robert Brown wrote: > Thanks Erick and Shawn, a "collection" is indeed what I meant. > > I was under the impression the entire Tree view in the admin GUI was > showing everything in ZK, including things like > "collections/name/state.json", not just the /configs directory. > > The solr.xml file is too isn't it? (I added it to ZK as per the docs), > just a bit confusing to see some files/directories from ZK, and some not. Info you may already know: When you create a new collection using the Collections API, you can give it the name of an existing config with the collections.configName parameter. Changes to that config will affect all collections using it. You'll usually need to reload the collection so it re-reads the config from zookeeper. The "Tree" view does show you the entire zookeeper database from Solr's point of view. This is the common information available to every Solr server in your entire cloud. It contains information about every collection, all of your uploaded configs, and a few other things. SolrCloud still requires a fair amount of information at the core level that isn't stored in zookeeper. The index is too big, and we haven't gotten around to the rest of it. We do want to get to the point where everything relevant for a SolrCloud core (except the Lucene index) is stored in zookeeper -- for internal discussions, we call this "ZK as truth": https://issues.apache.org/jira/browse/SOLR-7269 Thanks, Shawn
Re: Creating new cluster with existing config in zookeeper
So I setup a new solr server to point to my existing ZK configs. When going to the admin UI on this new server I can see the shards/replica's of the existing collection, and can even query it, even tho this new server has no cores on it itself. Is this all expected behaviour? Is there any performance gain with what I have at this precise stage? The extra server certainly makes it appear i could balance more load/requests, but I guess the queries are just being forwarded on to the servers with the actual data? Am I correct in thinking I can now create a new collection on this host, and begin to build up a new cluster? and they won't interfere with each other at all? Also, that I'll be able to see both collections when using the admin UI Cloud page on any of the servers in either collection? Thanks, Rob On 03/22/2016 04:47 PM, Erick Erickson wrote: The whole _point_ of configsets is to re-use them in multiple collections, so please do! Best, Erick On Tue, Mar 22, 2016 at 5:38 AM, Robert Brown wrote: Hi, Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster status too? I want to (re)-build a cluster from scratch, with a different amount of shards, but not using shard-splitting. Thanks, Rob
Re: Creating new cluster with existing config in zookeeper
Thanks Erick and Shawn, a "collection" is indeed what I meant. I was under the impression the entire Tree view in the admin GUI was showing everything in ZK, including things like "collections/name/state.json", not just the /configs directory. The solr.xml file is too isn't it? (I added it to ZK as per the docs), just a bit confusing to see some files/directories from ZK, and some not. Thanks for any more insight. On 03/22/2016 04:57 PM, Shawn Heisey wrote: On 3/22/2016 6:38 AM, Robert Brown wrote: Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster status too? I want to (re)-build a cluster from scratch, with a different amount of shards, but not using shard-splitting. When you say "cluster" what exactly do you mean? To me, "cluster" in a Solr context means "a bunch of Solr servers." If this is what you mean, there is nothing built in to copy things from an existing cluster. You *can* run multiple SolrCloud clusters on one Zookeeper ensemble. If you are actually talking about a *collection* when you say "cluster", then what Erick said is 100% correct. Thanks, Shawn
Re: Creating new cluster with existing config in zookeeper
On 3/22/2016 6:38 AM, Robert Brown wrote: Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster status too? I want to (re)-build a cluster from scratch, with a different amount of shards, but not using shard-splitting. When you say "cluster" what exactly do you mean? To me, "cluster" in a Solr context means "a bunch of Solr servers." If this is what you mean, there is nothing built in to copy things from an existing cluster. You *can* run multiple SolrCloud clusters on one Zookeeper ensemble. If you are actually talking about a *collection* when you say "cluster", then what Erick said is 100% correct. Thanks, Shawn
Re: Creating new cluster with existing config in zookeeper
The whole _point_ of configsets is to re-use them in multiple collections, so please do! Best, Erick On Tue, Mar 22, 2016 at 5:38 AM, Robert Brown wrote: > Hi, > > Is it safe to create a new cluster but use an existing config set that's in > zookeeper? Or does that config set contain the cluster status too? > > I want to (re)-build a cluster from scratch, with a different amount of > shards, but not using shard-splitting. > > Thanks, > Rob >
Creating new cluster with existing config in zookeeper
Hi, Is it safe to create a new cluster but use an existing config set that's in zookeeper? Or does that config set contain the cluster status too? I want to (re)-build a cluster from scratch, with a different amount of shards, but not using shard-splitting. Thanks, Rob