Re: Usage of CloudSolrServer?
CloudSolrServer uses LBHttpSolrServer by default. CloudSolrServer connects to Zookeeper and passes the live nodes to LBHttpSolrServer. LBHttpSolrServer connects each node as round robin. By the way do you mean leader instead of master? 2013/7/12 sathish_ix skandhasw...@inautix.co.in Hi , Iam using cloudsolrserver to connect to solrcloud, im indexing the documents using solrj API using cloudsolrserver object. Index is triggered on master node of a collection, whereas if i need to find the status of the loading , it return the message from replica where status is null. How to find which instance the cloudsolrserver is connecting ? -- View this message in context: http://lucene.472066.n3.nabble.com/Usage-of-CloudSolrServer-tp4056052p4077471.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Usage of CloudSolrServer?
Hi , Iam using cloudsolrserver to connect to solrcloud, im indexing the documents using solrj API using cloudsolrserver object. Index is triggered on master node of a collection, whereas if i need to find the status of the loading , it return the message from replica where status is null. How to find which instance the cloudsolrserver is connecting ? -- View this message in context: http://lucene.472066.n3.nabble.com/Usage-of-CloudSolrServer-tp4056052p4077471.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Usage of CloudSolrServer?
Hi Shawn; I am sorry but what kind of Load Balancing is that? I mean does it check whether some leaders are using much CPU or RAM etc.? I think a problem may occur at such kind of scenario: if some of leaders getting more documents than other leaders (I don't know how it is decided that into which shard a document will go) than there will be a bottleneck on that leader? 2013/4/15 Shawn Heisey s...@elyograg.org On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in Nutch currently does not use CloudSolrServer. There is an issue to add it. The mutual dependency on HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses HttpClient 4. https://issues.apache.org/**jira/browse/NUTCH-1377https://issues.apache.org/jira/browse/NUTCH-1377 Until that is fixed, a load balancer would be required for full redundancy for updates with SolrCloud. You don't have to use a load balancer for it to work, but if the Solr server that Nutch is using goes down, then indexing will stop unless you reconfigure Nutch or bring the Solr server back up. Thanks, Shawn
Re: Usage of CloudSolrServer?
If you are accessing Solr from Java code, you will likely use the SolrJ client to do so. If your users are hitting Solr directly, you should think about whether this is wise - as well as providing them with direct search access, you are also providing them with the ability to delete your entire index with a single command. SolrJ isn't really a load balancer as such. When SolrJ is used to make a request against a collection, it will ask Zookeeper for the names of the shards that make up that collection, and for the hosts/cores that make up the set of replicas for those shards. It will then choose one of those hosts/cores for each shard, and send a request to them as a distributed search request. This has the advantage over traditional load balancing that if you bring up a new node, that node will register itself with ZooKeeper, and thus your SolrJ client(s) will know about it, without any intervention. Upayavira On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote: Hi Shawn; I am sorry but what kind of Load Balancing is that? I mean does it check whether some leaders are using much CPU or RAM etc.? I think a problem may occur at such kind of scenario: if some of leaders getting more documents than other leaders (I don't know how it is decided that into which shard a document will go) than there will be a bottleneck on that leader? 2013/4/15 Shawn Heisey s...@elyograg.org On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in Nutch currently does not use CloudSolrServer. There is an issue to add it. The mutual dependency on HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses HttpClient 4. https://issues.apache.org/**jira/browse/NUTCH-1377https://issues.apache.org/jira/browse/NUTCH-1377 Until that is fixed, a load balancer would be required for full redundancy for updates with SolrCloud. You don't have to use a load balancer for it to work, but if the Solr server that Nutch is using goes down, then indexing will stop unless you reconfigure Nutch or bring the Solr server back up. Thanks, Shawn
Re: Usage of CloudSolrServer?
Thanks for your detailed explanation. However you said: It will then choose one of those hosts/cores for each shard, and send a request to them as a distributed search request. Is there any document that explains of distributed search? What is the criteria for it? 2013/4/16 Upayavira u...@odoko.co.uk If you are accessing Solr from Java code, you will likely use the SolrJ client to do so. If your users are hitting Solr directly, you should think about whether this is wise - as well as providing them with direct search access, you are also providing them with the ability to delete your entire index with a single command. SolrJ isn't really a load balancer as such. When SolrJ is used to make a request against a collection, it will ask Zookeeper for the names of the shards that make up that collection, and for the hosts/cores that make up the set of replicas for those shards. It will then choose one of those hosts/cores for each shard, and send a request to them as a distributed search request. This has the advantage over traditional load balancing that if you bring up a new node, that node will register itself with ZooKeeper, and thus your SolrJ client(s) will know about it, without any intervention. Upayavira On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote: Hi Shawn; I am sorry but what kind of Load Balancing is that? I mean does it check whether some leaders are using much CPU or RAM etc.? I think a problem may occur at such kind of scenario: if some of leaders getting more documents than other leaders (I don't know how it is decided that into which shard a document will go) than there will be a bottleneck on that leader? 2013/4/15 Shawn Heisey s...@elyograg.org On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in Nutch currently does not use CloudSolrServer. There is an issue to add it. The mutual dependency on HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses HttpClient 4. https://issues.apache.org/**jira/browse/NUTCH-1377 https://issues.apache.org/jira/browse/NUTCH-1377 Until that is fixed, a load balancer would be required for full redundancy for updates with SolrCloud. You don't have to use a load balancer for it to work, but if the Solr server that Nutch is using goes down, then indexing will stop unless you reconfigure Nutch or bring the Solr server back up. Thanks, Shawn
Re: Usage of CloudSolrServer?
I cannot say that I have researched it, but I have always taken it to be random. Upayavira On Tue, Apr 16, 2013, at 12:23 PM, Furkan KAMACI wrote: Thanks for your detailed explanation. However you said: It will then choose one of those hosts/cores for each shard, and send a request to them as a distributed search request. Is there any document that explains of distributed search? What is the criteria for it? 2013/4/16 Upayavira u...@odoko.co.uk If you are accessing Solr from Java code, you will likely use the SolrJ client to do so. If your users are hitting Solr directly, you should think about whether this is wise - as well as providing them with direct search access, you are also providing them with the ability to delete your entire index with a single command. SolrJ isn't really a load balancer as such. When SolrJ is used to make a request against a collection, it will ask Zookeeper for the names of the shards that make up that collection, and for the hosts/cores that make up the set of replicas for those shards. It will then choose one of those hosts/cores for each shard, and send a request to them as a distributed search request. This has the advantage over traditional load balancing that if you bring up a new node, that node will register itself with ZooKeeper, and thus your SolrJ client(s) will know about it, without any intervention. Upayavira On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote: Hi Shawn; I am sorry but what kind of Load Balancing is that? I mean does it check whether some leaders are using much CPU or RAM etc.? I think a problem may occur at such kind of scenario: if some of leaders getting more documents than other leaders (I don't know how it is decided that into which shard a document will go) than there will be a bottleneck on that leader? 2013/4/15 Shawn Heisey s...@elyograg.org On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in Nutch currently does not use CloudSolrServer. There is an issue to add it. The mutual dependency on HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses HttpClient 4. https://issues.apache.org/**jira/browse/NUTCH-1377 https://issues.apache.org/jira/browse/NUTCH-1377 Until that is fixed, a load balancer would be required for full redundancy for updates with SolrCloud. You don't have to use a load balancer for it to work, but if the Solr server that Nutch is using goes down, then indexing will stop unless you reconfigure Nutch or bring the Solr server back up. Thanks, Shawn
Usage of CloudSolrServer?
I am reading Lucidworks Solr Guide it says at SolrCloud section: *Read Side Fault Tolerance* With earlier versions of Solr, you had to set up your own load balancer. Now each individual node load balances requests across the replicas in a cluster. You still need a load balancer on the 'outside' that talks to the cluster, or you need a smart client. (Solr provides a smart Java Solrj client called CloudSolrServer.) My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different?
Re: Usage of CloudSolrServer?
On 4/15/2013 8:05 AM, Furkan KAMACI wrote: My system is as follows: I crawl data with Nutch and send them into SolrCloud. Users will search at Solr. What is that CloudSolrServer, should I use it for load balancing or is it something else different? It appears that the Solr integration in Nutch currently does not use CloudSolrServer. There is an issue to add it. The mutual dependency on HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses HttpClient 4. https://issues.apache.org/jira/browse/NUTCH-1377 Until that is fixed, a load balancer would be required for full redundancy for updates with SolrCloud. You don't have to use a load balancer for it to work, but if the Solr server that Nutch is using goes down, then indexing will stop unless you reconfigure Nutch or bring the Solr server back up. Thanks, Shawn