Re: Usage of CloudSolrServer?

2013-07-12 Thread Furkan KAMACI
CloudSolrServer uses LBHttpSolrServer by default. CloudSolrServer connects
to Zookeeper and passes the live nodes
to LBHttpSolrServer. LBHttpSolrServer connects each node as round robin. By
the way do you mean leader instead of master?

2013/7/12 sathish_ix skandhasw...@inautix.co.in

 Hi ,

 Iam using cloudsolrserver to connect to solrcloud, im indexing the
 documents
 using solrj API using cloudsolrserver object. Index is triggered on master
 node of a collection, whereas if i need to find the status of the loading ,
 it return the message from replica where status is null. How to find which
 instance the cloudsolrserver is connecting ?





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Usage-of-CloudSolrServer-tp4056052p4077471.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Usage of CloudSolrServer?

2013-07-11 Thread sathish_ix
Hi ,

Iam using cloudsolrserver to connect to solrcloud, im indexing the documents
using solrj API using cloudsolrserver object. Index is triggered on master
node of a collection, whereas if i need to find the status of the loading ,
it return the message from replica where status is null. How to find which
instance the cloudsolrserver is connecting ?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Usage-of-CloudSolrServer-tp4056052p4077471.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Usage of CloudSolrServer?

2013-04-16 Thread Furkan KAMACI
Hi Shawn;

I am sorry but what kind of Load Balancing is that? I mean does it check
whether some leaders are using much CPU or RAM etc.? I think a problem may
occur at such kind of scenario: if some of leaders getting more documents
than other leaders (I don't know how it is decided that into which shard a
document will go) than there will be a bottleneck on that leader?


2013/4/15 Shawn Heisey s...@elyograg.org

 On 4/15/2013 8:05 AM, Furkan KAMACI wrote:

 My system is as follows: I crawl data with Nutch and send them into
 SolrCloud. Users will search at Solr.

 What is that CloudSolrServer, should I use it for load balancing or is it
 something else different?


 It appears that the Solr integration in Nutch currently does not use
 CloudSolrServer.  There is an issue to add it.  The mutual dependency on
 HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
 HttpClient 4.

 https://issues.apache.org/**jira/browse/NUTCH-1377https://issues.apache.org/jira/browse/NUTCH-1377

 Until that is fixed, a load balancer would be required for full redundancy
 for updates with SolrCloud.  You don't have to use a load balancer for it
 to work, but if the Solr server that Nutch is using goes down, then
 indexing will stop unless you reconfigure Nutch or bring the Solr server
 back up.

 Thanks,
 Shawn




Re: Usage of CloudSolrServer?

2013-04-16 Thread Upayavira
If you are accessing Solr from Java code, you will likely use the SolrJ
client to do so. If your users are hitting Solr directly, you should
think about whether this is wise - as well as providing them with direct
search access, you are also providing them with the ability to delete
your entire index with a single command.

SolrJ isn't really a load balancer as such. When SolrJ is used to make a
request against a collection, it will ask Zookeeper for the names of the
shards that make up that collection, and for the hosts/cores that make
up the set of replicas for those shards.

It will then choose one of those hosts/cores for each shard, and send a
request to them as a distributed search request.

This has the advantage over traditional load balancing that if you bring
up a new node, that node will register itself with ZooKeeper, and thus
your SolrJ client(s) will know about it, without any intervention.

Upayavira

On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
 Hi Shawn;
 
 I am sorry but what kind of Load Balancing is that? I mean does it check
 whether some leaders are using much CPU or RAM etc.? I think a problem
 may
 occur at such kind of scenario: if some of leaders getting more documents
 than other leaders (I don't know how it is decided that into which shard
 a
 document will go) than there will be a bottleneck on that leader?
 
 
 2013/4/15 Shawn Heisey s...@elyograg.org
 
  On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
 
  My system is as follows: I crawl data with Nutch and send them into
  SolrCloud. Users will search at Solr.
 
  What is that CloudSolrServer, should I use it for load balancing or is it
  something else different?
 
 
  It appears that the Solr integration in Nutch currently does not use
  CloudSolrServer.  There is an issue to add it.  The mutual dependency on
  HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
  HttpClient 4.
 
  https://issues.apache.org/**jira/browse/NUTCH-1377https://issues.apache.org/jira/browse/NUTCH-1377
 
  Until that is fixed, a load balancer would be required for full redundancy
  for updates with SolrCloud.  You don't have to use a load balancer for it
  to work, but if the Solr server that Nutch is using goes down, then
  indexing will stop unless you reconfigure Nutch or bring the Solr server
  back up.
 
  Thanks,
  Shawn
 
 


Re: Usage of CloudSolrServer?

2013-04-16 Thread Furkan KAMACI
Thanks for your detailed explanation. However you said:

It will then choose one of those hosts/cores for each shard, and send a
request to them as a distributed search request. Is there any document
that explains of distributed search? What is the criteria for it?


2013/4/16 Upayavira u...@odoko.co.uk

 If you are accessing Solr from Java code, you will likely use the SolrJ
 client to do so. If your users are hitting Solr directly, you should
 think about whether this is wise - as well as providing them with direct
 search access, you are also providing them with the ability to delete
 your entire index with a single command.

 SolrJ isn't really a load balancer as such. When SolrJ is used to make a
 request against a collection, it will ask Zookeeper for the names of the
 shards that make up that collection, and for the hosts/cores that make
 up the set of replicas for those shards.

 It will then choose one of those hosts/cores for each shard, and send a
 request to them as a distributed search request.

 This has the advantage over traditional load balancing that if you bring
 up a new node, that node will register itself with ZooKeeper, and thus
 your SolrJ client(s) will know about it, without any intervention.

 Upayavira

 On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
  Hi Shawn;
 
  I am sorry but what kind of Load Balancing is that? I mean does it check
  whether some leaders are using much CPU or RAM etc.? I think a problem
  may
  occur at such kind of scenario: if some of leaders getting more documents
  than other leaders (I don't know how it is decided that into which shard
  a
  document will go) than there will be a bottleneck on that leader?
 
 
  2013/4/15 Shawn Heisey s...@elyograg.org
 
   On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
  
   My system is as follows: I crawl data with Nutch and send them into
   SolrCloud. Users will search at Solr.
  
   What is that CloudSolrServer, should I use it for load balancing or
 is it
   something else different?
  
  
   It appears that the Solr integration in Nutch currently does not use
   CloudSolrServer.  There is an issue to add it.  The mutual dependency
 on
   HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
   HttpClient 4.
  
   https://issues.apache.org/**jira/browse/NUTCH-1377
 https://issues.apache.org/jira/browse/NUTCH-1377
  
   Until that is fixed, a load balancer would be required for full
 redundancy
   for updates with SolrCloud.  You don't have to use a load balancer for
 it
   to work, but if the Solr server that Nutch is using goes down, then
   indexing will stop unless you reconfigure Nutch or bring the Solr
 server
   back up.
  
   Thanks,
   Shawn
  
  



Re: Usage of CloudSolrServer?

2013-04-16 Thread Upayavira
I cannot say that I have researched it, but I have always taken it to be
random.

Upayavira

On Tue, Apr 16, 2013, at 12:23 PM, Furkan KAMACI wrote:
 Thanks for your detailed explanation. However you said:
 
 It will then choose one of those hosts/cores for each shard, and send a
 request to them as a distributed search request. Is there any document
 that explains of distributed search? What is the criteria for it?
 
 
 2013/4/16 Upayavira u...@odoko.co.uk
 
  If you are accessing Solr from Java code, you will likely use the SolrJ
  client to do so. If your users are hitting Solr directly, you should
  think about whether this is wise - as well as providing them with direct
  search access, you are also providing them with the ability to delete
  your entire index with a single command.
 
  SolrJ isn't really a load balancer as such. When SolrJ is used to make a
  request against a collection, it will ask Zookeeper for the names of the
  shards that make up that collection, and for the hosts/cores that make
  up the set of replicas for those shards.
 
  It will then choose one of those hosts/cores for each shard, and send a
  request to them as a distributed search request.
 
  This has the advantage over traditional load balancing that if you bring
  up a new node, that node will register itself with ZooKeeper, and thus
  your SolrJ client(s) will know about it, without any intervention.
 
  Upayavira
 
  On Tue, Apr 16, 2013, at 08:36 AM, Furkan KAMACI wrote:
   Hi Shawn;
  
   I am sorry but what kind of Load Balancing is that? I mean does it check
   whether some leaders are using much CPU or RAM etc.? I think a problem
   may
   occur at such kind of scenario: if some of leaders getting more documents
   than other leaders (I don't know how it is decided that into which shard
   a
   document will go) than there will be a bottleneck on that leader?
  
  
   2013/4/15 Shawn Heisey s...@elyograg.org
  
On 4/15/2013 8:05 AM, Furkan KAMACI wrote:
   
My system is as follows: I crawl data with Nutch and send them into
SolrCloud. Users will search at Solr.
   
What is that CloudSolrServer, should I use it for load balancing or
  is it
something else different?
   
   
It appears that the Solr integration in Nutch currently does not use
CloudSolrServer.  There is an issue to add it.  The mutual dependency
  on
HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses
HttpClient 4.
   
https://issues.apache.org/**jira/browse/NUTCH-1377
  https://issues.apache.org/jira/browse/NUTCH-1377
   
Until that is fixed, a load balancer would be required for full
  redundancy
for updates with SolrCloud.  You don't have to use a load balancer for
  it
to work, but if the Solr server that Nutch is using goes down, then
indexing will stop unless you reconfigure Nutch or bring the Solr
  server
back up.
   
Thanks,
Shawn
   
   
 


Usage of CloudSolrServer?

2013-04-15 Thread Furkan KAMACI
I am reading Lucidworks Solr Guide it says at SolrCloud section:

*Read Side Fault Tolerance*
With earlier versions of Solr, you had to set up your own load balancer.
Now each individual node
load balances requests across the replicas in a cluster. You still need a
load balancer on the
'outside' that talks to the cluster, or you need a smart client. (Solr
provides a smart Java Solrj
client called CloudSolrServer.)

My system is as follows: I crawl data with Nutch and send them into
SolrCloud. Users will search at Solr.

What is that CloudSolrServer, should I use it for load balancing or is it
something else different?


Re: Usage of CloudSolrServer?

2013-04-15 Thread Shawn Heisey

On 4/15/2013 8:05 AM, Furkan KAMACI wrote:

My system is as follows: I crawl data with Nutch and send them into
SolrCloud. Users will search at Solr.

What is that CloudSolrServer, should I use it for load balancing or is it
something else different?


It appears that the Solr integration in Nutch currently does not use 
CloudSolrServer.  There is an issue to add it.  The mutual dependency on 
HttpClient is holding it up - Nutch uses HttpClient 3, SolrJ 4.x uses 
HttpClient 4.


https://issues.apache.org/jira/browse/NUTCH-1377

Until that is fixed, a load balancer would be required for full 
redundancy for updates with SolrCloud.  You don't have to use a load 
balancer for it to work, but if the Solr server that Nutch is using goes 
down, then indexing will stop unless you reconfigure Nutch or bring the 
Solr server back up.


Thanks,
Shawn