Re: Exposing Solr routing to SolrJ client

Mark Miller Mon, 12 Mar 2012 05:15:50 -0700

Hey Per,

A couple things:


1. Distributed realtime get is coming - I know Yonik was looking at this 
recently but got caught up in some other things.

2. There is a Solrj client that is aware of the cluster state - its called 
CloudSolrServer. You give it the zookeeper address rather than a node's 
address. Currently it doesn't send directly to the leader, but this is planned 
- it's a little tricky due to lack of access to the Schema for hashing, but 
likely coming soon - there is a JIRA issue for it. Clients in other languages 
should be able to do the same thing.

- Mark

On Mar 12, 2012, at 5:26 AM, Per Steffensen wrote:

> Hi
> 
> I believe Solr(Cloud) is doing some internal routing of update-requests to 
> make sure documents are stored in the correct core/shard decided by Solrs 
> internal routing algoritm (I believe it basically finds out who is the 
> leader-shard for a given document, using shared information in ZK, info about 
> the collection and hash(document.id)). All nice and cool.
> 
> I also believe realtime-gets are not forwarded internally in Solr through 
> this routing algorithm, and that it therefore is "impossible" to do 
> realtime-gets from a client, because you dont know which core/shard to 
> contact directly, again because you dont know the routing alogrithm. If Im 
> wrong, it would be very helpfull with a few directions on how to do 
> realtime-gets from a client to a Solr servers system containing many shards 
> and collection. If Im right, I think it would be very nice if the the routing 
> algorithm was somehow exposed to the client (in code reachable from SolrJ) so 
> that you can get to do realtime-gets from a SolrJ-based client - if it should 
> be done automatically for you of if the client using SolrJ explicitly needs 
> to call some code to get info about the core to contact, is not so important 
> for now.
> 
> Such a solution would also make it possible to get rid of another performance 
> related "problem", that most update-requests has to be transported among JVMs 
> twice to reach their destination. First from client to some "random" Solr 
> server, and then from this Solr server to the Solr server holding the core 
> involved in the update. If routing information was available for the client 
> it could make sure to route its updates directly to the core (the one 
> currently playing the role as leader-shard for the shard to which the routing 
> algorithm maps the document) involved in the update.
> 
> ElasticSearch has a solution to this problem by the usage of "Node Client" 
> (instead of just "Transport Client"), where a node client is basically a real 
> node in the system that just doesnt store document, but which have all the 
> logic and shared information like e.g. routing algorithm available - 
> http://www.elasticsearch.org/guide/reference/java-api/client.html. It 
> certainly doesnt have to be like that with Solr clients, but it would be nice 
> if somehow routing logic where available to the SolrJ so that it can send its 
> updates (and realtime-gets) directly to the correct destination.
> 
> Hope to get some comments on this issue.
> 
> Regards, Per Steffensen
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 

- Mark Miller
lucidimagination.com












---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Exposing Solr routing to SolrJ client

Reply via email to