Hey Per, A couple things:
1. Distributed realtime get is coming - I know Yonik was looking at this recently but got caught up in some other things. 2. There is a Solrj client that is aware of the cluster state - its called CloudSolrServer. You give it the zookeeper address rather than a node's address. Currently it doesn't send directly to the leader, but this is planned - it's a little tricky due to lack of access to the Schema for hashing, but likely coming soon - there is a JIRA issue for it. Clients in other languages should be able to do the same thing. - Mark On Mar 12, 2012, at 5:26 AM, Per Steffensen wrote: > Hi > > I believe Solr(Cloud) is doing some internal routing of update-requests to > make sure documents are stored in the correct core/shard decided by Solrs > internal routing algoritm (I believe it basically finds out who is the > leader-shard for a given document, using shared information in ZK, info about > the collection and hash(document.id)). All nice and cool. > > I also believe realtime-gets are not forwarded internally in Solr through > this routing algorithm, and that it therefore is "impossible" to do > realtime-gets from a client, because you dont know which core/shard to > contact directly, again because you dont know the routing alogrithm. If Im > wrong, it would be very helpfull with a few directions on how to do > realtime-gets from a client to a Solr servers system containing many shards > and collection. If Im right, I think it would be very nice if the the routing > algorithm was somehow exposed to the client (in code reachable from SolrJ) so > that you can get to do realtime-gets from a SolrJ-based client - if it should > be done automatically for you of if the client using SolrJ explicitly needs > to call some code to get info about the core to contact, is not so important > for now. > > Such a solution would also make it possible to get rid of another performance > related "problem", that most update-requests has to be transported among JVMs > twice to reach their destination. First from client to some "random" Solr > server, and then from this Solr server to the Solr server holding the core > involved in the update. If routing information was available for the client > it could make sure to route its updates directly to the core (the one > currently playing the role as leader-shard for the shard to which the routing > algorithm maps the document) involved in the update. > > ElasticSearch has a solution to this problem by the usage of "Node Client" > (instead of just "Transport Client"), where a node client is basically a real > node in the system that just doesnt store document, but which have all the > logic and shared information like e.g. routing algorithm available - > http://www.elasticsearch.org/guide/reference/java-api/client.html. It > certainly doesnt have to be like that with Solr clients, but it would be nice > if somehow routing logic where available to the SolrJ so that it can send its > updates (and realtime-gets) directly to the correct destination. > > Hope to get some comments on this issue. > > Regards, Per Steffensen > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - Mark Miller lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org