Hi
I believe Solr(Cloud) is doing some internal routing of update-requests
to make sure documents are stored in the correct core/shard decided by
Solrs internal routing algoritm (I believe it basically finds out who is
the leader-shard for a given document, using shared information in ZK,
info about the collection and hash(document.id)). All nice and cool.
I also believe realtime-gets are not forwarded internally in Solr
through this routing algorithm, and that it therefore is "impossible" to
do realtime-gets from a client, because you dont know which core/shard
to contact directly, again because you dont know the routing alogrithm.
If Im wrong, it would be very helpfull with a few directions on how to
do realtime-gets from a client to a Solr servers system containing many
shards and collection. If Im right, I think it would be very nice if the
the routing algorithm was somehow exposed to the client (in code
reachable from SolrJ) so that you can get to do realtime-gets from a
SolrJ-based client - if it should be done automatically for you of if
the client using SolrJ explicitly needs to call some code to get info
about the core to contact, is not so important for now.
Such a solution would also make it possible to get rid of another
performance related "problem", that most update-requests has to be
transported among JVMs twice to reach their destination. First from
client to some "random" Solr server, and then from this Solr server to
the Solr server holding the core involved in the update. If routing
information was available for the client it could make sure to route its
updates directly to the core (the one currently playing the role as
leader-shard for the shard to which the routing algorithm maps the
document) involved in the update.
ElasticSearch has a solution to this problem by the usage of "Node
Client" (instead of just "Transport Client"), where a node client is
basically a real node in the system that just doesnt store document, but
which have all the logic and shared information like e.g. routing
algorithm available -
http://www.elasticsearch.org/guide/reference/java-api/client.html. It
certainly doesnt have to be like that with Solr clients, but it would be
nice if somehow routing logic where available to the SolrJ so that it
can send its updates (and realtime-gets) directly to the correct
destination.
Hope to get some comments on this issue.
Regards, Per Steffensen
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org