Just a spinoff discussion outside of JIRA:

I have noticed that none or few of the 3rd party Solr clients have implemented 
ZK / Cloud support yet.
Why? Part of the reason could be the complexity of needing to talk to ZK?

What if we instead added a /solr/cluster REST api which delivers all info a 
client needs in order to interact with Solr. Then all talking to ZK would be 
handled on the server side and less burden on the client.

Imagine if you could simply bootstrap the client API with the URL to one or a 
few Solr nodes, no ZK. It would first consult GET /solr/cluster/state and cache 
it. Perhaps polling for changes with a HEAD request now and then to keep up to 
date.

It would also improve security since all the clients won't need direct access 
to ZK servers.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

21. nov. 2013 kl. 11:02 skrev Noble Paul (JIRA) <[email protected]>:

> 
>    [ 
> https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828662#comment-13828662
>  ] 
> 
> Noble Paul commented on SOLR-5474:
> ----------------------------------
> 
> The parent issue is to enable Solr to deal with a large no:of collections and 
> hence we are going to see an explosion of the no:of nodeswatching ZK. 
> Assuming that there will be many more clients than the nodes themselves, 
> * There will be too many watchers on the main clusterstate
> * If there are are multiple state nodes , clients would need to watch those 
> ZK nodes as well
> 
> We want to minimize the load on ZK . Moreover, all clients don't need to be 
> aware of all  collections and their states. It can be fetched lazily on demand
> 
> This will be another class extending SolrServer
> 
> {code:java}
> class LazyCloudSolrServer extends SolrServer {
> }
> 
> {code}
> 
>> Have a new mode for SolrJ to not watch any ZKNode
>> -------------------------------------------------
>> 
>>                Key: SOLR-5474
>>                URL: https://issues.apache.org/jira/browse/SOLR-5474
>>            Project: Solr
>>         Issue Type: Sub-task
>>         Components: SolrCloud
>>           Reporter: Noble Paul
>> 
>> In this mode SolrJ would not watch any ZK node
>> It fetches the state  on demand and cache the most recently used n 
>> collections in memory.
>> SolrJ would not listen to any ZK node. When a request comes for a collection 
>> ‘xcoll’
>> it would first check if such a collection exists
>> If yes it first looks up the details in the local cache for that collection
>> If not found in cache , it fetches the node /collections/xcoll/state.json 
>> and caches the information
>> Any query/update will be sent with extra query param specifying the 
>> collection name , shard name, Role (Leader/Replica), and range (example 
>> \_target_=xcoll:shard1:L:80000000-b332ffff) . A node would throw an error 
>> (INVALID_NODE) if it does not the serve the collection/shard/Role/range 
>> combo.
>> If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
>> fresh state information for that collection (and caches it again)
>> If there is a connection timeout, SolrJ assumes the node is down and 
>> re-fetch the state for the collection and try again
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.1#6144)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to