proper routing (from non-Java client) in solr cloud 5.0.0
Hi all - I've just upgraded my dev install of Solr (cloud) from 4.10 to 5.0. Our client is written in Go, for which I am not aware of a client, so we wrote our own. One tricky bit for this was the routing logic; if a document has routing prefix X and belong to collection Y, we need to know which solr node to connect to. Previously we accomplished this by watching the clusterstate.json file in zookeeper - at startup and whenever it changes, the client parses the file contents to build a routing table. However in 5.0 newly create collections do not show up in clusterstate.json but instead have their own state.json document. Are there any recommendations for how to handle this from the client? The obvious answer is to watch every collection's state.json document, but we run a lot of collections (~1000 currently, and growing) so I'm concerned about keeping that many watches open at the same time (should I be?). How does the SolrJ client handle this? Thanks! - Ian
Re: proper routing (from non-Java client) in solr cloud 5.0.0
Hi Ian, As per my understanding, Solrj does not use Zookeeper watches but instead caches the information (along with a TTL). You can find more information here, https://issues.apache.org/jira/browse/SOLR-5473 https://issues.apache.org/jira/browse/SOLR-5474 Regards Hrishikesh On Tue, Apr 14, 2015 at 8:49 AM, Ian Rose ianr...@fullstory.com wrote: Hi all - I've just upgraded my dev install of Solr (cloud) from 4.10 to 5.0. Our client is written in Go, for which I am not aware of a client, so we wrote our own. One tricky bit for this was the routing logic; if a document has routing prefix X and belong to collection Y, we need to know which solr node to connect to. Previously we accomplished this by watching the clusterstate.json file in zookeeper - at startup and whenever it changes, the client parses the file contents to build a routing table. However in 5.0 newly create collections do not show up in clusterstate.json but instead have their own state.json document. Are there any recommendations for how to handle this from the client? The obvious answer is to watch every collection's state.json document, but we run a lot of collections (~1000 currently, and growing) so I'm concerned about keeping that many watches open at the same time (should I be?). How does the SolrJ client handle this? Thanks! - Ian
Re: proper routing (from non-Java client) in solr cloud 5.0.0
Hi Hrishikesh, Thanks for the pointers - I had not looked at SOLR-5474 https://issues.apache.org/jira/browse/SOLR-5474 previously. Interesting approach... I think we will stick with trying to keep zk watches open from all clients to all collections for now, but if that starts to be a bottleneck its good to know how the route that Solrj has chosen... cheers, Ian On Tue, Apr 14, 2015 at 3:56 PM, Hrishikesh Gadre gadre.s...@gmail.com wrote: Hi Ian, As per my understanding, Solrj does not use Zookeeper watches but instead caches the information (along with a TTL). You can find more information here, https://issues.apache.org/jira/browse/SOLR-5473 https://issues.apache.org/jira/browse/SOLR-5474 Regards Hrishikesh On Tue, Apr 14, 2015 at 8:49 AM, Ian Rose ianr...@fullstory.com wrote: Hi all - I've just upgraded my dev install of Solr (cloud) from 4.10 to 5.0. Our client is written in Go, for which I am not aware of a client, so we wrote our own. One tricky bit for this was the routing logic; if a document has routing prefix X and belong to collection Y, we need to know which solr node to connect to. Previously we accomplished this by watching the clusterstate.json file in zookeeper - at startup and whenever it changes, the client parses the file contents to build a routing table. However in 5.0 newly create collections do not show up in clusterstate.json but instead have their own state.json document. Are there any recommendations for how to handle this from the client? The obvious answer is to watch every collection's state.json document, but we run a lot of collections (~1000 currently, and growing) so I'm concerned about keeping that many watches open at the same time (should I be?). How does the SolrJ client handle this? Thanks! - Ian