[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539501#comment-15539501 ]
Susheel Kumar commented on SOLR-8146: ------------------------------------- Thank you, Noble. I am going thru the changes and will get back to you. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > ----------------------------------------------------------------------- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java > Affects Versions: 5.3 > Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org