[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505333#comment-16505333 ] Tomás Fernández Löbbe commented on SOLR-8146: - I just related some related Jiras: SOLR-11982 adds support for indicating preferred replicas, although this only works on the server side (meaning, when working with multiple shards). SOLR-12217 is the follow up code to make it work on the client side too > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou >Priority: Major > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499896#comment-16499896 ] Edwin Yeo Zheng Lin commented on SOLR-8146: --- Hi, Would like to check, is there other ways which we can achieve this preferred replica for query/read in CloudSolrClient, or is it still pending implementation? This feature is good, so that we can use a different replica for query/read and the replica used for indexing, to maximize the performance of both. Edwin h1. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou >Priority: Major > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15844065#comment-15844065 ] ASF GitHub Bot commented on SOLR-8146: -- GitHub user susheelks opened a pull request: https://github.com/apache/lucene-solr/pull/147 SOLR-8146 decouple building url list from CloudSolrClient to separate class for… I am suggesting to decouple building the url list from CloudSolrClient.sendRequest(..) to a separate class. The advantage will be the ability to easily write unit test for building the url list part and as we implement more routingRules for querying like query only the same rack replica's / OR query replica's where mem/cpu/disk utilisation is below a threshold can be easily unit tested etc. I can add more tests if approach looks good. Please review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/susheelks/lucene-solr SOLR-8146 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/147.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #147 commit 852aa685cc626a4a03c649895ae5ccbfb0008887 Author: Susheel Kumar Date: 2017-01-28T13:33:53Z decouple building url list from CloudSolrClient to separate class for better testability > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614108#comment-15614108 ] ASF subversion and git services commented on SOLR-8146: --- Commit 591984cd0325c387e3b4976e5236eb7c7cd1e93e in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=591984c ] SOLR-8146: removing the unused class > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614098#comment-15614098 ] ASF subversion and git services commented on SOLR-8146: --- Commit e6ce903a76b2fd6bb28dc76805add6b37a7814eb in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e6ce903 ] SOLR-8146: removing the unused class > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614085#comment-15614085 ] Noble Paul commented on SOLR-8146: -- What is done is just a refactoring so that this can be used in solrj. So, no need to mention it in changes.txt. yeah there is an unused snitch class that needs to go > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614065#comment-15614065 ] Shalin Shekhar Mangar commented on SOLR-8146: - What is the status of this issue? Seems like some code has been committed but there is no mention in CHANGES.txt. Also now we have two Snitch classes on master (I did not check the other branches) one in org.apache.solr.cloud.rule.Snitch (which seems to be unused) and another in org.apache.solr.common.cloud.rule.Snitch. Can we please clean this up? > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15558228#comment-15558228 ] ASF GitHub Bot commented on SOLR-8146: -- Github user susheelks commented on the issue: https://github.com/apache/lucene-solr/pull/66 Merged routingRule changes after latest refactored code > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539501#comment-15539501 ] Susheel Kumar commented on SOLR-8146: - Thank you, Noble. I am going thru the changes and will get back to you. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535876#comment-15535876 ] Noble Paul commented on SOLR-8146: -- [~susheel2...@gmail.com] you can take up my patch and continue with that > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, > SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532902#comment-15532902 ] ASF subversion and git services commented on SOLR-8146: --- Commit 063d624cdcf73e0eeb3c11487a76d4c3de7f40dc in lucene-solr's branch refs/heads/master from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=063d624 ] SOLR-8146: refactored the replica rules classes so that they can be accessed from SolrJ > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532879#comment-15532879 ] ASF subversion and git services commented on SOLR-8146: --- Commit 3ab22f6e3a21e748220946aed7bac9bce3c9b332 in lucene-solr's branch refs/heads/branch_6x from [~noble.paul] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3ab22f6 ] SOLR-8146: refactored the replica rules classes so that they can be accessed from SolrJ > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527951#comment-15527951 ] Susheel Kumar commented on SOLR-8146: - Hello Noble, Can you please review the pull request and provide feedback on the approach of implementing routingRule and accordingly i can move forward with it. Thanks, Susheel > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414569#comment-15414569 ] ASF GitHub Bot commented on SOLR-8146: -- GitHub user susheelks opened a pull request: https://github.com/apache/lucene-solr/pull/66 SOLR-8146: Allowing SolrJ CloudSolrClient to have preferred replica for query/read This pull request is to get feedback on the approach of implementing routingRule. The unit test is not ready yet as facing challenges on how to mock/ inject dependency to simulate a cluster with different IP addresses machines and only matching one gets added to urlList which ultimately gets passed to LBHttpSolrClient. You can merge this pull request into a Git repository by running: $ git pull https://github.com/susheelks/lucene-solr SOLR-8146 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/66.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #66 commit e761d7c2e1ecf6ce37eb43bc764897fbed8cdc4e Author: Kumar, Susheel (CORP) Date: 2016-08-10T00:42:45Z changes for limiting query to shard matching routing rule > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382761#comment-15382761 ] Susheel Kumar commented on SOLR-8146: - Thanks, Paul. I like the routingRule terminology than preferredNodes. The current rules like cores, freeDisk, host etc doesn't include "rule" in their names, so wanted to double check if "routingRule" name is okay and there is similar parameter name _route_ for routing keys https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options. Hope these names all fit together to avoid any ambiguity. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382424#comment-15382424 ] Noble Paul commented on SOLR-8146: -- It should be also honored by SolrJ . SolrJ should look for the parameters and identify the nodes right there itself. If the request indeed reaches a node, The first server to receive this request should route the request to the right nodes The params could be as follows {code} routingRule=ip_1:192&routingRule=ip_2:93 {code} > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382390#comment-15382390 ] Susheel Kumar commented on SOLR-8146: - Thanks, Noble and Arcadius for clarifying the status of SOLR-8146. Hello Noble, I can start working on the patch. Have a question to clarify 1. For multi-date center scenario, the preferredNodes rule may specify different values / ranges depending on, from which data center solrj client is querying? So do you see preferredNodes rule being used during query operation like http://localhost:8983/solr/collection1/select?rule=preferredNodes=ip_1:192,ip_2:93 The current Snitches design/implementation is only being used in Admin Collections API (https://cwiki.apache.org/confluence/display/solr/Collections+API) for replica placement so this will be another usage of Snitches and extending to query operations. Thanks, Susheel > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381513#comment-15381513 ] Arcadius Ahouansou commented on SOLR-8146: -- Hello Susheel. This ticket is not fully implemented yet. The attached patch is the very first version which does work, but relies on passing regex as start up param to the SolrJClient in the format {code} -Dsolr.preferredQueryNodePattern=SOME_REGEX_MATCHING_A_SET_OF_SOLR_NODES {code} This approached worked well for us but: - It does not look very elegant and - it does not integrate well into the current code base. so, a better way to do this is to use the snitch. Unfortunately, due to changes in priority, I was not able to come back to finish this work. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364991#comment-15364991 ] Susheel Kumar commented on SOLR-8146: - Hello Noble, Arcadius, Can you please describe how exactly ImplicitSnitch can be used for preferredNodes and if there is anything to be done on SolrJ client to use preferredNodes for querying replicas? I have created a JIRA https://issues.apache.org/jira/browse/SOLR-9283 to document the exact steps/details for anyone to refer. Thanks, Susheel > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090508#comment-15090508 ] Arcadius Ahouansou commented on SOLR-8146: -- The new ticket is SOLR-8522 > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087282#comment-15087282 ] Noble Paul commented on SOLR-8146: -- Yeah, open a ticket for enhancing the snitch > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087281#comment-15087281 ] Arcadius Ahouansou commented on SOLR-8146: -- Hello [~noble.paul] I have been working on the support for IP tags such as {{ip_1}}, {{ip_2}}, {{ip_3}} and {{ip_4}} in {{ImplicitSnitch}}. Support for IPv6 is not yet implemented but I do have some failing tests for that Is it OK for me to create a different ticket/card for support for the IP tags? Thanks. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081176#comment-15081176 ] Noble Paul commented on SOLR-8146: -- It's OK , the tag names should make sense , that is all using DC or rack does not necessarily make sense in all cases > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081175#comment-15081175 ] Arcadius Ahouansou commented on SOLR-8146: -- Thank you very much [~noble.paul] for the clarification. Looking at SOLR-6289, maybe there is an overlap between {{ip_2}} vs {{dc}} and {{ip_3}} vs {{rack}}? > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081120#comment-15081120 ] Noble Paul commented on SOLR-8146: -- bq.in order to use the preferredNodes snitch, one will have to add that snitch to the collection. Is this correct? well, no. The implicit snitches are available to all collections. A snitch just has to say that I can provide values for a particular tag . Using regex is not really possible in the current design . It is only possible to provide discrete values. or ranges. Lets assume an ip address 192.93.255.255 . It is possible for a Snitch to provide values such as ip_1 = 192 ip_2 = 93 ip_3 = 255 ip_4 = 255 In this case you can provide a rule which says {{preferredNodes=ip_1:192,ip_2:93}} This means it will choose only nodes {{192.93.\*.\*}} This can be a part of the {{ImplicitSnitch}} itself. The implicitSnitch can provide values for tags {{ip_1}}, {{ip_2}}. {{ip_3}}, {{ip_4}} and for ip v6 it can provide values for {{ip_5}} and {{ip_6}} as well > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080997#comment-15080997 ] Arcadius Ahouansou commented on SOLR-8146: -- Hello [~noble.paul] Thank you very much for your suggestions. Regarding: {{preferredNodes=hostPattern:}}, If I understand well ( and correct me if I am wrong), in order to use the preferredNodes snitch, one will have to add that snitch to the collection. Is this correct? The way the current implementation works is that there is not change at all on the SolrCloud server or collection. All the configuration is on the client SolrJ: This is on purpose because it's the client SolrJ that needs to choose its preferred servers. Ideally, with the use of snitch, we would like to let the client make this choice without having to add anything to the server or collection. How can this be achieved? Any hint will be appreciated. Thank you very my [~noble.paul] > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066409#comment-15066409 ] Noble Paul commented on SOLR-8146: -- Actually you can write a snitch which uses a regex predicate as in {code} preferredNodes=hostPattern: {code} As the Snitch does not need any extra params , just add it to the list of well known snitches > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064862#comment-15064862 ] Arcadius Ahouansou commented on SOLR-8146: -- Thank you very much [~mikemccand] for your help! > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064219#comment-15064219 ] Arcadius Ahouansou commented on SOLR-8146: -- Thank you very much [~noble.paul]. I will have a look into {{snitch}} > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063936#comment-15063936 ] Noble Paul commented on SOLR-8146: -- I see that a regex is used for expressing the affinity. I would rather have something like the replica placement rule and piggy back on same syntax examples {code} preferredNodes=host: {code} you can implement new snitches such as DCAwareSnitch or RackAwareSnitch and add to the patch and use rules like {code} preferredNodes=dc:DC2 prefrredNodes=rack:RACK3 {code} > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read
[ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1538#comment-1538 ] Arcadius Ahouansou commented on SOLR-8146: -- Hello [~elyograg] I thought the initial issue description may have been misleading and unclear. So, I have re-edited it. Please let me know in case there is any question. Thanks. > Allowing SolrJ CloudSolrClient to have preferred replica for query/read > --- > > Key: SOLR-8146 > URL: https://issues.apache.org/jira/browse/SOLR-8146 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 5.3 >Reporter: Arcadius Ahouansou > Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch > > > h2. Backgrouds > Currently, the CloudSolrClient randomly picks a replica to query. > This is done by shuffling the list of live URLs to query then, picking the > first item from the list. > This ticket is to allow more flexibility and control to some extend which > URLs will be picked up for queries. > Note that this is for queries only and would not affect update/delete/admin > operations. > h2. Implementation > The current patch uses regex pattern and moves to the top of the list of URLs > only those matching the given regex specified by the system property > {code}solr.preferredQueryNodePattern{code} > Initially, I thought it may be good to have Solr nodes tagged with a string > pattern (snitch?) and use that pattern for matching the URLs. > Any comment, recommendation or feedback would be appreciated. > h2. Use Cases > There are many cases where the ability to choose the node where queries go > can be very handy: > h3. Special node for manual user queries and analytics > One may have a SolrCLoud cluster where every node host the same set of > collections with: > - multiple large SolrCLoud nodes (L) used for production apps and > - have 1 small node (S) in the same cluster with less ram/cpu used only for > manual user queries, data export and other production issue investigation. > This ticket would allow to configure the applications using SolrJ to query > only the (L) nodes > This use case is similar to the one described in SOLR-5501 raised by [~manuel > lenormand] > h3. Minimizing network traffic > > For simplicity, let's say that we have a SolrSloud cluster deployed on 2 (or > N) separate racks: rack1 and rack2. > On each rack, we have a set of SolrCloud VMs as well as a couple of client > VMs querying solr using SolrJ. > All solr nodes are identical and have the same number of collections. > What we would like to achieve is: > - clients on rack1 will by preference query only SolrCloud nodes on rack1, > and > - clients on rack2 will by preference query only SolrCloud nodes on rack2. > - Cross-rack read will happen if and only if one of the racks has no > available Solr node to serve a request. > In other words, we want read operations to be local to a rack whenever > possible. > Note that write/update/delete/admin operations should not be affected. > Note that in our use case, we have a cross DC deployment. So, replace > rack1/rack2 by DC1/DC2 > Any comment would be very appreciated. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org