[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2018-06-07 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505333#comment-16505333
 ] 

Tomás Fernández Löbbe commented on SOLR-8146:
-

I just related some related Jiras: SOLR-11982 adds support for indicating 
preferred replicas, although this only works on the server side (meaning, when 
working with multiple shards). SOLR-12217 is the follow up code to make it work 
on the client side too

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
>Priority: Major
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2018-06-04 Thread Edwin Yeo Zheng Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499896#comment-16499896
 ] 

Edwin Yeo Zheng Lin commented on SOLR-8146:
---

Hi,

Would like to check, is there other ways which we can achieve this preferred 
replica for query/read in CloudSolrClient, or is it still pending 
implementation?

This feature is good, so that we can use a different replica for query/read and 
the replica used for indexing, to maximize the performance of both.

Edwin
h1.  

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
>Priority: Major
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2017-01-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15844065#comment-15844065
 ] 

ASF GitHub Bot commented on SOLR-8146:
--

GitHub user susheelks opened a pull request:

https://github.com/apache/lucene-solr/pull/147

SOLR-8146 decouple building url list from CloudSolrClient to separate class 
for…

I am suggesting to decouple building the url list from 
CloudSolrClient.sendRequest(..) to a separate class.  The advantage will be the 
ability to easily write unit test for building the url list part and as we 
implement more routingRules for querying like query only the same rack 
replica's / OR query replica's where mem/cpu/disk utilisation is below a 
threshold can be easily unit tested etc.

I can add more tests if approach looks good. Please review.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/susheelks/lucene-solr SOLR-8146

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #147


commit 852aa685cc626a4a03c649895ae5ccbfb0008887
Author: Susheel Kumar 
Date:   2017-01-28T13:33:53Z

decouple building url list from CloudSolrClient to separate class for 
better testability




> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614108#comment-15614108
 ] 

ASF subversion and git services commented on SOLR-8146:
---

Commit 591984cd0325c387e3b4976e5236eb7c7cd1e93e in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=591984c ]

SOLR-8146: removing the unused class


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614098#comment-15614098
 ] 

ASF subversion and git services commented on SOLR-8146:
---

Commit e6ce903a76b2fd6bb28dc76805add6b37a7814eb in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e6ce903 ]

SOLR-8146: removing the unused class


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-27 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614085#comment-15614085
 ] 

Noble Paul commented on SOLR-8146:
--

What is done is just a refactoring so that this can be used in solrj. So, no 
need to mention it in changes.txt.  yeah there is an unused snitch class that 
needs to go

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-27 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15614065#comment-15614065
 ] 

Shalin Shekhar Mangar commented on SOLR-8146:
-

What is the status of this issue? Seems like some code has been committed but 
there is no mention in CHANGES.txt. Also now we have two Snitch classes on 
master (I did not check the other branches) one in 
org.apache.solr.cloud.rule.Snitch (which seems to be unused) and another in 
org.apache.solr.common.cloud.rule.Snitch. Can we please clean this up?

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15558228#comment-15558228
 ] 

ASF GitHub Bot commented on SOLR-8146:
--

Github user susheelks commented on the issue:

https://github.com/apache/lucene-solr/pull/66
  
Merged routingRule changes after latest refactored code


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-10-01 Thread Susheel Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539501#comment-15539501
 ] 

Susheel Kumar commented on SOLR-8146:
-

Thank you, Noble. I am going thru the changes and will get back to you.

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-09-30 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535876#comment-15535876
 ] 

Noble Paul commented on SOLR-8146:
--

[~susheel2...@gmail.com] you can take up my patch and continue with that

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch, 
> SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-09-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532902#comment-15532902
 ] 

ASF subversion and git services commented on SOLR-8146:
---

Commit 063d624cdcf73e0eeb3c11487a76d4c3de7f40dc in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=063d624 ]

SOLR-8146: refactored the replica rules classes so that they can be accessed 
from SolrJ


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-09-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532879#comment-15532879
 ] 

ASF subversion and git services commented on SOLR-8146:
---

Commit 3ab22f6e3a21e748220946aed7bac9bce3c9b332 in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3ab22f6 ]

SOLR-8146: refactored the replica rules classes so that they can be accessed 
from SolrJ


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-09-27 Thread Susheel Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527951#comment-15527951
 ] 

Susheel Kumar commented on SOLR-8146:
-

Hello Noble,

Can you please review the pull request and provide feedback on the approach of 
implementing routingRule and accordingly i can move forward with it.

Thanks,
Susheel

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414569#comment-15414569
 ] 

ASF GitHub Bot commented on SOLR-8146:
--

GitHub user susheelks opened a pull request:

https://github.com/apache/lucene-solr/pull/66

SOLR-8146: Allowing SolrJ CloudSolrClient to have preferred replica for 
query/read

This pull request is to get feedback on the approach of implementing 
routingRule. 

The unit test is not ready yet as facing challenges on how to mock/ inject 
dependency to simulate a cluster with different IP addresses machines and only 
matching one gets added to urlList which ultimately gets passed to 
LBHttpSolrClient.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/susheelks/lucene-solr SOLR-8146

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #66


commit e761d7c2e1ecf6ce37eb43bc764897fbed8cdc4e
Author: Kumar, Susheel (CORP) 
Date:   2016-08-10T00:42:45Z

changes for limiting query to shard matching routing rule




> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-07-18 Thread Susheel Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382761#comment-15382761
 ] 

Susheel Kumar commented on SOLR-8146:
-

Thanks, Paul.  I like the routingRule terminology than preferredNodes. The 
current rules like cores, freeDisk, host etc doesn't include "rule" in their 
names, so wanted to double check if "routingRule" name is okay and there is 
similar parameter name _route_ for routing keys 
https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options.
  Hope these names all fit together to avoid any ambiguity. 


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-07-18 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382424#comment-15382424
 ] 

Noble Paul commented on SOLR-8146:
--

It should be also honored by SolrJ . SolrJ should look for the parameters and 
identify the nodes right there itself. If the request indeed reaches a node, 
The first server to receive this request should route the request to the right 
nodes
The params could be as follows
{code}
routingRule=ip_1:192&routingRule=ip_2:93
{code}



> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-07-18 Thread Susheel Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382390#comment-15382390
 ] 

Susheel Kumar commented on SOLR-8146:
-

Thanks, Noble and Arcadius for clarifying the status of SOLR-8146.  


Hello Noble,  I can start working on the patch.  Have a question to clarify

1. For multi-date center scenario, the preferredNodes rule may specify 
different values / ranges depending on, from which data center solrj client is 
querying?  So do you see preferredNodes rule being used during query operation 
like 

http://localhost:8983/solr/collection1/select?rule=preferredNodes=ip_1:192,ip_2:93

The current Snitches design/implementation is only being used in Admin 
Collections API 
(https://cwiki.apache.org/confluence/display/solr/Collections+API) for replica 
placement so this will be another usage of Snitches and extending to query 
operations.

Thanks,
Susheel



> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-07-17 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381513#comment-15381513
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Hello Susheel.

This ticket is not fully implemented yet.

The attached patch is the very first version which does work, but relies on 
passing regex as start up param to the SolrJClient in the format

{code}
-Dsolr.preferredQueryNodePattern=SOME_REGEX_MATCHING_A_SET_OF_SOLR_NODES

{code}
This approached worked well for us but:
- It does not look very elegant and 
- it does not integrate well into the current code base.

so, a better way to do this is to use the snitch.

Unfortunately, due to changes in priority, I was not able to come back to 
finish this work.


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-07-06 Thread Susheel Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364991#comment-15364991
 ] 

Susheel Kumar commented on SOLR-8146:
-

Hello Noble, Arcadius,

Can you please describe how exactly ImplicitSnitch can be used for 
preferredNodes and if there is anything to be done on SolrJ client to use 
preferredNodes for querying replicas?

I have created a JIRA  https://issues.apache.org/jira/browse/SOLR-9283 to 
document the exact steps/details for anyone to refer.

Thanks,
Susheel

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-09 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090508#comment-15090508
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

The new ticket is SOLR-8522

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-07 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087282#comment-15087282
 ] 

Noble Paul commented on SOLR-8146:
--

Yeah, open a ticket for enhancing the snitch

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-07 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087281#comment-15087281
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Hello [~noble.paul]
I have been working on the support for IP tags such as {{ip_1}}, {{ip_2}}, 
{{ip_3}} and {{ip_4}} in {{ImplicitSnitch}}.

Support for IPv6 is not yet implemented but I do have some failing tests for 
that

Is it OK for me to create a different ticket/card for support for the IP tags?

Thanks.

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081176#comment-15081176
 ] 

Noble Paul commented on SOLR-8146:
--

It's OK , the tag names should make sense , that is all using DC or rack does 
not necessarily make sense in all cases

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-04 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081175#comment-15081175
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Thank you very much [~noble.paul] for the clarification.

Looking at SOLR-6289, maybe there is an overlap between {{ip_2}} vs {{dc}} and 
{{ip_3}} vs {{rack}}?

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-04 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081120#comment-15081120
 ] 

Noble Paul commented on SOLR-8146:
--

bq.in order to use the preferredNodes snitch, one will have to add that snitch 
to the collection. Is this correct?

well, no. The implicit snitches are available to all collections. A snitch just 
has to say that I can provide values for a particular tag . 


Using regex is not really possible in the current design . It is only possible 
to provide discrete values. or ranges. 

Lets assume an ip address 192.93.255.255 . It is possible for a Snitch to 
provide values such as
ip_1 = 192
ip_2 = 93
ip_3 = 255
ip_4 = 255

In this case you can provide a rule which says 
{{preferredNodes=ip_1:192,ip_2:93}}
This means it will choose only nodes {{192.93.\*.\*}} 
This can be a part of the {{ImplicitSnitch}} itself. The implicitSnitch can 
provide values for tags {{ip_1}}, {{ip_2}}. {{ip_3}}, {{ip_4}} and for ip v6 it 
can provide values for {{ip_5}} and {{ip_6}} as well

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2016-01-04 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080997#comment-15080997
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Hello [~noble.paul]
Thank you very much for your suggestions.

Regarding: {{preferredNodes=hostPattern:}},

If I understand well ( and correct me if I am wrong), in order to use the 
preferredNodes snitch, one will have to add that snitch to the collection. Is 
this correct?

The way the current implementation works is that there is not change at all on 
the SolrCloud server or collection. 

All the configuration is on the client SolrJ: This is on purpose because it's 
the client SolrJ that needs to choose its preferred servers.
Ideally, with the use of snitch, we would like to let the client make this 
choice without having to add anything to the server or collection.

How can this be achieved? Any hint will be appreciated.

Thank you very my [~noble.paul]



> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2015-12-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066409#comment-15066409
 ] 

Noble Paul commented on SOLR-8146:
--

Actually you can write a snitch which uses a regex predicate as in

{code}
preferredNodes=hostPattern:
{code}

As the Snitch does not need any extra params , just add it to the list of well 
known snitches 

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2015-12-18 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064862#comment-15064862
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Thank you very much [~mikemccand] for your help!

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2015-12-18 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064219#comment-15064219
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Thank you very much [~noble.paul].
I will have a look into {{snitch}}

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2015-12-18 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063936#comment-15063936
 ] 

Noble Paul commented on SOLR-8146:
--

I see that a regex is used for expressing the affinity.

I would rather have something like the replica placement rule and piggy back on 
same syntax

examples 
{code}
preferredNodes=host:
{code}
you can implement new snitches such as DCAwareSnitch or RackAwareSnitch and add 
to the patch and use rules like

{code}
preferredNodes=dc:DC2
prefrredNodes=rack:RACK3
{code}

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

2015-11-10 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1538#comment-1538
 ] 

Arcadius Ahouansou commented on SOLR-8146:
--

Hello [~elyograg]
I thought the initial issue description may have been misleading and unclear.
So, I have re-edited it.

Please let me know in case there is any question.

Thanks.

> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> ---
>
> Key: SOLR-8146
> URL: https://issues.apache.org/jira/browse/SOLR-8146
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 5.3
>Reporter: Arcadius Ahouansou
> Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the 
> first item from the list.
> This ticket is to allow more flexibility and control to some extend which 
> URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin 
> operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs 
> only those matching the given regex specified by the system property 
> {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string 
> pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go 
> can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of 
> collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for 
> manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query 
> only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel 
> lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or 
> N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client 
> VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, 
> and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no 
> available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever 
> possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace 
> rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org