[ 
https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206629#comment-14206629
 ] 

ASF GitHub Bot commented on SOLR-6730:
--------------------------------------

GitHub user cpoerschke opened a pull request:

    https://github.com/apache/lucene-solr/pull/104

    select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities 
support

    https://issues.apache.org/jira/i#browse/SOLR-6730

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bloomberg/lucene-solr 
trunk-replica-affinity-feature

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #104
    
----
commit 66b56265bdefec7eb814bfb533c0ff19bb1dcdff
Author: Christine Poerschke <cpoersc...@bloomberg.net>
Date:   2014-08-12T10:32:57Z

    solr: select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities 
support
    
    This commit also includes changes to reduce SearchHandler's overall use of 
ShardHandler objects.
    
    ---------
    
    solr: select?replicaAffinity=(node|host) support, 
select?replicaAffinity=host&replicaAffinity.hostPriorities=hostA,hostB=1,hostC=2,hostD=2,hostE=3
 prioritisation support
    
    illustration: `4-hosts-x-2-ports=8-instances 8-shards 2-replica system`
    
      http://host1:port1/solr/collection1_shard1_replicaA/
      http://host1:port1/solr/collection1_shard3_replicaA/
    
      http://host1:port2/solr/collection1_shard5_replicaA/
      http://host1:port2/solr/collection1_shard7_replicaA/
    
      http://host2:port1/solr/collection1_shard2_replicaA/
      http://host2:port1/solr/collection1_shard4_replicaA/
    
      http://host2:port2/solr/collection1_shard6_replicaA/
      http://host2:port2/solr/collection1_shard8_replicaA/
    
      http://host3:port1/solr/collection1_shard1_replicaB/
      http://host3:port1/solr/collection1_shard3_replicaB/
    
      http://host3:port2/solr/collection1_shard5_replicaB/
      http://host3:port2/solr/collection1_shard7_replicaB/
    
      http://host4:port1/solr/collection1_shard2_replicaB/
      http://host4:port1/solr/collection1_shard4_replicaB/
    
      http://host4:port2/solr/collection1_shard6_replicaB/
      http://host4:port2/solr/collection1_shard8_replicaB/
    
    `.../select` plain will route sub-requests to a random selection of solr 
cores and so could potentially use all 8 JVM instances
    
      http://host1:port1/solr/collection1_shard1_replicaA/
      http://host4:port1/solr/collection1_shard2_replicaB/
      http://host3:port1/solr/collection1_shard3_replicaB/
      http://host2:port1/solr/collection1_shard4_replicaA/
      http://host1:port2/solr/collection1_shard5_replicaA/
      http://host4:port2/solr/collection1_shard6_replicaB/
      http://host3:port2/solr/collection1_shard7_replicaB/
      http://host2:port2/solr/collection1_shard8_replicaA/
    
    `.../select?replicaAffinity=node` will route sub-requests to a random 
selection of solr cores whilst maintaining node affinity i.e. sub-requests that 
can go to the same solr instance will go to the same solr instance e.g.
    
      http://host1:port1/solr/collection1_shard1_replicaA/
      http://host4:port1/solr/collection1_shard2_replicaB/
      http://host1:port1/solr/collection1_shard3_replicaA/
      http://host4:port1/solr/collection1_shard4_replicaB/
      http://host3:port2/solr/collection1_shard5_replicaB/
      http://host2:port2/solr/collection1_shard6_replicaA/
      http://host3:port2/solr/collection1_shard7_replicaB/
      http://host2:port2/solr/collection1_shard8_replicaA/
    
    `.../select?replicaAffinity=host` will route sub-requests to a random 
selection of solr cores whilst maintaining host affinity i.e. sub-requests that 
can go to the same host machine will go to the same host machine e.g.
    
      http://host1:port1/solr/collection1_shard1_replicaA/
      http://host2:port1/solr/collection1_shard2_replicaA/
      http://host1:port1/solr/collection1_shard3_replicaA/
      http://host2:port1/solr/collection1_shard4_replicaA/
      http://host1:port2/solr/collection1_shard5_replicaA/
      http://host2:port2/solr/collection1_shard6_replicaA/
      http://host1:port2/solr/collection1_shard7_replicaA/
      http://host2:port2/solr/collection1_shard8_replicaA/
    
    `.../select?replicaAffinity=host&replicaAffinity=node` will route 
sub-requests to a random selection of solr cores whilst maintaining first host 
affinity and secondly node affinity (the latter clearly only applies if 
multiple JVMs on a given machine contain the same shard).
    
    If `replicaAffinity=host` is requested then optional 
`replicaAffinity.hostPriorities` are supported:
    
    
`.../select?replicaAffinity=host&replicaAffinity.hostPriorities=hostX=2,hostY=2,hostZ=1`
 will route sub-requests to hostZ (priority 1) for shards that are available on 
that host, to randomly either hostX or hostY (both priority 2) for shards 
available on those two hosts but not available on a priority 1 host.
    
    `replicaAffinity.hostPriorities=hostZ` and 
`replicaAffinity.hostPriorities=hostZ=1` are equivalent.
    
    If host priorities are supplied they can be just a subset of all hosts, 
preference will be given to live nodes on the prioritised hosts and random 
selections will be made for the remaining sub-requests.
    
    ---------
    
    solr: reduce SearchHandler's overall use of ShardHandler objects (from 
N+1+x to just 1)
    
    before:
     * A search request to an N-shard system constructs N+1+x ShardHandler 
objects in total:
       * 1 object in the receiving solr instance
       * 1 object in each of the N shards that receive an initial sub-request 
(for top ids or top group ids)
       * 1 object in each of x shards that receive a subsequent sub-request 
(for top ids within group or to get fields)
    
    after:
     * A search request to an N-shard systems constructs 1 ShardHandler object 
in the receiving solr instance only.
    
    summary of change:
     * move non-distrib related code fragments from 
HttpShardHandler.checkDistrib to SearchHandler
     * rename ShardHandler.checkDistrib to ShardHandler.prepDistrib (to be 
called for distributed requests only)
     * SearchHandler constructs ShardHandler object only for distributed 
requests

----


> select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-6730
>                 URL: https://issues.apache.org/jira/browse/SOLR-6730
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Christine Poerschke
>
> If no shards parameter is supplied with a select request then sub-requests 
> will go to a random selection of live solr nodes hosting shards for the 
> collection of interest. All sub-requests must complete before results can be 
> collated i.e. the slowest sub-request determines how fast the search 
> completes.
> Use of optional replicaAffinity can reduce the number of JVMs hit by a given 
> search (the more JVMs are hit, the higher the chance of hitting a garbage 
> collection pause in one of many JVMs). Preferentially directing requests to 
> certain areas of the cloud can also be useful for debugging or when some 
> replicas reside on 'faster' machines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to