[ https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206629#comment-14206629 ]
ASF GitHub Bot commented on SOLR-6730: -------------------------------------- GitHub user cpoerschke opened a pull request: https://github.com/apache/lucene-solr/pull/104 select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support https://issues.apache.org/jira/i#browse/SOLR-6730 You can merge this pull request into a Git repository by running: $ git pull https://github.com/bloomberg/lucene-solr trunk-replica-affinity-feature Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #104 ---- commit 66b56265bdefec7eb814bfb533c0ff19bb1dcdff Author: Christine Poerschke <cpoersc...@bloomberg.net> Date: 2014-08-12T10:32:57Z solr: select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support This commit also includes changes to reduce SearchHandler's overall use of ShardHandler objects. --------- solr: select?replicaAffinity=(node|host) support, select?replicaAffinity=host&replicaAffinity.hostPriorities=hostA,hostB=1,hostC=2,hostD=2,hostE=3 prioritisation support illustration: `4-hosts-x-2-ports=8-instances 8-shards 2-replica system` http://host1:port1/solr/collection1_shard1_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ http://host3:port1/solr/collection1_shard1_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host4:port2/solr/collection1_shard8_replicaB/ `.../select` plain will route sub-requests to a random selection of solr cores and so could potentially use all 8 JVM instances http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host3:port1/solr/collection1_shard3_replicaB/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host4:port2/solr/collection1_shard6_replicaB/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining node affinity i.e. sub-requests that can go to the same solr instance will go to the same solr instance e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host4:port1/solr/collection1_shard2_replicaB/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host4:port1/solr/collection1_shard4_replicaB/ http://host3:port2/solr/collection1_shard5_replicaB/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host3:port2/solr/collection1_shard7_replicaB/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=host` will route sub-requests to a random selection of solr cores whilst maintaining host affinity i.e. sub-requests that can go to the same host machine will go to the same host machine e.g. http://host1:port1/solr/collection1_shard1_replicaA/ http://host2:port1/solr/collection1_shard2_replicaA/ http://host1:port1/solr/collection1_shard3_replicaA/ http://host2:port1/solr/collection1_shard4_replicaA/ http://host1:port2/solr/collection1_shard5_replicaA/ http://host2:port2/solr/collection1_shard6_replicaA/ http://host1:port2/solr/collection1_shard7_replicaA/ http://host2:port2/solr/collection1_shard8_replicaA/ `.../select?replicaAffinity=host&replicaAffinity=node` will route sub-requests to a random selection of solr cores whilst maintaining first host affinity and secondly node affinity (the latter clearly only applies if multiple JVMs on a given machine contain the same shard). If `replicaAffinity=host` is requested then optional `replicaAffinity.hostPriorities` are supported: `.../select?replicaAffinity=host&replicaAffinity.hostPriorities=hostX=2,hostY=2,hostZ=1` will route sub-requests to hostZ (priority 1) for shards that are available on that host, to randomly either hostX or hostY (both priority 2) for shards available on those two hosts but not available on a priority 1 host. `replicaAffinity.hostPriorities=hostZ` and `replicaAffinity.hostPriorities=hostZ=1` are equivalent. If host priorities are supplied they can be just a subset of all hosts, preference will be given to live nodes on the prioritised hosts and random selections will be made for the remaining sub-requests. --------- solr: reduce SearchHandler's overall use of ShardHandler objects (from N+1+x to just 1) before: * A search request to an N-shard system constructs N+1+x ShardHandler objects in total: * 1 object in the receiving solr instance * 1 object in each of the N shards that receive an initial sub-request (for top ids or top group ids) * 1 object in each of x shards that receive a subsequent sub-request (for top ids within group or to get fields) after: * A search request to an N-shard systems constructs 1 ShardHandler object in the receiving solr instance only. summary of change: * move non-distrib related code fragments from HttpShardHandler.checkDistrib to SearchHandler * rename ShardHandler.checkDistrib to ShardHandler.prepDistrib (to be called for distributed requests only) * SearchHandler constructs ShardHandler object only for distributed requests ---- > select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support > ----------------------------------------------------------------------------- > > Key: SOLR-6730 > URL: https://issues.apache.org/jira/browse/SOLR-6730 > Project: Solr > Issue Type: New Feature > Reporter: Christine Poerschke > > If no shards parameter is supplied with a select request then sub-requests > will go to a random selection of live solr nodes hosting shards for the > collection of interest. All sub-requests must complete before results can be > collated i.e. the slowest sub-request determines how fast the search > completes. > Use of optional replicaAffinity can reduce the number of JVMs hit by a given > search (the more JVMs are hit, the higher the chance of hitting a garbage > collection pause in one of many JVMs). Preferentially directing requests to > certain areas of the cloud can also be useful for debugging or when some > replicas reside on 'faster' machines. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org