[
https://issues.apache.org/jira/browse/SOLR-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583110#comment-15583110
]
Christine Poerschke commented on SOLR-6730:
-------------------------------------------
Noble and I discussed offline re: SOLR-6730 and SOLR-8146 overlaps and
differences. I will try to summarise here as bullet points, [~noble.paul]
please add or correct if i missed or misunderstood something.
* The use case and motivation for the {{select?replicaAffinity=(node|host)}}
part of SOLR-6730 was to reduce the number of JVMs hit by a given search since
the more JVMs are hit, the higher the chance of hitting a garbage collection
pause in one of many JVMs.
* The use case and motivation for the {{replicaAffinity.hostPriorities=...}}
part of SOLR-6730 was to preferentially direct requests from the same
user/source to certain areas of the cloud.
** The implementation of the {{replicaAffinity.hostPriorities=...}} approach
requires configuration somewhere i.e. a list of which hosts to prioritise.
** No matter where it is stored, maintaining configuration can be cumbersome as
collections and hosts change over time.
* The objective of directing requests from the same user/source to certain
areas of the cloud can be achieved without configuration, and the objective of
reducing the number of JVMs hit by a search can pretty much be achieved that
way also.
** Approach outline:
*** Two numeric parameters ('seed' and 'mod') are optionally added to each
request.
*** The two parameters 'place' the requests within the cloud, e.g. for
{{mod=9}} any seed between 0 and 8 would be valid and {{seed=6}} would 'place'
the request with the 7th of 9 replicas, or more realistically the 3rd of 3
replicas.
*** seed-plus-mod placement automatically adjusts when the number of replicas
changes i.e. (seed=2,mod=6) would be 3rd-of-6 or 2nd-of-4 or 2nd-of-3 or
1st-of-2 placement.
*** SOLR-6730 here would likely be abandoned in favour of the approach outlined.
* What is common to SOLR-6730 and SOLR-8146:
** optional parameters would support changing of the existing behaviour
** existing behaviour is maintained if the optional parameters are not supplied
* What is different between SOLR-6730 and SOLR-8146:
** point-of-use of the optional parameter is HttpShardHandler\[Factory\] for
SOLR-6730
** point-of-use of the optional parameter is CloudSolrClient (and
HttpShardHandler\[Factory\]?) for SOLR-8146
* Next steps:
1. SOLR-8332 to factor HttpShardHandler\[Factory\]'s url shuffling out into a
ReplicaListTransformer class
2. creation of additional ReplicaListTransformer implementations corresponding
to the approach outlined above
> select?replicaAffinity=(node|host) and replicaAffinity.hostPriorities support
> -----------------------------------------------------------------------------
>
> Key: SOLR-6730
> URL: https://issues.apache.org/jira/browse/SOLR-6730
> Project: Solr
> Issue Type: New Feature
> Reporter: Christine Poerschke
> Assignee: Christine Poerschke
> Priority: Minor
>
> If no shards parameter is supplied with a select request then sub-requests
> will go to a random selection of live solr nodes hosting shards for the
> collection of interest. All sub-requests must complete before results can be
> collated i.e. the slowest sub-request determines how fast the search
> completes.
> Use of optional replicaAffinity can reduce the number of JVMs hit by a given
> search (the more JVMs are hit, the higher the chance of hitting a garbage
> collection pause in one of many JVMs). Preferentially directing requests to
> certain areas of the cloud can also be useful for debugging or when some
> replicas reside on 'faster' machines.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]