[ https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526734#comment-14526734 ]
ASF subversion and git services commented on SOLR-6220: ------------------------------------------------------- Commit 1677614 from [~noble.paul] in branch 'dev/trunk' [ https://svn.apache.org/r1677614 ] SOLR-6220: setting eol style > Replica placement strategy for solrcloud > ---------------------------------------- > > Key: SOLR-6220 > URL: https://issues.apache.org/jira/browse/SOLR-6220 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Reporter: Noble Paul > Assignee: Noble Paul > Attachments: SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch, > SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch > > > h1.Objective > Most cloud based systems allow to specify rules on how the replicas/nodes of > a cluster are allocated . Solr should have a flexible mechanism through which > we should be able to control allocation of replicas or later change it to > suit the needs of the system > All configurations are per collection basis. The rules are applied whenever a > replica is created in any of the shards in a given collection during > * collection creation > * shard splitting > * add replica > * createsshard > There are two aspects to how replicas are placed: snitch and placement. > h2.snitch > How to identify the tags of nodes. Snitches are configured through collection > create command with the snitch param . eg: snitch=EC2Snitch or > snitch=class:EC2Snitch > h2.ImplicitSnitch > This is shipped by default with Solr. user does not need to specify > {{ImplicitSnitch}} in configuration. If the tags known to ImplicitSnitch are > present in the rules , it is automatically used, > tags provided by ImplicitSnitch > # cores : No:of cores in the node > # disk : Disk space available in the node > # host : host name of the node > # node: node name > # D.* : These are values available from systrem propertes. {{D.key}} means a > value that is passed to the node as {{-Dkey=keyValue}} during the node > startup. It is possible to use rules like {{D.key:expectedVal,shard:*}} > h2.Rules > This tells how many replicas for a given shard needs to be assigned to nodes > with the given key value pairs. These parameters will be passed on to the > collection CREATE api as a multivalued parameter "rule" . The values will be > saved in the state of the collection as follows > {code:Javascript} > { > “mycollection”:{ > “snitch”: { > class:“ImplicitSnitch” > } > “rules”:[{"cores":"4-"}, > {"replica":"1" ,"shard" :"*" ,"node":"*"}, > {"disk":">100"}] > } > {code} > A rule is specified as a pseudo JSON syntax . which is a map of keys and > values > *Each collection can have any number of rules. As long as the rules do not > conflict with each other it should be OK. Or else an error is thrown > * In each rule , shard and replica can be omitted > ** default value of replica is {{\*}} means ANY or you can specify a count > and an operand such as {{<}} (less than) or {{>}} (greater than) > ** and the value of shard can be a shard name or {{\*}} means EACH or > {{**}} means ANY. default value is {{\*\*}} (ANY) > * There should be exactly one extra condition in a rule other than {{shard}} > and {{replica}}. > * all keys other than {{shard}} and {{replica}} are called tags and the tags > are nothing but values provided by the snitch for each node > * By default certain tags such as {{node}}, {{host}}, {{port}} are provided > by the system implicitly > h3.How are nodes picked up? > Nodes are not picked up in random. The rules are used to first sort the nodes > according to affinity. For example, if there is a rule that says > {{disk:100+}} , nodes with more disk space are given higher preference. And > if the rule is {{disk:100-}} nodes with lesser disk space will be given > priority. If everything else is equal , nodes with fewer cores are given > higher priority > h3.Fuzzy match > Fuzzy match can be applied when strict matches fail .The values can be > prefixed {{~}} to specify fuzziness > example rule > {noformat} > #Example requirement "use only one replica of a shard in a host if possible, > if no matches found , relax that rule". > rack:*,shard:*,replica:<2~ > #Another example, assign all replicas to nodes with disk space of 100GB or > more,, or relax the rule if not possible. This will ensure that if a node > does not exist with 100GB disk, nodes are picked up the order of size say a > 85GB node would be picked up over 80GB disk node > disk:>100~ > {noformat} > Examples: > {noformat} > #in each rack there can be max two replicas of A given shard > rack:*,shard:*,replica:<3 > //in each rack there can be max two replicas of ANY replica > rack:*,shard:**,replica:2 > rack:*,replica:<3 > #in each node there should be a max one replica of EACH shard > node:*,shard:*,replica:1- > #in each node there should be a max one replica of ANY shard > node:*,shard:**,replica:1- > node:*,replica:1- > > #In rack 738 and shard=shard1, there can be a max 0 replica > rack:738,shard:shard1,replica:<1 > > #All replicas of shard1 should go to rack 730 > shard:shard1,replica:*,rack:730 > shard:shard1,rack:730 > #all replicas must be created in a node with at least 20GB disk > replica:*,shard:*,disk:>20 > replica:*,disk:>20 > disk:>20 > #All replicas should be created in nodes with less than 5 cores > #In this ANY AND each for shard have same meaning > replica:*,shard:**,cores:<5 > replica:*,cores:<5 > cores:<5 > #one replica of shard1 must go to node 192.168.1.2:8080_solr > node:”192.168.1.2:8080_solr”, shard:shard1, replica:1 > #No replica of shard1 should go to rack 738 > rack:!738,shard:shard1,replica:* > rack:!738,shard:shard1 > #No replica of ANY shard should go to rack 738 > rack:!738,shard:**,replica:* > rack:!738,shard:* > rack:!738 > {noformat} > In the collection create API all the placement rules are provided as a > parameters called rule > example: > {noformat} > snitch=EC2Snitch&rule=shard:*,replica:1,dc:dc1&rule=shard:*,replica:<2,dc:dc3&rule=shard:shard1,replica:,rack:!738} > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org