[ 
https://issues.apache.org/jira/browse/SOLR-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520919#comment-16520919
 ] 

Noble Paul edited comment on SOLR-11985 at 6/23/18 1:53 AM:
------------------------------------------------------------

bq. for the collection with 4 replicas. In the collection with 4 replicas, you 
could have 2 replicas on us-east-1a and 2 replicas on us-east-1b. What we 
really want is 1 on each before having the 4th replica on another zone...

In reality that is what happens. it starts allotting one at a time and you end 
up with 1 on each zone and another one ends up in a random zone.

But the problem is that once you are already in a badly distributed cluster, it 
won't show any violations.

Once we are done with SOLR-12511, that ceases to be a problem. your rules will 
look like 
{code}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

this means the effective policy for a shard with 4 replicas is 
{code}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

This means that any zone with 0 replicas is a violation. 


was (Author: noble.paul):
bq. for the collection with 4 replicas. In the collection with 4 replicas, you 
could have 2 replicas on us-east-1a and 2 replicas on us-east-1b. What we 
really want is 1 on each before having the 4th replica on another zone...

In reality that is what happens. it starts allotting one at a time and you end 
up with 1 on each zone and another one ends up in a random zone.

Once we are done with SOLR-12511, that ceases to be a problem. your rules will 
look like 
{code}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "33%", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

this means the effective policy for a shard with 4 replicas is 
{code}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1a"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1b"}
{"replica" : "1.33", "shard" : "#EACH", "sysprop:region": "us-east-1c"}
{code}

This means that any zone with 0 replicas is a violation. 

> Allow percentage in replica attribute in policy
> -----------------------------------------------
>
>                 Key: SOLR-11985
>                 URL: https://issues.apache.org/jira/browse/SOLR-11985
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling, SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Noble Paul
>            Priority: Major
>             Fix For: master (8.0), 7.5
>
>         Attachments: SOLR-11985.patch, SOLR-11985.patch
>
>
> Today we can only specify an absolute number in the 'replica' attribute in 
> the policy rules. It'd be useful to write a percentage value to make certain 
> use-cases easier. For example:
> {code:java}
> // Keep a third of the the replicas of each shard in east region
> {"replica" : "<34%", "shard" : "#EACH", "sysprop:region": "east"}
> // Keep two thirds of the the replicas of each shard in west region
> {"replica" : "<67%", "shard" : "#EACH", "sysprop:region": "west"}
> {code}
> Today the above must be represented by different rules for each collection if 
> they have different replication factors. Also if the replication factor 
> changes later, the absolute value has to be changed in tandem. So expressing 
> a percentage removes both of these restrictions.
> This feature means that the value of the attribute {{"replica"}} is only 
> available just in time. We call such values {{"computed values"}} . The 
> computed value for this attribute depends on other attributes as well. 
>  Take the following 2 rules
> {code:java}
> //example 1
> {"replica" : "<34%", "shard" : "#EACH", "sysprop:region": "east"}
> //example 2
> {"replica" : "<34%",  "sysprop:region": "east"}
> {code}
> assume we have collection {{"A"}} with 2 shards and {{replicationFactor=3}}
> *example 1* would mean that the value of replica is computed as
> {{3 * 34 / 100 = 1.02}}
> Which means *_for each shard_* keep less than 1.02 replica in east 
> availability zone
>  
> *example 2* would mean that the value of replica is computed as 
> {{3 * 2 * 34 / 100 = 2.04}}
>  
> which means _*for each collection*_ keep less than 2.04 replicas on east 
> availability zone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to