The meaning of Replication Factor is screwed up. Replication factor is
a number. RF=3 means there are 3 replicas for each shard.

I understand that {"replica": "<7", "node":"#ANY"} may result in two
replicas of the same shard ending up on the same node. However, the
other rule should prevent this: {"replica": "<2", "shard": "#EACH",
"node": "#ANY"}
So by using both rules, that should mean "no more than six replicas on
a node, where all the replicas on that node represent distinct
shards". Right?

Yes you are right

On Fri, Feb 23, 2018 at 7:17 AM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
> I managed to miss this reply earlier, but:
>
> Shard: A logical segment of a collection
> Replica: A physical core, representing a particular Shard
> Replication Factor (RF): A set of Replicas, such that a single Replica exists 
> for each Shard in a Collection.
> Availability Zone (AZ): A partitioned set of nodes such that a physical or 
> hardware failure in one AZ should not affect another AZ. AZ could mean 
> distinct racks in a data center, or distinct  data centers, but I happen to 
> specifically mean the AWS definition here: 
> https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-regions-availability-zones
>
> So an RF2 collection with 2 shards means I have four Replicas in my 
> collection, two shard1 and two shard2. If it's RF3, then I have six: three 
> shard1 and three shard2.
> I'm using "Distinct RF" as a shorthand for "a single replica for every shard 
> in the collection".
> In the RF2 example above, if I have two Availability Zones, I would want a 
> Distinct RF in each AZ. So, a replica for shard1 and shard2 in AZ1, and a 
> replica for shard1 and shard2 in AZ2. I would *not* want, say, both shard1 
> replicas in AZ1 because then a failure of AZ1 could leave me with no replicas 
> for shard1 and an incomplete collection.
> If I had RF6 and two AZs, I would want three Distinct RFs in each AZ. (three 
> replicas for each shard, per AZ)
>
> I understand that {"replica": "<7", "node":"#ANY"} may result in two replicas 
> of the same shard ending up on the same node. However, the other rule should 
> prevent this: {"replica": "<2", "shard": "#EACH", "node": "#ANY"}
> So by using both rules, that should mean "no more than six replicas on a 
> node, where all the replicas on that node represent distinct shards". Right?
>
>
>
> On 2/12/18, 12:18 PM, "Noble Paul" <noble.p...@gmail.com> wrote:
>
>     >>Goal: No node should have more than 6 shards
>
>     This is not possible today
>
>      {"replica": "<7", "node":"#ANY"} , means don't put more than 7
>     replicas of the collection (irrespective of the shards) in a given
>     node
>
>     what do you mean by distinct 'RF' ? I think we are screwing up the
>     terminologies a bit here
>
>     On Wed, Feb 7, 2018 at 1:38 PM, Jeff Wartes <jwar...@whitepages.com> 
> wrote:
>     > I’ve been messing around with the Solr 7.2 autoscaling framework this 
> week. Some things seem trivial, but I’m also running into questions and 
> issues. If anyone else has experience with this stuff, I’d be glad to hear 
> it. Specifically:
>     >
>     >
>     > Context:
>     > -One collection, consisting of 42 shards, where up to 6 shards can fit 
> on a single node. (which means 7 nodes per Replication Factor)
>     > -Three AZs, each with its own ip_2 value.
>     >
>     > Goals:
>     >
>     > Goal: Fully utilize available nodes.
>     > Cluster Preference: {“maximize”: "cores”}
>     >
>     > Goal: No node should have more than one replica of a given shard
>     > Rule: {"replica": "<2", "shard": "#EACH", "node": "#ANY"}
>     >
>     > Goal: No node should have more than 6 shards
>     > Rule: {"replica": "<7", "node":"#ANY"}
>     >
>     > Goal: Where possible, distinct RFs should each exist in an AZ.
>     > (Example1: I’d like 7 nodes with a complete RF in AZ 1 and 7 nodes with 
> a complete RF in AZ 2, and not end up with, say, both shard2 replicas in AZ 1)
>     > (Example2: If I have 14 nodes in AZ 1 and 7 in AZ 2, I should have two 
> full RFs in AZ 1 and one in AZ 2)
>     > Rule: ???
>     >
>     > I could have multiple non-strict rules perhaps? Like:
>     > {"replica": "<2", "shard": "#EACH", "ip_2": "1", "strict":false}
>     > {"replica": "<3", "shard": "#EACH", "ip_2": "1", "strict":false}
>     > {"replica": "<4", "shard": "#EACH", "ip_2": "1", "strict":false}
>     > {"replica": "<2", "shard": "#EACH", "ip_2": "2", "strict":false}
>     > {"replica": "<3", "shard": "#EACH", "ip_2": "2", "strict":false}
>     > {"replica": "<4", "shard": "#EACH", "ip_2": "2", "strict":false}
>     > etc
>     > So having more than one RF in an AZ is a technical “violation”, but if 
> placement minimizes non-strict violations, replicas would tend to get placed 
> correctly.
>     >
>     >
>     > Given a working set of rules, I’m still having trouble with two things:
>     >
>     >   1.  I’ve manually created the “.system” collection, as it didn’t seem 
> to get created automatically. However, autoscaling activity is not getting 
> logged to it.
>     >   2.  I can’t seem to figure out how to scale up.
>     >      *   I’d presumed editing the collection’s “replicationFactor” 
> would do the trick, but it does not.
>     >      *   The “node-up” trigger will serve to replace lost replicas, but 
> won’t otherwise take advantage of additional capacity.
>     >
>     >                                                                i.      
> There’s a UTILIZENODE command in 7.2, but it appears that’s still something 
> you need to trigger manually.
>     >
>     > Anyone played with this stuff?
>
>
>
>     --
>     -----------------------------------------------------
>     Noble Paul
>
>



-- 
-----------------------------------------------------
Noble Paul

Reply via email to