Would this be a reasonable (if very rough) attempt at cake diagram? https://docs.google.com/drawings/d/1XxLjds0OOm44zOVCMR-cwCJXnTs3C2x257KpCTxI1Ec/edit
Not sure if I managed to get logical/physical separation clearly enough, but it could be a start. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 4, 2013 at 2:14 PM, Per Steffensen <st...@designware.dk> wrote: > It was a very good explanation, Jack! > > I believe I have heard most of it before, so it is really not new for me. > I DO understand that the name "replica" and "replication-factor" CAN be > justified, but it requires a long and thorough explanation. And thats the > point. A good name for a concept means that: > * The name is among the first that pops up in your mind when you think > about the concept, or at least you can make a very short explanation why > you choose this name for that concept > * When a (fairly educated) newcomer hears the name for the first time, his > first thoughts about the concept it covers is as close as possible to the > actual concept > > Good metrics for whether or not we have good names must therefore be > 1) The frequency of questions about the concepts behind the names > 2) The frequency of wrong usage of names (cases where people actually > didnt understand the concept behind the name, didnt ask (1. above) and just > used the name for what he thought it meant) > 3) The length of the explanation of why you chose this name for that > concept > > Ad 1) > I counted several questions just this week. Especially I noted "Replica > (Replica of _what_?)" in the original post of this thread. Whether we want > it or not, newcomers will keep "not getting" the concept of replica or > getting it wrong. Why? Because it is a bad name. > Ad 2) > I also counted several cases where names where used completely wrong this > week. > Ad 3) > Take a look at the length of Jacks great post below, and take a look at > the length of this mail-thread. > > I believe we will do better on the metrics if we use > node/collection/shard/shard-**instance/index instead of > node/collection/shard/replica/**(core/)index, and use instances-per-shard > instead of replication-factor. And say that "core" is the same as a > "shard-instance", but typically used in a non/pre-Cloud context. That index > is a physical lucene thing - and nothing but that. That collections and > shards are logical concepts. That a shard-instance is a physical instance > of a shard implemented using a lucene index persisting its data on physical > disk. > > My only interest here is to try to pull the project in a good direction. > You just get my opinion. Keep it simple and no bullshit. > > This entire discussion is great I think, but it probably belong on > dev-list (or maybe on a JIRA). > I belive Alexandre Rafalovitch got his answer already :-) To the level a > clean answer exists at the moment. > > Regards, Per Steffensen > > > On 1/4/13 2:54 PM, Jack Krupansky wrote: > >> Replication makes perfect sense even if our explanations so far do not. >> >> A shard is an abstraction of a subset of the data for a collection. >> >> A replica is an instance of the data of the shard and instances of Solr >> servers that have indicated a readiness to service queries and updates for >> the data. Alternatively, a replica is a node which has indicated a >> readiness to receive and serve the data of a shard, but may not have any >> data at the moment. >> >> Lets describe it operationally for SolrCloud: If data comes in to any >> replica of a shard it will automatically and quickly be "replicated" to all >> other replicas of the shard. If a new replica of a shard comes up it will >> be streamed all of the data from the another replica of the shard. If an >> existing replica of a shard restarts or reconnects to the cluster, it will >> be streamed updates of any new data since it was last updated from another >> replica of the shard. >> >> Replication is simply the process of assuring that all replicas are kept >> up to date. That's the same abstract meaning as for Master/Slave even >> though the operational details are somewhat different. The goal remains the >> same. >> >> Replication factor is the number of instances of the data of the shard >> and instances of Solr servers that can service queries and updates for the >> data. Alternatively, the replication factor is the number of nodes of the >> SolrCloud cluster which have indicated a readiness to receive and serve >> the data of a shard, but may not have any data at the moment. >> >> A node is an instance of Solr running in a Java JVM that has indicated to >> the Zookeeper ensemble of a SolrCloud cluster that it is ready to be a >> replica for a shard of a collection. [The latter part of that is a bit too >> fuzzy - I'm not sure what the node tells Zookeeper and who does shard >> assignment. I mean, does a node explicitly say what shard it wants to be, >> or is that assigned by Zookeeper, or is that a node's choice/option? But >> none of that changes the fact that a node "registers" with Zookeeper and >> then somehow becomes a replica for a shard.] >> >> A node (instance of a Solr server) can be a replica of shards from >> multiple collections (potentially multiple shards per collection). A node >> is not a replica per se, but a container that can serve multiple >> collections. A node can serve as multiple replicas, each of a different >> collection. >> >> My only interest here on this user list is to understand and explain the >> terms we have today and that SEEM to be working for the most part, even >> though we may not have defined them carefully enough and used them >> consistently enough. >> >> If somebody want to propose an alternative terminology - fine, discuss >> that on the dev list and/or file a Jira. >> >> I won't claim that my definitions are perfect (yet), but perfecting the >> definitions (for users) should be separated from changing the terms >> themselves. >> >> -- Jack Krupansky >> >> -----Original Message----- From: Per Steffensen >> Sent: Friday, January 04, 2013 2:49 AM >> To: solr-user@lucene.apache.org >> Subject: Re: Terminology question: Core vs. Collection vs... >> >> On 1/3/13 5:58 PM, Walter Underwood wrote: >> >>> A "factor" is multiplied, so multiplying the leader by a >>> replicationFactor of 1 means you have exactly one copy of that shard. >>> >>> I think that recycling the term "replication" within Solr was confusing, >>> but it is a bit late to change that. >>> >>> wunder >>> >> Yes, the term "factor" is not misleading, but the term "replication" is. >> If we keep calling shard-instances for "Replica" I guess "replicaFactor" >> will be ok - at least much better than "replicationFactor". But it would >> still be better with e.g. "ShardInstance" and "InstancesPerShard" >> >> >