anshumg commented on code in PR #977: URL: https://github.com/apache/solr/pull/977#discussion_r974655894
########## dev-docs/shard-split/shard-split.adoc: ########## @@ -0,0 +1,159 @@ += Shard Split +:toc: macro +:toclevels: 3 + +The document explains how shard split works in SolrCloud at a high level. + +toc::[] + +== Background +Constantly adding new documents to Solr will slow down query performance as index size increases. To handle this, shard split is introduced. Shard split feature works in both Standalone and SolrCloud modes. + +Shard is a logical partition of collection, containing a subset of documents from collection. Which shard contains which document depends on the sharding strategy. It is the "router" that determines this -- e.g. "implicit" vs "compositeId" When a document is sent to Solr for indexing, the system first determines which shard the document belongs to and finds a leader of that shard. Then the leader forwards the updates to other replicas. + +== Shard States +Shard can have one of the following states: + +* ACTIVE +** shard receives updates, participates in distributed search. +* CONSTRUCTION +** shard receives updates only from the parent shard leader, but doesn’t participate in distributed search. +** shard is put in that state when shard split operation is in progress or shard is undergoing data restoration. +* RECOVERY +** shard receives updates only from the parent shard leader, but doesn’t participate in distributed search. +** shard is put in that state to create replicas in order to meet collection’s replicationFactor. +* RECOVERY_FAILED +** shard doesn’t receive any updates, doesn’t participate in distributed search. +** shard is put in that state when parent shard leader is not live. +* INACTIVE +** shard is put in that state after it has been successfully split. + +Detail: Shard is referred to Slice in the codebase context. + +== Shard State Transition Diagram + +image::images/shard-state-transition-diagram.png[] + +== Replica States + +Replica is a core, physical partition of index, placed on a node. Replica location is `/var/solr/data`. + +Replica can have one of the following states: + +* ACTIVE +** replica is ready to receive updates and queries. +* DOWN +** replica is actively trying to move to RECOVERING or ACTIVE state. +* RECOVERING +** replica is recovering from leader. This includes peer sync, full replication. +* RECOVERY_FAILED +** recovery is not succeeded. + +== Replica State Transition Diagram + +image::images/replica-state-transition-diagram.png[] + +== Simplified Explanation + +Before digging into the explanation, let us define a few terminologies which will help us understand the content better. We explicitly say *parent shard* for a shard which will be split. A *sub shard* is a child of parent shard which is a shard after split. An *initial replica* is a first replica/core to be added for a sub shard. An *additional replica* is a replica to be created in order to meet `replicationFactor` of collection. + +Splitting a shard will take an existing shard (parent shard) and break it into two pieces which are written into disk as two new shards (sub shards). Behind the scene, original shard's hash range is computed in order to break a shard into two pieces. + +Simple Shard Split Steps: + +* Sub shards are created in `CONSTRUCTION` state. +* Initial replica is created for each sub shard. +* Parent shard leader is “split” (=two new indices of sub shards are created from the parent shard). Review Comment: I think you missed this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org