Andrzej Bialecki  created SOLR-12509:
----------------------------------------

             Summary: Improve SplitShardCmd performance and reliability
                 Key: SOLR-12509
                 URL: https://issues.apache.org/jira/browse/SOLR-12509
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Andrzej Bialecki 
            Assignee: Andrzej Bialecki 


{{SplitShardCmd}} is currently quite complex.

Shard splitting occurs on active shards, which are still being updated, so the 
splitting has to involve several carefully orchestrated steps, making sure that 
new sub-shard placeholders are properly created and visible, and then also 
applying buffered updates to the split leaders and performing recovery on 
sub-shard replicas.

This process could be simplified in cases where collections are not actively 
being updated or can tolerate a little downtime - we could put the shard 
"offline", ie. disable writing while the splitting is in progress (in order to 
avoid users' confusion we should disable writing to the whole collection).

The actual index splittingĀ couldĀ perhaps be improved to use 
{{HardLinkCopyDirectoryWrapper}} for creating a copy of the index by 
hard-linking existing index segments, and then applying deletes to the 
documents that don't belong in a sub-shard. However, the resulting index slices 
that replicas would have to pull would be the same size as the whole shard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to