Vinayak Borkar created HELIX-80:
-----------------------------------

             Summary: Helix support for splitting partitions for finer 
repartitioning
                 Key: HELIX-80
                 URL: https://issues.apache.org/jira/browse/HELIX-80
             Project: Apache Helix
          Issue Type: Improvement
          Components: helix-core
    Affects Versions: 0.6.0-incubating
         Environment: All
            Reporter: Vinayak Borkar
            Priority: Minor


Apache Helix expects all partitions of resources be created upfront and deals 
with moving partitions around based on instance availability. Currently, there 
is no automatic solution for the case where a resource needs to be 
repartitioned into a larger number of finer partitions. Here is an example of 
when systems might want to repartition resources:

Imagine I started with a cluster with 5 machines. Originally, say a resource 
was partitioned into 20 partitions and Helix distributed 4 partitions to each 
machine. As time progresses more and more data is loaded into this resource 
making the query response times unbearable. So we add more machines into the 
cluster. The partitions evenly distribute onto the machines until the cluster 
size reaches 20. Now when we grow the cluster size to say 50, no more 
redistribution is possible unless we split the existing 20 partitions into at 
least 50 partitions. Currently the application needs to use some technique to 
repartition existing partitions. It would be nice if Helix supported this 
concept as a first-class citizen.

The converse of the repartitioning case is to merge partitions if too many 
partitions were created in the first place, which also should be handled by 
Helix.

The complication that arises when changing the nature of partitions is when 
there are concurrent inserts and the reorganization of partitions cannot be 
done by "stopping the world". Care needs to be taken to make sure that 
concurrent inserts are not lost (or double inserted) when reorgs are in 
progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to