Vinayak Borkar created HELIX-80:
-----------------------------------
Summary: Helix support for splitting partitions for finer
repartitioning
Key: HELIX-80
URL: https://issues.apache.org/jira/browse/HELIX-80
Project: Apache Helix
Issue Type: Improvement
Components: helix-core
Affects Versions: 0.6.0-incubating
Environment: All
Reporter: Vinayak Borkar
Priority: Minor
Apache Helix expects all partitions of resources be created upfront and deals
with moving partitions around based on instance availability. Currently, there
is no automatic solution for the case where a resource needs to be
repartitioned into a larger number of finer partitions. Here is an example of
when systems might want to repartition resources:
Imagine I started with a cluster with 5 machines. Originally, say a resource
was partitioned into 20 partitions and Helix distributed 4 partitions to each
machine. As time progresses more and more data is loaded into this resource
making the query response times unbearable. So we add more machines into the
cluster. The partitions evenly distribute onto the machines until the cluster
size reaches 20. Now when we grow the cluster size to say 50, no more
redistribution is possible unless we split the existing 20 partitions into at
least 50 partitions. Currently the application needs to use some technique to
repartition existing partitions. It would be nice if Helix supported this
concept as a first-class citizen.
The converse of the repartitioning case is to merge partitions if too many
partitions were created in the first place, which also should be handled by
Helix.
The complication that arises when changing the nature of partitions is when
there are concurrent inserts and the reorganization of partitions cannot be
done by "stopping the world". Care needs to be taken to make sure that
concurrent inserts are not lost (or double inserted) when reorgs are in
progress.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira