Thanks Neha. I'll discard this for now. We can pick it up once replica throttling and the default policies are available and tested.
On Thu, Feb 11, 2016 at 5:45 PM, Neha Narkhede <n...@confluent.io> wrote: > > > > 1. Replica Throttling - I agree this is rather important to get done. > > However, it may also be argued that this problem is orthogonal. We do not > > have these protections currently yet we do run partition reassignment > > fairly often. Having said that, I'm perfectly happy to tackle KIP-46 > after > > this problem is solved. I understand it is actively being discussed in > > KAFKA-1464. > > > I think we are saying the same thing here. Replica throttling is required > to be able to pull off any partition reassignment action. It removes the > guesswork that comes from picking a batch size that is expressed in terms > of partition count, which is an annoying hack. > > > > 2. Pluggable policies - Can you elaborate on the need for pluggable > > policies in the partition reassignment tool? Even if we make it pluggable > > to begin with, this needs to ship with a default policy that makes sense > > for most users. IMO, partition count is the most intuitive default and is > > analogous to how we stripe partitions for new topics. > > > Agree about the default. I was arguing for making it pluggable so we make > it easy to test multiple policies. For instance, partition count is a > decent one but I can imagine how one would want a policy that optimizes for > balancing data sizes for instance. > > > > 3. Even if the trigger were fully manual (as it is now), we could still > > have the controller generate the assignment as per a configured policy > i.e. > > effectively the tool is built into Kafka itself. Following this approach > to > > begin with makes it easier to fully automate in the future since we will > > only need to automate the trigger later. > > > I would be much more comfortable adding the capability to move large > amounts of data to the controller, after we are very sure that the default > policy is well tested and the replica throttling works. If so, then it is > just a matter of placing the trigger in the controller vs in the tool. But > I'm skeptical of adding more things to the already messy controller, > especially without being sure about how well it works. > > Thanks, > Neha > > On Tue, Feb 9, 2016 at 12:53 PM, Aditya Auradkar < > aaurad...@linkedin.com.invalid> wrote: > > > Hi Neha, > > > > Thanks for the detailed reply and apologies for my late response. I do > have > > a few comments. > > > > 1. Replica Throttling - I agree this is rather important to get done. > > However, it may also be argued that this problem is orthogonal. We do not > > have these protections currently yet we do run partition reassignment > > fairly often. Having said that, I'm perfectly happy to tackle KIP-46 > after > > this problem is solved. I understand it is actively being discussed in > > KAFKA-1464. > > > > 2. Pluggable policies - Can you elaborate on the need for pluggable > > policies in the partition reassignment tool? Even if we make it pluggable > > to begin with, this needs to ship with a default policy that makes sense > > for most users. IMO, partition count is the most intuitive default and is > > analogous to how we stripe partitions for new topics. > > > > 3. Even if the trigger were fully manual (as it is now), we could still > > have the controller generate the assignment as per a configured policy > i.e. > > effectively the tool is built into Kafka itself. Following this approach > to > > begin with makes it easier to fully automate in the future since we will > > only need to automate the trigger later. > > > > Aditya > > > > > > > > On Wed, Feb 3, 2016 at 1:57 PM, Neha Narkhede <n...@confluent.io> wrote: > > > > > Adi, > > > > > > Thanks for the write-up. Here are my thoughts: > > > > > > I think you are suggesting a way of automating resurrecting a topic’s > > > replication factor in the presence of a specific scenario: in the event > > of > > > permanent broker failures. I agree that the partition reassignment > > > mechanism should be used to add replicas when they are lost to > permanent > > > broker failures. But I think the KIP probably chews off more than we > can > > > digest. > > > > > > Before we automate detection of permanent broker failures and have the > > > controller mitigate through automatic data balancing, I’d like to point > > out > > > that our current difficulty is not that but the ability to generate a > > > workable partition assignment for rebalancing data in a cluster. > > > > > > There are 2 problems with partition rebalancing today: > > > > > > 1. Lack of replica throttling for balancing data: In the absence of > > > replica throttling, even if you come up with an assignment that > might > > be > > > workable, it isn’t practical to kick it off without worrying about > > > bringing > > > the entire cluster down. I don’t think the hack of moving partitions > > in > > > batches is effective as it at-best a best guess. > > > 2. Lack of support for policies in the rebalance tool that > > automatically > > > generate a workable partition assignment: There is no easy way to > > > generate > > > a partition reassignment JSON file. An example of a policy is “end > up > > > with > > > an equal number of partitions on every broker while minimizing data > > > movement”. There might be other policies that might make sense, we’d > > > have > > > to experiment. > > > > > > Broadly speaking, the data balancing problem is comprised of 3 parts: > > > > > > 1. Trigger: An event that triggers data balancing to take place. > > KIP-46 > > > suggests a specific trigger and that is permanent broker failure. > But > > > there > > > might be several other events that might make sense — Cluster > > expansion, > > > decommissioning brokers, data imbalance > > > 2. Policy: Given a set of constraints, generate a target partition > > > assignment that can be executed when triggered. > > > 3. Mechanism: Given a partition assignment, make the state changes > and > > > actually move the data until the target assignment is achieved. > > > > > > Currently, the trigger is manual through the rebalance tool, there is > no > > > support for any viable policy today and we have a built-in mechanism > > that, > > > given a policy and upon a trigger, moves data in a cluster but does not > > > support throttling. > > > > > > Given that both the policy and the throttling improvement to the > > mechanism > > > are hard problems and given our past experience of operationalizing > > > partition reassignment (required months of testing before we got it > > right), > > > I strongly recommend attacking this problem in stages. I think a more > > > practical approach would be to add the concept of pluggable policies in > > the > > > rebalance tool, implement a practical policy that generates a workable > > > partition assignment upon triggering the tool and improve the mechanism > > to > > > support throttling so that a given policy can succeed without manual > > > intervention. If we solved these problems first, the rebalance tool > would > > > be much more accessible to Kafka users and operators. > > > > > > Assuming that we do this, the problem that KIP-46 aims to solve becomes > > > much easier. You can separate the detection of permanent broker > failures > > > (trigger) from the mitigation (above-mentioned improvements to data > > > balancing). The latter will be a native capability in Kafka. Detecting > > > permanent hardware failures is much easily done via an external script > > that > > > uses a simple health check. (Part 1 of KIP-46). > > > > > > I agree that it will be great to *eventually* be able to fully automate > > > both the trigger as well as the policies while also improving the > > > mechanism. But I’m highly skeptical of big-bang approaches that go > from a > > > completely manual and cumbersome process to a fully automated one, > > > especially when that involves large-scale data movement in a running > > > cluster. Once we stabilize these changes and feel confident that they > > work, > > > we can push the policy into the controller and have it automatically be > > > triggered based on different events. > > > > > > Thanks, > > > Neha > > > > > > On Tue, Feb 2, 2016 at 6:13 PM, Aditya Auradkar < > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > Hey everyone, > > > > > > > > I just created a kip to discuss automated replica reassignment when > we > > > lose > > > > a broker in the cluster. > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-46%3A+Self+Healing+Kafka > > > > > > > > Any feedback is welcome. > > > > > > > > Thanks, > > > > Aditya > > > > > > > > > > > > > > > > -- > > > Thanks, > > > Neha > > > > > > > > > -- > Thanks, > Neha >