The idea is to bake the functionality of such a tool in Kafka itself. In an ideal world, a Kafka cluster would automatically detect leader and data imbalance and trigger a rebalance operation that leads to optimal performance. I'm not sure if we have a JIRA for this though. So feel free to create one.
On Wed, Sep 17, 2014 at 5:51 PM, Alexis Midon < alexis.mi...@airbedandbreakfast.com> wrote: > we would be very happy to contribute. However a description of the current > plan and status regarding tooling would be helpful. > It would speed up the learning curve. You mentioned some jira tickets? > > (maybe I should sign up to the developer mailing list and take the > conversation over there) > > On Tue, Sep 16, 2014 at 6:46 PM, Gwen Shapira <gshap...@cloudera.com> > wrote: > > > Since these tools are so useful, I wonder what it requires (from both > > Airbnb and Kafka) to merge this into Kafka project. I think there are > > couple of Jira regarding improved tool usability that this resolved. > > > > On Mon, Sep 15, 2014 at 11:45 AM, Alexis Midon > > <alexis.mi...@airbedandbreakfast.com> wrote: > > > distribution will be even based on the number of partitions. > > > It is the same logic as AdminUtils. > > > see > > > > > > https://github.com/airbnb/kafkat/blob/master/lib/kafkat/command/reassign.rb#L39 > > > > > > On Sun, Sep 14, 2014 at 6:05 PM, Neha Narkhede < > neha.narkh...@gmail.com> > > > wrote: > > > > > >> This is great. Thanks for sharing! Does kafkat automatically figure > out > > the > > >> right reassignment strategy based on even data distribution? > > >> > > >> On Wed, Sep 3, 2014 at 12:12 AM, Alexis Midon < > > >> alexis.mi...@airbedandbreakfast.com> wrote: > > >> > > >> > Hi Marcin, > > >> > > > >> > A few weeks ago, I did an upgrade to 0.8.1.1 and then augmented the > > >> cluster > > >> > from 3 to 9 brokers. All went smoothly. > > >> > In a dev environment, we found out that the biggest pain point is to > > have > > >> > to deal with the json file and the error-prone command line > interface. > > >> > So to make our life easier, my team mate Nelson [1] came up with > > kafkat: > > >> > https://github.com/airbnb/kafkat > > >> > > > >> > We now install kafkat on every broker. Note that kafkat does NOT > > connect > > >> to > > >> > a broker, but to zookeeper. So you can actually use it from any > > machine. > > >> > > > >> > For reassignment, please see: > > >> > `kafkat reassign [topic] [--brokers <ids>] [--replicas <n>] ` > > >> > It will transparently generate and kick off a balanced assignment. > > >> > > > >> > feedback and contributions welcome! Enjoy! > > >> > > > >> > Alexis > > >> > > > >> > [1] https://github.com/nelgau > > >> > > > >> > > > >> > > > >> > On Tue, Aug 26, 2014 at 10:27 AM, Marcin Michalski < > > >> mmichal...@tagged.com> > > >> > wrote: > > >> > > > >> > > I am running on 0.8.1.1 and I thought that the partition > > reassignment > > >> > tools > > >> > > can do this job. Just was not sure if this is the best way to do > > this. > > >> > > I will try this out in stage env first and will perform the same > in > > >> prod. > > >> > > > > >> > > Thanks, > > >> > > marcin > > >> > > > > >> > > > > >> > > On Mon, Aug 25, 2014 at 7:23 PM, Joe Stein <joe.st...@stealth.ly> > > >> wrote: > > >> > > > > >> > > > Marcin, that is a typical task now. What version of Kafka are > you > > >> > > running? > > >> > > > > > >> > > > Take a look at > > >> > > > > > >> > > > https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion > > >> > > > and > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://kafka.apache.org/documentation.html#basic_ops_increase_replication_factor > > >> > > > > > >> > > > Basically you can do a --generate to get existing JSON topology > > and > > >> > with > > >> > > > that take the results of "Current partition replica assignment" > > (the > > >> > > first > > >> > > > JSON that outputs) and make whatever changes (like sed old node > > for > > >> new > > >> > > > node and add more replica's which increase the replication > factor, > > >> > > whatever > > >> > > > you want) and then --execute. > > >> > > > > > >> > > > With lots of data this takes time so you will want to run > > --verify to > > >> > see > > >> > > > what is in progress... good thing do a node at a time (even > topic > > at > > >> a > > >> > > > time) however you want to manage and wait for it as such. > > >> > > > > > >> > > > The "preferred" replica is simply the first one in the list of > > >> > replicas. > > >> > > > The kafka-preferred-replica-election.sh just makes that replica > > the > > >> > > leader > > >> > > > as this is not automatic yet. > > >> > > > > > >> > > > If you are running a version prior to 0.8.1.1 it might make > sense > > to > > >> > > > upgrade the old nodes first then run reassign to the new > servers. > > >> > > > > > >> > > > > > >> > > > /******************************************* > > >> > > > Joe Stein > > >> > > > Founder, Principal Consultant > > >> > > > Big Data Open Source Security LLC > > >> > > > http://www.stealth.ly > > >> > > > Twitter: @allthingshadoop < > > http://www.twitter.com/allthingshadoop> > > >> > > > ********************************************/ > > >> > > > > > >> > > > > > >> > > > On Mon, Aug 25, 2014 at 8:59 PM, Marcin Michalski < > > >> > mmichal...@tagged.com > > >> > > > > > >> > > > wrote: > > >> > > > > > >> > > > > Hi, I would like to migrate my Kafka setup from old servers to > > new > > >> > > > servers. > > >> > > > > Let say I have 8 really old servers that have the kafka > > >> > > topics/partitions > > >> > > > > replicated 4 ways and want to migrate the data to 4 brand new > > >> servers > > >> > > and > > >> > > > > want the replication factor be 3. I wonder if anyone has ever > > >> > performed > > >> > > > > this type of migration? > > >> > > > > > > >> > > > > Will auto rebalancing take care of this automatically if I do > > the > > >> > > > > following? > > >> > > > > > > >> > > > > Let say I bring down old broker id 1 down and startup new > server > > >> > broker > > >> > > > id > > >> > > > > 100 up, is there a way to migrate all of the data of the topic > > that > > >> > had > > >> > > > the > > >> > > > > topic (where borker id 1 was the leader) over to the new > broker > > >> 100? > > >> > > > > > > >> > > > > Or do I need to use *bin/kafka-preferred-replica-election.sh > *to > > >> > > reassign > > >> > > > > the topics/partitions from old broker 1 to broker 100? And > then > > >> just > > >> > > keep > > >> > > > > doing the same thing until all of the old brokers are > > >> decommissioned? > > >> > > > > > > >> > > > > Also, would kafka-preferred-replica-election.sh let me > actually > > >> lower > > >> > > the > > >> > > > > number of replicas as well, if I just simply make sure that > > given > > >> > > > > topic/partition was only elected 3 times versus 4? > > >> > > > > > > >> > > > > Thanks for your insight, > > >> > > > > Marcin > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >