Re: Kafka elastic no downtime scalability
https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 3:05 PM, sunil kalva sambarc...@gmail.com wrote: Joe Well, I know it is semantic but right now it can be elastically scaled without down time but you have to integrate into your environment for what that means it has been that way since 0.8.0 imho here what do you mean you have to integrate into your environment, how do i achieve elastically scaled cluster seamlessly ? SunilKalva On Fri, Mar 13, 2015 at 10:27 PM, Joe Stein joe.st...@stealth.ly wrote: Well, I know it is semantic but right now it can be elastically scaled without down time but you have to integrate into your environment for what that means it has been that way since 0.8.0 imho. My point was just another way to-do that out of the box... folks do this elastic scailing today with AWS CloudFormation and internal systems they built too. So, it can be done... you just have todo it. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com wrote: OK, thanks for heads up. When reading Apache Kafka docs, and reading what Apache Kafka can I expect it to already be available in latest general availability release, not what's planned as part of some other project. Kind regards, Stevo Slavic. On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote: Hey Stevo, can be elastically and transparently expanded without downtime. is the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the ability (knobs/levers) to-do this but has to be made to-do this out of the box. e.g. in Kafka on Mesos when a broker fails, after the configurable max fail over timeout (meaning it is truly deemed hard failure) then a broker (with the same id) will automatically be started on a another machine, data replicated and back in action once that is done, automatically. Lots more features already in there... we are also in progress to auto balance partitions when increasing/decreasing the size of the cluster and some more goodies too. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, On Apache Kafka website home page http://kafka.apache.org/ it is stated that Kafka can be elastically and transparently expanded without downtime. Is that really true? More specifically, can one just add one more broker, have another partition added for the topic, have new broker assigned to be the leader for new partition, have producers correctly write to the new partition, and consumers read from it, with no broker, consumer or producer downtime, no data loss, no manual action to move data from existing partitions to new partition? Kind regards, Stevo Slavic. -- SunilKalva
Re: Kafka elastic no downtime scalability
Hi Stevo, I won't speak for Joe, but what we do is documented in the link that Joe provided: Adding servers to a Kafka cluster is easy, just assign them a unique broker id and start up Kafka on your new servers. However these new servers will not automatically be assigned any data partitions, so unless partitions are moved to them they won't be doing any work until new topics are created. So usually when you add machines to your cluster you will want to migrate some existing data to these machines. The process of migrating data is manually initiated but fully automated. Under the covers what happens is that Kafka will add the new server as a follower of the partition it is migrating and allow it to fully replicate the existing data in that partition. When the new server has fully replicated the contents of this partition and joined the in-sync replica one of the existing replicas will delete their partition's data. The partition reassignment tool can be used to move partitions across brokers. An ideal partition distribution would ensure even data load and partition sizes across all brokers. In 0.8.1, the partition reassignment tool does not have the capability to automatically study the data distribution in a Kafka cluster and move partitions around to attain an even load distribution. As such, the admin has to figure out which topics or partitions should be moved around. We use a tool called kafkat (https://github.com/airbnb/kafkat) for reassignment and other administrative tasks, and have added brokers and partitions without an problems. The manual part is that you manually initiate the commands, but Kafka takes care of the rest without any interruption to producers and consumers. I also want to make clear that kafkat is not necessary but makes it much easier. Hope this helps clarify your doubts. Chi On Fri, Mar 13, 2015 at 4:19 PM, Stevo Slavić ssla...@gmail.com wrote: These features are all nice, if one adds new brokers to support additional topics, or to move existing partitions or whole topics to new brokers. Referenced sentence is in paragraph named scalability. When I read expanded I was thinking of scaling out, extending parallelization capabilities, and parallelism in Kafka is achieved with partitions. So I understood that sentence that it is possible to add more partitions to existing topics at runtime, with no downtime. I just found in source that there is API for adding new partitions to existing topics (see https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/admin/AdminUtils.scala#L101 ). Have to try it. I guess it should work during runtime, causing no downtime or data loss, or moving data from existing to new partition. Producers and consumers will eventually start writing to and reading from new partition, and consumers should be able to read previously published messages from old partitions, even messages which if they were sent again would end up assigned/written to new partition. Kind regards, Stevo Slavic. On Fri, Mar 13, 2015 at 8:27 PM, Joe Stein joe.st...@stealth.ly wrote: https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 3:05 PM, sunil kalva sambarc...@gmail.com wrote: Joe Well, I know it is semantic but right now it can be elastically scaled without down time but you have to integrate into your environment for what that means it has been that way since 0.8.0 imho here what do you mean you have to integrate into your environment, how do i achieve elastically scaled cluster seamlessly ? SunilKalva On Fri, Mar 13, 2015 at 10:27 PM, Joe Stein joe.st...@stealth.ly wrote: Well, I know it is semantic but right now it can be elastically scaled without down time but you have to integrate into your environment for what that means it has been that way since 0.8.0 imho. My point was just another way to-do that out of the box... folks do this elastic scailing today with AWS CloudFormation and internal systems they built too. So, it can be done... you just have todo it. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com wrote: OK, thanks for heads up. When reading Apache Kafka docs, and reading what Apache Kafka can I expect it to already be available in latest general availability release, not what's planned as part of some other project. Kind regards, Stevo Slavic. On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote: Hey Stevo, can be elastically and transparently expanded without downtime. is the goal of Kafka on Mesos
Re: Kafka elastic no downtime scalability
Hey Stevo, can be elastically and transparently expanded without downtime. is the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the ability (knobs/levers) to-do this but has to be made to-do this out of the box. e.g. in Kafka on Mesos when a broker fails, after the configurable max fail over timeout (meaning it is truly deemed hard failure) then a broker (with the same id) will automatically be started on a another machine, data replicated and back in action once that is done, automatically. Lots more features already in there... we are also in progress to auto balance partitions when increasing/decreasing the size of the cluster and some more goodies too. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, On Apache Kafka website home page http://kafka.apache.org/ it is stated that Kafka can be elastically and transparently expanded without downtime. Is that really true? More specifically, can one just add one more broker, have another partition added for the topic, have new broker assigned to be the leader for new partition, have producers correctly write to the new partition, and consumers read from it, with no broker, consumer or producer downtime, no data loss, no manual action to move data from existing partitions to new partition? Kind regards, Stevo Slavic.
Re: Kafka elastic no downtime scalability
OK, thanks for heads up. When reading Apache Kafka docs, and reading what Apache Kafka can I expect it to already be available in latest general availability release, not what's planned as part of some other project. Kind regards, Stevo Slavic. On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote: Hey Stevo, can be elastically and transparently expanded without downtime. is the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the ability (knobs/levers) to-do this but has to be made to-do this out of the box. e.g. in Kafka on Mesos when a broker fails, after the configurable max fail over timeout (meaning it is truly deemed hard failure) then a broker (with the same id) will automatically be started on a another machine, data replicated and back in action once that is done, automatically. Lots more features already in there... we are also in progress to auto balance partitions when increasing/decreasing the size of the cluster and some more goodies too. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, On Apache Kafka website home page http://kafka.apache.org/ it is stated that Kafka can be elastically and transparently expanded without downtime. Is that really true? More specifically, can one just add one more broker, have another partition added for the topic, have new broker assigned to be the leader for new partition, have producers correctly write to the new partition, and consumers read from it, with no broker, consumer or producer downtime, no data loss, no manual action to move data from existing partitions to new partition? Kind regards, Stevo Slavic.
Re: Kafka elastic no downtime scalability
Well, I know it is semantic but right now it can be elastically scaled without down time but you have to integrate into your environment for what that means it has been that way since 0.8.0 imho. My point was just another way to-do that out of the box... folks do this elastic scailing today with AWS CloudFormation and internal systems they built too. So, it can be done... you just have todo it. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com wrote: OK, thanks for heads up. When reading Apache Kafka docs, and reading what Apache Kafka can I expect it to already be available in latest general availability release, not what's planned as part of some other project. Kind regards, Stevo Slavic. On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote: Hey Stevo, can be elastically and transparently expanded without downtime. is the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the ability (knobs/levers) to-do this but has to be made to-do this out of the box. e.g. in Kafka on Mesos when a broker fails, after the configurable max fail over timeout (meaning it is truly deemed hard failure) then a broker (with the same id) will automatically be started on a another machine, data replicated and back in action once that is done, automatically. Lots more features already in there... we are also in progress to auto balance partitions when increasing/decreasing the size of the cluster and some more goodies too. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote: Hello Apache Kafka community, On Apache Kafka website home page http://kafka.apache.org/ it is stated that Kafka can be elastically and transparently expanded without downtime. Is that really true? More specifically, can one just add one more broker, have another partition added for the topic, have new broker assigned to be the leader for new partition, have producers correctly write to the new partition, and consumers read from it, with no broker, consumer or producer downtime, no data loss, no manual action to move data from existing partitions to new partition? Kind regards, Stevo Slavic.