Re: Kafka elastic no downtime scalability

2015-03-13 Thread Joe Stein
https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Fri, Mar 13, 2015 at 3:05 PM, sunil kalva sambarc...@gmail.com wrote:

 Joe

 Well, I know it is semantic but right now it can be elastically scaled
 without down time but you have to integrate into your environment for what
 that means it has been that way since 0.8.0 imho

 here what do you mean you have to integrate into your environment, how do
 i achieve elastically scaled cluster seamlessly ?

 SunilKalva

 On Fri, Mar 13, 2015 at 10:27 PM, Joe Stein joe.st...@stealth.ly wrote:

  Well, I know it is semantic but right now it can be elastically scaled
  without down time but you have to integrate into your environment for
 what
  that means it has been that way since 0.8.0 imho.
 
  My point was just another way to-do that out of the box... folks do this
  elastic scailing today with AWS CloudFormation and internal systems they
  built too.
 
  So, it can be done... you just have todo it.
 
  ~ Joe Stein
  - - - - - - - - - - - - - - - - -
 
http://www.stealth.ly
  - - - - - - - - - - - - - - - - -
 
  On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com
 wrote:
 
   OK, thanks for heads up.
  
   When reading Apache Kafka docs, and reading what Apache Kafka can I
   expect it to already be available in latest general availability
 release,
   not what's planned as part of some other project.
  
   Kind regards,
   Stevo Slavic.
  
   On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly
 wrote:
  
Hey Stevo, can be elastically and transparently expanded without
downtime. is
the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka
 as
   the
ability (knobs/levers) to-do this but has to be made to-do this out
 of
   the
box.
   
e.g. in Kafka on Mesos when a broker fails, after the configurable
 max
   fail
over timeout (meaning it is truly deemed hard failure) then a broker
   (with
the same id) will automatically be started on a another machine, data
replicated and back in action once that is done, automatically. Lots
  more
features already in there... we are also in progress to auto balance
partitions when increasing/decreasing the size of the cluster and
 some
   more
goodies too.
   
~ Joe Stein
- - - - - - - - - - - - - - - - -
   
  http://www.stealth.ly
- - - - - - - - - - - - - - - - -
   
On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com
  wrote:
   
 Hello Apache Kafka community,

 On Apache Kafka website home page http://kafka.apache.org/ it is
   stated
 that Kafka can be elastically and transparently expanded without
 downtime.
 Is that really true? More specifically, can one just add one more
   broker,
 have another partition added for the topic, have new broker
 assigned
  to
be
 the leader for new partition, have producers correctly write to the
  new
 partition, and consumers read from it, with no broker, consumer or
producer
 downtime, no data loss, no manual action to move data from existing
 partitions to new partition?

 Kind regards,
 Stevo Slavic.

   
  
 



 --
 SunilKalva



Re: Kafka elastic no downtime scalability

2015-03-13 Thread Chi Hoang
Hi Stevo,
I won't speak for Joe, but what we do is documented in the link that Joe
provided:
Adding servers to a Kafka cluster is easy, just assign them a unique
broker id and start up Kafka on your new servers. However these new servers
will not automatically be assigned any data partitions, so unless
partitions are moved to them they won't be doing any work until new topics
are created. So usually when you add machines to your cluster you will want
to migrate some existing data to these machines.

The process of migrating data is manually initiated but fully automated.
Under the covers what happens is that Kafka will add the new server as a
follower of the partition it is migrating and allow it to fully replicate
the existing data in that partition. When the new server has fully
replicated the contents of this partition and joined the in-sync replica
one of the existing replicas will delete their partition's data.

The partition reassignment tool can be used to move partitions across
brokers. An ideal partition distribution would ensure even data load and
partition sizes across all brokers. In 0.8.1, the partition reassignment
tool does not have the capability to automatically study the data
distribution in a Kafka cluster and move partitions around to attain an
even load distribution. As such, the admin has to figure out which topics
or partitions should be moved around.
We use a tool called kafkat (https://github.com/airbnb/kafkat) for
reassignment and other administrative tasks, and have added brokers and
partitions without an problems.  The manual part is that you manually
initiate the commands, but Kafka takes care of the rest without any
interruption to producers and consumers.  I also want to make clear that
kafkat is not necessary but makes it much easier.

Hope this helps clarify your doubts.

Chi

On Fri, Mar 13, 2015 at 4:19 PM, Stevo Slavić ssla...@gmail.com wrote:

 These features are all nice, if one adds new brokers to support additional
 topics, or to move existing partitions or whole topics to new brokers.
 Referenced sentence is in paragraph named scalability. When I read
 expanded I was thinking of scaling out, extending parallelization
 capabilities, and parallelism in Kafka is achieved with partitions. So I
 understood that sentence that it is possible to add more partitions to
 existing topics at runtime, with no downtime.

 I just found in source that there is API for adding new partitions to
 existing topics (see

 https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/admin/AdminUtils.scala#L101
 ). Have to try it. I guess it should work during runtime, causing no
 downtime or data loss, or moving data from existing to new partition.
 Producers and consumers will eventually start writing to and reading from
 new partition, and consumers should be able to read previously published
 messages from old partitions, even messages which if they were sent again
 would end up assigned/written to new partition.


 Kind regards,
 Stevo Slavic.

 On Fri, Mar 13, 2015 at 8:27 PM, Joe Stein joe.st...@stealth.ly wrote:

  https://kafka.apache.org/documentation.html#basic_ops_cluster_expansion
 
  ~ Joe Stein
  - - - - - - - - - - - - - - - - -
 
http://www.stealth.ly
  - - - - - - - - - - - - - - - - -
 
  On Fri, Mar 13, 2015 at 3:05 PM, sunil kalva sambarc...@gmail.com
 wrote:
 
   Joe
  
   Well, I know it is semantic but right now it can be elastically
 scaled
   without down time but you have to integrate into your environment for
  what
   that means it has been that way since 0.8.0 imho
  
   here what do you mean you have to integrate into your environment,
 how
  do
   i achieve elastically scaled cluster seamlessly ?
  
   SunilKalva
  
   On Fri, Mar 13, 2015 at 10:27 PM, Joe Stein joe.st...@stealth.ly
  wrote:
  
Well, I know it is semantic but right now it can be elastically
  scaled
without down time but you have to integrate into your environment for
   what
that means it has been that way since 0.8.0 imho.
   
My point was just another way to-do that out of the box... folks do
  this
elastic scailing today with AWS CloudFormation and internal systems
  they
built too.
   
So, it can be done... you just have todo it.
   
~ Joe Stein
- - - - - - - - - - - - - - - - -
   
  http://www.stealth.ly
- - - - - - - - - - - - - - - - -
   
On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com
   wrote:
   
 OK, thanks for heads up.

 When reading Apache Kafka docs, and reading what Apache Kafka
 can I
 expect it to already be available in latest general availability
   release,
 not what's planned as part of some other project.

 Kind regards,
 Stevo Slavic.

 On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly
   wrote:

  Hey Stevo, can be elastically and transparently expanded without
  downtime. is
  the goal of Kafka on Mesos 

Re: Kafka elastic no downtime scalability

2015-03-13 Thread Joe Stein
Hey Stevo, can be elastically and transparently expanded without downtime. is
the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the
ability (knobs/levers) to-do this but has to be made to-do this out of the
box.

e.g. in Kafka on Mesos when a broker fails, after the configurable max fail
over timeout (meaning it is truly deemed hard failure) then a broker (with
the same id) will automatically be started on a another machine, data
replicated and back in action once that is done, automatically. Lots more
features already in there... we are also in progress to auto balance
partitions when increasing/decreasing the size of the cluster and some more
goodies too.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote:

 Hello Apache Kafka community,

 On Apache Kafka website home page http://kafka.apache.org/ it is stated
 that Kafka can be elastically and transparently expanded without
 downtime.
 Is that really true? More specifically, can one just add one more broker,
 have another partition added for the topic, have new broker assigned to be
 the leader for new partition, have producers correctly write to the new
 partition, and consumers read from it, with no broker, consumer or producer
 downtime, no data loss, no manual action to move data from existing
 partitions to new partition?

 Kind regards,
 Stevo Slavic.



Re: Kafka elastic no downtime scalability

2015-03-13 Thread Stevo Slavić
OK, thanks for heads up.

When reading Apache Kafka docs, and reading what Apache Kafka can I
expect it to already be available in latest general availability release,
not what's planned as part of some other project.

Kind regards,
Stevo Slavic.

On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote:

 Hey Stevo, can be elastically and transparently expanded without
 downtime. is
 the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as the
 ability (knobs/levers) to-do this but has to be made to-do this out of the
 box.

 e.g. in Kafka on Mesos when a broker fails, after the configurable max fail
 over timeout (meaning it is truly deemed hard failure) then a broker (with
 the same id) will automatically be started on a another machine, data
 replicated and back in action once that is done, automatically. Lots more
 features already in there... we are also in progress to auto balance
 partitions when increasing/decreasing the size of the cluster and some more
 goodies too.

 ~ Joe Stein
 - - - - - - - - - - - - - - - - -

   http://www.stealth.ly
 - - - - - - - - - - - - - - - - -

 On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote:

  Hello Apache Kafka community,
 
  On Apache Kafka website home page http://kafka.apache.org/ it is stated
  that Kafka can be elastically and transparently expanded without
  downtime.
  Is that really true? More specifically, can one just add one more broker,
  have another partition added for the topic, have new broker assigned to
 be
  the leader for new partition, have producers correctly write to the new
  partition, and consumers read from it, with no broker, consumer or
 producer
  downtime, no data loss, no manual action to move data from existing
  partitions to new partition?
 
  Kind regards,
  Stevo Slavic.
 



Re: Kafka elastic no downtime scalability

2015-03-13 Thread Joe Stein
Well, I know it is semantic but right now it can be elastically scaled
without down time but you have to integrate into your environment for what
that means it has been that way since 0.8.0 imho.

My point was just another way to-do that out of the box... folks do this
elastic scailing today with AWS CloudFormation and internal systems they
built too.

So, it can be done... you just have todo it.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Fri, Mar 13, 2015 at 12:39 PM, Stevo Slavić ssla...@gmail.com wrote:

 OK, thanks for heads up.

 When reading Apache Kafka docs, and reading what Apache Kafka can I
 expect it to already be available in latest general availability release,
 not what's planned as part of some other project.

 Kind regards,
 Stevo Slavic.

 On Fri, Mar 13, 2015 at 2:32 PM, Joe Stein joe.st...@stealth.ly wrote:

  Hey Stevo, can be elastically and transparently expanded without
  downtime. is
  the goal of Kafka on Mesos https://github.com/mesos/kafka as Kafka as
 the
  ability (knobs/levers) to-do this but has to be made to-do this out of
 the
  box.
 
  e.g. in Kafka on Mesos when a broker fails, after the configurable max
 fail
  over timeout (meaning it is truly deemed hard failure) then a broker
 (with
  the same id) will automatically be started on a another machine, data
  replicated and back in action once that is done, automatically. Lots more
  features already in there... we are also in progress to auto balance
  partitions when increasing/decreasing the size of the cluster and some
 more
  goodies too.
 
  ~ Joe Stein
  - - - - - - - - - - - - - - - - -
 
http://www.stealth.ly
  - - - - - - - - - - - - - - - - -
 
  On Fri, Mar 13, 2015 at 8:43 AM, Stevo Slavić ssla...@gmail.com wrote:
 
   Hello Apache Kafka community,
  
   On Apache Kafka website home page http://kafka.apache.org/ it is
 stated
   that Kafka can be elastically and transparently expanded without
   downtime.
   Is that really true? More specifically, can one just add one more
 broker,
   have another partition added for the topic, have new broker assigned to
  be
   the leader for new partition, have producers correctly write to the new
   partition, and consumers read from it, with no broker, consumer or
  producer
   downtime, no data loss, no manual action to move data from existing
   partitions to new partition?
  
   Kind regards,
   Stevo Slavic.