With SimpleConsumer, you will have to handle leader discovery as well as
zookeeper based rebalancing. You can see an example here -
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

On Wed, Oct 8, 2014 at 11:45 AM, Sharninder <sharnin...@gmail.com> wrote:

> Thanks Gwen. This really helped.
>
> Yes, Kafka is the best thing ever :)
>
> Now how would this be done with the Simple consumer? I'm guessing I'll have
> to maintain my own state in Zookeeper or something of that sort?
>
>
> On Thu, Oct 9, 2014 at 12:01 AM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
>
> > Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1)
> > and 1 consumer group (flume), each of the 3 topic partitions is being
> > read by a different machine running the flume consumer:
> > Group           Topic                          Pid Offset
> > logSize         Lag             Owner
> > flume           t1                             0   50172068
> > 100210042       50037974
> > flume_kafkacdh-1.ent.cloudera.com-1412722833783-3d6d80db-0
> > flume           t1                             1   49914701
> > 49914701        0
> > flume_kafkacdh-2.ent.cloudera.com-1412722838536-a6a4915d-0
> > flume           t1                             2   54218841
> > 82733380        28514539
> > flume_kafkacdh-3.ent.cloudera.com-1412722832793-b23eaa63-0
> >
> > If flume_kafkacdh-1 crashed, another broker will pick up the partition:
> > Group           Topic                          Pid Offset
> > logSize         Lag             Owner
> > flume           t1                             0   59669715
> > 100210042       40540327
> > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0
> > flume           t1                             1   49914701
> > 49914701        0
> > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0
> > flume           t1                             2   65796205
> > 82733380        16937175
> > flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0
> >
> > Then I can start flume_kafkacdh-4 and see things rebalance again:
> > flume           t1                             0   60669715
> > 100210042       39540327
> > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0
> > flume           t1                             1   49914701
> > 49914701        0
> > flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0
> > flume           t1                             2   66829740
> > 82733380        15903640
> > flume_kafkacdh-4.ent.cloudera.com-1412793053882-9bfddff9-0
> >
> > Isn't Kafka the best thing ever? :)
> >
> > Gwen
> >
> > On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira <gshap...@cloudera.com>
> > wrote:
> > > yep. exactly.
> > >
> > > On Wed, Oct 8, 2014 at 11:07 AM, Sharninder <sharnin...@gmail.com>
> > wrote:
> > >> Thanks Gwen.
> > >>
> > >> When you're saying that I can add consumers to the same group, does
> that
> > >> also hold true if those consumers are running on different machines?
> Or
> > in
> > >> different JVMs?
> > >>
> > >> --
> > >> Sharninder
> > >>
> > >>
> > >> On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira <gshap...@cloudera.com>
> > wrote:
> > >>
> > >>> If you use the high level consumer implementation, and register all
> > >>> consumers as part of the same group - they will load-balance
> > >>> automatically.
> > >>>
> > >>> When you add a consumer to the group, if there are enough partitions
> > >>> in the topic, some of the partitions will be assigned to the new
> > >>> consumer.
> > >>> When a consumer crashes, once its node in ZK times out, other
> > >>> consumers will get its partitions.
> > >>>
> > >>> Gwen
> > >>>
> > >>> On Wed, Oct 8, 2014 at 10:39 AM, Sharninder <sharnin...@gmail.com>
> > wrote:
> > >>> > Hi,
> > >>> >
> > >>> > I'm not even sure if this is a valid use-case, but I really wanted
> > to run
> > >>> > it by you guys. How do I load balance my consumers? For example, if
> > my
> > >>> > consumer machine is under load, I'd like to spin up another VM with
> > >>> another
> > >>> > consumer process to keep reading messages off any topic. On similar
> > >>> lines,
> > >>> > how do you guys handle consumer failures? Suppose one consumer
> > process
> > >>> gets
> > >>> > an exception and crashes, is it possible for me to somehow make
> sure
> > that
> > >>> > there is another process that is still reading the queue for me?
> > >>> >
> > >>> > --
> > >>> > Sharninder
> > >>>
> >
>

Reply via email to