With SimpleConsumer, you will have to handle leader discovery as well as zookeeper based rebalancing. You can see an example here - https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
On Wed, Oct 8, 2014 at 11:45 AM, Sharninder <sharnin...@gmail.com> wrote: > Thanks Gwen. This really helped. > > Yes, Kafka is the best thing ever :) > > Now how would this be done with the Simple consumer? I'm guessing I'll have > to maintain my own state in Zookeeper or something of that sort? > > > On Thu, Oct 9, 2014 at 12:01 AM, Gwen Shapira <gshap...@cloudera.com> > wrote: > > > Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1) > > and 1 consumer group (flume), each of the 3 topic partitions is being > > read by a different machine running the flume consumer: > > Group Topic Pid Offset > > logSize Lag Owner > > flume t1 0 50172068 > > 100210042 50037974 > > flume_kafkacdh-1.ent.cloudera.com-1412722833783-3d6d80db-0 > > flume t1 1 49914701 > > 49914701 0 > > flume_kafkacdh-2.ent.cloudera.com-1412722838536-a6a4915d-0 > > flume t1 2 54218841 > > 82733380 28514539 > > flume_kafkacdh-3.ent.cloudera.com-1412722832793-b23eaa63-0 > > > > If flume_kafkacdh-1 crashed, another broker will pick up the partition: > > Group Topic Pid Offset > > logSize Lag Owner > > flume t1 0 59669715 > > 100210042 40540327 > > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 > > flume t1 1 49914701 > > 49914701 0 > > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 > > flume t1 2 65796205 > > 82733380 16937175 > > flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 > > > > Then I can start flume_kafkacdh-4 and see things rebalance again: > > flume t1 0 60669715 > > 100210042 39540327 > > flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 > > flume t1 1 49914701 > > 49914701 0 > > flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 > > flume t1 2 66829740 > > 82733380 15903640 > > flume_kafkacdh-4.ent.cloudera.com-1412793053882-9bfddff9-0 > > > > Isn't Kafka the best thing ever? :) > > > > Gwen > > > > On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira <gshap...@cloudera.com> > > wrote: > > > yep. exactly. > > > > > > On Wed, Oct 8, 2014 at 11:07 AM, Sharninder <sharnin...@gmail.com> > > wrote: > > >> Thanks Gwen. > > >> > > >> When you're saying that I can add consumers to the same group, does > that > > >> also hold true if those consumers are running on different machines? > Or > > in > > >> different JVMs? > > >> > > >> -- > > >> Sharninder > > >> > > >> > > >> On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira <gshap...@cloudera.com> > > wrote: > > >> > > >>> If you use the high level consumer implementation, and register all > > >>> consumers as part of the same group - they will load-balance > > >>> automatically. > > >>> > > >>> When you add a consumer to the group, if there are enough partitions > > >>> in the topic, some of the partitions will be assigned to the new > > >>> consumer. > > >>> When a consumer crashes, once its node in ZK times out, other > > >>> consumers will get its partitions. > > >>> > > >>> Gwen > > >>> > > >>> On Wed, Oct 8, 2014 at 10:39 AM, Sharninder <sharnin...@gmail.com> > > wrote: > > >>> > Hi, > > >>> > > > >>> > I'm not even sure if this is a valid use-case, but I really wanted > > to run > > >>> > it by you guys. How do I load balance my consumers? For example, if > > my > > >>> > consumer machine is under load, I'd like to spin up another VM with > > >>> another > > >>> > consumer process to keep reading messages off any topic. On similar > > >>> lines, > > >>> > how do you guys handle consumer failures? Suppose one consumer > > process > > >>> gets > > >>> > an exception and crashes, is it possible for me to somehow make > sure > > that > > >>> > there is another process that is still reading the queue for me? > > >>> > > > >>> > -- > > >>> > Sharninder > > >>> > > >