Re: Load Balancing Consumers or Multiple consumers reading off same topic
What's the output of the ConsumerOffsetChecker tool? Thanks, Jun On Tue, Oct 28, 2014 at 7:31 AM, Natarajan, Murugavel murugavel.natara...@softwareag.com wrote: Hi, I have the following Kafka Setup Number of producer : 1 Number of topics : 1 Number of partitions : 2 Number of consumers : 3 (with same group id) Number of Kafka cluster : none(single Kafka server) Zookeeper.session.timeout : 1000 Producer produces messages without any specific partitioning logic(default partitioning logic). Consumer 1 consumes message continuously. I am abruptly killing consumer 1 and I would except consumer 2 or consumer 3 to consume the messages after the failure of consumer 1. In some cases rebalance occurs and consumer 2 starts consuming messages. This is perfectly fine. But in some cases either consumer 2 or consumer 3 is not at all consuming. I have to manually kill all the consumers and start all three consumers again. Only after this restart consumer 1 starts consuming again. Precisely rebalance is successful in some cases while in some cases rebalance is not successful. Is there any configuration that I am missing. Cheers and Regards, Murugavel .
Re: Load Balancing Consumers or Multiple consumers reading off same topic
Hi, I have the following Kafka Setup Number of producer : 1 Number of topics : 1 Number of partitions : 2 Number of consumers : 3 (with same group id) Number of Kafka cluster : none(single Kafka server) Zookeeper.session.timeout : 1000 Producer produces messages without any specific partitioning logic(default partitioning logic). Consumer 1 consumes message continuously. I am abruptly killing consumer 1 and I would except consumer 2 or consumer 3 to consume the messages after the failure of consumer 1. In some cases rebalance occurs and consumer 2 starts consuming messages. This is perfectly fine. But in some cases either consumer 2 or consumer 3 is not at all consuming. I have to manually kill all the consumers and start all three consumers again. Only after this restart consumer 1 starts consuming again. Precisely rebalance is successful in some cases while in some cases rebalance is not successful. Is there any configuration that I am missing. Cheers and Regards, Murugavel .
Re: Load Balancing Consumers or Multiple consumers reading off same topic
With SimpleConsumer, you will have to handle leader discovery as well as zookeeper based rebalancing. You can see an example here - https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example On Wed, Oct 8, 2014 at 11:45 AM, Sharninder sharnin...@gmail.com wrote: Thanks Gwen. This really helped. Yes, Kafka is the best thing ever :) Now how would this be done with the Simple consumer? I'm guessing I'll have to maintain my own state in Zookeeper or something of that sort? On Thu, Oct 9, 2014 at 12:01 AM, Gwen Shapira gshap...@cloudera.com wrote: Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1) and 1 consumer group (flume), each of the 3 topic partitions is being read by a different machine running the flume consumer: Group Topic Pid Offset logSize Lag Owner flume t1 0 50172068 100210042 50037974 flume_kafkacdh-1.ent.cloudera.com-1412722833783-3d6d80db-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412722838536-a6a4915d-0 flume t1 2 54218841 8273338028514539 flume_kafkacdh-3.ent.cloudera.com-1412722832793-b23eaa63-0 If flume_kafkacdh-1 crashed, another broker will pick up the partition: Group Topic Pid Offset logSize Lag Owner flume t1 0 59669715 100210042 40540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 2 65796205 8273338016937175 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 Then I can start flume_kafkacdh-4 and see things rebalance again: flume t1 0 60669715 100210042 39540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 flume t1 2 66829740 8273338015903640 flume_kafkacdh-4.ent.cloudera.com-1412793053882-9bfddff9-0 Isn't Kafka the best thing ever? :) Gwen On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira gshap...@cloudera.com wrote: yep. exactly. On Wed, Oct 8, 2014 at 11:07 AM, Sharninder sharnin...@gmail.com wrote: Thanks Gwen. When you're saying that I can add consumers to the same group, does that also hold true if those consumers are running on different machines? Or in different JVMs? -- Sharninder On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira gshap...@cloudera.com wrote: If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder
Re: Load Balancing Consumers or Multiple consumers reading off same topic
Thanks Gwen. When you're saying that I can add consumers to the same group, does that also hold true if those consumers are running on different machines? Or in different JVMs? -- Sharninder On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira gshap...@cloudera.com wrote: If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder
Re: Load Balancing Consumers or Multiple consumers reading off same topic
If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder
Re: Load Balancing Consumers or Multiple consumers reading off same topic
yep. exactly. On Wed, Oct 8, 2014 at 11:07 AM, Sharninder sharnin...@gmail.com wrote: Thanks Gwen. When you're saying that I can add consumers to the same group, does that also hold true if those consumers are running on different machines? Or in different JVMs? -- Sharninder On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira gshap...@cloudera.com wrote: If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder
Re: Load Balancing Consumers or Multiple consumers reading off same topic
Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1) and 1 consumer group (flume), each of the 3 topic partitions is being read by a different machine running the flume consumer: Group Topic Pid Offset logSize Lag Owner flume t1 0 50172068 100210042 50037974 flume_kafkacdh-1.ent.cloudera.com-1412722833783-3d6d80db-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412722838536-a6a4915d-0 flume t1 2 54218841 8273338028514539 flume_kafkacdh-3.ent.cloudera.com-1412722832793-b23eaa63-0 If flume_kafkacdh-1 crashed, another broker will pick up the partition: Group Topic Pid Offset logSize Lag Owner flume t1 0 59669715 100210042 40540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 2 65796205 8273338016937175 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 Then I can start flume_kafkacdh-4 and see things rebalance again: flume t1 0 60669715 100210042 39540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 flume t1 2 66829740 8273338015903640 flume_kafkacdh-4.ent.cloudera.com-1412793053882-9bfddff9-0 Isn't Kafka the best thing ever? :) Gwen On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira gshap...@cloudera.com wrote: yep. exactly. On Wed, Oct 8, 2014 at 11:07 AM, Sharninder sharnin...@gmail.com wrote: Thanks Gwen. When you're saying that I can add consumers to the same group, does that also hold true if those consumers are running on different machines? Or in different JVMs? -- Sharninder On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira gshap...@cloudera.com wrote: If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder
Re: Load Balancing Consumers or Multiple consumers reading off same topic
Thanks Gwen. This really helped. Yes, Kafka is the best thing ever :) Now how would this be done with the Simple consumer? I'm guessing I'll have to maintain my own state in Zookeeper or something of that sort? On Thu, Oct 9, 2014 at 12:01 AM, Gwen Shapira gshap...@cloudera.com wrote: Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1) and 1 consumer group (flume), each of the 3 topic partitions is being read by a different machine running the flume consumer: Group Topic Pid Offset logSize Lag Owner flume t1 0 50172068 100210042 50037974 flume_kafkacdh-1.ent.cloudera.com-1412722833783-3d6d80db-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412722838536-a6a4915d-0 flume t1 2 54218841 8273338028514539 flume_kafkacdh-3.ent.cloudera.com-1412722832793-b23eaa63-0 If flume_kafkacdh-1 crashed, another broker will pick up the partition: Group Topic Pid Offset logSize Lag Owner flume t1 0 59669715 100210042 40540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 2 65796205 8273338016937175 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 Then I can start flume_kafkacdh-4 and see things rebalance again: flume t1 0 60669715 100210042 39540327 flume_kafkacdh-2.ent.cloudera.com-1412792880818-b4aa6feb-0 flume t1 1 49914701 499147010 flume_kafkacdh-3.ent.cloudera.com-1412792871089-cabd4934-0 flume t1 2 66829740 8273338015903640 flume_kafkacdh-4.ent.cloudera.com-1412793053882-9bfddff9-0 Isn't Kafka the best thing ever? :) Gwen On Wed, Oct 8, 2014 at 11:23 AM, Gwen Shapira gshap...@cloudera.com wrote: yep. exactly. On Wed, Oct 8, 2014 at 11:07 AM, Sharninder sharnin...@gmail.com wrote: Thanks Gwen. When you're saying that I can add consumers to the same group, does that also hold true if those consumers are running on different machines? Or in different JVMs? -- Sharninder On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira gshap...@cloudera.com wrote: If you use the high level consumer implementation, and register all consumers as part of the same group - they will load-balance automatically. When you add a consumer to the group, if there are enough partitions in the topic, some of the partitions will be assigned to the new consumer. When a consumer crashes, once its node in ZK times out, other consumers will get its partitions. Gwen On Wed, Oct 8, 2014 at 10:39 AM, Sharninder sharnin...@gmail.com wrote: Hi, I'm not even sure if this is a valid use-case, but I really wanted to run it by you guys. How do I load balance my consumers? For example, if my consumer machine is under load, I'd like to spin up another VM with another consumer process to keep reading messages off any topic. On similar lines, how do you guys handle consumer failures? Suppose one consumer process gets an exception and crashes, is it possible for me to somehow make sure that there is another process that is still reading the queue for me? -- Sharninder