Hello, I would like to write a multi-threaded consumer for the high-level consumer in Kafka 0.8.1. I have found two ways that seem feasible while keeping the guarantee that messages in a partition are processed in order. I would appreciate any feedback this list has.
Option 1 -------- - Create multiple threads, so each thread has its own ConsumerConnector. - Manually commit offsets in each thread after every N messages. - This was discussed a bit on this list previously. See [1]. ### Questions - Is there a problem with making multiple ConsumerConnectors per machine? - What does it take for ZooKeeper to handle this much load? We have a 3-node ZooKeeper cluster with relatively small machines. (I expect the topic will have about 40 messages per second. There will be 3 consumer groups. That would be 120 commits per second at most, but I can reduce the frequency of commits to make this lower.) ### Extra info Kafka 0.9 will have an entirely different commit API, which will allow one connection to commit offsets per partition, but I can’t wait that long. See [2]. Option 2 -------- - Create one ConsumerConnector, but ask for multiple streams in that connection. Give each thread one stream. - Since there is no way to commit offsets per stream right now, we need to do autoCommit. - This sacrifices the at-least-once processing guarantee, which would be nice to have. See KAFKA-1612 [3]. ### Extra info - There was some discussion in KAFKA-996 about a markForCommit() method so that autoCommit would preserve the at-least-once guarantee, but it seems more likely that the consumer API will just be redesigned to allow commits per partition instead. See [4]. So basically I'm wondering if option 1 is feasible. If not, I'll just do option 2. Of course, let me know if I was mistaken about anything or if there is another design which is better. Thanks in advance. Rafi [1] http://mail-archives.apache.org/mod_mbox/kafka-users/201310.mbox/%3cff142f6b499ae34caed4d263f6ca32901d35a...@extxmb19.nam.nsroot.net%3E [2] https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI [3] https://issues.apache.org/jira/browse/KAFKA-1612 [4] https://issues.apache.org/jira/browse/KAFKA-966