Hi,
I read this link https://cwiki.apache.org/KAFKA/consumer-group-example.html
and have a few questions (if not too many).
1 When you say the iterator may block, do you mean hasNext() may block?
2 "Remember, you can only use a single process per Consumer Group."
Do you mean we can only use a single process on one node of the cluster for
a consumer group?
Or there can be only one process on the whole cluster for a consumer group?
Please clarify on this.
3 Why save offset to zookeeper? Is it easier to save it to a local file?
4 When client exits/crashes or leader for a partition is changed, duplicate
messages may be replayed. "To help avoid this (replayed duplicate messages),
make sure you provide a clean way for your client to exit instead of assuming
it can be 'kill -9'd."
a. For client exit, if the client is receiving data at the time, how to
do a clean exit? How can client tell consumer to write offset to zookeepr
before exiting?
b. For client crash, what can client do to avoid duplicate messages when
restarted? What I can think of is to read last message from log file and ignore
the first few received duplicate messages until receiving the last read
message. But is it possible for client to read log file directly?
c. For the change of the partition leader, is there anything that clients
can do to avoid duplicates?
Thanks.
Libo