Hi, I was just wondering if there is any difference in the memory footprint of a high level consumer when:
1. the consumer is live and continuously consuming messages with no backlogs 2. when the consumer is down for quite some time and needs to be brought up to clear the backlog. My test case with kafka 0.8.2.1 using only one topic has: Setup: 6 brokers and 3 zookeeper nodes Message Size: 1 MB Producer rate: 100 threads with 1000 messages per thread No. of partitions in topic: 100 Consumer threads: 100 consumer threads in the same group I initially started producer and consumer on the same java process with a heap size 1 GB. The producer could send all the messages to broker. But the consumer started throwing OutOfMemory exceptions after consuming 26k messages. Upon restarting the process with 5 GB heap, the consumer consumed around 4.8k messages before going OOM (while clearing a backlog of around 74k). The rest of the messages got consumed when I bumped up heap to 10 GB. On the consumer, I have the default values for fetch.message.max.bytes and queued.max.message.chunks. If the calculation (fetch.message.max.bytes)*(queued.max.message.chunks)*(no. of consumer threads) holds good for consumer, then 1024*1024*10*100 (close to 1GB) is well below the 5GB heap allocated. Did I leave something out of this calculation? Regards, Kris