Hi,

I'm using Kafka as a messaging system in my data pipeline. I've a couple of
producer processes in my pipeline and Spark Streaming
<https://spark.apache.org/docs/2.2.1/streaming-kafka-0-10-integration.html>
and Druid's Kafka indexing service
<http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html>
as consumers of Kafka. The indexing service spawns 40 new indexing tasks
(Kafka consumers) every 15 mins.

The heap memory used on Kafka seems fairly constant for an hour after which
it seems to shoot up to the max allocated space. The garbage collection
logs of Kafka seems to indicate a memory leak in Kafka. Find attached the
plots generated from the GC logs.

*Kafka Deployment:*
3 nodes, with 3 topics and 64 partitions per topic

*Kafka Runtime jvm parameters:*
8GB Heap Memory
1GC swap Memory
Using G1GC
MaxGCPauseMilllis=20
InitiatingHeapOccupancyPercent=35

*Kafka Versions Used:*
I've used Kafka version 0.10.0, 0.11.0.2 and 1.0.0 and find similar behavior

*Questions:*
1) Is this a memory leak on the Kafka side or a misconfiguration of my
Kafka cluster? Does Kafka stably handle large number of consumers being
added periodically?
2) As a knock on effect, We also notice kafka partitions going offline
periodically after some time with the following error:
    ERROR [ReplicaFetcherThread-18-2], Error for partition [topic1,2] to
broker 2:*org.apache.kafka.common.errors.UnknownTopicOrPartitionException*:
This server does not host this topic-partition.
(kafka.server.ReplicaFetcherThread)

Can someone shed some light on the behavior being seen in my cluster?

Please let me know if more details are needed to root cause the behavior
being seen.

Thanks in advance.

Avinash
[image: Screen Shot 2018-01-23 at 2.29.04 PM.png][image: Screen Shot
2018-01-23 at 2.29.21 PM.png]




-- 

Excuse brevity and typos. Sent from mobile device.

Reply via email to