kaushik srinivas created KAFKA-6600:
---------------------------------------

             Summary: Kafka Bytes Out lags behind Kafka Bytes In on all brokers 
when topics replicated with 3 and flume kafka consumer.
                 Key: KAFKA-6600
                 URL: https://issues.apache.org/jira/browse/KAFKA-6600
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.10.0.1
            Reporter: kaushik srinivas


Below is the setup detail,

Kafka with 3 brokers (each broker with 10 cores and 32GBmem (12 GB heap)).

Created topic with 120 partitions and replication factor 3.

Throughput per broker is ~40k msgs/sec and bytes in ~8mb/sec.

Flume kafka source is used as the consumer.

Observations:

When the replication factor is kept 1, the bytes out and bytes in stops exactly 
at same timestamp(i.e when the producer to kafka is stopped).

But when the replication factor is increased to 3, there is a time lag observed 
in bytes out compared to bytes in. Flume kafka source is pulling data slowly. 
But flume is configured with very high memory and cpu configurations.

 

Tried increasing num.replica.fetchers from default value 1 to 10, 20, 50 etc 
and replica.fetch.max.bytes from default 1MB to 10MB,20MB. But no improvement 
is found to be observed in terms of the lag.

under repplicated partitions is observed to be zero using replica manager 
metrics in jmx.

Kafka brokers were monitored for cpu and memory, cpu is being used at 3% of 
total cores max and memory used at 4gb (32 Gb configured).

Flume kafka source has overriden kafka consumer properties : 
max.partition.fetch bytes is kept at default 1MB and fetch.max.bytes is kept at 
default 52MB. Flume kafka source batch size is kept at default value 1000.
 agent.sources.****.kafka.consumer.fetch.max.bytes = 10485760
 agent.sources.****.kafka.consumer.max.partition.fetch.bytes = 10485760
 agent.sources.****.batchSize = 1000
 

what more tuning is needed in order to reduce the lag between bytes in and 
bytes out  at kafka brokers with replication factor 3 or is there any 
configuration missed out?

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to