[ 
https://issues.apache.org/jira/browse/KAFKA-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justinwins updated KAFKA-14171:
-------------------------------
    Description: 
- This seems a small bug (or improvment) ,but it really impacts perf of mm2.

 - When DistributedHerder starts, it will   startServices()-->  
this.worker.start() --> offsetBackingStore.start()  -->  offsetLog.start() ,and 
finally  in  `KafkaBasedLog` class ,we  see 
`consumer.seekToBeginning(partitions)` .  Take a look at 
`org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will get to know it.

 - Basically, mm2-offsets topic will be kept for 7 days (as defined by 
'retention.ms' ) . If there are many paritions for mm2 to replicate ,then 
mm2-offsets  topic may be quite 'big' in 7 days.  And it  may take a few 
minutes or more to poll unitil the consumer reaches the latest . This is a VERY 
 Cpu-consuming action, and it incurs cpu throttle in the  k8s container.

 - I think mm-offsets topic ,or to be specific , KafkaBasedLog  is a special 
topic .At least, we can set a much shorter ttl for it to avoid this bug .

 

 

 

 

 

  was:
- This seems a small bug (or improvment) ,but it really impacts perf of mm2.

- When DistributedHerder starts, it will   startServices()-->  
this.worker.start() --> 

offsetBackingStore.start()  -->  offsetLog.start() ,and finally  in 
`KafkaBasedLog` class ,we 

see `consumer.seekToBeginning(partitions)` . 

Take a look at `org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will 
get to know it.

- Basically, mm2-offsets topic will be kept for 7 days (as defined by 
'retention.ms' ) . If there are many paritions for mm2 to replicate ,then 
mm2-offsets  topic may be quite 'big' in 7 days.  And it  may take a few 
minutes or more to poll unitil the consumer reaches the latest . This is a VERY 
 Cpu-consuming action, and it incurs cpu throttle in the  k8s container.

- I think mm-offsets topic ,or to be specific , KafkaBasedLog  is a special 
topic .At least, we can set a much shorter ttl for it to avoid this bug .

 

 

 

 

 


> mm2-offsets topic should be set retention.ms=1h or less as default
> ------------------------------------------------------------------
>
>                 Key: KAFKA-14171
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14171
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Justinwins
>            Priority: Major
>
> - This seems a small bug (or improvment) ,but it really impacts perf of mm2.
>  - When DistributedHerder starts, it will   startServices()-->  
> this.worker.start() --> offsetBackingStore.start()  -->  offsetLog.start() 
> ,and finally  in  `KafkaBasedLog` class ,we  see 
> `consumer.seekToBeginning(partitions)` .  Take a look at 
> `org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will get to know it.
>  - Basically, mm2-offsets topic will be kept for 7 days (as defined by 
> 'retention.ms' ) . If there are many paritions for mm2 to replicate ,then 
> mm2-offsets  topic may be quite 'big' in 7 days.  And it  may take a few 
> minutes or more to poll unitil the consumer reaches the latest . This is a 
> VERY  Cpu-consuming action, and it incurs cpu throttle in the  k8s container.
>  - I think mm-offsets topic ,or to be specific , KafkaBasedLog  is a special 
> topic .At least, we can set a much shorter ttl for it to avoid this bug .
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to