One question were found in the process of using kafka recently:

Why does Kafka start fetching from the leader's logStartOffset when Kafka is 
doing the topic reassign task (such as cluster expansion brokers)? If the 
amount of data stored locally by the partition leader is large, the disk IO 
will be full. Although there is a corresponding current throttle mechanism, it 
is difficult to grasp an appropriate limiting threshold. If the setting is not 
accurate, there will still be a instantaneous crit to the disk. 

I have an idea: when doing the reassign task, the new replica starts fetching 
data from logEndOffset/logStartOffset. According to some strategies (I haven’t 
thought about it yet, but it should not be very complicated), for example, when 
judging that the leader’s local data volume is not large, you can start 
directly from logStartOffset, so that the reassign task can be completed 
quickly. If the leader’s local data volume is huge, then the fetch can be 
started from logEndOffset, and only the amount of data accumulated by the new 
replica up to the data retention time can be added to the ISR to complete the 
reassign task.

What do you think about this issue?




best,

hudeqi

Reply via email to