[
https://issues.apache.org/jira/browse/KAFKA-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543833#comment-15543833
]
Gwen Shapira commented on KAFKA-3064:
-------------------------------------
I'm wondering if the addition of throttling of replicas in 0.10.1.0 will help
address this issue?
> Improve resync method to waste less time and data transfer
> ----------------------------------------------------------
>
> Key: KAFKA-3064
> URL: https://issues.apache.org/jira/browse/KAFKA-3064
> Project: Kafka
> Issue Type: Improvement
> Components: controller, network
> Affects Versions: 0.8.2.1, 0.9.0.0
> Reporter: Michael Graff
> Assignee: Neha Narkhede
>
> We have several topics which are large (65 GB per partition) with 12
> partitions. Data rates into each topic vary, but in general each one has its
> own rate.
> After a raid rebuild, we are pulling all the data over to the newly rebuild
> raid. This takes forever, and has yet to complete after nearly 8 hours.
> Here are my observations:
> (1) The Kafka broker seems to pull from all topics on all partitions at the
> same time, starting at the oldest message.
> (2) When you divide total disk bandwidth available across all partitions
> (really, only 48 of which have significant amounts of data, about 65 * 12 =
> 780 GB each topic) the ingest rate of 36 out of 48 of them is higher than the
> available bandwidth.
> (3) The effect of (2) is that one topic SLOWLY catches up, while the other 4
> topics continue to retrieve data at 75% of the bandwidth, just to toss it
> away because the source broker has discarded it already.
> (4) Eventually that one topic catches up, and the remaining bandwidth is
> then divided into the remaining 36 partitions, one group of which starts to
> catch up again.
> What I want to see is a way to say “don’t transfer more than X partitions at
> the same time” and ideally a priority rule that says, “Transfer partitions
> you are responsible for first, then transfer ones you are not. Also,
> transfer these first, then those, but no more than 1 topic at a time”
> What I REALLY want is for Kafka to track the new data (track the head of the
> log) and then ask for the tail in chunks. Ideally this would request from
> the source, “what is the next logical older starting point?” and then start
> there. This way, the transfer basically becomes a file transfer of the log
> stored on the source’s disk. Once that block is retrieved, it moves on to the
> next oldest. This way, there is almost zero waste as both the head and tail
> grow, but the tail runs the risk of losing the final chunk only. Thus,
> bandwidth is not significantly wasted.
> All this changes the ISR check to be is “am I caught up on head AND tail?”
> when the tail part is implied right now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)