[
https://issues.apache.org/jira/browse/KAFKA-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720917#comment-16720917
]
haiyangyu commented on KAFKA-7690:
----------------------------------
[~huxi_2b] when create topic or add partition, there is a situation that more
than one partition will be allocated on a broker, those parititions maybe be
allocated on a single disk although there are many disks on this broker.
I thinks the strategy of the allocation for a parition on disk should consider
this created topic's partition count on the target disk firstly, second , it
shoud be the total topic partition's count. Because topic's data is not equal,
we should first keep topic dimension balance insead of the total topic
partition count
> Change disk allocation policy for multiple partitions on a broker when topic
> is created
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-7690
> URL: https://issues.apache.org/jira/browse/KAFKA-7690
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.10.2.0, 1.0.0, 2.0.0
> Reporter: haiyangyu
> Priority: Major
> Attachments: disk_assignent_strategy.patch
>
>
> h3. *Background*
> if target topic partitions lager than broker size when create a topic or add
> partition, one broker will be assigned more than one partition. if current
> all disk is not balance, such as one disk has one partition and the other one
> has four partitions due to topic delete or others, the mutil partitions will
> be all allocated in a single disk, and if the target topic has a huge flow,
> it is easily to fill up the disk io.
> h3. *Improvement strategy*
> when mutil ** partition is going to be allocated on a broker, the strategy is
> as follow:
> 1、calculate the target topic partition count and total partition count on
> each disk.
> topic count
> 2、sorted by the target topic partition count wich ascending order, if the
> target topic partition count is equal, sorted by the total partitions on each
> disk.
> h3. *Example*
>
> ||disk||target topic partition count||total partition count||
> |disk1|0|11|
> |disk2|0|9|
> when tow partitions are assigned on this broker, if use origin strategy, the
> result is as follows:
> ||disk||target topic partition count||total partition count||
> |disk1|0|11|
> |disk2|2|11|
> use new strategy, the result is as follows:
> ||disk||target topic partition count||total partition count||
> |disk1|1|12|
> |disk2|1|10|
> if the topic has a huge flow such as 50MB/s per partition, it is easily to
> fill up disk2 io if use origin strategy. However if use new strategy, it's
> well in disk io rebalance.
> h3. *Summary*
> This strategy is good to build a big cluster and create a topic which has
> huge amount partition, the disk io will be more balanced.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)