[ 
https://issues.apache.org/jira/browse/KAFKA-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720917#comment-16720917
 ] 

haiyangyu commented on KAFKA-7690:
----------------------------------

[~huxi_2b] when create topic or add partition, there is a situation that more 
than one partition will be allocated on a broker, those parititions maybe be 
allocated on a single disk although there are many disks on this broker.

I thinks the strategy of  the allocation for a parition on disk should consider 
this created topic's partition count on the target disk firstly, second , it 
shoud be the total topic partition's count. Because topic's data is not equal, 
we should first keep topic dimension balance insead of the total topic 
partition count

> Change disk allocation policy for multiple partitions on a broker when topic 
> is created
> ---------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7690
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7690
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.10.2.0, 1.0.0, 2.0.0
>            Reporter: haiyangyu
>            Priority: Major
>         Attachments: disk_assignent_strategy.patch
>
>
> h3. *Background*
> if target topic partitions lager than broker size when create a topic or add 
> partition, one broker will be assigned more than one partition. if current 
> all disk is not balance, such as one disk has one partition and the other one 
> has four partitions due to topic delete or others, the mutil partitions will 
> be all allocated in a single disk, and if the target topic has a huge flow, 
> it is easily to fill up the disk io.
> h3. *Improvement strategy*
> when mutil ** partition is going to be allocated on a broker, the strategy is 
> as follow:
> 1、calculate the target topic partition count and total partition count on 
> each disk.
> topic count
> 2、sorted by the target topic partition count wich ascending order, if the 
> target topic partition count is equal, sorted by the total partitions on each 
> disk.
> h3. *Example*
>  
> ||disk||target topic partition count||total partition count||
> |disk1|0|11|
> |disk2|0|9|
> when tow partitions are assigned on this broker, if use origin strategy, the 
> result is as follows:
> ||disk||target topic partition count||total partition count||
> |disk1|0|11|
> |disk2|2|11|
> use new strategy, the result is as follows:
> ||disk||target topic partition count||total partition count||
> |disk1|1|12|
> |disk2|1|10|
> if the topic has a huge flow such as 50MB/s per partition, it is easily to 
> fill up disk2 io if use origin strategy. However if use new strategy, it's 
> well in disk io rebalance.
> h3. *Summary*
> This strategy is good to build a big cluster and create a topic which has 
> huge amount partition, the disk io will be more balanced.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to