[ 
https://issues.apache.org/jira/browse/FLINK-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794775#comment-16794775
 ] 

Congxian Qiu(klion26) commented on FLINK-11948:
-----------------------------------------------

As the doc said,
{noformat}
To avoid such an unbalanced partitioning, use a round-robin kafka partitioner 
(note that this will
* cause a lot of network connections between all the Flink instances and all 
the Kafka brokers).{noformat}
Maybe we could have a round-robin partitioner here?

> When kafka sink parallelism<kafka partition num,kafka data distribution 
> unbalance
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-11948
>                 URL: https://issues.apache.org/jira/browse/FLINK-11948
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.6.3, 1.6.4, 1.7.2
>            Reporter: qi quan
>            Priority: Major
>
> The default FlinkFixedPartitioner return  int[] partitions by subtaskid % 
> partitions.length.When kafka sink parallelism<kafka partition num。It only the 
> first few kafka partitions will write data.
> I think it needs to be improved here.
> {code:java}
>       @Override
>       public int partition(T record, byte[] key, byte[] value, String 
> targetTopic, int[] partitions) {
>               Preconditions.checkArgument(
>                       partitions != null && partitions.length > 0,
>                       "Partitions of the target topic is empty.");
>               return partitions[parallelInstanceId % partitions.length];
>       }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to