[ https://issues.apache.org/jira/browse/FLINK-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794767#comment-16794767 ]
Yun Tang commented on FLINK-11948: ---------------------------------- I think this is what {{FlinkFixedPartitioner}} wants to achieve. You could find the description of javadoc: {noformat} Not all Kafka partitions contain data To avoid such an unbalanced partitioning, use a round-robin kafka partitioner (note that this will cause a lot of network connections between all the Flink instances and all the Kafka brokers). {noformat} This is certainly not a bug but a trad-off. > When kafka sink parallelism<kafka partition num,kafka data distribution > unbalance > --------------------------------------------------------------------------------- > > Key: FLINK-11948 > URL: https://issues.apache.org/jira/browse/FLINK-11948 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka > Affects Versions: 1.6.3, 1.6.4, 1.7.2 > Reporter: qi quan > Priority: Major > > The default FlinkFixedPartitioner return int[] partitions by subtaskid % > partitions.length.When kafka sink parallelism<kafka partition num。It only the > first few kafka partitions will write data. > I think it needs to be improved here. > {code:java} > @Override > public int partition(T record, byte[] key, byte[] value, String > targetTopic, int[] partitions) { > Preconditions.checkArgument( > partitions != null && partitions.length > 0, > "Partitions of the target topic is empty."); > return partitions[parallelInstanceId % partitions.length]; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)