[ 
https://issues.apache.org/jira/browse/KAFKA-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847521#comment-13847521
 ] 

Dragos Dena edited comment on KAFKA-1183 at 12/13/13 3:38 PM:
--------------------------------------------------------------

Inspecting the {{DefaultEventHandler}} code, there are 2 cases in which this 
cache is invalidated:
1. Every {{topic.metadata.refresh.interval.ms}} ms. This defaults to 10 minutes 
(which I think is way too rarely for refreshing this cache).
2. When {{dispatchSerializedData}} fails to send all messages. Never noticed 
this actually happening.

I suppose the cache should also be invalidated at the start of the {{handle}} 
method.


was (Author: ddragos):
Inspecting the {{DefaultEventHandler}} code, there are 2 cases in which this 
cache is invalidated:
1. Every {{topic.metadata.refresh.interval.ms}} ms. This defaults to 10 minutes 
(which I think is way too rarely).
2. When {{dispatchSerializedData}} fails to send all messages. Never noticed 
this actually happening.

I suppose the cache should also be invalidated at the start of the {{handle}} 
method.

> DefaultEventHandler causes unbalanced distribution of messages across 
> partitions
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-1183
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1183
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.0
>            Reporter: Dragos Dena
>            Assignee: Jun Rao
>             Fix For: 0.8.1, 0.9.0
>
>         Attachments: KAFKA-1183-trunk.patch
>
>
> KAFKA-959 introduced an optimisation in {{DefaultEventHandler}} that was 
> supposed to have the effect of sending all messages from the same batch to a 
> single partition if no key is specified.
> The problem is that the {{sendPartitionPerTopicCache}} cache, which holds the 
> current selected partition for each topic, isn't actually invalided at the 
> start or end of each batch.
> The observed result is that, after the first request chooses a random 
> partition, all subsequent messages from that producer land in the same 
> partition. If you have a large number of producers, then it should be fine, 
> but if your producer count is comparable to the partition count, then it will 
> get unbalanced.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to