[ 
https://issues.apache.org/jira/browse/KAFKA-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726958#comment-13726958
 ] 

Joel Koshy commented on KAFKA-991:
----------------------------------

+1

Committed to 0.8

Minor comment:
- queue size is unintuitive. sounds like number of messages, but it is bytes
- The totalSize > queueSize check should ideally be done before adding it to 
msgList.
                
> Reduce the queue size in hadoop producer
> ----------------------------------------
>
>                 Key: KAFKA-991
>                 URL: https://issues.apache.org/jira/browse/KAFKA-991
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Swapnil Ghike
>            Assignee: Swapnil Ghike
>              Labels: bugs
>             Fix For: 0.8
>
>         Attachments: kafka-991-v1.patch
>
>
> Currently the queue.size in hadoop producer is 10MB. This means that the 
> KafkaRecordWriter will hit the send button on kafka producer after the size 
> of uncompressed queued messages becomes greater than 10MB. (The other 
> condition on which the messages are sent is if their number exceeds 
> SHORT.MAX_VALUE).
> Considering that the server accepts a (compressed) batch of messages of 
> sizeupto 1 million bytes minus the log overhead, we should probably reduce 
> the queue size in hadoop producer. We should do two things:
> 1. change max message size on the broker to 1 million + log overhead, because 
> that will make the client message size easy to remember. Right now the 
> maximum number of bytes that can be accepted from a client in a batch of 
> messages is an awkward 999988. (I don't have a stronger reason). We have set 
> fetch size on the consumer to 1MB, this gives us a lot of room even if the 
> log overhead increased with further versions.
> 2. Set the default number of bytes on hadoop producer to 1 million bytes. 
> Anyone who wants higher throughput can override this config using 
> kafka.output.queue.size

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to