[ https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063452#comment-16063452 ]
Chen He commented on KAFKA-3554: -------------------------------- Thank you for the quick reply [~becket_qin]. This work is really valuable. It provides us a tool that can exploit kafka system's capacity. For example, we can get lowest latency by only use 1 thread, at the same time, by increasing thread, we can find what is the maximum throughput for a kafka cluster. Only one question, I did applied this patch to latest kafka and comparing results with old ProducerPerformance.java file. I found out, if we set ack=all with snappy compression, with 100M record(100B each), it does not work as well as old PproducerPerformance.java file. > Generate actual data with specific compression ratio and add multi-thread > support in the ProducerPerformance tool. > ------------------------------------------------------------------------------------------------------------------ > > Key: KAFKA-3554 > URL: https://issues.apache.org/jira/browse/KAFKA-3554 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.9.0.1 > Reporter: Jiangjie Qin > Assignee: Jiangjie Qin > Fix For: 0.11.1.0 > > > Currently the ProducerPerformance always generate the payload with same > bytes. This does not quite well to test the compressed data because the > payload is extremely compressible no matter how big the payload is. > We can make some changes to make it more useful for compressed messages. > Currently I am generating the payload containing integer from a given range. > By adjusting the range of the integers, we can get different compression > ratios. > API wise, we can either let user to specify the integer range or the expected > compression ratio (we will do some probing to get the corresponding range for > the users) > Besides that, in many cases, it is useful to have multiple producer threads > when the producer threads themselves are bottleneck. Admittedly people can > run multiple ProducerPerformance to achieve similar result, but it is still > different from the real case when people actually use the producer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)