[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2019-02-17 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770599#comment-16770599
 ] 

Matthias J. Sax commented on KAFKA-3554:


Moving all major/minor/trivial tickets that are not merged yet out of 2.2 
release.

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Major
> Fix For: 2.2.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2017-11-17 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257400#comment-16257400
 ] 

Jiangjie Qin commented on KAFKA-3554:
-

[~airbots] Thanks for volunteer to help. The patch needs a rebase, again. I 
guess currently the reviewers are busy. [~ijuma] do you have time to look at 
this patch if I do a rebase. Not sure if we need a KIP for this though. 
Sometimes we submit KIPs for tooling but sometimes we don't. I am neutral on 
this one. Let me know if you prefer a KIP.

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 1.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2017-11-08 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245266#comment-16245266
 ] 

Chen He commented on KAFKA-3554:


This is a great feature, [~becket_qin]. What can I do for you to make it 
checked in to Kafka. I am happy to contribute my time and efforts. 

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 1.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2017-06-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063452#comment-16063452
 ] 

Chen He commented on KAFKA-3554:


Thank you for the quick reply [~becket_qin]. This work is really valuable. It 
provides us a tool that can exploit kafka system's capacity. For example, we 
can get lowest latency by only use 1 thread, at the same time, by increasing 
thread, we can find what is the maximum throughput for a kafka cluster. 

Only one question, I did applied this patch to latest kafka and comparing 
results with old ProducerPerformance.java file. I found out, if we set ack=all 
with snappy compression, with 100M record(100B each), it does not work as well 
as old PproducerPerformance.java file. 

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)