Hi Erik, Thanks for these inputs I'll implement it Regards, Prabhjot On Aug 25, 2015 11:53 PM, "Helleren, Erik" <erik.helle...@cmegroup.com> wrote:
> Prabhjot, > You can’t do it with producer perf test, but its relatively simple to > implement. The message body includes a timestamp of when your producer > produces, and the consumer looks at the difference between between the > timestamp in the body and the current timestamp. > > Or, if you were looking for ack latency, you can use the producer’s async > callback to measure latency. > -Erik > > From: Prabhjot Bharaj <prabhbha...@gmail.com<mailto:prabhbha...@gmail.com > >> > Date: Tuesday, August 25, 2015 at 9:22 AM > To: Erik Helleren <erik.helle...@cmegroup.com<mailto: > erik.helle...@cmegroup.com>> > Cc: "users@kafka.apache.org<mailto:users@kafka.apache.org>" < > users@kafka.apache.org<mailto:users@kafka.apache.org>>, " > d...@kafka.apache.org<mailto:d...@kafka.apache.org>" <d...@kafka.apache.org > <mailto:d...@kafka.apache.org>> > Subject: Re: kafka producer-perf-test.sh compression-codec not working > > Hi Erik, > > Thanks for your inputs. > > How can we measure round trip latency using kafka-producer-perf-test.sh ? > > or any other tool ? > > Regards, > Prabhjot > > On Tue, Aug 25, 2015 at 7:41 PM, Helleren, Erik < > erik.helle...@cmegroup.com<mailto:erik.helle...@cmegroup.com>> wrote: > Prabhjot, > When no compression is being used, it should have only a tiny impact on > performance. But when it is enabled it will make it as though the message > payload is small and nearly constant, regardless as to how large the > configured message size is. > > I think that the answer is that this is room for improvement in the perf > test, especially where compression is concerned. If you do implement an > improvement, a patch might be helpful to the community. But something to > consider is that threwput alone isn’t the only important performance > measure. Round trip latency is also important. > Thanks, > -Erik > > > From: Prabhjot Bharaj <prabhbha...@gmail.com<mailto:prabhbha...@gmail.com > ><mailto:prabhbha...@gmail.com<mailto:prabhbha...@gmail.com>>> > Date: Tuesday, August 25, 2015 at 8:41 AM > To: Erik Helleren <erik.helle...@cmegroup.com<mailto: > erik.helle...@cmegroup.com><mailto:erik.helle...@cmegroup.com<mailto: > erik.helle...@cmegroup.com>>> > Cc: "users@kafka.apache.org<mailto:users@kafka.apache.org><mailto: > users@kafka.apache.org<mailto:users@kafka.apache.org>>" < > users@kafka.apache.org<mailto:users@kafka.apache.org><mailto: > users@kafka.apache.org<mailto:users@kafka.apache.org>>>, " > d...@kafka.apache.org<mailto:d...@kafka.apache.org><mailto: > d...@kafka.apache.org<mailto:d...@kafka.apache.org>>" <d...@kafka.apache.org > <mailto:d...@kafka.apache.org><mailto:d...@kafka.apache.org<mailto: > d...@kafka.apache.org>>> > Subject: Re: kafka producer-perf-test.sh compression-codec not working > > Hi Erik, > > I have put my efforts on the produce side till now, Thanks for making me > aware that consumer will decompress automatically. > > I'll also consider your point on creating real-life messages > > But, I have still have one confusion - > > Why would the current ProducerPerformance.scala compress an Array of Bytes > with all zeros ? > That will anyways give better throughput. correct ? > > Regards, > Prabhjot > > On Tue, Aug 25, 2015 at 7:05 PM, Helleren, Erik < > erik.helle...@cmegroup.com<mailto:erik.helle...@cmegroup.com><mailto: > erik.helle...@cmegroup.com<mailto:erik.helle...@cmegroup.com>>> wrote: > Hi Prabhjot, > There are two important things to know about kafka compression: First > uncompression happens automatically in the consumer > (https://cwiki.apache.org/confluence/display/KAFKA/Compression) so you > should see ascii returned on the consumer side. The best way to see if > compression has happened that I know of is to actually look at a packet > capture. > > Second, the producer does not compress individual messages, but actually > batches several sequential messages to the same topic and partition > together and compresses that compound message. > ( > https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Pro > tocol#AGuideToTheKafkaProtocol-Compression) Thus, a fixed string will > still see far better compression ratios than a Œtypical' real life > message. > > Making a real-life-like message isn¹t easy, and depends heavily on your > domain. But a general approach would be to generate messages by randomly > selected words from a dictionary. And having a dictionary around thousand > large words means there is a reasonable chance of the same words appearing > multiple times in the same message. Also words can be non-sence like > ³asdfasdfasdfasdf², or large words in the language of your choice. The > goal is for each message to be unique, but still have similar chunks that > a compression algorithm can detect and compress. > > -Erik > > > On 8/25/15, 6:47 AM, "Prabhjot Bharaj" <prabhbha...@gmail.com<mailto: > prabhbha...@gmail.com><mailto:prabhbha...@gmail.com<mailto: > prabhbha...@gmail.com>>> wrote: > > >Hi, > > > >I have bene trying to use kafka-producer-perf-test.sh to arrive at certain > >benchmarks. > >When I try to run it with --compression-codec values of 1, 2 and 3, I > >notice increased throughput compared to NoCompressionCodec > > > >But, When I checked the Producerperformance.scala, I saw that the the > >`producer.send` is getting data from the method: `generateProducerData`. > >But, this data is just an empty array of Bytes. > > > >Now, as per my basic understanding of compression algorithms, I think a > >byte sequence of zeros will eventually result in a very small message, > >because of which I thought I might be observing better throughput. > > > >So, in line: 247 of ProducerPerformance.scala, I did this minor code > >change:- > > > > > > > >*val message = > >"qopwr11591UPD113582260001AS1IL1-1N/A1Entertainment1-1an-example.com1-1-1- > >1-1-1-1-1011413/011413_factor_points_FNC_,LOW,MED_LOW,MED,HIGH,HD,.mp4.csm > >il/bitrate=11subcategory > >71Title > >10^D1-1-111-1-1-1-1-1-111-1-1-1-1-115101-1-1-1-1126112491-1-1-1-1-1-1-1-1- > >1-1-1-1-1-1-111-1-1-r1VR-11591UPD113582260001AS1IL1-1N/A1Entertainment1-1a > >n-example.com1-1-1-1-1-1-1-1011413/011413_factor_points_FNC_,LOW,MED_LOW,M > >ED,HIGH,HD,.mp4.csmil/bitrate=11subcategory > >71Title > >10^D1-1-111-1-1-1-1-1-111-1-1-1-1-115101-1-1-1-1126112491-1-1-1-1-1-1-1-1- > >1-1-1-1-1-1-111-1-1-r1VR-11591UPD113582260001AS1IL1-1N/A1Entertainment1-1a > >n-example.com1-1-1-1-1-1-1-1011413/011413_factor_points_FNC_,LOW,MED_LOW,M > >ED,HIGH,HD,.mp4.csmil/bitrate=11subcategory > >71Title > >10^D1-1-111-1-1-1-1-1-111-1-1-1-1-115101-1-1-1-1126112491-1-1-1-1-1-1-1-1- > >1-1-1-1-1-1-111-1-1-"message.getBytes().slice(0,msgSize)* > > > > > >This makes sure that I have a big message, and I can slice that > >message to the message size passed in the command line options > > > > > >But, the problem is that when I try running the same with > >--compression-codec vlues of 1, 2 or 3, I still am seeing ASCII data > >(i.e. uncompressed one only) > > > > > >I want to ask whether this is a bug. And, using > >kafka-producer-perf-test.sh, how can I send my own compressed data ? > > > > > >Thanks, > > > >Prabhjot > > > > > -- > --------------------------------------------------------- > "There are only 10 types of people in the world: Those who understand > binary, and those who don't" > > > > -- > --------------------------------------------------------- > "There are only 10 types of people in the world: Those who understand > binary, and those who don't" >