Hi Bert

What you are describing could be done partially with the console producer. It 
will read from a file and send each line to the Kafka broker. You could make a 
really big file or alter that code to repeat a certain number of times. The 
source is pretty readable, I think that might be an easier route to take. 

Daniel.

> On 1/07/2014, at 2:07 am, Bert Corderman <bertc...@gmail.com> wrote:
> 
> Daniel,
> 
> 
> 
> We have the same question.  We noticed that the compression tests we ran
> using the built in performance tester was not realistic.  I think on disk
> compression was 200:1.  (yes that is two hundred to one) I had planned to
> try and edit the producer performance tester source and do the following
> 
> 
> 
> 1.       Add an option to read sample data from provided text file.
> (thought would be to add a file with 1-5000 rows, whatever I thought my
> batch size might be)
> 
> 2.      Load sample file into array
> 
> 3.      Change code that creates message to pull a random row from array
> 
> 
> 
> I also am not a Scala developer  so would take me a little bit to figure
> this out.  This is on hold right now as I am looking at options of
> compression of the message before sending to kafka.  We had originally not
> wanted to do this as we are assuming that we would not get efficient
> compression ratios as we are only doing a single message however we are
> also talking about sending multiple messages from our application as a
> single Kafka message.  Our concern with using kafka compression is the
> overhead required from decompression on the broker to assign Ids.  Here is
> a good article that describes this
> http://geekmantra.wordpress.com/2013/03/28/compression-in-kafka-gzip-or-snappy/
> 
> 
> 
> But again we haven’t decided just yet.  Would like to test and evaluate.
> 
> 
> 
> Bert
> 
> 
> On Mon, Jun 30, 2014 at 2:24 AM, Daniel Compton <d...@danielcompton.net>
> wrote:
> 
>> Hi folks
>> 
>> I was doing some performance testing using the built in Kafka performance
>> tester and it seems like it sends messages of size n bytes but with all
>> bytes having the value 0x0. Is that correct? Reading the source seemed to
>> indicate that too but I'm not a Scala developer so I could be wrong.
>> 
>> Would this affect the performance compared to a real world scenario?
>> Obviously you will get very efficient compression rates but apart from
>> that, is there likely to be optimisations carried out  anywhere between the
>> JVM and the network card that won't hold for messages with non zero entropy?
>> 
>> We're going to test this against our production workload so it's not a big
>> deal for us but I wondered if this could give others skewed results?
>> 
>> ---
>> Daniel

Reply via email to