Re: kafka benchmark tests

Yuheng Du Tue, 14 Jul 2015 08:24:44 -0700

Also, I guess setting the target throughput to -1 means let it be as high
as possible?


On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du <yuheng.du.h...@gmail.com>
wrote:

> Thanks. If I set the acks=1 in the producer config options in
> bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> test7 50000000 100 -1 acks=1 bootstrap.servers=
> esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196?
>
> Does that mean for each message generated at the producer, the producer
> will wait until the broker sends the ack back, then send another message?
>
> Thanks.
>
> Yuheng
>
> On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy <ku...@nmsworks.co.in>
> wrote:
>
>> Yes, A list of  Kafka Server host/port pairs to use for establishing the
>> initial connection to the Kafka cluster
>>
>> https://kafka.apache.org/documentation.html#newproducerconfigs
>>
>> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du <yuheng.du.h...@gmail.com>
>> wrote:
>>
>> > Does anyone know what is bootstrap.servers=
>> > esv4-hcl198.grid.linkedin.com:9092 means in the following test command:
>> >
>> > bin/kafka-run-class.sh
>> org.apache.kafka.clients.tools.ProducerPerformance
>> > test7 50000000 100 -1 acks=1 bootstrap.servers=
>> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
>> batch.size=8196?
>> >
>> > what is bootstrap.servers? Is it the kafka server that I am running a
>> test
>> > at?
>> >
>> > Thanks.
>> >
>> > Yuheng
>> >
>> >
>> >
>> >
>> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava <
>> e...@confluent.io
>> > >
>> > wrote:
>> >
>> > > I implemented (nearly) the same basic set of tests in the system test
>> > > framework we started at Confluent and that is going to move into
>> Kafka --
>> > > see the wip patch for KIP-25 here:
>> > https://github.com/apache/kafka/pull/70
>> > > In particular, that test is implemented in benchmark_test.py:
>> > >
>> > >
>> >
>> https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8
>> > >
>> > > Hopefully once that's merged people can reuse that benchmark (and add
>> to
>> > > it!) so they can easily run the same benchmarks across different
>> > hardware.
>> > > Here are some results from an older version of that test on m3.2xlarge
>> > > instances on EC2 using local ephemeral storage (I think... it's been
>> > awhile
>> > > since I ran these numbers and I didn't document methodology that
>> > > carefully):
>> > >
>> > > INFO:_.KafkaBenchmark:=================
>> > > INFO:_.KafkaBenchmark:BENCHMARK RESULTS
>> > > INFO:_.KafkaBenchmark:=================
>> > > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208
>> > > rec/sec (65.240000 MB/s)
>> > > INFO:_.KafkaBenchmark:Single producer, async 3x replication:
>> > > 667494.359673 rec/sec (63.660000 MB/s)
>> > > INFO:_.KafkaBenchmark:Single producer, sync 3x replication:
>> > > 116485.764275 rec/sec (11.110000 MB/s)
>> > > INFO:_.KafkaBenchmark:Three producers, async 3x replication:
>> > > 1696519.022182 rec/sec (161.790000 MB/s)
>> > > INFO:_.KafkaBenchmark:Message size:
>> > > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.620000 MB/s)
>> > > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.750000 MB/s)
>> > > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.170000 MB/s)
>> > > INFO:_.KafkaBenchmark: 10000: 8306.180862 rec/sec (79.210000 MB/s)
>> > > INFO:_.KafkaBenchmark: 100000: 978.403499 rec/sec (93.310000 MB/s)
>> > > INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
>> > > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.300000
>> > MB/s)
>> > > INFO:_.KafkaBenchmark:Single consumer: 701031.140000 rec/sec
>> (56.830500
>> > > MB/s)
>> > > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec
>> (267.830800
>> > > MB/s)
>> > > INFO:_.KafkaBenchmark:Producer + consumer:
>> > > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.600000
>> MB/s)
>> > > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.600000
>> MB/s)
>> > > INFO:_.KafkaBenchmark:End-to-end latency: median 2.000000 ms, 99%
>> > > 4.000000 ms, 99.9% 19.000000 ms
>> > >
>> > > Don't trust these numbers for anything, the were a quick one-off test.
>> > I'm
>> > > just pasting the output so you get some idea of what the results might
>> > look
>> > > like. Once we merge the KIP-25 patch, Confluent will be running the
>> tests
>> > > regularly and results will be available publicly so we'll be able to
>> keep
>> > > better tabs on performance, albeit for only a specific class of
>> hardware.
>> > >
>> > > For the batch.size question -- I'm not sure the results in the blog
>> post
>> > > actually have different settings, it could be accidental divergence
>> > between
>> > > the script and the blog post. The post specifically notes that tuning
>> the
>> > > batch size in the synchronous case might help, but that he didn't do
>> > that.
>> > > If you're trying to benchmark the *optimal* throughput, tuning the
>> batch
>> > > size would make sense. Since synchronous replication will have higher
>> > > latency and there's a limit to how many requests can be in flight at
>> > once,
>> > > you'll want a larger batch size to compensate for the additional
>> latency.
>> > > However, in practice the increase you see may be negligible. Somebody
>> who
>> > > has spent more time fiddling with tweaking producer performance may
>> have
>> > > more insight.
>> > >
>> > > -Ewen
>> > >
>> > > On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG <jg...@berkeley.edu>
>> wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I was wondering if any of you guys have done benchmarks on Kafka
>> > > > performance before, and if they or their details (# nodes in
>> cluster, #
>> > > > records / size(s) of messages, etc.) could be shared.
>> > > >
>> > > > For comparison purposes, I am trying to benchmark Kafka against some
>> > > > similar services such as Kinesis or Scribe. Additionally, I was
>> > wondering
>> > > > if anyone could shed some insight on Jay Kreps' benchmarks that he
>> has
>> > > > openly published here:
>> > > >
>> > > >
>> > >
>> >
>> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
>> > > >
>> > > > Specifically, I am unsure of why between his tests of 3x synchronous
>> > > > replication and 3x async replication he changed the batch.size, as
>> well
>> > > as
>> > > > why he is seemingly publishing to incorrect topics:
>> > > >
>> > > > Configs:
>> > > > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
>> > > >
>> > > > Any help is greatly appreciated!
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > Jiefu Gong
>> > > > University of California, Berkeley | Class of 2017
>> > > > B.A Computer Science | College of Letters and Sciences
>> > > >
>> > > > jg...@berkeley.edu <elise...@berkeley.edu> | (925) 400-3427
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks,
>> > > Ewen
>> > >
>> >
>>
>
>

Re: kafka benchmark tests

Reply via email to