Re: kafka benchmark tests
Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10:
Re: kafka benchmark tests
Hi Geoffrey, Thank you for your helpful information. Do I have to install the virtual machines? I am using Mac as the testdriver machine or I can use a linux machine to run testdriver too. Thanks. best, Yuheng On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson ge...@confluent.io wrote: Hi Yuheng, Running these tests requires a tool we've created at Confluent called 'ducktape', which you need to install with the command: pip install ducktape==0.2.0 Running the tests locally requires some setup (creation of virtual machines etc.) which is outlined here: https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 The instructions in the quickstart show you how to run the tests on cluster of virtual machines (on a single host) Once you have a cluster up and running, you'll be able to run the test you're interested in: cd kafka/tests ducktape kafkatest/tests/benchmark_test.py Definitely keep us posted about which parts are difficult, annoying, or confusing about this process and we'll do our best to help. Thanks, Geoff On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py:
Re: kafka benchmark tests
Hi Yuheng, Running these tests requires a tool we've created at Confluent called 'ducktape', which you need to install with the command: pip install ducktape==0.2.0 Running the tests locally requires some setup (creation of virtual machines etc.) which is outlined here: https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 The instructions in the quickstart show you how to run the tests on cluster of virtual machines (on a single host) Once you have a cluster up and running, you'll be able to run the test you're interested in: cd kafka/tests ducktape kafkatest/tests/benchmark_test.py Definitely keep us posted about which parts are difficult, annoying, or confusing about this process and we'll do our best to help. Thanks, Geoff On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document
Re: kafka benchmark tests
Hi Yuheng, Yes, you should be able to run on either mac or linux. The test cluster consists of a test-driver machine and some number of slave machines. Right now, there are roughly two ways to set up the slave machines: 1) Slave machines are virtual machines *on* the test-driver machine. 2) Slave machines are external to the test-driver machine. 1 is the simplest to set up, but yes it does require installation of the virtual machines on the test-driver machine. The installation of these machines is outlined in the quickstart I mentioned (here is a better link for the test README: https://github.com/confluentinc/kafka/tree/KAFKA-2276/tests). The tool we're using to bring up the slave virtual machines is called vagrant, so the vagrant steps in the quickstart are really telling you how to install the virtual machines. Hope that helps! Cheers, Geoff On Wed, Jul 15, 2015 at 12:13 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi Geoffrey, Thank you for your helpful information. Do I have to install the virtual machines? I am using Mac as the testdriver machine or I can use a linux machine to run testdriver too. Thanks. best, Yuheng On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson ge...@confluent.io wrote: Hi Yuheng, Running these tests requires a tool we've created at Confluent called 'ducktape', which you need to install with the command: pip install ducktape==0.2.0 Running the tests locally requires some setup (creation of virtual machines etc.) which is outlined here: https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 The instructions in the quickstart show you how to run the tests on cluster of virtual machines (on a single host) Once you have a cluster up and running, you'll be able to run the test you're interested in: cd kafka/tests ducktape kafkatest/tests/benchmark_test.py Definitely keep us posted about which parts are difficult, annoying, or confusing about this process and we'll do our best to help. Thanks, Geoff On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com
Re: kafka benchmark tests
Hi Geoffrey, Thank you for your detailed explaining. They are really helpful. I am thinking of going after the second way, since I have bare metal access to all the nodes in the cluster, it's probably better to run real slave machines instead of virtual machines. (correct me if I am wrong) Each of my node has 256 G ram and 2T disk space, how large will the slave machine virtual machine be and how much memory they will take? Thank you! best, Yuheng On Wed, Jul 15, 2015 at 4:19 PM, Geoffrey Anderson ge...@confluent.io wrote: Hi Yuheng, Yes, you should be able to run on either mac or linux. The test cluster consists of a test-driver machine and some number of slave machines. Right now, there are roughly two ways to set up the slave machines: 1) Slave machines are virtual machines *on* the test-driver machine. 2) Slave machines are external to the test-driver machine. 1 is the simplest to set up, but yes it does require installation of the virtual machines on the test-driver machine. The installation of these machines is outlined in the quickstart I mentioned (here is a better link for the test README: https://github.com/confluentinc/kafka/tree/KAFKA-2276/tests). The tool we're using to bring up the slave virtual machines is called vagrant, so the vagrant steps in the quickstart are really telling you how to install the virtual machines. Hope that helps! Cheers, Geoff On Wed, Jul 15, 2015 at 12:13 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi Geoffrey, Thank you for your helpful information. Do I have to install the virtual machines? I am using Mac as the testdriver machine or I can use a linux machine to run testdriver too. Thanks. best, Yuheng On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson ge...@confluent.io wrote: Hi Yuheng, Running these tests requires a tool we've created at Confluent called 'ducktape', which you need to install with the command: pip install ducktape==0.2.0 Running the tests locally requires some setup (creation of virtual machines etc.) which is outlined here: https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 The instructions in the quickstart show you how to run the tests on cluster of virtual machines (on a single host) Once you have a cluster up and running, you'll be able to run the test you're interested in: cd kafka/tests ducktape kafkatest/tests/benchmark_test.py Definitely keep us posted about which parts are difficult, annoying, or confusing about this process and we'll do our best to help. Thanks, Geoff On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File benchmark_test.py, line 16, in module from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers=
Re: kafka benchmark tests
Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question -- I'm not sure the results in the blog post actually have different settings, it could be accidental divergence between the script and the blog post. The post specifically notes that tuning the batch size in the synchronous case might help, but that he didn't do that. If you're trying to benchmark the *optimal* throughput, tuning the batch size would make sense. Since synchronous replication will have higher latency and there's a limit to how many requests can be in flight at once, you'll want a larger batch size to compensate for the additional latency. However, in practice the increase you see may be negligible. Somebody who has spent more time fiddling with tweaking producer performance may have more insight. -Ewen On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG jg...@berkeley.edu wrote: Hi all, I was wondering if any of you guys have done benchmarks on Kafka performance before, and if they or their details (# nodes in cluster, # records / size(s) of messages, etc.) could be shared. For comparison purposes, I am trying to benchmark Kafka against some similar services such as Kinesis or Scribe. Additionally, I was wondering if anyone could shed some insight on Jay Kreps' benchmarks that he has openly published here: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Specifically, I am unsure of why between his tests of 3x synchronous replication and 3x async
Re: kafka benchmark tests
Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question -- I'm not sure the results in the blog post actually have different settings, it could be accidental divergence between the script and the blog post. The post specifically notes that tuning the batch size in the synchronous case might help, but that he didn't do that. If you're trying to benchmark the *optimal* throughput, tuning the batch size would make sense. Since synchronous replication will have higher latency and there's a limit to how many requests can be in flight at once, you'll want a larger batch size to compensate for the additional latency. However, in practice the increase you see may be negligible. Somebody who has spent more time fiddling with tweaking producer performance may have more insight. -Ewen On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG jg...@berkeley.edu wrote: Hi all, I was wondering if any of you guys have done
Re: kafka benchmark tests
Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question -- I'm not sure the results in the blog post actually have different settings, it could be accidental divergence between the script and the blog post. The post specifically notes that tuning the batch size in the synchronous case might help, but that he didn't do that. If you're trying to benchmark the *optimal* throughput, tuning the batch size would make sense. Since synchronous replication will have higher latency and there's a limit to how many requests can be in flight at once, you'll want a larger batch size to compensate for the additional latency. However, in practice the increase you see may be negligible. Somebody who has spent more time fiddling with tweaking producer performance may
Re: kafka benchmark tests
Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question -- I'm not sure the results in the blog post actually have different settings, it could be accidental divergence between the script and the blog post. The post specifically notes that tuning the batch size in the synchronous case might help, but that he didn't do that. If you're trying to benchmark the *optimal* throughput, tuning the batch size would make sense. Since synchronous replication will have higher latency and there's a limit to how many requests can be in flight at once, you'll want a larger batch size to compensate for the additional latency. However, in practice the increase you see may be negligible. Somebody who has spent more time fiddling with tweaking producer performance may have more insight. -Ewen On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG jg...@berkeley.edu wrote: Hi all, I was wondering if any of you guys have done benchmarks on Kafka performance before, and if they or their details (# nodes in cluster, # records / size(s) of messages, etc.) could be shared. For comparison purposes, I am trying to benchmark Kafka against some similar services such as Kinesis or Scribe. Additionally, I was wondering if anyone could shed some insight on Jay Kreps' benchmarks that he has openly published here: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Specifically, I am unsure of why between his tests of 3x synchronous replication and 3x async replication he changed the batch.size, as well as why he is seemingly publishing to incorrect topics: Configs: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c Any help is greatly appreciated! -- Jiefu Gong University of California, Berkeley | Class of 2017 B.A Computer Science | College of Letters and Sciences
Re: kafka benchmark tests
@Jiefu, yes! The patch is functional, I think it's just waiting on a bit of final review after the last round of changes. You can definitely use it for your own benchmarking, and we'd love to see patches for any additional tests we missed in the first pass! -Ewen On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG jg...@berkeley.edu wrote: Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00
Re: kafka benchmark tests
Yuheng, I would recommend looking here: http://kafka.apache.org/documentation.html#brokerconfigs and scrolling down to get a better understanding of the default settings and what they mean -- it'll tell you what different options for acks does. Ewen, Thank you immensely for your thoughts, they shed a lot of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy ku...@nmsworks.co.in wrote: Yes, A list of Kafka Server host/port pairs to use for establishing the initial connection to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava e...@confluent.io wrote: I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question --
Re: kafka benchmark tests
I implemented (nearly) the same basic set of tests in the system test framework we started at Confluent and that is going to move into Kafka -- see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 In particular, that test is implemented in benchmark_test.py: https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 Hopefully once that's merged people can reuse that benchmark (and add to it!) so they can easily run the same benchmarks across different hardware. Here are some results from an older version of that test on m3.2xlarge instances on EC2 using local ephemeral storage (I think... it's been awhile since I ran these numbers and I didn't document methodology that carefully): INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:BENCHMARK RESULTS INFO:_.KafkaBenchmark:= INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.24 MB/s) INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.66 MB/s) INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.11 MB/s) INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.79 MB/s) INFO:_.KafkaBenchmark:Message size: INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) INFO:_.KafkaBenchmark:Throughput over long run, data memory: INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 MB/s) INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s) INFO:_.KafkaBenchmark:Producer + consumer: INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% 4.00 ms, 99.9% 19.00 ms Don't trust these numbers for anything, the were a quick one-off test. I'm just pasting the output so you get some idea of what the results might look like. Once we merge the KIP-25 patch, Confluent will be running the tests regularly and results will be available publicly so we'll be able to keep better tabs on performance, albeit for only a specific class of hardware. For the batch.size question -- I'm not sure the results in the blog post actually have different settings, it could be accidental divergence between the script and the blog post. The post specifically notes that tuning the batch size in the synchronous case might help, but that he didn't do that. If you're trying to benchmark the *optimal* throughput, tuning the batch size would make sense. Since synchronous replication will have higher latency and there's a limit to how many requests can be in flight at once, you'll want a larger batch size to compensate for the additional latency. However, in practice the increase you see may be negligible. Somebody who has spent more time fiddling with tweaking producer performance may have more insight. -Ewen On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG jg...@berkeley.edu wrote: Hi all, I was wondering if any of you guys have done benchmarks on Kafka performance before, and if they or their details (# nodes in cluster, # records / size(s) of messages, etc.) could be shared. For comparison purposes, I am trying to benchmark Kafka against some similar services such as Kinesis or Scribe. Additionally, I was wondering if anyone could shed some insight on Jay Kreps' benchmarks that he has openly published here: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Specifically, I am unsure of why between his tests of 3x synchronous replication and 3x async replication he changed the batch.size, as well as why he is seemingly publishing to incorrect topics: Configs: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c Any help is greatly appreciated! -- Jiefu Gong University of California, Berkeley | Class of 2017 B.A Computer Science | College of Letters and Sciences jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427 -- Thanks, Ewen
kafka benchmark tests
Hi all, I was wondering if any of you guys have done benchmarks on Kafka performance before, and if they or their details (# nodes in cluster, # records / size(s) of messages, etc.) could be shared. For comparison purposes, I am trying to benchmark Kafka against some similar services such as Kinesis or Scribe. Additionally, I was wondering if anyone could shed some insight on Jay Kreps' benchmarks that he has openly published here: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Specifically, I am unsure of why between his tests of 3x synchronous replication and 3x async replication he changed the batch.size, as well as why he is seemingly publishing to incorrect topics: Configs: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c Any help is greatly appreciated! -- Jiefu Gong University of California, Berkeley | Class of 2017 B.A Computer Science | College of Letters and Sciences jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427