Re: performance test using real data - comparing throughput & latency

Matt Andruff Fri, 15 Sep 2017 17:14:37 -0700

Look,. I'm a huge fan of sending identical data and using plane old 'wall
time' and averaging a couple runs to make sure you remove any whoops.


You can use fancy tools for reporting but in the real world wall time still
is the most critical factor.  And let's face it it's also simple to measure.

I personally would love to hear about your results.  If you'd be willing to
share.

On Fri, Sep 15, 2017, 15:01 Revin Chalil <rcha...@expedia.com> wrote:

> Any thoughts on the below will be appreciated. Thanks.
>
>
> On 9/13/17, 5:00 PM, "Revin Chalil" <rcha...@expedia.com> wrote:
>
>     We are testing kafka’s performance with the real prod data and plan to
> test things like the below. We would have producers publishing and
> consumers processing production data on a separate non-prod kafka cluster.
>
>
>       *   Impact of number of Partitions per Topic on throughput and
> latency on Producer & Consumer
>       *   Impact of scaling-up Brokers on throughput and latency
>       *   adding more brokers Vs adding more Disk on existing Brokers. How
> does the network interface usage differ?
>       *   cost of Replication on Throughput and Latency
>       *   impact of Broker vm.swappiness = 60 Vs vm.swappiness = 1
>       *   partitions on a Broker pointing to single Disk Vs multiple Disks
>       *   EXT4 Vs XFS Filesystem on broker
>       *   behavior when Broker “num.io<http://num.io/>.threads” is
> increased from 8 to higher value
>       *   behavior when Broker “num.network.threads” is increased from 3
> to higher value
>       *   behavior when the data segment size is increased from 1 GB
> (current setting)
>       *   producer “acks = 1” Vs “acks = all” (current setting) impact on
> throughput and latency
>       *   producer sending with Compression enabled (snappy?) Vs sending
> without Compression
>       *   setting producer batch-size (memory based) Vs record-count
> (current setting) per batch sent to Kafka
>       *   impact of message size throughput
>       *   Consumers fetching records from page-cache Vs fetching records
> from Disk
>
>
>     Ideally, the metrics we would like to compare for each test are
> (please let know if there are anything else to be compared)
>
>       *   Producer write Throughput
>       *   Producer write Latency (ms)
>       *   Consumption Throughput
>       *   Consumption Latency (ms)
>       *   End-to-end Latency
>
>     What would be the right tools to collect and compare the above metrics
> against different Tests? I have setup kafka-monitor but couldn’t find how
> to track the throughput and latency. Kafka-web-console seems to have some
> of these available? Kafka-Manager? Burrow? Anything else? Thank you.
>
>     Since we are going to use our own producers and consumers, I do not
> think it makes sense to use tools like kafka-consumer-perf-test.sh or
> kafka-producer-perf-test.sh, but please correct if I am wrong.
>
>     Thanks,
>     Revin
>
>
>
>
>

Re: performance test using real data - comparing throughput & latency

Reply via email to